parse-stack-next 5.1.1 → 5.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. checksums.yaml +4 -4
  2. data/.env.sample +12 -0
  3. data/.env.test +4 -4
  4. data/CHANGELOG.md +545 -0
  5. data/Gemfile +3 -0
  6. data/Gemfile.lock +6 -1
  7. data/README.md +167 -38
  8. data/Rakefile +56 -10
  9. data/docs/atlas_vector_search_guide.md +110 -9
  10. data/docs/mcp_guide.md +433 -0
  11. data/docs/mongodb_direct_guide.md +66 -1
  12. data/docs/mongodb_index_optimization_guide.md +22 -1
  13. data/docs/usage_guide.md +15 -0
  14. data/lib/parse/agent/approval_gate.rb +0 -0
  15. data/lib/parse/agent/constraint_translator.rb +90 -19
  16. data/lib/parse/agent/describe.rb +1 -0
  17. data/lib/parse/agent/errors.rb +16 -0
  18. data/lib/parse/agent/mcp_client.rb +9 -0
  19. data/lib/parse/agent/mcp_dispatcher.rb +139 -7
  20. data/lib/parse/agent/mcp_rack_app.rb +621 -17
  21. data/lib/parse/agent/mcp_subscriptions.rb +607 -0
  22. data/lib/parse/agent/metadata_dsl.rb +58 -0
  23. data/lib/parse/agent/metadata_registry.rb +141 -1
  24. data/lib/parse/agent/prompt_hardening.rb +213 -0
  25. data/lib/parse/agent/result_formatter.rb +18 -3
  26. data/lib/parse/agent/tools.rb +167 -24
  27. data/lib/parse/agent.rb +692 -21
  28. data/lib/parse/client/request.rb +55 -4
  29. data/lib/parse/client/response.rb +4 -0
  30. data/lib/parse/client.rb +205 -7
  31. data/lib/parse/model/classes/installation.rb +27 -10
  32. data/lib/parse/model/classes/user.rb +8 -0
  33. data/lib/parse/model/core/actions.rb +58 -4
  34. data/lib/parse/model/core/embed_managed.rb +19 -14
  35. data/lib/parse/model/core/indexing.rb +108 -16
  36. data/lib/parse/model/core/querying.rb +29 -0
  37. data/lib/parse/model/model.rb +34 -3
  38. data/lib/parse/model/object.rb +1 -0
  39. data/lib/parse/query.rb +90 -24
  40. data/lib/parse/retrieval/agent_tool.rb +369 -0
  41. data/lib/parse/retrieval/chunk.rb +74 -0
  42. data/lib/parse/retrieval/chunker.rb +208 -0
  43. data/lib/parse/retrieval/retriever.rb +274 -0
  44. data/lib/parse/retrieval.rb +10 -0
  45. data/lib/parse/schema.rb +69 -20
  46. data/lib/parse/stack/version.rb +2 -2
  47. data/parse-stack-next.gemspec +1 -1
  48. data/scripts/docker/docker-compose.atlas.yml +14 -10
  49. data/scripts/docker/docker-compose.test.yml +24 -20
  50. data/scripts/docker/mongo-init.js +3 -3
  51. data/scripts/start-parse.sh +10 -0
  52. data/scripts/start_mcp_server.rb +1 -1
  53. data/scripts/test_server_connection.rb +1 -1
  54. data/scripts/vector_prototype/create_vector_index.js +1 -1
  55. data/scripts/vector_prototype/fetch_embeddings.py +2 -2
  56. data/scripts/vector_prototype/query_prototype.rb +1 -1
  57. data/scripts/vector_prototype/run.sh +4 -4
  58. metadata +10 -2
data/docs/mcp_guide.md CHANGED
@@ -360,6 +360,433 @@ Common uses for the direct dispatcher:
360
360
 
361
361
  ---
362
362
 
363
+ ## Connecting Claude Desktop (stdio bridge)
364
+
365
+ Parse Stack speaks MCP over **HTTP** (the standalone server and the
366
+ Rack adapter both expose a JSON-RPC-over-HTTP endpoint). Claude Desktop,
367
+ however, launches MCP servers as local **stdio** subprocesses — it does
368
+ not dial an HTTP URL directly. Bridge the two with
369
+ [`mcp-remote`](https://www.npmjs.com/package/mcp-remote), a small stdio↔HTTP
370
+ proxy that Claude Desktop runs as the subprocess and which forwards to your
371
+ HTTP endpoint.
372
+
373
+ 1. Start the Parse Stack MCP endpoint over HTTP (standalone or Rack — see
374
+ Deployment Modes above) and note its URL and the bearer token your
375
+ `agent_factory` expects, e.g. `http://localhost:3001/` with
376
+ `Authorization: Bearer <token>`.
377
+
378
+ 2. Add the bridge to `claude_desktop_config.json` (macOS:
379
+ `~/Library/Application Support/Claude/claude_desktop_config.json`;
380
+ Windows: `%APPDATA%\Claude\claude_desktop_config.json`):
381
+
382
+ ```json
383
+ {
384
+ "mcpServers": {
385
+ "parse-stack": {
386
+ "command": "npx",
387
+ "args": [
388
+ "-y",
389
+ "mcp-remote",
390
+ "http://localhost:3001/",
391
+ "--header",
392
+ "Authorization: Bearer ${PARSE_MCP_TOKEN}"
393
+ ],
394
+ "env": {
395
+ "PARSE_MCP_TOKEN": "your-mcp-token"
396
+ }
397
+ }
398
+ }
399
+ }
400
+ ```
401
+
402
+ 3. Restart Claude Desktop. The Parse Stack tools (`query_class`,
403
+ `get_schema`, `semantic_search`, …) appear in the client.
404
+
405
+ Notes:
406
+
407
+ - `mcp-remote` requires Node.js on the machine running Claude Desktop.
408
+ - For a public endpoint, terminate TLS in front of the HTTP server and use
409
+ an `https://` URL; the bearer token rides the `Authorization` header.
410
+ - The same bridge works for any stdio-only MCP client (e.g. some IDE
411
+ integrations). Clients that support remote MCP connectors natively can
412
+ point at the HTTP URL without the bridge.
413
+ - Approval workflows (elicitation) need the streaming/listening-stream
414
+ prerequisites described under Approval Workflows — confirm the bridge and
415
+ client forward the SSE channel before relying on human-in-the-loop gating.
416
+
417
+ ---
418
+
419
+ ## Resource Subscriptions (LiveQuery bridge)
420
+
421
+ MCP lets a client `resources/subscribe` to a resource URI and then receive
422
+ unsolicited `notifications/resources/updated` messages whenever the underlying
423
+ data changes. Parse Stack bridges that surface onto Parse LiveQuery: a
424
+ subscribed `parse://<Class>/count` or `parse://<Class>/samples` resource is
425
+ backed by a LiveQuery subscription on `<Class>`, and any matching
426
+ create/update/delete/enter/leave event is debounced into a single coarse
427
+ update for that URI. The client re-reads the resource via `resources/read` to
428
+ obtain the new value — row payloads are never streamed through the resource
429
+ surface.
430
+
431
+ This is opt-in and requires a streaming-capable Rack server (Puma, Falcon —
432
+ WEBrick buffers responses and cannot hold the listening stream open) plus
433
+ LiveQuery enabled and configured.
434
+
435
+ ```ruby
436
+ # Boot: enable LiveQuery and point it at the server.
437
+ Parse.setup(
438
+ server_url: "https://your-parse-server.com/parse",
439
+ application_id: "your_app_id",
440
+ api_key: "your_api_key",
441
+ live_query_url: "wss://your-parse-server.com",
442
+ )
443
+ Parse.live_query_enabled = true
444
+
445
+ # Mount the Rack app with resource subscriptions enabled.
446
+ app = Parse::Agent::MCPRackApp.new(resource_subscriptions: true) do |env|
447
+ token = env["HTTP_AUTHORIZATION"].to_s.delete_prefix("Bearer ")
448
+ MyAuth.agent_for_token!(token) # returns a Parse::Agent or raises Unauthorized
449
+ end
450
+ ```
451
+
452
+ When enabled and LiveQuery is available, the `initialize` handshake advertises
453
+ `resources.subscribe: true`. When LiveQuery is not enabled/available — or on
454
+ the WEBrick `MCPServer`, which cannot stream — the capability stays
455
+ `subscribe: false` and `resources/subscribe` returns a "not supported" error.
456
+ The capability is a contract: it is never advertised unless the server can
457
+ actually deliver updates.
458
+
459
+ ### Protocol flow
460
+
461
+ 1. **`initialize`** — the response carries a server-issued `Mcp-Session-Id`
462
+ header. The client echoes it on every subsequent request.
463
+ 2. **`GET` listening stream** — the client opens a long-lived `GET` to the same
464
+ endpoint with `Accept: text/event-stream` and the `Mcp-Session-Id` header.
465
+ This is the server→client channel; it stays open and emits
466
+ `notifications/resources/updated` events until the client disconnects.
467
+ 3. **`resources/subscribe`** — a normal `POST` with
468
+ `{ "uri": "parse://Post/count" }`. Returns an empty result; updates begin
469
+ flowing on the listening stream.
470
+ 4. **`resources/unsubscribe`** — stops one subscription. `DELETE` with the
471
+ session id tears the whole session down.
472
+
473
+ Only `count` and `samples` resources are subscribable. `schema` is rejected
474
+ with an invalid-params error because schema changes are not LiveQuery events.
475
+
476
+ ### Access control (important)
477
+
478
+ The bridge enforces the same scope rules as the rest of the SDK. LiveQuery
479
+ filters events server-side using the credential on the subscribe frame, so the
480
+ subscription's credentials are derived from the subscribing agent:
481
+
482
+ | Agent scope | LiveQuery credential | Events seen |
483
+ |-------------|----------------------|-------------|
484
+ | session-token agent | that session token | only rows the user can read (ACL/CLP enforced by Parse Server) |
485
+ | master-key agent | master key | every event |
486
+ | `acl_user:` / `acl_role:` agent | **refused** | none — see below |
487
+
488
+ `acl_user:` / `acl_role:` agents are an SDK-side, mongo-direct-only construct
489
+ with no Parse Server REST or LiveQuery equivalent (Parse Server has no
490
+ "act as this user pointer / role" handshake). Bridging them would force a
491
+ silent downgrade to either master key (a row-level leak) or an unscoped
492
+ session, so the bridge **fails closed** and refuses the subscription with a
493
+ security error. Subscribe with a session-token or master-key agent instead.
494
+
495
+ Because Parse Server fixes ACL-bypass authorization at LiveQuery *connect*
496
+ time (there is no per-subscription master key), the bridge keeps two
497
+ connections and routes by credential: master-posture subscriptions ride a
498
+ dedicated **admin** connection
499
+ (`Parse::LiveQuery::Client.new(use_master_key: true)`), while session-token
500
+ subscriptions ride a normal connection and pass their token per subscription.
501
+ Either way, an update only fires for an object the subscription's scope is
502
+ permitted to read — LiveQuery filters events by ACL server-side. (Whether a
503
+ master connection additionally surfaces master-key-only rows depends on the
504
+ Parse Server version and its `masterKeyIps` configuration.)
505
+
506
+ ### Operational notes and limitations
507
+
508
+ - **Single-process.** Subscription state lives in the `MCPRackApp` instance
509
+ (like the cancellation registry), so in a clustered / multi-process
510
+ deployment a LiveQuery event observed on one worker does not reach a
511
+ listening stream held on another. The delivery seam
512
+ (`Parse::Agent::MCPSubscriptions::Notifier`) is isolated so a Redis-backed
513
+ pub/sub adapter can be supplied later without changing the bridge or the
514
+ dispatcher; pass it via `subscription_manager:`.
515
+ - **Subscriptions do not survive a listening-stream reconnect.** Closing the
516
+ `GET` stream tears down the session's LiveQuery subscriptions; a client that
517
+ reconnects must re-issue its `resources/subscribe` calls.
518
+ - **Session id is a bearer capability.** The listening stream authenticates via
519
+ the agent factory and keys delivery off the server-issued `Mcp-Session-Id`,
520
+ which the client must keep secret — possession of a valid session id (plus a
521
+ valid agent) is sufficient to attach. This matches the cancellation model.
522
+ - **Per-session and global caps.** A client that subscribes but never opens (or
523
+ later drops) its listening stream leaves LiveQuery subscriptions running until
524
+ the session is torn down. A per-session ceiling (default 100,
525
+ `max_subscriptions_per_session:` on the manager) bounds one session's
526
+ footprint, and a global ceiling on the number of distinct subscribing sessions
527
+ (default 10,000, `max_sessions:`) bounds total growth. The global cap is a
528
+ rejection cap (new sessions are refused with a JSON-RPC error once it is
529
+ reached) and fails closed.
530
+ - **Concurrent listening streams are bounded separately from request SSE.**
531
+ `max_concurrent_dispatchers:` does **not**, by itself, bound the GET listening
532
+ streams used for resource subscriptions and notifications — those get their own
533
+ soft cap *equal to* `max_concurrent_dispatchers`. So the effective steady-state
534
+ ceiling across both surfaces is up to **2× `max_concurrent_dispatchers`** (up
535
+ to N request-scoped SSE dispatchers plus N listening streams). Size the value
536
+ with that 2× factor in mind (e.g. relative to your Puma `max_threads`). Leaving
537
+ it unset (the default `nil`) leaves both surfaces uncapped; the app logs a
538
+ one-time warning at construction when a streaming or subscription/notification
539
+ surface is enabled without a cap.
540
+
541
+ ---
542
+
543
+ ## Approval Workflows (MCP elicitation)
544
+
545
+ `:write` / `:admin` tier tool calls can require human approval before they run,
546
+ using the MCP 2025-06-18 spec-native `elicitation/create` channel. Off by
547
+ default, so existing clients are unaffected.
548
+
549
+ ```ruby
550
+ # Opt tiers in (process-wide). Has teeth only when an approval gate is installed
551
+ # (the MCP transport installs one per session; see below).
552
+ Parse::Agent.require_approval_for = [:write, :admin]
553
+ ```
554
+
555
+ The approval gate is a pluggable `agent.approval_gate` consulted inside
556
+ `Parse::Agent#execute` — so it is reachable on the non-MCP path and
557
+ unit-testable with a fake approver. `Parse::Agent::MCPElicitationGate` is the
558
+ spec-native implementation; `Parse::Agent::NullGate` (the default) approves.
559
+
560
+ Round-trip over the streaming transport:
561
+
562
+ 1. A `tools/call` for a gated tier pauses before execution. The server builds an
563
+ `elicitation/create` request whose payload carries the **approval preview**
564
+ (for `call_method` the *effective* tier is resolved from the target
565
+ `agent_method`'s declared permission, so write/admin methods invoked through
566
+ the readonly `call_method` tool are gated correctly). The preview is a real
567
+ before/after only for methods that declare `supports_dry_run`; for the
568
+ built-in `update_object` / `delete_object` it is the proposed `{ tool, args }`
569
+ call, **not** a fetched before/after of the target row.
570
+ 2. The request is pushed to the client over the open **GET listening stream**
571
+ (the same bus as resource subscriptions).
572
+ 3. The client replies with a JSON-RPC response (`{ result: { action: "accept" |
573
+ "decline" | "cancel" } }`) as a separate POST. The server routes it,
574
+ session-bound, into a pending registry that wakes the blocked tool thread.
575
+ 4. `accept` → the tool runs. Anything else → a structured refusal; the tool
576
+ never executes.
577
+
578
+ Client capability + transport requirements (the server READS, does not
579
+ advertise, the client's `elicitation` capability at `initialize`):
580
+
581
+ ```ruby
582
+ Parse::Agent::MCPRackApp.new(
583
+ streaming: true,
584
+ resource_subscriptions: true, # or notifications: true — either opens the GET bus
585
+ approval_timeout: 300, # seconds to wait for a human; default 300
586
+ agent_factory: ->(env) { ... },
587
+ )
588
+ ```
589
+
590
+ **Three prerequisites — miss any one and every gated write fails closed,
591
+ which looks like a bug rather than a config gap:**
592
+
593
+ 1. **`streaming: true`** on the `MCPRackApp` (it defaults to `false`). Approval
594
+ needs a server→client request, which only the streaming transport can send.
595
+ 2. **An open GET bus** — `notifications: true` *or* `resource_subscriptions:
596
+ true`. `notifications: true` is the lighter choice if you don't need
597
+ LiveQuery resource subscriptions. Without a bus there is no channel to
598
+ deliver `elicitation/create`.
599
+ 3. **A concurrent server (Puma), not the bundled `MCPServer`.** The bundled
600
+ server runs on WEBrick and is non-streaming, so approval can never round-trip
601
+ there — mount {Parse::Agent.rack_app} under Puma for any deployment that uses
602
+ approval.
603
+
604
+ Operator aid: a write/admin agent served over MCP with `require_approval_for`
605
+ empty emits a one-time `[Parse::Agent:SECURITY]` warning (writes run ungated).
606
+ Approval round-trips also emit a `parse.agent.approval` `ActiveSupport::Notifications`
607
+ event carrying `outcome`, `reason`, and the measured wait — subscribe to it to
608
+ spot a non-answering client holding a dispatcher thread for the full
609
+ `approval_timeout` (default 300s).
610
+
611
+ **Fails closed.** When approval is required but the client did not advertise the
612
+ `elicitation` capability, no listening stream is open, the transport is
613
+ non-streaming (WEBrick), or the approver times out, the destructive operation is
614
+ **refused** — never blocked forever, never silently executed. Replies are bound
615
+ to the answering session's `Mcp-Session-Id`, so one session cannot answer (or
616
+ guess the id of) another's prompt.
617
+
618
+ ---
619
+
620
+ ## Server-initiated Notifications (general purpose)
621
+
622
+ The GET listening-stream bus also backs arbitrary server→client notifications,
623
+ without requiring LiveQuery resource subscriptions:
624
+
625
+ ```ruby
626
+ app = Parse::Agent::MCPRackApp.new(streaming: true, notifications: true,
627
+ agent_factory: ->(env) { ... })
628
+
629
+ # From application code that holds the app reference:
630
+ app.notify("the-session-id", method: "notifications/custom", params: { foo: 1 })
631
+ ```
632
+
633
+ `notifications: true` builds the listening-stream manager in a `supported:
634
+ false` posture: the GET stream and `#notify` work, but `resources.subscribe`
635
+ stays unadvertised and `resources/subscribe` POSTs fail closed. `#notify` builds
636
+ a JSON-RPC **notification** (never an `id` — that distinguishes it from the
637
+ server-initiated *request* used by elicitation) and returns `false` when no
638
+ stream is attached for the session. `app.subscription_manager` is exposed for an
639
+ out-of-band / clustered publisher that needs the lower-level `publish` seam.
640
+
641
+ ---
642
+
643
+ ## Built-in Agent Hardening & Telemetry
644
+
645
+ 5.2 adds several agent-side controls, all configured on `Parse::Agent`:
646
+
647
+ - **Impersonation** — `Parse::Agent.new(impersonate_user: <id|Pointer|User>,
648
+ impersonate_mint: false, impersonation_label:)` (or `agent.impersonate(user)`
649
+ / `agent.stop_impersonating!`) resolves a real session token for a `_User`
650
+ (reusing an active `_Session`, or minting a restricted one with
651
+ `impersonate_mint: true`) and binds it as if `session_token:` had been passed.
652
+ Master-key client required; fails closed if no session resolves. An
653
+ `impersonation_label:` (also usable with `acl_role:`) is emitted on the
654
+ `parse.agent.tool_call` payload alongside `impersonated_user_id`.
655
+ - **Prompt hardening** (`Parse::Agent::PromptHardening`) — schema descriptions
656
+ surfaced by `get_schema` / `get_all_schemas` are sanitized (non-identifier
657
+ field names dropped with a `[Parse::Agent:PROMPT]` warning, control/zero-width
658
+ chars stripped, capped, marker-wrapped); untrusted tool content has embedded
659
+ wrapper markers neutralized (`Parse::Agent.prompt_marker_strict = true` to
660
+ refuse instead). Operator canary phrases via
661
+ `Parse::Agent.prompt_injection_canaries = ["IGNORE PREVIOUS", /system:/i]`
662
+ emit `parse.agent.prompt_injection_detected`; set
663
+ `Parse::Agent.canary_action = :refuse` to raise on a hit.
664
+ `Parse::Agent::PROMPT_VERSION` is surfaced via
665
+ `agent.describe[:prompt][:version]`. A one-time warning fires when
666
+ `allowed_llm_endpoints` is left unrestricted (nil).
667
+ - **Embedding-cost telemetry** — embedding calls made inside a tool span add
668
+ `embed_calls`, `embed_tokens`, and (when
669
+ `Parse::Agent.embed_cost_per_million_tokens` is set) `embed_cost_usd` to the
670
+ `parse.agent.tool_call` payload. The per-tool span does **not** cover
671
+ corpus/ingestion embeds fired at `Model.save` time (typically the dominant
672
+ spend) — wrap those in `Parse::Agent.measure_embeddings { … }`, which returns
673
+ `{ calls:, tokens:, cost_usd: }` for the work done on the calling thread:
674
+
675
+ ```ruby
676
+ stats = Parse::Agent.measure_embeddings do
677
+ KnowledgeArticle.save_all(batch) # embed-on-save
678
+ end
679
+ stats # => { calls: 1200, tokens: 4_300_000, cost_usd: 0.43 }
680
+ ```
681
+
682
+ Thread-local: embeds fanned out to other threads/fibers are not captured —
683
+ measure inside each worker. `Parse::Agent.embed_cost_usd(tokens)` converts a
684
+ token count to USD using the configured rate (nil when unset).
685
+ - **Provenance** — `Parse::Agent.include_source_provenance = true` (default
686
+ false) stamps each read-tool row with `_source = { class, tool, object_id }`,
687
+ applied after field-allowlist projection and redaction.
688
+ - **`semantic_search` tool** — registered readonly + `client_safe`; opt a model
689
+ in with `agent_searchable field:, filter_fields:`. See the
690
+ [Atlas Vector Search Guide](./atlas_vector_search_guide.md#retrieval-rag).
691
+
692
+ ### Runtime denial gates
693
+
694
+ Beyond the permission-tier and env-gate checks, several gates refuse a tool
695
+ call at runtime based on its arguments. They fail closed; a caller sees a
696
+ structured error (the built-in tools return `{ success: false, error:,
697
+ error_code: }`, which surfaces as `isError: true` over MCP). Knowing them up
698
+ front avoids discovering each only on impact:
699
+
700
+ | Gate | When it fires | Surfaced as |
701
+ |------|---------------|-------------|
702
+ | Missing tenant scope | A searchable class has no `agent_tenant_scope` while other classes do (tenant-aware deployment) | `Parse::Agent::MissingTenantScope` (search path); a one-time `[Parse::Agent:SECURITY]` lint warning on the general query path |
703
+ | No tenant binding | A scoped class is queried by an agent whose tenant value resolves to `nil` | `Parse::Agent::AccessDenied` (`kind: :tenant`) |
704
+ | Hidden class | A tool targets an `agent_hidden` class (or one outside a per-instance `classes:` allowlist) | `Parse::Agent::AccessDenied` (`kind: :hidden_class`) / off-allowlist refusal |
705
+ | Reserved underscore key | A `filter:` / `vector_filter:` / `where:` contains an underscore-prefixed key (`_rperm`, `_p_*`, …) at any depth | `ArgumentError` / `ValidationError` (recursive refusal) |
706
+ | Filter-field allowlist | A `filter:` / `vector_filter:` names a field not in the class's `agent_searchable filter_fields:` | `ValidationError` naming the offending field(s) |
707
+ | `text_field` not embedded | `semantic_search` `text_field:` names a field that isn't a declared `embed` source | `ValidationError` listing the allowed sources |
708
+ | Tool filtered | A tool/method removed by a per-instance `tools:` / `methods:` filter is invoked | `error_code: :tool_filtered` |
709
+ | Approval denied/unavailable | A gated write/admin op is rejected or the approver is unreachable | `error_code: :approval_denied` |
710
+
711
+ ---
712
+
713
+ ## Token Economy
714
+
715
+ The MCP surface is paid for in LLM context tokens — the tool schemas sent every
716
+ session, and the data every tool returns. 5.2 adds controls to keep that cost
717
+ down.
718
+
719
+ ### Lean tool profile
720
+
721
+ A full `:readonly` `tools/list` payload is roughly **7.9K context tokens** every
722
+ session. For small-context models or token-sensitive deployments, the `:lean`
723
+ profile narrows the surface to the six core read tools (`get_all_schemas`,
724
+ `get_schema`, `query_class`, `count_objects`, `get_object`, `aggregate`) —
725
+ about **2.6K tokens, a ~67% reduction**:
726
+
727
+ ```ruby
728
+ Parse::Agent.new(permissions: :readonly, tools: :lean)
729
+ ```
730
+
731
+ A profile is an allowlist: it composes with the permission tier and can only
732
+ narrow, never elevate. Profiles are Symbol-only (`Parse::Agent::TOOL_PROFILES`);
733
+ for finer control still pass an explicit Array or `{ only:, except: }`. An
734
+ unknown profile raises rather than silently exposing the full surface.
735
+
736
+ ### Leaner tool responses
737
+
738
+ Read tools return rows in an LLM-friendly form (Pointers as `{_type, class,
739
+ id}`, Dates as bare ISO strings) and now **strip the raw `ACL` map** — it is
740
+ operationally useless to a model (effective authority is enforced server-side
741
+ regardless) and is pure token overhead plus a minor role/user-id disclosure.
742
+ `get_objects` and the Atlas Search tools now go through the same normalization
743
+ `query_class` always used, instead of shipping raw wire-form.
744
+
745
+ Defaults that bound response size: `query_class` `limit:` defaults to 100 (cap
746
+ 1000) with the rendered array capped at 50 (`truncated_note`); `aggregate`
747
+ auto-injects a terminal `$limit: 200`. Pass a smaller `limit:` / project fewer
748
+ fields via `keys:` when you want a tighter result.
749
+
750
+ ### `semantic_search` — deduped sources and a token budget
751
+
752
+ The `semantic_search` result hoists each chunk's parent record **once** into a
753
+ `documents` map keyed by `objectId`, instead of duplicating the full source on
754
+ every chunk — map a chunk back to its source via `metadata.object_id`:
755
+
756
+ ```jsonc
757
+ {
758
+ "chunks": [
759
+ { "id": "a#0", "score": 0.82, "content": "…", "metadata": { "object_id": "a", "chunk_index": 0 } },
760
+ { "id": "a#1", "score": 0.82, "content": "…", "metadata": { "object_id": "a", "chunk_index": 1 } }
761
+ ],
762
+ "documents": { "a": { "objectId": "a", "title": "…" } },
763
+ "count": 2
764
+ }
765
+ ```
766
+
767
+ A `max_total_tokens` budget (default 20,000; estimated as chars/4) trims the
768
+ lowest-ranked chunks so a few long documents can't silently blow the context
769
+ window — the count caps (`k * max_chunks_per_document`) bound the chunk *count*
770
+ but not their total size. When the budget trims, the result adds
771
+ `budget_truncated: true` and `budget_dropped: <n>` so the truncation is never
772
+ silent. Pass `max_total_tokens: 0` to disable.
773
+
774
+ ### Structured error metadata on the wire
775
+
776
+ A failing `tools/call` already carries `error_code` and a structured `details:`
777
+ hash (e.g. `allowed_fields`, `suggested_rewrite`) and `retry_after` — these are
778
+ now forwarded on the MCP error envelope under `_meta` (`parse.error_code`,
779
+ `parse.retry_after`, `parse.details`) so a client can branch deterministically
780
+ and honor `retry_after` instead of re-parsing the prose message. The
781
+ human-readable `content` text is unchanged.
782
+
783
+ `get_schema` on a mistyped class name now raises a `ValidationError` carrying a
784
+ "Did you mean: …?" hint (near matches from the locally-known classes), so the
785
+ model self-corrects in one retry instead of falling back to a full
786
+ `get_all_schemas` sweep.
787
+
788
+ ---
789
+
363
790
  ## Custom Authentication
364
791
 
365
792
  The agent factory pattern gives you full control over authentication. Every request passes through the factory before any Parse operation is attempted.
@@ -739,6 +1166,10 @@ agent = Parse::Agent.new(tools: { only: [:query_class, :get_schema, :aggregate],
739
1166
 
740
1167
  # Denylist only
741
1168
  agent = Parse::Agent.new(tools: { except: [:emit_artifact] })
1169
+
1170
+ # Named profile (Symbol) — :lean narrows to the six core read tools
1171
+ # (~67% smaller tools/list). See "Token Economy" above.
1172
+ agent = Parse::Agent.new(tools: :lean)
742
1173
  ```
743
1174
 
744
1175
  **Resolution order** is strict: env-gates ▷ permission tier ▷ per-instance filter. The filter cannot elevate — `tools: { only: [:delete_object] }` on a `:readonly` agent still excludes `delete_object` because `delete_object` is not in the readonly tier's permitted set in the first place.
@@ -1667,6 +2098,8 @@ Known `details[:kind]` subcodes for `:access_denied`:
1667
2098
 
1668
2099
  The top-level `error_code` stays at `:access_denied` for back-compat with consumers that only branch on it. The new subcode is purely additive — clients that ignore `details:` see no change in behavior.
1669
2100
 
2101
+ **On the wire (5.2+):** `error_code`, `retry_after`, and `details` are forwarded on the MCP tool-error envelope under `_meta` — `parse.error_code`, `parse.retry_after`, `parse.details` — so a spec-compliant client can branch deterministically (and honor `retry_after`) without parsing the prose `content` text. The `content` text and `isError: true` are unchanged.
2102
+
1670
2103
  ---
1671
2104
 
1672
2105
  ## Performance and Timeouts
@@ -757,6 +757,15 @@ non-master scope so this enforcement always runs. See the
757
757
  [Atlas Vector Search Guide](./atlas_vector_search_guide.md) for the
758
758
  full API surface.
759
759
 
760
+ `Parse::Retrieval.retrieve` and the `semantic_search` agent tool (v5.2)
761
+ build directly on `find_similar`, so they inherit this exact Layer 1-5
762
+ mongo-direct enforcement. The earlier RAG plan's "two-stage" REST
763
+ re-query was intentionally NOT adopted — there is no REST vector path,
764
+ and `acl_user:` / `acl_role:` scopes have no REST equivalent, so the
765
+ post-`$vectorSearch` `_rperm` `$match` is the single enforcement
766
+ boundary. The retrieval layer adds a tenant-scope fold into the Atlas
767
+ pre-filter on top of this, never a substitute for it.
768
+
760
769
  ### Timeouts
761
770
 
762
771
  ```ruby
@@ -947,7 +956,7 @@ three independent flags that every mutation re-checks per call.
947
956
  Requires `clusterMonitor` privilege on the reader; returns `{}`
948
957
  when not granted so callers degrade gracefully.
949
958
 
950
- ### Model DSL: `mongo_index` / `mongo_geo_index` / `mongo_relation_index`
959
+ ### Model DSL: `mongo_index` / `unique_index_on` / `mongo_geo_index` / `mongo_relation_index`
951
960
 
952
961
  Index declarations are class-level metadata on `Parse::Object`
953
962
  subclasses. They run validation at registration time so a typo,
@@ -967,6 +976,7 @@ class Car < Parse::Object
967
976
 
968
977
  mongo_index :make, :model, :year # compound
969
978
  mongo_index :vin, unique: true
979
+ unique_index_on :registration # dedup floor; unique { registration: 1 }
970
980
  mongo_index :owner # pointer auto-rewrites to _p_owner
971
981
  mongo_geo_index :location # 2dsphere on GeoJSON Point
972
982
  mongo_index :tags # array field
@@ -1007,6 +1017,61 @@ class Author < Parse::Object
1007
1017
  end
1008
1018
  ```
1009
1019
 
1020
+ ### `unique_index_on` — the `first_or_create!` correctness floor
1021
+
1022
+ `unique_index_on(*fields, sparse: false, partial: nil, name: nil)` declares
1023
+ a unique index on the exact dedup tuple that `first_or_create!` and
1024
+ `create_or_update!` key on. It is thin sugar over
1025
+ `mongo_index(*fields, unique: true, …)` — same registration, same validation
1026
+ (sensitive-field guard, pointer auto-rewrite, parallel-array / relation /
1027
+ `_id` rejection), same `apply_indexes!` writer path — but the name states the
1028
+ intent: these fields are the create-or-update identity.
1029
+
1030
+ ```ruby
1031
+ class Subscription < Parse::Object
1032
+ property :email, :string
1033
+ belongs_to :tenant, as: :user
1034
+
1035
+ unique_index_on :email, :tenant # key: { email: 1, _p_tenant: 1 } unique
1036
+ end
1037
+
1038
+ Subscription.apply_indexes! # provisions the index via the writer gate
1039
+ ```
1040
+
1041
+ **Why it matters.** The Redis-backed `synchronize:` lock on `first_or_create!`
1042
+ is a *latency optimization*: in the common path it collapses concurrent
1043
+ callers so only one issues the create. The unique index is the *correctness
1044
+ floor* that survives the lock being bypassed — a Redis outage, a TTL expiring
1045
+ between the existence check and the write, a caller passing
1046
+ `synchronize: false`, or two app servers whose lock secrets disagree. When a
1047
+ race slips past the lock, the loser's insert fails with `DuplicateValue`
1048
+ (Parse error 137), which `first_or_create!` rescues and resolves to the
1049
+ winning row. Lock plus index make the net invariant — *exactly one row, every
1050
+ caller sees the same id* — hold under any race, not just the happy path.
1051
+
1052
+ **Defaults are non-sparse, on purpose.** The index key is kept identical to
1053
+ the query `first_or_create!` re-runs on recovery (`_scoped_first` on the same
1054
+ `query_attrs`), so a 137 always corresponds to a row the recovery query can
1055
+ find. A sparse or partial index that fires on a condition the recovery query
1056
+ doesn't reproduce would surface a 137 the rescue can't resolve, and the error
1057
+ would re-raise. `sparse:` only changes behavior for a document missing *every*
1058
+ field in the tuple — a compound sparse index indexes a doc when it has at
1059
+ least one key, and `first_or_create!` always writes the full tuple, so sparse
1060
+ never weakens the floor. Leave it off unless out-of-band writers create
1061
+ tuple-less rows you want excluded.
1062
+
1063
+ For "unique within a subset" — unique email per tenant, but rows with no
1064
+ tenant may repeat — a partial filter is the right tool, **not** `sparse:`
1065
+ (a compound sparse index still collides two rows that share the present
1066
+ fields). You own the filter's lifecycle and must keep the recovery query
1067
+ consistent with it:
1068
+
1069
+ ```ruby
1070
+ # Unique email per tenant; tenant-less rows are not constrained.
1071
+ unique_index_on :email, :tenant,
1072
+ partial: { "_p_tenant" => { "$exists" => true } }
1073
+ ```
1074
+
1010
1075
  ### Migrator: `indexes_plan` (dry-run) / `apply_indexes!` (mutate)
1011
1076
 
1012
1077
  `Parse::Schema::IndexMigrator` reconciles declared indexes against the
@@ -66,6 +66,7 @@ paying for an index nobody uses.
66
66
  | Regular B-tree | `mongo_index :field` | Equality, range, sort on a scalar field |
67
67
  | Compound | `mongo_index :a, :b, :c` | Multi-field queries with a common prefix |
68
68
  | Unique | `mongo_index :field, unique: true` | Enforce uniqueness at the DB layer |
69
+ | Unique dedup floor | `unique_index_on :a, :b` | Name the `first_or_create!` / `create_or_update!` dedup tuple; sugar for `unique: true` (non-sparse) |
69
70
  | Sparse | `mongo_index :field, sparse: true` | Field present on only some documents |
70
71
  | Partial | `mongo_index :field, partial: { … }` | Index only documents matching a filter |
71
72
  | TTL | `mongo_index :field, expire_after: N` | Auto-delete documents N seconds after the timestamp |
@@ -318,7 +319,27 @@ fails.
318
319
 
319
320
  **Better:** `unique: true, sparse: true` for "unique when present".
320
321
  This is exactly what `parse_reference` auto-registers, and it's the
321
- right pattern for any optional uniqueness constraint.
322
+ right pattern for any optional uniqueness constraint *on a single
323
+ field*.
324
+
325
+ **Sparse does NOT generalize to compound keys.** A compound sparse
326
+ index excludes a document only when it is missing *every* indexed
327
+ field; a document that has at least one key is still indexed. So for a
328
+ two-field tuple, two rows that share the present field and both omit
329
+ the other still collide under `sparse: true`. For "unique within a
330
+ subset" — e.g. unique `email` per `tenant`, but tenant-less rows may
331
+ repeat — use a **partial filter**, not sparse:
332
+
333
+ ```ruby
334
+ unique_index_on :email, :tenant,
335
+ partial: { "_p_tenant" => { "$exists" => true } }
336
+ ```
337
+
338
+ For the `first_or_create!` / `create_or_update!` dedup tuple, prefer
339
+ `unique_index_on` (sugar for `unique: true`, **non-sparse** by default
340
+ so the index key matches the query the upsert re-runs on recovery). It
341
+ is the durable correctness floor behind the synchronize-create lock —
342
+ see the MongoDB Direct guide for the full rationale.
322
343
 
323
344
  ### Geo without proper coordinate order
324
345
 
data/docs/usage_guide.md CHANGED
@@ -133,6 +133,21 @@ song = Song.first_or_create!({ title: "My Song" }, { artist: "Unknown" })
133
133
  song = Song.create_or_update!({ title: "My Song" }, { plays: 100 })
134
134
  ```
135
135
 
136
+ Under concurrency these have a TOCTOU window. Pass `synchronize: true` to
137
+ serialize the find→create→save through a Moneta-backed lock, and declare a
138
+ `unique_index_on` on the dedup tuple as the durable correctness floor — the lock
139
+ optimizes latency, the unique index guarantees a single row even if the lock is
140
+ bypassed:
141
+
142
+ ```ruby
143
+ class Song < Parse::Object
144
+ property :title, :string
145
+ unique_index_on :title # provisioned via Song.apply_indexes!
146
+ end
147
+
148
+ Song.first_or_create!({ title: "My Song" }, { artist: "Unknown" }, synchronize: true)
149
+ ```
150
+
136
151
  ## ACLs (Access Control)
137
152
 
138
153
  ```ruby
Binary file