parse-stack-next 5.1.1 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.env.sample +12 -0
- data/.env.test +4 -4
- data/CHANGELOG.md +545 -0
- data/Gemfile +3 -0
- data/Gemfile.lock +6 -1
- data/README.md +167 -38
- data/Rakefile +56 -10
- data/docs/atlas_vector_search_guide.md +110 -9
- data/docs/mcp_guide.md +433 -0
- data/docs/mongodb_direct_guide.md +66 -1
- data/docs/mongodb_index_optimization_guide.md +22 -1
- data/docs/usage_guide.md +15 -0
- data/lib/parse/agent/approval_gate.rb +0 -0
- data/lib/parse/agent/constraint_translator.rb +90 -19
- data/lib/parse/agent/describe.rb +1 -0
- data/lib/parse/agent/errors.rb +16 -0
- data/lib/parse/agent/mcp_client.rb +9 -0
- data/lib/parse/agent/mcp_dispatcher.rb +139 -7
- data/lib/parse/agent/mcp_rack_app.rb +621 -17
- data/lib/parse/agent/mcp_subscriptions.rb +607 -0
- data/lib/parse/agent/metadata_dsl.rb +58 -0
- data/lib/parse/agent/metadata_registry.rb +141 -1
- data/lib/parse/agent/prompt_hardening.rb +213 -0
- data/lib/parse/agent/result_formatter.rb +18 -3
- data/lib/parse/agent/tools.rb +167 -24
- data/lib/parse/agent.rb +692 -21
- data/lib/parse/client/request.rb +55 -4
- data/lib/parse/client/response.rb +4 -0
- data/lib/parse/client.rb +205 -7
- data/lib/parse/model/classes/installation.rb +27 -10
- data/lib/parse/model/classes/user.rb +8 -0
- data/lib/parse/model/core/actions.rb +58 -4
- data/lib/parse/model/core/embed_managed.rb +19 -14
- data/lib/parse/model/core/indexing.rb +108 -16
- data/lib/parse/model/core/querying.rb +29 -0
- data/lib/parse/model/model.rb +34 -3
- data/lib/parse/model/object.rb +1 -0
- data/lib/parse/query.rb +90 -24
- data/lib/parse/retrieval/agent_tool.rb +369 -0
- data/lib/parse/retrieval/chunk.rb +74 -0
- data/lib/parse/retrieval/chunker.rb +208 -0
- data/lib/parse/retrieval/retriever.rb +274 -0
- data/lib/parse/retrieval.rb +10 -0
- data/lib/parse/schema.rb +69 -20
- data/lib/parse/stack/version.rb +2 -2
- data/parse-stack-next.gemspec +1 -1
- data/scripts/docker/docker-compose.atlas.yml +14 -10
- data/scripts/docker/docker-compose.test.yml +24 -20
- data/scripts/docker/mongo-init.js +3 -3
- data/scripts/start-parse.sh +10 -0
- data/scripts/start_mcp_server.rb +1 -1
- data/scripts/test_server_connection.rb +1 -1
- data/scripts/vector_prototype/create_vector_index.js +1 -1
- data/scripts/vector_prototype/fetch_embeddings.py +2 -2
- data/scripts/vector_prototype/query_prototype.rb +1 -1
- data/scripts/vector_prototype/run.sh +4 -4
- metadata +10 -2
data/docs/mcp_guide.md
CHANGED
|
@@ -360,6 +360,433 @@ Common uses for the direct dispatcher:
|
|
|
360
360
|
|
|
361
361
|
---
|
|
362
362
|
|
|
363
|
+
## Connecting Claude Desktop (stdio bridge)
|
|
364
|
+
|
|
365
|
+
Parse Stack speaks MCP over **HTTP** (the standalone server and the
|
|
366
|
+
Rack adapter both expose a JSON-RPC-over-HTTP endpoint). Claude Desktop,
|
|
367
|
+
however, launches MCP servers as local **stdio** subprocesses — it does
|
|
368
|
+
not dial an HTTP URL directly. Bridge the two with
|
|
369
|
+
[`mcp-remote`](https://www.npmjs.com/package/mcp-remote), a small stdio↔HTTP
|
|
370
|
+
proxy that Claude Desktop runs as the subprocess and which forwards to your
|
|
371
|
+
HTTP endpoint.
|
|
372
|
+
|
|
373
|
+
1. Start the Parse Stack MCP endpoint over HTTP (standalone or Rack — see
|
|
374
|
+
Deployment Modes above) and note its URL and the bearer token your
|
|
375
|
+
`agent_factory` expects, e.g. `http://localhost:3001/` with
|
|
376
|
+
`Authorization: Bearer <token>`.
|
|
377
|
+
|
|
378
|
+
2. Add the bridge to `claude_desktop_config.json` (macOS:
|
|
379
|
+
`~/Library/Application Support/Claude/claude_desktop_config.json`;
|
|
380
|
+
Windows: `%APPDATA%\Claude\claude_desktop_config.json`):
|
|
381
|
+
|
|
382
|
+
```json
|
|
383
|
+
{
|
|
384
|
+
"mcpServers": {
|
|
385
|
+
"parse-stack": {
|
|
386
|
+
"command": "npx",
|
|
387
|
+
"args": [
|
|
388
|
+
"-y",
|
|
389
|
+
"mcp-remote",
|
|
390
|
+
"http://localhost:3001/",
|
|
391
|
+
"--header",
|
|
392
|
+
"Authorization: Bearer ${PARSE_MCP_TOKEN}"
|
|
393
|
+
],
|
|
394
|
+
"env": {
|
|
395
|
+
"PARSE_MCP_TOKEN": "your-mcp-token"
|
|
396
|
+
}
|
|
397
|
+
}
|
|
398
|
+
}
|
|
399
|
+
}
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
3. Restart Claude Desktop. The Parse Stack tools (`query_class`,
|
|
403
|
+
`get_schema`, `semantic_search`, …) appear in the client.
|
|
404
|
+
|
|
405
|
+
Notes:
|
|
406
|
+
|
|
407
|
+
- `mcp-remote` requires Node.js on the machine running Claude Desktop.
|
|
408
|
+
- For a public endpoint, terminate TLS in front of the HTTP server and use
|
|
409
|
+
an `https://` URL; the bearer token rides the `Authorization` header.
|
|
410
|
+
- The same bridge works for any stdio-only MCP client (e.g. some IDE
|
|
411
|
+
integrations). Clients that support remote MCP connectors natively can
|
|
412
|
+
point at the HTTP URL without the bridge.
|
|
413
|
+
- Approval workflows (elicitation) need the streaming/listening-stream
|
|
414
|
+
prerequisites described under Approval Workflows — confirm the bridge and
|
|
415
|
+
client forward the SSE channel before relying on human-in-the-loop gating.
|
|
416
|
+
|
|
417
|
+
---
|
|
418
|
+
|
|
419
|
+
## Resource Subscriptions (LiveQuery bridge)
|
|
420
|
+
|
|
421
|
+
MCP lets a client `resources/subscribe` to a resource URI and then receive
|
|
422
|
+
unsolicited `notifications/resources/updated` messages whenever the underlying
|
|
423
|
+
data changes. Parse Stack bridges that surface onto Parse LiveQuery: a
|
|
424
|
+
subscribed `parse://<Class>/count` or `parse://<Class>/samples` resource is
|
|
425
|
+
backed by a LiveQuery subscription on `<Class>`, and any matching
|
|
426
|
+
create/update/delete/enter/leave event is debounced into a single coarse
|
|
427
|
+
update for that URI. The client re-reads the resource via `resources/read` to
|
|
428
|
+
obtain the new value — row payloads are never streamed through the resource
|
|
429
|
+
surface.
|
|
430
|
+
|
|
431
|
+
This is opt-in and requires a streaming-capable Rack server (Puma, Falcon —
|
|
432
|
+
WEBrick buffers responses and cannot hold the listening stream open) plus
|
|
433
|
+
LiveQuery enabled and configured.
|
|
434
|
+
|
|
435
|
+
```ruby
|
|
436
|
+
# Boot: enable LiveQuery and point it at the server.
|
|
437
|
+
Parse.setup(
|
|
438
|
+
server_url: "https://your-parse-server.com/parse",
|
|
439
|
+
application_id: "your_app_id",
|
|
440
|
+
api_key: "your_api_key",
|
|
441
|
+
live_query_url: "wss://your-parse-server.com",
|
|
442
|
+
)
|
|
443
|
+
Parse.live_query_enabled = true
|
|
444
|
+
|
|
445
|
+
# Mount the Rack app with resource subscriptions enabled.
|
|
446
|
+
app = Parse::Agent::MCPRackApp.new(resource_subscriptions: true) do |env|
|
|
447
|
+
token = env["HTTP_AUTHORIZATION"].to_s.delete_prefix("Bearer ")
|
|
448
|
+
MyAuth.agent_for_token!(token) # returns a Parse::Agent or raises Unauthorized
|
|
449
|
+
end
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
When enabled and LiveQuery is available, the `initialize` handshake advertises
|
|
453
|
+
`resources.subscribe: true`. When LiveQuery is not enabled/available — or on
|
|
454
|
+
the WEBrick `MCPServer`, which cannot stream — the capability stays
|
|
455
|
+
`subscribe: false` and `resources/subscribe` returns a "not supported" error.
|
|
456
|
+
The capability is a contract: it is never advertised unless the server can
|
|
457
|
+
actually deliver updates.
|
|
458
|
+
|
|
459
|
+
### Protocol flow
|
|
460
|
+
|
|
461
|
+
1. **`initialize`** — the response carries a server-issued `Mcp-Session-Id`
|
|
462
|
+
header. The client echoes it on every subsequent request.
|
|
463
|
+
2. **`GET` listening stream** — the client opens a long-lived `GET` to the same
|
|
464
|
+
endpoint with `Accept: text/event-stream` and the `Mcp-Session-Id` header.
|
|
465
|
+
This is the server→client channel; it stays open and emits
|
|
466
|
+
`notifications/resources/updated` events until the client disconnects.
|
|
467
|
+
3. **`resources/subscribe`** — a normal `POST` with
|
|
468
|
+
`{ "uri": "parse://Post/count" }`. Returns an empty result; updates begin
|
|
469
|
+
flowing on the listening stream.
|
|
470
|
+
4. **`resources/unsubscribe`** — stops one subscription. `DELETE` with the
|
|
471
|
+
session id tears the whole session down.
|
|
472
|
+
|
|
473
|
+
Only `count` and `samples` resources are subscribable. `schema` is rejected
|
|
474
|
+
with an invalid-params error because schema changes are not LiveQuery events.
|
|
475
|
+
|
|
476
|
+
### Access control (important)
|
|
477
|
+
|
|
478
|
+
The bridge enforces the same scope rules as the rest of the SDK. LiveQuery
|
|
479
|
+
filters events server-side using the credential on the subscribe frame, so the
|
|
480
|
+
subscription's credentials are derived from the subscribing agent:
|
|
481
|
+
|
|
482
|
+
| Agent scope | LiveQuery credential | Events seen |
|
|
483
|
+
|-------------|----------------------|-------------|
|
|
484
|
+
| session-token agent | that session token | only rows the user can read (ACL/CLP enforced by Parse Server) |
|
|
485
|
+
| master-key agent | master key | every event |
|
|
486
|
+
| `acl_user:` / `acl_role:` agent | **refused** | none — see below |
|
|
487
|
+
|
|
488
|
+
`acl_user:` / `acl_role:` agents are an SDK-side, mongo-direct-only construct
|
|
489
|
+
with no Parse Server REST or LiveQuery equivalent (Parse Server has no
|
|
490
|
+
"act as this user pointer / role" handshake). Bridging them would force a
|
|
491
|
+
silent downgrade to either master key (a row-level leak) or an unscoped
|
|
492
|
+
session, so the bridge **fails closed** and refuses the subscription with a
|
|
493
|
+
security error. Subscribe with a session-token or master-key agent instead.
|
|
494
|
+
|
|
495
|
+
Because Parse Server fixes ACL-bypass authorization at LiveQuery *connect*
|
|
496
|
+
time (there is no per-subscription master key), the bridge keeps two
|
|
497
|
+
connections and routes by credential: master-posture subscriptions ride a
|
|
498
|
+
dedicated **admin** connection
|
|
499
|
+
(`Parse::LiveQuery::Client.new(use_master_key: true)`), while session-token
|
|
500
|
+
subscriptions ride a normal connection and pass their token per subscription.
|
|
501
|
+
Either way, an update only fires for an object the subscription's scope is
|
|
502
|
+
permitted to read — LiveQuery filters events by ACL server-side. (Whether a
|
|
503
|
+
master connection additionally surfaces master-key-only rows depends on the
|
|
504
|
+
Parse Server version and its `masterKeyIps` configuration.)
|
|
505
|
+
|
|
506
|
+
### Operational notes and limitations
|
|
507
|
+
|
|
508
|
+
- **Single-process.** Subscription state lives in the `MCPRackApp` instance
|
|
509
|
+
(like the cancellation registry), so in a clustered / multi-process
|
|
510
|
+
deployment a LiveQuery event observed on one worker does not reach a
|
|
511
|
+
listening stream held on another. The delivery seam
|
|
512
|
+
(`Parse::Agent::MCPSubscriptions::Notifier`) is isolated so a Redis-backed
|
|
513
|
+
pub/sub adapter can be supplied later without changing the bridge or the
|
|
514
|
+
dispatcher; pass it via `subscription_manager:`.
|
|
515
|
+
- **Subscriptions do not survive a listening-stream reconnect.** Closing the
|
|
516
|
+
`GET` stream tears down the session's LiveQuery subscriptions; a client that
|
|
517
|
+
reconnects must re-issue its `resources/subscribe` calls.
|
|
518
|
+
- **Session id is a bearer capability.** The listening stream authenticates via
|
|
519
|
+
the agent factory and keys delivery off the server-issued `Mcp-Session-Id`,
|
|
520
|
+
which the client must keep secret — possession of a valid session id (plus a
|
|
521
|
+
valid agent) is sufficient to attach. This matches the cancellation model.
|
|
522
|
+
- **Per-session and global caps.** A client that subscribes but never opens (or
|
|
523
|
+
later drops) its listening stream leaves LiveQuery subscriptions running until
|
|
524
|
+
the session is torn down. A per-session ceiling (default 100,
|
|
525
|
+
`max_subscriptions_per_session:` on the manager) bounds one session's
|
|
526
|
+
footprint, and a global ceiling on the number of distinct subscribing sessions
|
|
527
|
+
(default 10,000, `max_sessions:`) bounds total growth. The global cap is a
|
|
528
|
+
rejection cap (new sessions are refused with a JSON-RPC error once it is
|
|
529
|
+
reached) and fails closed.
|
|
530
|
+
- **Concurrent listening streams are bounded separately from request SSE.**
|
|
531
|
+
`max_concurrent_dispatchers:` does **not**, by itself, bound the GET listening
|
|
532
|
+
streams used for resource subscriptions and notifications — those get their own
|
|
533
|
+
soft cap *equal to* `max_concurrent_dispatchers`. So the effective steady-state
|
|
534
|
+
ceiling across both surfaces is up to **2× `max_concurrent_dispatchers`** (up
|
|
535
|
+
to N request-scoped SSE dispatchers plus N listening streams). Size the value
|
|
536
|
+
with that 2× factor in mind (e.g. relative to your Puma `max_threads`). Leaving
|
|
537
|
+
it unset (the default `nil`) leaves both surfaces uncapped; the app logs a
|
|
538
|
+
one-time warning at construction when a streaming or subscription/notification
|
|
539
|
+
surface is enabled without a cap.
|
|
540
|
+
|
|
541
|
+
---
|
|
542
|
+
|
|
543
|
+
## Approval Workflows (MCP elicitation)
|
|
544
|
+
|
|
545
|
+
`:write` / `:admin` tier tool calls can require human approval before they run,
|
|
546
|
+
using the MCP 2025-06-18 spec-native `elicitation/create` channel. Off by
|
|
547
|
+
default, so existing clients are unaffected.
|
|
548
|
+
|
|
549
|
+
```ruby
|
|
550
|
+
# Opt tiers in (process-wide). Has teeth only when an approval gate is installed
|
|
551
|
+
# (the MCP transport installs one per session; see below).
|
|
552
|
+
Parse::Agent.require_approval_for = [:write, :admin]
|
|
553
|
+
```
|
|
554
|
+
|
|
555
|
+
The approval gate is a pluggable `agent.approval_gate` consulted inside
|
|
556
|
+
`Parse::Agent#execute` — so it is reachable on the non-MCP path and
|
|
557
|
+
unit-testable with a fake approver. `Parse::Agent::MCPElicitationGate` is the
|
|
558
|
+
spec-native implementation; `Parse::Agent::NullGate` (the default) approves.
|
|
559
|
+
|
|
560
|
+
Round-trip over the streaming transport:
|
|
561
|
+
|
|
562
|
+
1. A `tools/call` for a gated tier pauses before execution. The server builds an
|
|
563
|
+
`elicitation/create` request whose payload carries the **approval preview**
|
|
564
|
+
(for `call_method` the *effective* tier is resolved from the target
|
|
565
|
+
`agent_method`'s declared permission, so write/admin methods invoked through
|
|
566
|
+
the readonly `call_method` tool are gated correctly). The preview is a real
|
|
567
|
+
before/after only for methods that declare `supports_dry_run`; for the
|
|
568
|
+
built-in `update_object` / `delete_object` it is the proposed `{ tool, args }`
|
|
569
|
+
call, **not** a fetched before/after of the target row.
|
|
570
|
+
2. The request is pushed to the client over the open **GET listening stream**
|
|
571
|
+
(the same bus as resource subscriptions).
|
|
572
|
+
3. The client replies with a JSON-RPC response (`{ result: { action: "accept" |
|
|
573
|
+
"decline" | "cancel" } }`) as a separate POST. The server routes it,
|
|
574
|
+
session-bound, into a pending registry that wakes the blocked tool thread.
|
|
575
|
+
4. `accept` → the tool runs. Anything else → a structured refusal; the tool
|
|
576
|
+
never executes.
|
|
577
|
+
|
|
578
|
+
Client capability + transport requirements (the server READS, does not
|
|
579
|
+
advertise, the client's `elicitation` capability at `initialize`):
|
|
580
|
+
|
|
581
|
+
```ruby
|
|
582
|
+
Parse::Agent::MCPRackApp.new(
|
|
583
|
+
streaming: true,
|
|
584
|
+
resource_subscriptions: true, # or notifications: true — either opens the GET bus
|
|
585
|
+
approval_timeout: 300, # seconds to wait for a human; default 300
|
|
586
|
+
agent_factory: ->(env) { ... },
|
|
587
|
+
)
|
|
588
|
+
```
|
|
589
|
+
|
|
590
|
+
**Three prerequisites — miss any one and every gated write fails closed,
|
|
591
|
+
which looks like a bug rather than a config gap:**
|
|
592
|
+
|
|
593
|
+
1. **`streaming: true`** on the `MCPRackApp` (it defaults to `false`). Approval
|
|
594
|
+
needs a server→client request, which only the streaming transport can send.
|
|
595
|
+
2. **An open GET bus** — `notifications: true` *or* `resource_subscriptions:
|
|
596
|
+
true`. `notifications: true` is the lighter choice if you don't need
|
|
597
|
+
LiveQuery resource subscriptions. Without a bus there is no channel to
|
|
598
|
+
deliver `elicitation/create`.
|
|
599
|
+
3. **A concurrent server (Puma), not the bundled `MCPServer`.** The bundled
|
|
600
|
+
server runs on WEBrick and is non-streaming, so approval can never round-trip
|
|
601
|
+
there — mount {Parse::Agent.rack_app} under Puma for any deployment that uses
|
|
602
|
+
approval.
|
|
603
|
+
|
|
604
|
+
Operator aid: a write/admin agent served over MCP with `require_approval_for`
|
|
605
|
+
empty emits a one-time `[Parse::Agent:SECURITY]` warning (writes run ungated).
|
|
606
|
+
Approval round-trips also emit a `parse.agent.approval` `ActiveSupport::Notifications`
|
|
607
|
+
event carrying `outcome`, `reason`, and the measured wait — subscribe to it to
|
|
608
|
+
spot a non-answering client holding a dispatcher thread for the full
|
|
609
|
+
`approval_timeout` (default 300s).
|
|
610
|
+
|
|
611
|
+
**Fails closed.** When approval is required but the client did not advertise the
|
|
612
|
+
`elicitation` capability, no listening stream is open, the transport is
|
|
613
|
+
non-streaming (WEBrick), or the approver times out, the destructive operation is
|
|
614
|
+
**refused** — never blocked forever, never silently executed. Replies are bound
|
|
615
|
+
to the answering session's `Mcp-Session-Id`, so one session cannot answer (or
|
|
616
|
+
guess the id of) another's prompt.
|
|
617
|
+
|
|
618
|
+
---
|
|
619
|
+
|
|
620
|
+
## Server-initiated Notifications (general purpose)
|
|
621
|
+
|
|
622
|
+
The GET listening-stream bus also backs arbitrary server→client notifications,
|
|
623
|
+
without requiring LiveQuery resource subscriptions:
|
|
624
|
+
|
|
625
|
+
```ruby
|
|
626
|
+
app = Parse::Agent::MCPRackApp.new(streaming: true, notifications: true,
|
|
627
|
+
agent_factory: ->(env) { ... })
|
|
628
|
+
|
|
629
|
+
# From application code that holds the app reference:
|
|
630
|
+
app.notify("the-session-id", method: "notifications/custom", params: { foo: 1 })
|
|
631
|
+
```
|
|
632
|
+
|
|
633
|
+
`notifications: true` builds the listening-stream manager in a `supported:
|
|
634
|
+
false` posture: the GET stream and `#notify` work, but `resources.subscribe`
|
|
635
|
+
stays unadvertised and `resources/subscribe` POSTs fail closed. `#notify` builds
|
|
636
|
+
a JSON-RPC **notification** (never an `id` — that distinguishes it from the
|
|
637
|
+
server-initiated *request* used by elicitation) and returns `false` when no
|
|
638
|
+
stream is attached for the session. `app.subscription_manager` is exposed for an
|
|
639
|
+
out-of-band / clustered publisher that needs the lower-level `publish` seam.
|
|
640
|
+
|
|
641
|
+
---
|
|
642
|
+
|
|
643
|
+
## Built-in Agent Hardening & Telemetry
|
|
644
|
+
|
|
645
|
+
5.2 adds several agent-side controls, all configured on `Parse::Agent`:
|
|
646
|
+
|
|
647
|
+
- **Impersonation** — `Parse::Agent.new(impersonate_user: <id|Pointer|User>,
|
|
648
|
+
impersonate_mint: false, impersonation_label:)` (or `agent.impersonate(user)`
|
|
649
|
+
/ `agent.stop_impersonating!`) resolves a real session token for a `_User`
|
|
650
|
+
(reusing an active `_Session`, or minting a restricted one with
|
|
651
|
+
`impersonate_mint: true`) and binds it as if `session_token:` had been passed.
|
|
652
|
+
Master-key client required; fails closed if no session resolves. An
|
|
653
|
+
`impersonation_label:` (also usable with `acl_role:`) is emitted on the
|
|
654
|
+
`parse.agent.tool_call` payload alongside `impersonated_user_id`.
|
|
655
|
+
- **Prompt hardening** (`Parse::Agent::PromptHardening`) — schema descriptions
|
|
656
|
+
surfaced by `get_schema` / `get_all_schemas` are sanitized (non-identifier
|
|
657
|
+
field names dropped with a `[Parse::Agent:PROMPT]` warning, control/zero-width
|
|
658
|
+
chars stripped, capped, marker-wrapped); untrusted tool content has embedded
|
|
659
|
+
wrapper markers neutralized (`Parse::Agent.prompt_marker_strict = true` to
|
|
660
|
+
refuse instead). Operator canary phrases via
|
|
661
|
+
`Parse::Agent.prompt_injection_canaries = ["IGNORE PREVIOUS", /system:/i]`
|
|
662
|
+
emit `parse.agent.prompt_injection_detected`; set
|
|
663
|
+
`Parse::Agent.canary_action = :refuse` to raise on a hit.
|
|
664
|
+
`Parse::Agent::PROMPT_VERSION` is surfaced via
|
|
665
|
+
`agent.describe[:prompt][:version]`. A one-time warning fires when
|
|
666
|
+
`allowed_llm_endpoints` is left unrestricted (nil).
|
|
667
|
+
- **Embedding-cost telemetry** — embedding calls made inside a tool span add
|
|
668
|
+
`embed_calls`, `embed_tokens`, and (when
|
|
669
|
+
`Parse::Agent.embed_cost_per_million_tokens` is set) `embed_cost_usd` to the
|
|
670
|
+
`parse.agent.tool_call` payload. The per-tool span does **not** cover
|
|
671
|
+
corpus/ingestion embeds fired at `Model.save` time (typically the dominant
|
|
672
|
+
spend) — wrap those in `Parse::Agent.measure_embeddings { … }`, which returns
|
|
673
|
+
`{ calls:, tokens:, cost_usd: }` for the work done on the calling thread:
|
|
674
|
+
|
|
675
|
+
```ruby
|
|
676
|
+
stats = Parse::Agent.measure_embeddings do
|
|
677
|
+
KnowledgeArticle.save_all(batch) # embed-on-save
|
|
678
|
+
end
|
|
679
|
+
stats # => { calls: 1200, tokens: 4_300_000, cost_usd: 0.43 }
|
|
680
|
+
```
|
|
681
|
+
|
|
682
|
+
Thread-local: embeds fanned out to other threads/fibers are not captured —
|
|
683
|
+
measure inside each worker. `Parse::Agent.embed_cost_usd(tokens)` converts a
|
|
684
|
+
token count to USD using the configured rate (nil when unset).
|
|
685
|
+
- **Provenance** — `Parse::Agent.include_source_provenance = true` (default
|
|
686
|
+
false) stamps each read-tool row with `_source = { class, tool, object_id }`,
|
|
687
|
+
applied after field-allowlist projection and redaction.
|
|
688
|
+
- **`semantic_search` tool** — registered readonly + `client_safe`; opt a model
|
|
689
|
+
in with `agent_searchable field:, filter_fields:`. See the
|
|
690
|
+
[Atlas Vector Search Guide](./atlas_vector_search_guide.md#retrieval-rag).
|
|
691
|
+
|
|
692
|
+
### Runtime denial gates
|
|
693
|
+
|
|
694
|
+
Beyond the permission-tier and env-gate checks, several gates refuse a tool
|
|
695
|
+
call at runtime based on its arguments. They fail closed; a caller sees a
|
|
696
|
+
structured error (the built-in tools return `{ success: false, error:,
|
|
697
|
+
error_code: }`, which surfaces as `isError: true` over MCP). Knowing them up
|
|
698
|
+
front avoids discovering each only on impact:
|
|
699
|
+
|
|
700
|
+
| Gate | When it fires | Surfaced as |
|
|
701
|
+
|------|---------------|-------------|
|
|
702
|
+
| Missing tenant scope | A searchable class has no `agent_tenant_scope` while other classes do (tenant-aware deployment) | `Parse::Agent::MissingTenantScope` (search path); a one-time `[Parse::Agent:SECURITY]` lint warning on the general query path |
|
|
703
|
+
| No tenant binding | A scoped class is queried by an agent whose tenant value resolves to `nil` | `Parse::Agent::AccessDenied` (`kind: :tenant`) |
|
|
704
|
+
| Hidden class | A tool targets an `agent_hidden` class (or one outside a per-instance `classes:` allowlist) | `Parse::Agent::AccessDenied` (`kind: :hidden_class`) / off-allowlist refusal |
|
|
705
|
+
| Reserved underscore key | A `filter:` / `vector_filter:` / `where:` contains an underscore-prefixed key (`_rperm`, `_p_*`, …) at any depth | `ArgumentError` / `ValidationError` (recursive refusal) |
|
|
706
|
+
| Filter-field allowlist | A `filter:` / `vector_filter:` names a field not in the class's `agent_searchable filter_fields:` | `ValidationError` naming the offending field(s) |
|
|
707
|
+
| `text_field` not embedded | `semantic_search` `text_field:` names a field that isn't a declared `embed` source | `ValidationError` listing the allowed sources |
|
|
708
|
+
| Tool filtered | A tool/method removed by a per-instance `tools:` / `methods:` filter is invoked | `error_code: :tool_filtered` |
|
|
709
|
+
| Approval denied/unavailable | A gated write/admin op is rejected or the approver is unreachable | `error_code: :approval_denied` |
|
|
710
|
+
|
|
711
|
+
---
|
|
712
|
+
|
|
713
|
+
## Token Economy
|
|
714
|
+
|
|
715
|
+
The MCP surface is paid for in LLM context tokens — the tool schemas sent every
|
|
716
|
+
session, and the data every tool returns. 5.2 adds controls to keep that cost
|
|
717
|
+
down.
|
|
718
|
+
|
|
719
|
+
### Lean tool profile
|
|
720
|
+
|
|
721
|
+
A full `:readonly` `tools/list` payload is roughly **7.9K context tokens** every
|
|
722
|
+
session. For small-context models or token-sensitive deployments, the `:lean`
|
|
723
|
+
profile narrows the surface to the six core read tools (`get_all_schemas`,
|
|
724
|
+
`get_schema`, `query_class`, `count_objects`, `get_object`, `aggregate`) —
|
|
725
|
+
about **2.6K tokens, a ~67% reduction**:
|
|
726
|
+
|
|
727
|
+
```ruby
|
|
728
|
+
Parse::Agent.new(permissions: :readonly, tools: :lean)
|
|
729
|
+
```
|
|
730
|
+
|
|
731
|
+
A profile is an allowlist: it composes with the permission tier and can only
|
|
732
|
+
narrow, never elevate. Profiles are Symbol-only (`Parse::Agent::TOOL_PROFILES`);
|
|
733
|
+
for finer control still pass an explicit Array or `{ only:, except: }`. An
|
|
734
|
+
unknown profile raises rather than silently exposing the full surface.
|
|
735
|
+
|
|
736
|
+
### Leaner tool responses
|
|
737
|
+
|
|
738
|
+
Read tools return rows in an LLM-friendly form (Pointers as `{_type, class,
|
|
739
|
+
id}`, Dates as bare ISO strings) and now **strip the raw `ACL` map** — it is
|
|
740
|
+
operationally useless to a model (effective authority is enforced server-side
|
|
741
|
+
regardless) and is pure token overhead plus a minor role/user-id disclosure.
|
|
742
|
+
`get_objects` and the Atlas Search tools now go through the same normalization
|
|
743
|
+
`query_class` always used, instead of shipping raw wire-form.
|
|
744
|
+
|
|
745
|
+
Defaults that bound response size: `query_class` `limit:` defaults to 100 (cap
|
|
746
|
+
1000) with the rendered array capped at 50 (`truncated_note`); `aggregate`
|
|
747
|
+
auto-injects a terminal `$limit: 200`. Pass a smaller `limit:` / project fewer
|
|
748
|
+
fields via `keys:` when you want a tighter result.
|
|
749
|
+
|
|
750
|
+
### `semantic_search` — deduped sources and a token budget
|
|
751
|
+
|
|
752
|
+
The `semantic_search` result hoists each chunk's parent record **once** into a
|
|
753
|
+
`documents` map keyed by `objectId`, instead of duplicating the full source on
|
|
754
|
+
every chunk — map a chunk back to its source via `metadata.object_id`:
|
|
755
|
+
|
|
756
|
+
```jsonc
|
|
757
|
+
{
|
|
758
|
+
"chunks": [
|
|
759
|
+
{ "id": "a#0", "score": 0.82, "content": "…", "metadata": { "object_id": "a", "chunk_index": 0 } },
|
|
760
|
+
{ "id": "a#1", "score": 0.82, "content": "…", "metadata": { "object_id": "a", "chunk_index": 1 } }
|
|
761
|
+
],
|
|
762
|
+
"documents": { "a": { "objectId": "a", "title": "…" } },
|
|
763
|
+
"count": 2
|
|
764
|
+
}
|
|
765
|
+
```
|
|
766
|
+
|
|
767
|
+
A `max_total_tokens` budget (default 20,000; estimated as chars/4) trims the
|
|
768
|
+
lowest-ranked chunks so a few long documents can't silently blow the context
|
|
769
|
+
window — the count caps (`k * max_chunks_per_document`) bound the chunk *count*
|
|
770
|
+
but not their total size. When the budget trims, the result adds
|
|
771
|
+
`budget_truncated: true` and `budget_dropped: <n>` so the truncation is never
|
|
772
|
+
silent. Pass `max_total_tokens: 0` to disable.
|
|
773
|
+
|
|
774
|
+
### Structured error metadata on the wire
|
|
775
|
+
|
|
776
|
+
A failing `tools/call` already carries `error_code` and a structured `details:`
|
|
777
|
+
hash (e.g. `allowed_fields`, `suggested_rewrite`) and `retry_after` — these are
|
|
778
|
+
now forwarded on the MCP error envelope under `_meta` (`parse.error_code`,
|
|
779
|
+
`parse.retry_after`, `parse.details`) so a client can branch deterministically
|
|
780
|
+
and honor `retry_after` instead of re-parsing the prose message. The
|
|
781
|
+
human-readable `content` text is unchanged.
|
|
782
|
+
|
|
783
|
+
`get_schema` on a mistyped class name now raises a `ValidationError` carrying a
|
|
784
|
+
"Did you mean: …?" hint (near matches from the locally-known classes), so the
|
|
785
|
+
model self-corrects in one retry instead of falling back to a full
|
|
786
|
+
`get_all_schemas` sweep.
|
|
787
|
+
|
|
788
|
+
---
|
|
789
|
+
|
|
363
790
|
## Custom Authentication
|
|
364
791
|
|
|
365
792
|
The agent factory pattern gives you full control over authentication. Every request passes through the factory before any Parse operation is attempted.
|
|
@@ -739,6 +1166,10 @@ agent = Parse::Agent.new(tools: { only: [:query_class, :get_schema, :aggregate],
|
|
|
739
1166
|
|
|
740
1167
|
# Denylist only
|
|
741
1168
|
agent = Parse::Agent.new(tools: { except: [:emit_artifact] })
|
|
1169
|
+
|
|
1170
|
+
# Named profile (Symbol) — :lean narrows to the six core read tools
|
|
1171
|
+
# (~67% smaller tools/list). See "Token Economy" above.
|
|
1172
|
+
agent = Parse::Agent.new(tools: :lean)
|
|
742
1173
|
```
|
|
743
1174
|
|
|
744
1175
|
**Resolution order** is strict: env-gates ▷ permission tier ▷ per-instance filter. The filter cannot elevate — `tools: { only: [:delete_object] }` on a `:readonly` agent still excludes `delete_object` because `delete_object` is not in the readonly tier's permitted set in the first place.
|
|
@@ -1667,6 +2098,8 @@ Known `details[:kind]` subcodes for `:access_denied`:
|
|
|
1667
2098
|
|
|
1668
2099
|
The top-level `error_code` stays at `:access_denied` for back-compat with consumers that only branch on it. The new subcode is purely additive — clients that ignore `details:` see no change in behavior.
|
|
1669
2100
|
|
|
2101
|
+
**On the wire (5.2+):** `error_code`, `retry_after`, and `details` are forwarded on the MCP tool-error envelope under `_meta` — `parse.error_code`, `parse.retry_after`, `parse.details` — so a spec-compliant client can branch deterministically (and honor `retry_after`) without parsing the prose `content` text. The `content` text and `isError: true` are unchanged.
|
|
2102
|
+
|
|
1670
2103
|
---
|
|
1671
2104
|
|
|
1672
2105
|
## Performance and Timeouts
|
|
@@ -757,6 +757,15 @@ non-master scope so this enforcement always runs. See the
|
|
|
757
757
|
[Atlas Vector Search Guide](./atlas_vector_search_guide.md) for the
|
|
758
758
|
full API surface.
|
|
759
759
|
|
|
760
|
+
`Parse::Retrieval.retrieve` and the `semantic_search` agent tool (v5.2)
|
|
761
|
+
build directly on `find_similar`, so they inherit this exact Layer 1-5
|
|
762
|
+
mongo-direct enforcement. The earlier RAG plan's "two-stage" REST
|
|
763
|
+
re-query was intentionally NOT adopted — there is no REST vector path,
|
|
764
|
+
and `acl_user:` / `acl_role:` scopes have no REST equivalent, so the
|
|
765
|
+
post-`$vectorSearch` `_rperm` `$match` is the single enforcement
|
|
766
|
+
boundary. The retrieval layer adds a tenant-scope fold into the Atlas
|
|
767
|
+
pre-filter on top of this, never a substitute for it.
|
|
768
|
+
|
|
760
769
|
### Timeouts
|
|
761
770
|
|
|
762
771
|
```ruby
|
|
@@ -947,7 +956,7 @@ three independent flags that every mutation re-checks per call.
|
|
|
947
956
|
Requires `clusterMonitor` privilege on the reader; returns `{}`
|
|
948
957
|
when not granted so callers degrade gracefully.
|
|
949
958
|
|
|
950
|
-
### Model DSL: `mongo_index` / `mongo_geo_index` / `mongo_relation_index`
|
|
959
|
+
### Model DSL: `mongo_index` / `unique_index_on` / `mongo_geo_index` / `mongo_relation_index`
|
|
951
960
|
|
|
952
961
|
Index declarations are class-level metadata on `Parse::Object`
|
|
953
962
|
subclasses. They run validation at registration time so a typo,
|
|
@@ -967,6 +976,7 @@ class Car < Parse::Object
|
|
|
967
976
|
|
|
968
977
|
mongo_index :make, :model, :year # compound
|
|
969
978
|
mongo_index :vin, unique: true
|
|
979
|
+
unique_index_on :registration # dedup floor; unique { registration: 1 }
|
|
970
980
|
mongo_index :owner # pointer auto-rewrites to _p_owner
|
|
971
981
|
mongo_geo_index :location # 2dsphere on GeoJSON Point
|
|
972
982
|
mongo_index :tags # array field
|
|
@@ -1007,6 +1017,61 @@ class Author < Parse::Object
|
|
|
1007
1017
|
end
|
|
1008
1018
|
```
|
|
1009
1019
|
|
|
1020
|
+
### `unique_index_on` — the `first_or_create!` correctness floor
|
|
1021
|
+
|
|
1022
|
+
`unique_index_on(*fields, sparse: false, partial: nil, name: nil)` declares
|
|
1023
|
+
a unique index on the exact dedup tuple that `first_or_create!` and
|
|
1024
|
+
`create_or_update!` key on. It is thin sugar over
|
|
1025
|
+
`mongo_index(*fields, unique: true, …)` — same registration, same validation
|
|
1026
|
+
(sensitive-field guard, pointer auto-rewrite, parallel-array / relation /
|
|
1027
|
+
`_id` rejection), same `apply_indexes!` writer path — but the name states the
|
|
1028
|
+
intent: these fields are the create-or-update identity.
|
|
1029
|
+
|
|
1030
|
+
```ruby
|
|
1031
|
+
class Subscription < Parse::Object
|
|
1032
|
+
property :email, :string
|
|
1033
|
+
belongs_to :tenant, as: :user
|
|
1034
|
+
|
|
1035
|
+
unique_index_on :email, :tenant # key: { email: 1, _p_tenant: 1 } unique
|
|
1036
|
+
end
|
|
1037
|
+
|
|
1038
|
+
Subscription.apply_indexes! # provisions the index via the writer gate
|
|
1039
|
+
```
|
|
1040
|
+
|
|
1041
|
+
**Why it matters.** The Redis-backed `synchronize:` lock on `first_or_create!`
|
|
1042
|
+
is a *latency optimization*: in the common path it collapses concurrent
|
|
1043
|
+
callers so only one issues the create. The unique index is the *correctness
|
|
1044
|
+
floor* that survives the lock being bypassed — a Redis outage, a TTL expiring
|
|
1045
|
+
between the existence check and the write, a caller passing
|
|
1046
|
+
`synchronize: false`, or two app servers whose lock secrets disagree. When a
|
|
1047
|
+
race slips past the lock, the loser's insert fails with `DuplicateValue`
|
|
1048
|
+
(Parse error 137), which `first_or_create!` rescues and resolves to the
|
|
1049
|
+
winning row. Lock plus index make the net invariant — *exactly one row, every
|
|
1050
|
+
caller sees the same id* — hold under any race, not just the happy path.
|
|
1051
|
+
|
|
1052
|
+
**Defaults are non-sparse, on purpose.** The index key is kept identical to
|
|
1053
|
+
the query `first_or_create!` re-runs on recovery (`_scoped_first` on the same
|
|
1054
|
+
`query_attrs`), so a 137 always corresponds to a row the recovery query can
|
|
1055
|
+
find. A sparse or partial index that fires on a condition the recovery query
|
|
1056
|
+
doesn't reproduce would surface a 137 the rescue can't resolve, and the error
|
|
1057
|
+
would re-raise. `sparse:` only changes behavior for a document missing *every*
|
|
1058
|
+
field in the tuple — a compound sparse index indexes a doc when it has at
|
|
1059
|
+
least one key, and `first_or_create!` always writes the full tuple, so sparse
|
|
1060
|
+
never weakens the floor. Leave it off unless out-of-band writers create
|
|
1061
|
+
tuple-less rows you want excluded.
|
|
1062
|
+
|
|
1063
|
+
For "unique within a subset" — unique email per tenant, but rows with no
|
|
1064
|
+
tenant may repeat — a partial filter is the right tool, **not** `sparse:`
|
|
1065
|
+
(a compound sparse index still collides two rows that share the present
|
|
1066
|
+
fields). You own the filter's lifecycle and must keep the recovery query
|
|
1067
|
+
consistent with it:
|
|
1068
|
+
|
|
1069
|
+
```ruby
|
|
1070
|
+
# Unique email per tenant; tenant-less rows are not constrained.
|
|
1071
|
+
unique_index_on :email, :tenant,
|
|
1072
|
+
partial: { "_p_tenant" => { "$exists" => true } }
|
|
1073
|
+
```
|
|
1074
|
+
|
|
1010
1075
|
### Migrator: `indexes_plan` (dry-run) / `apply_indexes!` (mutate)
|
|
1011
1076
|
|
|
1012
1077
|
`Parse::Schema::IndexMigrator` reconciles declared indexes against the
|
|
@@ -66,6 +66,7 @@ paying for an index nobody uses.
|
|
|
66
66
|
| Regular B-tree | `mongo_index :field` | Equality, range, sort on a scalar field |
|
|
67
67
|
| Compound | `mongo_index :a, :b, :c` | Multi-field queries with a common prefix |
|
|
68
68
|
| Unique | `mongo_index :field, unique: true` | Enforce uniqueness at the DB layer |
|
|
69
|
+
| Unique dedup floor | `unique_index_on :a, :b` | Name the `first_or_create!` / `create_or_update!` dedup tuple; sugar for `unique: true` (non-sparse) |
|
|
69
70
|
| Sparse | `mongo_index :field, sparse: true` | Field present on only some documents |
|
|
70
71
|
| Partial | `mongo_index :field, partial: { … }` | Index only documents matching a filter |
|
|
71
72
|
| TTL | `mongo_index :field, expire_after: N` | Auto-delete documents N seconds after the timestamp |
|
|
@@ -318,7 +319,27 @@ fails.
|
|
|
318
319
|
|
|
319
320
|
**Better:** `unique: true, sparse: true` for "unique when present".
|
|
320
321
|
This is exactly what `parse_reference` auto-registers, and it's the
|
|
321
|
-
right pattern for any optional uniqueness constraint
|
|
322
|
+
right pattern for any optional uniqueness constraint *on a single
|
|
323
|
+
field*.
|
|
324
|
+
|
|
325
|
+
**Sparse does NOT generalize to compound keys.** A compound sparse
|
|
326
|
+
index excludes a document only when it is missing *every* indexed
|
|
327
|
+
field; a document that has at least one key is still indexed. So for a
|
|
328
|
+
two-field tuple, two rows that share the present field and both omit
|
|
329
|
+
the other still collide under `sparse: true`. For "unique within a
|
|
330
|
+
subset" — e.g. unique `email` per `tenant`, but tenant-less rows may
|
|
331
|
+
repeat — use a **partial filter**, not sparse:
|
|
332
|
+
|
|
333
|
+
```ruby
|
|
334
|
+
unique_index_on :email, :tenant,
|
|
335
|
+
partial: { "_p_tenant" => { "$exists" => true } }
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
For the `first_or_create!` / `create_or_update!` dedup tuple, prefer
|
|
339
|
+
`unique_index_on` (sugar for `unique: true`, **non-sparse** by default
|
|
340
|
+
so the index key matches the query the upsert re-runs on recovery). It
|
|
341
|
+
is the durable correctness floor behind the synchronize-create lock —
|
|
342
|
+
see the MongoDB Direct guide for the full rationale.
|
|
322
343
|
|
|
323
344
|
### Geo without proper coordinate order
|
|
324
345
|
|
data/docs/usage_guide.md
CHANGED
|
@@ -133,6 +133,21 @@ song = Song.first_or_create!({ title: "My Song" }, { artist: "Unknown" })
|
|
|
133
133
|
song = Song.create_or_update!({ title: "My Song" }, { plays: 100 })
|
|
134
134
|
```
|
|
135
135
|
|
|
136
|
+
Under concurrency these have a TOCTOU window. Pass `synchronize: true` to
|
|
137
|
+
serialize the find→create→save through a Moneta-backed lock, and declare a
|
|
138
|
+
`unique_index_on` on the dedup tuple as the durable correctness floor — the lock
|
|
139
|
+
optimizes latency, the unique index guarantees a single row even if the lock is
|
|
140
|
+
bypassed:
|
|
141
|
+
|
|
142
|
+
```ruby
|
|
143
|
+
class Song < Parse::Object
|
|
144
|
+
property :title, :string
|
|
145
|
+
unique_index_on :title # provisioned via Song.apply_indexes!
|
|
146
|
+
end
|
|
147
|
+
|
|
148
|
+
Song.first_or_create!({ title: "My Song" }, { artist: "Unknown" }, synchronize: true)
|
|
149
|
+
```
|
|
150
|
+
|
|
136
151
|
## ACLs (Access Control)
|
|
137
152
|
|
|
138
153
|
```ruby
|
|
Binary file
|