RubyGems - parse-stack-next - Versions diffs - 5.3.0 → 5.4.0 - Mend

parse-stack-next 5.3.0 → 5.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (64) hide show

checksums.yaml +4 -4
data/.gitignore +2 -0
data/CHANGELOG.md +461 -0
data/Gemfile +7 -0
data/Gemfile.lock +12 -4
data/README.md +160 -3
data/Rakefile +52 -3
data/docs/atlas_vector_search_guide.md +86 -2
data/docs/client_sdk_guide.md +5 -0
data/docs/mcp_guide.md +59 -4
data/docs/mongodb_direct_guide.md +93 -1
data/docs/usage_guide.md +11 -1
data/docs/webhooks_guide.md +418 -0
data/examples/README.md +46 -0
data/examples/basic_client.rb +93 -0
data/examples/basic_server.rb +109 -0
data/examples/live_query_listener.rb +98 -0
data/examples/rag_chatbot.rb +221 -0
data/examples/webhook_server.rb +111 -0
data/lib/parse/agent/mcp_rack_app.rb +285 -62
data/lib/parse/agent/tools.rb +45 -5
data/lib/parse/api/aggregate.rb +7 -1
data/lib/parse/api/cloud_functions.rb +12 -4
data/lib/parse/api/hooks.rb +46 -9
data/lib/parse/api/objects.rb +16 -2
data/lib/parse/api/path_segment.rb +33 -0
data/lib/parse/api/server.rb +94 -0
data/lib/parse/api/users.rb +58 -2
data/lib/parse/atlas_search.rb +7 -7
data/lib/parse/client/body_builder.rb +5 -0
data/lib/parse/client/protocol.rb +4 -0
data/lib/parse/client.rb +55 -2
data/lib/parse/embeddings/spend_cap.rb +255 -0
data/lib/parse/embeddings.rb +1 -0
data/lib/parse/live_query/client.rb +3 -1
data/lib/parse/live_query/subscription.rb +32 -5
data/lib/parse/model/acl.rb +4 -2
data/lib/parse/model/classes/audience.rb +52 -4
data/lib/parse/model/classes/user.rb +180 -3
data/lib/parse/model/core/embed_managed.rb +113 -0
data/lib/parse/model/core/querying.rb +3 -1
data/lib/parse/model/core/vector_searchable.rb +161 -0
data/lib/parse/model/object.rb +28 -5
data/lib/parse/mongodb.rb +7 -1
data/lib/parse/pipeline_security.rb +5 -3
data/lib/parse/query/constraints.rb +29 -0
data/lib/parse/query.rb +265 -27
data/lib/parse/retrieval/agent_tool.rb +49 -0
data/lib/parse/retrieval/reranker/cohere.rb +218 -0
data/lib/parse/retrieval/reranker.rb +157 -0
data/lib/parse/retrieval/retriever.rb +110 -23
data/lib/parse/stack/version.rb +1 -1
data/lib/parse/stack.rb +17 -0
data/lib/parse/two_factor_auth/user_extension.rb +123 -31
data/lib/parse/vector_search/hybrid.rb +578 -0
data/lib/parse/webhooks/payload.rb +252 -7
data/lib/parse/webhooks/trigger_audit.rb +502 -0
data/lib/parse/webhooks.rb +215 -3
data/scripts/docker/Dockerfile.parse +5 -1
data/scripts/docker/docker-compose.test.yml +31 -0
data/scripts/docker/docker-compose.verifyemail.yml +4 -0
data/scripts/docker/preflight.sh +76 -0
data/scripts/start-parse.sh +52 -4
metadata +15 -1

data/README.md CHANGED Viewed

@@ -4,6 +4,15 @@
 A full-featured Ruby client SDK for [Parse Server](http://parseplatform.org/). [parse-stack-next](https://github.com/neurosynq/parse-stack-next) is a Ruby client SDK, REST client, and Active Model ORM for [Parse Server](http://parseplatform.org/), combining a low-level API client, a query engine, an object-relational mapper (ORM), and a Cloud Code Webhooks rack application in a single gem.
+### What's new in 5.4
+- **5.4.0 — Hybrid search + reranking for RAG** — `Class.hybrid_search(text:, lexical:, vector:, k:, fusion:)` fuses a lexical Atlas Search branch with a `$vectorSearch` branch using reciprocal-rank fusion (RRF): lexical search nails exact tokens (codes, proper nouns), vector search nails paraphrase, and fusing the two beats either alone. Each branch enforces ACL/CLP independently before fusion (no separate hydration fetch to secure); results carry `#hybrid_score` / `#hybrid_ranks`. `Parse::VectorSearch::Hybrid.rank_fusion_supported?` detects Atlas 8.0+ native `$rankFusion` by a cached behavioural probe (native execution is opt-in; client-side RRF is the always-enforced default). `Parse::Retrieval::Reranker` adds cross-encoder reranking (`Reranker::Cohere` over `/v2/rerank`, plus a deterministic `Reranker::Fixture`), wired into `Parse::Retrieval.retrieve(hybrid:, rerank:)`. `Parse::Embeddings::SpendCap` adds an opt-in per-tenant embedding token cap (hard-refuse) at the `semantic_search` agent-tool boundary. See [CHANGELOG.md](./CHANGELOG.md) and [`docs/atlas_vector_search_guide.md`](./docs/atlas_vector_search_guide.md)
+- **5.4.0 — Vector backfill, visibility, and webhook redaction** — `Class.embed_pending!` backfills embeddings for records whose managed `:vector` field is null (objectId-cursor pagination); `Parse::Object#compute_embedding!` forces an in-place recompute without a save; `vector_visibility :owner_only | :public` controls whether a class's vectors appear in `as_json` by default; and webhook trigger payloads now strip declared `:vector` columns by default (a `:public` class keeps them). See [CHANGELOG.md](./CHANGELOG.md)
+- **5.4.0 — TOTP multi-factor auth works end to end** — the `Parse::User` MFA lifecycle is now fully functional and exercised against a real MFA-enabled Parse Server. `setup_mfa!(secret:, token:)` enrolls TOTP and returns recovery codes; `Parse::User.login_with_mfa(user, pass, code)` completes a second-factor login; `mfa_enabled?` / `mfa_status` report enrollment after an ordinary fetch — the SDK strips the raw TOTP secret and recovery codes that Parse Server returns in `authData` but preserves a leak-safe `{status: "enabled"}` projection so the status reads correctly without exposing the secret; `disable_mfa!(current_token:)` turns MFA off after re-validating the current code (a wrong code raises `Parse::MFA::VerificationError`), and `disable_mfa_master_key!(authorized_by:)` is the operator override. Each MFA write also no longer raises an internal argument error before reaching the server. Interactively, `rake client:console` now prompts for a TOTP / recovery code (or reads `PARSE_LOGIN_MFA`) when logging into an enrolled account. See [CHANGELOG.md](./CHANGELOG.md)
+- **5.4.0 — Request email-address verification** — `Parse::User.request_email_verification(email)` and the instance `Parse::User#request_email_verification` ask Parse Server to (re)send the verification email for a registered, not-yet-verified user, mirroring `request_password_reset` (per-email rate limiting, Boolean return). Requires a server email adapter with `verifyUserEmails` enabled. See [CHANGELOG.md](./CHANGELOG.md)
+- **5.4.0 — Audience hash queries persist correctly** — `Parse::Audience#query` is now stored as a JSON string on the wire to match Parse Server's `_Audience.query` column type, so saving an audience with a `Hash` query no longer fails the server schema check. The public API is unchanged — assign a `Hash`, read a `Hash` back. See [CHANGELOG.md](./CHANGELOG.md)
+- **5.4.0 — Faster AtlasSearch role-cache expiry** — `Parse::AtlasSearch` `role_cache_ttl` now defaults to 30 seconds (was 120) so a role grant or revoke is reflected in `$search` ACL decisions sooner, at the cost of slightly more frequent role lookups. See [CHANGELOG.md](./CHANGELOG.md)
 ### What's new in 5.3
 - **5.3.0 — Run webhook handlers (and clients) as the calling user** — Parse Server embeds the caller's live session token in every trigger webhook fired by a logged-in user. A handler can now opt in to acting on the server *as that user* — full ACL/CLP/`protectedFields` enforcement, no master key. `payload.session_token` exposes the captured token (`nil` for master-key requests; still scrubbed from `payload.user`/`payload.object`/`as_json`/logs); `payload.user_agent` returns a client-mode `Parse::Agent`, and `payload.user_client` a non-master `Parse::Client` with the token **bound** so even raw REST calls authorize as the user. The same user-scoped client is available client-side via `Parse::User#session_client` and the `Parse::Client#become(token)` primitive, with `Parse::Client#with_session { … }` for block scoping. Backed by a new `Parse::Client.new(session_token:)` option. See [Acting as the calling user](#acting-as-the-calling-user)
@@ -209,9 +218,20 @@ result = Parse.call_function :myFunctionName, {param: value}
 ```
+## Examples
+Runnable, self-contained scripts live in [`examples/`](examples/) — see
+[`examples/README.md`](examples/README.md) for the full index. Highlights:
+- [`basic_server.rb`](examples/basic_server.rb) — master-key setup: models, schema, CRUD + queries.
+- [`basic_client.rb`](examples/basic_client.rb) — unprivileged client with row-level ACL enforcement.
+- [`live_query_listener.rb`](examples/live_query_listener.rb) — interactive LiveQuery console scoped to a user's session.
+- [`rag_chatbot.rb`](examples/rag_chatbot.rb) — retrieval-augmented generation with `semantic_search` + an OpenAI/Anthropic add-in.
+- [`transaction_example.rb`](examples/transaction_example.rb) — atomic multi-object transactions.
 ## Release History
-**Current version: 5.0.1** | **Ruby 3.2+ required**
+**Current version: 5.4.0** | **Ruby 3.2+ required**
 The 5.0 highlights (vector search / RAG, pooled Redis cache, AS::N instrumentation, MCP transport hardening, GraphQL type generation) are summarized in the [What's new in 5.0](#whats-new-in-50) section above. Earlier releases are recorded below.
@@ -1586,8 +1606,11 @@ user.mfa_status    # => :enabled, :disabled, or :unknown
 # Disable MFA (requires current token)
 user.disable_mfa!(current_token: "123456")
-# Admin reset (master key) — authorized_by must be a Parse::User
-user.disable_mfa_master_key!(authorized_by: admin_user)
+# Admin reset (master key) — fails closed: pass either an admin_role:
+# for the library to verify, or allow_unverified: true to assert that
+# you have already authorized the operator out-of-band.
+user.disable_mfa_master_key!(authorized_by: admin_user, admin_role: "Admin")
+# or: user.disable_mfa_master_key!(authorized_by: admin_user, allow_unverified: true)
 ```
 **SMS MFA (requires Parse Server SMS callback):**
@@ -4917,6 +4940,32 @@ The `parse_object` handed to your handler is the **full object as Parse Server s
 For any `after_*` hook, return values are not needed since Parse does not utilize them. You may also register as many `after_save` or `after_delete` handlers as you prefer, all of them will be called.
+For `before_save` (and functions), the handler's value **is** the response Parse Server acts on — return the (possibly mutated) `parse_object` to allow the write, or `false` / `error!` to reject it. You can set that value with an explicit `return` or as the block's last expression; both work, as do the proc idioms `next value` / `break value`:
+```ruby
+Parse::Webhooks.route :before_save, :Artist do
+  artist = parse_object
+  return artist if artist.name.present?   # explicit early return
+  error! "name is required"               # raise to reject the save
+end
+```
+`self` inside the block is the `Parse::Webhooks::Payload`, so `parse_object`, `params`, and `error!` are available directly. As anywhere in Ruby, `return` ends the handler immediately — to run work *after* the response is sent, use `after_response` (below) rather than code after the `return`.
+#### Deferring work until after the response
+`payload.after_response { … }` (alias `defer`) registers work to run **after** the webhook response has been sent, off the critical path of the save or function the client is waiting on. The handler returns its value synchronously; the deferred block runs afterward — ideal for search indexing, cache warming, or fan-out that should not add latency.
+```ruby
+Parse::Webhooks.route :after_save, :Post do
+  post = parse_object
+  after_response { SearchIndex.reindex(post.id) }   # runs after the reply is sent
+  post
+end
+```
+Under Puma or Unicorn the block runs via `rack.after_reply` once the response is flushed (same worker thread, zero added round-trip latency); on a server without it (e.g. WEBrick) it falls back to a detached thread. Multiple blocks run in order and are isolated — one raising affects neither the response nor the others. Notes: deferred blocks run **only on the success path** (a rejected `before_save` runs none), "after the response" is **not** "after the row commits" (don't rely on the persisted row inside the block), and the work is **in-process and best-effort** — it dies with the worker, so for anything that *must* happen use a durable job queue (Sidekiq / ActiveJob). Blocks are drained only when the payload runs through the mounted `Parse::Webhooks` Rack app (a no-op under direct `run_function` / `call_route`). See the [Cloud Code Webhooks Guide](docs/webhooks_guide.md#deferring-work-until-after-the-response).
 > **Your model's `after_save` callbacks run here too.** When an `after_save` /
 > `after_create` trigger fires, the webhook rebuilds the `Parse::Object` from the
 > payload and runs that model's ActiveModel `after_save` / `after_create`
@@ -4928,6 +4977,57 @@ For any `after_*` hook, return values are not needed since Parse does not utiliz
 > for saves from other clients (JS / iOS / REST), the webhook runs them, since
 > the SDK never had the chance.
+#### ActiveModel callbacks vs. Parse Server triggers
+The SDK exposes the full ActiveModel lifecycle on every model
+(`before_validation`, `before_save`/`after_save`, `before_create`/`after_create`,
+`before_update`/`after_update`, `before_destroy`/`after_destroy`). Parse Server,
+separately, exposes a fixed set of **webhook trigger types**. They are not
+one-to-one — the SDK maps between them, and a webhook must be **registered** for
+your ActiveModel logic to run server-side for non-Ruby clients (JS / iOS / REST /
+Dashboard). Without a registered webhook, that logic runs only in the Ruby
+process that initiated the save.
+Supported Parse Server trigger types: `beforeSave`/`afterSave`,
+`beforeDelete`/`afterDelete`, `beforeFind`/`afterFind`, `beforeLogin`/`afterLogin`,
+`afterLogout`, `beforePasswordResetRequest`, `beforeConnect`,
+`beforeSubscribe`/`afterEvent`, and file triggers on the `@File` pseudo-class.
+The **authentication** triggers (`beforeLogin`/`afterLogin`/`afterLogout`/
+`beforePasswordResetRequest`) and **LiveQuery** triggers (`beforeConnect`/
+`beforeSubscribe`/`afterEvent`) route as first-class shapes — predicates
+(`before_login?` … `after_event?`, `auth_trigger?`/`live_query_trigger?`), an
+`event` accessor, and top-level `sessionToken` capture into `payload.session_token`.
+None of them run ActiveModel `save`/`create`/`destroy` callbacks, even though the
+auth triggers carry a `_User`/`_Session`. Parse Server **ignores the response body**
+for all of them, so the only signal that affects the operation is rejection, and
+only on the `before*` variants: returning `false` (or calling `error!`) from a
+`before_login`/`before_connect`/`before_subscribe`/`before_password_reset_request`
+handler denies the operation, while anything else is a success no-op. (LiveQuery
+triggers are delivered over HTTP only in a co-located single-process setup;
+`beforeConnect` is effectively in-process-only.)
+Key relationship — **`beforeSave`/`afterSave` carry the create variants**. Parse
+Server has **no `beforeCreate`/`afterCreate` trigger** (it rejects them). The SDK
+runs your `before_create`/`after_create` callbacks *inside* the
+`beforeSave`/`afterSave` handler for new objects, in ActiveModel order
+(`before_save → before_create`, `after_create → after_save`). So **registering a
+`beforeSave` webhook enables both `before_save` and `before_create`**, and
+`afterSave` enables both `after_save` and `after_create`. Requesting a create
+webhook raises with guidance pointing you at the save trigger.
+> **`after_save` is synchronous and on the critical path.** Parse Server waits
+> for the webhook to return before completing the client's write — even on
+> `afterSave`, whose return value is a no-op. Treat `after_save` as a place to
+> **enqueue** background work, not to run long logic inline, and avoid saving
+> other objects inside it (each cascading save fires more webhooks). `beforeSave`
+> can mutate or reject the write, so it is necessarily inline — keep it lean.
+For the full picture — trigger types, registration, the synchronous-latency
+model, the Ruby-initiated dedup, and inbound replay/freshness protection — see
+the [Cloud Code Webhooks Guide](docs/webhooks_guide.md) and
+[`examples/webhook_server.rb`](examples/webhook_server.rb).
 #### Trigger object state
 Because the trigger payload is server-authoritative, the `parse_object` your
@@ -5764,6 +5864,13 @@ The integration tests use Docker Compose to spin up a Parse Server instance with
 - Docker and Docker Compose installed
 - Ruby environment with bundler
+> **Always run the suite with `bundle exec`.** Newer `minitest` (6.0+) moved
+> `minitest/mock` out into a separate gem, so a bare `ruby`/`rake` invocation
+> activates minitest 6 and then fails to load `minitest/mock`, aborting every
+> test at load time with `cannot load such file -- minitest/mock (LoadError)`.
+> Running through bundler pins the locked versions and avoids this. If you hit
+> that LoadError, prefix the command with `bundle exec`.
 #### Setup and Running Tests
 1. **Enable Docker Tests**: Set the environment variable to enable Docker-based tests:
@@ -5848,6 +5955,56 @@ docker compose -f scripts/docker/docker-compose.test.yml up -d
 curl -s http://localhost:29337/parse/health   # -> {"status":"ok"}
 ```
+#### Network Exposure and the Preflight Guard
+Every service binds to loopback (`127.0.0.1`) by default, and the default
+credentials above are committed to this repository — safe in combination,
+since nothing off the host can reach them. Each bind is overridable
+(`PARSE_BIND`, `MONGO_BIND`, `REDIS_BIND`, `DASHBOARD_BIND`) for the
+occasional need to attach a remote client while debugging.
+That override is a footgun: pointing a bind at `0.0.0.0` while the default
+credentials are still in force would publish an admin-credentialed stack
+(Mongo `admin:password`, master key `psnextItMasterKey`) onto your LAN. A
+`preflight` service runs before anything else and **refuses to start the
+stack** in exactly that case. To proceed, do one of:
+```bash
+# 1. Keep it loopback (the default) — just omit the *_BIND override.
+# 2. Supply real credentials instead of the committed test defaults.
+PARSE_MASTER_KEY="$(openssl rand -hex 24)" \
+MONGO_ROOT_PASSWORD="$(openssl rand -hex 24)" \
+MONGO_BIND=0.0.0.0 \
+  docker compose -f scripts/docker/docker-compose.test.yml up -d
+# 3. Acknowledge the exposure on a trusted, isolated network.
+ALLOW_INSECURE_BIND=1 MONGO_BIND=0.0.0.0 \
+  docker compose -f scripts/docker/docker-compose.test.yml up -d
+```
+#### Secret Injection (real credentials)
+The committed defaults are deliberately non-secret, so the loopback stack
+needs no secrets manager. If you point the stack at *real or shared*
+credentials (option 2 above, or a staging Mongo), keep them out of your
+shell history and the compose file by injecting them at launch. The stack
+reads plain environment variables, so any injector works:
+```bash
+# 1Password CLI — secrets resolved from an op:// .env reference file.
+op run --env-file=.env.secrets -- \
+  docker compose -f scripts/docker/docker-compose.test.yml up -d
+# Doppler — secrets pulled from a configured project/config.
+doppler run -- \
+  docker compose -f scripts/docker/docker-compose.test.yml up -d
+```
+Use the committed `.env.sample` as the reference for which variables each
+side expects; copy it to a gitignored `.env` (or an `op://`-referenced
+`.env.secrets`) and fill in real values there.
 #### Environment Variables
 The defaults above are baked into the Compose file and the test helpers, so the

data/Rakefile CHANGED Viewed

@@ -77,12 +77,57 @@ def client_console_token!
       pwd = $stdin.gets.to_s
     end
   end
-  u = Parse::User.login(user, pwd.chomp)
+  u = console_login_with_optional_mfa(user, pwd.chomp)
   abort "[client:console] login failed for #{user.inspect}" if u.nil? || u.session_token.to_s.empty?
   puts "Logged in as #{u.username} (#{u.id})."
   u.session_token
 end
+# Log `user` in, transparently handling an MFA-enrolled account. If the server
+# reports that additional MFA auth is required, prompt for a TOTP / recovery
+# code (or read +PARSE_LOGIN_MFA+ for non-interactive use) and retry via
+# {Parse::User.login_with_mfa}. Returns a logged-in {Parse::User}, or nil when
+# the credentials themselves are rejected (so the caller's "login failed" abort
+# still fires for a bad password).
+def console_login_with_optional_mfa(user, pwd)
+  # Parse Server signals "this account needs an MFA token" two ways depending on
+  # the error code path: a returned error response ("Missing additional
+  # authData ...") or a raised Parse::Error for the OTHER_CAUSE (code <= 100)
+  # variant. Treat both as "prompt for MFA"; anything else is a real credential
+  # failure and must NOT trigger an MFA prompt.
+  mfa_indicator = /additional\s+authData|missing.*mfa|\bMFA\b/i
+  begin
+    response = Parse.client.login(user, pwd)
+    if response.success?
+      return Parse::User.with_authdata_trust { Parse::User.build(response.result) }
+    end
+    return nil unless response.error.to_s.match?(mfa_indicator)
+  rescue Parse::Error, Parse::Client::ResponseError => e
+    raise unless e.message.to_s.match?(mfa_indicator)
+  end
+  token = ENV["PARSE_LOGIN_MFA"].to_s.strip
+  if token.empty?
+    print "MFA token (authenticator code or recovery code): "
+    token = $stdin.gets.to_s.strip
+  end
+  abort "[client:console] MFA token required for #{user.inspect}" if token.empty?
+  # A wrong/expired token can surface either as Parse::MFA::VerificationError or,
+  # depending on the server error code path, as a generic Parse::Error (e.g.
+  # ServiceUnavailableError for the OTHER_CAUSE code) or a nil return. Since a
+  # token was supplied here, treat any failure as an MFA verification failure
+  # and abort cleanly rather than letting an unhandled exception escape.
+  result =
+    begin
+      Parse::User.login_with_mfa(user, pwd, token)
+    rescue Parse::MFA::VerificationError, Parse::Error => e
+      abort "[client:console] MFA verification failed for #{user.inspect}: #{e.message}"
+    end
+  abort "[client:console] MFA verification failed for #{user.inspect}" if result.nil?
+  result
+end
 # Default test task runs all tests with Docker enabled.
 #
 # `*disruptive*` tests are EXCLUDED here: they stop/restart the shared
@@ -131,7 +176,11 @@ def run_test_files!(label, files, log:)
     puts "[#{n}/#{total}] #{file}"
     puts "=" * 80
     t0 = Time.now
-    ok = system("PARSE_TEST_USE_DOCKER=true ruby -Ilib:test #{file}")
+    # Always go through `bundle exec` so the locked gem versions win. With a
+    # bare `ruby`, RubyGems activates the newest installed minitest (6.0.x),
+    # which dropped the bundled `minitest/mock`; the standalone `minitest-mock`
+    # gem then can't co-activate and `test_helper.rb` fails to load every file.
+    ok = system("PARSE_TEST_USE_DOCKER=true bundle exec ruby -Ilib:test #{file}")
     dt = Time.now - t0
     results << [file, ok, dt]
     summary = format("[%d/%d] %-4s %7.1fs  %s", n, total, ok ? "PASS" : "FAIL", dt, file)
@@ -203,7 +252,7 @@ namespace :test do
         puts "=" * 80
         # Each file runs in its own process so a server outage in one cannot
         # bleed into the next.
-        system("PARSE_TEST_USE_DOCKER=true ruby -Ilib:test #{file}") || begin
+        system("PARSE_TEST_USE_DOCKER=true bundle exec ruby -Ilib:test #{file}") || begin
           # A disruptive test may have left the server down on failure; bring
           # it back so a follow-up run / other tasks start from a clean state.
           system("docker start #{ENV["PSNEXT_PREFIX"] || "psnext-it"}-server", out: IO::NULL, err: IO::NULL)

data/docs/atlas_vector_search_guide.md CHANGED Viewed

@@ -372,6 +372,10 @@ embed-time chunking), use one of these patterns:
 ## Retrieval (RAG)
+> For an end-to-end runnable script — managed `embed`, `agent_searchable`,
+> `semantic_search`, and an OpenAI/Anthropic generation add-in — see
+> [`examples/rag_chatbot.rb`](../examples/rag_chatbot.rb).
 `Parse::Retrieval` (`Parse::RAG` is an alias) sits on top of
 `find_similar`. `Parse::Retrieval.retrieve` embeds a natural-language
 query, runs Atlas `$vectorSearch` through `find_similar` (so ACL/CLP are
@@ -395,8 +399,88 @@ chunks = Parse::Retrieval.retrieve(
 # => Array<Parse::Retrieval::Chunk> — { id, score, content, source, metadata }
 ```
-`rerank:` and `hybrid:` are reserved on the signature and raise
-`NotImplementedError` if supplied.
+`retrieve` also accepts `hybrid:` (fuse a lexical branch with the vector
+branch — see [Hybrid search](#hybrid-search-vector--lexical) below) and
+`rerank:` (reorder retrieved documents with a cross-encoder before
+chunking — see [Reranking](#reranking)). Both were reserved in earlier
+releases and now ship in 5.4.0.
+### Hybrid search (vector + lexical)
+`Class.hybrid_search` runs a lexical Atlas Search (`$search`) branch and a
+`$vectorSearch` branch as **two independent aggregations**, then fuses
+their ranked results with reciprocal-rank fusion (RRF). Two aggregations
+(not a single `$facet`) is mandatory: `$vectorSearch` is prohibited inside
+`$facet` / `$lookup` / `$unionWith` and must be stage 0 of its pipeline.
+Each branch enforces ACL/CLP/`protectedFields` independently before
+fusion (via `Parse::AtlasSearch.search` and `Parse::VectorSearch.search`),
+so the fused rows are already access-filtered — there is no separate
+hydration fetch.
+```ruby
+hits = Article.hybrid_search(
+  text:    "how do I reset my password",   # embedded for the vector branch;
+                                            # also the default lexical query
+  lexical: { index: "article_search", fields: %w[title body] },
+  vector:  { index: "article_embedding_idx", num_candidates: 200 },
+  k:       20,
+  fusion:  { k_constant: 60, weights: { lexical: 0.4, vector: 0.6 } },
+  session_token: user.session_token,        # ACL scope, applied to BOTH branches
+)
+# => Array<Parse::Object>; each carries #hybrid_score, #hybrid_ranks,
+#    and #vector_score / #search_score when that branch contributed.
+```
+**RRF math.** `fused_score(d) = Σ_b weight_b / (k_constant + rank_b(d))`,
+where `rank_b(d)` is the document's 1-based rank in branch `b`. A larger
+`k_constant` (default 60) flattens the contribution curve. `weights`
+defaults to 1.0 per branch. `Parse::VectorSearch::Hybrid.rrf` exposes the
+pure fusion if you want to fuse pre-fetched ranked lists yourself.
+**Native `$rankFusion` (Atlas 8.0+).**
+`Parse::VectorSearch::Hybrid.rank_fusion_supported?(collection)` detects
+the native server-side fusion stage via a cached behavioural probe (1-hour
+TTL — not version-string parsing). Native execution is **opt-in**
+(`fusion: { method: :rrf_native }`) and falls back to the client-side path
+when the cluster does not support it; the default `:rrf` always fuses
+client-side, which is the fully-enforced, deterministic path. `$rankFusion`
+is admitted to `PipelineSecurity::ALLOWED_STAGES` for the native path.
+`Parse::Retrieval.retrieve(hybrid: true, ...)` routes through
+`hybrid_search` and chunks the fused results; pass `hybrid: { lexical:,
+vector:, fusion: }` to configure the branches. Tenant scope is folded into
+**both** branches (the vector Atlas pre-filter and the lexical
+post-`$search` `$match`) so neither leaks cross-tenant document existence.
+### Reranking
+A reranker reorders retrieved documents by a cross-encoder relevance score
+**before** chunking. Pass any object answering
+`#rerank(query:, documents:, top_n:)` — typically a
+`Parse::Retrieval::Reranker::Base` subclass:
+```ruby
+reranker = Parse::Retrieval::Reranker::Cohere.new(
+  api_key: ENV.fetch("COHERE_API_KEY"), model: "rerank-v3.5",
+)
+chunks = Parse::Retrieval.retrieve(
+  query: "reset my password", klass: Article, k: 30,
+  rerank: reranker, rerank_top_n: 5,    # keep the 5 most relevant docs
+)
+# Reranked chunks' score is the cross-encoder relevance_score.
+```
+`Reranker::Fixture` is a deterministic, zero-network reranker (lexical
+token overlap) for tests. The `Reranker::Base` protocol validates inputs,
+bounds `top_n`, rejects out-of-range indices, and sorts descending —
+adapters implement only the network call (`#rerank_scores`).
+> **Spend cap.** The `semantic_search` agent tool charges the estimated
+> query-embedding tokens against the caller's tenant budget via
+> `Parse::Embeddings::SpendCap` (opt-in; `configure(limit_tokens:,
+> window:)`). A breach hard-refuses (surfaced to the agent as a
+> rate-limited tool error). Admin agents are exempt; direct
+> `find_similar` / `retrieve` callers are not metered.
 ### Chunkers

data/docs/client_sdk_guide.md CHANGED Viewed

@@ -11,6 +11,11 @@ go over REST, and authorization is carried by the user's `sessionToken`.
 Every claim below is locked in by the integration tests under
 `test/lib/parse/client_*_integration_test.rb`.
+For a runnable starting point, see
+[`examples/basic_client.rb`](../examples/basic_client.rb) (a no-master client
+with a row-level ACL-enforcement demo) and its master-key counterpart
+[`examples/basic_server.rb`](../examples/basic_server.rb).
 ---
 ## Why a separate guide?

data/docs/mcp_guide.md CHANGED Viewed

@@ -7,7 +7,7 @@ The Model Context Protocol (MCP) is a standardized JSON-RPC 2.0-based interface
 Three deployment modes are available:
 - **Standalone HTTP server (`MCPServer`)** — a WEBrick process for dedicated MCP deployments.
-- **Rack-mountable adapter (`MCPRackApp`)** — embeds inside an existing Sinatra or Rails application.
+- **Rack-mountable adapter (`MCPRackApp`)** — embeds inside an existing Sinatra or Rails application. This is the primary deployment for the MCP 2025-06-18 Streamable HTTP transport; enable it with `transport: :streamable_http` (see [Streamable HTTP transport](#streamable-http-transport-primary)).
 - **Direct in-process dispatcher (`MCPDispatcher`)** — a pure function for in-process usage, custom transports, and testing.
 ---
@@ -191,6 +191,42 @@ map("/mcp") { run mcp_app }
 map("/")    { run ->(env) { [200, {"Content-Type" => "text/plain"}, ["ok"]] } }
 ```
+#### Streamable HTTP transport (primary)
+The MCP 2025-06-18 **Streamable HTTP** transport is the recommended transport for `MCPRackApp`. It is a single connection model in which the client `POST`s JSON-RPC requests (receiving either a buffered JSON reply or, with `Accept: text/event-stream`, a streamed SSE reply) and holds open a long-lived `GET` request to receive server-initiated notifications. Session termination is signalled with `DELETE` carrying the `Mcp-Session-Id`.
+Enable the whole transport with one switch:
+```ruby
+mcp_app = Parse::Agent.rack_app(transport: :streamable_http) do |env|
+  # ... auth factory ...
+end
+```
+`transport: :streamable_http` is exactly equivalent to `streaming: true, notifications: true` — it turns on POST→SSE streaming and the server→client `GET /` notification stream together. Add `resource_subscriptions: true` alongside it to upgrade the server→client bus from the plain notification posture to the LiveQuery-backed resource-subscription posture:
+```ruby
+mcp_app = Parse::Agent.rack_app(
+  transport: :streamable_http,
+  resource_subscriptions: true,   # optional: bridge LiveQuery resource updates
+) do |env|
+  # ...
+end
+```
+`transport:` is a closed enum:
+| Value | Effect |
+|-------|--------|
+| `:streamable_http` | Full Streamable HTTP transport (`streaming: true` + `notifications: true`). |
+| `:legacy` / `nil` (default) | Historical behavior: buffered JSON responses, no server→client stream. The standalone SSE/JSON path below remains a supported fallback. |
+Passing `transport: :streamable_http` together with an explicit `streaming:` or `notifications:` raises `ArgumentError` (the switch already owns those toggles); any value other than the two above also raises. The default is unchanged, so an existing `Parse::Agent.rack_app { ... }` keeps its non-streaming JSON behavior until you opt in.
+**WEBrick cannot deliver Streamable HTTP.** The switch — like `streaming:` — has no effect under the WEBrick-backed standalone `MCPServer`, which buffers responses and cannot hold the `GET` stream open. Use Puma, Falcon, or Unicorn for a real Streamable HTTP deployment.
+The remaining subsections document the individual toggles `transport: :streamable_http` consolidates, for operators who need finer control or are reading older configurations.
 #### MCP progress notifications via SSE (opt-in)
 **WEBrick cannot stream.** The standalone `MCPServer` is WEBrick-based and buffers the full response before sending. Setting `streaming: true` on an `MCPRackApp` mounted under WEBrick silently degrades to a single buffered response with concatenated SSE events. SSE streaming requires a Rack server that supports streaming response bodies — **Puma, Falcon, or Unicorn**. Verify your deployment uses one of these before relying on `streaming: true`.
@@ -537,10 +573,29 @@ Parse Server version and its `masterKeyIps` configuration.)
   soft cap *equal to* `max_concurrent_dispatchers`. So the effective steady-state
   ceiling across both surfaces is up to **2× `max_concurrent_dispatchers`** (up
   to N request-scoped SSE dispatchers plus N listening streams). Size the value
-  with that 2× factor in mind (e.g. relative to your Puma `max_threads`). Leaving
-  it unset (the default `nil`) leaves both surfaces uncapped; the app logs a
+  with that 2× factor in mind (e.g. relative to your Puma `max_threads`).
+  `max_concurrent_dispatchers:` defaults to a finite **100**
+  (`Parse::Agent::MCPRackApp::DEFAULT_MAX_CONCURRENT_DISPATCHERS`), so a
+  streaming surface is bounded out of the box — once the cap is reached a new
+  SSE request or listening stream is refused with a `503` JSON-RPC `-32000`
+  ("server busy"). Pass an explicit positive integer to resize it, or
+  `max_concurrent_dispatchers: nil` to knowingly run uncapped (the app logs a
   one-time warning at construction when a streaming or subscription/notification
-  surface is enabled without a cap.
+  surface is enabled with `nil`). A non-positive or non-integer value raises
+  `ArgumentError`.
+- **Client disconnect mid-tool-call.** When a client drops the connection while
+  a tool is still running, the SSE worker is torn down and the dispatcher's
+  cancellation token is tripped, so a cooperative tool (one that checks
+  `agent.cancelled?` at a checkpoint) exits promptly. A tool blocked inside a
+  Mongo/REST roundtrip cannot observe the token, but its slot is reclaimed when
+  the per-tool `Timeout` or the clean MongoDB `socket_timeout` (10s) / REST
+  `timeout` (30s) deadline fires — through the driver's clean error path. The
+  orphaned dispatcher is **intentionally not force-killed**: a `Thread#kill`
+  would bypass the driver's connection-invalidation and could return a half-used
+  pooled connection to a later request. To observe how often disconnects abandon
+  in-flight work, watch the cumulative
+  `Parse::Agent::MCPRackApp.abandoned_dispatcher_count` or subscribe to the
+  `parse.agent.mcp_dispatcher_abandoned` `ActiveSupport::Notifications` event.
 ### Listening-stream ownership

data/docs/mongodb_direct_guide.md CHANGED Viewed

@@ -173,6 +173,58 @@ set the same kwargs on the query for chainable composition.
 Related: `first_direct(n)` for the first N rows, `count_direct` for a
 count-only query. Both accept the same auth kwargs.
+#### Field projection: `keys` and `exclude_keys`
+The two field-selection options behave differently on the direct path
+because MongoDB's `$project` is an allowlist, not a denylist:
+- **`keys` (allowlist)** compiles to a `$project` stage in the direct
+  pipeline, so the projection runs server-side in MongoDB — only the
+  named fields (plus the reserved envelope: `_id`, `_created_at`,
+  `_updated_at`, `_acl`) leave the database.
+- **`exclude_keys` (denylist)** has no `$project` equivalent, so Parse
+  Stack honors it as a **post-fetch sanitize**: the pipeline is
+  unchanged, and the SDK recursively strips every key with a matching
+  name from the decoded results in Ruby. The fields still travel from
+  MongoDB to the client — this is a result-shaping convenience, not a
+  data-minimization or access-control boundary.
+```ruby
+# Allowlist — projected server-side via $project
+Song.query.keys(:title, :artist).results_direct
+# Denylist — stripped client-side after fetch
+Song.query.exclude_keys(:internal_notes).results_direct
+```
+Two consequences specific to the direct path:
+1. **Recursive by name.** `exclude_keys(:name)` removes `name` at every
+   depth, including inside included/nested objects — so a query that
+   includes a pointer also strips the pointed-to object's `name`. This
+   is broader than Parse Server's REST `excludeKeys`, which is
+   path-scoped (top-level or dotted) and would leave the nested field
+   intact. The same query can therefore return different shapes on the
+   REST and direct paths.
+2. **Reserved fields are never stripped.** `objectId`, `className`,
+   `__type`, `createdAt`, `updatedAt`, `ACL`, and their Mongo
+   storage-form names (`_id`, `_created_at`, `_updated_at`, `_acl`) are
+   always retained, so excluding one of them is a no-op rather than a
+   way to break object reconstruction.
+The sanitize applies to the object/decoded result paths
+(`results_direct`, `first_direct`, and the auto-promoted
+`$inQuery`/`$notInQuery` aggregation). The raw aggregation accessor
+(`aggregate(...).raw`) returns documents untouched.
+Because `exclude_keys` here is a projection convenience and not an
+ACL/CLP/`protectedFields` boundary, the security contract in
+[Security](#security) is unaffected — to keep a field from leaving the
+database, use `keys` (allowlist) or `protectedFields`, not
+`exclude_keys`.
 ### `Query#aggregate(pipeline, mongo_direct: true)`
 ```ruby
@@ -233,9 +285,35 @@ raw = Parse::MongoDB.find(
 ```
 Convenience wrapper around `db.find`. Accepts `limit:`, `skip:`, `sort:`,
-`projection:`, `max_time_ms:`. When `:limit` is omitted the call applies
+`projection:`, `hint:`, `max_time_ms:`. When `:limit` is omitted the call applies
 `DEFAULT_FIND_LIMIT = 1000` and warns; pass `limit: 0` to opt out.
+### Forcing an index with `hint`
+When the query planner picks a sub-optimal index on a large collection,
+`Query#hint` forces a specific one. It applies on **both** paths — the REST body
+(`hint` parameter, Parse Server 7.4.0+) and the mongo-direct path — so a plan you
+diagnosed with `Query#explain` can be corrected without dropping to `mongosh`.
+```ruby
+# Diagnose, then force the index, on the mongo-direct path:
+Post.query(:status => "published").order(:created_at.desc).hint("status_1_created_at_-1")
+    .results_direct
+# A key pattern works too:
+Post.query(:status => "published").hint({ "status" => 1, "createdAt" => -1 }).count_direct
+```
+On the mongo-direct path the hint is forwarded to the driver as the Mongo `hint`
+option: `results_direct` / `count_direct` / `distinct_direct` pass it to
+`Parse::MongoDB.aggregate` (`hint:` → the aggregation `hint` option), and the
+primitives `Parse::MongoDB.aggregate(..., hint:)` and
+`Parse::MongoDB.find(..., hint:)` accept it directly. The index name (a `String`)
+or a key pattern (`Hash`) are both accepted; an unknown index name is rejected by
+MongoDB, which is the intended fail-fast signal that the hint is stale.
+`hint` is unset by default (the planner chooses); it is purely an override.
 ### Geo queries
 Three geo query constraints land in v4.4.0 alongside a direct
@@ -620,6 +698,20 @@ ACL/CLP enforcement if the SDK applies it.
 As of **v4.4.0**, the SDK applies that enforcement on the mongo-direct
 path when the caller supplies a scope. Five layers compose:
+> **Atlas index entry points share this enforcement.** The Atlas-index
+> stages (`$vectorSearch`, `$search`, `$rankFusion`) must be stage 0 of
+> their pipeline, so they cannot route through `Parse::MongoDB.aggregate`
+> (which prepends an ACL `$match` at stage 0). `Parse::VectorSearch.search`
+> (`find_similar`), `Parse::AtlasSearch.search`, and
+> `Parse::VectorSearch::Hybrid` (`Class.hybrid_search`, v5.4.0) therefore
+> reproduce the same enforcement chain **inline** — the ACL `_rperm`
+> `$match` is appended AFTER the index stage, and CLP / `protectedFields` /
+> the internal-fields denylist run post-fetch — so the same scope kwargs
+> (`session_token:` / `acl_user:` / `acl_role:` / `master:`) and the same
+> contract apply. Hybrid search fuses two independently-enforced branches,
+> so fused rows are already access-filtered. `$rankFusion` was added to the
+> strict-mode allowlist (Layer 1) in v5.4.0 for the opt-in native path.
 ### Layer 1: Pipeline-security denylist (always on)
 `Parse::PipelineSecurity` refuses dangerous operators at any depth in

data/docs/usage_guide.md CHANGED Viewed

@@ -83,10 +83,20 @@ Song.query.order(:plays.desc).skip(10).limit(20).results
 # Include related objects
 Song.all(includes: [:album, :comments])
-# Select specific fields
+# Select specific fields (allowlist)
 Song.all(keys: [:title, :artist])
+# Omit specific fields (denylist)
+Song.query.exclude_keys(:internal_notes).results
 ```
+> On the mongo-direct read path, `keys` is projected server-side while
+> `exclude_keys` is applied as a recursive post-fetch sanitize (it strips
+> matching field names at every depth and never removes reserved fields
+> such as `objectId`). See the
+> [Direct MongoDB Integration Guide](mongodb_direct_guide.md) for the
+> exact semantics and how it differs from the REST path.
 ## Aggregation
 ```ruby