parse-stack-next 5.1.1 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.env.sample +12 -0
- data/.env.test +4 -4
- data/CHANGELOG.md +545 -0
- data/Gemfile +3 -0
- data/Gemfile.lock +6 -1
- data/README.md +167 -38
- data/Rakefile +56 -10
- data/docs/atlas_vector_search_guide.md +110 -9
- data/docs/mcp_guide.md +433 -0
- data/docs/mongodb_direct_guide.md +66 -1
- data/docs/mongodb_index_optimization_guide.md +22 -1
- data/docs/usage_guide.md +15 -0
- data/lib/parse/agent/approval_gate.rb +0 -0
- data/lib/parse/agent/constraint_translator.rb +90 -19
- data/lib/parse/agent/describe.rb +1 -0
- data/lib/parse/agent/errors.rb +16 -0
- data/lib/parse/agent/mcp_client.rb +9 -0
- data/lib/parse/agent/mcp_dispatcher.rb +139 -7
- data/lib/parse/agent/mcp_rack_app.rb +621 -17
- data/lib/parse/agent/mcp_subscriptions.rb +607 -0
- data/lib/parse/agent/metadata_dsl.rb +58 -0
- data/lib/parse/agent/metadata_registry.rb +141 -1
- data/lib/parse/agent/prompt_hardening.rb +213 -0
- data/lib/parse/agent/result_formatter.rb +18 -3
- data/lib/parse/agent/tools.rb +167 -24
- data/lib/parse/agent.rb +692 -21
- data/lib/parse/client/request.rb +55 -4
- data/lib/parse/client/response.rb +4 -0
- data/lib/parse/client.rb +205 -7
- data/lib/parse/model/classes/installation.rb +27 -10
- data/lib/parse/model/classes/user.rb +8 -0
- data/lib/parse/model/core/actions.rb +58 -4
- data/lib/parse/model/core/embed_managed.rb +19 -14
- data/lib/parse/model/core/indexing.rb +108 -16
- data/lib/parse/model/core/querying.rb +29 -0
- data/lib/parse/model/model.rb +34 -3
- data/lib/parse/model/object.rb +1 -0
- data/lib/parse/query.rb +90 -24
- data/lib/parse/retrieval/agent_tool.rb +369 -0
- data/lib/parse/retrieval/chunk.rb +74 -0
- data/lib/parse/retrieval/chunker.rb +208 -0
- data/lib/parse/retrieval/retriever.rb +274 -0
- data/lib/parse/retrieval.rb +10 -0
- data/lib/parse/schema.rb +69 -20
- data/lib/parse/stack/version.rb +2 -2
- data/parse-stack-next.gemspec +1 -1
- data/scripts/docker/docker-compose.atlas.yml +14 -10
- data/scripts/docker/docker-compose.test.yml +24 -20
- data/scripts/docker/mongo-init.js +3 -3
- data/scripts/start-parse.sh +10 -0
- data/scripts/start_mcp_server.rb +1 -1
- data/scripts/test_server_connection.rb +1 -1
- data/scripts/vector_prototype/create_vector_index.js +1 -1
- data/scripts/vector_prototype/fetch_embeddings.py +2 -2
- data/scripts/vector_prototype/query_prototype.rb +1 -1
- data/scripts/vector_prototype/run.sh +4 -4
- metadata +10 -2
data/README.md
CHANGED
|
@@ -4,6 +4,19 @@
|
|
|
4
4
|
|
|
5
5
|
A full-featured Ruby client SDK for [Parse Server](http://parseplatform.org/). [parse-stack-next](https://github.com/neurosynq/parse-stack-next) is a Ruby client SDK, REST client, and Active Model ORM for [Parse Server](http://parseplatform.org/), combining a low-level API client, a query engine, an object-relational mapper (ORM), and a Cloud Code Webhooks rack application in a single gem.
|
|
6
6
|
|
|
7
|
+
### What's new in 5.2
|
|
8
|
+
|
|
9
|
+
- **Retrieval layer — `Parse::Retrieval` (`Parse::RAG`)** — `Parse::Retrieval.retrieve(query:, klass:, k:, filter:, tenant_scope:, …)` embeds a natural-language query, runs Atlas `$vectorSearch` through the existing ACL-enforcing `find_similar`, and splits each retrieved document's text field into scored `Parse::Retrieval::Chunk`s. Chunking is presentation-only (embedding stays one-vector-per-record), via `Parse::Retrieval::Chunker::FixedSizeOverlap(size:, overlap:, by:, max_chunks_per_document:)` (subclass `Chunker::Base` for custom strategies). ACL is mongo-direct (no REST two-stage); tenant scope folds into the Atlas pre-filter
|
|
10
|
+
- **`semantic_search` agent tool + `agent_searchable`** — declare `agent_searchable field:, filter_fields:` on a model to expose it to the readonly, client-safe `semantic_search` tool. The handler enforces the full agent envelope: searchable-class allowlist, recursive underscore-key refusal + filter-field allowlist on input, `field_allowlist` projection plus tenant-scope re-assertion on output, and score quantization in non-admin contexts
|
|
11
|
+
- **MCP elicitation — human-in-the-loop approval** — opt in with `Parse::Agent.require_approval_for = [:write, :admin]` to require spec-native `elicitation/create` approval before destructive tool calls. A pluggable `agent.approval_gate` (reachable on the non-MCP path too) shows the dry-run diff and blocks on the client's reply; `call_method` resolves the *effective* tier from the target `agent_method`. Fails closed (no capability / no listening stream / non-streaming transport / timeout → refuse); replies are session-bound
|
|
12
|
+
- **Agent impersonation** — `Parse::Agent.new(impersonate_user:, impersonate_mint:, impersonation_label:)` / `agent.impersonate(user)` resolve a real session token for a `_User` (reuse an active `_Session`, or mint a restricted one) and bind it as if `session_token:` had been passed. Master-key-required, fail-closed, with an audit label on `parse.agent.tool_call`
|
|
13
|
+
- **`Parse::Agent::PromptHardening`** — schema-string sanitization (drops non-identifier field names, strips control/zero-width chars, marker-wraps descriptions) on `get_schema`/`get_all_schemas`; embedded-marker scrubbing of untrusted tool content (`prompt_marker_strict` to refuse); operator canary phrases (`prompt_injection_canaries` + `parse.agent.prompt_injection_detected`, `canary_action = :refuse`); `Parse::Agent::PROMPT_VERSION` via `agent.describe[:prompt][:version]`; and a one-time warning when `allowed_llm_endpoints` is unrestricted
|
|
14
|
+
- **Agent telemetry + provenance** — embedding cost on `parse.agent.tool_call` (`embed_calls` / `embed_tokens` / `embed_cost_usd` via `Parse::Agent.embed_cost_per_million_tokens`); optional per-row `_source` citations (`{ class, tool, object_id }`) on read-tool results via `Parse::Agent.include_source_provenance`
|
|
15
|
+
- **General-purpose server-initiated notifications** — `Parse::Agent::MCPRackApp.new(notifications: true)` opens the GET listening-stream bus without LiveQuery resource subscriptions; `MCPRackApp#notify(session_id, method:, params:)` pushes arbitrary `notifications/*` to a session
|
|
16
|
+
- **Token economy** — `Parse::Agent.new(tools: :lean)` narrows the readonly surface to six core tools (~7.9K → ~2.6K `tools/list` tokens); read tools strip the raw `ACL` map and `get_objects`/Atlas tools share `query_class`'s compact normalization; `semantic_search` hoists each chunk's parent into a `documents` map (sent once, not per chunk) and enforces a `max_total_tokens:` budget (default 20K) with a `budget_truncated` signal; a failing `tools/call` forwards `error_code` / `retry_after` / `details` under MCP `_meta`; `get_schema` suggests near-match class names on a typo; `Parse::Agent.measure_embeddings { … }` scopes ingestion embedding cost. See [`docs/mcp_guide.md`](./docs/mcp_guide.md#token-economy)
|
|
17
|
+
|
|
18
|
+
See [CHANGELOG.md](./CHANGELOG.md) for the full 5.2 entry.
|
|
19
|
+
|
|
7
20
|
### What's new in 5.1
|
|
8
21
|
|
|
9
22
|
- **`Parse::File` URL normalization + presigned-URL stash** — `Parse::File#url=` and `attributes=` now strip signed-URL query parameters (`X-Amz-Signature`, `AWSAccessKeyId`, `Key-Pair-Id`, etc.) before storage; the bare canonical URL lands in `@url`, and the original signed URL is stashed in `file.presigned_url` with a data-driven expiry in `file.presigned_url_expires_at`. New `file.presigned_url_valid?(buffer: 60)` predicate, configurable `Parse::File.signed_url_policy = :strip | :raise`, and `Parse::File.log_filter` / `log_filter_strict` regexes for `lograge` / Sentry / Honeybadger scrubbers. `Parse::File#inspect` no longer emits the URL — see CHANGELOG for the error-reporter payload migration callout
|
|
@@ -15,6 +28,7 @@ A full-featured Ruby client SDK for [Parse Server](http://parseplatform.org/). [
|
|
|
15
28
|
- **`Parse::Installation` `belongs_to :user`** — read `installation.user` to find which user a device is currently signed in as. Symmetric `Parse::User#has_many :installations` for targeted-push grouping (master-key-only by Parse Server design; see the YARD for the owner-identity caveat)
|
|
16
29
|
- **`Parse.setup` / `live_query_url:` fixes** — `Parse.setup` is no longer a silent no-op on re-invocation; `Parse.setup(live_query_url: …)` and `live_query: { … }` options no longer raise `ArgumentError`; `ws://` against non-loopback hosts is refused unless `live_query: { allow_insecure: true }` is also passed
|
|
17
30
|
- **MCP `structuredContent` for 5 more tools** — `aggregate`, `export_data`, `atlas_text_search`, `atlas_autocomplete`, `atlas_faceted_search` now emit `structuredContent` with declared `outputSchema`s (sixteen of the built-in catalog now structured)
|
|
31
|
+
- **MCP resource subscriptions (LiveQuery bridge)** — opt-in `Parse::Agent::MCPRackApp.new(resource_subscriptions: true)` serves `resources/subscribe` and pushes `notifications/resources/updated` over a long-lived `GET` listening stream, backed by Parse LiveQuery. Subscribing to a class's `count` / `samples` resource opens a debounced LiveQuery subscription; the `resources.subscribe` capability is advertised only when LiveQuery is enabled and available. Credential-scoped per agent — session-token agents see only readable rows, master-key agents use a dedicated admin connection, and `acl_user:` / `acl_role:` agents are refused (no LiveQuery equivalent). See [`docs/mcp_guide.md`](./docs/mcp_guide.md#resource-subscriptions-livequery-bridge)
|
|
18
32
|
- **New ACL / CLP / `protectedFields` guide** — [`docs/acl_clp_guide.md`](./docs/acl_clp_guide.md) is the canonical reference for the five enforcement layers, the system-class CLP matrix (including the hardcoded master-key-only classes), the `_User` field-visibility recipe, role hierarchy direction, and the REST-aggregate vs `Parse::MongoDB.aggregate` enforcement asymmetry
|
|
19
33
|
|
|
20
34
|
See [CHANGELOG.md](./CHANGELOG.md) for the full 5.1 entry, including breaking changes, migration callouts, and the round-by-round security review notes.
|
|
@@ -1563,8 +1577,8 @@ user.mfa_status # => :enabled, :disabled, or :unknown
|
|
|
1563
1577
|
# Disable MFA (requires current token)
|
|
1564
1578
|
user.disable_mfa!(current_token: "123456")
|
|
1565
1579
|
|
|
1566
|
-
# Admin reset (
|
|
1567
|
-
user.
|
|
1580
|
+
# Admin reset (master key) — authorized_by must be a Parse::User
|
|
1581
|
+
user.disable_mfa_master_key!(authorized_by: admin_user)
|
|
1568
1582
|
```
|
|
1569
1583
|
|
|
1570
1584
|
**SMS MFA (requires Parse Server SMS callback):**
|
|
@@ -1873,6 +1887,17 @@ band.drummer # Artist object
|
|
|
1873
1887
|
###### `:field`
|
|
1874
1888
|
This option allows you to set the name of the remote Parse column for this property. Using this will explicitly set the remote property name to the value of this option. The value provided for this option will affect the name of the alias method that is generated when `alias` option is used. **By default, the name of the remote column is the lower-first camel case version of the property name. As an example, for a property with key `:my_property_name`, the framework will implicitly assume that the remote column is `myPropertyName`.**
|
|
1875
1889
|
|
|
1890
|
+
> **Pairing `belongs_to`/`has_many` when you override `:as` or `:field`.** A
|
|
1891
|
+
> `belongs_to`'s storage column comes from its **key** (or its explicit
|
|
1892
|
+
> `:field`), *not* from the class chosen by `:as`. A `has_many` on the inverse
|
|
1893
|
+
> side independently derives the column it queries from the **owning class
|
|
1894
|
+
> name**. These two defaults only line up automatically when you don't override
|
|
1895
|
+
> them — so if you customize one side, set `has_many ..., field:` to the exact
|
|
1896
|
+
> column the `belongs_to` writes, or the `has_many` query silently returns zero
|
|
1897
|
+
> results (it queries a column that does not exist, with no error). For example,
|
|
1898
|
+
> if `Post belongs_to :author, as: :workspace` (stored in column `author`), the
|
|
1899
|
+
> inverse must be `Workspace has_many :posts, as: :post, field: :author`.
|
|
1900
|
+
|
|
1876
1901
|
#### [Has One](https://neurosynq.github.io/parse-stack-next/Parse/Associations/HasOne.html)
|
|
1877
1902
|
The `has_one` creates a one-to-one association with another Parse class. This association says that the other class in the association contains a foreign pointer column which references instances of this class. If your model contains a column that is a Parse pointer to another class, you should use `belongs_to` for that association instead.
|
|
1878
1903
|
|
|
@@ -2307,7 +2332,16 @@ User.first_or_create!({ email: e }, {}, synchronize: false)
|
|
|
2307
2332
|
Parse.synchronize_classes = [User, Device, Subscription]
|
|
2308
2333
|
```
|
|
2309
2334
|
|
|
2310
|
-
The lock is a *latency optimization*; the durable correctness floor is a MongoDB unique index on the dedup tuple
|
|
2335
|
+
The lock is a *latency optimization*; the durable correctness floor is a MongoDB unique index on the dedup tuple, declared on the model with `unique_index_on`:
|
|
2336
|
+
|
|
2337
|
+
```ruby
|
|
2338
|
+
class User < Parse::Object
|
|
2339
|
+
property :email, :string
|
|
2340
|
+
unique_index_on :email # provisioned via User.apply_indexes!
|
|
2341
|
+
end
|
|
2342
|
+
```
|
|
2343
|
+
|
|
2344
|
+
When such an index exists, the synchronize wrapper rescues Parse code 137 (DuplicateValue) and re-queries inside the held lock to return the winner. On a process-local Moneta store (no Redis), the lock degrades to a per-key `Mutex` and emits a `[Parse::CreateLock]` warning. Configure `Parse.synchronize_create_secret` (or `ENV["PARSE_STACK_LOCK_SECRET"]`) to HMAC the lock keys against `query_attrs` content exposure via Redis MONITOR / snapshots.
|
|
2311
2345
|
|
|
2312
2346
|
### Saving
|
|
2313
2347
|
To commit a new record or changes to an existing record to Parse, use the `#save` method. The method will automatically detect whether it is a new object or an existing one and call the appropriate workflow. The use of ActiveModel dirty tracking allows us to send only the changes that were made to the object when saving. **Saving a record will take care of both saving all the changed properties, and associations. However, any modified linked objects (ex. belongs_to) need to be saved independently.**
|
|
@@ -2338,6 +2372,8 @@ To commit a new record or changes to an existing record to Parse, use the `#save
|
|
|
2338
2372
|
|
|
2339
2373
|
The save operation can handle both creating and updating existing objects. If you do not want to update the association data of a changed object, you may use the `#update` method to only save the changed property values. In the case where you want to force update an object even though it has not changed, to possibly trigger your `before_save` hooks, you can use the `#update!` method. In addition, just like with other ActiveModel objects, you may call `reload!` to fetch the current record again from the data store.
|
|
2340
2374
|
|
|
2375
|
+
> **Note:** because of dirty tracking, `#save` is a no-op when the object has no changed fields — it returns `true` **without** issuing a request. A `true` return therefore does not guarantee a server write occurred (assigning a property its current value leaves the object unchanged). To force callbacks and a write even when nothing changed, pass `save(force: true)` or use `#update!`.
|
|
2376
|
+
|
|
2341
2377
|
### Saving applying User ACLs
|
|
2342
2378
|
You may save and delete objects from Parse on behalf of a logged in user by passing the session token to the call to `save` or `destroy`. Doing so will allow Parse to apply the ACLs of this user against the record to see if the user is authorized to read or write the record. See [Parse::Actions](https://neurosynq.github.io/parse-stack-next/Parse/Core/Actions.html).
|
|
2343
2379
|
|
|
@@ -4194,6 +4230,40 @@ You may change your local Parse ruby classes by adding new properties. To easily
|
|
|
4194
4230
|
|
|
4195
4231
|
```
|
|
4196
4232
|
|
|
4233
|
+
### Inspecting Schema Differences
|
|
4234
|
+
|
|
4235
|
+
`Parse::Schema.diff(Klass)` returns a `SchemaDiff` describing how your local
|
|
4236
|
+
model and the server schema differ:
|
|
4237
|
+
|
|
4238
|
+
- `#missing_on_server` — fields declared locally but absent on the server (what `auto_upgrade!` would add).
|
|
4239
|
+
- `#missing_locally` — columns present on the server but not declared in your model (e.g. dashboard-added fields). Informational only; never removed.
|
|
4240
|
+
- `#type_mismatches` — fields whose local type differs from the server's.
|
|
4241
|
+
- `#in_sync?` — `true` only when all three are empty (strict, **bidirectional** equality).
|
|
4242
|
+
- `#server_covers_local?` — `true` when every field your model declares is present on the server (`missing_on_server.empty? && type_mismatches.empty?`). One-way: server-only columns are ignored.
|
|
4243
|
+
- `#summary` — a human-readable report of the above.
|
|
4244
|
+
|
|
4245
|
+
```ruby
|
|
4246
|
+
diff = Parse::Schema.diff(Post)
|
|
4247
|
+
puts diff.summary
|
|
4248
|
+
diff.missing_on_server # => { published: :boolean }
|
|
4249
|
+
diff.missing_locally # => { "legacyFlag" => :boolean }
|
|
4250
|
+
```
|
|
4251
|
+
|
|
4252
|
+
**CI convergence check.** Do **not** gate CI on `in_sync?` — it is
|
|
4253
|
+
bidirectional and returns `false` whenever the server has extra columns (a
|
|
4254
|
+
dashboard-added field, or a column owned by another service), even right after
|
|
4255
|
+
a successful `auto_upgrade!`. Gate on the one-way check instead:
|
|
4256
|
+
|
|
4257
|
+
```ruby
|
|
4258
|
+
diff = Parse::Schema.diff(Post)
|
|
4259
|
+
unless diff.server_covers_local?
|
|
4260
|
+
abort "Post schema not converged:\n#{diff.summary}"
|
|
4261
|
+
end
|
|
4262
|
+
```
|
|
4263
|
+
|
|
4264
|
+
Server-only columns (`missing_locally`) are expected and safe — `auto_upgrade!`
|
|
4265
|
+
is purely additive and never drops them.
|
|
4266
|
+
|
|
4197
4267
|
## Push Notifications
|
|
4198
4268
|
Push notifications are implemented through the `Parse::Push` class. To send push notifications through the REST API, you must enable `REST push enabled?` option in the `Push Notification Settings` section of the `Settings` page in your Parse application. Push notifications targeting uses the Installation Parse class to determine which devices receive the notification. You can provide any query constraint, similar to using `Parse::Query`, in order to target the specific set of devices you want given the columns you have configured in your `Installation` class.
|
|
4199
4269
|
|
|
@@ -4500,7 +4570,16 @@ export PARSE_MCP_ENABLED=true
|
|
|
4500
4570
|
```
|
|
4501
4571
|
|
|
4502
4572
|
```ruby
|
|
4503
|
-
# Step 2:
|
|
4573
|
+
# Step 2: Connect to your Parse Server FIRST — the agent's tools query it,
|
|
4574
|
+
# so without an active client every tool call raises a connection error.
|
|
4575
|
+
Parse.setup(
|
|
4576
|
+
server_url: ENV["PARSE_SERVER_URL"], # e.g. "https://api.example.com/parse"
|
|
4577
|
+
application_id: ENV["PARSE_APP_ID"],
|
|
4578
|
+
api_key: ENV["PARSE_REST_API_KEY"],
|
|
4579
|
+
master_key: ENV["PARSE_MASTER_KEY"], # master-key agent (full read access)
|
|
4580
|
+
)
|
|
4581
|
+
|
|
4582
|
+
# Then enable and start the MCP server.
|
|
4504
4583
|
Parse.mcp_server_enabled = true
|
|
4505
4584
|
Parse::Agent.enable_mcp!(port: 3001)
|
|
4506
4585
|
Parse::Agent::MCPServer.run(api_key: ENV["MCP_API_KEY"])
|
|
@@ -4772,6 +4851,27 @@ You can register webhooks to handle the different object triggers: `:before_save
|
|
|
4772
4851
|
|
|
4773
4852
|
For any `after_*` hook, return values are not needed since Parse does not utilize them. You may also register as many `after_save` or `after_delete` handlers as you prefer, all of them will be called.
|
|
4774
4853
|
|
|
4854
|
+
> **Your model's `after_save` callbacks run here too.** When an `after_save` /
|
|
4855
|
+
> `after_create` trigger fires, the webhook rebuilds the `Parse::Object` from the
|
|
4856
|
+
> payload and runs that model's ActiveModel `after_save` / `after_create`
|
|
4857
|
+
> callbacks — so a `webhook :after_save` block and a model `after_save :method`
|
|
4858
|
+
> callback are part of the same flow. They fire **exactly once** per save: for
|
|
4859
|
+
> saves initiated by this Ruby SDK (recognized by the `_RB_` request-id prefix
|
|
4860
|
+
> together with the master key), Parse Stack already ran them locally after the
|
|
4861
|
+
> REST response, so the webhook skips them to avoid double-firing side effects;
|
|
4862
|
+
> for saves from other clients (JS / iOS / REST), the webhook runs them, since
|
|
4863
|
+
> the SDK never had the chance.
|
|
4864
|
+
|
|
4865
|
+
> **Keep `after_save` handlers fast.** Parse Server **waits** for the `after_save`
|
|
4866
|
+
> webhook response before returning to the saving client (only LiveQuery events
|
|
4867
|
+
> are truly fire-and-forget), so a slow handler adds latency to that client's
|
|
4868
|
+
> save. And because Parse Server swallows afterSave errors and never retries the
|
|
4869
|
+
> trigger, blocking on slow work buys you no durability. Do trivial work inline
|
|
4870
|
+
> and hand anything slow, external, or must-not-be-lost (notifications,
|
|
4871
|
+
> downstream writes) to a background job/worker, returning quickly. This matters
|
|
4872
|
+
> most for client-initiated saves, where the callback runs inside the webhook —
|
|
4873
|
+
> Ruby-SDK saves run it in-process after their own REST response instead.
|
|
4874
|
+
|
|
4775
4875
|
`before_save` and `before_delete` hooks have special functionality and multiple ways to halt operations:
|
|
4776
4876
|
|
|
4777
4877
|
1. **Using `error!` method**: Calling `error!` will return an error response to Parse Server
|
|
@@ -5520,46 +5620,71 @@ The integration tests use Docker Compose to spin up a Parse Server instance with
|
|
|
5520
5620
|
|
|
5521
5621
|
#### Docker Configuration
|
|
5522
5622
|
|
|
5523
|
-
The
|
|
5524
|
-
|
|
5525
|
-
|
|
5526
|
-
|
|
5527
|
-
|
|
5528
|
-
|
|
5529
|
-
|
|
5530
|
-
|
|
5531
|
-
|
|
5532
|
-
|
|
5533
|
-
|
|
5534
|
-
|
|
5535
|
-
|
|
5536
|
-
|
|
5537
|
-
|
|
5538
|
-
|
|
5539
|
-
|
|
5540
|
-
|
|
5541
|
-
|
|
5542
|
-
|
|
5543
|
-
|
|
5544
|
-
|
|
5623
|
+
The integration stack is defined in `scripts/docker/docker-compose.test.yml`
|
|
5624
|
+
(Parse Server, MongoDB, Redis, and the Parse Dashboard); the Atlas Search stack
|
|
5625
|
+
is in `scripts/docker/docker-compose.atlas.yml`. It is deliberately isolated
|
|
5626
|
+
from any other Parse test system on the same host — a dedicated Compose project,
|
|
5627
|
+
a private port block, and a dedicated database name — so two Parse stacks can
|
|
5628
|
+
run side by side without colliding.
|
|
5629
|
+
|
|
5630
|
+
Default host ports (each overridable via the env var shown):
|
|
5631
|
+
|
|
5632
|
+
| Service | Host port | Override env var |
|
|
5633
|
+
|----------------------|-----------|-----------------------|
|
|
5634
|
+
| Parse Server | 29337 | `PARSE_HOST_PORT` |
|
|
5635
|
+
| MongoDB (test) | 29017 | `MONGO_HOST_PORT` |
|
|
5636
|
+
| Redis | 29379 | `REDIS_HOST_PORT` |
|
|
5637
|
+
| Parse Dashboard | 29040 | `DASHBOARD_HOST_PORT` |
|
|
5638
|
+
| MongoDB Atlas Local | 29020 | `ATLAS_HOST_PORT` |
|
|
5639
|
+
|
|
5640
|
+
Identity and naming:
|
|
5641
|
+
|
|
5642
|
+
- Containers, network, and volumes are namespaced by the Compose project
|
|
5643
|
+
`psnext-it`. Override the prefix with `PSNEXT_PREFIX` (e.g.
|
|
5644
|
+
`PSNEXT_PREFIX=psnext-ci`) to run a second, fully separate copy of the stack.
|
|
5645
|
+
- Parse database name: `parse_stack_next_it`. Atlas database: `parse_atlas_test`.
|
|
5646
|
+
- Default credentials: app id `psnextItAppId`, master key `psnextItMasterKey`,
|
|
5647
|
+
REST key `psnext-it-rest-key` (override with `PARSE_APP_ID`,
|
|
5648
|
+
`PARSE_MASTER_KEY`, `PARSE_API_KEY`).
|
|
5649
|
+
|
|
5650
|
+
Bring the stack up and verify:
|
|
5651
|
+
|
|
5652
|
+
```bash
|
|
5653
|
+
docker compose -f scripts/docker/docker-compose.test.yml up -d
|
|
5654
|
+
curl -s http://localhost:29337/parse/health # -> {"status":"ok"}
|
|
5545
5655
|
```
|
|
5546
5656
|
|
|
5547
5657
|
#### Environment Variables
|
|
5548
5658
|
|
|
5549
|
-
|
|
5659
|
+
The defaults above are baked into the Compose file and the test helpers, so the
|
|
5660
|
+
suite is isolated out of the box. To re-point anything, export the variables in
|
|
5661
|
+
your shell before running (nothing auto-loads `.env.test` — it is a committed
|
|
5662
|
+
reference of the full set; `set -a; source .env.test; set +a` loads them all at
|
|
5663
|
+
once). There are two sides — the containers and the Ruby client — and when you
|
|
5664
|
+
move a port you set both so they agree:
|
|
5550
5665
|
|
|
5551
5666
|
```bash
|
|
5552
|
-
# Required
|
|
5667
|
+
# Required to route the suite at the Docker stack
|
|
5553
5668
|
export PARSE_TEST_USE_DOCKER=true
|
|
5554
5669
|
|
|
5555
|
-
#
|
|
5556
|
-
export
|
|
5557
|
-
export
|
|
5558
|
-
export
|
|
5559
|
-
export
|
|
5560
|
-
|
|
5561
|
-
|
|
5562
|
-
export
|
|
5670
|
+
# Compose side — what the containers publish / use
|
|
5671
|
+
export PSNEXT_PREFIX=psnext-it
|
|
5672
|
+
export PARSE_HOST_PORT=29337
|
|
5673
|
+
export MONGO_HOST_PORT=29017
|
|
5674
|
+
export REDIS_HOST_PORT=29379
|
|
5675
|
+
export PARSE_APP_ID=psnextItAppId
|
|
5676
|
+
export PARSE_MASTER_KEY=psnextItMasterKey
|
|
5677
|
+
export PARSE_API_KEY=psnext-it-rest-key
|
|
5678
|
+
|
|
5679
|
+
# Client side — what the Ruby test suite connects to
|
|
5680
|
+
export PARSE_TEST_SERVER_URL=http://localhost:29337/parse
|
|
5681
|
+
export PARSE_TEST_APP_ID=psnextItAppId
|
|
5682
|
+
export PARSE_TEST_API_KEY=psnext-it-rest-key
|
|
5683
|
+
export PARSE_TEST_MASTER_KEY=psnextItMasterKey
|
|
5684
|
+
export PARSE_TEST_MONGO_URI="mongodb://admin:password@localhost:29017/parse_stack_next_it?authSource=admin"
|
|
5685
|
+
export PARSE_TEST_REDIS_URL=redis://localhost:29379/0
|
|
5686
|
+
export PARSE_TEST_LIVE_QUERY_URL=ws://localhost:29337
|
|
5687
|
+
export ATLAS_URI="mongodb://localhost:29020/parse_atlas_test?directConnection=true"
|
|
5563
5688
|
```
|
|
5564
5689
|
|
|
5565
5690
|
#### Troubleshooting
|
|
@@ -5572,9 +5697,13 @@ export REDIS_URL=redis://localhost:6379
|
|
|
5572
5697
|
docker-compose --version
|
|
5573
5698
|
```
|
|
5574
5699
|
|
|
5575
|
-
2. **Port conflicts**:
|
|
5700
|
+
2. **Port conflicts**: The stack uses a dedicated `29xxx` block (29337 / 29017 /
|
|
5701
|
+
29379 / 29040 / 29020) specifically to avoid colliding with a default Parse
|
|
5702
|
+
setup (1337 / 27017 / 6379 / 4040). If something still holds one of those
|
|
5703
|
+
ports, override it (for example `PARSE_HOST_PORT=29338`) or stop the
|
|
5704
|
+
conflicting stack:
|
|
5576
5705
|
```bash
|
|
5577
|
-
docker
|
|
5706
|
+
docker compose -f scripts/docker/docker-compose.test.yml down
|
|
5578
5707
|
```
|
|
5579
5708
|
|
|
5580
5709
|
3. **Permission errors**: Ensure Docker has proper permissions
|
data/Rakefile
CHANGED
|
@@ -14,7 +14,7 @@ require "rake/testtask"
|
|
|
14
14
|
# @return [Array(String, String, String, String)]
|
|
15
15
|
# server_url, application_id, api_key, master_key
|
|
16
16
|
def mcp_credentials_or_abort!
|
|
17
|
-
server_url = ENV["PARSE_SERVER_URL"] || "http://localhost:
|
|
17
|
+
server_url = ENV["PARSE_SERVER_URL"] || "http://localhost:29337/parse"
|
|
18
18
|
app_id = ENV["PARSE_APP_ID"]
|
|
19
19
|
rest_api_key = ENV["PARSE_API_KEY"]
|
|
20
20
|
master_key = ENV["PARSE_MASTER_KEY"]
|
|
@@ -23,9 +23,9 @@ def mcp_credentials_or_abort!
|
|
|
23
23
|
|
|
24
24
|
if app_id.to_s.empty? || master_key.to_s.empty?
|
|
25
25
|
if is_local
|
|
26
|
-
app_id = (app_id.to_s.empty? ? "
|
|
26
|
+
app_id = (app_id.to_s.empty? ? "psnextItAppId" : app_id)
|
|
27
27
|
rest_api_key = (rest_api_key.to_s.empty? ? "myApiKey" : rest_api_key)
|
|
28
|
-
master_key = (master_key.to_s.empty? ? "
|
|
28
|
+
master_key = (master_key.to_s.empty? ? "psnextItMasterKey" : master_key)
|
|
29
29
|
else
|
|
30
30
|
abort "[Rakefile] PARSE_SERVER_URL=#{server_url} is not local; refusing to fall back to " \
|
|
31
31
|
"placeholder credentials. Set PARSE_APP_ID and PARSE_MASTER_KEY explicitly."
|
|
@@ -35,11 +35,15 @@ def mcp_credentials_or_abort!
|
|
|
35
35
|
[server_url, app_id, rest_api_key, master_key]
|
|
36
36
|
end
|
|
37
37
|
|
|
38
|
-
# Default test task runs all tests with Docker enabled
|
|
38
|
+
# Default test task runs all tests with Docker enabled.
|
|
39
|
+
#
|
|
40
|
+
# `*disruptive*` tests are EXCLUDED here: they stop/restart the shared
|
|
41
|
+
# Parse Server container, which would flake any other test loaded into the
|
|
42
|
+
# same process. Run them on their own via `rake test:integration:disruptive`.
|
|
39
43
|
Rake::TestTask.new do |t|
|
|
40
44
|
ENV['PARSE_TEST_USE_DOCKER'] = 'true'
|
|
41
45
|
t.libs << "lib/parse/stack"
|
|
42
|
-
t.test_files = FileList["test/lib/**/*_test.rb"]
|
|
46
|
+
t.test_files = FileList["test/lib/**/*_test.rb"].exclude("test/lib/**/*disruptive*")
|
|
43
47
|
t.warning = false
|
|
44
48
|
t.verbose = true
|
|
45
49
|
end
|
|
@@ -48,8 +52,12 @@ end
|
|
|
48
52
|
namespace :test do
|
|
49
53
|
desc "Run all integration tests (requires Docker)"
|
|
50
54
|
task :integration do
|
|
55
|
+
# Disruptive tests (server stop/restart) are run separately via
|
|
56
|
+
# `test:integration:disruptive` so they never interleave with — and
|
|
57
|
+
# flake — the rest of the integration suite against the shared server.
|
|
51
58
|
integration_files = FileList["test/lib/**/*integration_test.rb"]
|
|
52
|
-
|
|
59
|
+
.exclude("test/lib/**/*disruptive*")
|
|
60
|
+
|
|
53
61
|
puts "Running #{integration_files.length} integration test files..."
|
|
54
62
|
integration_files.each_with_index do |file, index|
|
|
55
63
|
puts "Running integration test #{index + 1}/#{integration_files.length}: #{file}"
|
|
@@ -71,8 +79,10 @@ namespace :test do
|
|
|
71
79
|
|
|
72
80
|
desc "Run unit tests only (no Docker required)"
|
|
73
81
|
task :unit do
|
|
74
|
-
unit_files = FileList["test/lib/**/*_test.rb"]
|
|
75
|
-
|
|
82
|
+
unit_files = FileList["test/lib/**/*_test.rb"]
|
|
83
|
+
.exclude("test/lib/**/*integration_test.rb")
|
|
84
|
+
.exclude("test/lib/**/*disruptive*")
|
|
85
|
+
|
|
76
86
|
puts "Running #{unit_files.length} unit test files (no Docker)..."
|
|
77
87
|
unit_files.each_with_index do |file, index|
|
|
78
88
|
puts "Running unit test #{index + 1}/#{unit_files.length}: #{file}"
|
|
@@ -89,13 +99,49 @@ namespace :test do
|
|
|
89
99
|
puts "\n✅ All unit tests completed successfully!"
|
|
90
100
|
end
|
|
91
101
|
|
|
102
|
+
namespace :integration do
|
|
103
|
+
desc "Run DISRUPTIVE integration tests (stop/restart the Parse Server " \
|
|
104
|
+
"container). Run in isolation — these are excluded from the normal " \
|
|
105
|
+
"test / test:integration / test:unit runs."
|
|
106
|
+
task :disruptive do
|
|
107
|
+
disruptive_files = FileList["test/lib/**/*disruptive*_test.rb"]
|
|
108
|
+
|
|
109
|
+
if disruptive_files.empty?
|
|
110
|
+
puts "No disruptive test files found."
|
|
111
|
+
next
|
|
112
|
+
end
|
|
113
|
+
|
|
114
|
+
puts "Running #{disruptive_files.length} disruptive test file(s)..."
|
|
115
|
+
disruptive_files.each_with_index do |file, index|
|
|
116
|
+
puts "\n" + "=" * 80
|
|
117
|
+
puts "Running disruptive test #{index + 1}/#{disruptive_files.length}: #{file}"
|
|
118
|
+
puts "=" * 80
|
|
119
|
+
# Each file runs in its own process so a server outage in one cannot
|
|
120
|
+
# bleed into the next.
|
|
121
|
+
system("PARSE_TEST_USE_DOCKER=true ruby -Ilib:test #{file}") || begin
|
|
122
|
+
# A disruptive test may have left the server down on failure; bring
|
|
123
|
+
# it back so a follow-up run / other tasks start from a clean state.
|
|
124
|
+
system("docker start #{ENV["PSNEXT_PREFIX"] || "psnext-it"}-server", out: IO::NULL, err: IO::NULL)
|
|
125
|
+
exit(1)
|
|
126
|
+
end
|
|
127
|
+
end
|
|
128
|
+
puts "\n✅ All disruptive tests completed successfully!"
|
|
129
|
+
end
|
|
130
|
+
end
|
|
131
|
+
|
|
92
132
|
desc "List all available test files"
|
|
93
133
|
task :list do
|
|
94
134
|
puts "\nIntegration Tests:"
|
|
95
|
-
FileList["test/lib/**/*integration_test.rb"].each { |f| puts " #{f}" }
|
|
135
|
+
FileList["test/lib/**/*integration_test.rb"].exclude("test/lib/**/*disruptive*").each { |f| puts " #{f}" }
|
|
136
|
+
|
|
137
|
+
puts "\nDisruptive Integration Tests (run via test:integration:disruptive):"
|
|
138
|
+
FileList["test/lib/**/*disruptive*_test.rb"].each { |f| puts " #{f}" }
|
|
96
139
|
|
|
97
140
|
puts "\nUnit Tests:"
|
|
98
|
-
FileList["test/lib/**/*_test.rb"]
|
|
141
|
+
FileList["test/lib/**/*_test.rb"]
|
|
142
|
+
.exclude("test/lib/**/*integration_test.rb")
|
|
143
|
+
.exclude("test/lib/**/*disruptive*")
|
|
144
|
+
.each { |f| puts " #{f}" }
|
|
99
145
|
end
|
|
100
146
|
|
|
101
147
|
# ---------------------------------------------------------------------------
|
|
@@ -351,23 +351,124 @@ Mechanics:
|
|
|
351
351
|
|
|
352
352
|
### Single vector per record
|
|
353
353
|
|
|
354
|
-
`embed` produces exactly one vector per record.
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
358
|
-
|
|
354
|
+
`embed` produces exactly one vector per record. Long source text whose
|
|
355
|
+
concatenation exceeds the provider's per-call token budget is truncated
|
|
356
|
+
provider-side, and the stored vector represents only the leading portion
|
|
357
|
+
of the document. **Chunking happens at retrieval time, not embed time**
|
|
358
|
+
(see [Retrieval (RAG)](#retrieval-rag) below): the embedding stays
|
|
359
|
+
one-vector-per-record by design.
|
|
359
360
|
|
|
360
|
-
|
|
361
|
+
If you instead want each passage to have its OWN embedding (true
|
|
362
|
+
embed-time chunking), use one of these patterns:
|
|
361
363
|
|
|
362
364
|
1. **Pre-chunk client-side** and write each chunk as its own
|
|
363
365
|
`Parse::Object` record with its own `embed` declaration.
|
|
364
|
-
2. **Dedicated
|
|
366
|
+
2. **Dedicated chunk subclass** that `belongs_to` the parent, with
|
|
365
367
|
`embed :content, into: :embedding` on the chunk class itself. Run
|
|
366
368
|
similarity search against the chunk collection, then hydrate
|
|
367
369
|
parents as needed.
|
|
368
370
|
|
|
369
|
-
|
|
370
|
-
|
|
371
|
+
---
|
|
372
|
+
|
|
373
|
+
## Retrieval (RAG)
|
|
374
|
+
|
|
375
|
+
`Parse::Retrieval` (`Parse::RAG` is an alias) sits on top of
|
|
376
|
+
`find_similar`. `Parse::Retrieval.retrieve` embeds a natural-language
|
|
377
|
+
query, runs Atlas `$vectorSearch` through `find_similar` (so ACL/CLP are
|
|
378
|
+
enforced mongo-direct — there is no REST two-stage re-query), and splits
|
|
379
|
+
each retrieved document's text field into scored, citable chunks.
|
|
380
|
+
Chunking here is **presentation-only**: every chunk inherits its parent
|
|
381
|
+
document's single `$vectorSearch` score.
|
|
382
|
+
|
|
383
|
+
```ruby
|
|
384
|
+
chunks = Parse::Retrieval.retrieve(
|
|
385
|
+
query: "how do I reset my password?",
|
|
386
|
+
klass: KnowledgeArticle, # or "KnowledgeArticle"
|
|
387
|
+
field: :embedding, # optional; auto-resolves a single :vector field
|
|
388
|
+
k: 5,
|
|
389
|
+
filter: { published: true }, # post-$vectorSearch $match
|
|
390
|
+
vector_filter: nil, # Atlas-native pre-filter (fields must be type:"filter")
|
|
391
|
+
tenant_scope: nil, # { field:, value: } merged into vector_filter
|
|
392
|
+
score_quantize: false,
|
|
393
|
+
session_token: user.session_token, # ACL scope kwargs pass through to find_similar
|
|
394
|
+
)
|
|
395
|
+
# => Array<Parse::Retrieval::Chunk> — { id, score, content, source, metadata }
|
|
396
|
+
```
|
|
397
|
+
|
|
398
|
+
`rerank:` and `hybrid:` are reserved on the signature and raise
|
|
399
|
+
`NotImplementedError` if supplied.
|
|
400
|
+
|
|
401
|
+
### Chunkers
|
|
402
|
+
|
|
403
|
+
The default is a fixed-size sliding window with overlap. Subclass
|
|
404
|
+
`Parse::Retrieval::Chunker::Base` (implement `#chunk(text) -> Array<String>`)
|
|
405
|
+
for semantic / sentence-aware strategies.
|
|
406
|
+
|
|
407
|
+
```ruby
|
|
408
|
+
Parse::Retrieval::Chunker::FixedSizeOverlap.new(
|
|
409
|
+
size: 800, # window width
|
|
410
|
+
overlap: 100, # units shared between consecutive windows (must be < size)
|
|
411
|
+
by: :chars, # :chars (default) or :tokens (whitespace tokens)
|
|
412
|
+
max_chunks_per_document: 200, # amplification cap — TRUNCATES with a signal, never raises
|
|
413
|
+
)
|
|
414
|
+
```
|
|
415
|
+
|
|
416
|
+
### `agent_searchable` + the `semantic_search` agent tool
|
|
417
|
+
|
|
418
|
+
Opt a model in to agentic retrieval, declaring the vector field and the
|
|
419
|
+
fields an agent may filter on:
|
|
420
|
+
|
|
421
|
+
```ruby
|
|
422
|
+
class KnowledgeArticle < Parse::Object
|
|
423
|
+
property :title, :string
|
|
424
|
+
property :body, :string
|
|
425
|
+
property :embedding, :vector, dimensions: 1536, provider: :openai
|
|
426
|
+
embed :title, :body, into: :embedding
|
|
427
|
+
agent_searchable field: :embedding, filter_fields: %i[published category]
|
|
428
|
+
end
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
Every property referenced by `embed` must be declared — omitting
|
|
432
|
+
`property :title` here raises `InvalidEmbedDeclaration` at class load.
|
|
433
|
+
|
|
434
|
+
Because this model embeds **two** text sources (`:title` and `:body`),
|
|
435
|
+
`semantic_search` cannot guess which one to chunk and return as the
|
|
436
|
+
result `content`. Pass `text_field:` to choose (it must name one of the
|
|
437
|
+
embedded sources); a single-source model infers it automatically and the
|
|
438
|
+
parameter is optional:
|
|
439
|
+
|
|
440
|
+
```ruby
|
|
441
|
+
# via the agent tool (LLM-facing parameter)
|
|
442
|
+
semantic_search(class_name: "KnowledgeArticle", query: "vector indexes",
|
|
443
|
+
text_field: "body")
|
|
444
|
+
|
|
445
|
+
# or directly
|
|
446
|
+
Parse::Retrieval.retrieve(query: "vector indexes", klass: KnowledgeArticle,
|
|
447
|
+
text_field: :body)
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
The readonly, `client_safe` `semantic_search` tool then routes through
|
|
451
|
+
`Parse::Retrieval.retrieve` with the full agent security envelope:
|
|
452
|
+
searchable-class allowlist (`MetadataRegistry.resolve_searchable!`),
|
|
453
|
+
recursive underscore-key refusal + filter-field allowlist on caller
|
|
454
|
+
input, tenant scope folded into the Atlas pre-filter AND re-asserted on
|
|
455
|
+
every returned record, `field_allowlist` projection of each source, and
|
|
456
|
+
score quantization in non-admin contexts. In a tenant-aware deployment
|
|
457
|
+
(any class declares `agent_tenant_scope`), a searchable class without its
|
|
458
|
+
own tenant scope is refused at dispatch. See the
|
|
459
|
+
[MCP guide](./mcp_guide.md) for the agent-side wiring.
|
|
460
|
+
|
|
461
|
+
**Result shape (token-economy).** The tool returns
|
|
462
|
+
`{ chunks:, documents:, count: }`. Each chunk's parent record is hoisted
|
|
463
|
+
**once** into `documents` (keyed by `objectId`) rather than duplicated on
|
|
464
|
+
every chunk — map a chunk to its source via `metadata.object_id`. A
|
|
465
|
+
`max_total_tokens:` budget (default 20,000; estimated chars/4) trims the
|
|
466
|
+
lowest-ranked chunks so a few long documents can't silently blow the
|
|
467
|
+
context window, adding `budget_truncated: true` / `budget_dropped: <n>`
|
|
468
|
+
when it trims (pass `0` to disable). The library-level
|
|
469
|
+
`Parse::Retrieval.retrieve` still returns the flat `Array<Chunk>` with
|
|
470
|
+
`source` on each chunk — the dedup and budget live in the agent tool's
|
|
471
|
+
envelope. See the [MCP guide's Token Economy section](./mcp_guide.md#token-economy).
|
|
371
472
|
|
|
372
473
|
---
|
|
373
474
|
|