parse-stack-next 5.4.1 → 5.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +489 -0
  3. data/Gemfile.lock +1 -1
  4. data/README.md +61 -9
  5. data/docs/atlas_vector_search_guide.md +318 -19
  6. data/lib/parse/acl_scope.rb +11 -0
  7. data/lib/parse/agent/mcp_rack_app.rb +53 -14
  8. data/lib/parse/agent/mcp_server.rb +19 -0
  9. data/lib/parse/api/path_segment.rb +31 -0
  10. data/lib/parse/api/users.rb +13 -0
  11. data/lib/parse/cache/redis.rb +55 -11
  12. data/lib/parse/client/caching.rb +12 -3
  13. data/lib/parse/client/logging.rb +9 -0
  14. data/lib/parse/client.rb +37 -3
  15. data/lib/parse/embeddings/batch_embedder.rb +188 -0
  16. data/lib/parse/embeddings/cache.rb +374 -0
  17. data/lib/parse/embeddings/cohere.rb +31 -18
  18. data/lib/parse/embeddings/image_fetch.rb +347 -0
  19. data/lib/parse/embeddings/provider.rb +17 -11
  20. data/lib/parse/embeddings/spend_cap.rb +117 -3
  21. data/lib/parse/embeddings/voyage.rb +34 -25
  22. data/lib/parse/embeddings.rb +40 -3
  23. data/lib/parse/model/acl.rb +15 -11
  24. data/lib/parse/model/core/embed_managed.rb +243 -14
  25. data/lib/parse/model/core/properties.rb +42 -5
  26. data/lib/parse/model/core/vector_searchable.rb +157 -8
  27. data/lib/parse/mongodb.rb +12 -0
  28. data/lib/parse/pipeline_security.rb +81 -15
  29. data/lib/parse/query/constraint.rb +22 -0
  30. data/lib/parse/query/constraints.rb +271 -250
  31. data/lib/parse/query.rb +284 -43
  32. data/lib/parse/retrieval/agent_tool.rb +21 -14
  33. data/lib/parse/retrieval/retriever.rb +84 -0
  34. data/lib/parse/schema/search_index_migrator.rb +48 -1
  35. data/lib/parse/stack/version.rb +1 -1
  36. data/lib/parse/stack.rb +12 -1
  37. data/lib/parse/vector_search/hybrid.rb +39 -1
  38. data/lib/parse/vector_search.rb +34 -0
  39. data/lib/parse/webhooks/payload.rb +7 -1
  40. data/lib/parse/webhooks.rb +107 -21
  41. metadata +4 -1
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 842a9a2d8d24afbb8e0444d995ea6c1d8707ec5fc2c9a40405d40f71b46495b8
4
- data.tar.gz: 8e54c583a6bf251818144b2ae6e1589197b4b0153093e0dd36048798a696346a
3
+ metadata.gz: 8fed615f71ab3b45bd9f10e2947c50ebbcba075b15b0a8786f0a269d4e59ebd6
4
+ data.tar.gz: e9528b3f4bc811cef21494089f6e5eb5aaed43732ddf926a64ccc2a050d8742d
5
5
  SHA512:
6
- metadata.gz: 264a5574513616b8cb9ebe662c9e1109746d688c191affe3faefd11f20b0911039f09e895e535874e8715ddf5f53b30e863f6cdc5fa54f4cafabe2f5044398db
7
- data.tar.gz: d61b36d12ef78eb05b701ff1e45858c7dc2f911bd548d912db9e061fe9027fed86c3b2b2808f2a89cbeb5115c603fd279425ccec2d184f567b9ef00f8eb901af
6
+ metadata.gz: 0d1d0ee29e3787585f246e8b006f81aa514bc85732fe32c1c175f5734d60b8fadbf5f2d127f7d7fa6df38e49303e334ad2c31a10d85cb3b7e7f8eba1a1bf836d
7
+ data.tar.gz: 8c032babfcc42f16327a874d4cbf358ed7aade09157da7e29dd879578454b23cff7e7cbc3e860fec7891c8e43c12b1362778039550bf3371c63786d122fb9ae7
data/CHANGELOG.md CHANGED
@@ -1,5 +1,494 @@
1
1
  ## parse-stack-next Changelog
2
2
 
3
+ ### 5.5.1
4
+
5
+ #### Mongo-direct reads inside `Parse.with_session` are now scoped, not master
6
+
7
+ - **FIXED**: A query that auto-routes to the mongo-direct path because of a
8
+ direct-only constraint (for example a geo `$near` / `$geoIntersects` query)
9
+ now honors the ambient session token set by `Parse.with_session(token)`.
10
+ Previously the mongo-direct auth resolver consulted only the query's own
11
+ `session_token=` / `scope_to_user` / `scope_to_role` and ignored the
12
+ fiber-local ambient session, so in server mode it fell through to a
13
+ master-key read with no ACL/CLP enforcement — returning rows the session was
14
+ not permitted to see, even though every REST query in the same
15
+ `with_session` block was correctly scoped. The resolver now mirrors
16
+ `Parse::Client#request` precedence: an explicit per-query token wins, then
17
+ the ambient session, then the master-key fallback; an explicit
18
+ `use_master_key: true` is a deliberate admin call and still skips the
19
+ ambient. Routing also accepts the ambient on non-master clients
20
+ (`Parse.client_mode` or a user-scoped client), so such a query runs scoped
21
+ rather than raising.
22
+
23
+ #### Boolean property coercion no longer treats the string "false" as true
24
+
25
+ - **FIXED**: A `:boolean` property assigned a string now coerces via
26
+ ActiveModel's boolean caster instead of raw Ruby truthiness. Previously the
27
+ coercion was `val ? true : false`, so the strings `"false"`, `"0"`, and
28
+ `"off"` — exactly what arrives on a Rails-form or query-string ingestion
29
+ path — all coerced to `true`, silently flipping a boolean the wrong way (for
30
+ example an `archived` flag or an application-defined access gate). String
31
+ forms now map correctly (`"false"`/`"0"`/`"off"` to `false`), a blank string
32
+ is treated as unset (`nil`), and native booleans from Parse wire JSON pass
33
+ through unchanged.
34
+
35
+ #### Deprecation warning for setting ACL via mass-assignment
36
+
37
+ - **DEPRECATED**: Setting `acl`/`ACL` through mass-assignment
38
+ (`Parse::Object#attributes=`) now emits a one-time security warning. Mass-
39
+ assigning an ACL from a caller-supplied hash — for example a controller doing
40
+ `record.attributes = params` without StrongParameters — lets an attacker
41
+ grant unintended access by sending an `ACL` key
42
+ (`{"ACL" => {"*" => {"write" => true}}}`). The behavior is unchanged this
43
+ release (the ACL is still applied), but the supported path is the explicit
44
+ `record.acl = ...` setter, and a future release may block ACL mass-assignment.
45
+ The constructor form `Klass.new(acl: ...)` is unaffected and does not warn.
46
+
47
+ #### Redis cache values serialized as JSON instead of Marshal
48
+
49
+ - **FIXED**: `Parse::Cache::Redis` now serializes cached HTTP responses as
50
+ JSON rather than Marshal. The Moneta-Redis store Marshals values by default,
51
+ so every cache hit ran `Marshal.load` on the bytes returned by Redis. Against
52
+ a shared, unauthenticated, or plaintext-`redis://` cache, an attacker able to
53
+ write the cache could plant a crafted Marshal payload that executed code on
54
+ deserialization. The wrapper now disables Moneta's value serializer
55
+ (`value_serializer: nil`) and JSON-encodes/decodes values itself; an
56
+ undecodable value (including any legacy Marshal entry) is treated as a cache
57
+ miss rather than deserialized. Cache keys are unchanged. No application code
58
+ changes are required; existing cached entries are transparently refetched and
59
+ re-stored in the new format on first access.
60
+ - **FIXED**: The `cache: "redis://..."` shorthand on `Parse::Client.new` /
61
+ `Parse.setup` now builds a `Parse::Cache::Redis` store instead of a bare
62
+ `Moneta.new(:Redis, ...)`, so it gets the same JSON value serialization and
63
+ is not subject to the Marshal deserialization issue above.
64
+ - **CHANGED**: The caching middleware stores response entries with string keys
65
+ so they round-trip losslessly through the JSON serialization. Reads accept
66
+ both string and legacy symbol keys.
67
+ - **FIXED**: `Parse::Embeddings::Cache::MonetaStore` now JSON-encodes cached
68
+ embedding vectors instead of relying on the Moneta store's default Marshal
69
+ value serializer, closing the same `Marshal.load`-on-read deserialization
70
+ vector for the embedding cache (whose key is derived from often-user-supplied
71
+ text). It also emits a one-time warning when handed a Marshal-serializing
72
+ store and recommends `value_serializer: nil`.
73
+ - **CHANGED**: Documentation for Redis-backed caches, the embedding cache, and
74
+ the synchronize-create lock store (`Parse.synchronize_create_store`) now
75
+ builds the Redis store via `Parse::Cache::Redis` or `value_serializer: nil`
76
+ so a raw `Moneta.new(:Redis, ...)` no longer leaves Marshal on the read path.
77
+
78
+ #### Internal columns stripped from joined documents on mongo-direct reads
79
+
80
+ - **FIXED**: `Parse::MongoDB.aggregate` now recursively strips Parse-internal
81
+ credential columns (`_hashed_password`, `_session_token`, `_auth_data_*`,
82
+ `_rperm`/`_wperm`, ...) from every result row **and every embedded
83
+ sub-document** for scoped (non-master) callers. Previously a scoped caller
84
+ could embed a foreign class (e.g. `_User` or `_Session`) into an arbitrary
85
+ alias via `$lookup` / `$graphLookup` / `$unionWith` and read back password
86
+ hashes, OAuth tokens, and session tokens: the per-class `protectedFields`
87
+ strip is keyed on the outer class, and the ACL sub-document walk only drops
88
+ ACL-failing sub-documents, so neither covered the aliased foreign document.
89
+ A new `Parse::PipelineSecurity.redact_internal_fields_deep!` runs as the final
90
+ redaction step. Structural columns (`_id`, `_p_*`, `_acl`, timestamps) are
91
+ preserved, so object and ACL reconstruction are unaffected; master-key reads
92
+ are unchanged.
93
+
94
+ #### Hardened developer-facing mongo-direct aggregation terminals
95
+
96
+ - **FIXED**: Credential columns (`_hashed_password`, `_session_token`,
97
+ `_auth_data_*`, `_email_verify_token`, `_perishable_token`, ...) used as a
98
+ `$match` field name are now refused **unconditionally** on the mongo-direct
99
+ path — even on a pipeline running with `allow_internal_fields: true` (the flag
100
+ that lets SDK-emitted `_rperm`/`_wperm` references through for
101
+ `readable_by_role` / `publicly_readable`). Previously the `*_direct` terminals
102
+ (`count_direct`, `results_direct`, `distinct_direct`, the direct group-by
103
+ helpers) passed `allow_internal_fields: true` unconditionally, so a query
104
+ whose `where` referenced a credential column compiled into a `$match` key that
105
+ bypassed the internal-field screen — a count/match oracle that could bisect a
106
+ bcrypt hash or session token. The ACL columns (`_rperm`/`_wperm`/`_tombstone`)
107
+ remain gated by `allow_internal_fields`, so `readable_by_role` still works.
108
+ - **FIXED**: `Parse::Query#aggregate` and `#aggregate_from_query` now treat a
109
+ scoped query (`session_token` / `scope_to_user` / `scope_to_role`) as
110
+ authoritative over an explicit `mongo_direct: false`. Previously passing
111
+ `mongo_direct: false` on a scoped aggregation skipped the fail-closed guard
112
+ and routed to Parse Server's master-key-only REST `/aggregate` endpoint,
113
+ running the aggregation unscoped (no ACL, CLP, or `protectedFields`). A scoped
114
+ aggregation now promotes to mongo-direct, or fails closed with
115
+ `Parse::Query::MongoDirectRequired` when direct Mongo is unavailable; unscoped
116
+ callers can still opt out to REST with `mongo_direct: false`.
117
+
118
+ #### Additional hardening
119
+
120
+ - **FIXED**: Request/response body logging now redacts credentials. At `:debug`
121
+ level the logging middleware emitted login/signup request bodies (cleartext
122
+ `password`) and auth response bodies (`sessionToken`, `authData`, MFA
123
+ secrets); the body path now runs through the same `BodyBuilder.redact`
124
+ scrubber the header path already used, before truncation.
125
+ - **FIXED**: The `_User` REST endpoints (`fetch_user` / `update_user` /
126
+ `delete_user`) now validate the `objectId` against
127
+ `Parse::API::PathSegment.object_id!` before interpolating it into the path,
128
+ matching the object endpoints. A crafted objectId (e.g. from a compromised
129
+ server response) can no longer traverse to a different endpoint on a
130
+ subsequent request.
131
+ - **CHANGED**: `$sessionToken` / `$session_token` (the camelCase forms of the
132
+ session-token column) are now in `DENIED_FIELD_REFS`, so they cannot be
133
+ laundered through a `$`-field reference in a pipeline.
134
+ - **IMPROVED**: The internal-collection floor (`_SCHEMA` / `_Hooks` /
135
+ `_GlobalConfig` / `_Audit` / ...) is now enforced unconditionally on every
136
+ `$lookup` / `$graphLookup` / `$unionWith` join target in
137
+ `Parse::ACLScope`, not only when lookup-rewriting runs. This closes a
138
+ defense-in-depth gap where an internal class whose CLP lookup returned no
139
+ policy could otherwise have been joinable on the direct path.
140
+ - **IMPROVED**: When the MCP agent server is started on an unauthenticated
141
+ loopback bind with no Origin/custom-header gate configured, it now defaults
142
+ to a loopback-only Origin policy. A browser DNS-rebinding attack against
143
+ `127.0.0.1` carries a non-loopback `Origin` and is refused; native clients
144
+ (which send no `Origin`) and local browser UIs are unaffected. A one-time
145
+ warning points operators at `MCP_API_KEY` / `allowed_origins:` /
146
+ `require_custom_header:` for routable deployments.
147
+
148
+ ### 5.5.0
149
+
150
+ #### Multimodal bytes-fetch path with magic-byte MIME verification
151
+
152
+ - **NEW**: `Parse::Embeddings::ImageFetch` — the SDK-side image download
153
+ layer for image embeddings. Downloads through the existing
154
+ `Parse::File.safe_open_url` SSRF primitive (CIDR blocks, port allowlist,
155
+ DNS-rebinding re-check, size caps, timeouts — no parallel fetch mechanism),
156
+ determines the MIME type **exclusively by magic-byte sniffing** of the
157
+ leading bytes (JPEG / PNG / GIF / WebP), cross-checks the URL extension
158
+ against the sniffed type, and enforces a configurable
159
+ `Parse::Embeddings.allowed_image_types` allowlist. The HTTP `Content-Type`
160
+ header is never consulted, closing the file MIME-laundering gap: a `.jpg`
161
+ URL serving HTML (or PNG bytes behind a JPEG extension) is refused outright.
162
+ - **NEW**: `embed_image ..., source: :bytes` declaration mode. Where the
163
+ default `source: :url` forwards a validated URL for the provider to fetch
164
+ itself (and therefore requires the `trust_provider_url_fetch` sentinel),
165
+ `:bytes` mode has the SDK download, verify, and metadata-strip the image,
166
+ then forward it to the provider as a base64 data URI. No third-party URL
167
+ egress occurs, so the sentinel is not required — but the file's host must
168
+ still be in `Parse::Embeddings.allowed_image_hosts` (deny-all when empty).
169
+
170
+ ```ruby
171
+ class Post < Parse::Object
172
+ property :cover_image, :file
173
+ property :cover_embedding, :vector, dimensions: 1024, provider: :voyage
174
+ embed_image :cover_image, into: :cover_embedding, source: :bytes
175
+ end
176
+ ```
177
+ - **NEW**: EXIF/XMP metadata stripping, **default ON** for the bytes path.
178
+ User-uploaded photos commonly carry GPS coordinates and device serial
179
+ numbers; forwarding them to an embedding provider is a PII egress. JPEG
180
+ APP1 segments (Exif and XMP), PNG `eXIf` chunks, and WebP `EXIF`/`XMP `
181
+ RIFF chunks (with the VP8X flag bits cleared) are removed before the bytes
182
+ leave the process. Opt out per declaration with `exif_strip: false` when
183
+ orientation metadata must be preserved.
184
+ - **NEW**: `Voyage#embed_image` and `Cohere#embed_image` accept
185
+ `Parse::Embeddings::ImageFetch::FetchedImage` sources alongside URL
186
+ Strings (forms may be mixed in one batch). Fetched bytes ride Voyage's
187
+ `image_base64` content row and Cohere's `image_url` data-URI form.
188
+ - **NEW**: `Parse::Embeddings.allowed_image_types=` — MIME allowlist for the
189
+ bytes path (default JPEG/PNG/GIF/WebP; SVG deliberately excluded as
190
+ script-capable active content).
191
+ - **ENHANCED**: `Parse::Embeddings.validate_image_url!` accepts
192
+ `mode: :fetch` for SDK-side downloads — same host allowlist,
193
+ obfuscated-IP screen, port and CIDR checks as the default `:forward`
194
+ mode, minus the provider-egress sentinel that doesn't apply when no URL
195
+ is forwarded.
196
+
197
+ #### Embedding-model migration tooling
198
+
199
+ - **NEW**: `Class.reembed!(field:, batch_size:, limit:, where:, only_stale:,
200
+ save_opts:)` — bulk re-embed for provider/model migrations. Unlike
201
+ `embed_pending!` (which only fills null vectors), `reembed!` walks every
202
+ row with objectId-cursor pagination, clears the digest sibling so the
203
+ save-path recompute cannot elide the provider call, and saves. With
204
+ `only_stale: true` the walk skips rows whose recorded provenance already
205
+ matches the current provider, model, and dimensions — making a partially
206
+ failed migration resumable.
207
+ - **NEW**: `embed` / `embed_image` auto-declare an `<into>_meta` `:object`
208
+ sibling property recording `{ provider, model, dimensions, modality,
209
+ embedded_at }` on every recompute (cleared when the source clears).
210
+ This is the provenance record `reembed!(only_stale: true)` reads, and it
211
+ tells operational tooling which model produced any stored vector.
212
+ Override the name with `meta_field:`.
213
+
214
+ #### Bulk embedding and query-embed caching
215
+
216
+ - **NEW**: `Parse::Embeddings::BatchEmbedder` — batch-level orchestration
217
+ for bulk embedding jobs. Wraps any registered provider with batch slicing
218
+ (defaulting to the provider's own batch-size hint), requests-per-minute
219
+ pacing between calls, and batch-level exponential backoff with jitter on
220
+ rate-limit / transient errors (previously backoff lived only inside each
221
+ provider's single HTTP call). A batch that exhausts its attempts raises
222
+ `BatchEmbedder::BatchFailed` carrying `batch_index` and `completed_count`
223
+ so a resumable job knows where to pick up. Supports `retry_on:` exception
224
+ overrides and an `on_progress:` callback.
225
+ - **NEW**: `Parse::Embeddings::Cache` — process-local embedding cache keyed
226
+ by `(provider, model, dimensions, input_type, SHA-256(input))`, disabled by
227
+ default. Dimensions participate in the key so two registrations of the
228
+ same Matryoshka-capable model at different output widths never serve each
229
+ other's vectors.
230
+ `Parse::Embeddings::Cache.enable!(max_entries:, ttl:)` activates an LRU +
231
+ TTL store (or pass `store:` for a custom backend); repeated identical
232
+ query embeds through `find_similar(text:)`, `hybrid_search(text:)`, and
233
+ `Parse::Retrieval.retrieve` then skip the provider round-trip. Cache hits
234
+ emit the standard `parse.embeddings.embed` notification with
235
+ `cached: true`, so existing spend subscribers see hits and misses on one
236
+ stream. The input text is hashed before keying — plaintext queries never
237
+ land in a shared store.
238
+
239
+ #### Vector index drift detection
240
+
241
+ - **NEW**: first-query verification of deployed Atlas vectorSearch indexes.
242
+ When `find_similar` / `hybrid_search` auto-discovers an index, the SDK now
243
+ compares the index's `numDimensions` and `similarity` against the
244
+ `:vector` property declaration, and — when the class registers an
245
+ `agent_tenant_scope` — confirms the scope field is declared as a
246
+ `type: "filter"` path (without it, every tenant-scoped
247
+ `$vectorSearch.filter` fails Atlas-side). Findings are computed once per
248
+ (class, field, index) per process and governed by
249
+ `Parse::VectorSearch.index_drift_policy`: `:warn` (default) emits a
250
+ `[Parse::VectorSearch:DRIFT]` warning on the first check; `:raise` raises
251
+ `Parse::Core::VectorSearchable::IndexDriftError` on **every** query
252
+ against the drifted index, so strict deployments never serve degraded
253
+ results after the first failure; `:ignore` skips verification. An
254
+ explicit `index:` kwarg is verified best-effort when the catalog's
255
+ covering index carries the same name (lookup failures never fail the
256
+ query).
257
+
258
+ #### Hybrid search hardening
259
+
260
+ - **FIXED**: on the opt-in native `$rankFusion` path, a scoped (non-master)
261
+ caller's `_hybrid_score` is now recomputed from the post-ACL visible
262
+ ordering instead of surfacing the raw fused score. The raw score is
263
+ materialized before the ACL `$match`, so it encoded a surviving row's
264
+ rank among rows the caller cannot read — a cross-tenant/cross-ACL
265
+ inference channel for callers probing with crafted queries. The
266
+ recomputed score is monotone with the true fused order but is a function
267
+ of visible rows only. Master-key results and the default client-side RRF
268
+ path (which ranks from already-filtered rows) are unchanged.
269
+ - **FIXED**: the `$rankFusion` support probe no longer classifies MongoDB
270
+ authorization errors as "stage unsupported". The probe's
271
+ unrecognized-stage matching included the broad phrase "is not allowed",
272
+ which also appears in auth failures ("not allowed to execute command
273
+ aggregate") and could cache the wrong verdict for the probe TTL. Matching
274
+ is narrowed to unambiguous unknown-stage phrases; any other failure is
275
+ treated as supported and the real query surfaces the real error, with
276
+ the client-side path as the standing fallback.
277
+
278
+ #### Retrieval spend-cap and filter hardening
279
+
280
+ - **NEW**: `Parse::Embeddings::SpendCap.configure(..., warn_at: 0.8)` —
281
+ soft-cap alerting. When a charge pushes a tenant's in-window usage across
282
+ the given fraction of its hard limit, a
283
+ `parse.embeddings.spend_cap_warning` ActiveSupport::Notifications event
284
+ is emitted (`tenant_id`, `used`, `limit`, `window`, `warn_at`,
285
+ `threshold`), once per crossing and re-arming as the window rolls off —
286
+ an operator alerting hook that fires BEFORE the hard refuse trips.
287
+ Disabled unless configured. Note the cap deliberately charges before the
288
+ query-embed cache lookup, so cache hits bill at full price: it bounds
289
+ query volume (an abuse control), not just provider spend.
290
+ - **NEW**: `Parse::Embeddings::Cache::MonetaStore` — persistent-L2 adapter
291
+ for the embedding cache. Wraps any Moneta-compatible store (`[]`/`[]=`,
292
+ optional `store(key, value, expires:)`) behind the cache's `get`/`set`
293
+ duck, with key namespacing and TTL forwarding, so
294
+ `Cache.enable!(store: MonetaStore.new(moneta, ttl: 30 * 24 * 3600))`
295
+ shares query-embed entries across processes and restarts. Fail-open: a
296
+ backend error degrades to a cache miss / dropped write, never a failed
297
+ embed. Cache keys are input hashes — plaintext queries never land in the
298
+ shared store.
299
+ - **NEW**: embedding spend-cap coverage on every query-embed path. The
300
+ per-tenant `Parse::Embeddings::SpendCap` was previously charged only at
301
+ the `semantic_search` agent-tool boundary; direct `find_similar(text:)`,
302
+ `hybrid_search(text:)`, and `Parse::Retrieval.retrieve` callers bypassed
303
+ it. The shared query-embed path now charges via
304
+ `SpendCap.charge_query!` — tenant identity resolves to the ambient
305
+ `Parse.with_cache_tenant` scope when set, else the shared default bucket.
306
+ The agent tool wraps its retrieval in the new `SpendCap.with_precharged`
307
+ block so a query it already charged with per-tenant identity is not
308
+ double-billed (and admin-exempt queries are not billed to the shared
309
+ bucket). As before, the cap is a no-op until configured.
310
+ - **NEW**: pointer-value translation for caller-supplied retrieval filters.
311
+ `Parse::Retrieval.retrieve` (and through it the `semantic_search` agent
312
+ tool) now rewrites Parse pointer values — `Parse::Pointer` /
313
+ `Parse::Object` instances and wire-form `{"__type": "Pointer"}` hashes,
314
+ including inside `$in` / `$eq` / `$ne` operator hashes — into their
315
+ MongoDB storage form, so `{ owner: some_user }` becomes
316
+ `{ "_p_owner" => "_User$abc123" }` and actually matches rows. Previously
317
+ a pointer-valued filter silently matched nothing. Translation runs after
318
+ the underscore-key gate and filter-field allowlist (callers still cannot
319
+ name `_p_*` columns directly) and before the tenant-scope fold. The
320
+ standalone helper is `Parse::Retrieval.translate_pointer_filter_values`.
321
+ - **IMPROVED**: `Parse::Schema::SearchIndexMigrator` auto-includes the
322
+ model's registered `agent_tenant_scope` field as a `type: "filter"` path
323
+ when planning or applying `vectorSearch` index declarations. Newly created
324
+ indexes support tenant-scoped pre-filtering out of the box; existing
325
+ indexes missing the path surface as `drifted:` in the plan instead of
326
+ failing at query time.
327
+
328
+ #### Opt-in Unicode regex matching for text constraints
329
+
330
+ - **NEW**: `starts_with`, `contains`, `ends_with`, and `like`/`regex` now accept
331
+ an opt-in `{ value:, unicode: true }` form that appends the `u` (Unicode) flag
332
+ to the compiled `$options`, enabling correct multibyte case-insensitive
333
+ matching for accented and non-Latin text (for example `café` matching
334
+ `CAFÉ`, or CJK characters).
335
+
336
+ ```ruby
337
+ Post.where(:title.starts_with => { value: "café", unicode: true })
338
+ # => "title": { "$regex": "^café", "$options": "iu" }
339
+
340
+ Post.where(:title.like => { value: /café/i, unicode: true })
341
+ # => "title": { "$regex": "café", "$options": "iu" }
342
+ ```
343
+
344
+ The flag is strictly opt-in: the bare-value forms
345
+ (`:title.starts_with => "café"`) compile exactly as before with `$options: "i"`,
346
+ so existing queries are unchanged. The `u` flag is honored by Parse Server
347
+ 8.3.0+ over the REST query interface and by MongoDB 6.1+ on the mongo-direct
348
+ query path; older Parse Servers reject it, which is why it is never emitted
349
+ unless requested.
350
+
351
+ #### ACL permission query hardening
352
+
353
+ - **FIXED**: `readable_by`, `writable_by`, `readable_by_role`,
354
+ `writable_by_role`, `publicly_readable`, and `publicly_writable` no longer
355
+ raise a pipeline-security error when they auto-route through the direct
356
+ MongoDB path. These constraints compile to an aggregation `$match` on the
357
+ internal `_rperm` / `_wperm` permission columns, and the internal-fields
358
+ denylist that protects user-supplied pipelines from referencing
359
+ server-internal columns was also rejecting these SDK-generated references.
360
+ The aggregation runner now forwards the `allow_internal_fields` sanction for
361
+ pipelines built entirely from SDK constraint translation — matching the
362
+ parity already held by the `results_direct` / `count_direct` /
363
+ `distinct_direct` helpers — so public-read detection (`publicly_readable`,
364
+ `readable_by("*")`) and role/user permission filtering work again. The
365
+ sanction is scoped to SDK-built ACL pipelines only; caller-supplied
366
+ aggregation pipelines remain subject to the full denylist, so they still
367
+ cannot reference password hashes, session tokens, or other internal columns.
368
+ - **FIXED**: `Query#count` now routes ACL permission filters
369
+ (`publicly_readable.count`, `readable_by(...).count`, and friends) through
370
+ the direct MongoDB path, mirroring `Query#results`. Previously `count` only
371
+ switched to the direct path for subquery `$lookup` stages, so an ACL count
372
+ was sent to Parse Server's REST aggregate endpoint, which cannot express a
373
+ `$match` on `_rperm` / `_wperm`.
374
+ - **FIXED**: the scalar aggregation terminals — `Query#sum`, `#average`,
375
+ `#min`, `#max`, `#distinct`, and `#count_distinct` — now honor ACL
376
+ permission filters and scoped queries. They funnel through `Query#aggregate`,
377
+ which previously only switched to the direct MongoDB path for subquery
378
+ `$lookup` stages. An ACL filter (`readable_by(...).sum(:plays)`) was sent to
379
+ Parse Server's REST aggregate endpoint, which cannot express a `$match` on
380
+ `_rperm` / `_wperm`. More seriously, a **scoped** terminal
381
+ (`scope_to_user(u).sum(:plays)`, `scope_to_role`, or a `session_token`)
382
+ reached the same REST endpoint, which is master-key-only and enforces
383
+ neither ACL nor CLP — so the aggregate ran unscoped as the master key,
384
+ computing the result over rows the caller cannot read. `Query#aggregate` now
385
+ routes to mongo-direct whenever the query is scoped or the pipeline
386
+ references the ACL columns, and **fails closed** (raises
387
+ `Parse::Query::MongoDirectRequired`) for a scoped terminal when mongo-direct
388
+ is unavailable, rather than silently bypassing enforcement. The same
389
+ contract covers the inline-pipeline terminals: a scoped `Query#count` or
390
+ `Query#results` whose constraints compile to an aggregation pipeline
391
+ (e.g. `:field.size`) promotes to mongo-direct and fails closed identically
392
+ instead of falling back to REST `/aggregate`.
393
+ - **FIXED**: `not_publicly_readable` / `not_publicly_writable` (and the
394
+ `:ACL.not_readable_by` / `:ACL.not_writable_by` constraints) no longer return
395
+ the rows they are meant to exclude. They compiled to `{ _rperm: { $nin:
396
+ [...] } }`, and MongoDB's `$nin` matches documents where the field is
397
+ **absent** — and a missing `_rperm` is treated by Parse Server as public.
398
+ A security audit using `not_publicly_writable` to find safe objects silently
399
+ excluded write-exposed (public-by-absence) objects. The constraints now carry
400
+ an `$exists: true` guard. "Not readable by X" additionally expands the
401
+ principal's roles and excludes publicly-readable rows (a public row is
402
+ readable by everyone, so it cannot be "not readable by X").
403
+ - **FIXED**: `readable_by([])` / `writable_by([])` and the `:none` / `nil`
404
+ forms no longer raise `ArgumentError`; they now compile to the documented
405
+ "no permissions" match (an explicit empty `_rperm` / `_wperm`). Symbol
406
+ principals (`:public`, `:everyone`, `:world`) are accepted and map to the
407
+ public wildcard, matching the String forms.
408
+ - **FIXED**: `PrivateAclConstraint` (`:ACL.private_acl` / `master_key_only`)
409
+ no longer classifies public-by-absence rows as private. A truly master-key-
410
+ only object has an explicit empty `_rperm` **and** `_wperm`; a missing
411
+ column is public, the opposite of private, so the missing-field branch was
412
+ removed. `private_acl => false` is now the exact complement.
413
+ - **FIXED**: role expansion for `readable_by` / `writable_by` /
414
+ `readable_by_role` / `writable_by_role` now always includes the role's own
415
+ name in the permission set. The upward-inheritance walk yields nothing for
416
+ an unpersisted role (objectId still nil), which previously dropped the role
417
+ entirely and raised "no valid permissions"; the role's own `role:<name>`
418
+ entry is now appended idempotently, so persisted roles compile unchanged.
419
+ - **CHANGED**: a mistyped ACL permission no longer vanishes silently. An
420
+ unrecognized element in a `readable_by` / `writable_by` array (or an
421
+ unsupported Symbol) now raises `ArgumentError` instead of being dropped from
422
+ the permission set, which would silently weaken the intended filter.
423
+ - **NEW**: `strict:` option on `readable_by` / `writable_by` /
424
+ `readable_by_role` / `writable_by_role` (and the `:ACL.readable_by_exact` /
425
+ `writable_by_exact` / `*_by_role_exact` operators) for an **exact** match —
426
+ only rows whose `_rperm` / `_wperm` literally contains one of the resolved
427
+ permissions, with no implicit public `"*"` and no missing-field rows. The
428
+ default remains inclusive (access-simulation) semantics; `strict: true` is
429
+ the right choice for ownership and security audits.
430
+ - **NEW**: `Query#not_readable_by` / `#not_writable_by` chained methods, the
431
+ fluent counterparts to the existing `:ACL.not_readable_by` symbol operators.
432
+ - **BREAKING**: the British-spelled `:ACL.writeable_by` operator now resolves
433
+ to the same public-inclusive, role-expanding implementation as
434
+ `:ACL.writable_by`. Previously the one-letter spelling difference selected a
435
+ separate, strict, non-role-expanding constraint, so `writeable_by` and
436
+ `writable_by` silently produced different result sets. Code that relied on
437
+ the old strict behavior of `writeable_by` should pass `strict: true` (or use
438
+ the `:writable_by_exact` operator).
439
+
440
+ #### Webhook after_save callback hardening
441
+
442
+ - **FIXED**: the model's chained `after_save` / `after_create` callbacks now
443
+ fire exactly once per `afterSave` delivery, even when an app registers both a
444
+ class-specific handler (`webhook :after_save, MyClass`) and a catch-all
445
+ handler (`webhook :after_save, "*"`). The webhook endpoint dispatches every
446
+ trigger to both the class route and the `"*"` route, and the callback chain
447
+ previously ran inside each route — so an app with both handlers fired its
448
+ model `after_save` twice (e.g. two emails per save). The chain now runs once,
449
+ after both routes are dispatched. The existing behavior is otherwise
450
+ preserved: an `afterSave` for a class with no registered handler never fires
451
+ model callbacks, and trusted Ruby-initiated saves still skip the webhook-side
452
+ callbacks so the local `run_callbacks :save` is the single fire.
453
+ - **FIXED**: a chained `after_save` or `after_create` callback that raises
454
+ during an `afterSave` webhook no longer crashes the webhook endpoint or
455
+ suppresses the other phase's side effects. Because `afterSave` fires after the
456
+ object is already persisted and Parse Server discards the response body, the
457
+ `after_create` and `after_save` phases now run independently and any
458
+ `StandardError` they raise is logged and swallowed (mirroring Parse Server's
459
+ own afterSave semantics). A raising `after_create :send_welcome_email` no
460
+ longer silently skips an unrelated `after_save :reindex`, and an uncaught
461
+ callback error can no longer return a 500 to Parse Server.
462
+ - **FIXED**: `Parse::Webhooks::Payload#ruby_initiated?` now memoizes a `false`
463
+ result stably instead of re-deriving it on every call. The prior `||=`
464
+ memoization recomputed whenever the cached value was `false`, so a stamped
465
+ `false` could be re-derived inconsistently; the detection result is now cached
466
+ exactly once.
467
+
468
+ #### `verify_password` client-side rate-limit parity
469
+
470
+ - **CHANGED**: `verify_password` now participates in the same client-side login
471
+ rate-limit as `login`. It calls the rate-limit guard before issuing the
472
+ request and records the result afterward, keyed on the bare username so
473
+ failures share a bucket with `login` — an attacker cannot sidestep a `login`
474
+ lockout by pivoting to the `verify_password` credential oracle. Because the
475
+ bucket is shared, a run of failed step-up / re-authentication calls counts
476
+ toward (and can trigger) the primary login lockout for that username. As with
477
+ `login`, this is a convenience guard, not a security boundary — server-side
478
+ rate limiting remains the real control.
479
+
480
+ #### Cloud function results are server-authoritative
481
+
482
+ - **IMPROVED**: Documented that decoded cloud function results are treated as
483
+ server-authoritative. A cloud function that returns a Parse object decodes
484
+ through the same trusted path as every query and `fetch` result, so
485
+ server-set fields on the returned object (including `sessionToken` on a
486
+ returned user) are preserved rather than stripped — consistent with how the
487
+ rest of the SDK hydrates server responses. If a cloud function is expected to
488
+ echo back third-party-influenced data that you want to sanitize yourself,
489
+ call it with `raw: true` (`Parse.call_function(name, body, raw: true)`) to
490
+ receive the undecoded response before any object is built.
491
+
3
492
  ### 5.4.1
4
493
 
5
494
  #### Webhook after_save callback fix
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- parse-stack-next (5.4.1)
4
+ parse-stack-next (5.5.1)
5
5
  activemodel (>= 6.1, < 9)
6
6
  activesupport (>= 6.1, < 9)
7
7
  connection_pool (>= 2.2, < 4)