parse-stack-next 5.4.1 → 5.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +489 -0
- data/Gemfile.lock +1 -1
- data/README.md +61 -9
- data/docs/atlas_vector_search_guide.md +318 -19
- data/lib/parse/acl_scope.rb +11 -0
- data/lib/parse/agent/mcp_rack_app.rb +53 -14
- data/lib/parse/agent/mcp_server.rb +19 -0
- data/lib/parse/api/path_segment.rb +31 -0
- data/lib/parse/api/users.rb +13 -0
- data/lib/parse/cache/redis.rb +55 -11
- data/lib/parse/client/caching.rb +12 -3
- data/lib/parse/client/logging.rb +9 -0
- data/lib/parse/client.rb +37 -3
- data/lib/parse/embeddings/batch_embedder.rb +188 -0
- data/lib/parse/embeddings/cache.rb +374 -0
- data/lib/parse/embeddings/cohere.rb +31 -18
- data/lib/parse/embeddings/image_fetch.rb +347 -0
- data/lib/parse/embeddings/provider.rb +17 -11
- data/lib/parse/embeddings/spend_cap.rb +117 -3
- data/lib/parse/embeddings/voyage.rb +34 -25
- data/lib/parse/embeddings.rb +40 -3
- data/lib/parse/model/acl.rb +15 -11
- data/lib/parse/model/core/embed_managed.rb +243 -14
- data/lib/parse/model/core/properties.rb +42 -5
- data/lib/parse/model/core/vector_searchable.rb +157 -8
- data/lib/parse/mongodb.rb +12 -0
- data/lib/parse/pipeline_security.rb +81 -15
- data/lib/parse/query/constraint.rb +22 -0
- data/lib/parse/query/constraints.rb +271 -250
- data/lib/parse/query.rb +284 -43
- data/lib/parse/retrieval/agent_tool.rb +21 -14
- data/lib/parse/retrieval/retriever.rb +84 -0
- data/lib/parse/schema/search_index_migrator.rb +48 -1
- data/lib/parse/stack/version.rb +1 -1
- data/lib/parse/stack.rb +12 -1
- data/lib/parse/vector_search/hybrid.rb +39 -1
- data/lib/parse/vector_search.rb +34 -0
- data/lib/parse/webhooks/payload.rb +7 -1
- data/lib/parse/webhooks.rb +107 -21
- metadata +4 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 8fed615f71ab3b45bd9f10e2947c50ebbcba075b15b0a8786f0a269d4e59ebd6
|
|
4
|
+
data.tar.gz: e9528b3f4bc811cef21494089f6e5eb5aaed43732ddf926a64ccc2a050d8742d
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 0d1d0ee29e3787585f246e8b006f81aa514bc85732fe32c1c175f5734d60b8fadbf5f2d127f7d7fa6df38e49303e334ad2c31a10d85cb3b7e7f8eba1a1bf836d
|
|
7
|
+
data.tar.gz: 8c032babfcc42f16327a874d4cbf358ed7aade09157da7e29dd879578454b23cff7e7cbc3e860fec7891c8e43c12b1362778039550bf3371c63786d122fb9ae7
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,494 @@
|
|
|
1
1
|
## parse-stack-next Changelog
|
|
2
2
|
|
|
3
|
+
### 5.5.1
|
|
4
|
+
|
|
5
|
+
#### Mongo-direct reads inside `Parse.with_session` are now scoped, not master
|
|
6
|
+
|
|
7
|
+
- **FIXED**: A query that auto-routes to the mongo-direct path because of a
|
|
8
|
+
direct-only constraint (for example a geo `$near` / `$geoIntersects` query)
|
|
9
|
+
now honors the ambient session token set by `Parse.with_session(token)`.
|
|
10
|
+
Previously the mongo-direct auth resolver consulted only the query's own
|
|
11
|
+
`session_token=` / `scope_to_user` / `scope_to_role` and ignored the
|
|
12
|
+
fiber-local ambient session, so in server mode it fell through to a
|
|
13
|
+
master-key read with no ACL/CLP enforcement — returning rows the session was
|
|
14
|
+
not permitted to see, even though every REST query in the same
|
|
15
|
+
`with_session` block was correctly scoped. The resolver now mirrors
|
|
16
|
+
`Parse::Client#request` precedence: an explicit per-query token wins, then
|
|
17
|
+
the ambient session, then the master-key fallback; an explicit
|
|
18
|
+
`use_master_key: true` is a deliberate admin call and still skips the
|
|
19
|
+
ambient. Routing also accepts the ambient on non-master clients
|
|
20
|
+
(`Parse.client_mode` or a user-scoped client), so such a query runs scoped
|
|
21
|
+
rather than raising.
|
|
22
|
+
|
|
23
|
+
#### Boolean property coercion no longer treats the string "false" as true
|
|
24
|
+
|
|
25
|
+
- **FIXED**: A `:boolean` property assigned a string now coerces via
|
|
26
|
+
ActiveModel's boolean caster instead of raw Ruby truthiness. Previously the
|
|
27
|
+
coercion was `val ? true : false`, so the strings `"false"`, `"0"`, and
|
|
28
|
+
`"off"` — exactly what arrives on a Rails-form or query-string ingestion
|
|
29
|
+
path — all coerced to `true`, silently flipping a boolean the wrong way (for
|
|
30
|
+
example an `archived` flag or an application-defined access gate). String
|
|
31
|
+
forms now map correctly (`"false"`/`"0"`/`"off"` to `false`), a blank string
|
|
32
|
+
is treated as unset (`nil`), and native booleans from Parse wire JSON pass
|
|
33
|
+
through unchanged.
|
|
34
|
+
|
|
35
|
+
#### Deprecation warning for setting ACL via mass-assignment
|
|
36
|
+
|
|
37
|
+
- **DEPRECATED**: Setting `acl`/`ACL` through mass-assignment
|
|
38
|
+
(`Parse::Object#attributes=`) now emits a one-time security warning. Mass-
|
|
39
|
+
assigning an ACL from a caller-supplied hash — for example a controller doing
|
|
40
|
+
`record.attributes = params` without StrongParameters — lets an attacker
|
|
41
|
+
grant unintended access by sending an `ACL` key
|
|
42
|
+
(`{"ACL" => {"*" => {"write" => true}}}`). The behavior is unchanged this
|
|
43
|
+
release (the ACL is still applied), but the supported path is the explicit
|
|
44
|
+
`record.acl = ...` setter, and a future release may block ACL mass-assignment.
|
|
45
|
+
The constructor form `Klass.new(acl: ...)` is unaffected and does not warn.
|
|
46
|
+
|
|
47
|
+
#### Redis cache values serialized as JSON instead of Marshal
|
|
48
|
+
|
|
49
|
+
- **FIXED**: `Parse::Cache::Redis` now serializes cached HTTP responses as
|
|
50
|
+
JSON rather than Marshal. The Moneta-Redis store Marshals values by default,
|
|
51
|
+
so every cache hit ran `Marshal.load` on the bytes returned by Redis. Against
|
|
52
|
+
a shared, unauthenticated, or plaintext-`redis://` cache, an attacker able to
|
|
53
|
+
write the cache could plant a crafted Marshal payload that executed code on
|
|
54
|
+
deserialization. The wrapper now disables Moneta's value serializer
|
|
55
|
+
(`value_serializer: nil`) and JSON-encodes/decodes values itself; an
|
|
56
|
+
undecodable value (including any legacy Marshal entry) is treated as a cache
|
|
57
|
+
miss rather than deserialized. Cache keys are unchanged. No application code
|
|
58
|
+
changes are required; existing cached entries are transparently refetched and
|
|
59
|
+
re-stored in the new format on first access.
|
|
60
|
+
- **FIXED**: The `cache: "redis://..."` shorthand on `Parse::Client.new` /
|
|
61
|
+
`Parse.setup` now builds a `Parse::Cache::Redis` store instead of a bare
|
|
62
|
+
`Moneta.new(:Redis, ...)`, so it gets the same JSON value serialization and
|
|
63
|
+
is not subject to the Marshal deserialization issue above.
|
|
64
|
+
- **CHANGED**: The caching middleware stores response entries with string keys
|
|
65
|
+
so they round-trip losslessly through the JSON serialization. Reads accept
|
|
66
|
+
both string and legacy symbol keys.
|
|
67
|
+
- **FIXED**: `Parse::Embeddings::Cache::MonetaStore` now JSON-encodes cached
|
|
68
|
+
embedding vectors instead of relying on the Moneta store's default Marshal
|
|
69
|
+
value serializer, closing the same `Marshal.load`-on-read deserialization
|
|
70
|
+
vector for the embedding cache (whose key is derived from often-user-supplied
|
|
71
|
+
text). It also emits a one-time warning when handed a Marshal-serializing
|
|
72
|
+
store and recommends `value_serializer: nil`.
|
|
73
|
+
- **CHANGED**: Documentation for Redis-backed caches, the embedding cache, and
|
|
74
|
+
the synchronize-create lock store (`Parse.synchronize_create_store`) now
|
|
75
|
+
builds the Redis store via `Parse::Cache::Redis` or `value_serializer: nil`
|
|
76
|
+
so a raw `Moneta.new(:Redis, ...)` no longer leaves Marshal on the read path.
|
|
77
|
+
|
|
78
|
+
#### Internal columns stripped from joined documents on mongo-direct reads
|
|
79
|
+
|
|
80
|
+
- **FIXED**: `Parse::MongoDB.aggregate` now recursively strips Parse-internal
|
|
81
|
+
credential columns (`_hashed_password`, `_session_token`, `_auth_data_*`,
|
|
82
|
+
`_rperm`/`_wperm`, ...) from every result row **and every embedded
|
|
83
|
+
sub-document** for scoped (non-master) callers. Previously a scoped caller
|
|
84
|
+
could embed a foreign class (e.g. `_User` or `_Session`) into an arbitrary
|
|
85
|
+
alias via `$lookup` / `$graphLookup` / `$unionWith` and read back password
|
|
86
|
+
hashes, OAuth tokens, and session tokens: the per-class `protectedFields`
|
|
87
|
+
strip is keyed on the outer class, and the ACL sub-document walk only drops
|
|
88
|
+
ACL-failing sub-documents, so neither covered the aliased foreign document.
|
|
89
|
+
A new `Parse::PipelineSecurity.redact_internal_fields_deep!` runs as the final
|
|
90
|
+
redaction step. Structural columns (`_id`, `_p_*`, `_acl`, timestamps) are
|
|
91
|
+
preserved, so object and ACL reconstruction are unaffected; master-key reads
|
|
92
|
+
are unchanged.
|
|
93
|
+
|
|
94
|
+
#### Hardened developer-facing mongo-direct aggregation terminals
|
|
95
|
+
|
|
96
|
+
- **FIXED**: Credential columns (`_hashed_password`, `_session_token`,
|
|
97
|
+
`_auth_data_*`, `_email_verify_token`, `_perishable_token`, ...) used as a
|
|
98
|
+
`$match` field name are now refused **unconditionally** on the mongo-direct
|
|
99
|
+
path — even on a pipeline running with `allow_internal_fields: true` (the flag
|
|
100
|
+
that lets SDK-emitted `_rperm`/`_wperm` references through for
|
|
101
|
+
`readable_by_role` / `publicly_readable`). Previously the `*_direct` terminals
|
|
102
|
+
(`count_direct`, `results_direct`, `distinct_direct`, the direct group-by
|
|
103
|
+
helpers) passed `allow_internal_fields: true` unconditionally, so a query
|
|
104
|
+
whose `where` referenced a credential column compiled into a `$match` key that
|
|
105
|
+
bypassed the internal-field screen — a count/match oracle that could bisect a
|
|
106
|
+
bcrypt hash or session token. The ACL columns (`_rperm`/`_wperm`/`_tombstone`)
|
|
107
|
+
remain gated by `allow_internal_fields`, so `readable_by_role` still works.
|
|
108
|
+
- **FIXED**: `Parse::Query#aggregate` and `#aggregate_from_query` now treat a
|
|
109
|
+
scoped query (`session_token` / `scope_to_user` / `scope_to_role`) as
|
|
110
|
+
authoritative over an explicit `mongo_direct: false`. Previously passing
|
|
111
|
+
`mongo_direct: false` on a scoped aggregation skipped the fail-closed guard
|
|
112
|
+
and routed to Parse Server's master-key-only REST `/aggregate` endpoint,
|
|
113
|
+
running the aggregation unscoped (no ACL, CLP, or `protectedFields`). A scoped
|
|
114
|
+
aggregation now promotes to mongo-direct, or fails closed with
|
|
115
|
+
`Parse::Query::MongoDirectRequired` when direct Mongo is unavailable; unscoped
|
|
116
|
+
callers can still opt out to REST with `mongo_direct: false`.
|
|
117
|
+
|
|
118
|
+
#### Additional hardening
|
|
119
|
+
|
|
120
|
+
- **FIXED**: Request/response body logging now redacts credentials. At `:debug`
|
|
121
|
+
level the logging middleware emitted login/signup request bodies (cleartext
|
|
122
|
+
`password`) and auth response bodies (`sessionToken`, `authData`, MFA
|
|
123
|
+
secrets); the body path now runs through the same `BodyBuilder.redact`
|
|
124
|
+
scrubber the header path already used, before truncation.
|
|
125
|
+
- **FIXED**: The `_User` REST endpoints (`fetch_user` / `update_user` /
|
|
126
|
+
`delete_user`) now validate the `objectId` against
|
|
127
|
+
`Parse::API::PathSegment.object_id!` before interpolating it into the path,
|
|
128
|
+
matching the object endpoints. A crafted objectId (e.g. from a compromised
|
|
129
|
+
server response) can no longer traverse to a different endpoint on a
|
|
130
|
+
subsequent request.
|
|
131
|
+
- **CHANGED**: `$sessionToken` / `$session_token` (the camelCase forms of the
|
|
132
|
+
session-token column) are now in `DENIED_FIELD_REFS`, so they cannot be
|
|
133
|
+
laundered through a `$`-field reference in a pipeline.
|
|
134
|
+
- **IMPROVED**: The internal-collection floor (`_SCHEMA` / `_Hooks` /
|
|
135
|
+
`_GlobalConfig` / `_Audit` / ...) is now enforced unconditionally on every
|
|
136
|
+
`$lookup` / `$graphLookup` / `$unionWith` join target in
|
|
137
|
+
`Parse::ACLScope`, not only when lookup-rewriting runs. This closes a
|
|
138
|
+
defense-in-depth gap where an internal class whose CLP lookup returned no
|
|
139
|
+
policy could otherwise have been joinable on the direct path.
|
|
140
|
+
- **IMPROVED**: When the MCP agent server is started on an unauthenticated
|
|
141
|
+
loopback bind with no Origin/custom-header gate configured, it now defaults
|
|
142
|
+
to a loopback-only Origin policy. A browser DNS-rebinding attack against
|
|
143
|
+
`127.0.0.1` carries a non-loopback `Origin` and is refused; native clients
|
|
144
|
+
(which send no `Origin`) and local browser UIs are unaffected. A one-time
|
|
145
|
+
warning points operators at `MCP_API_KEY` / `allowed_origins:` /
|
|
146
|
+
`require_custom_header:` for routable deployments.
|
|
147
|
+
|
|
148
|
+
### 5.5.0
|
|
149
|
+
|
|
150
|
+
#### Multimodal bytes-fetch path with magic-byte MIME verification
|
|
151
|
+
|
|
152
|
+
- **NEW**: `Parse::Embeddings::ImageFetch` — the SDK-side image download
|
|
153
|
+
layer for image embeddings. Downloads through the existing
|
|
154
|
+
`Parse::File.safe_open_url` SSRF primitive (CIDR blocks, port allowlist,
|
|
155
|
+
DNS-rebinding re-check, size caps, timeouts — no parallel fetch mechanism),
|
|
156
|
+
determines the MIME type **exclusively by magic-byte sniffing** of the
|
|
157
|
+
leading bytes (JPEG / PNG / GIF / WebP), cross-checks the URL extension
|
|
158
|
+
against the sniffed type, and enforces a configurable
|
|
159
|
+
`Parse::Embeddings.allowed_image_types` allowlist. The HTTP `Content-Type`
|
|
160
|
+
header is never consulted, closing the file MIME-laundering gap: a `.jpg`
|
|
161
|
+
URL serving HTML (or PNG bytes behind a JPEG extension) is refused outright.
|
|
162
|
+
- **NEW**: `embed_image ..., source: :bytes` declaration mode. Where the
|
|
163
|
+
default `source: :url` forwards a validated URL for the provider to fetch
|
|
164
|
+
itself (and therefore requires the `trust_provider_url_fetch` sentinel),
|
|
165
|
+
`:bytes` mode has the SDK download, verify, and metadata-strip the image,
|
|
166
|
+
then forward it to the provider as a base64 data URI. No third-party URL
|
|
167
|
+
egress occurs, so the sentinel is not required — but the file's host must
|
|
168
|
+
still be in `Parse::Embeddings.allowed_image_hosts` (deny-all when empty).
|
|
169
|
+
|
|
170
|
+
```ruby
|
|
171
|
+
class Post < Parse::Object
|
|
172
|
+
property :cover_image, :file
|
|
173
|
+
property :cover_embedding, :vector, dimensions: 1024, provider: :voyage
|
|
174
|
+
embed_image :cover_image, into: :cover_embedding, source: :bytes
|
|
175
|
+
end
|
|
176
|
+
```
|
|
177
|
+
- **NEW**: EXIF/XMP metadata stripping, **default ON** for the bytes path.
|
|
178
|
+
User-uploaded photos commonly carry GPS coordinates and device serial
|
|
179
|
+
numbers; forwarding them to an embedding provider is a PII egress. JPEG
|
|
180
|
+
APP1 segments (Exif and XMP), PNG `eXIf` chunks, and WebP `EXIF`/`XMP `
|
|
181
|
+
RIFF chunks (with the VP8X flag bits cleared) are removed before the bytes
|
|
182
|
+
leave the process. Opt out per declaration with `exif_strip: false` when
|
|
183
|
+
orientation metadata must be preserved.
|
|
184
|
+
- **NEW**: `Voyage#embed_image` and `Cohere#embed_image` accept
|
|
185
|
+
`Parse::Embeddings::ImageFetch::FetchedImage` sources alongside URL
|
|
186
|
+
Strings (forms may be mixed in one batch). Fetched bytes ride Voyage's
|
|
187
|
+
`image_base64` content row and Cohere's `image_url` data-URI form.
|
|
188
|
+
- **NEW**: `Parse::Embeddings.allowed_image_types=` — MIME allowlist for the
|
|
189
|
+
bytes path (default JPEG/PNG/GIF/WebP; SVG deliberately excluded as
|
|
190
|
+
script-capable active content).
|
|
191
|
+
- **ENHANCED**: `Parse::Embeddings.validate_image_url!` accepts
|
|
192
|
+
`mode: :fetch` for SDK-side downloads — same host allowlist,
|
|
193
|
+
obfuscated-IP screen, port and CIDR checks as the default `:forward`
|
|
194
|
+
mode, minus the provider-egress sentinel that doesn't apply when no URL
|
|
195
|
+
is forwarded.
|
|
196
|
+
|
|
197
|
+
#### Embedding-model migration tooling
|
|
198
|
+
|
|
199
|
+
- **NEW**: `Class.reembed!(field:, batch_size:, limit:, where:, only_stale:,
|
|
200
|
+
save_opts:)` — bulk re-embed for provider/model migrations. Unlike
|
|
201
|
+
`embed_pending!` (which only fills null vectors), `reembed!` walks every
|
|
202
|
+
row with objectId-cursor pagination, clears the digest sibling so the
|
|
203
|
+
save-path recompute cannot elide the provider call, and saves. With
|
|
204
|
+
`only_stale: true` the walk skips rows whose recorded provenance already
|
|
205
|
+
matches the current provider, model, and dimensions — making a partially
|
|
206
|
+
failed migration resumable.
|
|
207
|
+
- **NEW**: `embed` / `embed_image` auto-declare an `<into>_meta` `:object`
|
|
208
|
+
sibling property recording `{ provider, model, dimensions, modality,
|
|
209
|
+
embedded_at }` on every recompute (cleared when the source clears).
|
|
210
|
+
This is the provenance record `reembed!(only_stale: true)` reads, and it
|
|
211
|
+
tells operational tooling which model produced any stored vector.
|
|
212
|
+
Override the name with `meta_field:`.
|
|
213
|
+
|
|
214
|
+
#### Bulk embedding and query-embed caching
|
|
215
|
+
|
|
216
|
+
- **NEW**: `Parse::Embeddings::BatchEmbedder` — batch-level orchestration
|
|
217
|
+
for bulk embedding jobs. Wraps any registered provider with batch slicing
|
|
218
|
+
(defaulting to the provider's own batch-size hint), requests-per-minute
|
|
219
|
+
pacing between calls, and batch-level exponential backoff with jitter on
|
|
220
|
+
rate-limit / transient errors (previously backoff lived only inside each
|
|
221
|
+
provider's single HTTP call). A batch that exhausts its attempts raises
|
|
222
|
+
`BatchEmbedder::BatchFailed` carrying `batch_index` and `completed_count`
|
|
223
|
+
so a resumable job knows where to pick up. Supports `retry_on:` exception
|
|
224
|
+
overrides and an `on_progress:` callback.
|
|
225
|
+
- **NEW**: `Parse::Embeddings::Cache` — process-local embedding cache keyed
|
|
226
|
+
by `(provider, model, dimensions, input_type, SHA-256(input))`, disabled by
|
|
227
|
+
default. Dimensions participate in the key so two registrations of the
|
|
228
|
+
same Matryoshka-capable model at different output widths never serve each
|
|
229
|
+
other's vectors.
|
|
230
|
+
`Parse::Embeddings::Cache.enable!(max_entries:, ttl:)` activates an LRU +
|
|
231
|
+
TTL store (or pass `store:` for a custom backend); repeated identical
|
|
232
|
+
query embeds through `find_similar(text:)`, `hybrid_search(text:)`, and
|
|
233
|
+
`Parse::Retrieval.retrieve` then skip the provider round-trip. Cache hits
|
|
234
|
+
emit the standard `parse.embeddings.embed` notification with
|
|
235
|
+
`cached: true`, so existing spend subscribers see hits and misses on one
|
|
236
|
+
stream. The input text is hashed before keying — plaintext queries never
|
|
237
|
+
land in a shared store.
|
|
238
|
+
|
|
239
|
+
#### Vector index drift detection
|
|
240
|
+
|
|
241
|
+
- **NEW**: first-query verification of deployed Atlas vectorSearch indexes.
|
|
242
|
+
When `find_similar` / `hybrid_search` auto-discovers an index, the SDK now
|
|
243
|
+
compares the index's `numDimensions` and `similarity` against the
|
|
244
|
+
`:vector` property declaration, and — when the class registers an
|
|
245
|
+
`agent_tenant_scope` — confirms the scope field is declared as a
|
|
246
|
+
`type: "filter"` path (without it, every tenant-scoped
|
|
247
|
+
`$vectorSearch.filter` fails Atlas-side). Findings are computed once per
|
|
248
|
+
(class, field, index) per process and governed by
|
|
249
|
+
`Parse::VectorSearch.index_drift_policy`: `:warn` (default) emits a
|
|
250
|
+
`[Parse::VectorSearch:DRIFT]` warning on the first check; `:raise` raises
|
|
251
|
+
`Parse::Core::VectorSearchable::IndexDriftError` on **every** query
|
|
252
|
+
against the drifted index, so strict deployments never serve degraded
|
|
253
|
+
results after the first failure; `:ignore` skips verification. An
|
|
254
|
+
explicit `index:` kwarg is verified best-effort when the catalog's
|
|
255
|
+
covering index carries the same name (lookup failures never fail the
|
|
256
|
+
query).
|
|
257
|
+
|
|
258
|
+
#### Hybrid search hardening
|
|
259
|
+
|
|
260
|
+
- **FIXED**: on the opt-in native `$rankFusion` path, a scoped (non-master)
|
|
261
|
+
caller's `_hybrid_score` is now recomputed from the post-ACL visible
|
|
262
|
+
ordering instead of surfacing the raw fused score. The raw score is
|
|
263
|
+
materialized before the ACL `$match`, so it encoded a surviving row's
|
|
264
|
+
rank among rows the caller cannot read — a cross-tenant/cross-ACL
|
|
265
|
+
inference channel for callers probing with crafted queries. The
|
|
266
|
+
recomputed score is monotone with the true fused order but is a function
|
|
267
|
+
of visible rows only. Master-key results and the default client-side RRF
|
|
268
|
+
path (which ranks from already-filtered rows) are unchanged.
|
|
269
|
+
- **FIXED**: the `$rankFusion` support probe no longer classifies MongoDB
|
|
270
|
+
authorization errors as "stage unsupported". The probe's
|
|
271
|
+
unrecognized-stage matching included the broad phrase "is not allowed",
|
|
272
|
+
which also appears in auth failures ("not allowed to execute command
|
|
273
|
+
aggregate") and could cache the wrong verdict for the probe TTL. Matching
|
|
274
|
+
is narrowed to unambiguous unknown-stage phrases; any other failure is
|
|
275
|
+
treated as supported and the real query surfaces the real error, with
|
|
276
|
+
the client-side path as the standing fallback.
|
|
277
|
+
|
|
278
|
+
#### Retrieval spend-cap and filter hardening
|
|
279
|
+
|
|
280
|
+
- **NEW**: `Parse::Embeddings::SpendCap.configure(..., warn_at: 0.8)` —
|
|
281
|
+
soft-cap alerting. When a charge pushes a tenant's in-window usage across
|
|
282
|
+
the given fraction of its hard limit, a
|
|
283
|
+
`parse.embeddings.spend_cap_warning` ActiveSupport::Notifications event
|
|
284
|
+
is emitted (`tenant_id`, `used`, `limit`, `window`, `warn_at`,
|
|
285
|
+
`threshold`), once per crossing and re-arming as the window rolls off —
|
|
286
|
+
an operator alerting hook that fires BEFORE the hard refuse trips.
|
|
287
|
+
Disabled unless configured. Note the cap deliberately charges before the
|
|
288
|
+
query-embed cache lookup, so cache hits bill at full price: it bounds
|
|
289
|
+
query volume (an abuse control), not just provider spend.
|
|
290
|
+
- **NEW**: `Parse::Embeddings::Cache::MonetaStore` — persistent-L2 adapter
|
|
291
|
+
for the embedding cache. Wraps any Moneta-compatible store (`[]`/`[]=`,
|
|
292
|
+
optional `store(key, value, expires:)`) behind the cache's `get`/`set`
|
|
293
|
+
duck, with key namespacing and TTL forwarding, so
|
|
294
|
+
`Cache.enable!(store: MonetaStore.new(moneta, ttl: 30 * 24 * 3600))`
|
|
295
|
+
shares query-embed entries across processes and restarts. Fail-open: a
|
|
296
|
+
backend error degrades to a cache miss / dropped write, never a failed
|
|
297
|
+
embed. Cache keys are input hashes — plaintext queries never land in the
|
|
298
|
+
shared store.
|
|
299
|
+
- **NEW**: embedding spend-cap coverage on every query-embed path. The
|
|
300
|
+
per-tenant `Parse::Embeddings::SpendCap` was previously charged only at
|
|
301
|
+
the `semantic_search` agent-tool boundary; direct `find_similar(text:)`,
|
|
302
|
+
`hybrid_search(text:)`, and `Parse::Retrieval.retrieve` callers bypassed
|
|
303
|
+
it. The shared query-embed path now charges via
|
|
304
|
+
`SpendCap.charge_query!` — tenant identity resolves to the ambient
|
|
305
|
+
`Parse.with_cache_tenant` scope when set, else the shared default bucket.
|
|
306
|
+
The agent tool wraps its retrieval in the new `SpendCap.with_precharged`
|
|
307
|
+
block so a query it already charged with per-tenant identity is not
|
|
308
|
+
double-billed (and admin-exempt queries are not billed to the shared
|
|
309
|
+
bucket). As before, the cap is a no-op until configured.
|
|
310
|
+
- **NEW**: pointer-value translation for caller-supplied retrieval filters.
|
|
311
|
+
`Parse::Retrieval.retrieve` (and through it the `semantic_search` agent
|
|
312
|
+
tool) now rewrites Parse pointer values — `Parse::Pointer` /
|
|
313
|
+
`Parse::Object` instances and wire-form `{"__type": "Pointer"}` hashes,
|
|
314
|
+
including inside `$in` / `$eq` / `$ne` operator hashes — into their
|
|
315
|
+
MongoDB storage form, so `{ owner: some_user }` becomes
|
|
316
|
+
`{ "_p_owner" => "_User$abc123" }` and actually matches rows. Previously
|
|
317
|
+
a pointer-valued filter silently matched nothing. Translation runs after
|
|
318
|
+
the underscore-key gate and filter-field allowlist (callers still cannot
|
|
319
|
+
name `_p_*` columns directly) and before the tenant-scope fold. The
|
|
320
|
+
standalone helper is `Parse::Retrieval.translate_pointer_filter_values`.
|
|
321
|
+
- **IMPROVED**: `Parse::Schema::SearchIndexMigrator` auto-includes the
|
|
322
|
+
model's registered `agent_tenant_scope` field as a `type: "filter"` path
|
|
323
|
+
when planning or applying `vectorSearch` index declarations. Newly created
|
|
324
|
+
indexes support tenant-scoped pre-filtering out of the box; existing
|
|
325
|
+
indexes missing the path surface as `drifted:` in the plan instead of
|
|
326
|
+
failing at query time.
|
|
327
|
+
|
|
328
|
+
#### Opt-in Unicode regex matching for text constraints
|
|
329
|
+
|
|
330
|
+
- **NEW**: `starts_with`, `contains`, `ends_with`, and `like`/`regex` now accept
|
|
331
|
+
an opt-in `{ value:, unicode: true }` form that appends the `u` (Unicode) flag
|
|
332
|
+
to the compiled `$options`, enabling correct multibyte case-insensitive
|
|
333
|
+
matching for accented and non-Latin text (for example `café` matching
|
|
334
|
+
`CAFÉ`, or CJK characters).
|
|
335
|
+
|
|
336
|
+
```ruby
|
|
337
|
+
Post.where(:title.starts_with => { value: "café", unicode: true })
|
|
338
|
+
# => "title": { "$regex": "^café", "$options": "iu" }
|
|
339
|
+
|
|
340
|
+
Post.where(:title.like => { value: /café/i, unicode: true })
|
|
341
|
+
# => "title": { "$regex": "café", "$options": "iu" }
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
The flag is strictly opt-in: the bare-value forms
|
|
345
|
+
(`:title.starts_with => "café"`) compile exactly as before with `$options: "i"`,
|
|
346
|
+
so existing queries are unchanged. The `u` flag is honored by Parse Server
|
|
347
|
+
8.3.0+ over the REST query interface and by MongoDB 6.1+ on the mongo-direct
|
|
348
|
+
query path; older Parse Servers reject it, which is why it is never emitted
|
|
349
|
+
unless requested.
|
|
350
|
+
|
|
351
|
+
#### ACL permission query hardening
|
|
352
|
+
|
|
353
|
+
- **FIXED**: `readable_by`, `writable_by`, `readable_by_role`,
|
|
354
|
+
`writable_by_role`, `publicly_readable`, and `publicly_writable` no longer
|
|
355
|
+
raise a pipeline-security error when they auto-route through the direct
|
|
356
|
+
MongoDB path. These constraints compile to an aggregation `$match` on the
|
|
357
|
+
internal `_rperm` / `_wperm` permission columns, and the internal-fields
|
|
358
|
+
denylist that protects user-supplied pipelines from referencing
|
|
359
|
+
server-internal columns was also rejecting these SDK-generated references.
|
|
360
|
+
The aggregation runner now forwards the `allow_internal_fields` sanction for
|
|
361
|
+
pipelines built entirely from SDK constraint translation — matching the
|
|
362
|
+
parity already held by the `results_direct` / `count_direct` /
|
|
363
|
+
`distinct_direct` helpers — so public-read detection (`publicly_readable`,
|
|
364
|
+
`readable_by("*")`) and role/user permission filtering work again. The
|
|
365
|
+
sanction is scoped to SDK-built ACL pipelines only; caller-supplied
|
|
366
|
+
aggregation pipelines remain subject to the full denylist, so they still
|
|
367
|
+
cannot reference password hashes, session tokens, or other internal columns.
|
|
368
|
+
- **FIXED**: `Query#count` now routes ACL permission filters
|
|
369
|
+
(`publicly_readable.count`, `readable_by(...).count`, and friends) through
|
|
370
|
+
the direct MongoDB path, mirroring `Query#results`. Previously `count` only
|
|
371
|
+
switched to the direct path for subquery `$lookup` stages, so an ACL count
|
|
372
|
+
was sent to Parse Server's REST aggregate endpoint, which cannot express a
|
|
373
|
+
`$match` on `_rperm` / `_wperm`.
|
|
374
|
+
- **FIXED**: the scalar aggregation terminals — `Query#sum`, `#average`,
|
|
375
|
+
`#min`, `#max`, `#distinct`, and `#count_distinct` — now honor ACL
|
|
376
|
+
permission filters and scoped queries. They funnel through `Query#aggregate`,
|
|
377
|
+
which previously only switched to the direct MongoDB path for subquery
|
|
378
|
+
`$lookup` stages. An ACL filter (`readable_by(...).sum(:plays)`) was sent to
|
|
379
|
+
Parse Server's REST aggregate endpoint, which cannot express a `$match` on
|
|
380
|
+
`_rperm` / `_wperm`. More seriously, a **scoped** terminal
|
|
381
|
+
(`scope_to_user(u).sum(:plays)`, `scope_to_role`, or a `session_token`)
|
|
382
|
+
reached the same REST endpoint, which is master-key-only and enforces
|
|
383
|
+
neither ACL nor CLP — so the aggregate ran unscoped as the master key,
|
|
384
|
+
computing the result over rows the caller cannot read. `Query#aggregate` now
|
|
385
|
+
routes to mongo-direct whenever the query is scoped or the pipeline
|
|
386
|
+
references the ACL columns, and **fails closed** (raises
|
|
387
|
+
`Parse::Query::MongoDirectRequired`) for a scoped terminal when mongo-direct
|
|
388
|
+
is unavailable, rather than silently bypassing enforcement. The same
|
|
389
|
+
contract covers the inline-pipeline terminals: a scoped `Query#count` or
|
|
390
|
+
`Query#results` whose constraints compile to an aggregation pipeline
|
|
391
|
+
(e.g. `:field.size`) promotes to mongo-direct and fails closed identically
|
|
392
|
+
instead of falling back to REST `/aggregate`.
|
|
393
|
+
- **FIXED**: `not_publicly_readable` / `not_publicly_writable` (and the
|
|
394
|
+
`:ACL.not_readable_by` / `:ACL.not_writable_by` constraints) no longer return
|
|
395
|
+
the rows they are meant to exclude. They compiled to `{ _rperm: { $nin:
|
|
396
|
+
[...] } }`, and MongoDB's `$nin` matches documents where the field is
|
|
397
|
+
**absent** — and a missing `_rperm` is treated by Parse Server as public.
|
|
398
|
+
A security audit using `not_publicly_writable` to find safe objects silently
|
|
399
|
+
excluded write-exposed (public-by-absence) objects. The constraints now carry
|
|
400
|
+
an `$exists: true` guard. "Not readable by X" additionally expands the
|
|
401
|
+
principal's roles and excludes publicly-readable rows (a public row is
|
|
402
|
+
readable by everyone, so it cannot be "not readable by X").
|
|
403
|
+
- **FIXED**: `readable_by([])` / `writable_by([])` and the `:none` / `nil`
|
|
404
|
+
forms no longer raise `ArgumentError`; they now compile to the documented
|
|
405
|
+
"no permissions" match (an explicit empty `_rperm` / `_wperm`). Symbol
|
|
406
|
+
principals (`:public`, `:everyone`, `:world`) are accepted and map to the
|
|
407
|
+
public wildcard, matching the String forms.
|
|
408
|
+
- **FIXED**: `PrivateAclConstraint` (`:ACL.private_acl` / `master_key_only`)
|
|
409
|
+
no longer classifies public-by-absence rows as private. A truly master-key-
|
|
410
|
+
only object has an explicit empty `_rperm` **and** `_wperm`; a missing
|
|
411
|
+
column is public, the opposite of private, so the missing-field branch was
|
|
412
|
+
removed. `private_acl => false` is now the exact complement.
|
|
413
|
+
- **FIXED**: role expansion for `readable_by` / `writable_by` /
|
|
414
|
+
`readable_by_role` / `writable_by_role` now always includes the role's own
|
|
415
|
+
name in the permission set. The upward-inheritance walk yields nothing for
|
|
416
|
+
an unpersisted role (objectId still nil), which previously dropped the role
|
|
417
|
+
entirely and raised "no valid permissions"; the role's own `role:<name>`
|
|
418
|
+
entry is now appended idempotently, so persisted roles compile unchanged.
|
|
419
|
+
- **CHANGED**: a mistyped ACL permission no longer vanishes silently. An
|
|
420
|
+
unrecognized element in a `readable_by` / `writable_by` array (or an
|
|
421
|
+
unsupported Symbol) now raises `ArgumentError` instead of being dropped from
|
|
422
|
+
the permission set, which would silently weaken the intended filter.
|
|
423
|
+
- **NEW**: `strict:` option on `readable_by` / `writable_by` /
|
|
424
|
+
`readable_by_role` / `writable_by_role` (and the `:ACL.readable_by_exact` /
|
|
425
|
+
`writable_by_exact` / `*_by_role_exact` operators) for an **exact** match —
|
|
426
|
+
only rows whose `_rperm` / `_wperm` literally contains one of the resolved
|
|
427
|
+
permissions, with no implicit public `"*"` and no missing-field rows. The
|
|
428
|
+
default remains inclusive (access-simulation) semantics; `strict: true` is
|
|
429
|
+
the right choice for ownership and security audits.
|
|
430
|
+
- **NEW**: `Query#not_readable_by` / `#not_writable_by` chained methods, the
|
|
431
|
+
fluent counterparts to the existing `:ACL.not_readable_by` symbol operators.
|
|
432
|
+
- **BREAKING**: the British-spelled `:ACL.writeable_by` operator now resolves
|
|
433
|
+
to the same public-inclusive, role-expanding implementation as
|
|
434
|
+
`:ACL.writable_by`. Previously the one-letter spelling difference selected a
|
|
435
|
+
separate, strict, non-role-expanding constraint, so `writeable_by` and
|
|
436
|
+
`writable_by` silently produced different result sets. Code that relied on
|
|
437
|
+
the old strict behavior of `writeable_by` should pass `strict: true` (or use
|
|
438
|
+
the `:writable_by_exact` operator).
|
|
439
|
+
|
|
440
|
+
#### Webhook after_save callback hardening
|
|
441
|
+
|
|
442
|
+
- **FIXED**: the model's chained `after_save` / `after_create` callbacks now
|
|
443
|
+
fire exactly once per `afterSave` delivery, even when an app registers both a
|
|
444
|
+
class-specific handler (`webhook :after_save, MyClass`) and a catch-all
|
|
445
|
+
handler (`webhook :after_save, "*"`). The webhook endpoint dispatches every
|
|
446
|
+
trigger to both the class route and the `"*"` route, and the callback chain
|
|
447
|
+
previously ran inside each route — so an app with both handlers fired its
|
|
448
|
+
model `after_save` twice (e.g. two emails per save). The chain now runs once,
|
|
449
|
+
after both routes are dispatched. The existing behavior is otherwise
|
|
450
|
+
preserved: an `afterSave` for a class with no registered handler never fires
|
|
451
|
+
model callbacks, and trusted Ruby-initiated saves still skip the webhook-side
|
|
452
|
+
callbacks so the local `run_callbacks :save` is the single fire.
|
|
453
|
+
- **FIXED**: a chained `after_save` or `after_create` callback that raises
|
|
454
|
+
during an `afterSave` webhook no longer crashes the webhook endpoint or
|
|
455
|
+
suppresses the other phase's side effects. Because `afterSave` fires after the
|
|
456
|
+
object is already persisted and Parse Server discards the response body, the
|
|
457
|
+
`after_create` and `after_save` phases now run independently and any
|
|
458
|
+
`StandardError` they raise is logged and swallowed (mirroring Parse Server's
|
|
459
|
+
own afterSave semantics). A raising `after_create :send_welcome_email` no
|
|
460
|
+
longer silently skips an unrelated `after_save :reindex`, and an uncaught
|
|
461
|
+
callback error can no longer return a 500 to Parse Server.
|
|
462
|
+
- **FIXED**: `Parse::Webhooks::Payload#ruby_initiated?` now memoizes a `false`
|
|
463
|
+
result stably instead of re-deriving it on every call. The prior `||=`
|
|
464
|
+
memoization recomputed whenever the cached value was `false`, so a stamped
|
|
465
|
+
`false` could be re-derived inconsistently; the detection result is now cached
|
|
466
|
+
exactly once.
|
|
467
|
+
|
|
468
|
+
#### `verify_password` client-side rate-limit parity
|
|
469
|
+
|
|
470
|
+
- **CHANGED**: `verify_password` now participates in the same client-side login
|
|
471
|
+
rate-limit as `login`. It calls the rate-limit guard before issuing the
|
|
472
|
+
request and records the result afterward, keyed on the bare username so
|
|
473
|
+
failures share a bucket with `login` — an attacker cannot sidestep a `login`
|
|
474
|
+
lockout by pivoting to the `verify_password` credential oracle. Because the
|
|
475
|
+
bucket is shared, a run of failed step-up / re-authentication calls counts
|
|
476
|
+
toward (and can trigger) the primary login lockout for that username. As with
|
|
477
|
+
`login`, this is a convenience guard, not a security boundary — server-side
|
|
478
|
+
rate limiting remains the real control.
|
|
479
|
+
|
|
480
|
+
#### Cloud function results are server-authoritative
|
|
481
|
+
|
|
482
|
+
- **IMPROVED**: Documented that decoded cloud function results are treated as
|
|
483
|
+
server-authoritative. A cloud function that returns a Parse object decodes
|
|
484
|
+
through the same trusted path as every query and `fetch` result, so
|
|
485
|
+
server-set fields on the returned object (including `sessionToken` on a
|
|
486
|
+
returned user) are preserved rather than stripped — consistent with how the
|
|
487
|
+
rest of the SDK hydrates server responses. If a cloud function is expected to
|
|
488
|
+
echo back third-party-influenced data that you want to sanitize yourself,
|
|
489
|
+
call it with `raw: true` (`Parse.call_function(name, body, raw: true)`) to
|
|
490
|
+
receive the undecoded response before any object is built.
|
|
491
|
+
|
|
3
492
|
### 5.4.1
|
|
4
493
|
|
|
5
494
|
#### Webhook after_save callback fix
|