parse-stack-next 4.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.bundle/config +2 -0
- data/.env.sample +112 -0
- data/.env.test +10 -0
- data/.github/workflows/ruby.yml +36 -0
- data/.gitignore +49 -0
- data/.ruby-version +1 -0
- data/.solargraph.yml +22 -0
- data/CHANGELOG.md +5816 -0
- data/Gemfile +30 -0
- data/Gemfile.lock +175 -0
- data/LICENSE.txt +23 -0
- data/Makefile +63 -0
- data/README.md +5655 -0
- data/Rakefile +573 -0
- data/bin/console +38 -0
- data/bin/parse-console +136 -0
- data/bin/server +17 -0
- data/bin/setup +7 -0
- data/config/parse-config.json +12 -0
- data/docs/TEST_SERVER.md +271 -0
- data/docs/_config.yml +1 -0
- data/docs/mcp_guide.md +3484 -0
- data/docs/mongodb_direct_guide.md +1348 -0
- data/docs/mongodb_index_optimization_guide.md +631 -0
- data/examples/transaction_example.rb +219 -0
- data/lib/parse/acl_scope.rb +728 -0
- data/lib/parse/agent/cancellation_token.rb +80 -0
- data/lib/parse/agent/constraint_translator.rb +480 -0
- data/lib/parse/agent/describe.rb +420 -0
- data/lib/parse/agent/errors.rb +133 -0
- data/lib/parse/agent/mcp_client.rb +557 -0
- data/lib/parse/agent/mcp_dispatcher.rb +1023 -0
- data/lib/parse/agent/mcp_rack_app.rb +1143 -0
- data/lib/parse/agent/mcp_server.rb +376 -0
- data/lib/parse/agent/metadata_audit.rb +259 -0
- data/lib/parse/agent/metadata_dsl.rb +733 -0
- data/lib/parse/agent/metadata_registry.rb +794 -0
- data/lib/parse/agent/pipeline_validator.rb +82 -0
- data/lib/parse/agent/prompts.rb +351 -0
- data/lib/parse/agent/rate_limiter.rb +158 -0
- data/lib/parse/agent/relation_graph.rb +162 -0
- data/lib/parse/agent/result_formatter.rb +453 -0
- data/lib/parse/agent/tools.rb +5489 -0
- data/lib/parse/agent.rb +3249 -0
- data/lib/parse/api/aggregate.rb +79 -0
- data/lib/parse/api/all.rb +26 -0
- data/lib/parse/api/analytics.rb +18 -0
- data/lib/parse/api/batch.rb +33 -0
- data/lib/parse/api/cloud_functions.rb +58 -0
- data/lib/parse/api/config.rb +125 -0
- data/lib/parse/api/files.rb +29 -0
- data/lib/parse/api/hooks.rb +117 -0
- data/lib/parse/api/objects.rb +146 -0
- data/lib/parse/api/path_segment.rb +75 -0
- data/lib/parse/api/push.rb +20 -0
- data/lib/parse/api/schema.rb +49 -0
- data/lib/parse/api/server.rb +50 -0
- data/lib/parse/api/sessions.rb +24 -0
- data/lib/parse/api/users.rb +250 -0
- data/lib/parse/atlas_search/index_manager.rb +353 -0
- data/lib/parse/atlas_search/result.rb +204 -0
- data/lib/parse/atlas_search/search_builder.rb +604 -0
- data/lib/parse/atlas_search/session.rb +253 -0
- data/lib/parse/atlas_search.rb +995 -0
- data/lib/parse/client/authentication.rb +97 -0
- data/lib/parse/client/batch.rb +234 -0
- data/lib/parse/client/body_builder.rb +240 -0
- data/lib/parse/client/caching.rb +203 -0
- data/lib/parse/client/logging.rb +293 -0
- data/lib/parse/client/profiling.rb +181 -0
- data/lib/parse/client/protocol.rb +91 -0
- data/lib/parse/client/request.rb +233 -0
- data/lib/parse/client/response.rb +208 -0
- data/lib/parse/client.rb +1104 -0
- data/lib/parse/clp_scope.rb +361 -0
- data/lib/parse/live_query/circuit_breaker.rb +256 -0
- data/lib/parse/live_query/client.rb +1001 -0
- data/lib/parse/live_query/configuration.rb +224 -0
- data/lib/parse/live_query/event.rb +115 -0
- data/lib/parse/live_query/event_queue.rb +272 -0
- data/lib/parse/live_query/health_monitor.rb +214 -0
- data/lib/parse/live_query/logging.rb +149 -0
- data/lib/parse/live_query/subscription.rb +294 -0
- data/lib/parse/live_query.rb +163 -0
- data/lib/parse/lookup_rewriter.rb +445 -0
- data/lib/parse/model/acl.rb +968 -0
- data/lib/parse/model/associations/belongs_to.rb +275 -0
- data/lib/parse/model/associations/collection_proxy.rb +435 -0
- data/lib/parse/model/associations/has_many.rb +597 -0
- data/lib/parse/model/associations/has_one.rb +158 -0
- data/lib/parse/model/associations/pointer_collection_proxy.rb +134 -0
- data/lib/parse/model/associations/relation_collection_proxy.rb +177 -0
- data/lib/parse/model/bytes.rb +62 -0
- data/lib/parse/model/classes/audience.rb +262 -0
- data/lib/parse/model/classes/installation.rb +363 -0
- data/lib/parse/model/classes/job_schedule.rb +153 -0
- data/lib/parse/model/classes/job_status.rb +264 -0
- data/lib/parse/model/classes/product.rb +75 -0
- data/lib/parse/model/classes/push_status.rb +263 -0
- data/lib/parse/model/classes/role.rb +751 -0
- data/lib/parse/model/classes/session.rb +201 -0
- data/lib/parse/model/classes/user.rb +943 -0
- data/lib/parse/model/clp.rb +544 -0
- data/lib/parse/model/core/actions.rb +1268 -0
- data/lib/parse/model/core/builder.rb +139 -0
- data/lib/parse/model/core/create_lock.rb +386 -0
- data/lib/parse/model/core/describe.rb +382 -0
- data/lib/parse/model/core/enhanced_change_tracking.rb +159 -0
- data/lib/parse/model/core/errors.rb +38 -0
- data/lib/parse/model/core/fetching.rb +566 -0
- data/lib/parse/model/core/field_guards.rb +220 -0
- data/lib/parse/model/core/indexing.rb +382 -0
- data/lib/parse/model/core/parse_reference.rb +407 -0
- data/lib/parse/model/core/properties.rb +809 -0
- data/lib/parse/model/core/querying.rb +491 -0
- data/lib/parse/model/core/schema.rb +202 -0
- data/lib/parse/model/core/search_indexing.rb +174 -0
- data/lib/parse/model/date.rb +88 -0
- data/lib/parse/model/email.rb +213 -0
- data/lib/parse/model/file.rb +527 -0
- data/lib/parse/model/geojson.rb +271 -0
- data/lib/parse/model/geopoint.rb +261 -0
- data/lib/parse/model/model.rb +260 -0
- data/lib/parse/model/object.rb +2068 -0
- data/lib/parse/model/phone.rb +520 -0
- data/lib/parse/model/pointer.rb +443 -0
- data/lib/parse/model/polygon.rb +406 -0
- data/lib/parse/model/push.rb +975 -0
- data/lib/parse/model/shortnames.rb +8 -0
- data/lib/parse/model/time_zone.rb +141 -0
- data/lib/parse/model/validations/uniqueness_validator.rb +97 -0
- data/lib/parse/model/validations.rb +96 -0
- data/lib/parse/mongodb.rb +2300 -0
- data/lib/parse/pipeline_security.rb +554 -0
- data/lib/parse/query/constraint.rb +198 -0
- data/lib/parse/query/constraints.rb +3279 -0
- data/lib/parse/query/cursor.rb +434 -0
- data/lib/parse/query/n_plus_one_detector.rb +445 -0
- data/lib/parse/query/operation.rb +104 -0
- data/lib/parse/query/ordering.rb +66 -0
- data/lib/parse/query.rb +7028 -0
- data/lib/parse/schema/index_migrator.rb +291 -0
- data/lib/parse/schema/search_index_migrator.rb +289 -0
- data/lib/parse/schema.rb +494 -0
- data/lib/parse/stack/generators/rails.rb +40 -0
- data/lib/parse/stack/generators/templates/model.erb +51 -0
- data/lib/parse/stack/generators/templates/model_installation.rb +4 -0
- data/lib/parse/stack/generators/templates/model_role.rb +4 -0
- data/lib/parse/stack/generators/templates/model_session.rb +4 -0
- data/lib/parse/stack/generators/templates/model_user.rb +11 -0
- data/lib/parse/stack/generators/templates/parse.rb +12 -0
- data/lib/parse/stack/generators/templates/webhooks.rb +10 -0
- data/lib/parse/stack/railtie.rb +18 -0
- data/lib/parse/stack/tasks.rb +563 -0
- data/lib/parse/stack/version.rb +11 -0
- data/lib/parse/stack.rb +455 -0
- data/lib/parse/two_factor_auth/user_extension.rb +449 -0
- data/lib/parse/two_factor_auth.rb +310 -0
- data/lib/parse/webhooks/payload.rb +360 -0
- data/lib/parse/webhooks/registration.rb +199 -0
- data/lib/parse/webhooks/replay_protection.rb +189 -0
- data/lib/parse/webhooks.rb +510 -0
- data/lib/parse-stack-next.rb +5 -0
- data/lib/parse-stack.rb +5 -0
- data/parse-stack-next.gemspec +82 -0
- data/parse-stack.png +0 -0
- data/scripts/debug-ips.js +35 -0
- data/scripts/docker/Dockerfile.parse +13 -0
- data/scripts/docker/atlas-init.js +284 -0
- data/scripts/docker/docker-compose.atlas.yml +76 -0
- data/scripts/docker/docker-compose.test.yml +106 -0
- data/scripts/docker/mongo-init.js +21 -0
- data/scripts/eval_mcp_with_lm_studio.rb +274 -0
- data/scripts/start-parse.sh +90 -0
- data/scripts/start_mcp_server.rb +78 -0
- data/scripts/test_server_connection.rb +82 -0
- metadata +377 -0
|
@@ -0,0 +1,631 @@
|
|
|
1
|
+
# MongoDB Index Optimization Guide
|
|
2
|
+
|
|
3
|
+
How to think about MongoDB indexes when running Parse Server on top of
|
|
4
|
+
Mongo, and how to wield the `mongo_index` / `mongo_relation_index` DSL
|
|
5
|
+
in `parse-stack` to land the indexes you actually need without exhausting
|
|
6
|
+
the 64-per-collection budget.
|
|
7
|
+
|
|
8
|
+
This guide assumes familiarity with the API surface — see
|
|
9
|
+
[mongodb_direct_guide.md](./mongodb_direct_guide.md) for the
|
|
10
|
+
declaration / migration / writer-URI mechanics. This document is about
|
|
11
|
+
**WHEN to add an index, WHICH shape to use, and WHEN to drop one**.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## TL;DR
|
|
16
|
+
|
|
17
|
+
- Add an index for every read pattern you run more than ~once per
|
|
18
|
+
second per host. Below that rate, a collection scan is usually fine.
|
|
19
|
+
- For compound indexes, order fields by **ESR**: Equality, Sort, Range.
|
|
20
|
+
- For Parse Relations, the reverse-direction index (`relatedId`) is
|
|
21
|
+
often heavier-used than the forward — declare `bidirectional: true`.
|
|
22
|
+
- Drop indexes whose `$indexStats` ops counter stays at 0 across a few
|
|
23
|
+
weeks of normal traffic.
|
|
24
|
+
- The 64-index-per-collection cap exists for a reason: every write
|
|
25
|
+
pays the cost of every index. Don't index columns you only read.
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## When to add an index
|
|
30
|
+
|
|
31
|
+
The right way to ask: **"is this query slow at production scale?"**
|
|
32
|
+
not "should I index this column?". An unindexed column is fine when:
|
|
33
|
+
|
|
34
|
+
- The collection has few documents (< 10k) — the scan is cheap.
|
|
35
|
+
- The query runs rarely (background jobs, admin tools, debugging).
|
|
36
|
+
- The query is selective enough that other indexes already prune the
|
|
37
|
+
candidate set down to a small handful before the unindexed predicate runs.
|
|
38
|
+
|
|
39
|
+
An index is needed when:
|
|
40
|
+
|
|
41
|
+
- The query runs in a hot path — request response, agent tool, public
|
|
42
|
+
API. Latency budget matters.
|
|
43
|
+
- The collection is large (> 100k documents).
|
|
44
|
+
- The query predicate selectivity is high — a small fraction of rows match.
|
|
45
|
+
- You're sorting on the column for paged results (`order(:field.desc)`).
|
|
46
|
+
|
|
47
|
+
### Read/write tradeoff
|
|
48
|
+
|
|
49
|
+
Every index costs **write amplification**: each `INSERT` / `UPDATE` /
|
|
50
|
+
`DELETE` rebuilds the entry in every index that touches the affected
|
|
51
|
+
fields. A collection with 8 indexes pays 8× the write cost of one
|
|
52
|
+
without. For high-throughput write paths, fewer indexes wins. For
|
|
53
|
+
read-heavy paths, more indexes wins.
|
|
54
|
+
|
|
55
|
+
Parse-stack collections are usually read-heavy (Parse apps tend to
|
|
56
|
+
read 10–100× more than they write), so the budget skews toward
|
|
57
|
+
indexing. But monitor it — `$indexStats` will tell you if you're
|
|
58
|
+
paying for an index nobody uses.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Index types in parse-on-Mongo
|
|
63
|
+
|
|
64
|
+
| Type | DSL spelling | When to use |
|
|
65
|
+
|---|---|---|
|
|
66
|
+
| Regular B-tree | `mongo_index :field` | Equality, range, sort on a scalar field |
|
|
67
|
+
| Compound | `mongo_index :a, :b, :c` | Multi-field queries with a common prefix |
|
|
68
|
+
| Unique | `mongo_index :field, unique: true` | Enforce uniqueness at the DB layer |
|
|
69
|
+
| Sparse | `mongo_index :field, sparse: true` | Field present on only some documents |
|
|
70
|
+
| Partial | `mongo_index :field, partial: { … }` | Index only documents matching a filter |
|
|
71
|
+
| TTL | `mongo_index :field, expire_after: N` | Auto-delete documents N seconds after the timestamp |
|
|
72
|
+
| 2dsphere (geo) | `mongo_geo_index :location` | Geographic queries on `geopoint` columns |
|
|
73
|
+
| Relation | `mongo_relation_index :field, bidirectional: true` | Indexes on `_Join:*` collections |
|
|
74
|
+
|
|
75
|
+
Hashed and text indexes are intentionally not exposed via the DSL yet
|
|
76
|
+
— if you need them, declare via `Parse::MongoDB.create_index`
|
|
77
|
+
directly. Atlas Search indexes use a different mechanism (the
|
|
78
|
+
`createSearchIndexes` / `dropSearchIndex` / `updateSearchIndex`
|
|
79
|
+
commands) and are managed imperatively rather than via the DSL — see
|
|
80
|
+
the "Atlas Search indexes" section below.
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Compound indexes: the ESR rule
|
|
85
|
+
|
|
86
|
+
The single most important compound-index rule: **Equality, Sort,
|
|
87
|
+
Range** — in that order, left to right.
|
|
88
|
+
|
|
89
|
+
Given a query like:
|
|
90
|
+
|
|
91
|
+
```ruby
|
|
92
|
+
Song.query(:artist => "ArtistName", # equality
|
|
93
|
+
:released.between(2020, 2024)) # range
|
|
94
|
+
.order(:plays.desc) # sort
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
The optimal index is:
|
|
98
|
+
|
|
99
|
+
```ruby
|
|
100
|
+
mongo_index :artist, # E — equality narrows fastest
|
|
101
|
+
:plays, # S — sort piggybacks on index order
|
|
102
|
+
:released # R — range further narrows the sorted set
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
The MongoDB query planner uses the leftmost-prefix of a compound
|
|
106
|
+
index. So `{artist:1, plays:-1, released:1}` can serve:
|
|
107
|
+
|
|
108
|
+
- Queries on `artist` alone
|
|
109
|
+
- Queries on `artist` + `plays`
|
|
110
|
+
- Queries on `artist` + `plays` + `released`
|
|
111
|
+
- Sorts on `plays` after filtering by `artist`
|
|
112
|
+
|
|
113
|
+
It **cannot** efficiently serve:
|
|
114
|
+
|
|
115
|
+
- Queries on `plays` alone (no leftmost `artist`)
|
|
116
|
+
- Queries on `released` alone
|
|
117
|
+
|
|
118
|
+
When in doubt, run `Query#explain` and look at the `winningPlan` —
|
|
119
|
+
look for `IXSCAN` (index scan) vs `COLLSCAN` (full collection scan).
|
|
120
|
+
|
|
121
|
+
### Order matters: getting it wrong costs an index
|
|
122
|
+
|
|
123
|
+
Putting Range before Sort is the most common mistake:
|
|
124
|
+
|
|
125
|
+
```ruby
|
|
126
|
+
# WRONG ORDER for the query above:
|
|
127
|
+
mongo_index :artist, :released, :plays
|
|
128
|
+
# Mongo must scan a range first, then sort in memory — defeats the index.
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
The compound forms ONE index, so picking the wrong order means
|
|
132
|
+
declaring a SECOND index to fix it later — eating another slot from
|
|
133
|
+
your 64-per-collection budget.
|
|
134
|
+
|
|
135
|
+
### Sort direction matters less than you'd think
|
|
136
|
+
|
|
137
|
+
`{plays:-1}` and `{plays:1}` can both serve `.order(:plays)` AND
|
|
138
|
+
`.order(:plays.desc)` — MongoDB walks the index in either direction.
|
|
139
|
+
The direction only matters when the SORT crosses fields:
|
|
140
|
+
|
|
141
|
+
```ruby
|
|
142
|
+
# These two indexes are NOT interchangeable for serving
|
|
143
|
+
# .order(:released.asc, :plays.desc):
|
|
144
|
+
mongo_index :released, :plays # serves released ASC, plays ASC
|
|
145
|
+
mongo_index :released, name: "rel_p_neg", # serves released ASC, plays DESC
|
|
146
|
+
# ...but in the second case you need {released:1, plays:-1} explicitly.
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
For most Parse models, single-direction indexes are fine. Worry about
|
|
150
|
+
multi-direction only when you have multi-field `order` clauses.
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
## Parse-Stack-specific patterns
|
|
155
|
+
|
|
156
|
+
### `belongs_to` → `_p_` pointer columns
|
|
157
|
+
|
|
158
|
+
Parse stores `belongs_to :owner` as the column `_p_owner` (typed
|
|
159
|
+
string `"User$objectId"`). The DSL auto-rewrites:
|
|
160
|
+
|
|
161
|
+
```ruby
|
|
162
|
+
class Post < Parse::Object
|
|
163
|
+
belongs_to :owner, as: :user
|
|
164
|
+
mongo_index :owner # → declared as _p_owner under the hood
|
|
165
|
+
end
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
**Every `belongs_to` that you filter on regularly should be indexed.**
|
|
169
|
+
This is the single highest-payoff index pattern in Parse-on-Mongo
|
|
170
|
+
schemas — without it, "fetch all posts by this user" is a full scan
|
|
171
|
+
of `Post` for every request.
|
|
172
|
+
|
|
173
|
+
If you also sort, make it a compound:
|
|
174
|
+
|
|
175
|
+
```ruby
|
|
176
|
+
mongo_index :owner, :created_at # belongs_to + chronological
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### `parse_reference` uniqueness
|
|
180
|
+
|
|
181
|
+
Auto-registered by the `parse_reference` declaration as
|
|
182
|
+
`unique: true, sparse: true`. The synchronize_create correctness
|
|
183
|
+
floor depends on this index existing.
|
|
184
|
+
|
|
185
|
+
Opt out only when you're certain duplicates are intentional:
|
|
186
|
+
|
|
187
|
+
```ruby
|
|
188
|
+
parse_reference unique_index: false # index without unique constraint
|
|
189
|
+
parse_reference index: false # no index at all
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
### `_rperm` / `_wperm` ACL filtering
|
|
193
|
+
|
|
194
|
+
Parse stores per-row ACL as arrays in `_rperm` / `_wperm`. When the
|
|
195
|
+
SDK runs scoped queries (under `session_token:`, `acl_user:`, or
|
|
196
|
+
`acl_role:`), it injects a `$match` on `_rperm` that includes the
|
|
197
|
+
caller's claim set. Without an index on `_rperm`, every ACL-scoped
|
|
198
|
+
query is a collection scan with row-level filtering.
|
|
199
|
+
|
|
200
|
+
```ruby
|
|
201
|
+
# For any class with significant per-row ACLs:
|
|
202
|
+
mongo_index :_rperm # ACL read predicate scan
|
|
203
|
+
# Don't compound with another array — Mongo's parallel-array rule
|
|
204
|
+
# applies. The DSL catches this at registration time.
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
For very heavy multi-tenant patterns, partial indexes on `_rperm`
|
|
208
|
+
serving specific role claim shapes can help — but that's a tuning
|
|
209
|
+
problem, not a default. Add `_rperm` indexes only where ACL queries
|
|
210
|
+
show up in `$indexStats`-derived hot lists.
|
|
211
|
+
|
|
212
|
+
### Relation join collections
|
|
213
|
+
|
|
214
|
+
Parse Relations store one document per (owner, related) edge in
|
|
215
|
+
`_Join:<field>:<ParentClass>`. The two columns are `owningId` (the
|
|
216
|
+
parent's objectId) and `relatedId` (the related's objectId). Both
|
|
217
|
+
are plain string objectIds, not BSON ObjectIds.
|
|
218
|
+
|
|
219
|
+
Two access patterns matter:
|
|
220
|
+
|
|
221
|
+
- **Forward**: "what's related to this owner?" — needs `{owningId: 1}`
|
|
222
|
+
- **Reverse**: "which owners contain this related object?" — needs `{relatedId: 1}`
|
|
223
|
+
|
|
224
|
+
For `Parse::Role.users`, the reverse direction is canonically the
|
|
225
|
+
heavier-used one (every auth call needs "which roles is this user
|
|
226
|
+
in?"). For most other relations, forward dominates.
|
|
227
|
+
|
|
228
|
+
```ruby
|
|
229
|
+
class Parse::Role < Parse::Object
|
|
230
|
+
has_many :users, through: :relation
|
|
231
|
+
mongo_relation_index :users, bidirectional: true
|
|
232
|
+
end
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
If only one direction is hot, drop `bidirectional:` and pay for just
|
|
236
|
+
one index from the budget.
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## The 64-index-per-collection cap
|
|
241
|
+
|
|
242
|
+
MongoDB hard-caps indexes at 64 per collection. The migrator enforces
|
|
243
|
+
this at plan time — if `existing + to_create > 64`, `apply!` returns
|
|
244
|
+
`{capacity_blocked: true, ...}` without issuing any creates.
|
|
245
|
+
|
|
246
|
+
Parse Server auto-creates several indexes you don't see in
|
|
247
|
+
declarations (`_id_`, `_username_unique`, `_email_unique`,
|
|
248
|
+
`_session_token_*`, etc.). They count against your 64.
|
|
249
|
+
|
|
250
|
+
### Budget per collection size
|
|
251
|
+
|
|
252
|
+
Rough guidance for healthy budgets:
|
|
253
|
+
|
|
254
|
+
| Collection size | Reasonable index count |
|
|
255
|
+
|---|---|
|
|
256
|
+
| < 10k documents | 1–3 (just `_id_` plus the obvious belongs_to) |
|
|
257
|
+
| 10k – 1M | 5–12 |
|
|
258
|
+
| 1M – 100M | 10–25 |
|
|
259
|
+
| > 100M | 15–40, but tune aggressively |
|
|
260
|
+
|
|
261
|
+
If you're approaching 50 indexes on one collection, you've probably
|
|
262
|
+
duplicated work — multiple compounds that subsume each other. Audit
|
|
263
|
+
with `$indexStats`.
|
|
264
|
+
|
|
265
|
+
### How to choose what to drop
|
|
266
|
+
|
|
267
|
+
Use `Model.describe(:indexes, network: true, usage: true)`:
|
|
268
|
+
|
|
269
|
+
```ruby
|
|
270
|
+
Song.describe(:indexes, network: true, usage: true)
|
|
271
|
+
# Each index entry includes a :usage sub-hash with `ops` (count since
|
|
272
|
+
# last Mongo restart) and `since` (the restart timestamp).
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
Heuristics for dropping:
|
|
276
|
+
|
|
277
|
+
- **`ops == 0` and the Mongo restart was > 14 days ago** → almost
|
|
278
|
+
certainly unused. The `since` field tells you the counting window;
|
|
279
|
+
if it's recent, wait longer.
|
|
280
|
+
- **One compound subsumes another** → keep only the most-specific. A
|
|
281
|
+
`{a:1, b:1, c:1}` index serves all queries on `{a:1}` alone and on
|
|
282
|
+
`{a:1, b:1}`, so dropping those shorter compounds is safe IF the
|
|
283
|
+
query planner picks the long one (verify with `explain`).
|
|
284
|
+
- **`ops` is < 1% of `_id_`'s ops** → the index is rarely useful;
|
|
285
|
+
consider whether the queries that use it can be served by another
|
|
286
|
+
index.
|
|
287
|
+
|
|
288
|
+
`$indexStats` resets on Mongo restart. Don't drop based on the first
|
|
289
|
+
day's data — sample a few weeks.
|
|
290
|
+
|
|
291
|
+
---
|
|
292
|
+
|
|
293
|
+
## Common mistakes
|
|
294
|
+
|
|
295
|
+
### Indexing every field
|
|
296
|
+
|
|
297
|
+
The reflex to "just add an index" creates a different problem: every
|
|
298
|
+
write hits every index. Saving a `Song` with 10 indexes is 10× the
|
|
299
|
+
work of saving one with 1.
|
|
300
|
+
|
|
301
|
+
**Better:** start with the indexes the obvious belongs_to columns and
|
|
302
|
+
`parse_reference` need, then add as you find slow queries.
|
|
303
|
+
|
|
304
|
+
### Wrong compound order
|
|
305
|
+
|
|
306
|
+
Putting Range or Sort before Equality means the index doesn't help
|
|
307
|
+
the predominant query — and now you've spent one of your 64 slots on
|
|
308
|
+
something useless.
|
|
309
|
+
|
|
310
|
+
**Better:** write the actual query first, then derive the index
|
|
311
|
+
ordering via ESR.
|
|
312
|
+
|
|
313
|
+
### Unique on null-heavy fields without sparse
|
|
314
|
+
|
|
315
|
+
A plain `unique: true` index treats `null` (and missing) as a value.
|
|
316
|
+
You can have ONE document with `field: null` before the constraint
|
|
317
|
+
fails.
|
|
318
|
+
|
|
319
|
+
**Better:** `unique: true, sparse: true` for "unique when present".
|
|
320
|
+
This is exactly what `parse_reference` auto-registers, and it's the
|
|
321
|
+
right pattern for any optional uniqueness constraint.
|
|
322
|
+
|
|
323
|
+
### Geo without proper coordinate order
|
|
324
|
+
|
|
325
|
+
GeoJSON `Point` coordinates are `[longitude, latitude]`, in that
|
|
326
|
+
order. Latitude-first will index but return wrong results for
|
|
327
|
+
proximity queries.
|
|
328
|
+
|
|
329
|
+
**Better:** `mongo_geo_index :location` and let parse-stack's
|
|
330
|
+
`Parse::GeoPoint` serializer handle the order. Avoid hand-crafted
|
|
331
|
+
`{type: "Point", coordinates: [...]}` documents.
|
|
332
|
+
|
|
333
|
+
### Parallel arrays in a compound
|
|
334
|
+
|
|
335
|
+
`mongo_index :tags, :categories` — both fields hold arrays — fails
|
|
336
|
+
at apply time with "cannot index parallel arrays". The DSL catches
|
|
337
|
+
this at registration, but the equivalent in raw `Parse::MongoDB.create_index`
|
|
338
|
+
calls bypasses the guard.
|
|
339
|
+
|
|
340
|
+
**Better:** declare two separate single-field indexes, or use
|
|
341
|
+
`Parse::MongoDB.create_index` with the same parallel-array guard
|
|
342
|
+
applied to the keys hash.
|
|
343
|
+
|
|
344
|
+
### Indexing `_id` explicitly
|
|
345
|
+
|
|
346
|
+
MongoDB auto-creates `_id_` (the primary key). Declaring `mongo_index :_id`
|
|
347
|
+
either creates a redundant index or triggers an `IndexOptionsConflict`.
|
|
348
|
+
The DSL rejects this at registration.
|
|
349
|
+
|
|
350
|
+
**Better:** trust the implicit `_id_`. Don't try to control it.
|
|
351
|
+
|
|
352
|
+
---
|
|
353
|
+
|
|
354
|
+
## Workflow: discover → plan → apply
|
|
355
|
+
|
|
356
|
+
Typical lifecycle for an index addition:
|
|
357
|
+
|
|
358
|
+
1. **Discover the slow query.** Use `Parse::Query#explain` (mongo-direct
|
|
359
|
+
path) or `db.collection.explain()` to confirm a `COLLSCAN`. Look at
|
|
360
|
+
`executionStats.totalDocsExamined` vs the actual result size — if
|
|
361
|
+
they diverge, an index would help.
|
|
362
|
+
|
|
363
|
+
2. **Plan the index.** Apply ESR to the query shape. Pick the field
|
|
364
|
+
order. Decide unique/sparse/partial.
|
|
365
|
+
|
|
366
|
+
3. **Declare in the model.** Add `mongo_index :a, :b, ...` to the
|
|
367
|
+
model file. The class loads — validation runs.
|
|
368
|
+
|
|
369
|
+
4. **Plan in dry-run.** Run `Model.indexes_plan` (or the rake task)
|
|
370
|
+
to confirm the migrator sees the declaration and classifies it as
|
|
371
|
+
`to_create`. Verify capacity headroom.
|
|
372
|
+
|
|
373
|
+
5. **Apply.** With the writer URI configured and the triple-gate flipped:
|
|
374
|
+
```bash
|
|
375
|
+
PARSE_MONGO_INDEX_MUTATIONS=1 rake parse:mongo:indexes:apply CLASS=Song
|
|
376
|
+
```
|
|
377
|
+
The migrator is additive — never drops without `DROP=true`.
|
|
378
|
+
|
|
379
|
+
6. **Verify.** Re-run the slow query, check `explain` shows `IXSCAN`.
|
|
380
|
+
Check `Model.describe(:indexes, network: true, usage: true)` shows
|
|
381
|
+
ops counting up.
|
|
382
|
+
|
|
383
|
+
7. **Monitor.** Periodic `$indexStats` audits catch indexes that
|
|
384
|
+
stopped being useful when query patterns shifted.
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
## Atlas Search indexes
|
|
389
|
+
|
|
390
|
+
Atlas Search indexes are a different beast from regular MongoDB indexes
|
|
391
|
+
and live on a different infrastructure path. They are NOT covered by
|
|
392
|
+
the `mongo_index` DSL or `parse:mongo:indexes:apply`. They are not
|
|
393
|
+
counted against the 64-index-per-collection cap (separate budget,
|
|
394
|
+
separate node). Use them for **full-text search, autocomplete, faceted
|
|
395
|
+
search, and vector similarity** — workloads a B-tree can't satisfy.
|
|
396
|
+
|
|
397
|
+
### When to reach for an Atlas Search index instead of a regular one
|
|
398
|
+
|
|
399
|
+
| Workload | Right tool |
|
|
400
|
+
|---|---|
|
|
401
|
+
| `find_by_title("exact match")` | regular index on `title` |
|
|
402
|
+
| `find_by_title_prefix("hel")` | regular index on `title` (uses `^hel` regex anchored) |
|
|
403
|
+
| Substring match: `title CONTAINS "ello"` | **Atlas Search** (`text` analyzer) |
|
|
404
|
+
| Misspelling tolerance: `helo` matches `hello` | **Atlas Search** (`text` + fuzzy) |
|
|
405
|
+
| Typeahead / autocomplete | **Atlas Search** (`autocomplete` field type) |
|
|
406
|
+
| Multi-field ranked search ("title OR body OR tags") | **Atlas Search** (compound query, BM25 scoring) |
|
|
407
|
+
| Facet counts (genre histogram) | **Atlas Search** (`$searchMeta`, `facet` operator) |
|
|
408
|
+
| Vector similarity (embeddings) | **Atlas Search** (`vectorSearch` index type) |
|
|
409
|
+
|
|
410
|
+
If the query plan compiles to a `$text` stage or a `^anchored` regex,
|
|
411
|
+
a regular index is enough. If the query needs ranking, fuzziness, or
|
|
412
|
+
analyzer-driven tokenization, you want Atlas Search.
|
|
413
|
+
|
|
414
|
+
### Declaring vs. managing
|
|
415
|
+
|
|
416
|
+
Regular indexes are **declared** on the model (`mongo_index :title`)
|
|
417
|
+
and reconciled by `parse:mongo:indexes:apply`. Atlas Search indexes
|
|
418
|
+
follow the same pattern with `mongo_search_index` + a parallel rake
|
|
419
|
+
task, but with looser semantics — definitions are opaque (the DSL
|
|
420
|
+
doesn't introspect field references; Atlas owns the mapping shape),
|
|
421
|
+
drift is reported-and-refused rather than auto-applied, and builds
|
|
422
|
+
run asynchronously so the rake task is fire-and-forget by default.
|
|
423
|
+
|
|
424
|
+
```ruby
|
|
425
|
+
class Song < Parse::Object
|
|
426
|
+
property :title, :string
|
|
427
|
+
property :artist, :string
|
|
428
|
+
|
|
429
|
+
mongo_search_index "song_search", {
|
|
430
|
+
mappings: { dynamic: false, fields: {
|
|
431
|
+
title: { type: "string", analyzer: "lucene.standard" },
|
|
432
|
+
artist: { type: "string" },
|
|
433
|
+
} },
|
|
434
|
+
}
|
|
435
|
+
mongo_search_index "song_autocomplete", {
|
|
436
|
+
mappings: { fields: {
|
|
437
|
+
title: { type: "autocomplete", tokenization: "edgeGram" },
|
|
438
|
+
} },
|
|
439
|
+
}
|
|
440
|
+
end
|
|
441
|
+
|
|
442
|
+
Song.search_indexes_plan # dry-run
|
|
443
|
+
Song.apply_search_indexes! # additive — only creates to_create
|
|
444
|
+
Song.apply_search_indexes!(update: true, wait: true) # rebuild drifted, block until READY
|
|
445
|
+
```
|
|
446
|
+
|
|
447
|
+
If you don't want the DSL — for one-off scripts, a model that needs
|
|
448
|
+
analyzers that don't round-trip cleanly, or a vector-search index
|
|
449
|
+
whose definition lives in a separate JSON file — call the raw
|
|
450
|
+
`Parse::AtlasSearch::IndexManager` / `Parse::MongoDB` primitives
|
|
451
|
+
directly. Both routes use the same writer connection and the same
|
|
452
|
+
triple-gate.
|
|
453
|
+
|
|
454
|
+
The triple-gate (writer URI + `index_mutations_enabled` +
|
|
455
|
+
`ENV["PARSE_MONGO_INDEX_MUTATIONS"]`) applies the same way it does
|
|
456
|
+
for regular index mutations. The writer role additionally needs the
|
|
457
|
+
`createSearchIndexes` / `dropSearchIndex` / `updateSearchIndex` /
|
|
458
|
+
`listSearchIndexes` Mongo actions granted by your operator.
|
|
459
|
+
|
|
460
|
+
### Creating an Atlas Search index
|
|
461
|
+
|
|
462
|
+
```ruby
|
|
463
|
+
Parse::AtlasSearch::IndexManager.create_index(
|
|
464
|
+
"Song",
|
|
465
|
+
"song_search",
|
|
466
|
+
{
|
|
467
|
+
mappings: {
|
|
468
|
+
dynamic: false,
|
|
469
|
+
fields: {
|
|
470
|
+
title: { type: "string", analyzer: "lucene.standard" },
|
|
471
|
+
artist: { type: "string", analyzer: "lucene.standard" },
|
|
472
|
+
tags: { type: "string" },
|
|
473
|
+
},
|
|
474
|
+
},
|
|
475
|
+
},
|
|
476
|
+
)
|
|
477
|
+
# => :created (build is async)
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
Return values mirror the regular-index primitives: `:created` on
|
|
481
|
+
submission, `:exists` when an index with that name is already present.
|
|
482
|
+
The wrapper (`Parse::AtlasSearch::IndexManager.create_index`) clears
|
|
483
|
+
the IndexManager's process-local index cache after a successful
|
|
484
|
+
submission so subsequent introspection sees the new index. The
|
|
485
|
+
underlying primitive (`Parse::MongoDB.create_search_index`) does NOT
|
|
486
|
+
touch the cache — callers using it directly must invalidate manually
|
|
487
|
+
via `IndexManager.clear_cache(collection_name)`.
|
|
488
|
+
|
|
489
|
+
**Idempotency is name-based, not definition-based.** If you re-run
|
|
490
|
+
`create_index` with a different `definition:` against an existing
|
|
491
|
+
name, the call returns `:exists` and silently does nothing. To change
|
|
492
|
+
a definition, call `update_index` explicitly.
|
|
493
|
+
|
|
494
|
+
### Dropping an Atlas Search index
|
|
495
|
+
|
|
496
|
+
```ruby
|
|
497
|
+
Parse::AtlasSearch::IndexManager.drop_index(
|
|
498
|
+
"Song",
|
|
499
|
+
"song_search",
|
|
500
|
+
confirm: "drop_search:Song:song_search",
|
|
501
|
+
)
|
|
502
|
+
# => :dropped
|
|
503
|
+
```
|
|
504
|
+
|
|
505
|
+
The confirm-token prefix is `drop_search:` (not `drop:`) so a token
|
|
506
|
+
prepared for a regular `Parse::MongoDB.drop_index` call cannot be
|
|
507
|
+
replayed against a search index that happens to share its name, and
|
|
508
|
+
vice versa.
|
|
509
|
+
|
|
510
|
+
### Replacing an Atlas Search index definition
|
|
511
|
+
|
|
512
|
+
```ruby
|
|
513
|
+
Parse::AtlasSearch::IndexManager.update_index(
|
|
514
|
+
"Song",
|
|
515
|
+
"song_search",
|
|
516
|
+
{ mappings: { dynamic: true } },
|
|
517
|
+
)
|
|
518
|
+
# => :updated
|
|
519
|
+
```
|
|
520
|
+
|
|
521
|
+
`update_index` requires the named index to already exist (raises
|
|
522
|
+
`ArgumentError` otherwise — use `create_index` for new indexes). The
|
|
523
|
+
rebuild runs asynchronously; the new mapping is not live until the
|
|
524
|
+
index returns to `READY` status.
|
|
525
|
+
|
|
526
|
+
### Waiting for an async build (and a footgun)
|
|
527
|
+
|
|
528
|
+
Atlas Search builds are not synchronous. `create_index` and
|
|
529
|
+
`update_index` return as soon as the command is accepted; the index
|
|
530
|
+
transitions through `BUILDING` to `READY` over seconds to minutes
|
|
531
|
+
depending on collection size and definition complexity.
|
|
532
|
+
|
|
533
|
+
The naive polling pattern has a sharp edge — the IndexManager's
|
|
534
|
+
default cache TTL is 300 seconds, and a poll loop that hits
|
|
535
|
+
`index_ready?` immediately after a mutation will cache the
|
|
536
|
+
`queryable: false` BUILDING state for up to five minutes:
|
|
537
|
+
|
|
538
|
+
```ruby
|
|
539
|
+
# ANTI-PATTERN — caches the BUILDING state
|
|
540
|
+
Parse::AtlasSearch::IndexManager.create_index("Song", "song_search", definition)
|
|
541
|
+
until Parse::AtlasSearch::IndexManager.index_ready?("Song", "song_search")
|
|
542
|
+
sleep 2
|
|
543
|
+
end
|
|
544
|
+
# Loops for the full TTL even after the index goes READY.
|
|
545
|
+
```
|
|
546
|
+
|
|
547
|
+
Use `wait_for_ready` instead — it polls `list_indexes` with
|
|
548
|
+
`force_refresh: true` on every iteration so the cache cannot lock in
|
|
549
|
+
the BUILDING state, and surfaces `:failed` and `:timeout` outcomes
|
|
550
|
+
explicitly:
|
|
551
|
+
|
|
552
|
+
```ruby
|
|
553
|
+
Parse::AtlasSearch::IndexManager.create_index("Song", "song_search", definition)
|
|
554
|
+
|
|
555
|
+
case Parse::AtlasSearch::IndexManager.wait_for_ready(
|
|
556
|
+
"Song", "song_search", timeout: 600, interval: 5,
|
|
557
|
+
)
|
|
558
|
+
when :ready then # index is queryable
|
|
559
|
+
when :failed then raise "search index build failed"
|
|
560
|
+
when :timeout then raise "search index did not become ready within 600s"
|
|
561
|
+
end
|
|
562
|
+
```
|
|
563
|
+
|
|
564
|
+
If you have a reason to roll your own loop (custom timeout strategy,
|
|
565
|
+
sidecar process polling, etc.), pass `force_refresh: true` to
|
|
566
|
+
`list_indexes` on every iteration, or lower the cache TTL globally:
|
|
567
|
+
|
|
568
|
+
```ruby
|
|
569
|
+
Parse::AtlasSearch::IndexManager.cache_ttl = 30 # or 0 to disable
|
|
570
|
+
```
|
|
571
|
+
|
|
572
|
+
### Budget and write cost
|
|
573
|
+
|
|
574
|
+
Atlas Search indexes have a separate per-cluster limit set by Atlas
|
|
575
|
+
(typically generous — dozens per collection). They DO carry an
|
|
576
|
+
ongoing cost: every write to an indexed field triggers a search-side
|
|
577
|
+
update. The same "don't index what you don't search" discipline
|
|
578
|
+
applies — a `mappings.dynamic: true` index over a write-heavy
|
|
579
|
+
collection will silently double or triple your storage and update
|
|
580
|
+
load.
|
|
581
|
+
|
|
582
|
+
If you're paying for Atlas Search, prefer **explicit field mappings**
|
|
583
|
+
(`mappings.dynamic: false` with an enumerated `fields:` map) over
|
|
584
|
+
`dynamic: true` for any collection above ~10k docs or above modest
|
|
585
|
+
write throughput. Dynamic mappings are convenient for prototyping;
|
|
586
|
+
explicit mappings are correct for production.
|
|
587
|
+
|
|
588
|
+
### What lives where
|
|
589
|
+
|
|
590
|
+
| Concern | Path |
|
|
591
|
+
|---|---|
|
|
592
|
+
| `mongo_index :foo` declarations + migrator | `Parse::Core::Indexing`, `Parse::Schema::IndexMigrator` |
|
|
593
|
+
| `mongo_search_index "name", { mappings: { … } }` declarations + migrator | `Parse::Core::SearchIndexing`, `Parse::Schema::SearchIndexMigrator` |
|
|
594
|
+
| `Parse::MongoDB.create_index` / `drop_index` (regular indexes) | `lib/parse/mongodb.rb` |
|
|
595
|
+
| `Parse::MongoDB.create_search_index` / `drop_search_index` / `update_search_index` (Atlas) | `lib/parse/mongodb.rb` |
|
|
596
|
+
| `Parse::AtlasSearch::IndexManager.create_index` / `drop_index` / `update_index` (cache-invalidating wrappers) | `lib/parse/atlas_search/index_manager.rb` |
|
|
597
|
+
| `rake parse:mongo:search_indexes:plan` / `:apply` | `lib/parse/stack/tasks.rb` |
|
|
598
|
+
| Search query execution | `Parse::AtlasSearch.search` / `.autocomplete` / `.faceted_search` |
|
|
599
|
+
|
|
600
|
+
---
|
|
601
|
+
|
|
602
|
+
## When NOT to add an index
|
|
603
|
+
|
|
604
|
+
- **Low-cardinality columns.** Indexing a boolean `is_active` is
|
|
605
|
+
almost never useful — the index points to ~half the collection.
|
|
606
|
+
Better to filter with another index that already narrows the set.
|
|
607
|
+
- **Write-only / append-only collections.** Audit logs, event
|
|
608
|
+
streams, telemetry data. Reads are rare; indexes pay the write
|
|
609
|
+
cost without recouping it.
|
|
610
|
+
- **Columns you only access in `$lookup` from one side.** The
|
|
611
|
+
foreign-side join column needs an index (the side you're looking
|
|
612
|
+
INTO), but the local side doesn't need a duplicate.
|
|
613
|
+
- **Columns Parse Server already manages.** Don't shadow Parse's
|
|
614
|
+
auto-managed indexes on `_User.username`, `_User.email`, etc.
|
|
615
|
+
Parse maintains them; the migrator excludes them from drift
|
|
616
|
+
analysis but won't stop you from creating a competing one.
|
|
617
|
+
- **As a "just in case".** Empty `ops` after a few weeks of
|
|
618
|
+
production traffic is your answer.
|
|
619
|
+
|
|
620
|
+
---
|
|
621
|
+
|
|
622
|
+
## See also
|
|
623
|
+
|
|
624
|
+
- [mongodb_direct_guide.md](./mongodb_direct_guide.md) — the full
|
|
625
|
+
direct-Mongo / index-management API reference (DSL spelling,
|
|
626
|
+
writer URI, triple-gate, rake tasks)
|
|
627
|
+
- [SECURITY_GUIDE.md](../SECURITY_GUIDE.md) — security posture around
|
|
628
|
+
the writer URI, role validation, audit trail
|
|
629
|
+
- MongoDB official: <https://www.mongodb.com/docs/manual/indexes/>
|
|
630
|
+
- Parse Server source for auto-managed indexes:
|
|
631
|
+
<https://github.com/parse-community/parse-server/blob/master/src/Adapters/Storage/Mongo/MongoStorageAdapter.js>
|