@gagik.co/snippet-agent 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/.eslintrc.js +13 -0
  2. package/.prettierrc.json +1 -0
  3. package/README.md +23 -0
  4. package/dist/agent-class.d.ts +47 -0
  5. package/dist/agent-class.js +314 -0
  6. package/dist/agent.d.ts +1 -0
  7. package/dist/agent.js +392 -0
  8. package/dist/banner.d.ts +1 -0
  9. package/dist/banner.js +23 -0
  10. package/dist/confirmation-extension.d.ts +10 -0
  11. package/dist/confirmation-extension.js +213 -0
  12. package/dist/index.d.ts +3 -0
  13. package/dist/index.js +141 -0
  14. package/dist/mongosh-interactive-mode.d.ts +33 -0
  15. package/dist/mongosh-interactive-mode.js +244 -0
  16. package/dist/project-agent.d.ts +1 -0
  17. package/dist/project-agent.js +36 -0
  18. package/dist/shell-context.d.ts +17 -0
  19. package/dist/shell-context.js +75 -0
  20. package/dist/skills-loader.d.ts +2 -0
  21. package/dist/skills-loader.js +69 -0
  22. package/dist/src/index.d.ts +1 -0
  23. package/dist/src/index.js +8 -0
  24. package/dist/src/project-agent.d.ts +1 -0
  25. package/dist/src/project-agent.js +36 -0
  26. package/dist/stdout-patcher.d.ts +5 -0
  27. package/dist/stdout-patcher.js +41 -0
  28. package/dist/tools/index.d.ts +4 -0
  29. package/dist/tools/index.js +7 -0
  30. package/dist/tools/mongosh-eval.d.ts +7 -0
  31. package/dist/tools/mongosh-eval.js +84 -0
  32. package/dist/tools/search-docs.d.ts +2 -0
  33. package/dist/tools/search-docs.js +106 -0
  34. package/dist/tools/types.d.ts +12 -0
  35. package/dist/tools/types.js +2 -0
  36. package/dist/tools.d.ts +7 -0
  37. package/dist/tools.js +189 -0
  38. package/dist/types.d.ts +21 -0
  39. package/dist/types.js +2 -0
  40. package/package.json +38 -0
  41. package/skills/mongodb-connection.md +208 -0
  42. package/skills/mongodb-natural-language-querying.md +202 -0
  43. package/skills/mongodb-query-optimizer.md +265 -0
  44. package/skills/mongodb-schema-design.md +455 -0
  45. package/skills/mongodb-search-and-ai.md +357 -0
  46. package/skills/mongosh-shell.md +227 -0
  47. package/src/agent-class.ts +393 -0
  48. package/src/banner.ts +36 -0
  49. package/src/confirmation-extension.ts +297 -0
  50. package/src/index.ts +137 -0
  51. package/src/mongosh-interactive-mode.ts +420 -0
  52. package/src/shell-context.ts +97 -0
  53. package/src/skills-loader.ts +37 -0
  54. package/src/stdout-patcher.ts +48 -0
  55. package/src/tools/index.ts +4 -0
  56. package/src/tools/mongosh-eval.ts +115 -0
  57. package/src/tools/search-docs.ts +115 -0
  58. package/src/tools/types.ts +15 -0
  59. package/src/types.ts +23 -0
  60. package/tsconfig-lint.json +4 -0
  61. package/tsconfig.json +20 -0
@@ -0,0 +1,455 @@
1
+ ---
2
+ name: mongodb-schema-design
3
+ description: MongoDB schema design patterns and anti-patterns. Use when designing data models, reviewing schemas, migrating from SQL, or troubleshooting performance issues caused by schema problems. Triggers on "design schema", "embed vs reference", "MongoDB data model", "schema review", "unbounded arrays", "one-to-many", "tree structure", "16MB limit", "schema validation", "JSON Schema", "time series", "schema migration", "polymorphic", "TTL", "data lifecycle", "archive", "index explosion", "unnecessary indexes", "approximation pattern", "document versioning".
4
+ disable-model-invocation: false
5
+ ---
6
+
7
+ # MongoDB Schema Design
8
+
9
+ Data modeling patterns and anti-patterns for MongoDB. Bad schema is the root cause of most MongoDB performance and cost issues—queries and indexes cannot fix a fundamentally wrong model.
10
+
11
+ ## When to Apply
12
+
13
+ Reference these guidelines when:
14
+ - Designing a new MongoDB schema from scratch
15
+ - Migrating from SQL/relational databases to MongoDB
16
+ - Reviewing existing data models for performance issues
17
+ - Troubleshooting slow queries or growing document sizes
18
+ - Deciding between embedding and referencing
19
+ - Modeling relationships (one-to-one, one-to-many, many-to-many)
20
+ - Implementing tree/hierarchical structures
21
+ - Hitting the 16MB document limit
22
+ - Adding schema validation to existing collections
23
+
24
+ ## Key Principle
25
+
26
+ > **"Data that is accessed together should be stored together."**
27
+
28
+ This is MongoDB's core philosophy. Embedding related data eliminates joins, reduces round trips, and enables atomic updates. Reference only when you must.
29
+
30
+ ## Schema Anti-Patterns
31
+
32
+ ### 1. Unnecessary Collections
33
+
34
+ **Problem:** Splitting homogeneous data into multiple collections by type, date, or tenant.
35
+
36
+ **Why it's bad:**
37
+ - More complex application logic
38
+ - Harder to query across collections
39
+ - Schema changes must be applied to multiple places
40
+
41
+ **Better approach:**
42
+ - Use a single collection with discriminator fields
43
+ - Use time-series collections for time-based data
44
+ - Use sharding for tenant isolation if needed
45
+
46
+ ```javascript
47
+ // BAD: Separate collections per year
48
+ db.orders_2023, db.orders_2024
49
+
50
+ // GOOD: Single collection with date field
51
+ db.orders.createIndex({ createdAt: 1 })
52
+
53
+ // Query with date range
54
+ db.orders.find({
55
+ createdAt: {
56
+ $gte: ISODate("2024-01-01"),
57
+ $lt: ISODate("2025-01-01")
58
+ }
59
+ })
60
+ ```
61
+
62
+ ### 2. Excessive Lookups
63
+
64
+ **Problem:** Overly normalized collections that reference each other, requiring frequent `$lookup` operations.
65
+
66
+ **Why it's bad:**
67
+ - `$lookup` is expensive (distributed joins across shards)
68
+ - Increases query complexity and latency
69
+ - Defeats the purpose of document model
70
+
71
+ **Better approach:**
72
+ - Embed data that is accessed together
73
+ - Use extended reference pattern to cache frequently accessed fields
74
+
75
+ ```javascript
76
+ // BAD: Normalized like SQL
77
+ db.orders.find().forEach(order => {
78
+ const customer = db.customers.findOne({ _id: order.customerId });
79
+ // Combine data in application
80
+ })
81
+
82
+ // GOOD: Embedded customer info
83
+ db.orders.find().forEach(order => {
84
+ // customer name and email already embedded
85
+ print(order.customer.name);
86
+ })
87
+
88
+ // Extended reference pattern (balance)
89
+ db.orders.find().forEach(order => {
90
+ // Essential customer info embedded
91
+ print(order.customer.name);
92
+ // Full customer data fetched only when needed
93
+ if (needFullProfile) {
94
+ const customer = db.customers.findOne({ _id: order.customerId });
95
+ }
96
+ })
97
+ ```
98
+
99
+ ### 3. Unnecessary Indexes
100
+
101
+ **Problem:** Creating indexes that overlap or are never used by queries.
102
+
103
+ **Why it's bad:**
104
+ - Slows down writes (each index must be updated)
105
+ - Consumes RAM and storage
106
+ - Can cause write performance issues
107
+
108
+ **Verification via mongosh_eval:**
109
+ ```javascript
110
+ // Check index usage
111
+ db.collection.aggregate([{ $indexStats: {} }])
112
+
113
+ // Check for overlapping indexes
114
+ // If you have {a:1,b:1}, you don't need {a:1}
115
+ ```
116
+
117
+ **Better approach:**
118
+ - Drop unused indexes (but verify first)
119
+ - Design compound indexes to cover multiple queries
120
+ - Use partial indexes for specific query patterns
121
+
122
+ ```javascript
123
+ // Remove unused index (after verifying with $indexStats)
124
+ db.collection.dropIndex("unused_index_name")
125
+ ```
126
+
127
+ ## Schema Fundamentals
128
+
129
+ ### Embed vs Reference Decision Framework
130
+
131
+ | Relationship | Cardinality | Access Pattern | Recommendation |
132
+ |-------------|-------------|----------------|----------------|
133
+ | One-to-One | 1:1 | Always together | **Embed** |
134
+ | One-to-Few | 1:N (N < 100) | Usually together | **Embed array** |
135
+ | One-to-Many | 1:N (N > 100) | Often separate | **Reference** |
136
+ | Many-to-Many | M:N | Varies | **Two-way reference** |
137
+
138
+ This is a **rough** guideline. Always verify with your actual workload.
139
+
140
+ **Embed when:**
141
+ - Data is always accessed together (1:1, 1:few)
142
+ - Atomic updates are needed across the data
143
+ - Arrays are bounded and small
144
+
145
+ **Reference when:**
146
+ - Data is accessed independently
147
+ - Relationships are many-to-many
148
+ - Arrays can grow without bound
149
+
150
+ ```javascript
151
+ // Embed: User with few addresses
152
+ db.users.insertOne({
153
+ name: "John",
154
+ addresses: [
155
+ { street: "123 Main St", city: "NYC", primary: true },
156
+ { street: "456 Oak Ave", city: "LA", primary: false }
157
+ ]
158
+ })
159
+
160
+ // Reference: User with many (unbounded) orders
161
+ db.users.insertOne({
162
+ name: "John",
163
+ // Don't embed orders here - they can be thousands
164
+ })
165
+ // Store orders in separate collection with userId reference
166
+ db.orders.insertOne({
167
+ userId: user._id,
168
+ items: [...],
169
+ total: 100.00
170
+ })
171
+ ```
172
+
173
+ ### Document Size Management
174
+
175
+ **The 16MB limit is hard.** Common causes:
176
+ - Unbounded arrays
177
+ - Large embedded binaries
178
+ - Deeply nested objects
179
+
180
+ **Verification via mongosh_eval:**
181
+ ```javascript
182
+ // Check document sizes in collection
183
+ db.collection.aggregate([
184
+ { $project: { size: { $bsonSize: "$$ROOT" } } },
185
+ { $sort: { size: -1 } },
186
+ { $limit: 10 }
187
+ ])
188
+
189
+ // Check for large arrays
190
+ db.collection.find({
191
+ $expr: { $gt: [{ $size: "$arrayField" }, 100] }
192
+ })
193
+ ```
194
+
195
+ **Mitigation strategies:**
196
+ - Move unbounded data to separate collections
197
+ - Use bucketing pattern for time-series data
198
+ - Use referencing for large subdocuments
199
+
200
+ ### Schema Validation
201
+
202
+ Use MongoDB's built-in `$jsonSchema` validator to catch invalid data at the database level.
203
+
204
+ ```javascript
205
+ // Add validation to existing collection
206
+ db.createCollection("users", {
207
+ validator: {
208
+ $jsonSchema: {
209
+ bsonType: "object",
210
+ required: ["name", "email"],
211
+ properties: {
212
+ name: {
213
+ bsonType: "string",
214
+ description: "must be a string and is required"
215
+ },
216
+ email: {
217
+ bsonType: "string",
218
+ pattern: "^.+@.+$",
219
+ description: "must be a valid email"
220
+ },
221
+ age: {
222
+ bsonType: "int",
223
+ minimum: 0,
224
+ maximum: 150
225
+ }
226
+ }
227
+ }
228
+ },
229
+ validationLevel: "moderate", // Allow existing docs, validate new/modified
230
+ validationAction: "warn" // Log violations without rejecting
231
+ })
232
+
233
+ // Later tighten to strict/error
234
+ ```
235
+
236
+ ## Design Patterns
237
+
238
+ ### 1. Approximation Pattern
239
+
240
+ Use approximate values for high-frequency counters instead of exact counts.
241
+
242
+ ```javascript
243
+ // Instead of incrementing on every view (expensive)
244
+ db.articles.updateOne(
245
+ { _id: articleId },
246
+ { $inc: { views: 1 } }
247
+ )
248
+
249
+ // Use approximation with random increment
250
+ db.articles.updateOne(
251
+ { _id: articleId },
252
+ { $inc: { views: Math.floor(Math.random() * 10) + 1 } }
253
+ )
254
+ ```
255
+
256
+ ### 2. Bucket Pattern
257
+
258
+ Group time-series or IoT data into buckets to reduce document count and improve query efficiency.
259
+
260
+ ```javascript
261
+ // Bucket by hour
262
+ db.sensorData.insertOne({
263
+ sensorId: "sensor123",
264
+ date: ISODate("2024-01-01"),
265
+ hour: 14,
266
+ measurements: [
267
+ { minute: 0, value: 23.5 },
268
+ { minute: 1, value: 23.6 },
269
+ // ... up to 60 minutes
270
+ ],
271
+ avgValue: 23.55,
272
+ minValue: 23.1,
273
+ maxValue: 24.2
274
+ })
275
+ ```
276
+
277
+ ### 3. Computed Pattern
278
+
279
+ Pre-calculate expensive aggregations and store them.
280
+
281
+ ```javascript
282
+ // Store computed totals with order
283
+ db.orders.insertOne({
284
+ items: [
285
+ { product: "A", qty: 2, price: 10 },
286
+ { product: "B", qty: 1, price: 20 }
287
+ ],
288
+ computed: {
289
+ subtotal: 40,
290
+ tax: 4,
291
+ total: 44
292
+ }
293
+ })
294
+ ```
295
+
296
+ ### 4. Extended Reference Pattern
297
+
298
+ Cache frequently-accessed data from related entities.
299
+
300
+ ```javascript
301
+ // Order embeds essential customer info
302
+ db.orders.insertOne({
303
+ customer: {
304
+ _id: customerId,
305
+ name: "John Doe",
306
+ email: "john@example.com" // Frequently needed
307
+ },
308
+ customerId: customerId // Reference for full profile if needed
309
+ // ... order details
310
+ })
311
+ ```
312
+
313
+ ### 5. Outlier Pattern
314
+
315
+ Handle collections where a small subset of documents are much larger.
316
+
317
+ ```javascript
318
+ // Most products have few reviews
319
+ db.products.insertOne({
320
+ name: "Widget",
321
+ reviews: [review1, review2] // Embedded for most
322
+ })
323
+
324
+ // Popular product with thousands of reviews
325
+ db.products.insertOne({
326
+ name: "iPhone",
327
+ reviewCount: 50000,
328
+ recentReviews: [review1, review2, review3], // Last 3 only
329
+ allReviewsReference: "reviews_iphone" // Points to separate collection
330
+ })
331
+ ```
332
+
333
+ ### 6. Polymorphic Pattern
334
+
335
+ Store different types of entities in the same collection.
336
+
337
+ ```javascript
338
+ // Different product types in same collection
339
+ db.products.insertMany([
340
+ {
341
+ type: "electronics",
342
+ name: "Laptop",
343
+ specs: { cpu: "i7", ram: "16GB" }
344
+ },
345
+ {
346
+ type: "clothing",
347
+ name: "T-Shirt",
348
+ specs: { size: "M", color: "blue" }
349
+ }
350
+ ])
351
+
352
+ // Query by type
353
+ db.products.find({ type: "electronics" })
354
+ ```
355
+
356
+ ### 7. Schema Versioning
357
+
358
+ Handle schema evolution gracefully.
359
+
360
+ ```javascript
361
+ // Documents include schema version
362
+ db.users.insertOne({
363
+ schemaVersion: 2,
364
+ name: "John",
365
+ email: "john@example.com",
366
+ // New field added in v2
367
+ preferences: { theme: "dark" }
368
+ })
369
+
370
+ // Application handles migration
371
+ if (doc.schemaVersion === 1) {
372
+ // Migrate to v2
373
+ doc.preferences = { theme: "light" };
374
+ doc.schemaVersion = 2;
375
+ }
376
+ ```
377
+
378
+ ### 8. Document Versioning
379
+
380
+ Track document changes for audit trails.
381
+
382
+ ```javascript
383
+ db.articles.insertOne({
384
+ title: "My Article",
385
+ content: "Current content",
386
+ version: 5,
387
+ history: [
388
+ { version: 1, content: "Original", modifiedAt: ISODate(...) },
389
+ { version: 2, content: "First edit", modifiedAt: ISODate(...) },
390
+ // ... keep last N versions
391
+ ]
392
+ })
393
+ ```
394
+
395
+ ### 9. Archive Pattern
396
+
397
+ Move historical data to separate/cold storage.
398
+
399
+ ```javascript
400
+ // Active orders in hot collection
401
+ db.orders.insertOne({
402
+ _id: orderId,
403
+ status: "pending",
404
+ createdAt: ISODate("2024-01-01")
405
+ })
406
+
407
+ // Archive completed old orders (via scheduled job)
408
+ db.orders.find({
409
+ status: "completed",
410
+ createdAt: { $lt: ISODate("2023-01-01") }
411
+ }).forEach(doc => {
412
+ db.ordersArchive.insertOne(doc);
413
+ db.orders.deleteOne({ _id: doc._id });
414
+ })
415
+ ```
416
+
417
+ ## Verification Commands
418
+
419
+ Use these mongosh_eval commands to analyze your schema:
420
+
421
+ ```javascript
422
+ // Check document sizes
423
+ db.collection.aggregate([
424
+ { $project: { size: { $bsonSize: "$$ROOT" } } },
425
+ { $group: { _id: null, avgSize: { $avg: "$size" }, maxSize: { $max: "$size" } } }
426
+ ])
427
+
428
+ // Find documents with large arrays
429
+ db.collection.find({
430
+ $expr: { $gt: [{ $size: "$arrayField" }, 100] }
431
+ }).limit(5)
432
+
433
+ // Check for field consistency across documents
434
+ db.collection.aggregate([
435
+ { $project: { fields: { $objectToArray: "$$ROOT" } } },
436
+ { $unwind: "$fields" },
437
+ { $group: { _id: "$fields.k", count: { $sum: 1 } } },
438
+ { $sort: { count: -1 } }
439
+ ])
440
+
441
+ // Collection stats
442
+ db.collection.stats()
443
+ ```
444
+
445
+ ## Action Policy
446
+
447
+ **I will NEVER execute write operations without your explicit approval.**
448
+
449
+ Before any schema change:
450
+ 1. I'll explain **what** I want to do and **why**
451
+ 2. I'll show you the **exact command**
452
+ 3. I'll **wait for your approval** before executing
453
+ 4. If you say "go ahead" or "yes", only then will I run it
454
+
455
+ **Your database, your decision.**