@gagik.co/snippet-agent 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/.eslintrc.js +13 -0
  2. package/.prettierrc.json +1 -0
  3. package/README.md +23 -0
  4. package/dist/agent-class.d.ts +47 -0
  5. package/dist/agent-class.js +314 -0
  6. package/dist/agent.d.ts +1 -0
  7. package/dist/agent.js +392 -0
  8. package/dist/banner.d.ts +1 -0
  9. package/dist/banner.js +23 -0
  10. package/dist/confirmation-extension.d.ts +10 -0
  11. package/dist/confirmation-extension.js +213 -0
  12. package/dist/index.d.ts +3 -0
  13. package/dist/index.js +141 -0
  14. package/dist/mongosh-interactive-mode.d.ts +33 -0
  15. package/dist/mongosh-interactive-mode.js +244 -0
  16. package/dist/project-agent.d.ts +1 -0
  17. package/dist/project-agent.js +36 -0
  18. package/dist/shell-context.d.ts +17 -0
  19. package/dist/shell-context.js +75 -0
  20. package/dist/skills-loader.d.ts +2 -0
  21. package/dist/skills-loader.js +69 -0
  22. package/dist/src/index.d.ts +1 -0
  23. package/dist/src/index.js +8 -0
  24. package/dist/src/project-agent.d.ts +1 -0
  25. package/dist/src/project-agent.js +36 -0
  26. package/dist/stdout-patcher.d.ts +5 -0
  27. package/dist/stdout-patcher.js +41 -0
  28. package/dist/tools/index.d.ts +4 -0
  29. package/dist/tools/index.js +7 -0
  30. package/dist/tools/mongosh-eval.d.ts +7 -0
  31. package/dist/tools/mongosh-eval.js +84 -0
  32. package/dist/tools/search-docs.d.ts +2 -0
  33. package/dist/tools/search-docs.js +106 -0
  34. package/dist/tools/types.d.ts +12 -0
  35. package/dist/tools/types.js +2 -0
  36. package/dist/tools.d.ts +7 -0
  37. package/dist/tools.js +189 -0
  38. package/dist/types.d.ts +21 -0
  39. package/dist/types.js +2 -0
  40. package/package.json +38 -0
  41. package/skills/mongodb-connection.md +208 -0
  42. package/skills/mongodb-natural-language-querying.md +202 -0
  43. package/skills/mongodb-query-optimizer.md +265 -0
  44. package/skills/mongodb-schema-design.md +455 -0
  45. package/skills/mongodb-search-and-ai.md +357 -0
  46. package/skills/mongosh-shell.md +227 -0
  47. package/src/agent-class.ts +393 -0
  48. package/src/banner.ts +36 -0
  49. package/src/confirmation-extension.ts +297 -0
  50. package/src/index.ts +137 -0
  51. package/src/mongosh-interactive-mode.ts +420 -0
  52. package/src/shell-context.ts +97 -0
  53. package/src/skills-loader.ts +37 -0
  54. package/src/stdout-patcher.ts +48 -0
  55. package/src/tools/index.ts +4 -0
  56. package/src/tools/mongosh-eval.ts +115 -0
  57. package/src/tools/search-docs.ts +115 -0
  58. package/src/tools/types.ts +15 -0
  59. package/src/types.ts +23 -0
  60. package/tsconfig-lint.json +4 -0
  61. package/tsconfig.json +20 -0
@@ -0,0 +1,208 @@
1
+ ---
2
+ name: mongodb-connection
3
+ description: Optimize MongoDB client connection configuration (pools, timeouts, patterns) for any supported driver language. Use this skill when working/updating/reviewing functions that instantiate or configure a MongoDB client (eg, when calling `connect()`), configuring connection pools, troubleshooting connection errors (ECONNREFUSED, timeouts, pool exhaustion), optimizing performance issues related to connections.
4
+ disable-model-invocation: false
5
+ ---
6
+
7
+ # MongoDB Connection Optimizer
8
+
9
+ You are an expert in MongoDB connection management across all officially supported driver languages (Node.js, Python, Java, Go, C#, Ruby, PHP, etc.).
10
+
11
+ **Note:** This skill is for application/driver connection configuration, not for the current mongosh session. For the current mongosh connection, use `db.getMongo()` to inspect connection state.
12
+
13
+ ## Core Principle: Context Before Configuration
14
+
15
+ **NEVER add connection pool parameters or timeout settings without first understanding the application's context.** Arbitrary values without justification lead to performance issues and harder-to-debug problems.
16
+
17
+ ## Understanding Connection Pools
18
+
19
+ - Connection pooling exists because establishing a MongoDB connection is expensive (TCP + TLS + auth = 50-500ms)
20
+ - Open connections consume ~1 MB of RAM on the MongoDB server per connection
21
+ - Each MongoClient establishes 2 monitoring connections per replica set member
22
+
23
+ **Connection Lifecycle:** Borrow from pool → Execute operation → Return to pool → Prune idle connections exceeding `maxIdleTimeMS`.
24
+
25
+ **Formula:** `Total Connections = (minPoolSize + 2) × replica members × app instances`
26
+
27
+ Example: 10 instances, minPoolSize 5, 3-member set = 210 server connections.
28
+
29
+ ## Configuration Design
30
+
31
+ **Before suggesting any configuration changes**, gather context about the application environment:
32
+
33
+ - Deployment type (serverless vs traditional server)
34
+ - Workload type (OLTP vs OLAP)
35
+ - Concurrency patterns (steady vs bursty)
36
+ - Server version and driver version
37
+ - Memory limits on application and database servers
38
+
39
+ ### Configuration Scenarios
40
+
41
+ #### Scenario: Serverless Environments (Lambda, Cloud Functions)
42
+
43
+ **Critical pattern**: Initialize client OUTSIDE handler/function scope to enable connection reuse across warm invocations.
44
+
45
+ **Recommended configuration**:
46
+
47
+ | Parameter | Value | Reasoning |
48
+ |-----------|-------|-----------|
49
+ | `maxPoolSize` | 3-5 | Each serverless function instance has its own pool |
50
+ | `minPoolSize` | 0 | Prevent maintaining unused connections |
51
+ | `maxIdleTimeMS` | 10000-30000 | Release unused connections quickly (10-30s) |
52
+ | `connectTimeoutMS` | 5000 | Fail fast on connection issues |
53
+ | `socketTimeoutMS` | 5000 | Use timeouts to ensure sockets are closed |
54
+
55
+ **Node.js Example:**
56
+ ```javascript
57
+ // OUTSIDE handler (reused across invocations)
58
+ const client = new MongoClient(uri, {
59
+ maxPoolSize: 3,
60
+ minPoolSize: 0,
61
+ maxIdleTimeMS: 10000,
62
+ connectTimeoutMS: 5000,
63
+ socketTimeoutMS: 5000
64
+ });
65
+
66
+ export const handler = async (event) => {
67
+ // Reuse existing connection
68
+ const db = client.db("mydb");
69
+ const result = await db.collection("items").find({});
70
+ return result;
71
+ };
72
+ ```
73
+
74
+ #### Scenario: Traditional Long-Running Servers (OLTP Workload)
75
+
76
+ **Recommended configuration**:
77
+
78
+ | Parameter | Value | Reasoning |
79
+ |-----------|-------|-----------|
80
+ | `maxPoolSize` | 50-100 | Based on peak concurrent requests |
81
+ | `minPoolSize` | 10-20 | Pre-warmed connections for traffic spikes |
82
+ | `maxIdleTimeMS` | 300000-600000 | 5-10 minutes for stable servers |
83
+ | `connectTimeoutMS` | 5000-10000 | Fail fast on connection issues |
84
+ | `socketTimeoutMS` | 30000 | Prevent hanging queries |
85
+ | `serverSelectionTimeoutMS` | 5000 | Quick failover for replica set changes |
86
+
87
+ **Node.js Example:**
88
+ ```javascript
89
+ const client = new MongoClient(uri, {
90
+ maxPoolSize: 50,
91
+ minPoolSize: 10,
92
+ maxIdleTimeMS: 300000,
93
+ connectTimeoutMS: 5000,
94
+ socketTimeoutMS: 30000,
95
+ serverSelectionTimeoutMS: 5000
96
+ });
97
+ ```
98
+
99
+ #### Scenario: OLAP / Analytical Workloads
100
+
101
+ **Recommended configuration**:
102
+
103
+ | Parameter | Value | Reasoning |
104
+ |-----------|-------|-----------|
105
+ | `maxPoolSize` | 10-20 | Fewer concurrent operations |
106
+ | `minPoolSize` | 0-5 | Queries are infrequent |
107
+ | `socketTimeoutMS` | 300000+ | Allow long-running queries |
108
+ | `maxIdleTimeMS` | 600000 | Minimize connection churn |
109
+
110
+ **Node.js Example:**
111
+ ```javascript
112
+ const client = new MongoClient(uri, {
113
+ maxPoolSize: 15,
114
+ minPoolSize: 2,
115
+ socketTimeoutMS: 300000, // 5 minutes for slow queries
116
+ maxIdleTimeMS: 600000
117
+ });
118
+ ```
119
+
120
+ #### Scenario: High-Traffic / Bursty Workloads
121
+
122
+ **Recommended configuration**:
123
+
124
+ | Parameter | Value | Reasoning |
125
+ |-----------|-------|-----------|
126
+ | `maxPoolSize` | 100+ | Higher ceiling for traffic spikes |
127
+ | `minPoolSize` | 20-30 | More pre-warmed connections |
128
+ | `maxConnecting` | 2 | Prevent thundering herd |
129
+ | `waitQueueTimeoutMS` | 2000-5000 | Fail fast when pool exhausted |
130
+ | `maxIdleTimeMS` | 300000 | Balance reuse and cleanup |
131
+
132
+ ## Troubleshooting Connection Issues
133
+
134
+ ### Pool Exhaustion
135
+
136
+ **Symptoms:** `MongoWaitQueueTimeoutError`, increased latency, operations waiting.
137
+
138
+ **Diagnosis via mongosh (server-side):**
139
+ ```javascript
140
+ // Check current connections
141
+ db.serverStatus().connections
142
+
143
+ // Check active operations
144
+ db.currentOp({ active: true })
145
+ ```
146
+
147
+ **Solutions:**
148
+ - **Increase `maxPoolSize`** when: Server shows low utilization but clients are waiting
149
+ - **Don't increase** when: Server is at capacity (suggest query optimization instead)
150
+
151
+ ### Connection Timeouts
152
+
153
+ **Symptoms:** `ECONNREFUSED`, `SocketTimeoutError`
154
+
155
+ **Check:**
156
+ - Network connectivity: Can you connect via mongosh from the same host?
157
+ - Firewall/VPC settings
158
+ - DNS resolution for SRV connections
159
+ - TLS certificate validity
160
+
161
+ ### Connection Churn
162
+
163
+ **Symptoms:** Rapidly increasing `connections.totalCreated` server metric
164
+
165
+ **Causes:**
166
+ - Not reusing clients (creating new MongoClient per request)
167
+ - Not caching in serverless
168
+ - `maxIdleTimeMS` too low
169
+
170
+ **Solution:** Ensure single MongoClient instance reused across application lifecycle
171
+
172
+ ## Monitoring Connections
173
+
174
+ ```javascript
175
+ // In mongosh - check server-side connection metrics
176
+ db.serverStatus().connections
177
+ // Returns:
178
+ // {
179
+ // current: 42, // Current open connections
180
+ // available: 838858, // Available connection slots
181
+ // totalCreated: 1523 // Total connections created since startup
182
+ // }
183
+
184
+ // Check active operations
185
+ db.currentOp({ active: true }).inprog.length
186
+
187
+ // Check slow operations (if profiling enabled)
188
+ db.system.profile.find().sort({ ts: -1 }).limit(5)
189
+ ```
190
+
191
+ ## Best Practices Summary
192
+
193
+ 1. **Create client once, reuse everywhere** - Never create new MongoClient per request
194
+ 2. **Initialize outside serverless handlers** - Enable warm-start connection reuse
195
+ 3. **Size pools based on concurrency** - Monitor and adjust based on actual load
196
+ 4. **Use appropriate timeouts** - Match socketTimeoutMS to expected query duration
197
+ 5. **Don't manually close connections** - Let the driver manage connection lifecycle
198
+ 6. **Monitor connection metrics** - Watch `connections.current` and creation rate
199
+
200
+ ## Action Policy
201
+
202
+ **I will NEVER suggest configuration changes without understanding your context first.**
203
+
204
+ Before recommending connection settings:
205
+ 1. I'll ask about your deployment type and workload
206
+ 2. I'll inquire about current issues you're experiencing
207
+ 3. I'll suggest specific values **with explanations** of why they fit your scenario
208
+ 4. You can then apply the configuration and test
@@ -0,0 +1,202 @@
1
+ ---
2
+ name: mongodb-natural-language-querying
3
+ description: Generate read-only MongoDB queries (find) or aggregation pipelines using natural language, with collection schema context and sample documents. Use this skill whenever the user asks to write, create, or generate MongoDB queries, wants to filter/query/aggregate data in MongoDB, asks "how do I query...", needs help with query syntax, or discusses finding/filtering/grouping MongoDB documents. Also use for translating SQL-like requests to MongoDB syntax. Does NOT handle Atlas Search ($search operator), vector/semantic search ($vectorSearch operator), fuzzy matching, autocomplete indexes, or relevance scoring - use mongodb-search-and-ai for those. Does NOT analyze or optimize existing queries - use mongodb-query-optimizer for that. Does NOT handle aggregation pipelines that involve write operations.
4
+ disable-model-invocation: false
5
+ ---
6
+
7
+ # MongoDB Natural Language Querying
8
+
9
+ You are an expert MongoDB read-only query and aggregation pipeline generator. You have access to the `mongosh_eval` tool to execute shell commands and inspect the database.
10
+
11
+ ## Query Generation Process
12
+
13
+ ### 1. Gather Context Using mongosh_eval
14
+
15
+ **Required Information:**
16
+ - Database name and collection name (use `show dbs`, `show collections`, or ask user)
17
+ - User's natural language description of the query
18
+
19
+ **Fetch in this order using mongosh_eval:**
20
+
21
+ 1. **Indexes** (for query optimization):
22
+ ```javascript
23
+ db.collection.getIndexes()
24
+ ```
25
+
26
+ 2. **Sample documents** (for understanding data patterns):
27
+ ```javascript
28
+ db.collection.find().limit(4)
29
+ ```
30
+ - Shows actual data values and formats
31
+ - Reveals common patterns (enums, ranges, etc.)
32
+
33
+ 3. **Collection stats** (optional, for context):
34
+ ```javascript
35
+ db.collection.stats()
36
+ db.collection.countDocuments()
37
+ ```
38
+
39
+ ### 2. Analyze Context and Validate Fields
40
+
41
+ Before generating a query, always validate field names against the sample documents you fetched. MongoDB won't error on nonexistent field names - it will simply return no results or behave unexpectedly, making bugs hard to diagnose. By checking the sample documents first, you catch these issues before the user tries to run the query.
42
+
43
+ Also review the available indexes to understand which query patterns will perform best.
44
+
45
+ ### 3. Choose Query Type: Find vs Aggregation
46
+
47
+ Prefer find queries over aggregation pipelines because find queries are simpler and easier for other developers to understand.
48
+
49
+ **Use Find Query when:**
50
+ - Simple filtering on one or more fields
51
+ - Basic sorting, limiting, or projecting specific fields
52
+ - No need for grouping, complex transformations, or multi-stage processing
53
+
54
+ **Use Aggregation Pipeline when the request requires:**
55
+ - Grouping or aggregation functions (sum, count, average, etc.)
56
+ - Multiple transformation stages
57
+ - Joins with other collections ($lookup)
58
+ - Array unwinding or complex array operations
59
+
60
+ ### 4. Format Your Response
61
+
62
+ Output queries using mongosh shell syntax for readability and compatibility with the current mongosh session.
63
+
64
+ **Find Query Response:**
65
+ ```javascript
66
+ // Filter: users aged 25+
67
+ // Projection: name and age only
68
+ // Sort: by age descending, limit 10
69
+ db.users.find(
70
+ { age: { $gte: 25 } },
71
+ { name: 1, age: 1, _id: 0 }
72
+ ).sort({ age: -1 }).limit(10)
73
+ ```
74
+
75
+ **Aggregation Pipeline Response:**
76
+ ```javascript
77
+ db.orders.aggregate([
78
+ { $match: { status: 'active' } },
79
+ { $group: { _id: '$category', total: { $sum: '$amount' } } },
80
+ { $sort: { total: -1 } }
81
+ ])
82
+ ```
83
+
84
+ ## Best Practices
85
+
86
+ ### Query Quality
87
+ 1. **Generate correct queries** - Build queries that match user requirements, then check index coverage:
88
+ - Generate the query to correctly satisfy all user requirements
89
+ - After generating the query, check if existing indexes can support it
90
+ - If no appropriate index exists, mention this in your response (user may want to create one)
91
+ - Never use `$where` because it prevents index usage
92
+ - Do not use `$text` without a text index
93
+ - `$expr` should only be used when necessary (use sparingly)
94
+
95
+ 2. **Avoid redundant operators** - Never add operators that are already implied by other conditions:
96
+ - Don't add `$exists` when you already have an equality or inequality check (e.g., `status: "active"` or `age: { $gt: 25 }` already implies the field exists)
97
+ - Don't add overlapping range conditions (e.g., don't use both `$gte: 0` and `$gt: -1`)
98
+ - Each condition should add meaningful filtering that isn't already covered
99
+
100
+ 3. **Project only needed fields** - Reduce data transfer with projections
101
+ - Add `_id: 0` to the projection when `_id` field is not needed
102
+
103
+ 4. **Validate field names** against the sample documents before using them
104
+
105
+ 5. **Use appropriate operators** - Choose the right MongoDB operator for the task:
106
+ - `$eq`, `$ne`, `$gt`, `$gte`, `$lt`, `$lte` for comparisons
107
+ - `$in`, `$nin` for matching against a list of possible values (equivalent to multiple $eq/$ne conditions OR'ed together)
108
+ - `$and`, `$or`, `$not`, `$nor` for logical operations
109
+ - `$regex` for case-sensitive text pattern matching (prefer left-anchored patterns like `/^prefix/` when possible, as they can use indexes efficiently)
110
+ - `$exists` for field existence checks (prefer `a: {$ne: null}` to `a: {$exists: true}` to leverage available indexes)
111
+ - `$type` for type matching
112
+
113
+ 6. **Optimize array field checks** - Use efficient patterns for array operations:
114
+ - To check if an array is non-empty: use `"arrayField.0": {$exists: true}` instead of `arrayField: {$exists: true, $type: "array", $ne: []}`
115
+ - Checking for the first element's existence is simpler, more readable, and more efficient than combining existence, type, and inequality checks
116
+ - For matching array elements with multiple conditions, use `$elemMatch`
117
+ - For array length checks, use `$size` when you need an exact count
118
+
119
+ ### Aggregation Pipeline Quality
120
+ 1. **Filter early** - Use `$match` as early as possible to reduce documents
121
+ 2. **Project at the end** - Use `$project` at the end to correctly shape returned documents to the client
122
+ 3. **Limit when possible** - Add `$limit` after `$sort` when appropriate
123
+ 4. **Use indexes** - Ensure `$match` and `$sort` stages can use indexes:
124
+ - Place `$match` stages at the beginning of the pipeline
125
+ - Initial `$match` and `$sort` stages can use indexes if they precede any stage that modifies documents
126
+ - After generating `$match` filters, check if indexes can support them
127
+ - Minimize stages that transform documents before first `$match`
128
+ 5. **Optimize `$lookup`** - Consider denormalization for frequently joined data
129
+
130
+ ### Error Prevention
131
+ 1. **Validate all field references** against the sample documents
132
+ 2. **Quote field names correctly** - Use dot notation for nested fields
133
+ 3. **Escape special characters** in regex patterns
134
+ 4. **Check data types** - Ensure field values match field types from sample documents
135
+ 5. **Geospatial coordinates** - MongoDB's GeoJSON format requires longitude first, then latitude (e.g., `[longitude, latitude]` or `{type: "Point", coordinates: [lng, lat]}`). This is opposite to how coordinates are often written in plain English, so double-check this when generating geo queries.
136
+
137
+ ## Schema Analysis
138
+
139
+ When provided with sample documents, analyze:
140
+ 1. **Field types** - String, Number, Boolean, Date, ObjectId, Array, Object
141
+ 2. **Field patterns** - Required vs optional fields (check multiple samples)
142
+ 3. **Nested structures** - Objects within objects, arrays of objects
143
+ 4. **Array elements** - Homogeneous vs heterogeneous arrays
144
+ 5. **Special types** - Dates, ObjectIds, Binary data, GeoJSON
145
+
146
+ ## Sample Document Usage
147
+
148
+ Use sample documents to:
149
+ - Understand actual data values and ranges
150
+ - Identify field naming conventions (camelCase, snake_case, etc.)
151
+ - Detect common patterns (e.g., status enums, category values)
152
+ - Estimate cardinality for grouping operations
153
+ - Validate that your query will work with real data
154
+
155
+ ## Error Handling
156
+
157
+ If you cannot generate a query:
158
+ 1. **Explain why** - Missing schema, ambiguous request, impossible query
159
+ 2. **Ask for clarification** - Request more details about requirements
160
+ 3. **Suggest alternatives** - Propose different approaches if available
161
+ 4. **Provide examples** - Show similar queries that could work
162
+
163
+ ## Example Workflow
164
+
165
+ **User Input:** "Find all active users over 25 years old, sorted by registration date"
166
+
167
+ **Your Process:**
168
+ 1. Use mongosh_eval to check schema: `db.users.find().limit(3)`
169
+ 2. Use mongosh_eval to check indexes: `db.users.getIndexes()`
170
+ 3. Verify field names: `status`, `age`, `registrationDate` or similar
171
+ 4. Verify field types match the query requirements
172
+ 5. Generate query based on user requirements
173
+ 6. Check if available indexes can support the query
174
+ 7. Suggest creating an index if no appropriate index exists for the query filters
175
+
176
+ **Generated Query:**
177
+ ```javascript
178
+ db.users.find(
179
+ { status: 'active', age: { $gt: 25 } }
180
+ ).sort({ registrationDate: -1 })
181
+ ```
182
+
183
+ ## Managing Context Size
184
+
185
+ Fetching large or numerous sample documents wastes context and can degrade query quality.
186
+
187
+ **Adjust sample count by schema width:**
188
+ - < 30 fields: `limit: 4` (default)
189
+ - 30-80 fields: `limit: 2`
190
+ - 80-150 fields: `limit: 1`
191
+ - 150+ fields: `limit: 1` with a projection of only the fields relevant to the user's query
192
+
193
+ **Preview large array fields and strings:**
194
+ - If schema documents contains arrays, use `$slice: 3` in the sample projection to cap array size. Limit string fields to 100 characters with `$substr` in the sample projection to prevent excessively long values from consuming context.
195
+
196
+ ## Executing Queries
197
+
198
+ When executing queries via mongosh_eval:
199
+ 1. Start with a small limit to preview results: `.limit(5)`
200
+ 2. Check that field names and values match expectations
201
+ 3. For aggregation pipelines, test incrementally by adding one stage at a time
202
+ 4. Use `.explain("executionStats")` to verify index usage for slow queries
@@ -0,0 +1,265 @@
1
+ ---
2
+ name: mongodb-query-optimizer
3
+ description: Help with MongoDB query optimization and indexing. Use only when the user asks for optimization or performance - "How do I optimize this query?", "How do I index this?", "Why is this query slow?", "Can you fix my slow queries?", "What are the slow queries on my cluster?", etc. Do not invoke for general MongoDB query writing unless user asks for performance or index help. Prefer indexing as optimization strategy.
4
+ disable-model-invocation: false
5
+ ---
6
+
7
+ # MongoDB Query Optimizer
8
+
9
+ ## When this skill is invoked
10
+
11
+ Invoke **only** when the user wants:
12
+
13
+ - Query/index **optimization** or **performance** help
14
+ - **Why** a query is slow or **how to speed it up**
15
+ - **How to index** a specific query
16
+ - **Slow queries** on their cluster and/or **how to optimize them**
17
+
18
+ Do **not** invoke for routine query authoring unless the user has requested help with optimization, slow queries, or indexing.
19
+
20
+ ## High Level Workflow
21
+
22
+ ### General Performance Help
23
+
24
+ If the user wants to examine slow queries, or is looking for general performance suggestions (not regarding any particular query):
25
+
26
+ 1. Check the profiling level and slow query log using mongosh_eval:
27
+ ```javascript
28
+ // Enable profiling level 1 (slow ops only, >100ms)
29
+ db.setProfilingLevel(1, { slowms: 100 })
30
+
31
+ // View recent slow queries
32
+ db.system.profile.find().sort({ ts: -1 }).limit(10)
33
+
34
+ // Check server status for metrics
35
+ db.serverStatus()
36
+ ```
37
+
38
+ 2. Check index usage stats across collections:
39
+ ```javascript
40
+ // For a specific collection
41
+ db.collection.aggregate([{ $indexStats: {} }])
42
+ ```
43
+
44
+ ### Help with a Specific Query
45
+
46
+ If the user is asking about a particular query:
47
+
48
+ 1. **Get existing indexes** using mongosh_eval:
49
+ ```javascript
50
+ db.collection.getIndexes()
51
+ ```
52
+
53
+ 2. **Run explain** to analyze the query plan:
54
+ ```javascript
55
+ // Basic explain
56
+ db.collection.find({...}).explain()
57
+
58
+ // Execution stats for detailed analysis
59
+ db.collection.find({...}).explain("executionStats")
60
+
61
+ // All plans execution to compare different approaches
62
+ db.collection.find({...}).explain("allPlansExecution")
63
+ ```
64
+
65
+ 3. **Get a sample document** to understand the schema:
66
+ ```javascript
67
+ db.collection.find().limit(1)
68
+ ```
69
+
70
+ ## Using Explain Output
71
+
72
+ ### Key Fields in explain("executionStats")
73
+
74
+ ```javascript
75
+ {
76
+ executionStats: {
77
+ nReturned: 5, // Documents returned
78
+ totalDocsExamined: 1000, // Documents scanned
79
+ totalKeysExamined: 5, // Index keys examined
80
+ executionTimeMillis: 2, // Time in milliseconds
81
+ stage: "IXSCAN", // COLLSCAN, IXSCAN, FETCH, etc.
82
+ inputStage: {
83
+ stage: "IXSCAN",
84
+ indexName: "field_1",
85
+ keyPattern: { field: 1 }
86
+ }
87
+ }
88
+ }
89
+ ```
90
+
91
+ ### What to Look For
92
+
93
+ **Good signs:**
94
+ - `stage: "IXSCAN"` with `totalKeysExamined` close to `nReturned`
95
+ - `totalDocsExamined` equals `nReturned` (covered query)
96
+ - `executionTimeMillis` is low
97
+
98
+ **Bad signs:**
99
+ - `stage: "COLLSCAN"` (collection scan)
100
+ - `totalDocsExamined` much higher than `nReturned` (inefficient index)
101
+ - High `executionTimeMillis`
102
+
103
+ ## Example Workflow 1 (help with specific query)
104
+
105
+ **User:** "Why is this query slow? `db.orders.find({status: 'shipped', region: 'US'}).sort({date: -1})`"
106
+
107
+ 1. **Check existing collection indexes:**
108
+ ```javascript
109
+ db.orders.getIndexes()
110
+ ```
111
+ - Result shows: `{_id: 1}`, `{status: 1}`, `{date: -1}`
112
+
113
+ 2. **Run explain:**
114
+ ```javascript
115
+ db.orders.find(
116
+ {status: 'shipped', region: 'US'}
117
+ ).sort({date: -1}).explain("executionStats")
118
+ ```
119
+ - Result: Uses `{status: 1}` index, then in-memory SORT
120
+ - `totalKeysExamined: 50000`, `nReturned: 100`
121
+
122
+ 3. **Diagnose:** The query targets 100 docs but scans 50K index entries. In-memory sort adds overhead. Index doesn't support both filter fields or sort.
123
+
124
+ 4. **Recommend:** Create compound index `{status: 1, region: 1, date: -1}` following ESR (two equality fields, then sort).
125
+
126
+ ## Indexing Best Practices
127
+
128
+ ### Creating Indexes Safely
129
+
130
+ ```javascript
131
+ // Check index doesn't exist first
132
+ db.collection.getIndexes()
133
+
134
+ // Create index in background (recommended for production)
135
+ db.collection.createIndex(
136
+ { field: 1 },
137
+ { background: true }
138
+ )
139
+
140
+ // Compound index following ESR rule
141
+ // Equality → Sort → Range
142
+ db.orders.createIndex({
143
+ status: 1, // Equality
144
+ createdAt: -1, // Sort
145
+ age: 1 // Range
146
+ })
147
+ ```
148
+
149
+ ### Common Index Candidates
150
+ - Fields in `find()` queries (especially equality matches)
151
+ - Fields in `sort()` operations
152
+ - Fields in aggregation `$match` stages
153
+ - Foreign key-like fields used in `$lookup`
154
+
155
+ ### When Queries Use Indexes
156
+ - Equality matches on index prefix
157
+ - Range queries on index fields (use bounded ranges)
158
+ - Sorting on indexed fields
159
+ - Covered queries (all fields in index)
160
+
161
+ ### When Indexes Are Ignored
162
+ - `$nin`, `$ne`, `$not` often can't use indexes effectively
163
+ - Regex without prefix anchor `/^pattern/`
164
+ - `$where` clauses
165
+ - Large `$in` arrays (threshold varies)
166
+
167
+ ### Specialized Index Types
168
+
169
+ ```javascript
170
+ // Partial index (only index active users)
171
+ db.users.createIndex(
172
+ { email: 1 },
173
+ { partialFilterExpression: { status: "active" } }
174
+ )
175
+
176
+ // Sparse index (only index documents where field exists)
177
+ db.collection.createIndex(
178
+ { optionalField: 1 },
179
+ { sparse: true }
180
+ )
181
+
182
+ // TTL index (auto-delete old documents)
183
+ db.logs.createIndex(
184
+ { createdAt: 1 },
185
+ { expireAfterSeconds: 2592000 } // 30 days
186
+ )
187
+ ```
188
+
189
+ ## Query Patterns to Avoid
190
+
191
+ 1. **Unbounded queries**: Always use limits
192
+ ```javascript
193
+ // BAD
194
+ db.logs.find({ level: "error" })
195
+ // GOOD
196
+ db.logs.find({ level: "error" }).limit(100)
197
+ ```
198
+
199
+ 2. **Large skip values**: Use cursor-based pagination
200
+ ```javascript
201
+ // BAD: skip 1000000 is slow
202
+ db.collection.find().skip(1000000).limit(10)
203
+ // GOOD: cursor-based
204
+ db.collection.find({ _id: { $gt: lastId } }).limit(10)
205
+ ```
206
+
207
+ 3. **$lookup without index on foreign field**
208
+
209
+ 4. **Updating large arrays**: Consider separate collection
210
+
211
+ 5. **Unnecessary projections**: Project only needed fields
212
+
213
+ ## Server Metrics to Monitor
214
+
215
+ ```javascript
216
+ // Key metrics from serverStatus
217
+ db.serverStatus().opcounters // CRUD operation counts
218
+ db.serverStatus().connections // Current connections
219
+ db.serverStatus().mem // Memory usage
220
+ db.serverStatus().globalLock // Lock contention
221
+
222
+ // WiredTiger cache metrics
223
+ db.serverStatus().wiredTiger.cache
224
+ // Look for:
225
+ // - "bytes currently in the cache" vs available RAM
226
+ // - "pages evicted by application threads" (high = pressure)
227
+ ```
228
+
229
+ ## Aggregation Pipeline Optimization
230
+
231
+ ### Stage Ordering
232
+ ```javascript
233
+ // GOOD: Filter early
234
+ db.orders.aggregate([
235
+ { $match: { status: "shipped", date: { $gte: startDate } } },
236
+ { $group: { _id: "$customerId", total: { $sum: "$amount" } } },
237
+ { $sort: { total: -1 } },
238
+ { $limit: 10 }
239
+ ])
240
+
241
+ // BAD: Sort before filtering
242
+ db.orders.aggregate([
243
+ { $sort: { date: -1 } },
244
+ { $match: { status: "shipped" } }
245
+ ])
246
+ ```
247
+
248
+ ### Memory Optimization
249
+ ```javascript
250
+ // Allow disk use for large aggregations
251
+ db.collection.aggregate([...], { allowDiskUse: true })
252
+
253
+ // Use $project to reduce document size early
254
+ { $project: { neededField: 1, computed: { $add: ["$a", "$b"] } } }
255
+ ```
256
+
257
+ ## Output Guidelines
258
+
259
+ - Keep answers short and clear: a few sentences on index and optimization suggestions
260
+ - Focus on highest impact indexes or optimizations
261
+ - Do not use strong language like "this will definitely improve performance"
262
+ - Explain they are suggestions and give the reasoning behind them
263
+ - Consider how many indexes already exist on the collection (shouldn't generally be more than 20)
264
+ - Suggest removing indexes only if they are clearly unused (check `$indexStats`)
265
+ - Never create indexes without user approval - show the command and wait for confirmation