purecontext-mcp 1.1.0 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/AGENT_INSTRUCTIONS.md +509 -0
  2. package/AGENT_INSTRUCTIONS_SHORT.md +97 -0
  3. package/CHANGELOG.md +212 -0
  4. package/docs/01-introduction.md +69 -0
  5. package/docs/02-installation.md +267 -0
  6. package/docs/03-quick-start.md +135 -0
  7. package/docs/04-configuration.md +214 -0
  8. package/docs/05-cli-reference.md +130 -0
  9. package/docs/06-tools-reference.md +499 -0
  10. package/docs/07-language-support.md +88 -0
  11. package/docs/08-framework-adapters.md +324 -0
  12. package/docs/09-dependency-graph.md +182 -0
  13. package/docs/10-semantic-search.md +153 -0
  14. package/docs/11-search-quality.md +110 -0
  15. package/docs/12-ai-summarization.md +106 -0
  16. package/docs/13-token-savings.md +110 -0
  17. package/docs/14-transport-modes.md +167 -0
  18. package/docs/15-team-setup.md +251 -0
  19. package/docs/16-docker.md +186 -0
  20. package/docs/17-web-ui.md +157 -0
  21. package/docs/18-git-history.md +157 -0
  22. package/docs/19-cross-repo.md +177 -0
  23. package/docs/20-architecture-analysis.md +228 -0
  24. package/docs/21-ecosystem-tools.md +189 -0
  25. package/docs/22-distribution.md +240 -0
  26. package/docs/23-performance.md +121 -0
  27. package/docs/24-security.md +144 -0
  28. package/docs/25-architecture-overview.md +240 -0
  29. package/docs/26-troubleshooting.md +234 -0
  30. package/docs/27-api-stability.md +114 -0
  31. package/docs/README.md +71 -0
  32. package/guide/README.md +57 -0
  33. package/guide/ai-summaries.md +127 -0
  34. package/guide/code-health.md +190 -0
  35. package/guide/code-history.md +149 -0
  36. package/guide/finding-code.md +157 -0
  37. package/guide/navigating-new-code.md +121 -0
  38. package/guide/safe-changes.md +156 -0
  39. package/guide/team-setup.md +191 -0
  40. package/guide/web-ui.md +154 -0
  41. package/guide/why-purecontext.md +73 -0
  42. package/guide/workflow-onboarding.md +114 -0
  43. package/guide/workflow-pr-review.md +199 -0
  44. package/guide/workflow-refactoring.md +172 -0
  45. package/package.json +9 -2
@@ -0,0 +1,499 @@
1
+ # MCP Tools Reference
2
+
3
+
4
+ All tools return JSON. Most responses include a `_tokenEstimate` field so agents can gauge context size before loading full source. Every retrieval tool also includes a `_meta` envelope with timing and token savings.
5
+
6
+ ```json
7
+ "_meta": {
8
+ "timing_ms": 3,
9
+ "tokens_saved": 1842,
10
+ "total_tokens_saved": 45231,
11
+ "cost_avoided": { "claude_opus_4": 0.028 },
12
+ "powered_by": "PureContext MCP"
13
+ }
14
+ ```
15
+
16
+ ---
17
+
18
+ ## Indexing
19
+
20
+ ### `index_folder`
21
+
22
+ Index a local project directory.
23
+
24
+ **Parameters:**
25
+
26
+ | Parameter | Type | Required | Description |
27
+ |-----------|------|----------|-------------|
28
+ | `path` | `string` | yes | Absolute path to the project root |
29
+ | `fileLimit` | `number` | no | Override config `fileLimit` for this run |
30
+ | `force` | `boolean` | no | Re-index even unchanged files (default: `false`) |
31
+
32
+ **Returns:** `{ repoId, filesIndexed, symbolsExtracted, durationMs, languages, adapters }`
33
+
34
+ Subsequent calls are incremental — only changed files (by content hash) are re-parsed. The file watcher also triggers incremental re-indexing automatically on file changes.
35
+
36
+ ---
37
+
38
+ ### `index_repo`
39
+
40
+ Clone and index a remote Git repository.
41
+
42
+ **Parameters:**
43
+
44
+ | Parameter | Type | Required | Description |
45
+ |-----------|------|----------|-------------|
46
+ | `url` | `string` | yes | Repository URL (`https://`, `http://`, or `git@`) |
47
+ | `branch` | `string` | no | Branch to clone (default: default branch) |
48
+ | `token` | `string` | no | Personal access token for private repos |
49
+ | `fileLimit` | `number` | no | Max files to index |
50
+
51
+ **Returns:** Same as `index_folder`. Clones are stored at `~/.purecontext/clones/`.
52
+
53
+ ---
54
+
55
+ ### `resolve_repo`
56
+
57
+ Resolve a local path to its `repoId` and check if it is indexed.
58
+
59
+ **Parameters:** `{ path: string }`
60
+
61
+ **Returns:** `{ repoId, indexed, lastIndexed, filesIndexed, symbolCount }`
62
+
63
+ ---
64
+
65
+ ### `list_repos`
66
+
67
+ List all indexed repositories.
68
+
69
+ **Parameters:** `{}`
70
+
71
+ **Returns:** `{ repos: [{ repoId, path, filesIndexed, lastIndexed, languages }] }`
72
+
73
+ ---
74
+
75
+ ### `invalidate_cache`
76
+
77
+ Force a full re-index of a repo or a single file, clearing all content hashes.
78
+
79
+ **Parameters:**
80
+
81
+ | Parameter | Type | Required | Description |
82
+ |-----------|------|----------|-------------|
83
+ | `repoId` | `string` | yes | Repository to invalidate |
84
+ | `filePath` | `string` | no | If given, invalidate only this file (relative path) |
85
+
86
+ **Returns:** `{ invalidated: number }` — number of files whose cache was cleared.
87
+
88
+ ---
89
+
90
+ ## Symbol Search & Retrieval
91
+
92
+ ### `search_symbols`
93
+
94
+ Search symbols by name fragment. The primary navigation tool.
95
+
96
+ **Parameters:**
97
+
98
+ | Parameter | Type | Required | Description |
99
+ |-----------|------|----------|-------------|
100
+ | `repoId` | `string` | yes | Target repository |
101
+ | `query` | `string` | yes | Name fragment or FTS5 query |
102
+ | `kind` | `string` | no | Filter by symbol kind (see [Symbol Kinds](#symbol-kinds)) |
103
+ | `filePath` | `string` | no | Filter to a specific file |
104
+ | `limit` | `number` | no | Max results (default: 20) |
105
+ | `mode` | `string` | no | `"keyword"` (default), `"semantic"`, or `"hybrid"` |
106
+ | `debug` | `boolean` | no | Include relevance scoring breakdown in response |
107
+
108
+ **Returns:** `{ symbols: SymbolSummary[], _tokenEstimate }`
109
+
110
+ Does **not** return source code — use `get_symbol_source` for that.
111
+
112
+ ---
113
+
114
+ ### `search_text`
115
+
116
+ Full-text search across cached file content (grep-like).
117
+
118
+ **Parameters:**
119
+
120
+ | Parameter | Type | Required | Description |
121
+ |-----------|------|----------|-------------|
122
+ | `repoId` | `string` | yes | Target repository |
123
+ | `query` | `string` | yes | Search term or regex |
124
+ | `is_regex` | `boolean` | no | Treat query as a regular expression (default: `false`) |
125
+ | `file_pattern` | `string` | no | Glob pattern to restrict to specific files |
126
+ | `context_lines` | `number` | no | Lines of context around each match (default: 2) |
127
+ | `max_results` | `number` | no | Max matches (default: 50) |
128
+ | `debug` | `boolean` | no | Include relevance scoring breakdown |
129
+
130
+ **Returns:** `{ matches: [{ file, line, column, match, context }], truncated }`
131
+
132
+ ---
133
+
134
+ ### `search_semantic`
135
+
136
+ Semantic (meaning-based) search using HNSW vector index.
137
+
138
+ **Parameters:**
139
+
140
+ | Parameter | Type | Required | Description |
141
+ |-----------|------|----------|-------------|
142
+ | `repoId` | `string` | yes | Target repository |
143
+ | `query` | `string` | yes | Natural language description |
144
+ | `mode` | `string` | no | `"semantic"` (default) or `"hybrid"` |
145
+ | `semantic_weight` | `number` | no | Weight for semantic score in hybrid mode (default: 0.6) |
146
+ | `keyword_weight` | `number` | no | Weight for keyword score in hybrid mode (default: 0.4) |
147
+ | `max_results` | `number` | no | Max results (default: 10) |
148
+ | `kind` | `string` | no | Filter by symbol kind |
149
+
150
+ **Returns:** `{ results: [{ ...symbol, scores: { keyword, semantic, combined } }] }`
151
+
152
+ Requires semantic search enabled and an embedding provider configured. Falls back to FTS5 when no HNSW index exists.
153
+
154
+ ---
155
+
156
+ ### `get_symbol_source`
157
+
158
+ Retrieve the raw source code of a symbol.
159
+
160
+ **Parameters:**
161
+
162
+ | Parameter | Type | Required | Description |
163
+ |-----------|------|----------|-------------|
164
+ | `repoId` | `string` | yes | Target repository |
165
+ | `symbolId` | `string` | yes | Symbol ID from `search_symbols` |
166
+ | `context_lines` | `number` | no | Extra lines of context above/below (default: 0) |
167
+ | `verify` | `boolean` | no | Re-read file from disk to verify source hasn't changed |
168
+
169
+ **Returns:** `{ source, filePath, startByte, endByte, startLine, endLine, _tokenEstimate }`
170
+
171
+ ---
172
+
173
+ ### `get_symbols`
174
+
175
+ Batch-fetch multiple symbols by ID.
176
+
177
+ **Parameters:**
178
+
179
+ | Parameter | Type | Required | Description |
180
+ |-----------|------|----------|-------------|
181
+ | `repoId` | `string` | yes | Target repository |
182
+ | `symbolIds` | `string[]` | yes | Array of symbol IDs |
183
+
184
+ **Returns:** `{ symbols: [{ ...symbolSummary, source }] }`
185
+
186
+ ---
187
+
188
+ ### `get_file_content`
189
+
190
+ Retrieve raw cached file content, with optional line range.
191
+
192
+ **Parameters:**
193
+
194
+ | Parameter | Type | Required | Description |
195
+ |-----------|------|----------|-------------|
196
+ | `repoId` | `string` | yes | Target repository |
197
+ | `filePath` | `string` | yes | Relative file path |
198
+ | `startLine` | `number` | no | Start line (1-based, default: 1) |
199
+ | `endLine` | `number` | no | End line (inclusive, default: end of file) |
200
+
201
+ **Returns:** `{ content, filePath, startLine, endLine, totalLines, _tokenEstimate }`
202
+
203
+ ---
204
+
205
+ ### `get_file_outline`
206
+
207
+ All symbols in a file with signatures and summaries.
208
+
209
+ **Parameters:** `{ repoId, filePath }`
210
+
211
+ **Returns:** `{ filePath, symbols: SymbolSummary[], _tokenEstimate }`
212
+
213
+ ---
214
+
215
+ ### `get_repo_outline`
216
+
217
+ All files with their top-level symbols — a project map.
218
+
219
+ **Parameters:** `{ repoId, limit? }` (default limit: 100 files)
220
+
221
+ **Returns:** `{ files: [{ filePath, symbols: SymbolSummary[] }], _tokenEstimate }`
222
+
223
+ ---
224
+
225
+ ### `get_file_tree`
226
+
227
+ Directory tree with file counts per directory.
228
+
229
+ **Parameters:** `{ repoId, maxDepth? }` (default maxDepth: 5)
230
+
231
+ **Returns:** Nested `{ name, type: 'dir'|'file', children?, symbolCount? }`
232
+
233
+ ---
234
+
235
+ ### `find_references`
236
+
237
+ Find all usage sites (call sites, references) for a symbol across the repo.
238
+
239
+ **Parameters:**
240
+
241
+ | Parameter | Type | Required | Description |
242
+ |-----------|------|----------|-------------|
243
+ | `repoId` | `string` | yes | Target repository |
244
+ | `symbolId` | `string` | yes | Symbol to find references for |
245
+ | `limit` | `number` | no | Max results (default: 50) |
246
+
247
+ **Returns:** `{ references: [{ filePath, line, column, snippet }], count }`
248
+
249
+ ---
250
+
251
+ ## Dependency Graph
252
+
253
+ ### `get_context_bundle`
254
+
255
+ Forward-walk from a symbol — returns everything needed to understand it (transitive imports).
256
+
257
+ **Parameters:**
258
+
259
+ | Parameter | Type | Required | Description |
260
+ |-----------|------|----------|-------------|
261
+ | `repoId` | `string` | yes | Target repository |
262
+ | `symbolId` | `string` | yes | Starting symbol |
263
+ | `maxDepth` | `number` | no | Max traversal depth (default: 3) |
264
+ | `maxTokens` | `number` | no | Stop collecting when estimate exceeds this |
265
+
266
+ **Returns:** `{ symbols: SymbolSummary[], files: string[], _tokenEstimate }`
267
+
268
+ ---
269
+
270
+ ### `get_blast_radius`
271
+
272
+ Reverse-walk — all files that (transitively) import a symbol. Use before modifying or deleting a symbol.
273
+
274
+ **Parameters:**
275
+
276
+ | Parameter | Type | Required | Description |
277
+ |-----------|------|----------|-------------|
278
+ | `repoId` | `string` | yes | Target repository |
279
+ | `symbolId` | `string` | yes | Starting symbol |
280
+ | `maxDepth` | `number` | no | Max traversal depth (default: 5) |
281
+
282
+ **Returns:** `{ importers: string[], count, _tokenEstimate }`
283
+
284
+ ---
285
+
286
+ ### `find_importers`
287
+
288
+ Direct (one-hop) importers of a file.
289
+
290
+ **Parameters:** `{ repoId, filePath }`
291
+
292
+ **Returns:** `{ importers: [{ filePath, importedNames: string[] }], _tokenEstimate }`
293
+
294
+ ---
295
+
296
+ ### `find_dead_code`
297
+
298
+ Exported symbols in files that nothing else imports.
299
+
300
+ **Parameters:**
301
+
302
+ | Parameter | Type | Required | Description |
303
+ |-----------|------|----------|-------------|
304
+ | `repoId` | `string` | yes | Target repository |
305
+ | `limit` | `number` | no | Max results (default: 50) |
306
+
307
+ **Returns:** `{ symbols: SymbolSummary[], _tokenEstimate }`
308
+
309
+ **Note:** May produce false positives for: dynamic imports, side-effect imports, symbols used by external packages (npm consumers).
310
+
311
+ ---
312
+
313
+ ### `get_layer_violations`
314
+
315
+ Detect architectural import boundary violations.
316
+
317
+ **Parameters:**
318
+
319
+ | Parameter | Type | Required | Description |
320
+ |-----------|------|----------|-------------|
321
+ | `repoId` | `string` | yes | Target repository |
322
+ | `layers` | `LayerDef[]` | no | Layer definitions (reads from config if omitted) |
323
+
324
+ **Returns:** `{ violations: [{ from_layer, to_layer, from_file, to_file, import_spec }], summary }`
325
+
326
+ ---
327
+
328
+ ## Token Savings
329
+
330
+ ### `get_savings_stats`
331
+
332
+ View cumulative token savings across all PureContext tool calls.
333
+
334
+ **Parameters:** `{ reset?: boolean }` — set `reset: true` to clear counters.
335
+
336
+ **Returns:**
337
+
338
+ ```json
339
+ {
340
+ "total_tokens_saved": 1234567,
341
+ "equivalent_context_windows": {
342
+ "claude_200k": 6.17,
343
+ "gpt4_128k": 9.64
344
+ },
345
+ "total_cost_avoided": {
346
+ "claude_opus_4": 18.52,
347
+ "claude_sonnet_4": 3.70,
348
+ "claude_haiku_4": 0.99,
349
+ "gpt4o": 3.09,
350
+ "gpt4o_mini": 0.19
351
+ }
352
+ }
353
+ ```
354
+
355
+ ---
356
+
357
+ ## Cross-Repo Tools
358
+
359
+ ### `search_cross_repo`
360
+
361
+ Search symbols across multiple indexed repositories simultaneously.
362
+
363
+ **Parameters:**
364
+
365
+ | Parameter | Type | Required | Description |
366
+ |-----------|------|----------|-------------|
367
+ | `query` | `string` | yes | Name fragment |
368
+ | `repoIds` | `string[]` | no | Repos to search (default: all in workspace) |
369
+ | `kind` | `string` | no | Symbol kind filter |
370
+ | `limit` | `number` | no | Max results (default: 20) |
371
+
372
+ **Returns:** `{ symbols: [{ ...symbolSummary, repoId }] }`
373
+
374
+ ---
375
+
376
+ ### `find_similar`
377
+
378
+ Find semantically similar code across repos.
379
+
380
+ **Parameters:**
381
+
382
+ | Parameter | Type | Required | Description |
383
+ |-----------|------|----------|-------------|
384
+ | `symbolId` | `string` | yes | Reference symbol |
385
+ | `repoId` | `string` | yes | Repo of the reference symbol |
386
+ | `searchRepoIds` | `string[]` | no | Repos to search (default: all) |
387
+ | `minSimilarity` | `number` | no | Minimum cosine similarity 0–1 (default: 0.8) |
388
+ | `limit` | `number` | no | Max results (default: 10) |
389
+
390
+ **Returns:** `{ similar: [{ ...symbolSummary, repoId, similarity }] }`
391
+
392
+ Requires semantic search enabled.
393
+
394
+ ---
395
+
396
+ ## Git & History Tools
397
+
398
+ ### `get_symbol_history`
399
+
400
+ Symbol-level git commit history.
401
+
402
+ **Parameters:**
403
+
404
+ | Parameter | Type | Required | Description |
405
+ |-----------|------|----------|-------------|
406
+ | `repoId` | `string` | yes | Target repository |
407
+ | `symbolId` | `string` | yes | Symbol to get history for |
408
+ | `limit` | `number` | no | Max commits (default: 20) |
409
+
410
+ **Returns:** `{ history: [{ hash, author, date, message, diff }] }`
411
+
412
+ ---
413
+
414
+ ### `get_churn_metrics`
415
+
416
+ File or symbol churn metrics.
417
+
418
+ **Parameters:**
419
+
420
+ | Parameter | Type | Required | Description |
421
+ |-----------|------|----------|-------------|
422
+ | `repoId` | `string` | yes | Target repository |
423
+ | `filePath` | `string` | no | Scope to a single file |
424
+ | `since` | `string` | no | ISO 8601 date — look back from this date |
425
+
426
+ **Returns:** `{ files: [{ filePath, commits, linesChanged, authors, churnScore }] }`
427
+
428
+ ---
429
+
430
+ ## Architecture Analysis Tools
431
+
432
+ ### `get_quality_metrics`
433
+
434
+ Per-file and per-symbol quality scores.
435
+
436
+ **Parameters:** `{ repoId, filePath? }`
437
+
438
+ **Returns:** `{ files: [{ filePath, complexity, coupling, cohesion, docCoverage, score }] }`
439
+
440
+ ---
441
+
442
+ ### `detect_antipatterns`
443
+
444
+ Detect common architectural anti-patterns.
445
+
446
+ **Parameters:** `{ repoId, patterns?: string[] }` — omit `patterns` for all checks.
447
+
448
+ **Returns:** `{ issues: [{ pattern, filePath, symbolId, severity, description }] }`
449
+
450
+ ---
451
+
452
+ ### `get_architecture_doc`
453
+
454
+ Auto-generate an architecture summary.
455
+
456
+ **Parameters:** `{ repoId, format?: 'markdown' | 'mermaid' }` (default: `'markdown'`)
457
+
458
+ **Returns:** `{ doc: string }`
459
+
460
+ ---
461
+
462
+ ## Ecosystem & Data Tools
463
+
464
+ ### `search_columns`
465
+
466
+ Search dbt/SQL column definitions.
467
+
468
+ **Parameters:** `{ repoId, query, modelName? }`
469
+
470
+ **Returns:** `{ columns: [{ name, model, dataType, description, lineage }] }`
471
+
472
+ ---
473
+
474
+ ## Symbol Kinds
475
+
476
+ The `kind` parameter accepts any of:
477
+
478
+ | Kind | Description |
479
+ |------|-------------|
480
+ | `function` | Standalone function / top-level def |
481
+ | `class` | Class, struct (Go/Rust), or OOP type |
482
+ | `method` | Method inside a class/struct/impl |
483
+ | `const` | Constant, exported variable, field |
484
+ | `type` | Type alias, typedef, newtype |
485
+ | `interface` | Interface or protocol |
486
+ | `enum` | Enumeration |
487
+ | `component` | UI component (Vue, React, Angular) |
488
+ | `composable` | Vue composable (`useXxx`) |
489
+ | `hook` | React hook (`useXxx`) |
490
+ | `route` | HTTP route (any framework) |
491
+ | `middleware` | Middleware or guard |
492
+ | `decorator` | Decorator / annotation |
493
+ | `model` | ORM model |
494
+ | `view` | Request handler / controller action |
495
+ | `struct` | C/C++ struct |
496
+ | `macro` | C/C++ `#define` macro |
497
+ | `signal` | Django signal receiver |
498
+ | `namespace` | C++ namespace |
499
+ | `widget` | Flutter widget |
@@ -0,0 +1,88 @@
1
+ # Language Support
2
+
3
+
4
+ PureContext supports **34 languages** via tree-sitter WASM grammars. All grammars are bundled in the `grammars/` directory — no separate install needed.
5
+
6
+ ---
7
+
8
+ ## Supported languages
9
+
10
+ | Language | Extensions | Symbol Types | Doc Comments |
11
+ |----------|-----------|--------------|--------------|
12
+ | TypeScript | `.ts`, `.tsx`, `.mts`, `.cts` | function, class, method, const, type, interface, enum | JSDoc `/** */` |
13
+ | JavaScript | `.js`, `.jsx`, `.mjs`, `.cjs` | function, class, method, const | JSDoc `/** */` |
14
+ | Python | `.py` | function, class, method, const | Docstrings `"""` |
15
+ | Go | `.go` | function, method, class (struct), interface, const, type | `//` preceding comments |
16
+ | Rust | `.rs` | function, method, class (struct), enum, interface (trait), const, type | `///` doc comments |
17
+ | Java | `.java` | class, interface, enum, method, const | Javadoc `/** */` |
18
+ | C# | `.cs` | class, interface, enum, struct, record, method, const, property | XML docs `/// <summary>` |
19
+ | PHP | `.php` | function, class, interface, trait, enum, method, const | PHPDoc `/** */` |
20
+ | Ruby | `.rb` | function, class, method, module, const | `#` comments |
21
+ | Kotlin | `.kt`, `.kts` | function, class, interface, enum, method, typealias, object | KDoc `/**` |
22
+ | C | `.c`, `.h` | function, struct, enum, macro, type | `//` and `/* */` |
23
+ | C++ | `.cpp`, `.cxx`, `.cc`, `.hpp`, `.hxx`, `.hh` | All C types + namespace, template | `///` Doxygen |
24
+ | Lua | `.lua` | function, method, const | `--` comments |
25
+ | Dart | `.dart` | class, mixin, extension, enum, function, method, const, type | `///` doc comments |
26
+ | Swift | `.swift` | class, struct, protocol, actor, extension, method, enum, type | `///` DocC |
27
+ | Elixir | `.ex`, `.exs` | module (class), function, macro, struct, protocol | `@doc` attribute |
28
+ | Haskell | `.hs`, `.lhs` | function, data (class), typeclass (interface), instance, type, newtype | Haddock `-- \|` |
29
+ | Scala | `.scala`, `.sc` | class, trait, object, case class, function, method, type, enum | Scaladoc `/** */` |
30
+ | R | `.r`, `.R`, `.Rmd` | function, const, S3/S4/R6 class | Roxygen2 `#'` |
31
+ | Bash | `.sh`, `.bash` | function |
32
+ | Perl | `.pl`, `.pm` | function, package |
33
+ | Terraform / HCL | `.tf`, `.hcl` | resource, module, variable, output |
34
+ | Nix | `.nix` | function, attribute |
35
+ | Protobuf | `.proto` | message, service, enum, rpc |
36
+ | GraphQL | `.graphql`, `.gql` | type, query, mutation, subscription, fragment |
37
+ | Groovy | `.groovy` | function, class, method |
38
+ | Erlang | `.erl`, `.hrl` | function, module |
39
+ | Gleam | `.gleam` | function, type |
40
+ | GDScript | `.gd` | function, class, signal |
41
+ | XML | `.xml` | element (configurable patterns) |
42
+ | Objective-C | `.m`, `.h` | function, class, method |
43
+ | Fortran | `.f90`, `.f95`, `.for`, `.f` | function, subroutine, module |
44
+ | SQL | `.sql` | table, view, function, procedure |
45
+ | OpenAPI / YAML | `.yaml`, `.yml` (OpenAPI detected by content) | endpoint, schema |
46
+
47
+ ---
48
+
49
+ ## What gets indexed
50
+
51
+ For all languages, the indexer extracts:
52
+
53
+ - **Symbol name** — the identifier as it appears in source
54
+ - **Symbol kind** — function, class, method, route, component, etc.
55
+ - **Byte offsets** (`startByte`, `endByte`) — for precise source retrieval without reading the whole file
56
+ - **Signature** — a one-line declaration (TypeScript shows full type annotations, Python shows type hints if present)
57
+ - **Summary** — sourced from docstring, framework inference, AI, or signature fallback
58
+ - **Import/dependency edges** — for the dependency graph
59
+
60
+ ---
61
+
62
+ ## What is excluded automatically
63
+
64
+ The indexer skips these automatically:
65
+
66
+ - `node_modules/`, `.git/`, `dist/`, `build/`, `.claude/`, `target/`, `.next/`, `.nuxt/`
67
+ - `*.lock` files, `.env*` files
68
+ - Binary files (detected by null-byte scanning of the first 8 KB)
69
+ - Files > 1 MB (configurable via `maxFileSizeBytes`)
70
+ - Secret files: `*.pem`, `*.key`, `id_rsa`, `credentials.json`, `serviceAccountKey*.json`, etc.
71
+ - Language-specific private symbols:
72
+ - Go: unexported names (lowercase)
73
+ - C: `static` functions (translation-unit internal)
74
+ - Java/C#/PHP: `private` members
75
+ - Dart: `_`-prefixed names
76
+
77
+ ---
78
+
79
+ ## Grammar notes
80
+
81
+ Grammars are bundled as `.wasm` files in the `grammars/` directory. They are loaded once per worker thread at startup. Grammar versions are pinned in `package.json` and tested against the test fixtures in `test/handlers/`.
82
+
83
+ Known limitations:
84
+ - **TypeScript JSX** (`.tsx`): the `tree-sitter-tsx` grammar is separate from `tree-sitter-typescript` and is used for all `.tsx` files.
85
+ - **Python**: type hints in stubs (`.pyi`) are not indexed — only `.py` files.
86
+ - **Terraform**: complex `dynamic` blocks may not be fully extracted.
87
+ - **XML**: element extraction uses configurable patterns — not all XML files are indexed by default.
88
+