nano-brain 2026.3.2 → 2026.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/01-prepare.md +172 -0
  2. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/02-discover.md +195 -0
  3. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/03-structure.md +558 -0
  4. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/04-execute.md +194 -0
  5. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/05-analyze.md +224 -0
  6. package/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/summary.md +119 -0
  7. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/.openspec.yaml +2 -0
  8. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/design.md +88 -0
  9. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/proposal.md +31 -0
  10. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/specs/context-tool/spec.md +27 -0
  11. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/specs/flow-detection/spec.md +49 -0
  12. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/specs/impact-analysis/spec.md +38 -0
  13. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/specs/search-pipeline/spec.md +16 -0
  14. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/specs/symbol-graph/spec.md +72 -0
  15. package/openspec/changes/archive/2026-03-06-nano-brain-gitnexus-enhancements/tasks.md +79 -0
  16. package/openspec/changes/code-intelligence-accuracy-eval/.openspec.yaml +2 -0
  17. package/openspec/changes/code-intelligence-accuracy-eval/design.md +164 -0
  18. package/openspec/changes/code-intelligence-accuracy-eval/proposal.md +29 -0
  19. package/openspec/changes/code-intelligence-accuracy-eval/specs/accuracy-reporting/spec.md +100 -0
  20. package/openspec/changes/code-intelligence-accuracy-eval/specs/eval-harness/spec.md +101 -0
  21. package/openspec/changes/code-intelligence-accuracy-eval/specs/golden-fixtures/spec.md +90 -0
  22. package/openspec/changes/code-intelligence-accuracy-eval/tasks.md +83 -0
  23. package/package.json +6 -1
  24. package/src/codebase.ts +200 -3
  25. package/src/eval/calibration.ts +71 -0
  26. package/src/eval/harness.ts +493 -0
  27. package/src/eval/loader.ts +162 -0
  28. package/src/eval/regression.ts +100 -0
  29. package/src/eval/report.ts +59 -0
  30. package/src/eval/types.ts +73 -0
  31. package/src/flow-detection.ts +249 -0
  32. package/src/graph.ts +206 -0
  33. package/src/index.ts +156 -42
  34. package/src/search.ts +72 -1
  35. package/src/server.ts +239 -6
  36. package/src/store.ts +53 -0
  37. package/src/symbol-graph.ts +615 -0
  38. package/src/treesitter.ts +651 -0
  39. package/src/types.ts +3 -0
  40. package/test/eval/accuracy.test.ts +126 -0
  41. package/test/eval/fixtures/py-mixed/fixture.json +4 -0
  42. package/test/eval/fixtures/py-mixed/ground-truth.json +257 -0
  43. package/test/eval/fixtures/py-mixed/src/main.py +28 -0
  44. package/test/eval/fixtures/py-mixed/src/service.py +28 -0
  45. package/test/eval/fixtures/py-mixed/src/utils.py +12 -0
  46. package/test/eval/fixtures/ts-complex/fixture.json +4 -0
  47. package/test/eval/fixtures/ts-complex/ground-truth.json +510 -0
  48. package/test/eval/fixtures/ts-complex/src/api-handler.ts +54 -0
  49. package/test/eval/fixtures/ts-complex/src/base-service.ts +36 -0
  50. package/test/eval/fixtures/ts-complex/src/types.ts +20 -0
  51. package/test/eval/fixtures/ts-complex/src/user-service.ts +68 -0
  52. package/test/eval/fixtures/ts-complex/src/utils.ts +25 -0
  53. package/test/eval/fixtures/ts-simple/fixture.json +4 -0
  54. package/test/eval/fixtures/ts-simple/ground-truth.json +176 -0
  55. package/test/eval/fixtures/ts-simple/src/index.ts +105 -0
  56. package/test/flow-detection.test.ts +469 -0
  57. package/test/mcp-tools-symbol.test.ts +477 -0
  58. package/test/search-enrichment.test.ts +395 -0
  59. package/test/search.test.ts +2 -1
  60. package/test/symbol-clustering.test.ts +337 -0
  61. package/test/symbol-graph.test.ts +690 -0
  62. package/test/treesitter.test.ts +615 -0
@@ -0,0 +1,172 @@
1
+ # RRI-T Phase 1: PREPARE
2
+
3
+ **Feature:** nano-brain-gitnexus-enhancements
4
+ **Date:** 2026-03-06
5
+ **Type:** Backend TypeScript Library (MCP Server + CLI)
6
+
7
+ ---
8
+
9
+ ## 1. Feature Overview
10
+
11
+ The nano-brain-gitnexus-enhancements feature adds symbol-level code intelligence capabilities to nano-brain, inspired by GitNexus. It transforms nano-brain from a file-level indexer into a symbol-aware knowledge graph system.
12
+
13
+ ---
14
+
15
+ ## 2. Five Core Capabilities
16
+
17
+ ### C1: Tree-sitter AST Parsing
18
+ - Uses Tree-sitter native bindings to parse TypeScript, JavaScript, and Python files
19
+ - Extracts code symbols: functions, classes, methods, interfaces
20
+ - Captures metadata: name, kind, file path, start/end lines, export status
21
+ - Graceful fallback to regex-only parsing if Tree-sitter fails to load
22
+
23
+ ### C2: Symbol-level Knowledge Graph
24
+ - Stores symbols in `code_symbols` table, edges in `symbol_edges` table
25
+ - Typed edges: CALLS, IMPORTS, EXTENDS, IMPLEMENTS
26
+ - Confidence scoring (0.5-1.0) based on resolution certainty:
27
+ - Direct AST-resolved: 1.0
28
+ - Import-resolved: 0.9
29
+ - Same-file unresolved: 0.8
30
+ - Cross-file heuristic: 0.7
31
+ - Dynamic/computed: 0.5
32
+ - Coexists with existing infrastructure symbols (Redis, MySQL, etc.)
33
+
34
+ ### C3: Impact Analysis (`impact` MCP tool)
35
+ - Computes blast radius for a symbol (upstream/downstream)
36
+ - Returns affected symbols grouped by traversal depth
37
+ - Includes risk assessment: LOW, MEDIUM, HIGH, CRITICAL
38
+ - Supports maxDepth and minConfidence filters
39
+ - Lists affected execution flows
40
+
41
+ ### C4: Context Tool (`context` MCP tool)
42
+ - Provides 360-degree symbol view
43
+ - Returns: metadata, incoming refs (callers), outgoing refs (callees), cluster membership, flow participation
44
+ - Handles ambiguous symbol names with disambiguation list
45
+ - Supports file_path parameter for disambiguation
46
+ - Shows connected infrastructure symbols
47
+
48
+ ### C5: Flow Detection & Change Detection
49
+ - Detects execution flows from entry points via BFS
50
+ - Entry points: exported functions with no internal callers, route handlers
51
+ - Configurable max depth (default: 10) and branching limit (default: 4)
52
+ - Labels flows heuristically from entry/terminal names
53
+ - Classifies flows: intra_community vs cross_community
54
+ - `detect_changes` MCP tool maps git diff to affected symbols and flows
55
+
56
+ ---
57
+
58
+ ## 3. Key Requirements from Specs
59
+
60
+ ### Symbol Graph (symbol-graph/spec.md)
61
+ | ID | Requirement | Testable Scenario |
62
+ |----|-------------|-------------------|
63
+ | SG-1 | Tree-sitter extracts symbols from TS/JS/Python | Index a TS file, verify symbols extracted |
64
+ | SG-2 | Unsupported languages fall back to regex | Index a .go file, verify no crash |
65
+ | SG-3 | Tree-sitter failure degrades gracefully | Simulate load failure, verify regex works |
66
+ | SG-4 | CALLS edges created with confidence >= 0.7 | Function A calls B, verify edge exists |
67
+ | SG-5 | EXTENDS edges have confidence 1.0 | Class A extends B, verify edge |
68
+ | SG-6 | IMPLEMENTS edges have confidence 1.0 | Class implements interface, verify edge |
69
+ | SG-7 | Incremental indexing skips unchanged files | Re-index unchanged file, verify skip |
70
+ | SG-8 | Changed files are re-parsed | Modify file, verify symbols updated |
71
+ | SG-9 | Deleted files have symbols removed | Delete file, verify cleanup |
72
+ | SG-10 | Code symbols coexist with infra symbols | File with both, verify both extracted |
73
+
74
+ ### Impact Analysis (impact-analysis/spec.md)
75
+ | ID | Requirement | Testable Scenario |
76
+ |----|-------------|-------------------|
77
+ | IA-1 | Upstream impact returns callers by depth | Query upstream, verify depth grouping |
78
+ | IA-2 | Downstream impact returns callees | Query downstream, verify results |
79
+ | IA-3 | maxDepth limits traversal | Set maxDepth=2, verify limit |
80
+ | IA-4 | minConfidence filters edges | Set minConfidence=0.8, verify filter |
81
+ | IA-5 | Risk assessment included | Verify LOW/MEDIUM/HIGH/CRITICAL |
82
+ | IA-6 | Affected flows listed | Symbol in flow, verify flow listed |
83
+
84
+ ### Context Tool (context-tool/spec.md)
85
+ | ID | Requirement | Testable Scenario |
86
+ |----|-------------|-------------------|
87
+ | CT-1 | 360-degree view returned | Query symbol, verify all sections |
88
+ | CT-2 | Ambiguous names return disambiguation | Query "handle", verify list |
89
+ | CT-3 | Infrastructure connections shown | Symbol uses redis, verify shown |
90
+ | CT-4 | Not found returns clear message | Query nonexistent, verify message |
91
+ | CT-5 | file_path disambiguates | Same name in 2 files, use file_path |
92
+
93
+ ### Flow Detection (flow-detection/spec.md)
94
+ | ID | Requirement | Testable Scenario |
95
+ |----|-------------|-------------------|
96
+ | FD-1 | Entry points detected | Exported function, verify entry |
97
+ | FD-2 | BFS traces forward | Entry to terminal, verify path |
98
+ | FD-3 | Max depth limits trace | Set depth=10, verify limit |
99
+ | FD-4 | Branching limit applied | >4 callees, verify top 4 followed |
100
+ | FD-5 | Flows labeled heuristically | handleLogin->createSession, verify label |
101
+ | FD-6 | Flows classified by community | Same cluster = intra, verify |
102
+ | FD-7 | detect_changes maps git diff | Modified function, verify affected flow |
103
+ | FD-8 | No changes returns empty | Clean repo, verify empty result |
104
+ | FD-9 | Non-symbol files listed | Config changed, verify listed |
105
+
106
+ ### Search Pipeline (search-pipeline/spec.md)
107
+ | ID | Requirement | Testable Scenario |
108
+ |----|-------------|-------------------|
109
+ | SP-1 | Search enriched with symbol metadata | Search file with symbols, verify enrichment |
110
+ | SP-2 | Files without symbols not enriched | Search markdown, verify no enrichment |
111
+ | SP-3 | No symbol graph = backward compatible | Tree-sitter disabled, verify search works |
112
+
113
+ ---
114
+
115
+ ## 4. Source Files Involved
116
+
117
+ | File | Purpose |
118
+ |------|---------|
119
+ | `src/treesitter.ts` | Tree-sitter AST parsing, symbol extraction |
120
+ | `src/symbol-graph.ts` | Symbol graph queries, impact/context tools |
121
+ | `src/flow-detection.ts` | BFS flow detection, entry point identification |
122
+ | `src/graph.ts` | Graph utilities, clusterSymbols function |
123
+ | `src/codebase.ts` | Codebase indexing, incremental updates |
124
+ | `src/server.ts` | MCP tool registration (context, impact, detect_changes) |
125
+ | `src/search.ts` | Search pipeline with symbol enrichment |
126
+ | `src/types.ts` | Type definitions for symbols, edges, flows |
127
+ | `src/store.ts` | SQLite storage for code_symbols, symbol_edges |
128
+
129
+ ---
130
+
131
+ ## 5. Test Files Involved
132
+
133
+ | File | Coverage Area |
134
+ |------|---------------|
135
+ | `test/treesitter.test.ts` | Tree-sitter parsing, symbol extraction |
136
+ | `test/symbol-graph.test.ts` | Symbol graph queries, edge creation |
137
+ | `test/symbol-clustering.test.ts` | Louvain clustering on symbol graph |
138
+ | `test/flow-detection.test.ts` | BFS flow detection, entry points |
139
+ | `test/mcp-tools-symbol.test.ts` | MCP tools: context, impact, detect_changes |
140
+ | `test/search-enrichment.test.ts` | Search result enrichment |
141
+
142
+ ---
143
+
144
+ ## 6. Test Environment
145
+
146
+ - **Runtime:** Node.js (ESM)
147
+ - **Test Framework:** Vitest
148
+ - **Database:** better-sqlite3 (WAL mode)
149
+ - **Parser:** Tree-sitter native bindings
150
+ - **Languages:** TypeScript, JavaScript, Python
151
+
152
+ ---
153
+
154
+ ## 7. Pre-existing Test Status
155
+
156
+ - **Total tests:** 726
157
+ - **Passing:** 725
158
+ - **Failing:** 1 (pre-existing in watcher.test.ts, unrelated to this feature)
159
+
160
+ ---
161
+
162
+ ## 8. Output Directory
163
+
164
+ ```
165
+ /Users/tamlh/workspaces/self/AI/Tools/nano-brain/ai/test-case/rri-t/nano-brain-gitnexus-enhancements/
166
+ ├── 01-prepare.md (this file)
167
+ ├── 02-discover.md (persona interviews)
168
+ ├── 03-structure.md (Q-A-R-P-T test cases)
169
+ ├── 04-execute.md (test execution results)
170
+ ├── 05-analyze.md (coverage analysis)
171
+ └── summary.md (final verdict)
172
+ ```
@@ -0,0 +1,195 @@
1
+ # RRI-T Phase 2: DISCOVER
2
+
3
+ **Feature:** nano-brain-gitnexus-enhancements
4
+ **Date:** 2026-03-06
5
+ **Methodology:** Reverse Requirements Interview - Testing
6
+
7
+ ---
8
+
9
+ ## Persona Adaptation for Backend Library
10
+
11
+ Since nano-brain is a backend TypeScript library (MCP server + CLI), not a web app, the personas are adapted:
12
+
13
+ | Standard Persona | Adapted For Backend Library |
14
+ |------------------|----------------------------|
15
+ | End User | AI Agent (Cursor/Claude Code/OpenCode) using MCP tools |
16
+ | Business Analyst | Developer integrating nano-brain into their workflow |
17
+ | QA Destroyer | Adversarial tester finding edge cases and failure modes |
18
+ | DevOps Tester | Ops engineer concerned with performance and reliability |
19
+ | Security Auditor | Security reviewer checking for vulnerabilities |
20
+
21
+ ---
22
+
23
+ ## Persona 1: AI Agent (End User)
24
+
25
+ *"I'm an AI coding assistant using nano-brain's MCP tools to understand codebases and help developers."*
26
+
27
+ ### Questions (17)
28
+
29
+ 1. When I call the `context` tool with a function name, do I get a clear, structured response I can parse?
30
+ 2. If multiple symbols match my query (e.g., "validate"), do I get a disambiguation list with enough info to choose?
31
+ 3. What happens if I query a symbol that doesn't exist? Do I get a helpful error message?
32
+ 4. When I call `impact` with direction "upstream", do I understand which symbols will break if I change the target?
33
+ 5. Does the impact response clearly show the risk level (LOW/MEDIUM/HIGH/CRITICAL)?
34
+ 6. Can I filter impact results by confidence to avoid false positives?
35
+ 7. When I call `detect_changes`, do I get a clear mapping of git changes to affected flows?
36
+ 8. If there are no uncommitted changes, does `detect_changes` return a clear "no changes" message?
37
+ 9. Does the `context` response include infrastructure connections (Redis, MySQL) so I understand side effects?
38
+ 10. Can I use `file_path` to disambiguate when I know which file I'm asking about?
39
+ 11. Are execution flows labeled in a human-readable way I can present to the developer?
40
+ 12. Does the search enrichment help me prioritize results by showing flow participation?
41
+ 13. If Tree-sitter fails, do the tools still work with degraded functionality?
42
+ 14. Are confidence scores included in edge data so I can assess reliability?
43
+ 15. Does the `impact` tool show which execution flows are affected by a change?
44
+ 16. Can I query symbols by kind (function, class, method) to narrow results?
45
+ 17. Are error messages actionable, telling me what to do next?
46
+
47
+ ---
48
+
49
+ ## Persona 2: Developer Integrating nano-brain (Business Analyst)
50
+
51
+ *"I'm integrating nano-brain into my development workflow and need reliable, consistent APIs."*
52
+
53
+ ### Questions (16)
54
+
55
+ 1. Are the MCP tool input/output schemas documented and stable?
56
+ 2. Does incremental indexing correctly update only changed files?
57
+ 3. If I delete a file, are its symbols and edges properly cleaned up?
58
+ 4. Can I rely on the symbol graph being consistent after partial indexing?
59
+ 5. Are CALLS edges created with appropriate confidence scores?
60
+ 6. Do EXTENDS and IMPLEMENTS edges have confidence 1.0 as specified?
61
+ 7. Is the `code_symbols` table schema backward compatible with future updates?
62
+ 8. Can I query the symbol graph directly via SQL if needed?
63
+ 9. Does the search API remain backward compatible when symbol graph is unavailable?
64
+ 10. Are flow labels deterministic (same input = same label)?
65
+ 11. Does the `detect_changes` tool work correctly with staged vs unstaged changes?
66
+ 12. Can I configure max depth and branching limits for flow detection?
67
+ 13. Are symbol IDs stable across re-indexing (for caching purposes)?
68
+ 14. Does the system handle circular dependencies without infinite loops?
69
+ 15. Are edge types (CALLS, IMPORTS, EXTENDS, IMPLEMENTS) correctly distinguished?
70
+ 16. Can I trust the risk assessment algorithm for production use?
71
+
72
+ ---
73
+
74
+ ## Persona 3: QA Destroyer (Adversarial Tester)
75
+
76
+ *"I break things. I find the edge cases that crash systems and corrupt data."*
77
+
78
+ ### Questions (18)
79
+
80
+ 1. What happens if I index a file with syntax errors?
81
+ 2. What if a function name contains SQL injection characters like `'; DROP TABLE`?
82
+ 3. What if a file path contains path traversal sequences like `../../../etc/passwd`?
83
+ 4. What happens with a 100MB TypeScript file with 10,000 functions?
84
+ 5. What if I have circular call dependencies (A calls B calls C calls A)?
85
+ 6. What if I index an empty repository with no source files?
86
+ 7. What if Tree-sitter native bindings fail to load at runtime?
87
+ 8. What happens if SQLite database is corrupted mid-indexing?
88
+ 9. What if I run concurrent indexing operations on the same workspace?
89
+ 10. What if a file has no exports but only internal functions?
90
+ 11. What if a class extends a class from node_modules (external)?
91
+ 12. What if I query `impact` with maxDepth=1000?
92
+ 13. What if I query `context` with an empty string as the symbol name?
93
+ 14. What happens if git is not installed when calling `detect_changes`?
94
+ 15. What if the workspace has 50,000 files?
95
+ 16. What if a Python file uses dynamic imports (`__import__`)?
96
+ 17. What if a TypeScript file uses `eval()` to call functions?
97
+ 18. What if I delete the SQLite database while the server is running?
98
+
99
+ ---
100
+
101
+ ## Persona 4: DevOps Tester (Ops Engineer)
102
+
103
+ *"I care about performance, reliability, and operational characteristics."*
104
+
105
+ ### Questions (15)
106
+
107
+ 1. How much memory does indexing a 10,000-file codebase consume?
108
+ 2. What's the indexing speed (files/second) for TypeScript files?
109
+ 3. Does SQLite WAL mode handle concurrent reads during indexing?
110
+ 4. What happens if disk space runs out during indexing?
111
+ 5. How long does server startup take with a large symbol graph?
112
+ 6. Are Tree-sitter native bindings compatible with all Node.js versions?
113
+ 7. What's the query latency for `impact` on a graph with 50,000 symbols?
114
+ 8. Does the system recover gracefully from SQLite lock timeouts?
115
+ 9. Can I monitor indexing progress programmatically?
116
+ 10. What's the database size growth rate per 1,000 symbols?
117
+ 11. Does incremental indexing actually skip unchanged files (verified by timing)?
118
+ 12. Are there any memory leaks during long-running indexing sessions?
119
+ 13. What's the CPU usage pattern during BFS flow detection?
120
+ 14. Can the system handle being killed and restarted mid-indexing?
121
+ 15. Are there any file descriptor leaks with Tree-sitter parsing?
122
+
123
+ ---
124
+
125
+ ## Persona 5: Security Auditor
126
+
127
+ *"I look for vulnerabilities that could be exploited in production."*
128
+
129
+ ### Questions (16)
130
+
131
+ 1. Can SQL injection occur via symbol names in queries?
132
+ 2. Can path traversal occur via `file_path` parameter in `context` tool?
133
+ 3. Can command injection occur in `detect_changes` git operations?
134
+ 4. Is data properly isolated between different project workspaces?
135
+ 5. Are there any secrets or credentials logged during indexing?
136
+ 6. Can a malicious file cause arbitrary code execution via Tree-sitter?
137
+ 7. Are SQLite queries parameterized to prevent injection?
138
+ 8. Can an attacker craft a file that causes denial of service?
139
+ 9. Is the MCP transport (stdio) secure against injection?
140
+ 10. Are file paths validated before being used in queries?
141
+ 11. Can symbol names be used to inject malicious content into responses?
142
+ 12. Is there any risk of information leakage across workspaces?
143
+ 13. Are git commands executed with proper escaping?
144
+ 14. Can a crafted Python file exploit the Tree-sitter parser?
145
+ 15. Are there any TOCTOU (time-of-check-time-of-use) vulnerabilities?
146
+ 16. Is the confidence scoring algorithm resistant to manipulation?
147
+
148
+ ---
149
+
150
+ ## Question Summary
151
+
152
+ | Persona | Question Count |
153
+ |---------|----------------|
154
+ | AI Agent (End User) | 17 |
155
+ | Developer (Business Analyst) | 16 |
156
+ | QA Destroyer | 18 |
157
+ | DevOps Tester | 15 |
158
+ | Security Auditor | 16 |
159
+ | **Total** | **82** |
160
+
161
+ ---
162
+
163
+ ## Key Themes Identified
164
+
165
+ ### Theme 1: Tool Response Quality
166
+ - Clear, structured responses
167
+ - Actionable error messages
168
+ - Disambiguation UX
169
+ - Human-readable labels
170
+
171
+ ### Theme 2: Data Integrity
172
+ - Incremental indexing correctness
173
+ - Edge confidence accuracy
174
+ - Flow consistency
175
+ - Cleanup on file deletion
176
+
177
+ ### Theme 3: Edge Cases
178
+ - Malformed inputs
179
+ - Huge codebases
180
+ - Circular dependencies
181
+ - Empty repositories
182
+ - Concurrent operations
183
+
184
+ ### Theme 4: Performance
185
+ - Memory usage
186
+ - Indexing speed
187
+ - Query latency
188
+ - Startup time
189
+
190
+ ### Theme 5: Security
191
+ - SQL injection
192
+ - Path traversal
193
+ - Command injection
194
+ - Project isolation
195
+ - Data leakage