@codragraph/cli 1.6.3 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (89) hide show
  1. package/README.md +50 -16
  2. package/dist/cli/ai-context.js +2 -2
  3. package/dist/cli/analyze.d.ts +22 -0
  4. package/dist/cli/analyze.js +111 -8
  5. package/dist/cli/compress-stats.d.ts +29 -0
  6. package/dist/cli/compress-stats.js +97 -0
  7. package/dist/cli/graphstore.d.ts +6 -2
  8. package/dist/cli/graphstore.js +24 -2
  9. package/dist/cli/index.js +17 -6
  10. package/dist/cli/profile-heap.d.ts +35 -0
  11. package/dist/cli/profile-heap.js +126 -0
  12. package/dist/cli/setup.d.ts +13 -0
  13. package/dist/cli/setup.js +75 -29
  14. package/dist/cli/skill-gen.d.ts +14 -2
  15. package/dist/cli/skill-gen.js +53 -20
  16. package/dist/cli/tool.js +4 -0
  17. package/dist/config/ignore-service.js +1 -1
  18. package/dist/core/embeddings/embedding-pipeline.js +24 -7
  19. package/dist/core/group/bridge-db.js +111 -24
  20. package/dist/core/group/extractors/grpc-patterns/proto.js +1 -12
  21. package/dist/core/ingestion/call-processor.js +2 -2
  22. package/dist/core/ingestion/cobol/cobol-preprocessor.js +1 -1
  23. package/dist/core/ingestion/cobol/jcl-parser.d.ts +1 -1
  24. package/dist/core/ingestion/cobol/jcl-parser.js +1 -1
  25. package/dist/core/ingestion/cobol-processor.d.ts +1 -1
  26. package/dist/core/ingestion/cobol-processor.js +1 -1
  27. package/dist/core/ingestion/heritage-extractors/generic.js +1 -1
  28. package/dist/core/ingestion/heritage-processor.js +1 -1
  29. package/dist/core/ingestion/import-processor.js +1 -1
  30. package/dist/core/ingestion/mro-processor.js +1 -1
  31. package/dist/core/ingestion/parsing-processor.js +1 -1
  32. package/dist/core/ingestion/type-extractors/c-cpp.js +1 -1
  33. package/dist/core/ingestion/type-extractors/python.js +1 -1
  34. package/dist/core/ingestion/type-extractors/shared.js +0 -3
  35. package/dist/core/lbug/content-read.d.ts +46 -0
  36. package/dist/core/lbug/content-read.js +64 -0
  37. package/dist/core/lbug/csv-generator.d.ts +2 -6
  38. package/dist/core/lbug/csv-generator.js +45 -12
  39. package/dist/core/lbug/lbug-adapter.d.ts +4 -1
  40. package/dist/core/lbug/lbug-adapter.js +157 -25
  41. package/dist/core/lbug/pool-adapter.js +51 -44
  42. package/dist/core/lbug/schema.d.ts +7 -7
  43. package/dist/core/lbug/schema.js +18 -0
  44. package/dist/core/run-analyze.d.ts +13 -0
  45. package/dist/core/run-analyze.js +91 -4
  46. package/dist/core/search/bm25-index.js +153 -12
  47. package/dist/core/wiki/generator.js +4 -4
  48. package/dist/mcp/local/local-backend.js +22 -5
  49. package/dist/mcp/resources.js +2 -3
  50. package/dist/server/api.js +4 -3
  51. package/dist/storage/repo-manager.d.ts +39 -0
  52. package/dist/storage/repo-manager.js +19 -0
  53. package/hooks/claude/codragraph-hook.cjs +108 -5
  54. package/hooks/claude/pre-tool-use.sh +6 -1
  55. package/package.json +4 -4
  56. package/scripts/build-tree-sitter-proto.cjs +15 -3
  57. package/scripts/patch-tree-sitter-swift.cjs +17 -4
  58. package/skills/codragraph-api-surface.md +110 -0
  59. package/skills/codragraph-cli.md +5 -5
  60. package/skills/codragraph-config-audit.md +146 -0
  61. package/skills/codragraph-cross-repo-impact.md +135 -0
  62. package/skills/codragraph-data-lineage.md +137 -0
  63. package/skills/codragraph-dead-code.md +119 -0
  64. package/skills/codragraph-debugging.md +1 -1
  65. package/skills/codragraph-exploring.md +1 -1
  66. package/skills/codragraph-gh-actions-debug.md +162 -0
  67. package/skills/codragraph-gh-issue-workflow.md +178 -0
  68. package/skills/codragraph-gh-pr-workflow.md +176 -0
  69. package/skills/codragraph-gh-release-workflow.md +187 -0
  70. package/skills/codragraph-git-bisect.md +176 -0
  71. package/skills/codragraph-git-force-push.md +147 -0
  72. package/skills/codragraph-git-history-rewrite.md +174 -0
  73. package/skills/codragraph-git-rebase-vs-merge.md +138 -0
  74. package/skills/codragraph-git-recovery.md +181 -0
  75. package/skills/codragraph-git-worktree.md +145 -0
  76. package/skills/codragraph-guide.md +1 -1
  77. package/skills/codragraph-impact-analysis.md +1 -1
  78. package/skills/codragraph-migration-tracking.md +130 -0
  79. package/skills/codragraph-notebook-context.md +136 -0
  80. package/skills/codragraph-observability-coverage.md +125 -0
  81. package/skills/codragraph-onboarding.md +129 -0
  82. package/skills/codragraph-perf-hotspots.md +132 -0
  83. package/skills/codragraph-pr-review.md +1 -1
  84. package/skills/codragraph-project-switcher.md +116 -0
  85. package/skills/codragraph-refactoring.md +1 -1
  86. package/skills/codragraph-security-audit.md +144 -0
  87. package/skills/codragraph-sql-tracing.md +122 -0
  88. package/skills/codragraph-supply-chain-audit.md +153 -0
  89. package/skills/codragraph-test-coverage.md +97 -0
@@ -0,0 +1,122 @@
1
+ ---
2
+ name: codragraph-sql-tracing
3
+ description: "Use when finding where SQL queries are constructed in code, tracing which functions execute a given query, auditing query patterns, or finding the call sites of a stored procedure. Examples: \"where is this SELECT defined\", \"who calls this query\", \"find all SQL in the auth module\", \"trace this stored procedure call\""
4
+ ---
5
+
6
+ # SQL Query Tracing with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "Where is the query for `<table>` constructed?"
11
+ - "Which functions execute `<sql snippet>`?"
12
+ - "Find all SQL string literals in `<area>`."
13
+ - Auditing query patterns (N+1, missing indexes, etc.) before optimization
14
+ - Tracing a stored-procedure call from production logs back to the caller
15
+
16
+ ## Why CodraGraph helps here
17
+
18
+ SQL queries are usually plain string literals — the language-server knows
19
+ nothing about them, but `query` over the index plus `cypher` against the
20
+ graph can find them, and `context` / `impact` can trace who calls the
21
+ enclosing function. This works equally well for raw SQL strings, query
22
+ builders (Knex, SQLAlchemy, Diesel), and ORM-generated queries that
23
+ include a recognizable identifier.
24
+
25
+ ## Workflow
26
+
27
+ ```
28
+ 1. codragraph_query({query: "SELECT FROM <table>"})
29
+ OR
30
+ codragraph_query({query: "<unique substring of the SQL>"})
31
+ → symbols whose body contains the SQL fragment
32
+
33
+ 2. For each candidate function:
34
+ codragraph_context({name: "<function>"})
35
+ → see who calls it (the actual query-execution site)
36
+
37
+ 3. codragraph_impact({target: "<function>", direction: "upstream"})
38
+ → blast radius: every caller of the SQL-executing function
39
+
40
+ 4. For ORM / query-builder users:
41
+ codragraph_query({query: "<table>.find OR <table>.where"})
42
+ → find ORM calls that compile to SQL touching the table
43
+
44
+ 5. Categorize: reads vs writes, hot paths vs cold paths
45
+ ```
46
+
47
+ > CodraGraph indexes the *source* of the query, not the *executed* SQL.
48
+ > Dynamically built queries (`f"SELECT * FROM {table}"`) require both a
49
+ > string-literal search AND a check on the variable's binding via `context`.
50
+
51
+ ## Checklist
52
+
53
+ ```
54
+ - [ ] query for the SQL substring or table name
55
+ - [ ] context on each candidate function
56
+ - [ ] impact upstream on the executor → who calls it from the application
57
+ - [ ] Filter for ORM call patterns separately if relevant
58
+ - [ ] Group results: read paths vs write paths, hot vs cold paths
59
+ - [ ] Flag any query with no test reach (cross-ref with codragraph-test-coverage)
60
+ ```
61
+
62
+ ## SQL Patterns to Search
63
+
64
+ | Pattern | Search query |
65
+ | --- | --- |
66
+ | Raw string SELECT | `"SELECT FROM users"` (with table name) |
67
+ | Query builder (Knex) | `.from('users').where` |
68
+ | ORM (SQLAlchemy) | `session.query(User)` |
69
+ | Stored procedure call | `CALL sp_name` or `EXEC sp_name` |
70
+ | Migration | `CREATE TABLE` / `ALTER TABLE` |
71
+
72
+ ## Example: "Find every place we read from the `audit_log` table"
73
+
74
+ ```
75
+ 1. codragraph_query({query: "FROM audit_log"})
76
+ → 5 symbols:
77
+ - getAuditByUser (src/admin/audit.ts)
78
+ - getAuditByAction (src/admin/audit.ts)
79
+ - exportAuditCSV (src/admin/audit.ts)
80
+ - countRecentAuditEntries (src/dashboard/health.ts)
81
+ - debugAuditDump (src/scripts/debug.ts)
82
+
83
+ 2. codragraph_query({query: "auditLog.find OR auditLog.where"})
84
+ → 0 (we're using raw SQL, not an ORM)
85
+
86
+ 3. codragraph_context({name: "getAuditByUser"})
87
+ → callers: AuditController.show, AuditController.export
88
+ → callees: db.query, parseAuditRow
89
+
90
+ 4. codragraph_impact({target: "getAuditByUser", direction: "upstream"})
91
+ → d=1: AuditController.show (admin UI), AuditController.export (CSV download)
92
+ → d=2: AdminRouter (HTTP layer)
93
+
94
+ Findings: 5 read sites in 3 files. All go through AuditController. The
95
+ debug script reads with no auth — flag for review.
96
+
97
+ 5. Cross-reference with codragraph-test-coverage:
98
+ - getAuditByUser: covered by AuditController.test
99
+ - debugAuditDump: NO TESTS, NO AUTH ⚠
100
+ ```
101
+
102
+ ## Output Format
103
+
104
+ ```markdown
105
+ ## SQL Trace: `audit_log` (reads)
106
+
107
+ ### Read sites
108
+ | Function | File | Caller chain | Test reach |
109
+ |----------|------|--------------|------------|
110
+ | getAuditByUser | src/admin/audit.ts | Controller → Router | ✓ |
111
+ | getAuditByAction | src/admin/audit.ts | Controller → Router | ✓ |
112
+ | exportAuditCSV | src/admin/audit.ts | Controller → Router | ✗ |
113
+ | countRecentAuditEntries | src/dashboard/health.ts | HealthCheck → cron | ✗ |
114
+ | debugAuditDump | src/scripts/debug.ts | (no auth?) ⚠ | ✗ |
115
+
116
+ ### Hot path
117
+ `AuditController` is the gateway for 3 of 5 read sites. Optimizations
118
+ that route through it benefit the most.
119
+
120
+ ### Risks
121
+ - `debugAuditDump` has no auth and no tests. Investigate.
122
+ ```
@@ -0,0 +1,153 @@
1
+ ---
2
+ name: codragraph-supply-chain-audit
3
+ description: "Use to audit external dependency risk — which packages does the codebase actually use, where are the deepest integration points (a single dep used across N modules is high-blast-radius), what would break if a dep was removed. Examples: \"audit dependencies\", \"supply chain risk\", \"what would break if I drop X\", \"deep dep usage\", \"vendor in or replace\""
4
+ ---
5
+
6
+ # Supply Chain / Dependency Audit with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "Which deps are used the most?"
11
+ - "What would actually break if I removed `<package>`?"
12
+ - "Find deps imported in only 1-2 places (cheap to replace)."
13
+ - "Which deps are deeply integrated and risky to change?"
14
+ - "Pre-vendor audit: should I vendor `<dep>` to lock the version?"
15
+ - "Post-CVE: which of our code paths reach this vulnerable function?"
16
+
17
+ ## Why CodraGraph helps here
18
+
19
+ `npm ls` / `pip list` / `go.sum` tell you which packages are *installed*.
20
+ CodraGraph tells you where they're *imported and called* — which is the
21
+ real measure of how integrated a dep is. Pair with `impact` for "what
22
+ breaks if this dep changes" and you have a much sharper risk picture
23
+ than pure dependency-tree analysis.
24
+
25
+ ## Workflow
26
+
27
+ ```
28
+ 1. List external dependency import sites:
29
+ codragraph_cypher({query: `
30
+ MATCH (n)-[:IMPORTS]->(dep)
31
+ WHERE dep.isExternal = true OR dep.id STARTS WITH 'package:'
32
+ RETURN dep.id, dep.name, count(DISTINCT n) AS importers
33
+ ORDER BY importers DESC
34
+ `})
35
+ → per-package import counts
36
+
37
+ 2. For each high-import package, find which symbols call into it:
38
+ codragraph_cypher({query: `
39
+ MATCH (caller)-[:CALLS]->(target)
40
+ WHERE target.filePath CONTAINS 'node_modules/<pkg>'
41
+ OR target.id STARTS WITH 'package:<pkg>:'
42
+ RETURN caller.name, caller.filePath, count(*) AS calls
43
+ ORDER BY calls DESC
44
+ `})
45
+ → per-package call sites; high-call-site = deeply integrated
46
+
47
+ 3. Identify shallow deps (cheap-to-replace):
48
+ - 1-2 importers → easy to swap out
49
+ - usage limited to one cluster → bounded blast radius
50
+
51
+ 4. Identify deep deps (high replacement cost):
52
+ - imported across many clusters → cross-cutting
53
+ - referenced in critical processes → request-path criticality
54
+
55
+ 5. CVE-specific: given a vulnerable function name from the advisory,
56
+ find which of YOUR symbols reach it:
57
+ codragraph_impact({target: "<vulnerableFn>", direction: "upstream"})
58
+ → only paths through this function are actually exposed
59
+ ```
60
+
61
+ ## Risk categorization
62
+
63
+ | Category | Signal | Action |
64
+ |---|---|---|
65
+ | **Trivial** | 1-2 importers, one cluster | Easy to replace; consider native impl |
66
+ | **Local** | Many importers in 1-2 clusters | Wrap behind a façade for future swap |
67
+ | **Cross-cutting** | Importers spread across most clusters | Treat as core infra; vendor if licensing allows |
68
+ | **Critical** | In every request-path process | Pin version, monitor CVEs, plan migration before EOL |
69
+ | **Vulnerable now** | Reachable code path to a known-CVE function | Patch / replace ASAP |
70
+
71
+ ## CVE response workflow
72
+
73
+ ```
74
+ 1. CVE published: "<package> <vulnerable-fn> allows X"
75
+
76
+ 2. Quick check: do we even reach the vulnerable function?
77
+ codragraph_query({query: "<vulnerable-fn>"})
78
+ → list of call sites in YOUR code
79
+
80
+ 3. For each call site, walk upstream:
81
+ codragraph_impact({target: "<our-caller>", direction: "upstream"})
82
+ → which entry points / processes reach the vulnerable code
83
+
84
+ 4. If 0 reachable paths → not exposed. Patch when convenient.
85
+ 5. If reachable from request-path → patch ASAP, communicate scope.
86
+ 6. If reachable from internal-only paths → patch in the next maintenance window.
87
+ ```
88
+
89
+ ## Checklist
90
+
91
+ ```
92
+ - [ ] Cypher: per-package importer + caller counts
93
+ - [ ] Categorize each top-N package: trivial / local / cross-cutting / critical
94
+ - [ ] For deep deps: identify a façade boundary if one exists / propose one
95
+ - [ ] CVE list cross-check: any current advisories against our deps?
96
+ - [ ] For each open advisory: codragraph_impact on the vulnerable function
97
+ - [ ] Output: ranked deps with risk tier + replaceability cost
98
+ ```
99
+
100
+ ## Example: "Should I replace lodash with native?"
101
+
102
+ ```
103
+ 1. codragraph_cypher for lodash imports:
104
+ → 47 importers across all 8 clusters
105
+ → Cross-cutting category.
106
+
107
+ 2. codragraph_cypher for lodash calls:
108
+ → top-called: _.get (78), _.isEmpty (54), _.cloneDeep (32),
109
+ _.debounce (12), 25 other functions ≤ 5 calls each
110
+
111
+ 3. Replacement cost analysis:
112
+ - _.get → optional chaining `?.` (47 sites)
113
+ - _.isEmpty → custom helper (3 lines)
114
+ - _.cloneDeep → structuredClone() (Node 17+)
115
+ - _.debounce → keep (lodash version is well-tuned, native lacks)
116
+ - 25 long-tail functions → ~75 individual replacement decisions
117
+
118
+ 4. Decision matrix:
119
+ - High-frequency simple ones: easy native swap (saves 70%% of bundle hit)
120
+ - _.debounce: keep lodash for this one (or use a 50-line single-purpose dep)
121
+ - Long-tail: case-by-case during routine refactors
122
+
123
+ 5. Migration plan:
124
+ - Phase 1: replace _.get / _.isEmpty / _.cloneDeep (top 3 = ~200 call sites)
125
+ - Phase 2: revisit long-tail in next major refactor
126
+ - Phase 3: keep lodash only if _.debounce's replacement isn't ready
127
+ ```
128
+
129
+ ## Output Format
130
+
131
+ ```markdown
132
+ ## Supply Chain Audit: <scope>
133
+
134
+ ### Top deps by integration depth
135
+ | Package | Importers | Call sites | Clusters touched | Tier |
136
+ |---|--:|--:|--:|---|
137
+ | react | 142 | 380 | 4 | critical |
138
+ | lodash | 47 | 220 | 8 | cross-cutting |
139
+ | date-fns | 12 | 45 | 3 | local |
140
+ | classnames | 4 | 9 | 2 | local |
141
+ | md5 | 1 | 1 | 1 | trivial |
142
+
143
+ ### Replacement candidates
144
+ - `md5` — 1 call site, ~5 lines of native crypto. Trivial removal.
145
+ - `lodash` — replace top 3 functions for 70%% of usage; keep for `_.debounce`.
146
+
147
+ ### CVE exposure
148
+ - 0 active advisories matching code paths reachable from request handlers.
149
+
150
+ ### Recommended next step
151
+ 1. Drop `md5` (5-line PR).
152
+ 2. Phase-1 lodash slim-down (~200 sites; can be incremental).
153
+ ```
@@ -0,0 +1,97 @@
1
+ ---
2
+ name: codragraph-test-coverage
3
+ description: "Use when the user wants to find untested code paths, audit test coverage gaps, identify functions or execution flows that have no test reach, or assess whether a refactor needs new tests. Examples: \"what isn't tested\", \"test coverage gaps\", \"which flows have no tests\", \"do I need a test for X\""
4
+ ---
5
+
6
+ # Test Coverage Audit with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "What's not tested in this codebase?"
11
+ - "Which execution flows have no test coverage?"
12
+ - "Are there tests that cover X?"
13
+ - "Do I need to add a test for this function?"
14
+ - Auditing coverage before a release / freeze
15
+ - Justifying a "needs more tests" review comment with evidence
16
+
17
+ ## Why CodraGraph helps here
18
+
19
+ Line-coverage tools (jest --coverage, c8, pytest-cov) tell you *which lines
20
+ ran*. They don't tell you *which call paths an agent / engineer should be
21
+ worried about*. CodraGraph's `impact({includeTests: true})` walks the call
22
+ graph and lists every test that transitively reaches a symbol — direct or
23
+ indirect — so you can prove a flow is exercised, or prove it isn't.
24
+
25
+ ## Workflow
26
+
27
+ ```
28
+ 1. codragraph_query({query: "<area you care about>"}) → find candidate symbols
29
+ 2. For each non-trivial symbol:
30
+ codragraph_impact({target: "<symbol>", direction: "upstream", includeTests: true})
31
+ → returns: callers + tests that transitively reach this symbol
32
+ 3. READ codragraph://repo/{name}/processes
33
+ → list every execution flow
34
+ 4. For each flow, codragraph_impact on the flow's entry point with includeTests: true
35
+ → flows with 0 tests = real gaps
36
+ 5. Summarize: which symbols and flows have no test reach
37
+ ```
38
+
39
+ > If "Index is stale" → run `npx @codragraph/cli analyze` first.
40
+
41
+ ## Checklist
42
+
43
+ ```
44
+ - [ ] List candidate symbols (query) or take from a recent diff (detect_changes)
45
+ - [ ] Run impact({includeTests: true}) on each
46
+ - [ ] Note symbols where the test list is empty
47
+ - [ ] Cross-reference with processes — flows with no test coverage are the real risk
48
+ - [ ] Report: gaps + the cheapest test that would close each gap (entry point)
49
+ - [ ] If reviewing a PR: limit to symbols changed in the PR
50
+ ```
51
+
52
+ ## Example: "What's not tested in the auth area?"
53
+
54
+ ```
55
+ 1. codragraph_query({query: "auth validation login session"})
56
+ → 14 symbols across 6 files
57
+
58
+ 2. codragraph_impact({target: "validateSession", direction: "upstream", includeTests: true})
59
+ → callers: requireAuth, refreshToken
60
+ → tests: 0 (no test reaches validateSession)
61
+ ⚠ GAP
62
+
63
+ 3. codragraph_impact({target: "hashPassword", direction: "upstream", includeTests: true})
64
+ → tests: hashPassword.test.ts [direct], auth.integration.test.ts [via signup]
65
+ ✓ covered
66
+
67
+ 4. READ codragraph://repo/CodraGraph/processes
68
+ → 3 auth flows: SignupFlow, LoginFlow, PasswordResetFlow
69
+
70
+ 5. codragraph_impact for each flow's entry point with includeTests: true
71
+ → SignupFlow: covered (3 tests)
72
+ → LoginFlow: covered (2 tests)
73
+ → PasswordResetFlow: NO TESTS ⚠
74
+
75
+ Findings:
76
+ - 2 untested gaps: validateSession (symbol), PasswordResetFlow (entire flow)
77
+ - Cheapest fix: one integration test calling resetPassword end-to-end would
78
+ close both gaps simultaneously (it's the entry point for the flow that
79
+ also calls validateSession).
80
+ ```
81
+
82
+ ## Output Format
83
+
84
+ ```markdown
85
+ ## Test Coverage Audit: <scope>
86
+
87
+ ### Gaps (no test reach)
88
+ - **[symbol]** `validateSession` — called by 2 functions, no transitive test
89
+ - **[flow]** `PasswordResetFlow` — entire flow untested
90
+
91
+ ### Covered (for reference)
92
+ - `hashPassword` — direct + integration test
93
+
94
+ ### Suggested fixes
95
+ 1. Add integration test for `resetPassword` → covers PasswordResetFlow + validateSession
96
+ 2. ...
97
+ ```