@codragraph/cli 1.6.4 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/README.md +34 -0
  2. package/dist/cli/analyze.d.ts +22 -0
  3. package/dist/cli/analyze.js +107 -4
  4. package/dist/cli/compress-stats.d.ts +29 -0
  5. package/dist/cli/compress-stats.js +97 -0
  6. package/dist/cli/graphstore.d.ts +6 -2
  7. package/dist/cli/graphstore.js +24 -2
  8. package/dist/cli/index.js +16 -2
  9. package/dist/cli/profile-heap.d.ts +35 -0
  10. package/dist/cli/profile-heap.js +126 -0
  11. package/dist/cli/setup.d.ts +13 -0
  12. package/dist/cli/setup.js +22 -11
  13. package/dist/cli/skill-gen.d.ts +14 -2
  14. package/dist/cli/skill-gen.js +52 -19
  15. package/dist/cli/tool.js +4 -0
  16. package/dist/core/embeddings/embedding-pipeline.js +24 -7
  17. package/dist/core/group/bridge-db.js +111 -24
  18. package/dist/core/lbug/content-read.d.ts +46 -0
  19. package/dist/core/lbug/content-read.js +64 -0
  20. package/dist/core/lbug/csv-generator.d.ts +2 -6
  21. package/dist/core/lbug/csv-generator.js +45 -12
  22. package/dist/core/lbug/lbug-adapter.d.ts +4 -1
  23. package/dist/core/lbug/lbug-adapter.js +153 -21
  24. package/dist/core/lbug/schema.d.ts +7 -7
  25. package/dist/core/lbug/schema.js +18 -0
  26. package/dist/core/run-analyze.d.ts +13 -0
  27. package/dist/core/run-analyze.js +91 -4
  28. package/dist/core/search/bm25-index.js +67 -15
  29. package/dist/mcp/local/local-backend.js +22 -5
  30. package/dist/server/api.js +4 -3
  31. package/dist/storage/repo-manager.d.ts +39 -0
  32. package/dist/storage/repo-manager.js +19 -0
  33. package/hooks/claude/codragraph-hook.cjs +95 -2
  34. package/package.json +4 -4
  35. package/scripts/build-tree-sitter-proto.cjs +15 -3
  36. package/scripts/patch-tree-sitter-swift.cjs +17 -4
  37. package/skills/codragraph-api-surface.md +110 -0
  38. package/skills/codragraph-config-audit.md +146 -0
  39. package/skills/codragraph-cross-repo-impact.md +135 -0
  40. package/skills/codragraph-data-lineage.md +137 -0
  41. package/skills/codragraph-dead-code.md +119 -0
  42. package/skills/codragraph-gh-actions-debug.md +162 -0
  43. package/skills/codragraph-gh-issue-workflow.md +178 -0
  44. package/skills/codragraph-gh-pr-workflow.md +176 -0
  45. package/skills/codragraph-gh-release-workflow.md +187 -0
  46. package/skills/codragraph-git-bisect.md +176 -0
  47. package/skills/codragraph-git-force-push.md +147 -0
  48. package/skills/codragraph-git-history-rewrite.md +174 -0
  49. package/skills/codragraph-git-rebase-vs-merge.md +138 -0
  50. package/skills/codragraph-git-recovery.md +181 -0
  51. package/skills/codragraph-git-worktree.md +145 -0
  52. package/skills/codragraph-migration-tracking.md +130 -0
  53. package/skills/codragraph-notebook-context.md +136 -0
  54. package/skills/codragraph-observability-coverage.md +125 -0
  55. package/skills/codragraph-onboarding.md +129 -0
  56. package/skills/codragraph-perf-hotspots.md +132 -0
  57. package/skills/codragraph-project-switcher.md +116 -0
  58. package/skills/codragraph-security-audit.md +144 -0
  59. package/skills/codragraph-sql-tracing.md +122 -0
  60. package/skills/codragraph-supply-chain-audit.md +153 -0
  61. package/skills/codragraph-test-coverage.md +97 -0
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: codragraph-api-surface
3
+ description: "Use when the user wants to enumerate the public API of a package or codebase, understand what's exported, audit breaking change risk, or compare API shapes across versions. Examples: \"what's our public API\", \"list exports\", \"API surface\", \"what would break if I remove X\", \"document the public interface\""
4
+ ---
5
+
6
+ # API Surface Audit with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "What's the public API of this package?"
11
+ - "List every exported function / class / type"
12
+ - "What would break if I remove or rename `<symbol>`?"
13
+ - Pre-release API freeze audit
14
+ - Generating API documentation from the graph
15
+ - Comparing API surface across versions (with `codragraph diff --semantic`)
16
+
17
+ ## Why CodraGraph helps here
18
+
19
+ Reading every `index.ts` / `__init__.py` / `mod.rs` by hand misses re-exports
20
+ and framework-magic exports (Next.js page routes, decorators, registered
21
+ plugins). CodraGraph's `isExported` property is computed by language-aware
22
+ export detection — covers default exports, named re-exports, `__all__`,
23
+ `pub use`, etc., consistently across all 16 supported languages.
24
+
25
+ ## Workflow
26
+
27
+ ```
28
+ 1. codragraph_cypher({query: `
29
+ MATCH (n) WHERE n.isExported = true
30
+ RETURN labels(n)[0] AS table, n.name, n.filePath, n.id
31
+ ORDER BY table, n.filePath, n.name
32
+ `})
33
+ → every exported symbol, grouped by table
34
+
35
+ 2. For each high-traffic export:
36
+ codragraph_impact({target: "<name>", direction: "upstream"})
37
+ → who depends on it (within this repo)
38
+
39
+ 3. For cross-repo audits (multi-repo group):
40
+ codragraph_impact({repo: "@<group>", target: "<name>", direction: "upstream"})
41
+ → blast radius across every group member
42
+
43
+ 4. Compare across versions:
44
+ codragraph diff <baseline> <head> --semantic --json
45
+ → addedAPIs / removedAPIs / classifiedModifications
46
+ → produces a versioned changelog of what your public surface gained / lost
47
+ ```
48
+
49
+ > Pair with `codragraph-pr-review` skill when reviewing a PR that touches
50
+ > exported symbols — the impact-across-group check is the difference between
51
+ > "breaks our consumers" and "internal refactor."
52
+
53
+ ## Checklist
54
+
55
+ ```
56
+ - [ ] Cypher query for n.isExported = true
57
+ - [ ] Group by file or by community (Leiden cluster)
58
+ - [ ] For each non-trivial export, run impact upstream
59
+ - [ ] If the package is in a group, run impact with repo: "@group" too
60
+ - [ ] Compare with previous release: codragraph diff <prev-tag> HEAD --semantic
61
+ - [ ] Flag exports with no documented consumers — candidates for visibility
62
+ reduction (export → internal)
63
+ ```
64
+
65
+ ## Example: "What's our public API?"
66
+
67
+ ```
68
+ 1. codragraph_cypher({
69
+ query: `MATCH (n) WHERE n.isExported = true
70
+ RETURN labels(n)[0] AS table, n.name, n.filePath`
71
+ })
72
+ → 47 exports: 22 Function, 12 Class, 8 Interface, 5 Constant
73
+
74
+ 2. Top-level functions:
75
+ - createClient (src/index.ts) ← 14 callers
76
+ - fetchUser (src/api.ts) ← 6 callers
77
+ - validate (src/utils.ts) ← 1 internal caller only ⚠ over-exported
78
+
79
+ 3. codragraph_impact({target: "validate", direction: "upstream"})
80
+ → d=1: only formatPayload (same package). No external consumers.
81
+ → Recommend: drop the `export` keyword. Internal-only.
82
+
83
+ 4. Compare with v1.5.3 release:
84
+ codragraph diff v1.5.3 HEAD --semantic
85
+ → +3 added APIs, -1 removed API (mappings.toCamelCase), ~2 modified
86
+ → Removed API is a SemVer major bump.
87
+ ```
88
+
89
+ ## Output Format
90
+
91
+ ```markdown
92
+ ## API Surface: <package>
93
+
94
+ ### Exports (47 total)
95
+ | Symbol | Table | File | Callers (internal) | Notes |
96
+ |--------|-------|------|-------------------:|-------|
97
+ | createClient | Function | src/index.ts | 14 | core entry |
98
+ | validate | Function | src/utils.ts | 1 | over-exported, suggest internal |
99
+ | ...
100
+
101
+ ### Diff vs <previous-tag>
102
+ - **Added (3):** `subscribe`, `unsubscribe`, `EventBus`
103
+ - **Removed (1):** `toCamelCase` ⚠ SemVer major
104
+ - **Modified (2):** `createClient` (param 3→4), `fetchUser` (return type)
105
+
106
+ ### Recommendations
107
+ - Reduce visibility on 4 over-exported internals
108
+ - Document the 3 new APIs in the release notes
109
+ - The removed `toCamelCase` requires a major version bump
110
+ ```
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: codragraph-config-audit
3
+ description: "Use to audit how environment variables, config files, and feature flags are read and used across the codebase — find unused config, missing defaults, undocumented env vars, secrets read into logs. Examples: \"audit env vars\", \"unused config\", \"who reads FOO_BAR env\", \"feature flag usage\", \"config sprawl\""
4
+ ---
5
+
6
+ # Configuration Audit with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "Which env vars do we actually read?"
11
+ - "Which env vars are read but never set in deploy configs?"
12
+ - "Find the unused feature flags I can delete."
13
+ - "Who reads `STRIPE_SECRET_KEY`?"
14
+ - "Is `<config>` ever logged or sent to telemetry?"
15
+ - "Audit config sprawl before consolidating."
16
+
17
+ ## Why CodraGraph helps here
18
+
19
+ Configuration enters your code through a small set of helpers:
20
+ `process.env.X`, `os.getenv("X")`, `config.get("foo.bar")`,
21
+ `featureFlags.isEnabled("flag")`. CodraGraph indexes the calls to those
22
+ helpers and the literal arguments — so a `query` for the helper plus a
23
+ `context` of each call site produces a complete picture of which keys
24
+ are read where.
25
+
26
+ ## Workflow
27
+
28
+ ```
29
+ 1. Identify the config helpers (per-language patterns):
30
+ codragraph_query({query: "process.env getenv ConfigService featureFlags"})
31
+ → list of config-read helpers
32
+
33
+ 2. For each helper, find every call site and its key argument:
34
+ codragraph_cypher({query: `
35
+ MATCH (caller)-[:CALLS]->(helper {name: 'getenv'})
36
+ RETURN caller.name, caller.filePath
37
+ `})
38
+ → For richer key-extraction, read the bodies via context:
39
+ codragraph_context({name: "<caller>", content: true})
40
+ → look for the literal string passed to getenv()
41
+
42
+ 3. Cross-check with deploy configs:
43
+ - Read .env / .env.example / docker-compose.yml / k8s ConfigMaps
44
+ - Build the SET of keys actually defined
45
+ - For each key your code reads but isn't defined: undocumented env var
46
+ - For each key defined but no code reads: dead config — delete
47
+
48
+ 4. Feature-flag specific audit:
49
+ codragraph_query({query: "featureFlags.isEnabled flag.evaluate"})
50
+ → For each flag-read site: codragraph_impact upstream
51
+ → Flags with no callers can be removed
52
+ → Flags with one branch always returning true / false are stale
53
+
54
+ 5. Secret-leakage check:
55
+ codragraph_query({query: "STRIPE_SECRET DATABASE_URL API_KEY"})
56
+ → For each match: codragraph_context to confirm the value is not
57
+ piped to logger / tracer / metrics
58
+ ```
59
+
60
+ ## Audit dimensions
61
+
62
+ | Dimension | Question | CodraGraph approach |
63
+ |---|---|---|
64
+ | **Used** | Is this env var read anywhere? | `query` for the literal key |
65
+ | **Documented** | Is the key in `.env.example` / docs? | grep deploy files; subtract from used set |
66
+ | **Defaulted** | Does the read have a default? | `context` shows the surrounding code |
67
+ | **Validated** | Is the value parsed / type-checked? | `context` for `parseInt` / `URL` / Zod schema in the caller |
68
+ | **Logged** | Does the value flow to telemetry? | `impact` downstream from the read site → check telemetry helpers |
69
+ | **Stale flag** | Is the flag still toggled in production? | combine with deploy-config check |
70
+
71
+ ## Feature flag lifecycle audit
72
+
73
+ ```
74
+ codragraph_cypher({query: `
75
+ MATCH (caller)-[:CALLS]->(ff {name: 'isEnabled'})
76
+ RETURN caller.name, caller.filePath, count(*) AS uses
77
+ ORDER BY uses DESC
78
+ `})
79
+ → for each call site, codragraph_context to extract the flag NAME literal
80
+
81
+ # Then:
82
+ - Flag name read by 0 callers → remove
83
+ - Flag name with both branches identical → stale (always-true or always-false)
84
+ - Flag still wired in code, but config has it pinned `true` for >90 days → graduate
85
+ ```
86
+
87
+ ## Checklist
88
+
89
+ ```
90
+ - [ ] Listed config helpers (env / config / featureFlag readers)
91
+ - [ ] Built the read-set: { key: [ call sites ] }
92
+ - [ ] Built the defined-set from deploy configs
93
+ - [ ] Diff: undocumented (in code, not in config) + dead (in config, not in code)
94
+ - [ ] Spot-check defaults / validation / secret leakage on critical keys
95
+ - [ ] Feature-flag staleness check
96
+ - [ ] Output: read map + recommended deletions / required deploy changes
97
+ ```
98
+
99
+ ## Example: "Audit our feature flags"
100
+
101
+ ```
102
+ 1. codragraph_query({query: "featureFlags.isEnabled"})
103
+ → 47 call sites in 23 files
104
+
105
+ 2. For each call site, extract the flag string (codragraph_context):
106
+ - 'new_checkout' (12 sites)
107
+ - 'experimental_search' (4 sites)
108
+ - 'use_new_pricing' (8 sites)
109
+ - 'kill_legacy_admin' (1 site)
110
+ - 'canary_v3' (0 sites — defined in code dead)
111
+
112
+ 3. Cross-check deploys:
113
+ - 'new_checkout' set to TRUE for 100%% prod since 2026-01 (graduate it)
114
+ - 'experimental_search' set to TRUE for 5%% prod (active experiment, keep)
115
+ - 'use_new_pricing' set to TRUE for 100%% prod since 2026-03 (graduate)
116
+ - 'kill_legacy_admin' set to TRUE for 100%% prod since 2026-02 (graduate)
117
+ - 'canary_v3' not configured anywhere (truly dead)
118
+
119
+ 4. Findings:
120
+ - DELETE: 'canary_v3' (dead code, no callers, no config)
121
+ - GRADUATE: 'new_checkout', 'use_new_pricing', 'kill_legacy_admin' →
122
+ remove the flag check; keep the new behavior unconditionally
123
+ - KEEP: 'experimental_search'
124
+ - Codebase loses: 21 call sites, 1 unused flag definition
125
+ ```
126
+
127
+ ## Output Format
128
+
129
+ ```markdown
130
+ ## Config Audit: <scope>
131
+
132
+ ### Env vars / config keys
133
+ | Key | Read sites | Defined? | Default? | Validated? | Notes |
134
+ |---|--:|---|---|---|---|
135
+ | DATABASE_URL | 4 | ✓ | ✗ | ✗ | add Zod parse |
136
+ | EXPERIMENTAL_FOO | 1 | ✗ | ✓ ('false') | ✓ | undocumented; either document or delete |
137
+ | ... | ... | ... | ... | ... | ... |
138
+
139
+ ### Feature flags
140
+ - DELETE (no callers): canary_v3, legacy_dashboard_b
141
+ - GRADUATE (100%% production for >90 days): new_checkout, kill_legacy_admin
142
+ - KEEP (active experiment): experimental_search, ai_summarize_v2
143
+
144
+ ### Secret-leak check
145
+ - 0 paths from secret reads to logger/metrics/tracer found ✓
146
+ ```
@@ -0,0 +1,135 @@
1
+ ---
2
+ name: codragraph-cross-repo-impact
3
+ description: "Use when assessing the blast radius of a change that crosses repository boundaries — a shared library used by multiple services, a contract / protobuf / OpenAPI schema consumed by N consumers, a microservices change. Examples: \"what services consume X\", \"cross-repo blast radius\", \"will this break the consumers\", \"who depends on this contract\""
4
+ ---
5
+
6
+ # Cross-Repo Impact Analysis with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "What other repos consume `<symbol>` from this one?"
11
+ - "If I change this gRPC method / OpenAPI route / protobuf message, what breaks?"
12
+ - "Cross-repo blast radius for `<change>`"
13
+ - Microservices architecture: assessing a contract change
14
+ - Shared-library author: deciding if a function is safe to remove
15
+
16
+ ## Why CodraGraph helps here
17
+
18
+ CodraGraph's **groups** (sets of related repos sharing a `group.yaml`)
19
+ maintain a Contract Registry — provider/consumer rows for every cross-repo
20
+ reference (gRPC service / method, OpenAPI route, protobuf message). The
21
+ group-mode `impact` walks both the local call graph AND the contract
22
+ bridges, so a single call returns the blast radius across every member
23
+ repo. You don't need to re-run impact in each consumer separately.
24
+
25
+ ## Workflow
26
+
27
+ ```
28
+ 1. Identify the group:
29
+ codragraph_group_list({})
30
+ → list of groups + their member repos
31
+
32
+ 2. Confirm the symbol exists in the producer repo:
33
+ codragraph_context({repo: "<producerRepo>", name: "<symbol>"})
34
+
35
+ 3. Run group-mode impact:
36
+ codragraph_impact({repo: "@<group>", target: "<symbol>", direction: "upstream"})
37
+ → d=1 callers spanning every group member
38
+ (in-repo callers + contract-bridge consumers)
39
+
40
+ 4. Inspect the Contract Registry to see provider/consumer rows directly:
41
+ READ codragraph://group/<groupName>/contracts
42
+ → list of contracts touching the symbol or its API
43
+
44
+ 5. Check group-status / staleness:
45
+ READ codragraph://group/<groupName>/status
46
+ → which member repos haven't been re-indexed recently
47
+ (stale members produce stale impact results)
48
+
49
+ 6. Surface the worst-case consumer:
50
+ any consumer not updated since the schema change = potential breakage
51
+ ```
52
+
53
+ > If any member repo's index is stale, group-mode impact may underreport.
54
+ > Re-analyze stale members before relying on the results.
55
+
56
+ ## Checklist
57
+
58
+ ```
59
+ - [ ] group_list to confirm the group exists and the producer is a member
60
+ - [ ] context on the symbol in the producer repo
61
+ - [ ] Group-mode impact upstream
62
+ - [ ] Inspect Contract Registry for provider/consumer rows
63
+ - [ ] Check group/status for stale members; re-analyze if needed
64
+ - [ ] List affected consumer repos by impact depth
65
+ - [ ] Recommend coordinated PRs across consumers (if breaking)
66
+ ```
67
+
68
+ ## When to Use Which Tool
69
+
70
+ | Question | Tool |
71
+ | --- | --- |
72
+ | "Which repos are in my group?" | `group_list` |
73
+ | "What contracts cross between member A and member B?" | Contract Registry resource |
74
+ | "If I change this provider method, what breaks?" | Group-mode `impact` |
75
+ | "Are all consumers up to date with the latest provider commit?" | `group/<name>/status` resource |
76
+ | "What's the structural diff between last release and now in repo X?" | Per-repo `diff --semantic` |
77
+
78
+ ## Example: "Will renaming `getUserProfile` break my microservices?"
79
+
80
+ ```
81
+ 1. codragraph_group_list({})
82
+ → group "platform": [user-service, web-app, mobile-bff, admin-portal]
83
+
84
+ 2. codragraph_context({repo: "user-service", name: "getUserProfile"})
85
+ → exported gRPC method in user.proto, defined in user-service
86
+
87
+ 3. codragraph_impact({repo: "@platform", target: "getUserProfile", direction: "upstream"})
88
+ → d=1 callers (across the group):
89
+ - web-app/src/api/userClient.ts (CALLS via grpc-web)
90
+ - mobile-bff/internal/user.go (CALLS via grpc.NewClient)
91
+ - admin-portal/src/services/users.tsx (CALLS via grpc-web)
92
+ → 3 consumer repos depend on this method by exact name.
93
+
94
+ 4. READ codragraph://group/platform/contracts
95
+ → user.UserService.getUserProfile: provider=user-service,
96
+ consumers=[web-app, mobile-bff, admin-portal]
97
+
98
+ 5. READ codragraph://group/platform/status
99
+ → web-app last indexed 2 hours ago ✓
100
+ → mobile-bff last indexed 3 days ago ⚠ (might miss recent callers)
101
+ → admin-portal last indexed 1 month ago ⚠⚠ (re-analyze first!)
102
+
103
+ 6. Recommendation:
104
+ - HIGH-RISK rename. 3 consumer repos must change in lockstep.
105
+ - Re-index admin-portal before trusting the d=1 list.
106
+ - Coordinated PR sequence:
107
+ 1. Add new method (getUserProfileV2) in user-service
108
+ 2. Migrate web-app, mobile-bff, admin-portal to V2
109
+ 3. Remove getUserProfile in user-service after all consumers ship
110
+ - Alternative: keep both, deprecate old, drop in next major.
111
+ ```
112
+
113
+ ## Output Format
114
+
115
+ ```markdown
116
+ ## Cross-Repo Impact: `<symbol>` in `<producer-repo>` (group `@<group>`)
117
+
118
+ ### Consumers (d=1)
119
+ | Repo | Caller | Path | Notes |
120
+ | --- | --- | --- | --- |
121
+ | web-app | userClient.ts | grpc-web | active |
122
+ | mobile-bff | internal/user.go | grpc native | active |
123
+ | admin-portal | services/users.tsx | grpc-web | last indexed 1mo ago ⚠ |
124
+
125
+ ### Contracts touching this symbol
126
+ - `user.UserService.getUserProfile` (provider: user-service)
127
+
128
+ ### Staleness
129
+ Re-analyze `admin-portal` before trusting these results.
130
+
131
+ ### Recommended migration sequence
132
+ 1. Add `getUserProfileV2` alongside the old method
133
+ 2. Migrate consumers to V2 (separate PRs per repo)
134
+ 3. Remove old method after all consumers ship
135
+ ```
@@ -0,0 +1,137 @@
1
+ ---
2
+ name: codragraph-data-lineage
3
+ description: "Use when tracing data flow through an ETL pipeline, finding where a column or table is read/written, mapping data dependencies in a notebook-heavy or data-engineering project. Examples: \"where does this data come from\", \"trace this column\", \"data lineage for X\", \"who reads from this table\", \"what's downstream of this query\""
4
+ ---
5
+
6
+ # Data Lineage with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "Where does the `user_events` table get written?"
11
+ - "Trace the data flow that produces `daily_revenue.csv`"
12
+ - "What's downstream of this Snowflake query?"
13
+ - "Which notebook cells transform this DataFrame?"
14
+ - Auditing a data pipeline before changing a schema
15
+ - Understanding an unfamiliar ETL project
16
+
17
+ ## Why CodraGraph helps here
18
+
19
+ Data pipelines often look like a graph of small functions: `extract_*`,
20
+ `transform_*`, `load_*`, `enrich_*`. Their connections are *function calls*
21
+ plus *string-literal table names* and *file paths*. CodraGraph already
22
+ captures the call graph; combining it with `query` over identifier strings
23
+ gives you data lineage at the *symbol* level, regardless of whether your
24
+ pipeline is in vanilla Python, Airflow, dbt, Pandas, PySpark, or Notebooks.
25
+
26
+ ## Workflow
27
+
28
+ ```
29
+ 1. codragraph_query({query: "<table_or_column_name>"})
30
+ → find every symbol that mentions the data identifier
31
+
32
+ 2. For each candidate symbol:
33
+ codragraph_context({name: "<symbol>"})
34
+ → see callers (who triggers this read/write) and callees (what it depends on)
35
+
36
+ 3. Walk the producer side (downstream → upstream):
37
+ codragraph_impact({target: "<load function>", direction: "upstream"})
38
+ → trace back to where the data originates
39
+
40
+ 4. Walk the consumer side (upstream → downstream):
41
+ codragraph_impact({target: "<extract function>", direction: "downstream"})
42
+ → trace forward to every transform / sink that depends on it
43
+
44
+ 5. READ codragraph://repo/{name}/process/<pipeline-flow>
45
+ → the canonical step-by-step flow if CodraGraph detected this as a process
46
+
47
+ 6. Build the lineage diagram: source → transform stages → sink
48
+ ```
49
+
50
+ > CodraGraph is graph-aware, not SQL-aware: it sees a string literal that
51
+ > *looks* like a table name, but doesn't parse SQL semantics. For mature SQL
52
+ > lineage tooling (column-level resolution), pair with `codragraph-sql-tracing`
53
+ > skill and a SQL parser like sqlglot.
54
+
55
+ ## Checklist
56
+
57
+ ```
58
+ - [ ] query for the table/column/file identifier
59
+ - [ ] context on each candidate to map producers vs consumers
60
+ - [ ] impact upstream on the load/sink function
61
+ - [ ] impact downstream on the extract/source function
62
+ - [ ] Cross-reference with processes for canonical pipeline flows
63
+ - [ ] Render the lineage as: source → transform_1 → transform_2 → sink
64
+ ```
65
+
66
+ ## Identifier Patterns to Search
67
+
68
+ | Layer | Search hints |
69
+ | --- | --- |
70
+ | File-based source | filename, path fragments (`raw_events.parquet`) |
71
+ | Database table | bare table name + `FROM table_name` |
72
+ | Column / field | column name in conjunction with the table name |
73
+ | API endpoint | URL path or function name (`fetchUserEvents`) |
74
+ | Event topic / queue | topic name (`user.signup.v2`) |
75
+
76
+ ## Example: "Where does daily_revenue.csv come from?"
77
+
78
+ ```
79
+ 1. codragraph_query({query: "daily_revenue"})
80
+ → 4 symbols:
81
+ - write_daily_revenue (src/etl/daily.py)
82
+ - read_daily_revenue (src/dashboards/finance.py)
83
+ - DailyRevenueRow (src/schemas/types.py)
84
+ - daily_revenue_dag (airflow/dags/finance_etl.py)
85
+
86
+ 2. codragraph_context({name: "write_daily_revenue"})
87
+ → callers: daily_revenue_dag (Airflow task)
88
+ → callees: aggregate_orders, attach_currency_rates, format_csv_row
89
+
90
+ 3. codragraph_impact({target: "aggregate_orders", direction: "upstream"})
91
+ → reads from: orders_raw, returns_raw (both tables)
92
+
93
+ 4. codragraph_impact({target: "read_daily_revenue", direction: "downstream"})
94
+ → consumed by: finance_dashboard.render(), revenue_alerts.check()
95
+
96
+ 5. READ codragraph://repo/CodraGraph/process/DailyRevenueETL
97
+ → 6 steps:
98
+ fetch_orders → fetch_returns → aggregate_orders →
99
+ attach_currency_rates → format_csv_row → write_daily_revenue
100
+
101
+ Lineage:
102
+ orders_raw, returns_raw (DB tables)
103
+
104
+ fetch_orders + fetch_returns (extract)
105
+
106
+ aggregate_orders → attach_currency_rates → format_csv_row (transform)
107
+
108
+ daily_revenue.csv (sink)
109
+
110
+ finance_dashboard, revenue_alerts (consumers)
111
+ ```
112
+
113
+ ## Output Format
114
+
115
+ ```markdown
116
+ ## Data Lineage: <data-asset>
117
+
118
+ ### Sources
119
+ - `orders_raw` (DB)
120
+ - `returns_raw` (DB)
121
+
122
+ ### Pipeline (DailyRevenueETL flow, 6 steps)
123
+ 1. `fetch_orders` — reads `orders_raw`
124
+ 2. `fetch_returns` — reads `returns_raw`
125
+ 3. `aggregate_orders` — joins, sums by day
126
+ 4. `attach_currency_rates` — enriches with FX
127
+ 5. `format_csv_row` — schema-conforming serialization
128
+ 6. `write_daily_revenue` — writes `daily_revenue.csv`
129
+
130
+ ### Consumers
131
+ - `finance_dashboard` (renders chart)
132
+ - `revenue_alerts` (threshold checks)
133
+
134
+ ### Risk if `<schema/source>` changes
135
+ - 6 transform stages depend on it
136
+ - 2 consumers depend on the output schema
137
+ ```
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: codragraph-dead-code
3
+ description: "Use when the user wants to find unused code, orphan functions/classes, dead code, or symbols safe to delete. Examples: \"what's unused\", \"find dead code\", \"can I delete this function\", \"orphan symbols\", \"clean up unused exports\""
4
+ ---
5
+
6
+ # Dead Code Detection with CodraGraph
7
+
8
+ ## When to Use
9
+
10
+ - "What's unused in this codebase?"
11
+ - "Find dead code / orphan functions / unused exports"
12
+ - "Can I safely delete `<symbol>`?"
13
+ - Periodic cleanup before a release
14
+ - Identifying candidates for the next refactor
15
+
16
+ ## Why CodraGraph helps here
17
+
18
+ Linters can find unreachable code in *one file*. CodraGraph walks the
19
+ *whole-repo* call graph, including dynamic dispatch, framework entry
20
+ points, and exports — so it can tell you a symbol is genuinely unreachable
21
+ rather than just "the linter couldn't see the caller."
22
+
23
+ The trick is to combine three signals:
24
+
25
+ 1. **No incoming references** — `cypher` query for symbols with zero
26
+ `<-[:CALLS|REFERENCES]-` edges.
27
+ 2. **Not exported** — internal-only symbols are stronger candidates than
28
+ `isExported = true` ones (which may be public API).
29
+ 3. **Not in any process** — execution flows are the canonical "this is
30
+ used" signal; symbols outside every process are extra suspect.
31
+
32
+ ## Workflow
33
+
34
+ ```
35
+ 1. codragraph_cypher({query: `
36
+ MATCH (n)
37
+ WHERE NOT (n)<-[:CALLS|REFERENCES]-()
38
+ AND NOT (n.isExported = true)
39
+ RETURN n.id, n.name, labels(n)[0] AS label, n.filePath
40
+ LIMIT 200
41
+ `})
42
+ → list of orphan candidates
43
+
44
+ 2. For each candidate:
45
+ codragraph_context({name: "<candidate>"})
46
+ → confirm: 0 callers, 0 callees that matter, not in any process
47
+
48
+ 3. Cross-check against processes:
49
+ READ codragraph://repo/{name}/processes
50
+ → if the symbol appears in ANY process, it's not actually dead
51
+
52
+ 4. For exported orphans (potentially public API):
53
+ codragraph_impact({target: "<symbol>", direction: "upstream"})
54
+ → if d=1 has external callers (in another indexed repo group), keep it
55
+
56
+ 5. Group by file/cluster, prioritize by file size of dead code
57
+ ```
58
+
59
+ > If "Index is stale" → run `npx @codragraph/cli analyze` first. Stale
60
+ > indexes produce false-positive dead-code reports because new callers
61
+ > aren't visible.
62
+
63
+ ## Checklist
64
+
65
+ ```
66
+ - [ ] Cypher query for orphan symbols (no incoming edges, not exported)
67
+ - [ ] context check on each candidate (confirm 0 callers)
68
+ - [ ] Cross-reference with processes (symbols in flows are not dead)
69
+ - [ ] For exported orphans, impact across groups (cross-repo callers?)
70
+ - [ ] Group findings by file → suggest which files can lose the most code
71
+ - [ ] Flag any candidate that's a framework convention (e.g., default export
72
+ of a Next.js page route) — those LOOK orphan but aren't
73
+ ```
74
+
75
+ ## Pitfalls
76
+
77
+ | Pitfall | What to do |
78
+ | --- | --- |
79
+ | Framework conventions (Next.js pages, Astro routes, Django URLs) | Check `isEntryPoint` on the node — these often score high |
80
+ | Test-only symbols | Filter `filePath CONTAINS '/test'` separately |
81
+ | Re-exported symbols | A re-export creates `REFERENCES` edges; a true orphan has none |
82
+ | Dynamic dispatch (factories, plugin systems) | Cross-check with `query` for the registration string |
83
+
84
+ ## Example: "Clean up unused code in src/utils/"
85
+
86
+ ```
87
+ 1. codragraph_cypher({
88
+ query: `MATCH (n) WHERE n.filePath STARTS WITH 'src/utils/'
89
+ AND NOT (n)<-[:CALLS|REFERENCES]-()
90
+ AND NOT (n.isExported = true)
91
+ RETURN n.id, n.name, n.filePath`
92
+ })
93
+ → 7 candidates
94
+
95
+ 2. codragraph_context({name: "formatLegacyDate"})
96
+ → 0 callers, 0 callees, not in any process. Truly dead.
97
+
98
+ 3. codragraph_context({name: "DEBUG_TIMER"})
99
+ → 0 callers but called dynamically via process.env injection.
100
+ → Keep it.
101
+
102
+ 4. Final: 6 of 7 candidates safe to delete. Total: 142 LoC across 4 files.
103
+ ```
104
+
105
+ ## Output Format
106
+
107
+ ```markdown
108
+ ## Dead Code Audit: <scope>
109
+
110
+ ### High-confidence (0 callers, 0 callees, not in any process)
111
+ - `formatLegacyDate` — `src/utils/date.ts:42` (12 LoC)
112
+ - ...
113
+
114
+ ### Possibly dead (verify dynamic dispatch first)
115
+ - `DEBUG_TIMER` — used via env-driven hook?
116
+
117
+ ### Total cleanup potential
118
+ N functions, M LoC, X files.
119
+ ```