npm - @codragraph/cli - Versions diffs - 1.6.4 → 2.1.0 - Mend

@codragraph/cli 1.6.4 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

package/README.md +34 -0
package/dist/_shared/cgdb/schema-constants.d.ts +16 -0
package/dist/_shared/cgdb/schema-constants.d.ts.map +1 -0
package/dist/_shared/cgdb/schema-constants.js +67 -0
package/dist/_shared/cgdb/schema-constants.js.map +1 -0
package/dist/_shared/index.d.ts +2 -2
package/dist/_shared/index.js +1 -1
package/dist/cli/analyze.d.ts +22 -0
package/dist/cli/analyze.js +109 -6
package/dist/cli/compress-stats.d.ts +29 -0
package/dist/cli/compress-stats.js +97 -0
package/dist/cli/graphstore.d.ts +6 -2
package/dist/cli/graphstore.js +45 -23
package/dist/cli/index-repo.js +3 -3
package/dist/cli/index.js +16 -2
package/dist/cli/profile-heap.d.ts +35 -0
package/dist/cli/profile-heap.js +126 -0
package/dist/cli/setup.d.ts +13 -0
package/dist/cli/setup.js +22 -11
package/dist/cli/skill-gen.d.ts +14 -2
package/dist/cli/skill-gen.js +52 -19
package/dist/cli/tool.js +4 -0
package/dist/cli/wiki.js +3 -3
package/dist/core/augmentation/engine.js +7 -7
package/dist/core/cgdb/cgdb-adapter.d.ts +176 -0
package/dist/core/cgdb/cgdb-adapter.js +1320 -0
package/dist/core/cgdb/content-read.d.ts +46 -0
package/dist/core/cgdb/content-read.js +64 -0
package/dist/core/cgdb/csv-generator.d.ts +29 -0
package/dist/core/cgdb/csv-generator.js +492 -0
package/dist/core/cgdb/pool-adapter.d.ts +93 -0
package/dist/core/cgdb/pool-adapter.js +550 -0
package/dist/core/cgdb/schema.d.ts +62 -0
package/dist/core/cgdb/schema.js +502 -0
package/dist/core/embeddings/embedding-pipeline.js +27 -10
package/dist/core/graphstore/cgdb-row-source.d.ts +19 -0
package/dist/core/graphstore/cgdb-row-source.js +141 -0
package/dist/core/graphstore/index.d.ts +1 -1
package/dist/core/graphstore/index.js +3 -3
package/dist/core/group/bridge-db.d.ts +2 -2
package/dist/core/group/bridge-db.js +123 -36
package/dist/core/group/bridge-schema.d.ts +4 -4
package/dist/core/group/bridge-schema.js +4 -4
package/dist/core/group/cross-impact.js +3 -3
package/dist/core/group/sync.js +4 -4
package/dist/core/lbug/content-read.d.ts +46 -0
package/dist/core/lbug/content-read.js +64 -0
package/dist/core/lbug/csv-generator.d.ts +2 -6
package/dist/core/lbug/csv-generator.js +45 -12
package/dist/core/lbug/lbug-adapter.d.ts +4 -1
package/dist/core/lbug/lbug-adapter.js +153 -21
package/dist/core/lbug/schema.d.ts +7 -7
package/dist/core/lbug/schema.js +18 -0
package/dist/core/run-analyze.d.ts +13 -0
package/dist/core/run-analyze.js +114 -27
package/dist/core/search/bm25-index.d.ts +3 -3
package/dist/core/search/bm25-index.js +75 -23
package/dist/core/search/hybrid-search.js +2 -2
package/dist/core/wiki/generator.d.ts +2 -2
package/dist/core/wiki/generator.js +4 -4
package/dist/core/wiki/graph-queries.d.ts +2 -2
package/dist/core/wiki/graph-queries.js +5 -5
package/dist/mcp/core/cgdb-adapter.d.ts +5 -0
package/dist/mcp/core/cgdb-adapter.js +5 -0
package/dist/mcp/core/embedder.js +1 -1
package/dist/mcp/local/local-backend.d.ts +2 -2
package/dist/mcp/local/local-backend.js +36 -19
package/dist/mcp/server.js +3 -3
package/dist/mcp/tools.js +1 -1
package/dist/server/analyze-worker.js +2 -2
package/dist/server/api.js +34 -33
package/dist/storage/repo-manager.d.ts +42 -3
package/dist/storage/repo-manager.js +23 -4
package/hooks/claude/codragraph-hook.cjs +98 -5
package/package.json +4 -4
package/scripts/build-tree-sitter-proto.cjs +15 -3
package/scripts/build.js +8 -9
package/scripts/patch-tree-sitter-swift.cjs +17 -4
package/skills/codragraph-api-surface.md +110 -0
package/skills/codragraph-config-audit.md +146 -0
package/skills/codragraph-cross-repo-impact.md +135 -0
package/skills/codragraph-data-lineage.md +137 -0
package/skills/codragraph-dead-code.md +119 -0
package/skills/codragraph-gh-actions-debug.md +162 -0
package/skills/codragraph-gh-issue-workflow.md +178 -0
package/skills/codragraph-gh-pr-workflow.md +176 -0
package/skills/codragraph-gh-release-workflow.md +187 -0
package/skills/codragraph-git-bisect.md +176 -0
package/skills/codragraph-git-force-push.md +147 -0
package/skills/codragraph-git-history-rewrite.md +174 -0
package/skills/codragraph-git-rebase-vs-merge.md +138 -0
package/skills/codragraph-git-recovery.md +181 -0
package/skills/codragraph-git-worktree.md +145 -0
package/skills/codragraph-migration-tracking.md +130 -0
package/skills/codragraph-notebook-context.md +136 -0
package/skills/codragraph-observability-coverage.md +125 -0
package/skills/codragraph-onboarding.md +129 -0
package/skills/codragraph-perf-hotspots.md +132 -0
package/skills/codragraph-project-switcher.md +116 -0
package/skills/codragraph-security-audit.md +144 -0
package/skills/codragraph-sql-tracing.md +122 -0
package/skills/codragraph-supply-chain-audit.md +153 -0
package/skills/codragraph-test-coverage.md +97 -0
package/vendor/tree-sitter-proto/bindings/node/index.js +3 -3
package/vendor/tree-sitter-proto/src/node-types.json +1 -1

package/skills/codragraph-migration-tracking.md ADDED Viewed

@@ -0,0 +1,130 @@
+---
+name: codragraph-migration-tracking
+description: "Use when tracking the progress of a phased refactor or migration (renaming an API, swapping a library, moving from class- to functional-components, deprecating a flag). Examples: \"how far is the migration\", \"what's left to migrate\", \"track this refactor\", \"are we done with the move from X to Y\", \"is the migration done\""
+---
+# Migration Progress Tracking with CodraGraph
+## When to Use
+- "How far along is the migration from `<old>` to `<new>`?"
+- "What's left to migrate / refactor / deprecate?"
+- "Are we done with `<old API>`?"
+- Coordinating a phased refactor across many PRs
+- Reporting migration status to stakeholders
+## Why CodraGraph helps here
+Migrations span dozens of PRs and weeks. Without a structural index, the
+question "are we done?" reduces to grep-and-eyeball. CodraGraph's versioned
+graphstore lets you snapshot the codebase at the start of the migration,
+then diff against today to see exactly what's converted and what isn't.
+Pair with `cypher` to count remaining instances of the old pattern.
+## Workflow
+```
+1. Establish baseline (once, at migration start):
+   codragraph commit -m "migration baseline: pre-X-removal"
+   codragraph branch create migration-baseline
+   → captures the structural state for later comparison
+2. Count remaining old-API call sites today:
+   codragraph_cypher({query: `
+     MATCH (n)-[:CALLS]->(target)
+     WHERE target.name = '<oldFunction>'
+     RETURN n.filePath, count(n) AS callers
+     ORDER BY callers DESC
+   `})
+   → "27 callers in 14 files still using <oldFunction>"
+3. Diff against the baseline to see structural progress:
+   codragraph diff migration-baseline HEAD --semantic --json
+   → look at removedAPIs (old surface gone), addedAPIs (new surface added),
+     classifiedModifications (signatures swapped)
+4. Assess flows still touching the old API:
+   codragraph_impact({target: "<oldFunction>", direction: "upstream"})
+   → list of remaining callers grouped by depth
+5. Suggest the next batch of files to migrate (highest caller-count first)
+```
+> If `migration-baseline` doesn't exist, you skipped step 1 — fall back to
+> the earliest commit in `codragraph log` as a baseline (less precise but
+> usable).
+## Checklist
+```
+- [ ] Establish a baseline (branch / tagged commit) at migration start
+- [ ] Cypher count of remaining old-API references
+- [ ] codragraph diff baseline HEAD --semantic for structural progress
+- [ ] impact upstream on the old API → list of remaining callers
+- [ ] Group remaining work by file → suggest next batch
+- [ ] Report: "<N>%% of <total> call sites converted. <K> files remaining."
+```
+## Migration Patterns This Catches
+| Pattern | Cypher hint |
+| --- | --- |
+| API rename (foo → bar) | `MATCH ()-[:CALLS]->(n) WHERE n.name = 'foo' RETURN n.filePath, count(*)` |
+| Library swap (lodash → native) | Filter on `filePath` for files still importing the old library |
+| Class → functional component | Match by `n.label = 'Class'` in the relevant directory |
+| Feature flag removal | Cypher for string literals matching the flag name |
+| Type-system migration (any → typed) | `MATCH (n) WHERE n.returnType = 'any' OR n.returnType IS NULL` |
+## Example: "Track our migration from `validatePaymentV1` to `validatePaymentV2`"
+```
+1. (Baseline established 3 months ago: codragraph branch migration-v2-start)
+2. codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(target)
+     WHERE target.name STARTS WITH 'validatePayment'
+     RETURN target.name, count(caller) AS callers
+   `})
+   → validatePaymentV1: 8 callers
+   → validatePaymentV2: 31 callers
+3. codragraph diff migration-v2-start HEAD --semantic
+   → addedAPIs: validatePaymentV2 (and 4 helpers)
+   → classifiedModifications: 23 functions migrated from V1 to V2
+   → removedAPIs: 0 (V1 still exported)
+4. codragraph_impact({target: "validatePaymentV1", direction: "upstream"})
+   → d=1 callers (still on V1):
+       - legacyCheckout (src/legacy/checkout.ts)
+       - webhookV1 (src/webhooks/v1.ts)
+       - … 6 more
+Report: 79%% migrated (31 / 39 callers). 8 callers in 3 files remaining.
+Next batch: src/legacy/checkout.ts (5 callers in one file).
+```
+## Output Format
+```markdown
+## Migration Progress: <name>
+### Baseline
+`migration-baseline` (3 months ago, before refactor started)
+### Current state
+- **Converted:** 31 / 39 call sites (79%%)
+- **Remaining:** 8 callers in 3 files
+- **Old API surface:** still exported (cannot remove yet)
+- **New API surface:** stable (4 helpers added)
+### Remaining work
+| File | Old-API callers | Notes |
+| --- | --- | --- |
+| `src/legacy/checkout.ts` | 5 | one batch |
+| `src/webhooks/v1.ts` | 2 | tied to legacy webhook contract |
+| ... | ... | ... |
+### Done criteria
+- 0 remaining callers
+- removedAPIs in `codragraph diff` includes `validatePaymentV1`
+```

package/skills/codragraph-notebook-context.md ADDED Viewed

@@ -0,0 +1,136 @@
+---
+name: codragraph-notebook-context
+description: "Use when working with notebook-heavy projects (Jupyter, Databricks, Colab, Marimo) where each notebook contains long pipelines of cells, and the user needs to navigate, summarize, or refactor across them. Examples: \"what do these notebooks do\", \"summarize this analysis pipeline\", \"refactor cells from this notebook into modules\", \"data analysis project tour\""
+---
+# Notebook-Heavy Project Navigation with CodraGraph
+## When to Use
+- "What's in these notebooks?"
+- "Summarize the analysis pipeline across `<notebook>.ipynb`"
+- "Help me refactor a notebook into a proper module"
+- "What functions defined in notebooks does the production code call?"
+- "Audit the data analyses in this project"
+## Why CodraGraph helps here
+Data-science / analytics projects often have 80%% of the logic inside
+`.ipynb` files: top-level imports, helper functions, ad-hoc transforms.
+CodraGraph indexes Python (the most common notebook language) and treats
+notebook-derived code as first-class graph content. That means `query`,
+`context`, and `impact` work on notebook-defined symbols just like
+production-code symbols, so you can navigate a 30-cell notebook the same
+way you'd navigate a typical module.
+## Workflow
+```
+1. List notebook-derived symbols:
+   codragraph_cypher({query: `
+     MATCH (n)
+     WHERE n.filePath ENDS WITH '.ipynb'
+     RETURN n.filePath, n.name, labels(n)[0] AS label
+     ORDER BY n.filePath, n.startLine
+   `})
+   → every function/class defined inside any notebook
+2. For each notebook of interest:
+   codragraph_query({query: "<notebook concept, e.g. 'monthly retention'>"})
+   → top-ranked symbols across notebooks (process-grouped)
+3. codragraph_context({name: "<notebook function>"})
+   → callers (other notebooks? production?) and callees (libraries used)
+4. Cross-notebook reuse check:
+   codragraph_impact({target: "<helper>", direction: "upstream"})
+   → if multiple notebooks call the same helper, that's a refactor candidate
+     (extract into a shared module)
+5. Production / notebook bridge:
+   codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(target)
+     WHERE NOT caller.filePath ENDS WITH '.ipynb'
+       AND target.filePath ENDS WITH '.ipynb'
+     RETURN caller.filePath, target.filePath, target.name
+   `})
+   → production code calling into notebooks (usually a bug — flag it)
+```
+> Notebooks export when wrapped in nbconvert / papermill / databricks-cli.
+> CodraGraph parses the `.ipynb` JSON and treats each code cell as part of
+> the file's symbol space.
+## Checklist
+```
+- [ ] Cypher query for all symbols with .ipynb file paths
+- [ ] Group by notebook → see total symbol count per notebook
+- [ ] query for the analysis topic to find top-ranked symbols
+- [ ] context on key notebook helpers
+- [ ] impact upstream on cross-notebook helpers → refactor candidates
+- [ ] Cypher for production-code → notebook calls (bridge audit)
+```
+## Refactor Signals
+| Signal | What to do |
+| --- | --- |
+| Same helper defined in 3+ notebooks | Extract into a shared `.py` module |
+| Notebook function called from production code | Move to production module; notebook should re-import |
+| Notebook with > 30 distinct symbols | Likely needs to be split (or graduated to a module) |
+| Notebook calling another notebook | Strong refactor signal — extract the shared part |
+## Example: "Summarize the customer-churn analyses in notebooks/"
+```
+1. codragraph_cypher({
+     query: `MATCH (n) WHERE n.filePath STARTS WITH 'notebooks/'
+              AND n.filePath ENDS WITH '.ipynb'
+              RETURN n.filePath, count(n) AS symbols
+              ORDER BY symbols DESC`
+   })
+   → 4 notebooks: churn_baseline.ipynb (28 symbols),
+                  churn_features.ipynb (35 symbols),
+                  churn_model.ipynb (22 symbols),
+                  churn_eval.ipynb (12 symbols)
+2. codragraph_query({query: "customer churn cohort"})
+   → top symbols across the 4 notebooks, grouped by detected processes
+3. codragraph_context({name: "compute_cohort_retention"})
+   → defined in: churn_features.ipynb
+   → called by: churn_model.ipynb, churn_eval.ipynb (TWO notebooks)
+   → REFACTOR CANDIDATE — extract into src/churn/cohort.py
+4. codragraph_cypher for production → notebook calls
+   → 1 hit: scripts/daily_churn_report.py imports from churn_eval.ipynb ⚠
+   → Production should not depend on a notebook. Extract.
+Findings:
+- 97 total symbols across 4 notebooks
+- 1 multi-notebook helper (compute_cohort_retention) → extract to module
+- 1 production → notebook bridge (daily_churn_report) → flag as tech debt
+```
+## Output Format
+```markdown
+## Notebook Tour: `notebooks/`
+### Notebooks
+| Notebook | Symbols | Purpose (top-3 symbols) |
+| --- | --- | --- |
+| churn_baseline.ipynb | 28 | baseline_churn_rate, … |
+| churn_features.ipynb | 35 | compute_cohort_retention, … |
+| churn_model.ipynb | 22 | train_churn_classifier, … |
+| churn_eval.ipynb | 12 | evaluate_churn_model, … |
+### Refactor candidates
+1. `compute_cohort_retention` — used in 2 notebooks → extract to `src/churn/cohort.py`
+2. ...
+### Bridge audits
+- ⚠ `scripts/daily_churn_report.py` imports from `churn_eval.ipynb` —
+  production should not depend on a notebook.
+```

package/skills/codragraph-observability-coverage.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: codragraph-observability-coverage
+description: "Use to audit observability coverage — which functions / processes have logs, metrics, or distributed-trace spans, and which don't. Find the dark corners where you're flying blind. Examples: \"observability coverage\", \"unlogged code\", \"missing traces\", \"telemetry audit\", \"where are we flying blind\""
+---
+# Observability Coverage Audit with CodraGraph
+## When to Use
+- "Which functions have NO logs / metrics / traces?"
+- "Audit telemetry coverage on the request path."
+- "Find dark spots in my observability."
+- "Are all my critical processes instrumented?"
+- Post-incident review: "did we have visibility into X?"
+## Why CodraGraph helps here
+Telemetry calls are just function calls — `logger.info(...)`,
+`tracer.startSpan(...)`, `metrics.histogram(...)`. CodraGraph's call
+graph shows you exactly which symbols invoke them. Subtract those from
+your full symbol set: the difference is your dark zone.
+## Workflow
+```
+1. Identify your telemetry surface area:
+   codragraph_query({query: "logger trace span metric histogram counter"})
+   → list of telemetry-emitting helpers (logger.info, span.end, metrics.timing, ...)
+2. For each telemetry helper, find its callers:
+   codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(t {name: '<telemetry-fn>'})
+     RETURN DISTINCT caller.id, caller.name
+   `})
+   → "instrumented" set: every function that emits telemetry
+3. Map your critical surface (processes / entry points):
+   READ codragraph://repo/{name}/processes
+   → request-path / job-path flows
+4. Subtract: which symbols in critical processes are NOT in the
+   instrumented set?
+   codragraph_cypher({query: `
+     MATCH (n {label: 'Function'})
+     WHERE n.isEntryPoint = true
+       AND NOT EXISTS {
+         MATCH (n)-[:CALLS*1..3]->(t)
+         WHERE t.name STARTS WITH 'logger.'
+            OR t.name STARTS WITH 'tracer.'
+            OR t.name STARTS WITH 'metrics.'
+       }
+     RETURN n.name, n.filePath
+   `})
+   → entry points with NO telemetry within 3 hops = dark zones
+5. For each dark zone, codragraph_context to confirm and propose
+   minimum-viable instrumentation (one log line + one span)
+```
+## Coverage tiers
+| Tier | What's covered | What it tells you |
+|---|---|---|
+| **None** | No telemetry within 3 hops of entry point | Flying blind under load |
+| **Logs only** | `logger.*` reachable but no `tracer.*` / `metrics.*` | Can debug post-hoc, can't query prod |
+| **Logs + metrics** | Counters / histograms emitted | Dashboards possible |
+| **Logs + metrics + traces** | Spans tied to request flow | Full observability |
+| **Structured + correlated** | All three with a request_id propagated | Best — can chase one user through everything |
+## Checklist
+```
+- [ ] Listed telemetry-emitting helpers (logger / tracer / metrics)
+- [ ] Resolved their direct callers (instrumented set)
+- [ ] Listed critical processes / entry points
+- [ ] Subtracted: which entry points have no telemetry within 3 hops?
+- [ ] For each gap, propose minimum-viable instrumentation
+- [ ] Tier-rate each critical flow (None / Logs / Metrics / Traces / Correlated)
+```
+## Example: "Audit observability on our checkout flow"
+```
+1. codragraph_query({query: "checkout payment process"})
+   → CheckoutFlow process: 7 steps (validateCart → reservePayment →
+     captureFunds → createOrder → notifyShip → emitReceipt → done)
+2. Telemetry helpers:
+   - logger.info, logger.warn, logger.error
+   - tracer.startSpan, span.end
+   - metrics.histogram, metrics.counter
+3. For each step in CheckoutFlow:
+   codragraph_context({name: "<step>"})
+   → check callees include any telemetry helper
+   - validateCart    → logger.info ✓, tracer ✓, metrics ✗
+   - reservePayment  → logger.info ✓, tracer ✓, metrics ✗
+   - captureFunds    → logger.info ✓, tracer ✗, metrics ✗ ⚠
+   - createOrder     → logger.info ✓, tracer ✓, metrics ✓
+   - notifyShip     → ⚠ NOTHING (dark zone)
+   - emitReceipt     → logger.info ✓
+   - done            → logger.info ✓
+4. Gaps:
+   - notifyShip: entirely unobserved. Add tracer.startSpan + counter on
+     success/failure. Cheapest fix to close the gap.
+   - captureFunds: missing tracer span around the actual capture call.
+     Add for distributed-trace correlation with payment provider.
+   - validateCart, reservePayment, captureFunds: missing latency histograms.
+     Add metrics.timing for each.
+Tier rating: Logs ✓, Metrics partial ⚠, Traces partial ⚠, Correlated ✓
+   (request_id is propagated end-to-end where instrumentation exists).
+```
+## Pitfalls
+| Pitfall | Symptom | Fix |
+|---|---|---|
+| Telemetry behind a façade | Direct caller is your `obs.log()` wrapper, not `logger.info` | Search for the wrapper too |
+| Conditional logging only on errors | "Looks instrumented" but emits nothing on the happy path | Audit success paths separately |
+| Telemetry in middleware, missing in handler | Edge instrumentation doesn't show handler-internal state | Check both layers |
+| Excessive logging in hot loops | Coverage looks great, dashboards drown in noise | Pair with codragraph-perf-hotspots; sample logs in hot paths |
+```

package/skills/codragraph-onboarding.md ADDED Viewed

@@ -0,0 +1,129 @@
+---
+name: codragraph-onboarding
+description: "Use when a developer is new to a codebase and needs a guided walkthrough — entry points, functional areas, key flows, where to start contributing. Examples: \"I'm new to this repo\", \"where do I start\", \"give me a tour\", \"onboard me to this codebase\", \"what does this project do\""
+---
+# Codebase Onboarding with CodraGraph
+## When to Use
+- "I'm new to this codebase. Where do I start?"
+- "Give me a tour of this project."
+- "What does each part of this repo do?"
+- "I want to fix a bug in `<area>` — what should I read first?"
+- "I'm picking this project back up after 6 months."
+## Why CodraGraph helps here
+A README tells you what the project *does*. Reading source top-down tells
+you nothing for the first hour. CodraGraph already grouped your code into
+**Leiden communities** during analyze — these are the natural functional
+areas of the codebase, derived from the call graph rather than directory
+structure. Pair them with the detected execution flows (Processes) and you
+get a guided tour: each cluster is a "module", each process is a "story
+running through it."
+## Workflow
+```
+1. READ codragraph://repo/{name}/context
+   → repo-level overview: file count, language mix, last index time
+2. READ codragraph://repo/{name}/clusters
+   → all functional areas (auth, payments, ingestion, …) with cohesion %
+     and dominant directories
+3. For each top-3 cluster (by symbol count):
+   READ .claude/skills/generated/<cluster-kebab-name>/SKILL.md (if --skills was run)
+   OR
+   codragraph_query({query: "<cluster label>"})
+   → entry points, key files, member symbols
+4. READ codragraph://repo/{name}/processes
+   → all detected execution flows (named processes that span the graph)
+5. For each process:
+   READ codragraph://repo/{name}/process/<processName>
+   → step-by-step trace: which symbol calls which next
+6. Pick a cluster the user wants to dig into:
+   codragraph_context({name: "<entry point of that cluster>"})
+   → callers + callees, full picture of the entry point
+```
+> If `.claude/skills/generated/` is empty, run `codragraph analyze --skills`
+> first to materialize per-community guides. They make onboarding
+> dramatically faster.
+## Checklist
+```
+- [ ] Repo overview (context resource)
+- [ ] List clusters (clusters resource)
+- [ ] Read top 3-5 cluster skills or query each cluster label
+- [ ] List processes
+- [ ] Walk top 2-3 processes step-by-step
+- [ ] Pick one entry point and run context for the deep dive
+- [ ] Summarize: "Here's the map. Start at <X> for <task>."
+```
+## Tour Structure
+| Stage | Tool | Output |
+| --- | --- | --- |
+| Map | `clusters` resource | "10 functional areas, dominant: auth, ingestion, web" |
+| Themes | per-community SKILL.md | Each area's purpose, key files, entry points |
+| Stories | `processes` resource | "5 flows: SignupFlow, IngestPipeline, …" |
+| Trace | `process/{name}` resource | Step-by-step call sequence |
+| Deep dive | `context` | Pick one symbol, see all sides |
+## Example: "I'm new to CodraGraph itself, where do I start?"
+```
+1. READ codragraph://repo/CodraGraph/context
+   → 4325 symbols, 10556 relationships, 300 flows. TypeScript primary.
+2. READ codragraph://repo/CodraGraph/clusters
+   → Top clusters: ingestion (1240 symbols), graphstore (340), cli (290),
+     mcp (220), languages (180)
+3. READ .claude/skills/generated/ingestion/SKILL.md
+   → Entry points: runFullAnalysis, IngestionPipeline.run
+   → Key files: codragraph/src/core/ingestion/
+4. READ codragraph://repo/CodraGraph/processes
+   → Top flows: AnalyzeFlow, McpQueryFlow, GraphstoreCommitFlow
+5. READ codragraph://repo/CodraGraph/process/AnalyzeFlow
+   → 12 steps from CLI invocation through Phase 4 snapshot
+6. codragraph_context({name: "runFullAnalysis"})
+   → orchestrator that takes (repoPath, options, hooks) and runs the pipeline.
+   → Called by: analyzeCommand (CLI), eval-server, augment hook
+Tour result: "Start at runFullAnalysis (codragraph/src/core/run-analyze.ts).
+That's the orchestrator. The 12-stage pipeline lives under
+src/core/ingestion/. Phase 4 graphstore is in src/core/graphstore/."
+```
+## Output Format
+```markdown
+## Codebase Tour: <repo>
+### Project shape
+- N symbols across M files. Primary languages: …
+- N functional areas (clusters), M execution flows.
+### Functional areas
+1. **<cluster>** — <symbolCount> symbols, dominant `src/<dir>/`. Purpose: …
+2. ...
+### Key flows
+- **AnalyzeFlow** — 12 steps. Entry: `<symbol>`.
+- ...
+### Recommended starting point for "<task>"
+Read `<file>:<line>` (`<symbol>`). It's the orchestrator for <area>.
+Once you understand it, walk the <flow> to see the whole story.
+```

package/skills/codragraph-perf-hotspots.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+name: codragraph-perf-hotspots
+description: "Use to identify likely performance hot paths from the call graph — top-N callees of entry points, fan-out functions, recursive cycles, and where to focus profiling effort. NOT a profiler — a structural pre-screen that narrows where to actually run a profiler. Examples: \"find perf hotspots\", \"hot paths\", \"top callees\", \"where should I profile\", \"functions called by every handler\""
+---
+# Performance Hotspot Pre-Screen with CodraGraph
+## When to Use
+- "Where should I focus my profiler?"
+- "Which functions are on every request path?"
+- "Find fan-out points — functions called from many places."
+- "Are there recursive call cycles?"
+- "List top N callees of the API request handler."
+## What this skill IS and ISN'T
+CodraGraph builds a **static call graph** — it knows who *can* call
+whom, not who *did* call whom in production. So this skill identifies
+**structural hot path candidates**, not measured hotspots.
+Use it as a **pre-screen** for actual profiling: "given my structural
+hot path candidates, the profiler should focus here first." If you have
+profiler data (pprof, flamegraphs, OpenTelemetry traces), CodraGraph
+turns the names from that data into actionable callgraph context.
+## Workflow
+```
+1. Identify entry points:
+   codragraph_query({query: "request handler endpoint route main"})
+   → top entry-point candidates
+2. For each entry point, get top N callees ordered by depth-1 fan-in:
+   codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(callee)
+     WHERE callee.label = 'Function'
+     RETURN callee.name, count(DISTINCT caller) AS in_degree
+     ORDER BY in_degree DESC
+     LIMIT 20
+   `})
+   → callees called from many places = on many paths = candidate hot spots
+3. Cross-cut with processes:
+   READ codragraph://repo/{name}/processes
+   → functions appearing in MANY processes ARE on the hot path by definition
+4. Recursive cycle detection:
+   codragraph_cypher({query: `
+     MATCH path = (n)-[:CALLS*2..6]->(n)
+     RETURN n.name, length(path) AS cycle_len
+     ORDER BY cycle_len ASC
+     LIMIT 10
+   `})
+   → unbounded recursion = potential perf cliff under specific inputs
+5. With profiler output, translate names back to context:
+   For each top-N name from your flamegraph/pprof:
+     codragraph_context({name: "<sym>"})
+     → "this function is on N execution flows; called by M sites"
+```
+## Hot-path heuristics
+| Signal | Meaning |
+|---|---|
+| Function called from > 10 distinct callers | High fan-in → optimize once, win everywhere |
+| Function appearing in > 5 processes | On many request paths → request-time critical |
+| Cycle of length 2-3 in CALLS edges | Mutual recursion — may overflow under depth |
+| `await` chain of length > 8 in one process | Sequential I/O — candidate for parallelism |
+| Function under cluster `database` / `network` | I/O-bound; profile network and DB calls separately |
+## Checklist
+```
+- [ ] List entry points (codragraph_query for handlers/routes/main)
+- [ ] Top-N callees by in-degree (Cypher)
+- [ ] Cross-reference with processes (functions in many flows = hot)
+- [ ] Cycle detection
+- [ ] If profiler data exists, codragraph_context for each top hot symbol
+- [ ] Prioritize: request-path + high in-degree + I/O-bound = first to optimize
+```
+## Example: "Find the hot paths in our HTTP handler chain"
+```
+1. codragraph_query({query: "express router handler"})
+   → 28 handler functions
+2. codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(callee)
+     WHERE callee.label = 'Function'
+     RETURN callee.name, count(DISTINCT caller) AS in_degree
+     ORDER BY in_degree DESC LIMIT 10
+   `})
+   → top in-degree callees:
+     - logRequest (28)         ← every handler calls this
+     - getCurrentUser (22)
+     - db.query (18)           ← I/O bound
+     - cache.get (15)
+     - serializeJson (28)      ← every handler calls this
+3. READ codragraph://repo/CodraGraph/processes
+   → 5 processes; getCurrentUser appears in 4 of 5
+4. codragraph_context({name: "getCurrentUser"})
+   → 22 callers, calls db.query (cache miss path) and cache.get
+   → STRONGLY recommend: profile getCurrentUser first.
+   → Win-rate per opt: 22 callers × cache miss rate × DB latency.
+```
+## Output Format
+```markdown
+## Perf Pre-Screen: <scope>
+### Top hot-path candidates (structural)
+| Function | In-degree | Processes | I/O type | Note |
+|---|--:|--:|---|---|
+| getCurrentUser | 22 | 4 | DB + cache | profile first |
+| db.query | 18 | 5 | DB | hot for write paths |
+| serializeJson | 28 | 5 | CPU | every handler — micro-opt territory |
+### Cycles detected
+- `processStep ↔ enqueueRetry` — depth 2 cycle, unbounded under failure conditions
+### Profiler integration plan
+1. Collect pprof / flamegraph / OTEL trace under representative load
+2. Top-N hottest functions from profile → run codragraph_context on each
+3. Cross-reference with this static pre-screen
+4. Optimize where structural & measured hot paths overlap
+```