npm - @codragraph/cli - Versions diffs - 1.6.4 → 2.0.0 - Mend

@codragraph/cli 1.6.4 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

package/README.md +34 -0
package/dist/cli/analyze.d.ts +22 -0
package/dist/cli/analyze.js +107 -4
package/dist/cli/compress-stats.d.ts +29 -0
package/dist/cli/compress-stats.js +97 -0
package/dist/cli/graphstore.d.ts +6 -2
package/dist/cli/graphstore.js +24 -2
package/dist/cli/index.js +16 -2
package/dist/cli/profile-heap.d.ts +35 -0
package/dist/cli/profile-heap.js +126 -0
package/dist/cli/setup.d.ts +13 -0
package/dist/cli/setup.js +22 -11
package/dist/cli/skill-gen.d.ts +14 -2
package/dist/cli/skill-gen.js +52 -19
package/dist/cli/tool.js +4 -0
package/dist/core/embeddings/embedding-pipeline.js +24 -7
package/dist/core/group/bridge-db.js +111 -24
package/dist/core/lbug/content-read.d.ts +46 -0
package/dist/core/lbug/content-read.js +64 -0
package/dist/core/lbug/csv-generator.d.ts +2 -6
package/dist/core/lbug/csv-generator.js +45 -12
package/dist/core/lbug/lbug-adapter.d.ts +4 -1
package/dist/core/lbug/lbug-adapter.js +153 -21
package/dist/core/lbug/schema.d.ts +7 -7
package/dist/core/lbug/schema.js +18 -0
package/dist/core/run-analyze.d.ts +13 -0
package/dist/core/run-analyze.js +91 -4
package/dist/core/search/bm25-index.js +67 -15
package/dist/mcp/local/local-backend.js +22 -5
package/dist/server/api.js +4 -3
package/dist/storage/repo-manager.d.ts +39 -0
package/dist/storage/repo-manager.js +19 -0
package/hooks/claude/codragraph-hook.cjs +95 -2
package/package.json +4 -4
package/scripts/build-tree-sitter-proto.cjs +15 -3
package/scripts/patch-tree-sitter-swift.cjs +17 -4
package/skills/codragraph-api-surface.md +110 -0
package/skills/codragraph-config-audit.md +146 -0
package/skills/codragraph-cross-repo-impact.md +135 -0
package/skills/codragraph-data-lineage.md +137 -0
package/skills/codragraph-dead-code.md +119 -0
package/skills/codragraph-gh-actions-debug.md +162 -0
package/skills/codragraph-gh-issue-workflow.md +178 -0
package/skills/codragraph-gh-pr-workflow.md +176 -0
package/skills/codragraph-gh-release-workflow.md +187 -0
package/skills/codragraph-git-bisect.md +176 -0
package/skills/codragraph-git-force-push.md +147 -0
package/skills/codragraph-git-history-rewrite.md +174 -0
package/skills/codragraph-git-rebase-vs-merge.md +138 -0
package/skills/codragraph-git-recovery.md +181 -0
package/skills/codragraph-git-worktree.md +145 -0
package/skills/codragraph-migration-tracking.md +130 -0
package/skills/codragraph-notebook-context.md +136 -0
package/skills/codragraph-observability-coverage.md +125 -0
package/skills/codragraph-onboarding.md +129 -0
package/skills/codragraph-perf-hotspots.md +132 -0
package/skills/codragraph-project-switcher.md +116 -0
package/skills/codragraph-security-audit.md +144 -0
package/skills/codragraph-sql-tracing.md +122 -0
package/skills/codragraph-supply-chain-audit.md +153 -0
package/skills/codragraph-test-coverage.md +97 -0

package/skills/codragraph-observability-coverage.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: codragraph-observability-coverage
+description: "Use to audit observability coverage — which functions / processes have logs, metrics, or distributed-trace spans, and which don't. Find the dark corners where you're flying blind. Examples: \"observability coverage\", \"unlogged code\", \"missing traces\", \"telemetry audit\", \"where are we flying blind\""
+---
+# Observability Coverage Audit with CodraGraph
+## When to Use
+- "Which functions have NO logs / metrics / traces?"
+- "Audit telemetry coverage on the request path."
+- "Find dark spots in my observability."
+- "Are all my critical processes instrumented?"
+- Post-incident review: "did we have visibility into X?"
+## Why CodraGraph helps here
+Telemetry calls are just function calls — `logger.info(...)`,
+`tracer.startSpan(...)`, `metrics.histogram(...)`. CodraGraph's call
+graph shows you exactly which symbols invoke them. Subtract those from
+your full symbol set: the difference is your dark zone.
+## Workflow
+```
+1. Identify your telemetry surface area:
+   codragraph_query({query: "logger trace span metric histogram counter"})
+   → list of telemetry-emitting helpers (logger.info, span.end, metrics.timing, ...)
+2. For each telemetry helper, find its callers:
+   codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(t {name: '<telemetry-fn>'})
+     RETURN DISTINCT caller.id, caller.name
+   `})
+   → "instrumented" set: every function that emits telemetry
+3. Map your critical surface (processes / entry points):
+   READ codragraph://repo/{name}/processes
+   → request-path / job-path flows
+4. Subtract: which symbols in critical processes are NOT in the
+   instrumented set?
+   codragraph_cypher({query: `
+     MATCH (n {label: 'Function'})
+     WHERE n.isEntryPoint = true
+       AND NOT EXISTS {
+         MATCH (n)-[:CALLS*1..3]->(t)
+         WHERE t.name STARTS WITH 'logger.'
+            OR t.name STARTS WITH 'tracer.'
+            OR t.name STARTS WITH 'metrics.'
+       }
+     RETURN n.name, n.filePath
+   `})
+   → entry points with NO telemetry within 3 hops = dark zones
+5. For each dark zone, codragraph_context to confirm and propose
+   minimum-viable instrumentation (one log line + one span)
+```
+## Coverage tiers
+| Tier | What's covered | What it tells you |
+|---|---|---|
+| **None** | No telemetry within 3 hops of entry point | Flying blind under load |
+| **Logs only** | `logger.*` reachable but no `tracer.*` / `metrics.*` | Can debug post-hoc, can't query prod |
+| **Logs + metrics** | Counters / histograms emitted | Dashboards possible |
+| **Logs + metrics + traces** | Spans tied to request flow | Full observability |
+| **Structured + correlated** | All three with a request_id propagated | Best — can chase one user through everything |
+## Checklist
+```
+- [ ] Listed telemetry-emitting helpers (logger / tracer / metrics)
+- [ ] Resolved their direct callers (instrumented set)
+- [ ] Listed critical processes / entry points
+- [ ] Subtracted: which entry points have no telemetry within 3 hops?
+- [ ] For each gap, propose minimum-viable instrumentation
+- [ ] Tier-rate each critical flow (None / Logs / Metrics / Traces / Correlated)
+```
+## Example: "Audit observability on our checkout flow"
+```
+1. codragraph_query({query: "checkout payment process"})
+   → CheckoutFlow process: 7 steps (validateCart → reservePayment →
+     captureFunds → createOrder → notifyShip → emitReceipt → done)
+2. Telemetry helpers:
+   - logger.info, logger.warn, logger.error
+   - tracer.startSpan, span.end
+   - metrics.histogram, metrics.counter
+3. For each step in CheckoutFlow:
+   codragraph_context({name: "<step>"})
+   → check callees include any telemetry helper
+   - validateCart    → logger.info ✓, tracer ✓, metrics ✗
+   - reservePayment  → logger.info ✓, tracer ✓, metrics ✗
+   - captureFunds    → logger.info ✓, tracer ✗, metrics ✗ ⚠
+   - createOrder     → logger.info ✓, tracer ✓, metrics ✓
+   - notifyShip     → ⚠ NOTHING (dark zone)
+   - emitReceipt     → logger.info ✓
+   - done            → logger.info ✓
+4. Gaps:
+   - notifyShip: entirely unobserved. Add tracer.startSpan + counter on
+     success/failure. Cheapest fix to close the gap.
+   - captureFunds: missing tracer span around the actual capture call.
+     Add for distributed-trace correlation with payment provider.
+   - validateCart, reservePayment, captureFunds: missing latency histograms.
+     Add metrics.timing for each.
+Tier rating: Logs ✓, Metrics partial ⚠, Traces partial ⚠, Correlated ✓
+   (request_id is propagated end-to-end where instrumentation exists).
+```
+## Pitfalls
+| Pitfall | Symptom | Fix |
+|---|---|---|
+| Telemetry behind a façade | Direct caller is your `obs.log()` wrapper, not `logger.info` | Search for the wrapper too |
+| Conditional logging only on errors | "Looks instrumented" but emits nothing on the happy path | Audit success paths separately |
+| Telemetry in middleware, missing in handler | Edge instrumentation doesn't show handler-internal state | Check both layers |
+| Excessive logging in hot loops | Coverage looks great, dashboards drown in noise | Pair with codragraph-perf-hotspots; sample logs in hot paths |
+```

package/skills/codragraph-onboarding.md ADDED Viewed

@@ -0,0 +1,129 @@
+---
+name: codragraph-onboarding
+description: "Use when a developer is new to a codebase and needs a guided walkthrough — entry points, functional areas, key flows, where to start contributing. Examples: \"I'm new to this repo\", \"where do I start\", \"give me a tour\", \"onboard me to this codebase\", \"what does this project do\""
+---
+# Codebase Onboarding with CodraGraph
+## When to Use
+- "I'm new to this codebase. Where do I start?"
+- "Give me a tour of this project."
+- "What does each part of this repo do?"
+- "I want to fix a bug in `<area>` — what should I read first?"
+- "I'm picking this project back up after 6 months."
+## Why CodraGraph helps here
+A README tells you what the project *does*. Reading source top-down tells
+you nothing for the first hour. CodraGraph already grouped your code into
+**Leiden communities** during analyze — these are the natural functional
+areas of the codebase, derived from the call graph rather than directory
+structure. Pair them with the detected execution flows (Processes) and you
+get a guided tour: each cluster is a "module", each process is a "story
+running through it."
+## Workflow
+```
+1. READ codragraph://repo/{name}/context
+   → repo-level overview: file count, language mix, last index time
+2. READ codragraph://repo/{name}/clusters
+   → all functional areas (auth, payments, ingestion, …) with cohesion %
+     and dominant directories
+3. For each top-3 cluster (by symbol count):
+   READ .claude/skills/generated/<cluster-kebab-name>/SKILL.md (if --skills was run)
+   OR
+   codragraph_query({query: "<cluster label>"})
+   → entry points, key files, member symbols
+4. READ codragraph://repo/{name}/processes
+   → all detected execution flows (named processes that span the graph)
+5. For each process:
+   READ codragraph://repo/{name}/process/<processName>
+   → step-by-step trace: which symbol calls which next
+6. Pick a cluster the user wants to dig into:
+   codragraph_context({name: "<entry point of that cluster>"})
+   → callers + callees, full picture of the entry point
+```
+> If `.claude/skills/generated/` is empty, run `codragraph analyze --skills`
+> first to materialize per-community guides. They make onboarding
+> dramatically faster.
+## Checklist
+```
+- [ ] Repo overview (context resource)
+- [ ] List clusters (clusters resource)
+- [ ] Read top 3-5 cluster skills or query each cluster label
+- [ ] List processes
+- [ ] Walk top 2-3 processes step-by-step
+- [ ] Pick one entry point and run context for the deep dive
+- [ ] Summarize: "Here's the map. Start at <X> for <task>."
+```
+## Tour Structure
+| Stage | Tool | Output |
+| --- | --- | --- |
+| Map | `clusters` resource | "10 functional areas, dominant: auth, ingestion, web" |
+| Themes | per-community SKILL.md | Each area's purpose, key files, entry points |
+| Stories | `processes` resource | "5 flows: SignupFlow, IngestPipeline, …" |
+| Trace | `process/{name}` resource | Step-by-step call sequence |
+| Deep dive | `context` | Pick one symbol, see all sides |
+## Example: "I'm new to CodraGraph itself, where do I start?"
+```
+1. READ codragraph://repo/CodraGraph/context
+   → 4325 symbols, 10556 relationships, 300 flows. TypeScript primary.
+2. READ codragraph://repo/CodraGraph/clusters
+   → Top clusters: ingestion (1240 symbols), graphstore (340), cli (290),
+     mcp (220), languages (180)
+3. READ .claude/skills/generated/ingestion/SKILL.md
+   → Entry points: runFullAnalysis, IngestionPipeline.run
+   → Key files: codragraph/src/core/ingestion/
+4. READ codragraph://repo/CodraGraph/processes
+   → Top flows: AnalyzeFlow, McpQueryFlow, GraphstoreCommitFlow
+5. READ codragraph://repo/CodraGraph/process/AnalyzeFlow
+   → 12 steps from CLI invocation through Phase 4 snapshot
+6. codragraph_context({name: "runFullAnalysis"})
+   → orchestrator that takes (repoPath, options, hooks) and runs the pipeline.
+   → Called by: analyzeCommand (CLI), eval-server, augment hook
+Tour result: "Start at runFullAnalysis (codragraph/src/core/run-analyze.ts).
+That's the orchestrator. The 12-stage pipeline lives under
+src/core/ingestion/. Phase 4 graphstore is in src/core/graphstore/."
+```
+## Output Format
+```markdown
+## Codebase Tour: <repo>
+### Project shape
+- N symbols across M files. Primary languages: …
+- N functional areas (clusters), M execution flows.
+### Functional areas
+1. **<cluster>** — <symbolCount> symbols, dominant `src/<dir>/`. Purpose: …
+2. ...
+### Key flows
+- **AnalyzeFlow** — 12 steps. Entry: `<symbol>`.
+- ...
+### Recommended starting point for "<task>"
+Read `<file>:<line>` (`<symbol>`). It's the orchestrator for <area>.
+Once you understand it, walk the <flow> to see the whole story.
+```

package/skills/codragraph-perf-hotspots.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+name: codragraph-perf-hotspots
+description: "Use to identify likely performance hot paths from the call graph — top-N callees of entry points, fan-out functions, recursive cycles, and where to focus profiling effort. NOT a profiler — a structural pre-screen that narrows where to actually run a profiler. Examples: \"find perf hotspots\", \"hot paths\", \"top callees\", \"where should I profile\", \"functions called by every handler\""
+---
+# Performance Hotspot Pre-Screen with CodraGraph
+## When to Use
+- "Where should I focus my profiler?"
+- "Which functions are on every request path?"
+- "Find fan-out points — functions called from many places."
+- "Are there recursive call cycles?"
+- "List top N callees of the API request handler."
+## What this skill IS and ISN'T
+CodraGraph builds a **static call graph** — it knows who *can* call
+whom, not who *did* call whom in production. So this skill identifies
+**structural hot path candidates**, not measured hotspots.
+Use it as a **pre-screen** for actual profiling: "given my structural
+hot path candidates, the profiler should focus here first." If you have
+profiler data (pprof, flamegraphs, OpenTelemetry traces), CodraGraph
+turns the names from that data into actionable callgraph context.
+## Workflow
+```
+1. Identify entry points:
+   codragraph_query({query: "request handler endpoint route main"})
+   → top entry-point candidates
+2. For each entry point, get top N callees ordered by depth-1 fan-in:
+   codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(callee)
+     WHERE callee.label = 'Function'
+     RETURN callee.name, count(DISTINCT caller) AS in_degree
+     ORDER BY in_degree DESC
+     LIMIT 20
+   `})
+   → callees called from many places = on many paths = candidate hot spots
+3. Cross-cut with processes:
+   READ codragraph://repo/{name}/processes
+   → functions appearing in MANY processes ARE on the hot path by definition
+4. Recursive cycle detection:
+   codragraph_cypher({query: `
+     MATCH path = (n)-[:CALLS*2..6]->(n)
+     RETURN n.name, length(path) AS cycle_len
+     ORDER BY cycle_len ASC
+     LIMIT 10
+   `})
+   → unbounded recursion = potential perf cliff under specific inputs
+5. With profiler output, translate names back to context:
+   For each top-N name from your flamegraph/pprof:
+     codragraph_context({name: "<sym>"})
+     → "this function is on N execution flows; called by M sites"
+```
+## Hot-path heuristics
+| Signal | Meaning |
+|---|---|
+| Function called from > 10 distinct callers | High fan-in → optimize once, win everywhere |
+| Function appearing in > 5 processes | On many request paths → request-time critical |
+| Cycle of length 2-3 in CALLS edges | Mutual recursion — may overflow under depth |
+| `await` chain of length > 8 in one process | Sequential I/O — candidate for parallelism |
+| Function under cluster `database` / `network` | I/O-bound; profile network and DB calls separately |
+## Checklist
+```
+- [ ] List entry points (codragraph_query for handlers/routes/main)
+- [ ] Top-N callees by in-degree (Cypher)
+- [ ] Cross-reference with processes (functions in many flows = hot)
+- [ ] Cycle detection
+- [ ] If profiler data exists, codragraph_context for each top hot symbol
+- [ ] Prioritize: request-path + high in-degree + I/O-bound = first to optimize
+```
+## Example: "Find the hot paths in our HTTP handler chain"
+```
+1. codragraph_query({query: "express router handler"})
+   → 28 handler functions
+2. codragraph_cypher({query: `
+     MATCH (caller)-[:CALLS]->(callee)
+     WHERE callee.label = 'Function'
+     RETURN callee.name, count(DISTINCT caller) AS in_degree
+     ORDER BY in_degree DESC LIMIT 10
+   `})
+   → top in-degree callees:
+     - logRequest (28)         ← every handler calls this
+     - getCurrentUser (22)
+     - db.query (18)           ← I/O bound
+     - cache.get (15)
+     - serializeJson (28)      ← every handler calls this
+3. READ codragraph://repo/CodraGraph/processes
+   → 5 processes; getCurrentUser appears in 4 of 5
+4. codragraph_context({name: "getCurrentUser"})
+   → 22 callers, calls db.query (cache miss path) and cache.get
+   → STRONGLY recommend: profile getCurrentUser first.
+   → Win-rate per opt: 22 callers × cache miss rate × DB latency.
+```
+## Output Format
+```markdown
+## Perf Pre-Screen: <scope>
+### Top hot-path candidates (structural)
+| Function | In-degree | Processes | I/O type | Note |
+|---|--:|--:|---|---|
+| getCurrentUser | 22 | 4 | DB + cache | profile first |
+| db.query | 18 | 5 | DB | hot for write paths |
+| serializeJson | 28 | 5 | CPU | every handler — micro-opt territory |
+### Cycles detected
+- `processStep ↔ enqueueRetry` — depth 2 cycle, unbounded under failure conditions
+### Profiler integration plan
+1. Collect pprof / flamegraph / OTEL trace under representative load
+2. Top-N hottest functions from profile → run codragraph_context on each
+3. Cross-reference with this static pre-screen
+4. Optimize where structural & measured hot paths overlap
+```

package/skills/codragraph-project-switcher.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: codragraph-project-switcher
+description: "Use when the user works across many parallel projects and needs to switch context, find which repo a symbol is in, list all indexed projects, or run a query against a specific repo without ambiguity. Examples: \"what projects am I working on\", \"switch to repo X\", \"which of my repos has function Y\", \"list my repositories\""
+---
+# Multi-Project / Vibecoding Context Switcher
+## When to Use
+- "What projects do I have indexed?"
+- "Switch to my `<project>` repo for this query."
+- "Which of my repos has the `<symbol>` function?"
+- "Run this query across all my repos."
+- Solo dev juggling 4+ side projects with different agents
+- Picking up a project after weeks away
+## Why CodraGraph helps here
+CodraGraph maintains a global registry of every indexed repo at
+`~/.codragraph/registry.json`. Every MCP tool accepts a `repo` parameter
+to disambiguate. So switching context isn't "open a new editor / cd into
+the project / re-orient your agent" — it's just passing `repo: "<name>"`
+to the next call. Combine with **groups** (multiple repos that share
+contracts) and you get cross-repo queries with one call.
+## Workflow
+```
+1. List every indexed repo:
+   codragraph_list_repos({})
+   → name, path, file count, last analyze time
+2. (Optional) List groups (sets of related repos):
+   codragraph_group_list({})
+   → group name + member repos
+3. Query against a specific repo:
+   codragraph_query({repo: "<name>", query: "<concept>"})
+   → answers come from that repo only
+4. Find which repo has a symbol you remember:
+   For each repo from list_repos:
+     codragraph_query({repo: "<name>", query: "<remembered name>"})
+   → first hit identifies the repo
+5. For cross-repo questions (group-mode):
+   codragraph_query({repo: "@<group>", query: "<concept>"})
+   → fans out across every group member, RRF-merges results
+```
+> If `list_repos` returns nothing, the user has no indexed projects yet.
+> Run `codragraph analyze` in each project once to register them.
+## Checklist
+```
+- [ ] list_repos to see what's indexed
+- [ ] group_list to see related-repo groups
+- [ ] Pick the right repo (or group) for the question
+- [ ] Pass repo: "<name>" or repo: "@<group>" to subsequent calls
+- [ ] If a project isn't indexed yet, suggest the user run analyze in it
+- [ ] Mention staleness ("repo X last indexed 12 days ago — re-analyze?")
+```
+## Multi-Project Patterns
+| Situation | What to do |
+| --- | --- |
+| Switching from one solo project to another | `list_repos` → pick → all subsequent tools take `repo: "<name>"` |
+| "Did I solve this in another repo?" | `query` over each repo, look for matching symbols |
+| Shared library used by multiple repos | Define a group (group.yaml); use `repo: "@<group>"` |
+| Resuming a project after weeks | Check staleness (`list_repos` last-indexed timestamps); re-analyze if old |
+## Example: "Switch me to my SaaS side project and find the auth code"
+```
+1. codragraph_list_repos({})
+   → 4 indexed repos:
+     - codragraph (~/code/codragraph, 4325 symbols, indexed 2 hours ago)
+     - my-saas (~/projects/my-saas, 1180 symbols, indexed 3 days ago)
+     - portfolio (~/code/portfolio, 240 symbols, indexed 2 weeks ago)
+     - data-eda (~/notebooks/data-eda, 95 symbols, indexed 1 month ago)
+2. codragraph_query({repo: "my-saas", query: "authentication login session"})
+   → top 5 symbols in my-saas, none in other repos
+3. codragraph_context({repo: "my-saas", name: "validateSession"})
+   → callers: requireAuth, refreshToken (both in my-saas)
+4. (Optional) Reminder: my-saas was indexed 3 days ago — fine for navigation
+   but if I just made commits, run `cd ~/projects/my-saas && codragraph analyze`
+   to refresh.
+Switched. Subsequent queries default to my-saas now.
+```
+## Output Format
+```markdown
+## Project Switch: → `<repo>`
+### Available projects
+1. **<repo-1>** — N symbols, last indexed Xh ago
+2. **<repo-2>** — M symbols, last indexed Yd ago (stale?)
+3. ...
+### Active for this conversation
+`<chosen-repo>` (path: `<path>`).
+### Staleness note
+Last indexed `<duration>` ago. Re-analyze if recent commits matter.
+### Quick links
+- `query` already scoped to this repo
+- For cross-repo: pass `repo: "@<group>"` instead
+```

package/skills/codragraph-security-audit.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+name: codragraph-security-audit
+description: "Use for security-focused codebase audits — finding auth bypass paths, missing input validation, secrets in code, untrusted-input flow, and routes that skip auth middleware. Examples: \"security audit\", \"find auth bypass\", \"unvalidated input\", \"untrusted data flow\", \"missing authentication\""
+---
+# Security Audit with CodraGraph
+## When to Use
+- "Audit auth coverage — which routes skip the auth middleware?"
+- "Find input that flows from request to SQL/template/exec without validation."
+- "Find hardcoded secrets / credentials in source."
+- "Trace untrusted-input flow for `<endpoint>`."
+- Pre-release security pass on a PR-heavy week.
+## Why CodraGraph helps here
+Static-analysis tools find pattern matches; CodraGraph adds the
+**call-graph**, so you can answer "*which* request handlers reach this
+unsafe sink?" rather than just "this sink is unsafe somewhere." Combined
+with `query` for sensitive identifier strings and `cypher` for structural
+filters, you can build an audit that's far more targeted than grep.
+## Workflow
+```
+1. Identify the boundary symbols:
+   codragraph_query({query: "request handler route controller"})
+   → all entry points where untrusted input arrives
+2. Identify the dangerous sinks:
+   codragraph_query({query: "exec spawn eval query raw_sql innerHTML"})
+   → places where untrusted input becomes harmful
+3. For each (handler → sink) pair, walk the call graph:
+   codragraph_impact({target: "<sink>", direction: "upstream"})
+   → which handlers REACH this sink?
+4. For each path, look for a validator on the way:
+   codragraph_context({name: "<handler>"})
+   → is `validate / sanitize / escape / parse` called between handler and sink?
+   → if not: candidate vulnerability
+5. Cross-check against secrets:
+   codragraph_cypher({query: "MATCH (n) WHERE n.body =~ '.*(?i)(api[_-]?key|secret|password|token)[ ]*=[ ]*[\\'\"][a-zA-Z0-9]{16,}.*' RETURN n"})
+   → suspicious literals that look like real keys
+```
+## Audit Patterns
+| Pattern | What to look for | CodraGraph approach |
+|---|---|---|
+| Auth bypass | Routes not wrapped by `requireAuth` middleware | `query` for routes; check `context` for middleware in callers |
+| SQL injection | Raw query string built from request input | `query` SQL literals → `impact` upstream → flag handlers |
+| XSS | Untrusted input rendered without escape | `query` `innerHTML` / `dangerouslySetInnerHTML` → impact upstream |
+| Command injection | `exec` / `spawn` with concatenated input | `query` exec/spawn → impact upstream → check for shell escape |
+| Open redirect | Redirect URL from request | `query` `redirect` / `Location:` → trace input source |
+| Hardcoded secrets | API keys in source | `cypher` regex over `n.body` |
+| Missing CSRF | State-changing routes without CSRF middleware | `query` POST / PUT / DELETE handlers → check middleware chain |
+## Why "missing validator" is a great query
+The graph cleanly shows the call path: `handler → … → sink`. Validators
+appear in that chain or they don't. If `validateInput` / `sanitize` /
+`escape` is NOT in the path between a handler and a sink, that's a
+*provable* gap, not a guess.
+```
+codragraph_cypher({query: `
+  // Find handler→sink paths that don't pass through ANY validator
+  MATCH path = (handler {label: 'Function'})-[:CALLS*1..6]->(sink {label: 'Function'})
+  WHERE handler.isEntryPoint = true
+    AND sink.name IN ['exec', 'query', 'innerHTML', 'eval']
+    AND NONE(n IN nodes(path) WHERE n.name STARTS WITH 'validate'
+                                  OR n.name STARTS WITH 'sanitize'
+                                  OR n.name STARTS WITH 'escape')
+  RETURN handler.name, sink.name, length(path) AS hops
+  ORDER BY hops ASC
+`})
+```
+## Checklist
+```
+- [ ] Listed entry-point handlers (query for "handler"/"route"/"controller")
+- [ ] Listed dangerous sinks (exec/query/eval/innerHTML/raw)
+- [ ] Built handler→sink table; flagged paths missing validators
+- [ ] Cypher scan for hardcoded-secret literal patterns
+- [ ] codragraph_context on each flagged handler — confirm coverage / propose fix
+- [ ] Document findings as severity-tagged report
+```
+## Example: "Audit which routes skip our requireAuth middleware"
+```
+1. codragraph_query({query: "Express router get post put delete handler"})
+   → 28 route handlers across 6 router files
+2. For each handler, codragraph_context({name: "<handler>"}):
+   → check that callers include `requireAuth` (a known middleware function)
+3. codragraph_cypher({
+     query: `MATCH (handler {isEntryPoint: true})-[:CALLS]->()
+              WHERE NOT EXISTS {
+                MATCH (handler)<-[:CALLS]-(mw {name: 'requireAuth'})
+              }
+              RETURN handler.name, handler.filePath`
+   })
+   → 4 handlers have no requireAuth caller:
+     - publicHealthCheck (intentional ✓)
+     - signup, login (intentional ✓ — these CREATE auth)
+     - debugDumpUser ⚠ (NOT intentional — leaks user data)
+4. Findings report:
+   - HIGH: debugDumpUser at src/routes/debug.ts:42
+     reaches db.query → returns full user record. No auth.
+     Fix: wrap with requireAuth or remove from production builds.
+```
+## Output Format
+```markdown
+## Security Audit: <scope>
+### Summary
+- N handlers audited
+- M handler→sink paths inspected
+- K validator-missing paths flagged
+- Severity: 1 HIGH, 2 MEDIUM, 0 CRITICAL
+### Findings
+#### HIGH — debugDumpUser at src/routes/debug.ts:42
+- Reachable without `requireAuth`
+- Calls `db.query` with no input validation
+- Returns full user record
+- **Fix:** wrap with auth middleware OR strip from production builds
+#### MEDIUM — …
+### Hardcoded-secret scan
+- 0 high-confidence matches
+- 1 false-positive: example token in test fixture
+```