@graphpilot-oss/graphpilot 0.0.1 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (123) hide show
  1. package/CHANGELOG.md +72 -126
  2. package/README.md +290 -102
  3. package/dist/cli.js +41 -1
  4. package/dist/cli.js.map +1 -1
  5. package/dist/edges.js +22 -11
  6. package/dist/edges.js.map +1 -1
  7. package/dist/indexer.js +3 -3
  8. package/dist/indexer.js.map +1 -1
  9. package/dist/init.d.ts +28 -0
  10. package/dist/init.js +112 -0
  11. package/dist/init.js.map +1 -0
  12. package/dist/interactions.d.ts +5 -4
  13. package/dist/interactions.js +0 -0
  14. package/dist/interactions.js.map +1 -1
  15. package/dist/mcp.js +119 -90
  16. package/dist/mcp.js.map +1 -1
  17. package/dist/repo-resolve.d.ts +47 -0
  18. package/dist/repo-resolve.js +195 -0
  19. package/dist/repo-resolve.js.map +1 -0
  20. package/dist/storage.js +10 -1
  21. package/dist/storage.js.map +1 -1
  22. package/dist/symbols.js +26 -2
  23. package/dist/symbols.js.map +1 -1
  24. package/dist/validation.js +30 -4
  25. package/dist/validation.js.map +1 -1
  26. package/dist/validators.d.ts +1 -5
  27. package/dist/validators.js +0 -11
  28. package/dist/validators.js.map +1 -1
  29. package/dist/watcher.d.ts +10 -0
  30. package/dist/watcher.js +70 -7
  31. package/dist/watcher.js.map +1 -1
  32. package/examples/README.md +105 -0
  33. package/examples/claude-code/README.md +125 -0
  34. package/examples/claude-code/claude-routing.md +102 -0
  35. package/examples/claude-code/claude_config.json +8 -0
  36. package/examples/cline/.clinerules +39 -0
  37. package/examples/cline/README.md +104 -0
  38. package/examples/cline/cline_mcp_settings.json +10 -0
  39. package/examples/continue/.continuerules +39 -0
  40. package/examples/continue/README.md +98 -0
  41. package/examples/continue/config.json +13 -0
  42. package/examples/cursor/.cursorrules +39 -0
  43. package/examples/cursor/README.md +98 -0
  44. package/examples/cursor/mcp.json +11 -0
  45. package/examples/windsurf/.windsurfrules +39 -0
  46. package/examples/windsurf/README.md +85 -0
  47. package/examples/windsurf/mcp_config.json +8 -0
  48. package/package.json +14 -4
  49. package/.editorconfig +0 -15
  50. package/.github/CODEOWNERS +0 -22
  51. package/.github/FUNDING.yml +0 -1
  52. package/.github/ISSUE_TEMPLATE/bug_report.md +0 -33
  53. package/.github/ISSUE_TEMPLATE/config.yml +0 -5
  54. package/.github/ISSUE_TEMPLATE/feature_request.md +0 -23
  55. package/.github/PULL_REQUEST_TEMPLATE.md +0 -19
  56. package/.github/dependabot.yml +0 -15
  57. package/.github/workflows/ci.yml +0 -62
  58. package/.github/workflows/release.yml +0 -50
  59. package/.prettierignore +0 -19
  60. package/.prettierrc.json +0 -20
  61. package/CODE_OF_CONDUCT.md +0 -83
  62. package/CONTRIBUTING.md +0 -111
  63. package/bench/README.md +0 -544
  64. package/bench/results/agent-tier-2026-05-22.md +0 -28
  65. package/bench/results/agent-tier-summary.md +0 -44
  66. package/bench/results/baseline-tier-2026-05-22.md +0 -23
  67. package/bench/results/baseline.json +0 -810
  68. package/bench/results/baseline.md +0 -28
  69. package/bench/run-agent-tier-automated.ts +0 -234
  70. package/bench/run-agent-tier.md +0 -125
  71. package/bench/run-baseline-tier.ts +0 -200
  72. package/bench/run.ts +0 -210
  73. package/bench/runner-baseline.ts +0 -177
  74. package/bench/runner-graphpilot.ts +0 -131
  75. package/bench/score-agent-tier.ts +0 -191
  76. package/bench/score.ts +0 -59
  77. package/bench/tasks.ts +0 -236
  78. package/dist/provenance.d.ts +0 -74
  79. package/dist/provenance.js +0 -95
  80. package/dist/provenance.js.map +0 -1
  81. package/docs/architecture.md +0 -311
  82. package/docs/limitations.md +0 -156
  83. package/docs/mcp-setup.md +0 -231
  84. package/docs/quickstart.md +0 -202
  85. package/eslint.config.js +0 -148
  86. package/lefthook.yml +0 -81
  87. package/pnpm-workspace.yaml +0 -6
  88. package/scripts/smoke-stdio.mjs +0 -97
  89. package/src/cli.ts +0 -171
  90. package/src/edges.ts +0 -202
  91. package/src/git.ts +0 -255
  92. package/src/graph-schema.ts +0 -229
  93. package/src/impact.ts +0 -218
  94. package/src/indexer.ts +0 -152
  95. package/src/interactions.ts +0 -0
  96. package/src/mcp.ts +0 -652
  97. package/src/parser.ts +0 -138
  98. package/src/provenance.ts +0 -115
  99. package/src/query.ts +0 -148
  100. package/src/redact.ts +0 -122
  101. package/src/storage.ts +0 -115
  102. package/src/symbols.ts +0 -173
  103. package/src/validation.ts +0 -69
  104. package/src/validators.ts +0 -253
  105. package/src/watcher.ts +0 -383
  106. package/tests/edges.test.ts +0 -175
  107. package/tests/fixtures/sample.ts +0 -32
  108. package/tests/git.test.ts +0 -303
  109. package/tests/graph-schema.test.ts +0 -321
  110. package/tests/impact.test.ts +0 -454
  111. package/tests/interactions.test.ts +0 -180
  112. package/tests/lint-policy.test.ts +0 -106
  113. package/tests/mcp-stdio.test.ts +0 -171
  114. package/tests/mcp.test.ts +0 -335
  115. package/tests/parser.test.ts +0 -31
  116. package/tests/provenance.test.ts +0 -132
  117. package/tests/query.test.ts +0 -160
  118. package/tests/redact.test.ts +0 -167
  119. package/tests/security.test.ts +0 -144
  120. package/tests/symbols.test.ts +0 -78
  121. package/tests/validators.test.ts +0 -193
  122. package/tests/watcher.test.ts +0 -250
  123. package/tsconfig.json +0 -18
package/CHANGELOG.md CHANGED
@@ -2,137 +2,83 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
6
 
8
7
  ## [Unreleased]
9
8
 
10
9
  ### Added
11
10
 
12
- - **Evidence anchors** on every MCP tool response. Symbol records and call
13
- edges now carry a `file:line @ <short-sha>` provenance tag inline so the
14
- agent can quote a verifiable reference. Backed by `src/provenance.ts` and
15
- pure-fs git helpers in `src/git.ts` (no `child_process` — T6-safe). The
16
- graph schema gained two optional fields, `indexedSha` and `indexedBranch`,
17
- populated at index time when the root is inside a git worktree. Old graphs
18
- load unchanged.
19
- - **Differential impact:** `gp_impact` now accepts `since: <commit|tag|branch>`.
20
- When set, the returned callers (direct + transitive) are filtered to files
21
- changed between that ref and HEAD — ideal for PR-scoped refactor review
22
- ("of every caller of X, which ones does my branch actually touch?"). Diff
23
- computation runs through `isomorphic-git`; pure JS, no shell-out.
24
- - **Worktree-scoped indexing.** `graphpilot index ./src/feature` and the
25
- `gp_index` MCP tool auto-resolve to the git worktree top when called from
26
- a subdirectory, so two `git worktree add`-ed branches naturally produce
27
- two separate indexes. Pass `--no-worktree` to opt out and index a subdir
28
- directly.
29
- - Repositioned the README around the "refactor-safe code graph" framing
30
- evidence-backed, branch-aware, worktree-native.
31
- - Initial project scaffold (Node.js + TypeScript)
32
- - Tree-sitter-based parser for TS/TSX/JS/JSX
33
- - Symbol extraction for functions, classes, methods, interfaces, type aliases, enums
34
- - Directory indexer with sensible default ignores (node_modules, dist, .d.ts, etc.)
35
- - JSON storage at `~/.graphpilot/<repo-id>/graph.json`
36
- - CLI: `graphpilot index <path>`, `graphpilot status <path>`, `graphpilot mcp`
37
- - Call-edge extraction (`gp_callers` precursor): captures every call/new
38
- expression inside a function body, attributes it to the immediate enclosing
39
- function, and resolves the target across the indexed symbol table.
40
- - Outputs include both resolved (`toId` set) and unresolved (`toName` only) edges
41
- so the agent can still see stdlib/external calls.
42
- - Query layer (`GraphIndex`): pre-computed lookup tables for findByName,
43
- findById, callers, callees. Sub-millisecond lookups on indexed repos.
44
- - MCP server over stdio (`@modelcontextprotocol/sdk`). Tool surface:
45
- - `gp_stats` index health probe
46
- - `gp_index` — re-index a repo from the agent
47
- - `gp_recall` — find symbols by name (exact CI by default, substring opt-in)
48
- - `gp_callers` — list callers or callees (with direction param)
49
- - `gp_impact` blast-radius analysis: direct callers, transitive
50
- callers (BFS, depth 1–5), tests likely affected (heuristic on file
51
- paths), and a public-API flag derived from `exported`. Answers "what
52
- breaks if I rename X?" in a single tool call. Pure-function core in
53
- `src/impact.ts`; cycle-safe; per-level cap with `truncated` flag.
54
- - Watch mode: `graphpilot watch <path>` keeps the index fresh as you
55
- edit. Uses `chokidar` (fsevents/inotify/RDCW) with editor-save
56
- debouncing. Each file save triggers an incremental update re-parse
57
- one file, drop its old contribution, re-resolve edges across the
58
- whole symbol table, save atomically. Real-world 3–5 ms per save on
59
- small repos. Updates serialize through an internal chain so chokidar
60
- bursts can't race into a torn graph. Storage writes are atomic
61
- (`.tmp` + rename) so a crash never leaves a half-written graph.json.
62
- CLI runs until SIGINT.
63
- - Reproducible benchmark (Tier A): `pnpm bench` runs 10 hand-curated
64
- structural tasks against GraphPilot's own codebase (the corpus) and
65
- scores precision/recall/F1 + bytes processed vs a grep-simulator
66
- baseline. Anyone with `pnpm install` can reproduce. First run:
67
- **F1 0.89 vs grep 0.42, 99.9 % byte reduction (721 B vs 528 KB),
68
- 7 wins / 2 ties / 1 expected loss** (the string-literal task,
69
- deliberately included as the honest "grep wins" case). Spec for the
70
- agent-eval Tier B is in `bench/run-agent-tier.md`.
71
- - Contributor Covenant 2.1 code of conduct (closes GitHub Community
72
- Standards check). Reporting email is `codewithakki@gmail.com`;
73
-
74
- ### Dev workflow
75
-
76
- - Pre-commit hooks via `lefthook` (added 2026-05-20):
77
- - `pre-commit`: `pnpm typecheck` + ESLint + `prettier --check` on
78
- staged source files (parallel). Hits sub-second on small changes.
79
- - `commit-msg`: Conventional Commits regex enforcement. Bad messages
80
- get a friendly error pointing at the format spec. Allows
81
- `Merge`/`Revert`/`fixup!`/`squash!` for ergonomics.
82
- - `pre-push`: full `pnpm test`. Stops broken builds from reaching
83
- the remote.
84
- - Bypass for emergencies: `LEFTHOOK=0 git commit` or
85
- `LEFTHOOK_EXCLUDE=<jobname> git commit`. Installed automatically by
86
- `pnpm install`; no manual `lefthook install` required.
87
- - Prettier configured (added 2026-05-20): `.prettierrc.json` + scripts
88
- `pnpm format` / `pnpm format:check`. Single quotes, trailing commas,
89
- 100-col print width, LF endings. Normalized 31 files in one mechanical
90
- pass; wired into `pnpm check` and the lefthook pre-commit so future
91
- drift gets blocked.
92
- - Hand-rolled input validation for every MCP tool (no deps). Rejects unknown
93
- fields, type errors, out-of-range numbers, oversize strings.
94
- - Interaction log (`~/.graphpilot/<repo-id>/interactions.jsonl`): every tool
95
- call recorded locally with sanitized inputs. Enables future ranking /
96
- personalization. Disabled via `GRAPHPILOT_NO_LOG=1`. Mode 0600.
11
+ - **MCP workspace roots** on connect, GraphPilot calls `roots/list` (when the client supports it) and uses workspace folders as the default repo path for tool calls.
12
+ - **Default path discovery** — when `path` is omitted, resolution tries `GRAPHPILOT_ROOT`, MCP roots, walking parents of `cwd`, a unique index under `~/.graphpilot`, then `cwd`. Errors list known indexes on the machine.
13
+ - Project-level **`.cursor/mcp.json`** template with `${workspaceFolder}` for local development.
14
+
15
+ ### Changed
16
+
17
+ - MCP tool schemas document the new default path behaviour instead of “Default: cwd”.
18
+
19
+ ## [0.1.0] 2026-05-23
20
+
21
+ Initial public release. GraphPilot is a local-first, refactor-safe code graph for coding agents over the Model Context Protocol.
22
+
23
+ ### Added
24
+
25
+ #### Core engine
26
+
27
+ - Tree-sitter parser for **TypeScript, TSX, JavaScript, JSX** with a 5 MB per-file size cap and an iterative AST walk (safe on deeply-nested generated code).
28
+ - Symbol extraction for functions, classes, methods, interfaces, type aliases, and enums. Stable symbol ids of the form `<file>#<parent>.<name>@<line>`.
29
+ - Call-edge extraction with a two-pass name resolver — same-file first, then first global match. Unresolved external calls (`JSON.parse`, stdlib, third-party) keep their `toName` so the agent still sees the call site.
30
+ - Directory indexer with sensible default ignores (`node_modules`, `dist`, `build`, `.git`, `coverage`, `.next`, `.nuxt`, `.cache`, `out`, `*.d.ts`), a 50 000-file hard cap, and symlink-escape protection.
31
+ - Query layer (`GraphIndex`) with four pre-computed maps (`byName`, `byId`, `callers`, `callees`). Sub-millisecond lookups on indexed repos.
32
+ - JSON storage at `~/.graphpilot/<repo-id>/graph.json` (mode `0600`) with versioned schema and atomic writes.
33
+ - Worktree-aware indexing: subdirectory invocations auto-resolve to the git worktree top, so two `git worktree add`-ed branches naturally produce two separate indexes. Opt out with `--no-worktree`.
34
+
35
+ #### MCP server (four tools)
36
+
37
+ - `gp_index` re-index a repo from inside the agent.
38
+ - `gp_recall` find symbols by name (exact case-insensitive by default, substring opt-in).
39
+ - `gp_callers` list callers or callees of a symbol, with a `direction` parameter.
40
+ - `gp_impact` blast-radius analysis: direct + transitive callers (BFS, depth 1–5), tests likely affected, public-API flag, summary stats. Accepts `since: <commit|tag|branch>` for PR-scoped impact via `isomorphic-git` (pure JS, no shell-out).
41
+
42
+ #### Evidence anchors
43
+
44
+ Every MCP tool response includes `file:line @ <short-sha>` provenance on each symbol and call edge, so the agent can quote a verifiable reference. Old graphs without `indexedSha` continue to load (the field is optional in the schema).
45
+
46
+ #### Watch mode
47
+
48
+ `graphpilot watch <path>` keeps the index fresh as you edit. Uses `chokidar` with editor-save debouncing; each save triggers an incremental update (re-parse one file, re-resolve edges, atomic save) at 3–10 ms per save on small repos. Updates serialize through an internal chain to prevent torn graphs during chokidar bursts.
49
+
50
+ #### CLI
51
+
52
+ - `graphpilot index <path>` index a repo
53
+ - `graphpilot status <path>` show what's indexed
54
+ - `graphpilot watch <path>` — keep the index fresh
55
+ - `graphpilot mcp` start the MCP server over stdio
97
56
 
98
57
  ### Security
99
58
 
100
- - 5 MB per-file size cap (`MAX_FILE_BYTES`)
101
- - Iterative `walk()` (no stack overflow on deep ASTs)
102
- - Symlink-escape protection: `followSymbolicLinks: false` + realpath bounds check
103
- - 50,000 file hard cap per index (`MAX_FILES_PER_INDEX`)
104
- - Refuses to index `/`, `/etc`, `~`, `/Users`, Windows system paths, and macOS
105
- resolved aliases (`/private/etc`, etc.)
106
- - Graph dir/file written with mode `0o700` / `0o600`
107
- - Pattern-based secret redaction at signature-extraction time
108
- (`src/redact.ts`): OpenAI/Anthropic `sk-`, GitHub `ghp_`/`ghs_`, AWS
109
- `AKIA`, JWTs, PEM private-key headers, Slack tokens, Stripe live keys,
110
- plus a defensive long-token catch-all.
111
- - Schema validation on graph.json load (`src/graph-schema.ts`): strict
112
- shape check, version enforcement, per-entry sanitization (control chars
113
- stripped, length-capped), and recomputed counts (attacker-supplied
114
- symbol/edge counts are ignored). Defends against tampered or corrupt
115
- files; falls back to "no index" on rejection.
116
- - ESLint policy enforcing "no network in `src/`" at the build gate
117
- (`eslint.config.js`): bans `http`, `https`, `undici`, `axios`,
118
- `node-fetch`, `cross-fetch`, `got`, `request`, `superagent`, plus
119
- `child_process`. Looser rules for `tests/` and `scripts/`. CI runs
120
- `pnpm lint` as a gating job; meta-tests in `tests/lint-policy.test.ts`
121
- prove the rule fires on every banned import (catches rule-rot in
122
- future PRs).
123
-
124
- ### Pending (for v0.1.0 launch)
125
-
126
- - Tier-B agent benchmark — Claude-Code-in-the-loop scoring on real
127
- refactor tasks; produces the "X/10 vs Y/10" headline. Spec in
128
- [`bench/run-agent-tier.md`](bench/run-agent-tier.md).
129
- - `release.yml` workflow + npm publish on tag.
130
- - CodeQL + dependency-review workflows.
131
- - Branch protection on `main` (GitHub web UI step).
132
- - Hero GIF + 30-second demo video.
133
- - Domain + landing page.
134
- - MCP registry submissions (Glama, PulseMCP, mcpservers.org,
135
- `awesome-mcp-servers`).
136
- - `examples/<each-client>/` scaffolds.
137
-
138
- [Unreleased]: https://github.com/graphpilot-oss/graphpilot/commits/main
59
+ - Hand-rolled input validators on every MCP tool (zero deps): reject unknown fields, type-check every field, range-check numbers, length-cap strings, strict enums.
60
+ - Refuses to index dangerous roots: `/`, `/etc`, `/var`, `~`, `/Users`, `/home`, Windows system paths, macOS-resolved aliases (`/private/etc`, etc.).
61
+ - Symlink-escape protection: `followSymbolicLinks: false` + per-file realpath bounds check.
62
+ - File-size cap (5 MB) and file-count cap (50 000) per index.
63
+ - Storage permissions: directories `0o700`, files `0o600`.
64
+ - Pattern-based secret redaction at signature-extraction time (`src/redact.ts`): OpenAI/Anthropic `sk-`, GitHub `ghp_`/`ghs_`, AWS `AKIA`, JWTs, PEM private-key headers, Slack tokens, Stripe live keys, plus a defensive long-token catch-all.
65
+ - Schema validation on `graph.json` load: strict shape check, version enforcement, per-entry sanitization, recomputed counts (attacker-supplied counts are ignored). Falls back to "no index" on rejection.
66
+ - ESLint policy enforcing **no network in `src/`** at the build gate. Bans `http`, `https`, `undici`, `axios`, `node-fetch`, `cross-fetch`, `got`, `request`, `superagent`, plus `child_process`. Meta-tests in `tests/lint-policy.test.ts` prove the rule fires on every banned import.
67
+
68
+ ### Observability
69
+
70
+ - Interaction log at `~/.graphpilot/<repo-id>/interactions.jsonl` (mode `0600`): every tool call records sanitized inputs, result count, duration, and any error. Disable with `GRAPHPILOT_NO_LOG=1`. v0.1 does not read this log — it exists so future ranking and personalization have local-only training data.
71
+
72
+ ### Benchmarks
73
+
74
+ Reproducible Tier-A and Tier-B benchmarks (see [`bench/README.md`](bench/README.md)). On GraphPilot's own codebase:
75
+
76
+ - Tier A: F1 **0.89** vs grep **0.42**, 99.9 % fewer bytes read (721 B vs 528 KB), 7 wins / 2 ties / 1 deliberate loss.
77
+ - Tier B (simulated): 7/13 tasks vs grep 4/13 (+75 %), mean F1 0.70 vs 0.33, 6 hallucinations vs 480, 100 % evidence-anchor citation rate.
78
+
79
+ ### Documentation
80
+
81
+ - README, [quickstart](docs/quickstart.md), [MCP setup](docs/mcp-setup.md), [architecture](docs/architecture.md), [limitations](docs/limitations.md), CONTRIBUTING, SECURITY, CODE_OF_CONDUCT.
82
+
83
+ [Unreleased]: https://github.com/graphpilot-oss/graphpilot/compare/v0.1.0...HEAD
84
+ [0.1.0]: https://github.com/graphpilot-oss/graphpilot/releases/tag/v0.1.0