@graphpilot-oss/graphpilot 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.editorconfig +15 -0
- package/.github/CODEOWNERS +22 -0
- package/.github/FUNDING.yml +1 -0
- package/.github/ISSUE_TEMPLATE/bug_report.md +33 -0
- package/.github/ISSUE_TEMPLATE/config.yml +5 -0
- package/.github/ISSUE_TEMPLATE/feature_request.md +23 -0
- package/.github/PULL_REQUEST_TEMPLATE.md +19 -0
- package/.github/dependabot.yml +15 -0
- package/.github/workflows/ci.yml +62 -0
- package/.github/workflows/release.yml +50 -0
- package/.prettierignore +19 -0
- package/.prettierrc.json +20 -0
- package/CHANGELOG.md +138 -0
- package/CODE_OF_CONDUCT.md +83 -0
- package/CONTRIBUTING.md +111 -0
- package/LICENSE +201 -0
- package/README.md +132 -0
- package/SECURITY.md +44 -0
- package/assets/logo.png +0 -0
- package/assets/logo.svg +1 -0
- package/bench/README.md +544 -0
- package/bench/results/agent-tier-2026-05-22.md +28 -0
- package/bench/results/agent-tier-summary.md +44 -0
- package/bench/results/baseline-tier-2026-05-22.md +23 -0
- package/bench/results/baseline.json +810 -0
- package/bench/results/baseline.md +28 -0
- package/bench/run-agent-tier-automated.ts +234 -0
- package/bench/run-agent-tier.md +125 -0
- package/bench/run-baseline-tier.ts +200 -0
- package/bench/run.ts +210 -0
- package/bench/runner-baseline.ts +177 -0
- package/bench/runner-graphpilot.ts +131 -0
- package/bench/score-agent-tier.ts +191 -0
- package/bench/score.ts +59 -0
- package/bench/tasks.ts +236 -0
- package/dist/cli.d.ts +2 -0
- package/dist/cli.js +162 -0
- package/dist/cli.js.map +1 -0
- package/dist/edges.d.ts +57 -0
- package/dist/edges.js +170 -0
- package/dist/edges.js.map +1 -0
- package/dist/git.d.ts +95 -0
- package/dist/git.js +247 -0
- package/dist/git.js.map +1 -0
- package/dist/graph-schema.d.ts +36 -0
- package/dist/graph-schema.js +208 -0
- package/dist/graph-schema.js.map +1 -0
- package/dist/impact.d.ts +99 -0
- package/dist/impact.js +123 -0
- package/dist/impact.js.map +1 -0
- package/dist/indexer.d.ts +28 -0
- package/dist/indexer.js +111 -0
- package/dist/indexer.js.map +1 -0
- package/dist/interactions.d.ts +46 -0
- package/dist/interactions.js +0 -0
- package/dist/interactions.js.map +1 -0
- package/dist/mcp.d.ts +3 -0
- package/dist/mcp.js +567 -0
- package/dist/mcp.js.map +1 -0
- package/dist/parser.d.ts +24 -0
- package/dist/parser.js +128 -0
- package/dist/parser.js.map +1 -0
- package/dist/provenance.d.ts +74 -0
- package/dist/provenance.js +95 -0
- package/dist/provenance.js.map +1 -0
- package/dist/query.d.ts +68 -0
- package/dist/query.js +127 -0
- package/dist/query.js.map +1 -0
- package/dist/redact.d.ts +30 -0
- package/dist/redact.js +117 -0
- package/dist/redact.js.map +1 -0
- package/dist/storage.d.ts +42 -0
- package/dist/storage.js +85 -0
- package/dist/storage.js.map +1 -0
- package/dist/symbols.d.ts +20 -0
- package/dist/symbols.js +140 -0
- package/dist/symbols.js.map +1 -0
- package/dist/validation.d.ts +9 -0
- package/dist/validation.js +65 -0
- package/dist/validation.js.map +1 -0
- package/dist/validators.d.ts +55 -0
- package/dist/validators.js +205 -0
- package/dist/validators.js.map +1 -0
- package/dist/watcher.d.ts +86 -0
- package/dist/watcher.js +310 -0
- package/dist/watcher.js.map +1 -0
- package/docs/architecture.md +311 -0
- package/docs/limitations.md +156 -0
- package/docs/mcp-setup.md +231 -0
- package/docs/quickstart.md +202 -0
- package/eslint.config.js +148 -0
- package/lefthook.yml +81 -0
- package/package.json +56 -0
- package/pnpm-workspace.yaml +6 -0
- package/scripts/smoke-stdio.mjs +97 -0
- package/src/cli.ts +171 -0
- package/src/edges.ts +202 -0
- package/src/git.ts +255 -0
- package/src/graph-schema.ts +229 -0
- package/src/impact.ts +218 -0
- package/src/indexer.ts +152 -0
- package/src/interactions.ts +0 -0
- package/src/mcp.ts +652 -0
- package/src/parser.ts +138 -0
- package/src/provenance.ts +115 -0
- package/src/query.ts +148 -0
- package/src/redact.ts +122 -0
- package/src/storage.ts +115 -0
- package/src/symbols.ts +173 -0
- package/src/validation.ts +69 -0
- package/src/validators.ts +253 -0
- package/src/watcher.ts +383 -0
- package/tests/edges.test.ts +175 -0
- package/tests/fixtures/sample.ts +32 -0
- package/tests/git.test.ts +303 -0
- package/tests/graph-schema.test.ts +321 -0
- package/tests/impact.test.ts +454 -0
- package/tests/interactions.test.ts +180 -0
- package/tests/lint-policy.test.ts +106 -0
- package/tests/mcp-stdio.test.ts +171 -0
- package/tests/mcp.test.ts +335 -0
- package/tests/parser.test.ts +31 -0
- package/tests/provenance.test.ts +132 -0
- package/tests/query.test.ts +160 -0
- package/tests/redact.test.ts +167 -0
- package/tests/security.test.ts +144 -0
- package/tests/symbols.test.ts +78 -0
- package/tests/validators.test.ts +193 -0
- package/tests/watcher.test.ts +250 -0
- package/tsconfig.json +18 -0
|
@@ -0,0 +1,311 @@
|
|
|
1
|
+
# Architecture
|
|
2
|
+
|
|
3
|
+
How GraphPilot turns a folder of source files into a refactor-safe,
|
|
4
|
+
branch-aware code graph an agent can query in milliseconds — with every
|
|
5
|
+
answer carrying a `file:line @ sha` evidence anchor.
|
|
6
|
+
|
|
7
|
+
This doc is for contributors and evaluators. If you just want to use it,
|
|
8
|
+
see [quickstart.md](quickstart.md).
|
|
9
|
+
|
|
10
|
+
## Three load-bearing properties
|
|
11
|
+
|
|
12
|
+
1. **Evidence anchors** — `src/provenance.ts` attaches `{file, line, sha,
|
|
13
|
+
excerpt}` to every symbol and call edge in tool output. The git SHA is
|
|
14
|
+
captured at index time via the pure-fs helpers in `src/git.ts` (no
|
|
15
|
+
`child_process`; we read `.git/HEAD`, `refs/heads/*`, and `packed-refs`
|
|
16
|
+
directly). Old graphs without `indexedSha` still load — the field is
|
|
17
|
+
optional in the schema.
|
|
18
|
+
2. **Differential impact** — `gp_impact` takes an optional `since:
|
|
19
|
+
<commit|tag|branch>`. When set, `getChangedFiles()` (in `src/git.ts`,
|
|
20
|
+
backed by `isomorphic-git` — pure JS, no shell-out) computes the diff
|
|
21
|
+
between that ref and HEAD; `analyzeImpact` filters callers to that
|
|
22
|
+
file set. Scope a refactor to your branch in one tool call.
|
|
23
|
+
3. **Worktree-aware roots** — `resolveIndexRoot()` walks up to the git
|
|
24
|
+
worktree top by default, so two `git worktree add`'d branches produce
|
|
25
|
+
two separate indexes (since `repoIdFor` hashes the absolute root).
|
|
26
|
+
Both the CLI and the MCP layer route through it; opt out with
|
|
27
|
+
`--no-worktree`.
|
|
28
|
+
|
|
29
|
+
## Top-level view
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
┌─────────────────────────────────────────────────────────────────────┐
|
|
33
|
+
│ Your TypeScript / JS repo │
|
|
34
|
+
└───────────────────────────────────┬─────────────────────────────────┘
|
|
35
|
+
│
|
|
36
|
+
┌─────────────────▼──────────────────┐
|
|
37
|
+
│ indexer.ts │
|
|
38
|
+
│ walk dir, ignore node_modules, │
|
|
39
|
+
│ followSymbolicLinks: false │
|
|
40
|
+
└─────────────────┬──────────────────┘
|
|
41
|
+
│
|
|
42
|
+
┌─────────────────▼──────────────────┐
|
|
43
|
+
│ parser.ts │
|
|
44
|
+
│ tree-sitter → AST │
|
|
45
|
+
│ (5 MB file cap, iterative walk) │
|
|
46
|
+
└─────────────────┬──────────────────┘
|
|
47
|
+
│
|
|
48
|
+
┌─────────────────────┴──────────────────────┐
|
|
49
|
+
│ │
|
|
50
|
+
┌──────────▼──────────┐ ┌──────────▼──────────┐
|
|
51
|
+
│ symbols.ts │ │ edges.ts │
|
|
52
|
+
│ extract │ │ call sites + │
|
|
53
|
+
│ func/class/method/ │ │ same-file > global │
|
|
54
|
+
│ iface/type/enum │ │ resolver │
|
|
55
|
+
└──────────┬──────────┘ └──────────┬──────────┘
|
|
56
|
+
│ │
|
|
57
|
+
└─────────────────────┬──────────────────────┘
|
|
58
|
+
│
|
|
59
|
+
┌─────────────────▼──────────────────┐
|
|
60
|
+
│ storage.ts │
|
|
61
|
+
│ ~/.graphpilot/<repo-id>/ │
|
|
62
|
+
│ graph.json (mode 0600) │
|
|
63
|
+
│ interactions.jsonl (mode 0600) │
|
|
64
|
+
└─────────────────┬──────────────────┘
|
|
65
|
+
│
|
|
66
|
+
┌─────────────────▼──────────────────┐
|
|
67
|
+
│ query.ts (GraphIndex) │
|
|
68
|
+
│ 4 pre-computed maps: │
|
|
69
|
+
│ byName, byId, callers, callees │
|
|
70
|
+
└─────────────────┬──────────────────┘
|
|
71
|
+
│
|
|
72
|
+
┌─────────────────▼──────────────────┐
|
|
73
|
+
│ mcp.ts │
|
|
74
|
+
│ 5 tools over stdio JSON-RPC │
|
|
75
|
+
│ validators.ts + interactions log │
|
|
76
|
+
└─────────────────┬──────────────────┘
|
|
77
|
+
│
|
|
78
|
+
[MCP protocol]
|
|
79
|
+
│
|
|
80
|
+
┌─────────────────▼──────────────────┐
|
|
81
|
+
│ Claude Code / Cursor / Cline / │
|
|
82
|
+
│ Windsurf / Continue / ... │
|
|
83
|
+
└────────────────────────────────────┘
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Data flow is one-way: source → tree → symbols + edges → JSON → query →
|
|
87
|
+
agent. Nothing flows back. GraphPilot never modifies your code.
|
|
88
|
+
|
|
89
|
+
## The five-stage pipeline
|
|
90
|
+
|
|
91
|
+
### Stage 1 — Walk the directory
|
|
92
|
+
|
|
93
|
+
File: [`src/indexer.ts`](../src/indexer.ts)
|
|
94
|
+
|
|
95
|
+
- Uses `fast-glob` over `**/*.{ts,tsx,js,jsx,mjs,cjs}`
|
|
96
|
+
- Skips `node_modules/`, `dist/`, `build/`, `.git/`, `coverage/`,
|
|
97
|
+
`.next/`, `.nuxt/`, `.cache/`, `out/`, and `*.d.ts`
|
|
98
|
+
- `followSymbolicLinks: false` + per-file realpath check — files whose
|
|
99
|
+
realpath escapes the indexed root are skipped (defends against
|
|
100
|
+
symlink-escape attacks)
|
|
101
|
+
- Hard cap of `MAX_FILES_PER_INDEX = 50,000`. Throws above that.
|
|
102
|
+
|
|
103
|
+
### Stage 2 — Parse each file
|
|
104
|
+
|
|
105
|
+
File: [`src/parser.ts`](../src/parser.ts)
|
|
106
|
+
|
|
107
|
+
- `tree-sitter` + `tree-sitter-typescript` (covers TS, TSX, JSX, and JS)
|
|
108
|
+
- Pre-read stat check: files over `MAX_FILE_BYTES = 5 MB` are skipped
|
|
109
|
+
- `walk()` is **iterative** (stack-based), not recursive — protects
|
|
110
|
+
against stack overflow on pathologically deep generated code
|
|
111
|
+
|
|
112
|
+
### Stage 3 — Extract symbols + raw calls (per file)
|
|
113
|
+
|
|
114
|
+
Files: [`src/symbols.ts`](../src/symbols.ts) and
|
|
115
|
+
[`src/edges.ts`](../src/edges.ts)
|
|
116
|
+
|
|
117
|
+
Symbols extracted:
|
|
118
|
+
|
|
119
|
+
- `function_declaration`
|
|
120
|
+
- arrow / function expressions assigned to consts → `variable` kind
|
|
121
|
+
- `class_declaration` and its `method_definition` children
|
|
122
|
+
- `interface_declaration` (TS)
|
|
123
|
+
- `type_alias_declaration` (TS)
|
|
124
|
+
- `enum_declaration` (TS)
|
|
125
|
+
|
|
126
|
+
Each gets a stable id of the form:
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
<file>#<parent>.<name>@<line>
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
`<parent>` is the enclosing class name for methods; empty otherwise.
|
|
133
|
+
|
|
134
|
+
Call extraction:
|
|
135
|
+
|
|
136
|
+
- For every function-like symbol, walk its body subtree
|
|
137
|
+
- **Stop at nested function boundaries** — calls inside an inline arrow
|
|
138
|
+
are attributed to the arrow, not the outer function
|
|
139
|
+
- Emit a `RawCall` for every `call_expression` and `new_expression`
|
|
140
|
+
- Callee name is the identifier or the `.property` of a member-expression
|
|
141
|
+
|
|
142
|
+
### Stage 4 — Resolve + save
|
|
143
|
+
|
|
144
|
+
File: [`src/edges.ts`](../src/edges.ts) (resolver) and
|
|
145
|
+
[`src/storage.ts`](../src/storage.ts) (persistence)
|
|
146
|
+
|
|
147
|
+
After all files are parsed, a second pass resolves each `RawCall`:
|
|
148
|
+
|
|
149
|
+
1. Prefer a symbol with the matching name in the **same file**
|
|
150
|
+
2. Otherwise pick the **first** global match
|
|
151
|
+
3. Otherwise leave `toId: null`; preserve `toName` so the agent still
|
|
152
|
+
sees the call happened
|
|
153
|
+
|
|
154
|
+
Save location:
|
|
155
|
+
|
|
156
|
+
```
|
|
157
|
+
~/.graphpilot/<repo-id>/graph.json
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
Where `<repo-id>` is the first 16 hex chars of
|
|
161
|
+
`sha256(absolute_repo_path)`. File permissions: `0o600`. Directory:
|
|
162
|
+
`0o700`.
|
|
163
|
+
|
|
164
|
+
Schema is versioned (`version: 1`) so future migrations are clean.
|
|
165
|
+
|
|
166
|
+
### Stage 5 — Serve queries
|
|
167
|
+
|
|
168
|
+
Files: [`src/query.ts`](../src/query.ts) and [`src/mcp.ts`](../src/mcp.ts)
|
|
169
|
+
|
|
170
|
+
When the MCP server starts, it lazy-loads `graph.json` for whichever
|
|
171
|
+
repo path is being queried and builds a `GraphIndex`:
|
|
172
|
+
|
|
173
|
+
- `byNameLower` — lowercase name → SymbolRecord[]
|
|
174
|
+
- `byId` — full id → SymbolRecord
|
|
175
|
+
- `callersOf` — target id → CallEdge[] (answers "who calls X")
|
|
176
|
+
- `calleesOf` — source id → CallEdge[] (answers "what does X call")
|
|
177
|
+
|
|
178
|
+
The index is cached per absolute path inside the process so repeated
|
|
179
|
+
tool calls don't re-parse the JSON.
|
|
180
|
+
|
|
181
|
+
Every tool call flows through:
|
|
182
|
+
|
|
183
|
+
```
|
|
184
|
+
MCP request → validator → tool handler → response
|
|
185
|
+
↓
|
|
186
|
+
interaction log
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## The five MCP tools
|
|
190
|
+
|
|
191
|
+
| Tool | Input | Output |
|
|
192
|
+
| ------------ | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
|
|
193
|
+
| `gp_index` | `{ path? }` | Triggers re-indexing + saves graph; invalidates per-path cache |
|
|
194
|
+
| `gp_recall` | `{ query, limit?, substring?, path? }` | Symbols matching name (exact case-insensitive by default; `substring: true` opt-in) |
|
|
195
|
+
| `gp_callers` | `{ symbol, direction?: 'callers' \| 'callees', limit?, includeUnresolved?, path? }` | Edges where the symbol is target (callers) or source (callees) |
|
|
196
|
+
| `gp_impact` | `{ symbol, depth? (1–5, default 3), path? }` | Blast-radius report: direct callers, transitive callers grouped by BFS depth, tests likely affected, public-API flag, summary stats |
|
|
197
|
+
| `gp_stats` | `{ path? }` | Health check: repo id, indexedAt, file/symbol/edge counts |
|
|
198
|
+
|
|
199
|
+
Every input is validated by hand-rolled validators in
|
|
200
|
+
[`src/validators.ts`](../src/validators.ts):
|
|
201
|
+
|
|
202
|
+
- Reject unknown fields (`additionalProperties: false` defence in depth)
|
|
203
|
+
- Type-check every field
|
|
204
|
+
- Range-check numbers (`limit` capped at 50–100 depending on tool)
|
|
205
|
+
- Length-cap strings (no 2 MB symbol names)
|
|
206
|
+
- Strict enums for `direction`
|
|
207
|
+
|
|
208
|
+
If any check fails, the tool returns `{ isError: true, content: ... }`
|
|
209
|
+
with a clear message — the request never reaches the handler.
|
|
210
|
+
|
|
211
|
+
## What lives where on disk
|
|
212
|
+
|
|
213
|
+
```
|
|
214
|
+
~/.graphpilot/
|
|
215
|
+
<repo-id-1>/
|
|
216
|
+
graph.json ← structural index (mode 0600)
|
|
217
|
+
interactions.jsonl ← append-only tool-call log (mode 0600)
|
|
218
|
+
<repo-id-2>/
|
|
219
|
+
graph.json
|
|
220
|
+
interactions.jsonl
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
Nothing else is written. Nothing leaves your machine.
|
|
224
|
+
|
|
225
|
+
## The interaction log
|
|
226
|
+
|
|
227
|
+
File: [`src/interactions.ts`](../src/interactions.ts)
|
|
228
|
+
|
|
229
|
+
Every tool call appends one line to `interactions.jsonl`:
|
|
230
|
+
|
|
231
|
+
```json
|
|
232
|
+
{
|
|
233
|
+
"ts": "2026-05-18T20:45:00Z",
|
|
234
|
+
"tool": "gp_recall",
|
|
235
|
+
"input": { "query": "parseToken" },
|
|
236
|
+
"results": 1,
|
|
237
|
+
"durationMs": 3
|
|
238
|
+
}
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
**What is logged:** tool name, sanitized input args, result count,
|
|
242
|
+
duration, error (if any).
|
|
243
|
+
|
|
244
|
+
**What is NOT logged:** source code, file contents, user prompts.
|
|
245
|
+
|
|
246
|
+
**Sanitization** (defends against log-line forgery via crafted symbol
|
|
247
|
+
names):
|
|
248
|
+
|
|
249
|
+
- Strip control characters (e.g. newlines become spaces)
|
|
250
|
+
- Cap strings at 500 chars
|
|
251
|
+
- Cap whole-line size at 8 KB; oversize entries fall back to a marker
|
|
252
|
+
- Disabled entirely with `GRAPHPILOT_NO_LOG=1`
|
|
253
|
+
|
|
254
|
+
v0.1 doesn't _read_ this log. It exists from day one so future ranking
|
|
255
|
+
and personalization have data to train on. Local-only, your data.
|
|
256
|
+
|
|
257
|
+
## Process model
|
|
258
|
+
|
|
259
|
+
- One process per MCP session
|
|
260
|
+
- stdio transport: reads JSON-RPC from stdin, writes responses to stdout
|
|
261
|
+
- Diagnostics go to **stderr** (stdout is reserved for the protocol)
|
|
262
|
+
- The process stays alive as long as stdin is open; exits cleanly on
|
|
263
|
+
client disconnect (see the Day-10 stdio fix)
|
|
264
|
+
- No daemon mode in v0.1. One process per `~/.claude.json` entry.
|
|
265
|
+
|
|
266
|
+
## Security model
|
|
267
|
+
|
|
268
|
+
See [SECURITY.md](../SECURITY.md) for the user-facing policy + how to
|
|
269
|
+
report a vulnerability. Active defences in code:
|
|
270
|
+
|
|
271
|
+
- `validateRootPath` refuses `/`, `/etc`, `/var`, `~`, `/Users`,
|
|
272
|
+
`/home`, Windows system paths, and macOS-resolved aliases like
|
|
273
|
+
`/private/etc`
|
|
274
|
+
- File size cap (5 MB) and file count cap (50k)
|
|
275
|
+
- Symlink protection (fast-glob `followSymbolicLinks: false` + realpath
|
|
276
|
+
bounds check per file)
|
|
277
|
+
- Storage perms (`0o700` dir, `0o600` files)
|
|
278
|
+
- No `child_process`, no `exec`, no network code anywhere in `src/`
|
|
279
|
+
- Hand-rolled validators on every MCP tool input (zero deps)
|
|
280
|
+
- Empty `additionalProperties: false` on every tool's input schema
|
|
281
|
+
|
|
282
|
+
## Testing strategy
|
|
283
|
+
|
|
284
|
+
| Test file | What it covers | Tests |
|
|
285
|
+
| ---------------------------- | --------------------------------------------------------- | ------ |
|
|
286
|
+
| `tests/parser.test.ts` | Tree-sitter wiring + function detection | 3 |
|
|
287
|
+
| `tests/symbols.test.ts` | Per-kind symbol extraction + id format | 9 |
|
|
288
|
+
| `tests/edges.test.ts` | Raw call extraction + resolution + nested fns | 10 |
|
|
289
|
+
| `tests/security.test.ts` | T1/T2/T7/T10 defences | 10 |
|
|
290
|
+
| `tests/query.test.ts` | GraphIndex maps + edge cases | 18 |
|
|
291
|
+
| `tests/validators.test.ts` | Per-tool input validators | 20 |
|
|
292
|
+
| `tests/interactions.test.ts` | Sanitization + log file + env-var disable | 11 |
|
|
293
|
+
| `tests/mcp.test.ts` | Tools through InMemoryTransport | 14 |
|
|
294
|
+
| `tests/mcp-stdio.test.ts` | Real subprocess over stdio (catches the Day-10 bug class) | 3 |
|
|
295
|
+
| **Total** | | **98** |
|
|
296
|
+
|
|
297
|
+
`InMemoryTransport` is fast and covers tool logic. `mcp-stdio.test.ts`
|
|
298
|
+
spawns the real binary and drives it over stdin/stdout — slower but
|
|
299
|
+
catches the "server starts but never responds" regression class.
|
|
300
|
+
|
|
301
|
+
## Extension points (where v0.2+ work plugs in)
|
|
302
|
+
|
|
303
|
+
| Feature | Where it would live | Effort |
|
|
304
|
+
| --------------------- | -------------------------------------------------- | ------ |
|
|
305
|
+
| ~~Watch mode~~ | Shipped: `src/watcher.ts` + `graphpilot watch` CLI | — |
|
|
306
|
+
| ~~`gp_impact` tool~~ | Shipped: `src/impact.ts` + handler in `mcp.ts` | — |
|
|
307
|
+
| `.graphpilotignore` | extend `DEFAULT_IGNORE` in indexer + watcher | small |
|
|
308
|
+
| Cross-repo workspace | new `src/workspace.ts` + workspace yaml loader | medium |
|
|
309
|
+
| Semantic search | embedding pipeline + vector index | medium |
|
|
310
|
+
| Stack-Graphs resolver | replace `resolveCallEdges` algorithm | large |
|
|
311
|
+
| Python support | new tree-sitter grammar wired through `parser.ts` | medium |
|
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# Limitations of v0.1
|
|
2
|
+
|
|
3
|
+
GraphPilot v0.1 makes deliberate trade-offs to ship a small, sharp tool
|
|
4
|
+
fast. Knowing what it doesn't do matters as much as knowing what it does.
|
|
5
|
+
|
|
6
|
+
The list below is exhaustive as of v0.1.0. Items with a milestone have
|
|
7
|
+
a planned fix; unmarked items are out of scope for v1.x.
|
|
8
|
+
|
|
9
|
+
## Language coverage
|
|
10
|
+
|
|
11
|
+
- **TypeScript, TSX, JavaScript, JSX only.**
|
|
12
|
+
- Python — deferred to **v0.2 / v0.3** (demand-gated)
|
|
13
|
+
- Rust / Go / Java — deferred to **v1.x**
|
|
14
|
+
- All other languages — not planned for v1
|
|
15
|
+
- **`.d.ts` declaration files are skipped.** They mostly express types
|
|
16
|
+
that don't add structural information to the call graph.
|
|
17
|
+
- **JSON, YAML, Markdown, configs:** not indexed (we are a _code_
|
|
18
|
+
index, not a project index).
|
|
19
|
+
|
|
20
|
+
## Resolver accuracy
|
|
21
|
+
|
|
22
|
+
GraphPilot uses a deliberately simple name-based resolver. The
|
|
23
|
+
trade-offs:
|
|
24
|
+
|
|
25
|
+
- **No import-path resolution.** `import { foo } from "./bar"` does
|
|
26
|
+
not get followed. If `foo` appears in multiple files, the resolver
|
|
27
|
+
picks **same-file first**, then the **first global match** — which
|
|
28
|
+
may be wrong.
|
|
29
|
+
- **Re-exports may pick the wrong source.** A chain like
|
|
30
|
+
`index.ts → utils/index.ts → utils/string.ts` resolves to whichever
|
|
31
|
+
file the walker saw first.
|
|
32
|
+
- **No type-based method dispatch.** `userRepo.save()` and
|
|
33
|
+
`productRepo.save()` both resolve to whichever `save` we saw first.
|
|
34
|
+
- **No `super()` or constructor inheritance tracking** beyond name
|
|
35
|
+
match.
|
|
36
|
+
- **Standard library calls show as unresolved.** `JSON.parse`,
|
|
37
|
+
`Date.now`, `console.log`, `Array.from`, fs/path/process — all have
|
|
38
|
+
`toId: null`. The agent still sees the call happened; it just doesn't
|
|
39
|
+
get a jump-to-definition pointer.
|
|
40
|
+
- **Expected resolution rate:** roughly **25–35% of edges resolve** to
|
|
41
|
+
an in-repo symbol id; the rest are external. On GraphPilot's own code
|
|
42
|
+
it's 42/155 (27%). That's enough to materially reduce hallucinations
|
|
43
|
+
because the questions agents actually ask (_"who calls X in my
|
|
44
|
+
repo"_) are the ones the dumb resolver answers correctly.
|
|
45
|
+
|
|
46
|
+
**Planned in v0.2:** import-path tracking, re-export resolution.
|
|
47
|
+
|
|
48
|
+
## Indexing model
|
|
49
|
+
|
|
50
|
+
- **Watch mode is per-file incremental.** `graphpilot watch` re-parses
|
|
51
|
+
only the file that changed and re-resolves edges across the symbol
|
|
52
|
+
table in ~3–10ms per save. Full re-index is only needed on first run
|
|
53
|
+
or after a `pnpm install` / branch switch that changes many files.
|
|
54
|
+
- **Single-process.** No CPU parallelism in v0.1.
|
|
55
|
+
- **No `.graphpilotignore`.** Defaults skip `node_modules`, `dist`,
|
|
56
|
+
`build`, `.git`, `coverage`, `.next`, `.nuxt`, `.cache`, `out`,
|
|
57
|
+
`*.d.ts`. To customize, hand-edit `src/indexer.ts` (and
|
|
58
|
+
`src/watcher.ts` for watch mode).
|
|
59
|
+
- **Max 50,000 files per index** (`MAX_FILES_PER_INDEX`). Larger repos
|
|
60
|
+
error out — narrow the path or wait for v0.4 workspaces.
|
|
61
|
+
- **Max 5 MB per file** (`MAX_FILE_BYTES`). Larger files (minified
|
|
62
|
+
bundles, generated code) are silently skipped.
|
|
63
|
+
|
|
64
|
+
**Planned in v0.2:** watch mode. **v0.3+:** incremental updates,
|
|
65
|
+
`.graphpilotignore`.
|
|
66
|
+
|
|
67
|
+
## What we don't index (deliberate)
|
|
68
|
+
|
|
69
|
+
- **Comments and docstrings.** Use `rg TODO` etc.
|
|
70
|
+
- **String literals.** `process.env.FOO`, route paths, embedded SQL.
|
|
71
|
+
- **Configuration files.** package.json, tsconfig.json, .env.
|
|
72
|
+
- **Git history.** No blame, no diff awareness.
|
|
73
|
+
|
|
74
|
+
## Scope
|
|
75
|
+
|
|
76
|
+
- **Single repo per query.** Each `gp_*` tool call operates on one
|
|
77
|
+
indexed repo. For microservices, index each repo separately; the
|
|
78
|
+
agent must coordinate lookups.
|
|
79
|
+
- **No workspace abstraction.** Cross-repo namespace resolution (e.g.
|
|
80
|
+
`@org/auth` imported in `@org/payments`) is not native in v0.1. See
|
|
81
|
+
the manual workaround in `quickstart.md`.
|
|
82
|
+
|
|
83
|
+
**Planned in v0.4 / v1.x:** workspace.yaml-driven cross-repo.
|
|
84
|
+
|
|
85
|
+
## Agent capabilities
|
|
86
|
+
|
|
87
|
+
- **No impact analysis tool.** "What breaks if I change X" must be
|
|
88
|
+
composed from `gp_callers` results. A dedicated `gp_impact` tool is
|
|
89
|
+
planned for v0.3.
|
|
90
|
+
- **No route detection.** Express, Fastify, NestJS, Hono handlers are
|
|
91
|
+
not recognized as routes in v0.1.
|
|
92
|
+
- **No test-to-unit mapping.** `tests/auth.spec.ts` does _not_ link to
|
|
93
|
+
the symbols it tests. Planned for v0.3.
|
|
94
|
+
- **No semantic search.** `gp_recall` is name-only (exact case-insensitive
|
|
95
|
+
or substring). "Find code similar to this snippet" — not supported.
|
|
96
|
+
Deferred until 30+ users request it.
|
|
97
|
+
- **No public-API extraction.** Inferable from `exported: true` symbols
|
|
98
|
+
but not a first-class tool.
|
|
99
|
+
|
|
100
|
+
## Privacy / data handling
|
|
101
|
+
|
|
102
|
+
- **No telemetry, no remote calls.** Verifiable: `src/` has zero `http`,
|
|
103
|
+
`fetch`, `axios`, or analytics imports. A CI lint rule will enforce
|
|
104
|
+
this from v0.2.
|
|
105
|
+
- **Source code never leaves your machine.** Only the structured graph
|
|
106
|
+
(names, locations, signatures, call relationships) lives in
|
|
107
|
+
`~/.graphpilot/`.
|
|
108
|
+
- **Signatures may contain secrets if your code does.** If you have
|
|
109
|
+
`const API_KEY = "sk-..."` literally in source, that line ends up in
|
|
110
|
+
`graph.json`. We don't redact in v0.1. **Planned in v0.2:** secret-
|
|
111
|
+
pattern detection (matches against known formats like `sk-`, `ghp_`,
|
|
112
|
+
AWS keys, JWTs, PEM headers).
|
|
113
|
+
|
|
114
|
+
## Platform support
|
|
115
|
+
|
|
116
|
+
- **Linux:** tested, CI green
|
|
117
|
+
- **macOS** (Intel + Apple Silicon): tested, CI green
|
|
118
|
+
- **Windows:** experimental in v0.1. CI green but real-world testing is
|
|
119
|
+
light. The subprocess MCP test is currently skipped on Windows. File
|
|
120
|
+
bugs eagerly.
|
|
121
|
+
|
|
122
|
+
## Performance ceiling
|
|
123
|
+
|
|
124
|
+
Rough numbers on Apple Silicon (M1 Pro, 16 GB):
|
|
125
|
+
|
|
126
|
+
| Repo size | Index time | Resident memory | graph.json |
|
|
127
|
+
| --------- | ---------- | --------------- | ---------- |
|
|
128
|
+
| 100 files | 80 ms | ~30 MB | ~100 KB |
|
|
129
|
+
| 1k files | 800 ms | ~80 MB | ~1 MB |
|
|
130
|
+
| 10k files | 8 s | ~300 MB | ~10 MB |
|
|
131
|
+
| 50k files | 40 s | ~1.2 GB | ~50 MB |
|
|
132
|
+
|
|
133
|
+
Query latency on the pre-computed indexes: sub-millisecond even at 50k
|
|
134
|
+
symbols.
|
|
135
|
+
|
|
136
|
+
## What we deliberately don't build
|
|
137
|
+
|
|
138
|
+
Not a value judgement — just clarity on scope:
|
|
139
|
+
|
|
140
|
+
- **Not a coding agent.** Claude Code / Cursor / Aider generate code.
|
|
141
|
+
We provide them context.
|
|
142
|
+
- **Not a security scanner.** CodeQL owns taint analysis.
|
|
143
|
+
- **Not a build system.** We don't compile.
|
|
144
|
+
- **Not a Sourcegraph clone.** No web UI for human browsing — the agent
|
|
145
|
+
is the user.
|
|
146
|
+
- **Not a SaaS.** No accounts, no cloud, no enterprise tier in v1.
|
|
147
|
+
|
|
148
|
+
If your use case needs one of those, GraphPilot is the wrong tool.
|
|
149
|
+
|
|
150
|
+
## Reporting limits we missed
|
|
151
|
+
|
|
152
|
+
The list above is intentionally exhaustive. If you hit a real limit
|
|
153
|
+
that's not documented here,
|
|
154
|
+
[open an issue](https://github.com/graphpilot-oss/graphpilot/issues) —
|
|
155
|
+
"undocumented limitation" is a valid issue type and helps us keep this
|
|
156
|
+
list honest.
|