@pmaddire/gcie 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENT.md CHANGED
@@ -1,10 +1,14 @@
1
1
  # AGENT.md
2
2
 
3
- Agent Operating Instructions for GraphCode Intelligence Engine (GCIE)
3
+ Agent Operating Instructions for GraphCode Intelligence Engine (GCIE)
4
4
 
5
5
  This file provides persistent architectural context for coding agents working on this repository.
6
6
 
7
- Agents must read this file before performing any development tasks.
7
+ Agents must read this file before performing any development tasks.
8
+
9
+ Agent trigger:
10
+ Use GCIE for context lookup before reading files or making edits.
11
+ Follow `GCIE_USAGE.md` for the adaptive retrieval workflow and mode-routing rules.
8
12
 
9
13
  ---
10
14
 
package/GCIE_USAGE.md CHANGED
@@ -1,231 +1,290 @@
1
- # GCIE Agent Usage (Portable Default)
1
+ # GCIE Agent Usage (Portable, Accuracy-First)
2
2
 
3
- This file is designed to be dropped into any repository and used immediately.
3
+ Trigger line for agent instructions:
4
+ `Use GCIE for context lookup before reading files or making edits. Follow GCIE_USAGE.md.`
4
5
 
5
6
  ## Goal
6
7
 
7
- Retrieve the smallest useful context while preserving edit safety.
8
+ Retrieve the smallest useful context without sacrificing edit safety.
8
9
 
9
10
  Priority order:
10
11
  1. accuracy (must-have coverage)
11
12
  2. full-hit reliability
12
13
  3. token efficiency
13
14
 
14
- ## Quick Start (Any Repo)
15
+ ## Core Rules
15
16
 
16
- 1. Identify must-have context categories for the task:
17
- - implementation file(s)
18
- - wiring/orchestration file(s)
19
- - validation surface when risk is non-trivial
20
- - this may be a test, spec, schema, contract, migration, config, or CLI surface depending on the repo
17
+ 1. Do not trade recall for token savings.
18
+ 2. Stop retrieval as soon as must-have categories are covered.
19
+ 3. Adapt per task family, not per one-off query.
20
+ 4. Keep defaults portable; keep repo-specific learning in `.gcie` state.
21
+
22
+ ## Commands (Tool-Synced)
23
+
24
+ Primary retrieval:
25
+ ```powershell
26
+ gcie.cmd context <path> "<query>" --intent <edit|debug|refactor|explore> --budget <auto|int> --mode <basic|adaptive>
27
+ ```
28
+
29
+ Sliced retrieval:
30
+ ```powershell
31
+ gcie.cmd context-slices <path> "<query>" --intent <edit|debug|refactor|explore> --profile <low|recall|adaptive>
32
+ ```
33
+
34
+ Adaptive profile state:
35
+ ```powershell
36
+ gcie.cmd adaptive-profile .
37
+ gcie.cmd adaptive-profile . --clear
38
+ ```
39
+
40
+ Post-init adaptation pipeline:
41
+ ```powershell
42
+ gcie.cmd adapt . --benchmark-size 10 --efficiency-iterations 5 --clear-profile
43
+ ```
21
44
 
22
- 2. Run one primary retrieval with a file-first, symbol-heavy query:
45
+ One-shot setup + adaptation:
23
46
  ```powershell
24
- gcie.cmd context <path> "<file-first symbol-heavy query>" --intent <edit|debug|refactor|explore> --budget <shape budget>
47
+ gcie.cmd setup . --adapt --adapt-benchmark-size 10 --adapt-efficiency-iterations 5
25
48
  ```
26
49
 
27
- 3. Check must-have coverage.
50
+ Setup and index:
51
+ ```powershell
52
+ gcie.cmd setup .
53
+ gcie.cmd index .
54
+ ```
55
+
56
+ Direct verification:
57
+ ```powershell
58
+ rg -n "symbol1|symbol2|symbol3" <likely files or subtree>
59
+ ```
60
+
61
+ ## Must-Have Coverage Gate (Required)
62
+
63
+ Context is sufficient only when all needed categories are present:
64
+ - implementation file(s)
65
+ - wiring/orchestration/caller file(s)
66
+ - validation surface when risk is non-trivial (test/spec/schema/config/contract/CLI surface)
67
+
68
+ If any must-have file is missing, retrieval is incomplete.
69
+
70
+ If a must-have file appears only as compact/skeleton context, re-query that file explicitly (pin or targeted query) before editing.
71
+
72
+ Note: tests/spec files are often excluded by default. Add `--include-tests` only when test context is required.
73
+
74
+ ## Query Construction (Portable, High-Signal)
75
+
76
+ Preferred pattern:
77
+ `<file-a> <file-b> <function/component> <route/flag> <state/config-key>`
28
78
 
29
- 4. If one must-have file is missing, run targeted gap-fill for only that file.
79
+ Rules:
80
+ 1. Use file-first, symbol-heavy phrasing.
81
+ 2. Include explicit file paths when known.
82
+ 3. Include 2-6 distinctive symbols.
83
+ 4. Add caller/entry anchor when target is indirect.
84
+ 5. Avoid natural-language question phrasing.
30
85
 
31
- 5. Stop immediately when must-have coverage is complete.
86
+ Example:
87
+ - Bad: `How does architecture routing decide when to fall back?`
88
+ - Good: `context/context_router.py context/fallback_evaluator.py architecture routing fallback confidence`
32
89
 
33
90
  ## Retrieval Modes (Adaptive Router)
34
91
 
35
- Use three modes and choose by task family:
92
+ Use three modes and route by observed outcomes:
36
93
 
37
- 1. `plain-context-first` (default for most tasks)
38
- 2. `slicer-first` (for hard routed architecture or multi-hop families)
39
- 3. `direct-file-check` (verification and fast gap closure)
94
+ 1. `slicer-first`
95
+ 2. `plain-context-first`
96
+ 3. `direct-file-check`
40
97
 
41
- Plain-context command:
98
+ Slicer-first:
42
99
  ```powershell
43
- gcie.cmd context <path> "<query>" --intent <edit|debug|refactor|explore> --budget <shape budget>
100
+ gcie.cmd context-slices <path> "<query>" --profile low --intent <intent>
44
101
  ```
45
102
 
46
- Slicer-first command:
103
+ Plain-context-first:
47
104
  ```powershell
48
- gcie.cmd context-slices <path> "<query>" --intent <edit|debug|refactor|explore>
105
+ gcie.cmd context <path> "<query>" --mode basic --intent <intent> --budget auto
49
106
  ```
50
107
 
51
- Direct-file-check command:
108
+ Direct-file-check:
52
109
  ```powershell
53
- rg -n "<symbol1|symbol2|symbol3>" <likely files or subtree>
110
+ rg -n "<symbols>" <files-or-subtree>
54
111
  ```
55
112
 
56
- Mode-switch rule:
57
- - start with `plain-context-first` unless setup calibration proved another mode is better for that family
58
- - use `slicer-first` only for families where routing/architecture slices repeatedly outperform plain context
59
- - use `direct-file-check` whenever must-have coverage is uncertain or one file remains missing
60
- - do not keep retrying the same mode indefinitely; switch after one weak result
113
+ Routing policy:
114
+ 1. Start new repos in `slicer-first` bootstrap mode.
115
+ 2. If must-have coverage is incomplete after one slicer pass, switch that task to `plain-context-first`.
116
+ 3. If a task family misses with slicer 2+ times in calibration, set that family default to `plain-context-first`.
117
+ 4. Keep slicer for families where it is both accurate and cheaper.
118
+ 5. If two GCIE attempts still miss required files, use `direct-file-check` and mark family `manual-verify-required` until recalibrated.
61
119
 
62
- Portable starter policy:
63
- - default all families to `plain-context-first`
64
- - after first 10-20 tasks, promote individual families to `slicer-first` only if benchmarked better
65
- - keep a family on plain-context if slicer is more expensive with no accuracy gain
120
+ ## Scope and Budget Baseline (Portable)
66
121
 
67
- ## Architecture Tracking (Portable, In-Repo)
122
+ Scope rule:
123
+ 1. Use the smallest path scope that still contains expected files.
124
+ 2. Use repo root `.` only for true cross-layer recovery.
125
+ 3. If explicit targets cluster in one subtree, subtree scope is usually better than root.
68
126
 
69
- To make slicer mode adapt as the repo changes, keep architecture tracking inside the repo where GCIE runs.
127
+ Profile ladder (concrete, portable):
128
+ 1. `context-slices --profile low`
129
+ 2. if miss: `context-slices --profile recall`
130
+ 3. if miss: `context-slices --profile recall --pin <missing-file>`
131
+ 4. if miss: `rg` direct check and targeted file retrieval
70
132
 
71
- Track these files under `.gcie/`:
72
- - `.gcie/architecture.md`
73
- - `.gcie/architecture_index.json`
74
- - `.gcie/context_config.json`
133
+ Plain-context budget baseline:
134
+ - `auto`: simple same-layer or strong single-file lookup
135
+ - `900`: same-family two-file lookup
136
+ - `1100`: backend/config pair or same-layer backend pair
137
+ - `1150`: cross-layer UI/API flow
138
+ - `1300-1400`: explicit multi-hop chain
75
139
 
76
- How to keep it adaptive:
77
- 1. Bootstrap from user docs once (read-only):
78
- - `ARCHITECTURE.md`, `README.md`, `PROJECT.md`, `docs/architecture.md`, `docs/system_design.md`
79
- 2. Use `.gcie/architecture.md` as GCIE-owned working architecture map.
80
- 3. Refresh `.gcie/architecture.md` and `.gcie/architecture_index.json` when structural changes happen:
81
- - new subsystem
82
- - major module split/merge
83
- - interface/boundary change
84
- - dependency-direction change
85
- - active work-area shift
86
- 4. Do not overwrite user-owned docs unless explicitly asked.
140
+ Gap-fill baseline:
141
+ - general implementation/wiring file: `900`
142
+ - small entry/orchestrator file: `500`
87
143
 
88
- Architecture confidence rule:
89
- - if architecture slice confidence is low or required mappings are stale/missing, fallback to plain `context` automatically
90
- - record fallback reason in `.gcie/context_config.json` when bypassing slicer mode
144
+ ## Adaptive Recovery Order (One Change At A Time)
91
145
 
92
- ## Portable Defaults (Task-Shape Based)
146
+ When retrieval is weak, apply in this exact order:
93
147
 
94
- Use these as a starting point in new repos.
148
+ 1. Query upgrade: add explicit files, symbols, caller/entry anchor
149
+ 2. Scope correction: subtree vs root
150
+ 3. One profile/budget escalation
151
+ 4. Targeted gap-fill for only missing must-have file(s)
152
+ 5. Multi-hop decomposition only if still incomplete
95
153
 
96
- Primary pass budgets:
97
- - `auto`: simple same-layer or strong single-file lookup
98
- - `900`: same-family two-file lookup, frontend-local component lookup
99
- - `1100`: backend/config pair, same-layer backend pair
100
- - `1150`: cross-layer UI/API flow
101
- - `1300-1400`: explicit multi-hop chain (3+ linked files)
154
+ Stop condition:
155
+ - If a required file is still missing after two GCIE attempts (with query+scope corrected), stop GCIE retries and use `rg`.
102
156
 
103
- Gap-fill budgets:
104
- - missing general implementation/wiring file: `900`
105
- - missing small orchestration or entry file: `500`
157
+ ## Architecture Tracking (Portable, In-Repo)
106
158
 
107
- Scope rule:
108
- - use the smallest path scope that still contains the expected files
109
- - use repo root (`.`) only for true cross-layer or backend orchestration recovery
110
- - if explicit targets cluster in one subtree, broad repo-root retrieval is often worse than subtree retrieval
159
+ Track these under `.gcie/`:
160
+ - `.gcie/architecture.md`
161
+ - `.gcie/architecture_index.json`
162
+ - `.gcie/context_config.json`
111
163
 
112
- ## Query Construction (Portable)
164
+ Keep adaptive:
165
+ 1. Bootstrap from user docs once (`ARCHITECTURE.md`, `README.md`, `PROJECT.md`, `docs/*architecture*`).
166
+ 2. Treat `.gcie/architecture.md` as GCIE-owned working map.
167
+ 3. Refresh architecture files when boundaries/subsystems/interfaces change.
168
+ 4. Do not overwrite user-owned docs unless explicitly asked.
113
169
 
114
- Use this pattern:
170
+ Fallback confidence rule:
171
+ - If architecture confidence is low or mappings are stale/missing, fallback to plain context and record reason in `.gcie/context_config.json`.
115
172
 
116
- `<file-a> <file-b> <function/component> <state-or-arg> <route/flag> <config-key>`
173
+ ## Pre-Calibration Readiness Gate (Required)
117
174
 
118
- Guidelines:
119
- - include explicit file paths when known
120
- - include 2 to 6 distinctive symbols
121
- - include a caller or entry anchor when the target is indirect
122
- - avoid vague summaries and long laundry-list queries
175
+ Run before full adaptation:
123
176
 
124
- ## Adaptive Loop (When Retrieval Is Weak)
177
+ 1. Index + architecture refresh:
178
+ ```powershell
179
+ gcie.cmd index .
180
+ ```
125
181
 
126
- Treat retrieval as weak if any are true:
127
- - missing implementation or wiring category
128
- - generic entry/support files dominate
129
- - only tiny snippets from the target file appear, with no useful implementation body
130
- - expected cross-layer endpoint is missing
182
+ 2. Readiness probe:
183
+ - Run 10-20 stratified queries across major families using `context-slices` and `context`.
131
184
 
132
- Adapt in this order, one change at a time:
185
+ 3. Readiness adaptation loop:
186
+ - Apply recovery order (query -> scope -> escalation -> targeted gap-fill -> decomposition).
133
187
 
134
- 1. Query upgrade:
135
- - add explicit file paths
136
- - add missing symbols such as functions, props, routes, flags, or keys
137
- - add caller or entry anchor
188
+ 4. Gate decision:
189
+ - Proceed to calibration only after coverage is reachable with stable behavior.
190
+ - If not reachable, keep safer fallback mode for affected families and continue tracking.
138
191
 
139
- 2. Scope correction:
140
- - noisy root results: move to subtree scope
141
- - missing cross-layer or backend anchor: use a targeted root query for that file
192
+ ## Automatic Post-Trigger Adaptation (Required)
142
193
 
143
- 3. Budget bump:
144
- - raise one rung only, roughly `+100` to `+250`
194
+ After trigger detection in a repo session:
145
195
 
146
- 4. Targeted gap-fill:
147
- - fetch only the missing must-have file(s)
196
+ 1. `checkpoint: trigger_detected`
197
+ 2. Run `gcie.cmd index .` -> `checkpoint: index_complete`
198
+ 3. Run readiness probe -> `checkpoint: readiness_probe_complete`
199
+ 4. Run accuracy calibration to 100% must-have hit -> `checkpoint: accuracy_lock_complete`
200
+ 5. Run efficiency iterations only under hard accuracy gate -> `checkpoint: efficiency_complete`
201
+ 6. Run final stress validation (recommended 50-query) -> `checkpoint: stress_validation_complete`
202
+ 7. Write back results (`.planning`, `.gcie/context_config.json`, learned overrides section) -> `checkpoint: write_back_complete`
148
203
 
149
- 5. Decompose chain, only if needed:
150
- - for 4+ hops, split into adjacent 2-3 file hops
204
+ If any checkpoint fails, mark run `incomplete`, record failure artifact in `.planning/`, and continue recovery/fallback flow.
151
205
 
152
- ## Safe Efficiency Mode
206
+ ## Mandatory Bootstrap Calibration Sequence
153
207
 
154
- Use only after stable coverage is achieved.
208
+ 1. Recall calibration stage (required):
209
+ - Tune mode/scope/query/profile until overall and per-family hit rates are 100%.
155
210
 
156
- Rules:
157
- - do not lower primary budgets for known hard shapes
158
- - for a single missing file, try `800` before `900` only if the first pass already found same-family context
159
- - if `800` misses, immediately retry the stable default
160
- - if any miss persists, revert that task family to stable settings
211
+ 2. Recall lock verification (required):
212
+ - Require 2 consecutive 100% lock runs.
161
213
 
162
- Note:
163
- - `800` is an experimental efficiency step-down, not a portable default truth
164
- - keep it only if it preserves full must-have coverage in the current repo
214
+ 3. Efficiency stage (optional, only after lock):
215
+ - Test controlled reductions one change at a time.
216
+ - Immediately rollback any hit-rate regression.
165
217
 
166
- ## Verification Rule
218
+ 4. Activation rule (required):
219
+ - Activate only if lock/stress pass.
220
+ - If stress fails, rollback to last known 100%-hit config.
167
221
 
168
- Always verify with a quick local symbol check before editing:
222
+ ## Metrics and Decision Rules
169
223
 
170
- ```powershell
171
- rg -n "symbol1|symbol2|symbol3" <likely files>
172
- ```
224
+ Per query, record:
225
+ - must-have hit (true/false)
226
+ - tokens used
227
+ - retrieved files
228
+ - escalations performed
173
229
 
174
- GCIE is a context compressor, not the final truth gate.
230
+ Track overall and by family:
231
+ - hit rate
232
+ - average and median tokens
233
+ - tokens-per-hit (`total_tokens / hit_count`)
175
234
 
176
- If one required file is still missing after retrieval, do direct-file-check first, then run one targeted GCIE call only for that file.
235
+ Selection rule per family:
236
+ 1. highest hit rate
237
+ 2. if tie: lowest tokens-per-hit
238
+ 3. if tie: lowest median tokens
177
239
 
178
- ## Portable Stop Rule
240
+ Demotion rules:
241
+ - If slicer miss-rate > 0% during recall calibration, do not keep slicer as default for that family.
242
+ - If both slicer and plain fail, route family to manual-verify until recalibration.
179
243
 
180
- Stop retrieval when all must-have categories are covered:
181
- - implementation
182
- - wiring/orchestration
183
- - validation surface, when risk justifies it
244
+ Promotion rules:
245
+ - Promote only configurations that preserve 100% hit.
246
+ - Efficiency changes must improve tokens without reducing hit rate.
184
247
 
185
- Do not continue increasing budgets after sufficiency is reached.
248
+ ## Continuous Adaptation Over Time
186
249
 
187
- ## First 5 Tasks Calibration (Minimal)
250
+ Trigger recalibration when any are true:
251
+ 1. major repo-change signal (large refactor/churn)
252
+ 2. savings decay (rolling savings drops materially vs active baseline)
253
+ 3. repeated family misses (2+ in recent window)
188
254
 
189
- For a new repo, track these fields for the first 5 tasks:
190
- - task shape
191
- - primary budget
192
- - gap-fill used (Y/N)
193
- - must-have full-hit (Y/N)
194
- - total tokens
255
+ Guardrails:
256
+ 1. Use a minimum evidence window (recommended: 20 retrieval events).
257
+ 2. Run in quiet/background mode when possible.
258
+ 3. Cap adaptation budget per cycle.
259
+ 4. Early-stop efficiency loop after 2 non-improving iterations.
260
+ 5. Prefer family-scoped recalibration before full recalibration.
195
261
 
196
- If a miss pattern repeats 2+ times in one task family:
197
- - add one local override for that family only
198
- - keep all other families on portable defaults
262
+ ## Persistence
199
263
 
200
- Update necessity rule:
201
- - explicit workflow updates are optional, not required for baseline operation
202
- - if results are stable, keep using portable defaults without changes
203
- - add or update a local override only when the same miss pattern repeats 2-3 times
264
+ Persist learned defaults in `.gcie/context_config.json` and `.gcie/retrieval_profile.json` with:
265
+ - family
266
+ - default mode/profile
267
+ - last benchmark date
268
+ - hit/token metrics
204
269
 
205
- ## Optional Appendix: Repo-Specific Overrides (Example)
270
+ Write repo-local learned routing here:
206
271
 
207
- These are examples from one mixed-layer repo and are not universal defaults.
272
+ ## Learned Routing Overrides (Repo-Local, Mutable)
208
273
 
209
- 1. `cross_layer_ui_api` override:
210
- ```powershell
211
- gcie.cmd context frontend "src/App.jsx src/main.jsx <symbols>" --intent edit --budget 900
212
- gcie.cmd context . "app.py start_convert selected_theme selectedTheme no_ai" --intent edit --budget 900
213
- ```
274
+ No active learned overrides yet.
275
+ Populate after first full adaptation cycle.
214
276
 
215
- 2. Stage 3/4 planner-builder pair override (`Plan_slides.py` + `Build_pptx.py`):
216
- ```powershell
217
- gcie.cmd context . "Plan_slides.py content_slides section_divider figure_slides table_slide" --intent <intent> --budget 900
218
- gcie.cmd context . "Build_pptx.py build_pptx render_eq_png apply_theme THEME_CHOICES" --intent <intent> --budget 900
219
- ```
277
+ ## Agent Instructions Snippet (Copy/Paste)
220
278
 
221
- 3. Stage 1/2 with `main.py` override:
222
- ```powershell
223
- gcie.cmd context . "Analyze_pdf_structure.py Extract_pdf_content.py extract_pages split_into_sections extract_images enrich_with_ai" --intent explore --budget 1100
224
- gcie.cmd context . "main.py Stage 1 Stage 2 extract_pages enrich_with_ai" --intent explore --budget 500
279
+ ```text
280
+ Use GCIE for context lookup before reading files or making edits. Follow GCIE_USAGE.md.
281
+ Prioritize must-have coverage over token savings.
282
+ Start with context-slices --profile low, then adapt using recovery order:
283
+ query -> scope -> profile/budget escalation -> targeted gap-fill -> rg fallback.
225
284
  ```
226
285
 
227
- 4. Guardrail example:
228
- - keep the stable workflow for families that regress under split retrieval
229
- - example: `llm_client.py + Analyze_pdf_structure.py + Extract_pdf_content.py` in one benchmarked repo
286
+ ## Notes
230
287
 
231
- If this appendix does not match your repo, ignore it and use only the portable sections above.
288
+ 1. This file is intentionally generalized and adaptive for any repo.
289
+ 2. Keep repo-specific tuning in learned overrides and `.gcie` state, not in global defaults.
290
+ 3. If in doubt, choose the higher-accuracy path first, then optimize tokens after lock.
package/README.md CHANGED
@@ -58,7 +58,7 @@ Use this when you want a fast drop-in setup for coding agents.
58
58
  No heavy upfront tuning is required. The workflow starts portable-first and only adds local overrides after repeated miss patterns.
59
59
 
60
60
  One-command repo bootstrap:
61
- - `gcie.cmd setup .`
61
+ - `npx -y @pmaddire/gcie@latest setup .`
62
62
 
63
63
  This creates `.gcie` architecture tracking files, copies portable agent workflow docs, and runs an initial index pass.
64
64
 
@@ -101,7 +101,7 @@ Net: adapted protocol preserved full accuracy while reducing average tokens by ~
101
101
  After publishing to npm, users can set up any repo with one command:
102
102
 
103
103
  ```powershell
104
- npx gcie@latest
104
+ npx -y @pmaddire/gcie@latest setup .
105
105
  ```
106
106
 
107
107
  This runs `gcie setup .` in the current repo by default.
@@ -110,10 +110,32 @@ If Python deps are missing, GCIE now bootstraps a local package venv and install
110
110
  Optional setup flags are passed through:
111
111
 
112
112
  ```powershell
113
- npx gcie@latest --no-index
114
- npx gcie@latest --force
113
+ npx -y @pmaddire/gcie@latest setup . --no-index
114
+ npx -y @pmaddire/gcie@latest setup . --force
115
115
  ```
116
116
 
117
+ For command-only usage without setup:
118
+
119
+ ```powershell
120
+ npx -y @pmaddire/gcie@latest --help
121
+ ```
122
+
123
+ ## Agent Integration
124
+
125
+ To make your coding agent use GCIE automatically, add this trigger line to your
126
+ agent instructions (system prompt / repo instruction file):
127
+
128
+ `Use GCIE for context lookup before reading files or making edits. Follow GCIE_USAGE.md.`
129
+
130
+ Required file:
131
+ - keep `GCIE_USAGE.md` in the target repo root
132
+
133
+ Recommended setup:
134
+ 1. Run one-command setup:
135
+ - `npx -y @pmaddire/gcie@latest setup .`
136
+ 2. Add the trigger line above to your agent instruction file.
137
+ 3. Start normal coding tasks; the agent should use GCIE-first retrieval workflow.
138
+
117
139
  ## One-Command GitHub Bootstrap
118
140
 
119
141
  Run this from the target repo to download GCIE from GitHub and set it up automatically:
@@ -146,7 +168,7 @@ What it does:
146
168
  1. In the GCIE repo:
147
169
  - `npm link`
148
170
  2. In your target repo:
149
- - `npm link gcie`
171
+ - `npm link @pmaddire/gcie`
150
172
  3. Verify:
151
173
  - `gcie --help`
152
174
 
@@ -162,7 +184,7 @@ This repo includes a lightweight npm wrapper so you can run `gcie` like other np
162
184
  2. In target repo: `gcie --help`
163
185
 
164
186
  Local option:
165
- - `npm install` then `npx gcie --help`
187
+ - `npm install` then `npx @pmaddire/gcie@latest --help`
166
188
 
167
189
  The wrapper prefers `.venv` in the GCIE repo and falls back to system Python.
168
190
 
@@ -216,7 +238,7 @@ Important note:
216
238
  - `gcie index <path>`
217
239
  - `gcie query <file.py> "<question>"`
218
240
  - `gcie debug <file.py> "<question>"`
219
- - `gcie context <repo|file> "<task>" --budget auto --intent <edit|debug|refactor|explore>`
241
+ - `gcie context <repo|file> "<task>" --budget auto --intent <edit|debug|refactor|explore> --mode basic`
220
242
  - `gcie context-slices <repo> "<task>" --intent <edit|debug|refactor|explore> [--profile recall|low] [--stage-a 400] [--stage-b 800] [--max-total 1200] [--pin frontend/src/App.jsx] [--pin-budget 300] [--include-tests]`
221
243
 
222
244
  ## How To Use It
@@ -367,6 +389,6 @@ npm publish --access public
367
389
  Then users can run:
368
390
 
369
391
  ```powershell
370
- npx gcie@latest
392
+ npx -y @pmaddire/gcie@latest setup .
371
393
  ```
372
394
 
@@ -0,0 +1,69 @@
1
+ import os
2
+ import pathlib
3
+ from parser.ast_parser import parse_python_file
4
+ from graphs.call_graph import build_call_graph
5
+ from graphs.variable_graph import build_variable_graph
6
+ from retrieval.hybrid_retriever import hybrid_retrieve
7
+ from llm_context.snippet_selector import RankedSnippet, estimate_tokens
8
+ from llm_context.context_builder import build_context
9
+
10
+ ROOT=pathlib.Path('.')
11
+ EXCLUDE={'__pycache__','.venv','venv'}
12
+ py_files=[]
13
+ for path in ROOT.rglob('*.py'):
14
+ if any(part in EXCLUDE for part in path.parts):
15
+ continue
16
+ py_files.append(path)
17
+
18
+ snippets_by_node={}
19
+ modules=[]
20
+ for path in py_files:
21
+ try:
22
+ module=parse_python_file(path)
23
+ except Exception:
24
+ continue
25
+ modules.append(module)
26
+ text=path.read_text()
27
+ lines=text.splitlines()
28
+ for fn in module.functions:
29
+ start=max(0,fn.start_line-1)
30
+ end=min(len(lines),fn.end_line)
31
+ snippet='\n'.join(lines[start:end])
32
+ node=f"function:{path.as_posix()}::{fn.name}"
33
+ snippets_by_node[node]=snippet
34
+
35
+ call_graph=build_call_graph(modules)
36
+ var_graph=build_variable_graph(modules)
37
+ graph=call_graph
38
+ for node,attrs in var_graph.nodes(data=True):
39
+ if not graph.has_node(node):
40
+ graph.add_node(node,**attrs)
41
+ for u,v,data in var_graph.edges(data=True):
42
+ graph.add_edge(u,v,**data)
43
+
44
+ prompts=[
45
+ "Why is variable diff exploding?",
46
+ "How does git history mining handle empty repositories?",
47
+ "How do CLI index/query/debug commands work?",
48
+ ]
49
+
50
+ def naive_tokens():
51
+ total=0
52
+ for path in py_files:
53
+ total+=estimate_tokens(path.read_text())
54
+ return total
55
+
56
+ naive=naive_tokens()
57
+ print('Prompt|GCIE tokens|Naive tokens|Reduction%|Selected snippets|Notes')
58
+ for prompt in prompts:
59
+ hybrid=hybrid_retrieve(graph,prompt,top_k=10,git_recency_by_node={},coverage_risk_by_node={},max_hops=2)
60
+ ranked=[]
61
+ for cand in hybrid:
62
+ text=snippets_by_node.get(cand.node_id)
63
+ if not text:
64
+ continue
65
+ ranked.append(RankedSnippet(cand.node_id,text,cand.score))
66
+ context=build_context(prompt,ranked,token_budget=300)
67
+ reduction=(1-context.total_tokens_estimate/naive)*100 if naive else 0
68
+ note='good' if ranked else 'empty'
69
+ print(f"{prompt}|{context.total_tokens_estimate:.1f}|{naive:.1f}|{reduction:.1f}%|{len(context.snippets)}|{note}")