@ictechgy/context-guard 0.4.8 → 0.4.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/CHANGELOG.md +29 -0
  2. package/README.ko.md +92 -37
  3. package/README.md +111 -37
  4. package/docs/benchmark-fixtures/token-savings-12task-baseline.prompt.example.md +7 -0
  5. package/docs/benchmark-fixtures/token-savings-12task-contextguard.prompt.example.md +7 -0
  6. package/docs/benchmark-fixtures/token-savings-12task.tasks.example.json +182 -0
  7. package/docs/benchmark-fixtures/token-savings-12task.variants.example.json +10 -0
  8. package/docs/distribution.md +10 -7
  9. package/docs/experimental-benchmark-fixtures.md +8 -1
  10. package/package.json +3 -6
  11. package/packaging/homebrew/context-guard.rb.template +1 -1
  12. package/plugins/context-guard/.claude-plugin/plugin.json +1 -1
  13. package/plugins/context-guard/README.ko.md +9 -6
  14. package/plugins/context-guard/README.md +27 -12
  15. package/plugins/context-guard/bin/context-guard +113 -26
  16. package/plugins/context-guard/bin/context-guard-artifact +542 -46
  17. package/plugins/context-guard/bin/context-guard-cache-score +380 -0
  18. package/plugins/context-guard/bin/context-guard-compress +146 -1
  19. package/plugins/context-guard/bin/context-guard-cost +783 -4
  20. package/plugins/context-guard/bin/context-guard-experiments +2211 -121
  21. package/plugins/context-guard/bin/context-guard-failed-nudge +3 -0
  22. package/plugins/context-guard/bin/context-guard-filter +163 -7
  23. package/plugins/context-guard/bin/context-guard-guard-read +3 -0
  24. package/plugins/context-guard/bin/context-guard-pack +602 -43
  25. package/plugins/context-guard/bin/context-guard-rewrite-bash +3 -0
  26. package/plugins/context-guard/bin/context-guard-setup +165 -31
  27. package/plugins/context-guard/bin/context-guard-statusline +490 -283
  28. package/plugins/context-guard/bin/context-guard-statusline-merged +5 -0
  29. package/plugins/context-guard/bin/context-guard-tool-prune +241 -1
  30. package/plugins/context-guard/lib/context_guard_commands.py +206 -0
  31. package/plugins/context-guard/skills/setup/SKILL.md +1 -0
  32. package/context-guard-kit/README.md +0 -91
  33. package/context-guard-kit/benchmark_runner.py +0 -2401
  34. package/context-guard-kit/claude_transcript_cost_audit.py +0 -2346
  35. package/context-guard-kit/context_compress.py +0 -695
  36. package/context-guard-kit/context_escrow.py +0 -935
  37. package/context-guard-kit/context_filter.py +0 -637
  38. package/context-guard-kit/context_guard_cli.py +0 -325
  39. package/context-guard-kit/context_guard_diet.py +0 -1711
  40. package/context-guard-kit/context_pack.py +0 -2713
  41. package/context-guard-kit/cost_guard.py +0 -2349
  42. package/context-guard-kit/experimental_registry.py +0 -2339
  43. package/context-guard-kit/failed_attempt_nudge.py +0 -567
  44. package/context-guard-kit/guard_large_read.py +0 -690
  45. package/context-guard-kit/hook_secret_patterns.py +0 -43
  46. package/context-guard-kit/read_symbol.py +0 -483
  47. package/context-guard-kit/rewrite_bash_for_token_budget.py +0 -501
  48. package/context-guard-kit/sanitize_output.py +0 -725
  49. package/context-guard-kit/settings.example.json +0 -67
  50. package/context-guard-kit/setup_wizard.py +0 -2515
  51. package/context-guard-kit/statusline.sh +0 -362
  52. package/context-guard-kit/statusline_merged.sh +0 -157
  53. package/context-guard-kit/tool_schema_pruner.py +0 -837
  54. package/context-guard-kit/trim_command_output.py +0 -1449
package/README.md CHANGED
@@ -1,19 +1,21 @@
1
1
  # ContextGuard
2
2
 
3
- ContextGuard is a local-first context management toolkit for AI coding and tool agents. It ships as a Claude Code plugin first install once, apply per project, reverse if needed. Guardrails cover trimmed output, symbol-level reads, repeated-failure nudges, secret-pattern redaction, and usage measurement, then extend to other agents through local helper commands and advisory brief-mode rule snippets.
3
+ ContextGuard is a local-first context management toolkit for AI coding and tool agents. It ships first as a Claude Code plugin: install it once, enable it per project, and roll it back when needed.
4
+
5
+ It trims noisy output, steers agents toward symbol-level reads, nudges repeated failures, redacts secret-like patterns, and measures usage. The same guardrails extend to other agents through local helper commands and advisory brief-mode rule snippets.
4
6
 
5
7
  - Korean documentation: [`README.ko.md`](README.ko.md)
6
8
  - Static landing page: [GitHub Pages](https://ictechgy.github.io/context-guard/) ([source](docs/index.html))
7
9
 
8
10
  ## TL;DR
9
11
 
10
- Install and activation are separate. Installing ContextGuard only puts local helpers or Claude plugin skills in reach; configuration changes happen later through an explicit setup command.
12
+ Installation and activation are deliberately separate. Installing ContextGuard only makes local helpers or Claude plugin skills available. Configuration changes happen only when you run an explicit setup command.
11
13
 
12
14
  | If you use... | Install | Activate |
13
15
  | --- | --- | --- |
14
16
  | Claude Code | `/plugin marketplace add ictechgy/context-guard` then `/plugin install context-guard@context-guard` | Run `/context-guard:setup` inside the project. |
15
17
  | Codex CLI or any terminal-first agent | `npm install -g @ictechgy/context-guard` or one-shot `npx @ictechgy/context-guard ...` | `context-guard setup --agent codex --scope project --with-init --with-skill --plan`, then rerun with `--yes`. |
16
- | Other rule-file agents | npm/npx install above | `context-guard setup --agent gemini,cursor,windsurf,cline,copilot --scope project --with-init --plan`, then apply only the agents you want. |
18
+ | Other rule-file agents | Use the npm/npx install path above. | `context-guard setup --agent gemini,cursor,windsurf,cline,copilot --scope project --with-init --plan`, then apply only the agents you want. |
17
19
  | macOS/Homebrew users | release path: `brew install ictechgy/tap/context-guard` | Same `context-guard setup ...` commands after install. |
18
20
 
19
21
  Common commands:
@@ -27,16 +29,16 @@ context-guard setup --agent claude --scope user --verify --json # read-only use
27
29
  context-guard setup --agent claude --scope user --plan
28
30
  ```
29
31
 
30
- Project scope is the default. User-level setup is opt-in, requires an explicit agent for writes, records backups/rollback metadata, and never runs during package installation. Use `context-guard doctor` or `context-guard setup --verify` for a read-only health check before applying setup; doctor only reports next commands and makes no changes.
32
+ Project scope is the default. User-level setup is opt-in, requires an explicit agent for writes, records backups and rollback metadata, and never runs during package installation. Use `context-guard doctor` or `context-guard setup --verify` for a read-only health check before applying setup. `doctor` reports next commands and makes no changes. Setup resolves bundled or checkout-local helpers first; it does not trust arbitrary `PATH` helpers unless you explicitly pass `--allow-path-helper-fallback` for a known-good install.
31
33
 
32
- ContextGuard is intentionally conservative about savings claims. It reduces common sources of context bloat and provides benchmark tooling so you can measure real before/after results on your own tasks. It does **not** promise a fixed token or cost reduction for every repository.
34
+ ContextGuard is intentionally conservative about savings claims. It reduces common sources of context bloat and provides benchmark tooling so you can measure before/after results on your own tasks. It does **not** promise a fixed token or cost reduction for every repository.
33
35
 
34
36
  ## Claude Code first, other agents too
35
37
 
36
- ContextGuard ships as a Claude Code plugin, and that is still the fastest path to value. Once installed, the same local-first guardrails can be reused by other AI coding and tool agents through:
38
+ ContextGuard ships first as a Claude Code plugin, which is still the fastest path to value for Claude users. After installation, the same local-first guardrails can be reused by other AI coding and tool agents through:
37
39
 
38
40
  - **Local helper commands** (`context-guard-*`) that run as plain shell commands, independent of any specific agent.
39
- - **Advisory brief-mode rule snippets** you install into an agent's own instruction file (`AGENTS.md`, `GEMINI.md`, `.cursorrules`, Copilot instructions, and similar rule files) and remove by deleting the marker-delimited block.
41
+ - **Advisory brief-mode rule snippets** that you install into an agent's own instruction file (`AGENTS.md`, `GEMINI.md`, `.cursorrules`, Copilot instructions, and similar rule files) and remove by deleting the marker-delimited block.
40
42
  - **Dry-run cross-agent setup** that writes only local files, backs up before changing anything, and applies only with explicit approval.
41
43
 
42
44
  Current setup surfaces:
@@ -54,14 +56,14 @@ Current setup surfaces:
54
56
 
55
57
  ## How ContextGuard reduces token waste
56
58
 
57
- ContextGuard does not make the model cheaper by itself. It reduces avoidable context before it reaches an AI coding agent, then gives you signals to measure whether that helped.
59
+ ContextGuard does not make the model cheaper by itself. It reduces avoidable context before it reaches an AI coding agent, then gives you signals to measure whether the change helped.
58
60
 
59
61
  | Waste path | ContextGuard guardrail |
60
62
  | --- | --- |
61
63
  | Whole-file reads for one function | Suggest search, symbol slices, bounded outlines, and small line ranges before a full read. |
62
64
  | Long test, build, search, or diff output | Trim output, emit structured digests, or store large logs locally and return compact receipts. |
63
65
  | Repeated failing commands | Warn after repeated Bash failures so the agent changes strategy before more stale logs enter context. |
64
- | Secret-like or noisy terminal output | Apply best-effort, pattern-based redaction for common credential patterns and sensitive-looking paths before output is copied into context. |
66
+ | Secret-like or noisy terminal output | Apply best-effort pattern-based redaction for common credential patterns and sensitive-looking paths before output is copied into context. |
65
67
  | Unknown token/cost hotspots | Surface statusline signals, transcript audits, and matched benchmark reports for before/after evidence. |
66
68
  | Anthropic API requests that may miss prompt cache | `context-guard cost preflight` estimates input size, breakpoint-level cache risk, and low/mid/high cost ranges before a call; default mode warns only. |
67
69
  | Volatile context before stable prompt prefixes | Audit bounded redacted prompt-segment hashes and flag likely cache-unfriendly prompt layouts without exposing raw prompt text. |
@@ -69,21 +71,21 @@ ContextGuard does not make the model cheaper by itself. It reduces avoidable con
69
71
 
70
72
  ## How it fits with caching and compression tools
71
73
 
72
- ContextGuard is complementary to provider and semantic caches, and adjacent to prompt compression. It focuses on **not sending unnecessary files, logs, or output in the first place**.
74
+ ContextGuard complements provider and semantic caches, and sits next to prompt compression. Its main job is simpler: **do not send unnecessary files, logs, or output in the first place**.
73
75
 
74
76
  | Tool category | Saves by | ContextGuard relationship |
75
77
  | --- | --- | --- |
76
78
  | Provider prompt/context caching | Reusing stable prompt prefixes. | Complementary; ContextGuard helps keep the changing tail of context smaller and cleaner, `context-guard-audit` can flag likely volatile prefix layouts, and `context-guard cost` can warn when an Anthropic request is likely to create/cache-write instead of read. |
77
79
  | Semantic response cache | Reusing answers to identical or similar requests. | Complementary; ContextGuard does not serve cached AI answers. |
78
80
  | Prompt/context compression | Shortening text that is already selected for the model. | Adjacent; ContextGuard trims and summarizes local output, but does not promise lossless semantic compression. |
79
- | Experimental dry-run planners | Reviewing learned-compression, visual crop/OCR, self-hosted metrics, context-diff compaction, and local-proxy ideas before any runtime path ships. | Default-off and advisory-only; no runtime compression, OCR, forwarding, or hosted-savings claim without separate evidence and future PR gates. |
81
+ | Experimental planners and local runtimes | Default-off and explicit-command-only; covers local-proxy plans and gate records plus narrow local runtimes for caller-supplied context-diff, visual evidence-pack, learned-compression, and self-hosted metrics evidence. | The local proxy `record` command starts no listener and forwards no traffic; `serve local-proxy` binds and forwards only literal loopback IPs for one bounded request. Compressor/model execution, OCR/crop services, external forwarding, credential persistence, and hosted-savings claims stay out of scope until a separate evidence gate and future PR allow them. |
80
82
  | ContextGuard | Avoiding unnecessary files, logs, repeated failures, and noisy output before they enter agent context. | Local guardrails, reversible artifacts, and measurement. |
81
83
 
82
84
  Related patterns that informed the design:
83
85
 
84
86
  | Approach | What it emphasizes | ContextGuard relationship |
85
87
  | --- | --- | --- |
86
- | Compression-first | Shortening text already selected for the model, often with lossy transforms. | ContextGuard prefers local artifact storage with exact slice retrieval over lossy one-way compression; you can get the original back. |
88
+ | Compression-first | Shortening text already selected for the model, often with lossy transforms. | ContextGuard prefers local artifact storage with exact slice retrieval over lossy one-way compression, so you can get the original back. |
87
89
  | Terse-output rulesets across agents | Installing brief-mode output rules into many agents at once. | ContextGuard offers advisory brief-mode snippets and dry-run cross-agent setup — opt-in per project, no guaranteed savings claimed. |
88
90
  | ContextGuard | Avoiding unnecessary files, logs, and output before they enter context, with conservative measurement. | Local guardrails, reversible artifacts and retrieval, and benchmark evidence you measure yourself. |
89
91
 
@@ -102,6 +104,7 @@ When you need a savings claim, measure it on your own tasks:
102
104
  - transcript hotspots reported by `context-guard-audit`, including `cache_friendliness` prompt-layout signals and `cache_layout_advice` experiment priorities
103
105
  - statusline `cache` / `reuse` as observed transcript/provider-cache signals, not savings caused by ContextGuard
104
106
  - `context-guard cost preflight` estimates for Anthropic request JSON, followed by `context-guard cost observe` using provider usage fields (`cache_creation_input_tokens`, `cache_read_input_tokens`) after the call
107
+ - static prompt/request cache layout checks from `context-guard-cache-score`; its char/4 token estimates and warnings are advisory only until provider usage fields confirm real cache hits
105
108
  - matched successful baseline/variant runs from `context-guard-bench`
106
109
  - large tool/MCP catalogs versus `context-guard-tool-prune` top-k reports plus receipt retrieval
107
110
  - optional experimental lanes in [`research/experimental-token-reduction-radar.md`](research/experimental-token-reduction-radar.md); fixture-only starters in [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md) use the same matched-task benchmark gates before any savings claim
@@ -113,7 +116,8 @@ When you need a savings claim, measure it on your own tasks:
113
116
  - It does not mutate global Claude settings during install.
114
117
  - It does not replace real before/after measurement when you need a savings claim.
115
118
  - Local RAM/disk receipts can reduce what you send next, but they do **not** replace Anthropic's provider prompt cache or guarantee cache hits. Recheck Anthropic prompt-caching and pricing docs before release or billing claims: https://docs.anthropic.com/en/build-with-claude/prompt-caching and https://platform.claude.com/docs/en/about-claude/pricing.
116
- - Experimental helpers are dry-run checker/planner surfaces only. ContextGuard can review learned-compression policy, visual crop/OCR metadata, self-hosted metrics ledger previews, context-diff compaction plans, and local-proxy constraints, but it does not ship learned/synthetic compressor runtime, embeddings, rerankers, model calls, replacement generation, OCR/crop runtime, self-hosted KV/latent inference optimization, or actual proxy forwarding runtime.
119
+ - Experimental helpers are mostly dry-run checker/planner surfaces, including a design-only external-forwarding opt-in gate. Explicit local runtimes exist only for caller-supplied context-diff replacement payloads, caller-supplied visual crop/OCR evidence packs, caller-supplied learned-compression prose candidates, self-hosted metrics JSONL sidecar records, local-proxy runtime-gate JSONL records, and one-shot `serve local-proxy` loopback forwarding with a private ready-file nonce plus optional shifted-cost diagnostic JSONL rows for successful forwarded requests.
120
+ - ContextGuard does not ship learned/synthetic compressor execution, embeddings, rerankers, model calls, generated replacement text, screenshot capture, image cropping, OCR execution, image parsing, external OCR/image services, self-hosted KV/latent inference optimization beyond explicit local metrics recording, or broader proxy forwarding beyond literal-loopback, one-request HTTP forwarding with credential material blocked.
117
121
  - It does not alias the old `/claude-token-optimizer:*` Claude Code slash-command namespace. Use `/context-guard:*` after installing this plugin.
118
122
 
119
123
  Legacy local CLI wrappers (`claude-token-*`, `claude-read-symbol`, `claude-trim-output`, and `claude-sanitize-output`) still ship in `bin/` so existing automation can migrate gradually.
@@ -124,16 +128,16 @@ Legacy local CLI wrappers (`claude-token-*`, `claude-read-symbol`, `claude-trim-
124
128
  | --- | --- |
125
129
  | Claude Code plugin skills | Guided setup, optimization, and transcript usage audits. |
126
130
  | Project-local setup wizard | Applies recommended `.claude/settings.json` options without touching global settings. |
127
- | Context hygiene scanner | Finds missing guardrails, noisy hooks, broad reads, large context files, secret-like files, excessive MCP servers, and expensive defaults. |
131
+ | Context management scanner | Finds missing guardrails, noisy hooks, broad reads, large context files, secret-like files, excessive MCP servers, and expensive defaults. |
128
132
  | Structural-waste doctor | Opt-in local diagnostics for duplicate rules, stale imports, unused skill candidates, oversized tool schemas, and repeated read/tool-call loops. |
129
133
  | Large-read guard and symbol reader | Nudges the agent toward `rg`, symbol reads, and small line ranges instead of full-file reads. |
130
134
  | Output trimming and sanitizing | Keeps test, build, search, and diff output compact while redacting likely secrets before they enter agent context. |
131
135
  | Declarative output filter | Opt-in JSON DSL for user-owned command filters with protected failure passthrough and validation before use. |
132
136
  | Local artifact store | Saves large sanitized logs outside the conversation and returns compact receipts or exact requested slices. |
133
- | Anthropic cost guard | `context-guard cost preflight/observe/ledger/compile` estimates cache-risk and cost ranges, stores only keyed HMAC fingerprints, and stays passive unless `--enforce` is explicit. |
134
- | Budgeted context packer | Assembles prioritized local file evidence into a byte-budgeted Markdown pack, can suggest a build-compatible manifest from local signals, and adds `--explain` for compact local selection reasons plus bounded repo-map metadata. |
137
+ | Anthropic cost guard | `context-guard cost preflight/observe/ledger/compile` estimates cache risk and cost ranges. `context-guard route-advisor` summarizes local total-cost and batchability route candidates, stores only keyed HMAC fingerprints where a ledger is used, and stays passive unless `--enforce` is explicit. |
138
+ | Budgeted context packer | Assembles prioritized local file evidence into a byte-budgeted Markdown pack, can suggest a build-compatible manifest from local signals, adds `--explain` for compact local selection reasons plus bounded repo-map metadata, and adds opt-in `--adaptive-k` / `--symbol-memory` advisory metadata. |
135
139
  | Tool/MCP schema pruner | Emits bounded top-k tool/schema advisory reports from local catalogs with compact receipts and full sanitized payload retrieval. |
136
- | Conservative stdin compressor | Shrinks selected JSON, diffs, logs, search output, code, and prose with observed byte evidence and estimated token proxies. |
140
+ | Conservative stdin compressor | Shrinks selected JSON, diffs, logs, search output, code, and prose with observed byte evidence and estimated token proxies; `--mode readable` adds an opt-in readable prose preview with exact fallback guidance. |
137
141
  | Protected-zone policy receipts | Opt-in `context-guard-compress --protected-policy` and `context-guard cost compile` metadata mark code/diff/path/hash/JSON/literal zones as structural-only with exact retrieval guidance. |
138
142
  | Repeated-failure nudge | Warns after repeated Bash failures so the agent changes strategy before stale logs fill the context. |
139
143
  | Statusline, audit, and benchmarks | Shows context/cache/cost signals, finds usage and cache-friendliness hotspots, and records conservative before/after evidence. |
@@ -169,7 +173,7 @@ Setup is explicit, project-local, and reversible. The plugin does not configure
169
173
 
170
174
  ## Install with npm/npx
171
175
 
172
- The npm package exposes a canonical `context-guard` command plus the backwards-compatible `context-guard-*` helper commands. Package installation is passive: there is no `postinstall` setup hook and no config write until you run `context-guard setup` yourself.
176
+ The npm package exposes a canonical `context-guard` command plus backward-compatible `context-guard-*` helper commands. Package installation is passive: there is no `postinstall` setup hook and no config write until you run `context-guard setup` yourself. If setup cannot find bundled or checkout-local helpers, `PATH` fallback remains disabled by default; use `--allow-path-helper-fallback` only for trusted helper directories after `context-guard doctor` or `setup --verify` confirms the plan.
173
177
 
174
178
  ```bash
175
179
  npm install -g @ictechgy/context-guard
@@ -213,7 +217,7 @@ context-guard setup --agent claude --scope user --verify --json
213
217
 
214
218
  Both modes are read-only configuration checks. `doctor` reports recommended next commands, and `setup --verify` checks whether setup is complete without applying changes. With `--json`, the report is written to stdout.
215
219
 
216
- ### Scan context hygiene
220
+ ### Scan context management
217
221
 
218
222
  ```bash
219
223
  ./plugins/context-guard/bin/context-guard-diet scan .
@@ -244,10 +248,11 @@ The optional Read guard uses a progressive path for oversized files: search firs
244
248
 
245
249
  ```bash
246
250
  long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --command "long-command" --json
251
+ ./plugins/context-guard/bin/context-guard-artifact search "ERROR" --json
247
252
  ./plugins/context-guard/bin/context-guard-artifact get <artifact_id> --lines 1:80
248
253
  ```
249
254
 
250
- Artifact mode is for capture and retrieval. It stores sanitized output under `.context-guard/artifacts` by default and can still read legacy `.claude-token-optimizer/artifacts` receipts from before the rebrand. JSON receipts include line-numbered top-error receipts, duplicate-line groups, and sanitized bounded `suggested_queries` so an agent can fetch the smallest useful exact slice instead of replaying the full log. When `--max-lines` accompanies a `--lines START:END` selector, it caps lines returned within that range; it does not expand the selector. Preserve the producer command's exit code yourself when using shell pipelines in release checks, or use `context-guard-trim-output -- ...` when exit-code preservation is the primary requirement.
255
+ Artifact mode is for capture, sandbox search, and retrieval. It stores sanitized output under `.context-guard/artifacts` by default and can still read legacy `.claude-token-optimizer/artifacts` receipts from before the rebrand. JSON receipts include line-numbered top-error receipts, duplicate-line groups, and sanitized bounded `suggested_queries` so an agent can fetch the smallest useful exact slice instead of replaying the full log. `search` scans the local sanitized artifact sandbox by literal substring, returns capped match/context records, and includes `context-guard-artifact get ... --lines START:END` rehydration commands for omitted detail. For custom `--dir` values, raw private paths stay redacted by default; rerun with the same `--dir`, or pass `search --show-paths` when you explicitly want a directly executable local command. The search report is local-only and does not make hosted token/cost savings claims. When `--max-lines` accompanies a `--lines START:END` selector, it caps lines returned within that range; it does not expand the selector. Preserve the producer command's exit code yourself when using shell pipelines in release checks, or use `context-guard-trim-output -- ...` when exit-code preservation is the primary requirement.
251
256
 
252
257
  ### Build a budgeted context pack
253
258
 
@@ -258,17 +263,31 @@ Artifact mode is for capture and retrieval. It stores sanitized output under `.c
258
263
  --diff HEAD \
259
264
  --manifest-out suggested-pack.json \
260
265
  --pack-out context-pack.md \
261
- --budget-bytes 12000 --json --explain
266
+ --budget-bytes 12000 --json --explain --adaptive-k --symbol-memory
262
267
  # Or run the two explicit steps:
263
268
  ./plugins/context-guard/bin/context-guard-pack suggest \
264
269
  --root . --query "review failing tests" --diff HEAD \
265
- --manifest-out suggested-pack.json --budget-bytes 12000 --json
270
+ --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k
266
271
  ./plugins/context-guard/bin/context-guard-pack build \
267
272
  --root . --manifest suggested-pack.json --budget-bytes 12000 --json
268
273
  ./plugins/context-guard/bin/context-guard-pack slice --root . --path README.md --lines 1:40 --json
269
274
  ```
270
275
 
271
- `context-guard-pack auto` is the one-command local-only path that runs the same suggestion step and immediately builds the budgeted Markdown pack; add `--explain` when you want compact deterministic local selection/build reasons in JSON or text output. `--explain` also includes bounded `repo_map` metadata: sampled byte/token-proxy tree entries, category-only secret-risk counts, signature-first file hints, explain-only graph ranks, and exact `slice`/symbol retrieval hints. This metadata does not change the manifest, pack body, receipt, or byte budget, does not use network/model/embedding calls, and treats token values as local `chars_div_4` proxies rather than provider-token or savings claims. `--manifest-out` remains a build-compatible manifest, while `--pack-out` can save the rendered pack. `context-guard-pack suggest` is the lower-level additive local-only planning step. It ranks candidate files and line ranges from `--query`, `--diff`, repeated `--files`, and optional `--output` / `--test-output` text files under `--root` after sanitizing those output signals, then writes a manifest that `build --manifest` can consume. It uses deterministic standard-library heuristics only: no network, model calls, embeddings, or provider-cost estimate. `context-guard-pack build` assembles prioritized local file evidence into a Markdown body whose rendered UTF-8 bytes stay within `--budget-bytes`. JSON output records included, partial, duplicate, unsafe, missing, and budget-omitted sources, writes a bounded local receipt under `.context-guard/packs`, and includes copy-pasteable `slice` commands for exact sanitized retrieval when the path/root are safe to display. If retrieval is unsafe, the pack and JSON metadata include `retrieval_omitted_reason` instead of a command. Byte counts are observed; token counts remain estimated `chars_div_4` proxies, not measured provider-token savings.
276
+ `context-guard-pack auto` is the one-command, local-only path: it runs the suggestion step and immediately builds the budgeted Markdown pack.
277
+
278
+ A few boundaries are intentional:
279
+
280
+ - Add `--explain` for compact deterministic local selection/build reasons in JSON or text output.
281
+ - `--explain` may include bounded `repo_map` metadata: sampled byte/token-proxy tree entries, category-only secret-risk counts, signature-first file hints, explain-only graph ranks, and exact `slice`/symbol retrieval hints.
282
+ - Explain metadata does not change the manifest, pack body, receipt, or byte budget. It does not use network/model/embedding calls, and token values remain local `chars_div_4` proxies rather than provider-token or savings claims.
283
+ - Add `--adaptive-k` to `suggest` or `auto` for advisory-only shrink/expand top-k metadata derived from local score distribution, byte-budget fit, and score-mass recall/precision proxies. It never applies the recommendation automatically and does not change the manifest, pack body, receipt, or byte budget.
284
+ - Add `--symbol-memory` to `auto` for repo-map-derived symbol/graph advisory metadata with exact `slice` / `read-symbol` verification hints. It is source-verification guidance only and does not change the manifest, pack body, receipt, or byte budget.
285
+ - `--manifest-out` writes a build-compatible manifest; `--pack-out` saves the rendered pack.
286
+ - `context-guard-pack suggest` is the lower-level additive local-only planning step. It ranks candidate files and line ranges from `--query`, `--diff`, repeated `--files`, and optional sanitized `--output` / `--test-output` files under `--root`, then writes a manifest that `build --manifest` can consume.
287
+ - `context-guard-pack build` assembles prioritized local file evidence into a Markdown body whose rendered UTF-8 bytes stay within `--budget-bytes`. JSON output records included, partial, duplicate, unsafe, missing, and budget-omitted sources.
288
+ - Bounded receipts are stored under `.context-guard/packs`. When path/root display is safe, JSON output includes copy-pasteable `slice` commands for exact sanitized retrieval; otherwise it records `retrieval_omitted_reason`.
289
+
290
+ The packer uses deterministic standard-library heuristics only: no network, model calls, embeddings, or provider-cost estimate. Byte counts are observed; token counts remain estimated `chars_div_4` proxies, not measured provider-token savings.
272
291
 
273
292
  ### Prune a tool/MCP catalog for a task
274
293
 
@@ -277,10 +296,32 @@ Artifact mode is for capture and retrieval. It stores sanitized output under `.c
277
296
  --catalog tools.json \
278
297
  --query "review failing tests" \
279
298
  --top 5 --budget-bytes 12000 --json
299
+ ./plugins/context-guard/bin/context-guard-tool-prune defer-report \
300
+ --catalog tools.json \
301
+ --query "review failing tests" \
302
+ --core-top 3 --deferred-top 20 --json
280
303
  ./plugins/context-guard/bin/context-guard-tool-prune get <receipt_id> --tool read_file --json
281
304
  ```
282
305
 
283
- `context-guard-tool-prune` ranks a local tool or MCP catalog with deterministic lexical heuristics and emits a bounded top-k advisory report. Inline selected schemas respect an observed UTF-8 byte budget, and omitted or budget-skipped schemas remain recoverable from a compact local receipt plus a separate sanitized payload under `.context-guard/tool-prune`. This is advisory only: it does not mutate MCP configuration, and token counts remain estimated proxies rather than measured provider savings.
306
+ `context-guard-tool-prune` ranks a local tool or MCP catalog with deterministic lexical heuristics and emits a bounded top-k advisory report. Inline selected schemas respect an observed UTF-8 byte budget, and omitted or budget-skipped schemas remain recoverable from a compact local receipt plus a separate sanitized payload under `.context-guard/tool-prune`. `defer-report` uses the same receipt path to split a catalog into core inline tools plus deferred tool stubs and namespace summaries. This is advisory only: it does not mutate MCP configuration, does not configure native provider tool search, and token counts remain estimated proxies rather than measured provider savings.
307
+
308
+ ### Score static prompt cacheability
309
+
310
+ ```bash
311
+ ./plugins/context-guard/bin/context-guard-cache-score --input prompt.json --provider openai --json
312
+ ./plugins/context-guard/bin/context-guard cache-score --input prompt.txt --provider anthropic --json
313
+ ```
314
+
315
+ `context-guard-cache-score` is a local static lint for prompt/request layout. It estimates total and cacheable-prefix size with a tokenizer-free char/4 proxy, warns about dynamic-looking values near the prefix, and records provider caveats for OpenAI, Anthropic, Gemini, or a generic threshold. It does not call providers, store raw prompts, estimate prices, observe cache hits, or prove token/cost savings; verify real cache behavior with provider usage telemetry.
316
+
317
+ ### Advise on total cost, batchability, and routing
318
+
319
+ ```bash
320
+ ./plugins/context-guard/bin/context-guard route-advisor --workload workload.json --json
321
+ ./plugins/context-guard/bin/context-guard-cost route-advisor --feature batch_api=true --feature structured_outputs=true --json < workload.json
322
+ ```
323
+
324
+ `context-guard route-advisor` is a local, passive advisor. It reads caller-supplied workload JSON, provider feature declarations, usage telemetry, and shifted external/local costs, then emits total-cost accounting, batchability blockers, and candidate routes such as batch API, prompt-cache prefix preservation, structured outputs, or cheaper-model evaluation. It does not start a queue, call providers, refresh pricing docs, or treat bundled provider feature knowledge as authoritative; unknown or caller-supplied features are marked recheck-required. Treat recommendations as candidates only. Hosted token or cost savings claims require matched successful tasks, non-inferior quality, and shifted-cost evidence.
284
325
 
285
326
  ### Compress selected local text conservatively
286
327
 
@@ -288,12 +329,15 @@ Artifact mode is for capture and retrieval. It stores sanitized output under `.c
288
329
  git diff | ./plugins/context-guard/bin/context-guard-compress --json
289
330
  pytest -q 2>&1 | ./plugins/context-guard/bin/context-guard-compress --type log
290
331
  cat evidence.txt | ./plugins/context-guard/bin/context-guard-compress --json --protected-policy
332
+ cat sanitized-prose.txt | ./plugins/context-guard/bin/context-guard-compress --json --type prose --mode readable
291
333
  ```
292
334
 
293
335
  `context-guard-compress` classifies sanitized stdin as JSON, diff, log, search output, code, or prose, then applies deterministic reductions such as JSON compaction, diff context folding, duplicate log/search line collapse, and whitespace normalization. It never claims observed model-token savings; byte counts are observed, token counts are labeled as estimates, and lossy receipts point you back to `context-guard-artifact store` for exact retrieval.
294
336
 
295
337
  Add `--protected-policy` when the input may contain semantic-sensitive zones such as code fences, diffs, identifiers, numeric constants, hashes, paths, stack frames, quoted strings, or JSON keys. The flag does not change default compressor behavior; it adds `protected_zone_policy` and `transform_policy` metadata that denies semantic/paraphrase rewrites, allows only structural transforms plus artifact retrieval, and stores only class/count policy metadata rather than raw protected spans.
296
338
 
339
+ Add `--mode readable` only for sanitized prose previews. It uses a deterministic sentence window, blocks prompt-like or high-risk protected signals, stores no raw protected spans, and marks exact fallback retrieval as required before edits or claims. It does not run learned compressors, models, embeddings, or rerankers.
340
+
297
341
  ### Trim or summarize command output
298
342
 
299
343
  ```bash
@@ -339,7 +383,15 @@ JSON
339
383
  ./plugins/context-guard/bin/context-guard-audit ~/.claude/projects --top 20 --recommend
340
384
  ```
341
385
 
342
- The audit command skips oversized transcript files and JSONL records by default (`--max-file-bytes`, `--max-line-bytes`) and reports skipped counts, so a corrupt trace cannot dominate memory or hide scan gaps. JSON output also includes `cache_friendliness` and [`cache_diagnostics`](docs/cache-diagnostics-schema.md): heuristic prompt-layout/cache-read diagnostics built from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes. The sibling `cache_layout_advice` field turns those signals into ranked **checks/experiments** such as splitting long sessions or stabilizing early prompt prefixes, while keeping observed issues separate from hypothesized or corroborated causes. `--feasibility-json` also includes a [`mac_visibility`](docs/mac-visibility-feasibility-schema.md) contract that local macOS-visible consumers can bind against; only stable top-level fields are designated binding targets, and `summary` is not a primary UI binding source. These fields can flag likely volatile content near the prompt prefix, stable-prefix candidates, cache-miss hypotheses, and TTL/headroom evidence gaps, but they do not print raw prompt text, do not prove provider cache hits, and may be `missing`, `partial`, `hypothesis`, or `unavailable` when transcript schemas do not expose enough evidence.
386
+ The audit command skips oversized transcript files and JSONL records by default (`--max-file-bytes`, `--max-line-bytes`) and reports skipped counts. That keeps a corrupt trace from dominating memory or hiding scan gaps.
387
+
388
+ JSON output can include several evidence surfaces:
389
+
390
+ - `cache_friendliness` and [`cache_diagnostics`](docs/cache-diagnostics-schema.md): heuristic prompt-layout/cache-read diagnostics built from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes.
391
+ - `cache_layout_advice`: ranked **checks/experiments** such as splitting long sessions or stabilizing early prompt prefixes, with observed issues kept separate from hypothesized or corroborated causes.
392
+ - `--feasibility-json` / [`mac_visibility`](docs/mac-visibility-feasibility-schema.md): a contract for local macOS-visible consumers. Only stable top-level fields are binding targets; `summary` is not a primary UI binding source.
393
+
394
+ These fields can flag likely volatile content near the prompt prefix, stable-prefix candidates, cache-miss hypotheses, and TTL/headroom evidence gaps. They do not print raw prompt text, do not prove provider cache hits, and may be `missing`, `partial`, `hypothesis`, or `unavailable` when transcript schemas do not expose enough evidence.
343
395
 
344
396
  ### Watch context and cache health in the statusline
345
397
 
@@ -377,39 +429,59 @@ Experimental lanes are **default off**. The registry records project-local inten
377
429
  context-guard experiments list
378
430
  context-guard experiments status --json
379
431
  context-guard experiments plan context-diff-compaction --json < change.diff
432
+ context-guard experiments emit context-diff-compaction --receipt-id <artifact-id> --reexpand-command "context-guard-artifact get <artifact-id> --full" --replacement-file compact-diff.txt --json < change.diff
380
433
  context-guard experiments plan visual-crop-ocr --json --full-evidence-receipt <id> --crop-label <label> --crop-bounds 0,0,100,100 --image-size 800,600 --missed-context-note "outside crop omitted"
434
+ context-guard experiments emit visual-crop-ocr --json --full-evidence-receipt <id> --crop-label <label> --crop-bounds 0,0,100,100 --image-size 800,600 --ocr-text "visible text" --ocr-confidence 0.9 --ocr-error-note "glyph may be uncertain" --missed-context-note "outside crop omitted"
381
435
  context-guard experiments plan learned-compression --json --sanitized --trusted-source --exact-fallback-receipt <id> --reexpand-command "context-guard-artifact get <id> --full" < sanitized-prose.txt
436
+ context-guard experiments emit learned-compression --json --sanitized --trusted-source --exact-fallback-receipt <id> --reexpand-command "context-guard-artifact get <id> --full" --replacement-file compact-prose.txt < sanitized-prose.txt
382
437
  context-guard experiments plan self-hosted-metrics-ledger --json --latency-ms 123.5 --peak-memory-mb 2048 --quality-score 0.98
438
+ context-guard experiments record self-hosted-metrics-ledger --ledger-jsonl .context-guard/self-hosted-metrics.jsonl --latency-ms 123.5 --peak-memory-mb 2048 --quality-score 0.98 --json
383
439
  context-guard experiments plan local-proxy --json --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack
440
+ context-guard experiments plan local-proxy-external-forwarding --external-forwarding-intent --external-forwarding-design-ack --allow-host api.example.com --allow-scheme https --credential-redaction-policy strip-sensitive-headers --provider-evidence-boundary diagnostic-only-provider-measured-required --threat-model-note "Only user-owned HTTPS endpoint; sensitive headers are stripped before any future forwarding." --json
441
+ context-guard experiments record local-proxy-runtime-gate --ledger-jsonl .context-guard/local-proxy-gates.jsonl --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack --json
442
+ context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
384
443
  context-guard experiments enable output-receipt-trim --root .
385
444
  context-guard experiments disable output-receipt-trim --root .
386
445
  ```
387
446
 
388
- The `--runtime-gate-ack` example still produces advisory metadata only; it does not enable forwarding.
447
+ The local-proxy examples are intentionally split by side effect:
448
+
449
+ - `plan local-proxy` produces advisory metadata only; it does not enable forwarding.
450
+ - `record local-proxy-runtime-gate` appends one localhost-only gate row and still starts no listener, forwards no traffic, persists no API keys, and makes no hosted-savings claim.
451
+ - `serve local-proxy` is the separate MVP. It requires both runtime and forwarding acknowledgements plus `--once`, a private `--ready-file` nonce handoff for the forwarding client, binds only a literal loopback IP, forwards only to a literal loopback IP target, blocks credential-bearing requests, uses byte/time limits, uses literal IPs instead of hostname DNS targets, does not persist API keys, and does not support external forwarding, CONNECT/TLS proxying, or hosted-savings claims.
452
+ - With `--diagnostic-ledger-jsonl`, `serve` appends one shifted-cost diagnostic row only after a successful forwarded request. The row stores hashes/metadata rather than raw headers, request bodies, response bodies, or hosted-savings evidence.
453
+ - `plan local-proxy-external-forwarding` is a dry-run design gate only. It requires explicit external intent, design acknowledgement, HTTPS host allowlist, threat model notes, credential redaction policy, and provider-evidence boundary, but starts no listener, performs no DNS lookup, calls no external service, forwards no traffic, persists no credentials, and does not ship an external proxy forwarding runtime.
389
454
 
390
455
  By default, project settings are stored in `.context-guard/experiments.json`. Use `--config <path>` only for an explicit project-local override. Experiment metadata includes risk level, gate requirements, explicit command/flag surfaces, and claim boundaries so hosted API token/cost savings are not claimed without provider-measured matched-task evidence. `experiments enable` records intent only; it does not run helpers, remove the need for their explicit flags, or permit replacing content without exact receipt/re-expand evidence.
391
456
 
392
- Shipped experimental dry-run checker/planner surfaces are intentionally narrow:
457
+ Shipped experimental checker/planner surfaces, plus explicit local context-diff, visual evidence, learned-candidate, metrics, and proxy-gate record runtimes, are intentionally narrow:
393
458
 
394
- | Planner/checker | What it emits | Hard boundary |
459
+ | Planner/checker/runtime | What it emits | Hard boundary |
395
460
  | --- | --- | --- |
396
- | `context-diff-compaction` | Reviewable compaction advice for diffs. | Does not emit replacement text; exact receipt/re-expand handles are recorded only for human or future gated review. |
397
- | `visual-crop-ocr` | Metadata for full visual evidence, crop bounds, OCR notes, confidence, and missed context. | No screenshot capture, crop/OCR runtime, image parsing, or external OCR/image service. |
398
- | `learned-compression` | Deny-by-default policy checks for sanitized trusted prose with exact fallback receipts. | No embeddings, rerankers, model calls, learned/synthetic compressor runtime, or replacement generation. |
399
- | `self-hosted-metrics-ledger` | Read-only ledger-compatible previews for local/model-server latency, memory, quality, energy, throughput, and local-cost metrics. | Does not write a ledger and does not support hosted API token/cost savings claims. |
400
- | `local-proxy` | Localhost-only advisory metadata for a possible future local proxy. | Starts no listener, forwards no traffic, persists no API keys, writes no ledger, blocks non-local bind/target/upstream values, and requires a separate future runtime gate before any forwarding implementation. |
461
+ | `context-diff-compaction` | Dry-run diff advice plus an explicit `emit ... --receipt-id ... --reexpand-command ...` runtime for caller-supplied compact replacements. | `plan` emits no replacement. `emit` requires reviewable hunks, exact local artifact re-expand metadata whose stored content matches the input diff, and a smaller caller-supplied replacement; ContextGuard does not generate semantic compression or support hosted token/cost savings claims. |
462
+ | `visual-crop-ocr` | Dry-run visual evidence advice plus an explicit `emit visual-crop-ocr` runtime for caller-supplied evidence packs. | `emit` requires a full visual evidence receipt, missed-context note, and complete user-supplied crop and/or OCR evidence; ContextGuard does not capture screenshots, crop images, run OCR, parse images, call external services, write files, or support hosted token/cost savings claims. |
463
+ | `learned-compression` | Deny-by-default policy checks plus an explicit `emit learned-compression` runtime for caller-supplied compact prose candidates with verified exact fallback content. | `emit` requires sanitized trusted prose, protected-signal denial, a verified local fallback artifact matching the input, and a smaller caller-supplied prose candidate; ContextGuard does not run compressors, embeddings, rerankers, model calls, subprocesses, external services, generated replacement text, or hosted savings claims. |
464
+ | `self-hosted-metrics-ledger` | Dry-run preview plus an explicit `record ... --ledger-jsonl` runtime for local/model-server latency, memory, quality, energy, throughput, and local-cost metrics. | The dry-run preview does not write a ledger; the explicit record command writes only local JSONL sidecars and still does not support hosted API token/cost savings claims. |
465
+ | `local-proxy` | Localhost-only advisory metadata, design-only `plan local-proxy-external-forwarding` review for future external forwarding, an explicit `record local-proxy-runtime-gate --ledger-jsonl` runtime for one local gate row, an explicit one-shot `serve local-proxy` loopback forwarding MVP, and optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics for successful forwarded requests. | `plan` writes no ledger. `record` writes only after localhost-only metadata and `--runtime-gate-ack`; it starts no listener, forwards no traffic, and performs no DNS lookup. `serve` additionally requires `--forwarding-gate-ack --once`, a private `--ready-file` nonce handoff, literal loopback bind/target IPs, nonzero ports, bounded bytes/timeouts, and credential-free requests; it performs no external forwarding, no CONNECT/TLS proxying, no API-key persistence, and no hosted-savings claim. `--diagnostic-ledger-jsonl` writes only successful-forward diagnostics with no raw headers/bodies and no hosted-savings claim. `plan local-proxy-external-forwarding` emits threat-model/allowlist/redaction/provider-evidence design metadata only and still performs no DNS lookup, external service call, traffic forwarding, credential persistence, or hosted-savings claim. |
401
466
 
402
467
  ## What is not yet shipped
403
468
 
404
- These are directions the project has noted, not committed features. Nothing here ships unless documented elsewhere in the repository.
469
+ These are directions the project has tracked, not committed features. Nothing here ships unless documented elsewhere in the repository.
405
470
 
406
- - Learned/synthetic compressor runtime, multimodal crop/OCR or visual-token pruning runtime, self-hosted KV/latent inference optimization, and actual proxy forwarding runtime. See the [experimental token-reduction radar](research/experimental-token-reduction-radar.md) and [fixture-only experimental benchmark starters](docs/experimental-benchmark-fixtures.md); those lanes remain experimental/non-shipped under the later-roadmap gate until matched successful tasks, failure-rate guardrails, human-correction tracking, shifted-cost accounting, provider-measured token/cost evidence, and separate future PR gates justify any hosted API savings claim or runtime feature claim.
471
+ ContextGuard does not yet ship:
472
+
473
+ - learned/synthetic compressor execution or generated replacement text beyond the caller-supplied learned candidate emitter
474
+ - generated crop/OCR or visual-token pruning runtime beyond the caller-supplied visual evidence-pack emitter
475
+ - self-hosted KV/latent optimization beyond explicit local metrics recording
476
+ - external, daemon, or credential-bearing proxy forwarding beyond the one-shot literal-loopback local proxy MVP
477
+
478
+ See the [experimental token-reduction radar](research/experimental-token-reduction-radar.md) and [fixture-only experimental benchmark starters](docs/experimental-benchmark-fixtures.md). Those lanes remain experimental/non-shipped under the later-roadmap gate until matched successful tasks, failure-rate guardrails, human-correction tracking, shifted-cost accounting, provider-measured token/cost evidence, and separate future PR gates justify any hosted API savings claim or broader runtime feature claim.
407
479
 
408
480
  ## Repository layout
409
481
 
410
482
  - `.claude-plugin/marketplace.json` — Claude Code marketplace manifest.
411
483
  - `plugins/context-guard/` — installable Claude Code plugin package.
412
- - `context-guard-kit/` — underlying Python/Bash helper tools.
484
+ - `context-guard-kit/` — checkout-local Python/Bash helper sources. npm packages ship synchronized `plugins/context-guard/bin` and `plugins/context-guard/lib` copies instead of duplicating this source tree.
413
485
  - `docs/index.html` — static landing page for the project.
414
486
  - `tests/` — regression tests for helper behavior.
415
487
 
@@ -443,6 +515,8 @@ export PATH="$PWD/plugins/context-guard/bin:$PATH"
443
515
  context-guard-setup --plan
444
516
  ```
445
517
 
518
+ Do not rely on `PATH` lookup for generated hooks by default. The setup wizard records explicit bundled or checkout-local helper paths; `--allow-path-helper-fallback` is only for trusted external installs and validates the resolved helper before writing commands.
519
+
446
520
  ## Release checks
447
521
 
448
522
  Before publishing or merging release-sensitive changes, run the copy check and both gates:
@@ -453,7 +527,7 @@ python3 scripts/prepublish_check.py
453
527
  python3 scripts/release_smoke.py
454
528
  ```
455
529
 
456
- When a helper under `context-guard-kit/` changes, run `python3 scripts/sync_plugin_copies.py --write` before the gates. `sync_plugin_copies.py --check` verifies the exact-copy contract up front; `prepublish_check.py` verifies package invariants, synchronized plugin binaries, manifests, diagnostic redaction, and the regression suite. `release_smoke.py` executes representative packaged entrypoints from `plugins/context-guard/bin` in a temporary project so broken CLI wiring is caught before publish. See [docs/release-runbook.md](docs/release-runbook.md) for the full release workflow, evidence checklist, quad-review requirement, and rollback checklist.
530
+ When a helper under `context-guard-kit/` changes, run `python3 scripts/sync_plugin_copies.py --write` before the gates. `sync_plugin_copies.py --check` verifies the maintainer-facing exact-copy contract up front. npm packages intentionally ship only the synchronized plugin-local `plugins/context-guard/bin` entrypoints and `plugins/context-guard/lib` helpers to avoid duplicate implementation payloads. `prepublish_check.py` verifies package invariants, synchronized plugin binaries, manifests, diagnostic redaction, and the regression suite. `release_smoke.py` executes representative packaged entrypoints from `plugins/context-guard/bin` in a temporary project so broken CLI wiring is caught before publish. See [docs/release-runbook.md](docs/release-runbook.md) for the full release workflow, evidence checklist, quad-review requirement, and rollback checklist.
457
531
 
458
532
  Versioned release notes live in [CHANGELOG.md](CHANGELOG.md); the prepublish gate requires an entry matching the plugin manifest version before publishing.
459
533
 
@@ -0,0 +1,7 @@
1
+ # Fixture-only baseline full context prompt
2
+
3
+ This synthetic baseline prompt represents the unoptimized full-context side of a future matched benchmark. It is dry-run-only and not a hosted API token/cost savings claim. A real run must replace this prose with sanitized project evidence, a real success_command, provider-measured primary token/cost fields, matched successful tasks, a 10%p failure-rate guardrail, human corrections, and shifted-cost accounting.
4
+
5
+ Include exact protected evidence such as paths, IDs, stack frames, JSON keys, and artifact receipt fallback handles without semantic rewriting.
6
+
7
+ Metric checklist for real replacements: tokens_per_successful_task, total_cost_with_shift_usd, external_cost_usd, matched_successful_task evidence, 10%p failure-rate guardrail, and proxy-byte caveat.
@@ -0,0 +1,7 @@
1
+ # Fixture-only ContextGuard advisory-foundations prompt
2
+
3
+ This synthetic candidate prompt represents a future ContextGuard-assisted side using cache layout lint, core-vs-deferred tool schemas, artifact receipts, and claim-safe telemetry. It is dry-run-only and not a hosted API token/cost savings claim. A real run must preserve exact retrieval or re-expand paths for omitted context and compare only provider-measured matched successful tasks.
4
+
5
+ Byte reductions, cache-hit predictions, local latency, char/4 token proxies, and receipt counts are diagnostic only until primary provider token/cost telemetry and shifted external work are measured.
6
+
7
+ Metric checklist for real replacements: tokens_per_successful_task, total_cost_with_shift_usd, external_cost_usd, matched_successful_task evidence, 10%p failure-rate guardrail, and proxy-byte caveat.
@@ -0,0 +1,182 @@
1
+ [
2
+ {
3
+ "id": "token_savings_01_bugfix",
4
+ "prompt": "Fixture-only synthetic token-savings roadmap task (bugfix). Fix a null-check regression in a sanitized request parser while preserving exact stack-frame evidence. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
5
+ "model": "sonnet",
6
+ "effort": "medium",
7
+ "max_turns": 3,
8
+ "max_budget_usd": 1.0,
9
+ "allowed_tools": [],
10
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
11
+ "success_cwd": ".",
12
+ "variant_prompt_files": {
13
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
14
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
15
+ }
16
+ },
17
+ {
18
+ "id": "token_savings_02_exploration",
19
+ "prompt": "Fixture-only synthetic token-savings roadmap task (exploration). Explore a small sanitized repository and identify the next file to inspect without loading unrelated logs. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
20
+ "model": "sonnet",
21
+ "effort": "medium",
22
+ "max_turns": 3,
23
+ "max_budget_usd": 1.0,
24
+ "allowed_tools": [],
25
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
26
+ "success_cwd": ".",
27
+ "variant_prompt_files": {
28
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
29
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
30
+ }
31
+ },
32
+ {
33
+ "id": "token_savings_03_code_review",
34
+ "prompt": "Fixture-only synthetic token-savings roadmap task (code_review). Review a focused diff and identify one correctness risk plus one test gap. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
35
+ "model": "sonnet",
36
+ "effort": "medium",
37
+ "max_turns": 3,
38
+ "max_budget_usd": 1.0,
39
+ "allowed_tools": [],
40
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
41
+ "success_cwd": ".",
42
+ "variant_prompt_files": {
43
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
44
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
45
+ }
46
+ },
47
+ {
48
+ "id": "token_savings_04_long_log_analysis",
49
+ "prompt": "Fixture-only synthetic token-savings roadmap task (long_log_analysis). Analyze a long sanitized CI log and cite the failing command, preserving artifact receipt fallback. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
50
+ "model": "sonnet",
51
+ "effort": "medium",
52
+ "max_turns": 3,
53
+ "max_budget_usd": 1.0,
54
+ "allowed_tools": [],
55
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
56
+ "success_cwd": ".",
57
+ "variant_prompt_files": {
58
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
59
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
60
+ }
61
+ },
62
+ {
63
+ "id": "token_savings_05_migration",
64
+ "prompt": "Fixture-only synthetic token-savings roadmap task (migration). Plan a safe migration of a deprecated CLI flag to a new option while keeping backwards compatibility. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
65
+ "model": "sonnet",
66
+ "effort": "medium",
67
+ "max_turns": 3,
68
+ "max_budget_usd": 1.0,
69
+ "allowed_tools": [],
70
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
71
+ "success_cwd": ".",
72
+ "variant_prompt_files": {
73
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
74
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
75
+ }
76
+ },
77
+ {
78
+ "id": "token_savings_06_docs",
79
+ "prompt": "Fixture-only synthetic token-savings roadmap task (docs). Update user-facing docs to clarify provider-measured matched successful task requirements. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
80
+ "model": "sonnet",
81
+ "effort": "medium",
82
+ "max_turns": 3,
83
+ "max_budget_usd": 1.0,
84
+ "allowed_tools": [],
85
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
86
+ "success_cwd": ".",
87
+ "variant_prompt_files": {
88
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
89
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
90
+ }
91
+ },
92
+ {
93
+ "id": "token_savings_07_refactor",
94
+ "prompt": "Fixture-only synthetic token-savings roadmap task (refactor). Refactor duplicated helper parsing into a shared function without changing public output schema. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
95
+ "model": "sonnet",
96
+ "effort": "medium",
97
+ "max_turns": 3,
98
+ "max_budget_usd": 1.0,
99
+ "allowed_tools": [],
100
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
101
+ "success_cwd": ".",
102
+ "variant_prompt_files": {
103
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
104
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
105
+ }
106
+ },
107
+ {
108
+ "id": "token_savings_08_performance",
109
+ "prompt": "Fixture-only synthetic token-savings roadmap task (performance). Find a deterministic hot path in a local-only helper and propose a bounded optimization. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
110
+ "model": "sonnet",
111
+ "effort": "medium",
112
+ "max_turns": 3,
113
+ "max_budget_usd": 1.0,
114
+ "allowed_tools": [],
115
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
116
+ "success_cwd": ".",
117
+ "variant_prompt_files": {
118
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
119
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
120
+ }
121
+ },
122
+ {
123
+ "id": "token_savings_09_telemetry",
124
+ "prompt": "Fixture-only synthetic token-savings roadmap task (telemetry). Add claim-safe telemetry fields for shifted local work without hosted cost-savings claims. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
125
+ "model": "sonnet",
126
+ "effort": "medium",
127
+ "max_turns": 3,
128
+ "max_budget_usd": 1.0,
129
+ "allowed_tools": [],
130
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
131
+ "success_cwd": ".",
132
+ "variant_prompt_files": {
133
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
134
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
135
+ }
136
+ },
137
+ {
138
+ "id": "token_savings_10_cache_layout",
139
+ "prompt": "Fixture-only synthetic token-savings roadmap task (cache_layout). Inspect a prompt layout and identify stable prefix versus dynamic suffix placement. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
140
+ "model": "sonnet",
141
+ "effort": "medium",
142
+ "max_turns": 3,
143
+ "max_budget_usd": 1.0,
144
+ "allowed_tools": [],
145
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
146
+ "success_cwd": ".",
147
+ "variant_prompt_files": {
148
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
149
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
150
+ }
151
+ },
152
+ {
153
+ "id": "token_savings_11_tool_schema",
154
+ "prompt": "Fixture-only synthetic token-savings roadmap task (tool_schema). Select a small core tool set from a sanitized MCP catalog and defer the rest by receipt. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
155
+ "model": "sonnet",
156
+ "effort": "medium",
157
+ "max_turns": 3,
158
+ "max_budget_usd": 1.0,
159
+ "allowed_tools": [],
160
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
161
+ "success_cwd": ".",
162
+ "variant_prompt_files": {
163
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
164
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
165
+ }
166
+ },
167
+ {
168
+ "id": "token_savings_12_artifact_receipt",
169
+ "prompt": "Fixture-only synthetic token-savings roadmap task (artifact_receipt). Verify that a digest plus receipt can re-expand omitted sanitized output exactly when needed. This validates benchmark shape only; real claims require provider-measured tokens/costs for matched successful tasks, failure-rate guardrail, human corrections, and shifted-cost accounting.",
170
+ "model": "sonnet",
171
+ "effort": "medium",
172
+ "max_turns": 3,
173
+ "max_budget_usd": 1.0,
174
+ "allowed_tools": [],
175
+ "success_command": "python3 -c \"raise SystemExit('fixture-only placeholder: replace success_command before real benchmark runs')\"",
176
+ "success_cwd": ".",
177
+ "variant_prompt_files": {
178
+ "baseline_full_context_fixture": "token-savings-12task-baseline.prompt.example.md",
179
+ "fixture_only_contextguard_advisory_foundations": "token-savings-12task-contextguard.prompt.example.md"
180
+ }
181
+ }
182
+ ]
@@ -0,0 +1,10 @@
1
+ [
2
+ {
3
+ "name": "baseline_full_context_fixture",
4
+ "extra_args": []
5
+ },
6
+ {
7
+ "name": "fixture_only_contextguard_advisory_foundations",
8
+ "extra_args": []
9
+ }
10
+ ]