npm - @ictechgy/context-guard - Versions diffs - 0.4.10 → 0.4.11 - Mend

@ictechgy/context-guard 0.4.10 → 0.4.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/CHANGELOG.md +13 -1
package/README.ko.md +32 -21
package/README.md +38 -29
package/docs/benchmark-fixtures/token-savings-12task.evidence.example.jsonl +24 -0
package/docs/benchmark-workflow-examples.md +3 -0
package/docs/benchmark-workflows/context-pack-byte-proxy.example.json +278 -137
package/docs/benchmark-workflows/measured-token-workflow.example.json +279 -138
package/docs/benchmark-workflows/provider-cache-telemetry.example.json +279 -138
package/docs/experimental-benchmark-fixtures.md +24 -7
package/package.json +2 -1
package/plugins/context-guard/.claude-plugin/plugin.json +1 -1
package/plugins/context-guard/README.ko.md +14 -11
package/plugins/context-guard/README.md +15 -14
package/plugins/context-guard/bin/context-guard +46 -11
package/plugins/context-guard/bin/context-guard-artifact +342 -33
package/plugins/context-guard/bin/context-guard-audit +33 -2
package/plugins/context-guard/bin/context-guard-bench +1542 -31
package/plugins/context-guard/bin/context-guard-cache-score +318 -33
package/plugins/context-guard/bin/context-guard-cost +7 -2
package/plugins/context-guard/bin/context-guard-experiments +364 -8
package/plugins/context-guard/bin/context-guard-failed-nudge +6 -2
package/plugins/context-guard/bin/context-guard-pack +301 -17
package/plugins/context-guard/bin/context-guard-sanitize-output +76 -12
package/plugins/context-guard/bin/context-guard-tool-prune +241 -54
package/plugins/context-guard/bin/context-guard-trim-output +288 -41
package/plugins/context-guard/brief/README.md +5 -5
package/plugins/context-guard/lib/context_guard_commands.py +214 -190

package/CHANGELOG.md CHANGED Viewed

@@ -4,10 +4,22 @@ All notable changes for the ContextGuard plugin are documented here.
 ## [Unreleased]
+## [0.4.11] - 2026-06-21
+- Hardened token-savings advisory surfaces with cache-score amortization risk accounting, tool-prune deferred-schema proxy accounting, and benchmark measurement-baseline contracts while preserving claim-safe boundaries.
+- Added benchmark evidence replay dashboards, default matrix reporting, and public claim readiness gates so public savings claims remain blocked unless matched successful tasks, provider-measured tokens/cost, quality non-inferiority, shifted-cost accounting, confidence notes, and complete provider-export provenance all pass.
+- Added output artifact sandbox receipts and local artifact search with stable `contextguard-artifact:<id>` handles, compact summaries, exact rehydration commands, custom-dir path redaction, and no hosted savings claims.
+- Added local-proxy response sandbox envelopes for safe UTF-8 loopback responses, plus docs and safety tests that keep proxy behavior one-shot, literal-loopback, credential-free, and non-claimable for hosted token/cost savings.
+- Productionized adaptive context packing as explicit `--adaptive-k` policies and `--symbol-memory` source-verification metadata without automatically changing manifests, packs, receipts, or provider-savings claims.
+- Hardened large-input processing bounds, private helper IO, adjacent helper loading, release smoke execution, and symlink/no-follow handling for artifact, tool-prune, benchmark, setup, and related helper paths.
+- Constrained release and runtime command manifests to literal-only data, kept legacy `claude-*` wrappers packaged but out of npm `.bin` aliases, and locked npm/package smoke checks to canonical `context-guard`/`context-guard-*` entrypoints.
+- Constrained macOS visibility helper discovery to bundled/resource/executable-relative paths or absolute explicit overrides, removed launch-CWD trust, rejected relative overrides, and launched the helper with a minimal allowlisted child environment.
+- Polished README, Korean README, and GitHub Pages copy after Claude review so setup, packaging, helper trust, and conservative savings-claim boundaries match the shipped product.
 ## [0.4.10] - 2026-06-14
 - Added `context-guard-artifact search`, a local sanitized artifact sandbox search that returns capped literal matches with exact `get --lines` rehydration commands and no hosted savings claims.
-- Added `context-guard-pack suggest/auto --adaptive-k`, an opt-in local shrink/expand top-k advisory that reports score-distribution, byte-budget fit, and score-mass recall/precision proxies without changing manifests, packs, receipts, or claiming provider-token savings.
+- Added `context-guard-pack suggest/auto --adaptive-k` selectable local policies (`--adaptive-k-policy balanced|recall|precision`), metadata-only recall/precision proxy gates, capped selected/omitted evidence, and structured source-verification hints without changing manifests, packs, receipts, or claiming provider-token savings.
 - Added `context-guard-pack auto --symbol-memory`, an opt-in repo-map-derived symbol/graph advisory with exact source verification hints that does not change manifests, packs, receipts, or provider-savings claims.
 - Added `context-guard-compress --mode readable`, an opt-in sanitized-prose readable preview mode with high-risk protected/prompt-like signal blocking, exact fallback guidance, and no learned compressor/model/embedding/reranker execution.
 - Added `context-guard-cache-score`, a static local prompt cacheability lint with char/4 proxy labeling, provider caveats, dynamic-prefix warnings, and no provider calls, ledger writes, or savings claims.

package/README.ko.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ContextGuard
-ContextGuard는 AI 코딩·도구 에이전트를 위한 로컬 우선 컨텍스트 관리 도구 모음입니다. Claude Code 플러그인으로 먼저 제공되며, 한 번 설치한 뒤 프로젝트별로 명시적으로 활성화하고 필요하면 되돌릴 수 있습니다. 출력 축약, 심볼 단위 읽기, 반복 실패 알림, 민감정보 패턴 가림, 사용량 측정 가드레일은 로컬 헬퍼 명령과 brief 모드 안내 규칙 스니펫을 통해 다른 에이전트에서도 재사용할 수 있습니다.
+ContextGuard는 AI 코딩·도구 에이전트를 위한 로컬 우선 컨텍스트 관리 도구 모음입니다. Claude Code 플러그인으로 먼저 시작할 수 있으며, 한 번 설치한 뒤 프로젝트별로 명시적으로 활성화하고 필요하면 되돌릴 수 있습니다. 출력 축약, 심볼 단위 읽기, 반복 실패 알림, 민감정보 패턴 가림, 사용량 측정 가드레일은 로컬 헬퍼 명령과 brief 모드 안내 규칙 스니펫을 통해 다른 에이전트에서도 재사용할 수 있습니다.
 - 영문 문서: [`README.md`](README.md)
 - HTML 랜딩 페이지: [GitHub Pages](https://ictechgy.github.io/context-guard/) ([소스](docs/index.html))
@@ -27,13 +27,15 @@ context-guard setup --agent claude --scope user --verify --json  # 읽기 전용
 context-guard setup --agent claude --scope user --plan
 ```
-기본값은 프로젝트 단위 설정입니다. 사용자 단위 설정은 명시적으로 선택해야 하며, 실제 변경을 적용하려면 `--yes`와 명시적인 `--agent`가 필요합니다. 지원되는 사용자 단위 변경은 백업과 되돌리기 기록을 남기며, 패키지 설치 중에는 실행되지 않습니다. setup은 먼저 패키지/체크아웃 내부 헬퍼를 찾습니다. 신뢰할 수 있는 설치임을 확인한 경우에만 `--allow-path-helper-fallback`으로 `PATH` 헬퍼 fallback을 허용하세요.
+기본값은 프로젝트 단위 설정입니다. 사용자 단위 설정은 명시적으로 선택해야 하며, 실제 변경을 적용하려면 `--yes`와 명시적인 `--agent`가 필요합니다. 지원되는 사용자 단위 변경은 백업과 되돌리기 기록을 남기며, 패키지 설치 중에는 실행되지 않습니다. `setup`은 먼저 패키지/체크아웃 내부 헬퍼를 찾습니다. 신뢰할 수 있는 설치임을 확인한 경우에만 `--allow-path-helper-fallback`으로 `PATH` 헬퍼 대체 경로를 허용하세요.
+배포와 헬퍼 신뢰 경계도 보수적입니다. npm은 canonical `context-guard`/`context-guard-*` bin 링크만 노출하고 legacy `claude-*` 래퍼는 경로 기반 마이그레이션용 패키지 파일로만 남깁니다. 명령 매니페스트는 실행 가능한 Python이 아니라 literal 데이터로만 읽으며, macOS visibility 헬퍼는 번들/resource/실행 파일 기준 경로나 absolute explicit override만 사용하고 최소 환경으로 실행합니다. 현재 작업 디렉터리, 상대 override, symlink 헬퍼, 임의 `PATH`, 상위 셸 환경은 기본적으로 신뢰하지 않습니다.
 ContextGuard는 절감 수치를 과장하지 않습니다. 흔히 컨텍스트를 불필요하게 키우는 원인을 줄이고, 실제 전후 비교 결과는 각자의 작업에서 측정할 수 있도록 벤치마크 도구를 제공합니다. 저장소마다 효과는 달라질 수 있으며, 고정된 토큰·비용 절감률은 보장하지 않습니다.
 ## Claude Code 우선, 다른 에이전트도 함께
-Claude 사용자는 Claude Code 플러그인으로 시작하는 것이 가장 빠릅니다. 설치 후에는 같은 로컬 우선 가드레일을 다음 방식으로 다른 AI 코딩·도구 에이전트에서도 재사용할 수 있습니다.
+Claude Code 사용자는 플러그인으로 시작하는 것이 가장 빠릅니다. 설치 후에는 같은 로컬 우선 가드레일을 다음 방식으로 다른 AI 코딩·도구 에이전트에서도 재사용할 수 있습니다.
 - **로컬 헬퍼 명령**(`context-guard-*`)은 특정 에이전트에 묶이지 않은 일반 셸 명령으로 실행됩니다.
 - **brief 모드 스니펫**은 에이전트의 지시 파일(`AGENTS.md`, `GEMINI.md`, `.cursorrules`, Copilot 지시 파일 등)에 마커 블록으로 설치하고, 블록을 지우면 제거됩니다.
@@ -54,7 +56,7 @@ Claude 사용자는 Claude Code 플러그인으로 시작하는 것이 가장
 ## ContextGuard가 토큰 낭비를 줄이는 방식
-ContextGuard는 모델 가격 자체를 낮추는 도구가 아닙니다. AI 코딩 에이전트의 컨텍스트에 들어가기 전에 불필요한 입력을 줄이고, 그 변화가 도움이 됐는지 직접 확인할 수 있는 신호를 제공합니다.
+ContextGuard는 모델 단가 자체를 낮추는 도구가 아닙니다. AI 코딩 에이전트의 컨텍스트에 들어가기 전에 불필요한 입력을 줄이고, 그 변화가 도움이 됐는지 직접 확인할 수 있는 신호를 제공합니다.
 | 낭비 경로 | ContextGuard 가드레일 |
 | --- | --- |
@@ -63,20 +65,20 @@ ContextGuard는 모델 가격 자체를 낮추는 도구가 아닙니다. AI 코
 | 같은 실패 명령을 반복하는 경우 | Bash 실패가 반복되면 불필요한 실패 로그가 더 쌓이기 전에 전략을 바꾸도록 알립니다. |
 | 민감하거나 과도한 터미널 출력 | 자격 증명처럼 보이는 값과 민감해 보이는 경로를 패턴 기반으로 최대한 가립니다. |
 | 어디서 토큰과 비용이 커지는지 모르는 경우 | 상태표시줄, 대화 기록 감사, 기준 실행과 변형 실행을 쌍으로 맞춰 비교한 벤치마크 리포트로 전후 비교 근거를 남깁니다. |
-| Anthropic API 요청이 provider prompt cache hit를 놓칠 수 있는 경우 | `context-guard cost preflight`가 호출 전 입력 크기, cache breakpoint별 위험, 낮음/중간/높음 비용 범위를 추정합니다. 기본값은 경고만 합니다. |
+| Anthropic API 요청이 provider prompt cache 적중을 놓칠 수 있는 경우 | `context-guard cost preflight`가 호출 전 입력 크기, cache breakpoint별 위험, 낮음/중간/높음 비용 범위를 추정합니다. 기본값은 경고만 합니다. |
 | 안정적인 프롬프트 앞부분보다 자주 바뀌는 컨텍스트가 먼저 오는 경우 | 제한된 범위의 가림 처리된 segment hash로 프롬프트 배치를 감사하여, 원문 프롬프트를 노출하지 않고 캐시에 불리한 배치 가능성을 알립니다. |
 | 좁은 작업에 비해 큰 tool/MCP catalog가 들어가는 경우 | 로컬 tool catalog를 제한된 top-k schema report로 순위화하고, 전체 가림 처리된 schema는 로컬 요약 기록으로 다시 조회할 수 있게 합니다. |
 ## 캐시·압축 도구와의 차이
-ContextGuard는 provider 캐시, semantic cache, 프롬프트 압축 도구를 대체하지 않습니다. 핵심 역할은 더 단순합니다. **불필요한 파일·로그·출력이 처음부터 에이전트 컨텍스트에 덜 들어가게 하는 것**입니다.
+ContextGuard는 provider 캐시, semantic cache, 프롬프트 압축 도구를 대체하지 않습니다. 핵심 역할은 더 단순합니다. **불필요한 파일·로그·출력이 에이전트 컨텍스트에 들어가기 전에 줄이는 것**입니다.
 | 도구 유형 | 줄이는 방식 | ContextGuard와의 관계 |
 | --- | --- | --- |
 | Provider prompt/context caching | 안정적인 프롬프트 앞부분을 재사용합니다. | 보완 관계입니다. ContextGuard는 자주 바뀌는 컨텍스트 뒷부분을 더 작고 깨끗하게 유지하도록 돕고, `context-guard-audit`로 프롬프트 배치를 점검하며, `context-guard cost`로 Anthropic 요청이 cache read 대신 cache write가 될 가능성을 미리 알릴 수 있습니다. |
 | Semantic response cache | 같거나 비슷한 요청의 이전 답변을 재사용합니다. | 보완 관계입니다. ContextGuard는 AI 답변 캐시를 제공하지 않습니다. |
 | 프롬프트/컨텍스트 압축 | 이미 선택된 텍스트를 더 짧게 만듭니다. | 인접한 역할입니다. ContextGuard는 로컬 출력 축약과 요약을 제공하지만, 무손실 의미 압축을 보장하지 않습니다. |
-| 실험 planner/runtime | local proxy는 dry-run plan, external-forwarding design plan, gate record, one-shot loopback forwarding MVP로만 검토합니다. context-diff, visual evidence-pack, learned-compression, self-hosted metrics도 명시적 로컬 런타임만 지원합니다. | 모두 기본 비활성이며 명시적 명령이 필요합니다. `record`는 listener·traffic forwarding·DNS lookup을 시작하지 않고, `serve local-proxy`는 literal loopback IP로 제한된 1회 요청만 bind/forward합니다. 별도 근거 gate와 future PR gate 없이는 model/compressor 실행, OCR/crop service, external forwarding, credential persistence, hosted API 절감 주장으로 보지 않습니다. 자세한 내용은 “실험 기능 opt-in 관리” 섹션을 참고하세요. |
+| 실험 planner/runtime | local proxy는 dry-run plan, external-forwarding design plan, gate record, one-shot loopback forwarding MVP로만 검토합니다. context-diff, visual evidence-pack, learned-compression, self-hosted metrics도 명시적 로컬 런타임만 지원합니다. | 모두 기본 비활성이며 명시적 명령이 필요합니다. `record`는 listener·traffic forwarding·DNS lookup을 시작하지 않고, `serve local-proxy`는 literal loopback IP로 제한된 1회 요청만 bind/forward하며, `--response-sandbox`는 safe UTF-8 upstream body를 compact local artifact 재조회 envelope로 대체할 수 있습니다. 별도 근거 gate와 future PR gate 없이는 model/compressor 실행, OCR/crop service, external forwarding, credential persistence, hosted API 절감 주장으로 보지 않습니다. 자세한 내용은 “실험 기능 opt-in 관리” 섹션을 참고하세요. |
 | ContextGuard | 불필요한 파일, 로그, 반복 실패, 과도한 출력이 에이전트 컨텍스트에 들어가기 전에 줄어들도록 돕습니다. | 로컬 가드레일, 되돌릴 수 있는 로컬 보관본, 측정 도구입니다. |
 설계에 참고한 관련 패턴은 다음과 같습니다.
@@ -102,6 +104,7 @@ brief 모드는 코딩 에이전트가 군더더기를 줄이도록 요청하되
 - `context-guard-audit`가 보고한 대화 기록 사용량 집중 지점, `cache_friendliness` 프롬프트 배치 신호, `cache_layout_advice` 실험 우선순위
 - 상태표시줄의 `cache` / `reuse` 값: ContextGuard가 직접 만든 절감 효과가 아니라 관찰된 대화 기록·provider cache 신호입니다.
 - `context-guard cost preflight`로 Anthropic 요청 JSON의 추정 비용을 보고, 호출 뒤 `context-guard cost observe`로 provider usage 필드(`cache_creation_input_tokens`, `cache_read_input_tokens`)를 대조합니다.
+- `context-guard-cache-score`로 정적 cache layout과, 사용자가 직접 넣은 cache write/read multiplier 기반 amortization 위험을 안내받습니다. char/4 토큰 값은 provider 측정 절감이 아니라 추정 proxy입니다.
 - `context-guard-bench`로 성공한 기준/변형 실행을 쌍으로 맞춰 비교한 결과
 - 큰 tool/MCP catalog와 `context-guard-tool-prune` top-k 리포트 및 요약 기록 재조회 방식의 차이
 - [`research/experimental-token-reduction-radar.md`](research/experimental-token-reduction-radar.md)의 선택적 실험 lane과 마찬가지로, [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md)의 fixture-only 시작 예시도 절감 주장을 하려면 같은 matched-task benchmark gate를 먼저 통과해야 합니다.
@@ -111,13 +114,14 @@ brief 모드는 코딩 에이전트가 군더더기를 줄이도록 요청하되
 - 고정된 토큰·비용 절감률을 보장하지 않습니다.
 - 모델 토큰을 줄이기 위해 작업을 외부 AI 서비스로 전송하지 않습니다.
 - 설치만으로 전역 Claude 설정을 변경하지 않습니다.
+- setup이나 패키징 smoke check에서 명령 매니페스트를 코드로 실행하거나 임의 `PATH`/현재 작업 디렉터리 헬퍼를 신뢰하지 않습니다.
 - 절감 수치가 필요할 때 직접 전후 비교 측정을 대신하지 않습니다.
 - 로컬 RAM/디스크 보관본은 다음에 보낼 컨텍스트를 줄이는 데 도움이 될 수 있지만 Anthropic provider prompt cache를 대체하거나 cache hit를 보장하지 않습니다. 배포나 청구 설명 전에는 Anthropic prompt caching/pricing 문서를 다시 확인하세요: https://docs.anthropic.com/en/build-with-claude/prompt-caching 및 https://platform.claude.com/docs/en/about-claude/pricing.
-- 실험 헬퍼는 대부분 dry-run 안전성 checker/planner이며 design-only external-forwarding opt-in gate를 포함합니다. 명시적 로컬 runtime은 caller-supplied context-diff replacement payload, caller-supplied visual crop/OCR evidence pack, caller-supplied learned-compression prose candidate, self-hosted metrics JSONL sidecar 기록, local-proxy runtime-gate JSONL 기록, private ready-file nonce가 필요한 one-shot `serve local-proxy` loopback forwarding과 successful forwarded request용 optional shifted-cost diagnostic JSONL row만 제공합니다.
+- 실험 헬퍼는 대부분 dry-run 안전성 checker/planner이며 design-only external-forwarding opt-in gate를 포함합니다. 명시적 로컬 runtime은 caller-supplied context-diff replacement payload, caller-supplied visual crop/OCR evidence pack, caller-supplied learned-compression prose candidate, self-hosted metrics JSONL sidecar 기록, local-proxy runtime-gate JSONL 기록, private ready-file nonce가 필요한 one-shot `serve local-proxy` loopback forwarding, safe UTF-8 응답을 compact artifact envelope로 바꾸는 optional `--response-sandbox`, successful forwarded request용 optional shifted-cost diagnostic JSONL row만 제공합니다.
 - ContextGuard는 learned/synthetic compressor 실행·embedding·reranker·model call·생성형 replacement, screenshot 캡처·image crop·OCR 실행·image parsing·외부 OCR/image service, 명시적 local metrics 기록을 넘어선 self-hosted KV/latent inference optimization runtime, literal-loopback 1회 HTTP forwarding과 credential 차단을 넘어선 proxy forwarding은 제공하지 않습니다.
 - 예전 `/claude-token-optimizer:*` Claude Code 슬래시 명령을 별칭으로 제공하지 않습니다. 설치 후에는 `/context-guard:*`를 사용하세요.
-기존 자동화가 바로 깨지지 않도록 로컬 CLI 호환 래퍼(`claude-token-*`, `claude-read-symbol`, `claude-trim-output`, `claude-sanitize-output`)는 `bin/`에서 계속 제공합니다.
+기존 자동화가 바로 깨지지 않도록 로컬 CLI 호환 래퍼(`claude-token-*`, `claude-read-symbol`, `claude-trim-output`, `claude-sanitize-output`)는 패키지 파일 `plugins/context-guard/bin/` 아래에 계속 포함합니다. npm global/`npx` bin 링크는 의도적으로 canonical `context-guard`/`context-guard-*` 명령만 노출하므로, legacy 래퍼가 필요하면 패키지/플러그인 경로로 호출하세요.
 ## 제공 기능
@@ -170,7 +174,7 @@ brief 모드는 코딩 에이전트가 군더더기를 줄이도록 요청하되
 ## npm/npx로 설치
-npm 패키지는 단일 `context-guard` 명령과 기존 `context-guard-*` 헬퍼 명령을 함께 제공합니다. 설치는 수동적입니다. `postinstall`로 설정을 쓰지 않으며, 사용자가 직접 `context-guard setup`을 실행할 때만 프로젝트나 사용자 설정을 변경합니다. setup이 패키지/체크아웃 내부 헬퍼를 찾지 못해도 `PATH` fallback은 기본적으로 꺼져 있습니다. `context-guard doctor` 또는 `setup --verify`로 계획을 확인한 뒤 신뢰하는 헬퍼 디렉터리에 한해서만 `--allow-path-helper-fallback`을 사용하세요.
+npm 패키지는 단일 `context-guard` 명령과 `context-guard-*` 헬퍼 명령을 함께 제공합니다. 설치는 수동적입니다. `postinstall`로 설정을 쓰지 않으며, 사용자가 직접 `context-guard setup`을 실행할 때만 프로젝트나 사용자 설정을 변경합니다. npm global/`npx` bin 링크는 의도적으로 canonical `context-guard`/`context-guard-*` 명령만 노출합니다. legacy `claude-*` 래퍼 파일은 명시적인 경로 기반 마이그레이션을 위해 패키지에 남지만 실행 bin 별칭으로 광고하지 않습니다. setup이 패키지/체크아웃 내부 헬퍼를 찾지 못해도 `PATH` fallback은 기본적으로 꺼져 있습니다. `context-guard doctor` 또는 `setup --verify`로 계획을 확인한 뒤 신뢰하는 헬퍼 디렉터리에 한해서만 `--allow-path-helper-fallback`을 사용하세요.
 ```bash
 npm install -g @ictechgy/context-guard
@@ -235,10 +239,11 @@ context-guard setup --agent claude --scope user --verify --json
 ```bash
 long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --command "long-command" --json
 ./plugins/context-guard/bin/context-guard-artifact search "ERROR" --json
+./plugins/context-guard/bin/context-guard-artifact receipt <artifact_id> --json
 ./plugins/context-guard/bin/context-guard-artifact get <artifact_id> --lines 1:80
 ```
-로컬 보관 모드는 캡처·sandbox 검색·조회 용도입니다. 기본 저장 위치는 `.context-guard/artifacts`이며, 리브랜딩 이전의 `.claude-token-optimizer/artifacts` 요약 기록도 계속 읽을 수 있습니다. JSON 요약 기록에는 줄 번호가 포함된 top-error 요약 기록, 중복 라인 그룹, 가림 처리된 범위 제한 `suggested_queries`가 들어가므로 에이전트가 전체 로그를 다시 넣지 않고 필요한 최소 범위만 정확하게 조회할 수 있습니다. `search`는 로컬 sanitized artifact sandbox를 literal substring으로 검색하고, bounded match/context record와 `context-guard-artifact get ... --lines START:END` 재조회 명령을 함께 반환합니다. custom `--dir` 값의 raw private path는 기본적으로 가림 처리되므로 같은 `--dir`로 다시 실행하거나, 직접 실행 가능한 local command가 꼭 필요할 때만 `search --show-paths`를 명시하세요. 이 검색 리포트는 local-only이며 hosted token/cost savings claim으로 해석하면 안 됩니다. 릴리스 확인처럼 종료 코드가 중요한 파이프라인에서는 원래 명령의 종료 코드를 직접 보존하세요. 종료 코드 보존이 핵심이면 `context-guard-trim-output -- ...`을 사용하는 편이 안전합니다.
+로컬 보관 모드는 캡처·sandbox 검색·조회 용도입니다. 기본 저장 위치는 `.context-guard/artifacts`이며, 리브랜딩 이전의 `.claude-token-optimizer/artifacts` 요약 기록도 계속 읽을 수 있습니다. JSON 요약 기록에는 줄 번호가 포함된 top-error 요약 기록, 중복 라인 그룹, 가림 처리된 범위 제한 `suggested_queries`, 안정적인 `contextguard-artifact:<id>` 핸들이 있는 `output_sandbox` envelope가 들어갑니다. `context-guard-artifact receipt <artifact_id> --json`으로 본문 없이 메타데이터/재조회 핸들만 다시 가져온 뒤, 전체 로그를 다시 넣지 않고 필요한 최소 범위만 정확하게 조회할 수 있습니다. `search`는 로컬 sanitized artifact sandbox를 literal substring으로 검색하고, bounded match/context record와 `context-guard-artifact get ... --lines START:END` 재조회 명령을 함께 반환합니다. custom `--dir` 값의 raw private path는 기본적으로 가림 처리되므로 같은 `--dir`로 다시 실행하거나, 직접 실행 가능한 local command가 꼭 필요할 때만 `search --show-paths`를 명시하세요. 이 검색 리포트는 local-only이며 hosted token/cost savings claim으로 해석하면 안 됩니다. 릴리스 확인처럼 종료 코드가 중요한 파이프라인에서는 원래 명령의 종료 코드를 직접 보존하세요. 종료 코드 보존이 핵심이면 `context-guard-trim-output -- ...`을 사용하는 편이 안전합니다.
 ### 예산 기반 컨텍스트 팩 만들기
@@ -253,7 +258,7 @@ long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --c
 # 또는 명시적인 두 단계로 실행:
 ./plugins/context-guard/bin/context-guard-pack suggest \
   --root . --query "failing tests review" --diff HEAD \
-  --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k
+  --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k --adaptive-k-policy recall
 ./plugins/context-guard/bin/context-guard-pack build \
   --root . --manifest suggested-pack.json --budget-bytes 12000 --json
 ./plugins/context-guard/bin/context-guard-pack slice --root . --path README.md --lines 1:40 --json
@@ -266,7 +271,7 @@ long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --c
 - `--explain`을 추가하면 JSON 또는 텍스트 출력에 결정적 로컬 선택/build 이유를 짧게 포함합니다.
 - JSON explain에는 bounded `repo_map`이 포함될 수 있습니다. 예시는 sampled byte/token-proxy tree, category-only secret risk count, signature-first hint, explain-only graph rank, 기존 `slice`/symbol 재조회 힌트입니다.
 - repo-map은 manifest, pack 본문, receipt, byte budget을 바꾸지 않고 네트워크·모델 호출·임베딩을 쓰지 않습니다. 토큰 값은 provider-token이나 savings claim이 아닌 추정 `chars_div_4` proxy입니다.
-- `suggest` 또는 `auto`에 `--adaptive-k`를 추가하면 로컬 score distribution, byte-budget fit, score-mass 기반 recall/precision proxy에서 나온 advisory-only top-k shrink/expand metadata를 포함합니다. 추천값은 자동 적용되지 않으며 manifest, pack 본문, receipt, byte budget을 바꾸지 않습니다.
+- `suggest` 또는 `auto`에 `--adaptive-k`를 추가하면 로컬 score distribution, byte-budget fit, clamped score-mass 기반 recall/precision proxy에서 나온 advisory-only top-k shrink/expand metadata를 포함합니다. `--adaptive-k-policy balanced|recall|precision`과 선택적 `--adaptive-k-min-recall-proxy` / `--adaptive-k-min-precision-proxy` gate로 로컬 추천 정책을 고를 수 있고, gate 실패는 metadata-only(`pass|failed`)입니다. adaptive block은 capped selected/omitted evidence와 구조화된 source-verification hint를 포함하지만 추천값을 자동 적용하지 않으며 manifest, pack 본문, receipt, byte budget을 바꾸지 않습니다.
 - `auto`에 `--symbol-memory`를 추가하면 repo-map 기반 symbol/graph advisory metadata와 정확한 `slice` / `read-symbol` 검증 힌트를 포함합니다. 이는 source verification 안내일 뿐이며 manifest, pack 본문, receipt, byte budget을 바꾸지 않습니다.
 - `--manifest-out`은 `build`가 읽을 수 있는 manifest를 저장하고, `--pack-out`은 렌더링된 팩 본문을 저장합니다.
 - `context-guard-pack suggest`는 더 낮은 수준의 로컬 전용 준비 단계입니다. `--query`, `--diff`, 반복 `--files`, 그리고 `--root` 아래의 선택적 `--output` / `--test-output` 텍스트 파일을 가림 처리한 신호에서 후보 파일과 줄 범위를 순위화한 뒤 `build --manifest`가 바로 읽을 수 있는 manifest를 씁니다.
@@ -282,10 +287,14 @@ long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --c
   --catalog tools.json \
   --query "review failing tests" \
   --top 5 --budget-bytes 12000 --json
+./plugins/context-guard/bin/context-guard-tool-prune defer-report \
+  --catalog tools.json \
+  --query "review failing tests" \
+  --core-top 3 --deferred-top 20 --json
 ./plugins/context-guard/bin/context-guard-tool-prune get <receipt_id> --tool read_file --json
 ```
-`context-guard-tool-prune`은 로컬 tool 또는 MCP catalog를 결정적 lexical heuristic(어휘 기반 휴리스틱)으로 순위화해 제한된 top-k 자문 리포트를 만듭니다. inline schema는 관측된 UTF-8 바이트 예산을 지키고, 누락되거나 예산 때문에 생략된 schema는 `.context-guard/tool-prune`의 compact 요약 기록과 별도 가림 처리 payload로 다시 조회할 수 있습니다. 이 기능은 안내용이며 MCP 설정을 변경하지 않습니다. 토큰 값은 provider가 측정한 절감 수치가 아니라 추정 proxy입니다.
+`context-guard-tool-prune`은 로컬 tool 또는 MCP catalog를 결정적 lexical heuristic(어휘 기반 휴리스틱)으로 순위화해 제한된 top-k 자문 리포트를 만듭니다. inline schema는 관측된 UTF-8 바이트 예산을 지키고, 누락되거나 예산 때문에 생략된 schema는 `.context-guard/tool-prune`의 compact 요약 기록과 별도 가림 처리 payload로 다시 조회할 수 있습니다. `defer-report`는 core inline tool과 deferred tool stub/namespace 요약을 나누고, 첫 프롬프트에서 빠진 schema의 gross/net char/4 proxy 회계를 함께 보여줍니다. 이 기능은 안내용이며 MCP 설정이나 native provider tool search를 변경하지 않습니다. 토큰 값은 provider가 측정한 절감 수치가 아니라 추정 proxy입니다.
 ### 총비용, batchability, routing 후보 자문
@@ -317,7 +326,7 @@ cat sanitized-prose.txt | ./plugins/context-guard/bin/context-guard-compress --j
 ./plugins/context-guard/bin/context-guard-trim-output --max-lines 120 -- npm test
 ```
-head/tail 로그 대신 의미 요약이 필요하면 `--digest markdown` 또는 `--digest json`을 사용하세요. 요약 모드는 원래 종료 코드를 보존하면서 상태, 종료 코드, 잘린 줄 수, 실행기 실패 정보, 가림 처리된 실패 signature, 중복 라인 그룹, 대표 라인, 가림 처리 횟수, 다음 조회 제안을 남깁니다. 요약 모드에서 가림 처리된 전체 출력을 로컬 `context-guard-artifact` 보관본에 저장하려면 `--artifact-receipt`를 함께 사용하세요. 생략된 세부 내용에 의존하기 전에 출력된 `context-guard-artifact get ...` 명령으로 전체 내용을 다시 가져오세요. 래핑된 명령은 기본 600초 뒤 종료되며, `--timeout-seconds`로 조정할 수 있습니다.
+head/tail 로그 대신 의미 요약이 필요하면 `--digest markdown` 또는 `--digest json`을 사용하세요. 요약 모드는 원래 종료 코드를 보존하면서 상태, 종료 코드, 잘린 줄 수, 실행기 실패 정보, 가림 처리된 실패 signature, 중복 라인 그룹, 대표 라인, 가림 처리 횟수, 다음 조회 제안을 남깁니다. 요약 모드에서 가림 처리된 전체 출력을 로컬 `context-guard-artifact` 보관본에 저장하려면 `--artifact-receipt`를 함께 사용하세요. 출력된 `contextguard-artifact:<id>` 핸들을 agent context에 남기고, 생략된 세부 내용에 의존하기 전에 `context-guard-artifact receipt/get/search ...` 명령으로 필요한 부분을 정확히 다시 가져오세요. 래핑된 명령은 기본 600초 뒤 종료되며, `--timeout-seconds`로 조정할 수 있습니다.
 ### 검색·diff 출력 민감정보 가림
@@ -365,12 +374,14 @@ JSON 출력에는 여러 증거 surface가 포함될 수 있습니다.
 - 성공한 기준/변형 실행은 실제 토큰과 `cost_usd + external_cost_usd` 기준으로 비교하고, 바이트 감소는 간접 증거로만 기록합니다.
 - 토큰 절감 주장은 대응 태스크 양쪽 모두에 `primary_tokens_measured`가 있을 때만 계산합니다.
 - `matched_pair_evidence`는 성공한 task bucket을 transform, 측정 가능 여부, quality gate, claim boundary와 연결하므로 절감 문구를 쓰기 전에 먼저 확인해야 합니다.
+- `default_matrix`는 같은 대응 evidence를 기반으로 trimming, artifact escrow, tool pruning, cache advice, adaptive-k, optional compression을 `default-on`, `advisory`, `experimental`, `reject/rework`로 분류합니다. 이 matrix는 report 전용이며 runtime default를 바꾸거나 hosted token/cost 절감 주장을 허용하지 않습니다.
+- `public_claim_readiness`는 release/public claim의 최종 gate입니다. matched successful task, provider-measured primary token/cost, quality non-inferiority, shifted-cost accounting, 명시적 confidence/failure note, complete provider-export provenance가 모두 통과해야 `claim_allowed=true`가 되며, 그렇지 않은 hosted savings claim은 금지됩니다.
 - `wall_time_seconds`, `provider_cached_tokens`, `provider_cached_tokens_measured`는 진단용 텔레메트리이며, ContextGuard가 직접 만든 토큰·비용 절감 증거로 보지 않습니다.
 - 선택적 `self_hosted_metrics`는 run별 JSONL ledger sidecar로만 기록하고 CSV/report 요약에는 넣지 않으며, hosted API token/cost 절감 주장의 근거로 포함해서는 안 됩니다. `context-guard experiments plan self-hosted-metrics-ledger`는 이런 sidecar의 dry-run preview만 만들고 ledger 파일을 쓰지 않습니다.
 - 비용 필드가 0이거나 없으면 토큰 절감만 표시하고 실제 비용 절감은 주장하지 않습니다.
 - CSV 스키마는 엄격하게 검사합니다. 벤치마크 헬퍼를 업그레이드한 뒤에는 새 `--csv` 파일을 시작하거나 mismatch 오류가 알려주는 헤더로 마이그레이션하세요.
-최소 보고서 형태 예시는 [`docs/benchmark-report.example.json`](docs/benchmark-report.example.json)을, 작업 유형별 합성 예시와 안전한 해석 경계는 [`docs/benchmark-workflow-examples.md`](docs/benchmark-workflow-examples.md)을, fixture-only 실험 시작 예시는 [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md)을 참고하세요.
+최소 보고서 형태 예시는 [`docs/benchmark-report.example.json`](docs/benchmark-report.example.json)을, 작업 유형별 합성 예시와 안전한 해석 경계는 [`docs/benchmark-workflow-examples.md`](docs/benchmark-workflow-examples.md)을, fixture-only 실험 시작 예시는 [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md)을 참고하세요. live provider 실행 전 deterministic local replay가 필요하면 `--evidence-jsonl docs/benchmark-fixtures/token-savings-12task.evidence.example.jsonl --dashboard-md ... --baseline-variant baseline_full_context_fixture`를 사용하세요. Replay mode는 provider와 `success_command`를 실행하지 않고 CSV/report/dashboard를 만들지만 synthetic/manual evidence는 public hosted-savings claim 불가로 표시합니다.
 ### 실험 기능 opt-in 관리
@@ -390,7 +401,7 @@ context-guard experiments record self-hosted-metrics-ledger --ledger-jsonl .cont
 context-guard experiments plan local-proxy --json --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack
 context-guard experiments plan local-proxy-external-forwarding --external-forwarding-intent --external-forwarding-design-ack --allow-host api.example.com --allow-scheme https --credential-redaction-policy strip-sensitive-headers --provider-evidence-boundary diagnostic-only-provider-measured-required --threat-model-note "Only user-owned HTTPS endpoint; sensitive headers are stripped before any future forwarding." --json
 context-guard experiments record local-proxy-runtime-gate --ledger-jsonl .context-guard/local-proxy-gates.jsonl --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack --json
-context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
+context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --response-sandbox --response-artifact-dir .context-guard/artifacts --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
 context-guard experiments enable output-receipt-trim --root .
 context-guard experiments disable output-receipt-trim --root .
 ```
@@ -399,7 +410,7 @@ local-proxy 예시는 side effect 기준으로 나뉩니다.
 - `plan local-proxy`는 advisory metadata만 만들며 forwarding을 켜지 않습니다.
 - `record local-proxy-runtime-gate`는 localhost-only gate row 하나만 append하고 listener 시작, traffic forwarding, API key 저장, hosted API 절감 주장을 하지 않습니다.
-- `serve local-proxy`는 별도 MVP입니다. `--runtime-gate-ack --forwarding-gate-ack --once`와 private `--ready-file` nonce handoff가 모두 필요하고 literal loopback IP에만 bind/forward하며 byte/time limit을 적용하고 credential-bearing 요청, hostname DNS target, external forwarding, CONNECT/TLS proxying, API-key persistence, hosted savings claim을 차단합니다.
+- `serve local-proxy`는 별도 MVP입니다. `--runtime-gate-ack --forwarding-gate-ack --once`와 private `--ready-file` nonce handoff가 모두 필요하고 literal loopback IP에만 bind/forward하며 byte/time limit을 적용하고 credential-bearing 요청, hostname DNS target, external forwarding, CONNECT/TLS proxying, API-key persistence, hosted savings claim을 차단합니다. Optional `--response-sandbox`는 transparent forwarding이 아니라 mediated response mode로, safe UTF-8 upstream response text만 sanitized local artifact receipt로 저장하고 `contextguard-artifact:<id>` 및 rehydration command가 담긴 compact JSON envelope를 반환합니다. binary/sensitive/oversized/blocked 응답은 artifact로 저장하지 않습니다.
 - `--diagnostic-ledger-jsonl`을 지정하면 successful forwarded request 뒤에만 shifted-cost 진단 row를 append하며 raw header, request body, response body, hosted-savings evidence를 저장하지 않습니다.
 - `plan local-proxy-external-forwarding`은 dry-run design gate일 뿐입니다. explicit external intent, design ack, HTTPS host allowlist, threat model note, credential redaction policy, provider-evidence boundary를 요구하지만 listener 시작, DNS lookup, external service call, traffic forwarding, credential persistence, external proxy forwarding runtime 제공, hosted savings claim을 하지 않습니다.
@@ -411,7 +422,7 @@ local-proxy 예시는 side effect 기준으로 나뉩니다.
 | `visual-crop-ocr` | dry-run visual evidence 조언과 명시적 `emit visual-crop-ocr` 런타임으로 caller-supplied evidence pack을 출력합니다. | `emit`은 full visual evidence receipt, missed-context note, 완전한 user-supplied crop 및/또는 OCR evidence가 필요합니다. ContextGuard는 screenshot 캡처, image crop, OCR 실행, image parsing, 외부 service 호출, 파일 쓰기, hosted token/cost 절감 주장을 하지 않습니다. |
 | `learned-compression` | deny-by-default 정책 검사와 명시적 `emit learned-compression` 런타임으로 verified exact fallback content가 있는 caller-supplied compact prose candidate를 출력합니다. | `emit`은 sanitized trusted prose, protected-signal denial, input과 일치하는 verified local fallback artifact, 더 작은 caller-supplied prose candidate가 필요합니다. ContextGuard는 compressor, embedding, reranker, model call, subprocess, external service, 생성형 replacement, hosted savings claim을 실행/생성하지 않습니다. |
 | `self-hosted-metrics-ledger` | dry-run preview와 명시적 `record ... --ledger-jsonl` 런타임으로 local/model-server latency, memory, quality, energy, throughput, local-cost metric을 기록합니다. | dry-run preview는 ledger 파일을 쓰지 않습니다. 명시적 record 명령만 로컬 JSONL sidecar를 쓰며, hosted API token/cost 절감 주장 근거로는 쓰지 않습니다. |
-| `local-proxy` | 미래 local proxy 후보에 대한 localhost-only advisory metadata, future external forwarding용 design-only `plan local-proxy-external-forwarding` review, 명시적 `record local-proxy-runtime-gate --ledger-jsonl` gate row runtime, 명시적 one-shot `serve local-proxy` loopback forwarding MVP, successful forwarded request용 optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics. | `plan`은 ledger를 쓰지 않습니다. `record`는 localhost-only metadata와 `--runtime-gate-ack`가 있을 때만 로컬 JSONL row를 쓰며 listener 시작이나 traffic forwarding, DNS lookup을 하지 않습니다. `serve`는 `--forwarding-gate-ack --once`, private `--ready-file` nonce handoff, literal loopback bind/target IP, nonzero port, byte/time limit, credential-free request가 필요하며 external forwarding, CONNECT/TLS proxying, API-key persistence, hosted API 절감 주장을 하지 않습니다. `--diagnostic-ledger-jsonl`은 successful-forward 진단 row만 쓰며 raw header/body와 hosted-savings claim을 저장하지 않습니다. `plan local-proxy-external-forwarding`은 threat model/allowlist/redaction/provider-evidence design metadata만 출력하고 DNS lookup, external service call, traffic forwarding, credential persistence, hosted savings claim을 하지 않습니다. |
+| `local-proxy` | 미래 local proxy 후보에 대한 localhost-only advisory metadata, future external forwarding용 design-only `plan local-proxy-external-forwarding` review, 명시적 `record local-proxy-runtime-gate --ledger-jsonl` gate row runtime, 명시적 one-shot `serve local-proxy` loopback forwarding MVP, safe UTF-8 응답을 compact artifact envelope로 바꾸는 optional `--response-sandbox`, successful forwarded request용 optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics. | `plan`은 ledger를 쓰지 않습니다. `record`는 localhost-only metadata와 `--runtime-gate-ack`가 있을 때만 로컬 JSONL row를 쓰며 listener 시작이나 traffic forwarding, DNS lookup을 하지 않습니다. `serve`는 `--forwarding-gate-ack --once`, private `--ready-file` nonce handoff, literal loopback bind/target IP, nonzero port, byte/time limit, credential-free request가 필요하며 external forwarding, CONNECT/TLS proxying, API-key persistence, hosted API 절감 주장을 하지 않습니다. `--response-sandbox`는 safe UTF-8 response text만 sanitized local artifact receipt로 저장하고 raw body 대신 redacted rehydration command template가 담긴 compact envelope를 반환하며 hosted token/cost savings claim은 아닙니다. `--diagnostic-ledger-jsonl`은 successful-forward 진단 row만 쓰며 raw header/body와 hosted-savings claim을 저장하지 않습니다. `plan local-proxy-external-forwarding`은 threat model/allowlist/redaction/provider-evidence design metadata만 출력하고 DNS lookup, external service call, traffic forwarding, credential persistence, hosted savings claim을 하지 않습니다. |
 ## 아직 제공하지 않는 기능
@@ -457,7 +468,7 @@ export PATH="$PWD/plugins/context-guard/bin:$PATH"
 context-guard-setup --plan
 ```
-생성되는 hook 명령은 기본적으로 `PATH` 조회에 의존하지 않습니다. setup 마법사는 명시적인 패키지/체크아웃 헬퍼 경로를 기록하며, `--allow-path-helper-fallback`은 신뢰한 외부 설치를 사용할 때만 canonical 경로·symlink 없음·bounded identity probe 검증 후 허용됩니다.
+생성되는 hook 명령은 기본적으로 `PATH` 조회에 의존하지 않습니다. setup 마법사는 명시적인 패키지/체크아웃 헬퍼 경로를 기록하며, `--allow-path-helper-fallback`은 신뢰한 외부 설치를 사용할 때만 canonical 경로·symlink 없음·bounded identity probe 검증 후 허용됩니다. macOS 앱 헬퍼도 같은 신뢰 모델을 따릅니다. launch CWD 탐색, 상대 override 경로, 필요한 allowlist 값을 넘어선 상위 셸 환경 상속을 사용하지 않습니다.
 ## 릴리스 확인
@@ -469,7 +480,7 @@ python3 scripts/prepublish_check.py
 python3 scripts/release_smoke.py
 ```
-헬퍼가 `context-guard-kit/` 아래에서 바뀌었다면 게이트 전에 `python3 scripts/sync_plugin_copies.py --write`를 실행하세요. `sync_plugin_copies.py --check`는 maintainer exact-copy 계약을 먼저 확인합니다. npm 패키지는 구현 payload 중복을 피하기 위해 동기화된 플러그인 로컬 `plugins/context-guard/bin` 엔트리포인트와 `plugins/context-guard/lib` 헬퍼만 배포합니다. `prepublish_check.py`는 패키지 불변식, 동기화된 플러그인 바이너리, 매니페스트, 진단 메시지 가림 처리, 회귀 테스트를 확인합니다. `release_smoke.py`는 임시 프로젝트에서 `plugins/context-guard/bin`의 대표 패키징 엔트리포인트를 실제로 실행해, 배포 전 깨진 CLI 연결을 잡습니다. 전체 릴리스 절차, 증거 체크리스트, quad-review 요구사항, 롤백 체크리스트는 [docs/release-runbook.md](docs/release-runbook.md)를 참고하세요.
+헬퍼가 `context-guard-kit/` 아래에서 바뀌었다면 게이트 전에 `python3 scripts/sync_plugin_copies.py --write`를 실행하세요. `sync_plugin_copies.py --check`는 maintainer exact-copy 계약을 먼저 확인합니다. npm 패키지는 구현 payload 중복을 피하기 위해 동기화된 플러그인 로컬 `plugins/context-guard/bin` 엔트리포인트와 `plugins/context-guard/lib` 헬퍼만 배포하며, npm bin map은 legacy `claude-*` 래퍼 별칭을 의도적으로 제외합니다. 명령 매니페스트는 release/runtime 확인에서 literal assignment로만 읽고, 실행 가능한 Python·import·function·shadow manifest는 거부합니다. `prepublish_check.py`는 패키지 불변식, 동기화된 플러그인 바이너리, 매니페스트, 진단 메시지 가림 처리, 회귀 테스트를 확인합니다. `release_smoke.py`는 임시 프로젝트에서 `plugins/context-guard/bin`의 대표 패키징 엔트리포인트를 실제로 실행해, 배포 전 깨진 CLI 연결을 잡습니다. 전체 릴리스 절차, 증거 체크리스트, quad-review 요구사항, 롤백 체크리스트는 [docs/release-runbook.md](docs/release-runbook.md)를 참고하세요.
 버전별 릴리스 노트는 [CHANGELOG.md](CHANGELOG.md)에 기록하며, 사전 배포 게이트는 플러그인 매니페스트 버전과 일치하는 항목이 있는지 확인합니다.

package/README.md CHANGED Viewed

@@ -1,22 +1,22 @@
 # ContextGuard
-ContextGuard is a local-first context management toolkit for AI coding and tool agents. It ships first as a Claude Code plugin: install it once, enable it per project, and roll it back when needed.
+ContextGuard is a local-first context management toolkit for AI coding and tool-using agents. It ships as a Claude Code plugin first: install it once, enable it per project, and roll it back when needed.
-It trims noisy output, steers agents toward symbol-level reads, nudges repeated failures, redacts secret-like patterns, and measures usage. The same guardrails extend to other agents through local helper commands and advisory brief-mode rule snippets.
+It helps trim noisy output, steer agents toward symbol-level reads, nudge repeated failures, redact secret-like patterns, and measure usage. The same guardrails extend to other agents through local helper commands and advisory brief-mode rule snippets.
 - Korean documentation: [`README.ko.md`](README.ko.md)
 - Static landing page: [GitHub Pages](https://ictechgy.github.io/context-guard/) ([source](docs/index.html))
 ## TL;DR
-Installation and activation are deliberately separate. Installing ContextGuard only makes local helpers or Claude plugin skills available. Configuration changes happen only when you run an explicit setup command.
+Installation and activation are deliberately separate. Installing ContextGuard only makes local helpers or Claude plugin skills available; configuration changes happen only when you run an explicit setup command.
 | If you use... | Install | Activate |
 | --- | --- | --- |
 | Claude Code | `/plugin marketplace add ictechgy/context-guard` then `/plugin install context-guard@context-guard` | Run `/context-guard:setup` inside the project. |
 | Codex CLI or any terminal-first agent | `npm install -g @ictechgy/context-guard` or one-shot `npx @ictechgy/context-guard ...` | `context-guard setup --agent codex --scope project --with-init --with-skill --plan`, then rerun with `--yes`. |
 | Other rule-file agents | Use the npm/npx install path above. | `context-guard setup --agent gemini,cursor,windsurf,cline,copilot --scope project --with-init --plan`, then apply only the agents you want. |
-| macOS/Homebrew users | release path: `brew install ictechgy/tap/context-guard` | Same `context-guard setup ...` commands after install. |
+| macOS/Homebrew users | Release path: `brew install ictechgy/tap/context-guard` | Same `context-guard setup ...` commands after install. |
 Common commands:
@@ -29,13 +29,15 @@ context-guard setup --agent claude --scope user --verify --json  # read-only use
 context-guard setup --agent claude --scope user --plan
 ```
-Project scope is the default. User-level setup is opt-in, requires an explicit agent for writes, records backups and rollback metadata, and never runs during package installation. Use `context-guard doctor` or `context-guard setup --verify` for a read-only health check before applying setup. `doctor` reports next commands and makes no changes. Setup resolves bundled or checkout-local helpers first; it does not trust arbitrary `PATH` helpers unless you explicitly pass `--allow-path-helper-fallback` for a known-good install.
+Project scope is the default. User-level setup is opt-in, requires an explicit agent for writes, records backups and rollback metadata, and never runs during package installation. Use `context-guard doctor` or `context-guard setup --verify` for a read-only health check before applying setup. `doctor` reports next commands and makes no changes. Setup looks for bundled or checkout-local helpers first; it does not trust arbitrary `PATH` helpers unless you explicitly pass `--allow-path-helper-fallback` for a known-good install.
-ContextGuard is intentionally conservative about savings claims. It reduces common sources of context bloat and provides benchmark tooling so you can measure before/after results on your own tasks. It does **not** promise a fixed token or cost reduction for every repository.
+Distribution and helper trust boundaries are conservative too: npm exposes only canonical `context-guard`/`context-guard-*` bin links, legacy `claude-*` wrappers remain package files for path-based migration, command manifests are treated as literal data instead of executable Python, and the macOS visibility helper is discovered from bundled/resource/executable-relative paths or an absolute explicit override with a minimal child environment. Current working directories, relative overrides, symlinked helpers, arbitrary `PATH`, and ambient shell environment are not trusted by default.
+ContextGuard is intentionally conservative about savings claims. It reduces common sources of context bloat and provides benchmark tooling so you can measure before-and-after results on your own tasks. It does **not** promise a fixed token or cost reduction for every repository.
 ## Claude Code first, other agents too
-ContextGuard ships first as a Claude Code plugin, which is still the fastest path to value for Claude users. After installation, the same local-first guardrails can be reused by other AI coding and tool agents through:
+ContextGuard ships as a Claude Code plugin first, which is still the fastest path to value for Claude users. After installation, the same local-first guardrails can be reused by other AI coding and tool-using agents through:
 - **Local helper commands** (`context-guard-*`) that run as plain shell commands, independent of any specific agent.
 - **Advisory brief-mode rule snippets** that you install into an agent's own instruction file (`AGENTS.md`, `GEMINI.md`, `.cursorrules`, Copilot instructions, and similar rule files) and remove by deleting the marker-delimited block.
@@ -56,7 +58,7 @@ Current setup surfaces:
 ## How ContextGuard reduces token waste
-ContextGuard does not make the model cheaper by itself. It reduces avoidable context before it reaches an AI coding agent, then gives you signals to measure whether the change helped.
+ContextGuard does not lower model prices by itself. It reduces avoidable context before it reaches an AI coding agent, then gives you signals to measure whether the change helped.
 | Waste path | ContextGuard guardrail |
 | --- | --- |
@@ -75,10 +77,10 @@ ContextGuard complements provider and semantic caches, and sits next to prompt c
 | Tool category | Saves by | ContextGuard relationship |
 | --- | --- | --- |
-| Provider prompt/context caching | Reusing stable prompt prefixes. | Complementary; ContextGuard helps keep the changing tail of context smaller and cleaner, `context-guard-audit` can flag likely volatile prefix layouts, and `context-guard cost` can warn when an Anthropic request is likely to create/cache-write instead of read. |
+| Provider prompt/context caching | Reusing stable prompt prefixes. | Complementary; ContextGuard helps keep the changing tail of context smaller and cleaner, `context-guard-audit` can flag likely volatile prefix layouts, and `context-guard cost` can warn when an Anthropic request is likely to cache-write instead of cache-read. |
 | Semantic response cache | Reusing answers to identical or similar requests. | Complementary; ContextGuard does not serve cached AI answers. |
 | Prompt/context compression | Shortening text that is already selected for the model. | Adjacent; ContextGuard trims and summarizes local output, but does not promise lossless semantic compression. |
-| Experimental planners and local runtimes | Default-off and explicit-command-only; covers local-proxy plans and gate records plus narrow local runtimes for caller-supplied context-diff, visual evidence-pack, learned-compression, and self-hosted metrics evidence. | The local proxy `record` command starts no listener and forwards no traffic; `serve local-proxy` binds and forwards only literal loopback IPs for one bounded request. Compressor/model execution, OCR/crop services, external forwarding, credential persistence, and hosted-savings claims stay out of scope until a separate evidence gate and future PR allow them. |
+| Experimental planners and local runtimes | Default-off and explicit-command-only; covers local-proxy plans and gate records plus narrow local runtimes for caller-supplied context-diff, visual evidence-pack, learned-compression, and self-hosted metrics evidence. | The local proxy `record` command starts no listener and forwards no traffic; `serve local-proxy` binds and forwards only literal loopback IPs for one bounded request; `--response-sandbox` can replace a safe UTF-8 upstream body with a compact local artifact rehydration envelope. Compressor/model execution, OCR/crop services, external forwarding, credential persistence, and hosted-savings claims stay out of scope until a separate evidence gate and future PR allow them. |
 | ContextGuard | Avoiding unnecessary files, logs, repeated failures, and noisy output before they enter agent context. | Local guardrails, reversible artifacts, and measurement. |
 Related patterns that informed the design:
@@ -97,14 +99,14 @@ Three deterministic levels ship under [`plugins/context-guard/brief/`](plugins/c
 ## What to measure
-When you need a savings claim, measure it on your own tasks:
+If you need a savings claim, measure it on your own tasks:
 - full-file reads versus symbol or line-range reads
 - raw logs versus digest output or artifact receipts
 - transcript hotspots reported by `context-guard-audit`, including `cache_friendliness` prompt-layout signals and `cache_layout_advice` experiment priorities
 - statusline `cache` / `reuse` as observed transcript/provider-cache signals, not savings caused by ContextGuard
 - `context-guard cost preflight` estimates for Anthropic request JSON, followed by `context-guard cost observe` using provider usage fields (`cache_creation_input_tokens`, `cache_read_input_tokens`) after the call
-- static prompt/request cache layout checks from `context-guard-cache-score`; its char/4 token estimates and warnings are advisory only until provider usage fields confirm real cache hits
+- static prompt/request cache layout checks from `context-guard-cache-score`, including optional user-supplied cache write/read multiplier amortization risk; its char/4 token estimates and warnings are advisory only until provider usage fields confirm real cache hits
 - matched successful baseline/variant runs from `context-guard-bench`
 - large tool/MCP catalogs versus `context-guard-tool-prune` top-k reports plus receipt retrieval
 - optional experimental lanes in [`research/experimental-token-reduction-radar.md`](research/experimental-token-reduction-radar.md); fixture-only starters in [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md) use the same matched-task benchmark gates before any savings claim
@@ -114,13 +116,14 @@ When you need a savings claim, measure it on your own tasks:
 - It does not guarantee a fixed token or cost reduction.
 - It does not send work to external AI providers to save model tokens.
 - It does not mutate global Claude settings during install.
+- It does not execute command manifests as code or trust arbitrary `PATH`/current-working-directory helpers during setup or packaged smoke checks.
 - It does not replace real before/after measurement when you need a savings claim.
-- Local RAM/disk receipts can reduce what you send next, but they do **not** replace Anthropic's provider prompt cache or guarantee cache hits. Recheck Anthropic prompt-caching and pricing docs before release or billing claims: https://docs.anthropic.com/en/build-with-claude/prompt-caching and https://platform.claude.com/docs/en/about-claude/pricing.
-- Experimental helpers are mostly dry-run checker/planner surfaces, including a design-only external-forwarding opt-in gate. Explicit local runtimes exist only for caller-supplied context-diff replacement payloads, caller-supplied visual crop/OCR evidence packs, caller-supplied learned-compression prose candidates, self-hosted metrics JSONL sidecar records, local-proxy runtime-gate JSONL records, and one-shot `serve local-proxy` loopback forwarding with a private ready-file nonce plus optional shifted-cost diagnostic JSONL rows for successful forwarded requests.
+- Local RAM/disk receipts can help reduce what you send next, but they do **not** replace Anthropic's provider prompt cache or guarantee cache hits. Recheck Anthropic prompt-caching and pricing docs before release or billing claims: https://docs.anthropic.com/en/build-with-claude/prompt-caching and https://platform.claude.com/docs/en/about-claude/pricing.
+- Experimental helpers are mostly dry-run checker/planner surfaces, including a design-only external-forwarding opt-in gate. Explicit local runtimes exist only for caller-supplied context-diff replacement payloads, caller-supplied visual crop/OCR evidence packs, caller-supplied learned-compression prose candidates, self-hosted metrics JSONL sidecar records, local-proxy runtime-gate JSONL records, and one-shot `serve local-proxy` loopback forwarding with a private ready-file nonce, optional `--response-sandbox` compact artifact envelopes for safe UTF-8 responses, plus optional shifted-cost diagnostic JSONL rows for successful forwarded requests.
 - ContextGuard does not ship learned/synthetic compressor execution, embeddings, rerankers, model calls, generated replacement text, screenshot capture, image cropping, OCR execution, image parsing, external OCR/image services, self-hosted KV/latent inference optimization beyond explicit local metrics recording, or broader proxy forwarding beyond literal-loopback, one-request HTTP forwarding with credential material blocked.
 - It does not alias the old `/claude-token-optimizer:*` Claude Code slash-command namespace. Use `/context-guard:*` after installing this plugin.
-Legacy local CLI wrappers (`claude-token-*`, `claude-read-symbol`, `claude-trim-output`, and `claude-sanitize-output`) still ship in `bin/` so existing automation can migrate gradually.
+Legacy local CLI wrappers (`claude-token-*`, `claude-read-symbol`, `claude-trim-output`, and `claude-sanitize-output`) still ship as package files under `plugins/context-guard/bin/` so existing plugin-path automation can migrate gradually. npm global/`npx` bin links intentionally expose only the canonical `context-guard`/`context-guard-*` commands; call the legacy wrappers by package/plugin path if you still need them.
 ## Features
@@ -173,7 +176,7 @@ Setup is explicit, project-local, and reversible. The plugin does not configure
 ## Install with npm/npx
-The npm package exposes a canonical `context-guard` command plus backward-compatible `context-guard-*` helper commands. Package installation is passive: there is no `postinstall` setup hook and no config write until you run `context-guard setup` yourself. If setup cannot find bundled or checkout-local helpers, `PATH` fallback remains disabled by default; use `--allow-path-helper-fallback` only for trusted helper directories after `context-guard doctor` or `setup --verify` confirms the plan.
+The npm package exposes a canonical `context-guard` command plus `context-guard-*` helper commands. Package installation is passive: there is no `postinstall` setup hook and no config write until you run `context-guard setup` yourself. npm global/`npx` bin links intentionally expose only canonical `context-guard`/`context-guard-*` commands; legacy `claude-*` wrapper files remain packaged for explicit path-based migration but are not advertised as executable bin aliases. If setup cannot find bundled or checkout-local helpers, `PATH` fallback remains disabled by default; use `--allow-path-helper-fallback` only for trusted helper directories after `context-guard doctor` or `setup --verify` confirms the plan.
 ```bash
 npm install -g @ictechgy/context-guard
@@ -249,10 +252,11 @@ The optional Read guard uses a progressive path for oversized files: search firs
 ```bash
 long-command 2>&1 | ./plugins/context-guard/bin/context-guard-artifact store --command "long-command" --json
 ./plugins/context-guard/bin/context-guard-artifact search "ERROR" --json
+./plugins/context-guard/bin/context-guard-artifact receipt <artifact_id> --json
 ./plugins/context-guard/bin/context-guard-artifact get <artifact_id> --lines 1:80
 ```
-Artifact mode is for capture, sandbox search, and retrieval. It stores sanitized output under `.context-guard/artifacts` by default and can still read legacy `.claude-token-optimizer/artifacts` receipts from before the rebrand. JSON receipts include line-numbered top-error receipts, duplicate-line groups, and sanitized bounded `suggested_queries` so an agent can fetch the smallest useful exact slice instead of replaying the full log. `search` scans the local sanitized artifact sandbox by literal substring, returns capped match/context records, and includes `context-guard-artifact get ... --lines START:END` rehydration commands for omitted detail. For custom `--dir` values, raw private paths stay redacted by default; rerun with the same `--dir`, or pass `search --show-paths` when you explicitly want a directly executable local command. The search report is local-only and does not make hosted token/cost savings claims. When `--max-lines` accompanies a `--lines START:END` selector, it caps lines returned within that range; it does not expand the selector. Preserve the producer command's exit code yourself when using shell pipelines in release checks, or use `context-guard-trim-output -- ...` when exit-code preservation is the primary requirement.
+Artifact mode is for capture, sandbox search, and retrieval. It stores sanitized output under `.context-guard/artifacts` by default and can still read legacy `.claude-token-optimizer/artifacts` receipts from before the rebrand. JSON receipts include line-numbered top-error receipts, duplicate-line groups, sanitized bounded `suggested_queries`, and an `output_sandbox` envelope with a stable `contextguard-artifact:<id>` handle. Use `context-guard-artifact receipt <artifact_id> --json` to rehydrate metadata-only handles without returning content, then fetch the smallest useful exact slice instead of replaying the full log. `search` scans the local sanitized artifact sandbox by literal substring, returns capped match/context records, and includes `context-guard-artifact get ... --lines START:END` rehydration commands for omitted detail. For custom `--dir` values, raw private paths stay redacted by default; rerun with the same `--dir`, or pass `search --show-paths` when you explicitly want a directly executable local command. The search report is local-only and does not make hosted token/cost savings claims. When `--max-lines` accompanies a `--lines START:END` selector, it caps lines returned within that range; it does not expand the selector. Preserve the producer command's exit code yourself when using shell pipelines in release checks, or use `context-guard-trim-output -- ...` when exit-code preservation is the primary requirement.
 ### Build a budgeted context pack
@@ -267,7 +271,7 @@ Artifact mode is for capture, sandbox search, and retrieval. It stores sanitized
 # Or run the two explicit steps:
 ./plugins/context-guard/bin/context-guard-pack suggest \
   --root . --query "review failing tests" --diff HEAD \
-  --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k
+  --manifest-out suggested-pack.json --budget-bytes 12000 --json --adaptive-k --adaptive-k-policy recall
 ./plugins/context-guard/bin/context-guard-pack build \
   --root . --manifest suggested-pack.json --budget-bytes 12000 --json
 ./plugins/context-guard/bin/context-guard-pack slice --root . --path README.md --lines 1:40 --json
@@ -280,7 +284,7 @@ A few boundaries are intentional:
 - Add `--explain` for compact deterministic local selection/build reasons in JSON or text output.
 - `--explain` may include bounded `repo_map` metadata: sampled byte/token-proxy tree entries, category-only secret-risk counts, signature-first file hints, explain-only graph ranks, and exact `slice`/symbol retrieval hints.
 - Explain metadata does not change the manifest, pack body, receipt, or byte budget. It does not use network/model/embedding calls, and token values remain local `chars_div_4` proxies rather than provider-token or savings claims.
-- Add `--adaptive-k` to `suggest` or `auto` for advisory-only shrink/expand top-k metadata derived from local score distribution, byte-budget fit, and score-mass recall/precision proxies. It never applies the recommendation automatically and does not change the manifest, pack body, receipt, or byte budget.
+- Add `--adaptive-k` to `suggest` or `auto` for advisory-only shrink/expand top-k metadata derived from local score distribution, byte-budget fit, and clamped score-mass recall/precision proxies. Use `--adaptive-k-policy balanced|recall|precision` plus optional `--adaptive-k-min-recall-proxy` / `--adaptive-k-min-precision-proxy` gates to choose a local recommendation policy; gate failures are metadata-only (`pass|failed`). The adaptive block includes capped selected/omitted evidence and structured source-verification hints, never applies the recommendation automatically, and does not change the manifest, pack body, receipt, or byte budget.
 - Add `--symbol-memory` to `auto` for repo-map-derived symbol/graph advisory metadata with exact `slice` / `read-symbol` verification hints. It is source-verification guidance only and does not change the manifest, pack body, receipt, or byte budget.
 - `--manifest-out` writes a build-compatible manifest; `--pack-out` saves the rendered pack.
 - `context-guard-pack suggest` is the lower-level additive local-only planning step. It ranks candidate files and line ranges from `--query`, `--diff`, repeated `--files`, and optional sanitized `--output` / `--test-output` files under `--root`, then writes a manifest that `build --manifest` can consume.
@@ -303,7 +307,7 @@ The packer uses deterministic standard-library heuristics only: no network, mode
 ./plugins/context-guard/bin/context-guard-tool-prune get <receipt_id> --tool read_file --json
 ```
-`context-guard-tool-prune` ranks a local tool or MCP catalog with deterministic lexical heuristics and emits a bounded top-k advisory report. Inline selected schemas respect an observed UTF-8 byte budget, and omitted or budget-skipped schemas remain recoverable from a compact local receipt plus a separate sanitized payload under `.context-guard/tool-prune`. `defer-report` uses the same receipt path to split a catalog into core inline tools plus deferred tool stubs and namespace summaries. This is advisory only: it does not mutate MCP configuration, does not configure native provider tool search, and token counts remain estimated proxies rather than measured provider savings.
+`context-guard-tool-prune` ranks a local tool or MCP catalog with deterministic lexical heuristics and emits a bounded top-k advisory report. Inline selected schemas respect an observed UTF-8 byte budget, and omitted or budget-skipped schemas remain recoverable from a compact local receipt plus a separate sanitized payload under `.context-guard/tool-prune`. `defer-report` uses the same receipt path to split a catalog into core inline tools plus deferred tool stubs and namespace summaries, and reports gross deferred-schema plus net initial-report char/4 proxy accounting so you can see what moved out of the first prompt. This is advisory only: it does not mutate MCP configuration, does not configure native provider tool search, and token counts remain estimated proxies rather than measured provider savings.
 ### Score static prompt cacheability
@@ -312,7 +316,7 @@ The packer uses deterministic standard-library heuristics only: no network, mode
 ./plugins/context-guard/bin/context-guard cache-score --input prompt.txt --provider anthropic --json
 ```
-`context-guard-cache-score` is a local static lint for prompt/request layout. It estimates total and cacheable-prefix size with a tokenizer-free char/4 proxy, warns about dynamic-looking values near the prefix, and records provider caveats for OpenAI, Anthropic, Gemini, or a generic threshold. It does not call providers, store raw prompts, estimate prices, observe cache hits, or prove token/cost savings; verify real cache behavior with provider usage telemetry.
+`context-guard-cache-score` is a local static lint for prompt/request layout. It estimates total and cacheable-prefix size with a tokenizer-free char/4 proxy, warns about dynamic-looking values near the prefix, and records provider caveats for OpenAI, Anthropic, Gemini, or a generic threshold. Optional `--expected-reuses`, `--cache-write-multiplier`, and `--cache-read-multiplier` inputs add an advisory amortization-risk section using user-supplied economics only. It does not call providers, store raw prompts, estimate prices from bundled defaults, observe cache hits, or prove token/cost savings; verify real cache behavior with provider usage telemetry.
 ### Advise on total cost, batchability, and routing
@@ -344,7 +348,7 @@ Add `--mode readable` only for sanitized prose previews. It uses a deterministic
 ./plugins/context-guard/bin/context-guard-trim-output --max-lines 120 -- npm test
 ```
-Use `--digest markdown` or `--digest json` for a compact semantic digest instead of head/tail logs. Digest mode keeps status, exit code, truncation counts, runner failure facts, a sanitized failure signature, duplicate-line groups, representative lines, redaction counts, and suggested next queries while preserving the wrapped command exit code. Add `--artifact-receipt` with digest mode when you want the exact sanitized full output stored locally as a `context-guard-artifact` receipt; re-expand with the emitted `context-guard-artifact get ...` command before relying on omitted details. Wrapped commands time out after 600 seconds by default; tune this with `--timeout-seconds`.
+Use `--digest markdown` or `--digest json` for a compact semantic digest instead of head/tail logs. Digest mode keeps status, exit code, truncation counts, runner failure facts, a sanitized failure signature, duplicate-line groups, representative lines, redaction counts, and suggested next queries while preserving the wrapped command exit code. Add `--artifact-receipt` with digest mode when you want the exact sanitized full output stored locally as a `context-guard-artifact` receipt; keep the emitted `contextguard-artifact:<id>` handle in agent context and re-expand with the emitted `context-guard-artifact receipt/get/search ...` commands before relying on omitted details. Wrapped commands time out after 600 seconds by default; tune this with `--timeout-seconds`.
 ### Sanitize search and diff output
@@ -406,20 +410,25 @@ These fields can flag likely volatile content near the prompt prefix, stable-pre
 ```bash
 ./plugins/context-guard/bin/context-guard-bench \
   --tasks bench/tasks.json --variants bench/variants.json --csv bench/results.csv \
-  --ledger-jsonl bench/cost-shift.jsonl --report-json bench/report.json
+  --ledger-jsonl bench/cost-shift.jsonl --report-json bench/report.json \
+  --dashboard-md bench/dashboard.md
 ```
+For deterministic local replay before a live provider run, add `--evidence-jsonl docs/benchmark-fixtures/token-savings-12task.evidence.example.jsonl` and, for the 12-task fixture, `--baseline-variant baseline_full_context_fixture`. Replay mode skips provider and `success_command` execution, writes the same CSV/report/dashboard surfaces, and marks synthetic/manual evidence as non-public-claim-eligible.
 Read the report through its claim boundaries before writing any savings statement:
 - Successful baseline/variant runs are compared by real tokens and `cost_usd + external_cost_usd`; byte reductions stay proxy evidence.
 - Token-savings claims require `primary_tokens_measured` on both sides of a matched task.
 - `matched_pair_evidence` links each successful task bucket to the transform, measurement availability, quality gate, and claim boundary.
+- `default_matrix` classifies trimming, artifact escrow, tool pruning, cache advice, adaptive-k, and optional compression as `default-on`, `advisory`, `experimental`, or `reject/rework` from the same matched evidence. The matrix is report-only: it does not change runtime defaults or authorize hosted token/cost savings claims.
+- `public_claim_readiness` is the authoritative release/public-claim gate. It remains false unless matched successful tasks, provider-measured primary tokens/cost, quality non-inferiority, shifted-cost accounting, explicit confidence/failure notes, and complete provider-export provenance all pass; unsupported hosted savings claims are forbidden when `claim_allowed` is false.
 - `wall_time_seconds`, `provider_cached_tokens`, and `provider_cached_tokens_measured` are diagnostic telemetry, not proof of ContextGuard-caused token or cost savings.
 - Optional `self_hosted_metrics` from provider payloads are stored as per-row JSONL ledger sidecars, kept out of CSV/report summaries, and must not be folded into hosted API token/cost savings claims.
 - If cost fields are zero or unavailable, the report can still mark token savings but will not claim shifted-cost savings.
 - CSV schemas are strict; after upgrading the benchmark helper, start a new `--csv` file or migrate the header named in the mismatch error.
-See [`docs/benchmark-report.example.json`](docs/benchmark-report.example.json) for a minimal report-shape example, [`docs/benchmark-workflow-examples.md`](docs/benchmark-workflow-examples.md) for workflow-specific synthetic examples, and [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md) for fixture-only experimental task/variant starters.
+See [`docs/benchmark-report.example.json`](docs/benchmark-report.example.json) for a minimal report-shape example, [`docs/benchmark-workflow-examples.md`](docs/benchmark-workflow-examples.md) for workflow-specific synthetic examples, and [`docs/experimental-benchmark-fixtures.md`](docs/experimental-benchmark-fixtures.md) for fixture-only experimental task/variant starters plus synthetic evidence replay.
 ### Manage experimental opt-ins
@@ -439,7 +448,7 @@ context-guard experiments record self-hosted-metrics-ledger --ledger-jsonl .cont
 context-guard experiments plan local-proxy --json --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack
 context-guard experiments plan local-proxy-external-forwarding --external-forwarding-intent --external-forwarding-design-ack --allow-host api.example.com --allow-scheme https --credential-redaction-policy strip-sensitive-headers --provider-evidence-boundary diagnostic-only-provider-measured-required --threat-model-note "Only user-owned HTTPS endpoint; sensitive headers are stripped before any future forwarding." --json
 context-guard experiments record local-proxy-runtime-gate --ledger-jsonl .context-guard/local-proxy-gates.jsonl --bind-host 127.0.0.1 --target-host 127.0.0.1 --runtime-gate-ack --json
-context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
+context-guard experiments serve local-proxy --bind-host 127.0.0.1 --bind-port 18080 --target-host 127.0.0.1 --target-port 18081 --runtime-gate-ack --forwarding-gate-ack --once --ready-file .context-guard/local-proxy-ready.json --response-sandbox --response-artifact-dir .context-guard/artifacts --diagnostic-ledger-jsonl .context-guard/local-proxy-diagnostics.jsonl --json
 context-guard experiments enable output-receipt-trim --root .
 context-guard experiments disable output-receipt-trim --root .
 ```
@@ -448,7 +457,7 @@ The local-proxy examples are intentionally split by side effect:
 - `plan local-proxy` produces advisory metadata only; it does not enable forwarding.
 - `record local-proxy-runtime-gate` appends one localhost-only gate row and still starts no listener, forwards no traffic, persists no API keys, and makes no hosted-savings claim.
-- `serve local-proxy` is the separate MVP. It requires both runtime and forwarding acknowledgements plus `--once`, a private `--ready-file` nonce handoff for the forwarding client, binds only a literal loopback IP, forwards only to a literal loopback IP target, blocks credential-bearing requests, uses byte/time limits, uses literal IPs instead of hostname DNS targets, does not persist API keys, and does not support external forwarding, CONNECT/TLS proxying, or hosted-savings claims.
+- `serve local-proxy` is the separate MVP. It requires both runtime and forwarding acknowledgements plus `--once`, a private `--ready-file` nonce handoff for the forwarding client, binds only a literal loopback IP, forwards only to a literal loopback IP target, blocks credential-bearing requests, uses byte/time limits, uses literal IPs instead of hostname DNS targets, does not persist API keys, and does not support external forwarding, CONNECT/TLS proxying, or hosted-savings claims. Optional `--response-sandbox` is a mediated response mode, not transparent forwarding: it artifacts only safe UTF-8 upstream response text and returns a compact JSON envelope with `contextguard-artifact:<id>` and rehydration commands; binary, sensitive, oversized, or blocked responses are not artifacted.
 - With `--diagnostic-ledger-jsonl`, `serve` appends one shifted-cost diagnostic row only after a successful forwarded request. The row stores hashes/metadata rather than raw headers, request bodies, response bodies, or hosted-savings evidence.
 - `plan local-proxy-external-forwarding` is a dry-run design gate only. It requires explicit external intent, design acknowledgement, HTTPS host allowlist, threat model notes, credential redaction policy, and provider-evidence boundary, but starts no listener, performs no DNS lookup, calls no external service, forwards no traffic, persists no credentials, and does not ship an external proxy forwarding runtime.
@@ -462,7 +471,7 @@ Shipped experimental checker/planner surfaces, plus explicit local context-diff,
 | `visual-crop-ocr` | Dry-run visual evidence advice plus an explicit `emit visual-crop-ocr` runtime for caller-supplied evidence packs. | `emit` requires a full visual evidence receipt, missed-context note, and complete user-supplied crop and/or OCR evidence; ContextGuard does not capture screenshots, crop images, run OCR, parse images, call external services, write files, or support hosted token/cost savings claims. |
 | `learned-compression` | Deny-by-default policy checks plus an explicit `emit learned-compression` runtime for caller-supplied compact prose candidates with verified exact fallback content. | `emit` requires sanitized trusted prose, protected-signal denial, a verified local fallback artifact matching the input, and a smaller caller-supplied prose candidate; ContextGuard does not run compressors, embeddings, rerankers, model calls, subprocesses, external services, generated replacement text, or hosted savings claims. |
 | `self-hosted-metrics-ledger` | Dry-run preview plus an explicit `record ... --ledger-jsonl` runtime for local/model-server latency, memory, quality, energy, throughput, and local-cost metrics. | The dry-run preview does not write a ledger; the explicit record command writes only local JSONL sidecars and still does not support hosted API token/cost savings claims. |
-| `local-proxy` | Localhost-only advisory metadata, design-only `plan local-proxy-external-forwarding` review for future external forwarding, an explicit `record local-proxy-runtime-gate --ledger-jsonl` runtime for one local gate row, an explicit one-shot `serve local-proxy` loopback forwarding MVP, and optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics for successful forwarded requests. | `plan` writes no ledger. `record` writes only after localhost-only metadata and `--runtime-gate-ack`; it starts no listener, forwards no traffic, and performs no DNS lookup. `serve` additionally requires `--forwarding-gate-ack --once`, a private `--ready-file` nonce handoff, literal loopback bind/target IPs, nonzero ports, bounded bytes/timeouts, and credential-free requests; it performs no external forwarding, no CONNECT/TLS proxying, no API-key persistence, and no hosted-savings claim. `--diagnostic-ledger-jsonl` writes only successful-forward diagnostics with no raw headers/bodies and no hosted-savings claim. `plan local-proxy-external-forwarding` emits threat-model/allowlist/redaction/provider-evidence design metadata only and still performs no DNS lookup, external service call, traffic forwarding, credential persistence, or hosted-savings claim. |
+| `local-proxy` | Localhost-only advisory metadata, design-only `plan local-proxy-external-forwarding` review for future external forwarding, an explicit `record local-proxy-runtime-gate --ledger-jsonl` runtime for one local gate row, an explicit one-shot `serve local-proxy` loopback forwarding MVP, optional `--response-sandbox` compact artifact envelopes, and optional `--diagnostic-ledger-jsonl` shifted-cost diagnostics for successful forwarded requests. | `plan` writes no ledger. `record` writes only after localhost-only metadata and `--runtime-gate-ack`; it starts no listener, forwards no traffic, and performs no DNS lookup. `serve` additionally requires `--forwarding-gate-ack --once`, a private `--ready-file` nonce handoff, literal loopback bind/target IPs, nonzero ports, bounded bytes/timeouts, and credential-free requests; it performs no external forwarding, no CONNECT/TLS proxying, no API-key persistence, and no hosted-savings claim. `--response-sandbox` can store safe UTF-8 response text as a sanitized local artifact receipt and return a compact envelope with redacted rehydration command templates; it does not claim hosted token/cost savings. `--diagnostic-ledger-jsonl` writes only successful-forward diagnostics with no raw headers/bodies and no hosted-savings claim. `plan local-proxy-external-forwarding` emits threat-model/allowlist/redaction/provider-evidence design metadata only and still performs no DNS lookup, external service call, traffic forwarding, credential persistence, or hosted-savings claim. |
 ## What is not yet shipped
@@ -515,7 +524,7 @@ export PATH="$PWD/plugins/context-guard/bin:$PATH"
 context-guard-setup --plan
 ```
-Do not rely on `PATH` lookup for generated hooks by default. The setup wizard records explicit bundled or checkout-local helper paths; `--allow-path-helper-fallback` is only for trusted external installs and validates the resolved helper before writing commands.
+Do not rely on `PATH` lookup for generated hooks by default. The setup wizard records explicit bundled or checkout-local helper paths; `--allow-path-helper-fallback` is only for trusted external installs and validates the resolved helper path, symlink state, and bounded identity probe before writing commands. The macOS app helper follows the same trust model: no launch-CWD discovery, no relative override paths, and no inherited ambient shell environment beyond the allowlisted values it needs to start.
 ## Release checks
@@ -527,7 +536,7 @@ python3 scripts/prepublish_check.py
 python3 scripts/release_smoke.py
 ```
-When a helper under `context-guard-kit/` changes, run `python3 scripts/sync_plugin_copies.py --write` before the gates. `sync_plugin_copies.py --check` verifies the maintainer-facing exact-copy contract up front. npm packages intentionally ship only the synchronized plugin-local `plugins/context-guard/bin` entrypoints and `plugins/context-guard/lib` helpers to avoid duplicate implementation payloads. `prepublish_check.py` verifies package invariants, synchronized plugin binaries, manifests, diagnostic redaction, and the regression suite. `release_smoke.py` executes representative packaged entrypoints from `plugins/context-guard/bin` in a temporary project so broken CLI wiring is caught before publish. See [docs/release-runbook.md](docs/release-runbook.md) for the full release workflow, evidence checklist, quad-review requirement, and rollback checklist.
+When a helper under `context-guard-kit/` changes, run `python3 scripts/sync_plugin_copies.py --write` before the gates. `sync_plugin_copies.py --check` verifies the maintainer-facing exact-copy contract up front. npm packages intentionally ship only the synchronized plugin-local `plugins/context-guard/bin` entrypoints and `plugins/context-guard/lib` helpers to avoid duplicate implementation payloads, and the npm bin map intentionally omits legacy `claude-*` wrapper aliases. Command manifests are loaded as literal assignments for release and runtime checks; executable Python, imports, functions, or shadow manifests are rejected. `prepublish_check.py` verifies package invariants, synchronized plugin binaries, manifests, diagnostic redaction, and the regression suite. `release_smoke.py` executes representative packaged entrypoints from `plugins/context-guard/bin` in a temporary project so broken CLI wiring is caught before publish. See [docs/release-runbook.md](docs/release-runbook.md) for the full release workflow, evidence checklist, quad-review requirement, and rollback checklist.
 Versioned release notes live in [CHANGELOG.md](CHANGELOG.md); the prepublish gate requires an entry matching the plugin manifest version before publishing.