@ictechgy/context-guard 0.4.3 → 0.4.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +6 -0
- package/README.ko.md +2 -2
- package/README.md +2 -2
- package/context-guard-kit/README.md +1 -1
- package/context-guard-kit/claude_transcript_cost_audit.py +272 -0
- package/docs/cache-diagnostics-schema.md +25 -4
- package/package.json +1 -1
- package/packaging/homebrew/context-guard.rb.template +1 -1
- package/plugins/context-guard/.claude-plugin/plugin.json +1 -1
- package/plugins/context-guard/README.ko.md +2 -2
- package/plugins/context-guard/README.md +2 -2
- package/plugins/context-guard/bin/context-guard-audit +272 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes for the ContextGuard plugin are documented here.
|
|
4
4
|
|
|
5
|
+
## [0.4.4] - 2026-06-08
|
|
6
|
+
|
|
7
|
+
- Added top-level `cache_layout_advice` to transcript audit JSON and feasibility output so cache-prefix instability can be prioritized without mixing advice into evidence-only diagnostics.
|
|
8
|
+
- Documented the `cache_layout_advice` consumer contract and conservative cause boundaries for volatile-prefix findings.
|
|
9
|
+
- Refined cache-prefix recommendation wording after quad-review so advice does not overclaim cache reads or session-splitting evidence.
|
|
10
|
+
|
|
5
11
|
## [0.4.3] - 2026-06-08
|
|
6
12
|
|
|
7
13
|
- Fixed the Homebrew formula template so packaged helper paths are handled as Pathname objects during install.
|
package/README.ko.md
CHANGED
|
@@ -99,7 +99,7 @@ brief 모드는 코딩 에이전트가 군더더기를 줄이도록 요청하되
|
|
|
99
99
|
|
|
100
100
|
- 전체 파일 읽기와 심볼·줄 범위 읽기의 차이
|
|
101
101
|
- 원본 로그와 요약 출력 또는 로컬 보관 요약 기록의 차이
|
|
102
|
-
- `context-guard-audit`가 보고한 대화 기록 사용량 집중
|
|
102
|
+
- `context-guard-audit`가 보고한 대화 기록 사용량 집중 지점, `cache_friendliness` 프롬프트 배치 신호, `cache_layout_advice` 실험 우선순위
|
|
103
103
|
- 상태표시줄의 `cache` / `reuse` 값: ContextGuard가 직접 만든 절감 효과가 아니라 관찰된 대화 기록·provider cache 신호입니다.
|
|
104
104
|
- `context-guard cost preflight`로 Anthropic 요청 JSON의 추정 비용을 보고, 호출 뒤 `context-guard cost observe`로 provider usage 필드(`cache_creation_input_tokens`, `cache_read_input_tokens`)를 대조합니다.
|
|
105
105
|
- `context-guard-bench`로 성공한 기준/변형 실행을 쌍으로 맞춰 비교한 결과
|
|
@@ -300,7 +300,7 @@ head/tail 로그 대신 의미 요약이 필요하면 `--digest markdown` 또는
|
|
|
300
300
|
./plugins/context-guard/bin/context-guard-audit ~/.claude/projects --top 20 --recommend
|
|
301
301
|
```
|
|
302
302
|
|
|
303
|
-
감사 명령은 기본적으로 너무 큰 대화 기록 파일과 JSONL 기록을 건너뛰고(`--max-file-bytes`, `--max-line-bytes`), 건너뛴 개수를 함께 보고합니다. 손상된 추적 기록이 메모리를 독점하거나 스캔 공백을 숨기지 않도록 하기 위한 방어입니다. JSON 출력에는 `cache_friendliness`와 [`cache_diagnostics`](docs/cache-diagnostics-schema.md)도 포함됩니다. 이는 제한된 사용량 필드, timestamped cache telemetry records, 가림 처리된 segment hash로 만든 휴리스틱 프롬프트 배치/cache-read
|
|
303
|
+
감사 명령은 기본적으로 너무 큰 대화 기록 파일과 JSONL 기록을 건너뛰고(`--max-file-bytes`, `--max-line-bytes`), 건너뛴 개수를 함께 보고합니다. 손상된 추적 기록이 메모리를 독점하거나 스캔 공백을 숨기지 않도록 하기 위한 방어입니다. JSON 출력에는 `cache_friendliness`와 [`cache_diagnostics`](docs/cache-diagnostics-schema.md)도 포함됩니다. 이는 제한된 사용량 필드, timestamped cache telemetry records, 가림 처리된 segment hash로 만든 휴리스틱 프롬프트 배치/cache-read 진단입니다. sibling `cache_layout_advice`는 이 신호를 긴 세션 분리, prefix 안정화 같은 순위화된 **확인/실험**으로 바꾸되, 관측된 issue와 가설/입증된 cause를 분리합니다. 원문 프롬프트는 출력하지 않고 provider cache hit를 증명하지 않으며, 대화 기록 스키마가 충분한 증거를 드러내지 않으면 `missing`, `partial`, `hypothesis`, `unavailable`일 수 있습니다.
|
|
304
304
|
|
|
305
305
|
### 상태표시줄에서 컨텍스트와 캐시 상태 확인
|
|
306
306
|
|
package/README.md
CHANGED
|
@@ -99,7 +99,7 @@ When you need a savings claim, measure it on your own tasks:
|
|
|
99
99
|
|
|
100
100
|
- full-file reads versus symbol or line-range reads
|
|
101
101
|
- raw logs versus digest output or artifact receipts
|
|
102
|
-
- transcript hotspots reported by `context-guard-audit`, including `cache_friendliness` prompt-layout signals
|
|
102
|
+
- transcript hotspots reported by `context-guard-audit`, including `cache_friendliness` prompt-layout signals and `cache_layout_advice` experiment priorities
|
|
103
103
|
- statusline `cache` / `reuse` as observed transcript/provider-cache signals, not savings caused by ContextGuard
|
|
104
104
|
- `context-guard cost preflight` estimates for Anthropic request JSON, followed by `context-guard cost observe` using provider usage fields (`cache_creation_input_tokens`, `cache_read_input_tokens`) after the call
|
|
105
105
|
- matched successful baseline/variant runs from `context-guard-bench`
|
|
@@ -339,7 +339,7 @@ JSON
|
|
|
339
339
|
./plugins/context-guard/bin/context-guard-audit ~/.claude/projects --top 20 --recommend
|
|
340
340
|
```
|
|
341
341
|
|
|
342
|
-
The audit command skips oversized transcript files and JSONL records by default (`--max-file-bytes`, `--max-line-bytes`) and reports skipped counts, so a corrupt trace cannot dominate memory or hide scan gaps. JSON output also includes `cache_friendliness` and [`cache_diagnostics`](docs/cache-diagnostics-schema.md): heuristic prompt-layout/cache-read diagnostics built from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes.
|
|
342
|
+
The audit command skips oversized transcript files and JSONL records by default (`--max-file-bytes`, `--max-line-bytes`) and reports skipped counts, so a corrupt trace cannot dominate memory or hide scan gaps. JSON output also includes `cache_friendliness` and [`cache_diagnostics`](docs/cache-diagnostics-schema.md): heuristic prompt-layout/cache-read diagnostics built from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes. The sibling `cache_layout_advice` field turns those signals into ranked **checks/experiments** such as splitting long sessions or stabilizing early prompt prefixes, while keeping observed issues separate from hypothesized or corroborated causes. These fields can flag likely volatile content near the prompt prefix, stable-prefix candidates, cache-miss hypotheses, and TTL/headroom evidence gaps, but they do not print raw prompt text, do not prove provider cache hits, and may be `missing`, `partial`, `hypothesis`, or `unavailable` when transcript schemas do not expose enough evidence.
|
|
343
343
|
|
|
344
344
|
### Watch context and cache health in the statusline
|
|
345
345
|
|
|
@@ -65,7 +65,7 @@ python3 context-guard-kit/sanitize_output.py -- git diff
|
|
|
65
65
|
`../research/experimental-token-reduction-radar.md`는 learned compression, multimodal crop/OCR/visual-token pruning, self-hosted KV/latent inference optimization 같은 선택적 미래 실험을 문서화한 gate입니다. `../docs/experimental-benchmark-fixtures.md`에는 fixture-only task/variant 시작 예시가 있습니다. 이 radar와 fixture는 현재 제공되는 runtime helper가 아니며, hosted API token/cost 절감을 보장하지 않습니다. hosted API token/cost 절감 주장은 provider가 측정한 matched-task 근거가 있을 때만 허용합니다. Radar의 later-roadmap gate는 neural/semantic compression, trust-tiered injection-aware compression, context-diff compaction, local proxy constraint를 별도 미래 PR이 gate를 통과하기 전까지 experimental/non-shipped로 유지합니다.
|
|
66
66
|
|
|
67
67
|
`claude_transcript_cost_audit.py --recommend`의 기본 출력은 공유 시 안전하도록 transcript 경로를 `basename#hash`, 명령을 `command#hash` 형태로 익명화합니다. 로컬 원문 식별자가 꼭 필요할 때만 `--show-paths` 또는 `--show-commands`를 추가하세요.
|
|
68
|
-
대용량/손상 transcript 방어를 위해 파일 단위 `--max-file-bytes`, JSONL record 단위 `--max-line-bytes` 제한도 기본 적용되며, 건너뛴 항목은 skip count와 warning으로 표시됩니다. JSON summary/feasibility 출력의 `cache_friendliness`는 제한된 정제 segment hash로 안정적인 prefix와 volatile prefix/tail 신호를 비교하는 휴리스틱입니다. 원문 prompt text는 출력하지 않고, provider cache token field는 ContextGuard가 만든 토큰 절감 증거가 아니라 별도 진단 텔레메트리로 해석하세요.
|
|
68
|
+
대용량/손상 transcript 방어를 위해 파일 단위 `--max-file-bytes`, JSONL record 단위 `--max-line-bytes` 제한도 기본 적용되며, 건너뛴 항목은 skip count와 warning으로 표시됩니다. JSON summary/feasibility 출력의 `cache_friendliness`는 제한된 정제 segment hash로 안정적인 prefix와 volatile prefix/tail 신호를 비교하는 휴리스틱입니다. `cache_layout_advice`는 그 신호를 긴 세션 분리, prefix 안정화, diet 점검 같은 순위화된 확인/실험으로 연결하지만, 관측 issue와 가설/입증 cause를 분리합니다. 원문 prompt text는 출력하지 않고, provider cache token field는 ContextGuard가 만든 토큰 절감 증거가 아니라 별도 진단 텔레메트리로 해석하세요.
|
|
69
69
|
|
|
70
70
|
`context_guard_diet.py scan`은 항상 로컬에서만 읽는 read-only 스캐너입니다. 기본 출력은 project root를 익명화하고 상대경로 중심으로 보고합니다. `--top`은 보고서의 context-like file 목록과 context-exclusion recommendation 목록에 공통으로 적용됩니다. `--show-paths`는 로컬/비공개 디버깅에서만 쓰세요.
|
|
71
71
|
|
|
@@ -49,6 +49,7 @@ TIMESTAMP_KEYS = ("timestamp", "created_at", "createdAt", "time", "ts")
|
|
|
49
49
|
FEASIBILITY_SCHEMA_VERSION = "contextguard.metric-feasibility.v1.2"
|
|
50
50
|
FEASIBILITY_PRODUCER = "context-guard-audit"
|
|
51
51
|
CACHE_DIAGNOSTICS_SCHEMA_VERSION = "contextguard.cache-diagnostics.v1"
|
|
52
|
+
CACHE_LAYOUT_ADVICE_SCHEMA_VERSION = "contextguard.cache-layout-advice.v1"
|
|
52
53
|
MAX_ERROR_EXAMPLES = 20
|
|
53
54
|
JSON_PARSE_RECURSION_LIMIT = 10_000
|
|
54
55
|
READ_CHUNK_BYTES = 64 * 1024
|
|
@@ -184,6 +185,7 @@ class UsageSummary:
|
|
|
184
185
|
prompt_cache_audit: PromptCacheAudit = field(default_factory=PromptCacheAudit)
|
|
185
186
|
cache_friendliness_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
186
187
|
cache_diagnostics_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
188
|
+
cache_layout_advice_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
187
189
|
|
|
188
190
|
@property
|
|
189
191
|
def total_tokens(self) -> int:
|
|
@@ -1398,6 +1400,222 @@ def cache_diagnostics_for_summary(summary: UsageSummary) -> dict[str, Any]:
|
|
|
1398
1400
|
return build_cache_diagnostics(summary)
|
|
1399
1401
|
|
|
1400
1402
|
|
|
1403
|
+
def _dominant_transcript(summary: UsageSummary) -> dict[str, Any] | None:
|
|
1404
|
+
if summary.total_tokens <= 0 or not summary.by_file:
|
|
1405
|
+
return None
|
|
1406
|
+
_label, tokens = summary.by_file.most_common(1)[0]
|
|
1407
|
+
share = tokens / summary.total_tokens if summary.total_tokens else 0.0
|
|
1408
|
+
return {
|
|
1409
|
+
"tokens": tokens,
|
|
1410
|
+
"share": round(share, 4),
|
|
1411
|
+
"dominates": share >= 0.20 and tokens >= 1_000,
|
|
1412
|
+
}
|
|
1413
|
+
|
|
1414
|
+
|
|
1415
|
+
def _first_dynamic_breaker(cache_diagnostics: dict[str, Any]) -> dict[str, Any] | None:
|
|
1416
|
+
breakers = cache_diagnostics.get("dynamic_prefix_breakers") or []
|
|
1417
|
+
if not breakers:
|
|
1418
|
+
return None
|
|
1419
|
+
first = breakers[0]
|
|
1420
|
+
return first if isinstance(first, dict) else None
|
|
1421
|
+
|
|
1422
|
+
|
|
1423
|
+
def build_cache_layout_advice(summary: UsageSummary) -> dict[str, Any]:
|
|
1424
|
+
if summary.cache_layout_advice_cache is not None:
|
|
1425
|
+
return summary.cache_layout_advice_cache
|
|
1426
|
+
|
|
1427
|
+
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1428
|
+
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1429
|
+
signals = cache_friendliness.get("signals") if isinstance(cache_friendliness.get("signals"), dict) else {}
|
|
1430
|
+
dynamic_breaker = _first_dynamic_breaker(cache_diagnostics)
|
|
1431
|
+
dominant = _dominant_transcript(summary)
|
|
1432
|
+
cache_creation = summary.tokens.get("cache_creation", 0)
|
|
1433
|
+
cache_read = summary.tokens.get("cache_read", 0)
|
|
1434
|
+
cache_fields = cache_diagnostics.get("observations", {}).get("cache_fields", {}) if isinstance(cache_diagnostics.get("observations"), dict) else {}
|
|
1435
|
+
cache_status = cache_fields.get("status") if isinstance(cache_fields, dict) else None
|
|
1436
|
+
stable_prefix_share = signals.get("stable_prefix_share")
|
|
1437
|
+
volatile_prefix_share = signals.get("volatile_prefix_share")
|
|
1438
|
+
volatile_tail_share = signals.get("volatile_tail_share")
|
|
1439
|
+
max_prefix_position = dynamic_breaker.get("position") if dynamic_breaker else None
|
|
1440
|
+
max_prefix_position_volatile_share = dynamic_breaker.get("volatile_share") if dynamic_breaker else signals.get("max_prefix_position_volatile_share")
|
|
1441
|
+
|
|
1442
|
+
status = "missing"
|
|
1443
|
+
confidence = "unavailable"
|
|
1444
|
+
observed_issue = "unknown"
|
|
1445
|
+
priority = "P2"
|
|
1446
|
+
hypothesized_causes: list[dict[str, Any]] = []
|
|
1447
|
+
corroborated_causes: list[dict[str, Any]] = []
|
|
1448
|
+
next_checks: list[dict[str, Any]] = []
|
|
1449
|
+
recommended_experiments: list[dict[str, Any]] = []
|
|
1450
|
+
|
|
1451
|
+
has_cache_any = bool(
|
|
1452
|
+
summary.token_field_presence.get("cache_read", 0)
|
|
1453
|
+
or summary.token_field_presence.get("cache_creation", 0)
|
|
1454
|
+
)
|
|
1455
|
+
has_prompt_samples = bool(summary.prompt_cache_audit.samples)
|
|
1456
|
+
if has_cache_any or has_prompt_samples:
|
|
1457
|
+
status = "partial" if (
|
|
1458
|
+
not has_prompt_samples
|
|
1459
|
+
or cache_friendliness.get("status") == "partial"
|
|
1460
|
+
or cache_diagnostics.get("status") == "partial"
|
|
1461
|
+
or summary.skipped_files
|
|
1462
|
+
or summary.skipped_records
|
|
1463
|
+
or summary.parse_errors
|
|
1464
|
+
) else "available"
|
|
1465
|
+
confidence = "partial" if status == "partial" else "hypothesis"
|
|
1466
|
+
|
|
1467
|
+
volatile_prefix_breaker = bool(
|
|
1468
|
+
dynamic_breaker
|
|
1469
|
+
and cache_creation > 0
|
|
1470
|
+
and (max_prefix_position in {0, 1} or (max_prefix_position_volatile_share or 0) >= PROMPT_PREFIX_VOLATILE_THRESHOLD)
|
|
1471
|
+
)
|
|
1472
|
+
long_session_dominates = bool(dominant and dominant.get("dominates"))
|
|
1473
|
+
|
|
1474
|
+
if volatile_prefix_breaker:
|
|
1475
|
+
observed_issue = "volatile_prefix_breaker"
|
|
1476
|
+
priority = "P0" if cache_creation >= 50_000 and max_prefix_position in {0, 1} else "P1"
|
|
1477
|
+
hypothesized_causes.append({
|
|
1478
|
+
"id": "prefix-position-churn",
|
|
1479
|
+
"confidence": confidence,
|
|
1480
|
+
"evidence": EVIDENCE_INFERRED,
|
|
1481
|
+
"reason": (
|
|
1482
|
+
"A highly volatile redacted prompt segment appears in the early prefix window; "
|
|
1483
|
+
"this identifies a layout issue, not a confirmed source."
|
|
1484
|
+
),
|
|
1485
|
+
"next_check": "Check whether startup context, generated evidence, or tool/MCP catalog changes are moving before stable policy.",
|
|
1486
|
+
})
|
|
1487
|
+
if cache_diagnostics.get("stable_prefix_candidates"):
|
|
1488
|
+
hypothesized_causes.append({
|
|
1489
|
+
"id": "evidence-before-policy",
|
|
1490
|
+
"confidence": confidence,
|
|
1491
|
+
"evidence": EVIDENCE_INFERRED,
|
|
1492
|
+
"reason": (
|
|
1493
|
+
"Stable reusable segments appear elsewhere while the early prefix churns; "
|
|
1494
|
+
"check whether logs, diffs, timestamps, or file evidence precede stable instructions."
|
|
1495
|
+
),
|
|
1496
|
+
"next_check": "Keep stable policy/instructions first and move generated run evidence later.",
|
|
1497
|
+
})
|
|
1498
|
+
next_checks.append({
|
|
1499
|
+
"id": "inspect-startup-context-size",
|
|
1500
|
+
"confidence": "hypothesis",
|
|
1501
|
+
"command_templates": [
|
|
1502
|
+
"context-guard-diet scan <repo>",
|
|
1503
|
+
"context-guard-diet structural-waste <repo>",
|
|
1504
|
+
],
|
|
1505
|
+
"evidence_required_for_corroboration": (
|
|
1506
|
+
"Large or duplicate CLAUDE.md/AGENTS.md/GEMINI.md findings from diet output."
|
|
1507
|
+
),
|
|
1508
|
+
})
|
|
1509
|
+
elif long_session_dominates:
|
|
1510
|
+
observed_issue = "long_session_accumulation"
|
|
1511
|
+
priority = "P1"
|
|
1512
|
+
elif cache_creation >= 10_000 and cache_read > 0 and summary.cache_amortization < 0.5:
|
|
1513
|
+
observed_issue = "low_cache_reuse"
|
|
1514
|
+
priority = "P1"
|
|
1515
|
+
elif cache_status == "missing" or not has_cache_any:
|
|
1516
|
+
observed_issue = "missing_cache_fields"
|
|
1517
|
+
priority = "P2"
|
|
1518
|
+
|
|
1519
|
+
if long_session_dominates:
|
|
1520
|
+
recommended_experiments.append({
|
|
1521
|
+
"id": "split-long-sessions",
|
|
1522
|
+
"order": len(recommended_experiments) + 1,
|
|
1523
|
+
"priority": "P1",
|
|
1524
|
+
"effort": "low",
|
|
1525
|
+
"action": "Use /clear between unrelated tasks and /compact focus on changed files, failing tests, and remaining TODO during long work.",
|
|
1526
|
+
"expected_signal": "Cache creation per comparable task decreases and one transcript no longer dominates observed tokens.",
|
|
1527
|
+
"verification": "Re-run context-guard-audit on a comparable window and compare cache_creation, cache_amortization, and top transcript share.",
|
|
1528
|
+
"evidence": dominant or {},
|
|
1529
|
+
})
|
|
1530
|
+
if volatile_prefix_breaker:
|
|
1531
|
+
recommended_experiments.append({
|
|
1532
|
+
"id": "stabilize-cache-prefix",
|
|
1533
|
+
"order": len(recommended_experiments) + 1,
|
|
1534
|
+
"priority": priority,
|
|
1535
|
+
"effort": "medium",
|
|
1536
|
+
"action": "Keep stable reusable instructions/policy before volatile logs, diffs, timestamps, and generated file evidence.",
|
|
1537
|
+
"expected_signal": "Stable prefix share rises and volatile prefix share falls on matched audit windows.",
|
|
1538
|
+
"verification": "Re-run context-guard-audit --json --recommend and compare cache_layout_advice plus cache_friendliness signals.",
|
|
1539
|
+
"evidence": {
|
|
1540
|
+
"dynamic_prefix_breaker_position": max_prefix_position,
|
|
1541
|
+
"dynamic_prefix_breaker_volatile_share": max_prefix_position_volatile_share,
|
|
1542
|
+
},
|
|
1543
|
+
})
|
|
1544
|
+
recommended_experiments.append({
|
|
1545
|
+
"id": "run-context-diet-checks",
|
|
1546
|
+
"order": len(recommended_experiments) + 1,
|
|
1547
|
+
"priority": "P1",
|
|
1548
|
+
"effort": "low",
|
|
1549
|
+
"action": "Run the generated diet command templates and treat any large/duplicate context-file findings as corroborating evidence before editing instructions.",
|
|
1550
|
+
"expected_signal": "Diet output identifies or rules out oversized/duplicated startup context as a contributor.",
|
|
1551
|
+
"verification": "Record diet JSON separately; do not convert prefix-position evidence alone into a confirmed startup-context cause.",
|
|
1552
|
+
"command_templates": [
|
|
1553
|
+
"context-guard-diet scan <repo> --json > diet.json",
|
|
1554
|
+
"context-guard-diet structural-waste <repo> --json > structural-waste.json",
|
|
1555
|
+
],
|
|
1556
|
+
})
|
|
1557
|
+
if cache_creation >= 50_000 and summary.cache_amortization_defined and 1.0 <= summary.cache_amortization < 5.0:
|
|
1558
|
+
recommended_experiments.append({
|
|
1559
|
+
"id": "defer-longer-ttl-until-prefix-stable" if volatile_prefix_breaker else "evaluate-longer-ttl-after-stability-check",
|
|
1560
|
+
"order": len(recommended_experiments) + 1,
|
|
1561
|
+
"priority": "P2",
|
|
1562
|
+
"effort": "medium",
|
|
1563
|
+
"action": "Treat longer TTL as secondary; first corroborate stable prefix reuse and current provider TTL/pricing behavior.",
|
|
1564
|
+
"expected_signal": "TTL evaluation happens only after prefix volatility is reduced or ruled out.",
|
|
1565
|
+
"verification": "Use timestamped cache telemetry and provider-measured billing/cost evidence; historical token totals alone are insufficient.",
|
|
1566
|
+
})
|
|
1567
|
+
if not recommended_experiments and status == "partial":
|
|
1568
|
+
next_checks.append({
|
|
1569
|
+
"id": "rerun-narrower-audit",
|
|
1570
|
+
"confidence": "partial",
|
|
1571
|
+
"command_templates": ["context-guard-audit <transcript-or-project-dir> --json --recommend"],
|
|
1572
|
+
"evidence_required_for_corroboration": "Enough uncapped prompt/cache records to classify prefix layout.",
|
|
1573
|
+
})
|
|
1574
|
+
if not recommended_experiments and observed_issue == "missing_cache_fields":
|
|
1575
|
+
next_checks.append({
|
|
1576
|
+
"id": "collect-cache-telemetry",
|
|
1577
|
+
"confidence": "unavailable",
|
|
1578
|
+
"command_templates": ["context-guard-audit ~/.claude/projects --json --recommend"],
|
|
1579
|
+
"evidence_required_for_corroboration": "Transcript records with cache_read/cache_creation fields.",
|
|
1580
|
+
})
|
|
1581
|
+
|
|
1582
|
+
advice = {
|
|
1583
|
+
"schema_version": CACHE_LAYOUT_ADVICE_SCHEMA_VERSION,
|
|
1584
|
+
"status": status,
|
|
1585
|
+
"confidence": confidence,
|
|
1586
|
+
"heuristic": True,
|
|
1587
|
+
"observed_issue": observed_issue,
|
|
1588
|
+
"priority": priority,
|
|
1589
|
+
"observed_summary": {
|
|
1590
|
+
"cache_creation_tokens": cache_creation,
|
|
1591
|
+
"cache_read_tokens": cache_read,
|
|
1592
|
+
"cache_amortization": round(summary.cache_amortization, 4) if summary.cache_amortization_defined else None,
|
|
1593
|
+
"stable_prefix_share": stable_prefix_share,
|
|
1594
|
+
"volatile_prefix_share": volatile_prefix_share,
|
|
1595
|
+
"volatile_tail_share": volatile_tail_share,
|
|
1596
|
+
"max_prefix_position": max_prefix_position,
|
|
1597
|
+
"max_prefix_position_volatile_share": max_prefix_position_volatile_share,
|
|
1598
|
+
"dominant_transcript_share": dominant.get("share") if dominant else None,
|
|
1599
|
+
},
|
|
1600
|
+
"hypothesized_causes": hypothesized_causes,
|
|
1601
|
+
"corroborated_causes": corroborated_causes,
|
|
1602
|
+
"next_checks": next_checks,
|
|
1603
|
+
"recommended_experiments": recommended_experiments,
|
|
1604
|
+
"caveats": [
|
|
1605
|
+
"Cache layout advice is a local transcript heuristic, not billing authority or provider-cache proof.",
|
|
1606
|
+
"Observed issues come from cache fields and redacted segment statistics; causes remain hypotheses until corroborated by diet/structural evidence.",
|
|
1607
|
+
"Generated command templates use placeholders and must not be treated as observed user commands or paths.",
|
|
1608
|
+
"Use matched before/after audits before making token or cost savings claims.",
|
|
1609
|
+
],
|
|
1610
|
+
}
|
|
1611
|
+
summary.cache_layout_advice_cache = advice
|
|
1612
|
+
return advice
|
|
1613
|
+
|
|
1614
|
+
|
|
1615
|
+
def cache_layout_advice_for_summary(summary: UsageSummary) -> dict[str, Any]:
|
|
1616
|
+
return build_cache_layout_advice(summary)
|
|
1617
|
+
|
|
1618
|
+
|
|
1401
1619
|
def build_metric_caveats(summary: UsageSummary) -> list[str]:
|
|
1402
1620
|
caveats = [
|
|
1403
1621
|
"Values are observed from local Claude Code transcript JSON/JSONL fields and are not official billing records.",
|
|
@@ -1433,6 +1651,7 @@ def feasibility_json(
|
|
|
1433
1651
|
stable_total_tokens = sum(stable_tokens.values())
|
|
1434
1652
|
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1435
1653
|
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1654
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
1436
1655
|
return {
|
|
1437
1656
|
"schema_version": FEASIBILITY_SCHEMA_VERSION,
|
|
1438
1657
|
"producer": FEASIBILITY_PRODUCER,
|
|
@@ -1452,6 +1671,7 @@ def feasibility_json(
|
|
|
1452
1671
|
"headroom_availability",
|
|
1453
1672
|
"cache_friendliness",
|
|
1454
1673
|
"cache_diagnostics",
|
|
1674
|
+
"cache_layout_advice",
|
|
1455
1675
|
"totals",
|
|
1456
1676
|
],
|
|
1457
1677
|
"diagnostic_fields": ["summary"],
|
|
@@ -1480,6 +1700,7 @@ def feasibility_json(
|
|
|
1480
1700
|
"headroom_availability": availability["headroom"],
|
|
1481
1701
|
"cache_friendliness": cache_friendliness,
|
|
1482
1702
|
"cache_diagnostics": cache_diagnostics,
|
|
1703
|
+
"cache_layout_advice": cache_layout_advice,
|
|
1483
1704
|
"totals": {
|
|
1484
1705
|
"total_tokens": stable_total_tokens,
|
|
1485
1706
|
"tokens": stable_tokens,
|
|
@@ -1531,6 +1752,36 @@ def build_recommendations(summary: UsageSummary, top: int) -> list[dict[str, Any
|
|
|
1531
1752
|
input_ratio = input_tokens / total
|
|
1532
1753
|
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1533
1754
|
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1755
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
1756
|
+
if cache_layout_advice.get("observed_issue") == "volatile_prefix_breaker":
|
|
1757
|
+
evidence = {
|
|
1758
|
+
"observed_issue": cache_layout_advice.get("observed_issue"),
|
|
1759
|
+
"priority": cache_layout_advice.get("priority"),
|
|
1760
|
+
"confidence": cache_layout_advice.get("confidence"),
|
|
1761
|
+
"cache_creation_tokens": cache_creation,
|
|
1762
|
+
"cache_read_tokens": cache_read,
|
|
1763
|
+
}
|
|
1764
|
+
observed_summary = cache_layout_advice.get("observed_summary")
|
|
1765
|
+
if isinstance(observed_summary, dict):
|
|
1766
|
+
for key in ("max_prefix_position", "max_prefix_position_volatile_share", "stable_prefix_share", "volatile_prefix_share"):
|
|
1767
|
+
evidence[key] = observed_summary.get(key)
|
|
1768
|
+
rec = recommendation(
|
|
1769
|
+
"prioritize-cache-prefix-stabilization",
|
|
1770
|
+
"Prioritize cache-prefix stabilization before TTL or output trimming",
|
|
1771
|
+
(
|
|
1772
|
+
"Cache creation remains material and redacted segment statistics show a volatile early prefix; "
|
|
1773
|
+
"this is an experiment-prioritization signal, not a confirmed root cause."
|
|
1774
|
+
),
|
|
1775
|
+
(
|
|
1776
|
+
"If one transcript dominates, split unrelated work into shorter sessions; then check startup/context "
|
|
1777
|
+
"size and keep stable policy before volatile logs, diffs, timestamps, and generated evidence."
|
|
1778
|
+
),
|
|
1779
|
+
str(cache_layout_advice.get("priority") or "P1"),
|
|
1780
|
+
evidence,
|
|
1781
|
+
)
|
|
1782
|
+
rec["heuristic"] = True
|
|
1783
|
+
rec["confidence"] = cache_layout_advice.get("confidence")
|
|
1784
|
+
recs.append(rec)
|
|
1534
1785
|
for finding in cache_friendliness.get("findings", []):
|
|
1535
1786
|
if isinstance(finding, dict) and finding.get("id") == "volatile-content-near-prefix":
|
|
1536
1787
|
evidence = dict(finding.get("evidence") or {})
|
|
@@ -1754,6 +2005,7 @@ def summary_json(
|
|
|
1754
2005
|
"top_tools": counter_json(summary.by_tool, top),
|
|
1755
2006
|
"cache_friendliness": cache_friendliness_for_summary(summary),
|
|
1756
2007
|
"cache_diagnostics": cache_diagnostics_for_summary(summary),
|
|
2008
|
+
"cache_layout_advice": cache_layout_advice_for_summary(summary),
|
|
1757
2009
|
}
|
|
1758
2010
|
if include_recommendations:
|
|
1759
2011
|
data["recommendations"] = build_recommendations(summary, top)
|
|
@@ -1887,6 +2139,26 @@ def main() -> int:
|
|
|
1887
2139
|
headroom = cache_diagnostics.get("headroom_diagnostics") or {}
|
|
1888
2140
|
print(f" headroom_status {headroom.get('status')} ({headroom.get('evidence')})")
|
|
1889
2141
|
|
|
2142
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
2143
|
+
if cache_layout_advice.get("status") != "missing" or cache_layout_advice.get("observed_issue") != "unknown":
|
|
2144
|
+
print("\nCache layout advice")
|
|
2145
|
+
print(f" status {cache_layout_advice.get('status')}")
|
|
2146
|
+
print(f" confidence {cache_layout_advice.get('confidence')}")
|
|
2147
|
+
print(f" observed_issue {cache_layout_advice.get('observed_issue')}")
|
|
2148
|
+
print(f" priority {cache_layout_advice.get('priority')}")
|
|
2149
|
+
experiments = cache_layout_advice.get("recommended_experiments") or []
|
|
2150
|
+
if experiments:
|
|
2151
|
+
first = experiments[0]
|
|
2152
|
+
print(f" first_experiment {first.get('id')} ({first.get('priority')})")
|
|
2153
|
+
print(f" experiment_action {first.get('action')}")
|
|
2154
|
+
checks = cache_layout_advice.get("next_checks") or []
|
|
2155
|
+
if checks:
|
|
2156
|
+
first = checks[0]
|
|
2157
|
+
print(f" next_check {first.get('id')}")
|
|
2158
|
+
templates = first.get("command_templates") or []
|
|
2159
|
+
if templates:
|
|
2160
|
+
print(f" command_template {templates[0]}")
|
|
2161
|
+
|
|
1890
2162
|
model_totals = Counter({model: sum(tokens.values()) for model, tokens in summary.by_model.items()})
|
|
1891
2163
|
print_counter("By model", model_totals, args.top)
|
|
1892
2164
|
|
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
`cache_diagnostics` is the nested diagnostic object emitted by `context-guard-audit --json` and by top-level `cache_diagnostics` in `context-guard-audit --feasibility-json`. The committed schema file, [`cache-diagnostics.schema.json`](cache-diagnostics.schema.json), describes that nested object only; it is not the full CLI response envelope.
|
|
4
4
|
|
|
5
|
-
The object is for GUI and external consumers that need stable cache-read, prefix-layout, TTL-evidence, and headroom-boundary fields without scraping prose. It is a local transcript diagnostic contract, not a billing source, not provider telemetry verification, and not a token or cost savings promise.
|
|
5
|
+
The object is for GUI and external consumers that need stable cache-read, prefix-layout, TTL-evidence, and headroom-boundary fields without scraping prose. It is a local transcript diagnostic contract, not a billing source, not provider telemetry verification, and not a token or cost savings promise. It does not guarantee savings, does not prove provider cache hits, and does not infer live headroom.
|
|
6
|
+
|
|
7
|
+
`context-guard-audit` also emits a top-level sibling `cache_layout_advice` object. That sibling is intentionally separate from `cache_diagnostics`: diagnostics stay evidence-oriented, while advice ranks checks and experiments such as session splitting, prefix stabilization, and context-diet scans. Advice distinguishes an `observed_issue` from `hypothesized_causes`, `corroborated_causes`, and `next_checks`; without diet or structural evidence, volatile prefix positions should be presented as hypotheses to check, not confirmed root causes.
|
|
6
8
|
|
|
7
9
|
## Files
|
|
8
10
|
|
|
@@ -13,11 +15,30 @@ The object is for GUI and external consumers that need stable cache-read, prefix
|
|
|
13
15
|
|
|
14
16
|
### `context-guard-audit --json`
|
|
15
17
|
|
|
16
|
-
The legacy audit JSON includes top-level `cache_diagnostics` beside `cache_metrics` and `
|
|
18
|
+
The legacy audit JSON includes top-level `cache_diagnostics` beside `cache_metrics`, `cache_friendliness`, and the separate `cache_layout_advice` advice object.
|
|
17
19
|
|
|
18
20
|
### `context-guard-audit --feasibility-json`
|
|
19
21
|
|
|
20
|
-
The feasibility JSON includes top-level `cache_diagnostics` and lists
|
|
22
|
+
The feasibility JSON includes top-level `cache_diagnostics` and `cache_layout_advice`, and lists both in `consumer_contract.stable_top_level_fields`. GUI consumers should prefer the top-level feasibility field when available and use `summary.cache_diagnostics` only for legacy compatibility.
|
|
23
|
+
|
|
24
|
+
## Sibling `cache_layout_advice` fields
|
|
25
|
+
|
|
26
|
+
`cache_layout_advice` is a stable top-level sibling of `cache_diagnostics`, but it is deliberately not part of `cache-diagnostics.schema.json`. It is an advice contract over local transcript heuristics.
|
|
27
|
+
|
|
28
|
+
| Field | Meaning | Consumer note |
|
|
29
|
+
| --- | --- | --- |
|
|
30
|
+
| `schema_version` | Stable version string, currently `contextguard.cache-layout-advice.v1`. | Treat unknown versions conservatively. |
|
|
31
|
+
| `status` | Advice availability: `available`, `partial`, or `missing`. | `partial` means prompt/cache evidence was capped, skipped, or incomplete. |
|
|
32
|
+
| `confidence` | Overall advice confidence: `hypothesis`, `partial`, or `unavailable`. | Never present as provider truth or billing proof. |
|
|
33
|
+
| `heuristic` | Always `true` for v1. | UI should label advice as heuristic. |
|
|
34
|
+
| `observed_issue` | Primary observed layout issue: `volatile_prefix_breaker`, `long_session_accumulation`, `low_cache_reuse`, `missing_cache_fields`, or `unknown`. | This is an observed/audited symptom, not a confirmed cause. |
|
|
35
|
+
| `priority` | Suggested priority bucket (`P0`, `P1`, or `P2`). | Use for ordering checks, not for savings claims. |
|
|
36
|
+
| `observed_summary` | Sanitized numeric summary such as cache creation/read tokens, prefix shares, breaker position, and dominant transcript share. | Contains aggregate counts/shares only, not raw prompt text. |
|
|
37
|
+
| `hypothesized_causes` | Candidate causes to investigate, each with `id`, `confidence`, `evidence`, `reason`, and `next_check`. | Keep separate from confirmed causes. |
|
|
38
|
+
| `corroborated_causes` | Causes supported by independent evidence beyond prefix-position heuristics. | Empty means no cause has been confirmed. |
|
|
39
|
+
| `next_checks` | Evidence-gathering checks with `id`, `confidence`, `command_templates`, and `evidence_required_for_corroboration`. | Templates use placeholders such as `<repo>` and must not embed observed local paths. |
|
|
40
|
+
| `recommended_experiments` | Ordered experiments with `id`, `order`, `priority`, `effort`, `action`, `expected_signal`, and `verification`. | Run in `order`; compare matched audit windows before claiming improvement. |
|
|
41
|
+
| `caveats` | User-facing boundaries for claims and evidence limits. | Preserve these in GUI summaries and reports. |
|
|
21
42
|
|
|
22
43
|
## Top-level fields
|
|
23
44
|
|
|
@@ -72,4 +93,4 @@ Historical transcript scans do not carry live context-window state. `headroom_di
|
|
|
72
93
|
|
|
73
94
|
## Claim boundaries
|
|
74
95
|
|
|
75
|
-
`cache_diagnostics` can help users reorganize prompts, find volatile prefix segments, and identify missing evidence.
|
|
96
|
+
`cache_diagnostics` and the sibling `cache_layout_advice` can help users reorganize prompts, find volatile prefix segments, and identify missing evidence or next checks. They do not guarantee savings, do not verify provider cache state, are not billing authority, do not prove provider cache hits, and do not infer live headroom from historical token totals.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ictechgy/context-guard",
|
|
3
|
-
"version": "0.4.
|
|
3
|
+
"version": "0.4.4",
|
|
4
4
|
"description": "ContextGuard CLI helpers for keeping AI coding agent context focused and local-first.",
|
|
5
5
|
"license": "Apache-2.0",
|
|
6
6
|
"homepage": "https://github.com/ictechgy/context-guard#readme",
|
|
@@ -5,7 +5,7 @@ class ContextGuard < Formula
|
|
|
5
5
|
|
|
6
6
|
desc "Local-first context guardrails for AI coding agents"
|
|
7
7
|
homepage "https://github.com/ictechgy/context-guard"
|
|
8
|
-
url "https://github.com/ictechgy/context-guard/archive/refs/tags/v0.4.
|
|
8
|
+
url "https://github.com/ictechgy/context-guard/archive/refs/tags/v0.4.4.tar.gz"
|
|
9
9
|
sha256 "REPLACE_WITH_RELEASE_TARBALL_SHA256"
|
|
10
10
|
license "Apache-2.0"
|
|
11
11
|
|
|
@@ -95,7 +95,7 @@ context-guard-statusline-merged
|
|
|
95
95
|
- **출력 축약기**는 감싼 명령의 종료 코드를 보존하면서 긴 로그를 줄이고, `--digest markdown` 또는 `--digest json`으로 실행기 실패 정보, 가림 처리된 failure signature, 중복 라인 그룹, 다음 조회 제안이 담긴 요약을 만들 수 있습니다.
|
|
96
96
|
- **민감정보 가림 도구**는 검색, diff, 로그 출력에서 자격 증명 패턴, 비공개 키 블록, 인증 헤더, 자격 증명이 포함된 URL, 민감해 보이는 경로를 가립니다.
|
|
97
97
|
- **상태표시줄**은 모델, 컨텍스트, 비용 신호를 짧게 보여주고, 대화 기록 데이터가 있으면 캐시 읽기와 캐시 재사용 신호도 함께 표시합니다.
|
|
98
|
-
- **대화 기록 감사**는 usage/cost/cache bucket을 집계하고, 토큰 집중
|
|
98
|
+
- **대화 기록 감사**는 usage/cost/cache bucket을 집계하고, 토큰 집중 지점, `cache_friendliness` 프롬프트 배치 신호, `cache_layout_advice` 확인/실험 우선순위를 제한된 가림 처리된 segment hash로 보고합니다. 원문 프롬프트는 출력하지 않습니다.
|
|
99
99
|
- **반복 실패 알림**은 Bash 실패가 반복될 때 같은 경로를 계속 재시도하지 않고 전략을 바꾸도록 안내합니다.
|
|
100
100
|
- **벤치마크 헬퍼**는 기준/변형 실행을 대응해 실제 토큰·비용 필드, 별도의 바이트 감소 간접 증거, 진단용 `wall_time_seconds`, `provider_cached_tokens`, provider-cache 사용 가능성 텔레메트리로 기록합니다.
|
|
101
101
|
|
|
@@ -109,7 +109,7 @@ brief 모드는 코딩 에이전트가 군더더기를 줄이도록 요청하되
|
|
|
109
109
|
|
|
110
110
|
## 절감 수치를 과장하지 않습니다
|
|
111
111
|
|
|
112
|
-
이 헬퍼들은 흔히 컨텍스트를 불필요하게 키우는 원인을 줄이지만, 고정된 절감률을 보장하지 않습니다. 실제 전후 비교 증거가 필요하면 `context-guard-bench --ledger-jsonl ... --report-json ...`로 본인 작업에서 측정하세요. 토큰 절감 주장은 대응 태스크 양쪽 모두에 `primary_tokens_measured`가 있을 때만 계산하며, report의 `matched_pair_evidence`가 성공한 baseline/variant task bucket을 transform, quality gate, 측정 가능 여부, claim boundary와 연결합니다. wall-time과 provider-cache 필드는 진단용 텔레메트리이지 단독 절감 증거가 아닙니다. 감사의 `cache_friendliness
|
|
112
|
+
이 헬퍼들은 흔히 컨텍스트를 불필요하게 키우는 원인을 줄이지만, 고정된 절감률을 보장하지 않습니다. 실제 전후 비교 증거가 필요하면 `context-guard-bench --ledger-jsonl ... --report-json ...`로 본인 작업에서 측정하세요. 토큰 절감 주장은 대응 태스크 양쪽 모두에 `primary_tokens_measured`가 있을 때만 계산하며, report의 `matched_pair_evidence`가 성공한 baseline/variant task bucket을 transform, quality gate, 측정 가능 여부, claim boundary와 연결합니다. wall-time과 provider-cache 필드는 진단용 텔레메트리이지 단독 절감 증거가 아닙니다. 감사의 `cache_friendliness`, [`cache_diagnostics`](https://github.com/ictechgy/context-guard/blob/main/docs/cache-diagnostics-schema.md), `cache_layout_advice`는 관측/추론/가설/불가 경계를 둔 휴리스틱 배치·cache-read 신호와 순위화된 확인/실험이며 청구 기준이나 provider-cache 증명이 아닙니다. 벤치마크 CSV 스키마는 엄격하므로 헬퍼 업그레이드 후에는 새 CSV를 시작하거나 헤더를 마이그레이션하세요. 작업 유형별 합성 예시는 [`docs/benchmark-workflow-examples.md`](https://github.com/ictechgy/context-guard/blob/main/docs/benchmark-workflow-examples.md)에 있고, fixture-only 실험 시작 예시는 [`docs/experimental-benchmark-fixtures.md`](https://github.com/ictechgy/context-guard/blob/main/docs/experimental-benchmark-fixtures.md)에 있습니다.
|
|
113
113
|
|
|
114
114
|
ContextGuard는 모델 토큰을 줄이기 위해 작업을 외부 AI 서비스로 전송하지 않습니다. 모든 헬퍼 명령은 로컬에서 동작합니다. 로컬 RAM/디스크 보관본은 다음에 보낼 컨텍스트를 줄이는 데 도움될 수 있지만 provider prompt cache를 대체하지 않습니다. Anthropic 배포나 청구 설명 전에는 공식 prompt caching/pricing 문서를 다시 확인하세요: https://docs.anthropic.com/en/build-with-claude/prompt-caching 및 https://platform.claude.com/docs/en/about-claude/pricing.
|
|
115
115
|
|
|
@@ -101,7 +101,7 @@ context-guard-statusline-merged
|
|
|
101
101
|
- **Output trimmer** preserves the wrapped command exit code, trims long logs, and can emit `--digest markdown` or `--digest json` summaries with runner failure facts, sanitized failure signatures, duplicate-line groups, and suggested next queries. Add `--artifact-receipt` with digest mode to store the exact sanitized full output as a local artifact receipt and re-expand omitted slices with the emitted `context-guard-artifact get ...` command.
|
|
102
102
|
- **Sanitizer** redacts common credential patterns, private key blocks, auth headers, credential URLs, and sensitive-looking paths from search, diff, and log output.
|
|
103
103
|
- **Statusline** displays compact model/context/cost signals and, when transcript data is available, cache-read and cache-reuse signals.
|
|
104
|
-
- **Transcript audit** aggregates usage/cost/cache buckets, flags likely token hotspots, and exposes `cache_friendliness
|
|
104
|
+
- **Transcript audit** aggregates usage/cost/cache buckets, flags likely token hotspots, and exposes `cache_friendliness`, additive [`cache_diagnostics`](https://github.com/ictechgy/context-guard/blob/main/docs/cache-diagnostics-schema.md), and `cache_layout_advice` experiment priorities from bounded usage fields, timestamped cache telemetry records, and redacted segment hashes without printing raw prompt text or claiming provider-cache savings.
|
|
105
105
|
- **Repeated-failure nudge** warns after repeated Bash failures so the agent switches strategy instead of retrying the same context-heavy path.
|
|
106
106
|
- **Benchmark helper** records matched baseline/variant runs with real token and cost fields, separate byte-reduction proxy evidence, diagnostic `wall_time_seconds`, `provider_cached_tokens`, and provider-cache availability telemetry.
|
|
107
107
|
|
|
@@ -115,7 +115,7 @@ Three deterministic levels — `lite`, `standard`, `ultra` — live under [`brie
|
|
|
115
115
|
|
|
116
116
|
## Conservative claims
|
|
117
117
|
|
|
118
|
-
These helpers reduce common sources of context bloat, but they do not guarantee a fixed percentage savings. Use `context-guard-bench --ledger-jsonl ... --report-json ...` when you need measured before/after evidence for your own tasks; token-savings claims require `primary_tokens_measured` on both matched sides, and the report's `matched_pair_evidence` links each successful baseline/variant task bucket to the transform, quality gate, measurement availability, and claim boundary. Wall-time/provider-cache fields are diagnostic telemetry, not standalone savings proof. Audit `cache_friendliness
|
|
118
|
+
These helpers reduce common sources of context bloat, but they do not guarantee a fixed percentage savings. Use `context-guard-bench --ledger-jsonl ... --report-json ...` when you need measured before/after evidence for your own tasks; token-savings claims require `primary_tokens_measured` on both matched sides, and the report's `matched_pair_evidence` links each successful baseline/variant task bucket to the transform, quality gate, measurement availability, and claim boundary. Wall-time/provider-cache fields are diagnostic telemetry, not standalone savings proof. Audit `cache_friendliness`, [`cache_diagnostics`](https://github.com/ictechgy/context-guard/blob/main/docs/cache-diagnostics-schema.md), and `cache_layout_advice` findings are heuristic layout/cache-read signals and ranked checks/experiments with observed/inferred/hypothesis/unavailable boundaries, not billing authority or provider-cache proof. Benchmark CSV schemas are strict, so start a new CSV or migrate the header after helper upgrades. Workflow-specific synthetic examples live in [`docs/benchmark-workflow-examples.md`](https://github.com/ictechgy/context-guard/blob/main/docs/benchmark-workflow-examples.md), and fixture-only experimental task/variant starters live in [`docs/experimental-benchmark-fixtures.md`](https://github.com/ictechgy/context-guard/blob/main/docs/experimental-benchmark-fixtures.md).
|
|
119
119
|
|
|
120
120
|
ContextGuard also does not send work to external AI providers to save model tokens. All helper commands run locally. Local RAM/disk receipts can reduce what you choose to send, but they do not replace a provider prompt cache. Before release or billing claims for Anthropic, recheck the official prompt-caching and pricing docs: https://docs.anthropic.com/en/build-with-claude/prompt-caching and https://platform.claude.com/docs/en/about-claude/pricing.
|
|
121
121
|
|
|
@@ -49,6 +49,7 @@ TIMESTAMP_KEYS = ("timestamp", "created_at", "createdAt", "time", "ts")
|
|
|
49
49
|
FEASIBILITY_SCHEMA_VERSION = "contextguard.metric-feasibility.v1.2"
|
|
50
50
|
FEASIBILITY_PRODUCER = "context-guard-audit"
|
|
51
51
|
CACHE_DIAGNOSTICS_SCHEMA_VERSION = "contextguard.cache-diagnostics.v1"
|
|
52
|
+
CACHE_LAYOUT_ADVICE_SCHEMA_VERSION = "contextguard.cache-layout-advice.v1"
|
|
52
53
|
MAX_ERROR_EXAMPLES = 20
|
|
53
54
|
JSON_PARSE_RECURSION_LIMIT = 10_000
|
|
54
55
|
READ_CHUNK_BYTES = 64 * 1024
|
|
@@ -184,6 +185,7 @@ class UsageSummary:
|
|
|
184
185
|
prompt_cache_audit: PromptCacheAudit = field(default_factory=PromptCacheAudit)
|
|
185
186
|
cache_friendliness_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
186
187
|
cache_diagnostics_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
188
|
+
cache_layout_advice_cache: dict[str, Any] | None = field(default=None, init=False, repr=False)
|
|
187
189
|
|
|
188
190
|
@property
|
|
189
191
|
def total_tokens(self) -> int:
|
|
@@ -1398,6 +1400,222 @@ def cache_diagnostics_for_summary(summary: UsageSummary) -> dict[str, Any]:
|
|
|
1398
1400
|
return build_cache_diagnostics(summary)
|
|
1399
1401
|
|
|
1400
1402
|
|
|
1403
|
+
def _dominant_transcript(summary: UsageSummary) -> dict[str, Any] | None:
|
|
1404
|
+
if summary.total_tokens <= 0 or not summary.by_file:
|
|
1405
|
+
return None
|
|
1406
|
+
_label, tokens = summary.by_file.most_common(1)[0]
|
|
1407
|
+
share = tokens / summary.total_tokens if summary.total_tokens else 0.0
|
|
1408
|
+
return {
|
|
1409
|
+
"tokens": tokens,
|
|
1410
|
+
"share": round(share, 4),
|
|
1411
|
+
"dominates": share >= 0.20 and tokens >= 1_000,
|
|
1412
|
+
}
|
|
1413
|
+
|
|
1414
|
+
|
|
1415
|
+
def _first_dynamic_breaker(cache_diagnostics: dict[str, Any]) -> dict[str, Any] | None:
|
|
1416
|
+
breakers = cache_diagnostics.get("dynamic_prefix_breakers") or []
|
|
1417
|
+
if not breakers:
|
|
1418
|
+
return None
|
|
1419
|
+
first = breakers[0]
|
|
1420
|
+
return first if isinstance(first, dict) else None
|
|
1421
|
+
|
|
1422
|
+
|
|
1423
|
+
def build_cache_layout_advice(summary: UsageSummary) -> dict[str, Any]:
|
|
1424
|
+
if summary.cache_layout_advice_cache is not None:
|
|
1425
|
+
return summary.cache_layout_advice_cache
|
|
1426
|
+
|
|
1427
|
+
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1428
|
+
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1429
|
+
signals = cache_friendliness.get("signals") if isinstance(cache_friendliness.get("signals"), dict) else {}
|
|
1430
|
+
dynamic_breaker = _first_dynamic_breaker(cache_diagnostics)
|
|
1431
|
+
dominant = _dominant_transcript(summary)
|
|
1432
|
+
cache_creation = summary.tokens.get("cache_creation", 0)
|
|
1433
|
+
cache_read = summary.tokens.get("cache_read", 0)
|
|
1434
|
+
cache_fields = cache_diagnostics.get("observations", {}).get("cache_fields", {}) if isinstance(cache_diagnostics.get("observations"), dict) else {}
|
|
1435
|
+
cache_status = cache_fields.get("status") if isinstance(cache_fields, dict) else None
|
|
1436
|
+
stable_prefix_share = signals.get("stable_prefix_share")
|
|
1437
|
+
volatile_prefix_share = signals.get("volatile_prefix_share")
|
|
1438
|
+
volatile_tail_share = signals.get("volatile_tail_share")
|
|
1439
|
+
max_prefix_position = dynamic_breaker.get("position") if dynamic_breaker else None
|
|
1440
|
+
max_prefix_position_volatile_share = dynamic_breaker.get("volatile_share") if dynamic_breaker else signals.get("max_prefix_position_volatile_share")
|
|
1441
|
+
|
|
1442
|
+
status = "missing"
|
|
1443
|
+
confidence = "unavailable"
|
|
1444
|
+
observed_issue = "unknown"
|
|
1445
|
+
priority = "P2"
|
|
1446
|
+
hypothesized_causes: list[dict[str, Any]] = []
|
|
1447
|
+
corroborated_causes: list[dict[str, Any]] = []
|
|
1448
|
+
next_checks: list[dict[str, Any]] = []
|
|
1449
|
+
recommended_experiments: list[dict[str, Any]] = []
|
|
1450
|
+
|
|
1451
|
+
has_cache_any = bool(
|
|
1452
|
+
summary.token_field_presence.get("cache_read", 0)
|
|
1453
|
+
or summary.token_field_presence.get("cache_creation", 0)
|
|
1454
|
+
)
|
|
1455
|
+
has_prompt_samples = bool(summary.prompt_cache_audit.samples)
|
|
1456
|
+
if has_cache_any or has_prompt_samples:
|
|
1457
|
+
status = "partial" if (
|
|
1458
|
+
not has_prompt_samples
|
|
1459
|
+
or cache_friendliness.get("status") == "partial"
|
|
1460
|
+
or cache_diagnostics.get("status") == "partial"
|
|
1461
|
+
or summary.skipped_files
|
|
1462
|
+
or summary.skipped_records
|
|
1463
|
+
or summary.parse_errors
|
|
1464
|
+
) else "available"
|
|
1465
|
+
confidence = "partial" if status == "partial" else "hypothesis"
|
|
1466
|
+
|
|
1467
|
+
volatile_prefix_breaker = bool(
|
|
1468
|
+
dynamic_breaker
|
|
1469
|
+
and cache_creation > 0
|
|
1470
|
+
and (max_prefix_position in {0, 1} or (max_prefix_position_volatile_share or 0) >= PROMPT_PREFIX_VOLATILE_THRESHOLD)
|
|
1471
|
+
)
|
|
1472
|
+
long_session_dominates = bool(dominant and dominant.get("dominates"))
|
|
1473
|
+
|
|
1474
|
+
if volatile_prefix_breaker:
|
|
1475
|
+
observed_issue = "volatile_prefix_breaker"
|
|
1476
|
+
priority = "P0" if cache_creation >= 50_000 and max_prefix_position in {0, 1} else "P1"
|
|
1477
|
+
hypothesized_causes.append({
|
|
1478
|
+
"id": "prefix-position-churn",
|
|
1479
|
+
"confidence": confidence,
|
|
1480
|
+
"evidence": EVIDENCE_INFERRED,
|
|
1481
|
+
"reason": (
|
|
1482
|
+
"A highly volatile redacted prompt segment appears in the early prefix window; "
|
|
1483
|
+
"this identifies a layout issue, not a confirmed source."
|
|
1484
|
+
),
|
|
1485
|
+
"next_check": "Check whether startup context, generated evidence, or tool/MCP catalog changes are moving before stable policy.",
|
|
1486
|
+
})
|
|
1487
|
+
if cache_diagnostics.get("stable_prefix_candidates"):
|
|
1488
|
+
hypothesized_causes.append({
|
|
1489
|
+
"id": "evidence-before-policy",
|
|
1490
|
+
"confidence": confidence,
|
|
1491
|
+
"evidence": EVIDENCE_INFERRED,
|
|
1492
|
+
"reason": (
|
|
1493
|
+
"Stable reusable segments appear elsewhere while the early prefix churns; "
|
|
1494
|
+
"check whether logs, diffs, timestamps, or file evidence precede stable instructions."
|
|
1495
|
+
),
|
|
1496
|
+
"next_check": "Keep stable policy/instructions first and move generated run evidence later.",
|
|
1497
|
+
})
|
|
1498
|
+
next_checks.append({
|
|
1499
|
+
"id": "inspect-startup-context-size",
|
|
1500
|
+
"confidence": "hypothesis",
|
|
1501
|
+
"command_templates": [
|
|
1502
|
+
"context-guard-diet scan <repo>",
|
|
1503
|
+
"context-guard-diet structural-waste <repo>",
|
|
1504
|
+
],
|
|
1505
|
+
"evidence_required_for_corroboration": (
|
|
1506
|
+
"Large or duplicate CLAUDE.md/AGENTS.md/GEMINI.md findings from diet output."
|
|
1507
|
+
),
|
|
1508
|
+
})
|
|
1509
|
+
elif long_session_dominates:
|
|
1510
|
+
observed_issue = "long_session_accumulation"
|
|
1511
|
+
priority = "P1"
|
|
1512
|
+
elif cache_creation >= 10_000 and cache_read > 0 and summary.cache_amortization < 0.5:
|
|
1513
|
+
observed_issue = "low_cache_reuse"
|
|
1514
|
+
priority = "P1"
|
|
1515
|
+
elif cache_status == "missing" or not has_cache_any:
|
|
1516
|
+
observed_issue = "missing_cache_fields"
|
|
1517
|
+
priority = "P2"
|
|
1518
|
+
|
|
1519
|
+
if long_session_dominates:
|
|
1520
|
+
recommended_experiments.append({
|
|
1521
|
+
"id": "split-long-sessions",
|
|
1522
|
+
"order": len(recommended_experiments) + 1,
|
|
1523
|
+
"priority": "P1",
|
|
1524
|
+
"effort": "low",
|
|
1525
|
+
"action": "Use /clear between unrelated tasks and /compact focus on changed files, failing tests, and remaining TODO during long work.",
|
|
1526
|
+
"expected_signal": "Cache creation per comparable task decreases and one transcript no longer dominates observed tokens.",
|
|
1527
|
+
"verification": "Re-run context-guard-audit on a comparable window and compare cache_creation, cache_amortization, and top transcript share.",
|
|
1528
|
+
"evidence": dominant or {},
|
|
1529
|
+
})
|
|
1530
|
+
if volatile_prefix_breaker:
|
|
1531
|
+
recommended_experiments.append({
|
|
1532
|
+
"id": "stabilize-cache-prefix",
|
|
1533
|
+
"order": len(recommended_experiments) + 1,
|
|
1534
|
+
"priority": priority,
|
|
1535
|
+
"effort": "medium",
|
|
1536
|
+
"action": "Keep stable reusable instructions/policy before volatile logs, diffs, timestamps, and generated file evidence.",
|
|
1537
|
+
"expected_signal": "Stable prefix share rises and volatile prefix share falls on matched audit windows.",
|
|
1538
|
+
"verification": "Re-run context-guard-audit --json --recommend and compare cache_layout_advice plus cache_friendliness signals.",
|
|
1539
|
+
"evidence": {
|
|
1540
|
+
"dynamic_prefix_breaker_position": max_prefix_position,
|
|
1541
|
+
"dynamic_prefix_breaker_volatile_share": max_prefix_position_volatile_share,
|
|
1542
|
+
},
|
|
1543
|
+
})
|
|
1544
|
+
recommended_experiments.append({
|
|
1545
|
+
"id": "run-context-diet-checks",
|
|
1546
|
+
"order": len(recommended_experiments) + 1,
|
|
1547
|
+
"priority": "P1",
|
|
1548
|
+
"effort": "low",
|
|
1549
|
+
"action": "Run the generated diet command templates and treat any large/duplicate context-file findings as corroborating evidence before editing instructions.",
|
|
1550
|
+
"expected_signal": "Diet output identifies or rules out oversized/duplicated startup context as a contributor.",
|
|
1551
|
+
"verification": "Record diet JSON separately; do not convert prefix-position evidence alone into a confirmed startup-context cause.",
|
|
1552
|
+
"command_templates": [
|
|
1553
|
+
"context-guard-diet scan <repo> --json > diet.json",
|
|
1554
|
+
"context-guard-diet structural-waste <repo> --json > structural-waste.json",
|
|
1555
|
+
],
|
|
1556
|
+
})
|
|
1557
|
+
if cache_creation >= 50_000 and summary.cache_amortization_defined and 1.0 <= summary.cache_amortization < 5.0:
|
|
1558
|
+
recommended_experiments.append({
|
|
1559
|
+
"id": "defer-longer-ttl-until-prefix-stable" if volatile_prefix_breaker else "evaluate-longer-ttl-after-stability-check",
|
|
1560
|
+
"order": len(recommended_experiments) + 1,
|
|
1561
|
+
"priority": "P2",
|
|
1562
|
+
"effort": "medium",
|
|
1563
|
+
"action": "Treat longer TTL as secondary; first corroborate stable prefix reuse and current provider TTL/pricing behavior.",
|
|
1564
|
+
"expected_signal": "TTL evaluation happens only after prefix volatility is reduced or ruled out.",
|
|
1565
|
+
"verification": "Use timestamped cache telemetry and provider-measured billing/cost evidence; historical token totals alone are insufficient.",
|
|
1566
|
+
})
|
|
1567
|
+
if not recommended_experiments and status == "partial":
|
|
1568
|
+
next_checks.append({
|
|
1569
|
+
"id": "rerun-narrower-audit",
|
|
1570
|
+
"confidence": "partial",
|
|
1571
|
+
"command_templates": ["context-guard-audit <transcript-or-project-dir> --json --recommend"],
|
|
1572
|
+
"evidence_required_for_corroboration": "Enough uncapped prompt/cache records to classify prefix layout.",
|
|
1573
|
+
})
|
|
1574
|
+
if not recommended_experiments and observed_issue == "missing_cache_fields":
|
|
1575
|
+
next_checks.append({
|
|
1576
|
+
"id": "collect-cache-telemetry",
|
|
1577
|
+
"confidence": "unavailable",
|
|
1578
|
+
"command_templates": ["context-guard-audit ~/.claude/projects --json --recommend"],
|
|
1579
|
+
"evidence_required_for_corroboration": "Transcript records with cache_read/cache_creation fields.",
|
|
1580
|
+
})
|
|
1581
|
+
|
|
1582
|
+
advice = {
|
|
1583
|
+
"schema_version": CACHE_LAYOUT_ADVICE_SCHEMA_VERSION,
|
|
1584
|
+
"status": status,
|
|
1585
|
+
"confidence": confidence,
|
|
1586
|
+
"heuristic": True,
|
|
1587
|
+
"observed_issue": observed_issue,
|
|
1588
|
+
"priority": priority,
|
|
1589
|
+
"observed_summary": {
|
|
1590
|
+
"cache_creation_tokens": cache_creation,
|
|
1591
|
+
"cache_read_tokens": cache_read,
|
|
1592
|
+
"cache_amortization": round(summary.cache_amortization, 4) if summary.cache_amortization_defined else None,
|
|
1593
|
+
"stable_prefix_share": stable_prefix_share,
|
|
1594
|
+
"volatile_prefix_share": volatile_prefix_share,
|
|
1595
|
+
"volatile_tail_share": volatile_tail_share,
|
|
1596
|
+
"max_prefix_position": max_prefix_position,
|
|
1597
|
+
"max_prefix_position_volatile_share": max_prefix_position_volatile_share,
|
|
1598
|
+
"dominant_transcript_share": dominant.get("share") if dominant else None,
|
|
1599
|
+
},
|
|
1600
|
+
"hypothesized_causes": hypothesized_causes,
|
|
1601
|
+
"corroborated_causes": corroborated_causes,
|
|
1602
|
+
"next_checks": next_checks,
|
|
1603
|
+
"recommended_experiments": recommended_experiments,
|
|
1604
|
+
"caveats": [
|
|
1605
|
+
"Cache layout advice is a local transcript heuristic, not billing authority or provider-cache proof.",
|
|
1606
|
+
"Observed issues come from cache fields and redacted segment statistics; causes remain hypotheses until corroborated by diet/structural evidence.",
|
|
1607
|
+
"Generated command templates use placeholders and must not be treated as observed user commands or paths.",
|
|
1608
|
+
"Use matched before/after audits before making token or cost savings claims.",
|
|
1609
|
+
],
|
|
1610
|
+
}
|
|
1611
|
+
summary.cache_layout_advice_cache = advice
|
|
1612
|
+
return advice
|
|
1613
|
+
|
|
1614
|
+
|
|
1615
|
+
def cache_layout_advice_for_summary(summary: UsageSummary) -> dict[str, Any]:
|
|
1616
|
+
return build_cache_layout_advice(summary)
|
|
1617
|
+
|
|
1618
|
+
|
|
1401
1619
|
def build_metric_caveats(summary: UsageSummary) -> list[str]:
|
|
1402
1620
|
caveats = [
|
|
1403
1621
|
"Values are observed from local Claude Code transcript JSON/JSONL fields and are not official billing records.",
|
|
@@ -1433,6 +1651,7 @@ def feasibility_json(
|
|
|
1433
1651
|
stable_total_tokens = sum(stable_tokens.values())
|
|
1434
1652
|
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1435
1653
|
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1654
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
1436
1655
|
return {
|
|
1437
1656
|
"schema_version": FEASIBILITY_SCHEMA_VERSION,
|
|
1438
1657
|
"producer": FEASIBILITY_PRODUCER,
|
|
@@ -1452,6 +1671,7 @@ def feasibility_json(
|
|
|
1452
1671
|
"headroom_availability",
|
|
1453
1672
|
"cache_friendliness",
|
|
1454
1673
|
"cache_diagnostics",
|
|
1674
|
+
"cache_layout_advice",
|
|
1455
1675
|
"totals",
|
|
1456
1676
|
],
|
|
1457
1677
|
"diagnostic_fields": ["summary"],
|
|
@@ -1480,6 +1700,7 @@ def feasibility_json(
|
|
|
1480
1700
|
"headroom_availability": availability["headroom"],
|
|
1481
1701
|
"cache_friendliness": cache_friendliness,
|
|
1482
1702
|
"cache_diagnostics": cache_diagnostics,
|
|
1703
|
+
"cache_layout_advice": cache_layout_advice,
|
|
1483
1704
|
"totals": {
|
|
1484
1705
|
"total_tokens": stable_total_tokens,
|
|
1485
1706
|
"tokens": stable_tokens,
|
|
@@ -1531,6 +1752,36 @@ def build_recommendations(summary: UsageSummary, top: int) -> list[dict[str, Any
|
|
|
1531
1752
|
input_ratio = input_tokens / total
|
|
1532
1753
|
cache_friendliness = cache_friendliness_for_summary(summary)
|
|
1533
1754
|
cache_diagnostics = cache_diagnostics_for_summary(summary)
|
|
1755
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
1756
|
+
if cache_layout_advice.get("observed_issue") == "volatile_prefix_breaker":
|
|
1757
|
+
evidence = {
|
|
1758
|
+
"observed_issue": cache_layout_advice.get("observed_issue"),
|
|
1759
|
+
"priority": cache_layout_advice.get("priority"),
|
|
1760
|
+
"confidence": cache_layout_advice.get("confidence"),
|
|
1761
|
+
"cache_creation_tokens": cache_creation,
|
|
1762
|
+
"cache_read_tokens": cache_read,
|
|
1763
|
+
}
|
|
1764
|
+
observed_summary = cache_layout_advice.get("observed_summary")
|
|
1765
|
+
if isinstance(observed_summary, dict):
|
|
1766
|
+
for key in ("max_prefix_position", "max_prefix_position_volatile_share", "stable_prefix_share", "volatile_prefix_share"):
|
|
1767
|
+
evidence[key] = observed_summary.get(key)
|
|
1768
|
+
rec = recommendation(
|
|
1769
|
+
"prioritize-cache-prefix-stabilization",
|
|
1770
|
+
"Prioritize cache-prefix stabilization before TTL or output trimming",
|
|
1771
|
+
(
|
|
1772
|
+
"Cache creation remains material and redacted segment statistics show a volatile early prefix; "
|
|
1773
|
+
"this is an experiment-prioritization signal, not a confirmed root cause."
|
|
1774
|
+
),
|
|
1775
|
+
(
|
|
1776
|
+
"If one transcript dominates, split unrelated work into shorter sessions; then check startup/context "
|
|
1777
|
+
"size and keep stable policy before volatile logs, diffs, timestamps, and generated evidence."
|
|
1778
|
+
),
|
|
1779
|
+
str(cache_layout_advice.get("priority") or "P1"),
|
|
1780
|
+
evidence,
|
|
1781
|
+
)
|
|
1782
|
+
rec["heuristic"] = True
|
|
1783
|
+
rec["confidence"] = cache_layout_advice.get("confidence")
|
|
1784
|
+
recs.append(rec)
|
|
1534
1785
|
for finding in cache_friendliness.get("findings", []):
|
|
1535
1786
|
if isinstance(finding, dict) and finding.get("id") == "volatile-content-near-prefix":
|
|
1536
1787
|
evidence = dict(finding.get("evidence") or {})
|
|
@@ -1754,6 +2005,7 @@ def summary_json(
|
|
|
1754
2005
|
"top_tools": counter_json(summary.by_tool, top),
|
|
1755
2006
|
"cache_friendliness": cache_friendliness_for_summary(summary),
|
|
1756
2007
|
"cache_diagnostics": cache_diagnostics_for_summary(summary),
|
|
2008
|
+
"cache_layout_advice": cache_layout_advice_for_summary(summary),
|
|
1757
2009
|
}
|
|
1758
2010
|
if include_recommendations:
|
|
1759
2011
|
data["recommendations"] = build_recommendations(summary, top)
|
|
@@ -1887,6 +2139,26 @@ def main() -> int:
|
|
|
1887
2139
|
headroom = cache_diagnostics.get("headroom_diagnostics") or {}
|
|
1888
2140
|
print(f" headroom_status {headroom.get('status')} ({headroom.get('evidence')})")
|
|
1889
2141
|
|
|
2142
|
+
cache_layout_advice = cache_layout_advice_for_summary(summary)
|
|
2143
|
+
if cache_layout_advice.get("status") != "missing" or cache_layout_advice.get("observed_issue") != "unknown":
|
|
2144
|
+
print("\nCache layout advice")
|
|
2145
|
+
print(f" status {cache_layout_advice.get('status')}")
|
|
2146
|
+
print(f" confidence {cache_layout_advice.get('confidence')}")
|
|
2147
|
+
print(f" observed_issue {cache_layout_advice.get('observed_issue')}")
|
|
2148
|
+
print(f" priority {cache_layout_advice.get('priority')}")
|
|
2149
|
+
experiments = cache_layout_advice.get("recommended_experiments") or []
|
|
2150
|
+
if experiments:
|
|
2151
|
+
first = experiments[0]
|
|
2152
|
+
print(f" first_experiment {first.get('id')} ({first.get('priority')})")
|
|
2153
|
+
print(f" experiment_action {first.get('action')}")
|
|
2154
|
+
checks = cache_layout_advice.get("next_checks") or []
|
|
2155
|
+
if checks:
|
|
2156
|
+
first = checks[0]
|
|
2157
|
+
print(f" next_check {first.get('id')}")
|
|
2158
|
+
templates = first.get("command_templates") or []
|
|
2159
|
+
if templates:
|
|
2160
|
+
print(f" command_template {templates[0]}")
|
|
2161
|
+
|
|
1890
2162
|
model_totals = Counter({model: sum(tokens.values()) for model, tokens in summary.by_model.items()})
|
|
1891
2163
|
print_counter("By model", model_totals, args.top)
|
|
1892
2164
|
|