pluribus-context 0.3.33 → 0.3.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/CHANGELOG.md +19 -0
  2. package/README.md +7 -6
  3. package/docs/ai-pr-review-receipts.md +153 -0
  4. package/docs/canonical-output-receipts.md +107 -0
  5. package/docs/community-review-packet.md +11 -11
  6. package/docs/context-budget-receipts.md +22 -0
  7. package/docs/context-input-evidence.md +15 -0
  8. package/docs/dynamic-workflow-run-receipts.md +158 -0
  9. package/docs/install-plan-receipts.md +77 -0
  10. package/docs/mcp-tool-visibility-receipts.md +67 -0
  11. package/docs/review-primitive-gate.md +107 -0
  12. package/docs/skill-policy-receipts.md +87 -0
  13. package/docs/subagent-role-receipts.md +95 -0
  14. package/docs/temporal-context-receipts.md +123 -0
  15. package/examples/agent-skills/context-receipts/SKILL.md +21 -0
  16. package/examples/agent-skills/skill-policy-receipts/README.md +22 -0
  17. package/examples/agent-skills/skill-policy-receipts/SKILL.md +77 -0
  18. package/examples/ai-pr-review-receipts/.github/pull_request_template.md +31 -0
  19. package/examples/ai-pr-review-receipts/README.md +5 -0
  20. package/examples/canonical-output-receipts/canonical-output-receipt.json +55 -0
  21. package/examples/claude-code-review-hook/README.md +74 -0
  22. package/examples/claude-code-review-hook/check-review-receipt-hook.mjs +80 -0
  23. package/examples/claude-code-review-hook/sample-task-completed-event.json +6 -0
  24. package/examples/context-input-evidence/code-search-retrieval-otel-trace.json +879 -0
  25. package/examples/context-input-evidence/code-search-retrieval-receipt.ndjson +8 -0
  26. package/examples/context-input-evidence/convert-code-search-retrieval-log.mjs +280 -0
  27. package/examples/context-input-evidence/sample-code-search-retrieval-log.jsonl +5 -0
  28. package/examples/dynamic-workflow-run-receipts/README.md +18 -0
  29. package/examples/dynamic-workflow-run-receipts/workflow-run-receipt.json +112 -0
  30. package/examples/install-plan-receipts/README.md +34 -0
  31. package/examples/install-plan-receipts/agent-install-plan-receipt.json +56 -0
  32. package/examples/review-primitive-gate/README.md +19 -0
  33. package/examples/review-primitive-gate/check-review-receipt.mjs +100 -0
  34. package/examples/review-primitive-gate/fail-review-receipt.json +42 -0
  35. package/examples/review-primitive-gate/pass-review-receipt.json +54 -0
  36. package/examples/subagent-role-receipts/README.md +15 -0
  37. package/examples/subagent-role-receipts/agents.toml +36 -0
  38. package/examples/temporal-context-receipts/CURRENT_STATE.md +13 -0
  39. package/examples/temporal-context-receipts/specs/2025-checkout-rewrite.md +10 -0
  40. package/examples/temporal-context-receipts/specs/2026-checkout-risk-notes.md +10 -0
  41. package/examples/temporal-context-receipts/temporal-authority-receipt.json +27 -0
  42. package/package.json +1 -1
  43. package/src/utils/version.js +1 -1
@@ -0,0 +1,77 @@
1
+ # Install-plan receipts
2
+
3
+ Use this when an MCP server, Skill bundle, plugin, starter kit, or setup script says it can configure many AI coding tools for you.
4
+
5
+ The risk is not only whether a hook later runs safely. The earlier boundary is the installer itself: it may detect agents, write MCP config, add instruction files, install Skills, register hooks, or create backups before the user understands what changed.
6
+
7
+ The goal is a tiny, privacy-safe pre-mutation receipt that proves what the setup step intends to touch **before the first write starts**. Do not log prompts, source code, secrets, raw environment dumps, transcripts, raw command output, customer data, or private absolute paths.
8
+
9
+ ## Boundary to prove
10
+
11
+ For every setup/install run, capture a plan like this before applying changes:
12
+
13
+ ```json
14
+ {
15
+ "receipt_type": "agent.install.plan.v1",
16
+ "run_id": "local-install-2026-05-29T16:00Z",
17
+ "installer": "code-memory-mcp",
18
+ "mode_requested": "plan",
19
+ "mode_effective": "plan",
20
+ "agents_detected": ["claude-code", "cursor", "codex", "openclaw"],
21
+ "agents_selected": ["claude-code", "openclaw"],
22
+ "planned_writes": [
23
+ {
24
+ "kind": "mcp_config",
25
+ "target": "claude-code project config",
26
+ "operation": "add_server",
27
+ "backup_planned": true
28
+ },
29
+ {
30
+ "kind": "instruction_file",
31
+ "target": "AGENTS.md",
32
+ "operation": "append_usage_notes",
33
+ "backup_planned": true
34
+ },
35
+ {
36
+ "kind": "hook",
37
+ "target": "pre-tool hook config",
38
+ "operation": "register_command",
39
+ "backup_planned": true
40
+ }
41
+ ],
42
+ "external_commands_planned": [
43
+ { "phase": "apply", "command_class": "package_manager_install" }
44
+ ],
45
+ "network_after_install": "mcp_server_localhost_only",
46
+ "writes_started": false,
47
+ "next_safe_command": "installer apply --from-plan install-plan.json"
48
+ }
49
+ ```
50
+
51
+ Keep `target` values coarse enough for review. Prefer `claude-code project config` over a full local path, and `package_manager_install` over raw shell output.
52
+
53
+ ## Acceptance checks
54
+
55
+ A safe installer should make these claims inspectable:
56
+
57
+ 1. **Plan mode exists** — `install --plan`, `install --dry-run`, or equivalent emits the receipt without writing files.
58
+ 2. **Effective mode is explicit** — if the user requested `apply` but policy downgraded to `plan`, the receipt says so.
59
+ 3. **Agent detection is separated from selection** — finding Cursor/Codex/Claude/OpenClaw does not imply every detected tool will be changed.
60
+ 4. **Every planned write has a kind and backup decision** — config, instruction file, Skill, hook, shell profile, lockfile, cache, or generated artifact.
61
+ 5. **Writes are still false at receipt time** — `writes_started=false` is the key trust boundary.
62
+ 6. **Apply can be repeated from the plan** — the user can review one artifact, then run a concrete next command.
63
+ 7. **No private payloads leak** — no raw source, prompts, env dumps, secrets, token values, transcripts, stack traces, or raw tool output.
64
+
65
+ ## Why this matters for hooks and MCP
66
+
67
+ Hooks, Skills, and MCP configs are often discussed as runtime supply-chain surfaces. That is true, but it is downstream. A one-command installer can create the hook or MCP entry first.
68
+
69
+ A hook receipt answers: “what executed?”
70
+
71
+ An install-plan receipt answers the earlier question: **“what is about to be installed, written, and trusted?”**
72
+
73
+ If an installer cannot answer that before mutation, treat it like running CI from an untrusted fork: useful, but not automatically safe.
74
+
75
+ ## Try the copyable example
76
+
77
+ See [`examples/install-plan-receipts/`](../examples/install-plan-receipts/) for a small review checklist and sample receipt you can copy into setup scripts, README install sections, or agent-managed onboarding workflows.
@@ -0,0 +1,67 @@
1
+ # MCP tool visibility receipts
2
+
3
+ MCP memory, Git, GitLab, code-search, and knowledge-graph servers can be healthy while the agent still cannot see their tools.
4
+
5
+ A useful debug artifact should prove each boundary separately:
6
+
7
+ 1. **Server launched** — the configured command starts without leaking env/secrets.
8
+ 2. **Handshake completed** — client and server agreed on a protocol version and capabilities.
9
+ 3. **Proxy catalog returned** — a direct `tools/list` call returns the expected tool count and names.
10
+ 4. **Client catalog visible** — the actual agent UI/runtime exposes the same tools under the expected names.
11
+ 5. **Invocation allowed or refused** — the first tool call either runs, or returns an explicit permission/config/schema reason.
12
+
13
+ `server healthy` is not enough. `tools/list` is not enough. The receipt needs to say where the chain stopped.
14
+
15
+ ## 60-second probe for any stdio MCP server
16
+
17
+ Replace the command after the pipe with the server command you already configured in Claude Code, Cursor, Codex, OpenClaw, or another MCP client.
18
+
19
+ ```bash
20
+ (
21
+ printf '%s\n' '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"receipt-probe","version":"0.1.0"}},"id":1}'
22
+ printf '%s\n' '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}'
23
+ ) | your-mcp-server-command
24
+ ```
25
+
26
+ Record only metadata, not raw prompt/source/tool output:
27
+
28
+ ```json
29
+ {
30
+ "kind": "mcp.tool_visibility.receipt",
31
+ "server": "gitlab",
32
+ "server_command_hash": "sha256:...",
33
+ "protocol_version_requested": "2024-11-05",
34
+ "handshake": "ok",
35
+ "proxy_tools_count": 172,
36
+ "proxy_tool_names_sample": ["glab_issue_list", "glab_mr_view"],
37
+ "client": "Claude Code",
38
+ "client_catalog_visible": false,
39
+ "client_tools_count": 0,
40
+ "stopped_at": "client_catalog_visible",
41
+ "privacy": "names/counts only; no args, outputs, tokens, paths, or source snippets"
42
+ }
43
+ ```
44
+
45
+ ## Acceptance check
46
+
47
+ For a release or bug report, ask for one small matrix:
48
+
49
+ | Boundary | Evidence | Pass condition |
50
+ | --- | --- | --- |
51
+ | Launch | server command hash + exit/live status | command starts and stays alive long enough for handshake |
52
+ | Handshake | protocol version + capabilities summary | initialized without version/schema mismatch |
53
+ | Proxy catalog | `tools/list` count + stable tool-name sample | expected tools returned directly |
54
+ | Client catalog | client-visible count + naming prefix | same class of tools visible to the agent |
55
+ | First invocation | allowed/refused reason | failure explains permission/config/schema, not silent absence |
56
+
57
+ This shape is intentionally compatible with GitHub/GitLab issue reports and OpenTelemetry-style events. It helps maintainers separate server bugs from client catalog, protocol-version, schema, timeout, and permission bugs without asking users to paste private output.
58
+
59
+ ## Why this belongs near Pluribus
60
+
61
+ Pluribus should not become an MCP gateway or memory database. The narrow value is evidence for context boundaries:
62
+
63
+ - generated instruction files prove what static rules were written;
64
+ - memory/search receipts prove what retrieved context was delivered;
65
+ - tool visibility receipts prove whether a configured MCP capability actually crossed into the agent's usable catalog.
66
+
67
+ If a tool is not visible to the agent, the project has no reliable context handoff no matter how healthy the server looks.
@@ -0,0 +1,107 @@
1
+ # Review primitive gate for agent handoffs
2
+
3
+ Use this when a parallel-agent run, Claude Code hook/workflow, Codex/OpenClaw handoff, or local control-plane wrapper needs to prove more than "the agent said it was done".
4
+
5
+ The market question is not just what to log after a run. It is whether a reviewer or CI job can make a decision:
6
+
7
+ - **continue** because the assignment stayed inside approved scope and required checks passed;
8
+ - **review first** because the run is partial or has explicit unverified assumptions;
9
+ - **reject / stop** because scope changed without approval, required checks were skipped or failed, or the run is unsafe to resume.
10
+
11
+ Pluribus should not be the execution control plane. Worktrees, VMs, hooks, masks, and vendor guardrails can enforce parts of the run. The useful Pluribus layer is a small, privacy-safe receipt that turns those controls into reviewable evidence across tools.
12
+
13
+ ## Receipt shape
14
+
15
+ Attach this receipt to a PR body, CI artifact, run summary, or handoff packet.
16
+
17
+ ```json
18
+ {
19
+ "type": "agent.review_primitive_receipt.v1",
20
+ "assignment_id": "agent-auth-audit-42",
21
+ "run_id": "run-2026-05-31T17-00Z",
22
+ "agent": {
23
+ "tool": "claude-code",
24
+ "role": "auth-reviewer"
25
+ },
26
+ "approved_boundaries": {
27
+ "read": ["src/auth/**", "tests/auth/**"],
28
+ "write": ["tests/auth/**"],
29
+ "network": false
30
+ },
31
+ "scope_access_changes": [
32
+ {
33
+ "change": "read docs/security/**",
34
+ "reason": "needed policy wording for test fixture",
35
+ "approved": true,
36
+ "approved_by": "human-reviewer"
37
+ }
38
+ ],
39
+ "commands_and_checks": [
40
+ {
41
+ "name": "npm test -- tests/auth",
42
+ "kind": "required_test",
43
+ "status": "passed",
44
+ "evidence": "ci://job/123#auth-tests"
45
+ },
46
+ {
47
+ "name": "npm run lint",
48
+ "kind": "required_check",
49
+ "status": "passed",
50
+ "evidence": "ci://job/123#lint"
51
+ }
52
+ ],
53
+ "refused_operations": [
54
+ {
55
+ "operation": "write src/auth/session.ts",
56
+ "reason": "outside approved write boundary"
57
+ }
58
+ ],
59
+ "handoff": {
60
+ "changed_files_bucket": "under_5",
61
+ "evidence_path": "artifacts/agent-auth-audit-42.json",
62
+ "next_safe_action": "review tests/auth/session.test.ts before merge"
63
+ },
64
+ "resume_state": "complete",
65
+ "privacy": {
66
+ "raw_prompts_logged": false,
67
+ "raw_tool_output_logged": false,
68
+ "source_code_logged": false,
69
+ "secrets_logged": false
70
+ }
71
+ }
72
+ ```
73
+
74
+ ## Minimal gate
75
+
76
+ The copyable demo in [`examples/review-primitive-gate/`](../examples/review-primitive-gate/) turns the receipt into a CI/reviewer decision.
77
+
78
+ If you use Claude Code hooks, the [`examples/claude-code-review-hook/`](../examples/claude-code-review-hook/) bridge shows how to run the same gate from `TaskCompleted`, `PostCompact`, or `SessionEnd` without logging raw prompts, transcripts, tool output, source code, or secrets.
79
+
80
+ ```bash
81
+ node examples/review-primitive-gate/check-review-receipt.mjs \
82
+ examples/review-primitive-gate/pass-review-receipt.json
83
+
84
+ node examples/review-primitive-gate/check-review-receipt.mjs \
85
+ examples/review-primitive-gate/fail-review-receipt.json
86
+ ```
87
+
88
+ The gate passes only when:
89
+
90
+ - `type` is `agent.review_primitive_receipt.v1`;
91
+ - `assignment_id` and `run_id` exist;
92
+ - approved read/write boundaries are present;
93
+ - every scope/access change is explicitly approved;
94
+ - every required check/test passed;
95
+ - `resume_state` is `complete`.
96
+
97
+ The gate fails when a run is `partial` or `unsafe-to-resume`, when a required check is skipped/failed, or when scope changed without approval. That is intentional: partial work can be valuable, but it should not silently pass a merge gate.
98
+
99
+ ## What to keep out
100
+
101
+ Do not put raw prompts, full transcripts, source code, exact proprietary paths, secrets, customer data, or raw tool output in the receipt. Use coarse globs, hashes, CI URLs, artifact IDs, pass/fail states, and human-readable next safe actions.
102
+
103
+ ## Why this is different from a receipt field list
104
+
105
+ A field list says what happened. A review primitive says what the next system is allowed to do with that evidence.
106
+
107
+ If the artifact cannot reject a PR, pause a handoff, or force review when the run became partial/unsafe, it is probably just a nicer `plan.md`.
@@ -0,0 +1,87 @@
1
+ # Skill policy receipts
2
+
3
+ Use this when an Agent Skill, `CLAUDE.md`, hook, or project rule says "do not touch X" but the agent can still drift into the forbidden path.
4
+
5
+ The goal is not to log prompts or source code. The goal is a tiny, privacy-safe receipt that proves the run checked the policy boundary before writing code and again after writing code.
6
+
7
+ This was prompted by a live `r/ClaudeCode` thread where a Skill told Claude Code not to create unit tests for internal services, but the run still generated one. Natural-language policy alone was too soft; the missing piece was an inspectable guard.
8
+
9
+ ## Boundary to prove
10
+
11
+ For every requested change, capture:
12
+
13
+ ```json
14
+ {
15
+ "receipt_type": "skill.policy.v1",
16
+ "skill": "unit-test-boundary",
17
+ "request_id": "local-run-2026-05-28T12:00Z",
18
+ "policy_scope": "unit-test targets",
19
+ "targets": [
20
+ {
21
+ "target": "src/public-api/client.test.ts",
22
+ "decision": "allowed",
23
+ "reason": "public API surface"
24
+ },
25
+ {
26
+ "target": "src/internal/billing/reconciler.test.ts",
27
+ "decision": "refused",
28
+ "reason": "internal service tests are out of scope for this Skill"
29
+ }
30
+ ],
31
+ "write_started": false,
32
+ "post_write_guard": "not_run",
33
+ "stopped_at": "policy_refused"
34
+ }
35
+ ```
36
+
37
+ Keep values coarse. Do not include code, secrets, customer names, stack traces, raw tool output, or full transcripts.
38
+
39
+ ## Minimal Skill guard
40
+
41
+ Add a short preflight before the Skill writes files:
42
+
43
+ ```markdown
44
+ ## Policy preflight
45
+
46
+ Before writing tests:
47
+
48
+ 1. List the intended test targets.
49
+ 2. Mark each target as `allowed` or `refused`.
50
+ 3. Refuse before writing if any target imports or exercises internal services.
51
+ 4. Emit a `skill.policy.v1` receipt with target names or coarse globs, decision, reason, and `write_started=false` when refused.
52
+ 5. Only after every target is allowed, write files.
53
+ 6. After writing, run the post-write guard and emit whether it passed.
54
+ ```
55
+
56
+ Then add a post-write check that is simple enough for an agent to run reliably:
57
+
58
+ ```bash
59
+ # Example: fail if generated unit tests import internal services.
60
+ grep -R "from ['\"]\.\./\.\./internal\|from ['\"]@/internal\|require(['\"]@/internal" \
61
+ -- '*test.*' '*spec.*'
62
+ ```
63
+
64
+ Adjust the grep for your repo. The important part is the receipt shape:
65
+
66
+ - `policy_target_listed`
67
+ - `policy_decision_allowed` / `policy_decision_refused`
68
+ - `refusal_reason`
69
+ - `write_started`
70
+ - `post_write_guard_passed` / `post_write_guard_failed`
71
+ - `stopped_at`
72
+
73
+ ## Why this belongs next to context receipts
74
+
75
+ A Skill can be loaded and still fail to obey the boundary. That is the same class of problem as a healthy MCP server with tools invisible in the client, or a context file generated but not actually selected by the agent.
76
+
77
+ The useful question is: **where did the boundary proof stop?**
78
+
79
+ - Skill loaded, but no target list: policy was never made operational.
80
+ - Target list exists, but no decisions: policy was considered but not enforced.
81
+ - Refused target exists, but `write_started=true`: refusal came too late.
82
+ - Post-write guard failed: generated code crossed the forbidden boundary.
83
+ - Guard passed: the run has a small, reviewable receipt instead of only a confident claim.
84
+
85
+ ## Try the copyable Skill recipe
86
+
87
+ See [`examples/agent-skills/skill-policy-receipts/`](../examples/agent-skills/skill-policy-receipts/) for a small `SKILL.md` recipe you can copy into Claude Code/OpenClaw-style Skill workflows.
@@ -0,0 +1,95 @@
1
+ # Subagent role receipts
2
+
3
+ Custom subagents are useful only if the caller can tell which role instructions actually governed the delegated work.
4
+
5
+ Use this recipe when a project defines Codex/Claude Code/Cursor/OpenClaw-style subagents and wants a privacy-safe receipt for the role boundary: which role was requested, which instruction source was loaded, which tools/capabilities were allowed or deferred, and where the subagent stopped before crossing an unsafe boundary.
6
+
7
+ This is not a claim that every agent runner uses the same file format. Treat `agents.toml` as a portable **example** for role definitions, and treat the receipt as the stable artifact: evidence about the role boundary without logging raw prompts, source code, transcripts, tool output, secrets, or customer data.
8
+
9
+ ## When this helps
10
+
11
+ Use a subagent role receipt when:
12
+
13
+ - a manager agent delegates work to a specialist reviewer, tester, security checker, migration planner, or docs writer;
14
+ - the role has a narrower policy than the main agent;
15
+ - the subagent has restricted tools, MCP servers, or write permissions;
16
+ - the role should refuse mutation and only report findings;
17
+ - a human reviewer needs to know which role instructions were loaded before trusting the result.
18
+
19
+ ## Example role definition
20
+
21
+ The example in [`examples/subagent-role-receipts/agents.toml`](../examples/subagent-role-receipts/agents.toml) defines two project-local roles:
22
+
23
+ - `blast-radius-reviewer` — reviews AI-generated PRs by operational boundaries before merge;
24
+ - `temporal-authority-checker` — checks whether docs/specs are current or superseded before an agent writes code.
25
+
26
+ The file is intentionally small so it can be adapted to the runner you use.
27
+
28
+ ## Receipt shape
29
+
30
+ Attach this to a PR body, task handoff, review-bot comment, or run summary.
31
+
32
+ ```json
33
+ {
34
+ "type": "subagent.role_boundary.v1",
35
+ "delegation": {
36
+ "requested_role": "blast-radius-reviewer",
37
+ "effective_role": "blast-radius-reviewer",
38
+ "role_source": "agents.toml",
39
+ "role_source_hash": "sha256:example-only",
40
+ "caller": "manager-agent"
41
+ },
42
+ "instructions": {
43
+ "loaded": true,
44
+ "source_kind": "project-local-role-definition",
45
+ "raw_instruction_logged": false,
46
+ "policy_summary": [
47
+ "review by blast radius, not diff size",
48
+ "do not approve merge when boundary evidence is ambiguous"
49
+ ]
50
+ },
51
+ "capabilities": {
52
+ "writes_allowed": false,
53
+ "tools_allowed": ["read", "grep", "test-summary"],
54
+ "tools_deferred_or_unavailable": ["shell-write", "deploy", "migration-apply"],
55
+ "mcp_servers_allowed": []
56
+ },
57
+ "boundary_decisions": [
58
+ {
59
+ "boundary": "schema_or_data_contract",
60
+ "status": "ambiguous",
61
+ "decision": "blocks_merge",
62
+ "reason": "migration rollback evidence missing"
63
+ }
64
+ ],
65
+ "handoff": {
66
+ "result_kind": "review_receipt",
67
+ "stopped_at": "ambiguous boundary before merge approval",
68
+ "next_safe_action": "ask backend owner to confirm rollback and reader compatibility"
69
+ },
70
+ "privacy": {
71
+ "raw_prompt_logged": false,
72
+ "raw_source_logged": false,
73
+ "raw_tool_output_logged": false,
74
+ "transcript_logged": false,
75
+ "secrets_logged": false,
76
+ "customer_data_logged": false
77
+ }
78
+ }
79
+ ```
80
+
81
+ ## Minimal checklist
82
+
83
+ Before trusting a delegated subagent result, ask for:
84
+
85
+ - requested role and effective role match;
86
+ - role definition source and coarse hash/version;
87
+ - whether role instructions loaded through the intended path;
88
+ - allowed/refused tool and write capabilities;
89
+ - boundary decisions made by the role;
90
+ - where the role stopped and the next safe action;
91
+ - explicit privacy flags showing raw prompts/source/tool output were not logged.
92
+
93
+ ## What not to log
94
+
95
+ Do not include raw prompts, full instructions, transcripts, source code, file paths that expose private structure, tool output, secrets, credentials, customer data, stack traces, or proprietary diffs. Prefer coarse names, hashes, counts, decision states, and review-owner labels.
@@ -0,0 +1,123 @@
1
+ # Temporal context receipts
2
+
3
+ Use this when a long-lived AI coding project has old specs, ADRs, plans, or TODOs that still match grep but are no longer the current authority.
4
+
5
+ The goal is not to delete history or log raw project content. The goal is a tiny, privacy-safe receipt that proves the agent separated **current authority** from **historical citation** before it edits code.
6
+
7
+ This was prompted by a live `r/ClaudeCode` thread about the temporal problem in long-running projects: Claude Code can find every old plan, but grep is blind to time. If old docs do not carry status, date, and supersession metadata, the agent can treat a stale architecture note as current truth.
8
+
9
+ ## Boundary to prove
10
+
11
+ For every coding run that reads design/context docs, capture a coarse receipt like this:
12
+
13
+ ```json
14
+ {
15
+ "receipt_type": "context.temporal_authority.v1",
16
+ "request_id": "local-run-2026-05-28T16:00Z",
17
+ "current_authority": {
18
+ "file": "CURRENT_STATE.md",
19
+ "status": "current",
20
+ "as_of": "2026-05-28",
21
+ "scope": "checkout-flow"
22
+ },
23
+ "sources_considered": [
24
+ {
25
+ "file": "specs/2025-checkout-rewrite.md",
26
+ "status": "superseded",
27
+ "superseded_by": "CURRENT_STATE.md#checkout-flow",
28
+ "decision": "historical_citation_only"
29
+ },
30
+ {
31
+ "file": "specs/2026-checkout-risk-notes.md",
32
+ "status": "current",
33
+ "scope": "checkout-flow",
34
+ "decision": "allowed_as_supporting_context"
35
+ }
36
+ ],
37
+ "ambiguous_sources": [],
38
+ "write_started": true,
39
+ "stopped_at": "temporal_authority_resolved"
40
+ }
41
+ ```
42
+
43
+ Keep values coarse. Do not include source code, raw plans, prompts, transcripts, secrets, customer names, stack traces, private paths, or raw tool output.
44
+
45
+ ## Minimal doc convention
46
+
47
+ Give every long-lived context file a small frontmatter header:
48
+
49
+ ```markdown
50
+ ---
51
+ status: current # current | superseded | archived
52
+ scope: checkout-flow
53
+ date: 2026-05-28
54
+ superseded_by: null
55
+ ---
56
+ ```
57
+
58
+ For old specs:
59
+
60
+ ```markdown
61
+ ---
62
+ status: superseded
63
+ scope: checkout-flow
64
+ date: 2025-11-10
65
+ superseded_by: ../CURRENT_STATE.md#checkout-flow
66
+ ---
67
+ ```
68
+
69
+ Then make `CURRENT_STATE.md` the short authority file an agent must read first:
70
+
71
+ ```markdown
72
+ # Current state
73
+
74
+ ## checkout-flow
75
+
76
+ - status: current
77
+ - as_of: 2026-05-28
78
+ - current authority: this section
79
+ - related historical specs:
80
+ - specs/2025-checkout-rewrite.md (superseded)
81
+ - specs/2026-checkout-risk-notes.md (current supporting context)
82
+
83
+ Agents may cite superseded specs for rationale, but must not implement from them unless the current authority explicitly reactivates that behavior.
84
+ ```
85
+
86
+ ## Agent preflight
87
+
88
+ Before editing code in a long-lived project, ask the agent to do this:
89
+
90
+ ```markdown
91
+ ## Temporal authority preflight
92
+
93
+ Before writing code:
94
+
95
+ 1. Read `CURRENT_STATE.md` or the repo's current-state equivalent.
96
+ 2. List design/spec/TODO/context files found for the requested scope.
97
+ 3. Mark each source as `current`, `superseded`, `archived`, or `ambiguous`.
98
+ 4. If any relevant source is `ambiguous` or lacks `superseded_by` while contradicting current authority, stop before writing.
99
+ 5. Emit a `context.temporal_authority.v1` receipt with coarse file names/globs, status, decision, `write_started`, and `stopped_at`.
100
+ 6. Only use superseded docs as historical citations, not as implementation authority.
101
+ ```
102
+
103
+ Useful receipt markers:
104
+
105
+ - `context_current_authority`
106
+ - `historical_spec_citation`
107
+ - `status_superseded`
108
+ - `superseded_by_resolved`
109
+ - `ambiguous_temporal_source`
110
+ - `stale_source_ignored`
111
+ - `write_refused_until_authority_resolved`
112
+ - `preflight_temporal_decision`
113
+
114
+ ## Where this catches failures
115
+
116
+ - Old spec matches grep, but has `status: superseded`: agent can cite it but should not implement from it.
117
+ - Old spec conflicts with `CURRENT_STATE.md` and has no `superseded_by`: agent should stop and ask for authority resolution.
118
+ - Multiple current files claim the same scope: agent should stop before writing.
119
+ - Current authority exists, but the run never read it: the receipt should show `stopped_at=current_authority_missing` or `write_started=false`.
120
+
121
+ ## Try the copyable example
122
+
123
+ See [`examples/temporal-context-receipts/`](../examples/temporal-context-receipts/) for a minimal `CURRENT_STATE.md`, superseded spec, current supporting note, and receipt example.
@@ -100,6 +100,27 @@ Minimal JSONL event names:
100
100
  {"event":"subagent.toolsearch.matrix.completed","tested_axis":"tools_frontmatter_shape","audit_gap":"proves ToolSearch exposure, not semantic tool relevance or runtime call success"}
101
101
  ```
102
102
 
103
+ ## Retrieval / code-search smoke
104
+
105
+ For semantic code search, repo RAG, or MCP tools such as Claude Context, separate "search returned" from "agent context loaded":
106
+
107
+ - which index snapshot/version was used, without raw local codebase paths;
108
+ - what query/category/filter identity selected the candidates, without raw query text;
109
+ - which result ids/chunk hashes were returned, with rank, score bucket, stale flag, duplicate marker, path hash/extension, and range bucket;
110
+ - which returned chunks were actually loaded into the agent context;
111
+ - which chunks were suppressed as duplicate, stale, clipped, policy-blocked, or over budget;
112
+ - whether raw code, raw prompts, raw paths, customer names, URLs, secrets, and ticket text stayed out of the receipt;
113
+ - the audit gap: this proves retrieval/loading boundaries, not semantic answer quality.
114
+
115
+ Minimal JSONL event names:
116
+
117
+ ```jsonl
118
+ {"event":"code.index.snapshot.used","snapshot_id_hash":"sha256:...","codebase_path_hash":"sha256:...","indexed_chunk_count_bucket":"over_1k","raw_codebase_path_copied":false}
119
+ {"event":"code.search.performed","query_hash":"sha256:...","query_category":"auth_debug","candidate_count_bucket":"over_1k","raw_query_copied":false}
120
+ {"event":"code.search.result.returned","rank":1,"chunk_id_hash":"sha256:...","chunk_text_hash":"sha256:...","path_hash":"sha256:...","score_bucket":"high","stale":false,"raw_code_copied":false}
121
+ {"event":"context.input.loaded","kind":"retrieved_code_chunks","loaded_chunk_count":3,"suppressed_chunk_count":2,"suppression_reasons":["duplicate","stale_snapshot_chunk"],"raw_code_copied":false}
122
+ ```
123
+
103
124
  ## Usage attribution smoke
104
125
 
105
126
  For `/usage`, `/context`, `/doctor`, or other context-budget breakdowns, map each displayed category to evidence that can be reviewed without exposing private content:
@@ -0,0 +1,22 @@
1
+ # Skill policy receipts recipe
2
+
3
+ This is a copyable Agent Skill recipe for cases where a natural-language rule needs an inspectable guard.
4
+
5
+ Example use cases:
6
+
7
+ - a Skill must not generate tests for internal services;
8
+ - an agent must not edit generated files;
9
+ - a hook must not call production APIs;
10
+ - a migration helper must default to preview/dry-run unless `--apply` is explicit.
11
+
12
+ Copy `SKILL.md` into your Skill registry, adjust the policy and post-write guard, then ask the agent to emit `skill.policy.v1` receipts before writes and after guard checks.
13
+
14
+ The receipt should prove:
15
+
16
+ - intended targets were listed;
17
+ - each target was allowed or refused;
18
+ - refusal happened before writes;
19
+ - post-write guard passed or failed;
20
+ - no raw prompt, code, secret, customer data, stack trace, or full transcript was logged.
21
+
22
+ Related guide: [`docs/skill-policy-receipts.md`](../../../docs/skill-policy-receipts.md).