pluribus-context 0.3.33 → 0.3.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +19 -0
- package/README.md +7 -6
- package/docs/ai-pr-review-receipts.md +153 -0
- package/docs/canonical-output-receipts.md +107 -0
- package/docs/community-review-packet.md +11 -11
- package/docs/context-budget-receipts.md +22 -0
- package/docs/context-input-evidence.md +15 -0
- package/docs/dynamic-workflow-run-receipts.md +158 -0
- package/docs/install-plan-receipts.md +77 -0
- package/docs/mcp-tool-visibility-receipts.md +67 -0
- package/docs/review-primitive-gate.md +107 -0
- package/docs/skill-policy-receipts.md +87 -0
- package/docs/subagent-role-receipts.md +95 -0
- package/docs/temporal-context-receipts.md +123 -0
- package/examples/agent-skills/context-receipts/SKILL.md +21 -0
- package/examples/agent-skills/skill-policy-receipts/README.md +22 -0
- package/examples/agent-skills/skill-policy-receipts/SKILL.md +77 -0
- package/examples/ai-pr-review-receipts/.github/pull_request_template.md +31 -0
- package/examples/ai-pr-review-receipts/README.md +5 -0
- package/examples/canonical-output-receipts/canonical-output-receipt.json +55 -0
- package/examples/claude-code-review-hook/README.md +74 -0
- package/examples/claude-code-review-hook/check-review-receipt-hook.mjs +80 -0
- package/examples/claude-code-review-hook/sample-task-completed-event.json +6 -0
- package/examples/context-input-evidence/code-search-retrieval-otel-trace.json +879 -0
- package/examples/context-input-evidence/code-search-retrieval-receipt.ndjson +8 -0
- package/examples/context-input-evidence/convert-code-search-retrieval-log.mjs +280 -0
- package/examples/context-input-evidence/sample-code-search-retrieval-log.jsonl +5 -0
- package/examples/dynamic-workflow-run-receipts/README.md +18 -0
- package/examples/dynamic-workflow-run-receipts/workflow-run-receipt.json +112 -0
- package/examples/install-plan-receipts/README.md +34 -0
- package/examples/install-plan-receipts/agent-install-plan-receipt.json +56 -0
- package/examples/review-primitive-gate/README.md +19 -0
- package/examples/review-primitive-gate/check-review-receipt.mjs +100 -0
- package/examples/review-primitive-gate/fail-review-receipt.json +42 -0
- package/examples/review-primitive-gate/pass-review-receipt.json +54 -0
- package/examples/subagent-role-receipts/README.md +15 -0
- package/examples/subagent-role-receipts/agents.toml +36 -0
- package/examples/temporal-context-receipts/CURRENT_STATE.md +13 -0
- package/examples/temporal-context-receipts/specs/2025-checkout-rewrite.md +10 -0
- package/examples/temporal-context-receipts/specs/2026-checkout-risk-notes.md +10 -0
- package/examples/temporal-context-receipts/temporal-authority-receipt.json +27 -0
- package/package.json +1 -1
- package/src/utils/version.js +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,25 @@
|
|
|
4
4
|
|
|
5
5
|
All notable changes to Pluribus are documented here.
|
|
6
6
|
|
|
7
|
+
## 0.3.35 - 2026-05-31
|
|
8
|
+
|
|
9
|
+
- Added canonical-output receipts for preserving the last clean version of an artifact as versioned evidence instead of treating old chats as source of truth.
|
|
10
|
+
- Added a review primitive gate example that fails unsafe agent handoffs on unapproved scope changes, skipped/failed required checks, missing evidence, privacy leaks, or `partial` / `unsafe-to-resume` state.
|
|
11
|
+
- Added a Claude Code hook bridge that runs the review primitive gate from lifecycle hook JSON, making handoff proof enforceable at `TaskCompleted` / `PostCompact` style boundaries.
|
|
12
|
+
- Tracked fresh market signals around dynamic workflow cost accounting, post-compact recovery, active idea indexes, Skill policy alternatives, and control-plane buying thresholds.
|
|
13
|
+
|
|
14
|
+
- Add Skill policy receipts guide and copyable Agent Skill recipe for hard-rule enforcement, target allow/refuse decisions, and post-write guards.
|
|
15
|
+
- Add temporal context receipts guide and copyable current-state/spec example for long-lived projects where old docs still match search but are no longer implementation authority.
|
|
16
|
+
- Add AI PR review receipts guide and copyable PR template for reviewing agent-generated changes by blast radius instead of diff size alone.
|
|
17
|
+
- Add subagent role receipts guide and copyable `agents.toml` example for proving requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action.
|
|
18
|
+
- Add install-plan receipts guide and copyable example for setup scripts that configure MCP servers, Skills, instruction files, hooks, or plugins across multiple AI coding tools before writes begin.
|
|
19
|
+
- Added an MCP tool visibility receipts checklist for debugging servers that launch and return `tools/list` directly but do not surface tools in the actual agent client catalog.
|
|
20
|
+
- Tracked GBrain `unify-types` mutation-mode feedback as an external receipt signal for protected migrations that may default from dry-run into writes.
|
|
21
|
+
|
|
22
|
+
## 0.3.34 - 2026-05-26
|
|
23
|
+
|
|
24
|
+
- Repositioned README and the community review packet around privacy-safe agent context receipts first, with instruction-file audit/sync as the supporting workflow, so directory reviewers do not mistake Pluribus for another generic ContextOps, memory, RAG, or rules-sync tool.
|
|
25
|
+
|
|
7
26
|
## 0.3.33 - 2026-05-26
|
|
8
27
|
|
|
9
28
|
- Added Agent Skill metadata frontmatter and a `/usage` attribution smoke to the context receipts skill so directory reviewers can evaluate it as a standard SKILL.md and connect receipts to component-level usage breakdowns.
|
package/README.md
CHANGED
|
@@ -6,15 +6,15 @@
|
|
|
6
6
|
[](https://x.com/RibeiroCaioCLW)
|
|
7
7
|
[](LICENSE)
|
|
8
8
|
|
|
9
|
-
>
|
|
9
|
+
> Privacy-safe context receipts for AI coding agents — plus audits/sync for the instruction files they actually load.
|
|
10
10
|
|
|
11
|
-
Pluribus (`pluribus-context` on npm, `pluribus` on the command line) is
|
|
11
|
+
Pluribus (`pluribus-context` on npm, `pluribus` on the command line) is a CLI for **agent context evidence**. It helps teams answer: what instruction file, skill, MCP/tool schema, memory/RAG result, compaction, pruning step, or generated rule actually crossed an agent boundary — without logging raw prompts, source code, tool output, paths, transcripts, secrets, or customer data.
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
The original sync workflow is still useful: Pluribus can keep project instructions, conventions, constraints, and team context in one versioned `pluribus.md` source of truth, then generate native files for Claude Code, Cursor, GitHub Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. The sharper wedge is evidence: read-only audits and receipts show where context keeps fidelity, downgrades to a generic fallback, duplicates, stays deferred, hydrates, gets pruned, or rolls back after failed compaction.
|
|
14
14
|
|
|
15
|
-
It is **not** a persistent memory layer, retrieval system, agent orchestrator, or agent-merging framework. Think `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md
|
|
15
|
+
It is **not** a persistent memory layer, retrieval system, agent orchestrator, enterprise ContextOps platform, or agent-merging framework. Think evidence for context boundaries: `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md`, MCP Tool Search, Agent Skills, RAG/code-search, pruning, and compaction — with privacy-safe receipts instead of raw content dumps.
|
|
16
16
|
|
|
17
|
-
**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and
|
|
17
|
+
**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and disposable 60-second smoke tests. If you only run one command for the cross-tool audit, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, retrieval chunks, pruning runs, or compaction summaries crossed an agent boundary. If you want the same idea as a copyable skill, use the [context-receipts Agent Skill recipe](examples/agent-skills/context-receipts/). npm `latest` is currently aligned with the GitHub release; the review packet also documents a GitHub-release smoke fallback for future release-lag windows.
|
|
18
18
|
|
|
19
19
|
---
|
|
20
20
|
|
|
@@ -161,7 +161,7 @@ npx --yes pluribus-context@latest sync --dry-run
|
|
|
161
161
|
|
|
162
162
|
If the preview looks right, run `npx --yes pluribus-context@latest sync` to write the tool-specific files.
|
|
163
163
|
|
|
164
|
-
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
|
|
164
|
+
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If an MCP server is healthy but tools are missing in Claude Code/Cursor/Codex, use the [MCP tool visibility receipts](docs/mcp-tool-visibility-receipts.md) checklist to separate launch, handshake, `tools/list`, client catalog, and first invocation failures. If a Claude Code/OpenClaw-style Skill states a hard rule but the run still violates it, use the [Skill policy receipts](docs/skill-policy-receipts.md) guide and [copyable Skill recipe](examples/agent-skills/skill-policy-receipts/) to turn target decisions, refusals, and post-write guards into privacy-safe evidence. If long-lived projects keep old specs/TODOs that still match grep but are no longer authoritative, use [Temporal context receipts](docs/temporal-context-receipts.md) and the [copyable current-state example](examples/temporal-context-receipts/) to separate current authority from historical citations before an agent writes code. If AI-generated pull requests are hard to review because diff size hides operational risk, use [AI PR review receipts](docs/ai-pr-review-receipts.md) and the [copyable PR template](examples/ai-pr-review-receipts/) to review by blast radius: schema/data contracts, async paths, rollout gates, side effects, and ambiguous boundaries. If you delegate work to Codex/Claude Code/Cursor/OpenClaw-style specialist subagents, use [Subagent role receipts](docs/subagent-role-receipts.md) and the [example role definitions](examples/subagent-role-receipts/) to prove the requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action. If you run Claude Code-style dynamic workflows, ultracode, or local LLM gateway orchestration that spawns many agents, use [Dynamic workflow run receipts](docs/dynamic-workflow-run-receipts.md) and the [copyable workflow example](examples/dynamic-workflow-run-receipts/) to prove phases, per-agent roles/models, context loaded/skipped, tool grants, token spend buckets, per-agent fuses, heartbeat, stop reasons, and known gaps. If you need CI/reviewers to decide whether an agent handoff can continue, must be reviewed, or should be rejected, use the [Review primitive gate](docs/review-primitive-gate.md), its [copyable gate example](examples/review-primitive-gate/), and the [Claude Code review hook bridge](examples/claude-code-review-hook/) to validate assignment boundaries, approved scope/access changes, required checks, privacy flags, and `complete / partial / unsafe-to-resume` state from CI or Claude Code `TaskCompleted` / `PostCompact` hooks. If Claude Projects, long chats, or compaction make the last clean artifact hard to recover, use [Canonical output receipts](docs/canonical-output-receipts.md) and the [copyable index example](examples/canonical-output-receipts/) to track stable IDs, paths, versions, exact grep phrases, decisions, rejected options, and next actions. If a setup script installs MCP servers, Skills, instruction files, hooks, or plugins across multiple agents, use [Install-plan receipts](docs/install-plan-receipts.md) and the [copyable example](examples/install-plan-receipts/) to prove planned writes, backups, network behavior, and `writes_started=false` before mutation. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
|
|
165
165
|
|
|
166
166
|
### Usage
|
|
167
167
|
|
|
@@ -407,6 +407,7 @@ If you've felt this pain, tell me about your setup. What tools do you use? How d
|
|
|
407
407
|
- [OpenClaw Integration](docs/openclaw-integration.md) — how Pluribus generates `AGENTS.md` for OpenClaw
|
|
408
408
|
- [Composable Contexts](docs/composable-contexts.md) — local/remote imports, merge behavior, and safety rules
|
|
409
409
|
- [MCP Memory Handoff](docs/memory-mcp-handoff.md) — demo for keeping memory recall/store protocols aligned across tool-specific instruction files
|
|
410
|
+
- [MCP Tool Visibility Receipts](docs/mcp-tool-visibility-receipts.md) — checklist for debugging healthy MCP servers whose tools do not appear in the agent client catalog
|
|
410
411
|
- [Remote Composable Context Imports](docs/remote-composable-context-imports.md) — design notes for lockfile/cache/auth hardening
|
|
411
412
|
- [Context Format Spec](spec/context-format.md) — the `pluribus.md` format reference
|
|
412
413
|
- [Skills Format Spec](spec/skills-format.md) — how adapters work and how to write custom skills
|
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
# AI PR review receipts
|
|
2
|
+
|
|
3
|
+
AI-generated PRs are not risky because they are large or small. They are risky when the reviewer cannot tell which operational boundaries the agent touched.
|
|
4
|
+
|
|
5
|
+
Use this recipe when Claude Code, Cursor, Codex, Copilot agents, OpenClaw, or another coding agent opens a PR and the team needs a compact review artifact before merge.
|
|
6
|
+
|
|
7
|
+
The goal is not to log prompts, transcripts, source code, stack traces, secrets, customer data, or raw tool output. The goal is a privacy-safe receipt that proves the review unit: blast radius.
|
|
8
|
+
|
|
9
|
+
## When this helps
|
|
10
|
+
|
|
11
|
+
Use an AI PR review receipt when a change may affect:
|
|
12
|
+
|
|
13
|
+
- database schema, migrations, backfills, or persisted data contracts;
|
|
14
|
+
- readers/writers that run while a migration or rollout is in progress;
|
|
15
|
+
- async jobs, queues, cron tasks, webhooks, retries, or background workers;
|
|
16
|
+
- feature flags, rollout gates, kill switches, or compatibility shims;
|
|
17
|
+
- external side effects such as payments, email, auth, billing, search indexes, analytics, or third-party APIs;
|
|
18
|
+
- generated files, public APIs, plugin manifests, MCP/Skill/hook configuration, or security-sensitive project config.
|
|
19
|
+
|
|
20
|
+
If none of these apply, the receipt can say so. That negative claim is still useful because it tells the human reviewer what the agent believes it did **not** touch.
|
|
21
|
+
|
|
22
|
+
## Receipt shape
|
|
23
|
+
|
|
24
|
+
Attach this as a PR body section, `REVIEW.md` note, check-run summary, or bot comment.
|
|
25
|
+
|
|
26
|
+
```json
|
|
27
|
+
{
|
|
28
|
+
"type": "review.blast_radius.v1",
|
|
29
|
+
"pr": {
|
|
30
|
+
"source": "agent-pr",
|
|
31
|
+
"review_requested": true,
|
|
32
|
+
"human_review_required": true
|
|
33
|
+
},
|
|
34
|
+
"boundaries": [
|
|
35
|
+
{
|
|
36
|
+
"name": "schema_or_data_contract",
|
|
37
|
+
"status": "touched",
|
|
38
|
+
"evidence": "migration file added; live reader compatibility checked",
|
|
39
|
+
"risk_tier": "high",
|
|
40
|
+
"review_owner": "backend"
|
|
41
|
+
},
|
|
42
|
+
{
|
|
43
|
+
"name": "async_or_background_path",
|
|
44
|
+
"status": "not_touched",
|
|
45
|
+
"evidence": "no queue/cron/webhook paths changed in diff summary",
|
|
46
|
+
"risk_tier": "low"
|
|
47
|
+
},
|
|
48
|
+
{
|
|
49
|
+
"name": "rollout_gate",
|
|
50
|
+
"status": "present",
|
|
51
|
+
"evidence": "feature flag path exists before new behavior is enabled",
|
|
52
|
+
"risk_tier": "medium"
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"name": "external_side_effect",
|
|
56
|
+
"status": "ambiguous",
|
|
57
|
+
"evidence": "email sender import changed; no dry-run evidence found",
|
|
58
|
+
"risk_tier": "high",
|
|
59
|
+
"blocked_until": "reviewer confirms side-effect behavior"
|
|
60
|
+
}
|
|
61
|
+
],
|
|
62
|
+
"tests_and_checks": [
|
|
63
|
+
{
|
|
64
|
+
"name": "unit_or_integration_tests",
|
|
65
|
+
"status": "passed",
|
|
66
|
+
"scope": "changed package only"
|
|
67
|
+
},
|
|
68
|
+
{
|
|
69
|
+
"name": "migration_or_rollback_check",
|
|
70
|
+
"status": "missing",
|
|
71
|
+
"blocks_merge": true
|
|
72
|
+
}
|
|
73
|
+
],
|
|
74
|
+
"decision": {
|
|
75
|
+
"merge_ready": false,
|
|
76
|
+
"reason": "external side effect and rollback evidence are ambiguous",
|
|
77
|
+
"next_safe_action": "ask backend owner to review email behavior and migration rollback before merge"
|
|
78
|
+
},
|
|
79
|
+
"privacy": {
|
|
80
|
+
"raw_prompt_logged": false,
|
|
81
|
+
"raw_source_logged": false,
|
|
82
|
+
"raw_tool_output_logged": false,
|
|
83
|
+
"secrets_logged": false,
|
|
84
|
+
"customer_data_logged": false
|
|
85
|
+
}
|
|
86
|
+
}
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
## Minimal PR template
|
|
90
|
+
|
|
91
|
+
Copy this into `.github/pull_request_template.md` or a review-bot comment.
|
|
92
|
+
|
|
93
|
+
```markdown
|
|
94
|
+
## AI PR review receipt
|
|
95
|
+
|
|
96
|
+
This PR was prepared or modified by an AI coding agent. Review by blast radius, not by diff size alone.
|
|
97
|
+
|
|
98
|
+
### Boundary receipt
|
|
99
|
+
|
|
100
|
+
| Boundary | Status | Evidence | Risk tier | Owner / blocker |
|
|
101
|
+
| --- | --- | --- | --- | --- |
|
|
102
|
+
| Schema / persisted data contract | `touched / not_touched / ambiguous` | | | |
|
|
103
|
+
| Live reader/writer compatibility | `checked / missing / n/a` | | | |
|
|
104
|
+
| Async jobs / queues / cron / webhooks | `touched / not_touched / ambiguous` | | | |
|
|
105
|
+
| Rollout gate / feature flag / kill switch | `present / missing / n/a` | | | |
|
|
106
|
+
| External side effects | `declared / not_touched / ambiguous` | | | |
|
|
107
|
+
| Generated files / public API / plugin config | `touched / not_touched / ambiguous` | | | |
|
|
108
|
+
|
|
109
|
+
### Checks
|
|
110
|
+
|
|
111
|
+
- [ ] Tests relevant to touched boundaries passed.
|
|
112
|
+
- [ ] Migration/backfill/rollback behavior is explicit, or not applicable.
|
|
113
|
+
- [ ] External side effects are declared, or not touched.
|
|
114
|
+
- [ ] Any `ambiguous` boundary has an owner before merge.
|
|
115
|
+
|
|
116
|
+
### Privacy
|
|
117
|
+
|
|
118
|
+
This receipt does not include raw prompts, transcripts, source code, secrets, customer data, stack traces, or raw tool output.
|
|
119
|
+
|
|
120
|
+
### Decision
|
|
121
|
+
|
|
122
|
+
`merge_ready: yes/no`
|
|
123
|
+
|
|
124
|
+
`next_safe_action:`
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## How to use with Pluribus
|
|
128
|
+
|
|
129
|
+
Pluribus does not need to own your PR workflow. Use it as the neutral language for evidence that crossed an agent boundary:
|
|
130
|
+
|
|
131
|
+
- `review_boundary_schema_data`
|
|
132
|
+
- `live_reader_writer_compatibility`
|
|
133
|
+
- `review_boundary_async_path`
|
|
134
|
+
- `rollout_gate_present`
|
|
135
|
+
- `external_side_effects_declared`
|
|
136
|
+
- `not_touched_boundary_claim`
|
|
137
|
+
- `ambiguous_boundary_blocks_merge`
|
|
138
|
+
- `risk_tier_evidence`
|
|
139
|
+
- `next_safe_action`
|
|
140
|
+
|
|
141
|
+
The same terms can appear in a GitHub PR template, a Claude Code `/code-review` note, an OpenClaw task receipt, a CI check summary, or a release checklist.
|
|
142
|
+
|
|
143
|
+
## Bad receipts
|
|
144
|
+
|
|
145
|
+
Avoid receipts that say only:
|
|
146
|
+
|
|
147
|
+
- “tests passed”;
|
|
148
|
+
- “Claude reviewed it”;
|
|
149
|
+
- “small PR”;
|
|
150
|
+
- “no issues found”;
|
|
151
|
+
- “looks safe.”
|
|
152
|
+
|
|
153
|
+
Those are conclusions. A useful receipt names the boundary, the evidence, the risk tier, and the next safe action when something is ambiguous.
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Canonical output receipts
|
|
2
|
+
|
|
3
|
+
Claude Projects, long Claude Code sessions, and other agent workspaces are useful archives, but search is a weak source of truth. Exact phrases can be hard to recover, project-scoped search may miss the last clean version, and a later chat can overwrite the user's memory of which artifact is authoritative.
|
|
4
|
+
|
|
5
|
+
Use a canonical output receipt when a session produces something that should be found and reused later: a master prompt, escalation memo, architecture decision, migration plan, test matrix, runbook, product brief, or reviewed context file.
|
|
6
|
+
|
|
7
|
+
The receipt is not persistent memory and not a transcript dump. It is a small index card for the last clean artifact: stable id/path, version, exact grep phrases, decisions, rejected alternatives, open questions, and next action — without logging raw private content, secrets, customer data, or full chat history.
|
|
8
|
+
|
|
9
|
+
## When this helps
|
|
10
|
+
|
|
11
|
+
Use this receipt when:
|
|
12
|
+
|
|
13
|
+
- a Claude Project or long chat produces a canonical artifact that must survive fuzzy chat search;
|
|
14
|
+
- several sessions produce competing versions and a reviewer needs the current one;
|
|
15
|
+
- the source chat may be compacted, archived, deleted, exported, or hard to search;
|
|
16
|
+
- a team needs to know which artifact should be copied into repo docs, `pluribus.md`, `CLAUDE.md`, `AGENTS.md`, a prompt library, or a ticket;
|
|
17
|
+
- exact phrases, dates, decisions, and rejected options matter more than the full conversation.
|
|
18
|
+
|
|
19
|
+
## Receipt shape
|
|
20
|
+
|
|
21
|
+
Attach this to the artifact, repo issue, PR body, project notes, or a `canonical_outputs.md` index.
|
|
22
|
+
|
|
23
|
+
```json
|
|
24
|
+
{
|
|
25
|
+
"type": "canonical.output.receipt.v1",
|
|
26
|
+
"artifact": {
|
|
27
|
+
"stable_id": "project-alpha-master-prompt-2026-05-30",
|
|
28
|
+
"name": "Project Alpha master prompt",
|
|
29
|
+
"kind": "master_prompt",
|
|
30
|
+
"canonical_path": "docs/prompts/project-alpha-master-prompt.md",
|
|
31
|
+
"current_version": "2026-05-30.1",
|
|
32
|
+
"content_hash": "sha256:example-only",
|
|
33
|
+
"status": "current",
|
|
34
|
+
"owner_label": "product-ops",
|
|
35
|
+
"created_at": "2026-05-30T21:40:00Z",
|
|
36
|
+
"last_reviewed_at": "2026-05-30T21:58:00Z"
|
|
37
|
+
},
|
|
38
|
+
"source": {
|
|
39
|
+
"workspace": "claude-project-alpha",
|
|
40
|
+
"source_session_id": "session-redacted-2026-05-30",
|
|
41
|
+
"source_tool": "claude-projects",
|
|
42
|
+
"source_chat_title": "Master prompt rebuild",
|
|
43
|
+
"source_url_or_path_redacted": true,
|
|
44
|
+
"raw_transcript_logged": false
|
|
45
|
+
},
|
|
46
|
+
"index": {
|
|
47
|
+
"exact_phrases_worth_grepping": [
|
|
48
|
+
"do not collapse escalation paths into summaries",
|
|
49
|
+
"billing exports are evidence, not source of truth",
|
|
50
|
+
"final prompt contract v3"
|
|
51
|
+
],
|
|
52
|
+
"tags": ["master-prompt", "billing", "escalation", "current-state"],
|
|
53
|
+
"related_artifacts": ["billing-escalation-runbook-2026-05-28"]
|
|
54
|
+
},
|
|
55
|
+
"decisions": {
|
|
56
|
+
"accepted": [
|
|
57
|
+
"Use repo-owned markdown as the canonical copy, not old chats",
|
|
58
|
+
"Keep escalation criteria in the prompt body and test cases in a separate appendix"
|
|
59
|
+
],
|
|
60
|
+
"rejected": [
|
|
61
|
+
{
|
|
62
|
+
"option": "Rely on Claude Project conversation search for recovery",
|
|
63
|
+
"reason": "exact phrase and project-scoped search were unreliable during rebuild"
|
|
64
|
+
}
|
|
65
|
+
],
|
|
66
|
+
"open_questions": [
|
|
67
|
+
"Does support need a shorter handoff summary for weekend rotations?"
|
|
68
|
+
],
|
|
69
|
+
"next_action": "Open a PR that adds the canonical prompt and this receipt to docs/prompts/"
|
|
70
|
+
},
|
|
71
|
+
"privacy": {
|
|
72
|
+
"raw_prompt_logged": false,
|
|
73
|
+
"raw_chat_logged": false,
|
|
74
|
+
"customer_data_logged": false,
|
|
75
|
+
"secrets_logged": false,
|
|
76
|
+
"proprietary_paths_logged": false
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Minimal checklist
|
|
82
|
+
|
|
83
|
+
Before treating an artifact as recoverable, capture:
|
|
84
|
+
|
|
85
|
+
- stable id, human name, artifact kind, canonical path, version/date, owner label, status, and content hash;
|
|
86
|
+
- source workspace/tool/session label, with private URLs or IDs redacted when needed;
|
|
87
|
+
- exact phrases worth grepping, tags, and related artifacts;
|
|
88
|
+
- decisions accepted, options rejected with reasons, open questions, and next action;
|
|
89
|
+
- privacy flags proving raw chats, raw prompts, customer data, secrets, and private paths were not logged.
|
|
90
|
+
|
|
91
|
+
## `canonical_outputs.md` sketch
|
|
92
|
+
|
|
93
|
+
For small teams, a plain markdown index is enough:
|
|
94
|
+
|
|
95
|
+
```markdown
|
|
96
|
+
# Canonical outputs
|
|
97
|
+
|
|
98
|
+
| Stable ID | Current path | Version | Status | Exact phrase to grep | Next action |
|
|
99
|
+
| --- | --- | --- | --- | --- | --- |
|
|
100
|
+
| project-alpha-master-prompt-2026-05-30 | docs/prompts/project-alpha-master-prompt.md | 2026-05-30.1 | current | final prompt contract v3 | PR canonical copy |
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Old chats should be evidence. The source of truth should be the artifact plus the receipt.
|
|
104
|
+
|
|
105
|
+
## What not to log
|
|
106
|
+
|
|
107
|
+
Do not include raw chat transcripts, full prompts that contain private context, customer data, secrets, credentials, exact private paths, proprietary document bodies, or unredacted project URLs. Prefer hashes, stable ids, coarse tags, short grep phrases, version dates, and decision states.
|
|
@@ -4,11 +4,11 @@ Use this when reviewing Pluribus for a list, newsletter, package roundup, or too
|
|
|
4
4
|
|
|
5
5
|
## One-line description
|
|
6
6
|
|
|
7
|
-
Pluribus
|
|
7
|
+
Pluribus provides privacy-safe context receipts for AI coding agents, plus audits/sync for the instruction files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
|
|
8
8
|
|
|
9
9
|
## Short listing copy
|
|
10
10
|
|
|
11
|
-
Pluribus is an open-source CLI for teams and solo developers who
|
|
11
|
+
Pluribus is an open-source CLI for teams and solo developers who need evidence about agent context boundaries. It emits privacy-safe receipts for what crossed, stayed deferred, duplicated, got pruned, or rolled back across MCP tools, Agent Skills, memory/RAG retrieval, subagents, compaction, and generated instruction files — without logging raw prompts, code, schemas, tool outputs, transcripts, paths, secrets, or customer data. It also treats project instructions, conventions, constraints, and shared team context as versioned Markdown, then generates each tool's expected context file (`CLAUDE.md`, `.cursorrules`, Copilot instructions, `AGENTS.md`, Windsurf/Continue rules, Zed rules, and Bob rules). The safest first command is a read-only audit:
|
|
12
12
|
|
|
13
13
|
```bash
|
|
14
14
|
npx --yes pluribus-context@latest audit
|
|
@@ -25,10 +25,10 @@ Use these fields for directories, awesome lists, or review forms that ask for a
|
|
|
25
25
|
| npm | https://www.npmjs.com/package/pluribus-context |
|
|
26
26
|
| License | MIT |
|
|
27
27
|
| Install / run | `npx --yes pluribus-context@latest audit` or `npm install -g pluribus-context@latest` |
|
|
28
|
-
| Category | AI coding tools / context management |
|
|
29
|
-
| Tags | `claude-code`, `cursor`, `copilot`, `openclaw`, `windsurf`, `continue`, `zed`, `bob`, `context-drift` |
|
|
30
|
-
| One sentence |
|
|
31
|
-
| 280-char blurb | Pluribus is an open-source CLI for
|
|
28
|
+
| Category | AI coding tools / agent observability / context management |
|
|
29
|
+
| Tags | `claude-code`, `cursor`, `copilot`, `openclaw`, `windsurf`, `continue`, `zed`, `bob`, `context-receipts`, `context-drift`, `mcp`, `agent-skills`, `opentelemetry` |
|
|
30
|
+
| One sentence | Emit privacy-safe receipts for what context crossed agent boundaries, and audit or sync the generated instruction files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. |
|
|
31
|
+
| 280-char blurb | Pluribus is an open-source CLI for agent context evidence. It emits privacy-safe receipts for MCP/tools, skills, memory/RAG, pruning and compaction boundaries, then audits or syncs AI instruction files like `CLAUDE.md`, Cursor rules, Copilot instructions, and `AGENTS.md`. |
|
|
32
32
|
| Safe first command | `npx --yes pluribus-context@latest audit` |
|
|
33
33
|
|
|
34
34
|
### Awesome-list Markdown entry
|
|
@@ -36,7 +36,7 @@ Use these fields for directories, awesome lists, or review forms that ask for a
|
|
|
36
36
|
Use this exact line when a curated list accepts one Markdown bullet per tool:
|
|
37
37
|
|
|
38
38
|
```markdown
|
|
39
|
-
- [Pluribus](https://github.com/caioribeiroclw-pixel/pluribus) - Open-source CLI
|
|
39
|
+
- [Pluribus](https://github.com/caioribeiroclw-pixel/pluribus) - Open-source CLI for privacy-safe agent context receipts, plus audits/sync for AI instruction files across Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
|
|
40
40
|
```
|
|
41
41
|
|
|
42
42
|
## Why it may be useful
|
|
@@ -52,12 +52,12 @@ Use this section when a directory, list maintainer, or reviewer asks how Pluribu
|
|
|
52
52
|
|
|
53
53
|
| Question | Short answer |
|
|
54
54
|
| --- | --- |
|
|
55
|
-
| What category is it? | AI coding context management / rules sync CLI. |
|
|
56
|
-
| What is the source of truth? | `pluribus.md`, reviewed in git. |
|
|
57
|
-
| What does it generate? | Tool-native context files for Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. |
|
|
55
|
+
| What category is it? | Agent context evidence / AI coding context management / rules sync CLI. |
|
|
56
|
+
| What is the source of truth? | For sync: `pluribus.md`, reviewed in git. For receipts: counts, hashes, buckets, lifecycle states, and privacy flags generated from the tool/harness boundary being audited. |
|
|
57
|
+
| What does it generate? | Tool-native context files for Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob; receipt fixtures/trace shapes for context-budget, retrieval, pruning, compaction, Tool Search, subagent, and skill boundaries. |
|
|
58
58
|
| What is the safe first step? | Run `npx --yes pluribus-context@latest audit` to inspect existing context files without writing. |
|
|
59
59
|
| When is another tool enough? | If you only need one tool's native rules format or a one-time converter, a smaller rules manager/converter may be enough. |
|
|
60
|
-
| What is Pluribus not? | Not chat memory, retrieval, vector search, agent orchestration, or agent merging. |
|
|
60
|
+
| What is Pluribus not? | Not chat memory, retrieval, vector search, agent orchestration, enterprise ContextOps, or agent merging. |
|
|
61
61
|
|
|
62
62
|
## Safety and removability
|
|
63
63
|
|
|
@@ -43,6 +43,28 @@ A useful receipt starts small:
|
|
|
43
43
|
|
|
44
44
|
Keep exact counts when they are not sensitive. Bucket token counts and sizes when exact values could reveal private workload shape.
|
|
45
45
|
|
|
46
|
+
## Code-search / retrieval receipts
|
|
47
|
+
|
|
48
|
+
Semantic code-search MCPs and RAG-over-repo tools can reduce context bloat by returning only relevant chunks. The observability gap is that retrieval and agent-loading are two different boundaries: a tool may return five chunks, a client may dedupe or stale-filter two of them, and only three may actually enter the agent context.
|
|
49
|
+
|
|
50
|
+
The receipt should prove:
|
|
51
|
+
|
|
52
|
+
- the indexed snapshot/version used, without raw local paths or embedding secrets;
|
|
53
|
+
- the search request identity/category, without raw query text or filters;
|
|
54
|
+
- returned result identities, ranks, score buckets, stale/duplicate markers, and path hashes/extensions/range buckets;
|
|
55
|
+
- which returned chunks were loaded into agent context versus suppressed by the client/harness; and
|
|
56
|
+
- raw code, private paths, prompts, customer names, URLs, tokens, and ticket text stayed out of the receipt.
|
|
57
|
+
|
|
58
|
+
Runnable fixture:
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
node examples/context-input-evidence/convert-code-search-retrieval-log.mjs
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Public trace:
|
|
65
|
+
|
|
66
|
+
- `examples/context-input-evidence/code-search-retrieval-otel-trace.json`
|
|
67
|
+
|
|
46
68
|
## Post-hoc pruning / context cleaning
|
|
47
69
|
|
|
48
70
|
Context-cleaning tools can reduce a bloated session after context has already entered the transcript. That creates a separate proof boundary from lazy loading: what was pruned, minified, stubbed, deduped, protected, and backed up?
|
|
@@ -197,6 +197,21 @@ It reads `sample-mcp-tool-search-log.jsonl` and writes `mcp-tool-search-receipt.
|
|
|
197
197
|
|
|
198
198
|
This is for Claude Code/MCP context-budget work where Tool Search reduces context bloat but still needs verifiable boundaries. The receipt should prove “only indexes were loaded up front; this one definition was loaded when needed; private query/arguments/results stayed out of the trace.”
|
|
199
199
|
|
|
200
|
+
To test semantic code-search retrieval — where a code-search MCP returns multiple ranked chunks but the client/harness may dedupe stale or duplicate results before loading only a subset into the agent context — run:
|
|
201
|
+
|
|
202
|
+
```bash
|
|
203
|
+
node examples/context-input-evidence/convert-code-search-retrieval-log.mjs
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
It reads `sample-code-search-retrieval-log.jsonl` and writes `code-search-retrieval-receipt.ndjson` plus `code-search-retrieval-otel-trace.json`. The sample emits:
|
|
207
|
+
|
|
208
|
+
- `code.index.snapshot.used` — snapshot, codebase path hash, git commit hash, indexed file/chunk buckets, and privacy flags.
|
|
209
|
+
- `code.search.performed` — query hash/category, filter hash, top-k, and candidate-count bucket.
|
|
210
|
+
- `code.search.result.returned` — rank, score bucket, chunk hash, path hash/extension, line-range bucket, stale/duplicate flags, and whether the result was loaded into agent context.
|
|
211
|
+
- `context.input.loaded` — loaded versus suppressed chunk counts, suppression reason hashes/categories, token bucket, and explicit audit gap.
|
|
212
|
+
|
|
213
|
+
The fixture intentionally includes raw private code snippets, local paths, URLs, tokens, customer names, emails, and ticket ids in the synthetic input, then verifies those strings do not appear in the receipt or trace. This is for Claude Context / code-search MCP / RAG-over-repo workflows where “search returned” and “agent loaded” need separate evidence.
|
|
214
|
+
|
|
200
215
|
To test CLI progressive disclosure — where an agent receives a tiny CLI prompt first, loads specific command help only when needed, and executes the CLI instead of loading a full OpenAPI spec or MCP schema set — run:
|
|
201
216
|
|
|
202
217
|
```bash
|
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
# Dynamic workflow run receipts
|
|
2
|
+
|
|
3
|
+
Claude Code-style dynamic workflows move orchestration into a script that can spawn many subagents, keep intermediate results outside the parent conversation, and show progress by phase, agent count, token total, and elapsed time.
|
|
4
|
+
|
|
5
|
+
That is useful when a codebase audit, migration, research task, or verification pass needs more parallelism than one conversation can coordinate. It also creates a new failure mode: one child agent can loop, burn tokens, or drift while the parent workflow only shows a high-level progress line.
|
|
6
|
+
|
|
7
|
+
Use a dynamic workflow run receipt when a workflow, ultracode run, local LLM gateway, or multi-agent script delegates work across several agents/models and a human needs a privacy-safe summary of what actually happened.
|
|
8
|
+
|
|
9
|
+
The first thing to look for is a **per-agent fuse**: budget, heartbeat, partial progress, stop reason, and kill-switch state for every spawned agent. After that, inspect whether the expensive path bought better verification or just more context drift.
|
|
10
|
+
|
|
11
|
+
This is not an orchestration framework. The receipt is the stable artifact: compact evidence for each phase and spawned agent without logging raw prompts, source code, transcripts, tool output, secrets, customer data, or proprietary file paths.
|
|
12
|
+
|
|
13
|
+
## When this helps
|
|
14
|
+
|
|
15
|
+
Use this receipt when:
|
|
16
|
+
|
|
17
|
+
- a workflow spawns several agents to audit, migrate, research, or verify a codebase;
|
|
18
|
+
- agents may run different roles, models, or local/remote providers;
|
|
19
|
+
- the run has a token/cost budget that needs to be explained after the fact;
|
|
20
|
+
- a child agent could loop, stall, or keep spending while the parent workflow stays mostly blind;
|
|
21
|
+
- the parent session sees only the final report, not every intermediate result;
|
|
22
|
+
- a reviewer needs to know what context was loaded, skipped, or suppressed for each agent;
|
|
23
|
+
- the run stops, pauses, resumes, or rejects a result and the stop point matters.
|
|
24
|
+
|
|
25
|
+
## Receipt shape
|
|
26
|
+
|
|
27
|
+
Attach this to a workflow report, PR body, task handoff, run summary, or CI artifact.
|
|
28
|
+
|
|
29
|
+
```json
|
|
30
|
+
{
|
|
31
|
+
"type": "dynamic.workflow.run_receipt.v1",
|
|
32
|
+
"workflow": {
|
|
33
|
+
"workflow_id": "wf_checkout_auth_audit_2026_05_30",
|
|
34
|
+
"runner": "claude-code-dynamic-workflow",
|
|
35
|
+
"script_source": "generated-then-reviewed-command",
|
|
36
|
+
"script_hash": "sha256:example-only",
|
|
37
|
+
"task_kind": "codebase_auth_audit",
|
|
38
|
+
"plan_approved_before_run": true,
|
|
39
|
+
"resumable": true,
|
|
40
|
+
"max_wall_clock_bucket": "under_15m",
|
|
41
|
+
"kill_switch_available": true,
|
|
42
|
+
"started_at": "2026-05-30T15:20:00Z",
|
|
43
|
+
"completed_at": "2026-05-30T15:31:42Z"
|
|
44
|
+
},
|
|
45
|
+
"permissions": {
|
|
46
|
+
"tool_allowlist_inherited": true,
|
|
47
|
+
"writes_allowed": false,
|
|
48
|
+
"network_allowed": false,
|
|
49
|
+
"external_commands_allowed": ["grep", "test --dry-run"],
|
|
50
|
+
"permission_profile": "review-only"
|
|
51
|
+
},
|
|
52
|
+
"phases": [
|
|
53
|
+
{
|
|
54
|
+
"phase_id": "route-inventory",
|
|
55
|
+
"purpose": "find candidate auth-sensitive routes",
|
|
56
|
+
"agent_count": 3,
|
|
57
|
+
"token_spend_bucket": "under_50k",
|
|
58
|
+
"elapsed_ms_bucket": "under_2m",
|
|
59
|
+
"result": "completed"
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
"phase_id": "adversarial-review",
|
|
63
|
+
"purpose": "cross-check candidate misses",
|
|
64
|
+
"agent_count": 2,
|
|
65
|
+
"token_spend_bucket": "under_25k",
|
|
66
|
+
"elapsed_ms_bucket": "under_2m",
|
|
67
|
+
"result": "completed_with_gaps"
|
|
68
|
+
}
|
|
69
|
+
],
|
|
70
|
+
"agents": [
|
|
71
|
+
{
|
|
72
|
+
"agent_id": "agent-route-auditor-1",
|
|
73
|
+
"phase_id": "route-inventory",
|
|
74
|
+
"role": "route-auth-auditor",
|
|
75
|
+
"model": "claude-sonnet",
|
|
76
|
+
"provider": "anthropic",
|
|
77
|
+
"context_loaded": ["repo-policy", "auth-boundary-rules", "route-index-summary"],
|
|
78
|
+
"context_skipped_or_suppressed": [
|
|
79
|
+
{
|
|
80
|
+
"source": "customer-fixture-dump",
|
|
81
|
+
"reason": "contains raw customer data; summary hash only"
|
|
82
|
+
}
|
|
83
|
+
],
|
|
84
|
+
"tools_granted": ["read", "grep"],
|
|
85
|
+
"tools_used": ["grep"],
|
|
86
|
+
"feature_areas_checked": ["checkout routes", "admin routes"],
|
|
87
|
+
"token_budget_bucket": "under_25k",
|
|
88
|
+
"token_spend_bucket": "under_10k",
|
|
89
|
+
"max_iterations": 8,
|
|
90
|
+
"iterations_used": 3,
|
|
91
|
+
"heartbeat_seen_at": "2026-05-30T15:25:00Z",
|
|
92
|
+
"partial_progress_reported": true,
|
|
93
|
+
"fuse_triggered": false,
|
|
94
|
+
"stop_reason": "completed_assigned_partition",
|
|
95
|
+
"confidence": "medium",
|
|
96
|
+
"known_gaps": ["did not execute integration tests"],
|
|
97
|
+
"raw_prompt_logged": false,
|
|
98
|
+
"raw_tool_output_logged": false,
|
|
99
|
+
"raw_paths_logged": false
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
"agent_id": "agent-reviewer-1",
|
|
103
|
+
"phase_id": "adversarial-review",
|
|
104
|
+
"role": "adversarial-auth-reviewer",
|
|
105
|
+
"model": "local-codex-compatible",
|
|
106
|
+
"provider": "local-llm-gateway",
|
|
107
|
+
"context_loaded": ["candidate-findings-summary", "public-api-contract-summary"],
|
|
108
|
+
"context_skipped_or_suppressed": [],
|
|
109
|
+
"tools_granted": ["read"],
|
|
110
|
+
"tools_used": ["read"],
|
|
111
|
+
"feature_areas_checked": ["route findings cross-check"],
|
|
112
|
+
"token_budget_bucket": "under_10k",
|
|
113
|
+
"token_spend_bucket": "under_10k",
|
|
114
|
+
"max_iterations": 5,
|
|
115
|
+
"iterations_used": 5,
|
|
116
|
+
"heartbeat_seen_at": "2026-05-30T15:30:00Z",
|
|
117
|
+
"partial_progress_reported": true,
|
|
118
|
+
"fuse_triggered": true,
|
|
119
|
+
"stop_reason": "iteration_budget_reached_before_claim_verified",
|
|
120
|
+
"confidence": "low",
|
|
121
|
+
"known_gaps": ["one route requires owner confirmation before merge"],
|
|
122
|
+
"raw_prompt_logged": false,
|
|
123
|
+
"raw_tool_output_logged": false,
|
|
124
|
+
"raw_paths_logged": false
|
|
125
|
+
}
|
|
126
|
+
],
|
|
127
|
+
"handoff": {
|
|
128
|
+
"final_result_kind": "workflow_review_receipt",
|
|
129
|
+
"claims_rejected_or_deferred": 1,
|
|
130
|
+
"next_safe_action": "ask route owner to confirm checkout callback auth before writing fix",
|
|
131
|
+
"where_it_stopped": "ambiguous auth boundary before mutation"
|
|
132
|
+
},
|
|
133
|
+
"privacy": {
|
|
134
|
+
"raw_prompts_logged": false,
|
|
135
|
+
"raw_source_logged": false,
|
|
136
|
+
"raw_tool_output_logged": false,
|
|
137
|
+
"transcripts_logged": false,
|
|
138
|
+
"secrets_logged": false,
|
|
139
|
+
"customer_data_logged": false
|
|
140
|
+
}
|
|
141
|
+
}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
## Minimal checklist
|
|
145
|
+
|
|
146
|
+
Before trusting the result of a dynamic workflow, ask for:
|
|
147
|
+
|
|
148
|
+
- workflow/run id, runner, script source, script hash, and whether the plan was approved before execution;
|
|
149
|
+
- workflow-level wall-clock budget, whether a kill switch exists, and whether the run can be paused/resumed safely;
|
|
150
|
+
- permission profile, inherited tool allowlist, write/network/command capability, and whether the run was review-only or mutating;
|
|
151
|
+
- phases, agent counts, token spend buckets, elapsed-time buckets, and phase result states;
|
|
152
|
+
- per-agent role, model/provider actually used, context loaded, context skipped/suppressed, tools granted/used, token budget/spend, iteration budget, heartbeat, partial progress, fuse state, stop reason, confidence, and known gaps;
|
|
153
|
+
- explicit privacy flags proving raw prompts, source, transcripts, tool output, paths, secrets, and customer data were not logged;
|
|
154
|
+
- a handoff that says what was accepted, rejected/deferred, where the workflow stopped, and the next safe action.
|
|
155
|
+
|
|
156
|
+
## What not to log
|
|
157
|
+
|
|
158
|
+
Do not include raw prompts, full workflow scripts when they reveal private structure, full transcripts, source code, exact proprietary paths, tool output, secrets, credentials, customer data, stack traces, or raw LLM gateway logs. Prefer coarse names, hashes, buckets, counts, role labels, decision states, stop reasons, and owner labels.
|