pluribus-context 0.3.35 → 0.3.36
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +8 -0
- package/README.md +1 -1
- package/bin/pluribus.js +12 -0
- package/docs/agent-firewall-denial-audit.md +95 -0
- package/docs/ai-pr-review-receipts.md +20 -0
- package/docs/compaction-resume-receipts.md +43 -0
- package/docs/controlled-learning-queue.md +48 -0
- package/docs/install-plan-receipts.md +2 -0
- package/docs/loaded-resource-boundary.md +97 -0
- package/docs/memory-write-policy-receipts.md +41 -0
- package/docs/parallel-session-review-ledger.md +103 -0
- package/docs/phase-boundary-contracts.md +87 -0
- package/docs/review-primitive-gate.md +2 -0
- package/docs/skill-install-receipts.md +102 -0
- package/docs/skill-use-rate-receipts.md +104 -0
- package/examples/agent-firewall-denial-audit/README.md +14 -0
- package/examples/agent-firewall-denial-audit/check-denial-audit.mjs +116 -0
- package/examples/agent-firewall-denial-audit/denial-envelope.json +9 -0
- package/examples/agent-firewall-denial-audit/operator-audit-record.json +20 -0
- package/examples/ai-pr-review-receipts/.github/workflows/ai-pr-review-receipt.yml +25 -0
- package/examples/ai-pr-review-receipts/README.md +51 -1
- package/examples/ai-pr-review-receipts/incomplete-review-primitive-receipt.json +43 -0
- package/examples/ai-pr-review-receipts/review-primitive-receipt.json +60 -0
- package/examples/compaction-resume-receipts/README.md +12 -0
- package/examples/compaction-resume-receipts/check-resume-receipt.mjs +116 -0
- package/examples/compaction-resume-receipts/safe-resume-receipt.json +52 -0
- package/examples/compaction-resume-receipts/unsafe-resume-receipt.json +41 -0
- package/examples/controlled-learning-queue/README.md +26 -0
- package/examples/controlled-learning-queue/check-learning-queue.mjs +44 -0
- package/examples/controlled-learning-queue/leads/acme-job-card.md +12 -0
- package/examples/controlled-learning-queue/learning_queue.md +27 -0
- package/examples/controlled-learning-queue/memory/durable.md +10 -0
- package/examples/controlled-learning-queue/memory/working-notes.md +5 -0
- package/examples/controlled-learning-queue/role/job-contract.md +18 -0
- package/examples/controlled-learning-queue/skills/qualify-lead.md +17 -0
- package/examples/loaded-resource-boundary/README.md +22 -0
- package/examples/loaded-resource-boundary/check-loaded-resource-boundary.mjs +65 -0
- package/examples/loaded-resource-boundary/loaded-resource-boundary.json +69 -0
- package/examples/memory-write-policy/README.md +28 -0
- package/examples/memory-write-policy/approved-memory-update.json +48 -0
- package/examples/memory-write-policy/check-memory-update.mjs +120 -0
- package/examples/memory-write-policy/quarantined-memory-update.json +43 -0
- package/examples/parallel-session-review-ledger/README.md +13 -0
- package/examples/parallel-session-review-ledger/check-parallel-session-review-ledger.mjs +69 -0
- package/examples/parallel-session-review-ledger/parallel-session-review-ledger.json +72 -0
- package/examples/phase-boundary-contract/README.md +23 -0
- package/examples/phase-boundary-contract/check-phase-boundary.mjs +73 -0
- package/examples/phase-boundary-contract/phase-boundary-contract.json +68 -0
- package/examples/skill-install-receipts/README.md +31 -0
- package/examples/skill-install-receipts/check-skill-install-receipt.mjs +75 -0
- package/examples/skill-install-receipts/skill-install-receipt.json +79 -0
- package/examples/skill-use-rate-receipts/README.md +16 -0
- package/examples/skill-use-rate-receipts/check-skill-use-rate.mjs +89 -0
- package/examples/skill-use-rate-receipts/skill-use-rate-receipt.json +79 -0
- package/package.json +1 -1
- package/src/commands/demo.js +155 -0
- package/src/index.js +1 -0
- package/src/utils/version.js +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,14 @@
|
|
|
4
4
|
|
|
5
5
|
All notable changes to Pluribus are documented here.
|
|
6
6
|
|
|
7
|
+
## 0.3.36 - 2026-06-05
|
|
8
|
+
|
|
9
|
+
- Added `pluribus demo skill-use-rate`, a tiny npm-runnable demo that validates the packaged Skill use-rate receipt and warns when installed/attached Skills have no observed invocations.
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
- Added a GitHub Actions AI PR review receipt gate example that validates `agent.review_primitive_receipt.v1` evidence for AI-authored pull requests.
|
|
13
|
+
- Added a memory write policy receipt guide and executable gate for approving or quarantining shared-memory updates before they become durable context across agents.
|
|
14
|
+
|
|
7
15
|
## 0.3.35 - 2026-05-31
|
|
8
16
|
|
|
9
17
|
- Added canonical-output receipts for preserving the last clean version of an artifact as versioned evidence instead of treating old chats as source of truth.
|
package/README.md
CHANGED
|
@@ -161,7 +161,7 @@ npx --yes pluribus-context@latest sync --dry-run
|
|
|
161
161
|
|
|
162
162
|
If the preview looks right, run `npx --yes pluribus-context@latest sync` to write the tool-specific files.
|
|
163
163
|
|
|
164
|
-
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If an MCP server is healthy but tools are missing in Claude Code/Cursor/Codex, use the [MCP tool visibility receipts](docs/mcp-tool-visibility-receipts.md) checklist to separate launch, handshake, `tools/list`, client catalog, and first invocation failures. If a Claude Code/OpenClaw-style Skill states a hard rule but the run still violates it, use the [Skill policy receipts](docs/skill-policy-receipts.md) guide and [copyable Skill recipe](examples/agent-skills/skill-policy-receipts/) to turn target decisions, refusals, and post-write guards into privacy-safe evidence. If long-lived projects keep old specs/TODOs that still match grep but are no longer authoritative, use [Temporal context receipts](docs/temporal-context-receipts.md) and the [copyable current-state example](examples/temporal-context-receipts/) to separate current authority from historical citations before an agent writes code. If AI-generated pull requests are hard to review because diff size hides operational risk, use [AI PR review receipts](docs/ai-pr-review-receipts.md)
|
|
164
|
+
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If your shared-memory or knowledge-graph setup lets agents write durable facts, use [Memory write policy receipts](docs/memory-write-policy-receipts.md) and the [copyable gate](examples/memory-write-policy/) to require proposed diffs, scope, lifecycle, visibility, approval, and privacy checks before one run can teach every harness. If hooks, local gateways, or agent firewalls block risky tool calls, use [Agent firewall denial/audit receipts](docs/agent-firewall-denial-audit.md) and the [copyable checker](examples/agent-firewall-denial-audit/) to split model-visible denial from private operator audit evidence. If you are turning Claude Code/OpenClaw/Cursor into role-based “AI employee” agents with Skills and memory folders, use the [Controlled learning queue](docs/controlled-learning-queue.md) and [copyable example](examples/controlled-learning-queue/) to let agents propose durable memory changes without silently rewriting shared ICP, pricing, compliance, or process assumptions. If `PreCompact` / `PostCompact` or `SessionStart(compact)` workflows decide whether an agent may continue after summarization, use [Compaction resume receipts](docs/compaction-resume-receipts.md) and the [copyable gate](examples/compaction-resume-receipts/) to prove what was summarized, which instruction sources reloaded, what state was lost/kept, and whether `safe_to_resume` is actually true. If an MCP server is healthy but tools are missing in Claude Code/Cursor/Codex, use the [MCP tool visibility receipts](docs/mcp-tool-visibility-receipts.md) checklist to separate launch, handshake, `tools/list`, client catalog, and first invocation failures. If a Claude Code/OpenClaw-style Skill states a hard rule but the run still violates it, use the [Skill policy receipts](docs/skill-policy-receipts.md) guide and [copyable Skill recipe](examples/agent-skills/skill-policy-receipts/) to turn target decisions, refusals, and post-write guards into privacy-safe evidence. If a Skill, plugin resource, MCP instruction, or custom-agent file exists but disappears in ACP/Zed/CLI/chat parity tests, use [Loaded-resource boundary receipts](docs/loaded-resource-boundary.md) and the [copyable checker](examples/loaded-resource-boundary/) to prove discovered, attached, injected, readable, and skipped-resource stages. If long-lived projects keep old specs/TODOs that still match grep but are no longer authoritative, use [Temporal context receipts](docs/temporal-context-receipts.md) and the [copyable current-state example](examples/temporal-context-receipts/) to separate current authority from historical citations before an agent writes code. If AI-generated pull requests are hard to review because diff size hides operational risk, use [AI PR review receipts](docs/ai-pr-review-receipts.md), the [copyable PR template](examples/ai-pr-review-receipts/), and the [GitHub Actions receipt gate](examples/ai-pr-review-receipts/.github/workflows/ai-pr-review-receipt.yml) to review by blast radius: schema/data contracts, async paths, rollout gates, side effects, and ambiguous boundaries. If you delegate work to Codex/Claude Code/Cursor/OpenClaw-style specialist subagents, use [Subagent role receipts](docs/subagent-role-receipts.md) and the [example role definitions](examples/subagent-role-receipts/) to prove the requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action. If you run Claude Code-style dynamic workflows, ultracode, or local LLM gateway orchestration that spawns many agents, use [Dynamic workflow run receipts](docs/dynamic-workflow-run-receipts.md) and the [copyable workflow example](examples/dynamic-workflow-run-receipts/) to prove phases, per-agent roles/models, context loaded/skipped, tool grants, token spend buckets, per-agent fuses, heartbeat, stop reasons, and known gaps. If your workflow routes Explore/Propose/Spec/Design/Tasks/Apply/Verify across OpenCode, Claude Code, Cursor, Codex, or different models, use [Phase-boundary contracts](docs/phase-boundary-contracts.md) and the [copyable Apply→Verify gate](examples/phase-boundary-contract/) to prove allowed input context, output artifact, evidence required before the next phase, dropped context, and stop conditions. If you need CI/reviewers to decide whether an agent handoff can continue, must be reviewed, or should be rejected, use the [Review primitive gate](docs/review-primitive-gate.md), its [copyable gate example](examples/review-primitive-gate/), and the [Claude Code review hook bridge](examples/claude-code-review-hook/) to validate assignment boundaries, approved scope/access changes, required checks, privacy flags, and `complete / partial / unsafe-to-resume` state from CI or Claude Code `TaskCompleted` / `PostCompact` hooks. If Claude Projects, long chats, or compaction make the last clean artifact hard to recover, use [Canonical output receipts](docs/canonical-output-receipts.md) and the [copyable index example](examples/canonical-output-receipts/) to track stable IDs, paths, versions, exact grep phrases, decisions, rejected options, and next actions. If a setup script installs MCP servers, Skills, instruction files, hooks, or plugins across multiple agents, use [Install-plan receipts](docs/install-plan-receipts.md) and the [copyable example](examples/install-plan-receipts/) to prove planned writes, backups, network behavior, and `writes_started=false` before mutation. After a Skill installer runs, use [Skill install/load receipts](docs/skill-install-receipts.md) and the [copyable checker](examples/skill-install-receipts/) to prove source ref, target agents/scopes, discovery/load status, context-cost bucket, and `safe_to_start_session` without logging raw Skill bodies. If you are pruning Skill sprawl after real sessions, use [Skill use-rate receipts](docs/skill-use-rate-receipts.md) and the [copyable checker](examples/skill-use-rate-receipts/) to separate discovered/installed/attached from invoked/acted-on and catch "installed but unused" resources. If you supervise multiple Claude Code/Cursor/Codex/OpenClaw sessions in parallel, use the [Parallel session review ledger](docs/parallel-session-review-ledger.md) and [copyable checker](examples/parallel-session-review-ledger/) to decide which sessions are complete, partial, blocked, or unsafe to resume without trusting an agent summary. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
|
|
165
165
|
|
|
166
166
|
### Usage
|
|
167
167
|
|
package/bin/pluribus.js
CHANGED
|
@@ -10,6 +10,7 @@ import { runSync } from '../src/commands/sync.js'
|
|
|
10
10
|
import { runValidate } from '../src/commands/validate.js'
|
|
11
11
|
import { runWatch } from '../src/commands/watch.js'
|
|
12
12
|
import { runAudit } from '../src/commands/audit.js'
|
|
13
|
+
import { runDemo } from '../src/commands/demo.js'
|
|
13
14
|
import { parseArgs } from '../src/utils/args.js'
|
|
14
15
|
import { SUPPORTED_TOOLS } from '../src/skills/built-in.js'
|
|
15
16
|
import { VERSION } from '../src/utils/version.js'
|
|
@@ -28,6 +29,7 @@ COMMANDS
|
|
|
28
29
|
validate Validate pluribus.md before syncing
|
|
29
30
|
audit Compare generated tool files with pluribus.md without writing
|
|
30
31
|
watch Watch pluribus.md and auto-sync after changes
|
|
32
|
+
demo Run tiny packaged demos from npm without cloning the repo
|
|
31
33
|
help Show this help message
|
|
32
34
|
|
|
33
35
|
OPTIONS (init)
|
|
@@ -64,6 +66,10 @@ OPTIONS (watch)
|
|
|
64
66
|
--once Exit after the first change-triggered sync
|
|
65
67
|
--debounce Debounce delay in ms (minimum 300, default 400)
|
|
66
68
|
|
|
69
|
+
OPTIONS (demo)
|
|
70
|
+
--receipt Validate a custom skill use-rate receipt JSON file
|
|
71
|
+
--json Print machine-readable demo results
|
|
72
|
+
|
|
67
73
|
EXAMPLES
|
|
68
74
|
pluribus init
|
|
69
75
|
pluribus init --dry-run
|
|
@@ -81,6 +87,8 @@ EXAMPLES
|
|
|
81
87
|
pluribus audit --strict --github-annotations
|
|
82
88
|
pluribus audit --json --fidelity-report
|
|
83
89
|
pluribus watch --tools claude,cursor
|
|
90
|
+
pluribus demo skill-use-rate
|
|
91
|
+
pluribus demo skill-use-rate --json
|
|
84
92
|
|
|
85
93
|
DOCS
|
|
86
94
|
https://github.com/caioribeiroclw-pixel/pluribus
|
|
@@ -92,6 +100,7 @@ const COMMAND_FLAGS = {
|
|
|
92
100
|
validate: new Set(['source', 'update-imports']),
|
|
93
101
|
audit: new Set(['source', 'tools', 'update-imports', 'strict', 'ci', 'json', 'output', 'github-annotations', 'fidelity-report']),
|
|
94
102
|
watch: new Set(['source', 'tools', 'update-imports', 'dry-run', 'once', 'debounce']),
|
|
103
|
+
demo: new Set(['receipt', 'json']),
|
|
95
104
|
}
|
|
96
105
|
|
|
97
106
|
function getFlagNames(argv) {
|
|
@@ -152,6 +161,9 @@ async function main() {
|
|
|
152
161
|
case 'audit':
|
|
153
162
|
await runAudit(parsedArgs)
|
|
154
163
|
break
|
|
164
|
+
case 'demo':
|
|
165
|
+
await runDemo(parsedArgs, commandArgs.filter((arg) => !arg.startsWith('--') && !Object.values(parsedArgs).includes(arg)))
|
|
166
|
+
break
|
|
155
167
|
default:
|
|
156
168
|
console.error(`❌ Unknown command: "${command}"`)
|
|
157
169
|
console.log(`Run \`pluribus help\` for usage.`)
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# Agent firewall denial/audit receipts
|
|
2
|
+
|
|
3
|
+
Claude Code hooks, OpenClaw policies, local MCP gateways, and agent firewalls can block destructive commands, outbound calls, or risky writes before an agent executes them.
|
|
4
|
+
|
|
5
|
+
The hard part is not only blocking. If the model sees a vague failure, it may keep trying variants. If the model sees too much detail, the denial can leak secrets, raw policy logic, or bypass hints.
|
|
6
|
+
|
|
7
|
+
Use a split receipt:
|
|
8
|
+
|
|
9
|
+
1. **Model-visible denial envelope** — minimal structured feedback the agent can act on safely.
|
|
10
|
+
2. **Operator audit record** — privacy-safe evidence for the human/operator, CI, or local dashboard.
|
|
11
|
+
|
|
12
|
+
## Model-visible denial envelope
|
|
13
|
+
|
|
14
|
+
The model should receive enough information to stop, ask, or choose a safe alternative, without exposing raw secrets, raw commands, or sensitive policy internals.
|
|
15
|
+
|
|
16
|
+
```json
|
|
17
|
+
{
|
|
18
|
+
"type": "agent_firewall_denial.v1",
|
|
19
|
+
"decision": "blocked",
|
|
20
|
+
"reasonClass": "destructive_git",
|
|
21
|
+
"requiresApproval": true,
|
|
22
|
+
"safeAlternative": "Explain the planned git operation and wait for explicit approval.",
|
|
23
|
+
"retrySafety": "unsafe_until_approved",
|
|
24
|
+
"correlationId": "deny_2026_06_02_2200_7f3a"
|
|
25
|
+
}
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Good `reasonClass` values are coarse and non-secret:
|
|
29
|
+
|
|
30
|
+
- `destructive_git`
|
|
31
|
+
- `filesystem_write_out_of_scope`
|
|
32
|
+
- `outbound_after_secret_read`
|
|
33
|
+
- `credential_exposure_risk`
|
|
34
|
+
- `package_publish_requires_approval`
|
|
35
|
+
- `unknown_policy_boundary`
|
|
36
|
+
|
|
37
|
+
The denial should avoid:
|
|
38
|
+
|
|
39
|
+
- raw shell commands;
|
|
40
|
+
- raw file contents;
|
|
41
|
+
- secret values or secret-looking substrings;
|
|
42
|
+
- full policy source;
|
|
43
|
+
- exact bypass instructions;
|
|
44
|
+
- absolute private paths when a path class or hash is enough.
|
|
45
|
+
|
|
46
|
+
## Operator audit record
|
|
47
|
+
|
|
48
|
+
The operator needs more detail, but still not raw prompts, code, or secrets. Prefer hashes, policy ids, classes, and booleans.
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"type": "agent_firewall_operator_audit.v1",
|
|
53
|
+
"decision": "blocked",
|
|
54
|
+
"correlationId": "deny_2026_06_02_2200_7f3a",
|
|
55
|
+
"tool": "Bash",
|
|
56
|
+
"commandHash": "sha256:0e5751c026e543b2a6f2b4d7a7c8d8e5b81b69c5b9f7db2a5b94f31f987e7f44",
|
|
57
|
+
"cwdHash": "sha256:dcdb704109a454784b81229d2b05f368692e758bfa33cb61d04c1b93791b0273",
|
|
58
|
+
"matchedPolicyIds": ["git.destructive.requires_approval"],
|
|
59
|
+
"sessionTaint": {
|
|
60
|
+
"secretRead": false,
|
|
61
|
+
"privateFileRead": true,
|
|
62
|
+
"networkAccessed": false
|
|
63
|
+
},
|
|
64
|
+
"approval": {
|
|
65
|
+
"state": "missing",
|
|
66
|
+
"requiredFrom": "operator"
|
|
67
|
+
},
|
|
68
|
+
"retrySafety": "unsafe_until_approved",
|
|
69
|
+
"modelEnvelopeHash": "sha256:a1bcaa1cb2572ab0e735c30062a268391d0a9d1b3dd7ff4b14065d8b29513b2a"
|
|
70
|
+
}
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## Invariant
|
|
74
|
+
|
|
75
|
+
A blocked tool call should never disappear into the middle ground of “the command just failed.”
|
|
76
|
+
|
|
77
|
+
- The **model** gets a safe reason class and next action.
|
|
78
|
+
- The **operator** gets policy evidence and retry safety.
|
|
79
|
+
- The shared identifier is a correlation id plus hashes, not raw private payloads.
|
|
80
|
+
|
|
81
|
+
That makes enforcement auditable without turning policy internals into model-visible bypass material.
|
|
82
|
+
|
|
83
|
+
## Try the copyable example
|
|
84
|
+
|
|
85
|
+
See [`examples/agent-firewall-denial-audit/`](../examples/agent-firewall-denial-audit/) for a tiny denial envelope, operator audit record, and local checker:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
node examples/agent-firewall-denial-audit/check-denial-audit.mjs examples/agent-firewall-denial-audit
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
The checker is intentionally small. It fails if the model-visible envelope leaks command/path/policy/secret-looking fields, if the audit record lacks policy ids or hash evidence, or if the envelope/audit correlation id does not match.
|
|
92
|
+
|
|
93
|
+
## How this fits Pluribus
|
|
94
|
+
|
|
95
|
+
Pluribus is not an agent firewall. This recipe is for teams already using hooks, policy engines, or local gateways and needing privacy-safe evidence at the enforcement boundary: what was denied, what the model was safely told, and what the operator can audit later.
|
|
@@ -124,6 +124,26 @@ This receipt does not include raw prompts, transcripts, source code, secrets, cu
|
|
|
124
124
|
`next_safe_action:`
|
|
125
125
|
```
|
|
126
126
|
|
|
127
|
+
## CI gate example
|
|
128
|
+
|
|
129
|
+
The copyable example in [`examples/ai-pr-review-receipts/`](../examples/ai-pr-review-receipts/) includes:
|
|
130
|
+
|
|
131
|
+
- a PR template for human-readable blast-radius review;
|
|
132
|
+
- a GitHub Actions workflow that validates a machine-readable `agent.review_primitive_receipt.v1` receipt;
|
|
133
|
+
- a passing fixture and an intentionally failing fixture.
|
|
134
|
+
|
|
135
|
+
Run the smoke locally from the repository root:
|
|
136
|
+
|
|
137
|
+
```bash
|
|
138
|
+
node examples/review-primitive-gate/check-review-receipt.mjs \
|
|
139
|
+
examples/ai-pr-review-receipts/review-primitive-receipt.json
|
|
140
|
+
|
|
141
|
+
node examples/review-primitive-gate/check-review-receipt.mjs \
|
|
142
|
+
examples/ai-pr-review-receipts/incomplete-review-primitive-receipt.json
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
The first command should pass. The second should fail because partial/unsafe or under-evidenced agent work should not silently pass a merge gate.
|
|
146
|
+
|
|
127
147
|
## How to use with Pluribus
|
|
128
148
|
|
|
129
149
|
Pluribus does not need to own your PR workflow. Use it as the neutral language for evidence that crossed an agent boundary:
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Compaction resume receipts
|
|
2
|
+
|
|
3
|
+
Claude Code and Codex users are asking for `PreCompact` / `PostCompact` hooks because long sessions lose thread detail, reload rules inconsistently, or continue after a summary that no one can audit.
|
|
4
|
+
|
|
5
|
+
The risky moment is not compaction itself. The risky moment is **resuming as if nothing changed**.
|
|
6
|
+
|
|
7
|
+
A compaction resume receipt is a small, privacy-safe handoff object emitted after a compaction/restore flow. It proves what was summarized, what instruction sources were reloaded, what state was lost or kept, and whether the next agent turn is safe to continue.
|
|
8
|
+
|
|
9
|
+
## Receipt boundary
|
|
10
|
+
|
|
11
|
+
A resume receipt should prove:
|
|
12
|
+
|
|
13
|
+
- **event identity** — stable `compaction_event_id`, `session_id`, and trigger so a restore can be correlated with the hook that caused it;
|
|
14
|
+
- **transcript boundary** — the compacted transcript range and hash, without logging raw transcript content;
|
|
15
|
+
- **summary evidence** — summary hash and token count so downstream tools can detect stale or changed summaries;
|
|
16
|
+
- **instruction reloads** — `AGENTS.md`, `CLAUDE.md`, skills, MCP/tool manifests, workflow plans, or project rules reloaded with hashes/mtimes;
|
|
17
|
+
- **kept/lost fields** — explicit lists for active plan, open diffs, pending tests, rejected decisions, tool grants, or unresolved blockers;
|
|
18
|
+
- **resume verdict** — `safe_to_resume: true | false | unknown` plus reasons;
|
|
19
|
+
- **privacy flags** — no raw transcript, raw prompts, raw tool outputs, secrets, or full instruction bodies in the receipt.
|
|
20
|
+
|
|
21
|
+
## 60-second gate
|
|
22
|
+
|
|
23
|
+
The copyable example is in [`examples/compaction-resume-receipts/`](../examples/compaction-resume-receipts/):
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
node examples/compaction-resume-receipts/check-resume-receipt.mjs \
|
|
27
|
+
examples/compaction-resume-receipts/safe-resume-receipt.json
|
|
28
|
+
|
|
29
|
+
node examples/compaction-resume-receipts/check-resume-receipt.mjs \
|
|
30
|
+
examples/compaction-resume-receipts/unsafe-resume-receipt.json
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
The first passes because the restore has a stable event id, compacted transcript hash, summary hash, reloaded instruction sources, explicit kept/lost state, privacy flags, and `safe_to_resume: true` with no blocking lost fields.
|
|
34
|
+
|
|
35
|
+
The second fails because it tries to continue after missing `AGENTS.md`, hidden lost decisions, raw transcript logging, and an `unknown` verdict.
|
|
36
|
+
|
|
37
|
+
## Positioning
|
|
38
|
+
|
|
39
|
+
Memory systems remember. Hooks fire lifecycle events. This receipt answers the operational review question:
|
|
40
|
+
|
|
41
|
+
> After compaction, do we have enough verified context to continue safely?
|
|
42
|
+
|
|
43
|
+
That makes `PostCompact` / `SessionStart(compact)` restore flows auditable without turning Pluribus into a memory store or requiring private transcript capture.
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Controlled learning queue for AI employee-style agents
|
|
2
|
+
|
|
3
|
+
Claude Code, OpenClaw, Cursor, and MCP tools make it easy to turn a repository into a role-based worker: `CLAUDE.md` as the job description, Skills as procedures, and a `memory/` folder as durable knowledge.
|
|
4
|
+
|
|
5
|
+
That pattern compounds quickly, but it has a failure mode: the agent can overlearn from one weird lead, support ticket, or edge case and rewrite shared memory for every future run.
|
|
6
|
+
|
|
7
|
+
Use a controlled learning queue when an agent is allowed to **propose** durable memory changes but not silently promote them.
|
|
8
|
+
|
|
9
|
+
## Split the folders
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
role/ # job contract: responsibilities, boundaries, escalation
|
|
13
|
+
skills/ # callable procedures with inputs, outputs, stop conditions
|
|
14
|
+
memory/durable.md # approved facts only; small enough to review
|
|
15
|
+
memory/working-notes.md# scratch observations; allowed to be messy/temporary
|
|
16
|
+
learning_queue.md # proposed durable changes awaiting promote/reject
|
|
17
|
+
leads/ # tiny job cards for active work
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
The key rule: `memory/durable.md` changes only through `learning_queue.md` proposals with source, reason, scope, expiry, and reviewer decision.
|
|
21
|
+
|
|
22
|
+
## Proposal shape
|
|
23
|
+
|
|
24
|
+
Each proposed learning should answer:
|
|
25
|
+
|
|
26
|
+
- **Source:** what run, lead, issue, or transcript produced the observation?
|
|
27
|
+
- **Observed:** what happened, without storing raw private text?
|
|
28
|
+
- **Proposed durable change:** the exact fact/rule to add, edit, or remove.
|
|
29
|
+
- **Reason:** why this should affect future runs, not just the current case.
|
|
30
|
+
- **Scope:** global, client-specific, project-specific, channel-specific, or temporary.
|
|
31
|
+
- **Expiry / review date:** when this fact should be rechecked.
|
|
32
|
+
- **Status:** proposed, promoted, rejected, or expired.
|
|
33
|
+
|
|
34
|
+
That is enough to preserve learning while keeping an agent from slowly corrupting ICP, pricing assumptions, escalation rules, or compliance boundaries.
|
|
35
|
+
|
|
36
|
+
## Try the copyable example
|
|
37
|
+
|
|
38
|
+
See [`examples/controlled-learning-queue/`](../examples/controlled-learning-queue/) for a tiny AI sales/ops worker layout and a local checker:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
node examples/controlled-learning-queue/check-learning-queue.mjs examples/controlled-learning-queue/learning_queue.md
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
The checker is intentionally small. It fails proposals that are missing source/reason/scope/expiry/status, that try to auto-promote without review, or that paste raw secrets/private payloads into the learning queue.
|
|
45
|
+
|
|
46
|
+
## How this fits Pluribus
|
|
47
|
+
|
|
48
|
+
Pluribus is not trying to be the agent's brain. This pattern keeps intentional context reviewable: durable memory is a small versioned source of truth, while working notes and proposed learnings remain visibly provisional until promoted.
|
|
@@ -75,3 +75,5 @@ If an installer cannot answer that before mutation, treat it like running CI fro
|
|
|
75
75
|
## Try the copyable example
|
|
76
76
|
|
|
77
77
|
See [`examples/install-plan-receipts/`](../examples/install-plan-receipts/) for a small review checklist and sample receipt you can copy into setup scripts, README install sections, or agent-managed onboarding workflows.
|
|
78
|
+
|
|
79
|
+
After the installer has run, use [Skill install/load receipts](skill-install-receipts.md) when the next question is whether each target agent can discover/load the installed Skill and whether the install made the first session unsafe by adding too much always-loaded context.
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
# Loaded-resource boundary receipts
|
|
2
|
+
|
|
3
|
+
Use this when a Skill, plugin resource, MCP-provided instruction, or custom-agent file appears to be configured correctly but does not actually reach the agent runtime.
|
|
4
|
+
|
|
5
|
+
This is the failure mode behind reports like:
|
|
6
|
+
|
|
7
|
+
- "the Skill works in chat but not ACP/Zed/CLI";
|
|
8
|
+
- "`/skills` or the skill list is unavailable in this client";
|
|
9
|
+
- "the agent followed generic instructions because the real resource was never injected";
|
|
10
|
+
- "a prompt workaround says resources are preloaded, but there is no proof they were readable by the runtime".
|
|
11
|
+
|
|
12
|
+
Pluribus should not become a Skill manager. The useful boundary is a small receipt that proves what crossed from configuration into the run.
|
|
13
|
+
|
|
14
|
+
## Receipt shape
|
|
15
|
+
|
|
16
|
+
A loaded-resource receipt separates the stages that are often collapsed into "the skill exists":
|
|
17
|
+
|
|
18
|
+
| Stage | Question |
|
|
19
|
+
| --- | --- |
|
|
20
|
+
| `expected` | Which resources did the user/config expect for this agent and task? |
|
|
21
|
+
| `discovered` | Did the host find the resource on disk, in a plugin, registry, or MCP response? |
|
|
22
|
+
| `attached` | Was the resource attached to the selected agent/profile/workspace? |
|
|
23
|
+
| `injected` | Did the runtime put the resource into the model/tool context for this session? |
|
|
24
|
+
| `readable` | Could the agent actually read the resource bytes or resolved prompt? |
|
|
25
|
+
| `skipped` | If not, what precise stage and reason explain the gap? |
|
|
26
|
+
|
|
27
|
+
Recommended privacy-safe fields:
|
|
28
|
+
|
|
29
|
+
```json
|
|
30
|
+
{
|
|
31
|
+
"receipt_type": "pluribus.loaded_resource_boundary.v1",
|
|
32
|
+
"scenario": "custom-agent skill parity across chat and ACP",
|
|
33
|
+
"expected_resources": [
|
|
34
|
+
{
|
|
35
|
+
"id": "skill:pr-review",
|
|
36
|
+
"kind": "skill",
|
|
37
|
+
"scope": "project",
|
|
38
|
+
"source_ref": ".kiro/skills/pr-review/SKILL.md",
|
|
39
|
+
"source_hash": "sha256:...",
|
|
40
|
+
"required": true
|
|
41
|
+
}
|
|
42
|
+
],
|
|
43
|
+
"sessions": [
|
|
44
|
+
{
|
|
45
|
+
"runtime": "chat",
|
|
46
|
+
"client": "kiro-desktop",
|
|
47
|
+
"agent": "reviewer",
|
|
48
|
+
"discovered_resources": ["skill:pr-review"],
|
|
49
|
+
"attached_resources": ["skill:pr-review"],
|
|
50
|
+
"injected_resources": ["skill:pr-review"],
|
|
51
|
+
"readable_resources": ["skill:pr-review"],
|
|
52
|
+
"skipped_resources": []
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"runtime": "acp",
|
|
56
|
+
"client": "zed",
|
|
57
|
+
"agent": "reviewer",
|
|
58
|
+
"discovered_resources": ["skill:pr-review"],
|
|
59
|
+
"attached_resources": ["skill:pr-review"],
|
|
60
|
+
"injected_resources": [],
|
|
61
|
+
"readable_resources": [],
|
|
62
|
+
"skipped_resources": [
|
|
63
|
+
{
|
|
64
|
+
"id": "skill:pr-review",
|
|
65
|
+
"stage": "injected",
|
|
66
|
+
"reason": "runtime_does_not_inject_resources"
|
|
67
|
+
}
|
|
68
|
+
]
|
|
69
|
+
}
|
|
70
|
+
]
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
Do not include raw skill text, private prompts, credentials, or full project memory. Hashes, refs, stage names, and skip reasons are enough for a maintainer to reproduce the boundary.
|
|
75
|
+
|
|
76
|
+
## Acceptance test
|
|
77
|
+
|
|
78
|
+
For the same custom agent and the same attached Skill/resource, compare chat vs ACP/CLI/IDE sessions:
|
|
79
|
+
|
|
80
|
+
1. The resource should be `discovered` in each runtime that claims to support it.
|
|
81
|
+
2. If it is attached in chat but not in ACP/Zed/CLI, record `not_attached_to_agent`.
|
|
82
|
+
3. If it is attached but absent from the model context, record `runtime_does_not_inject_resources`.
|
|
83
|
+
4. If it was injected but the bytes cannot be resolved, record `resource_read_failed`.
|
|
84
|
+
5. If trigger logic prevented loading, record `trigger_not_matched` and include the matched task label or hash, not the full prompt.
|
|
85
|
+
|
|
86
|
+
A useful bug report is not "Skills are broken". It is:
|
|
87
|
+
|
|
88
|
+
> For agent `reviewer`, `skill:pr-review` is discovered and attached in both chat and ACP. Chat injects and reads it; ACP/Zed does not inject it and reports `runtime_does_not_inject_resources`.
|
|
89
|
+
|
|
90
|
+
## Try the example
|
|
91
|
+
|
|
92
|
+
```bash
|
|
93
|
+
node examples/loaded-resource-boundary/check-loaded-resource-boundary.mjs \
|
|
94
|
+
examples/loaded-resource-boundary/loaded-resource-boundary.json
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
The sample intentionally includes a chat-vs-ACP mismatch and treats that mismatch as the useful finding.
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# Memory write policy receipts
|
|
2
|
+
|
|
3
|
+
Cross-agent memory tools usually optimize recall: make Claude Code, Codex, Cursor, OpenClaw, ChatGPT, or MCP clients find the same facts later.
|
|
4
|
+
|
|
5
|
+
The adoption risk is different: **who is allowed to write durable memory, under what scope, and with what rollback or review path?**
|
|
6
|
+
|
|
7
|
+
Pluribus should not become another memory server. This receipt is a small governance layer for shared memory systems: every durable memory update is treated like a proposed diff before it becomes trusted context for future agents.
|
|
8
|
+
|
|
9
|
+
## Receipt boundary
|
|
10
|
+
|
|
11
|
+
A memory write receipt should prove:
|
|
12
|
+
|
|
13
|
+
- **source** — where the proposed memory came from, with a hash/ref instead of raw transcript or raw memory body;
|
|
14
|
+
- **scope** — whether the write is repo, project, org, or user scoped;
|
|
15
|
+
- **proposed diff** — adds/updates/supersedes/expires by stable refs and hashes;
|
|
16
|
+
- **write policy** — proposed, approved, rejected, or quarantined; who/what approved it;
|
|
17
|
+
- **lifecycle** — expiry or review date so stale facts do not become immortal;
|
|
18
|
+
- **injection visibility** — future sessions can see which memory was injected;
|
|
19
|
+
- **privacy flags** — no raw prompts, raw tool output, raw memory text, or secrets in the receipt.
|
|
20
|
+
|
|
21
|
+
## 60-second gate
|
|
22
|
+
|
|
23
|
+
The copyable example is in [`examples/memory-write-policy/`](../examples/memory-write-policy/):
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
node examples/memory-write-policy/check-memory-update.mjs \
|
|
27
|
+
examples/memory-write-policy/approved-memory-update.json
|
|
28
|
+
|
|
29
|
+
node examples/memory-write-policy/check-memory-update.mjs \
|
|
30
|
+
examples/memory-write-policy/quarantined-memory-update.json
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
The first passes because the write is approved, scoped, hashed, visible to future sessions, and has a review lifecycle. The second fails because it tries to turn a quarantined, broad user-scoped, private/sensitive update into durable shared memory and includes raw text.
|
|
34
|
+
|
|
35
|
+
## Positioning
|
|
36
|
+
|
|
37
|
+
Memory systems remember. Hooks and workflow engines execute. This receipt answers a narrower review question:
|
|
38
|
+
|
|
39
|
+
> Is this memory update allowed to become durable context for other agents?
|
|
40
|
+
|
|
41
|
+
That makes shared memory safer without requiring the memory provider to expose private content or the agent transcript.
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# Parallel session review ledger
|
|
2
|
+
|
|
3
|
+
Use this when you run multiple Claude Code, Cursor, Codex, OpenClaw, or terminal-agent sessions in parallel and the bottleneck is no longer starting work — it is deciding whether each result can be trusted, rejected, or safely resumed.
|
|
4
|
+
|
|
5
|
+
The point is not orchestration. A review ledger is a small privacy-safe receipt that lets a human or a follow-up agent answer:
|
|
6
|
+
|
|
7
|
+
- what was this session assigned to do?
|
|
8
|
+
- which files, commands, and external systems was it allowed to touch?
|
|
9
|
+
- what does the agent claim changed?
|
|
10
|
+
- what evidence exists outside the agent summary?
|
|
11
|
+
- which checks are still missing?
|
|
12
|
+
- is the next reviewer allowed to continue, or should they stop and inspect?
|
|
13
|
+
|
|
14
|
+
Do **not** paste raw prompts, source code, secrets, customer data, transcripts, or full tool output into the ledger. Store stable references, hashes, check names, commit ids, redacted paths, and short evidence labels instead.
|
|
15
|
+
|
|
16
|
+
## Minimal receipt
|
|
17
|
+
|
|
18
|
+
```json
|
|
19
|
+
{
|
|
20
|
+
"schema": "pluribus.parallel_session_review_ledger.v1",
|
|
21
|
+
"generated_at": "2026-06-04T19:00:00Z",
|
|
22
|
+
"run": {
|
|
23
|
+
"orchestrator": "human",
|
|
24
|
+
"repo": "redacted-service",
|
|
25
|
+
"coordination_mode": "parallel_sessions"
|
|
26
|
+
},
|
|
27
|
+
"sessions": [
|
|
28
|
+
{
|
|
29
|
+
"id": "session-a",
|
|
30
|
+
"agent": "claude-code",
|
|
31
|
+
"assignment": "update validation for billing webhook retries",
|
|
32
|
+
"branch": "agent/billing-webhook-retry-validation",
|
|
33
|
+
"allowed_scope": {
|
|
34
|
+
"files": ["src/billing/**", "test/billing/**"],
|
|
35
|
+
"commands": ["npm test -- --test-name-pattern=billing"],
|
|
36
|
+
"network": "none"
|
|
37
|
+
},
|
|
38
|
+
"agent_claim": "added retry validation and regression coverage",
|
|
39
|
+
"evidence": [
|
|
40
|
+
{
|
|
41
|
+
"type": "commit",
|
|
42
|
+
"ref": "abc1234"
|
|
43
|
+
},
|
|
44
|
+
{
|
|
45
|
+
"type": "test",
|
|
46
|
+
"name": "billing retry validation",
|
|
47
|
+
"status": "passed"
|
|
48
|
+
}
|
|
49
|
+
],
|
|
50
|
+
"missing_checks": [],
|
|
51
|
+
"privacy_flags": [],
|
|
52
|
+
"state": "complete",
|
|
53
|
+
"safe_next_action": "review_diff"
|
|
54
|
+
}
|
|
55
|
+
]
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Review states
|
|
60
|
+
|
|
61
|
+
| State | Meaning | Required next action |
|
|
62
|
+
| --- | --- | --- |
|
|
63
|
+
| `complete` | Assignment is done and required evidence exists. | Review the diff or merge path normally. |
|
|
64
|
+
| `partial` | Some useful work exists but a required check or boundary is missing. | Continue only after the missing check/scope is resolved. |
|
|
65
|
+
| `blocked` | The agent stopped before a useful handoff. | Reassign or inspect the blocker before continuing. |
|
|
66
|
+
| `unsafe_to_resume` | Scope, privacy, command, or evidence boundaries were violated. | Stop; inspect manually before any follow-up agent uses the result. |
|
|
67
|
+
|
|
68
|
+
## Safe next actions
|
|
69
|
+
|
|
70
|
+
Use constrained verbs so another session does not turn a vague summary into authority:
|
|
71
|
+
|
|
72
|
+
- `review_diff`
|
|
73
|
+
- `run_missing_check`
|
|
74
|
+
- `continue_same_scope`
|
|
75
|
+
- `ask_human`
|
|
76
|
+
- `stop_manual_review`
|
|
77
|
+
|
|
78
|
+
## Copyable checker
|
|
79
|
+
|
|
80
|
+
The example in [`examples/parallel-session-review-ledger/`](../examples/parallel-session-review-ledger/) validates the useful minimum:
|
|
81
|
+
|
|
82
|
+
- every session has an assignment, branch, allowed scope, state, and safe next action;
|
|
83
|
+
- `complete` sessions have evidence and no missing checks;
|
|
84
|
+
- `partial` sessions name missing checks;
|
|
85
|
+
- `unsafe_to_resume` sessions use `stop_manual_review`;
|
|
86
|
+
- sessions with privacy flags cannot be marked complete;
|
|
87
|
+
- no session writes outside its declared file scope.
|
|
88
|
+
|
|
89
|
+
Run it from the repo root:
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
node examples/parallel-session-review-ledger/check-parallel-session-review-ledger.mjs examples/parallel-session-review-ledger/parallel-session-review-ledger.json
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Expected output:
|
|
96
|
+
|
|
97
|
+
```text
|
|
98
|
+
parallel session review ledger ok: 3 sessions checked
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Positioning
|
|
102
|
+
|
|
103
|
+
Parallel agents do not fail only because models are weak. They fail because the human reviewer loses the boundary of each run. A ledger turns "the agent says it is done" into a small resume/reject object: assignment, scope, evidence, missing checks, and safe next action.
|