pluribus-context 0.3.34 → 0.3.36
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +23 -0
- package/README.md +2 -1
- package/bin/pluribus.js +12 -0
- package/docs/agent-firewall-denial-audit.md +95 -0
- package/docs/ai-pr-review-receipts.md +173 -0
- package/docs/canonical-output-receipts.md +107 -0
- package/docs/compaction-resume-receipts.md +43 -0
- package/docs/controlled-learning-queue.md +48 -0
- package/docs/dynamic-workflow-run-receipts.md +158 -0
- package/docs/install-plan-receipts.md +79 -0
- package/docs/loaded-resource-boundary.md +97 -0
- package/docs/mcp-tool-visibility-receipts.md +67 -0
- package/docs/memory-write-policy-receipts.md +41 -0
- package/docs/parallel-session-review-ledger.md +103 -0
- package/docs/phase-boundary-contracts.md +87 -0
- package/docs/review-primitive-gate.md +109 -0
- package/docs/skill-install-receipts.md +102 -0
- package/docs/skill-policy-receipts.md +87 -0
- package/docs/skill-use-rate-receipts.md +104 -0
- package/docs/subagent-role-receipts.md +95 -0
- package/docs/temporal-context-receipts.md +123 -0
- package/examples/agent-firewall-denial-audit/README.md +14 -0
- package/examples/agent-firewall-denial-audit/check-denial-audit.mjs +116 -0
- package/examples/agent-firewall-denial-audit/denial-envelope.json +9 -0
- package/examples/agent-firewall-denial-audit/operator-audit-record.json +20 -0
- package/examples/agent-skills/skill-policy-receipts/README.md +22 -0
- package/examples/agent-skills/skill-policy-receipts/SKILL.md +77 -0
- package/examples/ai-pr-review-receipts/.github/pull_request_template.md +31 -0
- package/examples/ai-pr-review-receipts/.github/workflows/ai-pr-review-receipt.yml +25 -0
- package/examples/ai-pr-review-receipts/README.md +55 -0
- package/examples/ai-pr-review-receipts/incomplete-review-primitive-receipt.json +43 -0
- package/examples/ai-pr-review-receipts/review-primitive-receipt.json +60 -0
- package/examples/canonical-output-receipts/canonical-output-receipt.json +55 -0
- package/examples/claude-code-review-hook/README.md +74 -0
- package/examples/claude-code-review-hook/check-review-receipt-hook.mjs +80 -0
- package/examples/claude-code-review-hook/sample-task-completed-event.json +6 -0
- package/examples/compaction-resume-receipts/README.md +12 -0
- package/examples/compaction-resume-receipts/check-resume-receipt.mjs +116 -0
- package/examples/compaction-resume-receipts/safe-resume-receipt.json +52 -0
- package/examples/compaction-resume-receipts/unsafe-resume-receipt.json +41 -0
- package/examples/controlled-learning-queue/README.md +26 -0
- package/examples/controlled-learning-queue/check-learning-queue.mjs +44 -0
- package/examples/controlled-learning-queue/leads/acme-job-card.md +12 -0
- package/examples/controlled-learning-queue/learning_queue.md +27 -0
- package/examples/controlled-learning-queue/memory/durable.md +10 -0
- package/examples/controlled-learning-queue/memory/working-notes.md +5 -0
- package/examples/controlled-learning-queue/role/job-contract.md +18 -0
- package/examples/controlled-learning-queue/skills/qualify-lead.md +17 -0
- package/examples/dynamic-workflow-run-receipts/README.md +18 -0
- package/examples/dynamic-workflow-run-receipts/workflow-run-receipt.json +112 -0
- package/examples/install-plan-receipts/README.md +34 -0
- package/examples/install-plan-receipts/agent-install-plan-receipt.json +56 -0
- package/examples/loaded-resource-boundary/README.md +22 -0
- package/examples/loaded-resource-boundary/check-loaded-resource-boundary.mjs +65 -0
- package/examples/loaded-resource-boundary/loaded-resource-boundary.json +69 -0
- package/examples/memory-write-policy/README.md +28 -0
- package/examples/memory-write-policy/approved-memory-update.json +48 -0
- package/examples/memory-write-policy/check-memory-update.mjs +120 -0
- package/examples/memory-write-policy/quarantined-memory-update.json +43 -0
- package/examples/parallel-session-review-ledger/README.md +13 -0
- package/examples/parallel-session-review-ledger/check-parallel-session-review-ledger.mjs +69 -0
- package/examples/parallel-session-review-ledger/parallel-session-review-ledger.json +72 -0
- package/examples/phase-boundary-contract/README.md +23 -0
- package/examples/phase-boundary-contract/check-phase-boundary.mjs +73 -0
- package/examples/phase-boundary-contract/phase-boundary-contract.json +68 -0
- package/examples/review-primitive-gate/README.md +19 -0
- package/examples/review-primitive-gate/check-review-receipt.mjs +100 -0
- package/examples/review-primitive-gate/fail-review-receipt.json +42 -0
- package/examples/review-primitive-gate/pass-review-receipt.json +54 -0
- package/examples/skill-install-receipts/README.md +31 -0
- package/examples/skill-install-receipts/check-skill-install-receipt.mjs +75 -0
- package/examples/skill-install-receipts/skill-install-receipt.json +79 -0
- package/examples/skill-use-rate-receipts/README.md +16 -0
- package/examples/skill-use-rate-receipts/check-skill-use-rate.mjs +89 -0
- package/examples/skill-use-rate-receipts/skill-use-rate-receipt.json +79 -0
- package/examples/subagent-role-receipts/README.md +15 -0
- package/examples/subagent-role-receipts/agents.toml +36 -0
- package/examples/temporal-context-receipts/CURRENT_STATE.md +13 -0
- package/examples/temporal-context-receipts/specs/2025-checkout-rewrite.md +10 -0
- package/examples/temporal-context-receipts/specs/2026-checkout-risk-notes.md +10 -0
- package/examples/temporal-context-receipts/temporal-authority-receipt.json +27 -0
- package/package.json +1 -1
- package/src/commands/demo.js +155 -0
- package/src/index.js +1 -0
- package/src/utils/version.js +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,29 @@
|
|
|
4
4
|
|
|
5
5
|
All notable changes to Pluribus are documented here.
|
|
6
6
|
|
|
7
|
+
## 0.3.36 - 2026-06-05
|
|
8
|
+
|
|
9
|
+
- Added `pluribus demo skill-use-rate`, a tiny npm-runnable demo that validates the packaged Skill use-rate receipt and warns when installed/attached Skills have no observed invocations.
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
- Added a GitHub Actions AI PR review receipt gate example that validates `agent.review_primitive_receipt.v1` evidence for AI-authored pull requests.
|
|
13
|
+
- Added a memory write policy receipt guide and executable gate for approving or quarantining shared-memory updates before they become durable context across agents.
|
|
14
|
+
|
|
15
|
+
## 0.3.35 - 2026-05-31
|
|
16
|
+
|
|
17
|
+
- Added canonical-output receipts for preserving the last clean version of an artifact as versioned evidence instead of treating old chats as source of truth.
|
|
18
|
+
- Added a review primitive gate example that fails unsafe agent handoffs on unapproved scope changes, skipped/failed required checks, missing evidence, privacy leaks, or `partial` / `unsafe-to-resume` state.
|
|
19
|
+
- Added a Claude Code hook bridge that runs the review primitive gate from lifecycle hook JSON, making handoff proof enforceable at `TaskCompleted` / `PostCompact` style boundaries.
|
|
20
|
+
- Tracked fresh market signals around dynamic workflow cost accounting, post-compact recovery, active idea indexes, Skill policy alternatives, and control-plane buying thresholds.
|
|
21
|
+
|
|
22
|
+
- Add Skill policy receipts guide and copyable Agent Skill recipe for hard-rule enforcement, target allow/refuse decisions, and post-write guards.
|
|
23
|
+
- Add temporal context receipts guide and copyable current-state/spec example for long-lived projects where old docs still match search but are no longer implementation authority.
|
|
24
|
+
- Add AI PR review receipts guide and copyable PR template for reviewing agent-generated changes by blast radius instead of diff size alone.
|
|
25
|
+
- Add subagent role receipts guide and copyable `agents.toml` example for proving requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action.
|
|
26
|
+
- Add install-plan receipts guide and copyable example for setup scripts that configure MCP servers, Skills, instruction files, hooks, or plugins across multiple AI coding tools before writes begin.
|
|
27
|
+
- Added an MCP tool visibility receipts checklist for debugging servers that launch and return `tools/list` directly but do not surface tools in the actual agent client catalog.
|
|
28
|
+
- Tracked GBrain `unify-types` mutation-mode feedback as an external receipt signal for protected migrations that may default from dry-run into writes.
|
|
29
|
+
|
|
7
30
|
## 0.3.34 - 2026-05-26
|
|
8
31
|
|
|
9
32
|
- Repositioned README and the community review packet around privacy-safe agent context receipts first, with instruction-file audit/sync as the supporting workflow, so directory reviewers do not mistake Pluribus for another generic ContextOps, memory, RAG, or rules-sync tool.
|
package/README.md
CHANGED
|
@@ -161,7 +161,7 @@ npx --yes pluribus-context@latest sync --dry-run
|
|
|
161
161
|
|
|
162
162
|
If the preview looks right, run `npx --yes pluribus-context@latest sync` to write the tool-specific files.
|
|
163
163
|
|
|
164
|
-
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
|
|
164
|
+
For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If your shared-memory or knowledge-graph setup lets agents write durable facts, use [Memory write policy receipts](docs/memory-write-policy-receipts.md) and the [copyable gate](examples/memory-write-policy/) to require proposed diffs, scope, lifecycle, visibility, approval, and privacy checks before one run can teach every harness. If hooks, local gateways, or agent firewalls block risky tool calls, use [Agent firewall denial/audit receipts](docs/agent-firewall-denial-audit.md) and the [copyable checker](examples/agent-firewall-denial-audit/) to split model-visible denial from private operator audit evidence. If you are turning Claude Code/OpenClaw/Cursor into role-based “AI employee” agents with Skills and memory folders, use the [Controlled learning queue](docs/controlled-learning-queue.md) and [copyable example](examples/controlled-learning-queue/) to let agents propose durable memory changes without silently rewriting shared ICP, pricing, compliance, or process assumptions. If `PreCompact` / `PostCompact` or `SessionStart(compact)` workflows decide whether an agent may continue after summarization, use [Compaction resume receipts](docs/compaction-resume-receipts.md) and the [copyable gate](examples/compaction-resume-receipts/) to prove what was summarized, which instruction sources reloaded, what state was lost/kept, and whether `safe_to_resume` is actually true. If an MCP server is healthy but tools are missing in Claude Code/Cursor/Codex, use the [MCP tool visibility receipts](docs/mcp-tool-visibility-receipts.md) checklist to separate launch, handshake, `tools/list`, client catalog, and first invocation failures. If a Claude Code/OpenClaw-style Skill states a hard rule but the run still violates it, use the [Skill policy receipts](docs/skill-policy-receipts.md) guide and [copyable Skill recipe](examples/agent-skills/skill-policy-receipts/) to turn target decisions, refusals, and post-write guards into privacy-safe evidence. If a Skill, plugin resource, MCP instruction, or custom-agent file exists but disappears in ACP/Zed/CLI/chat parity tests, use [Loaded-resource boundary receipts](docs/loaded-resource-boundary.md) and the [copyable checker](examples/loaded-resource-boundary/) to prove discovered, attached, injected, readable, and skipped-resource stages. If long-lived projects keep old specs/TODOs that still match grep but are no longer authoritative, use [Temporal context receipts](docs/temporal-context-receipts.md) and the [copyable current-state example](examples/temporal-context-receipts/) to separate current authority from historical citations before an agent writes code. If AI-generated pull requests are hard to review because diff size hides operational risk, use [AI PR review receipts](docs/ai-pr-review-receipts.md), the [copyable PR template](examples/ai-pr-review-receipts/), and the [GitHub Actions receipt gate](examples/ai-pr-review-receipts/.github/workflows/ai-pr-review-receipt.yml) to review by blast radius: schema/data contracts, async paths, rollout gates, side effects, and ambiguous boundaries. If you delegate work to Codex/Claude Code/Cursor/OpenClaw-style specialist subagents, use [Subagent role receipts](docs/subagent-role-receipts.md) and the [example role definitions](examples/subagent-role-receipts/) to prove the requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action. If you run Claude Code-style dynamic workflows, ultracode, or local LLM gateway orchestration that spawns many agents, use [Dynamic workflow run receipts](docs/dynamic-workflow-run-receipts.md) and the [copyable workflow example](examples/dynamic-workflow-run-receipts/) to prove phases, per-agent roles/models, context loaded/skipped, tool grants, token spend buckets, per-agent fuses, heartbeat, stop reasons, and known gaps. If your workflow routes Explore/Propose/Spec/Design/Tasks/Apply/Verify across OpenCode, Claude Code, Cursor, Codex, or different models, use [Phase-boundary contracts](docs/phase-boundary-contracts.md) and the [copyable Apply→Verify gate](examples/phase-boundary-contract/) to prove allowed input context, output artifact, evidence required before the next phase, dropped context, and stop conditions. If you need CI/reviewers to decide whether an agent handoff can continue, must be reviewed, or should be rejected, use the [Review primitive gate](docs/review-primitive-gate.md), its [copyable gate example](examples/review-primitive-gate/), and the [Claude Code review hook bridge](examples/claude-code-review-hook/) to validate assignment boundaries, approved scope/access changes, required checks, privacy flags, and `complete / partial / unsafe-to-resume` state from CI or Claude Code `TaskCompleted` / `PostCompact` hooks. If Claude Projects, long chats, or compaction make the last clean artifact hard to recover, use [Canonical output receipts](docs/canonical-output-receipts.md) and the [copyable index example](examples/canonical-output-receipts/) to track stable IDs, paths, versions, exact grep phrases, decisions, rejected options, and next actions. If a setup script installs MCP servers, Skills, instruction files, hooks, or plugins across multiple agents, use [Install-plan receipts](docs/install-plan-receipts.md) and the [copyable example](examples/install-plan-receipts/) to prove planned writes, backups, network behavior, and `writes_started=false` before mutation. After a Skill installer runs, use [Skill install/load receipts](docs/skill-install-receipts.md) and the [copyable checker](examples/skill-install-receipts/) to prove source ref, target agents/scopes, discovery/load status, context-cost bucket, and `safe_to_start_session` without logging raw Skill bodies. If you are pruning Skill sprawl after real sessions, use [Skill use-rate receipts](docs/skill-use-rate-receipts.md) and the [copyable checker](examples/skill-use-rate-receipts/) to separate discovered/installed/attached from invoked/acted-on and catch "installed but unused" resources. If you supervise multiple Claude Code/Cursor/Codex/OpenClaw sessions in parallel, use the [Parallel session review ledger](docs/parallel-session-review-ledger.md) and [copyable checker](examples/parallel-session-review-ledger/) to decide which sessions are complete, partial, blocked, or unsafe to resume without trusting an agent summary. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
|
|
165
165
|
|
|
166
166
|
### Usage
|
|
167
167
|
|
|
@@ -407,6 +407,7 @@ If you've felt this pain, tell me about your setup. What tools do you use? How d
|
|
|
407
407
|
- [OpenClaw Integration](docs/openclaw-integration.md) — how Pluribus generates `AGENTS.md` for OpenClaw
|
|
408
408
|
- [Composable Contexts](docs/composable-contexts.md) — local/remote imports, merge behavior, and safety rules
|
|
409
409
|
- [MCP Memory Handoff](docs/memory-mcp-handoff.md) — demo for keeping memory recall/store protocols aligned across tool-specific instruction files
|
|
410
|
+
- [MCP Tool Visibility Receipts](docs/mcp-tool-visibility-receipts.md) — checklist for debugging healthy MCP servers whose tools do not appear in the agent client catalog
|
|
410
411
|
- [Remote Composable Context Imports](docs/remote-composable-context-imports.md) — design notes for lockfile/cache/auth hardening
|
|
411
412
|
- [Context Format Spec](spec/context-format.md) — the `pluribus.md` format reference
|
|
412
413
|
- [Skills Format Spec](spec/skills-format.md) — how adapters work and how to write custom skills
|
package/bin/pluribus.js
CHANGED
|
@@ -10,6 +10,7 @@ import { runSync } from '../src/commands/sync.js'
|
|
|
10
10
|
import { runValidate } from '../src/commands/validate.js'
|
|
11
11
|
import { runWatch } from '../src/commands/watch.js'
|
|
12
12
|
import { runAudit } from '../src/commands/audit.js'
|
|
13
|
+
import { runDemo } from '../src/commands/demo.js'
|
|
13
14
|
import { parseArgs } from '../src/utils/args.js'
|
|
14
15
|
import { SUPPORTED_TOOLS } from '../src/skills/built-in.js'
|
|
15
16
|
import { VERSION } from '../src/utils/version.js'
|
|
@@ -28,6 +29,7 @@ COMMANDS
|
|
|
28
29
|
validate Validate pluribus.md before syncing
|
|
29
30
|
audit Compare generated tool files with pluribus.md without writing
|
|
30
31
|
watch Watch pluribus.md and auto-sync after changes
|
|
32
|
+
demo Run tiny packaged demos from npm without cloning the repo
|
|
31
33
|
help Show this help message
|
|
32
34
|
|
|
33
35
|
OPTIONS (init)
|
|
@@ -64,6 +66,10 @@ OPTIONS (watch)
|
|
|
64
66
|
--once Exit after the first change-triggered sync
|
|
65
67
|
--debounce Debounce delay in ms (minimum 300, default 400)
|
|
66
68
|
|
|
69
|
+
OPTIONS (demo)
|
|
70
|
+
--receipt Validate a custom skill use-rate receipt JSON file
|
|
71
|
+
--json Print machine-readable demo results
|
|
72
|
+
|
|
67
73
|
EXAMPLES
|
|
68
74
|
pluribus init
|
|
69
75
|
pluribus init --dry-run
|
|
@@ -81,6 +87,8 @@ EXAMPLES
|
|
|
81
87
|
pluribus audit --strict --github-annotations
|
|
82
88
|
pluribus audit --json --fidelity-report
|
|
83
89
|
pluribus watch --tools claude,cursor
|
|
90
|
+
pluribus demo skill-use-rate
|
|
91
|
+
pluribus demo skill-use-rate --json
|
|
84
92
|
|
|
85
93
|
DOCS
|
|
86
94
|
https://github.com/caioribeiroclw-pixel/pluribus
|
|
@@ -92,6 +100,7 @@ const COMMAND_FLAGS = {
|
|
|
92
100
|
validate: new Set(['source', 'update-imports']),
|
|
93
101
|
audit: new Set(['source', 'tools', 'update-imports', 'strict', 'ci', 'json', 'output', 'github-annotations', 'fidelity-report']),
|
|
94
102
|
watch: new Set(['source', 'tools', 'update-imports', 'dry-run', 'once', 'debounce']),
|
|
103
|
+
demo: new Set(['receipt', 'json']),
|
|
95
104
|
}
|
|
96
105
|
|
|
97
106
|
function getFlagNames(argv) {
|
|
@@ -152,6 +161,9 @@ async function main() {
|
|
|
152
161
|
case 'audit':
|
|
153
162
|
await runAudit(parsedArgs)
|
|
154
163
|
break
|
|
164
|
+
case 'demo':
|
|
165
|
+
await runDemo(parsedArgs, commandArgs.filter((arg) => !arg.startsWith('--') && !Object.values(parsedArgs).includes(arg)))
|
|
166
|
+
break
|
|
155
167
|
default:
|
|
156
168
|
console.error(`❌ Unknown command: "${command}"`)
|
|
157
169
|
console.log(`Run \`pluribus help\` for usage.`)
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# Agent firewall denial/audit receipts
|
|
2
|
+
|
|
3
|
+
Claude Code hooks, OpenClaw policies, local MCP gateways, and agent firewalls can block destructive commands, outbound calls, or risky writes before an agent executes them.
|
|
4
|
+
|
|
5
|
+
The hard part is not only blocking. If the model sees a vague failure, it may keep trying variants. If the model sees too much detail, the denial can leak secrets, raw policy logic, or bypass hints.
|
|
6
|
+
|
|
7
|
+
Use a split receipt:
|
|
8
|
+
|
|
9
|
+
1. **Model-visible denial envelope** — minimal structured feedback the agent can act on safely.
|
|
10
|
+
2. **Operator audit record** — privacy-safe evidence for the human/operator, CI, or local dashboard.
|
|
11
|
+
|
|
12
|
+
## Model-visible denial envelope
|
|
13
|
+
|
|
14
|
+
The model should receive enough information to stop, ask, or choose a safe alternative, without exposing raw secrets, raw commands, or sensitive policy internals.
|
|
15
|
+
|
|
16
|
+
```json
|
|
17
|
+
{
|
|
18
|
+
"type": "agent_firewall_denial.v1",
|
|
19
|
+
"decision": "blocked",
|
|
20
|
+
"reasonClass": "destructive_git",
|
|
21
|
+
"requiresApproval": true,
|
|
22
|
+
"safeAlternative": "Explain the planned git operation and wait for explicit approval.",
|
|
23
|
+
"retrySafety": "unsafe_until_approved",
|
|
24
|
+
"correlationId": "deny_2026_06_02_2200_7f3a"
|
|
25
|
+
}
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Good `reasonClass` values are coarse and non-secret:
|
|
29
|
+
|
|
30
|
+
- `destructive_git`
|
|
31
|
+
- `filesystem_write_out_of_scope`
|
|
32
|
+
- `outbound_after_secret_read`
|
|
33
|
+
- `credential_exposure_risk`
|
|
34
|
+
- `package_publish_requires_approval`
|
|
35
|
+
- `unknown_policy_boundary`
|
|
36
|
+
|
|
37
|
+
The denial should avoid:
|
|
38
|
+
|
|
39
|
+
- raw shell commands;
|
|
40
|
+
- raw file contents;
|
|
41
|
+
- secret values or secret-looking substrings;
|
|
42
|
+
- full policy source;
|
|
43
|
+
- exact bypass instructions;
|
|
44
|
+
- absolute private paths when a path class or hash is enough.
|
|
45
|
+
|
|
46
|
+
## Operator audit record
|
|
47
|
+
|
|
48
|
+
The operator needs more detail, but still not raw prompts, code, or secrets. Prefer hashes, policy ids, classes, and booleans.
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"type": "agent_firewall_operator_audit.v1",
|
|
53
|
+
"decision": "blocked",
|
|
54
|
+
"correlationId": "deny_2026_06_02_2200_7f3a",
|
|
55
|
+
"tool": "Bash",
|
|
56
|
+
"commandHash": "sha256:0e5751c026e543b2a6f2b4d7a7c8d8e5b81b69c5b9f7db2a5b94f31f987e7f44",
|
|
57
|
+
"cwdHash": "sha256:dcdb704109a454784b81229d2b05f368692e758bfa33cb61d04c1b93791b0273",
|
|
58
|
+
"matchedPolicyIds": ["git.destructive.requires_approval"],
|
|
59
|
+
"sessionTaint": {
|
|
60
|
+
"secretRead": false,
|
|
61
|
+
"privateFileRead": true,
|
|
62
|
+
"networkAccessed": false
|
|
63
|
+
},
|
|
64
|
+
"approval": {
|
|
65
|
+
"state": "missing",
|
|
66
|
+
"requiredFrom": "operator"
|
|
67
|
+
},
|
|
68
|
+
"retrySafety": "unsafe_until_approved",
|
|
69
|
+
"modelEnvelopeHash": "sha256:a1bcaa1cb2572ab0e735c30062a268391d0a9d1b3dd7ff4b14065d8b29513b2a"
|
|
70
|
+
}
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## Invariant
|
|
74
|
+
|
|
75
|
+
A blocked tool call should never disappear into the middle ground of “the command just failed.”
|
|
76
|
+
|
|
77
|
+
- The **model** gets a safe reason class and next action.
|
|
78
|
+
- The **operator** gets policy evidence and retry safety.
|
|
79
|
+
- The shared identifier is a correlation id plus hashes, not raw private payloads.
|
|
80
|
+
|
|
81
|
+
That makes enforcement auditable without turning policy internals into model-visible bypass material.
|
|
82
|
+
|
|
83
|
+
## Try the copyable example
|
|
84
|
+
|
|
85
|
+
See [`examples/agent-firewall-denial-audit/`](../examples/agent-firewall-denial-audit/) for a tiny denial envelope, operator audit record, and local checker:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
node examples/agent-firewall-denial-audit/check-denial-audit.mjs examples/agent-firewall-denial-audit
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
The checker is intentionally small. It fails if the model-visible envelope leaks command/path/policy/secret-looking fields, if the audit record lacks policy ids or hash evidence, or if the envelope/audit correlation id does not match.
|
|
92
|
+
|
|
93
|
+
## How this fits Pluribus
|
|
94
|
+
|
|
95
|
+
Pluribus is not an agent firewall. This recipe is for teams already using hooks, policy engines, or local gateways and needing privacy-safe evidence at the enforcement boundary: what was denied, what the model was safely told, and what the operator can audit later.
|
|
@@ -0,0 +1,173 @@
|
|
|
1
|
+
# AI PR review receipts
|
|
2
|
+
|
|
3
|
+
AI-generated PRs are not risky because they are large or small. They are risky when the reviewer cannot tell which operational boundaries the agent touched.
|
|
4
|
+
|
|
5
|
+
Use this recipe when Claude Code, Cursor, Codex, Copilot agents, OpenClaw, or another coding agent opens a PR and the team needs a compact review artifact before merge.
|
|
6
|
+
|
|
7
|
+
The goal is not to log prompts, transcripts, source code, stack traces, secrets, customer data, or raw tool output. The goal is a privacy-safe receipt that proves the review unit: blast radius.
|
|
8
|
+
|
|
9
|
+
## When this helps
|
|
10
|
+
|
|
11
|
+
Use an AI PR review receipt when a change may affect:
|
|
12
|
+
|
|
13
|
+
- database schema, migrations, backfills, or persisted data contracts;
|
|
14
|
+
- readers/writers that run while a migration or rollout is in progress;
|
|
15
|
+
- async jobs, queues, cron tasks, webhooks, retries, or background workers;
|
|
16
|
+
- feature flags, rollout gates, kill switches, or compatibility shims;
|
|
17
|
+
- external side effects such as payments, email, auth, billing, search indexes, analytics, or third-party APIs;
|
|
18
|
+
- generated files, public APIs, plugin manifests, MCP/Skill/hook configuration, or security-sensitive project config.
|
|
19
|
+
|
|
20
|
+
If none of these apply, the receipt can say so. That negative claim is still useful because it tells the human reviewer what the agent believes it did **not** touch.
|
|
21
|
+
|
|
22
|
+
## Receipt shape
|
|
23
|
+
|
|
24
|
+
Attach this as a PR body section, `REVIEW.md` note, check-run summary, or bot comment.
|
|
25
|
+
|
|
26
|
+
```json
|
|
27
|
+
{
|
|
28
|
+
"type": "review.blast_radius.v1",
|
|
29
|
+
"pr": {
|
|
30
|
+
"source": "agent-pr",
|
|
31
|
+
"review_requested": true,
|
|
32
|
+
"human_review_required": true
|
|
33
|
+
},
|
|
34
|
+
"boundaries": [
|
|
35
|
+
{
|
|
36
|
+
"name": "schema_or_data_contract",
|
|
37
|
+
"status": "touched",
|
|
38
|
+
"evidence": "migration file added; live reader compatibility checked",
|
|
39
|
+
"risk_tier": "high",
|
|
40
|
+
"review_owner": "backend"
|
|
41
|
+
},
|
|
42
|
+
{
|
|
43
|
+
"name": "async_or_background_path",
|
|
44
|
+
"status": "not_touched",
|
|
45
|
+
"evidence": "no queue/cron/webhook paths changed in diff summary",
|
|
46
|
+
"risk_tier": "low"
|
|
47
|
+
},
|
|
48
|
+
{
|
|
49
|
+
"name": "rollout_gate",
|
|
50
|
+
"status": "present",
|
|
51
|
+
"evidence": "feature flag path exists before new behavior is enabled",
|
|
52
|
+
"risk_tier": "medium"
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"name": "external_side_effect",
|
|
56
|
+
"status": "ambiguous",
|
|
57
|
+
"evidence": "email sender import changed; no dry-run evidence found",
|
|
58
|
+
"risk_tier": "high",
|
|
59
|
+
"blocked_until": "reviewer confirms side-effect behavior"
|
|
60
|
+
}
|
|
61
|
+
],
|
|
62
|
+
"tests_and_checks": [
|
|
63
|
+
{
|
|
64
|
+
"name": "unit_or_integration_tests",
|
|
65
|
+
"status": "passed",
|
|
66
|
+
"scope": "changed package only"
|
|
67
|
+
},
|
|
68
|
+
{
|
|
69
|
+
"name": "migration_or_rollback_check",
|
|
70
|
+
"status": "missing",
|
|
71
|
+
"blocks_merge": true
|
|
72
|
+
}
|
|
73
|
+
],
|
|
74
|
+
"decision": {
|
|
75
|
+
"merge_ready": false,
|
|
76
|
+
"reason": "external side effect and rollback evidence are ambiguous",
|
|
77
|
+
"next_safe_action": "ask backend owner to review email behavior and migration rollback before merge"
|
|
78
|
+
},
|
|
79
|
+
"privacy": {
|
|
80
|
+
"raw_prompt_logged": false,
|
|
81
|
+
"raw_source_logged": false,
|
|
82
|
+
"raw_tool_output_logged": false,
|
|
83
|
+
"secrets_logged": false,
|
|
84
|
+
"customer_data_logged": false
|
|
85
|
+
}
|
|
86
|
+
}
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
## Minimal PR template
|
|
90
|
+
|
|
91
|
+
Copy this into `.github/pull_request_template.md` or a review-bot comment.
|
|
92
|
+
|
|
93
|
+
```markdown
|
|
94
|
+
## AI PR review receipt
|
|
95
|
+
|
|
96
|
+
This PR was prepared or modified by an AI coding agent. Review by blast radius, not by diff size alone.
|
|
97
|
+
|
|
98
|
+
### Boundary receipt
|
|
99
|
+
|
|
100
|
+
| Boundary | Status | Evidence | Risk tier | Owner / blocker |
|
|
101
|
+
| --- | --- | --- | --- | --- |
|
|
102
|
+
| Schema / persisted data contract | `touched / not_touched / ambiguous` | | | |
|
|
103
|
+
| Live reader/writer compatibility | `checked / missing / n/a` | | | |
|
|
104
|
+
| Async jobs / queues / cron / webhooks | `touched / not_touched / ambiguous` | | | |
|
|
105
|
+
| Rollout gate / feature flag / kill switch | `present / missing / n/a` | | | |
|
|
106
|
+
| External side effects | `declared / not_touched / ambiguous` | | | |
|
|
107
|
+
| Generated files / public API / plugin config | `touched / not_touched / ambiguous` | | | |
|
|
108
|
+
|
|
109
|
+
### Checks
|
|
110
|
+
|
|
111
|
+
- [ ] Tests relevant to touched boundaries passed.
|
|
112
|
+
- [ ] Migration/backfill/rollback behavior is explicit, or not applicable.
|
|
113
|
+
- [ ] External side effects are declared, or not touched.
|
|
114
|
+
- [ ] Any `ambiguous` boundary has an owner before merge.
|
|
115
|
+
|
|
116
|
+
### Privacy
|
|
117
|
+
|
|
118
|
+
This receipt does not include raw prompts, transcripts, source code, secrets, customer data, stack traces, or raw tool output.
|
|
119
|
+
|
|
120
|
+
### Decision
|
|
121
|
+
|
|
122
|
+
`merge_ready: yes/no`
|
|
123
|
+
|
|
124
|
+
`next_safe_action:`
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## CI gate example
|
|
128
|
+
|
|
129
|
+
The copyable example in [`examples/ai-pr-review-receipts/`](../examples/ai-pr-review-receipts/) includes:
|
|
130
|
+
|
|
131
|
+
- a PR template for human-readable blast-radius review;
|
|
132
|
+
- a GitHub Actions workflow that validates a machine-readable `agent.review_primitive_receipt.v1` receipt;
|
|
133
|
+
- a passing fixture and an intentionally failing fixture.
|
|
134
|
+
|
|
135
|
+
Run the smoke locally from the repository root:
|
|
136
|
+
|
|
137
|
+
```bash
|
|
138
|
+
node examples/review-primitive-gate/check-review-receipt.mjs \
|
|
139
|
+
examples/ai-pr-review-receipts/review-primitive-receipt.json
|
|
140
|
+
|
|
141
|
+
node examples/review-primitive-gate/check-review-receipt.mjs \
|
|
142
|
+
examples/ai-pr-review-receipts/incomplete-review-primitive-receipt.json
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
The first command should pass. The second should fail because partial/unsafe or under-evidenced agent work should not silently pass a merge gate.
|
|
146
|
+
|
|
147
|
+
## How to use with Pluribus
|
|
148
|
+
|
|
149
|
+
Pluribus does not need to own your PR workflow. Use it as the neutral language for evidence that crossed an agent boundary:
|
|
150
|
+
|
|
151
|
+
- `review_boundary_schema_data`
|
|
152
|
+
- `live_reader_writer_compatibility`
|
|
153
|
+
- `review_boundary_async_path`
|
|
154
|
+
- `rollout_gate_present`
|
|
155
|
+
- `external_side_effects_declared`
|
|
156
|
+
- `not_touched_boundary_claim`
|
|
157
|
+
- `ambiguous_boundary_blocks_merge`
|
|
158
|
+
- `risk_tier_evidence`
|
|
159
|
+
- `next_safe_action`
|
|
160
|
+
|
|
161
|
+
The same terms can appear in a GitHub PR template, a Claude Code `/code-review` note, an OpenClaw task receipt, a CI check summary, or a release checklist.
|
|
162
|
+
|
|
163
|
+
## Bad receipts
|
|
164
|
+
|
|
165
|
+
Avoid receipts that say only:
|
|
166
|
+
|
|
167
|
+
- “tests passed”;
|
|
168
|
+
- “Claude reviewed it”;
|
|
169
|
+
- “small PR”;
|
|
170
|
+
- “no issues found”;
|
|
171
|
+
- “looks safe.”
|
|
172
|
+
|
|
173
|
+
Those are conclusions. A useful receipt names the boundary, the evidence, the risk tier, and the next safe action when something is ambiguous.
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Canonical output receipts
|
|
2
|
+
|
|
3
|
+
Claude Projects, long Claude Code sessions, and other agent workspaces are useful archives, but search is a weak source of truth. Exact phrases can be hard to recover, project-scoped search may miss the last clean version, and a later chat can overwrite the user's memory of which artifact is authoritative.
|
|
4
|
+
|
|
5
|
+
Use a canonical output receipt when a session produces something that should be found and reused later: a master prompt, escalation memo, architecture decision, migration plan, test matrix, runbook, product brief, or reviewed context file.
|
|
6
|
+
|
|
7
|
+
The receipt is not persistent memory and not a transcript dump. It is a small index card for the last clean artifact: stable id/path, version, exact grep phrases, decisions, rejected alternatives, open questions, and next action — without logging raw private content, secrets, customer data, or full chat history.
|
|
8
|
+
|
|
9
|
+
## When this helps
|
|
10
|
+
|
|
11
|
+
Use this receipt when:
|
|
12
|
+
|
|
13
|
+
- a Claude Project or long chat produces a canonical artifact that must survive fuzzy chat search;
|
|
14
|
+
- several sessions produce competing versions and a reviewer needs the current one;
|
|
15
|
+
- the source chat may be compacted, archived, deleted, exported, or hard to search;
|
|
16
|
+
- a team needs to know which artifact should be copied into repo docs, `pluribus.md`, `CLAUDE.md`, `AGENTS.md`, a prompt library, or a ticket;
|
|
17
|
+
- exact phrases, dates, decisions, and rejected options matter more than the full conversation.
|
|
18
|
+
|
|
19
|
+
## Receipt shape
|
|
20
|
+
|
|
21
|
+
Attach this to the artifact, repo issue, PR body, project notes, or a `canonical_outputs.md` index.
|
|
22
|
+
|
|
23
|
+
```json
|
|
24
|
+
{
|
|
25
|
+
"type": "canonical.output.receipt.v1",
|
|
26
|
+
"artifact": {
|
|
27
|
+
"stable_id": "project-alpha-master-prompt-2026-05-30",
|
|
28
|
+
"name": "Project Alpha master prompt",
|
|
29
|
+
"kind": "master_prompt",
|
|
30
|
+
"canonical_path": "docs/prompts/project-alpha-master-prompt.md",
|
|
31
|
+
"current_version": "2026-05-30.1",
|
|
32
|
+
"content_hash": "sha256:example-only",
|
|
33
|
+
"status": "current",
|
|
34
|
+
"owner_label": "product-ops",
|
|
35
|
+
"created_at": "2026-05-30T21:40:00Z",
|
|
36
|
+
"last_reviewed_at": "2026-05-30T21:58:00Z"
|
|
37
|
+
},
|
|
38
|
+
"source": {
|
|
39
|
+
"workspace": "claude-project-alpha",
|
|
40
|
+
"source_session_id": "session-redacted-2026-05-30",
|
|
41
|
+
"source_tool": "claude-projects",
|
|
42
|
+
"source_chat_title": "Master prompt rebuild",
|
|
43
|
+
"source_url_or_path_redacted": true,
|
|
44
|
+
"raw_transcript_logged": false
|
|
45
|
+
},
|
|
46
|
+
"index": {
|
|
47
|
+
"exact_phrases_worth_grepping": [
|
|
48
|
+
"do not collapse escalation paths into summaries",
|
|
49
|
+
"billing exports are evidence, not source of truth",
|
|
50
|
+
"final prompt contract v3"
|
|
51
|
+
],
|
|
52
|
+
"tags": ["master-prompt", "billing", "escalation", "current-state"],
|
|
53
|
+
"related_artifacts": ["billing-escalation-runbook-2026-05-28"]
|
|
54
|
+
},
|
|
55
|
+
"decisions": {
|
|
56
|
+
"accepted": [
|
|
57
|
+
"Use repo-owned markdown as the canonical copy, not old chats",
|
|
58
|
+
"Keep escalation criteria in the prompt body and test cases in a separate appendix"
|
|
59
|
+
],
|
|
60
|
+
"rejected": [
|
|
61
|
+
{
|
|
62
|
+
"option": "Rely on Claude Project conversation search for recovery",
|
|
63
|
+
"reason": "exact phrase and project-scoped search were unreliable during rebuild"
|
|
64
|
+
}
|
|
65
|
+
],
|
|
66
|
+
"open_questions": [
|
|
67
|
+
"Does support need a shorter handoff summary for weekend rotations?"
|
|
68
|
+
],
|
|
69
|
+
"next_action": "Open a PR that adds the canonical prompt and this receipt to docs/prompts/"
|
|
70
|
+
},
|
|
71
|
+
"privacy": {
|
|
72
|
+
"raw_prompt_logged": false,
|
|
73
|
+
"raw_chat_logged": false,
|
|
74
|
+
"customer_data_logged": false,
|
|
75
|
+
"secrets_logged": false,
|
|
76
|
+
"proprietary_paths_logged": false
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Minimal checklist
|
|
82
|
+
|
|
83
|
+
Before treating an artifact as recoverable, capture:
|
|
84
|
+
|
|
85
|
+
- stable id, human name, artifact kind, canonical path, version/date, owner label, status, and content hash;
|
|
86
|
+
- source workspace/tool/session label, with private URLs or IDs redacted when needed;
|
|
87
|
+
- exact phrases worth grepping, tags, and related artifacts;
|
|
88
|
+
- decisions accepted, options rejected with reasons, open questions, and next action;
|
|
89
|
+
- privacy flags proving raw chats, raw prompts, customer data, secrets, and private paths were not logged.
|
|
90
|
+
|
|
91
|
+
## `canonical_outputs.md` sketch
|
|
92
|
+
|
|
93
|
+
For small teams, a plain markdown index is enough:
|
|
94
|
+
|
|
95
|
+
```markdown
|
|
96
|
+
# Canonical outputs
|
|
97
|
+
|
|
98
|
+
| Stable ID | Current path | Version | Status | Exact phrase to grep | Next action |
|
|
99
|
+
| --- | --- | --- | --- | --- | --- |
|
|
100
|
+
| project-alpha-master-prompt-2026-05-30 | docs/prompts/project-alpha-master-prompt.md | 2026-05-30.1 | current | final prompt contract v3 | PR canonical copy |
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Old chats should be evidence. The source of truth should be the artifact plus the receipt.
|
|
104
|
+
|
|
105
|
+
## What not to log
|
|
106
|
+
|
|
107
|
+
Do not include raw chat transcripts, full prompts that contain private context, customer data, secrets, credentials, exact private paths, proprietary document bodies, or unredacted project URLs. Prefer hashes, stable ids, coarse tags, short grep phrases, version dates, and decision states.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Compaction resume receipts
|
|
2
|
+
|
|
3
|
+
Claude Code and Codex users are asking for `PreCompact` / `PostCompact` hooks because long sessions lose thread detail, reload rules inconsistently, or continue after a summary that no one can audit.
|
|
4
|
+
|
|
5
|
+
The risky moment is not compaction itself. The risky moment is **resuming as if nothing changed**.
|
|
6
|
+
|
|
7
|
+
A compaction resume receipt is a small, privacy-safe handoff object emitted after a compaction/restore flow. It proves what was summarized, what instruction sources were reloaded, what state was lost or kept, and whether the next agent turn is safe to continue.
|
|
8
|
+
|
|
9
|
+
## Receipt boundary
|
|
10
|
+
|
|
11
|
+
A resume receipt should prove:
|
|
12
|
+
|
|
13
|
+
- **event identity** — stable `compaction_event_id`, `session_id`, and trigger so a restore can be correlated with the hook that caused it;
|
|
14
|
+
- **transcript boundary** — the compacted transcript range and hash, without logging raw transcript content;
|
|
15
|
+
- **summary evidence** — summary hash and token count so downstream tools can detect stale or changed summaries;
|
|
16
|
+
- **instruction reloads** — `AGENTS.md`, `CLAUDE.md`, skills, MCP/tool manifests, workflow plans, or project rules reloaded with hashes/mtimes;
|
|
17
|
+
- **kept/lost fields** — explicit lists for active plan, open diffs, pending tests, rejected decisions, tool grants, or unresolved blockers;
|
|
18
|
+
- **resume verdict** — `safe_to_resume: true | false | unknown` plus reasons;
|
|
19
|
+
- **privacy flags** — no raw transcript, raw prompts, raw tool outputs, secrets, or full instruction bodies in the receipt.
|
|
20
|
+
|
|
21
|
+
## 60-second gate
|
|
22
|
+
|
|
23
|
+
The copyable example is in [`examples/compaction-resume-receipts/`](../examples/compaction-resume-receipts/):
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
node examples/compaction-resume-receipts/check-resume-receipt.mjs \
|
|
27
|
+
examples/compaction-resume-receipts/safe-resume-receipt.json
|
|
28
|
+
|
|
29
|
+
node examples/compaction-resume-receipts/check-resume-receipt.mjs \
|
|
30
|
+
examples/compaction-resume-receipts/unsafe-resume-receipt.json
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
The first passes because the restore has a stable event id, compacted transcript hash, summary hash, reloaded instruction sources, explicit kept/lost state, privacy flags, and `safe_to_resume: true` with no blocking lost fields.
|
|
34
|
+
|
|
35
|
+
The second fails because it tries to continue after missing `AGENTS.md`, hidden lost decisions, raw transcript logging, and an `unknown` verdict.
|
|
36
|
+
|
|
37
|
+
## Positioning
|
|
38
|
+
|
|
39
|
+
Memory systems remember. Hooks fire lifecycle events. This receipt answers the operational review question:
|
|
40
|
+
|
|
41
|
+
> After compaction, do we have enough verified context to continue safely?
|
|
42
|
+
|
|
43
|
+
That makes `PostCompact` / `SessionStart(compact)` restore flows auditable without turning Pluribus into a memory store or requiring private transcript capture.
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Controlled learning queue for AI employee-style agents
|
|
2
|
+
|
|
3
|
+
Claude Code, OpenClaw, Cursor, and MCP tools make it easy to turn a repository into a role-based worker: `CLAUDE.md` as the job description, Skills as procedures, and a `memory/` folder as durable knowledge.
|
|
4
|
+
|
|
5
|
+
That pattern compounds quickly, but it has a failure mode: the agent can overlearn from one weird lead, support ticket, or edge case and rewrite shared memory for every future run.
|
|
6
|
+
|
|
7
|
+
Use a controlled learning queue when an agent is allowed to **propose** durable memory changes but not silently promote them.
|
|
8
|
+
|
|
9
|
+
## Split the folders
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
role/ # job contract: responsibilities, boundaries, escalation
|
|
13
|
+
skills/ # callable procedures with inputs, outputs, stop conditions
|
|
14
|
+
memory/durable.md # approved facts only; small enough to review
|
|
15
|
+
memory/working-notes.md# scratch observations; allowed to be messy/temporary
|
|
16
|
+
learning_queue.md # proposed durable changes awaiting promote/reject
|
|
17
|
+
leads/ # tiny job cards for active work
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
The key rule: `memory/durable.md` changes only through `learning_queue.md` proposals with source, reason, scope, expiry, and reviewer decision.
|
|
21
|
+
|
|
22
|
+
## Proposal shape
|
|
23
|
+
|
|
24
|
+
Each proposed learning should answer:
|
|
25
|
+
|
|
26
|
+
- **Source:** what run, lead, issue, or transcript produced the observation?
|
|
27
|
+
- **Observed:** what happened, without storing raw private text?
|
|
28
|
+
- **Proposed durable change:** the exact fact/rule to add, edit, or remove.
|
|
29
|
+
- **Reason:** why this should affect future runs, not just the current case.
|
|
30
|
+
- **Scope:** global, client-specific, project-specific, channel-specific, or temporary.
|
|
31
|
+
- **Expiry / review date:** when this fact should be rechecked.
|
|
32
|
+
- **Status:** proposed, promoted, rejected, or expired.
|
|
33
|
+
|
|
34
|
+
That is enough to preserve learning while keeping an agent from slowly corrupting ICP, pricing assumptions, escalation rules, or compliance boundaries.
|
|
35
|
+
|
|
36
|
+
## Try the copyable example
|
|
37
|
+
|
|
38
|
+
See [`examples/controlled-learning-queue/`](../examples/controlled-learning-queue/) for a tiny AI sales/ops worker layout and a local checker:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
node examples/controlled-learning-queue/check-learning-queue.mjs examples/controlled-learning-queue/learning_queue.md
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
The checker is intentionally small. It fails proposals that are missing source/reason/scope/expiry/status, that try to auto-promote without review, or that paste raw secrets/private payloads into the learning queue.
|
|
45
|
+
|
|
46
|
+
## How this fits Pluribus
|
|
47
|
+
|
|
48
|
+
Pluribus is not trying to be the agent's brain. This pattern keeps intentional context reviewable: durable memory is a small versioned source of truth, while working notes and proposed learnings remain visibly provisional until promoted.
|