npm - pluribus-context - Versions diffs - 0.3.34 → 0.3.35 - Mend

pluribus-context 0.3.34 → 0.3.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,21 @@
 All notable changes to Pluribus are documented here.
+## 0.3.35 - 2026-05-31
+- Added canonical-output receipts for preserving the last clean version of an artifact as versioned evidence instead of treating old chats as source of truth.
+- Added a review primitive gate example that fails unsafe agent handoffs on unapproved scope changes, skipped/failed required checks, missing evidence, privacy leaks, or `partial` / `unsafe-to-resume` state.
+- Added a Claude Code hook bridge that runs the review primitive gate from lifecycle hook JSON, making handoff proof enforceable at `TaskCompleted` / `PostCompact` style boundaries.
+- Tracked fresh market signals around dynamic workflow cost accounting, post-compact recovery, active idea indexes, Skill policy alternatives, and control-plane buying thresholds.
+- Add Skill policy receipts guide and copyable Agent Skill recipe for hard-rule enforcement, target allow/refuse decisions, and post-write guards.
+- Add temporal context receipts guide and copyable current-state/spec example for long-lived projects where old docs still match search but are no longer implementation authority.
+- Add AI PR review receipts guide and copyable PR template for reviewing agent-generated changes by blast radius instead of diff size alone.
+- Add subagent role receipts guide and copyable `agents.toml` example for proving requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action.
+- Add install-plan receipts guide and copyable example for setup scripts that configure MCP servers, Skills, instruction files, hooks, or plugins across multiple AI coding tools before writes begin.
+- Added an MCP tool visibility receipts checklist for debugging servers that launch and return `tools/list` directly but do not surface tools in the actual agent client catalog.
+- Tracked GBrain `unify-types` mutation-mode feedback as an external receipt signal for protected migrations that may default from dry-run into writes.
 ## 0.3.34 - 2026-05-26
 - Repositioned README and the community review packet around privacy-safe agent context receipts first, with instruction-file audit/sync as the supporting workflow, so directory reviewers do not mistake Pluribus for another generic ContextOps, memory, RAG, or rules-sync tool.

package/README.md CHANGED Viewed

@@ -161,7 +161,7 @@ npx --yes pluribus-context@latest sync --dry-run
 If the preview looks right, run `npx --yes pluribus-context@latest sync` to write the tool-specific files.
-For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
+For a fuller walkthrough, see the [Quickstart](docs/quickstart.md). To enforce generated context files in pull requests, use the [CI audit example](docs/ci-audit-example.md); to catch drift before commits leave your machine, use the [Pre-commit Audit Hook](docs/pre-commit-audit.md). If your repo already has `CLAUDE.md`, `.cursorrules`, Copilot instructions, or `AGENTS.md`, run a [Context Drift Audit](docs/context-drift-audit.md) first, try the intentionally drifted [audit example](examples/context-drift-audit/), then follow [Migrate Existing AI Context Files](docs/migrate-existing-context.md). If you switch between Cursor, Claude Code, Copilot, and terminal agents, try the [Cursor ↔ Claude Code context handoff guide](docs/cursor-claude-context-handoff.md) and its [example source file](examples/context-handoff/pluribus.md). If you run multiple AI sessions on the same project, try the [Coordination Contract guide](docs/coordination-contract.md) and its [example source file](examples/coordination-contract/pluribus.md) to keep event-log/scratchpad protocol rules aligned without turning Pluribus into an orchestrator. If you evaluate code-search, MCP retrieval, RAG-over-notes, or agent memory tools, use the [Orchestration-layer Search Receipts](docs/orchestration-search-receipts.md) sketch to measure retrieved context from the harness layer without asking retrieval tools to inspect whole transcripts. If you are adding agent observability, traces, or OpenTelemetry-style events, start with [Context Receipts for Agent Observability](docs/context-receipts-for-agent-observability.md), then use the [Context Input Evidence](docs/context-input-evidence.md) sketch and its [executable demos](examples/context-input-evidence/) to separate source bytes, canonical text, delivered hashes, post-hoc session-log receipts, skill/plugin invocation receipts, shared-memory retrieval receipts, self-remediating brain/doctor receipts, and OpenTelemetry-style SpanEvents. If you publish AI rules, skills, or instruction bundles as "portable", use the [Portability Fidelity Report](docs/portability-fidelity-report.md) and its [example source file](examples/portability-fidelity/pluribus.md) to make compatibility claims evidence-based instead of self-attested. Before committing shared or generated AI instructions, use the [Context File Review Checklist](docs/context-file-review.md). If you're deciding between Pluribus and a one-way rules converter, see [When to use Pluribus](docs/when-to-use-pluribus.md). If you are debugging "context drift" after compaction or long sessions, start with the [Context Drift Taxonomy](docs/context-drift-taxonomy.md) to separate file drift from runtime precedence drift. If you use MCP memory or knowledge-graph tools, try the [MCP memory handoff demo](docs/memory-mcp-handoff.md) to keep recall/store protocols aligned across AI coding tools without turning Pluribus into a memory server. If an MCP server is healthy but tools are missing in Claude Code/Cursor/Codex, use the [MCP tool visibility receipts](docs/mcp-tool-visibility-receipts.md) checklist to separate launch, handshake, `tools/list`, client catalog, and first invocation failures. If a Claude Code/OpenClaw-style Skill states a hard rule but the run still violates it, use the [Skill policy receipts](docs/skill-policy-receipts.md) guide and [copyable Skill recipe](examples/agent-skills/skill-policy-receipts/) to turn target decisions, refusals, and post-write guards into privacy-safe evidence. If long-lived projects keep old specs/TODOs that still match grep but are no longer authoritative, use [Temporal context receipts](docs/temporal-context-receipts.md) and the [copyable current-state example](examples/temporal-context-receipts/) to separate current authority from historical citations before an agent writes code. If AI-generated pull requests are hard to review because diff size hides operational risk, use [AI PR review receipts](docs/ai-pr-review-receipts.md) and the [copyable PR template](examples/ai-pr-review-receipts/) to review by blast radius: schema/data contracts, async paths, rollout gates, side effects, and ambiguous boundaries. If you delegate work to Codex/Claude Code/Cursor/OpenClaw-style specialist subagents, use [Subagent role receipts](docs/subagent-role-receipts.md) and the [example role definitions](examples/subagent-role-receipts/) to prove the requested role, effective role, loaded instruction source, allowed/refused capabilities, stop point, and next safe action. If you run Claude Code-style dynamic workflows, ultracode, or local LLM gateway orchestration that spawns many agents, use [Dynamic workflow run receipts](docs/dynamic-workflow-run-receipts.md) and the [copyable workflow example](examples/dynamic-workflow-run-receipts/) to prove phases, per-agent roles/models, context loaded/skipped, tool grants, token spend buckets, per-agent fuses, heartbeat, stop reasons, and known gaps. If you need CI/reviewers to decide whether an agent handoff can continue, must be reviewed, or should be rejected, use the [Review primitive gate](docs/review-primitive-gate.md), its [copyable gate example](examples/review-primitive-gate/), and the [Claude Code review hook bridge](examples/claude-code-review-hook/) to validate assignment boundaries, approved scope/access changes, required checks, privacy flags, and `complete / partial / unsafe-to-resume` state from CI or Claude Code `TaskCompleted` / `PostCompact` hooks. If Claude Projects, long chats, or compaction make the last clean artifact hard to recover, use [Canonical output receipts](docs/canonical-output-receipts.md) and the [copyable index example](examples/canonical-output-receipts/) to track stable IDs, paths, versions, exact grep phrases, decisions, rejected options, and next actions. If a setup script installs MCP servers, Skills, instruction files, hooks, or plugins across multiple agents, use [Install-plan receipts](docs/install-plan-receipts.md) and the [copyable example](examples/install-plan-receipts/) to prove planned writes, backups, network behavior, and `writes_started=false` before mutation. If you are reviewing Pluribus for a list, newsletter, or tool directory, use the [Community Review Packet](docs/community-review-packet.md) for directory submission fields, a one-line description, safety notes, and a disposable 60-second smoke test. Maintainers can track package/repo discovery with the [Discovery Smoke Checks](docs/discovery-smoke.md).
 ### Usage
@@ -407,6 +407,7 @@ If you've felt this pain, tell me about your setup. What tools do you use? How d
 - [OpenClaw Integration](docs/openclaw-integration.md) — how Pluribus generates `AGENTS.md` for OpenClaw
 - [Composable Contexts](docs/composable-contexts.md) — local/remote imports, merge behavior, and safety rules
 - [MCP Memory Handoff](docs/memory-mcp-handoff.md) — demo for keeping memory recall/store protocols aligned across tool-specific instruction files
+- [MCP Tool Visibility Receipts](docs/mcp-tool-visibility-receipts.md) — checklist for debugging healthy MCP servers whose tools do not appear in the agent client catalog
 - [Remote Composable Context Imports](docs/remote-composable-context-imports.md) — design notes for lockfile/cache/auth hardening
 - [Context Format Spec](spec/context-format.md) — the `pluribus.md` format reference
 - [Skills Format Spec](spec/skills-format.md) — how adapters work and how to write custom skills

package/docs/ai-pr-review-receipts.md ADDED Viewed

@@ -0,0 +1,153 @@
+# AI PR review receipts
+AI-generated PRs are not risky because they are large or small. They are risky when the reviewer cannot tell which operational boundaries the agent touched.
+Use this recipe when Claude Code, Cursor, Codex, Copilot agents, OpenClaw, or another coding agent opens a PR and the team needs a compact review artifact before merge.
+The goal is not to log prompts, transcripts, source code, stack traces, secrets, customer data, or raw tool output. The goal is a privacy-safe receipt that proves the review unit: blast radius.
+## When this helps
+Use an AI PR review receipt when a change may affect:
+- database schema, migrations, backfills, or persisted data contracts;
+- readers/writers that run while a migration or rollout is in progress;
+- async jobs, queues, cron tasks, webhooks, retries, or background workers;
+- feature flags, rollout gates, kill switches, or compatibility shims;
+- external side effects such as payments, email, auth, billing, search indexes, analytics, or third-party APIs;
+- generated files, public APIs, plugin manifests, MCP/Skill/hook configuration, or security-sensitive project config.
+If none of these apply, the receipt can say so. That negative claim is still useful because it tells the human reviewer what the agent believes it did **not** touch.
+## Receipt shape
+Attach this as a PR body section, `REVIEW.md` note, check-run summary, or bot comment.
+```json
+{
+  "type": "review.blast_radius.v1",
+  "pr": {
+    "source": "agent-pr",
+    "review_requested": true,
+    "human_review_required": true
+  },
+  "boundaries": [
+    {
+      "name": "schema_or_data_contract",
+      "status": "touched",
+      "evidence": "migration file added; live reader compatibility checked",
+      "risk_tier": "high",
+      "review_owner": "backend"
+    },
+    {
+      "name": "async_or_background_path",
+      "status": "not_touched",
+      "evidence": "no queue/cron/webhook paths changed in diff summary",
+      "risk_tier": "low"
+    },
+    {
+      "name": "rollout_gate",
+      "status": "present",
+      "evidence": "feature flag path exists before new behavior is enabled",
+      "risk_tier": "medium"
+    },
+    {
+      "name": "external_side_effect",
+      "status": "ambiguous",
+      "evidence": "email sender import changed; no dry-run evidence found",
+      "risk_tier": "high",
+      "blocked_until": "reviewer confirms side-effect behavior"
+    }
+  ],
+  "tests_and_checks": [
+    {
+      "name": "unit_or_integration_tests",
+      "status": "passed",
+      "scope": "changed package only"
+    },
+    {
+      "name": "migration_or_rollback_check",
+      "status": "missing",
+      "blocks_merge": true
+    }
+  ],
+  "decision": {
+    "merge_ready": false,
+    "reason": "external side effect and rollback evidence are ambiguous",
+    "next_safe_action": "ask backend owner to review email behavior and migration rollback before merge"
+  },
+  "privacy": {
+    "raw_prompt_logged": false,
+    "raw_source_logged": false,
+    "raw_tool_output_logged": false,
+    "secrets_logged": false,
+    "customer_data_logged": false
+  }
+}
+```
+## Minimal PR template
+Copy this into `.github/pull_request_template.md` or a review-bot comment.
+```markdown
+## AI PR review receipt
+This PR was prepared or modified by an AI coding agent. Review by blast radius, not by diff size alone.
+### Boundary receipt
+| Boundary | Status | Evidence | Risk tier | Owner / blocker |
+| --- | --- | --- | --- | --- |
+| Schema / persisted data contract | `touched / not_touched / ambiguous` |  |  |  |
+| Live reader/writer compatibility | `checked / missing / n/a` |  |  |  |
+| Async jobs / queues / cron / webhooks | `touched / not_touched / ambiguous` |  |  |  |
+| Rollout gate / feature flag / kill switch | `present / missing / n/a` |  |  |  |
+| External side effects | `declared / not_touched / ambiguous` |  |  |  |
+| Generated files / public API / plugin config | `touched / not_touched / ambiguous` |  |  |  |
+### Checks
+- [ ] Tests relevant to touched boundaries passed.
+- [ ] Migration/backfill/rollback behavior is explicit, or not applicable.
+- [ ] External side effects are declared, or not touched.
+- [ ] Any `ambiguous` boundary has an owner before merge.
+### Privacy
+This receipt does not include raw prompts, transcripts, source code, secrets, customer data, stack traces, or raw tool output.
+### Decision
+`merge_ready: yes/no`
+`next_safe_action:`
+```
+## How to use with Pluribus
+Pluribus does not need to own your PR workflow. Use it as the neutral language for evidence that crossed an agent boundary:
+- `review_boundary_schema_data`
+- `live_reader_writer_compatibility`
+- `review_boundary_async_path`
+- `rollout_gate_present`
+- `external_side_effects_declared`
+- `not_touched_boundary_claim`
+- `ambiguous_boundary_blocks_merge`
+- `risk_tier_evidence`
+- `next_safe_action`
+The same terms can appear in a GitHub PR template, a Claude Code `/code-review` note, an OpenClaw task receipt, a CI check summary, or a release checklist.
+## Bad receipts
+Avoid receipts that say only:
+- “tests passed”;
+- “Claude reviewed it”;
+- “small PR”;
+- “no issues found”;
+- “looks safe.”
+Those are conclusions. A useful receipt names the boundary, the evidence, the risk tier, and the next safe action when something is ambiguous.

package/docs/canonical-output-receipts.md ADDED Viewed

@@ -0,0 +1,107 @@
+# Canonical output receipts
+Claude Projects, long Claude Code sessions, and other agent workspaces are useful archives, but search is a weak source of truth. Exact phrases can be hard to recover, project-scoped search may miss the last clean version, and a later chat can overwrite the user's memory of which artifact is authoritative.
+Use a canonical output receipt when a session produces something that should be found and reused later: a master prompt, escalation memo, architecture decision, migration plan, test matrix, runbook, product brief, or reviewed context file.
+The receipt is not persistent memory and not a transcript dump. It is a small index card for the last clean artifact: stable id/path, version, exact grep phrases, decisions, rejected alternatives, open questions, and next action — without logging raw private content, secrets, customer data, or full chat history.
+## When this helps
+Use this receipt when:
+- a Claude Project or long chat produces a canonical artifact that must survive fuzzy chat search;
+- several sessions produce competing versions and a reviewer needs the current one;
+- the source chat may be compacted, archived, deleted, exported, or hard to search;
+- a team needs to know which artifact should be copied into repo docs, `pluribus.md`, `CLAUDE.md`, `AGENTS.md`, a prompt library, or a ticket;
+- exact phrases, dates, decisions, and rejected options matter more than the full conversation.
+## Receipt shape
+Attach this to the artifact, repo issue, PR body, project notes, or a `canonical_outputs.md` index.
+```json
+{
+  "type": "canonical.output.receipt.v1",
+  "artifact": {
+    "stable_id": "project-alpha-master-prompt-2026-05-30",
+    "name": "Project Alpha master prompt",
+    "kind": "master_prompt",
+    "canonical_path": "docs/prompts/project-alpha-master-prompt.md",
+    "current_version": "2026-05-30.1",
+    "content_hash": "sha256:example-only",
+    "status": "current",
+    "owner_label": "product-ops",
+    "created_at": "2026-05-30T21:40:00Z",
+    "last_reviewed_at": "2026-05-30T21:58:00Z"
+  },
+  "source": {
+    "workspace": "claude-project-alpha",
+    "source_session_id": "session-redacted-2026-05-30",
+    "source_tool": "claude-projects",
+    "source_chat_title": "Master prompt rebuild",
+    "source_url_or_path_redacted": true,
+    "raw_transcript_logged": false
+  },
+  "index": {
+    "exact_phrases_worth_grepping": [
+      "do not collapse escalation paths into summaries",
+      "billing exports are evidence, not source of truth",
+      "final prompt contract v3"
+    ],
+    "tags": ["master-prompt", "billing", "escalation", "current-state"],
+    "related_artifacts": ["billing-escalation-runbook-2026-05-28"]
+  },
+  "decisions": {
+    "accepted": [
+      "Use repo-owned markdown as the canonical copy, not old chats",
+      "Keep escalation criteria in the prompt body and test cases in a separate appendix"
+    ],
+    "rejected": [
+      {
+        "option": "Rely on Claude Project conversation search for recovery",
+        "reason": "exact phrase and project-scoped search were unreliable during rebuild"
+      }
+    ],
+    "open_questions": [
+      "Does support need a shorter handoff summary for weekend rotations?"
+    ],
+    "next_action": "Open a PR that adds the canonical prompt and this receipt to docs/prompts/"
+  },
+  "privacy": {
+    "raw_prompt_logged": false,
+    "raw_chat_logged": false,
+    "customer_data_logged": false,
+    "secrets_logged": false,
+    "proprietary_paths_logged": false
+  }
+}
+```
+## Minimal checklist
+Before treating an artifact as recoverable, capture:
+- stable id, human name, artifact kind, canonical path, version/date, owner label, status, and content hash;
+- source workspace/tool/session label, with private URLs or IDs redacted when needed;
+- exact phrases worth grepping, tags, and related artifacts;
+- decisions accepted, options rejected with reasons, open questions, and next action;
+- privacy flags proving raw chats, raw prompts, customer data, secrets, and private paths were not logged.
+## `canonical_outputs.md` sketch
+For small teams, a plain markdown index is enough:
+```markdown
+# Canonical outputs
+| Stable ID | Current path | Version | Status | Exact phrase to grep | Next action |
+| --- | --- | --- | --- | --- | --- |
+| project-alpha-master-prompt-2026-05-30 | docs/prompts/project-alpha-master-prompt.md | 2026-05-30.1 | current | final prompt contract v3 | PR canonical copy |
+```
+Old chats should be evidence. The source of truth should be the artifact plus the receipt.
+## What not to log
+Do not include raw chat transcripts, full prompts that contain private context, customer data, secrets, credentials, exact private paths, proprietary document bodies, or unredacted project URLs. Prefer hashes, stable ids, coarse tags, short grep phrases, version dates, and decision states.

package/docs/dynamic-workflow-run-receipts.md ADDED Viewed

@@ -0,0 +1,158 @@
+# Dynamic workflow run receipts
+Claude Code-style dynamic workflows move orchestration into a script that can spawn many subagents, keep intermediate results outside the parent conversation, and show progress by phase, agent count, token total, and elapsed time.
+That is useful when a codebase audit, migration, research task, or verification pass needs more parallelism than one conversation can coordinate. It also creates a new failure mode: one child agent can loop, burn tokens, or drift while the parent workflow only shows a high-level progress line.
+Use a dynamic workflow run receipt when a workflow, ultracode run, local LLM gateway, or multi-agent script delegates work across several agents/models and a human needs a privacy-safe summary of what actually happened.
+The first thing to look for is a **per-agent fuse**: budget, heartbeat, partial progress, stop reason, and kill-switch state for every spawned agent. After that, inspect whether the expensive path bought better verification or just more context drift.
+This is not an orchestration framework. The receipt is the stable artifact: compact evidence for each phase and spawned agent without logging raw prompts, source code, transcripts, tool output, secrets, customer data, or proprietary file paths.
+## When this helps
+Use this receipt when:
+- a workflow spawns several agents to audit, migrate, research, or verify a codebase;
+- agents may run different roles, models, or local/remote providers;
+- the run has a token/cost budget that needs to be explained after the fact;
+- a child agent could loop, stall, or keep spending while the parent workflow stays mostly blind;
+- the parent session sees only the final report, not every intermediate result;
+- a reviewer needs to know what context was loaded, skipped, or suppressed for each agent;
+- the run stops, pauses, resumes, or rejects a result and the stop point matters.
+## Receipt shape
+Attach this to a workflow report, PR body, task handoff, run summary, or CI artifact.
+```json
+{
+  "type": "dynamic.workflow.run_receipt.v1",
+  "workflow": {
+    "workflow_id": "wf_checkout_auth_audit_2026_05_30",
+    "runner": "claude-code-dynamic-workflow",
+    "script_source": "generated-then-reviewed-command",
+    "script_hash": "sha256:example-only",
+    "task_kind": "codebase_auth_audit",
+    "plan_approved_before_run": true,
+    "resumable": true,
+    "max_wall_clock_bucket": "under_15m",
+    "kill_switch_available": true,
+    "started_at": "2026-05-30T15:20:00Z",
+    "completed_at": "2026-05-30T15:31:42Z"
+  },
+  "permissions": {
+    "tool_allowlist_inherited": true,
+    "writes_allowed": false,
+    "network_allowed": false,
+    "external_commands_allowed": ["grep", "test --dry-run"],
+    "permission_profile": "review-only"
+  },
+  "phases": [
+    {
+      "phase_id": "route-inventory",
+      "purpose": "find candidate auth-sensitive routes",
+      "agent_count": 3,
+      "token_spend_bucket": "under_50k",
+      "elapsed_ms_bucket": "under_2m",
+      "result": "completed"
+    },
+    {
+      "phase_id": "adversarial-review",
+      "purpose": "cross-check candidate misses",
+      "agent_count": 2,
+      "token_spend_bucket": "under_25k",
+      "elapsed_ms_bucket": "under_2m",
+      "result": "completed_with_gaps"
+    }
+  ],
+  "agents": [
+    {
+      "agent_id": "agent-route-auditor-1",
+      "phase_id": "route-inventory",
+      "role": "route-auth-auditor",
+      "model": "claude-sonnet",
+      "provider": "anthropic",
+      "context_loaded": ["repo-policy", "auth-boundary-rules", "route-index-summary"],
+      "context_skipped_or_suppressed": [
+        {
+          "source": "customer-fixture-dump",
+          "reason": "contains raw customer data; summary hash only"
+        }
+      ],
+      "tools_granted": ["read", "grep"],
+      "tools_used": ["grep"],
+      "feature_areas_checked": ["checkout routes", "admin routes"],
+      "token_budget_bucket": "under_25k",
+      "token_spend_bucket": "under_10k",
+      "max_iterations": 8,
+      "iterations_used": 3,
+      "heartbeat_seen_at": "2026-05-30T15:25:00Z",
+      "partial_progress_reported": true,
+      "fuse_triggered": false,
+      "stop_reason": "completed_assigned_partition",
+      "confidence": "medium",
+      "known_gaps": ["did not execute integration tests"],
+      "raw_prompt_logged": false,
+      "raw_tool_output_logged": false,
+      "raw_paths_logged": false
+    },
+    {
+      "agent_id": "agent-reviewer-1",
+      "phase_id": "adversarial-review",
+      "role": "adversarial-auth-reviewer",
+      "model": "local-codex-compatible",
+      "provider": "local-llm-gateway",
+      "context_loaded": ["candidate-findings-summary", "public-api-contract-summary"],
+      "context_skipped_or_suppressed": [],
+      "tools_granted": ["read"],
+      "tools_used": ["read"],
+      "feature_areas_checked": ["route findings cross-check"],
+      "token_budget_bucket": "under_10k",
+      "token_spend_bucket": "under_10k",
+      "max_iterations": 5,
+      "iterations_used": 5,
+      "heartbeat_seen_at": "2026-05-30T15:30:00Z",
+      "partial_progress_reported": true,
+      "fuse_triggered": true,
+      "stop_reason": "iteration_budget_reached_before_claim_verified",
+      "confidence": "low",
+      "known_gaps": ["one route requires owner confirmation before merge"],
+      "raw_prompt_logged": false,
+      "raw_tool_output_logged": false,
+      "raw_paths_logged": false
+    }
+  ],
+  "handoff": {
+    "final_result_kind": "workflow_review_receipt",
+    "claims_rejected_or_deferred": 1,
+    "next_safe_action": "ask route owner to confirm checkout callback auth before writing fix",
+    "where_it_stopped": "ambiguous auth boundary before mutation"
+  },
+  "privacy": {
+    "raw_prompts_logged": false,
+    "raw_source_logged": false,
+    "raw_tool_output_logged": false,
+    "transcripts_logged": false,
+    "secrets_logged": false,
+    "customer_data_logged": false
+  }
+}
+```
+## Minimal checklist
+Before trusting the result of a dynamic workflow, ask for:
+- workflow/run id, runner, script source, script hash, and whether the plan was approved before execution;
+- workflow-level wall-clock budget, whether a kill switch exists, and whether the run can be paused/resumed safely;
+- permission profile, inherited tool allowlist, write/network/command capability, and whether the run was review-only or mutating;
+- phases, agent counts, token spend buckets, elapsed-time buckets, and phase result states;
+- per-agent role, model/provider actually used, context loaded, context skipped/suppressed, tools granted/used, token budget/spend, iteration budget, heartbeat, partial progress, fuse state, stop reason, confidence, and known gaps;
+- explicit privacy flags proving raw prompts, source, transcripts, tool output, paths, secrets, and customer data were not logged;
+- a handoff that says what was accepted, rejected/deferred, where the workflow stopped, and the next safe action.
+## What not to log
+Do not include raw prompts, full workflow scripts when they reveal private structure, full transcripts, source code, exact proprietary paths, tool output, secrets, credentials, customer data, stack traces, or raw LLM gateway logs. Prefer coarse names, hashes, buckets, counts, role labels, decision states, stop reasons, and owner labels.

package/docs/install-plan-receipts.md ADDED Viewed

@@ -0,0 +1,77 @@
+# Install-plan receipts
+Use this when an MCP server, Skill bundle, plugin, starter kit, or setup script says it can configure many AI coding tools for you.
+The risk is not only whether a hook later runs safely. The earlier boundary is the installer itself: it may detect agents, write MCP config, add instruction files, install Skills, register hooks, or create backups before the user understands what changed.
+The goal is a tiny, privacy-safe pre-mutation receipt that proves what the setup step intends to touch **before the first write starts**. Do not log prompts, source code, secrets, raw environment dumps, transcripts, raw command output, customer data, or private absolute paths.
+## Boundary to prove
+For every setup/install run, capture a plan like this before applying changes:
+```json
+{
+  "receipt_type": "agent.install.plan.v1",
+  "run_id": "local-install-2026-05-29T16:00Z",
+  "installer": "code-memory-mcp",
+  "mode_requested": "plan",
+  "mode_effective": "plan",
+  "agents_detected": ["claude-code", "cursor", "codex", "openclaw"],
+  "agents_selected": ["claude-code", "openclaw"],
+  "planned_writes": [
+    {
+      "kind": "mcp_config",
+      "target": "claude-code project config",
+      "operation": "add_server",
+      "backup_planned": true
+    },
+    {
+      "kind": "instruction_file",
+      "target": "AGENTS.md",
+      "operation": "append_usage_notes",
+      "backup_planned": true
+    },
+    {
+      "kind": "hook",
+      "target": "pre-tool hook config",
+      "operation": "register_command",
+      "backup_planned": true
+    }
+  ],
+  "external_commands_planned": [
+    { "phase": "apply", "command_class": "package_manager_install" }
+  ],
+  "network_after_install": "mcp_server_localhost_only",
+  "writes_started": false,
+  "next_safe_command": "installer apply --from-plan install-plan.json"
+}
+```
+Keep `target` values coarse enough for review. Prefer `claude-code project config` over a full local path, and `package_manager_install` over raw shell output.
+## Acceptance checks
+A safe installer should make these claims inspectable:
+1. **Plan mode exists** — `install --plan`, `install --dry-run`, or equivalent emits the receipt without writing files.
+2. **Effective mode is explicit** — if the user requested `apply` but policy downgraded to `plan`, the receipt says so.
+3. **Agent detection is separated from selection** — finding Cursor/Codex/Claude/OpenClaw does not imply every detected tool will be changed.
+4. **Every planned write has a kind and backup decision** — config, instruction file, Skill, hook, shell profile, lockfile, cache, or generated artifact.
+5. **Writes are still false at receipt time** — `writes_started=false` is the key trust boundary.
+6. **Apply can be repeated from the plan** — the user can review one artifact, then run a concrete next command.
+7. **No private payloads leak** — no raw source, prompts, env dumps, secrets, token values, transcripts, stack traces, or raw tool output.
+## Why this matters for hooks and MCP
+Hooks, Skills, and MCP configs are often discussed as runtime supply-chain surfaces. That is true, but it is downstream. A one-command installer can create the hook or MCP entry first.
+A hook receipt answers: “what executed?”
+An install-plan receipt answers the earlier question: **“what is about to be installed, written, and trusted?”**
+If an installer cannot answer that before mutation, treat it like running CI from an untrusted fork: useful, but not automatically safe.
+## Try the copyable example
+See [`examples/install-plan-receipts/`](../examples/install-plan-receipts/) for a small review checklist and sample receipt you can copy into setup scripts, README install sections, or agent-managed onboarding workflows.

package/docs/mcp-tool-visibility-receipts.md ADDED Viewed

@@ -0,0 +1,67 @@
+# MCP tool visibility receipts
+MCP memory, Git, GitLab, code-search, and knowledge-graph servers can be healthy while the agent still cannot see their tools.
+A useful debug artifact should prove each boundary separately:
+1. **Server launched** — the configured command starts without leaking env/secrets.
+2. **Handshake completed** — client and server agreed on a protocol version and capabilities.
+3. **Proxy catalog returned** — a direct `tools/list` call returns the expected tool count and names.
+4. **Client catalog visible** — the actual agent UI/runtime exposes the same tools under the expected names.
+5. **Invocation allowed or refused** — the first tool call either runs, or returns an explicit permission/config/schema reason.
+`server healthy` is not enough. `tools/list` is not enough. The receipt needs to say where the chain stopped.
+## 60-second probe for any stdio MCP server
+Replace the command after the pipe with the server command you already configured in Claude Code, Cursor, Codex, OpenClaw, or another MCP client.
+```bash
+(
+  printf '%s\n' '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"receipt-probe","version":"0.1.0"}},"id":1}'
+  printf '%s\n' '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}'
+) | your-mcp-server-command
+```
+Record only metadata, not raw prompt/source/tool output:
+```json
+{
+  "kind": "mcp.tool_visibility.receipt",
+  "server": "gitlab",
+  "server_command_hash": "sha256:...",
+  "protocol_version_requested": "2024-11-05",
+  "handshake": "ok",
+  "proxy_tools_count": 172,
+  "proxy_tool_names_sample": ["glab_issue_list", "glab_mr_view"],
+  "client": "Claude Code",
+  "client_catalog_visible": false,
+  "client_tools_count": 0,
+  "stopped_at": "client_catalog_visible",
+  "privacy": "names/counts only; no args, outputs, tokens, paths, or source snippets"
+}
+```
+## Acceptance check
+For a release or bug report, ask for one small matrix:
+| Boundary | Evidence | Pass condition |
+| --- | --- | --- |
+| Launch | server command hash + exit/live status | command starts and stays alive long enough for handshake |
+| Handshake | protocol version + capabilities summary | initialized without version/schema mismatch |
+| Proxy catalog | `tools/list` count + stable tool-name sample | expected tools returned directly |
+| Client catalog | client-visible count + naming prefix | same class of tools visible to the agent |
+| First invocation | allowed/refused reason | failure explains permission/config/schema, not silent absence |
+This shape is intentionally compatible with GitHub/GitLab issue reports and OpenTelemetry-style events. It helps maintainers separate server bugs from client catalog, protocol-version, schema, timeout, and permission bugs without asking users to paste private output.
+## Why this belongs near Pluribus
+Pluribus should not become an MCP gateway or memory database. The narrow value is evidence for context boundaries:
+- generated instruction files prove what static rules were written;
+- memory/search receipts prove what retrieved context was delivered;
+- tool visibility receipts prove whether a configured MCP capability actually crossed into the agent's usable catalog.
+If a tool is not visible to the agent, the project has no reliable context handoff no matter how healthy the server looks.