npm - pluribus-context - Versions diffs - 0.3.32 → 0.3.34 - Mend

pluribus-context 0.3.32 → 0.3.34

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/CHANGELOG.md +8 -0
package/README.md +5 -5
package/docs/community-review-packet.md +11 -11
package/docs/context-budget-receipts.md +22 -0
package/docs/context-input-evidence.md +15 -0
package/examples/agent-skills/context-receipts/SKILL.md +46 -1
package/examples/context-input-evidence/code-search-retrieval-otel-trace.json +879 -0
package/examples/context-input-evidence/code-search-retrieval-receipt.ndjson +8 -0
package/examples/context-input-evidence/convert-code-search-retrieval-log.mjs +280 -0
package/examples/context-input-evidence/sample-code-search-retrieval-log.jsonl +5 -0
package/package.json +1 -1
package/src/utils/version.js +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,14 @@
 All notable changes to Pluribus are documented here.
+## 0.3.34 - 2026-05-26
+- Repositioned README and the community review packet around privacy-safe agent context receipts first, with instruction-file audit/sync as the supporting workflow, so directory reviewers do not mistake Pluribus for another generic ContextOps, memory, RAG, or rules-sync tool.
+## 0.3.33 - 2026-05-26
+- Added Agent Skill metadata frontmatter and a `/usage` attribution smoke to the context receipts skill so directory reviewers can evaluate it as a standard SKILL.md and connect receipts to component-level usage breakdowns.
 ## 0.3.32 - 2026-05-26
 - Added an executable compaction transaction/rollback receipt fixture for failed `/compact` runs, proving summary failure, `swap_committed=false`, original-context preservation, restored deferred-tool registry/system-reminder queue, and no stale reminder replay without logging raw transcript/tool output.

package/README.md CHANGED Viewed

@@ -6,15 +6,15 @@
 [![Building in Public](https://img.shields.io/badge/building-in%20public-orange?style=flat-square)](https://x.com/RibeiroCaioCLW)
 [![License: MIT](https://img.shields.io/badge/license-MIT-blue?style=flat-square)](LICENSE)
-> Detect where AI-agent context loses fidelity across tools — then sync the parts that can be safely shared.
+> Privacy-safe context receipts for AI coding agents — plus audits/sync for the instruction files they actually load.
-Pluribus (`pluribus-context` on npm, `pluribus` on the command line) is an AI context sync CLI with AI-agent context fidelity audit for teams and projects that use Claude Code, Cursor, GitHub Copilot, OpenClaw, Windsurf, Continue, Zed, or Bob.
+Pluribus (`pluribus-context` on npm, `pluribus` on the command line) is a CLI for **agent context evidence**. It helps teams answer: what instruction file, skill, MCP/tool schema, memory/RAG result, compaction, pruning step, or generated rule actually crossed an agent boundary — without logging raw prompts, source code, tool output, paths, transcripts, secrets, or customer data.
-It shows where instructions keep their semantics, where they are downgraded to a generic fallback, and where manual activation or native discovery matters — then keeps project instructions, conventions, constraints, and team context in one versioned source of truth.
+The original sync workflow is still useful: Pluribus can keep project instructions, conventions, constraints, and team context in one versioned `pluribus.md` source of truth, then generate native files for Claude Code, Cursor, GitHub Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. The sharper wedge is evidence: read-only audits and receipts show where context keeps fidelity, downgrades to a generic fallback, duplicates, stays deferred, hydrates, gets pruned, or rolls back after failed compaction.
-It is **not** a persistent memory layer, retrieval system, agent orchestrator, or agent-merging framework. Think `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md` — one intentional context, multiple generated outputs.
+It is **not** a persistent memory layer, retrieval system, agent orchestrator, enterprise ContextOps platform, or agent-merging framework. Think evidence for context boundaries: `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md`, MCP Tool Search, Agent Skills, RAG/code-search, pruning, and compaction — with privacy-safe receipts instead of raw content dumps.
-**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and a disposable 60-second smoke test. If you only run one command, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the newer agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, or summaries actually crossed an agent boundary. If you want the same idea as a copyable skill, use the [context-receipts Agent Skill recipe](examples/agent-skills/context-receipts/). npm `latest` is currently aligned with the GitHub release; the review packet also documents a GitHub-release smoke fallback for future release-lag windows.
+**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and disposable 60-second smoke tests. If you only run one command for the cross-tool audit, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, retrieval chunks, pruning runs, or compaction summaries crossed an agent boundary. If you want the same idea as a copyable skill, use the [context-receipts Agent Skill recipe](examples/agent-skills/context-receipts/). npm `latest` is currently aligned with the GitHub release; the review packet also documents a GitHub-release smoke fallback for future release-lag windows.
 ---

package/docs/community-review-packet.md CHANGED Viewed

@@ -4,11 +4,11 @@ Use this when reviewing Pluribus for a list, newsletter, package roundup, or too
 ## One-line description
-Pluribus keeps intentional AI coding context in one `pluribus.md` source of truth, then syncs or audits the tool-specific files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
+Pluribus provides privacy-safe context receipts for AI coding agents, plus audits/sync for the instruction files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
 ## Short listing copy
-Pluribus is an open-source CLI for teams and solo developers who use multiple AI coding tools. It treats project instructions, conventions, constraints, and shared team context as versioned Markdown, then generates each tool's expected context file (`CLAUDE.md`, `.cursorrules`, Copilot instructions, `AGENTS.md`, Windsurf/Continue rules, Zed rules, and Bob rules). The safest first command is a read-only audit:
+Pluribus is an open-source CLI for teams and solo developers who need evidence about agent context boundaries. It emits privacy-safe receipts for what crossed, stayed deferred, duplicated, got pruned, or rolled back across MCP tools, Agent Skills, memory/RAG retrieval, subagents, compaction, and generated instruction files — without logging raw prompts, code, schemas, tool outputs, transcripts, paths, secrets, or customer data. It also treats project instructions, conventions, constraints, and shared team context as versioned Markdown, then generates each tool's expected context file (`CLAUDE.md`, `.cursorrules`, Copilot instructions, `AGENTS.md`, Windsurf/Continue rules, Zed rules, and Bob rules). The safest first command is a read-only audit:
 ```bash
 npx --yes pluribus-context@latest audit
@@ -25,10 +25,10 @@ Use these fields for directories, awesome lists, or review forms that ask for a
 | npm | https://www.npmjs.com/package/pluribus-context |
 | License | MIT |
 | Install / run | `npx --yes pluribus-context@latest audit` or `npm install -g pluribus-context@latest` |
-| Category | AI coding tools / context management |
-| Tags | `claude-code`, `cursor`, `copilot`, `openclaw`, `windsurf`, `continue`, `zed`, `bob`, `context-drift` |
-| One sentence | Keep one versioned AI coding context in `pluribus.md`, then audit or sync the generated files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. |
-| 280-char blurb | Pluribus is an open-source CLI for intentional AI coding context. It keeps project guidance in one `pluribus.md`, then audits or syncs `CLAUDE.md`, Cursor rules, Copilot instructions, `AGENTS.md`, Windsurf/Continue rules, Zed rules, and Bob rules. |
+| Category | AI coding tools / agent observability / context management |
+| Tags | `claude-code`, `cursor`, `copilot`, `openclaw`, `windsurf`, `continue`, `zed`, `bob`, `context-receipts`, `context-drift`, `mcp`, `agent-skills`, `opentelemetry` |
+| One sentence | Emit privacy-safe receipts for what context crossed agent boundaries, and audit or sync the generated instruction files used by Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. |
+| 280-char blurb | Pluribus is an open-source CLI for agent context evidence. It emits privacy-safe receipts for MCP/tools, skills, memory/RAG, pruning and compaction boundaries, then audits or syncs AI instruction files like `CLAUDE.md`, Cursor rules, Copilot instructions, and `AGENTS.md`. |
 | Safe first command | `npx --yes pluribus-context@latest audit` |
 ### Awesome-list Markdown entry
@@ -36,7 +36,7 @@ Use these fields for directories, awesome lists, or review forms that ask for a
 Use this exact line when a curated list accepts one Markdown bullet per tool:
 ```markdown
-- [Pluribus](https://github.com/caioribeiroclw-pixel/pluribus) - Open-source CLI that keeps one versioned AI coding context in sync across Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
+- [Pluribus](https://github.com/caioribeiroclw-pixel/pluribus) - Open-source CLI for privacy-safe agent context receipts, plus audits/sync for AI instruction files across Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob.
 ```
 ## Why it may be useful
@@ -52,12 +52,12 @@ Use this section when a directory, list maintainer, or reviewer asks how Pluribu
 | Question | Short answer |
 | --- | --- |
-| What category is it? | AI coding context management / rules sync CLI. |
-| What is the source of truth? | `pluribus.md`, reviewed in git. |
-| What does it generate? | Tool-native context files for Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob. |
+| What category is it? | Agent context evidence / AI coding context management / rules sync CLI. |
+| What is the source of truth? | For sync: `pluribus.md`, reviewed in git. For receipts: counts, hashes, buckets, lifecycle states, and privacy flags generated from the tool/harness boundary being audited. |
+| What does it generate? | Tool-native context files for Claude Code, Cursor, Copilot, OpenClaw, Windsurf, Continue, Zed, and Bob; receipt fixtures/trace shapes for context-budget, retrieval, pruning, compaction, Tool Search, subagent, and skill boundaries. |
 | What is the safe first step? | Run `npx --yes pluribus-context@latest audit` to inspect existing context files without writing. |
 | When is another tool enough? | If you only need one tool's native rules format or a one-time converter, a smaller rules manager/converter may be enough. |
-| What is Pluribus not? | Not chat memory, retrieval, vector search, agent orchestration, or agent merging. |
+| What is Pluribus not? | Not chat memory, retrieval, vector search, agent orchestration, enterprise ContextOps, or agent merging. |
 ## Safety and removability

package/docs/context-budget-receipts.md CHANGED Viewed

@@ -43,6 +43,28 @@ A useful receipt starts small:
 Keep exact counts when they are not sensitive. Bucket token counts and sizes when exact values could reveal private workload shape.
+## Code-search / retrieval receipts
+Semantic code-search MCPs and RAG-over-repo tools can reduce context bloat by returning only relevant chunks. The observability gap is that retrieval and agent-loading are two different boundaries: a tool may return five chunks, a client may dedupe or stale-filter two of them, and only three may actually enter the agent context.
+The receipt should prove:
+- the indexed snapshot/version used, without raw local paths or embedding secrets;
+- the search request identity/category, without raw query text or filters;
+- returned result identities, ranks, score buckets, stale/duplicate markers, and path hashes/extensions/range buckets;
+- which returned chunks were loaded into agent context versus suppressed by the client/harness; and
+- raw code, private paths, prompts, customer names, URLs, tokens, and ticket text stayed out of the receipt.
+Runnable fixture:
+```bash
+node examples/context-input-evidence/convert-code-search-retrieval-log.mjs
+```
+Public trace:
+- `examples/context-input-evidence/code-search-retrieval-otel-trace.json`
 ## Post-hoc pruning / context cleaning
 Context-cleaning tools can reduce a bloated session after context has already entered the transcript. That creates a separate proof boundary from lazy loading: what was pruned, minified, stubbed, deduped, protected, and backed up?

package/docs/context-input-evidence.md CHANGED Viewed

@@ -197,6 +197,21 @@ It reads `sample-mcp-tool-search-log.jsonl` and writes `mcp-tool-search-receipt.
 This is for Claude Code/MCP context-budget work where Tool Search reduces context bloat but still needs verifiable boundaries. The receipt should prove “only indexes were loaded up front; this one definition was loaded when needed; private query/arguments/results stayed out of the trace.”
+To test semantic code-search retrieval — where a code-search MCP returns multiple ranked chunks but the client/harness may dedupe stale or duplicate results before loading only a subset into the agent context — run:
+```bash
+node examples/context-input-evidence/convert-code-search-retrieval-log.mjs
+```
+It reads `sample-code-search-retrieval-log.jsonl` and writes `code-search-retrieval-receipt.ndjson` plus `code-search-retrieval-otel-trace.json`. The sample emits:
+- `code.index.snapshot.used` — snapshot, codebase path hash, git commit hash, indexed file/chunk buckets, and privacy flags.
+- `code.search.performed` — query hash/category, filter hash, top-k, and candidate-count bucket.
+- `code.search.result.returned` — rank, score bucket, chunk hash, path hash/extension, line-range bucket, stale/duplicate flags, and whether the result was loaded into agent context.
+- `context.input.loaded` — loaded versus suppressed chunk counts, suppression reason hashes/categories, token bucket, and explicit audit gap.
+The fixture intentionally includes raw private code snippets, local paths, URLs, tokens, customer names, emails, and ticket ids in the synthetic input, then verifies those strings do not appear in the receipt or trace. This is for Claude Context / code-search MCP / RAG-over-repo workflows where “search returned” and “agent loaded” need separate evidence.
 To test CLI progressive disclosure — where an agent receives a tiny CLI prompt first, loads specific command help only when needed, and executes the CLI instead of loading a full OpenAPI spec or MCP schema set — run:
 ```bash

package/examples/agent-skills/context-receipts/SKILL.md CHANGED Viewed

@@ -1,6 +1,11 @@
+---
+name: context-receipts
+description: Emit privacy-safe receipts for context selection, deferral, hydration, compaction, pruning, delegation, usage attribution, and boundary handoffs.
+---
 # Context Receipts
-Use this skill when an agent workflow claims to save context by selecting, deferring, hydrating, summarizing, compacting, pruning, delegating, or isolating context.
+Use this skill when an agent workflow claims to save context by selecting, deferring, hydrating, summarizing, compacting, pruning, delegating, attributing usage, or isolating context.
 The job is not to log the private content. The job is to emit a small receipt that lets a reviewer answer:
@@ -95,6 +100,46 @@ Minimal JSONL event names:
 {"event":"subagent.toolsearch.matrix.completed","tested_axis":"tools_frontmatter_shape","audit_gap":"proves ToolSearch exposure, not semantic tool relevance or runtime call success"}
 ```
+## Retrieval / code-search smoke
+For semantic code search, repo RAG, or MCP tools such as Claude Context, separate "search returned" from "agent context loaded":
+- which index snapshot/version was used, without raw local codebase paths;
+- what query/category/filter identity selected the candidates, without raw query text;
+- which result ids/chunk hashes were returned, with rank, score bucket, stale flag, duplicate marker, path hash/extension, and range bucket;
+- which returned chunks were actually loaded into the agent context;
+- which chunks were suppressed as duplicate, stale, clipped, policy-blocked, or over budget;
+- whether raw code, raw prompts, raw paths, customer names, URLs, secrets, and ticket text stayed out of the receipt;
+- the audit gap: this proves retrieval/loading boundaries, not semantic answer quality.
+Minimal JSONL event names:
+```jsonl
+{"event":"code.index.snapshot.used","snapshot_id_hash":"sha256:...","codebase_path_hash":"sha256:...","indexed_chunk_count_bucket":"over_1k","raw_codebase_path_copied":false}
+{"event":"code.search.performed","query_hash":"sha256:...","query_category":"auth_debug","candidate_count_bucket":"over_1k","raw_query_copied":false}
+{"event":"code.search.result.returned","rank":1,"chunk_id_hash":"sha256:...","chunk_text_hash":"sha256:...","path_hash":"sha256:...","score_bucket":"high","stale":false,"raw_code_copied":false}
+{"event":"context.input.loaded","kind":"retrieved_code_chunks","loaded_chunk_count":3,"suppressed_chunk_count":2,"suppression_reasons":["duplicate","stale_snapshot_chunk"],"raw_code_copied":false}
+```
+## Usage attribution smoke
+For `/usage`, `/context`, `/doctor`, or other context-budget breakdowns, map each displayed category to evidence that can be reviewed without exposing private content:
+- what measurement window was used;
+- which categories were attributed, such as skills, subagents, plugins, MCP servers, rules, memory, or project files;
+- which components were loaded, deferred, hydrated, suppressed, pruned, or rolled back;
+- before/after or current token/cost buckets by category;
+- whether raw skill bodies, prompts, MCP schemas, tool outputs, and file paths were excluded;
+- the remaining audit gap, such as not proving semantic usefulness of a high-cost component.
+Minimal JSONL event names:
+```jsonl
+{"event":"context.usage.window.measured","window":"current_session","total_token_bucket":"100k_150k","raw_prompts_copied":false}
+{"event":"context.usage.category.attributed","category":"mcp_server","component_hash":"sha256:...","loaded_token_bucket":"10k_25k","deferred_definition_count":42,"hydrated_definition_count":3,"raw_schema_copied":false}
+{"event":"context.usage.breakdown.completed","categories":["skills","subagents","plugins","mcp_server"],"audit_gap":"proves attribution buckets, not whether each component was necessary"}
+```
 ## Pruning / compaction smoke
 For context-cleaning, pruning, compaction, or doctor/guard tools, answer: