pluribus-context 0.3.26 → 0.3.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,7 +4,15 @@
4
4
 
5
5
  All notable changes to Pluribus are documented here.
6
6
 
7
- - Added an executable subagent delegation receipt demo proving that large child command/tool output stayed isolated and only a bounded summary crossed back into the parent context.
7
+ ## 0.3.27 - 2026-05-25
8
+
9
+ - Added a copyable Agent Skill recipe for privacy-safe context receipts, with 60-second smokes for Tool Search/lazy MCP, skill/prompt context, and subagent/manager boundaries.
10
+ - Linked the recipe from the README and context-budget receipts guide so reviewers can install or inspect a concrete `SKILL.md` instead of reverse-engineering the receipt fixtures.
11
+
12
+ ## 0.3.26 - 2026-05-24
13
+
14
+ - Added the public context-budget receipts guide, connecting subagent boot budget, delegation boundaries, MCP manager isolation, MCP gateway progressive disclosure, and CLI progressive disclosure under one diagnostic frame: what ate the agent's context before or after the task.
15
+ - Published the GitHub release and npm catch-up so `pluribus-context@latest` includes the context-budget receipts docs and fixtures.
8
16
 
9
17
  ## 0.3.25 - 2026-05-23
10
18
 
package/README.md CHANGED
@@ -14,7 +14,7 @@ It shows where instructions keep their semantics, where they are downgraded to a
14
14
 
15
15
  It is **not** a persistent memory layer, retrieval system, agent orchestrator, or agent-merging framework. Think `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md` — one intentional context, multiple generated outputs.
16
16
 
17
- **Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and a disposable 60-second smoke test. If you only run one command, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the newer agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, or summaries actually crossed an agent boundary.
17
+ **Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and a disposable 60-second smoke test. If you only run one command, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the newer agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, or summaries actually crossed an agent boundary. If you want the same idea as a copyable skill, use the [context-receipts Agent Skill recipe](examples/agent-skills/context-receipts/). npm `latest` is currently aligned with the GitHub release; the review packet also documents a GitHub-release smoke fallback for future release-lag windows.
18
18
 
19
19
  ---
20
20
 
@@ -122,6 +122,13 @@ pluribus --help
122
122
  npm uninstall -g pluribus-context
123
123
  ```
124
124
 
125
+ npm `latest` is currently aligned with the latest GitHub release. If you are reviewing a future GitHub release before npm `latest` catches up, run that release directly without a global install:
126
+
127
+ ```bash
128
+ npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus --version
129
+ npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus help
130
+ ```
131
+
125
132
  For local development:
126
133
 
127
134
  ```bash
@@ -82,9 +82,16 @@ npx --yes pluribus-context@latest sync --dry-run
82
82
  npx --yes pluribus-context@latest audit --ci --json --output pluribus-audit.json || test $? -eq 1
83
83
  ```
84
84
 
85
+ npm `latest` is currently aligned with the latest GitHub release. If a future GitHub release is newer than npm `latest`, review that release directly instead of waiting for the npm dist-tag:
86
+
87
+ ```bash
88
+ npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus --version
89
+ npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus help
90
+ ```
91
+
85
92
  Expected result:
86
93
 
87
- - `--version` prints the current npm release.
94
+ - `--version` prints the current npm release. During a future release-lag window, it prints the GitHub release version when using the `npm exec --package github:...` fallback.
88
95
  - `init --dry-run` previews `pluribus.md` without writing.
89
96
  - `init` writes `pluribus.md`.
90
97
  - `validate` succeeds.
@@ -6,6 +6,8 @@ Privacy-safe receipts for answering a narrow operational question:
6
6
 
7
7
  This is different from generic token accounting. A context-budget receipt should prove which context surfaces were available, which ones crossed the boundary, which ones stayed deferred or suppressed, and how much budget remained — without exporting raw prompts, tool schemas, tool outputs, memory bodies, file paths, ticket text, secrets, or customer data.
8
8
 
9
+ If you want a copyable Agent Skill recipe instead of a spec-style guide, see [`examples/agent-skills/context-receipts/`](../examples/agent-skills/context-receipts/). It turns the receipt pattern into a 60-second smoke checklist for Tool Search, skills, and subagent boundaries.
10
+
9
11
  ## When to use this receipt
10
12
 
11
13
  Use a context-budget receipt when a coding agent looks lazy, fails with `prompt is too long`, or returns a tiny summary after a subagent/tool-heavy step and you need to distinguish:
@@ -61,6 +63,22 @@ Public trace:
61
63
 
62
64
  - `examples/context-input-evidence/subagent-context-budget-otel-trace.json`
63
65
 
66
+ ## Per-agent MCP injection
67
+
68
+ Role-specific subagents may need different MCP surfaces: a testing agent might need `testing` and `github`, while deployment, analytics, email, or browser servers should stay outside that boot context. The receipt should prove the policy boundary before the first task:
69
+
70
+ - role/session id for the subagent without raw instructions;
71
+ - available server count/hash for the role;
72
+ - excluded server count/hash before boot;
73
+ - loaded vs deferred tool-definition counts;
74
+ - startup token bucket after the policy was applied; and
75
+ - an explicit audit gap that this proves injection scope, not semantic tool quality.
76
+
77
+ Minimal events:
78
+
79
+ - `subagent.mcp_policy.applied`
80
+ - `subagent.context_boot.evaluated`
81
+
64
82
  ## Delegation boundary
65
83
 
66
84
  A subagent can save parent context at boot and still lose the benefit if raw child output is pasted back into the parent. The receipt should prove:
@@ -144,7 +162,8 @@ Instead of “why is my subagent bad?”, ask for a receipt or debug JSON that c
144
162
  2. How many were loaded into the parent?
145
163
  3. How many were loaded into the subagent?
146
164
  4. How many were suppressed/deferred?
147
- 5. What token bucket remained before the first tool call?
148
- 6. Did raw child output return to the parent, or only a bounded summary?
165
+ 5. For a subagent, which MCP servers were allowed and which were excluded before boot?
166
+ 6. What token bucket remained before the first tool call?
167
+ 7. Did raw child output return to the parent, or only a bounded summary?
149
168
 
150
169
  That is the narrow wedge for Pluribus: context-budget evidence across agent boundaries, not another memory store or tool router.
@@ -0,0 +1,22 @@
1
+ # Context receipts Agent Skill recipe
2
+
3
+ This is a small, copyable Agent Skill recipe for context-engineering users who are adopting Tool Search, lazy MCP loading, skills, memory, compaction, or subagents and need to verify what actually crossed the context boundary.
4
+
5
+ It is intentionally markdown-only so it can be copied into a local skills directory such as:
6
+
7
+ - `.claude/skills/context-receipts/SKILL.md`
8
+ - `.opencode/skills/context-receipts/SKILL.md`
9
+ - `.agents/skills/context-receipts/SKILL.md`
10
+
11
+ ## Quick smoke
12
+
13
+ Ask an agent or harness using the skill to emit a receipt for one workflow and verify these constraints:
14
+
15
+ ```bash
16
+ grep -E 'mcp\.tool_index\.loaded|context\.skill\.registry\.index\.loaded|subagent\.mcp_policy\.applied|subagent\.delegation\.requested' receipt.jsonl
17
+ grep -E 'raw_(schema|query|args|result|output)_copied":false|raw.*CopiedToReceipt":false' receipt.jsonl
18
+ ```
19
+
20
+ Then manually check that the receipt contains counts, hashes, ids, buckets, and `audit_gap`, but does **not** contain private prompts, raw schemas, tool args/results, skill bodies, memory bodies, customer names, secrets, or transcript text.
21
+
22
+ For executable fixture examples, see [`../../context-input-evidence/`](../../context-input-evidence/).
@@ -0,0 +1,107 @@
1
+ # Context Receipts
2
+
3
+ Use this skill when an agent workflow claims to save context by selecting, deferring, hydrating, summarizing, compacting, delegating, or isolating context.
4
+
5
+ The job is not to log the private content. The job is to emit a small receipt that lets a reviewer answer:
6
+
7
+ > what crossed the context boundary, what stayed out, and what audit gap remains?
8
+
9
+ ## Privacy defaults
10
+
11
+ Never include raw prompts, raw tool schemas, raw tool arguments, raw tool results, raw skill bodies, memory bodies, secrets, customer names, or full transcripts in the receipt.
12
+
13
+ Prefer:
14
+
15
+ - stable ids or hashed ids;
16
+ - counts and token/line buckets;
17
+ - categorical reasons;
18
+ - explicit booleans for raw content copied/not copied;
19
+ - before/after context budget buckets;
20
+ - an `audit_gap` field when the receipt proves routing but not semantic correctness.
21
+
22
+ ## 60-second Tool Search smoke
23
+
24
+ For MCP Tool Search, lazy tool loading, or progressive disclosure, emit enough evidence to answer these seven checks:
25
+
26
+ 1. **Index-only startup:** did the session load a compact tool/server index instead of all full schemas?
27
+ 2. **Search/routing:** what hashed query/category or routing reason selected candidate tools?
28
+ 3. **Hydration:** which full tool definition was loaded, why, and how many definitions stayed suppressed?
29
+ 4. **Call:** which server/tool id was invoked, with argument/result redaction status and success/error status?
30
+ 5. **Boundary:** if a manager subagent or child agent was used, did raw child output return to the parent?
31
+ 6. **Budget:** what were the startup and post-hydration context-token buckets?
32
+ 7. **Audit gap:** what is not proven, such as whether the selected tool was semantically optimal?
33
+
34
+ Minimal JSONL event names:
35
+
36
+ ```jsonl
37
+ {"event":"mcp.tool_index.loaded","loaded_server_count":12,"loaded_tool_index_count":84,"full_schema_count":0,"suppressed_tool_count":84,"raw_schema_copied":false,"startup_token_bucket":"lt_1k"}
38
+ {"event":"mcp.tool_search.performed","query_hash":"sha256:...","query_category":"repo_search","candidate_tool_count":5,"selected_tool_id":"github.search_code","raw_query_copied":false}
39
+ {"event":"mcp.tool_definition.loaded","tool_id":"github.search_code","hydrate_reason":"selected_after_tool_search","suppressed_tool_count":83,"definition_token_bucket":"1k_2k","raw_schema_copied":false}
40
+ {"event":"mcp.tool_call.completed","tool_id":"github.search_code","args_hash":"sha256:...","result_token_bucket":"2k_4k","raw_args_copied":false,"raw_result_copied":false,"status":"ok"}
41
+ ```
42
+
43
+ ## Skill / prompt context smoke
44
+
45
+ For skills, rules, AGENTS.md overlays, or instruction files, answer:
46
+
47
+ - which index/listing entered the session;
48
+ - which full skill/rule/instruction body was selected;
49
+ - which candidates were suppressed and why;
50
+ - whether the body was loaded at session start, after a search, or after an explicit command;
51
+ - source hash, delivered hash, and canonical form when available;
52
+ - whether the skill/instruction text was copied into the receipt.
53
+
54
+ Minimal event names:
55
+
56
+ - `context.skill.registry.index.loaded`
57
+ - `context.skill.registry.skill.read`
58
+ - `context.skill.registry.skill.injected`
59
+ - `context.input.loaded`
60
+ - `context.input.candidate_suppressed`
61
+
62
+ ## Per-agent MCP injection smoke
63
+
64
+ For role-specific subagents or per-agent MCP configs, prove the policy boundary before debugging model quality:
65
+
66
+ - which subagent role/session requested tools;
67
+ - which MCP servers were available to that role;
68
+ - which servers were explicitly excluded before boot;
69
+ - whether startup loaded full schemas or only a compact index;
70
+ - how many tool definitions stayed deferred/suppressed; and
71
+ - the startup token bucket after policy was applied.
72
+
73
+ Minimal JSONL event names:
74
+
75
+ ```jsonl
76
+ {"event":"subagent.mcp_policy.applied","subagent_role":"testing","available_server_count":2,"available_servers_hash":"sha256:...","excluded_server_count":5,"excluded_servers_hash":"sha256:...","policy_source":"role_config","raw_server_names_copied":false}
77
+ {"event":"subagent.context_boot.evaluated","subagent_role":"testing","loaded_tool_definition_count":0,"deferred_tool_definition_count":48,"startup_token_bucket":"50k_75k","raw_schema_copied":false,"audit_gap":"proves injection boundary, not tool relevance"}
78
+ ```
79
+
80
+ ## Subagent / manager boundary smoke
81
+
82
+ For subagents, manager agents, or child workers, answer:
83
+
84
+ - what task was delegated, by category and hashed objective;
85
+ - what large output was captured by the child, as line/token buckets;
86
+ - what bounded summary returned to the parent;
87
+ - whether raw child output, tool results, or MCP schemas entered the parent context;
88
+ - the remaining audit gap.
89
+
90
+ Minimal event names:
91
+
92
+ - `subagent.delegation.requested`
93
+ - `subagent.tool_output.captured`
94
+ - `subagent.summary.returned`
95
+ - `parent.context_budget.evaluated`
96
+
97
+ ## Good receipt test
98
+
99
+ A receipt is useful if a maintainer can debug one of these failures without seeing private content:
100
+
101
+ - the agent never found the right tool/skill;
102
+ - the full definition loaded too early;
103
+ - too many definitions stayed in context;
104
+ - a child/subagent saved no budget because raw output returned to the parent;
105
+ - compaction happened but no one can prove what was preserved, summarized, or dropped.
106
+
107
+ A receipt is not enough if it only says “Tool Search enabled” or “used subagent”. It must prove the boundary behavior.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pluribus-context",
3
- "version": "0.3.26",
3
+ "version": "0.3.28",
4
4
  "description": "AI context and rules sync CLI for Claude.md, Claude Code, Cursor, and Copilot instructions, with privacy-safe context receipts that prove what memory, tools, skills, compactions, and security findings crossed agent boundaries without logging raw content.",
5
5
  "type": "module",
6
6
  "homepage": "https://github.com/caioribeiroclw-pixel/pluribus#readme",
@@ -1 +1 @@
1
- export const VERSION = '0.3.26'
1
+ export const VERSION = '0.3.28'