pluribus-context 0.3.26 → 0.3.28
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +9 -1
- package/README.md +8 -1
- package/docs/community-review-packet.md +8 -1
- package/docs/context-budget-receipts.md +21 -2
- package/examples/agent-skills/context-receipts/README.md +22 -0
- package/examples/agent-skills/context-receipts/SKILL.md +107 -0
- package/package.json +1 -1
- package/src/utils/version.js +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,7 +4,15 @@
|
|
|
4
4
|
|
|
5
5
|
All notable changes to Pluribus are documented here.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## 0.3.27 - 2026-05-25
|
|
8
|
+
|
|
9
|
+
- Added a copyable Agent Skill recipe for privacy-safe context receipts, with 60-second smokes for Tool Search/lazy MCP, skill/prompt context, and subagent/manager boundaries.
|
|
10
|
+
- Linked the recipe from the README and context-budget receipts guide so reviewers can install or inspect a concrete `SKILL.md` instead of reverse-engineering the receipt fixtures.
|
|
11
|
+
|
|
12
|
+
## 0.3.26 - 2026-05-24
|
|
13
|
+
|
|
14
|
+
- Added the public context-budget receipts guide, connecting subagent boot budget, delegation boundaries, MCP manager isolation, MCP gateway progressive disclosure, and CLI progressive disclosure under one diagnostic frame: what ate the agent's context before or after the task.
|
|
15
|
+
- Published the GitHub release and npm catch-up so `pluribus-context@latest` includes the context-budget receipts docs and fixtures.
|
|
8
16
|
|
|
9
17
|
## 0.3.25 - 2026-05-23
|
|
10
18
|
|
package/README.md
CHANGED
|
@@ -14,7 +14,7 @@ It shows where instructions keep their semantics, where they are downgraded to a
|
|
|
14
14
|
|
|
15
15
|
It is **not** a persistent memory layer, retrieval system, agent orchestrator, or agent-merging framework. Think `CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`, `AGENTS.md` — one intentional context, multiple generated outputs.
|
|
16
16
|
|
|
17
|
-
**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and a disposable 60-second smoke test. If you only run one command, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the newer agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, or summaries actually crossed an agent boundary.
|
|
17
|
+
**Reviewer shortcut:** evaluating Pluribus for a list, newsletter, package roundup, or tool directory? Use the [Community Review Packet](docs/community-review-packet.md) for copy-paste directory submission fields, safety/removability notes, feedback links, and a disposable 60-second smoke test. If you only run one command, try `npx --yes pluribus-context@latest audit --json --fidelity-report` to see native discovery surfaces, generic fallbacks, load evidence, duplicate-load selection evidence, manual activation requirements, effective context scope, and semantic differences. For the newer agent-observability wedge, start with [context-budget receipts](docs/context-budget-receipts.md): privacy-safe evidence for what MCP schemas, skills, memory, subagents, CLI help, or summaries actually crossed an agent boundary. If you want the same idea as a copyable skill, use the [context-receipts Agent Skill recipe](examples/agent-skills/context-receipts/). npm `latest` is currently aligned with the GitHub release; the review packet also documents a GitHub-release smoke fallback for future release-lag windows.
|
|
18
18
|
|
|
19
19
|
---
|
|
20
20
|
|
|
@@ -122,6 +122,13 @@ pluribus --help
|
|
|
122
122
|
npm uninstall -g pluribus-context
|
|
123
123
|
```
|
|
124
124
|
|
|
125
|
+
npm `latest` is currently aligned with the latest GitHub release. If you are reviewing a future GitHub release before npm `latest` catches up, run that release directly without a global install:
|
|
126
|
+
|
|
127
|
+
```bash
|
|
128
|
+
npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus --version
|
|
129
|
+
npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus help
|
|
130
|
+
```
|
|
131
|
+
|
|
125
132
|
For local development:
|
|
126
133
|
|
|
127
134
|
```bash
|
|
@@ -82,9 +82,16 @@ npx --yes pluribus-context@latest sync --dry-run
|
|
|
82
82
|
npx --yes pluribus-context@latest audit --ci --json --output pluribus-audit.json || test $? -eq 1
|
|
83
83
|
```
|
|
84
84
|
|
|
85
|
+
npm `latest` is currently aligned with the latest GitHub release. If a future GitHub release is newer than npm `latest`, review that release directly instead of waiting for the npm dist-tag:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus --version
|
|
89
|
+
npm exec --yes --package github:caioribeiroclw-pixel/pluribus#v0.3.26 -- pluribus help
|
|
90
|
+
```
|
|
91
|
+
|
|
85
92
|
Expected result:
|
|
86
93
|
|
|
87
|
-
- `--version` prints the current npm release.
|
|
94
|
+
- `--version` prints the current npm release. During a future release-lag window, it prints the GitHub release version when using the `npm exec --package github:...` fallback.
|
|
88
95
|
- `init --dry-run` previews `pluribus.md` without writing.
|
|
89
96
|
- `init` writes `pluribus.md`.
|
|
90
97
|
- `validate` succeeds.
|
|
@@ -6,6 +6,8 @@ Privacy-safe receipts for answering a narrow operational question:
|
|
|
6
6
|
|
|
7
7
|
This is different from generic token accounting. A context-budget receipt should prove which context surfaces were available, which ones crossed the boundary, which ones stayed deferred or suppressed, and how much budget remained — without exporting raw prompts, tool schemas, tool outputs, memory bodies, file paths, ticket text, secrets, or customer data.
|
|
8
8
|
|
|
9
|
+
If you want a copyable Agent Skill recipe instead of a spec-style guide, see [`examples/agent-skills/context-receipts/`](../examples/agent-skills/context-receipts/). It turns the receipt pattern into a 60-second smoke checklist for Tool Search, skills, and subagent boundaries.
|
|
10
|
+
|
|
9
11
|
## When to use this receipt
|
|
10
12
|
|
|
11
13
|
Use a context-budget receipt when a coding agent looks lazy, fails with `prompt is too long`, or returns a tiny summary after a subagent/tool-heavy step and you need to distinguish:
|
|
@@ -61,6 +63,22 @@ Public trace:
|
|
|
61
63
|
|
|
62
64
|
- `examples/context-input-evidence/subagent-context-budget-otel-trace.json`
|
|
63
65
|
|
|
66
|
+
## Per-agent MCP injection
|
|
67
|
+
|
|
68
|
+
Role-specific subagents may need different MCP surfaces: a testing agent might need `testing` and `github`, while deployment, analytics, email, or browser servers should stay outside that boot context. The receipt should prove the policy boundary before the first task:
|
|
69
|
+
|
|
70
|
+
- role/session id for the subagent without raw instructions;
|
|
71
|
+
- available server count/hash for the role;
|
|
72
|
+
- excluded server count/hash before boot;
|
|
73
|
+
- loaded vs deferred tool-definition counts;
|
|
74
|
+
- startup token bucket after the policy was applied; and
|
|
75
|
+
- an explicit audit gap that this proves injection scope, not semantic tool quality.
|
|
76
|
+
|
|
77
|
+
Minimal events:
|
|
78
|
+
|
|
79
|
+
- `subagent.mcp_policy.applied`
|
|
80
|
+
- `subagent.context_boot.evaluated`
|
|
81
|
+
|
|
64
82
|
## Delegation boundary
|
|
65
83
|
|
|
66
84
|
A subagent can save parent context at boot and still lose the benefit if raw child output is pasted back into the parent. The receipt should prove:
|
|
@@ -144,7 +162,8 @@ Instead of “why is my subagent bad?”, ask for a receipt or debug JSON that c
|
|
|
144
162
|
2. How many were loaded into the parent?
|
|
145
163
|
3. How many were loaded into the subagent?
|
|
146
164
|
4. How many were suppressed/deferred?
|
|
147
|
-
5.
|
|
148
|
-
6.
|
|
165
|
+
5. For a subagent, which MCP servers were allowed and which were excluded before boot?
|
|
166
|
+
6. What token bucket remained before the first tool call?
|
|
167
|
+
7. Did raw child output return to the parent, or only a bounded summary?
|
|
149
168
|
|
|
150
169
|
That is the narrow wedge for Pluribus: context-budget evidence across agent boundaries, not another memory store or tool router.
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Context receipts Agent Skill recipe
|
|
2
|
+
|
|
3
|
+
This is a small, copyable Agent Skill recipe for context-engineering users who are adopting Tool Search, lazy MCP loading, skills, memory, compaction, or subagents and need to verify what actually crossed the context boundary.
|
|
4
|
+
|
|
5
|
+
It is intentionally markdown-only so it can be copied into a local skills directory such as:
|
|
6
|
+
|
|
7
|
+
- `.claude/skills/context-receipts/SKILL.md`
|
|
8
|
+
- `.opencode/skills/context-receipts/SKILL.md`
|
|
9
|
+
- `.agents/skills/context-receipts/SKILL.md`
|
|
10
|
+
|
|
11
|
+
## Quick smoke
|
|
12
|
+
|
|
13
|
+
Ask an agent or harness using the skill to emit a receipt for one workflow and verify these constraints:
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
grep -E 'mcp\.tool_index\.loaded|context\.skill\.registry\.index\.loaded|subagent\.mcp_policy\.applied|subagent\.delegation\.requested' receipt.jsonl
|
|
17
|
+
grep -E 'raw_(schema|query|args|result|output)_copied":false|raw.*CopiedToReceipt":false' receipt.jsonl
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Then manually check that the receipt contains counts, hashes, ids, buckets, and `audit_gap`, but does **not** contain private prompts, raw schemas, tool args/results, skill bodies, memory bodies, customer names, secrets, or transcript text.
|
|
21
|
+
|
|
22
|
+
For executable fixture examples, see [`../../context-input-evidence/`](../../context-input-evidence/).
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Context Receipts
|
|
2
|
+
|
|
3
|
+
Use this skill when an agent workflow claims to save context by selecting, deferring, hydrating, summarizing, compacting, delegating, or isolating context.
|
|
4
|
+
|
|
5
|
+
The job is not to log the private content. The job is to emit a small receipt that lets a reviewer answer:
|
|
6
|
+
|
|
7
|
+
> what crossed the context boundary, what stayed out, and what audit gap remains?
|
|
8
|
+
|
|
9
|
+
## Privacy defaults
|
|
10
|
+
|
|
11
|
+
Never include raw prompts, raw tool schemas, raw tool arguments, raw tool results, raw skill bodies, memory bodies, secrets, customer names, or full transcripts in the receipt.
|
|
12
|
+
|
|
13
|
+
Prefer:
|
|
14
|
+
|
|
15
|
+
- stable ids or hashed ids;
|
|
16
|
+
- counts and token/line buckets;
|
|
17
|
+
- categorical reasons;
|
|
18
|
+
- explicit booleans for raw content copied/not copied;
|
|
19
|
+
- before/after context budget buckets;
|
|
20
|
+
- an `audit_gap` field when the receipt proves routing but not semantic correctness.
|
|
21
|
+
|
|
22
|
+
## 60-second Tool Search smoke
|
|
23
|
+
|
|
24
|
+
For MCP Tool Search, lazy tool loading, or progressive disclosure, emit enough evidence to answer these seven checks:
|
|
25
|
+
|
|
26
|
+
1. **Index-only startup:** did the session load a compact tool/server index instead of all full schemas?
|
|
27
|
+
2. **Search/routing:** what hashed query/category or routing reason selected candidate tools?
|
|
28
|
+
3. **Hydration:** which full tool definition was loaded, why, and how many definitions stayed suppressed?
|
|
29
|
+
4. **Call:** which server/tool id was invoked, with argument/result redaction status and success/error status?
|
|
30
|
+
5. **Boundary:** if a manager subagent or child agent was used, did raw child output return to the parent?
|
|
31
|
+
6. **Budget:** what were the startup and post-hydration context-token buckets?
|
|
32
|
+
7. **Audit gap:** what is not proven, such as whether the selected tool was semantically optimal?
|
|
33
|
+
|
|
34
|
+
Minimal JSONL event names:
|
|
35
|
+
|
|
36
|
+
```jsonl
|
|
37
|
+
{"event":"mcp.tool_index.loaded","loaded_server_count":12,"loaded_tool_index_count":84,"full_schema_count":0,"suppressed_tool_count":84,"raw_schema_copied":false,"startup_token_bucket":"lt_1k"}
|
|
38
|
+
{"event":"mcp.tool_search.performed","query_hash":"sha256:...","query_category":"repo_search","candidate_tool_count":5,"selected_tool_id":"github.search_code","raw_query_copied":false}
|
|
39
|
+
{"event":"mcp.tool_definition.loaded","tool_id":"github.search_code","hydrate_reason":"selected_after_tool_search","suppressed_tool_count":83,"definition_token_bucket":"1k_2k","raw_schema_copied":false}
|
|
40
|
+
{"event":"mcp.tool_call.completed","tool_id":"github.search_code","args_hash":"sha256:...","result_token_bucket":"2k_4k","raw_args_copied":false,"raw_result_copied":false,"status":"ok"}
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## Skill / prompt context smoke
|
|
44
|
+
|
|
45
|
+
For skills, rules, AGENTS.md overlays, or instruction files, answer:
|
|
46
|
+
|
|
47
|
+
- which index/listing entered the session;
|
|
48
|
+
- which full skill/rule/instruction body was selected;
|
|
49
|
+
- which candidates were suppressed and why;
|
|
50
|
+
- whether the body was loaded at session start, after a search, or after an explicit command;
|
|
51
|
+
- source hash, delivered hash, and canonical form when available;
|
|
52
|
+
- whether the skill/instruction text was copied into the receipt.
|
|
53
|
+
|
|
54
|
+
Minimal event names:
|
|
55
|
+
|
|
56
|
+
- `context.skill.registry.index.loaded`
|
|
57
|
+
- `context.skill.registry.skill.read`
|
|
58
|
+
- `context.skill.registry.skill.injected`
|
|
59
|
+
- `context.input.loaded`
|
|
60
|
+
- `context.input.candidate_suppressed`
|
|
61
|
+
|
|
62
|
+
## Per-agent MCP injection smoke
|
|
63
|
+
|
|
64
|
+
For role-specific subagents or per-agent MCP configs, prove the policy boundary before debugging model quality:
|
|
65
|
+
|
|
66
|
+
- which subagent role/session requested tools;
|
|
67
|
+
- which MCP servers were available to that role;
|
|
68
|
+
- which servers were explicitly excluded before boot;
|
|
69
|
+
- whether startup loaded full schemas or only a compact index;
|
|
70
|
+
- how many tool definitions stayed deferred/suppressed; and
|
|
71
|
+
- the startup token bucket after policy was applied.
|
|
72
|
+
|
|
73
|
+
Minimal JSONL event names:
|
|
74
|
+
|
|
75
|
+
```jsonl
|
|
76
|
+
{"event":"subagent.mcp_policy.applied","subagent_role":"testing","available_server_count":2,"available_servers_hash":"sha256:...","excluded_server_count":5,"excluded_servers_hash":"sha256:...","policy_source":"role_config","raw_server_names_copied":false}
|
|
77
|
+
{"event":"subagent.context_boot.evaluated","subagent_role":"testing","loaded_tool_definition_count":0,"deferred_tool_definition_count":48,"startup_token_bucket":"50k_75k","raw_schema_copied":false,"audit_gap":"proves injection boundary, not tool relevance"}
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
## Subagent / manager boundary smoke
|
|
81
|
+
|
|
82
|
+
For subagents, manager agents, or child workers, answer:
|
|
83
|
+
|
|
84
|
+
- what task was delegated, by category and hashed objective;
|
|
85
|
+
- what large output was captured by the child, as line/token buckets;
|
|
86
|
+
- what bounded summary returned to the parent;
|
|
87
|
+
- whether raw child output, tool results, or MCP schemas entered the parent context;
|
|
88
|
+
- the remaining audit gap.
|
|
89
|
+
|
|
90
|
+
Minimal event names:
|
|
91
|
+
|
|
92
|
+
- `subagent.delegation.requested`
|
|
93
|
+
- `subagent.tool_output.captured`
|
|
94
|
+
- `subagent.summary.returned`
|
|
95
|
+
- `parent.context_budget.evaluated`
|
|
96
|
+
|
|
97
|
+
## Good receipt test
|
|
98
|
+
|
|
99
|
+
A receipt is useful if a maintainer can debug one of these failures without seeing private content:
|
|
100
|
+
|
|
101
|
+
- the agent never found the right tool/skill;
|
|
102
|
+
- the full definition loaded too early;
|
|
103
|
+
- too many definitions stayed in context;
|
|
104
|
+
- a child/subagent saved no budget because raw output returned to the parent;
|
|
105
|
+
- compaction happened but no one can prove what was preserved, summarized, or dropped.
|
|
106
|
+
|
|
107
|
+
A receipt is not enough if it only says “Tool Search enabled” or “used subagent”. It must prove the boundary behavior.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pluribus-context",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.28",
|
|
4
4
|
"description": "AI context and rules sync CLI for Claude.md, Claude Code, Cursor, and Copilot instructions, with privacy-safe context receipts that prove what memory, tools, skills, compactions, and security findings crossed agent boundaries without logging raw content.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"homepage": "https://github.com/caioribeiroclw-pixel/pluribus#readme",
|
package/src/utils/version.js
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
export const VERSION = '0.3.
|
|
1
|
+
export const VERSION = '0.3.28'
|