pluribus-context 0.3.28 → 0.3.30
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/context-budget-receipts.md +51 -0
- package/docs/context-input-evidence.md +14 -0
- package/examples/agent-skills/context-receipts/README.md +6 -2
- package/examples/agent-skills/context-receipts/SKILL.md +18 -0
- package/examples/context-input-evidence/convert-pruning-log.mjs +275 -0
- package/examples/context-input-evidence/convert-subagent-toolsearch-propagation-log.mjs +171 -0
- package/examples/context-input-evidence/pruning-otel-trace.json +885 -0
- package/examples/context-input-evidence/pruning-receipt.ndjson +7 -0
- package/examples/context-input-evidence/sample-pruning-log.jsonl +8 -0
- package/examples/context-input-evidence/sample-subagent-toolsearch-propagation-log.jsonl +5 -0
- package/examples/context-input-evidence/subagent-toolsearch-propagation-otel-trace.json +522 -0
- package/examples/context-input-evidence/subagent-toolsearch-propagation-receipt.ndjson +4 -0
- package/package.json +1 -1
- package/src/utils/version.js +1 -1
|
@@ -43,6 +43,29 @@ A useful receipt starts small:
|
|
|
43
43
|
|
|
44
44
|
Keep exact counts when they are not sensitive. Bucket token counts and sizes when exact values could reveal private workload shape.
|
|
45
45
|
|
|
46
|
+
## Post-hoc pruning / context cleaning
|
|
47
|
+
|
|
48
|
+
Context-cleaning tools can reduce a bloated session after context has already entered the transcript. That creates a separate proof boundary from lazy loading: what was pruned, minified, stubbed, deduped, protected, and backed up?
|
|
49
|
+
|
|
50
|
+
The receipt should prove:
|
|
51
|
+
|
|
52
|
+
- prescription/mode/trigger without raw session JSONL;
|
|
53
|
+
- before/after token and byte buckets;
|
|
54
|
+
- per-strategy candidate, changed, removed, and protected buckets;
|
|
55
|
+
- compact summaries, behavioral digests, active task state, and other protected items were not removed;
|
|
56
|
+
- a backup was created/verified for executed runs; and
|
|
57
|
+
- raw tool output, file contents, session text, emails, secrets, paths, and customer data stayed out of the receipt.
|
|
58
|
+
|
|
59
|
+
Runnable fixture:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
node examples/context-input-evidence/convert-pruning-log.mjs
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Public trace:
|
|
66
|
+
|
|
67
|
+
- `examples/context-input-evidence/pruning-otel-trace.json`
|
|
68
|
+
|
|
46
69
|
## Subagent boot budget
|
|
47
70
|
|
|
48
71
|
Subagents can fail before task #1 if they inherit every MCP schema, skill listing, rule, or memory index from the parent. The receipt should separate:
|
|
@@ -79,6 +102,34 @@ Minimal events:
|
|
|
79
102
|
- `subagent.mcp_policy.applied`
|
|
80
103
|
- `subagent.context_boot.evaluated`
|
|
81
104
|
|
|
105
|
+
## ToolSearch propagation into subagents
|
|
106
|
+
|
|
107
|
+
When MCP tools are deferred behind `ToolSearch`, subagent bugs can hide in three different layers:
|
|
108
|
+
|
|
109
|
+
- the parent/orchestrator policy intended to expose or exclude MCP/ToolSearch;
|
|
110
|
+
- the subagent `tools:` declaration made `ToolSearch` available, stripped it, or froze an older registry; and
|
|
111
|
+
- the runtime filter actually exposed the deferred-tools channel after spawn.
|
|
112
|
+
|
|
113
|
+
The receipt should make those layers distinguishable without raw tool schemas, prompts, agent files, or private paths. It should include:
|
|
114
|
+
|
|
115
|
+
- spawn path and whether skill context was active;
|
|
116
|
+
- coarse bucket for parent intermediate tool calls before spawn;
|
|
117
|
+
- `tools:` declaration shape such as wildcard, explicit include, or exclusion style;
|
|
118
|
+
- whether `ToolSearch` was declared and actually exposed to the subagent;
|
|
119
|
+
- parent vs subagent MCP server count buckets;
|
|
120
|
+
- loaded vs deferred tool-definition buckets; and
|
|
121
|
+
- a `filtered_by` or `filter_reason` category.
|
|
122
|
+
|
|
123
|
+
Runnable fixture:
|
|
124
|
+
|
|
125
|
+
```bash
|
|
126
|
+
node examples/context-input-evidence/convert-subagent-toolsearch-propagation-log.mjs
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Public trace:
|
|
130
|
+
|
|
131
|
+
- `examples/context-input-evidence/subagent-toolsearch-propagation-otel-trace.json`
|
|
132
|
+
|
|
82
133
|
## Delegation boundary
|
|
83
134
|
|
|
84
135
|
A subagent can save parent context at boot and still lose the benefit if raw child output is pasted back into the parent. The receipt should prove:
|
|
@@ -330,6 +330,20 @@ It reads `sample-compaction-log.jsonl` and writes `compaction-receipt.ndjson` pl
|
|
|
330
330
|
|
|
331
331
|
This is for reliability/auditability work where users need to know whether the original engineering objective survived compaction. The receipt should prove the compaction boundary and item decisions without exposing raw prompts, private instructions, tool outputs, memory bodies, summaries, customer data, or transcripts.
|
|
332
332
|
|
|
333
|
+
To test post-hoc pruning / context-cleaning receipts — where a Claude Code-style session cleaner trims, minifies, stubs, or dedupes context after it entered the transcript — run:
|
|
334
|
+
|
|
335
|
+
```bash
|
|
336
|
+
node examples/context-input-evidence/convert-pruning-log.mjs
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
It reads `sample-pruning-log.jsonl` and writes `pruning-receipt.ndjson` plus `pruning-otel-trace.json`. The sample emits three event types:
|
|
340
|
+
|
|
341
|
+
- `context.prune.started` — prescription, mode, trigger, before token/byte buckets, backup identity hash, plan hash, and privacy flags.
|
|
342
|
+
- `context.prune.strategy.evaluated` — each pruning strategy, action (`trimmed`, `minified`, `deduped`, or `protected`), candidate/changed/protected buckets, removed token/byte buckets, reason hash, and sample hash.
|
|
343
|
+
- `context.prune.completed` — after token/byte buckets, changed/protected item buckets, backup verification, summary hash, and the audit gap.
|
|
344
|
+
|
|
345
|
+
This is for tools that promise to clean bloated sessions while preserving active task state. The receipt should prove what was pruned and what was protected without exporting raw session JSONL, tool outputs, file contents, paths, secrets, emails, screenshots, customer data, or summaries.
|
|
346
|
+
|
|
333
347
|
To test incremental memory consolidation — where a shared-memory server runs a hook-safe pass after a session and turns several recent memories into one consolidated memory with lineage — run:
|
|
334
348
|
|
|
335
349
|
```bash
|
|
@@ -13,10 +13,14 @@ It is intentionally markdown-only so it can be copied into a local skills direct
|
|
|
13
13
|
Ask an agent or harness using the skill to emit a receipt for one workflow and verify these constraints:
|
|
14
14
|
|
|
15
15
|
```bash
|
|
16
|
-
grep -E 'mcp\.tool_index\.loaded|context\.skill\.registry\.index\.loaded|subagent\.mcp_policy\.applied|subagent\.delegation\.requested' receipt.jsonl
|
|
16
|
+
grep -E 'mcp\.tool_index\.loaded|context\.skill\.registry\.index\.loaded|subagent\.mcp_policy\.applied|subagent\.toolsearch\.propagation\.evaluated|subagent\.delegation\.requested' receipt.jsonl
|
|
17
17
|
grep -E 'raw_(schema|query|args|result|output)_copied":false|raw.*CopiedToReceipt":false' receipt.jsonl
|
|
18
18
|
```
|
|
19
19
|
|
|
20
20
|
Then manually check that the receipt contains counts, hashes, ids, buckets, and `audit_gap`, but does **not** contain private prompts, raw schemas, tool args/results, skill bodies, memory bodies, customer names, secrets, or transcript text.
|
|
21
21
|
|
|
22
|
-
For executable fixture examples, see [`../../context-input-evidence/`](../../context-input-evidence/)
|
|
22
|
+
For executable fixture examples, see [`../../context-input-evidence/`](../../context-input-evidence/), including the ToolSearch propagation smoke:
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
node ../../context-input-evidence/convert-subagent-toolsearch-propagation-log.mjs
|
|
26
|
+
```
|
|
@@ -77,6 +77,24 @@ Minimal JSONL event names:
|
|
|
77
77
|
{"event":"subagent.context_boot.evaluated","subagent_role":"testing","loaded_tool_definition_count":0,"deferred_tool_definition_count":48,"startup_token_bucket":"50k_75k","raw_schema_copied":false,"audit_gap":"proves injection boundary, not tool relevance"}
|
|
78
78
|
```
|
|
79
79
|
|
|
80
|
+
## ToolSearch propagation smoke
|
|
81
|
+
|
|
82
|
+
For subagents that should inherit MCP through `ToolSearch`, distinguish policy, declaration, and runtime filtering:
|
|
83
|
+
|
|
84
|
+
- did the parent/orchestrator intend to expose MCP or exclude it for this subagent?
|
|
85
|
+
- was the subagent spawned immediately or after parent tool calls/orchestration work?
|
|
86
|
+
- was the `tools:` declaration wildcard, explicit include, or exclusion style?
|
|
87
|
+
- was `ToolSearch` declared and was it actually exposed in the subagent tool surface?
|
|
88
|
+
- did MCP servers/tool definitions stay deferred, or did the channel collapse to zero?
|
|
89
|
+
- was the agent registry loaded at session boot, making newly added agent files invisible until restart?
|
|
90
|
+
|
|
91
|
+
Minimal JSONL event names:
|
|
92
|
+
|
|
93
|
+
```jsonl
|
|
94
|
+
{"event":"subagent.toolsearch.propagation.evaluated","spawn_path":"Task","tools_declaration_shape":"enumerated_include","toolsearch_declared":false,"toolsearch_exposed":false,"mcp_servers_available_bucket":"0","deferred_tool_definitions_bucket":"0","filtered_by":"frontmatter_tools_policy_or_runtime_filter","raw_tool_schemas_copied":false}
|
|
95
|
+
{"event":"subagent.toolsearch.matrix.completed","tested_axis":"tools_frontmatter_shape","audit_gap":"proves ToolSearch exposure, not semantic tool relevance or runtime call success"}
|
|
96
|
+
```
|
|
97
|
+
|
|
80
98
|
## Subagent / manager boundary smoke
|
|
81
99
|
|
|
82
100
|
For subagents, manager agents, or child workers, answer:
|
|
@@ -0,0 +1,275 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
import { createHash } from 'node:crypto';
|
|
3
|
+
import { readFileSync, writeFileSync } from 'node:fs';
|
|
4
|
+
import { dirname, join, resolve } from 'node:path';
|
|
5
|
+
import { fileURLToPath } from 'node:url';
|
|
6
|
+
|
|
7
|
+
const here = dirname(fileURLToPath(import.meta.url));
|
|
8
|
+
const inputPath = process.argv[2] ? resolve(process.argv[2]) : join(here, 'sample-pruning-log.jsonl');
|
|
9
|
+
const receiptPath = process.argv[3] ? resolve(process.argv[3]) : join(here, 'pruning-receipt.ndjson');
|
|
10
|
+
const tracePath = process.argv[4] ? resolve(process.argv[4]) : join(here, 'pruning-otel-trace.json');
|
|
11
|
+
|
|
12
|
+
function sha256(value) {
|
|
13
|
+
return `sha256:${createHash('sha256').update(value ?? '').digest('hex')}`;
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
function hashRef(value) {
|
|
17
|
+
return sha256(value ?? '').slice(0, 19);
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
function readJsonl(path) {
|
|
21
|
+
return readFileSync(path, 'utf8')
|
|
22
|
+
.trim()
|
|
23
|
+
.split('\n')
|
|
24
|
+
.filter(Boolean)
|
|
25
|
+
.map((line, index) => {
|
|
26
|
+
try {
|
|
27
|
+
return JSON.parse(line);
|
|
28
|
+
} catch (error) {
|
|
29
|
+
throw new Error(`Invalid JSONL at ${path}:${index + 1}: ${error.message}`);
|
|
30
|
+
}
|
|
31
|
+
});
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
function unixNano(isoTimestamp) {
|
|
35
|
+
return `${BigInt(Date.parse(isoTimestamp)) * 1_000_000n}`;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
function otelValue(value) {
|
|
39
|
+
if (typeof value === 'boolean') return { boolValue: value };
|
|
40
|
+
if (typeof value === 'number' && Number.isInteger(value)) return { intValue: String(value) };
|
|
41
|
+
if (typeof value === 'number') return { doubleValue: value };
|
|
42
|
+
if (value == null) return { stringValue: '' };
|
|
43
|
+
return { stringValue: String(value) };
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
function attributesToOtel(attributes) {
|
|
47
|
+
return Object.entries(attributes).map(([key, value]) => ({ key, value: otelValue(value) }));
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
function tokenBucket(value) {
|
|
51
|
+
if (value < 1_000) return 'under_1k';
|
|
52
|
+
if (value < 10_000) return 'under_10k';
|
|
53
|
+
if (value < 50_000) return 'under_50k';
|
|
54
|
+
if (value < 100_000) return 'under_100k';
|
|
55
|
+
return 'over_100k';
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
function bytesBucket(value) {
|
|
59
|
+
if (value < 1024) return 'under_1kb';
|
|
60
|
+
if (value < 1024 * 1024) return 'under_1mb';
|
|
61
|
+
if (value < 10 * 1024 * 1024) return 'under_10mb';
|
|
62
|
+
if (value < 50 * 1024 * 1024) return 'under_50mb';
|
|
63
|
+
return 'over_50mb';
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
function countBucket(value) {
|
|
67
|
+
if (value === 0) return 'zero';
|
|
68
|
+
if (value <= 5) return 'under_5';
|
|
69
|
+
if (value <= 25) return 'under_25';
|
|
70
|
+
if (value <= 100) return 'under_100';
|
|
71
|
+
if (value <= 500) return 'under_500';
|
|
72
|
+
return 'over_500';
|
|
73
|
+
}
|
|
74
|
+
|
|
75
|
+
function ratioBucket(numerator, denominator) {
|
|
76
|
+
const ratio = denominator > 0 ? numerator / denominator : 0;
|
|
77
|
+
if (ratio < 0.25) return 'under_25_percent';
|
|
78
|
+
if (ratio < 0.5) return 'under_50_percent';
|
|
79
|
+
if (ratio < 0.75) return 'under_75_percent';
|
|
80
|
+
if (ratio < 0.9) return 'under_90_percent';
|
|
81
|
+
return 'over_90_percent';
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
const records = readJsonl(inputPath);
|
|
85
|
+
const session = records.find((record) => record.type === 'session.start');
|
|
86
|
+
const start = records.find((record) => record.type === 'context.prune.start');
|
|
87
|
+
const strategies = records.filter((record) => record.type === 'context.prune.strategy.evaluated');
|
|
88
|
+
const completed = records.find((record) => record.type === 'context.prune.completed');
|
|
89
|
+
|
|
90
|
+
if (!session || !start || strategies.length === 0 || !completed) {
|
|
91
|
+
throw new Error(`Expected session.start, context.prune.start, strategy evaluations, and context.prune.completed records in ${inputPath}`);
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
const traceSeed = `${session.session_id}:${start.run_id}:context-pruning`;
|
|
95
|
+
const traceId = sha256(traceSeed).replace('sha256:', '').slice(0, 32);
|
|
96
|
+
const spanId = sha256(`${traceSeed}:span`).replace('sha256:', '').slice(0, 16);
|
|
97
|
+
const runIdHash = hashRef(start.run_id);
|
|
98
|
+
|
|
99
|
+
const startedEvent = {
|
|
100
|
+
trace_id: traceId,
|
|
101
|
+
span_id: spanId,
|
|
102
|
+
name: 'context.prune.started',
|
|
103
|
+
time: start.time,
|
|
104
|
+
attributes: {
|
|
105
|
+
'session.id': session.session_id,
|
|
106
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
107
|
+
'agent.name': session.agent,
|
|
108
|
+
'context.prune.run_id_hash': runIdHash,
|
|
109
|
+
'context.prune.tool': start.tool,
|
|
110
|
+
'context.prune.command_hash': sha256(start.command),
|
|
111
|
+
'context.prune.prescription': start.prescription,
|
|
112
|
+
'context.prune.mode': start.mode,
|
|
113
|
+
'context.prune.trigger': start.trigger,
|
|
114
|
+
'context.prune.context_window_bucket': tokenBucket(start.context_window_tokens),
|
|
115
|
+
'context.prune.token_count.before_bucket': tokenBucket(start.token_count_before),
|
|
116
|
+
'context.prune.byte_count.before_bucket': bytesBucket(start.byte_count_before),
|
|
117
|
+
'context.prune.start_ratio_bucket': ratioBucket(start.token_count_before, start.context_window_tokens),
|
|
118
|
+
'context.prune.backup_id_hash': hashRef(start.backup_id),
|
|
119
|
+
'context.prune.backup.created': true,
|
|
120
|
+
'context.prune.plan.hash': sha256(start.raw_plan_notes),
|
|
121
|
+
'privacy.raw_session_recorded': false,
|
|
122
|
+
'privacy.raw_plan_recorded': false,
|
|
123
|
+
'privacy.raw_prompt_recorded': false,
|
|
124
|
+
'privacy.raw_tool_output_recorded': false
|
|
125
|
+
}
|
|
126
|
+
};
|
|
127
|
+
|
|
128
|
+
const strategyEvents = strategies.map((strategy) => ({
|
|
129
|
+
trace_id: traceId,
|
|
130
|
+
span_id: spanId,
|
|
131
|
+
name: 'context.prune.strategy.evaluated',
|
|
132
|
+
time: strategy.time,
|
|
133
|
+
attributes: {
|
|
134
|
+
'session.id': session.session_id,
|
|
135
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
136
|
+
'context.prune.run_id_hash': runIdHash,
|
|
137
|
+
'context.prune.strategy': strategy.strategy,
|
|
138
|
+
'context.prune.strategy.action': strategy.action,
|
|
139
|
+
'context.prune.strategy.candidate_count_bucket': countBucket(strategy.candidate_count),
|
|
140
|
+
'context.prune.strategy.changed_count_bucket': countBucket(strategy.changed_count),
|
|
141
|
+
'context.prune.strategy.protected_count_bucket': countBucket(strategy.protected_count),
|
|
142
|
+
'context.prune.strategy.token_count.before_bucket': tokenBucket(strategy.token_count_before),
|
|
143
|
+
'context.prune.strategy.token_count.removed_bucket': tokenBucket(strategy.token_count_removed),
|
|
144
|
+
'context.prune.strategy.byte_count.removed_bucket': bytesBucket(strategy.byte_count_removed),
|
|
145
|
+
'context.prune.strategy.reason_hash': sha256(strategy.reason),
|
|
146
|
+
'context.prune.strategy.sample_hash': sha256(strategy.raw_sample),
|
|
147
|
+
'context.prune.strategy.raw_text_recorded': false,
|
|
148
|
+
'privacy.raw_session_recorded': false,
|
|
149
|
+
'privacy.raw_tool_output_recorded': false,
|
|
150
|
+
'privacy.raw_file_content_recorded': false
|
|
151
|
+
}
|
|
152
|
+
}));
|
|
153
|
+
|
|
154
|
+
const completedEvent = {
|
|
155
|
+
trace_id: traceId,
|
|
156
|
+
span_id: spanId,
|
|
157
|
+
name: 'context.prune.completed',
|
|
158
|
+
time: completed.time,
|
|
159
|
+
attributes: {
|
|
160
|
+
'session.id': session.session_id,
|
|
161
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
162
|
+
'context.prune.run_id_hash': runIdHash,
|
|
163
|
+
'context.prune.status': completed.status,
|
|
164
|
+
'context.prune.token_count.after_bucket': tokenBucket(completed.token_count_after),
|
|
165
|
+
'context.prune.byte_count.after_bucket': bytesBucket(completed.byte_count_after),
|
|
166
|
+
'context.prune.token_count.removed_bucket': tokenBucket(completed.total_token_count_removed),
|
|
167
|
+
'context.prune.byte_count.removed_bucket': bytesBucket(completed.total_byte_count_removed),
|
|
168
|
+
'context.prune.end_ratio_bucket': ratioBucket(completed.token_count_after, start.context_window_tokens),
|
|
169
|
+
'context.prune.changed_item_count_bucket': countBucket(completed.changed_item_count),
|
|
170
|
+
'context.prune.protected_item_count_bucket': countBucket(completed.protected_item_count),
|
|
171
|
+
'context.prune.backup.verified': completed.backup_verified,
|
|
172
|
+
'context.prune.summary.hash': sha256(completed.raw_summary),
|
|
173
|
+
'context.prune.audit_gap': completed.audit_gap,
|
|
174
|
+
'privacy.raw_session_recorded': false,
|
|
175
|
+
'privacy.raw_summary_recorded': false,
|
|
176
|
+
'privacy.raw_tool_output_recorded': false,
|
|
177
|
+
'privacy.raw_file_content_recorded': false
|
|
178
|
+
}
|
|
179
|
+
};
|
|
180
|
+
|
|
181
|
+
const events = [startedEvent, ...strategyEvents, completedEvent]
|
|
182
|
+
.sort((left, right) => Date.parse(left.time) - Date.parse(right.time));
|
|
183
|
+
|
|
184
|
+
writeFileSync(receiptPath, `${events.map((event) => JSON.stringify(event)).join('\n')}\n`);
|
|
185
|
+
|
|
186
|
+
const trace = {
|
|
187
|
+
resourceSpans: [
|
|
188
|
+
{
|
|
189
|
+
resource: {
|
|
190
|
+
attributes: attributesToOtel({
|
|
191
|
+
'service.name': 'pluribus-context-pruning-receipt-demo',
|
|
192
|
+
'service.version': '0.0.0-fixture',
|
|
193
|
+
'deployment.environment.name': 'local-fixture'
|
|
194
|
+
})
|
|
195
|
+
},
|
|
196
|
+
scopeSpans: [
|
|
197
|
+
{
|
|
198
|
+
scope: {
|
|
199
|
+
name: 'pluribus.context_input_evidence.pruning_demo',
|
|
200
|
+
version: '0.0.0-fixture'
|
|
201
|
+
},
|
|
202
|
+
spans: [
|
|
203
|
+
{
|
|
204
|
+
traceId,
|
|
205
|
+
spanId,
|
|
206
|
+
parentSpanId: '',
|
|
207
|
+
name: 'agent.session.context_prune',
|
|
208
|
+
kind: 1,
|
|
209
|
+
startTimeUnixNano: unixNano(start.time),
|
|
210
|
+
endTimeUnixNano: unixNano(completed.time),
|
|
211
|
+
attributes: attributesToOtel({
|
|
212
|
+
'session.id': session.session_id,
|
|
213
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
214
|
+
'agent.name': session.agent,
|
|
215
|
+
'workspace.name': session.workspace,
|
|
216
|
+
'gen_ai.request.model': session.model,
|
|
217
|
+
'context.prune.run_id_hash': runIdHash,
|
|
218
|
+
'context.prune.prescription': start.prescription,
|
|
219
|
+
'context.prune.mode': start.mode,
|
|
220
|
+
'context.prune.trigger': start.trigger
|
|
221
|
+
}),
|
|
222
|
+
events: events.map((event) => ({
|
|
223
|
+
name: event.name,
|
|
224
|
+
timeUnixNano: unixNano(event.time),
|
|
225
|
+
attributes: attributesToOtel(event.attributes)
|
|
226
|
+
}))
|
|
227
|
+
}
|
|
228
|
+
]
|
|
229
|
+
}
|
|
230
|
+
]
|
|
231
|
+
}
|
|
232
|
+
]
|
|
233
|
+
};
|
|
234
|
+
|
|
235
|
+
writeFileSync(tracePath, `${JSON.stringify(trace, null, 2)}\n`);
|
|
236
|
+
|
|
237
|
+
const forbiddenRawStrings = [
|
|
238
|
+
'Acme-Co',
|
|
239
|
+
'sk_live_private_fixture',
|
|
240
|
+
'finance@acme.example',
|
|
241
|
+
'Bearer private_fixture',
|
|
242
|
+
'PAY-1234',
|
|
243
|
+
'/workspace/acme',
|
|
244
|
+
'/private/work/acme',
|
|
245
|
+
'private support transcript',
|
|
246
|
+
'internal deployment host',
|
|
247
|
+
'private order',
|
|
248
|
+
'auth header'
|
|
249
|
+
];
|
|
250
|
+
const exportedText = `${events.map((event) => JSON.stringify(event)).join('\n')}\n${JSON.stringify(trace)}`;
|
|
251
|
+
const rawTextCopiedToReceipt = forbiddenRawStrings.some((value) => exportedText.includes(value));
|
|
252
|
+
const strategyActionCounts = strategies.reduce((counts, strategy) => {
|
|
253
|
+
counts[strategy.action] = (counts[strategy.action] ?? 0) + 1;
|
|
254
|
+
return counts;
|
|
255
|
+
}, {});
|
|
256
|
+
|
|
257
|
+
const summary = {
|
|
258
|
+
schema: 'pluribus.contextPruningReceipt.demo.v0',
|
|
259
|
+
eventCount: events.length,
|
|
260
|
+
strategyEvents: strategyEvents.length,
|
|
261
|
+
strategyActionCounts,
|
|
262
|
+
tokenBeforeBucket: startedEvent.attributes['context.prune.token_count.before_bucket'],
|
|
263
|
+
tokenAfterBucket: completedEvent.attributes['context.prune.token_count.after_bucket'],
|
|
264
|
+
tokenRemovedBucket: completedEvent.attributes['context.prune.token_count.removed_bucket'],
|
|
265
|
+
changedItemCountBucket: completedEvent.attributes['context.prune.changed_item_count_bucket'],
|
|
266
|
+
protectedItemCountBucket: completedEvent.attributes['context.prune.protected_item_count_bucket'],
|
|
267
|
+
backupVerified: completedEvent.attributes['context.prune.backup.verified'],
|
|
268
|
+
includesAuditGap: completedEvent.attributes['context.prune.audit_gap'],
|
|
269
|
+
rawTextCopiedToReceipt,
|
|
270
|
+
receiptPath: 'examples/context-input-evidence/pruning-receipt.ndjson',
|
|
271
|
+
tracePath: 'examples/context-input-evidence/pruning-otel-trace.json',
|
|
272
|
+
lesson: 'Post-hoc context cleaning needs receipts for what was pruned, minified, stubbed, protected, backed up, and kept private; token savings alone do not prove safe context preservation.'
|
|
273
|
+
};
|
|
274
|
+
|
|
275
|
+
console.log(JSON.stringify(summary, null, 2));
|
|
@@ -0,0 +1,171 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
import { createHash } from 'node:crypto';
|
|
3
|
+
import { readFileSync, writeFileSync } from 'node:fs';
|
|
4
|
+
import { dirname, join, resolve } from 'node:path';
|
|
5
|
+
import { fileURLToPath } from 'node:url';
|
|
6
|
+
|
|
7
|
+
const here = dirname(fileURLToPath(import.meta.url));
|
|
8
|
+
const inputPath = process.argv[2] ? resolve(process.argv[2]) : join(here, 'sample-subagent-toolsearch-propagation-log.jsonl');
|
|
9
|
+
const receiptPath = process.argv[3] ? resolve(process.argv[3]) : join(here, 'subagent-toolsearch-propagation-receipt.ndjson');
|
|
10
|
+
const tracePath = process.argv[4] ? resolve(process.argv[4]) : join(here, 'subagent-toolsearch-propagation-otel-trace.json');
|
|
11
|
+
|
|
12
|
+
function sha256(value) {
|
|
13
|
+
return `sha256:${createHash('sha256').update(value ?? '').digest('hex')}`;
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
function hashRef(value) {
|
|
17
|
+
return sha256(value ?? '').slice(0, 19);
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
function readJsonl(path) {
|
|
21
|
+
return readFileSync(path, 'utf8')
|
|
22
|
+
.trim()
|
|
23
|
+
.split('\n')
|
|
24
|
+
.filter(Boolean)
|
|
25
|
+
.map((line, index) => {
|
|
26
|
+
try {
|
|
27
|
+
return JSON.parse(line);
|
|
28
|
+
} catch (error) {
|
|
29
|
+
throw new Error(`Invalid JSONL at ${path}:${index + 1}: ${error.message}`);
|
|
30
|
+
}
|
|
31
|
+
});
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
function unixNano(isoTimestamp) {
|
|
35
|
+
return `${BigInt(Date.parse(isoTimestamp)) * 1_000_000n}`;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
function otelValue(value) {
|
|
39
|
+
if (typeof value === 'boolean') return { boolValue: value };
|
|
40
|
+
if (typeof value === 'number' && Number.isInteger(value)) return { intValue: String(value) };
|
|
41
|
+
if (typeof value === 'number') return { doubleValue: value };
|
|
42
|
+
if (value == null) return { stringValue: '' };
|
|
43
|
+
return { stringValue: String(value) };
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
function attributesToOtel(attributes) {
|
|
47
|
+
return Object.entries(attributes).map(([key, value]) => ({ key, value: otelValue(value) }));
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
function countBucket(value) {
|
|
51
|
+
if (value === 0) return 'zero';
|
|
52
|
+
if (value <= 5) return 'under_5';
|
|
53
|
+
if (value <= 25) return 'under_25';
|
|
54
|
+
if (value <= 100) return 'under_100';
|
|
55
|
+
if (value <= 500) return 'under_500';
|
|
56
|
+
return 'over_500';
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
const records = readJsonl(inputPath);
|
|
60
|
+
const session = records.find((record) => record.type === 'session.start');
|
|
61
|
+
const probes = records.filter((record) => record.type === 'subagent.spawn.probe');
|
|
62
|
+
const completed = records.find((record) => record.type === 'subagent.toolsearch.matrix.completed');
|
|
63
|
+
|
|
64
|
+
if (!session || probes.length === 0 || !completed) {
|
|
65
|
+
throw new Error(`Expected session.start, subagent.spawn.probe records, and subagent.toolsearch.matrix.completed in ${inputPath}`);
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
const traceSeed = `${session.session_id}:${session.conversation_id}:subagent-toolsearch-propagation`;
|
|
69
|
+
const traceId = sha256(traceSeed).replace('sha256:', '').slice(0, 32);
|
|
70
|
+
const spanId = sha256(`${traceSeed}:span`).replace('sha256:', '').slice(0, 16);
|
|
71
|
+
|
|
72
|
+
const events = [
|
|
73
|
+
...probes.map((record) => ({
|
|
74
|
+
trace_id: traceId,
|
|
75
|
+
span_id: spanId,
|
|
76
|
+
name: 'subagent.toolsearch.propagation.evaluated',
|
|
77
|
+
time: record.time,
|
|
78
|
+
attributes: {
|
|
79
|
+
'session.id': session.session_id,
|
|
80
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
81
|
+
'agent.name': session.agent,
|
|
82
|
+
'subagent.type_hash': hashRef(record.subagent_type),
|
|
83
|
+
'subagent.spawn.path': record.spawn_path,
|
|
84
|
+
'subagent.skill_context_active': record.skill_context_active,
|
|
85
|
+
'subagent.parent_intermediate_tool_call_count_bucket': countBucket(record.parent_intermediate_tool_call_count),
|
|
86
|
+
'subagent.tools.declaration_shape': record.tools_declaration_shape,
|
|
87
|
+
'subagent.tools.toolsearch_declared': record.toolsearch_declared,
|
|
88
|
+
'subagent.tools.toolsearch_exposed': record.toolsearch_exposed,
|
|
89
|
+
'subagent.mcp.parent_server_count_bucket': countBucket(record.mcp_server_count_parent),
|
|
90
|
+
'subagent.mcp.available_server_count_bucket': countBucket(record.mcp_server_count_subagent),
|
|
91
|
+
'subagent.mcp.loaded_tool_definition_count_bucket': countBucket(record.loaded_tool_definition_count),
|
|
92
|
+
'subagent.mcp.deferred_tool_definition_count_bucket': countBucket(record.deferred_tool_definition_count),
|
|
93
|
+
'subagent.tools.filter_reason': record.filter_reason,
|
|
94
|
+
'subagent.tools.declaration_hash': sha256(record.raw_tools_declaration),
|
|
95
|
+
'privacy.raw_tools_declaration_recorded': false,
|
|
96
|
+
'privacy.raw_tool_schemas_recorded': false,
|
|
97
|
+
'privacy.raw_prompts_recorded': false,
|
|
98
|
+
'privacy.raw_paths_recorded': false
|
|
99
|
+
}
|
|
100
|
+
})),
|
|
101
|
+
{
|
|
102
|
+
trace_id: traceId,
|
|
103
|
+
span_id: spanId,
|
|
104
|
+
name: 'subagent.toolsearch.matrix.completed',
|
|
105
|
+
time: completed.time,
|
|
106
|
+
attributes: {
|
|
107
|
+
'session.id': session.session_id,
|
|
108
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
109
|
+
'subagent.toolsearch.tested_axis': completed.tested_axis,
|
|
110
|
+
'subagent.toolsearch.probe_count_bucket': countBucket(completed.probe_count),
|
|
111
|
+
'subagent.toolsearch.passing_probe_count_bucket': countBucket(completed.passing_probe_count),
|
|
112
|
+
'subagent.toolsearch.failing_probe_count_bucket': countBucket(completed.failing_probe_count),
|
|
113
|
+
'subagent.toolsearch.recommended_next_probe_hash': hashRef(completed.recommended_next_probe),
|
|
114
|
+
'subagent.toolsearch.audit_gap': completed.audit_gap
|
|
115
|
+
}
|
|
116
|
+
}
|
|
117
|
+
];
|
|
118
|
+
|
|
119
|
+
writeFileSync(receiptPath, `${events.map((event) => JSON.stringify(event)).join('\n')}\n`);
|
|
120
|
+
|
|
121
|
+
const trace = {
|
|
122
|
+
resourceSpans: [
|
|
123
|
+
{
|
|
124
|
+
resource: {
|
|
125
|
+
attributes: attributesToOtel({
|
|
126
|
+
'service.name': 'pluribus-subagent-toolsearch-propagation-receipt-demo',
|
|
127
|
+
'service.version': '0.0.0-fixture',
|
|
128
|
+
'deployment.environment.name': 'local-fixture'
|
|
129
|
+
})
|
|
130
|
+
},
|
|
131
|
+
scopeSpans: [
|
|
132
|
+
{
|
|
133
|
+
scope: {
|
|
134
|
+
name: 'pluribus.context_input_evidence.subagent_toolsearch_propagation_demo',
|
|
135
|
+
version: '0.0.0-fixture'
|
|
136
|
+
},
|
|
137
|
+
spans: [
|
|
138
|
+
{
|
|
139
|
+
traceId,
|
|
140
|
+
spanId,
|
|
141
|
+
parentSpanId: '',
|
|
142
|
+
name: 'agent.subagent.toolsearch.propagation',
|
|
143
|
+
kind: 1,
|
|
144
|
+
startTimeUnixNano: unixNano(probes[0].time),
|
|
145
|
+
endTimeUnixNano: unixNano(completed.time),
|
|
146
|
+
attributes: attributesToOtel({
|
|
147
|
+
'session.id': session.session_id,
|
|
148
|
+
'gen_ai.conversation.id': session.conversation_id,
|
|
149
|
+
'agent.name': session.agent,
|
|
150
|
+
'workspace.name_hash': hashRef(session.workspace),
|
|
151
|
+
'gen_ai.request.model': session.model,
|
|
152
|
+
'subagent.context.receipt.scope': 'toolsearch_propagation_matrix'
|
|
153
|
+
}),
|
|
154
|
+
events: events.map((event) => ({
|
|
155
|
+
name: event.name,
|
|
156
|
+
timeUnixNano: unixNano(event.time),
|
|
157
|
+
attributes: attributesToOtel(event.attributes)
|
|
158
|
+
})),
|
|
159
|
+
status: { code: 1 }
|
|
160
|
+
}
|
|
161
|
+
]
|
|
162
|
+
}
|
|
163
|
+
]
|
|
164
|
+
}
|
|
165
|
+
]
|
|
166
|
+
};
|
|
167
|
+
|
|
168
|
+
writeFileSync(tracePath, `${JSON.stringify(trace, null, 2)}\n`);
|
|
169
|
+
|
|
170
|
+
console.log(`Wrote ${receiptPath}`);
|
|
171
|
+
console.log(`Wrote ${tracePath}`);
|