dcp-wrap 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,548 @@
1
+ # DCP Integration with PicoClaw
2
+
3
+ Reduce LLM token consumption by 40-60% on structured tool output — without modifying PicoClaw's core.
4
+
5
+ PicoClaw's [hook system](https://github.com/sipeed/picoclaw) provides `after_tool` interception points. dcp-wrap runs as an out-of-process hook that intercepts JSON tool results and converts them to DCP positional arrays before they reach the LLM.
6
+
7
+ ```
8
+ Tool execution → JSON result
9
+ → after_tool hook (dcp-wrap, Node.js)
10
+ → DCP encode: {"id":"abc","score":0.9,"tags":["fix"]} → ["abc",0.9,"fix"]
11
+ → LLM receives compact DCP instead of verbose JSON
12
+ ```
13
+
14
+ ## Prerequisites
15
+
16
+ - PicoClaw v0.2.4+
17
+ - Node.js 18+ (installed in PicoClaw's environment)
18
+ - dcp-wrap (`npm install dcp-wrap`)
19
+
20
+ ## Quick Start
21
+
22
+ ### 1. Install dcp-wrap
23
+
24
+ ```bash
25
+ npm install dcp-wrap
26
+ ```
27
+
28
+ ### 2. Add the hook to PicoClaw config
29
+
30
+ In your `config.json`, add the `hooks` section. The hook runs as an external process communicating via JSON-RPC over stdio.
31
+
32
+ ```json
33
+ {
34
+ "version": 1,
35
+ "hooks": {
36
+ "enabled": true,
37
+ "processes": {
38
+ "dcp_encoder": {
39
+ "enabled": true,
40
+ "priority": 50,
41
+ "transport": "stdio",
42
+ "command": ["node", "./node_modules/dcp-wrap/dist/picoclaw-hook.js"],
43
+ "intercept": ["after_tool"],
44
+ "env": {
45
+ "PICOCLAW_DCP_TOOLS": "{\"my_api_tool\":{\"id\":\"api-response:v1\",\"fields\":[\"endpoint\",\"method\",\"status\",\"latency_ms\"]}}"
46
+ }
47
+ }
48
+ }
49
+ }
50
+ }
51
+ ```
52
+
53
+ ### 3. Restart PicoClaw
54
+
55
+ The hook starts automatically on the first user message (hooks are lazily initialized).
56
+
57
+ Check the logs for:
58
+ ```
59
+ Process hook stderr | hook=dcp_encoder | stderr="[dcp-hook] Started. Tools configured: my_api_tool"
60
+ ```
61
+
62
+ ## Configuring Tools
63
+
64
+ The `PICOCLAW_DCP_TOOLS` environment variable maps tool names to DCP schemas. Two modes:
65
+
66
+ ### Explicit schema (recommended for known tools)
67
+
68
+ Define the schema ID and field list. Fields are extracted by name from the JSON output.
69
+
70
+ ```json
71
+ {
72
+ "mcp_engram_pull": {
73
+ "id": "engram-recall:v1",
74
+ "fields": ["id", "relevance", "summary", "tags", "hitCount", "weight", "status"]
75
+ }
76
+ }
77
+ ```
78
+
79
+ ### Auto schema (for unknown/varying tools)
80
+
81
+ Set `"auto"` and dcp-wrap will infer the schema from the first batch of results.
82
+
83
+ ```json
84
+ {
85
+ "web_fetch": "auto"
86
+ }
87
+ ```
88
+
89
+ Auto-generated schemas are cached for the hook process lifetime. Good for exploration; switch to explicit schemas once you know the output shape.
90
+
91
+ ### Mixed configuration
92
+
93
+ ```json
94
+ {
95
+ "mcp_engram_pull": {
96
+ "id": "engram-recall:v1",
97
+ "fields": ["id", "relevance", "summary", "tags", "hitCount", "weight", "status"]
98
+ },
99
+ "mcp_engram_ls": {
100
+ "id": "engram-scan:v1",
101
+ "fields": ["id", "summary", "tags", "hitCount", "weight", "status"]
102
+ },
103
+ "web_fetch": "auto"
104
+ }
105
+ ```
106
+
107
+ Unlisted tools pass through unchanged.
108
+
109
+ ## What gets encoded
110
+
111
+ The hook intercepts `result.for_llm` — the string that PicoClaw sends to the LLM as tool output. Encoding happens only when:
112
+
113
+ 1. The tool is listed in `PICOCLAW_DCP_TOOLS`
114
+ 2. The result is not an error (`is_error: false`)
115
+ 3. The result parses as JSON (object or array of objects)
116
+
117
+ If any condition fails, the result passes through unchanged. The hook never breaks tool output.
118
+
119
+ ## Where DCP helps most
120
+
121
+ DCP reduces tokens on **structured, multi-record data**:
122
+
123
+ | Tool output type | DCP effect | Why |
124
+ |---|---|---|
125
+ | Array of JSON objects (API results, search results, database rows) | 40-60% reduction | Repeated keys eliminated, positional encoding |
126
+ | Single JSON object with large text field | ~0% reduction | Text dominates, schema overhead > savings |
127
+ | Plain text | Passthrough | Not JSON, nothing to encode |
128
+
129
+ Best candidates: MCP tool results, API responses, database queries, structured search results.
130
+
131
+ ## Docker Setup
132
+
133
+ PicoClaw's official Docker image (`sipeed/picoclaw:latest`) is Alpine-based with no Node.js. Add it:
134
+
135
+ ```dockerfile
136
+ FROM docker.io/sipeed/picoclaw:latest
137
+
138
+ USER root
139
+ RUN apk add --no-cache nodejs npm
140
+
141
+ # Option A: npm install (when dcp-wrap is published)
142
+ # WORKDIR /opt/dcp-hook
143
+ # RUN npm install dcp-wrap
144
+
145
+ # Option B: copy built dist (local development)
146
+ WORKDIR /opt/dcp-hook/node_modules/dcp-wrap
147
+ COPY dcp-wrap-dist/ ./dist/
148
+ COPY dcp-wrap-package.json ./package.json
149
+
150
+ WORKDIR /root
151
+ ENTRYPOINT ["picoclaw"]
152
+ CMD ["gateway"]
153
+ ```
154
+
155
+ Update the hook command path in config:
156
+ ```json
157
+ "command": ["node", "/opt/dcp-hook/node_modules/dcp-wrap/dist/picoclaw-hook.js"]
158
+ ```
159
+
160
+ Mount config via volume:
161
+ ```yaml
162
+ services:
163
+ picoclaw-gateway:
164
+ build: .
165
+ volumes:
166
+ - ./data:/root/.picoclaw
167
+ extra_hosts:
168
+ - "host.docker.internal:host-gateway"
169
+ ports:
170
+ - "127.0.0.1:18800:18790"
171
+ ```
172
+
173
+ ## Gotchas
174
+
175
+ ### Config must have `"version": 1`
176
+
177
+ PicoClaw's config migration runs when `version` is missing (treated as v0). The v0-to-v1 migration re-serializes the Go struct with `omitempty`, which silently drops `hooks.processes` if it wasn't recognized during migration.
178
+
179
+ Always start your config with:
180
+ ```json
181
+ {
182
+ "version": 1,
183
+ ...
184
+ }
185
+ ```
186
+
187
+ ### Hooks initialize lazily
188
+
189
+ Don't expect hook logs at gateway startup. The `hook.hello` handshake happens on the first user message that triggers a turn. If you only see startup logs with no hook activity, send a message first.
190
+
191
+ ### `intercept: ["after_tool"]` also sends `before_tool`
192
+
193
+ PicoClaw maps both `before_tool` and `after_tool` to a single `InterceptTool` flag. The hook receives both RPCs. dcp-wrap handles this correctly (returns `{"action": "continue"}` for `before_tool`).
194
+
195
+ ### Plain text tool output
196
+
197
+ Some built-in tools (e.g., DuckDuckGo `web_search`) return plain text, not JSON. The hook passes these through unchanged. This is correct behavior — DCP encodes structure, not prose.
198
+
199
+ ## How it works internally
200
+
201
+ The hook process communicates with PicoClaw via [JSON-RPC over stdio](https://github.com/sipeed/picoclaw):
202
+
203
+ ```
204
+ PicoClaw dcp-wrap hook (Node.js)
205
+ │ │
206
+ ├──hook.hello──────────────────────▶│
207
+ │◀─────────────{ok:true}────────────┤
208
+ │ │
209
+ │ (user sends message, LLM calls tool)
210
+ │ │
211
+ ├──hook.before_tool────────────────▶│
212
+ │◀─────────{action:"continue"}──────┤
213
+ │ │
214
+ │ (tool executes, produces result) │
215
+ │ │
216
+ ├──hook.after_tool─────────────────▶│
217
+ │ {tool:"mcp_query", │
218
+ │ result:{for_llm:"[{...},...]"}} │
219
+ │ │
220
+ │ (hook encodes for_llm via DCP) │
221
+ │ │
222
+ │◀──{action:"modify",───────────────┤
223
+ │ result:{for_llm:"[$S,...]\n..."}}│
224
+ │ │
225
+ │ (LLM receives DCP-encoded output) │
226
+ ```
227
+
228
+ ### RPC payloads
229
+
230
+ **after_tool request** (PicoClaw sends):
231
+ ```json
232
+ {
233
+ "jsonrpc": "2.0",
234
+ "id": 7,
235
+ "method": "hook.after_tool",
236
+ "params": {
237
+ "meta": {"session_key": "..."},
238
+ "tool": "mcp_engram_pull",
239
+ "arguments": {"query": "docker port conflict"},
240
+ "result": {
241
+ "for_llm": "[{\"id\":\"abc\",\"relevance\":0.95,...}]",
242
+ "is_error": false
243
+ },
244
+ "duration": 234000000
245
+ }
246
+ }
247
+ ```
248
+
249
+ **after_tool response** (hook returns, when encoding):
250
+ ```json
251
+ {
252
+ "jsonrpc": "2.0",
253
+ "id": 7,
254
+ "result": {
255
+ "action": "modify",
256
+ "result": {
257
+ "meta": {"session_key": "..."},
258
+ "tool": "mcp_engram_pull",
259
+ "result": {
260
+ "for_llm": "[\"$S\",\"engram-recall:v1\",\"id\",\"relevance\",\"summary\",\"tags\"]\n[\"abc\",0.95,\"docker port fix\",\"docker,gotcha\"]"
261
+ }
262
+ }
263
+ }
264
+ }
265
+ ```
266
+
267
+ ## Why PicoClaw, not OpenClaw
268
+
269
+ We evaluated both frameworks for DCP integration. PicoClaw is the clear choice.
270
+
271
+ ### OpenClaw: skill/prompt-level DCP constraints don't work
272
+
273
+ OpenClaw's architecture makes hook-level interception impractical:
274
+
275
+ - **Built-in prompts override skill prompts.** DCP formatting instructions added via skills or custom prompts are silently overridden by OpenClaw's internal system prompt. The LLM never sees the DCP constraint.
276
+ - **No output hooks.** OpenClaw has no `after_tool` equivalent. There is no way to intercept tool results before they reach the LLM. This is tracked as [OpenClaw #12914](https://github.com/openclaw/openclaw) but unimplemented as of March 2026.
277
+ - **Plugin architecture is input-only.** OpenClaw plugins can modify the system prompt but cannot intercept or transform tool output.
278
+
279
+ Bottom line: without output hooks, DCP encoding at the tool result boundary is impossible in OpenClaw.
280
+
281
+ ### PicoClaw: 4 modifiable hooks = complete DCP pipeline
282
+
283
+ PicoClaw's hook system was designed for exactly this kind of interception:
284
+
285
+ | Hook | DCP role | Status |
286
+ |---|---|---|
287
+ | `after_tool` | Encode tool JSON → DCP positional arrays | **Implemented** |
288
+ | `before_llm` | Inject output controller ("respond as DCP") | Planned |
289
+ | `after_llm` | Cap non-conforming output + decode for messaging | Planned |
290
+ | `before_tool` | (Optional) Tool argument optimization | Not needed |
291
+
292
+ Key advantages:
293
+ - **Out-of-process hooks** via JSON-RPC over stdio — any language works, no Go port needed
294
+ - **Modifiable responses** — hooks can rewrite tool results, not just observe them
295
+ - **Per-tool selectivity** — DCP encode only configured tools, everything else passes through
296
+ - **Edge device focus** — PicoClaw targets Raspberry Pi and resource-constrained hardware where token cost is a real constraint, not theoretical
297
+
298
+ ### Practical comparison
299
+
300
+ | | OpenClaw | PicoClaw |
301
+ |---|---|---|
302
+ | Hook system | No output hooks | 4 modifiable hooks |
303
+ | Tool result interception | Impossible | `after_tool` with modify |
304
+ | DCP encoding | Not feasible | Working (dcp-wrap hook) |
305
+ | LLM output control | Prompt-level only (overridden) | `before_llm` injection |
306
+ | External process hooks | Not supported | JSON-RPC over stdio |
307
+ | Token cost sensitivity | Cloud-focused | Edge-device-focused |
308
+
309
+ ## Rate limits and tool count
310
+
311
+ PicoClaw sends all tool definitions to the LLM on every turn. With MCP servers, tool count can grow quickly:
312
+
313
+ | Configuration | Tool count | ~Input tokens per turn |
314
+ |---|---|---|
315
+ | Default (built-in only) | 14 | ~8K |
316
+ | + 1 MCP server (6 tools) | 20 | ~15K |
317
+ | + skills, multiple MCP servers | 30+ | ~25K+ |
318
+
319
+ On Anthropic's free/low-tier plans (50K input tokens/min), a single multi-iteration turn with 20+ tools can hit rate limits. Mitigations:
320
+
321
+ 1. **Disable unused tools** — Set `"enabled": false` for tools you don't need (exec, read_file, write_file, spawn, subagent, skills)
322
+ 2. **Clear session history** — Delete `data/sessions/` to reset accumulated context
323
+ 3. **Limit iterations** — Set `max_tool_iterations` lower (e.g., 5)
324
+ 4. **Use a higher-tier API plan** — More headroom for agentic loops
325
+
326
+ This is actually where `before_llm` DCP encoding of ToolDefinition[] would have the highest impact — compressing 20+ tool schemas that ship on every single turn.
327
+
328
+ ## Real-world results: engram MCP integration
329
+
330
+ ### The problem with naive after_tool encoding
331
+
332
+ The initial approach — intercept JSON tool results in `after_tool` and DCP-encode them — hit a fundamental issue: **most tool output is not JSON**.
333
+
334
+ | Tool | Output format | DCP encodable? |
335
+ |---|---|---|
336
+ | web_search (DuckDuckGo) | Plain text (`"Results for: ..."`) | No |
337
+ | web_fetch | Single JSON object, `text` field dominates | ~0% reduction |
338
+ | read_file, exec, list_dir | Plain text | No |
339
+ | cron (list) | Plain text (`"Scheduled jobs:\n- ..."`) | No |
340
+ | **MCP tools (engram_pull)** | **Depends on output mode** | **Yes, with the right approach** |
341
+
342
+ Even engram's MCP server returns human-readable text by default:
343
+ ```
344
+ Found 10 results for "DCP" (cross-project):
345
+ [1] DCP formatter placement: OUT-side...
346
+ hits=2 weight=-2.58 status=recent relevance=0.345
347
+ tags: why, dcp, formatter, architecture
348
+ id: a7b5dce9-...
349
+ ```
350
+
351
+ This is 5649 chars for 10 results. Not JSON, so the after_tool hook passes it through unchanged.
352
+
353
+ ### The solution: before_tool parameter injection
354
+
355
+ engram's MCP server already supports a `queryType` parameter:
356
+ - `queryType: "human"` (default) — verbose natural language
357
+ - `queryType: "agent"` — DCP positional arrays with `$S` header
358
+
359
+ The problem: PicoClaw's LLM doesn't know to pass `queryType: "agent"`. It uses whatever parameters it decides on.
360
+
361
+ The fix: **use `before_tool` to inject `queryType: "agent"` before the MCP call executes**.
362
+
363
+ ```typescript
364
+ // In picoclaw-hook.ts
365
+ const AGENT_QUERY_TOOLS = new Set(["mcp_engram_engram_pull", "mcp_engram_engram_ls"]);
366
+
367
+ function handleBeforeTool(params: unknown): unknown {
368
+ const payload = params as ToolCallPayload;
369
+ if (AGENT_QUERY_TOOLS.has(payload.tool)) {
370
+ return {
371
+ action: "modify",
372
+ call: {
373
+ ...payload,
374
+ arguments: { ...payload.arguments, queryType: "agent" },
375
+ },
376
+ };
377
+ }
378
+ return { action: "continue" };
379
+ }
380
+ ```
381
+
382
+ This is transparent to the LLM — it calls `engram_pull` normally, the hook injects the parameter, engram returns DCP, and the LLM reads compact positional arrays.
383
+
384
+ ### Measured results
385
+
386
+ ```
387
+ before_tool: injecting queryType=agent for mcp_engram_engram_pull
388
+ Tool call: mcp_engram_engram_pull({"crossProject":true,"limit":10,"query":"DCP","queryType":"agent"})
389
+ Tool execution completed | result_length=1697 | tool=mcp_engram_engram_pull
390
+ ```
391
+
392
+ | | Human format | DCP format | Reduction |
393
+ |---|---|---|---|
394
+ | engram_pull (10 results) | 5649 chars | 1697 chars | **70%** |
395
+ | LLM iterations to answer | 3 | 2 | **-33%** |
396
+
397
+ The LLM correctly interprets the DCP `$S` header and positional rows, extracting the same information from 70% fewer tokens.
398
+
399
+ ### Key insight: two-hook pattern
400
+
401
+ The effective pattern is not `after_tool` alone, but **`before_tool` + `after_tool` working together**:
402
+
403
+ 1. **`before_tool`**: Inject parameters that tell the MCP server to return compact format
404
+ 2. **`after_tool`**: Available as fallback for tools that don't have a compact mode (auto-encode JSON via SchemaGenerator)
405
+
406
+ This avoids the fundamental problem of trying to parse and re-encode text that was never JSON in the first place.
407
+
408
+ ### Difficulties encountered during integration
409
+
410
+ **engram MCP dist was stale.** The `dcp-format.ts` source existed but `dist/dcp-format.js` did not — the MCP server had never been rebuilt after adding DCP output support. The hook injected `queryType: "agent"` correctly, the MCP server received it, but the import of `formatRecallDcp` failed silently and fell through to the human format codepath. Always rebuild MCP servers before copying dist into Docker.
411
+
412
+ **MCP tool naming convention.** PicoClaw prefixes MCP tools with `mcp_{serverName}_{toolName}`. For server `engram` and tool `engram_pull`, the full name is `mcp_engram_engram_pull` — not `mcp_engram_pull`. This affects both the DCP_TOOLS config and the AGENT_QUERY_TOOLS set.
413
+
414
+ **Environment variable naming.** engram's MCP server uses `GATEWAY_URL`, not `ENGRAM_GATEWAY_URL`. The first attempt used the wrong name, causing `Cannot reach http://localhost:3100` errors inside the container (the default fallback).
415
+
416
+ **Rate limits with many tools.** PicoClaw sends all tool definitions on every LLM turn. With 6 MCP tools + 14 built-in tools = 20 tools, a multi-iteration turn can exceed Anthropic's 50K input tokens/min limit. Disabling unused tools (exec, file I/O, skills, spawn) reduced the count from 21 to 11 and resolved the issue.
417
+
418
+ ## For MCP server authors: why your server should speak DCP
419
+
420
+ This integration proved one thing clearly: **DCP cannot be bolted on from outside.** A hook sitting between the tool and the LLM can only work with what the tool gives it. If the tool returns plain text, there is nothing to compress.
421
+
422
+ ### The plain text problem
423
+
424
+ Most tools — web_search, exec, read_file, cron, list_dir — return plain text for a good reason: maximum compatibility. Any LLM can read text. No schema knowledge required.
425
+
426
+ But this "compatibility" has a cost. When an MCP tool returns 10 structured records as formatted text:
427
+
428
+ ```
429
+ [1] DCP formatter placement: OUT-side...
430
+ hits=2 weight=-2.58 status=recent relevance=0.345
431
+ tags: why, dcp, formatter, architecture
432
+ id: a7b5dce9-...
433
+ ```
434
+
435
+ Every field label (`hits=`, `weight=`, `status=`, `tags:`, `id:`) is repeated per record. For 10 records, that's 10x the overhead. The LLM reads all of it, pays for all of it, and extracts the same information that a positional array conveys in a fraction of the tokens.
436
+
437
+ ### What MCP servers should do
438
+
439
+ Add a `queryType` parameter (or equivalent) to your tool schema:
440
+
441
+ ```typescript
442
+ server.tool("my_tool", {
443
+ query: z.string(),
444
+ queryType: z.enum(["human", "agent"]).optional()
445
+ .describe("'agent' returns DCP compact format. Default: 'human'."),
446
+ }, async ({ query, queryType }) => {
447
+ const results = await fetchResults(query);
448
+
449
+ if (queryType === "agent") {
450
+ // DCP positional arrays — 70% fewer tokens
451
+ return { content: [{ type: "text", text: dcpEncode(results, schema) }] };
452
+ }
453
+
454
+ // Human-readable text — default, compatible with everything
455
+ return { content: [{ type: "text", text: formatHuman(results) }] };
456
+ });
457
+ ```
458
+
459
+ This is the contract. The MCP tool declares it can speak DCP. The consumer (LLM, hook, agent framework) decides whether to ask for it.
460
+
461
+ ### Why this matters
462
+
463
+ Without this contract:
464
+ - Hooks can only observe text, not compress it
465
+ - Every agent framework must implement its own parsing for every tool's text format
466
+ - Token costs scale linearly with verbosity
467
+
468
+ With this contract:
469
+ - A `before_tool` hook injects `queryType: "agent"` once
470
+ - The MCP server returns DCP natively — no parsing, no re-encoding
471
+ - 70% token reduction, measured and proven
472
+ - The human format remains the default — nothing breaks for consumers that don't know DCP
473
+
474
+ The MCP protocol already provides the mechanism: typed parameters via tool schemas. DCP doesn't require a new protocol — it requires MCP servers to **offer a compact output mode** and consumers to **ask for it**.
475
+
476
+ ### The pattern for any MCP server
477
+
478
+ 1. **Keep human format as default.** Backward compatible. Text works everywhere.
479
+ 2. **Add `queryType: "agent"` parameter.** Declare the compact mode exists.
480
+ 3. **Return DCP when asked.** Use [dcp-wrap](https://github.com/hiatamaworkshop/dcp-wrap) or format positional arrays directly.
481
+ 4. **Let hooks handle the switching.** Agent frameworks inject the parameter automatically — the LLM never needs to learn about it.
482
+
483
+ ## Scheduled tasks and DCP: where it applies
484
+
485
+ PicoClaw has two schedulers: **cron** (user-defined jobs) and **heartbeat** (periodic `HEARTBEAT.md` check). Whether DCP applies depends on the execution path.
486
+
487
+ ### Execution paths
488
+
489
+ | Scheduler | Mode | Flow | DCP applies? |
490
+ |---|---|---|---|
491
+ | Cron | `deliver=false` (default) | Message → LLM turn → LLM calls tools → response | **Yes** — tools go through hooks |
492
+ | Cron | `deliver=true` | Message → direct to Telegram | No — LLM not involved |
493
+ | Cron | `command` set | Shell exec → output to Telegram | No — LLM not involved |
494
+ | Heartbeat | — | HEARTBEAT.md → LLM turn → LLM calls tools → response | **Yes** — same as cron deliver=false |
495
+
496
+ The key: cron `deliver=false` and heartbeat both start an LLM turn via `ProcessDirectWithChannel`. The LLM decides which tools to call. Those tool calls go through `before_tool` (parameter injection) and `after_tool` (encoding fallback) — the same DCP pipeline as interactive messages.
497
+
498
+ ### What this means in practice
499
+
500
+ A heartbeat task like "Check engram for new knowledge about DCP" triggers:
501
+
502
+ ```
503
+ Heartbeat tick (every 30 min)
504
+ → LLM reads HEARTBEAT.md task list
505
+ → LLM calls mcp_engram_engram_pull(query="DCP")
506
+ → before_tool injects queryType=agent ← DCP kicks in here
507
+ → engram returns 1697 chars (not 5649) ← 70% saved
508
+ → LLM summarizes and sends to Telegram
509
+ ```
510
+
511
+ Every 30 minutes, 70% fewer tokens per engram query. Over a day, that compounds.
512
+
513
+ ### Input-side DCP: the before_llm frontier
514
+
515
+ The current implementation handles **tool output** (after_tool) and **tool parameters** (before_tool). But the largest token cost is on the **input side** — what gets sent to the LLM on every single turn:
516
+
517
+ | Input component | Sent every turn | Approximate tokens | DCP potential |
518
+ |---|---|---|---|
519
+ | Tool definitions (12+ tools) | Yes | ~4K-8K | **High** — repeated structured schemas |
520
+ | Conversation history | Yes | Grows over time | **High** — message[] with repeated structure |
521
+ | System prompt | Yes (cached by LLM) | ~2K | Low — already prefix-cached |
522
+ | User message | Yes | Small | None — natural language |
523
+
524
+ The `before_llm` hook can intercept the full `LLMHookRequest` including `messages[]` and `tools[]`. Compressing tool definitions from verbose JSON Schema to DCP positional format could save 50%+ on every turn — but this requires the LLM to understand DCP tool schemas, which is unverified territory.
525
+
526
+ This is the next frontier. The tool output side is solved. The input side is where the remaining cost lives.
527
+
528
+ ### Guidance for task authors
529
+
530
+ When writing `HEARTBEAT.md` tasks or cron job messages, structure them so the LLM calls MCP tools (which benefit from DCP) rather than built-in text tools:
531
+
532
+ ```markdown
533
+ ## Good — triggers MCP tool with DCP benefit
534
+ - Check engram for recent knowledge about deployment issues
535
+ - Search engram for error patterns from this week
536
+
537
+ ## Less effective — triggers text-output tools
538
+ - Run `df -h` and report disk usage
539
+ - Fetch https://status.example.com and summarize
540
+ ```
541
+
542
+ The former triggers `engram_pull` → DCP encoding → 70% savings. The latter triggers `exec` or `web_fetch` → plain text → no DCP benefit.
543
+
544
+ ## Next steps
545
+
546
+ - [DCP Specification](https://dcp-docs.pages.dev/dcp/specification) — full protocol design
547
+ - [dcp-wrap README](../README.md) — CLI and programmatic API
548
+ - [Schema-Driven Encoder](https://dcp-docs.pages.dev/dcp/schema-driven-encoder) — how encoding works
package/package.json ADDED
@@ -0,0 +1,50 @@
1
+ {
2
+ "name": "dcp-wrap",
3
+ "version": "0.2.0",
4
+ "description": "Convert JSON to DCP positional-array format for AI agents — 40-70% token reduction",
5
+ "type": "module",
6
+ "main": "dist/index.js",
7
+ "types": "dist/index.d.ts",
8
+ "bin": {
9
+ "dcp-wrap": "dist/cli.js",
10
+ "dcp-picoclaw-hook": "dist/picoclaw-hook.js"
11
+ },
12
+ "exports": {
13
+ ".": {
14
+ "import": "./dist/index.js",
15
+ "types": "./dist/index.d.ts"
16
+ }
17
+ },
18
+ "scripts": {
19
+ "build": "tsc",
20
+ "dev": "tsc --watch",
21
+ "test": "tsc && node --test dist/decoder.test.js dist/picoclaw-hook.test.js",
22
+ "prepublishOnly": "npm run test"
23
+ },
24
+ "files": [
25
+ "dist",
26
+ "README.md",
27
+ "docs"
28
+ ],
29
+ "repository": {
30
+ "type": "git",
31
+ "url": "git+https://github.com/hiatamaworkshop/dcp-wrap.git"
32
+ },
33
+ "homepage": "https://dcp-docs.pages.dev",
34
+ "dependencies": {},
35
+ "devDependencies": {
36
+ "@types/node": "^20.0.0",
37
+ "typescript": "^5.3.0"
38
+ },
39
+ "keywords": [
40
+ "dcp",
41
+ "ai",
42
+ "llm",
43
+ "tokens",
44
+ "mcp",
45
+ "positional-array",
46
+ "picoclaw",
47
+ "hook"
48
+ ],
49
+ "license": "Apache-2.0"
50
+ }