simplicio-prompt 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +15 -0
- package/README.md +173 -0
- package/YOOL_TUPLE_HAMT.md +1149 -0
- package/adopters.md +24 -0
- package/benchmarks/generate_prompt_benchmark_pdf.py +355 -0
- package/benchmarks/generate_v2_benchmark_pdf.py +302 -0
- package/benchmarks/prompt_vs_normal.py +431 -0
- package/benchmarks/prompt_vs_normal_benchmark.pdf +124 -0
- package/benchmarks/prompt_vs_normal_results.md +148 -0
- package/benchmarks/v2_safe_speed_benchmark.pdf +118 -0
- package/benchmarks/v2_safe_speed_benchmark.py +626 -0
- package/benchmarks/v2_safe_speed_results.json +446 -0
- package/benchmarks/v2_safe_speed_results.md +96 -0
- package/docs/assets/simplicio-prompt-hero.png +0 -0
- package/docs/assets/yool-v2-safe-speed-infographic-en.png +0 -0
- package/docs/assets/yool-v2-safe-speed-infographic-pt.png +0 -0
- package/examples/node/build-catalog.mjs +70 -0
- package/examples/python/minimal_bus.py +134 -0
- package/examples/python/receipts.py +152 -0
- package/guardrails/cpu_throttle.py +119 -0
- package/guardrails/disk_gc.py +212 -0
- package/kernel/README.md +82 -0
- package/kernel/yool_tuple_kernel.py +1109 -0
- package/kernel-implementation-request.md +38 -0
- package/package.json +40 -0
- package/prompts/agent-runtime-execution-prompt.md +119 -0
- package/prompts/legacy-tuple-space-engine-prompt.md +36 -0
|
@@ -0,0 +1,1149 @@
|
|
|
1
|
+
# yool · tuple · HAMT — capability addressing for agent systems
|
|
2
|
+
|
|
3
|
+
> Canonical specification of the **yool / tuple / HAMT** pattern.
|
|
4
|
+
> Cross-project pattern doc. Source of truth lives here:
|
|
5
|
+
> https://github.com/wesleysimplicio/simplicio-prompt (private).
|
|
6
|
+
> Vendored into [SendSprint](https://github.com/wesleysimplicio/SendSprint),
|
|
7
|
+
> [llm-project-mapper](https://github.com/wesleysimplicio/llm-project-mapper),
|
|
8
|
+
> and any future agent-orchestration project.
|
|
9
|
+
|
|
10
|
+
Status: **draft v0.2** · Maintainer: @wesleysimplicio · Last updated: 2026-05-19
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## 0. TL;DR
|
|
15
|
+
|
|
16
|
+
- **yool** — smallest callable capability atom. An opcode: `agent.dev.python`, `ide.cursor.send`, `fs.read_sector`.
|
|
17
|
+
- **tuple** — addressable envelope binding yools to map position, authority, lane, budget, source pointers, receipts. Unit of work.
|
|
18
|
+
- **HAMT** — Hash Array Mapped Trie cataloging every yool/tuple/agent/operator. O(log32) lookup, immutable structural sharing.
|
|
19
|
+
- **tuple-space** (Linda) — producers `out` tuples; workers `in`/`rd` by pattern. No imperative orchestrator.
|
|
20
|
+
- **receipts** — content-addressable execution records. Same input → same hash → cache hit, no recompute.
|
|
21
|
+
- **MCP edge** — read-mostly snapshot/dispatch surface. NOT the inner loop.
|
|
22
|
+
- **guardrails** — CPU throttle + disk GC are mandatory, not optional (see §11).
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## 1. The Problem (Why We Built This)
|
|
27
|
+
|
|
28
|
+
Every agent system we built had the same five rots:
|
|
29
|
+
|
|
30
|
+
1. **Orchestrator accretion** — `flow.py` / `pipeline.py` grow with every new step. Imperative coupling. New step = patching 5 files.
|
|
31
|
+
2. **Registry sprawl** — agents/IDEs/operators live in hand-maintained dicts. Adding 1 IDE touches the dict, the dispatcher, the CLI, the docs, the tests.
|
|
32
|
+
3. **No cross-run cache** — same input rebuilds because intermediate results have no addressable identity.
|
|
33
|
+
4. **Resume/replay is bespoke** — every project serializes run state differently. Crash recovery is "best effort", usually a fresh restart.
|
|
34
|
+
5. **Audit is post-hoc** — "what ran, with whose authority, against which input, costing how much" — answered by `grep` over logs.
|
|
35
|
+
|
|
36
|
+
Symptom that triggered this spec: the **MCP exposure drift** — exposing every internal call as an MCP tool makes the inner loop slow (latency per call), uncacheable (no addressable identity), and unbounded (no budget). The fix isn't "better MCP" — it's keeping MCP at the edge and rebuilding the inner loop on **capability addressing**.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## 2. Vocabulary
|
|
41
|
+
|
|
42
|
+
### 2.1 yool
|
|
43
|
+
|
|
44
|
+
The atomic, callable capability. **One yool = one opcode = one side-effect or pure computation**.
|
|
45
|
+
|
|
46
|
+
#### Examples by category
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
# Code-acting agents
|
|
50
|
+
agent.dev.python
|
|
51
|
+
agent.dev.dotnet
|
|
52
|
+
agent.dev.typescript
|
|
53
|
+
agent.lint.ruff
|
|
54
|
+
agent.lint.eslint
|
|
55
|
+
agent.lint.dotnet
|
|
56
|
+
agent.test.pytest
|
|
57
|
+
agent.test.jest
|
|
58
|
+
agent.test.e2e.playwright
|
|
59
|
+
agent.security.scan
|
|
60
|
+
agent.security.dependabot
|
|
61
|
+
agent.pr.create
|
|
62
|
+
agent.pr.review
|
|
63
|
+
|
|
64
|
+
# IDE bridges
|
|
65
|
+
ide.cursor.send
|
|
66
|
+
ide.zed.send
|
|
67
|
+
ide.vscode.send
|
|
68
|
+
ide.jetbrains.send
|
|
69
|
+
|
|
70
|
+
# Project operators
|
|
71
|
+
op.jira.fetch_sprint
|
|
72
|
+
op.azure.fetch_iteration
|
|
73
|
+
op.linear.fetch_cycle
|
|
74
|
+
op.github.fetch_issues
|
|
75
|
+
|
|
76
|
+
# Filesystem & net (primitives)
|
|
77
|
+
fs.read_sector
|
|
78
|
+
fs.write_receipt
|
|
79
|
+
net.fetch
|
|
80
|
+
net.post
|
|
81
|
+
|
|
82
|
+
# Catalog itself (introspection)
|
|
83
|
+
catalog.lookup
|
|
84
|
+
catalog.list_by_lane
|
|
85
|
+
catalog.diff
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
#### Naming rules
|
|
89
|
+
|
|
90
|
+
| Rule | Example |
|
|
91
|
+
|---|---|
|
|
92
|
+
| One verb, one direct object | `agent.lint.ruff` (lint is the verb, ruff is the implementation) |
|
|
93
|
+
| Stable identifier (rename = breaking) | `agent.dev.python.v1` then bump to `.v2` instead of renaming |
|
|
94
|
+
| Namespace by domain.action.tool | `domain` in {agent, ide, op, fs, net, catalog} |
|
|
95
|
+
| Pure or single-effect | Yool either reads OR writes, not both implicitly |
|
|
96
|
+
| Returns a receipt | Every yool execution emits exactly one receipt |
|
|
97
|
+
|
|
98
|
+
#### Anti-patterns
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
# NO - multiple verbs
|
|
102
|
+
agent.lint_and_test.python
|
|
103
|
+
|
|
104
|
+
# NO - implicit fan-out
|
|
105
|
+
agent.deploy.all_envs
|
|
106
|
+
|
|
107
|
+
# NO - opaque action
|
|
108
|
+
agent.do.thing
|
|
109
|
+
|
|
110
|
+
# YES - split into independent yools
|
|
111
|
+
agent.lint.python
|
|
112
|
+
agent.test.python
|
|
113
|
+
agent.deploy.staging
|
|
114
|
+
agent.deploy.prod
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
A yool is **not** a function. A function might back a yool, but the yool itself is the **addressable symbol + contract**, decoupled from implementation.
|
|
118
|
+
|
|
119
|
+
### 2.2 tuple
|
|
120
|
+
|
|
121
|
+
The envelope. Wraps a payload of yools with everything needed to route, authorize, budget, audit, and replay.
|
|
122
|
+
|
|
123
|
+
#### Canonical schema (v1)
|
|
124
|
+
|
|
125
|
+
```jsonc
|
|
126
|
+
{
|
|
127
|
+
"id": "sha256:abc123...",
|
|
128
|
+
"schema": "yool-tuple/v1",
|
|
129
|
+
"map_pos": {
|
|
130
|
+
"repo": "EVT",
|
|
131
|
+
"branch": "feat/JIRA-456",
|
|
132
|
+
"sprint_id": "JIRA-456",
|
|
133
|
+
"stack": "dotnet"
|
|
134
|
+
},
|
|
135
|
+
"authority": {
|
|
136
|
+
"user": "wes",
|
|
137
|
+
"agent": "sendsprint",
|
|
138
|
+
"ci": false
|
|
139
|
+
},
|
|
140
|
+
"lane": "build",
|
|
141
|
+
"agent_terms": {
|
|
142
|
+
"budget_usd": 0.50,
|
|
143
|
+
"budget_tokens": 50000,
|
|
144
|
+
"deadline_iso": "2026-05-19T20:00:00Z",
|
|
145
|
+
"max_retries": 2,
|
|
146
|
+
"cpu_quota_pct": 60,
|
|
147
|
+
"disk_quota_mb": 100
|
|
148
|
+
},
|
|
149
|
+
"src_ptr": [
|
|
150
|
+
"jira://EVT/JIRA-456",
|
|
151
|
+
"commit://abc123"
|
|
152
|
+
],
|
|
153
|
+
"payload": [
|
|
154
|
+
{"yool": "agent.dev.dotnet", "args": {}},
|
|
155
|
+
{"yool": "agent.lint.dotnet", "args": {}},
|
|
156
|
+
{"yool": "agent.test.dotnet", "args": {"filter": "Unit"}}
|
|
157
|
+
],
|
|
158
|
+
"receipts": [],
|
|
159
|
+
"parent_id": null,
|
|
160
|
+
"created_at": "2026-05-19T17:30:00Z"
|
|
161
|
+
}
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
#### Field semantics
|
|
165
|
+
|
|
166
|
+
| Field | Purpose | Mutability |
|
|
167
|
+
|---|---|---|
|
|
168
|
+
| `id` | Content hash of canonical form (excluding `id` itself) | immutable post-creation |
|
|
169
|
+
| `schema` | Version of this schema | immutable |
|
|
170
|
+
| `map_pos` | Where in the project graph this tuple lives | immutable |
|
|
171
|
+
| `authority` | Who/what authorized this work | immutable |
|
|
172
|
+
| `lane` | Routing key for workers | immutable (re-emit to change) |
|
|
173
|
+
| `agent_terms` | Budget envelope + guardrails | immutable |
|
|
174
|
+
| `src_ptr` | Provenance pointers (issue, commit, doc) | immutable |
|
|
175
|
+
| `payload` | Ordered yool program | immutable |
|
|
176
|
+
| `receipts` | Receipt ids appended as yools complete | append-only |
|
|
177
|
+
| `parent_id` | Parent tuple in DAG | immutable |
|
|
178
|
+
| `created_at` | Wall clock at creation | immutable |
|
|
179
|
+
|
|
180
|
+
#### Properties
|
|
181
|
+
|
|
182
|
+
- **Content-addressable**: `id = sha256(canonical(tuple_without_id))`. Same input -> same id -> free dedupe.
|
|
183
|
+
- **Self-describing**: `schema` field allows v1 and v2 to coexist on the bus.
|
|
184
|
+
- **Replayable**: every field needed to re-execute is on the envelope.
|
|
185
|
+
- **Auditable**: `authority` + `src_ptr` make every action traceable.
|
|
186
|
+
|
|
187
|
+
### 2.3 HAMT (Hash Array Mapped Trie)
|
|
188
|
+
|
|
189
|
+
A persistent dictionary structure. Used here to **catalog** all yools, agents, IDEs, operators under a single addressable namespace.
|
|
190
|
+
|
|
191
|
+
#### Why HAMT vs flat dict
|
|
192
|
+
|
|
193
|
+
| Concern | Flat dict | HAMT |
|
|
194
|
+
|---|---|---|
|
|
195
|
+
| Lookup | O(1) avg, O(n) worst | O(log32 n) bounded |
|
|
196
|
+
| Memory on update | Rewrite-heavy | Structural sharing (only touched path copied) |
|
|
197
|
+
| Concurrency | Mutex required | Lock-free reads (immutable nodes) |
|
|
198
|
+
| Distribution | Hard to shard | Trivially shardable by top-level slot |
|
|
199
|
+
| Audit history | None | Hashes form Merkle chain |
|
|
200
|
+
| Size at scale | Memory bloat at >100k | Bounded depth = bounded memory walk |
|
|
201
|
+
|
|
202
|
+
#### Parameters used
|
|
203
|
+
|
|
204
|
+
- Hash: BLAKE2b-64 truncated to 30 bits
|
|
205
|
+
- Bits per level: 5 (branching factor = 32)
|
|
206
|
+
- Max levels: 6
|
|
207
|
+
- Address space pre-collision: 2^30 ~= 1.07 billion
|
|
208
|
+
|
|
209
|
+
A yool name like `agent.dev.python` is hashed; the 30-bit hash decomposes into 6 slot indices `[s0..s5]` that walk the trie. Collisions beyond level 6 collapse to a `collision` leaf list.
|
|
210
|
+
|
|
211
|
+
Reference: Phil Bagwell, *Ideal Hash Trees* — https://lampwww.epfl.ch/papers/idealhashtrees.pdf
|
|
212
|
+
|
|
213
|
+
### 2.4 tuple-space
|
|
214
|
+
|
|
215
|
+
The coordination substrate. Producers `out` tuples; consumers `in`/`rd` tuples matching a pattern.
|
|
216
|
+
|
|
217
|
+
**Linda primitives** (Gelernter 1985):
|
|
218
|
+
|
|
219
|
+
| Primitive | Semantics |
|
|
220
|
+
|---|---|
|
|
221
|
+
| `out(tuple)` | Publish to space. Non-blocking. |
|
|
222
|
+
| `in(pattern)` | Remove and return one matching tuple. Blocks until match. |
|
|
223
|
+
| `rd(pattern)` | Read (don't remove) one matching tuple. Blocks until match. |
|
|
224
|
+
| `eval(template)` | Spawn an active tuple that computes itself, becoming a passive tuple. |
|
|
225
|
+
|
|
226
|
+
A worker handling `lane=build`:
|
|
227
|
+
|
|
228
|
+
```python
|
|
229
|
+
while True:
|
|
230
|
+
t = bus.in_({"lane": "build"})
|
|
231
|
+
receipt = run(t)
|
|
232
|
+
bus.out_(t.with_receipt(receipt))
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
That's the entire orchestrator.
|
|
236
|
+
|
|
237
|
+
Reference: David Gelernter, *Generative Communication in Linda*, ACM TOPLAS 1985.
|
|
238
|
+
|
|
239
|
+
### 2.5 receipt
|
|
240
|
+
|
|
241
|
+
The output artifact of a yool execution.
|
|
242
|
+
|
|
243
|
+
```jsonc
|
|
244
|
+
{
|
|
245
|
+
"id": "sha256:def456...",
|
|
246
|
+
"yool": "agent.test.pytest",
|
|
247
|
+
"tuple_id": "sha256:abc123...",
|
|
248
|
+
"started_at": "2026-05-19T17:30:00Z",
|
|
249
|
+
"ended_at": "2026-05-19T17:31:12Z",
|
|
250
|
+
"exit": 0,
|
|
251
|
+
"stdout_sha": "sha256:111...",
|
|
252
|
+
"stderr_sha": "sha256:222...",
|
|
253
|
+
"artifacts": [
|
|
254
|
+
{"kind": "junit", "path": "evidence/sha:333.../junit.xml"},
|
|
255
|
+
{"kind": "coverage", "path": "evidence/sha:444.../coverage.json"}
|
|
256
|
+
],
|
|
257
|
+
"cost": {"usd": 0.012, "tokens_in": 1200, "tokens_out": 800, "wall_ms": 72000, "disk_mb": 4.2}
|
|
258
|
+
}
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
Receipts are **immutable** and **content-addressable**. Two yool runs with the same input hash MAY share a receipt — that's the cache key.
|
|
262
|
+
|
|
263
|
+
### 2.6 MCP edge
|
|
264
|
+
|
|
265
|
+
Model Context Protocol surfaces are **read-mostly snapshots** of the tuple-space, not the bus itself.
|
|
266
|
+
|
|
267
|
+
| Allowed | Disallowed |
|
|
268
|
+
|---|---|
|
|
269
|
+
| `catalog.lookup(name)` | `tuple.in()` / `tuple.out()` (inner loop) |
|
|
270
|
+
| `catalog.list_by_lane(lane)` | per-yool dispatch |
|
|
271
|
+
| `tuple.dispatch(tuple)` (returns receipt id) | streaming raw tuple events without ETag |
|
|
272
|
+
| `tuple.observe(id)` (SSE, cacheable) | mutation of catalog/receipts |
|
|
273
|
+
| `receipt.get(id)` | |
|
|
274
|
+
|
|
275
|
+
Why: latency, cost, cacheability. MCP is fine for snapshot dashboards and external agent observation. Putting inner-loop semantics behind MCP turns every tuple emit into a network call.
|
|
276
|
+
|
|
277
|
+
Reference: MCP tools spec — https://modelcontextprotocol.io/specification/draft/server/tools
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## 3. Architecture
|
|
282
|
+
|
|
283
|
+
### 3.1 Static layout
|
|
284
|
+
|
|
285
|
+
```
|
|
286
|
+
+------------------------------------------------------------+
|
|
287
|
+
| Capability Catalog |
|
|
288
|
+
| (HAMT, addressable) |
|
|
289
|
+
| |
|
|
290
|
+
| agent.dev.* agent.lint.* agent.test.* |
|
|
291
|
+
| agent.security.* agent.pr.* agent.deploy.* |
|
|
292
|
+
| ide.* op.* fs.* net.* |
|
|
293
|
+
| |
|
|
294
|
+
| storage: .catalog/capabilities.json (versioned in repo) |
|
|
295
|
+
+------------------------------------------------------------+
|
|
296
|
+
^
|
|
297
|
+
| resolve(name) -> impl
|
|
298
|
+
|
|
|
299
|
+
+------------------------------------------------------------+
|
|
300
|
+
| Tuple Space |
|
|
301
|
+
| (Linda bus) |
|
|
302
|
+
| |
|
|
303
|
+
| .catalog/tuples.jsonl (append-only, content-addressable) |
|
|
304
|
+
| |
|
|
305
|
+
| producers --out--> [tuple] --in--> subscribers |
|
|
306
|
+
| ^ | |
|
|
307
|
+
| | v |
|
|
308
|
+
| .catalog/receipts/ (content-addressable) |
|
|
309
|
+
+------------------------------------------------------------+
|
|
310
|
+
^
|
|
311
|
+
| snapshot
|
|
312
|
+
|
|
|
313
|
+
+------------------------------------------------------------+
|
|
314
|
+
| MCP Edge |
|
|
315
|
+
| catalog.lookup catalog.list tuple.dispatch |
|
|
316
|
+
| tuple.observe (SSE) receipt.get |
|
|
317
|
+
+------------------------------------------------------------+
|
|
318
|
+
^
|
|
319
|
+
|
|
|
320
|
+
Claude / Codex / Copilot / Dashboard
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
### 3.2 Dynamic flow (one item)
|
|
324
|
+
|
|
325
|
+
```
|
|
326
|
+
1. user/CI/agent emits tuple T0
|
|
327
|
+
{ lane: "build", map_pos: {...}, payload: [yool_a, yool_b, yool_c] }
|
|
328
|
+
|
|
329
|
+
2. catalog resolves each yool to an impl
|
|
330
|
+
yool_a -> agent.dev.python.v1
|
|
331
|
+
yool_b -> agent.lint.ruff.v3
|
|
332
|
+
yool_c -> agent.test.pytest.v2
|
|
333
|
+
|
|
334
|
+
3. worker pool subscribes lane="build"
|
|
335
|
+
worker pulls T0, executes payload sequentially or in parallel per dep graph
|
|
336
|
+
|
|
337
|
+
4. each yool emits a receipt R_a, R_b, R_c
|
|
338
|
+
each receipt is content-addressed; cache check before recompute
|
|
339
|
+
|
|
340
|
+
5. final receipt R_T0 = aggregate(R_a, R_b, R_c)
|
|
341
|
+
tuple log appends T0 + R_T0
|
|
342
|
+
|
|
343
|
+
6. MCP snapshot reflects new state
|
|
344
|
+
dashboard / Claude observe via SSE
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
### 3.3 Failure & resume
|
|
348
|
+
|
|
349
|
+
```
|
|
350
|
+
crash mid-flight
|
|
351
|
+
|
|
|
352
|
+
v
|
|
353
|
+
restart reads tuples.jsonl
|
|
354
|
+
|
|
|
355
|
+
v
|
|
356
|
+
filter tuples with no terminal receipt
|
|
357
|
+
|
|
|
358
|
+
v
|
|
359
|
+
re-emit on bus (idempotent: id collision = skip)
|
|
360
|
+
|
|
|
361
|
+
v
|
|
362
|
+
workers reprocess; cached receipts short-circuit
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
### 3.4 HAMT trie example
|
|
366
|
+
|
|
367
|
+
Sample insertion of 3 yools - trie state after each step.
|
|
368
|
+
|
|
369
|
+
```
|
|
370
|
+
Initial: empty Node { bitmap=0, children={} }
|
|
371
|
+
|
|
372
|
+
insert(agent.dev.python) hash=011010... slots=[13, 4, 22, 9, 1, 7]
|
|
373
|
+
|
|
374
|
+
Node { bitmap=...10000000000000, children={13: Leaf(agent.dev.python)} }
|
|
375
|
+
|
|
376
|
+
insert(agent.lint.ruff) hash=000111... slots=[3, 21, 0, 18, 30, 12]
|
|
377
|
+
|
|
378
|
+
Node { bitmap=...10000000001000, children={3: Leaf(agent.lint.ruff),
|
|
379
|
+
13: Leaf(agent.dev.python)} }
|
|
380
|
+
|
|
381
|
+
insert(agent.dev.dotnet) hash=011010... slots=[13, 7, 1, 28, 4, 19]
|
|
382
|
+
collides with agent.dev.python at level 0 (slot 13)
|
|
383
|
+
|
|
384
|
+
Node {
|
|
385
|
+
bitmap=...,
|
|
386
|
+
children={
|
|
387
|
+
3: Leaf(agent.lint.ruff),
|
|
388
|
+
13: Node { # subnode created
|
|
389
|
+
bitmap=...,
|
|
390
|
+
children={
|
|
391
|
+
4: Leaf(agent.dev.python), # at level 1, slot 4
|
|
392
|
+
7: Leaf(agent.dev.dotnet) # at level 1, slot 7
|
|
393
|
+
}
|
|
394
|
+
}
|
|
395
|
+
}
|
|
396
|
+
}
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
### 3.5 Tuple-space example (Linda flow)
|
|
400
|
+
|
|
401
|
+
```
|
|
402
|
+
Time Producer Bus Worker(build)
|
|
403
|
+
---- --------------- ---------------------------- --------------
|
|
404
|
+
t0 emit(T0) --> [T0{lane:build}]
|
|
405
|
+
in({lane:build})
|
|
406
|
+
t1 <-- T0
|
|
407
|
+
t2 run(agent.dev.dotnet)
|
|
408
|
+
cache miss; exec
|
|
409
|
+
emit receipt R_a
|
|
410
|
+
t3 [T0{...receipts:[R_a]}] run(agent.lint.dotnet)
|
|
411
|
+
cache HIT; reuse
|
|
412
|
+
t4 [T0{...receipts:[R_a,R_b]}] run(agent.test.dotnet)
|
|
413
|
+
cache miss; exec
|
|
414
|
+
emit receipt R_c
|
|
415
|
+
t5 [T0{...receipts:[R_a,R_b,R_c]}] out(T0_done)
|
|
416
|
+
aggregate R_T0
|
|
417
|
+
t6 observe(T0) <-- [T0_done]
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
---
|
|
421
|
+
|
|
422
|
+
## 4. Algorithms
|
|
423
|
+
|
|
424
|
+
### 4.1 yool name hashing
|
|
425
|
+
|
|
426
|
+
```python
|
|
427
|
+
import hashlib
|
|
428
|
+
|
|
429
|
+
def yool_hash(name: str) -> int:
|
|
430
|
+
h = hashlib.blake2b(name.encode("utf-8"), digest_size=8).digest()
|
|
431
|
+
return int.from_bytes(h, "big") & ((1 << 30) - 1)
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
### 4.2 HAMT slot decomposition
|
|
435
|
+
|
|
436
|
+
```python
|
|
437
|
+
def slots(h: int, levels: int = 6, bits: int = 5) -> list[int]:
|
|
438
|
+
mask = (1 << bits) - 1
|
|
439
|
+
return [(h >> ((levels - 1 - lvl) * bits)) & mask for lvl in range(levels)]
|
|
440
|
+
```
|
|
441
|
+
|
|
442
|
+
### 4.3 HAMT insert (full)
|
|
443
|
+
|
|
444
|
+
```python
|
|
445
|
+
def insert(root: Node, leaf: Leaf, level: int = 0) -> None:
|
|
446
|
+
if level >= MAX_LEVELS:
|
|
447
|
+
slot = leaf.hash & (BRANCH - 1)
|
|
448
|
+
existing = root.children.get(slot)
|
|
449
|
+
if existing is None:
|
|
450
|
+
root.bitmap |= 1 << slot
|
|
451
|
+
root.children[slot] = Collision(hash_prefix=leaf.hash, leaves=[leaf])
|
|
452
|
+
elif isinstance(existing, Collision):
|
|
453
|
+
existing.leaves.append(leaf)
|
|
454
|
+
else:
|
|
455
|
+
raise RuntimeError("unexpected node at collision depth")
|
|
456
|
+
return
|
|
457
|
+
|
|
458
|
+
slot = slot_at(leaf.hash, level)
|
|
459
|
+
existing = root.children.get(slot)
|
|
460
|
+
|
|
461
|
+
if existing is None:
|
|
462
|
+
root.bitmap |= 1 << slot
|
|
463
|
+
root.children[slot] = leaf
|
|
464
|
+
return
|
|
465
|
+
|
|
466
|
+
if isinstance(existing, Leaf):
|
|
467
|
+
if existing.hash == leaf.hash and existing.key == leaf.key:
|
|
468
|
+
existing.tuple = leaf.tuple
|
|
469
|
+
return
|
|
470
|
+
sub = Node()
|
|
471
|
+
insert(sub, existing, level + 1)
|
|
472
|
+
insert(sub, leaf, level + 1)
|
|
473
|
+
root.children[slot] = sub
|
|
474
|
+
return
|
|
475
|
+
|
|
476
|
+
if isinstance(existing, Node):
|
|
477
|
+
insert(existing, leaf, level + 1)
|
|
478
|
+
return
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
### 4.4 HAMT lookup
|
|
482
|
+
|
|
483
|
+
```python
|
|
484
|
+
def lookup(root: Node, key: str) -> Leaf | None:
|
|
485
|
+
h = yool_hash(key)
|
|
486
|
+
node = root
|
|
487
|
+
for lvl in range(MAX_LEVELS):
|
|
488
|
+
slot = slot_at(h, lvl)
|
|
489
|
+
child = node.children.get(slot)
|
|
490
|
+
if child is None:
|
|
491
|
+
return None
|
|
492
|
+
if isinstance(child, Leaf):
|
|
493
|
+
return child if child.key == key else None
|
|
494
|
+
if isinstance(child, Collision):
|
|
495
|
+
for leaf in child.leaves:
|
|
496
|
+
if leaf.key == key:
|
|
497
|
+
return leaf
|
|
498
|
+
return None
|
|
499
|
+
node = child
|
|
500
|
+
return None
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
### 4.5 Tuple id
|
|
504
|
+
|
|
505
|
+
```python
|
|
506
|
+
import json, hashlib
|
|
507
|
+
|
|
508
|
+
def tuple_id(t: dict) -> str:
|
|
509
|
+
t_no_id = {k: v for k, v in t.items() if k != "id"}
|
|
510
|
+
canonical = json.dumps(t_no_id, sort_keys=True, separators=(",", ":"), ensure_ascii=False)
|
|
511
|
+
return "sha256:" + hashlib.sha256(canonical.encode("utf-8")).hexdigest()
|
|
512
|
+
```
|
|
513
|
+
|
|
514
|
+
### 4.6 Receipt content addressing
|
|
515
|
+
|
|
516
|
+
```python
|
|
517
|
+
def receipt_id(receipt: dict) -> str:
|
|
518
|
+
h = hashlib.sha256()
|
|
519
|
+
h.update(receipt["yool"].encode())
|
|
520
|
+
h.update(str(receipt["exit"]).encode())
|
|
521
|
+
h.update(receipt["stdout_sha"].encode())
|
|
522
|
+
h.update(receipt["stderr_sha"].encode())
|
|
523
|
+
for a in receipt.get("artifacts", []):
|
|
524
|
+
h.update(a["path"].encode())
|
|
525
|
+
return "sha256:" + h.hexdigest()
|
|
526
|
+
```
|
|
527
|
+
|
|
528
|
+
### 4.7 Cache check
|
|
529
|
+
|
|
530
|
+
```python
|
|
531
|
+
def input_hash(yool: str, args: dict, file_shas: list[str], env_whitelist: dict) -> str:
|
|
532
|
+
h = hashlib.sha256()
|
|
533
|
+
h.update(yool.encode())
|
|
534
|
+
h.update(json.dumps(args, sort_keys=True).encode())
|
|
535
|
+
for sha in sorted(file_shas):
|
|
536
|
+
h.update(sha.encode())
|
|
537
|
+
for k, v in sorted(env_whitelist.items()):
|
|
538
|
+
h.update(f"{k}={v}".encode())
|
|
539
|
+
return h.hexdigest()
|
|
540
|
+
|
|
541
|
+
def cached_receipt(yool: str, ih: str):
|
|
542
|
+
return receipt_store.get(f"{yool}@{ih}")
|
|
543
|
+
```
|
|
544
|
+
|
|
545
|
+
---
|
|
546
|
+
|
|
547
|
+
## 5. End-to-End Example: SendSprint Adoption
|
|
548
|
+
|
|
549
|
+
### 5.1 Before - imperative pipeline
|
|
550
|
+
|
|
551
|
+
```python
|
|
552
|
+
class SprintFlow:
|
|
553
|
+
def __init__(self, sprint_id, stack):
|
|
554
|
+
self.sprint_id = sprint_id
|
|
555
|
+
self.stack = stack
|
|
556
|
+
|
|
557
|
+
def run(self):
|
|
558
|
+
items = jira.fetch_sprint(self.sprint_id)
|
|
559
|
+
for item in items:
|
|
560
|
+
code = self._dev(item)
|
|
561
|
+
self._lint(code, item)
|
|
562
|
+
self._test(code, item)
|
|
563
|
+
self._security_scan(code, item)
|
|
564
|
+
self._create_pr(code, item)
|
|
565
|
+
|
|
566
|
+
def _dev(self, item):
|
|
567
|
+
if self.stack == "python":
|
|
568
|
+
return PythonDevAgent().run(item)
|
|
569
|
+
elif self.stack == "dotnet":
|
|
570
|
+
return DotnetDevAgent().run(item)
|
|
571
|
+
# ... 5 more elifs
|
|
572
|
+
```
|
|
573
|
+
|
|
574
|
+
Problems: adding a stack patches 5 methods; no caching; crash mid-sprint restarts from scratch; no audit beyond logs.
|
|
575
|
+
|
|
576
|
+
### 5.2 After - yool/tuple/HAMT
|
|
577
|
+
|
|
578
|
+
```python
|
|
579
|
+
class SprintFlow:
|
|
580
|
+
def __init__(self, sprint_id, stack, bus, catalog, receipts):
|
|
581
|
+
self.sprint_id = sprint_id
|
|
582
|
+
self.stack = stack
|
|
583
|
+
self.bus = bus
|
|
584
|
+
self.catalog = catalog
|
|
585
|
+
self.receipts = receipts
|
|
586
|
+
|
|
587
|
+
def run(self):
|
|
588
|
+
items = self._emit(yool="op.jira.fetch_sprint", args={"sprint_id": self.sprint_id})
|
|
589
|
+
for item in items:
|
|
590
|
+
tuple_ = Tuple(
|
|
591
|
+
map_pos={"sprint_id": self.sprint_id, "stack": self.stack, "item": item.id},
|
|
592
|
+
lane="build",
|
|
593
|
+
agent_terms={"budget_usd": 0.50, "cpu_quota_pct": 60, "disk_quota_mb": 100},
|
|
594
|
+
payload=[
|
|
595
|
+
{"yool": f"agent.dev.{self.stack}", "args": {"item": item}},
|
|
596
|
+
{"yool": f"agent.lint.{self.stack}", "args": {}},
|
|
597
|
+
{"yool": f"agent.test.{self.stack}", "args": {}},
|
|
598
|
+
{"yool": "agent.security.scan", "args": {}},
|
|
599
|
+
{"yool": "agent.pr.create", "args": {}},
|
|
600
|
+
],
|
|
601
|
+
src_ptr=[f"jira://{item.id}"],
|
|
602
|
+
)
|
|
603
|
+
self.bus.out_(tuple_)
|
|
604
|
+
```
|
|
605
|
+
|
|
606
|
+
Adding a new stack = add `agent.dev.<newstack>` to the catalog. Zero touches to `SprintFlow`.
|
|
607
|
+
|
|
608
|
+
### 5.3 Worker
|
|
609
|
+
|
|
610
|
+
```python
|
|
611
|
+
async def worker(lane: str, bus, catalog, receipts):
|
|
612
|
+
async for t in bus.subscribe(lane):
|
|
613
|
+
for step in t.payload:
|
|
614
|
+
yool_name = step["yool"]
|
|
615
|
+
args = step["args"]
|
|
616
|
+
|
|
617
|
+
ih = input_hash(yool_name, args, file_shas=[], env_whitelist={})
|
|
618
|
+
cached = receipts.find(yool_name, ih)
|
|
619
|
+
if cached and cached.status == "ok" and not t.flags.get("no_cache"):
|
|
620
|
+
t.receipts.append(cached.id)
|
|
621
|
+
continue
|
|
622
|
+
|
|
623
|
+
impl = catalog.lookup(yool_name)
|
|
624
|
+
if impl is None:
|
|
625
|
+
raise UnknownYool(yool_name)
|
|
626
|
+
|
|
627
|
+
with cpu_throttle(t.agent_terms["cpu_quota_pct"]):
|
|
628
|
+
with disk_quota(t.agent_terms["disk_quota_mb"]):
|
|
629
|
+
receipt = await impl.run(args)
|
|
630
|
+
|
|
631
|
+
receipts.put(receipt)
|
|
632
|
+
t.receipts.append(receipt.id)
|
|
633
|
+
|
|
634
|
+
bus.out_(t)
|
|
635
|
+
```
|
|
636
|
+
|
|
637
|
+
### 5.4 Cache hit example
|
|
638
|
+
|
|
639
|
+
```
|
|
640
|
+
# First run
|
|
641
|
+
$ sprint run --sprint-id JIRA-456
|
|
642
|
+
[t=0] op.jira.fetch_sprint MISS exec cost=$0.001
|
|
643
|
+
[t=4s] agent.dev.dotnet MISS exec cost=$0.12
|
|
644
|
+
[t=18s] agent.lint.dotnet MISS exec cost=$0.01
|
|
645
|
+
[t=19s] agent.test.dotnet MISS exec cost=$0.04
|
|
646
|
+
[t=42s] agent.security.scan MISS exec cost=$0.02
|
|
647
|
+
[t=44s] agent.pr.create MISS exec cost=$0.01
|
|
648
|
+
TOTAL: $0.201
|
|
649
|
+
|
|
650
|
+
# Re-run, no source change
|
|
651
|
+
$ sprint run --sprint-id JIRA-456
|
|
652
|
+
[t=0] op.jira.fetch_sprint MISS exec cost=$0.001
|
|
653
|
+
[t=4s] agent.dev.dotnet HIT skip cost=$0
|
|
654
|
+
[t=4s] agent.lint.dotnet HIT skip cost=$0
|
|
655
|
+
[t=4s] agent.test.dotnet HIT skip cost=$0
|
|
656
|
+
[t=4s] agent.security.scan HIT skip cost=$0
|
|
657
|
+
[t=4s] agent.pr.create HIT skip cost=$0
|
|
658
|
+
TOTAL: $0.001
|
|
659
|
+
```
|
|
660
|
+
|
|
661
|
+
### 5.5 Crash + resume example
|
|
662
|
+
|
|
663
|
+
```
|
|
664
|
+
$ sprint run --sprint-id JIRA-456
|
|
665
|
+
[t=0] op.jira.fetch_sprint MISS exec
|
|
666
|
+
[t=4s] agent.dev.dotnet MISS exec
|
|
667
|
+
[t=18s] agent.lint.dotnet MISS exec
|
|
668
|
+
[t=19s] agent.test.dotnet MISS exec
|
|
669
|
+
^C (kill -9)
|
|
670
|
+
|
|
671
|
+
$ sprint resume --run-id $(sprint runs list | head -1)
|
|
672
|
+
[resume] reading .catalog/tuples.jsonl
|
|
673
|
+
[resume] T0 has receipts for [op.jira.fetch_sprint, agent.dev.dotnet, agent.lint.dotnet]
|
|
674
|
+
[resume] re-emitting T0 from step 4 (agent.test.dotnet)
|
|
675
|
+
[t=0] agent.test.dotnet MISS exec cost=$0.04
|
|
676
|
+
[t=23s] agent.security.scan MISS exec cost=$0.02
|
|
677
|
+
[t=25s] agent.pr.create MISS exec cost=$0.01
|
|
678
|
+
TOTAL: $0.07
|
|
679
|
+
```
|
|
680
|
+
|
|
681
|
+
---
|
|
682
|
+
|
|
683
|
+
## 6. End-to-End Example: llm-project-mapper Adoption
|
|
684
|
+
|
|
685
|
+
### 6.1 Catalog from `AGENTS.md`
|
|
686
|
+
|
|
687
|
+
llm-project-mapper already defines agents declaratively in `AGENTS.md`. The pattern extends each entry with `yool_id`, `authority`, `lane`, `agent_terms` defaults.
|
|
688
|
+
|
|
689
|
+
#### Before
|
|
690
|
+
|
|
691
|
+
```markdown
|
|
692
|
+
## Agents
|
|
693
|
+
|
|
694
|
+
### dev-agent
|
|
695
|
+
- Role: implements code from spec
|
|
696
|
+
- Triggers: new task in .specs/sprints/
|
|
697
|
+
- Stack: auto-detect
|
|
698
|
+
|
|
699
|
+
### lint-agent
|
|
700
|
+
- Role: enforces style
|
|
701
|
+
- Triggers: post-edit
|
|
702
|
+
```
|
|
703
|
+
|
|
704
|
+
#### After
|
|
705
|
+
|
|
706
|
+
```markdown
|
|
707
|
+
## Agents
|
|
708
|
+
|
|
709
|
+
### dev-agent
|
|
710
|
+
- yool_id: agent.dev.${stack}.v1
|
|
711
|
+
- authority: [user, ci]
|
|
712
|
+
- lane: build
|
|
713
|
+
- agent_terms:
|
|
714
|
+
budget_usd: 0.50
|
|
715
|
+
cpu_quota_pct: 60
|
|
716
|
+
disk_quota_mb: 100
|
|
717
|
+
- Role: implements code from spec
|
|
718
|
+
- Triggers: new task in .specs/sprints/
|
|
719
|
+
|
|
720
|
+
### lint-agent
|
|
721
|
+
- yool_id: agent.lint.${stack}.v1
|
|
722
|
+
- authority: [user, ci]
|
|
723
|
+
- lane: build
|
|
724
|
+
- agent_terms:
|
|
725
|
+
budget_usd: 0.05
|
|
726
|
+
cpu_quota_pct: 30
|
|
727
|
+
disk_quota_mb: 10
|
|
728
|
+
- Role: enforces style
|
|
729
|
+
- Triggers: post-edit
|
|
730
|
+
```
|
|
731
|
+
|
|
732
|
+
### 6.2 `bin/build-hamt-catalog`
|
|
733
|
+
|
|
734
|
+
```bash
|
|
735
|
+
#!/usr/bin/env bash
|
|
736
|
+
# Wrapper: node -> python core.
|
|
737
|
+
|
|
738
|
+
set -euo pipefail
|
|
739
|
+
|
|
740
|
+
PY=$(command -v python3 || command -v py || { echo "python3 required"; exit 1; })
|
|
741
|
+
|
|
742
|
+
ROOT="${1:-.}"
|
|
743
|
+
"$PY" "$(dirname "$0")/../scripts/build_hamt.py" \
|
|
744
|
+
--source "$ROOT/AGENTS.md" \
|
|
745
|
+
--output "$ROOT/.catalog/capabilities.json"
|
|
746
|
+
```
|
|
747
|
+
|
|
748
|
+
### 6.3 npx flow
|
|
749
|
+
|
|
750
|
+
```
|
|
751
|
+
$ npx @wesleysimplicio/llm-project-mapper my-new-project
|
|
752
|
+
[scaffold] writing AGENTS.md, .specs/, .skills/, .catalog/.gitkeep ...
|
|
753
|
+
[scaffold] creating .catalog/capabilities.json (stub)
|
|
754
|
+
[scaffold] adding .receipts/ to .gitignore
|
|
755
|
+
[scaffold] writing bin/build-hamt-catalog
|
|
756
|
+
|
|
757
|
+
$ npx llm-project-mapper build-hamt-catalog
|
|
758
|
+
[build] parsed 7 agents from AGENTS.md
|
|
759
|
+
[build] hashing yools ... 7/7
|
|
760
|
+
[build] inserting into HAMT ... done
|
|
761
|
+
[build] wrote .catalog/capabilities.json (4.2 KB)
|
|
762
|
+
[build] popcount root: 7/32
|
|
763
|
+
```
|
|
764
|
+
|
|
765
|
+
---
|
|
766
|
+
|
|
767
|
+
## 7. Implementation Checklist
|
|
768
|
+
|
|
769
|
+
### CP1 · Capability catalog (HAMT)
|
|
770
|
+
|
|
771
|
+
- [ ] Pick storage location: `<project>/.catalog/capabilities.json`.
|
|
772
|
+
- [ ] Enumerate existing capabilities.
|
|
773
|
+
- [ ] Build catalog generator.
|
|
774
|
+
- [ ] Replace existing registry lookups with `catalog.lookup(name)`.
|
|
775
|
+
- [ ] Test: lookup unknown name returns explicit error.
|
|
776
|
+
|
|
777
|
+
### CP2 · Receipt store (content-addressable)
|
|
778
|
+
|
|
779
|
+
- [ ] Directory layout: `<project>/.catalog/receipts/<sha-prefix-2>/<sha-rest>.json`.
|
|
780
|
+
- [ ] `receipt_id(receipt)` helper.
|
|
781
|
+
- [ ] Re-key artifacts by SHA.
|
|
782
|
+
- [ ] Run-scoped index.
|
|
783
|
+
- [ ] Garbage collection policy (see §11.2).
|
|
784
|
+
|
|
785
|
+
### CP3 · Tuple log
|
|
786
|
+
|
|
787
|
+
- [ ] Append-only `<project>/.catalog/tuples.jsonl`.
|
|
788
|
+
- [ ] Line per emitted tuple + per receipt.
|
|
789
|
+
- [ ] Fsync on terminal receipts.
|
|
790
|
+
- [ ] Recovery script reads log + filters incomplete tuples.
|
|
791
|
+
|
|
792
|
+
### CP4 · Worker pool & lanes
|
|
793
|
+
|
|
794
|
+
- [ ] Define lanes.
|
|
795
|
+
- [ ] Worker = `subscribe(lane)` loop with bounded concurrency.
|
|
796
|
+
- [ ] Same-lane work can fan out with `LaneWorkerPool`; cap with
|
|
797
|
+
`YOOL_TUPLE_MAX_LANE_CONCURRENCY` / `YOOL_MAX_LANE_CONCURRENCY` (default 64).
|
|
798
|
+
Preferred workers per lane default to `YOOL_TUPLE_LANE_CONCURRENCY=32`.
|
|
799
|
+
- [ ] Massive batches use `batch_spawn(depth, branching, compression_threshold)`
|
|
800
|
+
and virtual-agent accounting, not flat subagent lists.
|
|
801
|
+
- [ ] Idle materialized leaves use `compress_token` and `prune_idle`.
|
|
802
|
+
- [ ] Replace imperative orchestrator with emitter.
|
|
803
|
+
- [ ] Backpressure: queue depth per lane.
|
|
804
|
+
- [ ] Guardrails applied per-step (§11).
|
|
805
|
+
|
|
806
|
+
### CP5 · MCP edge
|
|
807
|
+
|
|
808
|
+
- [ ] MCP server exposes: `catalog.lookup`, `catalog.list_by_lane`, `tuple.dispatch`, `tuple.observe`, `receipt.get`.
|
|
809
|
+
- [ ] No write semantics besides `dispatch`.
|
|
810
|
+
- [ ] Snapshot endpoint cacheable.
|
|
811
|
+
|
|
812
|
+
### CP6 · Budget enforcement
|
|
813
|
+
|
|
814
|
+
- [ ] `agent_terms` on every tuple.
|
|
815
|
+
- [ ] Worker computes projected cost; reject above remaining budget.
|
|
816
|
+
- [ ] Receipt records actual cost.
|
|
817
|
+
- [ ] Aggregator emits alarm tuple on threshold.
|
|
818
|
+
|
|
819
|
+
---
|
|
820
|
+
|
|
821
|
+
## 8. Migration Playbook
|
|
822
|
+
|
|
823
|
+
| Step | Mechanism | Risk |
|
|
824
|
+
|---|---|---|
|
|
825
|
+
| 1. Generate catalog | Run builder against current registry. Commit JSON. | none |
|
|
826
|
+
| 2. Dual-read | Existing code uses old dict; new code reads catalog. Diff on mismatch. | low |
|
|
827
|
+
| 3. Receipt shim | Wrap existing artifact writes to emit content-addressed copy. | low |
|
|
828
|
+
| 4. Tuple log shim | Write tuple lines alongside current run state. | low |
|
|
829
|
+
| 5. Cut orchestrator | Refactor flow to emit/await. Old path under feature flag. | medium |
|
|
830
|
+
| 6. Workers replace direct calls | Subscribe-based execution. | medium |
|
|
831
|
+
| 7. Cache lookup | Before running yool, check receipt store. | medium |
|
|
832
|
+
| 8. MCP server | Expose snapshot. | low |
|
|
833
|
+
| 9. Budget enforcement | Add `agent_terms` to all tuples, enforce in workers. | low |
|
|
834
|
+
| 10. Remove old orchestrator | Once stable for N sprints, delete dual paths. | low |
|
|
835
|
+
|
|
836
|
+
---
|
|
837
|
+
|
|
838
|
+
## 9. Reference Implementations
|
|
839
|
+
|
|
840
|
+
This repo:
|
|
841
|
+
|
|
842
|
+
- `scripts/build_hamt.py` — Python HAMT builder.
|
|
843
|
+
- `kernel/yool_tuple_kernel.py` — reference Tuple-Space kernel with
|
|
844
|
+
`batch_spawn`, `compress_token`, hookwall, indexed scans, and lane fan-out.
|
|
845
|
+
- `examples/python/minimal_bus.py` — minimal Linda-style tuple-space.
|
|
846
|
+
- `examples/python/receipts.py` — content-addressable receipt store.
|
|
847
|
+
- `examples/node/build-catalog.mjs` — Node wrapper invoking Python core.
|
|
848
|
+
- `guardrails/cpu_throttle.py` — CPU quota enforcement (§11.1).
|
|
849
|
+
- `guardrails/disk_gc.py` — receipt store GC (§11.2).
|
|
850
|
+
- `prompts/agent-runtime-execution-prompt.md` — prompt for Claude, Codex,
|
|
851
|
+
Hermes, and other coding agents to consume the runtime consistently.
|
|
852
|
+
|
|
853
|
+
Adopters:
|
|
854
|
+
|
|
855
|
+
- SendSprint — Python: `scripts/build_agent_catalog.py`, `src/sendsprint/bus/`, `src/sendsprint/receipts/`.
|
|
856
|
+
- llm-project-mapper — Node + Python: `bin/build-hamt-catalog`, `.catalog/capabilities.json`.
|
|
857
|
+
|
|
858
|
+
---
|
|
859
|
+
|
|
860
|
+
## 10. Foundational Literature
|
|
861
|
+
|
|
862
|
+
| Concept | Reference |
|
|
863
|
+
|---|---|
|
|
864
|
+
| Tuple spaces / coordination | Gelernter, *Generative Communication in Linda*, ACM TOPLAS 1985 |
|
|
865
|
+
| HAMT / persistent hash trie | Bagwell, *Ideal Hash Trees*, EPFL 2001 |
|
|
866
|
+
| Locality-preserving multi-attr indexing | Jagadish, *Linear Clustering of Objects with Multiple Attributes* |
|
|
867
|
+
| Hilbert clustering analysis | Moon/Jagadish/Faloutsos/Saltz |
|
|
868
|
+
| Information theory base | Shannon, *A Mathematical Theory of Communication* |
|
|
869
|
+
| Model Context Protocol tools | MCP spec — https://modelcontextprotocol.io/specification/draft/server/tools |
|
|
870
|
+
| Content-addressable storage | Merkle, *Protocols for Public Key Cryptosystems*, IEEE S&P 1980 |
|
|
871
|
+
| Persistent data structures | Okasaki, *Purely Functional Data Structures*, 1998 |
|
|
872
|
+
|
|
873
|
+
---
|
|
874
|
+
|
|
875
|
+
## 11. Guardrails (MANDATORY)
|
|
876
|
+
|
|
877
|
+
> Origin of this section: field observation from Victor "Dev Hermes" Genaro (2026-05-19):
|
|
878
|
+
> *"precisa de guardrail pra não fritar o processador. Você precisa de garbage collector também pra não encher 100% do disco."*
|
|
879
|
+
>
|
|
880
|
+
> Any adopter MUST implement both before going past CP4 (worker pool). Without them, a runaway agent or receipt-store explosion can take down the host.
|
|
881
|
+
|
|
882
|
+
### 11.1 CPU throttle (don't fry the CPU)
|
|
883
|
+
|
|
884
|
+
#### Problem
|
|
885
|
+
|
|
886
|
+
A worker that pulls tuples as fast as it can will pin every available core. Multiple workers compound. Local dev box becomes unusable; cloud VM hits CPU throttling and gets killed.
|
|
887
|
+
|
|
888
|
+
#### Policy
|
|
889
|
+
|
|
890
|
+
Every tuple carries `agent_terms.cpu_quota_pct` (0-100). Worker MUST enforce this before invoking the yool's implementation.
|
|
891
|
+
|
|
892
|
+
Reference runtime defaults are intentionally faster than the first minimal
|
|
893
|
+
examples: `YOOL_TUPLE_LANE_CONCURRENCY` defaults to `32`,
|
|
894
|
+
`YOOL_TUPLE_MAX_LANE_CONCURRENCY` defaults to `64`,
|
|
895
|
+
`YOOL_TUPLE_CPU_QUOTA_PCT` defaults to `95`,
|
|
896
|
+
`YOOL_TUPLE_QUEUE_MAXSIZE` defaults to `8192`, and
|
|
897
|
+
`YOOL_TUPLE_COMPRESSION_THRESHOLD` defaults to `1024`. Adopters may lower these
|
|
898
|
+
values for laptops or CI runners, but must never raise per-yool CPU above 100.
|
|
899
|
+
|
|
900
|
+
#### Reference implementation (Python, POSIX)
|
|
901
|
+
|
|
902
|
+
```python
|
|
903
|
+
# guardrails/cpu_throttle.py
|
|
904
|
+
import os
|
|
905
|
+
import contextlib
|
|
906
|
+
|
|
907
|
+
@contextlib.contextmanager
|
|
908
|
+
def cpu_throttle(quota_pct: int):
|
|
909
|
+
"""Soft CPU throttle via niceness. For hard throttle, use cgroups (Linux)."""
|
|
910
|
+
if quota_pct >= 100:
|
|
911
|
+
yield
|
|
912
|
+
return
|
|
913
|
+
|
|
914
|
+
# Niceness mapping: 60% -> nice 5, 30% -> nice 10, 10% -> nice 15
|
|
915
|
+
nice_delta = max(0, int(round((100 - quota_pct) / 6)))
|
|
916
|
+
try:
|
|
917
|
+
os.nice(nice_delta)
|
|
918
|
+
except OSError:
|
|
919
|
+
pass
|
|
920
|
+
try:
|
|
921
|
+
yield
|
|
922
|
+
finally:
|
|
923
|
+
try:
|
|
924
|
+
os.nice(-nice_delta)
|
|
925
|
+
except OSError:
|
|
926
|
+
pass
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
#### Stricter alternative (cgroups, Linux only)
|
|
930
|
+
|
|
931
|
+
```bash
|
|
932
|
+
cgcreate -g cpu:/yool-worker-${WORKER_ID}
|
|
933
|
+
echo $((quota_pct * 1000)) > /sys/fs/cgroup/yool-worker-${WORKER_ID}/cpu.max
|
|
934
|
+
cgexec -g cpu:/yool-worker-${WORKER_ID} python -m sendsprint.worker
|
|
935
|
+
```
|
|
936
|
+
|
|
937
|
+
#### macOS alternative
|
|
938
|
+
|
|
939
|
+
```bash
|
|
940
|
+
taskpolicy -c utility python -m sendsprint.worker
|
|
941
|
+
```
|
|
942
|
+
|
|
943
|
+
#### Enforcement points
|
|
944
|
+
|
|
945
|
+
1. **Worker startup**: read default `cpu_quota_pct` from project config (`.catalog/policy.yaml`).
|
|
946
|
+
2. **Per-tuple**: override with `agent_terms.cpu_quota_pct`.
|
|
947
|
+
3. **Per-yool**: implementation MAY further reduce (never raise).
|
|
948
|
+
|
|
949
|
+
#### Test
|
|
950
|
+
|
|
951
|
+
```python
|
|
952
|
+
def test_cpu_throttle_under_quota():
|
|
953
|
+
with cpu_throttle(50):
|
|
954
|
+
burn_cpu(seconds=2)
|
|
955
|
+
assert measured_cpu_time() < 1.5
|
|
956
|
+
```
|
|
957
|
+
|
|
958
|
+
### 11.2 Disk GC (don't fill 100%)
|
|
959
|
+
|
|
960
|
+
#### Problem
|
|
961
|
+
|
|
962
|
+
Receipts + tuple logs + cached artifacts grow unbounded. Daily sprint with 100 items × 5 yools × 50 KB artifacts = 25 MB/day = 9 GB/year. Multiply by N projects.
|
|
963
|
+
|
|
964
|
+
#### Policy — three retention tiers
|
|
965
|
+
|
|
966
|
+
| Tier | What | Retention | Why |
|
|
967
|
+
|---|---|---|---|
|
|
968
|
+
| **hot** | Last N runs of receipts, tuple log, artifacts | default 30 days | active debugging, cache hits |
|
|
969
|
+
| **warm** | Receipts only (not artifacts), pointer index | default 365 days | replay + audit |
|
|
970
|
+
| **cold** | Hash + pointer record only (artifacts purged) | forever | provenance trail |
|
|
971
|
+
|
|
972
|
+
Receipts themselves are **never deleted**, only their **artifact bodies**. Preserves the immutable Merkle chain.
|
|
973
|
+
|
|
974
|
+
#### Reference implementation
|
|
975
|
+
|
|
976
|
+
```python
|
|
977
|
+
# guardrails/disk_gc.py
|
|
978
|
+
import json, os, pathlib
|
|
979
|
+
from datetime import datetime, timedelta, timezone
|
|
980
|
+
|
|
981
|
+
def gc_run(catalog_dir: pathlib.Path, hot_days: int = 30, warm_days: int = 365, max_total_mb: int = 5000):
|
|
982
|
+
"""
|
|
983
|
+
Phase 1: artifact body purge for receipts older than hot_days.
|
|
984
|
+
Phase 2: hard size cap (max_total_mb): purge oldest until under cap.
|
|
985
|
+
Phase 3: rotate tuples.jsonl (daily file, gzip yesterday's).
|
|
986
|
+
"""
|
|
987
|
+
now = datetime.now(timezone.utc)
|
|
988
|
+
hot_cutoff = now - timedelta(days=hot_days)
|
|
989
|
+
|
|
990
|
+
receipts_dir = catalog_dir / "receipts"
|
|
991
|
+
artifacts_dir = catalog_dir / "artifacts"
|
|
992
|
+
|
|
993
|
+
purged_artifacts = 0
|
|
994
|
+
purged_bytes = 0
|
|
995
|
+
|
|
996
|
+
for receipt_file in receipts_dir.rglob("*.json"):
|
|
997
|
+
r = json.loads(receipt_file.read_text())
|
|
998
|
+
ts = datetime.fromisoformat(r["ended_at"])
|
|
999
|
+
if ts < hot_cutoff:
|
|
1000
|
+
for art in r.get("artifacts", []):
|
|
1001
|
+
p = artifacts_dir / art["path"]
|
|
1002
|
+
if p.exists():
|
|
1003
|
+
purged_bytes += p.stat().st_size
|
|
1004
|
+
p.unlink()
|
|
1005
|
+
purged_artifacts += 1
|
|
1006
|
+
r["artifacts_purged_at"] = now.isoformat()
|
|
1007
|
+
receipt_file.write_text(json.dumps(r, indent=2))
|
|
1008
|
+
|
|
1009
|
+
total_mb = _du_mb(catalog_dir)
|
|
1010
|
+
while total_mb > max_total_mb:
|
|
1011
|
+
oldest = _find_oldest_artifact(artifacts_dir)
|
|
1012
|
+
if oldest is None:
|
|
1013
|
+
break
|
|
1014
|
+
purged_bytes += oldest.stat().st_size
|
|
1015
|
+
oldest.unlink()
|
|
1016
|
+
purged_artifacts += 1
|
|
1017
|
+
total_mb = _du_mb(catalog_dir)
|
|
1018
|
+
|
|
1019
|
+
_rotate_daily(catalog_dir / "tuples.jsonl")
|
|
1020
|
+
|
|
1021
|
+
return {
|
|
1022
|
+
"artifacts_purged": purged_artifacts,
|
|
1023
|
+
"bytes_freed": purged_bytes,
|
|
1024
|
+
"total_mb_after": _du_mb(catalog_dir),
|
|
1025
|
+
}
|
|
1026
|
+
```
|
|
1027
|
+
|
|
1028
|
+
#### Schedule
|
|
1029
|
+
|
|
1030
|
+
```cron
|
|
1031
|
+
# Cron: nightly at 03:00
|
|
1032
|
+
0 3 * * * cd ~/Projetos/SendSprint && python -m sendsprint.gc --hot-days 30 --warm-days 365 --max-mb 5000
|
|
1033
|
+
```
|
|
1034
|
+
|
|
1035
|
+
#### Disk pressure circuit breaker
|
|
1036
|
+
|
|
1037
|
+
```python
|
|
1038
|
+
def check_disk_pressure(catalog_dir: pathlib.Path, free_mb_floor: int = 1000):
|
|
1039
|
+
stat = os.statvfs(catalog_dir)
|
|
1040
|
+
free_mb = (stat.f_bavail * stat.f_frsize) / (1024 * 1024)
|
|
1041
|
+
if free_mb < free_mb_floor:
|
|
1042
|
+
bus.out_(Tuple(lane="gc.urgent", payload=[{"yool": "fs.gc.run", "args": {}}]))
|
|
1043
|
+
raise DiskPressure(f"free={free_mb:.0f}MB below floor={free_mb_floor}MB")
|
|
1044
|
+
```
|
|
1045
|
+
|
|
1046
|
+
#### Test
|
|
1047
|
+
|
|
1048
|
+
```python
|
|
1049
|
+
def test_gc_purges_warm_tier_artifacts(tmp_path):
|
|
1050
|
+
setup_receipts(tmp_path, recent=50, old=50)
|
|
1051
|
+
result = gc_run(tmp_path, hot_days=30)
|
|
1052
|
+
assert result["artifacts_purged"] == 50
|
|
1053
|
+
assert len(list((tmp_path / "receipts").rglob("*.json"))) == 100
|
|
1054
|
+
```
|
|
1055
|
+
|
|
1056
|
+
### 11.3 Memory & token guardrails
|
|
1057
|
+
|
|
1058
|
+
Every yool implementation MUST:
|
|
1059
|
+
|
|
1060
|
+
- Stream large inputs/outputs to disk rather than buffer.
|
|
1061
|
+
- Respect `agent_terms.budget_tokens` (LLM calls).
|
|
1062
|
+
- Emit incremental cost into receipt as work progresses.
|
|
1063
|
+
|
|
1064
|
+
---
|
|
1065
|
+
|
|
1066
|
+
## 12. Glossary
|
|
1067
|
+
|
|
1068
|
+
- **address space** — set of distinct identifiers a system can refer to.
|
|
1069
|
+
- **bitmap** — per-HAMT-node bitfield indicating populated child slots.
|
|
1070
|
+
- **collision** — two keys hashing to the same path beyond max trie depth.
|
|
1071
|
+
- **lane** — coarse-grained routing key on a tuple.
|
|
1072
|
+
- **leaf** — terminal HAMT node holding a key/value pair.
|
|
1073
|
+
- **map_pos** — semantic coordinates inside a tuple.
|
|
1074
|
+
- **opcode** — synonym for yool name.
|
|
1075
|
+
- **payload** — ordered list of yool invocations inside a tuple.
|
|
1076
|
+
- **popcount** — number of bits set in a HAMT node's bitmap.
|
|
1077
|
+
- **receipt** — immutable, content-addressed record of one yool execution.
|
|
1078
|
+
- **slot** — index into a HAMT node's children array.
|
|
1079
|
+
- **structural sharing** — persistent-data-structure update strategy.
|
|
1080
|
+
- **tuple space** — Linda-style coordination substrate.
|
|
1081
|
+
|
|
1082
|
+
---
|
|
1083
|
+
|
|
1084
|
+
## 13. Versioning
|
|
1085
|
+
|
|
1086
|
+
| Version | Date | Changes |
|
|
1087
|
+
|---|---|---|
|
|
1088
|
+
| v0.1 | 2026-05-19 | Initial draft |
|
|
1089
|
+
| v0.2 | 2026-05-19 | Expanded examples (SendSprint, llm-project-mapper); guardrails (§11); end-to-end cache/resume flows; HAMT lookup algorithm. |
|
|
1090
|
+
|
|
1091
|
+
---
|
|
1092
|
+
|
|
1093
|
+
## Appendix A — FAQ
|
|
1094
|
+
|
|
1095
|
+
**Q: Isn't this just Kafka + a registry?**
|
|
1096
|
+
A: Kafka is the bus, fine. The pattern adds: (1) HAMT-addressed catalog, (2) content-addressable receipts as cache keys, (3) tuple as canonical unit of work with budget/authority. Kafka alone gives transport, not addressing.
|
|
1097
|
+
|
|
1098
|
+
**Q: Why blake2b for hashing instead of sha256?**
|
|
1099
|
+
A: HAMT addressing benefits from speed over cryptographic strength. blake2b-64 is faster than sha256 and 30 bits suffices for catalog sizes under ~1M. Receipts use sha256 because content-addressing needs collision resistance.
|
|
1100
|
+
|
|
1101
|
+
**Q: How does this interact with my existing MCP server?**
|
|
1102
|
+
A: Your MCP server becomes the **edge** in §3.1. Expose `catalog.lookup`, `tuple.dispatch`, `tuple.observe`. Don't expose `tuple.in/out`.
|
|
1103
|
+
|
|
1104
|
+
**Q: How do I migrate without breaking prod?**
|
|
1105
|
+
A: §8 playbook. Dual-read step is key: catalog runs alongside old dict, diff on every lookup. When diffs zero for N days, flip the switch.
|
|
1106
|
+
|
|
1107
|
+
**Q: What about distributed workers?**
|
|
1108
|
+
A: Tuple-space scales horizontally — workers on different hosts sharing the bus (Kafka, Redis Streams, NATS). HAMT itself is immutable so distribution is read-trivial.
|
|
1109
|
+
|
|
1110
|
+
**Q: Why force guardrails (§11) if my project is small?**
|
|
1111
|
+
A: Small projects grow. Guardrail cost is low (<=200 LOC for both). Cost of retrofit after a runaway agent fries a laptop is high.
|
|
1112
|
+
|
|
1113
|
+
---
|
|
1114
|
+
|
|
1115
|
+
## Appendix B — Diagram Index
|
|
1116
|
+
|
|
1117
|
+
- §3.1 — Static layout (3 layers)
|
|
1118
|
+
- §3.2 — Dynamic flow (one item end-to-end)
|
|
1119
|
+
- §3.3 — Failure & resume
|
|
1120
|
+
- §3.4 — HAMT trie insertion
|
|
1121
|
+
- §3.5 — Linda tuple-space timeline
|
|
1122
|
+
|
|
1123
|
+
---
|
|
1124
|
+
|
|
1125
|
+
## Appendix C — Quick Vendor Instructions
|
|
1126
|
+
|
|
1127
|
+
```bash
|
|
1128
|
+
# Vendor the spec
|
|
1129
|
+
curl -L https://raw.githubusercontent.com/wesleysimplicio/simplicio-prompt/main/YOOL_TUPLE_HAMT.md \
|
|
1130
|
+
-o YOOL_TUPLE_HAMT.md
|
|
1131
|
+
|
|
1132
|
+
# Or as a submodule (read-only consumer)
|
|
1133
|
+
git submodule add https://github.com/wesleysimplicio/simplicio-prompt vendor/simplicio-prompt
|
|
1134
|
+
|
|
1135
|
+
# Copy reference impls
|
|
1136
|
+
cp vendor/simplicio-prompt/kernel/yool_tuple_kernel.py src/kernel/
|
|
1137
|
+
cp vendor/simplicio-prompt/scripts/build_hamt.py scripts/
|
|
1138
|
+
cp vendor/simplicio-prompt/guardrails/cpu_throttle.py src/guardrails/
|
|
1139
|
+
cp vendor/simplicio-prompt/guardrails/disk_gc.py src/guardrails/
|
|
1140
|
+
cp vendor/simplicio-prompt/prompts/agent-runtime-execution-prompt.md prompts/
|
|
1141
|
+
|
|
1142
|
+
# Generate catalog
|
|
1143
|
+
python scripts/build_hamt.py --source AGENTS.md --output .catalog/capabilities.json
|
|
1144
|
+
|
|
1145
|
+
# Wire workers to use catalog.lookup + receipts + guardrails
|
|
1146
|
+
python vendor/simplicio-prompt/kernel/yool_tuple_kernel.py
|
|
1147
|
+
|
|
1148
|
+
# Add GC schedule (cron / launchd / systemd)
|
|
1149
|
+
```
|