clearctx 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +71 -0
- package/LICENSE +21 -0
- package/README.md +1006 -0
- package/STRATEGY.md +485 -0
- package/bin/cli.js +1756 -0
- package/bin/continuity-hook.js +118 -0
- package/bin/mcp.js +27 -0
- package/bin/setup.js +929 -0
- package/package.json +56 -0
- package/src/artifact-store.js +710 -0
- package/src/atomic-io.js +99 -0
- package/src/briefing-generator.js +451 -0
- package/src/continuity-hooks.js +253 -0
- package/src/contract-store.js +525 -0
- package/src/decision-journal.js +229 -0
- package/src/delegate.js +348 -0
- package/src/dependency-resolver.js +453 -0
- package/src/diff-engine.js +473 -0
- package/src/file-lock.js +161 -0
- package/src/index.js +61 -0
- package/src/lineage-graph.js +402 -0
- package/src/manager.js +510 -0
- package/src/mcp-server.js +3501 -0
- package/src/pattern-registry.js +221 -0
- package/src/pipeline-engine.js +618 -0
- package/src/prompts.js +1217 -0
- package/src/safety-net.js +170 -0
- package/src/session-snapshot.js +508 -0
- package/src/snapshot-engine.js +490 -0
- package/src/stale-detector.js +169 -0
- package/src/store.js +131 -0
- package/src/stream-session.js +463 -0
- package/src/team-hub.js +615 -0
package/STRATEGY.md
ADDED
|
@@ -0,0 +1,485 @@
|
|
|
1
|
+
# Multi-Session Strategy Guide
|
|
2
|
+
|
|
3
|
+
This guide teaches Claude Code WHEN and HOW to use the multi-session tools effectively.
|
|
4
|
+
Place this content in your project's `CLAUDE.md` or reference it as a system prompt.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## When to Use Multi-Session
|
|
9
|
+
|
|
10
|
+
### DO delegate when:
|
|
11
|
+
- **Task has 3+ independent parts** — build login, signup, and reset in parallel
|
|
12
|
+
- **Task is too large for one context** — full feature implementation with tests
|
|
13
|
+
- **You need different models for different subtasks** — haiku for simple lookups, opus for architecture
|
|
14
|
+
- **You want to try multiple approaches** — fork and compare
|
|
15
|
+
- **Task involves risky operations** — delegate with safety limits instead of doing it directly
|
|
16
|
+
|
|
17
|
+
### DON'T delegate when:
|
|
18
|
+
- **Task is small** — a few lines of code, a quick fix
|
|
19
|
+
- **Task needs tight coordination** — changes that must happen atomically across files
|
|
20
|
+
- **You're just reading/exploring** — use Read, Grep, Glob directly
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Decision Framework
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
Is the task simple (< 5 minutes, < 3 files)?
|
|
28
|
+
→ YES: Do it yourself. No delegation needed.
|
|
29
|
+
→ NO: Continue...
|
|
30
|
+
|
|
31
|
+
Can the task be split into independent parts?
|
|
32
|
+
→ YES: Delegate each part in parallel.
|
|
33
|
+
→ NO: Delegate as one task, use continue_task for corrections.
|
|
34
|
+
|
|
35
|
+
Does the task need special safety limits?
|
|
36
|
+
→ YES: Use delegate_task with max_cost and max_turns.
|
|
37
|
+
→ NO: Use spawn_session + send_message for direct control.
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## Task Decomposition
|
|
43
|
+
|
|
44
|
+
When breaking a large task into subtasks:
|
|
45
|
+
|
|
46
|
+
1. **Identify independent pieces** — parts that don't depend on each other
|
|
47
|
+
2. **Identify dependencies** — part B needs part A's output
|
|
48
|
+
3. **Assign models** — haiku for simple, sonnet for standard, opus for complex
|
|
49
|
+
4. **Set budgets** — split total budget proportionally
|
|
50
|
+
|
|
51
|
+
**Example: "Build the authentication system"**
|
|
52
|
+
```
|
|
53
|
+
Independent (run in parallel):
|
|
54
|
+
- child-1 (sonnet): "Build login endpoint with JWT"
|
|
55
|
+
- child-2 (sonnet): "Build signup endpoint with validation"
|
|
56
|
+
- child-3 (sonnet): "Build password reset with email tokens"
|
|
57
|
+
|
|
58
|
+
Dependent (run after all above complete):
|
|
59
|
+
- child-4 (sonnet): "Write integration tests for auth system"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Model Selection Guide
|
|
65
|
+
|
|
66
|
+
| Task Type | Model | Why |
|
|
67
|
+
|-----------|-------|-----|
|
|
68
|
+
| Simple lookups, counting, listing | haiku | Fast and cheap |
|
|
69
|
+
| Standard code edits, bug fixes | sonnet | Good balance |
|
|
70
|
+
| Architecture decisions, complex refactoring | opus | Best reasoning |
|
|
71
|
+
| Code review, analysis | sonnet | Good enough |
|
|
72
|
+
| Writing tests | sonnet | Reliable |
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Budget Allocation
|
|
77
|
+
|
|
78
|
+
When delegating a complex task with a total budget:
|
|
79
|
+
|
|
80
|
+
| Allocation | Percentage | Purpose |
|
|
81
|
+
|------------|-----------|---------|
|
|
82
|
+
| Main work | 60% | The core implementation |
|
|
83
|
+
| Tests | 20% | Writing and running tests |
|
|
84
|
+
| Reserve | 20% | Corrections, follow-ups, retries |
|
|
85
|
+
|
|
86
|
+
**Example:** $5 total budget for 3 parallel tasks
|
|
87
|
+
- Each task gets $1.00 (60% = $3.00 total)
|
|
88
|
+
- Tests task gets $1.00 (20%)
|
|
89
|
+
- Reserve $1.00 for corrections (20%)
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## The Correction Loop
|
|
94
|
+
|
|
95
|
+
After EVERY delegation, evaluate the result:
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
1. delegate_task → get result
|
|
99
|
+
2. Read result.status:
|
|
100
|
+
- "completed" → Read result.response, evaluate quality
|
|
101
|
+
- "failed" → Read result.error, decide: retry or different approach
|
|
102
|
+
- "cost_exceeded" → Budget was too low. Increase or reduce scope.
|
|
103
|
+
- "turns_exceeded" → Task too complex. Break it down further.
|
|
104
|
+
3. Is the result good?
|
|
105
|
+
- YES → finish_task
|
|
106
|
+
- NO → continue_task with specific corrections
|
|
107
|
+
- HOPELESS → abort_task, try a different approach
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Writing Good Corrections
|
|
111
|
+
|
|
112
|
+
**Bad correction:** "Fix it"
|
|
113
|
+
**Good correction:** "The login function is missing password hashing. Add bcrypt hashing before saving to database. Use 12 salt rounds."
|
|
114
|
+
|
|
115
|
+
Be specific about:
|
|
116
|
+
- WHAT is wrong
|
|
117
|
+
- WHERE in the code
|
|
118
|
+
- HOW to fix it
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## Parallel Coordination
|
|
123
|
+
|
|
124
|
+
When running multiple child sessions that work on related code:
|
|
125
|
+
|
|
126
|
+
1. **Spawn all independent tasks** in parallel
|
|
127
|
+
2. **Wait for all results**
|
|
128
|
+
3. **Evaluate each result** independently
|
|
129
|
+
4. **Send corrections** to any that need fixing (in parallel)
|
|
130
|
+
5. **If task B needs output from task A**, read A's result and include relevant details in B's context
|
|
131
|
+
|
|
132
|
+
**Example:**
|
|
133
|
+
```
|
|
134
|
+
# Step 1: Parallel delegation
|
|
135
|
+
delegate_task(name="auth-api", task="Build auth endpoints")
|
|
136
|
+
delegate_task(name="auth-ui", task="Build login screen")
|
|
137
|
+
|
|
138
|
+
# Step 2: Read results
|
|
139
|
+
# auth-api completed: "Created /api/login and /api/register"
|
|
140
|
+
# auth-ui completed but used wrong endpoint path
|
|
141
|
+
|
|
142
|
+
# Step 3: Correct auth-ui with info from auth-api
|
|
143
|
+
continue_task(name="auth-ui", message="The API endpoint is /api/login not /login. Update the fetch URL.")
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
---
|
|
147
|
+
|
|
148
|
+
## Session Recovery
|
|
149
|
+
|
|
150
|
+
If a conversation ends and you start a new one:
|
|
151
|
+
|
|
152
|
+
1. **Always check for existing sessions first:** `list_sessions`
|
|
153
|
+
2. **Resume important sessions:** `resume_session`
|
|
154
|
+
3. **Clean up finished work:** `delete_session` for completed tasks
|
|
155
|
+
4. **Sessions persist across conversations** — the data is saved to disk
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Safety Presets Quick Reference
|
|
160
|
+
|
|
161
|
+
| Preset | Use When |
|
|
162
|
+
|--------|----------|
|
|
163
|
+
| `read-only` | Analyzing code, gathering information, code review |
|
|
164
|
+
| `review` | Same as read-only |
|
|
165
|
+
| `edit` | Normal development work (DEFAULT — use this most of the time) |
|
|
166
|
+
| `full` | Trusted tasks that need unrestricted access (use sparingly) |
|
|
167
|
+
| `plan` | Exploring architecture, planning (no file modifications) |
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## Anti-Patterns (What NOT to do)
|
|
172
|
+
|
|
173
|
+
1. **Don't delegate trivial tasks** — overhead of spawning a session isn't worth it for simple fixes
|
|
174
|
+
2. **Don't use `full` preset by default** — use `edit` and escalate only if needed
|
|
175
|
+
3. **Don't ignore failed results** — always check status and handle errors
|
|
176
|
+
4. **Don't set budgets too low** — a realistic task needs $0.50-2.00, complex tasks need more
|
|
177
|
+
5. **Don't send vague corrections** — be specific about what's wrong and how to fix it
|
|
178
|
+
6. **Don't forget to finish_task** — stopped sessions consume stored data
|
|
179
|
+
7. **Don't coordinate file edits across parallel sessions on the same file** — they'll conflict
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
# Team Hub v2 Strategy — From Conversations to Transactions
|
|
184
|
+
|
|
185
|
+
## The Positioning Statement
|
|
186
|
+
|
|
187
|
+
**claude-multi-session is the only multi-agent coordination system for Claude Code that replaces conversation with transactions.**
|
|
188
|
+
|
|
189
|
+
Agents exchange **versioned artifacts**, not messages. Any agent can assign work to any other agent — the team **self-organizes** without a central bottleneck. Workflows **auto-heal** through reactive pipelines. Data lineage tracks exactly how every output was derived. And the entire coordination state can be **rolled back and replayed** — something no conversational system can ever do.
|
|
190
|
+
|
|
191
|
+
Every competitor is racing to make agents talk better. We're building a system where agents don't need to talk at all — and don't need a boss to tell them what to do next.
|
|
192
|
+
|
|
193
|
+
---
|
|
194
|
+
|
|
195
|
+
## The Flywheel: How All Three Layers Reinforce Each Other
|
|
196
|
+
|
|
197
|
+
```
|
|
198
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
199
|
+
│ Layer 2: Artifacts (versioned, immutable) │
|
|
200
|
+
│ │ │
|
|
201
|
+
│ ├──► Layer 3a: Lineage Graph tracks how artifacts │
|
|
202
|
+
│ │ relate to each other │
|
|
203
|
+
│ │ │ │
|
|
204
|
+
│ │ └──► Knows which artifacts are STALE when │
|
|
205
|
+
│ │ sources change │
|
|
206
|
+
│ │ │ │
|
|
207
|
+
│ │ └──► Layer 3b: Reactive Pipelines │
|
|
208
|
+
│ │ auto-trigger regeneration │
|
|
209
|
+
│ │ │ │
|
|
210
|
+
│ │ └──► Self-healing CI loops │
|
|
211
|
+
│ │ without chat │
|
|
212
|
+
│ │ │
|
|
213
|
+
│ └──► Layer 3c: Snapshots capture the full state at │
|
|
214
|
+
│ any point in time │
|
|
215
|
+
│ │ │
|
|
216
|
+
│ └──► Replay re-executes from snapshots with │
|
|
217
|
+
│ overrides │
|
|
218
|
+
│ │ │
|
|
219
|
+
│ └──► Lineage shows what changed between runs │
|
|
220
|
+
│ │ │
|
|
221
|
+
│ └──► Artifacts store both runs' outputs │
|
|
222
|
+
│ for comparison │
|
|
223
|
+
│ │
|
|
224
|
+
│ The cycle: │
|
|
225
|
+
│ Build → Track lineage → React to changes → Snapshot │
|
|
226
|
+
│ → Replay if needed → Lineage shows delta │
|
|
227
|
+
│ │
|
|
228
|
+
│ Competitors can't replicate this because step 1 (versioned │
|
|
229
|
+
│ artifacts) doesn't exist in their systems. Every Layer 3 │
|
|
230
|
+
│ feature depends on Layer 2 existing first. │
|
|
231
|
+
└─────────────────────────────────────────────────────────────┘
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Why Competitors Can't Replicate This
|
|
237
|
+
|
|
238
|
+
### Conversational Systems (Agent Teams, claude-flow, CrewAI, AutoGen)
|
|
239
|
+
|
|
240
|
+
**What they do:**
|
|
241
|
+
- Agents talk in natural language
|
|
242
|
+
- Messages pile up in shared context
|
|
243
|
+
- Orchestrator summarizes and routes
|
|
244
|
+
- Quality degrades over time
|
|
245
|
+
|
|
246
|
+
**Their limitations:**
|
|
247
|
+
|
|
248
|
+
1. **No data provenance:** Can't answer "how was this output derived?" without reading conversation history
|
|
249
|
+
2. **No impact analysis:** Can't answer "what breaks if I change X?" without asking every agent
|
|
250
|
+
3. **No snapshot/replay:** Can't rollback to a previous state — conversation is linear, irreversible
|
|
251
|
+
4. **No self-healing:** When tests fail, someone has to notice and manually retry
|
|
252
|
+
5. **No peer-to-peer:** Only the orchestrator can assign work (hub-and-spoke bottleneck)
|
|
253
|
+
|
|
254
|
+
**Why they can't bolt this on:**
|
|
255
|
+
|
|
256
|
+
To add lineage tracking, they'd need versioned artifacts. To add impact analysis, they'd need a dependency graph. To add snapshot/replay, they'd need a state machine. But all of these require **structured, transactional coordination** — which means rebuilding their entire coordination layer.
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
### The Moat: Each Layer 3 Feature Requires Layer 2
|
|
261
|
+
|
|
262
|
+
| Layer 3 Feature | Why it depends on Layer 2 |
|
|
263
|
+
|----------------|---------------------------|
|
|
264
|
+
| **Lineage Graph** | Needs immutable artifact versions and explicit `derivedFrom` relationships |
|
|
265
|
+
| **Impact Analysis** | Needs dependency graph built from artifact relationships |
|
|
266
|
+
| **Staleness Detection** | Needs version numbers to compare (derived from v1 but v2 exists) |
|
|
267
|
+
| **Reactive Pipelines** | Needs structured events (artifact_published, contract_completed) with typed data |
|
|
268
|
+
| **Self-Healing CI** | Needs contract state machine (reopen → retry) and artifact validation (test-results schema) |
|
|
269
|
+
| **Snapshots** | Needs serializable state (contracts + artifacts + pipelines) |
|
|
270
|
+
| **Rollback** | Needs immutable version files (can restore without losing data) |
|
|
271
|
+
| **Replay with Overrides** | Needs contract inputs to be structured and modifiable |
|
|
272
|
+
|
|
273
|
+
Competitors would need to:
|
|
274
|
+
1. Add versioned artifacts (breaking change to their data model)
|
|
275
|
+
2. Add contract state machines (breaking change to their coordination model)
|
|
276
|
+
3. Add structured schema validation (breaking change to their output format)
|
|
277
|
+
4. Add lineage tracking (requires rewriting artifact publish logic)
|
|
278
|
+
5. Add reactive pipelines (requires event system + rule engine)
|
|
279
|
+
6. Add snapshot/replay (requires full state serialization)
|
|
280
|
+
|
|
281
|
+
This is not "add a feature" — this is "rebuild the entire system."
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## When to Use Team Hub v2
|
|
286
|
+
|
|
287
|
+
### Use Team Hub when:
|
|
288
|
+
|
|
289
|
+
1. **You have a multi-step workflow with dependencies**
|
|
290
|
+
- Example: Schema → API → Tests
|
|
291
|
+
- Why: Contracts auto-resolve as artifacts are published
|
|
292
|
+
|
|
293
|
+
2. **You want self-healing behavior**
|
|
294
|
+
- Example: When tests fail, auto-reopen the API contract
|
|
295
|
+
- Why: Reactive pipelines handle this without chat
|
|
296
|
+
|
|
297
|
+
3. **You need data provenance**
|
|
298
|
+
- Example: "How was this test result derived?"
|
|
299
|
+
- Why: Lineage graph tracks the full chain
|
|
300
|
+
|
|
301
|
+
4. **You want to try multiple approaches**
|
|
302
|
+
- Example: Build with REST, then replay with GraphQL
|
|
303
|
+
- Why: Snapshots + replay let you compare both runs
|
|
304
|
+
|
|
305
|
+
5. **You have peer-to-peer coordination needs**
|
|
306
|
+
- Example: QA finds a bug and assigns it directly to the backend dev
|
|
307
|
+
- Why: Any session can create contracts for any other session
|
|
308
|
+
|
|
309
|
+
### Don't use Team Hub when:
|
|
310
|
+
|
|
311
|
+
1. **You have a single, linear task** — just use delegate_task
|
|
312
|
+
2. **You're just exploring/reading code** — use Grep/Glob directly
|
|
313
|
+
3. **The task is too small** — overhead of contracts isn't worth it
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Team Strategy Patterns
|
|
318
|
+
|
|
319
|
+
### Pattern 1: Linear Chain (Schema → API → Tests)
|
|
320
|
+
|
|
321
|
+
**Setup:**
|
|
322
|
+
```javascript
|
|
323
|
+
contract_create("setup-schema", assignee: "db-dev",
|
|
324
|
+
expectedOutputs: [{ artifactType: "schema-change" }])
|
|
325
|
+
|
|
326
|
+
contract_create("build-api", assignee: "api-dev",
|
|
327
|
+
dependencies: [{ type: "contract", contractId: "setup-schema" }],
|
|
328
|
+
expectedOutputs: [{ artifactType: "api-contract" }])
|
|
329
|
+
|
|
330
|
+
contract_create("write-tests", assignee: "qa-dev",
|
|
331
|
+
dependencies: [{ type: "contract", contractId: "build-api" }],
|
|
332
|
+
expectedOutputs: [{ artifactType: "test-results" }])
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
**What happens:**
|
|
336
|
+
- setup-schema is `ready` immediately
|
|
337
|
+
- build-api is `pending` until setup-schema completes
|
|
338
|
+
- write-tests is `pending` until build-api completes
|
|
339
|
+
- Each session auto-starts when its contract becomes `ready`
|
|
340
|
+
- No orchestrator involvement after initial setup
|
|
341
|
+
|
|
342
|
+
### Pattern 2: Self-Healing CI Loop
|
|
343
|
+
|
|
344
|
+
**Setup:**
|
|
345
|
+
```javascript
|
|
346
|
+
pipeline_create("ci-loop", rules: [
|
|
347
|
+
{
|
|
348
|
+
trigger: { type: "artifact_published", artifactType: "api-contract" },
|
|
349
|
+
action: { type: "notify_session", target: "qa-dev",
|
|
350
|
+
message: "API contract updated — re-run tests" }
|
|
351
|
+
},
|
|
352
|
+
{
|
|
353
|
+
trigger: { type: "artifact_published", artifactType: "test-results" },
|
|
354
|
+
condition: "data.failed > 0",
|
|
355
|
+
action: { type: "reopen_contract", contractId: "build-api",
|
|
356
|
+
reason: "Tests failing: ${data.failed} failures" }
|
|
357
|
+
}
|
|
358
|
+
])
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
**What happens:**
|
|
362
|
+
- API dev publishes api-contract → QA gets notified → re-runs tests
|
|
363
|
+
- If tests fail → contract auto-reopens → API dev gets notification → fixes → publishes new version
|
|
364
|
+
- If tests pass → contract auto-completes → done
|
|
365
|
+
- Zero human intervention, zero orchestrator messages
|
|
366
|
+
|
|
367
|
+
### Pattern 3: Peer-to-Peer Bug Assignment
|
|
368
|
+
|
|
369
|
+
**Scenario:** QA discovers a bug during testing
|
|
370
|
+
|
|
371
|
+
**Traditional approach (hub-and-spoke):**
|
|
372
|
+
```
|
|
373
|
+
QA → tells orchestrator → orchestrator assigns to api-dev → api-dev works on it
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
**Team Hub v2 approach (peer-to-peer):**
|
|
377
|
+
```javascript
|
|
378
|
+
// QA creates contract directly for api-dev
|
|
379
|
+
contract_create("fix-sql-injection", assigner: "qa-dev", assignee: "api-dev",
|
|
380
|
+
title: "Fix SQL injection in login endpoint",
|
|
381
|
+
inputs: { context: "Parameterized queries missing in /login handler" },
|
|
382
|
+
expectedOutputs: [{ artifactType: "api-contract", required: true }])
|
|
383
|
+
// → api-dev gets inbox: "contract_ready" from qa-dev (not orchestrator)
|
|
384
|
+
// → Orchestrator also notified (broadcast on contract creation) but not involved
|
|
385
|
+
```
|
|
386
|
+
|
|
387
|
+
**What happens:**
|
|
388
|
+
- QA assigns work directly to api-dev without going through orchestrator
|
|
389
|
+
- api-dev starts work immediately
|
|
390
|
+
- When api-dev publishes the fix, CI pipeline auto-triggers QA's tests
|
|
391
|
+
- Team self-organizes without a bottleneck
|
|
392
|
+
|
|
393
|
+
### Pattern 4: Snapshot → Try Different Approach → Compare
|
|
394
|
+
|
|
395
|
+
**Scenario:** Auth feature is done with REST, want to try GraphQL
|
|
396
|
+
|
|
397
|
+
```bash
|
|
398
|
+
# Take snapshot of completed work
|
|
399
|
+
team_snapshot "auth-rest-complete" --label "Auth feature done with REST"
|
|
400
|
+
|
|
401
|
+
# Replay from the beginning with GraphQL override
|
|
402
|
+
team_replay "pre-work" --overrides '{
|
|
403
|
+
"build-api": {
|
|
404
|
+
"inputs": { "context": "Use GraphQL instead of REST" }
|
|
405
|
+
}
|
|
406
|
+
}'
|
|
407
|
+
|
|
408
|
+
# Both runs' artifacts are preserved:
|
|
409
|
+
# - api-contract-user-auth@v2 (REST approach)
|
|
410
|
+
# - api-contract-user-auth@v3 (GraphQL approach)
|
|
411
|
+
# - test-results-auth@v1 (REST tests)
|
|
412
|
+
# - test-results-auth@v2 (GraphQL tests)
|
|
413
|
+
|
|
414
|
+
# Compare test results
|
|
415
|
+
artifact_get api-contract-user-auth --version 2 # REST
|
|
416
|
+
artifact_get api-contract-user-auth --version 3 # GraphQL
|
|
417
|
+
artifact_get test-results-auth --version 1 # REST tests
|
|
418
|
+
artifact_get test-results-auth --version 2 # GraphQL tests
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
**What happens:**
|
|
422
|
+
- Original work is preserved
|
|
423
|
+
- Entire workflow re-executes with different parameters
|
|
424
|
+
- Both approaches' outputs are stored for comparison
|
|
425
|
+
- Lineage shows how each output was derived
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
429
|
+
## Budget Strategy for Teams
|
|
430
|
+
|
|
431
|
+
When using Team Hub, your budget is split across:
|
|
432
|
+
1. **Contract execution** (sessions doing the work)
|
|
433
|
+
2. **Orchestrator monitoring** (checking contract status, handling failures)
|
|
434
|
+
3. **Reserve** (corrections, retries)
|
|
435
|
+
|
|
436
|
+
**Example:** $10 total budget for a 3-agent team building an auth feature
|
|
437
|
+
|
|
438
|
+
| Allocation | Amount | Purpose |
|
|
439
|
+
|------------|--------|---------|
|
|
440
|
+
| db-dev contract | $2.00 | Schema design + migrations |
|
|
441
|
+
| api-dev contract | $4.00 | API implementation (most complex) |
|
|
442
|
+
| qa-dev contract | $2.00 | Test suite |
|
|
443
|
+
| Orchestrator | $1.00 | Setup contracts, monitor, handle failures |
|
|
444
|
+
| Reserve | $1.00 | Retries when tests fail |
|
|
445
|
+
|
|
446
|
+
**Key insight:** With reactive pipelines, the orchestrator's budget is tiny — it only intervenes on failures. Most work happens autonomously.
|
|
447
|
+
|
|
448
|
+
---
|
|
449
|
+
|
|
450
|
+
## Quality Metrics: Why Transactional Beats Conversational
|
|
451
|
+
|
|
452
|
+
| Metric | Conversational | Transactional (Team Hub v2) |
|
|
453
|
+
|--------|---------------|---------------------------|
|
|
454
|
+
| **Precision** | Degrades (summaries lose detail) | Stays consistent (artifacts are exact) |
|
|
455
|
+
| **Traceability** | Poor (read conversation history) | Perfect (lineage graph) |
|
|
456
|
+
| **Repeatability** | Impossible (can't replay) | Easy (snapshots + replay) |
|
|
457
|
+
| **Self-healing** | Manual (human notices failures) | Automatic (reactive pipelines) |
|
|
458
|
+
| **Context bloat** | Grows (messages pile up) | Constant (artifacts are versioned, not accumulated) |
|
|
459
|
+
| **Coordination overhead** | High (orchestrator routes everything) | Low (peer-to-peer contracts) |
|
|
460
|
+
|
|
461
|
+
**Example:**
|
|
462
|
+
|
|
463
|
+
Conversational system after 10 iterations:
|
|
464
|
+
```
|
|
465
|
+
Orchestrator context: 50K tokens
|
|
466
|
+
- Messages from all agents
|
|
467
|
+
- Summaries of what was done
|
|
468
|
+
- Repeated explanations
|
|
469
|
+
- Lost precision from compression
|
|
470
|
+
```
|
|
471
|
+
|
|
472
|
+
Team Hub v2 after 10 iterations:
|
|
473
|
+
```
|
|
474
|
+
Orchestrator context: 5K tokens
|
|
475
|
+
- Contract statuses
|
|
476
|
+
- Artifact IDs
|
|
477
|
+
- Pipeline logs
|
|
478
|
+
- Full precision preserved in artifacts
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
---
|
|
482
|
+
|
|
483
|
+
## The Pitch: Why This Wins
|
|
484
|
+
|
|
485
|
+
> "Every multi-agent system relies on conversation. Agents talk, orchestrators summarize, quality degrades. Team Hub v2 replaces conversation with transactions. Agents publish versioned artifacts, create contracts for each other, and auto-resolve dependencies. The system heals itself when tests fail. You can rollback to any point and replay with different parameters. And data lineage tracks exactly how every output was derived. This is not incremental — this is a different paradigm. And because every Layer 3 feature requires Layer 2 to exist first, competitors can't bolt this on without rebuilding their entire coordination layer. That's the moat."
|