clearctx 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/STRATEGY.md ADDED
@@ -0,0 +1,485 @@
1
+ # Multi-Session Strategy Guide
2
+
3
+ This guide teaches Claude Code WHEN and HOW to use the multi-session tools effectively.
4
+ Place this content in your project's `CLAUDE.md` or reference it as a system prompt.
5
+
6
+ ---
7
+
8
+ ## When to Use Multi-Session
9
+
10
+ ### DO delegate when:
11
+ - **Task has 3+ independent parts** — build login, signup, and reset in parallel
12
+ - **Task is too large for one context** — full feature implementation with tests
13
+ - **You need different models for different subtasks** — haiku for simple lookups, opus for architecture
14
+ - **You want to try multiple approaches** — fork and compare
15
+ - **Task involves risky operations** — delegate with safety limits instead of doing it directly
16
+
17
+ ### DON'T delegate when:
18
+ - **Task is small** — a few lines of code, a quick fix
19
+ - **Task needs tight coordination** — changes that must happen atomically across files
20
+ - **You're just reading/exploring** — use Read, Grep, Glob directly
21
+
22
+ ---
23
+
24
+ ## Decision Framework
25
+
26
+ ```
27
+ Is the task simple (< 5 minutes, < 3 files)?
28
+ → YES: Do it yourself. No delegation needed.
29
+ → NO: Continue...
30
+
31
+ Can the task be split into independent parts?
32
+ → YES: Delegate each part in parallel.
33
+ → NO: Delegate as one task, use continue_task for corrections.
34
+
35
+ Does the task need special safety limits?
36
+ → YES: Use delegate_task with max_cost and max_turns.
37
+ → NO: Use spawn_session + send_message for direct control.
38
+ ```
39
+
40
+ ---
41
+
42
+ ## Task Decomposition
43
+
44
+ When breaking a large task into subtasks:
45
+
46
+ 1. **Identify independent pieces** — parts that don't depend on each other
47
+ 2. **Identify dependencies** — part B needs part A's output
48
+ 3. **Assign models** — haiku for simple, sonnet for standard, opus for complex
49
+ 4. **Set budgets** — split total budget proportionally
50
+
51
+ **Example: "Build the authentication system"**
52
+ ```
53
+ Independent (run in parallel):
54
+ - child-1 (sonnet): "Build login endpoint with JWT"
55
+ - child-2 (sonnet): "Build signup endpoint with validation"
56
+ - child-3 (sonnet): "Build password reset with email tokens"
57
+
58
+ Dependent (run after all above complete):
59
+ - child-4 (sonnet): "Write integration tests for auth system"
60
+ ```
61
+
62
+ ---
63
+
64
+ ## Model Selection Guide
65
+
66
+ | Task Type | Model | Why |
67
+ |-----------|-------|-----|
68
+ | Simple lookups, counting, listing | haiku | Fast and cheap |
69
+ | Standard code edits, bug fixes | sonnet | Good balance |
70
+ | Architecture decisions, complex refactoring | opus | Best reasoning |
71
+ | Code review, analysis | sonnet | Good enough |
72
+ | Writing tests | sonnet | Reliable |
73
+
74
+ ---
75
+
76
+ ## Budget Allocation
77
+
78
+ When delegating a complex task with a total budget:
79
+
80
+ | Allocation | Percentage | Purpose |
81
+ |------------|-----------|---------|
82
+ | Main work | 60% | The core implementation |
83
+ | Tests | 20% | Writing and running tests |
84
+ | Reserve | 20% | Corrections, follow-ups, retries |
85
+
86
+ **Example:** $5 total budget for 3 parallel tasks
87
+ - Each task gets $1.00 (60% = $3.00 total)
88
+ - Tests task gets $1.00 (20%)
89
+ - Reserve $1.00 for corrections (20%)
90
+
91
+ ---
92
+
93
+ ## The Correction Loop
94
+
95
+ After EVERY delegation, evaluate the result:
96
+
97
+ ```
98
+ 1. delegate_task → get result
99
+ 2. Read result.status:
100
+ - "completed" → Read result.response, evaluate quality
101
+ - "failed" → Read result.error, decide: retry or different approach
102
+ - "cost_exceeded" → Budget was too low. Increase or reduce scope.
103
+ - "turns_exceeded" → Task too complex. Break it down further.
104
+ 3. Is the result good?
105
+ - YES → finish_task
106
+ - NO → continue_task with specific corrections
107
+ - HOPELESS → abort_task, try a different approach
108
+ ```
109
+
110
+ ### Writing Good Corrections
111
+
112
+ **Bad correction:** "Fix it"
113
+ **Good correction:** "The login function is missing password hashing. Add bcrypt hashing before saving to database. Use 12 salt rounds."
114
+
115
+ Be specific about:
116
+ - WHAT is wrong
117
+ - WHERE in the code
118
+ - HOW to fix it
119
+
120
+ ---
121
+
122
+ ## Parallel Coordination
123
+
124
+ When running multiple child sessions that work on related code:
125
+
126
+ 1. **Spawn all independent tasks** in parallel
127
+ 2. **Wait for all results**
128
+ 3. **Evaluate each result** independently
129
+ 4. **Send corrections** to any that need fixing (in parallel)
130
+ 5. **If task B needs output from task A**, read A's result and include relevant details in B's context
131
+
132
+ **Example:**
133
+ ```
134
+ # Step 1: Parallel delegation
135
+ delegate_task(name="auth-api", task="Build auth endpoints")
136
+ delegate_task(name="auth-ui", task="Build login screen")
137
+
138
+ # Step 2: Read results
139
+ # auth-api completed: "Created /api/login and /api/register"
140
+ # auth-ui completed but used wrong endpoint path
141
+
142
+ # Step 3: Correct auth-ui with info from auth-api
143
+ continue_task(name="auth-ui", message="The API endpoint is /api/login not /login. Update the fetch URL.")
144
+ ```
145
+
146
+ ---
147
+
148
+ ## Session Recovery
149
+
150
+ If a conversation ends and you start a new one:
151
+
152
+ 1. **Always check for existing sessions first:** `list_sessions`
153
+ 2. **Resume important sessions:** `resume_session`
154
+ 3. **Clean up finished work:** `delete_session` for completed tasks
155
+ 4. **Sessions persist across conversations** — the data is saved to disk
156
+
157
+ ---
158
+
159
+ ## Safety Presets Quick Reference
160
+
161
+ | Preset | Use When |
162
+ |--------|----------|
163
+ | `read-only` | Analyzing code, gathering information, code review |
164
+ | `review` | Same as read-only |
165
+ | `edit` | Normal development work (DEFAULT — use this most of the time) |
166
+ | `full` | Trusted tasks that need unrestricted access (use sparingly) |
167
+ | `plan` | Exploring architecture, planning (no file modifications) |
168
+
169
+ ---
170
+
171
+ ## Anti-Patterns (What NOT to do)
172
+
173
+ 1. **Don't delegate trivial tasks** — overhead of spawning a session isn't worth it for simple fixes
174
+ 2. **Don't use `full` preset by default** — use `edit` and escalate only if needed
175
+ 3. **Don't ignore failed results** — always check status and handle errors
176
+ 4. **Don't set budgets too low** — a realistic task needs $0.50-2.00, complex tasks need more
177
+ 5. **Don't send vague corrections** — be specific about what's wrong and how to fix it
178
+ 6. **Don't forget to finish_task** — stopped sessions consume stored data
179
+ 7. **Don't coordinate file edits across parallel sessions on the same file** — they'll conflict
180
+
181
+ ---
182
+
183
+ # Team Hub v2 Strategy — From Conversations to Transactions
184
+
185
+ ## The Positioning Statement
186
+
187
+ **claude-multi-session is the only multi-agent coordination system for Claude Code that replaces conversation with transactions.**
188
+
189
+ Agents exchange **versioned artifacts**, not messages. Any agent can assign work to any other agent — the team **self-organizes** without a central bottleneck. Workflows **auto-heal** through reactive pipelines. Data lineage tracks exactly how every output was derived. And the entire coordination state can be **rolled back and replayed** — something no conversational system can ever do.
190
+
191
+ Every competitor is racing to make agents talk better. We're building a system where agents don't need to talk at all — and don't need a boss to tell them what to do next.
192
+
193
+ ---
194
+
195
+ ## The Flywheel: How All Three Layers Reinforce Each Other
196
+
197
+ ```
198
+ ┌─────────────────────────────────────────────────────────────┐
199
+ │ Layer 2: Artifacts (versioned, immutable) │
200
+ │ │ │
201
+ │ ├──► Layer 3a: Lineage Graph tracks how artifacts │
202
+ │ │ relate to each other │
203
+ │ │ │ │
204
+ │ │ └──► Knows which artifacts are STALE when │
205
+ │ │ sources change │
206
+ │ │ │ │
207
+ │ │ └──► Layer 3b: Reactive Pipelines │
208
+ │ │ auto-trigger regeneration │
209
+ │ │ │ │
210
+ │ │ └──► Self-healing CI loops │
211
+ │ │ without chat │
212
+ │ │ │
213
+ │ └──► Layer 3c: Snapshots capture the full state at │
214
+ │ any point in time │
215
+ │ │ │
216
+ │ └──► Replay re-executes from snapshots with │
217
+ │ overrides │
218
+ │ │ │
219
+ │ └──► Lineage shows what changed between runs │
220
+ │ │ │
221
+ │ └──► Artifacts store both runs' outputs │
222
+ │ for comparison │
223
+ │ │
224
+ │ The cycle: │
225
+ │ Build → Track lineage → React to changes → Snapshot │
226
+ │ → Replay if needed → Lineage shows delta │
227
+ │ │
228
+ │ Competitors can't replicate this because step 1 (versioned │
229
+ │ artifacts) doesn't exist in their systems. Every Layer 3 │
230
+ │ feature depends on Layer 2 existing first. │
231
+ └─────────────────────────────────────────────────────────────┘
232
+ ```
233
+
234
+ ---
235
+
236
+ ## Why Competitors Can't Replicate This
237
+
238
+ ### Conversational Systems (Agent Teams, claude-flow, CrewAI, AutoGen)
239
+
240
+ **What they do:**
241
+ - Agents talk in natural language
242
+ - Messages pile up in shared context
243
+ - Orchestrator summarizes and routes
244
+ - Quality degrades over time
245
+
246
+ **Their limitations:**
247
+
248
+ 1. **No data provenance:** Can't answer "how was this output derived?" without reading conversation history
249
+ 2. **No impact analysis:** Can't answer "what breaks if I change X?" without asking every agent
250
+ 3. **No snapshot/replay:** Can't rollback to a previous state — conversation is linear, irreversible
251
+ 4. **No self-healing:** When tests fail, someone has to notice and manually retry
252
+ 5. **No peer-to-peer:** Only the orchestrator can assign work (hub-and-spoke bottleneck)
253
+
254
+ **Why they can't bolt this on:**
255
+
256
+ To add lineage tracking, they'd need versioned artifacts. To add impact analysis, they'd need a dependency graph. To add snapshot/replay, they'd need a state machine. But all of these require **structured, transactional coordination** — which means rebuilding their entire coordination layer.
257
+
258
+ ---
259
+
260
+ ### The Moat: Each Layer 3 Feature Requires Layer 2
261
+
262
+ | Layer 3 Feature | Why it depends on Layer 2 |
263
+ |----------------|---------------------------|
264
+ | **Lineage Graph** | Needs immutable artifact versions and explicit `derivedFrom` relationships |
265
+ | **Impact Analysis** | Needs dependency graph built from artifact relationships |
266
+ | **Staleness Detection** | Needs version numbers to compare (derived from v1 but v2 exists) |
267
+ | **Reactive Pipelines** | Needs structured events (artifact_published, contract_completed) with typed data |
268
+ | **Self-Healing CI** | Needs contract state machine (reopen → retry) and artifact validation (test-results schema) |
269
+ | **Snapshots** | Needs serializable state (contracts + artifacts + pipelines) |
270
+ | **Rollback** | Needs immutable version files (can restore without losing data) |
271
+ | **Replay with Overrides** | Needs contract inputs to be structured and modifiable |
272
+
273
+ Competitors would need to:
274
+ 1. Add versioned artifacts (breaking change to their data model)
275
+ 2. Add contract state machines (breaking change to their coordination model)
276
+ 3. Add structured schema validation (breaking change to their output format)
277
+ 4. Add lineage tracking (requires rewriting artifact publish logic)
278
+ 5. Add reactive pipelines (requires event system + rule engine)
279
+ 6. Add snapshot/replay (requires full state serialization)
280
+
281
+ This is not "add a feature" — this is "rebuild the entire system."
282
+
283
+ ---
284
+
285
+ ## When to Use Team Hub v2
286
+
287
+ ### Use Team Hub when:
288
+
289
+ 1. **You have a multi-step workflow with dependencies**
290
+ - Example: Schema → API → Tests
291
+ - Why: Contracts auto-resolve as artifacts are published
292
+
293
+ 2. **You want self-healing behavior**
294
+ - Example: When tests fail, auto-reopen the API contract
295
+ - Why: Reactive pipelines handle this without chat
296
+
297
+ 3. **You need data provenance**
298
+ - Example: "How was this test result derived?"
299
+ - Why: Lineage graph tracks the full chain
300
+
301
+ 4. **You want to try multiple approaches**
302
+ - Example: Build with REST, then replay with GraphQL
303
+ - Why: Snapshots + replay let you compare both runs
304
+
305
+ 5. **You have peer-to-peer coordination needs**
306
+ - Example: QA finds a bug and assigns it directly to the backend dev
307
+ - Why: Any session can create contracts for any other session
308
+
309
+ ### Don't use Team Hub when:
310
+
311
+ 1. **You have a single, linear task** — just use delegate_task
312
+ 2. **You're just exploring/reading code** — use Grep/Glob directly
313
+ 3. **The task is too small** — overhead of contracts isn't worth it
314
+
315
+ ---
316
+
317
+ ## Team Strategy Patterns
318
+
319
+ ### Pattern 1: Linear Chain (Schema → API → Tests)
320
+
321
+ **Setup:**
322
+ ```javascript
323
+ contract_create("setup-schema", assignee: "db-dev",
324
+ expectedOutputs: [{ artifactType: "schema-change" }])
325
+
326
+ contract_create("build-api", assignee: "api-dev",
327
+ dependencies: [{ type: "contract", contractId: "setup-schema" }],
328
+ expectedOutputs: [{ artifactType: "api-contract" }])
329
+
330
+ contract_create("write-tests", assignee: "qa-dev",
331
+ dependencies: [{ type: "contract", contractId: "build-api" }],
332
+ expectedOutputs: [{ artifactType: "test-results" }])
333
+ ```
334
+
335
+ **What happens:**
336
+ - setup-schema is `ready` immediately
337
+ - build-api is `pending` until setup-schema completes
338
+ - write-tests is `pending` until build-api completes
339
+ - Each session auto-starts when its contract becomes `ready`
340
+ - No orchestrator involvement after initial setup
341
+
342
+ ### Pattern 2: Self-Healing CI Loop
343
+
344
+ **Setup:**
345
+ ```javascript
346
+ pipeline_create("ci-loop", rules: [
347
+ {
348
+ trigger: { type: "artifact_published", artifactType: "api-contract" },
349
+ action: { type: "notify_session", target: "qa-dev",
350
+ message: "API contract updated — re-run tests" }
351
+ },
352
+ {
353
+ trigger: { type: "artifact_published", artifactType: "test-results" },
354
+ condition: "data.failed > 0",
355
+ action: { type: "reopen_contract", contractId: "build-api",
356
+ reason: "Tests failing: ${data.failed} failures" }
357
+ }
358
+ ])
359
+ ```
360
+
361
+ **What happens:**
362
+ - API dev publishes api-contract → QA gets notified → re-runs tests
363
+ - If tests fail → contract auto-reopens → API dev gets notification → fixes → publishes new version
364
+ - If tests pass → contract auto-completes → done
365
+ - Zero human intervention, zero orchestrator messages
366
+
367
+ ### Pattern 3: Peer-to-Peer Bug Assignment
368
+
369
+ **Scenario:** QA discovers a bug during testing
370
+
371
+ **Traditional approach (hub-and-spoke):**
372
+ ```
373
+ QA → tells orchestrator → orchestrator assigns to api-dev → api-dev works on it
374
+ ```
375
+
376
+ **Team Hub v2 approach (peer-to-peer):**
377
+ ```javascript
378
+ // QA creates contract directly for api-dev
379
+ contract_create("fix-sql-injection", assigner: "qa-dev", assignee: "api-dev",
380
+ title: "Fix SQL injection in login endpoint",
381
+ inputs: { context: "Parameterized queries missing in /login handler" },
382
+ expectedOutputs: [{ artifactType: "api-contract", required: true }])
383
+ // → api-dev gets inbox: "contract_ready" from qa-dev (not orchestrator)
384
+ // → Orchestrator also notified (broadcast on contract creation) but not involved
385
+ ```
386
+
387
+ **What happens:**
388
+ - QA assigns work directly to api-dev without going through orchestrator
389
+ - api-dev starts work immediately
390
+ - When api-dev publishes the fix, CI pipeline auto-triggers QA's tests
391
+ - Team self-organizes without a bottleneck
392
+
393
+ ### Pattern 4: Snapshot → Try Different Approach → Compare
394
+
395
+ **Scenario:** Auth feature is done with REST, want to try GraphQL
396
+
397
+ ```bash
398
+ # Take snapshot of completed work
399
+ team_snapshot "auth-rest-complete" --label "Auth feature done with REST"
400
+
401
+ # Replay from the beginning with GraphQL override
402
+ team_replay "pre-work" --overrides '{
403
+ "build-api": {
404
+ "inputs": { "context": "Use GraphQL instead of REST" }
405
+ }
406
+ }'
407
+
408
+ # Both runs' artifacts are preserved:
409
+ # - api-contract-user-auth@v2 (REST approach)
410
+ # - api-contract-user-auth@v3 (GraphQL approach)
411
+ # - test-results-auth@v1 (REST tests)
412
+ # - test-results-auth@v2 (GraphQL tests)
413
+
414
+ # Compare test results
415
+ artifact_get api-contract-user-auth --version 2 # REST
416
+ artifact_get api-contract-user-auth --version 3 # GraphQL
417
+ artifact_get test-results-auth --version 1 # REST tests
418
+ artifact_get test-results-auth --version 2 # GraphQL tests
419
+ ```
420
+
421
+ **What happens:**
422
+ - Original work is preserved
423
+ - Entire workflow re-executes with different parameters
424
+ - Both approaches' outputs are stored for comparison
425
+ - Lineage shows how each output was derived
426
+
427
+ ---
428
+
429
+ ## Budget Strategy for Teams
430
+
431
+ When using Team Hub, your budget is split across:
432
+ 1. **Contract execution** (sessions doing the work)
433
+ 2. **Orchestrator monitoring** (checking contract status, handling failures)
434
+ 3. **Reserve** (corrections, retries)
435
+
436
+ **Example:** $10 total budget for a 3-agent team building an auth feature
437
+
438
+ | Allocation | Amount | Purpose |
439
+ |------------|--------|---------|
440
+ | db-dev contract | $2.00 | Schema design + migrations |
441
+ | api-dev contract | $4.00 | API implementation (most complex) |
442
+ | qa-dev contract | $2.00 | Test suite |
443
+ | Orchestrator | $1.00 | Setup contracts, monitor, handle failures |
444
+ | Reserve | $1.00 | Retries when tests fail |
445
+
446
+ **Key insight:** With reactive pipelines, the orchestrator's budget is tiny — it only intervenes on failures. Most work happens autonomously.
447
+
448
+ ---
449
+
450
+ ## Quality Metrics: Why Transactional Beats Conversational
451
+
452
+ | Metric | Conversational | Transactional (Team Hub v2) |
453
+ |--------|---------------|---------------------------|
454
+ | **Precision** | Degrades (summaries lose detail) | Stays consistent (artifacts are exact) |
455
+ | **Traceability** | Poor (read conversation history) | Perfect (lineage graph) |
456
+ | **Repeatability** | Impossible (can't replay) | Easy (snapshots + replay) |
457
+ | **Self-healing** | Manual (human notices failures) | Automatic (reactive pipelines) |
458
+ | **Context bloat** | Grows (messages pile up) | Constant (artifacts are versioned, not accumulated) |
459
+ | **Coordination overhead** | High (orchestrator routes everything) | Low (peer-to-peer contracts) |
460
+
461
+ **Example:**
462
+
463
+ Conversational system after 10 iterations:
464
+ ```
465
+ Orchestrator context: 50K tokens
466
+ - Messages from all agents
467
+ - Summaries of what was done
468
+ - Repeated explanations
469
+ - Lost precision from compression
470
+ ```
471
+
472
+ Team Hub v2 after 10 iterations:
473
+ ```
474
+ Orchestrator context: 5K tokens
475
+ - Contract statuses
476
+ - Artifact IDs
477
+ - Pipeline logs
478
+ - Full precision preserved in artifacts
479
+ ```
480
+
481
+ ---
482
+
483
+ ## The Pitch: Why This Wins
484
+
485
+ > "Every multi-agent system relies on conversation. Agents talk, orchestrators summarize, quality degrades. Team Hub v2 replaces conversation with transactions. Agents publish versioned artifacts, create contracts for each other, and auto-resolve dependencies. The system heals itself when tests fail. You can rollback to any point and replay with different parameters. And data lineage tracks exactly how every output was derived. This is not incremental — this is a different paradigm. And because every Layer 3 feature requires Layer 2 to exist first, competitors can't bolt this on without rebuilding their entire coordination layer. That's the moat."