@covibes/zeroshot 3.0.0 → 4.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +76 -0
- package/README.md +154 -108
- package/cli/index.js +97 -50
- package/cli/lib/update-checker.js +50 -4
- package/cluster-templates/base-templates/debug-workflow.json +232 -61
- package/cluster-templates/base-templates/full-workflow.json +387 -92
- package/cluster-templates/base-templates/single-worker.json +2 -1
- package/cluster-templates/base-templates/worker-validator.json +2 -1
- package/lib/docker-config.js +207 -0
- package/lib/settings.js +37 -0
- package/package.json +1 -1
- package/src/agent/agent-context-builder.js +37 -14
- package/src/agent/agent-lifecycle.js +85 -19
- package/src/agent/agent-task-executor.js +13 -12
- package/src/agent-wrapper.js +3 -0
- package/src/agents/git-pusher-agent.json +2 -2
- package/src/attach/socket-discovery.js +33 -8
- package/src/config-validator.js +643 -13
- package/src/isolation-manager.js +72 -89
- package/src/ledger.js +14 -0
- package/src/message-bus.js +5 -0
- package/src/orchestrator.js +78 -3
- package/src/status-footer.js +30 -5
- package/task-lib/attachable-watcher.js +69 -6
- package/task-lib/watcher.js +1 -2
package/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,79 @@
|
|
|
1
|
+
# [4.0.0](https://github.com/covibes/zeroshot/compare/v3.1.0...v4.0.0) (2026-01-04)
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
### Bug Fixes
|
|
5
|
+
|
|
6
|
+
* adversarial tester condition and README accuracy ([c12109b](https://github.com/covibes/zeroshot/commit/c12109b5ee574301e472bd09ec7495f3a578dc36))
|
|
7
|
+
* **ci:** use correct agent state in status-footer test ([c6f54a8](https://github.com/covibes/zeroshot/commit/c6f54a89d91a621a8d92c1a21dfa796743e38cd2))
|
|
8
|
+
* **cli:** ensure PROCESS_SPAWNED sets EXECUTING_TASK state ([4c3cc9c](https://github.com/covibes/zeroshot/commit/4c3cc9c82b67513cf6ab5e5eca9de1b6d259a9d1))
|
|
9
|
+
* **ledger:** prevent write-after-close race condition ([6b64fcf](https://github.com/covibes/zeroshot/commit/6b64fcfa37a4396591599774c788a022cdbfb1e9))
|
|
10
|
+
* **release:** allow semantic-release to query remote tags ([0be475b](https://github.com/covibes/zeroshot/commit/0be475b264d400c6b504306e7c535b2736dfaaa1))
|
|
11
|
+
* **release:** explicitly fetch tags for semantic-release ([cecf735](https://github.com/covibes/zeroshot/commit/cecf7358d9091992d4c7a1191f874588ba7a592d))
|
|
12
|
+
* **tests:** ensure first-run tests are isolated from module cache ([e55dbe7](https://github.com/covibes/zeroshot/commit/e55dbe7255bab7cf3ec4ddefcc897ec71296a74a))
|
|
13
|
+
* **tests:** move env var and module setup to before() hook ([cf787ff](https://github.com/covibes/zeroshot/commit/cf787ff7453d1a65cbaaf98655606ccb38dea967))
|
|
14
|
+
* **tests:** use validateConfig for modelRules catch-all validation ([4092d78](https://github.com/covibes/zeroshot/commit/4092d78be5739f6a3ca4bc80b3dc25ea7c41f74d))
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
### chore
|
|
18
|
+
|
|
19
|
+
* bump version to 4.0.0 ([95844e8](https://github.com/covibes/zeroshot/commit/95844e8ffeee4d24dde56b084053d0cdcd30d3e9))
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
### Features
|
|
23
|
+
|
|
24
|
+
* **context:** enforce maximum informativeness, minimum verbosity ([f99a7b7](https://github.com/covibes/zeroshot/commit/f99a7b738214863744119a9b96a50590034299aa))
|
|
25
|
+
* **prompts:** add universal language/task support with LLM antipattern detection ([906102b](https://github.com/covibes/zeroshot/commit/906102b654914ccd73ebb8abaa121304ee4f347e))
|
|
26
|
+
|
|
27
|
+
|
|
28
|
+
### Performance Improvements
|
|
29
|
+
|
|
30
|
+
* **ci:** reduce matrix from 6 jobs to 1 (save ~90% minutes) ([cad652d](https://github.com/covibes/zeroshot/commit/cad652d22fdc24cf10efabf04e13902529c05b98))
|
|
31
|
+
* **validators:** remove relevance/notes fields to save tokens ([b775e5a](https://github.com/covibes/zeroshot/commit/b775e5a028475f2f11d3b87ec0202c4398100c1d))
|
|
32
|
+
|
|
33
|
+
|
|
34
|
+
### BREAKING CHANGES
|
|
35
|
+
|
|
36
|
+
* CREW_* env vars renamed to ZEROSHOT_*
|
|
37
|
+
|
|
38
|
+
🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
|
39
|
+
|
|
40
|
+
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
41
|
+
* **prompts:** Validator prompts no longer include language-specific examples
|
|
42
|
+
|
|
43
|
+
# [3.1.0](https://github.com/covibes/zeroshot/compare/v3.0.0...v3.1.0) (2026-01-03)
|
|
44
|
+
|
|
45
|
+
|
|
46
|
+
### Bug Fixes
|
|
47
|
+
|
|
48
|
+
* **attach:** detect cluster IDs without prefix by checking clusters.json ([a3f3b3a](https://github.com/covibes/zeroshot/commit/a3f3b3a1c3de47333297b98327f36aefb36cb958))
|
|
49
|
+
* **cli:** use canonical AGENT_STATE constants for status footer ([ac53f83](https://github.com/covibes/zeroshot/commit/ac53f83b0af9a7f2de8264ca791457e4e0afca9a))
|
|
50
|
+
* **footer:** show agents during evaluating_logic, building_context, executing_task ([f3c3484](https://github.com/covibes/zeroshot/commit/f3c348400d4b2e960410121cef0614dc583e7528))
|
|
51
|
+
* handle Claude CLI lock contention in parallel validators ([b88d502](https://github.com/covibes/zeroshot/commit/b88d502699c8c0e628310a9b998c4a1f4cb26d1a))
|
|
52
|
+
* **orchestrator:** add missing close() method for test cleanup ([a642886](https://github.com/covibes/zeroshot/commit/a6428867647de4d5b28c61163be87d711e001c7c))
|
|
53
|
+
* **output:** broadcast text output, not just JSON ([adc8556](https://github.com/covibes/zeroshot/commit/adc8556a47e3d02ed7189d8290e9cf81a07c909c))
|
|
54
|
+
* **output:** change from MINIMAL to INFORMATIVE output style ([3b87466](https://github.com/covibes/zeroshot/commit/3b87466eb0012f089edac2fb66a4d118a39e92e0))
|
|
55
|
+
* **planner:** explicitly forbid Deferred and Why defer patterns ([0504b0a](https://github.com/covibes/zeroshot/commit/0504b0a5e4e4cd0c902989e29a3759fbc46aa534))
|
|
56
|
+
* **planner:** forbid scope reduction in planner prompt ([a9dbfb2](https://github.com/covibes/zeroshot/commit/a9dbfb2bef67a14289af91b0cea77880fe3eff3f))
|
|
57
|
+
* **planner:** prevent silent phase omission in scope reduction checks ([7e99787](https://github.com/covibes/zeroshot/commit/7e99787593cd970b45901f7cf1bf641bf4e5f772))
|
|
58
|
+
* **status-footer:** cleanup footer on stop regardless of hidden state ([52fe9e9](https://github.com/covibes/zeroshot/commit/52fe9e9efeae5c768291a9a0810399bfdae03934))
|
|
59
|
+
* **templates:** hardcode completion-detector model to haiku ([78b917e](https://github.com/covibes/zeroshot/commit/78b917e1c97697bf64d7a0897c2b68d8bb0bbaa3))
|
|
60
|
+
* **tests:** set ZEROSHOT_WORKTREE env in git-safety-hook tests ([7399cfc](https://github.com/covibes/zeroshot/commit/7399cfca1d42f748314db355eb247e426fad97a2))
|
|
61
|
+
* **tests:** skip isolation tests when Docker image unavailable ([142f43c](https://github.com/covibes/zeroshot/commit/142f43c6af6e209a95b286556d13bb594985e850))
|
|
62
|
+
* **tests:** update settings test for maxModel rename and fix git hook case sensitivity ([6cbb654](https://github.com/covibes/zeroshot/commit/6cbb654fd2ef15fde9f1454d63cd6aae6807404b))
|
|
63
|
+
* **tests:** update tests for maxModel cost ceiling rename ([45b4ac8](https://github.com/covibes/zeroshot/commit/45b4ac809c480205345be96249608ea2b284f50e))
|
|
64
|
+
* **update-checker:** check npm write permissions before auto-update ([dd9efa8](https://github.com/covibes/zeroshot/commit/dd9efa83edeef812f6d0ad6142a8e8c7ec4006e6))
|
|
65
|
+
* **watcher:** add global error handlers to prevent silent crashes ([cea4b57](https://github.com/covibes/zeroshot/commit/cea4b57fe7cfea899bf8981c2b0d200d1c0a9050))
|
|
66
|
+
* **worker:** forbid scope reduction excuses in worker prompt ([c666847](https://github.com/covibes/zeroshot/commit/c6668473c7f2882482b0593950db780088721925))
|
|
67
|
+
* **worktree:** inject cwd into dynamically spawned template agents ([4c3b916](https://github.com/covibes/zeroshot/commit/4c3b9162e5656133b01ccbf58c91782855669e33))
|
|
68
|
+
|
|
69
|
+
|
|
70
|
+
### Features
|
|
71
|
+
|
|
72
|
+
* **agents:** conditional git restriction based on isolation mode ([70eb368](https://github.com/covibes/zeroshot/commit/70eb3681c3d55747d72b491a4e85279b0e215ab5))
|
|
73
|
+
* **orchestrator:** persist agent runtime states for accurate status display ([4205c7d](https://github.com/covibes/zeroshot/commit/4205c7d0234d3e34e0000ed15ac218c9edb7d048))
|
|
74
|
+
* **validation:** enforce E2E verification with technical constraints ([f2a680a](https://github.com/covibes/zeroshot/commit/f2a680ada66e1485d174084d346c0ae9932ce2c9))
|
|
75
|
+
* **worker:** add aggressive COMPLETION MINDSET to worker prompts ([0c6e37b](https://github.com/covibes/zeroshot/commit/0c6e37b4c0c58cab8b77b7ed1ba23ebb73f55d29))
|
|
76
|
+
|
|
1
77
|
# [3.0.0](https://github.com/covibes/zeroshot/compare/v2.1.0...v3.0.0) (2025-12-29)
|
|
2
78
|
|
|
3
79
|
|
package/README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
# zeroshot CLI
|
|
2
2
|
|
|
3
|
+
[](https://github.com/covibes/zeroshot/actions/workflows/ci.yml)
|
|
4
|
+
[](https://www.npmjs.com/package/@covibes/zeroshot)
|
|
3
5
|
[](LICENSE)
|
|
4
6
|
[](https://nodejs.org/)
|
|
5
7
|
[]()
|
|
@@ -32,19 +34,54 @@ Point at a GitHub issue, walk away, come back to working code.
|
|
|
32
34
|
### Demo
|
|
33
35
|
|
|
34
36
|
```bash
|
|
35
|
-
zeroshot "Add
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
37
|
+
zeroshot "Add optimistic locking with automatic retry: when updating a user,
|
|
38
|
+
detect if another request modified it first using version numbers,
|
|
39
|
+
automatically retry with exponential backoff up to 3 times,
|
|
40
|
+
merge non-conflicting field changes, surface true conflicts to the caller
|
|
41
|
+
with details of what conflicted. Handle the ABA problem where version goes A->B->A."
|
|
40
42
|
```
|
|
41
43
|
|
|
42
44
|
<p align="center">
|
|
43
45
|
<img src="./docs/assets/zeroshot-demo.gif" alt="Demo" width="700">
|
|
44
46
|
<br>
|
|
45
|
-
<em>Sped up —
|
|
47
|
+
<em>Sped up 100x — 90 minutes, $16, 5 iterations until validators approved</em>
|
|
46
48
|
</p>
|
|
47
49
|
|
|
50
|
+
**The full fix cycle.** Initial implementation passed basic tests but validators caught edge cases: race conditions in concurrent updates, ABA problem not fully handled, retry backoff timing issues. Each rejection triggered fixes until all 48 tests passed with 91%+ coverage.
|
|
51
|
+
|
|
52
|
+
A single agent would say "done!" after the first implementation. Here, the adversarial tester actually *runs* concurrent requests, times the retry backoff, and verifies conflict detection works under load.
|
|
53
|
+
|
|
54
|
+
**This is what production-grade looks like.** Not "tests pass" — validators reject until it actually works. 5 iterations, each one fixing real bugs the previous attempt missed.
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
## When to Use Zeroshot
|
|
59
|
+
|
|
60
|
+
**Zeroshot requires well-defined tasks with clear acceptance criteria.**
|
|
61
|
+
|
|
62
|
+
| Scenario | Zeroshot? | Why |
|
|
63
|
+
|----------|-----------|-----|
|
|
64
|
+
| "Add rate limiting with sliding window, per-IP, 429 responses" | ✅ Yes | Clear requirements, validators can verify each one |
|
|
65
|
+
| "Refactor auth to use JWT instead of sessions" | ✅ Yes | Known complexity, defined end state |
|
|
66
|
+
| "Fix the bug where users can't login" | ✅ Yes | Known unknown - need to find cause, but success is clear |
|
|
67
|
+
| "Fix all 2410 linting violations" | ✅ Yes | Long-running batch task, clear completion (0 violations) |
|
|
68
|
+
| "Make the app faster" | ❌ No | Unknown unknowns - need exploration first |
|
|
69
|
+
| "Improve the codebase" | ❌ No | No acceptance criteria to validate |
|
|
70
|
+
| "Figure out why tests are flaky" | ❌ No | Exploratory - use single-agent Claude Code |
|
|
71
|
+
|
|
72
|
+
**Known unknowns** (implementation details unclear) → Zeroshot handles this. The planner figures it out.
|
|
73
|
+
|
|
74
|
+
**Unknown unknowns** (don't know what you'll discover) → Use single-agent Claude Code for exploration first, then come back with a well-defined task.
|
|
75
|
+
|
|
76
|
+
**Long-running batch tasks** → Zeroshot excels here. Run overnight with `-d` (daemon mode):
|
|
77
|
+
- "Fix all 2410 semantic linting violations"
|
|
78
|
+
- "Add TypeScript types to all 47 untyped files"
|
|
79
|
+
- "Migrate all API calls from v1 to v2"
|
|
80
|
+
|
|
81
|
+
Crash recovery (`zeroshot resume`) means multi-hour tasks survive interruptions.
|
|
82
|
+
|
|
83
|
+
**Rule of thumb:** If you can't describe what "done" looks like, zeroshot's validators can't verify it.
|
|
84
|
+
|
|
48
85
|
---
|
|
49
86
|
|
|
50
87
|
## Install
|
|
@@ -99,7 +136,8 @@ zeroshot purge # NUCLEAR: kill all + delete all
|
|
|
99
136
|
|
|
100
137
|
---
|
|
101
138
|
|
|
102
|
-
|
|
139
|
+
<details>
|
|
140
|
+
<summary><strong>FAQ</strong></summary>
|
|
103
141
|
|
|
104
142
|
**Q: Why Claude-only?**
|
|
105
143
|
|
|
@@ -119,12 +157,6 @@ Zeroshot fixes this with **isolated agents** where validators check work they di
|
|
|
119
157
|
|
|
120
158
|
Yes, see CLAUDE.md. But most people never need to.
|
|
121
159
|
|
|
122
|
-
**Q: Why does the CLI appear frozen?**
|
|
123
|
-
|
|
124
|
-
Zeroshot agents use strict JSON schema outputs to ensure reliable parsing and hook execution. This is incompatible with live streaming - agents can't stream partial JSON.
|
|
125
|
-
|
|
126
|
-
During heavy tasks (large refactors, complex analysis), the CLI may appear frozen for several minutes while the agent works. This is normal - the agent is actively running, just not streaming output.
|
|
127
|
-
|
|
128
160
|
**Q: Why is it called "zeroshot"?**
|
|
129
161
|
|
|
130
162
|
In machine learning, "zero-shot" means solving tasks the model has never seen before - using only the task description, no prior examples needed.
|
|
@@ -133,6 +165,8 @@ Same idea here: give zeroshot a well-defined task, get back a result. No example
|
|
|
133
165
|
|
|
134
166
|
The multi-agent architecture handles planning, implementation, and validation internally. You provide a clear problem statement. Zeroshot handles the rest.
|
|
135
167
|
|
|
168
|
+
</details>
|
|
169
|
+
|
|
136
170
|
---
|
|
137
171
|
|
|
138
172
|
## How It Works
|
|
@@ -145,9 +179,7 @@ Zeroshot is a **multi-agent coordination framework** with smart defaults.
|
|
|
145
179
|
zeroshot 123 # Analyzes task → picks team → done
|
|
146
180
|
```
|
|
147
181
|
|
|
148
|
-
The conductor classifies your task (complexity × type) and
|
|
149
|
-
|
|
150
|
-
### Default Workflows (Out of the Box)
|
|
182
|
+
The conductor classifies your task (complexity × type) and picks the right workflow:
|
|
151
183
|
|
|
152
184
|
```
|
|
153
185
|
┌─────────────────┐
|
|
@@ -187,7 +219,7 @@ The conductor classifies your task (complexity × type) and routes to a pre-buil
|
|
|
187
219
|
│ │ │ ✓ security (CRIT) │
|
|
188
220
|
│ │ │ ✓ tester (CRIT) │
|
|
189
221
|
│ │ │ ✓ adversarial │
|
|
190
|
-
│ │ │ (
|
|
222
|
+
│ │ │ (real execution) │
|
|
191
223
|
│ │ └──────────┬───────────┘
|
|
192
224
|
│ │ REJECT │ ALL OK
|
|
193
225
|
│ └──────────────┘ │
|
|
@@ -197,109 +229,28 @@ The conductor classifies your task (complexity × type) and routes to a pre-buil
|
|
|
197
229
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
198
230
|
```
|
|
199
231
|
|
|
200
|
-
These are **templates**. The conductor picks based on what you're building.
|
|
201
|
-
|
|
202
232
|
| Task | Complexity | Agents | Validators |
|
|
203
233
|
| ---------------------- | ---------- | ------ | ------------------------------------------------- |
|
|
204
234
|
| Fix typo in README | TRIVIAL | 1 | None |
|
|
205
235
|
| Add dark mode toggle | SIMPLE | 2 | generic validator |
|
|
206
|
-
| Refactor auth system | STANDARD |
|
|
236
|
+
| Refactor auth system | STANDARD | 4 | requirements, code |
|
|
207
237
|
| Implement payment flow | CRITICAL | 7 | requirements, code, security, tester, adversarial |
|
|
208
238
|
|
|
209
|
-
## End-to-End Flow
|
|
210
|
-
|
|
211
|
-
**This is how zeroshot processes a task from start to finish:**
|
|
212
|
-
|
|
213
|
-
```
|
|
214
|
-
╔═════════════════════════════════════════════════════╗
|
|
215
|
-
║ ZEROSHOT ORCHESTRATION ENGINE ║
|
|
216
|
-
╚═════════════════════════════════════════════════════╝
|
|
217
|
-
|
|
218
|
-
┌─────────────────┐
|
|
219
|
-
│ "Add auth │
|
|
220
|
-
│ to the API" │
|
|
221
|
-
└────────┬────────┘
|
|
222
|
-
│
|
|
223
|
-
▼
|
|
224
|
-
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
|
|
225
|
-
│ CONDUCTOR (2D Classification) │
|
|
226
|
-
│ ┌─────────────────────────────────────────────────────────────────────────────────────────┐ │
|
|
227
|
-
│ │ Junior (Haiku) Senior (Sonnet) │ │
|
|
228
|
-
│ │ ───────────── ─────────────── │ │
|
|
229
|
-
│ │ Fast classification on 2 dimensions: ───▶ Handles UNCERTAIN cases │ │
|
|
230
|
-
│ │ • Complexity: TRIVIAL | SIMPLE | STANDARD (if with deeper analysis │ │
|
|
231
|
-
│ │ • TaskType: INQUIRY | TASK | DEBUG Junior │ │
|
|
232
|
-
│ │ unsure) │ │
|
|
233
|
-
│ └─────────────────────────────────────────────────────────────────────────────────────────┘ │
|
|
234
|
-
└──────────────────────────────────────────────────────────────────────────────────────────────┘
|
|
235
|
-
│
|
|
236
|
-
│ Classification: STANDARD × TASK
|
|
237
|
-
▼
|
|
238
|
-
┌─────────────────────────────────────────┐
|
|
239
|
-
│ CONFIG ROUTER │
|
|
240
|
-
│ ───────────────────────────────────── │
|
|
241
|
-
│ TRIVIAL → single-worker │
|
|
242
|
-
│ SIMPLE → worker-validator │
|
|
243
|
-
│ DEBUG (non-trivial) → debug-workflow │
|
|
244
|
-
│ STANDARD/CRITICAL → full-workflow ◀──│
|
|
245
|
-
└─────────────────────────────────────────┘
|
|
246
|
-
│
|
|
247
|
-
│ Spawns full-workflow agents
|
|
248
|
-
▼
|
|
249
|
-
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
|
|
250
|
-
│ FULL WORKFLOW │
|
|
251
|
-
│ ┌─────────────────────────────────────────────────────────────────────────────────────────┐ │
|
|
252
|
-
│ │ │ │
|
|
253
|
-
│ │ ┌──────────────┐ │ │
|
|
254
|
-
│ │ │ PLANNER │ Creates implementation plan │ │
|
|
255
|
-
│ │ │ (opus/sonnet)│ • Analyzes requirements │ │
|
|
256
|
-
│ │ └──────┬───────┘ • Identifies files to change │ │
|
|
257
|
-
│ │ │ • Breaks into actionable steps │ │
|
|
258
|
-
│ │ │ PLAN_READY │ │
|
|
259
|
-
│ │ ▼ │ │
|
|
260
|
-
│ │ ┌──────────────┐ │ │
|
|
261
|
-
│ │ │ WORKER │◀─────────────────────────────────────────────┐ │ │
|
|
262
|
-
│ │ │ (sonnet) │ Implements the plan │ │ │
|
|
263
|
-
│ │ └──────┬───────┘ • Writes/modifies code │ │ │
|
|
264
|
-
│ │ │ • Handles rejections │ │ │
|
|
265
|
-
│ │ │ IMPLEMENTATION_READY │ │ │
|
|
266
|
-
│ │ ▼ │ │ │
|
|
267
|
-
│ │ ┌─────────────────────────────────────────────────────┐ │ │ │
|
|
268
|
-
│ │ │ VALIDATORS (parallel) │ │ │ │
|
|
269
|
-
│ │ │ │ │ │ │
|
|
270
|
-
│ │ │ ┌────────────┐ ┌────────────┐ ┌─────────────────┐ │ │ REJECTED │ │
|
|
271
|
-
│ │ │ │Requirements│ │Code Review │ │ Adversarial │ │ │ │ │
|
|
272
|
-
│ │ │ │ Validator │ │ (reviewer)│ │ Tester │ │───────┘ │ │
|
|
273
|
-
│ │ │ │ (validator)│ │ │ │ EXECUTES tests │ │ │ │
|
|
274
|
-
│ │ │ └────────────┘ └────────────┘ └─────────────────┘ │ │ │
|
|
275
|
-
│ │ │ │ │ │
|
|
276
|
-
│ │ └──────────────────────┬──────────────────────────────┘ │ │
|
|
277
|
-
│ │ │ │ │
|
|
278
|
-
│ │ │ ALL APPROVED │ │
|
|
279
|
-
│ │ ▼ │ │
|
|
280
|
-
│ │ ┌──────────────┐ │ │
|
|
281
|
-
│ │ │ COMPLETE │ │ │
|
|
282
|
-
│ │ │ ────────── │ │ │
|
|
283
|
-
│ │ │ PR Created │ (with --pr flag) │ │
|
|
284
|
-
│ │ │ Auto-merged │ (with --merge flag) │ │
|
|
285
|
-
│ │ └──────────────┘ │ │
|
|
286
|
-
│ │ │ │
|
|
287
|
-
│ └─────────────────────────────────────────────────────────────────────────────────────────┘ │
|
|
288
|
-
└──────────────────────────────────────────────────────────────────────────────────────────────┘
|
|
289
|
-
```
|
|
290
|
-
|
|
291
239
|
### Model Selection by Complexity
|
|
292
240
|
|
|
293
241
|
| Complexity | Planner | Worker | Validators |
|
|
294
242
|
| ---------- | ------- | ------ | ---------- |
|
|
295
243
|
| TRIVIAL | - | haiku | 0 |
|
|
296
244
|
| SIMPLE | - | sonnet | 1 (sonnet) |
|
|
297
|
-
| STANDARD | sonnet | sonnet |
|
|
245
|
+
| STANDARD | sonnet | sonnet | 2 (sonnet) |
|
|
298
246
|
| CRITICAL | opus | sonnet | 5 (sonnet) |
|
|
299
247
|
|
|
248
|
+
Set model ceiling: `zeroshot settings set maxModel sonnet` (prevents opus)
|
|
249
|
+
|
|
300
250
|
---
|
|
301
251
|
|
|
302
|
-
|
|
252
|
+
<details>
|
|
253
|
+
<summary><strong>Custom Workflows (Framework Mode)</strong></summary>
|
|
303
254
|
|
|
304
255
|
Zeroshot is **message-driven** - define any agent topology:
|
|
305
256
|
|
|
@@ -315,9 +266,46 @@ Zeroshot is **message-driven** - define any agent topology:
|
|
|
315
266
|
- Ledger (SQLite, crash recovery)
|
|
316
267
|
- Dynamic spawning (CLUSTER_OPERATIONS)
|
|
317
268
|
|
|
318
|
-
|
|
269
|
+
#### Creating Custom Clusters with Claude Code
|
|
270
|
+
|
|
271
|
+
**The easiest way to create a custom cluster: just ask Claude Code.**
|
|
272
|
+
|
|
273
|
+
```bash
|
|
274
|
+
# In your zeroshot repo
|
|
275
|
+
claude
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
**Example prompt:**
|
|
279
|
+
```
|
|
280
|
+
Create a zeroshot cluster config for security-critical features:
|
|
319
281
|
|
|
320
|
-
|
|
282
|
+
1. Implementation agent (sonnet) implements the feature
|
|
283
|
+
2. FOUR parallel validators:
|
|
284
|
+
- Security validator: OWASP checks, SQL injection, XSS, CSRF
|
|
285
|
+
- Performance validator: No N+1 queries, proper indexing
|
|
286
|
+
- Privacy validator: GDPR compliance, data minimization
|
|
287
|
+
- Code reviewer: General code quality
|
|
288
|
+
|
|
289
|
+
3. ALL validators must approve before merge
|
|
290
|
+
4. If ANY validator rejects, implementation agent fixes and resubmits
|
|
291
|
+
5. Use opus for security validator (highest stakes)
|
|
292
|
+
|
|
293
|
+
Look at cluster-templates/base-templates/full-workflow.json
|
|
294
|
+
and create a similar cluster. Save to cluster-templates/security-review.json
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
Claude Code will read existing templates, create valid JSON config, and iterate until it works.
|
|
298
|
+
|
|
299
|
+
**Built-in validation catches failures before running:**
|
|
300
|
+
- Never start (no bootstrap trigger)
|
|
301
|
+
- Never complete (no path to completion)
|
|
302
|
+
- Loop infinitely (circular dependencies)
|
|
303
|
+
- Deadlock (impossible consensus)
|
|
304
|
+
- Type mismatches (boolean → string in JSON)
|
|
305
|
+
|
|
306
|
+
See [CLAUDE.md](./CLAUDE.md) for cluster config schema and examples.
|
|
307
|
+
|
|
308
|
+
</details>
|
|
321
309
|
|
|
322
310
|
---
|
|
323
311
|
|
|
@@ -350,6 +338,62 @@ zeroshot 123 --docker
|
|
|
350
338
|
|
|
351
339
|
Full isolation in a fresh container. Your workspace stays untouched. Good for risky experiments or parallel agents.
|
|
352
340
|
|
|
341
|
+
### When to Use Which
|
|
342
|
+
|
|
343
|
+
| Scenario | Recommended |
|
|
344
|
+
| -------- | ----------- |
|
|
345
|
+
| Quick trusted task | No isolation (default) |
|
|
346
|
+
| PR workflow, code review | `--worktree` or `--pr` |
|
|
347
|
+
| Risky experiment, might break things | `--docker` |
|
|
348
|
+
| Running multiple tasks in parallel | `--docker` |
|
|
349
|
+
| Full automation, no review needed | `--ship` |
|
|
350
|
+
|
|
351
|
+
### Docker Credential Mounts
|
|
352
|
+
|
|
353
|
+
When using `--docker`, zeroshot mounts credential directories so Claude can access tools like AWS, Azure, kubectl.
|
|
354
|
+
|
|
355
|
+
**Default mounts**: `gh`, `git`, `ssh` (GitHub CLI, git config, SSH keys)
|
|
356
|
+
|
|
357
|
+
**Available presets**: `gh`, `git`, `ssh`, `aws`, `azure`, `kube`, `terraform`, `gcloud`
|
|
358
|
+
|
|
359
|
+
```bash
|
|
360
|
+
# Configure via settings (persistent)
|
|
361
|
+
zeroshot settings set dockerMounts '["gh", "git", "ssh", "aws", "azure"]'
|
|
362
|
+
|
|
363
|
+
# View current config
|
|
364
|
+
zeroshot settings get dockerMounts
|
|
365
|
+
|
|
366
|
+
# Per-run override
|
|
367
|
+
zeroshot run 123 --docker --mount ~/.aws:/root/.aws:ro
|
|
368
|
+
|
|
369
|
+
# Disable all mounts
|
|
370
|
+
zeroshot run 123 --docker --no-mounts
|
|
371
|
+
|
|
372
|
+
# CI: env var override
|
|
373
|
+
ZEROSHOT_DOCKER_MOUNTS='["aws","azure"]' zeroshot run 123 --docker
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
**Custom mounts** (mix presets with explicit paths):
|
|
377
|
+
```bash
|
|
378
|
+
zeroshot settings set dockerMounts '[
|
|
379
|
+
"gh",
|
|
380
|
+
"git",
|
|
381
|
+
{"host": "~/.myconfig", "container": "$HOME/.myconfig", "readonly": true}
|
|
382
|
+
]'
|
|
383
|
+
```
|
|
384
|
+
|
|
385
|
+
**Container home**: Presets use `$HOME` placeholder. Default: `/root`. Override with:
|
|
386
|
+
```bash
|
|
387
|
+
zeroshot settings set dockerContainerHome '/home/node'
|
|
388
|
+
# Or per-run:
|
|
389
|
+
zeroshot run 123 --docker --container-home /home/node
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
**Env var passthrough**: Presets auto-pass related env vars (e.g., `aws` → `AWS_REGION`, `AWS_PROFILE`). Add custom:
|
|
393
|
+
```bash
|
|
394
|
+
zeroshot settings set dockerEnvPassthrough '["MY_API_KEY", "TF_VAR_*"]'
|
|
395
|
+
```
|
|
396
|
+
|
|
353
397
|
---
|
|
354
398
|
|
|
355
399
|
## More
|
|
@@ -360,18 +404,20 @@ Full isolation in a fresh container. Your workspace stays untouched. Good for ri
|
|
|
360
404
|
|
|
361
405
|
---
|
|
362
406
|
|
|
363
|
-
|
|
407
|
+
<details>
|
|
408
|
+
<summary><strong>Troubleshooting</strong></summary>
|
|
364
409
|
|
|
365
410
|
| Issue | Fix |
|
|
366
411
|
| ----------------------------- | -------------------------------------------------------------------- |
|
|
367
412
|
| `claude: command not found` | `npm i -g @anthropic-ai/claude-code && claude auth login` |
|
|
368
413
|
| `gh: command not found` | [Install GitHub CLI](https://cli.github.com/) |
|
|
369
|
-
| CLI frozen for minutes | Normal - agents use JSON schema output, can't stream partial results |
|
|
370
414
|
| `--docker` fails | Docker must be running: `docker ps` to verify |
|
|
371
415
|
| Cluster stuck | `zeroshot resume <id>` to continue with guidance |
|
|
372
416
|
| Agent keeps failing | Check `zeroshot logs <id>` for actual error |
|
|
373
417
|
| `zeroshot: command not found` | `npm install -g @covibes/zeroshot` |
|
|
374
418
|
|
|
419
|
+
</details>
|
|
420
|
+
|
|
375
421
|
---
|
|
376
422
|
|
|
377
423
|
## Contributing
|