sisyphi 1.1.18 → 1.1.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (231) hide show
  1. package/README.md +195 -75
  2. package/dist/chunk-36VJ7ZBD.js +1898 -0
  3. package/dist/chunk-36VJ7ZBD.js.map +1 -0
  4. package/dist/{chunk-C2XKXERJ.js → chunk-M6Z3KHOH.js} +159 -46
  5. package/dist/chunk-M6Z3KHOH.js.map +1 -0
  6. package/dist/chunk-O4ZHSQ5R.js +544 -0
  7. package/dist/chunk-O4ZHSQ5R.js.map +1 -0
  8. package/dist/chunk-P2HHTIPM.js +478 -0
  9. package/dist/chunk-P2HHTIPM.js.map +1 -0
  10. package/dist/{chunk-TMBAVPHH.js → chunk-PNDCVKBN.js} +73 -1
  11. package/dist/chunk-PNDCVKBN.js.map +1 -0
  12. package/dist/chunk-SVGIQ2G4.js +1076 -0
  13. package/dist/chunk-SVGIQ2G4.js.map +1 -0
  14. package/dist/cli.js +4405 -892
  15. package/dist/cli.js.map +1 -1
  16. package/dist/daemon.js +4340 -1990
  17. package/dist/daemon.js.map +1 -1
  18. package/dist/{paths-XRDEEJ5R.js → paths-JXFLR5BN.js} +38 -2
  19. package/dist/single-ask-6G4BIVY2.js +132 -0
  20. package/dist/single-ask-6G4BIVY2.js.map +1 -0
  21. package/dist/templates/CLAUDE.md +1 -56
  22. package/dist/templates/agent-plugin/agents/CLAUDE.md +2 -65
  23. package/dist/templates/agent-plugin/agents/debug.md +43 -6
  24. package/dist/templates/agent-plugin/agents/debug.settings.json +57 -0
  25. package/dist/templates/agent-plugin/agents/explore.md +28 -1
  26. package/dist/templates/agent-plugin/agents/explore.settings.json +57 -0
  27. package/dist/templates/agent-plugin/agents/implementor.md +94 -0
  28. package/dist/templates/agent-plugin/agents/implementor.settings.json +57 -0
  29. package/dist/templates/agent-plugin/agents/operator.md +43 -1
  30. package/dist/templates/agent-plugin/agents/operator.settings.json +57 -0
  31. package/dist/templates/agent-plugin/agents/plan/sub-planner.md +75 -0
  32. package/dist/templates/agent-plugin/agents/plan.md +176 -86
  33. package/dist/templates/agent-plugin/agents/plan.settings.json +57 -0
  34. package/dist/templates/agent-plugin/agents/problem/adversarial.md +26 -0
  35. package/dist/templates/agent-plugin/agents/problem/contrarian.md +26 -0
  36. package/dist/templates/agent-plugin/agents/problem/first-principles.md +26 -0
  37. package/dist/templates/agent-plugin/agents/problem/precedent.md +25 -0
  38. package/dist/templates/agent-plugin/agents/problem/simplifier.md +26 -0
  39. package/dist/templates/agent-plugin/agents/problem/systems-thinker.md +26 -0
  40. package/dist/templates/agent-plugin/agents/problem/time-traveler.md +26 -0
  41. package/dist/templates/agent-plugin/agents/problem/user-empathy.md +26 -0
  42. package/dist/templates/agent-plugin/agents/problem.md +334 -79
  43. package/dist/templates/agent-plugin/agents/problem.settings.json +57 -0
  44. package/dist/templates/agent-plugin/agents/research-lead/CLAUDE.md +26 -0
  45. package/dist/templates/agent-plugin/agents/research-lead/critic.md +61 -0
  46. package/dist/templates/agent-plugin/agents/research-lead/researcher.md +60 -0
  47. package/dist/templates/agent-plugin/agents/research-lead.md +184 -0
  48. package/dist/templates/agent-plugin/agents/research-lead.settings.json +57 -0
  49. package/dist/templates/agent-plugin/agents/review/CLAUDE.md +3 -29
  50. package/dist/templates/agent-plugin/agents/review/compliance.md +14 -3
  51. package/dist/templates/agent-plugin/agents/review/efficiency.md +15 -4
  52. package/dist/templates/agent-plugin/agents/review/quality.md +20 -6
  53. package/dist/templates/agent-plugin/agents/review/reuse.md +17 -5
  54. package/dist/templates/agent-plugin/agents/review/security.md +10 -3
  55. package/dist/templates/agent-plugin/agents/review/tests.md +58 -0
  56. package/dist/templates/agent-plugin/agents/review-plan/CLAUDE.md +28 -0
  57. package/dist/templates/agent-plugin/agents/review-plan/code-smells.md +4 -2
  58. package/dist/templates/agent-plugin/agents/review-plan/pattern-consistency.md +4 -2
  59. package/dist/templates/agent-plugin/agents/review-plan/requirements-coverage.md +3 -1
  60. package/dist/templates/agent-plugin/agents/review-plan/security.md +5 -2
  61. package/dist/templates/agent-plugin/agents/review-plan.md +52 -5
  62. package/dist/templates/agent-plugin/agents/review-plan.settings.json +57 -0
  63. package/dist/templates/agent-plugin/agents/review.md +89 -16
  64. package/dist/templates/agent-plugin/agents/review.settings.json +57 -0
  65. package/dist/templates/agent-plugin/agents/spec/engineer.md +175 -0
  66. package/dist/templates/agent-plugin/agents/spec/requirements-writer.md +149 -0
  67. package/dist/templates/agent-plugin/agents/spec.md +444 -0
  68. package/dist/templates/agent-plugin/agents/spec.settings.json +57 -0
  69. package/dist/templates/agent-plugin/agents/test-spec.md +58 -2
  70. package/dist/templates/agent-plugin/agents/test-spec.settings.json +57 -0
  71. package/dist/templates/agent-plugin/hooks/CLAUDE.md +9 -57
  72. package/dist/templates/agent-plugin/hooks/ask-background-guard.sh +57 -0
  73. package/dist/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  74. package/dist/templates/agent-plugin/hooks/plan-user-prompt.sh +8 -7
  75. package/dist/templates/agent-plugin/hooks/plan-validate.sh +97 -0
  76. package/dist/templates/agent-plugin/hooks/plan-write-path.sh +55 -0
  77. package/dist/templates/agent-plugin/hooks/problem-user-prompt.sh +26 -0
  78. package/dist/templates/agent-plugin/hooks/register-bg-task.sh +37 -0
  79. package/dist/templates/agent-plugin/hooks/require-submit.sh +51 -42
  80. package/dist/templates/agent-plugin/hooks/review-user-prompt.sh +6 -2
  81. package/dist/templates/agent-plugin/hooks/spec-user-prompt.sh +43 -0
  82. package/dist/templates/agent-plugin/skills/humanloop/SKILL.md +147 -0
  83. package/dist/templates/agent-plugin/skills/perspective-fanout/SKILL.md +115 -0
  84. package/dist/templates/agent-plugin/skills/problem-document/SKILL.md +105 -0
  85. package/dist/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +83 -0
  86. package/dist/templates/agent-suffix.md +7 -4
  87. package/dist/templates/baleia.lua +42 -0
  88. package/dist/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  89. package/dist/templates/dashboard-claude.md +7 -3
  90. package/dist/templates/orchestrator-base.md +89 -52
  91. package/dist/templates/orchestrator-completion.md +47 -24
  92. package/dist/templates/orchestrator-discovery.md +183 -0
  93. package/dist/templates/orchestrator-impl.md +47 -18
  94. package/dist/templates/orchestrator-planning.md +109 -20
  95. package/dist/templates/orchestrator-plugin/commands/sisyphus/scratch.md +19 -0
  96. package/dist/templates/orchestrator-plugin/commands/sisyphus/spec.md +11 -0
  97. package/dist/templates/orchestrator-plugin/commands/sisyphus/strategize.md +5 -5
  98. package/dist/templates/orchestrator-plugin/hooks/hooks.json +0 -10
  99. package/dist/templates/orchestrator-plugin/skills/humanloop/SKILL.md +149 -0
  100. package/dist/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +1 -0
  101. package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md +2 -1
  102. package/dist/templates/orchestrator-plugin/skills/orchestration/strategy.md +160 -0
  103. package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +26 -28
  104. package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +133 -25
  105. package/dist/templates/orchestrator-settings.json +55 -0
  106. package/dist/templates/orchestrator-validation.md +17 -14
  107. package/dist/templates/sisyphus-init.lua +30 -0
  108. package/dist/templates/sisyphus-tmux-plugin/hooks/hooks.json +54 -0
  109. package/dist/templates/sisyphus-tmux-plugin/hooks/tmux-state.sh +19 -0
  110. package/dist/templates/termrender-haiku-system.md +82 -0
  111. package/dist/templates/whip-animation.sh +345 -0
  112. package/dist/tui.js +3242 -2189
  113. package/dist/tui.js.map +1 -1
  114. package/native/SisyphusNotify/main.swift +15 -5
  115. package/package.json +8 -6
  116. package/templates/CLAUDE.md +1 -56
  117. package/templates/agent-plugin/agents/CLAUDE.md +2 -65
  118. package/templates/agent-plugin/agents/debug.md +43 -6
  119. package/templates/agent-plugin/agents/debug.settings.json +57 -0
  120. package/templates/agent-plugin/agents/explore.md +28 -1
  121. package/templates/agent-plugin/agents/explore.settings.json +57 -0
  122. package/templates/agent-plugin/agents/implementor.md +94 -0
  123. package/templates/agent-plugin/agents/implementor.settings.json +57 -0
  124. package/templates/agent-plugin/agents/operator.md +43 -1
  125. package/templates/agent-plugin/agents/operator.settings.json +57 -0
  126. package/templates/agent-plugin/agents/plan/sub-planner.md +75 -0
  127. package/templates/agent-plugin/agents/plan.md +176 -86
  128. package/templates/agent-plugin/agents/plan.settings.json +57 -0
  129. package/templates/agent-plugin/agents/problem/adversarial.md +26 -0
  130. package/templates/agent-plugin/agents/problem/contrarian.md +26 -0
  131. package/templates/agent-plugin/agents/problem/first-principles.md +26 -0
  132. package/templates/agent-plugin/agents/problem/precedent.md +25 -0
  133. package/templates/agent-plugin/agents/problem/simplifier.md +26 -0
  134. package/templates/agent-plugin/agents/problem/systems-thinker.md +26 -0
  135. package/templates/agent-plugin/agents/problem/time-traveler.md +26 -0
  136. package/templates/agent-plugin/agents/problem/user-empathy.md +26 -0
  137. package/templates/agent-plugin/agents/problem.md +334 -79
  138. package/templates/agent-plugin/agents/problem.settings.json +57 -0
  139. package/templates/agent-plugin/agents/research-lead/CLAUDE.md +26 -0
  140. package/templates/agent-plugin/agents/research-lead/critic.md +61 -0
  141. package/templates/agent-plugin/agents/research-lead/researcher.md +60 -0
  142. package/templates/agent-plugin/agents/research-lead.md +184 -0
  143. package/templates/agent-plugin/agents/research-lead.settings.json +57 -0
  144. package/templates/agent-plugin/agents/review/CLAUDE.md +3 -29
  145. package/templates/agent-plugin/agents/review/compliance.md +14 -3
  146. package/templates/agent-plugin/agents/review/efficiency.md +15 -4
  147. package/templates/agent-plugin/agents/review/quality.md +20 -6
  148. package/templates/agent-plugin/agents/review/reuse.md +17 -5
  149. package/templates/agent-plugin/agents/review/security.md +10 -3
  150. package/templates/agent-plugin/agents/review/tests.md +58 -0
  151. package/templates/agent-plugin/agents/review-plan/CLAUDE.md +28 -0
  152. package/templates/agent-plugin/agents/review-plan/code-smells.md +4 -2
  153. package/templates/agent-plugin/agents/review-plan/pattern-consistency.md +4 -2
  154. package/templates/agent-plugin/agents/review-plan/requirements-coverage.md +3 -1
  155. package/templates/agent-plugin/agents/review-plan/security.md +5 -2
  156. package/templates/agent-plugin/agents/review-plan.md +52 -5
  157. package/templates/agent-plugin/agents/review-plan.settings.json +57 -0
  158. package/templates/agent-plugin/agents/review.md +89 -16
  159. package/templates/agent-plugin/agents/review.settings.json +57 -0
  160. package/templates/agent-plugin/agents/spec/engineer.md +175 -0
  161. package/templates/agent-plugin/agents/spec/requirements-writer.md +149 -0
  162. package/templates/agent-plugin/agents/spec.md +444 -0
  163. package/templates/agent-plugin/agents/spec.settings.json +57 -0
  164. package/templates/agent-plugin/agents/test-spec.md +58 -2
  165. package/templates/agent-plugin/agents/test-spec.settings.json +57 -0
  166. package/templates/agent-plugin/hooks/CLAUDE.md +9 -57
  167. package/templates/agent-plugin/hooks/ask-background-guard.sh +57 -0
  168. package/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  169. package/templates/agent-plugin/hooks/plan-user-prompt.sh +8 -7
  170. package/templates/agent-plugin/hooks/plan-validate.sh +97 -0
  171. package/templates/agent-plugin/hooks/plan-write-path.sh +55 -0
  172. package/templates/agent-plugin/hooks/problem-user-prompt.sh +26 -0
  173. package/templates/agent-plugin/hooks/register-bg-task.sh +37 -0
  174. package/templates/agent-plugin/hooks/require-submit.sh +51 -42
  175. package/templates/agent-plugin/hooks/review-user-prompt.sh +6 -2
  176. package/templates/agent-plugin/hooks/spec-user-prompt.sh +43 -0
  177. package/templates/agent-plugin/skills/humanloop/SKILL.md +147 -0
  178. package/templates/agent-plugin/skills/perspective-fanout/SKILL.md +115 -0
  179. package/templates/agent-plugin/skills/problem-document/SKILL.md +105 -0
  180. package/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +83 -0
  181. package/templates/agent-suffix.md +7 -4
  182. package/templates/baleia.lua +42 -0
  183. package/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  184. package/templates/dashboard-claude.md +7 -3
  185. package/templates/orchestrator-base.md +89 -52
  186. package/templates/orchestrator-completion.md +47 -24
  187. package/templates/orchestrator-discovery.md +183 -0
  188. package/templates/orchestrator-impl.md +47 -18
  189. package/templates/orchestrator-planning.md +109 -20
  190. package/templates/orchestrator-plugin/commands/sisyphus/scratch.md +19 -0
  191. package/templates/orchestrator-plugin/commands/sisyphus/spec.md +11 -0
  192. package/templates/orchestrator-plugin/commands/sisyphus/strategize.md +5 -5
  193. package/templates/orchestrator-plugin/hooks/hooks.json +0 -10
  194. package/templates/orchestrator-plugin/skills/humanloop/SKILL.md +149 -0
  195. package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +1 -0
  196. package/templates/orchestrator-plugin/skills/orchestration/SKILL.md +2 -1
  197. package/templates/orchestrator-plugin/skills/orchestration/strategy.md +160 -0
  198. package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +26 -28
  199. package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +133 -25
  200. package/templates/orchestrator-settings.json +55 -0
  201. package/templates/orchestrator-validation.md +17 -14
  202. package/templates/sisyphus-init.lua +30 -0
  203. package/templates/sisyphus-tmux-plugin/hooks/hooks.json +54 -0
  204. package/templates/sisyphus-tmux-plugin/hooks/tmux-state.sh +19 -0
  205. package/templates/termrender-haiku-system.md +82 -0
  206. package/templates/whip-animation.sh +345 -0
  207. package/dist/chunk-22ZGZTGY.js +0 -67
  208. package/dist/chunk-22ZGZTGY.js.map +0 -1
  209. package/dist/chunk-6PJVJEYQ.js +0 -46
  210. package/dist/chunk-6PJVJEYQ.js.map +0 -1
  211. package/dist/chunk-C2XKXERJ.js.map +0 -1
  212. package/dist/chunk-TMBAVPHH.js.map +0 -1
  213. package/dist/chunk-V36NXMHP.js +0 -299
  214. package/dist/chunk-V36NXMHP.js.map +0 -1
  215. package/dist/templates/agent-plugin/agents/design.md +0 -134
  216. package/dist/templates/agent-plugin/agents/requirements.md +0 -138
  217. package/dist/templates/begin.md +0 -22
  218. package/dist/templates/nvim-tutorial.txt +0 -68
  219. package/dist/templates/orchestrator-plugin/commands/sisyphus/design.md +0 -13
  220. package/dist/templates/orchestrator-plugin/commands/sisyphus/requirements.md +0 -13
  221. package/dist/templates/orchestrator-plugin/hooks/idle-notify.sh +0 -71
  222. package/dist/templates/orchestrator-strategy.md +0 -238
  223. package/templates/agent-plugin/agents/design.md +0 -134
  224. package/templates/agent-plugin/agents/requirements.md +0 -138
  225. package/templates/begin.md +0 -22
  226. package/templates/nvim-tutorial.txt +0 -68
  227. package/templates/orchestrator-plugin/commands/sisyphus/design.md +0 -13
  228. package/templates/orchestrator-plugin/commands/sisyphus/requirements.md +0 -13
  229. package/templates/orchestrator-plugin/hooks/idle-notify.sh +0 -71
  230. package/templates/orchestrator-strategy.md +0 -238
  231. /package/dist/{paths-XRDEEJ5R.js.map → paths-JXFLR5BN.js.map} +0 -0
@@ -0,0 +1,160 @@
1
+ # Strategy Reference
2
+
3
+ Reference material for writing and updating strategy.md — the document that maps the shape of the work across stages.
4
+
5
+ ## strategy.md Format
6
+
7
+ ```markdown
8
+ ## Completed
9
+ [Compressed summaries of finished stages — delete detail, keep outcomes]
10
+
11
+ ## Current Stage: [name]
12
+ [Detailed process flow with exit criteria and backtrack triggers]
13
+
14
+ ## Ahead
15
+ [Sketched future stages — one line each: name + what it covers]
16
+ [Only as far as you can currently see — it's OK if this is vague]
17
+ ```
18
+
19
+ **Principles:**
20
+ - **Detail the current stage** — concrete enough that the orchestrator can execute without re-reading this skill
21
+ - **Sketch what's ahead** — enough continuity that future updates don't lose the thread, not so much that you're committing to unknowns
22
+ - **Every detailed stage gets exit criteria** — concrete enough to evaluate, not so rigid they become checkboxes
23
+ - **Include user gates** — where does this stage need the user? What decision or approval?
24
+
25
+ ## Stages name kinds of work, not areas of code
26
+
27
+ A strategy stage is a **process phase** — `discovery`, `planning`, `implementation`, `validation`, `spike`. It describes the *kind* of thinking happening that stage. It is **not** a work-area label like `auth-refactor`, `tui-panel`, `migration-script`, or `foundations`.
28
+
29
+ Work areas are the plan agent's job. They live in `context/{plan-lead-agent-id}/plan-stage-N-*.md` and structure the implementation phase from the inside. Keep them out of `strategy.md`.
30
+
31
+ <example>
32
+ ✓ Correct — process phases:
33
+ ```
34
+ ## Ahead
35
+ - **implementation** — phased build per the plan outline (5 sub-stages: foundations → ask-cli → tui → orphan-handling → migration). Critique + validate per stage.
36
+ - **validation** — run e2e recipe end-to-end, capture evidence, user gate.
37
+ ```
38
+
39
+ ✗ Wrong — work areas masquerading as stages:
40
+ ```
41
+ ## Ahead
42
+ - **foundations** — humanloop refactor + ask-store helpers
43
+ - **ask-cli + haiku + template** — CLI command and tool-use loop
44
+ - **tui-integration** — inbox panel and key routing
45
+ - **orphan-handling** — kill/complete paths
46
+ - **migration + e2e validation** — drop old command, run recipe
47
+ ```
48
+ The second list is a roadmap of code work. Strategy.md collapses into a task list and the process shape (when do we critique? when do we validate? what's the user gate?) disappears.
49
+ </example>
50
+
51
+ When you're tempted to name a stage after a code area, that signals you're sketching the plan, not the strategy. Push that detail down into the plan agent's output and keep `strategy.md` at the process-shape layer.
52
+
53
+ ## Default Pipeline Shape
54
+
55
+ The session's effort tier dictates the default pipeline. **Use this shape unless the problem explicitly demands more or less.** The user can change tiers via `sisyphus session effort <low|medium|high|xhigh>`.
56
+
57
+ <!--EFFORT:LOW-->
58
+ **Pipeline:** `plan → implement → validate`
59
+
60
+ A single plan agent, a single implement agent, a single validate agent. No spec, problem, test-spec, or review-plan stages — the user's request is the requirement; ask in-band if anything's ambiguous. If the work is wrapper-shaped (every change backs onto an existing CLI/API/handler), move directly from discovery into implementation mode without a planning-mode cycle at all.
61
+ <!--/EFFORT-->
62
+
63
+ <!--EFFORT:MEDIUM-->
64
+ **Pipeline:** `(spec, if behavior changes) → plan → implement → validate`
65
+
66
+ Add `sisyphus:review-plan` only when the plan covers multi-domain integration. Add `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Spawn `sisyphus:spec` and `sisyphus:problem` only when the goal has multiple valid framings or the design space is genuinely open.
67
+ <!--/EFFORT-->
68
+
69
+ <!--EFFORT:HIGH,XHIGH-->
70
+ **Pipeline:** `discovery → spec → planning (with parallel review-plan) → phased implementation with critique/validate checkpoints → validation`
71
+
72
+ `sisyphus:review-plan` runs after the plan is drafted. `sisyphus:spec` spawns whenever a feature adds user-visible behavior. `sisyphus:problem` spawns when the goal is nebulous. Append `+ test-spec` to the planning stage **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"); silence is a "no." When justified, `sisyphus:test-spec` spawns in parallel with the high-level plan at Cycle 2, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do.
73
+ <!--/EFFORT-->
74
+
75
+ **Re-evaluate the tier when scope shifts mid-session.** A MEDIUM feature that uncovers a new subsystem may have crossed into HIGH; a HIGH feature whose scope was narrowed may have dropped to MEDIUM. Re-run `sisyphus session effort` and re-invoke this skill rather than continuing under the old tier's pipeline.
76
+
77
+ ## Choosing a Different Shape
78
+
79
+ If the default doesn't match the problem, these canonical progressions are the next-best starting points — pick the closest one and prune what's already clear, rather than inventing custom shapes:
80
+
81
+ ```
82
+ discovery → spec → planning → implementation → validation
83
+ exploration → spike → design → implementation → validation
84
+ investigation → recommendation → (user decides) → implementation
85
+ analysis → phased-transformation → verification
86
+ discovery → product-design → technical-investigation → architecture → implementation → validation
87
+ ```
88
+
89
+ Add a new stage *type* only when the problem demands a kind of work the patterns don't cover — for example a `spike` to prove feasibility, a `compatibility-check` before a migration, or a `prototype` before committing. The test for "is this a real new stage?" is whether it names a different kind of thinking, not a different slice of code.
90
+
91
+ ## Stage Patterns
92
+
93
+ Use these as starting points. Invent new stage types when the problem demands it. Add backtrack edges where you can foresee things going wrong.
94
+
95
+ ### discovery
96
+ **Use when:** Goal is undefined, ambiguous, or has shifted — need to clarify what "done" looks like before any other stage runs. Also re-entered mid-session when a pivot invalidates the current goal.
97
+ - Process: read prior context (goal.md, prior strategy if any) → if the goal is provably clear, write goal.md and run the clarity-confirmation deck → otherwise spawn `sisyphus:problem` for interactive exploration → user iterates → fold result into goal.md → set effort tier → write or revise strategy.md
98
+ - Exit: goal.md is current and confirmed; effort tier is set; strategy.md exists for this iteration
99
+ - Produces: goal.md, strategy.md, optionally context/problem.md or context/problem-bifurcation.md
100
+ - Backtrack: if scope reveals multiple independent projects, issue a decomposition deck and let the user pick a lead — record the others under "Known follow-ups" in goal.md
101
+
102
+ ### exploration
103
+ **Use when:** Need to understand the technical landscape before committing to an approach.
104
+ - Process: spawn explore agents (each producing a focused context doc) → review findings → identify gaps → re-explore or converge
105
+ - Exit: enough understanding to make decisions — key questions answered, relevant patterns documented
106
+ - Produces: context documents (one per investigation angle, not one sprawling doc)
107
+
108
+ ### spike
109
+ **Use when:** Feasibility is uncertain — need to prove an approach works before investing in full design.
110
+ - Process: identify the riskiest assumption → build a minimal prototype that tests it → evaluate results → present findings to user if the spike changes the approach
111
+ - Exit: feasibility confirmed or denied with evidence, decision on path forward
112
+ - Produces: spike findings in context/, prototype code (may be throwaway)
113
+ - Backtrack: if spike fails → re-explore alternatives
114
+
115
+ ### spec
116
+ **Use when:** Need to define what to build and how, in a single interactive session.
117
+ - Process: spawn sisyphus:spec → lead explores codebase, asks user questions, dispatches engineer for design and a single writer for requirements → user reviews via TUI → lead deepens design with findings
118
+ - Exit: user-approved design + requirements with testable acceptance criteria
119
+ - Produces: context/design.md + context/design.json + context/requirements.json + context/requirements.md
120
+ - Backtrack: if problem was misframed → re-explore or re-discover
121
+
122
+ ### planning
123
+ **Use when:** Design approved, need an executable breakdown.
124
+ - Process: spawn plan lead with spec outputs (requirements + design) as inputs → adversarial review of plan → create e2e verification recipe
125
+ - Exit: reviewed plan + executable e2e-recipe.md that defines how to prove the feature works
126
+ - Produces: phased implementation plan + e2e recipe in context/
127
+ - Backtrack: if plan reveals design infeasibility → revisit spec
128
+
129
+ ### implementation
130
+ **Use when:** Plan exists, time to build.
131
+ - Process: for each phase → detail-plan → spawn implement agents → single critique pass → refine → validate phase
132
+ - Exit: all phases validated with evidence, no critical review findings remain
133
+ - Loops: none within a phase — review runs once, fixes land, then validation. If review surfaces architectural issues, backtrack to plan; otherwise advance.
134
+ - Backtrack: if 2+ agents hit same unexpected complexity → revisit plan or spec; if review finds architectural issues → revisit plan
135
+
136
+ ### validation
137
+ **Use when:** Implementation complete, need to prove it works end-to-end.
138
+ - Process: run full e2e recipe → collect evidence (command output, screenshots, responses) → assess against success criteria → step back and check if the goal is actually met
139
+ - Exit: all recipe steps pass with concrete evidence, original goal satisfied
140
+ - Produces: validation report with evidence
141
+ - Backtrack: if bugs found → implementation; if architectural issues → spec
142
+
143
+ ## Mid-session shape revisions
144
+
145
+ When the work in flight reveals the strategy itself is off, escalate up this ladder — reach for the lowest-cost move that fits.
146
+
147
+ 1. **Revise in place.** Stage detail evolved but the pipeline shape holds. Edit `strategy.md` and `roadmap.md`; continue.
148
+ 2. **`sisyphus:strategize`.** Approach is wrong but artifacts (specs, explorations, reports) still apply. Annotates the pivot into `strategy.md` and yields `--mode discovery` with a fresh orchestrator.
149
+ 3. **`sisyphus session clone <goal>`.** The session is actually two (or more) independent projects. Forks scope into a new top-level session; update `goal.md`/`roadmap.md` here to drop what was cloned.
150
+ 4. **`sisyphus session rollback <sessionId> <cycle>`.** A specific cycle introduced state to discard. Rewinds and pauses the session — cycles after the target are lost. Last resort; the others preserve history.
151
+
152
+ When the user is the source of the change, update `goal.md` first — strategy revision is downstream of goal.
153
+
154
+ ## Design Philosophy
155
+
156
+ Frameworks to inform process shape selection — use them to *choose the right shape*, not to follow mechanically:
157
+
158
+ - **Double Diamond** — Diverge to explore, converge on a definition; diverge on solutions, converge on implementation. Use when requirements are unclear or the problem needs defining.
159
+ - **OODA (Observe–Orient–Decide–Act)** — Tight sensing/reacting loops. Use when the situation is fluid and the cost of wrong moves is low (debugging, spikes, incident response).
160
+ - **Cynefin** — Match approach to domain. Clear → best practice. Complicated → analyze then execute. Complex → probe, sense, respond. Chaotic → act to stabilize.
@@ -71,7 +71,7 @@ Feature with moderate complexity. Requirements may need clarification. Multiple
71
71
  ## Feature: [description]
72
72
 
73
73
  ### Requirements & Design
74
- - [ ] Problem exploration — understand goals, constraints, assumptions
74
+ - [ ] (conditional) Problem exploration — if goal is nebulous, explore before spec
75
75
  - [ ] Requirements — define acceptance criteria
76
76
  - [ ] Design — architecture, component boundaries, data models
77
77
  - [ ] Create implementation plan from requirements + design
@@ -89,20 +89,19 @@ Feature with moderate complexity. Requirements may need clarification. Multiple
89
89
  Note: critique and validation are embedded between implementation phases, not deferred to the end. Phase 1 (types) is low-risk and doesn't need its own review, but critique catches issues before Phase 3 builds on them. Validation happens after integration, when all the pieces come together.
90
90
 
91
91
  ### Cycle plan
92
- - **Cycle 1**: Spawn `sisyphus:problem` for problem exploration. Yield. (Human iterates between cycles.)
93
- - **Cycle 2**: Spawn `sisyphus:requirements` for requirements analysis. Yield. (Human reviews/iterates.)
94
- - **Cycle 3**: Spawn `sisyphus:design` for technical design. Yield. (Human reviews/iterates.)
95
- - **Cycle 4**: Spawn `sisyphus:plan` for plan. Yield.
96
- - **Cycle 5**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
97
- - **Cycle 6**: Spawn `sisyphus:implement` for Phase 1. Yield.
98
- - **Cycle 7**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types low risk, doesn't need its own validation. Yield.
99
- - **Cycle 8**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
100
- - **Cycle 9**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
101
- - **Cycle 10**: `sisyphus yield --mode validation` for e2e smoketest. Validation mode proves the feature works — operator for UI, evidence for every claim.
102
- - **Cycle 11**: Address validation failures (back to `--mode implementation`) or complete.
92
+ - **Cycle 0** (conditional): If the problem is nebulous — multiple valid framings, unclear what "done" looks like — spawn `sisyphus:problem` for interactive exploration. Yield `--mode discovery`. Skip if goal is clear and acceptance criteria are obvious.
93
+ - **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
94
+ - **Cycle 2**: Spawn `sisyphus:plan` for plan. Yield.
95
+ - **Cycle 3**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
96
+ - **Cycle 4**: Spawn `sisyphus:implement` for Phase 1. Yield.
97
+ - **Cycle 5**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types — low risk, doesn't need its own validation. Yield.
98
+ - **Cycle 6**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
99
+ - **Cycle 7**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
100
+ - **Cycle 8**: `sisyphus orch yield --mode validation` for e2e smoketest. Validation mode proves the feature works — operator for UI, evidence for every claim.
101
+ - **Cycle 9**: Address validation failures (back to `--mode implementation`) or complete.
103
102
 
104
103
  ### Failure modes
105
- - **Requirements/design needs human input**: Mark session as needing human review. Orchestrator notes open questions.
104
+ - **Spec needs human input**: Mark session as needing human review. Orchestrator notes open questions.
106
105
  - **Plan fails review**: Feed review issues back, respawn planner.
107
106
  - **Critique finds issues in foundation**: Fix before starting integration — don't build on shaky ground.
108
107
  - **Validation fails**: Feed specifics back to implement agent for the failing area.
@@ -122,7 +121,7 @@ Cross-cutting feature, multiple domains, needs team coordination. Uses **progres
122
121
  ## Feature: [description]
123
122
 
124
123
  ### Requirements & Design
125
- - [ ] Problem exploration
124
+ - [ ] (conditional) Problem exploration — if goal is nebulous
126
125
  - [ ] Requirements
127
126
  - [ ] Design
128
127
 
@@ -138,24 +137,23 @@ Cross-cutting feature, multiple domains, needs team coordination. Uses **progres
138
137
  6. [final review] — depends on all
139
138
 
140
139
  ### Current Stage: [whichever is active]
141
- See context/plan-stage-N-{name}.md for detail plan.
140
+ See context/{plan-lead-agent-id}/plan-stage-N-{name}.md for detail plan. (Path comes from the plan lead's submission report.)
142
141
  - [ ] [task-level items from detail plan]
143
142
  ```
144
143
 
145
144
  Note: verification checkpoints are embedded in the stage outline, not deferred to a final phase. The level of rigor varies — foundation stages get a light critique, core logic gets critique + validation, integration gets full e2e validation. This is judgment, not formula.
146
145
 
147
146
  ### Cycle plan
148
- - **Cycle 1**: Spawn `sisyphus:problem` for problem exploration. Yield.
149
- - **Cycle 2**: Spawn `sisyphus:requirements` for requirements. Yield.
150
- - **Cycle 3**: Spawn `sisyphus:design` for design. Yield.
151
- - **Cycle 4**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." Spawn `sisyphus:test-spec` for test properties (parallel). Yield.
152
- - **Cycle 5**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). Output to `context/plan-stage-1-{name}.md`. Yield.
153
- - **Cycle 6**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
154
- - **Cycle 7**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
155
- - **Cycle 8**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
156
- - **Cycle 9**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
157
- - **Cycle 10**: Spawn `sisyphus:validate` for stages 3-4 core logic checkpoint before integration. Address stage 3 critique. Yield.
158
- - **Cycle 11+**: Implement integration stage. Final review. Then `sisyphus yield --mode validation` for comprehensive e2e proof.
147
+ - **Cycle 0** (conditional): If the problem is nebulous, spawn explore agents for technical landscape (yield `--mode discovery`), then spawn `sisyphus:problem` for interactive problem exploration (yield `--mode discovery`). May take 1-3 discovery cycles. Skip if the goal and scope are already clear.
148
+ - **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
149
+ - **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." If the user's initial prompt or goal.md explicitly requested tests, also spawn `sisyphus:test-spec` for test properties in parallel; otherwise skip. Yield.
150
+ - **Cycle 4**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). The plan agent saves under its own subdir and reports the full path carry that path forward for the implement cycle. Yield.
151
+ - **Cycle 5**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
152
+ - **Cycle 6**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
153
+ - **Cycle 7**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
154
+ - **Cycle 8**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
155
+ - **Cycle 9**: Spawn `sisyphus:validate` for stages 3-4 core logic checkpoint before integration. Address stage 3 critique. Yield.
156
+ - **Cycle 10+**: Implement integration stage. Final review. Then `sisyphus orch yield --mode validation` for comprehensive e2e proof.
159
157
 
160
158
  ### Failure modes
161
159
  - **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
@@ -211,13 +209,13 @@ PR review, pre-merge check, or periodic quality audit.
211
209
 
212
210
  - [ ] Review [scope] for issues
213
211
  - [ ] (conditional) Fix critical/high issues found
214
- - [ ] (conditional) Re-review fixes
212
+ - [ ] Verify fixes landed (type-check, tests pass)
215
213
  ```
216
214
 
217
215
  ### Cycle plan
218
216
  - **Cycle 1**: Spawn `sisyphus:review` for review. Yield.
219
217
  - **Cycle 2**: If critical/high issues, spawn `sisyphus:implement` for fixes. If clean, complete.
220
- - **Cycle 3**: Spawn `sisyphus:review` for re-review (targeted at fixes only). Complete.
218
+ - **Cycle 3**: Verify fixes landed by reading fix-agent reports + running type-check/tests. Complete. Do **not** spawn a second review pass — review runs once, validation catches regressions.
221
219
 
222
220
  ### Parallelization
223
221
  Review itself parallelizes internally (subagents per concern). Fix cycle is usually serial.
@@ -2,6 +2,122 @@
2
2
 
3
3
  End-to-end examples showing how the orchestrator structures cycles for real scenarios.
4
4
 
5
+ ### Path conventions in these examples
6
+
7
+ Plan files live under per-plan-lead subdirectories: `context/{plan-lead-agent-id}/plan-*.md`. These examples elide the subdir (showing `context/plan-rate-limiting.md`) for readability. In a real cycle, the orchestrator reads the exact path from the plan lead's submission report and carries it verbatim into downstream implement, review-plan, and validate agent prompts.
8
+
9
+ ---
10
+
11
+ ## Example 4: Wrapper-Shaped Config Migration (LOW effort — 5 files, mechanical)
12
+
13
+ **Starting task**: "All config access goes through `process.env` directly — migrate to a `getConfig()` wrapper already defined in `src/config.ts`"
14
+
15
+ **Effort tier**: LOW. Every change is a call-site swap onto an existing handler. No new behavior.
16
+
17
+ ### Cycle 1 — Plan
18
+ ```
19
+ roadmap.md:
20
+ ## Refactor: Migrate env access to getConfig()
21
+
22
+ - [ ] Plan migration — enumerate all process.env call sites
23
+ - [ ] Update call sites to use getConfig()
24
+ - [ ] Validate — no direct process.env access remains; tests pass
25
+
26
+ Agents spawned:
27
+ plan agent → "Enumerate every direct process.env access in src/. Map each call site
28
+ to the matching getConfig() key. Output a migration checklist. Files expected:
29
+ src/api/server.ts, src/db/connection.ts, src/queue/worker.ts,
30
+ src/cli/commands/start.ts, src/config.ts (source of truth — do not modify)."
31
+ ```
32
+
33
+ ### Cycle 2 — Implement
34
+ ```
35
+ Plan complete. 23 call sites across 4 files.
36
+
37
+ Agents spawned:
38
+ implement agent → "Execute migration plan at context/{plan-agent-id}/plan-config-migration.md.
39
+ Replace every process.env.X access with getConfig('X'). Do not modify src/config.ts.
40
+ Do not add error handling — getConfig() already throws on missing keys."
41
+ ```
42
+
43
+ ### Cycle 3 — Validate + complete
44
+ ```
45
+ Implementation complete.
46
+
47
+ Agents spawned:
48
+ validate agent → "Verify migration: grep for remaining process.env access in src/ (excluding
49
+ src/config.ts). Run existing tests. Confirm zero direct env reads outside config.ts."
50
+
51
+ Validation: PASS. Complete — "All env access routed through getConfig()."
52
+ ```
53
+
54
+ **Pipeline shape**: `plan → implement → validate`. 3 cycles. No `sisyphus:spec`, no `sisyphus:test-spec`, no `sisyphus:review-plan`.
55
+
56
+ ---
57
+
58
+ ## Example 5: New Subsystem — Distributed Task Queue (HIGH effort)
59
+
60
+ **Starting task**: "Add a persistent task queue so long-running jobs survive server restarts. Include test coverage of the survival, retry, and concurrency invariants."
61
+
62
+ **Effort tier**: HIGH. New subsystem, new protocol (worker ↔ queue contract), cross-domain orchestration (API + storage + worker process). The prompt explicitly asks for test coverage — `sisyphus:test-spec` is justified at Cycle 2.
63
+
64
+ ### Cycle 0 — Problem exploration
65
+ ```
66
+ roadmap.md:
67
+ ## Feature: Persistent Task Queue
68
+
69
+ - [ ] Explore current job execution patterns and constraints
70
+ - [ ] Spec — requirements + architecture
71
+ - [ ] Plan implementation (staged outline)
72
+ - [ ] Spec behavioral properties (test-spec) — user asked for tests in the prompt
73
+ ...
74
+
75
+ Agents spawned:
76
+ explore agent → "Map current job execution in src/jobs/. Identify what needs to survive
77
+ restarts, current storage backends, worker process lifecycle."
78
+ problem agent → "Explore design space for persistent task queue. Questions: push vs pull
79
+ worker model, at-least-once vs exactly-once semantics, failure/retry policy, storage
80
+ backend options (Redis, Postgres, SQLite)."
81
+ ```
82
+
83
+ ### Cycle 1 — Spec (human iterates)
84
+ ```
85
+ Agents spawned:
86
+ sisyphus:spec → "Run spec session for persistent task queue.
87
+ Context in context/problem-task-queue.md and context/explore-task-queue.md."
88
+
89
+ Human iterates. Spec outputs:
90
+ context/requirements-task-queue.md — acceptance criteria, failure semantics
91
+ context/design-task-queue.md — Redis-backed queue, pull workers, at-least-once delivery
92
+ ```
93
+
94
+ ### Cycle 2 — High-level plan + test-spec (parallel)
95
+ ```
96
+ Agents spawned (parallel):
97
+ plan agent → "Create high-level stage outline from context/requirements-task-queue.md
98
+ and context/design-task-queue.md. Stages: (1) queue storage layer, (2) producer API,
99
+ (3) worker consumer, (4) integration + retry logic. Cycle estimates per stage."
100
+ test-spec agent → "Define behavioral properties: job survives server restart, failed
101
+ jobs retry up to N times, concurrent workers don't double-execute the same job."
102
+ ```
103
+
104
+ If the original prompt had been silent on tests, the test-spec spawn would be omitted and Cycle 2 would be plan-only — Cycle 3 would then proceed straight to detail-planning stage 1.
105
+
106
+ ### Cycles 3–9 — Staged implementation with critique + validation checkpoints
107
+ ```
108
+ Follows Feature Build Large pattern:
109
+ Cycle 3: detail-plan stage 1 + implement stage 1
110
+ Cycle 4: implement stage 2; detail-plan stage 3 in parallel
111
+ Cycle 5: critique stages 1-2 (foundation review before worker builds on it)
112
+ Cycle 6: address critique + implement stage 3
113
+ Cycle 7: implement stage 4 (integration + retry); validate stages 3-4
114
+ Cycle 8: sisyphus orch yield --mode validation — e2e: enqueue job, kill server, restart,
115
+ confirm job ran exactly once
116
+ Cycle 9: final review agent; complete
117
+ ```
118
+
119
+ **Pipeline shape**: Full HIGH pipeline — `problem → spec → plan (+ test-spec because the prompt asked for tests) → staged implement → critique → validate → review`. 9+ cycles. Without an explicit test request in the prompt, the parallel `test-spec` would be omitted and Cycle 2 would be plan-only.
120
+
5
121
  ---
6
122
 
7
123
  ## Example 1: Fix a Race Condition in WebSocket Reconnection
@@ -92,7 +208,7 @@ Action: complete — "Fixed WebSocket message loss during reconnection. Messages
92
208
 
93
209
  ## Example 2: Add API Rate Limiting
94
210
 
95
- **Starting task**: "Add rate limiting to the REST API — per-user, configurable limits"
211
+ **Starting task**: "Add rate limiting to the REST API — per-user, configurable limits, with tests for the limit-enforcement and 429 response behavior"
96
212
 
97
213
  ### Cycle 1 — Problem exploration
98
214
  ```
@@ -122,7 +238,7 @@ Agents spawned:
122
238
  what storage backends are available (Redis?), user identification mechanisms."
123
239
  ```
124
240
 
125
- ### Cycle 2 — Requirements (after human reviews problem doc)
241
+ ### Cycle 2 — Spec (after human iterates on problem)
126
242
  ```
127
243
  Agent report: "Problem document saved to context/problem-rate-limiting.md.
128
244
  Current middleware chain uses Express middleware pattern. Redis is already in stack.
@@ -130,35 +246,27 @@ Agent report: "Problem document saved to context/problem-rate-limiting.md.
130
246
 
131
247
  roadmap.md updated:
132
248
  - [x] ~~Problem exploration~~
133
- - [ ] Requirements — define acceptance criteria
249
+ - [ ] Spec — define acceptance criteria and architecture
134
250
  ...
135
251
 
136
252
  Agents spawned:
137
- requirements agent → "Draft acceptance criteria for per-user API rate limiting.
138
- Read context/problem-rate-limiting.md for context.
139
- Questions to resolve: limit granularity (per-endpoint vs global),
140
- response format for rate-limited requests, override mechanisms."
141
- ```
253
+ sisyphus:spec → "Run a spec session for per-user API rate limiting. Read context/problem-rate-limiting.md for context."
142
254
 
143
- ### Cycle 3 — Design (after human reviews requirements)
144
- ```
145
- Agent report: "Requirements saved to context/requirements-rate-limiting.md.
255
+ Later report: "Spec completed.
256
+ Requirements saved to context/requirements-rate-limiting.md.
257
+ Design saved to context/design-rate-limiting.md.
146
258
  Covers: per-user limits, endpoint-specific overrides, 429 response format,
147
- Retry-After headers. User confirmed Redis-backed approach."
148
-
149
- Agents spawned:
150
- design agent → "Create technical design for rate limiting based on
151
- context/requirements-rate-limiting.md and context/problem-rate-limiting.md."
259
+ Retry-After headers, and a Redis-backed sliding window approach."
152
260
  ```
153
261
 
154
- ### Cycle 4 — Plan (after human reviews design)
262
+ ### Cycle 3 — Plan (after human reviews spec)
155
263
  ```
156
- Agent report: "Design saved to context/design-rate-limiting.md.
264
+ Agent report: "Spec outputs approved.
157
265
  Approach: Redis-backed sliding window middleware. Per-user with endpoint-specific
158
266
  overrides. Standard 429 response with Retry-After header. Config via environment variables."
159
267
 
160
268
  roadmap.md updated:
161
- - [x] ~~Problem exploration~~, [x] ~~Requirements~~, [x] ~~Design~~
269
+ - [x] ~~Problem exploration~~, [x] ~~Spec~~
162
270
  - [ ] Plan implementation
163
271
  ...
164
272
 
@@ -169,7 +277,7 @@ Agents spawned:
169
277
  context/requirements-rate-limiting.md"
170
278
  ```
171
279
 
172
- ### Cycle 5 — Review plan
280
+ ### Cycle 4 — Review plan
173
281
  ```
174
282
  Both agents complete. Plan at context/plan-rate-limiting.md.
175
283
  Plan has 3 phases: middleware, config, response format.
@@ -179,12 +287,12 @@ Agents spawned:
179
287
  against context/requirements-rate-limiting.md and context/design-rate-limiting.md"
180
288
  ```
181
289
 
182
- ### Cycle 6 — Implement phases 1+2 (parallel, low-risk foundation)
290
+ ### Cycle 5 — Implement phases 1+2 (parallel, low-risk foundation)
183
291
  ```
184
292
  Plan review: PASS.
185
293
 
186
294
  roadmap.md updated (plan review done, starting implementation):
187
- - [x] ~~Requirements~~, [x] ~~Design~~, [x] ~~Plan~~, [x] ~~Review plan~~
295
+ - [x] ~~Spec~~, [x] ~~Plan~~, [x] ~~Review plan~~
188
296
  - [ ] Implement rate limiting middleware
189
297
  - [ ] Implement rate limit configuration
190
298
  - [ ] Critique phases 1-2 — review before integration phase
@@ -199,7 +307,7 @@ Agents spawned (parallel — phases touch different files):
199
307
  rate limit configuration in src/config/rate-limits.ts"
200
308
  ```
201
309
 
202
- ### Cycle 7 — Critique before integration builds on top
310
+ ### Cycle 6 — Critique before integration builds on top
203
311
  ```
204
312
  Both implementation agents complete.
205
313
 
@@ -217,7 +325,7 @@ Agents spawned:
217
325
  config schema matches what middleware expects."
218
326
  ```
219
327
 
220
- ### Cycle 8 — Implement phase 3 + address critique
328
+ ### Cycle 7 — Implement phase 3 + address critique
221
329
  ```
222
330
  Review: 2 findings — middleware doesn't handle Redis connection failure gracefully,
223
331
  config schema allows negative rate limits.
@@ -229,7 +337,7 @@ Agents spawned (parallel):
229
337
  rate limit headers and 429 error responses in src/api/middleware/rate-limit.ts"
230
338
  ```
231
339
 
232
- ### Cycle 9 — Validate end-to-end
340
+ ### Cycle 8 — Validate end-to-end
233
341
  ```
234
342
  Phase 3 and fixes complete.
235
343
 
@@ -1,2 +1,57 @@
1
1
  {
2
+ "spinnerVerbs": {
3
+ "mode": "replace",
4
+ "verbs": [
5
+ "Pushing the boulder",
6
+ "Delegating the boulder",
7
+ "Outsourcing the futility",
8
+ "Splitting the stone",
9
+ "Spawning underlings",
10
+ "Herding agents",
11
+ "Fanning out, praying",
12
+ "Dispatching, then worrying",
13
+ "Auditioning agents for misery",
14
+ "Allocating despair evenly",
15
+ "Watching panes bloom",
16
+ "Counting heartbeats",
17
+ "Tallying reports",
18
+ "Reaping the finished",
19
+ "Reconciling outputs",
20
+ "Synthesizing the damage",
21
+ "Rotating the cycle",
22
+ "Rolling cycle N+1",
23
+ "Pretending this is under control",
24
+ "Maintaining plausible command",
25
+ "Second-guessing the split",
26
+ "Revising optimistic estimates",
27
+ "Redrafting the plan quietly",
28
+ "Re-reading the roadmap",
29
+ "Updating strategy.md",
30
+ "Pondering whether to yield",
31
+ "Yielding gracefully",
32
+ "Yielding reluctantly",
33
+ "Quoting Camus under breath",
34
+ "Imagining agents happy",
35
+ "Embracing the absurd",
36
+ "Accepting the backlog",
37
+ "Forgiving a timeout",
38
+ "Nudging a stuck agent",
39
+ "Absorbing a crash",
40
+ "Retrying with dignity",
41
+ "Delegating harder",
42
+ "Elevating a blocker",
43
+ "Squinting at pane count",
44
+ "Holding the thread",
45
+ "Holding the line",
46
+ "Blessing the fleet",
47
+ "Releasing the session",
48
+ "Letting agents cook",
49
+ "Letting go",
50
+ "Staring into the session",
51
+ "Weighing respawn",
52
+ "Shouldering the next cycle",
53
+ "Believing in the climb",
54
+ "Contemplating cycle N+1"
55
+ ]
56
+ }
2
57
  }
@@ -21,7 +21,7 @@ If the recipe doesn't exist or doesn't cover what was implemented:
21
21
  If you genuinely cannot determine how to verify the feature — transition back to planning:
22
22
 
23
23
  ```bash
24
- sisyphus yield --mode planning --prompt "Cannot determine verification method for [feature] — need to establish e2e recipe"
24
+ sisyphus orch yield --mode planning --prompt "Cannot determine verification method for [feature] — need to establish e2e recipe"
25
25
  ```
26
26
 
27
27
  ## The Operator Is Not Optional
@@ -63,6 +63,8 @@ Spawn validation agents with clear, specific instructions:
63
63
 
64
64
  For broad features, parallelize: spawn multiple agents each covering a distinct area. An operator for the UI flows, a CLI agent for backend verification, etc.
65
65
 
66
+ When spawning an operator, tell it explicitly what to target — the browser URL, the Electron app name, or whichever surface applies. The operator should not have to guess whether the product is a web app or a desktop app.
67
+
66
68
  ### Review the evidence yourself
67
69
 
68
70
  When validation reports come back, **read them critically.** Check that the evidence actually supports the claims. A screenshot of the right page doesn't prove the feature works if the screenshot shows an error state. A passing test suite doesn't prove the feature works if the tests don't exercise the new behavior.
@@ -74,32 +76,33 @@ If a report says "all checks pass" but the evidence is thin or missing — that'
74
76
  When validation surfaces real bugs:
75
77
 
76
78
  ```bash
77
- sisyphus yield --mode implementation --prompt "Validation failed — [specific failures]. See reports/agent-XXX-final.md for details."
79
+ sisyphus orch yield --mode implementation --prompt "Validation failed — [specific failures]. See reports/agent-XXX-final.md for details."
78
80
  ```
79
81
 
80
- Log what failed and why in the cycle log before yielding. The implementation cycle needs clear context on what to fix.
82
+ Log what failed and why before yielding. The implementation cycle needs clear context on what to fix.
81
83
 
82
84
  When validation reveals that the approach itself is flawed — not bugs, but architectural issues or fundamental misunderstandings:
83
85
 
84
86
  ```bash
85
- sisyphus yield --mode planning --prompt "Validation revealed [architectural issue] — approach needs rethinking. See cycle log."
87
+ sisyphus orch yield --mode planning --prompt "Validation revealed [architectural issue] — approach needs rethinking. See cycle log."
86
88
  ```
87
89
 
88
90
  **Do not attempt fixes in validation mode** beyond trivial issues (a missed import, a config typo). If the fix requires design decisions or touches multiple files, transition to implementation mode where the orchestrator has the right guidance for managing that work.
89
91
 
90
- ## Completion Gate
91
-
92
- When all validation passes, **do not call `sisyphus complete` directly.** Yield to completion mode for user sign-off:
92
+ ## Validation CLI
93
93
 
94
94
  ```bash
95
- sisyphus yield --mode completion --prompt "Validation passed all recipe steps verified. Ready for user review."
95
+ sisyphus agent restart <agentId> # respawn a failed/killed validation agent
96
96
  ```
97
97
 
98
- Only yield to completion when:
99
- - Every recipe step has been executed (not skipped, not assumed)
100
- - Every step has evidence of success in the validation report
101
- - The evidence actually matches the success criteria from the recipe
98
+ ## Transition to Completion
99
+
100
+ When all validation passes, yield to completion mode for user sign-off:
101
+
102
+ ```bash
103
+ sisyphus orch yield --mode completion --prompt "Validation passed — all recipe steps verified. Ready for user review."
104
+ ```
102
105
 
103
- If the recipe was updated during validation, re-validate against the updated version. Completion means the current recipe passes, not that an earlier draft would have.
106
+ Only yield when every recipe step has been executed with evidence of success. If the recipe was updated during validation, re-validate against the updated version.
104
107
 
105
- Before transitioning, step back: does the validated behavior actually satisfy the original goal? It's possible to pass every recipe step and still miss the point. The recipe is a tool, not a substitute for judgment.
108
+ Before yielding, re-read goal.md and check recipe coverage against it not against itself. For each clause that names a user-visible behavior or capability, find the recipe step that exercised it. If a clause has no matching step, the recipe is incomplete: extend it, re-validate, and only then yield. A passing recipe proves the recipe's steps work; it does not prove the goal was met.