sisyphi 1.2.2 → 1.2.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (85) hide show
  1. package/README.md +20 -20
  2. package/dist/cli.js +12450 -11255
  3. package/dist/cli.js.map +1 -1
  4. package/dist/daemon.js +1113 -565
  5. package/dist/daemon.js.map +1 -1
  6. package/dist/templates/agent-plugin/agents/CLAUDE.md +2 -2
  7. package/dist/templates/agent-plugin/agents/operator.md +3 -4
  8. package/dist/templates/agent-plugin/agents/plan.md +1 -1
  9. package/dist/templates/agent-plugin/agents/problem.md +20 -20
  10. package/dist/templates/agent-plugin/agents/research-lead.md +1 -1
  11. package/dist/templates/agent-plugin/agents/spec/engineer.md +9 -7
  12. package/dist/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  13. package/dist/templates/agent-plugin/agents/spec.md +31 -25
  14. package/dist/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  15. package/dist/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  16. package/dist/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  17. package/dist/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  18. package/dist/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  19. package/dist/templates/agent-plugin/hooks/require-submit.sh +1 -1
  20. package/dist/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  21. package/dist/templates/agent-suffix.md +4 -18
  22. package/dist/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  23. package/dist/templates/dashboard-claude.md +15 -13
  24. package/dist/templates/orchestrator-base.md +44 -78
  25. package/dist/templates/orchestrator-completion.md +9 -11
  26. package/dist/templates/orchestrator-discovery.md +8 -8
  27. package/dist/templates/orchestrator-impl.md +6 -7
  28. package/dist/templates/orchestrator-planning.md +2 -2
  29. package/dist/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  30. package/dist/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  31. package/dist/templates/orchestrator-validation.md +1 -3
  32. package/dist/templates/termrender-haiku-system.md +5 -3
  33. package/dist/tui.js +1817 -1400
  34. package/dist/tui.js.map +1 -1
  35. package/native/build-notify.sh +2 -2
  36. package/package.json +3 -3
  37. package/templates/agent-plugin/agents/CLAUDE.md +2 -2
  38. package/templates/agent-plugin/agents/operator.md +3 -4
  39. package/templates/agent-plugin/agents/plan.md +1 -1
  40. package/templates/agent-plugin/agents/problem.md +20 -20
  41. package/templates/agent-plugin/agents/research-lead.md +1 -1
  42. package/templates/agent-plugin/agents/spec/engineer.md +9 -7
  43. package/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  44. package/templates/agent-plugin/agents/spec.md +31 -25
  45. package/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  46. package/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  47. package/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  48. package/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  49. package/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  50. package/templates/agent-plugin/hooks/require-submit.sh +1 -1
  51. package/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  52. package/templates/agent-suffix.md +4 -18
  53. package/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  54. package/templates/dashboard-claude.md +15 -13
  55. package/templates/orchestrator-base.md +44 -78
  56. package/templates/orchestrator-completion.md +9 -11
  57. package/templates/orchestrator-discovery.md +8 -8
  58. package/templates/orchestrator-impl.md +6 -7
  59. package/templates/orchestrator-planning.md +2 -2
  60. package/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  61. package/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  62. package/templates/orchestrator-validation.md +1 -3
  63. package/templates/termrender-haiku-system.md +5 -3
  64. package/dist/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  65. package/dist/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  66. package/dist/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  67. package/dist/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  68. package/dist/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  69. package/dist/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  70. package/dist/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  71. package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  72. package/dist/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  73. package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  74. package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
  75. package/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  76. package/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  77. package/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  78. package/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  79. package/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  80. package/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  81. package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  82. package/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  83. package/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  84. package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  85. package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
@@ -1,150 +0,0 @@
1
- ---
2
- name: humanloop
3
- description: >
4
- Read before calling `sis ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to invoke synchronously so the orchestrator's process blocks until the user answers.
5
- ---
6
-
7
- # Talking to the user via decks
8
-
9
- `sis ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
10
-
11
- This skill covers **what to put in a deck** and **how to invoke it**. Run `sis ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
12
-
13
- ## Reach for a deck when
14
-
15
- - You have **2+ questions** to surface in one beat (bundle them into one deck).
16
- - You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
17
- - You're choosing between **concrete alternatives** the user must pick.
18
- - The work will sit while the user thinks. Decks survive across cycles; chat does not.
19
-
20
- ## Skip the deck when
21
-
22
- - It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
23
- - You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
24
- - The user is actively conversing with you — converting a live exchange into a deck adds friction.
25
-
26
- ## How to invoke
27
-
28
- **Run `sis ask` in the foreground — let the Bash tool block.** The CLI waits internally for the user to resolve the deck (potentially 10+ minutes). Your pane stays alive in tmux for the duration; the daemon will not respawn you while a tool call is in flight. When the user answers, the bash returns stdout and you parse it inline.
29
-
30
- ```bash
31
- result=$(sis ask "$deck")
32
- choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId')
33
- notes=$(echo "$result" | jq -r '.responses[0].freetext // ""')
34
- ```
35
-
36
- **Do not `run_in_background` and yield** — yielding kills your pane and any backgrounded bash with it; the next cycle's fresh orchestrator can only peek the on-disk deck (`sis ask peek`) and yield again, producing a polling loop. The daemon now refuses `sis orch yield` while a deck owned by orchestrator is pending; the supported pattern is foreground.
37
-
38
- Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
39
-
40
- If you respawn mid-wait and find a pending deck on disk (e.g. after a daemon restart that orphaned the prior bash), block on it with `sis ask poll <askId>` to re-attach. `sis ask peek <askId>` is non-blocking and reserved for respawn-recovery diagnostics. See `sis ask -h`.
41
-
42
- ## Designing interactions
43
-
44
- ### Each option is a concrete path forward
45
-
46
- The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
47
-
48
- <example type="good">
49
- ```
50
- title: "Session store backend?"
51
- subtitle: "Auth needs persistent sessions across restarts"
52
- kind: decision
53
- options:
54
- in-memory: "In-memory map — simplest. Loses sessions on restart; single-process only."
55
- redis: "Redis — survives restart, supports horizontal scale. New ops dependency."
56
- postgres: "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
57
- defer: "Ship in-memory now, migrate later if scale becomes real."
58
- allowFreetext: true
59
- freetextLabel: "Different framing — describe it"
60
- ```
61
- </example>
62
-
63
- <example type="bad">
64
- ```
65
- title: "Happy with this design?"
66
- options:
67
- 1. Yes
68
- 2. No, start over
69
- 3. Maybe, with comments
70
- 4. (no option, just freetext)
71
- ```
72
- "Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
73
- </example>
74
-
75
- ### Use `allowFreetext: true` as a safety valve, not the primary input
76
-
77
- Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
78
-
79
- <example type="bad">
80
- ```
81
- title: "Approve?"
82
- options:
83
- 1. Approve
84
- 2. Reject
85
- 3. Comment
86
- allowFreetext: true
87
- ```
88
- A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
89
- </example>
90
-
91
- ### Bound option count to 2–4
92
-
93
- Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
94
-
95
- ### Ground options in what you've already gathered
96
-
97
- Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
98
-
99
- ### One concern per interaction
100
-
101
- When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
102
-
103
- ## `kind` — display hint
104
-
105
- | kind | use for |
106
- |---|---|
107
- | `decision` | fork in the road; user picks a path forward |
108
- | `validation` | sign-off on completed work |
109
- | `notify` | FYI; user acknowledges |
110
- | `context` | surfacing background that needs a response |
111
- | `error` | something went wrong; user picks a recovery |
112
-
113
- The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
114
-
115
- ## Bundling
116
-
117
- If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
118
-
119
- ```bash
120
- deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
121
- cat > "$deck" <<'EOF'
122
- {
123
- "title": "Phase 2 sign-off + follow-on decisions",
124
- "interactions": [
125
- {
126
- "id": "approve-phase-2",
127
- "title": "Phase 2 looks good?",
128
- "kind": "validation",
129
- "options": [...]
130
- },
131
- {
132
- "id": "phase-3-scope",
133
- "title": "Phase 3 scope?",
134
- "kind": "decision",
135
- "options": [...]
136
- }
137
- ]
138
- }
139
- EOF
140
- # Then invoke `sis ask "$deck"` synchronously (foreground bash) — blocks until answered.
141
- # Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
142
- ```
143
-
144
- ## Submission notes
145
-
146
- - The deck is validated at submit (precise errors — trust them).
147
- - `kind` is an enum: `notify` | `validation` | `decision` | `context` | `error`. No other values accepted (see the table above for which to pick).
148
- - `bodyPath` points at a markdown file instead of inlining the body in JSON. The path is resolved **relative to the deck JSON's directory** and must stay inside it (no `..`, no symlinks out, no absolute paths pointing elsewhere). Practical pattern: write the deck JSON next to its body file — e.g. both inside `$SISYPHUS_SESSION_DIR/context/` — and use a basename like `"completion-summary.md"`. Mutually exclusive with `body`.
149
- - On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
150
- - See `sis ask -h` for the full CLI surface.
@@ -1 +0,0 @@
1
- - `sis orch yield --mode <mode>` is required on every yield. Pass the current mode to stay in it; pass a different mode to transition. There is no implicit "keep current mode" fallback — the CLI rejects yields without `--mode`.
@@ -1,29 +0,0 @@
1
- ---
2
- name: orchestration
3
- description: >
4
- Task breakdown patterns for sisyphus orchestrator sessions. How to structure tasks, sequence agents, and manage cycles for debugging, feature builds, refactors, and other common workflows. Use when planning orchestration strategy or structuring a multi-agent session.
5
- ---
6
-
7
- # Orchestration Patterns
8
-
9
- How to structure sisyphus sessions for common task types. This skill helps the orchestrator break work into tasks, choose agent types, sequence cycles, and handle failures.
10
-
11
- ## Core Principles
12
-
13
- 1. **roadmap.md is the orchestrator's memory.** roadmap.md and agent reports persist across cycles — they're all you have. Keep roadmap.md current and specific enough that a fresh orchestrator can pick up where you left off.
14
-
15
- 2. **Agents are disposable.** Each agent gets one focused instruction. If it fails or the scope changes, spawn a new one — don't try to redirect a running agent.
16
-
17
- 3. **Parallelize when independent.** If two tasks don't share files or depend on each other's output, spawn agents for both in the same cycle.
18
-
19
- 4. **Interleave verification.** Don't batch all implementation and defer review to the end. Embed critique and validation checkpoints between stages based on risk — the more subsequent work depends on a stage being correct, the more it needs verification before you build on it.
20
-
21
- 5. **Reports are handoffs.** Agent reports should contain everything the next cycle's orchestrator needs — what was done, what was found, what's unresolved, where artifacts were saved.
22
-
23
- ## Agent Types
24
-
25
- Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sis agent spawn`.
26
-
27
- For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
28
- For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
29
- For strategy.md authoring — stage patterns, process shapes, format — see [strategy.md](strategy.md).
@@ -1,160 +0,0 @@
1
- # Strategy Reference
2
-
3
- Reference material for writing and updating strategy.md — the document that maps the shape of the work across stages.
4
-
5
- ## strategy.md Format
6
-
7
- ```markdown
8
- ## Completed
9
- [Compressed summaries of finished stages — delete detail, keep outcomes]
10
-
11
- ## Current Stage: [name]
12
- [Detailed process flow with exit criteria and backtrack triggers]
13
-
14
- ## Ahead
15
- [Sketched future stages — one line each: name + what it covers]
16
- [Only as far as you can currently see — it's OK if this is vague]
17
- ```
18
-
19
- **Principles:**
20
- - **Detail the current stage** — concrete enough that the orchestrator can execute without re-reading this skill
21
- - **Sketch what's ahead** — enough continuity that future updates don't lose the thread, not so much that you're committing to unknowns
22
- - **Every detailed stage gets exit criteria** — concrete enough to evaluate, not so rigid they become checkboxes
23
- - **Include user gates** — where does this stage need the user? What decision or approval?
24
-
25
- ## Stages name kinds of work, not areas of code
26
-
27
- A strategy stage is a **process phase** — `discovery`, `planning`, `implementation`, `validation`, `spike`. It describes the *kind* of thinking happening that stage. It is **not** a work-area label like `auth-refactor`, `tui-panel`, `migration-script`, or `foundations`.
28
-
29
- Work areas are the plan agent's job. They live in `context/{plan-lead-agent-id}/plan-stage-N-*.md` and structure the implementation phase from the inside. Keep them out of `strategy.md`.
30
-
31
- <example>
32
- ✓ Correct — process phases:
33
- ```
34
- ## Ahead
35
- - **implementation** — phased build per the plan outline (5 sub-stages: foundations → ask-cli → tui → orphan-handling → migration). Critique + validate per stage.
36
- - **validation** — run e2e recipe end-to-end, capture evidence, user gate.
37
- ```
38
-
39
- ✗ Wrong — work areas masquerading as stages:
40
- ```
41
- ## Ahead
42
- - **foundations** — humanloop refactor + ask-store helpers
43
- - **ask-cli + haiku + template** — CLI command and tool-use loop
44
- - **tui-integration** — inbox panel and key routing
45
- - **orphan-handling** — kill/complete paths
46
- - **migration + e2e validation** — drop old command, run recipe
47
- ```
48
- The second list is a roadmap of code work. Strategy.md collapses into a task list and the process shape (when do we critique? when do we validate? what's the user gate?) disappears.
49
- </example>
50
-
51
- When you're tempted to name a stage after a code area, that signals you're sketching the plan, not the strategy. Push that detail down into the plan agent's output and keep `strategy.md` at the process-shape layer.
52
-
53
- ## Default Pipeline Shape
54
-
55
- The session's effort tier dictates the default pipeline. **Use this shape unless the problem explicitly demands more or less.** The user can change tiers via `sis session effort <low|medium|high|xhigh>`.
56
-
57
- <!--EFFORT:LOW-->
58
- **Pipeline:** `plan → implement → validate`
59
-
60
- A single plan agent, a single implement agent, a single validate agent. No spec, problem, test-spec, or review-plan stages — the user's request is the requirement; ask in-band if anything's ambiguous. If the work is wrapper-shaped (every change backs onto an existing CLI/API/handler), move directly from discovery into implementation mode without a planning-mode cycle at all.
61
- <!--/EFFORT-->
62
-
63
- <!--EFFORT:MEDIUM-->
64
- **Pipeline:** `(spec, if behavior changes) → plan → implement → validate`
65
-
66
- Add `sisyphus:review-plan` only when the plan covers multi-domain integration. Add `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Spawn `sisyphus:spec` and `sisyphus:problem` only when the goal has multiple valid framings or the design space is genuinely open.
67
- <!--/EFFORT-->
68
-
69
- <!--EFFORT:HIGH,XHIGH-->
70
- **Pipeline:** `discovery → spec → planning (with parallel review-plan) → phased implementation with critique/validate checkpoints → validation`
71
-
72
- `sisyphus:review-plan` runs after the plan is drafted. `sisyphus:spec` spawns whenever a feature adds user-visible behavior. `sisyphus:problem` spawns when the goal is nebulous. Append `+ test-spec` to the planning stage **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"); silence is a "no." When justified, `sisyphus:test-spec` spawns in parallel with the high-level plan at Cycle 2, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do.
73
- <!--/EFFORT-->
74
-
75
- **Re-evaluate the tier when scope shifts mid-session.** A MEDIUM feature that uncovers a new subsystem may have crossed into HIGH; a HIGH feature whose scope was narrowed may have dropped to MEDIUM. Re-run `sis session effort` and re-invoke this skill rather than continuing under the old tier's pipeline.
76
-
77
- ## Choosing a Different Shape
78
-
79
- If the default doesn't match the problem, these canonical progressions are the next-best starting points — pick the closest one and prune what's already clear, rather than inventing custom shapes:
80
-
81
- ```
82
- discovery → spec → planning → implementation → validation
83
- exploration → spike → design → implementation → validation
84
- investigation → recommendation → (user decides) → implementation
85
- analysis → phased-transformation → verification
86
- discovery → product-design → technical-investigation → architecture → implementation → validation
87
- ```
88
-
89
- Add a new stage *type* only when the problem demands a kind of work the patterns don't cover — for example a `spike` to prove feasibility, a `compatibility-check` before a migration, or a `prototype` before committing. The test for "is this a real new stage?" is whether it names a different kind of thinking, not a different slice of code.
90
-
91
- ## Stage Patterns
92
-
93
- Use these as starting points. Invent new stage types when the problem demands it. Add backtrack edges where you can foresee things going wrong.
94
-
95
- ### discovery
96
- **Use when:** Goal is undefined, ambiguous, or has shifted — need to clarify what "done" looks like before any other stage runs. Also re-entered mid-session when a pivot invalidates the current goal.
97
- - Process: read prior context (goal.md, prior strategy if any) → if the goal is provably clear, write goal.md and run the clarity-confirmation deck → otherwise spawn `sisyphus:problem` for interactive exploration → user iterates → fold result into goal.md → set effort tier → write or revise strategy.md
98
- - Exit: goal.md is current and confirmed; effort tier is set; strategy.md exists for this iteration
99
- - Produces: goal.md, strategy.md, optionally context/problem.md or context/problem-bifurcation.md
100
- - Backtrack: if scope reveals multiple independent projects, issue a decomposition deck and let the user pick a lead — record the others under "Known follow-ups" in goal.md
101
-
102
- ### exploration
103
- **Use when:** Need to understand the technical landscape before committing to an approach.
104
- - Process: spawn explore agents (each producing a focused context doc) → review findings → identify gaps → re-explore or converge
105
- - Exit: enough understanding to make decisions — key questions answered, relevant patterns documented
106
- - Produces: context documents (one per investigation angle, not one sprawling doc)
107
-
108
- ### spike
109
- **Use when:** Feasibility is uncertain — need to prove an approach works before investing in full design.
110
- - Process: identify the riskiest assumption → build a minimal prototype that tests it → evaluate results → present findings to user if the spike changes the approach
111
- - Exit: feasibility confirmed or denied with evidence, decision on path forward
112
- - Produces: spike findings in context/, prototype code (may be throwaway)
113
- - Backtrack: if spike fails → re-explore alternatives
114
-
115
- ### spec
116
- **Use when:** Need to define what to build and how, in a single interactive session.
117
- - Process: spawn sisyphus:spec → lead explores codebase, asks user questions, dispatches engineer for design and a single writer for requirements → user reviews via TUI → lead deepens design with findings
118
- - Exit: user-approved design + requirements with testable acceptance criteria
119
- - Produces: context/design.md + context/design.json + context/requirements.json + context/requirements.md
120
- - Backtrack: if problem was misframed → re-explore or re-discover
121
-
122
- ### planning
123
- **Use when:** Design approved, need an executable breakdown.
124
- - Process: spawn plan lead with spec outputs (requirements + design) as inputs → adversarial review of plan → create e2e verification recipe
125
- - Exit: reviewed plan + executable e2e-recipe.md that defines how to prove the feature works
126
- - Produces: phased implementation plan + e2e recipe in context/
127
- - Backtrack: if plan reveals design infeasibility → revisit spec
128
-
129
- ### implementation
130
- **Use when:** Plan exists, time to build.
131
- - Process: for each phase → detail-plan → spawn implement agents → single critique pass → refine → validate phase
132
- - Exit: all phases validated with evidence, no critical review findings remain
133
- - Loops: none within a phase — review runs once, fixes land, then validation. If review surfaces architectural issues, backtrack to plan; otherwise advance.
134
- - Backtrack: if 2+ agents hit same unexpected complexity → revisit plan or spec; if review finds architectural issues → revisit plan
135
-
136
- ### validation
137
- **Use when:** Implementation complete, need to prove it works end-to-end.
138
- - Process: run full e2e recipe → collect evidence (command output, screenshots, responses) → assess against success criteria → step back and check if the goal is actually met
139
- - Exit: all recipe steps pass with concrete evidence, original goal satisfied
140
- - Produces: validation report with evidence
141
- - Backtrack: if bugs found → implementation; if architectural issues → spec
142
-
143
- ## Mid-session shape revisions
144
-
145
- When the work in flight reveals the strategy itself is off, escalate up this ladder — reach for the lowest-cost move that fits.
146
-
147
- 1. **Revise in place.** Stage detail evolved but the pipeline shape holds. Edit `strategy.md` and `roadmap.md`; continue.
148
- 2. **`sisyphus:strategize`.** Approach is wrong but artifacts (specs, explorations, reports) still apply. Annotates the pivot into `strategy.md` and yields `--mode discovery` with a fresh orchestrator.
149
- 3. **`sis session clone <goal>`.** The session is actually two (or more) independent projects. Forks scope into a new top-level session; update `goal.md`/`roadmap.md` here to drop what was cloned.
150
- 4. **`sis session rollback <sessionId> <cycle>`.** A specific cycle introduced state to discard. Rewinds and pauses the session — cycles after the target are lost. Last resort; the others preserve history.
151
-
152
- When the user is the source of the change, update `goal.md` first — strategy revision is downstream of goal.
153
-
154
- ## Design Philosophy
155
-
156
- Frameworks to inform process shape selection — use them to *choose the right shape*, not to follow mechanically:
157
-
158
- - **Double Diamond** — Diverge to explore, converge on a definition; diverge on solutions, converge on implementation. Use when requirements are unclear or the problem needs defining.
159
- - **OODA (Observe–Orient–Decide–Act)** — Tight sensing/reacting loops. Use when the situation is fluid and the cost of wrong moves is low (debugging, spikes, incident response).
160
- - **Cynefin** — Match approach to domain. Clear → best practice. Complicated → analyze then execute. Complex → probe, sense, respond. Chaotic → act to stabilize.
@@ -1,266 +0,0 @@
1
- # Work Breakdown Patterns
2
-
3
- Patterns for how the orchestrator should structure roadmap.md for common workflow types. Each pattern shows the plan structure, agent assignments, cycle sequencing, and failure handling.
4
-
5
- ---
6
-
7
- ## Bug Fix
8
-
9
- ### When to use
10
- Something is broken. User reports a bug, test is failing, behavior is wrong.
11
-
12
- ### Plan structure
13
- ```
14
- ## Bug Fix: [description]
15
-
16
- - [ ] Diagnose root cause of [bug description]
17
- - [ ] Implement fix for [root cause]
18
- - [ ] Validate fix — regression tests pass, bug is resolved
19
- - [ ] Review fix for unintended side effects
20
- ```
21
-
22
- ### Cycle plan
23
- - **Cycle 1**: Spawn `sisyphus:debug` for diagnosis. Yield.
24
- - **Cycle 2**: Read diagnosis report. If confident root cause found, spawn `sisyphus:implement` for fix with the diagnosis as context. Yield.
25
- - **Cycle 3**: Spawn `sisyphus:validate` for validation. Yield.
26
- - **Cycle 4**: If validation passes, spawn `sisyphus:review` for review. If fails, update plan with failure context and respawn implement. Yield.
27
- - **Cycle 5**: Review results. Complete or address review findings.
28
-
29
- ### Failure modes
30
- - **Debug inconclusive**: Add more context to plan, respawn debug with narrower scope or different focus areas.
31
- - **Fix breaks other things**: Validation catches this. Feed validation failures back into a new implement cycle.
32
- - **Root cause was wrong**: Update plan with what was learned, respawn debug.
33
-
34
- ### Parallelization
35
- Usually serial — diagnosis must complete before fix, fix before validation. Exception: if the bug affects multiple independent areas, spawn multiple debug agents in parallel.
36
-
37
- ---
38
-
39
- ## Feature Build (Small — 1-3 files)
40
-
41
- ### When to use
42
- Clear requirements, small scope, no formal requirements document needed.
43
-
44
- ### Plan structure
45
- ```
46
- ## Feature: [description]
47
-
48
- - [ ] Plan implementation for [feature]
49
- - [ ] Implement [feature]
50
- - [ ] Validate implementation
51
- ```
52
-
53
- ### Cycle plan
54
- - **Cycle 1**: Spawn `sisyphus:plan` for planning. Yield.
55
- - **Cycle 2**: Spawn `sisyphus:implement` with plan path. Yield.
56
- - **Cycle 3**: Spawn `sisyphus:validate` for validation. Yield.
57
- - **Cycle 4**: Complete or fix issues.
58
-
59
- ### Parallelization
60
- Serial. Too small to benefit from parallel agents.
61
-
62
- ---
63
-
64
- ## Feature Build (Medium — 4-10 files)
65
-
66
- ### When to use
67
- Feature with moderate complexity. Requirements may need clarification. Multiple files across a few modules.
68
-
69
- ### Plan structure
70
- ```
71
- ## Feature: [description]
72
-
73
- ### Requirements & Design
74
- - [ ] (conditional) Problem exploration — if goal is nebulous, explore before spec
75
- - [ ] Requirements — define acceptance criteria
76
- - [ ] Design — architecture, component boundaries, data models
77
- - [ ] Create implementation plan from requirements + design
78
- - [ ] Review plan against requirements + design
79
-
80
- ### Implementation
81
- - [ ] Phase 1 — [foundation/types/interfaces]
82
- - [ ] Phase 2 — [core logic]
83
- - [ ] Critique phases 1-2
84
- - [ ] Phase 3 — [integration/wiring]
85
- - [ ] Validate — smoketest full feature e2e
86
- - [ ] Review implementation
87
- ```
88
-
89
- Note: critique and validation are embedded between implementation phases, not deferred to the end. Phase 1 (types) is low-risk and doesn't need its own review, but critique catches issues before Phase 3 builds on them. Validation happens after integration, when all the pieces come together.
90
-
91
- ### Cycle plan
92
- - **Cycle 0** (conditional): If the problem is nebulous — multiple valid framings, unclear what "done" looks like — spawn `sisyphus:problem` for interactive exploration. Yield `--mode discovery`. Skip if goal is clear and acceptance criteria are obvious.
93
- - **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
94
- - **Cycle 2**: Spawn `sisyphus:plan` for plan. Yield.
95
- - **Cycle 3**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
96
- - **Cycle 4**: Spawn `sisyphus:implement` for Phase 1. Yield.
97
- - **Cycle 5**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types — low risk, doesn't need its own validation. Yield.
98
- - **Cycle 6**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
99
- - **Cycle 7**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
100
- - **Cycle 8**: `sis orch yield --mode validation` for e2e smoketest. Validation mode proves the feature works — operator for UI, evidence for every claim.
101
- - **Cycle 9**: Address validation failures (back to `--mode implementation`) or complete.
102
-
103
- ### Failure modes
104
- - **Spec needs human input**: Mark session as needing human review. Orchestrator notes open questions.
105
- - **Plan fails review**: Feed review issues back, respawn planner.
106
- - **Critique finds issues in foundation**: Fix before starting integration — don't build on shaky ground.
107
- - **Validation fails**: Feed specifics back to implement agent for the failing area.
108
-
109
- ### Parallelization
110
- Phases without dependencies can run in parallel. Types/interfaces (Phase 1) must complete before implementation phases that consume them. Critique can run alongside detail-planning for the next phase.
111
-
112
- ---
113
-
114
- ## Feature Build (Large — 10+ files)
115
-
116
- ### When to use
117
- Cross-cutting feature, multiple domains, needs team coordination. Uses **progressive planning** — high-level outline first, then detail-plan each stage as it's reached.
118
-
119
- ### Plan structure
120
- ```
121
- ## Feature: [description]
122
-
123
- ### Requirements & Design
124
- - [ ] (conditional) Problem exploration — if goal is nebulous
125
- - [ ] Requirements
126
- - [ ] Design
127
-
128
- ### Stage Outline (high-level only — no file-level detail yet)
129
- 1. [domain A foundation] — no deps — ~N cycles
130
- 2. [domain B foundation] — no deps — ~N cycles
131
- → critique stages 1-2 (foundation is low-risk individually, but review before building on it)
132
- 3. [domain A implementation] — depends on 1 — ~N cycles
133
- 4. [domain B implementation] — depends on 2 — ~N cycles
134
- → critique + validate stages 3-4 (core logic, high risk — verify before integration)
135
- 5. [integration layer] — depends on 3, 4 — ~N cycles
136
- → validate end-to-end (integration is where accumulated assumptions break)
137
- 6. [final review] — depends on all
138
-
139
- ### Current Stage: [whichever is active]
140
- See context/{plan-lead-agent-id}/plan-stage-N-{name}.md for detail plan. (Path comes from the plan lead's submission report.)
141
- - [ ] [task-level items from detail plan]
142
- ```
143
-
144
- Note: verification checkpoints are embedded in the stage outline, not deferred to a final phase. The level of rigor varies — foundation stages get a light critique, core logic gets critique + validation, integration gets full e2e validation. This is judgment, not formula.
145
-
146
- ### Cycle plan
147
- - **Cycle 0** (conditional): If the problem is nebulous, spawn explore agents for technical landscape (yield `--mode discovery`), then spawn `sisyphus:problem` for interactive problem exploration (yield `--mode discovery`). May take 1-3 discovery cycles. Skip if the goal and scope are already clear.
148
- - **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
149
- - **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." If the user's initial prompt or goal.md explicitly requested tests, also spawn `sisyphus:test-spec` for test properties in parallel; otherwise skip. Yield.
150
- - **Cycle 4**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). The plan agent saves under its own subdir and reports the full path — carry that path forward for the implement cycle. Yield.
151
- - **Cycle 5**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
152
- - **Cycle 6**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
153
- - **Cycle 7**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
154
- - **Cycle 8**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
155
- - **Cycle 9**: Spawn `sisyphus:validate` for stages 3-4 — core logic checkpoint before integration. Address stage 3 critique. Yield.
156
- - **Cycle 10+**: Implement integration stage. Final review. Then `sis orch yield --mode validation` for comprehensive e2e proof.
157
-
158
- ### Failure modes
159
- - **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
160
- - **Integration failures**: Often means contracts between domains don't match. Spawn debug agent targeting the integration seam.
161
- - **Stage N implementation invalidates stage N+1 outline**: Update the high-level outline. This is expected — it's why you don't detail-plan everything upfront.
162
- - **Critique finds issues after multiple stages built on top**: This is the scenario verification checkpoints exist to prevent. If it happens, you waited too long to review — add earlier checkpoints to the roadmap going forward.
163
-
164
- ### Parallelization
165
- Maximize within the progressive pattern. Independent stages run in parallel. Detail-planning the next stage runs alongside implementing the current one. Critique and validation agents run alongside the next stage's planning or implementation. Foundation stages complete before dependent stages. Integration waits for all domain implementations.
166
-
167
- ---
168
-
169
- ## Refactor
170
-
171
- ### When to use
172
- Restructure code without changing behavior. Move files, rename abstractions, consolidate patterns.
173
-
174
- ### Plan structure
175
- ```
176
- ## Refactor: [description]
177
-
178
- - [ ] Analyze current structure and plan refactor
179
- - [ ] Capture behavioral snapshot (existing tests + manual checks)
180
- - [ ] Execute refactor phase 1 — [structural changes]
181
- - [ ] Execute refactor phase 2 — [update consumers]
182
- - [ ] Validate behavior preserved — all original tests pass
183
- - [ ] Review for missed references, dead code, broken imports
184
- ```
185
-
186
- ### Cycle plan
187
- - **Cycle 1**: Spawn `sisyphus:plan` for analysis + `sisyphus:validate` to capture baseline (parallel). Yield.
188
- - **Cycle 2**: Spawn `sisyphus:implement` for phase 1. Yield.
189
- - **Cycle 3**: Spawn `sisyphus:implement` for phase 2 + `sisyphus:validate` for phase 1 (parallel). Yield.
190
- - **Cycle 4**: Spawn `sisyphus:validate` for full validation. Yield.
191
- - **Cycle 5**: Spawn `sisyphus:review` for final review. Complete.
192
-
193
- ### Key principle
194
- **Behavior preservation is the only metric.** The refactor is correct if and only if all existing tests pass and externally observable behavior is unchanged.
195
-
196
- ### Parallelization
197
- Limited. Refactor phases are often sequential (move before update consumers). Validation can run in parallel with the next phase if they touch different files.
198
-
199
- ---
200
-
201
- ## Code Review
202
-
203
- ### When to use
204
- PR review, pre-merge check, or periodic quality audit.
205
-
206
- ### Plan structure
207
- ```
208
- ## Review: [scope]
209
-
210
- - [ ] Review [scope] for issues
211
- - [ ] (conditional) Fix critical/high issues found
212
- - [ ] Verify fixes landed (type-check, tests pass)
213
- ```
214
-
215
- ### Cycle plan
216
- - **Cycle 1**: Spawn `sisyphus:review` for review. Yield.
217
- - **Cycle 2**: If critical/high issues, spawn `sisyphus:implement` for fixes. If clean, complete.
218
- - **Cycle 3**: Verify fixes landed by reading fix-agent reports + running type-check/tests. Complete. Do **not** spawn a second review pass — review runs once, validation catches regressions.
219
-
220
- ### Parallelization
221
- Review itself parallelizes internally (subagents per concern). Fix cycle is usually serial.
222
-
223
- ---
224
-
225
- ## Investigation / Spike
226
-
227
- ### When to use
228
- Need to understand something before committing to an approach. Prototype, explore, or answer a technical question.
229
-
230
- ### Plan structure
231
- ```
232
- ## Investigation: [question/area]
233
-
234
- - [ ] Investigate [question/area]
235
- - [ ] Summarize findings and recommendation
236
- ```
237
-
238
- ### Cycle plan
239
- - **Cycle 1**: Spawn `sisyphus:debug` (for code investigation) or `sisyphus:general` (for broader research). Yield.
240
- - **Cycle 2**: Spawn `sisyphus:general` to synthesize findings. Complete.
241
-
242
- ### Parallelization
243
- If investigating multiple independent areas, spawn parallel agents each exploring a different angle.
244
-
245
- ---
246
-
247
- ## Tactician-Driven Implementation
248
-
249
- ### When to use
250
- The plan exists and you want automated cycle-by-cycle execution without manual orchestrator decisions. The tactician reads the plan, dispatches one phase at a time, and tracks progress.
251
-
252
- ### Plan structure
253
- ```
254
- ## Tactician Execution
255
-
256
- - [ ] Execute implementation plan at [path] using tactician loop
257
- ```
258
-
259
- ### Cycle plan
260
- This is a single-item pattern. The orchestrator spawns the tactician once:
261
- - **Cycle 1**: Spawn `sisyphus:tactician` with plan path. The tactician internally dispatches implement/validate agents via submit tool actions. The orchestrator's role is minimal — just monitor the tactician's completion report.
262
-
263
- ### When NOT to use
264
- - When you need human checkpoints between phases
265
- - When phases have external dependencies (waiting on API access, design review, etc.)
266
- - When the task requires creative decisions the tactician shouldn't make alone