sisyphi 1.0.13 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/{chunk-T7ETTIQK.js → chunk-M7LZ2ZHD.js} +3 -27
- package/dist/chunk-M7LZ2ZHD.js.map +1 -0
- package/dist/{chunk-JXKUI4P6.js → chunk-REUQ4B45.js} +7 -38
- package/dist/chunk-REUQ4B45.js.map +1 -0
- package/dist/{chunk-LWWRGQWM.js → chunk-Z32YVDMY.js} +2 -2
- package/dist/chunk-Z32YVDMY.js.map +1 -0
- package/dist/cli.js +75 -56
- package/dist/cli.js.map +1 -1
- package/dist/daemon.js +776 -629
- package/dist/daemon.js.map +1 -1
- package/dist/{paths-NUUALUVP.js → paths-IJXOAN4E.js} +4 -6
- package/dist/templates/CLAUDE.md +16 -14
- package/dist/templates/agent-plugin/agents/CLAUDE.md +17 -6
- package/dist/templates/agent-plugin/agents/design.md +134 -0
- package/dist/templates/agent-plugin/agents/explore.md +39 -0
- package/dist/templates/agent-plugin/agents/operator.md +24 -0
- package/dist/templates/agent-plugin/agents/plan.md +15 -20
- package/dist/templates/agent-plugin/agents/problem.md +119 -0
- package/dist/templates/agent-plugin/agents/requirements.md +138 -0
- package/dist/templates/agent-plugin/agents/review/CLAUDE.md +29 -0
- package/dist/templates/agent-plugin/agents/review/compliance.md +6 -6
- package/dist/templates/agent-plugin/agents/review-plan/code-smells.md +4 -4
- package/dist/templates/agent-plugin/agents/review-plan/requirements-coverage.md +62 -0
- package/dist/templates/agent-plugin/agents/review-plan/security.md +1 -1
- package/dist/templates/agent-plugin/agents/review-plan.md +9 -8
- package/dist/templates/agent-plugin/agents/review.md +1 -1
- package/dist/templates/agent-plugin/agents/test-spec.md +3 -3
- package/dist/templates/agent-plugin/hooks/CLAUDE.md +2 -2
- package/dist/templates/agent-plugin/hooks/explore-user-prompt.sh +13 -0
- package/dist/templates/agent-plugin/hooks/plan-user-prompt.sh +1 -1
- package/dist/templates/agent-plugin/hooks/require-submit.sh +70 -3
- package/dist/templates/agent-plugin/hooks/review-plan-user-prompt.sh +4 -4
- package/dist/templates/agent-plugin/hooks/review-user-prompt.sh +1 -1
- package/dist/templates/agent-suffix.md +0 -2
- package/dist/templates/orchestrator-base.md +169 -145
- package/dist/templates/orchestrator-impl.md +92 -57
- package/dist/templates/orchestrator-planning.md +46 -56
- package/dist/templates/orchestrator-plugin/commands/sisyphus/design.md +13 -0
- package/dist/templates/orchestrator-plugin/commands/sisyphus/problem.md +13 -0
- package/dist/templates/orchestrator-plugin/commands/sisyphus/requirements.md +13 -0
- package/dist/templates/orchestrator-plugin/commands/sisyphus/strategize.md +19 -0
- package/dist/templates/orchestrator-plugin/hooks/explore-gate.sh +15 -0
- package/dist/templates/orchestrator-plugin/hooks/hooks.json +14 -1
- package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +34 -27
- package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +56 -24
- package/dist/templates/orchestrator-strategy.md +233 -0
- package/dist/templates/orchestrator-validation.md +94 -0
- package/dist/tui.js +2730 -2924
- package/dist/tui.js.map +1 -1
- package/package.json +2 -4
- package/templates/CLAUDE.md +16 -14
- package/templates/agent-plugin/agents/CLAUDE.md +17 -6
- package/templates/agent-plugin/agents/design.md +134 -0
- package/templates/agent-plugin/agents/explore.md +39 -0
- package/templates/agent-plugin/agents/operator.md +24 -0
- package/templates/agent-plugin/agents/plan.md +15 -20
- package/templates/agent-plugin/agents/problem.md +119 -0
- package/templates/agent-plugin/agents/requirements.md +138 -0
- package/templates/agent-plugin/agents/review/CLAUDE.md +29 -0
- package/templates/agent-plugin/agents/review/compliance.md +6 -6
- package/templates/agent-plugin/agents/review-plan/code-smells.md +4 -4
- package/templates/agent-plugin/agents/review-plan/requirements-coverage.md +62 -0
- package/templates/agent-plugin/agents/review-plan/security.md +1 -1
- package/templates/agent-plugin/agents/review-plan.md +9 -8
- package/templates/agent-plugin/agents/review.md +1 -1
- package/templates/agent-plugin/agents/test-spec.md +3 -3
- package/templates/agent-plugin/hooks/CLAUDE.md +2 -2
- package/templates/agent-plugin/hooks/explore-user-prompt.sh +13 -0
- package/templates/agent-plugin/hooks/plan-user-prompt.sh +1 -1
- package/templates/agent-plugin/hooks/require-submit.sh +70 -3
- package/templates/agent-plugin/hooks/review-plan-user-prompt.sh +4 -4
- package/templates/agent-plugin/hooks/review-user-prompt.sh +1 -1
- package/templates/agent-suffix.md +0 -2
- package/templates/orchestrator-base.md +169 -145
- package/templates/orchestrator-impl.md +92 -57
- package/templates/orchestrator-planning.md +46 -56
- package/templates/orchestrator-plugin/commands/sisyphus/design.md +13 -0
- package/templates/orchestrator-plugin/commands/sisyphus/problem.md +13 -0
- package/templates/orchestrator-plugin/commands/sisyphus/requirements.md +13 -0
- package/templates/orchestrator-plugin/commands/sisyphus/strategize.md +19 -0
- package/templates/orchestrator-plugin/hooks/explore-gate.sh +15 -0
- package/templates/orchestrator-plugin/hooks/hooks.json +14 -1
- package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +34 -27
- package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +56 -24
- package/templates/orchestrator-strategy.md +233 -0
- package/templates/orchestrator-validation.md +94 -0
- package/dist/chunk-JXKUI4P6.js.map +0 -1
- package/dist/chunk-LWWRGQWM.js.map +0 -1
- package/dist/chunk-T7ETTIQK.js.map +0 -1
- package/dist/templates/agent-plugin/agents/review-plan/spec-coverage.md +0 -44
- package/dist/templates/agent-plugin/agents/spec-draft.md +0 -78
- package/dist/templates/agent-plugin/hooks/hooks.json +0 -25
- package/dist/templates/agent-plugin/hooks/spec-user-prompt.sh +0 -19
- package/dist/templates/orchestrator-plugin/skills/git-management/SKILL.md +0 -111
- package/templates/agent-plugin/agents/review-plan/spec-coverage.md +0 -44
- package/templates/agent-plugin/agents/spec-draft.md +0 -78
- package/templates/agent-plugin/hooks/hooks.json +0 -25
- package/templates/agent-plugin/hooks/spec-user-prompt.sh +0 -19
- package/templates/orchestrator-plugin/skills/git-management/SKILL.md +0 -111
- /package/dist/{paths-NUUALUVP.js.map → paths-IJXOAN4E.js.map} +0 -0
|
@@ -1,122 +1,157 @@
|
|
|
1
1
|
# Implementation Phase
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
<stage-execution>
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
## Maximize Parallelism
|
|
6
6
|
|
|
7
|
-
Before
|
|
7
|
+
Before each cycle, ask: **which stages or tasks are independent right now?** If two stages touch different subsystems, spawn them concurrently.
|
|
8
8
|
|
|
9
|
-
Maximize parallelism **within your development cycle, not by skipping parts of it.** Running a review alongside the next stage's implementation is good parallelism. Skipping review because the next stage is ready is
|
|
9
|
+
Maximize parallelism **within your development cycle, not by skipping parts of it.** Running a review alongside the next stage's implementation is good parallelism. Skipping review because the next stage is ready is cutting corners.
|
|
10
10
|
|
|
11
|
-
If the plan has stages that share no file dependencies,
|
|
11
|
+
If the plan has stages that share no file dependencies, run them in parallel from the start. The development cycle for each stage:
|
|
12
12
|
|
|
13
|
-
1. **Detail-plan it** — expand the
|
|
14
|
-
2. **Implement it** — spawn agents with self-contained instructions
|
|
15
|
-
3. **Critique and refine it** — spawn review agents, fix what they find
|
|
16
|
-
4. **Validate it** —
|
|
13
|
+
1. **Detail-plan it** — expand the outline into specific file changes. If complex, spawn a requirements or design agent first.
|
|
14
|
+
2. **Implement it** — spawn agents with self-contained instructions.
|
|
15
|
+
3. **Critique and refine it** — spawn review agents, fix what they find.
|
|
16
|
+
4. **Validate it** — verify the stage actually works end-to-end.
|
|
17
17
|
|
|
18
|
-
Not every stage needs every step
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
-
|
|
18
|
+
Not every stage needs every step:
|
|
19
|
+
- Types/interfaces → implementation only (consumers surface type errors)
|
|
20
|
+
- Core business logic → implementation + critique minimum
|
|
21
|
+
- Integration/critical path → full loop including validation
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
**When multiple stages have completed without any critique or validation, stop implementing and catch up on verification.** Don't let unverified work compound.
|
|
24
24
|
|
|
25
25
|
Don't detail-plan all stages up front. What you learn implementing earlier stages should inform later ones.
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
</stage-execution>
|
|
28
28
|
|
|
29
|
-
|
|
29
|
+
<agent-instructions>
|
|
30
30
|
|
|
31
|
-
-
|
|
31
|
+
Implementation agent prompts must be **fully self-contained** — include everything the agent needs so it doesn't have to re-explore or guess:
|
|
32
|
+
|
|
33
|
+
- The overall session goal (one sentence)
|
|
32
34
|
- This agent's specific task (files to create/modify, what the change does, done condition)
|
|
33
35
|
- References to relevant context files (`conventions.md`, `explore-architecture.md`, etc.)
|
|
34
|
-
- The e2e recipe reference (`context/e2e-recipe.md`)
|
|
36
|
+
- The e2e recipe reference (`context/e2e-recipe.md`) for self-verification
|
|
37
|
+
|
|
38
|
+
Tell every implementation agent to report clearly when done: what they built, what files they changed, and any issues or uncertainties.
|
|
35
39
|
|
|
36
|
-
|
|
40
|
+
<delegate-outcomes>
|
|
37
41
|
|
|
38
42
|
### Delegate outcomes, not implementations
|
|
39
43
|
|
|
40
|
-
|
|
44
|
+
Define **what needs to happen and why**, not the code to write. If you're writing exact code snippets or line-by-line fix instructions in agent prompts, you're doing the agent's job.
|
|
45
|
+
|
|
46
|
+
<example>
|
|
47
|
+
<bad>
|
|
48
|
+
"Change line 45 from `x === y` to `crypto.timingSafeEqual(Buffer.from(x), Buffer.from(y))`, handle length mismatch..."
|
|
49
|
+
</bad>
|
|
50
|
+
<good>
|
|
51
|
+
"Fix the timing-safe comparison issue in authMiddleware.ts — see report at reports/agent-002-final.md, Major #3"
|
|
52
|
+
</good>
|
|
53
|
+
</example>
|
|
54
|
+
|
|
55
|
+
For fix agents: **pass the review report path and tell the agent to action the items.** The agent reads the report, understands the codebase, and figures out the right fix. Writing the code for them defeats the purpose of delegation.
|
|
41
56
|
|
|
42
|
-
|
|
43
|
-
**Good**: "Fix the timing-safe comparison issue in authMiddleware.ts — see report at reports/agent-002-final.md, Major #3"
|
|
57
|
+
The exception is architectural constraints the agent wouldn't know: "use the existing `personRepository.findOrCreateOwner` method" or "the Supabase client is at `supabaseService.getClient()`". Give agents the **what** and the **landmarks**, not the **how**.
|
|
44
58
|
|
|
45
|
-
|
|
59
|
+
</delegate-outcomes>
|
|
46
60
|
|
|
47
|
-
|
|
61
|
+
<context-propagation>
|
|
48
62
|
|
|
49
63
|
### Context propagation
|
|
50
64
|
|
|
51
|
-
The planning phase produced context files — conventions, e2e recipe, architectural findings. Be selective — give each agent the context relevant to their task
|
|
65
|
+
The planning phase produced context files — conventions, e2e recipe, architectural findings. Be selective — give each agent the context relevant to their task.
|
|
52
66
|
|
|
53
|
-
|
|
67
|
+
<example>
|
|
68
|
+
<bad>
|
|
69
|
+
"Implement the auth middleware. Look at how the existing middleware works."
|
|
70
|
+
</bad>
|
|
71
|
+
<rationale>Vague. The agent must re-explore the codebase to find conventions and patterns.</rationale>
|
|
72
|
+
<good>
|
|
73
|
+
"Implement auth middleware per context/requirements-auth.md and context/design-auth.md. Reference context/conventions.md for middleware patterns. E2E recipe at context/e2e-recipe.md."
|
|
74
|
+
</good>
|
|
75
|
+
</example>
|
|
54
76
|
|
|
55
|
-
|
|
77
|
+
</context-propagation>
|
|
56
78
|
|
|
57
|
-
|
|
79
|
+
</agent-instructions>
|
|
58
80
|
|
|
59
|
-
|
|
81
|
+
<code-smell-escalation>
|
|
60
82
|
|
|
61
|
-
|
|
83
|
+
Instruct agents to flag problems early rather than working around them. When an agent encounters unexpected complexity, unclear architecture, or code that fights back — the right move is to stop and report clearly. A clear problem description is more valuable than a brittle implementation.
|
|
62
84
|
|
|
63
|
-
|
|
85
|
+
When you see these reports, investigate before pushing forward. If the smell suggests a design issue, involve the user.
|
|
64
86
|
|
|
65
|
-
|
|
87
|
+
</code-smell-escalation>
|
|
66
88
|
|
|
67
|
-
|
|
89
|
+
<critique-refinement>
|
|
68
90
|
|
|
69
|
-
|
|
91
|
+
## Critique Cycle
|
|
70
92
|
|
|
71
|
-
|
|
93
|
+
After implementation agents report, assess whether the stage needs critique before advancing. The failure mode is not "sometimes skipping review" — it's implementing six stages in a row without any.
|
|
72
94
|
|
|
73
|
-
|
|
95
|
+
When a stage warrants critique, spawn review agents in parallel, each attacking a different dimension:
|
|
96
|
+
- **Code reuse** — existing utilities, helpers, patterns the new code duplicates
|
|
97
|
+
- **Code quality** — hacky patterns, redundant state, parameter sprawl, copy-paste, leaky abstractions
|
|
98
|
+
- **Efficiency** — redundant computations, N+1 patterns, missed concurrency, unbounded data structures
|
|
99
|
+
|
|
100
|
+
Give each reviewer the full diff and relevant context files. They report problems — they don't fix.
|
|
74
101
|
|
|
75
|
-
|
|
102
|
+
## Refine Cycle
|
|
76
103
|
|
|
77
|
-
Aggregate
|
|
104
|
+
Aggregate reviewer findings. Spawn fix agents and **point them at the review report** — don't rewrite findings as line-by-line instructions. You triage (skip false positives, note architectural constraints) — they implement.
|
|
78
105
|
|
|
79
106
|
```bash
|
|
80
107
|
sisyphus spawn --name "fix-review-issues" --agent-type sisyphus:implement \
|
|
81
108
|
"Fix the issues in reports/agent-003-final.md. Skip item #5 (false positive). Run type-check after."
|
|
82
109
|
```
|
|
83
110
|
|
|
84
|
-
|
|
111
|
+
Fix agents should use `/simplify` to review their own changes before reporting.
|
|
85
112
|
|
|
86
|
-
|
|
113
|
+
Re-review after fixes. Stop when reviewers return only stylistic nits. If 3+ rounds are needed, the approach — not the patches — needs rethinking.
|
|
87
114
|
|
|
88
|
-
|
|
115
|
+
</critique-refinement>
|
|
89
116
|
|
|
90
|
-
|
|
117
|
+
<e2e-validation>
|
|
91
118
|
|
|
92
|
-
E2E validation confirms the implementation actually works — not just
|
|
119
|
+
E2E validation confirms the implementation actually works — not just compiles or passes unit tests. Reserve full validation for stages where you're building on accumulated work or where failure would be expensive to debug later. Don't let more than 2-3 stages accumulate without one.
|
|
93
120
|
|
|
94
121
|
Spawn a validation agent with the e2e recipe from `context/e2e-recipe.md`. The agent should:
|
|
95
|
-
- Follow
|
|
96
|
-
- Run every verification step
|
|
97
|
-
- Report exactly what passed and what failed
|
|
122
|
+
- Follow setup steps exactly (build, start servers, seed data)
|
|
123
|
+
- Run every verification step
|
|
124
|
+
- Report exactly what passed and what failed
|
|
98
125
|
|
|
99
|
-
If the recipe involves UI,
|
|
126
|
+
If the recipe involves UI, use `capture` to screenshot the running app. If API, curl the endpoints. If CLI, exercise it in the terminal.
|
|
100
127
|
|
|
101
|
-
If the project lacks validation tooling, **create it
|
|
128
|
+
If the project lacks validation tooling, **create it** — a smoke-test script, seed command, or health-check endpoint pays for itself immediately.
|
|
102
129
|
|
|
103
|
-
|
|
130
|
+
**Don't advance past a validated stage until validation passes.** If it fails, log failures, spawn fix agents, re-validate.
|
|
104
131
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
When spawning two or more implementation agents in the same cycle, prefer `--worktree` for each. Worktree isolation eliminates file conflict risk — agents can't clobber each other's changes, each gets a clean branch, and they can commit incrementally. The daemon merges branches back when agents complete and surfaces conflicts in your next cycle's state.
|
|
132
|
+
When all implementation stages are complete, transition to validation mode for the comprehensive final pass:
|
|
108
133
|
|
|
109
134
|
```bash
|
|
110
|
-
sisyphus
|
|
111
|
-
sisyphus spawn --name "impl-routes" --agent-type sisyphus:implement --worktree "Add login routes — see context/conventions.md and context/explore-architecture.md"
|
|
135
|
+
sisyphus yield --mode validation --prompt "All stages implemented — validate against context/e2e-recipe.md"
|
|
112
136
|
```
|
|
113
137
|
|
|
114
|
-
|
|
138
|
+
Validation mode shifts the orchestrator's entire focus to proving the feature works. Stage-level validation during implementation catches issues early; the final validation pass proves the whole thing holds together.
|
|
139
|
+
|
|
140
|
+
</e2e-validation>
|
|
115
141
|
|
|
116
|
-
|
|
142
|
+
<returning-to-planning>
|
|
143
|
+
|
|
144
|
+
If the approach is wrong mid-implementation, don't keep pushing. Return to planning:
|
|
117
145
|
|
|
118
146
|
```bash
|
|
119
147
|
sisyphus yield --mode planning --prompt "Re-evaluate: discovered X changes the approach — write cycle log"
|
|
120
148
|
```
|
|
121
149
|
|
|
122
|
-
|
|
150
|
+
Concrete triggers:
|
|
151
|
+
- 2+ agents report same unexpected complexity in the same subsystem
|
|
152
|
+
- An agent discovers a dependency that changes the approach
|
|
153
|
+
- Fix agents keep patching the same area across cycles
|
|
154
|
+
|
|
155
|
+
Document what you found in the cycle log before yielding. Update roadmap.md to reflect you're back in an earlier phase.
|
|
156
|
+
|
|
157
|
+
</returning-to-planning>
|
|
@@ -1,90 +1,72 @@
|
|
|
1
1
|
# Planning Phase
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
<planning-workflow>
|
|
4
4
|
|
|
5
|
-
The natural sequence: **context →
|
|
5
|
+
The natural sequence: **context → requirements → design → roadmap refinement → detailed planning.** Context documents come first because they feed everything downstream — requirements analysts, designers, planners, and implementers all benefit from not having to re-explore the codebase. After the requirements and design are aligned, revisit the roadmap — that's when you actually understand scope well enough to flesh out phases honestly.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
</planning-workflow>
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
<exploration>
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Use explore agents to build understanding before making decisions. Each agent saves a focused context document to `$SISYPHUS_SESSION_DIR/context/`.
|
|
12
12
|
|
|
13
|
-
- **Each agent produces a focused artifact** — not one sprawling document. Focused documents can be selectively passed to downstream agents.
|
|
14
|
-
- **Conventions and patterns are high-value** to capture. Implementation agents that receive convention context write consistent code.
|
|
15
|
-
- **Exploration serves different purposes at different stages.** Early exploration is architectural
|
|
16
|
-
- **Delegate understanding of unfamiliar territory.** If the task touches
|
|
13
|
+
- **Each agent produces a focused artifact** — not one sprawling document. Focused documents can be selectively passed to downstream agents.
|
|
14
|
+
- **Conventions and patterns are high-value** to capture. Implementation agents that receive convention context write consistent code.
|
|
15
|
+
- **Exploration serves different purposes at different stages.** Early exploration is architectural. Later exploration before a specific stage is tactical — files, patterns, utilities to reuse.
|
|
16
|
+
- **Delegate understanding of unfamiliar territory.** If the task touches an unfamiliar library or subsystem, spawn an agent to investigate and report.
|
|
17
17
|
|
|
18
|
-
|
|
18
|
+
</exploration>
|
|
19
19
|
|
|
20
|
-
|
|
20
|
+
<requirements-alignment>
|
|
21
21
|
|
|
22
|
-
|
|
23
|
-
- Draft the spec based on exploration findings
|
|
24
|
-
- Have agents review for feasibility and code smells (can this actually work given the codebase?)
|
|
25
|
-
- Seek user alignment on the high-level approach and any decisions that set direction
|
|
26
|
-
- **Apply corrections back to the spec itself** — the spec is the single source of truth. Don't create a separate corrections file and pass both downstream; update the spec and delete the corrections. Plan agents should read one authoritative document, not reconcile two contradictory ones.
|
|
22
|
+
Before investing in detailed requirements, make sure the goal is well-defined. If you're making assumptions about scope, requirements, or constraints — surface them to the user.
|
|
27
23
|
|
|
28
|
-
|
|
24
|
+
For significant features, requirements refinement is iterative:
|
|
25
|
+
- Draft requirements based on exploration findings
|
|
26
|
+
- Have agents review for feasibility (can this actually work given the codebase?)
|
|
27
|
+
- Seek user alignment on the high-level approach
|
|
28
|
+
- **Fold new knowledge into authoritative documents.** When reviews, exploration, or user feedback change the understanding, update the requirements and design documents directly — they are the single source of truth. Don't create correction files, addendum files, or decision logs alongside them. Remove superseded material rather than annotating it. Plan agents should read clean, current documents — not reconcile contradictions or skip over resolved questions.
|
|
29
29
|
|
|
30
|
-
|
|
30
|
+
Not every stage needs standalone requirements — a well-defined stage might just be a detailed section in the implementation plan.
|
|
31
31
|
|
|
32
|
-
|
|
32
|
+
</requirements-alignment>
|
|
33
33
|
|
|
34
|
-
|
|
34
|
+
<plan-delegation>
|
|
35
35
|
|
|
36
|
-
|
|
36
|
+
Once you have context docs and aligned requirements/design, revisit the roadmap — this is the first point where you understand real scope. Roadmap refinement means updating the four canonical sections: current stage, exit criteria, active context references, and next steps. Decisions from exploration, requirements, and design fold into context documents — not the roadmap.
|
|
37
37
|
|
|
38
|
-
**
|
|
38
|
+
Spawn **one plan lead** per feature. Point it at inputs (requirements, design, context docs) — not a pre-made structure. The plan lead handles its own decomposition: it assesses scope, delegates sub-plans if needed, runs adversarial reviews, and delivers a synthesized master plan. **Delegate outcomes, not implementations** — tell the plan lead what needs planning and why, not how to structure the plan.
|
|
39
39
|
|
|
40
|
-
**
|
|
40
|
+
**Don't split planning yourself.** If the orchestrator pre-splits into "backend plan agent" and "frontend plan agent," the plan lead's synthesis step — resolving cross-domain conflicts, finding gaps, stress-testing edge cases — never happens.
|
|
41
41
|
|
|
42
|
-
|
|
42
|
+
**When to spawn multiple plan leads:** Only for genuinely independent features with no shared files or integration points.
|
|
43
43
|
|
|
44
|
-
|
|
44
|
+
</plan-delegation>
|
|
45
45
|
|
|
46
|
-
|
|
46
|
+
<progressive-development>
|
|
47
47
|
|
|
48
|
-
|
|
49
|
-
- **Large task** (3+ stages, multiple domains or repos): Full phased development. The roadmap tracks development phases, and each phase produces artifacts in `context/`.
|
|
48
|
+
Not all tasks need the same process depth.
|
|
50
49
|
|
|
51
|
-
|
|
50
|
+
- **Small task** (1-3 files, single domain): Skip phases — roadmap is a short checklist (diagnose, fix, validate). Single plan agent, single implement agent.
|
|
51
|
+
- **Large task** (3+ stages, multiple domains): Full phased development. The roadmap tracks phases, each producing artifacts in `context/`.
|
|
52
52
|
|
|
53
|
-
|
|
53
|
+
Signs you need phased development: multiple unfamiliar subsystems, the task spans different concerns (backend, frontend, IPC), or the requirements have more than 3 distinct work areas.
|
|
54
54
|
|
|
55
|
-
|
|
56
|
-
- `context/plan-implementation.md` — overall stage outline with dependencies
|
|
57
|
-
- `context/plan-stage-1-types.md` — detailed plan for stage 1
|
|
58
|
-
- `context/plan-stage-2-service.md` — detailed plan for stage 2 (written when stage 1 is underway)
|
|
55
|
+
Implementation stages are context artifacts — saved to `context/plan-stage-N-*.md`. Detail-plan one stage at a time; what you learn implementing stage N informs stage N+1.
|
|
59
56
|
|
|
60
|
-
|
|
57
|
+
</progressive-development>
|
|
61
58
|
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
Detailed plans for stages 4-7 written before stage 1 is implemented are fiction. Defer detail until you're about to execute.
|
|
65
|
-
|
|
66
|
-
## E2E Verification Recipe
|
|
59
|
+
<verification-planning>
|
|
67
60
|
|
|
68
61
|
Before implementation begins, determine how to concretely verify the change works end-to-end. This is the single most common failure mode: agents report success but nothing actually works.
|
|
69
62
|
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
- **Browser automation**: `capture` CLI for UI changes — click through affected flows, screenshot results
|
|
73
|
-
- **CLI verification**: exercise changed behavior interactively in tmux
|
|
74
|
-
- **API testing**: dev server + curl/httpie for endpoint changes
|
|
75
|
-
- **Integration tests**: existing e2e or integration test suite
|
|
76
|
-
- **Smoke script**: create one if nothing else exists
|
|
63
|
+
If you cannot determine a concrete verification method, **ask the user**. Do not proceed to implementation without a verification plan.
|
|
77
64
|
|
|
78
|
-
|
|
65
|
+
Write the recipe to `context/e2e-recipe.md` with setup steps, exact commands or interactions to verify, and what success looks like. Make it executable, not aspirational. Implementation agents and validation agents both reference this file.
|
|
79
66
|
|
|
80
|
-
|
|
81
|
-
- Setup steps (start dev server, build, seed data, etc.)
|
|
82
|
-
- Exact commands or interactions to verify
|
|
83
|
-
- What success looks like (expected output, visual state, response codes)
|
|
67
|
+
</verification-planning>
|
|
84
68
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
## Transitioning to Implementation
|
|
69
|
+
<transition>
|
|
88
70
|
|
|
89
71
|
When you have enough understanding, a reviewed plan, and a verification recipe — transition explicitly:
|
|
90
72
|
|
|
@@ -92,4 +74,12 @@ When you have enough understanding, a reviewed plan, and a verification recipe
|
|
|
92
74
|
sisyphus yield --mode implementation --prompt "Begin implementation — see roadmap.md and context/plan-implementation.md"
|
|
93
75
|
```
|
|
94
76
|
|
|
95
|
-
The `--mode implementation` flag loads implementation-phase guidance for the next cycle.
|
|
77
|
+
The `--mode implementation` flag loads implementation-phase guidance for the next cycle.
|
|
78
|
+
|
|
79
|
+
After implementation is complete, transition to validation mode to prove the feature works:
|
|
80
|
+
|
|
81
|
+
```bash
|
|
82
|
+
sisyphus yield --mode validation --prompt "Implementation complete — validate against context/e2e-recipe.md"
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
</transition>
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Create technical design from requirements through investigation and user iteration
|
|
3
|
+
argument-hint: <topic or description>
|
|
4
|
+
---
|
|
5
|
+
# Technical Design
|
|
6
|
+
|
|
7
|
+
**Input:** $ARGUMENTS
|
|
8
|
+
|
|
9
|
+
The user wants a technical design before implementation begins.
|
|
10
|
+
|
|
11
|
+
Spawn a `sisyphus:design` agent to lead this — it's interactive, investigates the codebase, proposes architecture, and iterates with the user. Output goes to `context/design.md`. It expects `context/requirements.md` to exist; if it doesn't, flag that to the user or run requirements first.
|
|
12
|
+
|
|
13
|
+
If the current strategy doesn't include a design stage, update it before spawning. Don't do the design work yourself.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Explore the problem space collaboratively before committing to a solution
|
|
3
|
+
argument-hint: <topic or description>
|
|
4
|
+
---
|
|
5
|
+
# Problem Exploration
|
|
6
|
+
|
|
7
|
+
**Input:** $ARGUMENTS
|
|
8
|
+
|
|
9
|
+
The user wants to step back and explore the problem space before committing to a direction. This is a signal to prioritize understanding over progress.
|
|
10
|
+
|
|
11
|
+
Spawn a `sisyphus:problem` agent to lead this — it's interactive, collaborates with the user, and saves findings to `context/problem.md`. If the current strategy doesn't account for a problem exploration stage, update it before spawning.
|
|
12
|
+
|
|
13
|
+
Don't do the exploration yourself. The `sisyphus:problem` agent is purpose-built for divergent thinking and user collaboration.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Define behavioral requirements with EARS acceptance criteria
|
|
3
|
+
argument-hint: <topic or description>
|
|
4
|
+
---
|
|
5
|
+
# Requirements
|
|
6
|
+
|
|
7
|
+
**Input:** $ARGUMENTS
|
|
8
|
+
|
|
9
|
+
The user wants formal requirements defined before design or implementation proceeds.
|
|
10
|
+
|
|
11
|
+
Spawn a `sisyphus:requirements` agent to lead this — it's interactive, drafts EARS-format requirements, and iterates with the user until approved. Output goes to `context/requirements.md`. If the current strategy doesn't include a requirements stage, update it before spawning.
|
|
12
|
+
|
|
13
|
+
Don't draft requirements yourself. The `sisyphus:requirements` agent handles the full process: codebase investigation, drafting, and user iteration.
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Redirect session strategy — reactivate if completed, then respawn in strategy mode
|
|
3
|
+
argument-hint: <new direction or focus>
|
|
4
|
+
---
|
|
5
|
+
# Strategize
|
|
6
|
+
|
|
7
|
+
**Input:** $ARGUMENTS
|
|
8
|
+
|
|
9
|
+
The user wants to redirect this session's strategy.
|
|
10
|
+
|
|
11
|
+
## Steps
|
|
12
|
+
|
|
13
|
+
1. If the session is completed (`sisyphus status`), reactivate it with `sisyphus continue`.
|
|
14
|
+
2. Annotate `strategy.md` with the pivot — what changed, new focus, which existing artifacts still apply. Don't rewrite the whole strategy.
|
|
15
|
+
3. Yield to strategy mode:
|
|
16
|
+
```bash
|
|
17
|
+
sisyphus yield --mode strategy --prompt "<concise description of the new direction>"
|
|
18
|
+
```
|
|
19
|
+
This respawns a fresh orchestrator that will re-evaluate the goal, stages, and approach.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
if [ -z "$SISYPHUS_SESSION_DIR" ]; then exit 0; fi
|
|
3
|
+
|
|
4
|
+
CONTEXT_DIR="${SISYPHUS_SESSION_DIR}/context"
|
|
5
|
+
|
|
6
|
+
# Gate passes if any explore context file exists
|
|
7
|
+
if ls "${CONTEXT_DIR}"/explore-*.md 1>/dev/null 2>&1; then
|
|
8
|
+
exit 0
|
|
9
|
+
fi
|
|
10
|
+
|
|
11
|
+
cat <<'GATE'
|
|
12
|
+
<explore-gate>
|
|
13
|
+
No exploration context exists yet. Before planning or delegating work, spawn explore agents to build codebase understanding.
|
|
14
|
+
</explore-gate>
|
|
15
|
+
GATE
|
|
@@ -39,7 +39,7 @@ Usually serial — diagnosis must complete before fix, fix before validation. Ex
|
|
|
39
39
|
## Feature Build (Small — 1-3 files)
|
|
40
40
|
|
|
41
41
|
### When to use
|
|
42
|
-
Clear requirements, small scope, no
|
|
42
|
+
Clear requirements, small scope, no formal requirements document needed.
|
|
43
43
|
|
|
44
44
|
### Plan structure
|
|
45
45
|
```
|
|
@@ -70,10 +70,12 @@ Feature with moderate complexity. Requirements may need clarification. Multiple
|
|
|
70
70
|
```
|
|
71
71
|
## Feature: [description]
|
|
72
72
|
|
|
73
|
-
###
|
|
74
|
-
- [ ]
|
|
75
|
-
- [ ]
|
|
76
|
-
- [ ]
|
|
73
|
+
### Requirements & Design
|
|
74
|
+
- [ ] Problem exploration — understand goals, constraints, assumptions
|
|
75
|
+
- [ ] Requirements — define acceptance criteria
|
|
76
|
+
- [ ] Design — architecture, component boundaries, data models
|
|
77
|
+
- [ ] Create implementation plan from requirements + design
|
|
78
|
+
- [ ] Review plan against requirements + design
|
|
77
79
|
|
|
78
80
|
### Implementation
|
|
79
81
|
- [ ] Phase 1 — [foundation/types/interfaces]
|
|
@@ -87,18 +89,20 @@ Feature with moderate complexity. Requirements may need clarification. Multiple
|
|
|
87
89
|
Note: critique and validation are embedded between implementation phases, not deferred to the end. Phase 1 (types) is low-risk and doesn't need its own review, but critique catches issues before Phase 3 builds on them. Validation happens after integration, when all the pieces come together.
|
|
88
90
|
|
|
89
91
|
### Cycle plan
|
|
90
|
-
- **Cycle 1**: Spawn `sisyphus:
|
|
91
|
-
- **Cycle 2**: Spawn `sisyphus:
|
|
92
|
-
- **Cycle 3**: Spawn `sisyphus:
|
|
93
|
-
- **Cycle 4**: Spawn `sisyphus:
|
|
94
|
-
- **Cycle 5**: Spawn `sisyphus:
|
|
95
|
-
- **Cycle 6**: Spawn `sisyphus:
|
|
96
|
-
- **Cycle 7**:
|
|
97
|
-
- **Cycle 8**: Spawn `sisyphus:
|
|
98
|
-
- **Cycle 9**: Address
|
|
92
|
+
- **Cycle 1**: Spawn `sisyphus:problem` for problem exploration. Yield. (Human iterates between cycles.)
|
|
93
|
+
- **Cycle 2**: Spawn `sisyphus:requirements` for requirements analysis. Yield. (Human reviews/iterates.)
|
|
94
|
+
- **Cycle 3**: Spawn `sisyphus:design` for technical design. Yield. (Human reviews/iterates.)
|
|
95
|
+
- **Cycle 4**: Spawn `sisyphus:plan` for plan. Yield.
|
|
96
|
+
- **Cycle 5**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
|
|
97
|
+
- **Cycle 6**: Spawn `sisyphus:implement` for Phase 1. Yield.
|
|
98
|
+
- **Cycle 7**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types — low risk, doesn't need its own validation. Yield.
|
|
99
|
+
- **Cycle 8**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
|
|
100
|
+
- **Cycle 9**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
|
|
101
|
+
- **Cycle 10**: `sisyphus yield --mode validation` for e2e smoketest. Validation mode proves the feature works — operator for UI, evidence for every claim.
|
|
102
|
+
- **Cycle 11**: Address validation failures (back to `--mode implementation`) or complete.
|
|
99
103
|
|
|
100
104
|
### Failure modes
|
|
101
|
-
- **
|
|
105
|
+
- **Requirements/design needs human input**: Mark session as needing human review. Orchestrator notes open questions.
|
|
102
106
|
- **Plan fails review**: Feed review issues back, respawn planner.
|
|
103
107
|
- **Critique finds issues in foundation**: Fix before starting integration — don't build on shaky ground.
|
|
104
108
|
- **Validation fails**: Feed specifics back to implement agent for the failing area.
|
|
@@ -117,9 +121,10 @@ Cross-cutting feature, multiple domains, needs team coordination. Uses **progres
|
|
|
117
121
|
```
|
|
118
122
|
## Feature: [description]
|
|
119
123
|
|
|
120
|
-
###
|
|
121
|
-
- [ ]
|
|
122
|
-
- [ ]
|
|
124
|
+
### Requirements & Design
|
|
125
|
+
- [ ] Problem exploration
|
|
126
|
+
- [ ] Requirements
|
|
127
|
+
- [ ] Design
|
|
123
128
|
|
|
124
129
|
### Stage Outline (high-level only — no file-level detail yet)
|
|
125
130
|
1. [domain A foundation] — no deps — ~N cycles
|
|
@@ -140,15 +145,17 @@ See context/plan-stage-N-{name}.md for detail plan.
|
|
|
140
145
|
Note: verification checkpoints are embedded in the stage outline, not deferred to a final phase. The level of rigor varies — foundation stages get a light critique, core logic gets critique + validation, integration gets full e2e validation. This is judgment, not formula.
|
|
141
146
|
|
|
142
147
|
### Cycle plan
|
|
143
|
-
- **Cycle 1**: Spawn `sisyphus:
|
|
144
|
-
- **Cycle 2**: Spawn `sisyphus:
|
|
145
|
-
- **Cycle 3**:
|
|
146
|
-
- **Cycle 4**: Spawn `sisyphus:
|
|
147
|
-
- **Cycle 5**:
|
|
148
|
-
- **Cycle 6**:
|
|
149
|
-
- **Cycle 7**: Spawn `sisyphus:implement` for stage
|
|
150
|
-
- **Cycle 8**: Spawn `sisyphus:
|
|
151
|
-
- **Cycle 9
|
|
148
|
+
- **Cycle 1**: Spawn `sisyphus:problem` for problem exploration. Yield.
|
|
149
|
+
- **Cycle 2**: Spawn `sisyphus:requirements` for requirements. Yield.
|
|
150
|
+
- **Cycle 3**: Spawn `sisyphus:design` for design. Yield.
|
|
151
|
+
- **Cycle 4**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." Spawn `sisyphus:test-spec` for test properties (parallel). Yield.
|
|
152
|
+
- **Cycle 5**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). Output to `context/plan-stage-1-{name}.md`. Yield.
|
|
153
|
+
- **Cycle 6**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
|
|
154
|
+
- **Cycle 7**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
|
|
155
|
+
- **Cycle 8**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
|
|
156
|
+
- **Cycle 9**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
|
|
157
|
+
- **Cycle 10**: Spawn `sisyphus:validate` for stages 3-4 — core logic checkpoint before integration. Address stage 3 critique. Yield.
|
|
158
|
+
- **Cycle 11+**: Implement integration stage. Final review. Then `sisyphus yield --mode validation` for comprehensive e2e proof.
|
|
152
159
|
|
|
153
160
|
### Failure modes
|
|
154
161
|
- **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
|