sisyphi 0.1.21 → 0.1.23

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/dist/chunk-KQBSC5KY.js +31 -0
  2. package/dist/chunk-KQBSC5KY.js.map +1 -0
  3. package/dist/{chunk-LTAW6OWS.js → chunk-YGBGKMTF.js} +31 -6
  4. package/dist/chunk-YGBGKMTF.js.map +1 -0
  5. package/dist/chunk-ZE2SKB4B.js +35 -0
  6. package/dist/chunk-ZE2SKB4B.js.map +1 -0
  7. package/dist/cli.js +638 -51
  8. package/dist/cli.js.map +1 -1
  9. package/dist/daemon.js +915 -289
  10. package/dist/daemon.js.map +1 -1
  11. package/dist/paths-FYYSBD27.js +58 -0
  12. package/dist/paths-FYYSBD27.js.map +1 -0
  13. package/dist/templates/CLAUDE.md +21 -20
  14. package/dist/templates/agent-plugin/agents/CLAUDE.md +2 -0
  15. package/dist/templates/agent-plugin/agents/debug.md +1 -0
  16. package/dist/templates/agent-plugin/agents/operator.md +1 -2
  17. package/dist/templates/agent-plugin/agents/plan.md +86 -55
  18. package/dist/templates/agent-plugin/agents/review-plan.md +1 -0
  19. package/dist/templates/agent-plugin/agents/spec-draft.md +1 -0
  20. package/dist/templates/agent-plugin/hooks/hooks.json +19 -1
  21. package/dist/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  22. package/dist/templates/agent-plugin/hooks/require-submit.sh +24 -0
  23. package/dist/templates/agent-suffix.md +18 -0
  24. package/dist/templates/dashboard-claude.md +38 -0
  25. package/dist/templates/orchestrator-base.md +270 -0
  26. package/dist/templates/orchestrator-impl.md +116 -0
  27. package/dist/templates/orchestrator-planning.md +131 -0
  28. package/dist/templates/orchestrator-plugin/hooks/hooks.json +1 -15
  29. package/dist/templates/orchestrator-plugin/skills/git-management/SKILL.md +1 -1
  30. package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md +4 -16
  31. package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +22 -23
  32. package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +11 -11
  33. package/dist/tui.js +3236 -0
  34. package/dist/tui.js.map +1 -0
  35. package/package.json +5 -1
  36. package/templates/CLAUDE.md +21 -20
  37. package/templates/agent-plugin/agents/CLAUDE.md +2 -0
  38. package/templates/agent-plugin/agents/debug.md +1 -0
  39. package/templates/agent-plugin/agents/operator.md +1 -2
  40. package/templates/agent-plugin/agents/plan.md +86 -55
  41. package/templates/agent-plugin/agents/review-plan.md +1 -0
  42. package/templates/agent-plugin/agents/spec-draft.md +1 -0
  43. package/templates/agent-plugin/hooks/hooks.json +19 -1
  44. package/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  45. package/templates/agent-plugin/hooks/require-submit.sh +24 -0
  46. package/templates/agent-suffix.md +18 -0
  47. package/templates/dashboard-claude.md +38 -0
  48. package/templates/orchestrator-base.md +270 -0
  49. package/templates/orchestrator-impl.md +116 -0
  50. package/templates/orchestrator-planning.md +131 -0
  51. package/templates/orchestrator-plugin/hooks/hooks.json +1 -15
  52. package/templates/orchestrator-plugin/skills/git-management/SKILL.md +1 -1
  53. package/templates/orchestrator-plugin/skills/orchestration/SKILL.md +4 -16
  54. package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +22 -23
  55. package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +11 -11
  56. package/dist/chunk-LTAW6OWS.js.map +0 -1
  57. package/dist/templates/orchestrator-plugin/scripts/block-task.sh +0 -11
  58. package/dist/templates/orchestrator.md +0 -173
  59. package/templates/orchestrator-plugin/scripts/block-task.sh +0 -11
  60. package/templates/orchestrator.md +0 -173
@@ -0,0 +1,116 @@
1
+ # Implementation Phase
2
+
3
+ ## Stage-by-Stage Execution
4
+
5
+ ### Maximize parallelism
6
+
7
+ Before starting each cycle, ask: **which stages or tasks are independent right now?** If two stages touch different subsystems (e.g., backend vs frontend, separate services, unrelated modules), spawn them concurrently — don't serialize work that doesn't need to be serialized. Use `--worktree` when parallel agents might touch overlapping files.
8
+
9
+ Sequential execution is the default trap. Fight it actively. At every yield, look for work that can run alongside the next stage — review agents while the next implementation starts, frontend and backend stages in parallel, independent fix agents concurrently. A cycle with one agent running is a wasted cycle if other work was ready.
10
+
11
+ If the plan has stages that share no file dependencies, **run them in parallel from the start.** Each stage is multiple cycles:
12
+
13
+ 1. **Detail-plan it** — expand the high-level outline into specific file changes, informed by previous stages. If complex enough, spawn a spec agent first.
14
+ 2. **Implement it** — spawn agents with self-contained instructions (see Agent Instructions below). May itself take multiple cycles if the stage has enough work.
15
+ 3. **Critique and refine it** — spawn parallel review agents, fix what they find, repeat until clean (see below).
16
+ 4. **Validate it end-to-end** — spawn a validation agent with the e2e recipe. Don't advance until it passes.
17
+ 5. **Update roadmap.md** — mark the stage done in the implementation phase, refine future stage outlines if what you learned changes the approach.
18
+
19
+ Don't detail-plan all stages up front. What you learn implementing earlier stages should inform later ones.
20
+
21
+ ## Agent Instructions
22
+
23
+ Implementation agent prompts must be **fully self-contained** — include everything the agent needs so it doesn't have to re-explore or guess. Each spawn instruction should include:
24
+
25
+ - The overall goal of the session (one sentence)
26
+ - This agent's specific task (files to create/modify, what the change does, done condition)
27
+ - References to relevant context files (`conventions.md`, `explore-architecture.md`, etc.)
28
+ - The e2e recipe reference (`context/e2e-recipe.md`) so the agent can self-verify
29
+
30
+ **Tell every implementation agent to report clearly when done:** what they built, what files they changed, and any issues or uncertainties they encountered. Testing and validation happens at the orchestrator level (see Critique and Refinement below), not inside each agent.
31
+
32
+ ### Delegate outcomes, not implementations
33
+
34
+ Your job is to define **what needs to happen and why**, not to write the code yourself. If you find yourself writing exact code snippets, function signatures, or line-by-line fix instructions in agent prompts — you're doing the agent's job.
35
+
36
+ **Bad**: "Change line 45 from `x === y` to `crypto.timingSafeEqual(Buffer.from(x), Buffer.from(y))`, handle length mismatch..."
37
+ **Good**: "Fix the timing-safe comparison issue in authMiddleware.ts — see report at reports/agent-002-final.md, Major #3"
38
+
39
+ For fix agents specifically: **pass the review report path and tell the agent to action the items.** The agent reads the report, understands the codebase, and figures out the right fix. This is why you have agents — they're capable of solving problems, not just transcribing solutions. Writing the code for them defeats the purpose of delegation and wastes your context on implementation details you shouldn't be tracking.
40
+
41
+ The exception is architectural constraints the agent wouldn't know: "use the existing `personRepository.findOrCreateOwner` method for Neo4j sync" or "the Supabase client is at `supabaseService.getClient()`". Give agents the **what** and the **landmarks**, not the **how**.
42
+
43
+ ### Context propagation
44
+
45
+ The planning phase produced context files — conventions, e2e recipe, architectural findings. Be selective — give each agent the context relevant to their task, not everything. An agent that gets `conventions.md` writes consistent code. An agent that gets `explore-architecture.md` understands where their change fits.
46
+
47
+ ## Code Smell Escalation
48
+
49
+ Instruct agents to flag problems early rather than working around them. When an agent encounters unexpected complexity, unclear architecture, or code that fights back — the right move is to stop and report clearly. A clear description of the problem is more valuable than a brittle implementation built on a bad foundation.
50
+
51
+ When you see these reports, investigate before pushing forward. If the smell suggests a design issue, involve the user.
52
+
53
+ ## Critique and Refinement
54
+
55
+ After implementation agents report, **do not advance to the next stage.** The code needs to be reviewed and refined first. This is not optional.
56
+
57
+ ### Critique cycle
58
+
59
+ Spawn three review agents in parallel, each attacking a different dimension:
60
+
61
+ 1. **Code reuse reviewer** — searches the codebase for existing utilities, helpers, and patterns that the new code duplicates. Flags any new function that reimplements existing functionality, any inline logic that could use an existing utility.
62
+
63
+ 2. **Code quality reviewer** — looks for hacky patterns: redundant state, parameter sprawl, copy-paste with slight variation, leaky abstractions, stringly-typed code where constants or enums exist, unnecessary nesting or wrapping.
64
+
65
+ 3. **Efficiency reviewer** — looks for unnecessary work (redundant computations, duplicate API calls, N+1 patterns), missed concurrency (independent operations run sequentially), hot-path bloat, unbounded data structures, overly broad operations.
66
+
67
+ Give each reviewer the full diff and relevant context files. They report problems — they don't fix them.
68
+
69
+ ### Refine cycle
70
+
71
+ Aggregate the reviewer findings. Spawn fix agents and **point them at the review report** — don't rewrite the findings as line-by-line instructions. The fix agent reads the report, reads the code, and figures out the right solution. You triage (skip false positives, note any architectural constraints) — they implement.
72
+
73
+ ```bash
74
+ sisyphus spawn --name "fix-review-issues" --agent-type sisyphus:implement \
75
+ "Fix the issues in reports/agent-003-final.md. Skip item #5 (false positive). Run type-check after."
76
+ ```
77
+
78
+ The fix agents should use `/simplify` to systematically review their own changes before reporting.
79
+
80
+ ### Repeat until clean
81
+
82
+ Spawn reviewers again on the refined code. If they come back with new issues, fix those too. Genuinely nitpicky findings — stylistic preferences, irrelevant edge cases — can be skipped. But if a finding is actually correct, it gets done. **"I don't want to" is not a reason to skip a valid finding.** The distinction is between false positives and laziness. In practice this is usually 1-2 rounds. If it's taking more, the implementation was shaky and you should consider whether the approach needs rethinking rather than patching.
83
+
84
+ ## E2E Validation
85
+
86
+ After the critique/refine loop produces clean code, **validate end-to-end before advancing.** This is also not optional. The implementing agent is the worst validator of its own work — same blind spots, same assumptions.
87
+
88
+ Spawn a validation agent with the e2e recipe from `context/e2e-recipe.md`. The agent should:
89
+ - Follow the setup steps exactly (build, start servers, seed data)
90
+ - Run every verification step in the recipe
91
+ - Report exactly what passed and what failed — not "it looks good"
92
+
93
+ If the recipe involves UI, the validation agent should use `capture` to screenshot and interact with the actual running app. If it involves an API, it should curl the actual endpoints. If it involves CLI behavior, it should exercise it in the terminal.
94
+
95
+ If the project lacks validation tooling, **create it**. A smoke-test script, a seed command, a health-check endpoint — these pay for themselves immediately and every future validation agent reuses them.
96
+
97
+ **Only advance to the next stage when validation passes.** If it fails, log the failures, spawn fix agents, and re-validate.
98
+
99
+ ## Worktree Preference
100
+
101
+ When spawning two or more implementation agents in the same cycle, prefer `--worktree` for each. Worktree isolation eliminates file conflict risk — agents can't clobber each other's changes, each gets a clean branch, and they can commit incrementally. The daemon merges branches back when agents complete and surfaces conflicts in your next cycle's state.
102
+
103
+ ```bash
104
+ sisyphus spawn --name "impl-auth" --agent-type sisyphus:implement --worktree "Add session middleware — see context/conventions.md"
105
+ sisyphus spawn --name "impl-routes" --agent-type sisyphus:implement --worktree "Add login routes — see context/conventions.md and context/explore-architecture.md"
106
+ ```
107
+
108
+ ## Returning to Planning
109
+
110
+ If you discover mid-implementation that the approach is wrong — the architecture is different than expected, a dependency changes the approach, or agents keep hitting the same wall — don't keep pushing. Return to planning:
111
+
112
+ ```bash
113
+ sisyphus yield --mode planning --prompt "Re-evaluate: discovered X changes the approach — write cycle log"
114
+ ```
115
+
116
+ Document what you found in the cycle log before yielding so the planning cycle starts informed. Update roadmap.md to reflect that you're back in an earlier phase.
@@ -0,0 +1,131 @@
1
+ # Planning Phase
2
+
3
+ ## Exploration
4
+
5
+ Use explore agents to build understanding before making decisions. Each agent should save a focused context document to `.sisyphus/sessions/$SISYPHUS_SESSION_ID/context/` — these artifacts get passed to downstream agents so they don't have to re-explore the codebase themselves.
6
+
7
+ Adapt the number and focus of explore agents to the task. Key principles:
8
+
9
+ - **Each agent produces a focused artifact** — not one sprawling document. Focused documents can be selectively passed to downstream agents. An agent implementing auth gets `conventions.md` + `architecture.md`, not a 500-line dump.
10
+ - **Conventions and patterns are high-value** to capture. Implementation agents that receive convention context write consistent code. Ones that don't produce code you'll have to fix.
11
+ - **Exploration serves different purposes at different stages.** Early exploration is architectural — understanding the system and what needs to change. Later exploration before a specific stage is tactical — identifying files, patterns to follow, utilities to reuse. Both are valuable.
12
+ - **Delegate understanding of unfamiliar territory.** If the task touches a library or subsystem you don't know, spawn an agent to investigate and report.
13
+
14
+ ## Spec Alignment
15
+
16
+ Before investing in a detailed spec, make sure the goal itself is well-defined. If you're making assumptions about scope, requirements, or constraints — surface them to the user. A spec built on wrong assumptions wastes every cycle downstream.
17
+
18
+ For significant features, spec refinement is iterative:
19
+ - Draft the spec based on exploration findings
20
+ - Have agents review for feasibility and code smells (can this actually work given the codebase?)
21
+ - Seek user alignment on the high-level approach and any decisions that set direction
22
+ - **Apply corrections back to the spec itself** — the spec is the single source of truth. Don't create a separate corrections file and pass both downstream; update the spec and delete the corrections. Plan agents should read one authoritative document, not reconcile two contradictory ones.
23
+
24
+ Not every stage needs a standalone spec document — a well-defined stage might just be a detailed section in the implementation plan. Use judgment about how much formality each stage warrants.
25
+
26
+ ## Delegating to Plan Agents
27
+
28
+ Point plan agents at **inputs** (spec, context docs, corrections) — not a pre-made structure. Don't pre-decide staging, ordering, or design decisions. The plan agent has `effort: max` reasoning and will produce a better plan when given room to think through the structure itself.
29
+
30
+ For cross-domain tasks, consider spawning parallel plan agents scoped to independent domains (e.g., one for backend, one for frontend, one for IPC). Each produces a focused sub-plan. This is faster and produces better domain-specific plans than one agent trying to plan everything.
31
+
32
+ ## Progressive Development
33
+
34
+ Not all tasks need the same process depth. A 2-file bug fix can go straight to implementation. A cross-repo feature with multiple domains needs full phased development.
35
+
36
+ ### Decision heuristic
37
+
38
+ - **Small task** (1-3 files, single domain): Skip phases — roadmap is just a short task checklist (diagnose, fix, validate). Single plan agent, single implement agent.
39
+ - **Large task** (3+ stages, multiple domains or repos): Full phased development. The roadmap tracks development phases, and each phase produces artifacts in `context/`.
40
+
41
+ Signs you need phased development: the task touches multiple unfamiliar subsystems, the task description spans different concerns (backend, frontend, IPC, etc.), or a spec exists with more than 3 distinct work areas.
42
+
43
+ ### How phased development works
44
+
45
+ The roadmap tracks **development phases**, not implementation stages. A large feature's roadmap looks like:
46
+
47
+ ```markdown
48
+ ## Goal: Implement Worker System
49
+
50
+ ### Phases
51
+ 1. Research — explore architecture, conventions, constraints [current]
52
+ 2. Spec — validate/refine spec, align with user [outlined]
53
+ 3. Plan — break into implementation stages [outlined]
54
+ 4. Implement — execute stage-by-stage with review cycles [outlined]
55
+ 5. Validate — e2e verification [outlined]
56
+ ```
57
+
58
+ Each phase expands when you enter it. Implementation stages only appear once Phase 3 (Plan) produces them — and they live in `context/`, not the roadmap itself.
59
+
60
+ ### Phase expansion
61
+
62
+ When entering a new phase, expand it in the roadmap with concrete items:
63
+
64
+ ```markdown
65
+ ### Phase 1: Research (current)
66
+ - [x] Core architecture exploration (scheduler, presets, routing)
67
+ - [x] Agent IPC + runtime patterns
68
+ - [ ] Gateway patterns (RTK Query, components)
69
+
70
+ ### Phase 3: Plan (current)
71
+ - Implementation plan: see context/plan-implementation.md
72
+ - [x] High-level stage outline
73
+ - [ ] Detail-plan stage 1 (types + migration)
74
+ - [ ] Review plan against spec
75
+ ```
76
+
77
+ Future phases stay as one-liners until reached. What you learn in earlier phases informs how later phases get expanded.
78
+
79
+ ### Implementation stages are context artifacts
80
+
81
+ When Phase 3 (Plan) runs, it produces implementation stage breakdowns saved to `context/`:
82
+ - `context/plan-implementation.md` — overall stage outline with dependencies
83
+ - `context/plan-stage-1-types.md` — detailed plan for stage 1
84
+ - `context/plan-stage-2-service.md` — detailed plan for stage 2 (written when stage 1 is underway)
85
+
86
+ The roadmap references these but doesn't contain them. During Phase 4 (Implement), the roadmap tracks which stages are done:
87
+
88
+ ```markdown
89
+ ### Phase 4: Implement (current)
90
+ See context/plan-implementation.md for stage breakdown.
91
+ - [x] Stage 1: Types + migration — verified
92
+ - [ ] Stage 2: Worker service — in progress (see context/plan-stage-2-service.md)
93
+ - [ ] Stage 3: Gateway UI — outlined
94
+ ```
95
+
96
+ ### Don't front-load phases
97
+
98
+ Detail-plan one stage at a time. What you learn implementing stage N informs stage N+1's detail plan. The stage outline evolves — stages get added, removed, reordered, or split as understanding grows. That's the system working correctly.
99
+
100
+ Detailed plans for stages 4-7 written before stage 1 is implemented are fiction. Defer detail until you're about to execute.
101
+
102
+ ## E2E Verification Recipe
103
+
104
+ Before implementation begins, determine how to concretely verify the change works end-to-end. This is the single most common failure mode: agents report success but nothing actually works.
105
+
106
+ The tooling explorer should have mapped the available infrastructure. Common patterns:
107
+
108
+ - **Browser automation**: `capture` CLI for UI changes — click through affected flows, screenshot results
109
+ - **CLI verification**: exercise changed behavior interactively in tmux
110
+ - **API testing**: dev server + curl/httpie for endpoint changes
111
+ - **Integration tests**: existing e2e or integration test suite
112
+ - **Smoke script**: create one if nothing else exists
113
+
114
+ If you cannot determine a concrete verification method, **ask the user**. Offer 2-3 specific options. Do not proceed to implementation without a verification plan.
115
+
116
+ Write the recipe to `context/e2e-recipe.md` with:
117
+ - Setup steps (start dev server, build, seed data, etc.)
118
+ - Exact commands or interactions to verify
119
+ - What success looks like (expected output, visual state, response codes)
120
+
121
+ Implementation agents and validation agents both reference this file. Write it to be executable, not aspirational.
122
+
123
+ ## Transitioning to Implementation
124
+
125
+ When you have enough understanding, a reviewed plan, and a verification recipe — transition explicitly:
126
+
127
+ ```bash
128
+ sisyphus yield --mode implementation --prompt "Begin implementation — see roadmap.md and context/plan-implementation.md"
129
+ ```
130
+
131
+ The `--mode implementation` flag loads implementation-phase guidance for the next cycle. Pass a prompt that orients the next cycle to where things stand.
@@ -1,15 +1 @@
1
- {
2
- "hooks": {
3
- "PreToolUse": [
4
- {
5
- "matcher": "Task",
6
- "hooks": [
7
- {
8
- "type": "command",
9
- "command": "\"${CLAUDE_PLUGIN_ROOT}/scripts/block-task.sh\""
10
- }
11
- ]
12
- }
13
- ]
14
- }
15
- }
1
+ {"hooks":{}}
@@ -85,7 +85,7 @@ Scan the project root for gitignored files that agents will need:
85
85
 
86
86
  ## Handling Merge Conflicts
87
87
 
88
- When the daemon merges agent branches back, conflicts appear in the `## Worktrees` section of your state block. For each conflicting agent you'll see:
88
+ When the daemon merges agent branches back, conflicts appear in the `## Worktrees` section of your prompt. For each conflicting agent you'll see:
89
89
  - The branch name (still exists, unmerged)
90
90
  - The worktree path (still exists on disk)
91
91
  - The conflict details (git merge stderr output)
@@ -10,7 +10,7 @@ How to structure sisyphus sessions for common task types. This skill helps the o
10
10
 
11
11
  ## Core Principles
12
12
 
13
- 1. **plan.md is the orchestrator's memory.** plan.md and agent reports persist across cycles — they're all you have. Keep plan.md current and specific enough that a fresh orchestrator can pick up where you left off.
13
+ 1. **roadmap.md is the orchestrator's memory.** roadmap.md and agent reports persist across cycles — they're all you have. Keep roadmap.md current and specific enough that a fresh orchestrator can pick up where you left off.
14
14
 
15
15
  2. **Agents are disposable.** Each agent gets one focused instruction. If it fails or the scope changes, spawn a new one — don't try to redirect a running agent.
16
16
 
@@ -20,21 +20,9 @@ How to structure sisyphus sessions for common task types. This skill helps the o
20
20
 
21
21
  5. **Reports are handoffs.** Agent reports should contain everything the next cycle's orchestrator needs — what was done, what was found, what's unresolved, where artifacts were saved.
22
22
 
23
- ## Agent Types Quick Reference
24
-
25
- | Agent | Model | Use For |
26
- |-------|-------|---------|
27
- | `sisyphus:general` | sonnet | Ad-hoc tasks, summarization, simple questions |
28
- | `sisyphus:debug` | opus | Bug diagnosis and root cause analysis |
29
- | `sisyphus:spec-draft` | opus | Feature investigation and spec drafting |
30
- | `sisyphus:plan` | opus | Implementation planning from spec |
31
- | `sisyphus:review-plan` | opus | Validate plan covers spec completely |
32
- | `sisyphus:test-spec` | opus | Define behavioral properties to verify |
33
- | `sisyphus:implement` | sonnet | Execute plan phases, write code |
34
- | `sisyphus:validate` | opus | Verify implementation matches plan |
35
- | `sisyphus:review` | opus | Code review with parallel concern subagents |
36
- | `sisyphus:tactician` | opus | Track plan progress, dispatch next task |
37
- | `sisyphus:triage` | sonnet | Classify tickets by type/size |
23
+ ## Agent Types
24
+
25
+ Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sisyphus spawn`.
38
26
 
39
27
  For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
40
28
  For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
@@ -1,6 +1,6 @@
1
1
  # Work Breakdown Patterns
2
2
 
3
- Patterns for how the orchestrator should structure plan.md for common workflow types. Each pattern shows the plan structure, agent assignments, cycle sequencing, and failure handling.
3
+ Patterns for how the orchestrator should structure roadmap.md for common workflow types. Each pattern shows the plan structure, agent assignments, cycle sequencing, and failure handling.
4
4
 
5
5
  ---
6
6
 
@@ -106,45 +106,44 @@ Phases without dependencies can run in parallel. Types/interfaces (Phase 1) must
106
106
  ## Feature Build (Large — 10+ files)
107
107
 
108
108
  ### When to use
109
- Cross-cutting feature, multiple domains, needs team coordination.
109
+ Cross-cutting feature, multiple domains, needs team coordination. Uses **progressive planning** — high-level outline first, then detail-plan each stage as it's reached.
110
110
 
111
111
  ### Plan structure
112
112
  ```
113
113
  ## Feature: [description]
114
114
 
115
- ### Spec & Planning
115
+ ### Spec
116
116
  - [ ] Draft spec
117
- - [ ] Create master implementation plan
118
- - [ ] Review plan against spec
119
- - [ ] Define behavioral test properties
117
+ - [ ] Review spec
120
118
 
121
- ### Implementation
122
- - [ ] Phase 1[domain A foundation]
123
- - [ ] Phase 2[domain B foundation]
124
- - [ ] Phase 3[domain A implementation]
125
- - [ ] Phase 4[domain B implementation]
126
- - [ ] Phase 5[integration layer]
119
+ ### Stage Outline (high-level only — no file-level detail yet)
120
+ 1. [domain A foundation] no deps ~N cycles
121
+ 2. [domain B foundation] no deps ~N cycles
122
+ 3. [domain A implementation] depends on 1 ~N cycles
123
+ 4. [domain B implementation] depends on 2 ~N cycles
124
+ 5. [integration layer] depends on 3, 4 ~N cycles
125
+ 6. [integration tests] — depends on all — ~N cycles
127
126
 
128
- ### Validation
129
- - [ ] Validate full implementation
130
- - [ ] Review implementation
131
- - [ ] Adversarial validation against test spec
127
+ ### Current Stage: [whichever is active]
128
+ See context/plan-stage-N-{name}.md for detail plan.
129
+ - [ ] [task-level items from detail plan]
132
130
  ```
133
131
 
134
132
  ### Cycle plan
135
133
  - **Cycle 1**: Spawn `sisyphus:spec-draft` for spec. Yield.
136
- - **Cycle 2**: Spawn `sisyphus:plan` for plan + `sisyphus:test-spec` for test properties (parallel). Yield.
137
- - **Cycle 3**: Spawn `sisyphus:review-plan` for review. Yield.
138
- - **Cycle 4**: Spawn `sisyphus:implement` for Phase 1 + Phase 2 (parallel independent domains). Yield.
139
- - **Cycle 5**: Validate Phase 1 + Phase 2, then spawn Phase 3 + Phase 4 (parallel). Yield.
140
- - **Cycle 6+**: Integration, validation, review.
134
+ - **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Do not detail any stage — no file-level specifics." Spawn `sisyphus:test-spec` for test properties (parallel). Yield.
135
+ - **Cycle 3**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). Output to `context/plan-stage-1-{name}.md`. Yield.
136
+ - **Cycle 4**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
137
+ - **Cycle 5**: Validate stage 1. Spawn `sisyphus:implement` for stage 2 (if detail-planned). Detail-plan stage 3 in parallel if independent. Yield.
138
+ - **Cycle 6+**: Continue pattern — implement current stage, validate previous, detail-plan next. Each stage follows implement → critique → refine → validate.
141
139
 
142
140
  ### Failure modes
141
+ - **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
143
142
  - **Integration failures**: Often means contracts between domains don't match. Spawn debug agent targeting the integration seam.
144
- - **Test spec violations**: Feed specific property failures back to implement.
143
+ - **Stage N implementation invalidates stage N+1 outline**: Update the high-level outline. This is expected — it's why you don't detail-plan everything upfront.
145
144
 
146
145
  ### Parallelization
147
- Maximize. Independent domains run in parallel. Foundation phases complete before implementation phases in the same domain. Integration waits for all domain implementations.
146
+ Maximize within the progressive pattern. Independent stages run in parallel. Detail-planning the next stage runs alongside implementing the current one. Foundation stages complete before dependent stages. Integration waits for all domain implementations.
148
147
 
149
148
  ---
150
149
 
@@ -10,7 +10,7 @@ End-to-end examples showing how the orchestrator structures cycles for real scen
10
10
 
11
11
  ### Cycle 1 — Diagnosis
12
12
  ```
13
- plan.md:
13
+ roadmap.md:
14
14
  ## Bug Fix: WebSocket message loss during reconnection
15
15
 
16
16
  - [ ] Diagnose message loss during WebSocket reconnection
@@ -33,7 +33,7 @@ Agent report: "Root cause: reconnect() clears the message queue before the new s
33
33
  but should be deferred until onReconnect confirms the new socket is live.
34
34
  Confidence: High."
35
35
 
36
- plan.md updated:
36
+ roadmap.md updated:
37
37
  - [x] ~~Diagnose message loss during WebSocket reconnection~~
38
38
  - [ ] Fix root cause — queue.clear() at L47 of reconnect.ts runs too early
39
39
  - [ ] Validate fix under concurrent load
@@ -51,7 +51,7 @@ Agents spawned:
51
51
  Agent report: "Fixed. Moved queue.clear() to onReconnect callback after socket.readyState
52
52
  confirms OPEN. Added message buffer during reconnection window. 2 files changed."
53
53
 
54
- plan.md updated:
54
+ roadmap.md updated:
55
55
  - [x] ~~Diagnose message loss~~
56
56
  - [x] ~~Fix root cause~~
57
57
  - [ ] Validate fix under concurrent load
@@ -67,7 +67,7 @@ Agents spawned:
67
67
  ```
68
68
  Validation: PASS. Existing tests pass, reconnection preserves messages.
69
69
 
70
- plan.md updated:
70
+ roadmap.md updated:
71
71
  - [x] ~~Diagnose~~, [x] ~~Fix~~, [x] ~~Validate~~
72
72
  - [ ] Review fix
73
73
 
@@ -82,7 +82,7 @@ Agents spawned:
82
82
  Review: Clean. One medium finding — buffer should have a max size to prevent
83
83
  memory issues if reconnect fails indefinitely. Not blocking.
84
84
 
85
- All plan.md items complete.
85
+ All roadmap.md items complete.
86
86
 
87
87
  Action: complete — "Fixed WebSocket message loss during reconnection. Messages now
88
88
  buffered during reconnection window and drained on successful reconnect."
@@ -96,7 +96,7 @@ Action: complete — "Fixed WebSocket message loss during reconnection. Messages
96
96
 
97
97
  ### Cycle 1 — Spec
98
98
  ```
99
- plan.md:
99
+ roadmap.md:
100
100
  ## Feature: API Rate Limiting
101
101
 
102
102
  ### Spec & Planning
@@ -127,7 +127,7 @@ Agent report: "Spec saved to context/spec-rate-limiting.md.
127
127
  Standard 429 response with Retry-After header. Config via environment variables.
128
128
  Open questions resolved with user: Redis is already in stack, use it."
129
129
 
130
- plan.md updated (spec item marked done):
130
+ roadmap.md updated (spec item marked done):
131
131
  - [x] ~~Draft spec for API rate limiting~~
132
132
  - [ ] Plan implementation
133
133
  ...
@@ -151,7 +151,7 @@ Agents spawned:
151
151
  ```
152
152
  Plan review: PASS.
153
153
 
154
- plan.md updated (plan review done, starting implementation):
154
+ roadmap.md updated (plan review done, starting implementation):
155
155
  - [x] ~~Draft spec~~, [x] ~~Plan~~, [x] ~~Review plan~~
156
156
  - [ ] Implement rate limiting middleware
157
157
  - [ ] Implement rate limit configuration
@@ -174,7 +174,7 @@ Agents spawned:
174
174
 
175
175
  ### Cycle 1 — Plan + baseline
176
176
  ```
177
- plan.md:
177
+ roadmap.md:
178
178
  ## Refactor: Extract Token Service
179
179
 
180
180
  - [ ] Plan auth refactor — extract token service
@@ -197,7 +197,7 @@ Agents spawned (parallel):
197
197
  ```
198
198
  Plan complete, baseline captured (47 tests passing).
199
199
 
200
- plan.md updated:
200
+ roadmap.md updated:
201
201
  - [x] ~~Plan auth refactor~~
202
202
  - [x] ~~Capture behavioral baseline~~ (47 tests passing)
203
203
  - [ ] Create TokenService class with extracted logic
@@ -232,6 +232,6 @@ Agents spawned (parallel):
232
232
  ### Cycle 5 — Complete
233
233
  ```
234
234
  All 47 tests passing. Review clean.
235
- All plan.md items complete.
235
+ All roadmap.md items complete.
236
236
  Complete — "Extracted token logic into TokenService. All existing tests pass."
237
237
  ```
@@ -1 +0,0 @@
1
- {"version":3,"sources":["../src/shared/paths.ts"],"sourcesContent":["import { homedir } from 'node:os';\nimport { basename, join } from 'node:path';\n\nexport function globalDir(): string {\n return join(homedir(), '.sisyphus');\n}\n\nexport function socketPath(): string {\n return join(globalDir(), 'daemon.sock');\n}\n\nexport function globalConfigPath(): string {\n return join(globalDir(), 'config.json');\n}\n\nexport function daemonLogPath(): string {\n return join(globalDir(), 'daemon.log');\n}\n\nexport function daemonPidPath(): string {\n return join(globalDir(), 'daemon.pid');\n}\n\nexport function daemonUpdatingPath(): string {\n return join(globalDir(), 'updating');\n}\n\nexport function projectDir(cwd: string): string {\n return join(cwd, '.sisyphus');\n}\n\nexport function projectConfigPath(cwd: string): string {\n return join(projectDir(cwd), 'config.json');\n}\n\nexport function projectOrchestratorPromptPath(cwd: string): string {\n return join(projectDir(cwd), 'orchestrator.md');\n}\n\nexport function sessionsDir(cwd: string): string {\n return join(projectDir(cwd), 'sessions');\n}\n\nexport function sessionDir(cwd: string, sessionId: string): string {\n return join(sessionsDir(cwd), sessionId);\n}\n\nexport function statePath(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'state.json');\n}\n\nexport function reportsDir(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'reports');\n}\n\nexport function reportFilePath(cwd: string, sessionId: string, agentId: string, suffix: string): string {\n return join(reportsDir(cwd, sessionId), `${agentId}-${suffix}.md`);\n}\n\nexport function promptsDir(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'prompts');\n}\n\nexport function contextDir(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'context');\n}\n\nexport function planPath(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'plan.md');\n}\n\nexport function logsPath(cwd: string, sessionId: string): string {\n return join(sessionDir(cwd, sessionId), 'logs.md');\n}\n\nexport function worktreeConfigPath(cwd: string): string {\n return join(projectDir(cwd), 'worktree.json');\n}\n\nexport function worktreeBaseDir(cwd: string): string {\n return join(cwd, '..', `${basename(cwd)}-sisyphus-wt`);\n}\n"],"mappings":";;;AAAA,SAAS,eAAe;AACxB,SAAS,UAAU,YAAY;AAExB,SAAS,YAAoB;AAClC,SAAO,KAAK,QAAQ,GAAG,WAAW;AACpC;AAEO,SAAS,aAAqB;AACnC,SAAO,KAAK,UAAU,GAAG,aAAa;AACxC;AAEO,SAAS,mBAA2B;AACzC,SAAO,KAAK,UAAU,GAAG,aAAa;AACxC;AAEO,SAAS,gBAAwB;AACtC,SAAO,KAAK,UAAU,GAAG,YAAY;AACvC;AAEO,SAAS,gBAAwB;AACtC,SAAO,KAAK,UAAU,GAAG,YAAY;AACvC;AAEO,SAAS,qBAA6B;AAC3C,SAAO,KAAK,UAAU,GAAG,UAAU;AACrC;AAEO,SAAS,WAAW,KAAqB;AAC9C,SAAO,KAAK,KAAK,WAAW;AAC9B;AAEO,SAAS,kBAAkB,KAAqB;AACrD,SAAO,KAAK,WAAW,GAAG,GAAG,aAAa;AAC5C;AAEO,SAAS,8BAA8B,KAAqB;AACjE,SAAO,KAAK,WAAW,GAAG,GAAG,iBAAiB;AAChD;AAEO,SAAS,YAAY,KAAqB;AAC/C,SAAO,KAAK,WAAW,GAAG,GAAG,UAAU;AACzC;AAEO,SAAS,WAAW,KAAa,WAA2B;AACjE,SAAO,KAAK,YAAY,GAAG,GAAG,SAAS;AACzC;AAEO,SAAS,UAAU,KAAa,WAA2B;AAChE,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,YAAY;AACtD;AAEO,SAAS,WAAW,KAAa,WAA2B;AACjE,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,SAAS;AACnD;AAEO,SAAS,eAAe,KAAa,WAAmB,SAAiB,QAAwB;AACtG,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,GAAG,OAAO,IAAI,MAAM,KAAK;AACnE;AAEO,SAAS,WAAW,KAAa,WAA2B;AACjE,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,SAAS;AACnD;AAEO,SAAS,WAAW,KAAa,WAA2B;AACjE,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,SAAS;AACnD;AAEO,SAAS,SAAS,KAAa,WAA2B;AAC/D,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,SAAS;AACnD;AAEO,SAAS,SAAS,KAAa,WAA2B;AAC/D,SAAO,KAAK,WAAW,KAAK,SAAS,GAAG,SAAS;AACnD;AAEO,SAAS,mBAAmB,KAAqB;AACtD,SAAO,KAAK,WAAW,GAAG,GAAG,eAAe;AAC9C;AAEO,SAAS,gBAAgB,KAAqB;AACnD,SAAO,KAAK,KAAK,MAAM,GAAG,SAAS,GAAG,CAAC,cAAc;AACvD;","names":[]}
@@ -1,11 +0,0 @@
1
- #!/bin/bash
2
- # Block Task tool — orchestrator should use sisyphus spawn CLI directly.
3
- # Passthrough (exit 0) if not in a sisyphus session.
4
-
5
- if [ -z "$SISYPHUS_SESSION_ID" ]; then
6
- exit 0
7
- fi
8
-
9
- cat <<'EOF'
10
- {"decision":"block","reason":"Do not use the Task tool. Use the sisyphus CLI to spawn agents:\n- sisyphus spawn --name \"agent-name\" --agent-type sisyphus:implement \"instruction\"\n- echo \"instruction\" | sisyphus spawn --name \"agent-name\"\nThen call sisyphus yield when done spawning."}
11
- EOF