npm - sisyphi - Versions diffs - 1.2.2 → 1.2.12 - Mend

sisyphi 1.2.2 → 1.2.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (85) hide show

package/dist/templates/orchestrator-plugin/skills/humanloop/SKILL.md DELETED Viewed

@@ -1,150 +0,0 @@
----
-name: humanloop
-description: >
-  Read before calling `sis ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to invoke synchronously so the orchestrator's process blocks until the user answers.
----
-# Talking to the user via decks
-`sis ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
-This skill covers **what to put in a deck** and **how to invoke it**. Run `sis ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
-## Reach for a deck when
-- You have **2+ questions** to surface in one beat (bundle them into one deck).
-- You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
-- You're choosing between **concrete alternatives** the user must pick.
-- The work will sit while the user thinks. Decks survive across cycles; chat does not.
-## Skip the deck when
-- It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
-- You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
-- The user is actively conversing with you — converting a live exchange into a deck adds friction.
-## How to invoke
-**Run `sis ask` in the foreground — let the Bash tool block.** The CLI waits internally for the user to resolve the deck (potentially 10+ minutes). Your pane stays alive in tmux for the duration; the daemon will not respawn you while a tool call is in flight. When the user answers, the bash returns stdout and you parse it inline.
-```bash
-result=$(sis ask "$deck")
-choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId')
-notes=$(echo "$result"  | jq -r '.responses[0].freetext // ""')
-```
-**Do not `run_in_background` and yield** — yielding kills your pane and any backgrounded bash with it; the next cycle's fresh orchestrator can only peek the on-disk deck (`sis ask peek`) and yield again, producing a polling loop. The daemon now refuses `sis orch yield` while a deck owned by orchestrator is pending; the supported pattern is foreground.
-Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
-If you respawn mid-wait and find a pending deck on disk (e.g. after a daemon restart that orphaned the prior bash), block on it with `sis ask poll <askId>` to re-attach. `sis ask peek <askId>` is non-blocking and reserved for respawn-recovery diagnostics. See `sis ask -h`.
-## Designing interactions
-### Each option is a concrete path forward
-The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
-<example type="good">
-```
-title: "Session store backend?"
-subtitle: "Auth needs persistent sessions across restarts"
-kind: decision
-options:
-  in-memory:  "In-memory map — simplest. Loses sessions on restart; single-process only."
-  redis:      "Redis — survives restart, supports horizontal scale. New ops dependency."
-  postgres:   "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
-  defer:      "Ship in-memory now, migrate later if scale becomes real."
-allowFreetext: true
-freetextLabel: "Different framing — describe it"
-```
-</example>
-<example type="bad">
-```
-title: "Happy with this design?"
-options:
-  1. Yes
-  2. No, start over
-  3. Maybe, with comments
-  4. (no option, just freetext)
-```
-"Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
-</example>
-### Use `allowFreetext: true` as a safety valve, not the primary input
-Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
-<example type="bad">
-```
-title: "Approve?"
-options:
-  1. Approve
-  2. Reject
-  3. Comment
-allowFreetext: true
-```
-A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
-</example>
-### Bound option count to 2–4
-Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
-### Ground options in what you've already gathered
-Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
-### One concern per interaction
-When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
-## `kind` — display hint
-| kind | use for |
-|---|---|
-| `decision` | fork in the road; user picks a path forward |
-| `validation` | sign-off on completed work |
-| `notify` | FYI; user acknowledges |
-| `context` | surfacing background that needs a response |
-| `error` | something went wrong; user picks a recovery |
-The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
-## Bundling
-If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
-```bash
-deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
-cat > "$deck" <<'EOF'
-{
-  "title": "Phase 2 sign-off + follow-on decisions",
-  "interactions": [
-    {
-      "id": "approve-phase-2",
-      "title": "Phase 2 looks good?",
-      "kind": "validation",
-      "options": [...]
-    },
-    {
-      "id": "phase-3-scope",
-      "title": "Phase 3 scope?",
-      "kind": "decision",
-      "options": [...]
-    }
-  ]
-}
-EOF
-# Then invoke `sis ask "$deck"` synchronously (foreground bash) — blocks until answered.
-# Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
-```
-## Submission notes
-- The deck is validated at submit (precise errors — trust them).
-- `kind` is an enum: `notify` | `validation` | `decision` | `context` | `error`. No other values accepted (see the table above for which to pick).
-- `bodyPath` points at a markdown file instead of inlining the body in JSON. The path is resolved **relative to the deck JSON's directory** and must stay inside it (no `..`, no symlinks out, no absolute paths pointing elsewhere). Practical pattern: write the deck JSON next to its body file — e.g. both inside `$SISYPHUS_SESSION_DIR/context/` — and use a basename like `"completion-summary.md"`. Mutually exclusive with `body`.
-- On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
-- See `sis ask -h` for the full CLI surface.

package/dist/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md DELETED Viewed

	@@ -1 +0,0 @@
1	- - `sis orch yield --mode <mode>` is required on every yield. Pass the current mode to stay in it; pass a different mode to transition. There is no implicit "keep current mode" fallback — the CLI rejects yields without `--mode`.

package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md DELETED Viewed

@@ -1,29 +0,0 @@
----
-name: orchestration
-description: >
-  Task breakdown patterns for sisyphus orchestrator sessions. How to structure tasks, sequence agents, and manage cycles for debugging, feature builds, refactors, and other common workflows. Use when planning orchestration strategy or structuring a multi-agent session.
----
-# Orchestration Patterns
-How to structure sisyphus sessions for common task types. This skill helps the orchestrator break work into tasks, choose agent types, sequence cycles, and handle failures.
-## Core Principles
-1. **roadmap.md is the orchestrator's memory.** roadmap.md and agent reports persist across cycles — they're all you have. Keep roadmap.md current and specific enough that a fresh orchestrator can pick up where you left off.
-2. **Agents are disposable.** Each agent gets one focused instruction. If it fails or the scope changes, spawn a new one — don't try to redirect a running agent.
-3. **Parallelize when independent.** If two tasks don't share files or depend on each other's output, spawn agents for both in the same cycle.
-4. **Interleave verification.** Don't batch all implementation and defer review to the end. Embed critique and validation checkpoints between stages based on risk — the more subsequent work depends on a stage being correct, the more it needs verification before you build on it.
-5. **Reports are handoffs.** Agent reports should contain everything the next cycle's orchestrator needs — what was done, what was found, what's unresolved, where artifacts were saved.
-## Agent Types
-Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sis agent spawn`.
-For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
-For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
-For strategy.md authoring — stage patterns, process shapes, format — see [strategy.md](strategy.md).

package/dist/templates/orchestrator-plugin/skills/orchestration/strategy.md DELETED Viewed

@@ -1,160 +0,0 @@
-# Strategy Reference
-Reference material for writing and updating strategy.md — the document that maps the shape of the work across stages.
-## strategy.md Format
-```markdown
-## Completed
-[Compressed summaries of finished stages — delete detail, keep outcomes]
-## Current Stage: [name]
-[Detailed process flow with exit criteria and backtrack triggers]
-## Ahead
-[Sketched future stages — one line each: name + what it covers]
-[Only as far as you can currently see — it's OK if this is vague]
-```
-**Principles:**
-- **Detail the current stage** — concrete enough that the orchestrator can execute without re-reading this skill
-- **Sketch what's ahead** — enough continuity that future updates don't lose the thread, not so much that you're committing to unknowns
-- **Every detailed stage gets exit criteria** — concrete enough to evaluate, not so rigid they become checkboxes
-- **Include user gates** — where does this stage need the user? What decision or approval?
-## Stages name kinds of work, not areas of code
-A strategy stage is a **process phase** — `discovery`, `planning`, `implementation`, `validation`, `spike`. It describes the *kind* of thinking happening that stage. It is **not** a work-area label like `auth-refactor`, `tui-panel`, `migration-script`, or `foundations`.
-Work areas are the plan agent's job. They live in `context/{plan-lead-agent-id}/plan-stage-N-*.md` and structure the implementation phase from the inside. Keep them out of `strategy.md`.
-<example>
-✓ Correct — process phases:
-```
-## Ahead
-- **implementation** — phased build per the plan outline (5 sub-stages: foundations → ask-cli → tui → orphan-handling → migration). Critique + validate per stage.
-- **validation** — run e2e recipe end-to-end, capture evidence, user gate.
-```
-✗ Wrong — work areas masquerading as stages:
-```
-## Ahead
-- **foundations** — humanloop refactor + ask-store helpers
-- **ask-cli + haiku + template** — CLI command and tool-use loop
-- **tui-integration** — inbox panel and key routing
-- **orphan-handling** — kill/complete paths
-- **migration + e2e validation** — drop old command, run recipe
-```
-The second list is a roadmap of code work. Strategy.md collapses into a task list and the process shape (when do we critique? when do we validate? what's the user gate?) disappears.
-</example>
-When you're tempted to name a stage after a code area, that signals you're sketching the plan, not the strategy. Push that detail down into the plan agent's output and keep `strategy.md` at the process-shape layer.
-## Default Pipeline Shape
-The session's effort tier dictates the default pipeline. **Use this shape unless the problem explicitly demands more or less.** The user can change tiers via `sis session effort <low|medium|high|xhigh>`.
-<!--EFFORT:LOW-->
-**Pipeline:** `plan → implement → validate`
-A single plan agent, a single implement agent, a single validate agent. No spec, problem, test-spec, or review-plan stages — the user's request is the requirement; ask in-band if anything's ambiguous. If the work is wrapper-shaped (every change backs onto an existing CLI/API/handler), move directly from discovery into implementation mode without a planning-mode cycle at all.
-<!--/EFFORT-->
-<!--EFFORT:MEDIUM-->
-**Pipeline:** `(spec, if behavior changes) → plan → implement → validate`
-Add `sisyphus:review-plan` only when the plan covers multi-domain integration. Add `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Spawn `sisyphus:spec` and `sisyphus:problem` only when the goal has multiple valid framings or the design space is genuinely open.
-<!--/EFFORT-->
-<!--EFFORT:HIGH,XHIGH-->
-**Pipeline:** `discovery → spec → planning (with parallel review-plan) → phased implementation with critique/validate checkpoints → validation`
-`sisyphus:review-plan` runs after the plan is drafted. `sisyphus:spec` spawns whenever a feature adds user-visible behavior. `sisyphus:problem` spawns when the goal is nebulous. Append `+ test-spec` to the planning stage **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"); silence is a "no." When justified, `sisyphus:test-spec` spawns in parallel with the high-level plan at Cycle 2, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do.
-<!--/EFFORT-->
-**Re-evaluate the tier when scope shifts mid-session.** A MEDIUM feature that uncovers a new subsystem may have crossed into HIGH; a HIGH feature whose scope was narrowed may have dropped to MEDIUM. Re-run `sis session effort` and re-invoke this skill rather than continuing under the old tier's pipeline.
-## Choosing a Different Shape
-If the default doesn't match the problem, these canonical progressions are the next-best starting points — pick the closest one and prune what's already clear, rather than inventing custom shapes:
-```
-discovery → spec → planning → implementation → validation
-exploration → spike → design → implementation → validation
-investigation → recommendation → (user decides) → implementation
-analysis → phased-transformation → verification
-discovery → product-design → technical-investigation → architecture → implementation → validation
-```
-Add a new stage *type* only when the problem demands a kind of work the patterns don't cover — for example a `spike` to prove feasibility, a `compatibility-check` before a migration, or a `prototype` before committing. The test for "is this a real new stage?" is whether it names a different kind of thinking, not a different slice of code.
-## Stage Patterns
-Use these as starting points. Invent new stage types when the problem demands it. Add backtrack edges where you can foresee things going wrong.
-### discovery
-**Use when:** Goal is undefined, ambiguous, or has shifted — need to clarify what "done" looks like before any other stage runs. Also re-entered mid-session when a pivot invalidates the current goal.
-- Process: read prior context (goal.md, prior strategy if any) → if the goal is provably clear, write goal.md and run the clarity-confirmation deck → otherwise spawn `sisyphus:problem` for interactive exploration → user iterates → fold result into goal.md → set effort tier → write or revise strategy.md
-- Exit: goal.md is current and confirmed; effort tier is set; strategy.md exists for this iteration
-- Produces: goal.md, strategy.md, optionally context/problem.md or context/problem-bifurcation.md
-- Backtrack: if scope reveals multiple independent projects, issue a decomposition deck and let the user pick a lead — record the others under "Known follow-ups" in goal.md
-### exploration
-**Use when:** Need to understand the technical landscape before committing to an approach.
-- Process: spawn explore agents (each producing a focused context doc) → review findings → identify gaps → re-explore or converge
-- Exit: enough understanding to make decisions — key questions answered, relevant patterns documented
-- Produces: context documents (one per investigation angle, not one sprawling doc)
-### spike
-**Use when:** Feasibility is uncertain — need to prove an approach works before investing in full design.
-- Process: identify the riskiest assumption → build a minimal prototype that tests it → evaluate results → present findings to user if the spike changes the approach
-- Exit: feasibility confirmed or denied with evidence, decision on path forward
-- Produces: spike findings in context/, prototype code (may be throwaway)
-- Backtrack: if spike fails → re-explore alternatives
-### spec
-**Use when:** Need to define what to build and how, in a single interactive session.
-- Process: spawn sisyphus:spec → lead explores codebase, asks user questions, dispatches engineer for design and a single writer for requirements → user reviews via TUI → lead deepens design with findings
-- Exit: user-approved design + requirements with testable acceptance criteria
-- Produces: context/design.md + context/design.json + context/requirements.json + context/requirements.md
-- Backtrack: if problem was misframed → re-explore or re-discover
-### planning
-**Use when:** Design approved, need an executable breakdown.
-- Process: spawn plan lead with spec outputs (requirements + design) as inputs → adversarial review of plan → create e2e verification recipe
-- Exit: reviewed plan + executable e2e-recipe.md that defines how to prove the feature works
-- Produces: phased implementation plan + e2e recipe in context/
-- Backtrack: if plan reveals design infeasibility → revisit spec
-### implementation
-**Use when:** Plan exists, time to build.
-- Process: for each phase → detail-plan → spawn implement agents → single critique pass → refine → validate phase
-- Exit: all phases validated with evidence, no critical review findings remain
-- Loops: none within a phase — review runs once, fixes land, then validation. If review surfaces architectural issues, backtrack to plan; otherwise advance.
-- Backtrack: if 2+ agents hit same unexpected complexity → revisit plan or spec; if review finds architectural issues → revisit plan
-### validation
-**Use when:** Implementation complete, need to prove it works end-to-end.
-- Process: run full e2e recipe → collect evidence (command output, screenshots, responses) → assess against success criteria → step back and check if the goal is actually met
-- Exit: all recipe steps pass with concrete evidence, original goal satisfied
-- Produces: validation report with evidence
-- Backtrack: if bugs found → implementation; if architectural issues → spec
-## Mid-session shape revisions
-When the work in flight reveals the strategy itself is off, escalate up this ladder — reach for the lowest-cost move that fits.
-1. **Revise in place.** Stage detail evolved but the pipeline shape holds. Edit `strategy.md` and `roadmap.md`; continue.
-2. **`sisyphus:strategize`.** Approach is wrong but artifacts (specs, explorations, reports) still apply. Annotates the pivot into `strategy.md` and yields `--mode discovery` with a fresh orchestrator.
-3. **`sis session clone <goal>`.** The session is actually two (or more) independent projects. Forks scope into a new top-level session; update `goal.md`/`roadmap.md` here to drop what was cloned.
-4. **`sis session rollback <sessionId> <cycle>`.** A specific cycle introduced state to discard. Rewinds and pauses the session — cycles after the target are lost. Last resort; the others preserve history.
-When the user is the source of the change, update `goal.md` first — strategy revision is downstream of goal.
-## Design Philosophy
-Frameworks to inform process shape selection — use them to *choose the right shape*, not to follow mechanically:
-- **Double Diamond** — Diverge to explore, converge on a definition; diverge on solutions, converge on implementation. Use when requirements are unclear or the problem needs defining.
-- **OODA (Observe–Orient–Decide–Act)** — Tight sensing/reacting loops. Use when the situation is fluid and the cost of wrong moves is low (debugging, spikes, incident response).
-- **Cynefin** — Match approach to domain. Clear → best practice. Complicated → analyze then execute. Complex → probe, sense, respond. Chaotic → act to stabilize.

package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md DELETED Viewed

@@ -1,266 +0,0 @@
-# Work Breakdown Patterns
-Patterns for how the orchestrator should structure roadmap.md for common workflow types. Each pattern shows the plan structure, agent assignments, cycle sequencing, and failure handling.
----
-## Bug Fix
-### When to use
-Something is broken. User reports a bug, test is failing, behavior is wrong.
-### Plan structure
-```
-## Bug Fix: [description]
-- [ ] Diagnose root cause of [bug description]
-- [ ] Implement fix for [root cause]
-- [ ] Validate fix — regression tests pass, bug is resolved
-- [ ] Review fix for unintended side effects
-```
-### Cycle plan
-- **Cycle 1**: Spawn `sisyphus:debug` for diagnosis. Yield.
-- **Cycle 2**: Read diagnosis report. If confident root cause found, spawn `sisyphus:implement` for fix with the diagnosis as context. Yield.
-- **Cycle 3**: Spawn `sisyphus:validate` for validation. Yield.
-- **Cycle 4**: If validation passes, spawn `sisyphus:review` for review. If fails, update plan with failure context and respawn implement. Yield.
-- **Cycle 5**: Review results. Complete or address review findings.
-### Failure modes
-- **Debug inconclusive**: Add more context to plan, respawn debug with narrower scope or different focus areas.
-- **Fix breaks other things**: Validation catches this. Feed validation failures back into a new implement cycle.
-- **Root cause was wrong**: Update plan with what was learned, respawn debug.
-### Parallelization
-Usually serial — diagnosis must complete before fix, fix before validation. Exception: if the bug affects multiple independent areas, spawn multiple debug agents in parallel.
----
-## Feature Build (Small — 1-3 files)
-### When to use
-Clear requirements, small scope, no formal requirements document needed.
-### Plan structure
-```
-## Feature: [description]
-- [ ] Plan implementation for [feature]
-- [ ] Implement [feature]
-- [ ] Validate implementation
-```
-### Cycle plan
-- **Cycle 1**: Spawn `sisyphus:plan` for planning. Yield.
-- **Cycle 2**: Spawn `sisyphus:implement` with plan path. Yield.
-- **Cycle 3**: Spawn `sisyphus:validate` for validation. Yield.
-- **Cycle 4**: Complete or fix issues.
-### Parallelization
-Serial. Too small to benefit from parallel agents.
----
-## Feature Build (Medium — 4-10 files)
-### When to use
-Feature with moderate complexity. Requirements may need clarification. Multiple files across a few modules.
-### Plan structure
-```
-## Feature: [description]
-### Requirements & Design
-- [ ] (conditional) Problem exploration — if goal is nebulous, explore before spec
-- [ ] Requirements — define acceptance criteria
-- [ ] Design — architecture, component boundaries, data models
-- [ ] Create implementation plan from requirements + design
-- [ ] Review plan against requirements + design
-### Implementation
-- [ ] Phase 1 — [foundation/types/interfaces]
-- [ ] Phase 2 — [core logic]
-- [ ] Critique phases 1-2
-- [ ] Phase 3 — [integration/wiring]
-- [ ] Validate — smoketest full feature e2e
-- [ ] Review implementation
-```
-Note: critique and validation are embedded between implementation phases, not deferred to the end. Phase 1 (types) is low-risk and doesn't need its own review, but critique catches issues before Phase 3 builds on them. Validation happens after integration, when all the pieces come together.
-### Cycle plan
-- **Cycle 0** (conditional): If the problem is nebulous — multiple valid framings, unclear what "done" looks like — spawn `sisyphus:problem` for interactive exploration. Yield `--mode discovery`. Skip if goal is clear and acceptance criteria are obvious.
-- **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
-- **Cycle 2**: Spawn `sisyphus:plan` for plan. Yield.
-- **Cycle 3**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
-- **Cycle 4**: Spawn `sisyphus:implement` for Phase 1. Yield.
-- **Cycle 5**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types — low risk, doesn't need its own validation. Yield.
-- **Cycle 6**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
-- **Cycle 7**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
-- **Cycle 8**: `sis orch yield --mode validation` for e2e smoketest. Validation mode proves the feature works — operator for UI, evidence for every claim.
-- **Cycle 9**: Address validation failures (back to `--mode implementation`) or complete.
-### Failure modes
-- **Spec needs human input**: Mark session as needing human review. Orchestrator notes open questions.
-- **Plan fails review**: Feed review issues back, respawn planner.
-- **Critique finds issues in foundation**: Fix before starting integration — don't build on shaky ground.
-- **Validation fails**: Feed specifics back to implement agent for the failing area.
-### Parallelization
-Phases without dependencies can run in parallel. Types/interfaces (Phase 1) must complete before implementation phases that consume them. Critique can run alongside detail-planning for the next phase.
----
-## Feature Build (Large — 10+ files)
-### When to use
-Cross-cutting feature, multiple domains, needs team coordination. Uses **progressive planning** — high-level outline first, then detail-plan each stage as it's reached.
-### Plan structure
-```
-## Feature: [description]
-### Requirements & Design
-- [ ] (conditional) Problem exploration — if goal is nebulous
-- [ ] Requirements
-- [ ] Design
-### Stage Outline (high-level only — no file-level detail yet)
-1. [domain A foundation] — no deps — ~N cycles
-2. [domain B foundation] — no deps — ~N cycles
-   → critique stages 1-2 (foundation is low-risk individually, but review before building on it)
-3. [domain A implementation] — depends on 1 — ~N cycles
-4. [domain B implementation] — depends on 2 — ~N cycles
-   → critique + validate stages 3-4 (core logic, high risk — verify before integration)
-5. [integration layer] — depends on 3, 4 — ~N cycles
-   → validate end-to-end (integration is where accumulated assumptions break)
-6. [final review] — depends on all
-### Current Stage: [whichever is active]
-See context/{plan-lead-agent-id}/plan-stage-N-{name}.md for detail plan. (Path comes from the plan lead's submission report.)
-- [ ] [task-level items from detail plan]
-```
-Note: verification checkpoints are embedded in the stage outline, not deferred to a final phase. The level of rigor varies — foundation stages get a light critique, core logic gets critique + validation, integration gets full e2e validation. This is judgment, not formula.
-### Cycle plan
-- **Cycle 0** (conditional): If the problem is nebulous, spawn explore agents for technical landscape (yield `--mode discovery`), then spawn `sisyphus:problem` for interactive problem exploration (yield `--mode discovery`). May take 1-3 discovery cycles. Skip if the goal and scope are already clear.
-- **Cycle 1**: Spawn `sisyphus:spec` for combined design + requirements. Yield. (Human iterates inside the spec session.)
-- **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." If the user's initial prompt or goal.md explicitly requested tests, also spawn `sisyphus:test-spec` for test properties in parallel; otherwise skip. Yield.
-- **Cycle 4**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). The plan agent saves under its own subdir and reports the full path — carry that path forward for the implement cycle. Yield.
-- **Cycle 5**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
-- **Cycle 6**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
-- **Cycle 7**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
-- **Cycle 8**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
-- **Cycle 9**: Spawn `sisyphus:validate` for stages 3-4 — core logic checkpoint before integration. Address stage 3 critique. Yield.
-- **Cycle 10+**: Implement integration stage. Final review. Then `sis orch yield --mode validation` for comprehensive e2e proof.
-### Failure modes
-- **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
-- **Integration failures**: Often means contracts between domains don't match. Spawn debug agent targeting the integration seam.
-- **Stage N implementation invalidates stage N+1 outline**: Update the high-level outline. This is expected — it's why you don't detail-plan everything upfront.
-- **Critique finds issues after multiple stages built on top**: This is the scenario verification checkpoints exist to prevent. If it happens, you waited too long to review — add earlier checkpoints to the roadmap going forward.
-### Parallelization
-Maximize within the progressive pattern. Independent stages run in parallel. Detail-planning the next stage runs alongside implementing the current one. Critique and validation agents run alongside the next stage's planning or implementation. Foundation stages complete before dependent stages. Integration waits for all domain implementations.
----
-## Refactor
-### When to use
-Restructure code without changing behavior. Move files, rename abstractions, consolidate patterns.
-### Plan structure
-```
-## Refactor: [description]
-- [ ] Analyze current structure and plan refactor
-- [ ] Capture behavioral snapshot (existing tests + manual checks)
-- [ ] Execute refactor phase 1 — [structural changes]
-- [ ] Execute refactor phase 2 — [update consumers]
-- [ ] Validate behavior preserved — all original tests pass
-- [ ] Review for missed references, dead code, broken imports
-```
-### Cycle plan
-- **Cycle 1**: Spawn `sisyphus:plan` for analysis + `sisyphus:validate` to capture baseline (parallel). Yield.
-- **Cycle 2**: Spawn `sisyphus:implement` for phase 1. Yield.
-- **Cycle 3**: Spawn `sisyphus:implement` for phase 2 + `sisyphus:validate` for phase 1 (parallel). Yield.
-- **Cycle 4**: Spawn `sisyphus:validate` for full validation. Yield.
-- **Cycle 5**: Spawn `sisyphus:review` for final review. Complete.
-### Key principle
-**Behavior preservation is the only metric.** The refactor is correct if and only if all existing tests pass and externally observable behavior is unchanged.
-### Parallelization
-Limited. Refactor phases are often sequential (move before update consumers). Validation can run in parallel with the next phase if they touch different files.
----
-## Code Review
-### When to use
-PR review, pre-merge check, or periodic quality audit.
-### Plan structure
-```
-## Review: [scope]
-- [ ] Review [scope] for issues
-- [ ] (conditional) Fix critical/high issues found
-- [ ] Verify fixes landed (type-check, tests pass)
-```
-### Cycle plan
-- **Cycle 1**: Spawn `sisyphus:review` for review. Yield.
-- **Cycle 2**: If critical/high issues, spawn `sisyphus:implement` for fixes. If clean, complete.
-- **Cycle 3**: Verify fixes landed by reading fix-agent reports + running type-check/tests. Complete. Do **not** spawn a second review pass — review runs once, validation catches regressions.
-### Parallelization
-Review itself parallelizes internally (subagents per concern). Fix cycle is usually serial.
----
-## Investigation / Spike
-### When to use
-Need to understand something before committing to an approach. Prototype, explore, or answer a technical question.
-### Plan structure
-```
-## Investigation: [question/area]
-- [ ] Investigate [question/area]
-- [ ] Summarize findings and recommendation
-```
-### Cycle plan
-- **Cycle 1**: Spawn `sisyphus:debug` (for code investigation) or `sisyphus:general` (for broader research). Yield.
-- **Cycle 2**: Spawn `sisyphus:general` to synthesize findings. Complete.
-### Parallelization
-If investigating multiple independent areas, spawn parallel agents each exploring a different angle.
----
-## Tactician-Driven Implementation
-### When to use
-The plan exists and you want automated cycle-by-cycle execution without manual orchestrator decisions. The tactician reads the plan, dispatches one phase at a time, and tracks progress.
-### Plan structure
-```
-## Tactician Execution
-- [ ] Execute implementation plan at [path] using tactician loop
-```
-### Cycle plan
-This is a single-item pattern. The orchestrator spawns the tactician once:
-- **Cycle 1**: Spawn `sisyphus:tactician` with plan path. The tactician internally dispatches implement/validate agents via submit tool actions. The orchestrator's role is minimal — just monitor the tactician's completion report.
-### When NOT to use
-- When you need human checkpoints between phases
-- When phases have external dependencies (waiting on API access, design review, etc.)
-- When the task requires creative decisions the tactician shouldn't make alone