npm - sisyphi - Versions diffs - 1.1.18 → 1.1.19 - Mend

sisyphi 1.1.18 → 1.1.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (231) hide show

package/templates/orchestrator-impl.md CHANGED Viewed

@@ -15,15 +15,12 @@ Maximize parallelism **within your development cycle, not by skipping parts of i
 If the plan has stages that share no file dependencies, run them in parallel from the start. The development cycle for each stage:
-1. **Detail-plan it** — expand the outline into specific file changes. If complex, spawn a requirements or design agent first.
+1. **Detail-plan it** — expand the outline into specific file changes. If complex, spawn a `sisyphus:spec` agent first to align design + requirements.
 2. **Implement it** — spawn agents with self-contained instructions.
 3. **Critique and refine it** — spawn review agents, fix what they find.
 4. **Validate it** — verify the stage actually works end-to-end.
-Not every stage needs every step:
-- Types/interfaces → implementation only (consumers surface type errors)
-- Core business logic → implementation + critique minimum
-- Integration/critical path → full loop including validation
+Not every stage needs every step — use the rigor calibration table above to decide.
 **When multiple stages have completed without any critique or validation, stop implementing and catch up on verification.** Don't let unverified work compound.
@@ -93,29 +90,41 @@ When you see these reports, investigate before pushing forward. If the smell sug
 <critique-refinement>
-## Critique Cycle
+## Critique Pass
 After implementation agents report, assess whether the stage needs critique before advancing. The failure mode is not "sometimes skipping review" — it's implementing six stages in a row without any.
-When a stage warrants critique, spawn review agents in parallel, each attacking a different dimension:
-- **Code reuse** — existing utilities, helpers, patterns the new code duplicates
-- **Code quality** — hacky patterns, redundant state, parameter sprawl, copy-paste, leaky abstractions
-- **Efficiency** — redundant computations, N+1 patterns, missed concurrency, unbounded data structures
+When a stage warrants critique, spawn a `sisyphus:review` agent. It will run parallel sub-reviewers across the relevant dimensions (reuse, quality, efficiency, and security/compliance when appropriate), validate their findings, and return a single consolidated report. Give it the full diff and relevant context files. It reports problems — it does not fix.
-Give each reviewer the full diff and relevant context files. They report problems — they don't fix.
+A clean report ("No concerns") is a valid and common outcome. When you get one, advance. Do not spawn another reviewer to double-check — one careful pass is the contract.
-## Refine Cycle
+## Refine Pass
 Aggregate reviewer findings. Spawn fix agents and **point them at the review report** — don't rewrite findings as line-by-line instructions. You triage (skip false positives, note architectural constraints) — they implement.
 ```bash
-sisyphus spawn --name "fix-review-issues" --agent-type sisyphus:implement \
+sisyphus agent spawn --name "fix-review-issues" --agent-type sisyphus:implement \
   "Fix the issues in reports/agent-003-final.md. Skip item #5 (false positive). Run type-check after."
 ```
 Fix agents should use `/simplify` to review their own changes before reporting.
-Re-review after fixes. Stop when reviewers return only stylistic nits. If 3+ rounds are needed, the approach — not the patches — needs rethinking.
+## One Review Pass Per Stage
+**Do not spawn a second review after fix agents land.** The review pass runs once per stage. After fixes, verify they landed by reading the fix agents' reports and checking that type-check / tests pass — not by spawning another reviewer to re-scan the same surface.
+This is a deliberate choice, not an oversight. Re-reviewing has two failure modes that compound:
+1. A fresh reviewer scanning edited code will anchor on the new code and produce fresh findings, most of which are noise — the tier structure has no "nit" category and the model feels implicit pressure to return something.
+2. When fix agents do introduce real regressions, they typically show up in validation (type-check failures, test failures, e2e failures) rather than in static review. Validation catches the real problems; re-review mostly catches phantoms.
+If the fix agent's own report flags that it hit unexpected complexity or introduced something it wasn't comfortable with, address that specifically — read the code, decide, don't spawn another reviewer. If the single review pass surfaces findings that suggest an architectural problem rather than code-level issues, backtrack to planning instead of patching:
+```bash
+sisyphus orch yield --mode planning --prompt "Review surfaced architectural issue: [summary]. Needs replan, not fixes."
+```
+Real regressions from fix agents are caught by e2e validation (next step), not by a second review pass.
 </critique-refinement>
@@ -134,10 +143,18 @@ If the project lacks validation tooling, **create it** — a smoke-test script,
 **Don't advance past a validated stage until validation passes.** If it fails, log failures, spawn fix agents, re-validate.
-When all implementation stages are complete, transition to validation mode for the comprehensive final pass:
+**Phase-scoped plans:** if the current plan only covers one phase of a multi-phase feature (the plan-lead convention when `strategy.md` has multiple phases), yield back to planning after this phase's validation passes — not to validation mode. Plan files live under `context/{plan-lead-agent-id}/`; use the paths the plan lead reported when dispatching implement agents.
 ```bash
-sisyphus yield --mode validation --prompt "All stages implemented — validate against context/e2e-recipe.md"
+sisyphus orch yield --mode planning --prompt "Phase N validated. Plan phase N+1 per strategy.md."
+```
+The next cycle's plan lead incorporates what you learned here before committing phase N+1 to paper.
+When all implementation phases are complete (the final phase has been planned, implemented, and stage-validated), transition to validation mode for the comprehensive final pass:
+```bash
+sisyphus orch yield --mode validation --prompt "All stages implemented — validate against context/e2e-recipe.md"
 ```
 Validation mode shifts the orchestrator's entire focus to proving the feature works. Stage-level validation during implementation catches issues early; the final validation pass proves the whole thing holds together.
@@ -149,7 +166,7 @@ Validation mode shifts the orchestrator's entire focus to proving the feature wo
 If the approach is wrong mid-implementation, don't keep pushing. Return to planning:
 ```bash
-sisyphus yield --mode planning --prompt "Re-evaluate: discovered X changes the approach — write cycle log"
+sisyphus orch yield --mode planning --prompt "Discovered X mid-implementation — approach needs rework. See cycle log and roadmap.md."
 ```
 Concrete triggers:
@@ -157,6 +174,18 @@ Concrete triggers:
 - An agent discovers a dependency that changes the approach
 - Fix agents keep patching the same area across cycles
-Document what you found in the cycle log before yielding. Update roadmap.md to reflect you're back in an earlier phase.
+Update roadmap.md to reflect you're back in an earlier phase. Log the discovery before yielding.
 </returning-to-planning>
+<impl-cli>
+## Implementation CLI
+```bash
+sisyphus session task "revised goal"                      # update the session goal mid-flight
+sisyphus agent restart <agentId>                         # respawn a failed/killed agent in a new pane
+sisyphus session rollback <sessionId> <cycle>            # rewind state to a prior cycle boundary
+```
+</impl-cli>

package/templates/orchestrator-planning.md CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
 name: planning
-description: Deep exploration, requirements gathering, design, and detailed roadmap creation. Use after strategy is established and before implementation begins.
+description: Deep exploration, spec alignment and detailed roadmap creation. Use after discovery is complete and before implementation begins.
 ---
 # Planning Phase
 <planning-workflow>
-The natural sequence: **context → requirements → design → roadmap refinement → detailed planning.** Context documents come first because they feed everything downstream — requirements analysts, designers, planners, and implementers all benefit from not having to re-explore the codebase. After the requirements and design are aligned, revisit the roadmap — that's when you actually understand scope well enough to flesh out phases honestly.
+The natural sequence: **context → spec → roadmap refinement → detailed planning.** Context documents come first because they feed everything downstream — spec leads, planners, and implementers all benefit from not having to re-explore the codebase. After the spec is aligned, revisit the roadmap — that's when you actually understand scope well enough to flesh out phases honestly.
 </planning-workflow>
@@ -22,25 +22,49 @@ Use explore agents to build understanding before making decisions. Each agent sa
 </exploration>
-<requirements-alignment>
+<spec-alignment>
-Before investing in detailed requirements, make sure the goal is well-defined. If you're making assumptions about scope, requirements, or constraints — surface them to the user.
+<!--EFFORT:LOW-->
+**Skip spec.** Treat the user's request as the requirements. If something's ambiguous, ask the user in-band — don't spawn `sisyphus:spec` or `sisyphus:problem`. Move directly into plan delegation below.
+<!--/EFFORT-->
-For significant features, requirements refinement is iterative:
-- Draft requirements based on exploration findings
-- Have agents review for feasibility (can this actually work given the codebase?)
-- Seek user alignment on the high-level approach
-- **Fold new knowledge into authoritative documents.** When reviews, exploration, or user feedback resolve questions or change the understanding, update the requirements and design documents directly — they are the single source of truth. Delete resolved questions from their listing sections, then update the topical sections where those answers belong so the document reads as settled fact. Don't create correction files, addendum files, or decision logs alongside them. Don't annotate questions with answers — remove the questions entirely and weave the answers into the body. Plan agents should read clean, current documents — not reconcile contradictions or skip over resolved questions.
+<!--EFFORT:MEDIUM-->
+Spec is the combined product discovery + technical design stage. Spawning a spec agent hands off both to a specialist that collaborates with the user directly: exploring the codebase, asking informed questions, drafting a design, writing EARS requirements with TUI review, and deepening the design with what was learned.
-Not every stage needs standalone requirements — a well-defined stage might just be a detailed section in the implementation plan.
+**Spawn `sisyphus:spec` only when the goal has multiple valid framings or the design space is genuinely open.** Single-feature work within a known subsystem rarely needs a spec session — the implementation plan and TUI review cover the design questions. If you're unsure, ask the user in-band before spawning.
-</requirements-alignment>
+**Spec refinement is iterative.** When a spec is spawned, the process doesn't end when documents are saved:
+- Have agents review requirements for feasibility (can this actually work given the codebase?)
+- **Fold new knowledge into authoritative documents.** When reviews, exploration, or user feedback resolve questions or change the understanding, update requirements and design documents directly — they are the single source of truth. Delete resolved questions from their listing sections, then update the topical sections where those answers belong so the document reads as settled fact. Don't create correction files, addendum files, or decision logs alongside them.
+<!--/EFFORT-->
+<!--EFFORT:HIGH,XHIGH-->
+Spec is the combined product discovery + technical design stage. Spawning a spec agent hands off both to a specialist that collaborates with the user directly: exploring the codebase, asking informed questions, drafting a design, writing EARS requirements with TUI review, and deepening the design with what was learned.
+**When to spawn a spec agent:**
+- Any feature that adds or changes user-visible behavior
+- Any task where you're making assumptions about what "done" looks like
+- When exploration revealed ambiguity, trade-offs, or multiple valid interpretations
+**When you can skip spec:**
+- Pure bug fixes with clear reproduction steps
+- Mechanical refactors with no behavioral change (rename, extract, move)
+- Tasks where the user has already provided explicit, detailed acceptance criteria in their starting prompt
+If you're unsure, spawn the spec agent. The cost of a short spec conversation is low. The cost of building the wrong thing is an entire wasted implementation cycle.
+**Spec refinement is iterative.** The spec agent works with the user, but the process doesn't end when documents are saved:
+- Have agents review requirements for feasibility (can this actually work given the codebase?)
+- **Fold new knowledge into authoritative documents.** When reviews, exploration, or user feedback resolve questions or change the understanding, update requirements and design documents directly — they are the single source of truth. Delete resolved questions from their listing sections, then update the topical sections where those answers belong so the document reads as settled fact. Don't create correction files, addendum files, or decision logs alongside them.
+<!--/EFFORT-->
+</spec-alignment>
 <plan-delegation>
-Once you have context docs and aligned requirements/design, revisit the roadmap — this is the first point where you understand real scope. Roadmap refinement means updating the four canonical sections: current stage, exit criteria, active context references, and next steps. Decisions from exploration, requirements, and design fold into context documents — not the roadmap.
+Once you have context docs and aligned spec outputs (requirements + design), revisit the roadmap — this is the first point where you understand real scope. Roadmap refinement means updating the four canonical sections: current stage, exit criteria, active context references, and next steps. Decisions from exploration and spec work fold into context documents — not the roadmap.
-Spawn **one plan lead** per feature. Point it at inputs (requirements, design, context docs) — not a pre-made structure. The plan lead handles its own decomposition: it assesses scope, delegates sub-plans if needed, runs adversarial reviews, and delivers a synthesized master plan. **Delegate outcomes, not implementations** — tell the plan lead what needs planning and why, not how to structure the plan.
+Spawn **one plan lead** per feature (or per phase — see phase-scoped planning below). Point it at inputs (requirements, design, context docs) — not a pre-made structure. The plan lead handles its own decomposition: it assesses scope, delegates sub-plans if needed, runs adversarial reviews, and delivers a synthesized master plan. **Delegate outcomes, not implementations** — tell the plan lead what needs planning and why, not how to structure the plan.
 **Don't split planning yourself.** If the orchestrator pre-splits into "backend plan agent" and "frontend plan agent," the plan lead's synthesis step — resolving cross-domain conflicts, finding gaps, stress-testing edge cases — never happens.
@@ -48,16 +72,67 @@ Spawn **one plan lead** per feature. Point it at inputs (requirements, design, c
 </plan-delegation>
+<plan-review-and-test-spec>
+<!--EFFORT:LOW-->
+**Skip plan review and test-spec.** The plan agent's output is taken at face value — implementation flows directly from plan to implement. If the plan turns out to be wrong, the implement-cycle's review catches it.
+<!--/EFFORT-->
+<!--EFFORT:MEDIUM-->
+After the plan lead delivers:
+- Spawn `sisyphus:review-plan` only when the plan covers multi-domain integration. For single-domain plans, the implementation cycle's review catches issues without a dedicated review pass.
+- Spawn `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Reviews and validation cover correctness without a test-spec.
+If neither applies, transition straight to implementation.
+<!--/EFFORT-->
+<!--EFFORT:HIGH,XHIGH-->
+After the plan lead delivers, `sisyphus:review-plan` runs alongside the planning cycle as an adversarial review of the plan against the requirements and design. Spawn it after the plan is drafted; feed findings back to the plan lead. Address review findings before transitioning to implementation.
+Spawn `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. When test-spec is justified, spawn it **in parallel with the high-level plan**, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do. Its output then feeds the implementation phase as a verification target.
+<!--/EFFORT-->
+</plan-review-and-test-spec>
+<phase-scoped-planning>
+## Plan One Phase at a Time for Multi-Phase Features
+Count the implementation phases in `strategy.md`.
+- **One phase:** spawn the plan lead with the full feature scope.
+- **More than one phase:** spawn the plan lead for the next phase only. What you learn implementing Phase N informs Phase N+1 before it's committed to paper.
+The cycle shape:
+```
+plan phase 1 → implement phase 1 → validate phase 1 → plan phase 2 → implement phase 2 → validate phase 2 → ...
+```
+After a phase's implementation passes e2e validation, yield back to planning mode for the next phase:
+```bash
+sisyphus orch yield --mode planning --prompt "Phase N validated. Plan phase N+1 per strategy.md."
+```
+When spawning the phase-scoped plan lead, name in the prompt:
+- Which phase from `strategy.md` is in scope
+- Which design document or phase-section applies
+- That later phases are out of scope
+Plans save under the plan lead's own subdirectory: `context/{plan-lead-agent-id}/plan-{topic}.md` (or `plan-phase-N-{topic}.md` when the phase identifier helps discoverability). Sub-plans share the same subdir. The plan lead reports the exact paths in its submission — use those verbatim; don't reconstruct them.
+</phase-scoped-planning>
 <progressive-development>
 Not all tasks need the same process depth.
 - **Small task** (1-3 files, single domain): Skip phases — roadmap is a short checklist (diagnose, fix, validate). Single plan agent, single implement agent.
-- **Large task** (3+ stages, multiple domains): Full phased development. The roadmap tracks phases, each producing artifacts in `context/`.
+- **Large task** (3+ stages, multiple domains): Full phased development. The roadmap tracks phases; each phase is planned, implemented, and validated before the next is planned (see phase-scoped planning above).
-Signs you need phased development: multiple unfamiliar subsystems, the task spans different concerns (backend, frontend, IPC), or the requirements have more than 3 distinct work areas.
-Implementation stages are context artifacts — saved to `context/plan-stage-N-*.md`. Detail-plan one stage at a time; what you learn implementing stage N informs stage N+1.
+Signs you need phased development: multiple unfamiliar subsystems, the task spans different concerns (backend, frontend, IPC), or the spec has more than 3 distinct work areas.
 </progressive-development>
@@ -65,18 +140,32 @@ Implementation stages are context artifacts — saved to `context/plan-stage-N-*
 Before implementation begins, determine how to concretely verify the change works end-to-end. This is the single most common failure mode: agents report success but nothing actually works.
-If you cannot determine a concrete verification method, **ask the user**. Do not proceed to implementation without a verification plan.
+If you cannot determine a concrete verification method, **ask the user via `sisyphus ask`**. Propose 2-4 candidate verification approaches as options (not an open-ended question). Do not proceed to implementation without a verification plan.
+Before authoring the deck, **read the `humanloop` skill** for option-design guidance and submission flow. Ground options in this feature's actual surface (manual UI? integration test? log inspection? metric delta?) — not generic placeholders. `sisyphus ask -h` covers CLI syntax.
 Write the recipe to `context/e2e-recipe.md` with setup steps, exact commands or interactions to verify, and what success looks like. Make it executable, not aspirational. Implementation agents and validation agents both reference this file.
 </verification-planning>
+<planning-cli>
+## Planning CLI
+```bash
+sisyphus admin requirements --export --session-id <id>  # render requirements.json → requirements.md (no LLM tokens)
+```
+The requirements export renders a `requirements.json` to markdown without consuming LLM tokens.
+</planning-cli>
 <transition>
 When you have enough understanding, a reviewed plan, and a verification recipe — transition explicitly:
 ```bash
-sisyphus yield --mode implementation --prompt "Begin implementation — see roadmap.md and context/plan-implementation.md"
+sisyphus orch yield --mode implementation --prompt "Begin implementation — see roadmap.md and the plan file path the plan lead reported (under context/{plan-lead-agent-id}/)."
 ```
 The `--mode implementation` flag loads implementation-phase guidance for the next cycle.
@@ -84,7 +173,7 @@ The `--mode implementation` flag loads implementation-phase guidance for the nex
 After implementation is complete, transition to validation mode to prove the feature works:
 ```bash
-sisyphus yield --mode validation --prompt "Implementation complete — validate against context/e2e-recipe.md"
+sisyphus orch yield --mode validation --prompt "Implementation complete — validate against context/e2e-recipe.md"
 ```
 </transition>

package/templates/orchestrator-plugin/commands/sisyphus/scratch.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+description: Open a standalone Claude Code session outside sisyphus for ad-hoc work
+argument-hint: <prompt for the scratch session>
+---
+# Scratch Session
+**Input:** $ARGUMENTS
+The user wants to spin up a standalone Claude Code session — outside sisyphus orchestration — for something that came up during this session. This is not an agent; it's an independent session the user controls.
+Run the following in bash:
+```bash
+sisyphus admin scratch "$ARGUMENTS"
+```
+This opens a new tmux window in the home session with `claude --dangerously-skip-permissions`. Do not track it, wait for it, or reference it in the roadmap. It's fire-and-forget.
+Pass the session a reference to relevant context files and any other additional context that would be helpful for it to complete its task.

package/templates/orchestrator-plugin/commands/sisyphus/spec.md ADDED Viewed

@@ -0,0 +1,11 @@
+---
+description: Run a full spec session — interactive product/engineering conversation that produces an aligned design and EARS requirements
+argument-hint: <topic or description>
+---
+# Spec
+**Input:** $ARGUMENTS
+The user wants a full spec — design + EARS requirements — produced through a single interactive session.
+Spawn a `sisyphus:spec` agent to lead this. It is interactive and runs a three-stage flow: shape (engineer drafts a high-level design), requirements (single req-writer dispatch for the full design with TUI review), deepen (engineer refines design with what was learned).
+Output: `context/design.md` + `context/design.json` + `context/requirements.json` + `context/requirements.md`. The lead generates `requirements.md` via a pure-code script — no LLM tokens for formatting.
+The `sisyphus:spec` agent fully replaces the old `sisyphus:requirements` and `sisyphus:design` commands. Do not spawn either of those (they no longer exist). If the strategy currently lists separate requirements/design stages, collapse them into a single spec stage before spawning.

package/templates/orchestrator-plugin/commands/sisyphus/strategize.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-description: Redirect session strategy — reactivate if completed, then respawn in strategy mode
+description: Redirect session strategy — reactivate if completed, then respawn in discovery mode
 argument-hint: <new direction or focus>
 ---
 # Strategize
@@ -10,10 +10,10 @@ The user wants to redirect this session's strategy.
 ## Steps
-1. If the session is completed (`sisyphus status`), reactivate it with `sisyphus continue`.
-2. Annotate `strategy.md` with the pivot — what changed, new focus, which existing artifacts still apply. Don't rewrite the whole strategy.
-3. Yield to strategy mode:
+1. If the session is completed (`sisyphus status`), reactivate it with `sisyphus session continue`.
+2. Invoke the **strategy skill** to annotate `strategy.md` with the pivot — what changed, new focus, which existing artifacts still apply. Don't rewrite the whole strategy.
+3. Yield to discovery mode:
    ```bash
-   sisyphus yield --mode strategy --prompt "<concise description of the new direction>"
+   sisyphus orch yield --mode discovery --prompt "<concise description of the new direction>"
    ```
    This respawns a fresh orchestrator that will re-evaluate the goal, stages, and approach.

package/templates/orchestrator-plugin/hooks/hooks.json CHANGED Viewed

@@ -9,16 +9,6 @@
           }
         ]
       }
-    ],
-    "Stop": [
-      {
-        "hooks": [
-          {
-            "type": "command",
-            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/idle-notify.sh"
-          }
-        ]
-      }
     ]
   }
 }

package/templates/orchestrator-plugin/skills/humanloop/SKILL.md ADDED Viewed

@@ -0,0 +1,149 @@
+---
+name: humanloop
+description: >
+  Read before calling `sisyphus ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to invoke synchronously so the orchestrator's process blocks until the user answers.
+---
+# Talking to the user via decks
+`sisyphus ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
+This skill covers **what to put in a deck** and **how to invoke it**. Run `sisyphus ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
+## Reach for a deck when
+- You have **2+ questions** to surface in one beat (bundle them into one deck).
+- You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
+- You're choosing between **concrete alternatives** the user must pick.
+- The work will sit while the user thinks. Decks survive across cycles; chat does not.
+## Skip the deck when
+- It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
+- You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
+- The user is actively conversing with you — converting a live exchange into a deck adds friction.
+## How to invoke
+**Run `sisyphus ask` in the foreground — let the Bash tool block.** The CLI waits internally for the user to resolve the deck (potentially 10+ minutes). Your pane stays alive in tmux for the duration; the daemon will not respawn you while a tool call is in flight. When the user answers, the bash returns stdout and you parse it inline.
+```bash
+result=$(sisyphus ask "$deck")
+choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId')
+notes=$(echo "$result"  | jq -r '.responses[0].freetext // ""')
+```
+**Do not `run_in_background` and yield** — yielding kills your pane and any backgrounded bash with it; the next cycle's fresh orchestrator can only peek the on-disk deck (`sisyphus ask peek`) and yield again, producing a polling loop. The daemon now refuses `sisyphus orch yield` while a deck owned by orchestrator is pending; the supported pattern is foreground.
+Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
+If you respawn mid-wait and find a pending deck on disk (e.g. after a daemon restart that orphaned the prior bash), block on it with `sisyphus ask poll <askId>` to re-attach. `sisyphus ask peek <askId>` is non-blocking and reserved for respawn-recovery diagnostics. See `sisyphus ask -h`.
+## Designing interactions
+### Each option is a concrete path forward
+The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
+<example type="good">
+```
+title: "Session store backend?"
+subtitle: "Auth needs persistent sessions across restarts"
+kind: decision
+options:
+  in-memory:  "In-memory map — simplest. Loses sessions on restart; single-process only."
+  redis:      "Redis — survives restart, supports horizontal scale. New ops dependency."
+  postgres:   "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
+  defer:      "Ship in-memory now, migrate later if scale becomes real."
+allowFreetext: true
+freetextLabel: "Different framing — describe it"
+```
+</example>
+<example type="bad">
+```
+title: "Happy with this design?"
+options:
+  1. Yes
+  2. No, start over
+  3. Maybe, with comments
+  4. (no option, just freetext)
+```
+"Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
+</example>
+### Use `allowFreetext: true` as a safety valve, not the primary input
+Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
+<example type="bad">
+```
+title: "Approve?"
+options:
+  1. Approve
+  2. Reject
+  3. Comment
+allowFreetext: true
+```
+A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
+</example>
+### Bound option count to 2–4
+Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
+### Ground options in what you've already gathered
+Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
+### One concern per interaction
+When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
+## `kind` — display hint
+| kind | use for |
+|---|---|
+| `decision` | fork in the road; user picks a path forward |
+| `validation` | sign-off on completed work |
+| `notify` | FYI; user acknowledges |
+| `context` | surfacing background that needs a response |
+| `error` | something went wrong; user picks a recovery |
+The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
+## Bundling
+If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
+```bash
+deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
+cat > "$deck" <<'EOF'
+{
+  "title": "Phase 2 sign-off + follow-on decisions",
+  "interactions": [
+    {
+      "id": "approve-phase-2",
+      "title": "Phase 2 looks good?",
+      "kind": "validation",
+      "options": [...]
+    },
+    {
+      "id": "phase-3-scope",
+      "title": "Phase 3 scope?",
+      "kind": "decision",
+      "options": [...]
+    }
+  ]
+}
+EOF
+# Then invoke `sisyphus ask "$deck"` synchronously (foreground bash) — blocks until answered.
+# Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
+```
+## Submission notes
+- The deck is validated at submit (precise errors — trust them).
+- `bodyPath` lets an interaction point at a markdown file (e.g. a completion summary) instead of inlining the markdown in JSON.
+- On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
+- See `sisyphus ask -h` for the full CLI surface.

package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ - `sisyphus orch yield --mode discovery`, `--mode validation`, etc. must be explicit at mode-transition cycles. Omitting silently defaults to `implementation` mode, skipping guardrails the subsequent cycle expects.

package/templates/orchestrator-plugin/skills/orchestration/SKILL.md CHANGED Viewed

@@ -22,7 +22,8 @@ How to structure sisyphus sessions for common task types. This skill helps the o
 ## Agent Types
-Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sisyphus spawn`.
+Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sisyphus agent spawn`.
 For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
 For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
+For strategy.md authoring — stage patterns, process shapes, format — see [strategy.md](strategy.md).