npm - sisyphi - Versions diffs - 1.2.1 → 1.2.11 - Mend

sisyphi 1.2.1 → 1.2.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (87) hide show

package/templates/agent-plugin/skills/perspective-fanout/SKILL.md DELETED Viewed

@@ -1,115 +0,0 @@
----
-name: perspective-fanout
-description: >
-  Load when the problem-agent dialogue has produced enough substance to react to but conclusions haven't hardened — typically four or more turns in, with a framing solidifying. Provides the protocol for spawning eight perspective sub-agents in parallel, synthesizing their outputs, and presenting the synthesis back to the user via a render+deck pair. Available only at MEDIUM, HIGH, or XHIGH effort.
----
-# Perspective fanout
-Spawn the eight perspective lenses as parallel sub-agents to challenge convergence before the framing locks in. The agents operate from a shared problem statement so their outputs are directly comparable. After they return, synthesize and surface to the user — convergence, surprises, insights — as the seed for the next dialogue turn.
-## When to spawn
-- The conversation has substance to react to (typically four or more turns in)
-- A framing is starting to solidify
-- You want to challenge convergence, not rescue a stalled discussion
-- You have already formed your own take
-If the conversation is stalled, use a plateau-breaker instead — perspective fanout needs material to push against.
-## Before spawning: write the shared problem statement
-Two or three sentences, given verbatim to all eight agents:
-- What's happening (or not happening)
-- What's been considered so far (from your exploration and the user input)
-- What a good outcome looks like
-This shared framing is what makes the eight outputs comparable. Different framings produce different conversations and the synthesis collapses.
-## The eight lenses
-Spawn one sub-agent per lens, all in the background, in parallel:
-| Lens | Brief |
-|---|---|
-| First Principles | Strip away assumptions. What is the actual problem at its most fundamental level? |
-| User Empathy | Forget the code. What does the person using this actually need? |
-| Simplifier | What can be deleted, removed, or skipped? The best solution might be no solution. |
-| Systems Thinker | Zoom out. What are the second-order effects? What breaks downstream? |
-| Contrarian | Take the opposite position of whatever seems obvious. |
-| Time Traveler | Six months from now, what will we wish we had done? |
-| Adversarial | Assume the current approach is wrong. Find the flaw, the hidden assumption that breaks under stress. |
-| Precedent | Has this been solved before? In this codebase, in open source, in a different domain entirely? |
-Continue the conversation with the user while the agents run. Do not block.
-## Synthesis
-When the eight return, write to `$SISYPHUS_SESSION_DIR/context/perspective-synthesis.md` covering:
-- **Convergence** — where multiple lenses pointed the same direction (signal worth trusting)
-- **Surprises** — which perspective said something nobody else did (potential breakthroughs)
-- **Insights** — name each key finding in a memorable sentence the user can carry forward
-Then render in the side pane:
-```bash
-termrender --tmux "$SISYPHUS_SESSION_DIR/context/perspective-synthesis.md"
-```
-Bail on non-zero exit with the file path and exit code.
-## Surface to the user
-Issue the synthesis deck. No `${var}` shell assignments needed; angle-bracket placeholders are pre-substituted:
-- `<one-line convergence>` — where multiple lenses pointed the same direction
-- `<one-line surprise>` — what a single lens said that nobody else did
-```bash
-synth_deck="$SISYPHUS_SESSION_DIR/context/.ask-problem-synth-$(date +%s)-$$.json"
-cat > "$synth_deck" <<EOF
-{
-  "interactions": [{
-    "id": "problem-perspective-synth",
-    "title": "Lens synthesis",
-    "subtitle": "After 8 perspective agents",
-    "body": "## In the side pane\n\n- Synthesis rendered via termrender — scroll and react below.\n\n## What I'm hearing\n\n- <one-line convergence>\n- <one-line surprise>",
-    "kind": "decision",
-    "options": [
-      {"id": "breakthrough",    "label": "Breakthrough — this lens reframes it"},
-      {"id": "useful",          "label": "Useful but not load-bearing"},
-      {"id": "wrong-direction", "label": "Wrong direction — discard"},
-      {"id": "mixed",           "label": "Mixed — see freetext"}
-    ],
-    "allowFreetext": true,
-    "freetextLabel": "Which lens, what landed, what's still missing"
-  }]
-}
-EOF
-result=$(sis ask "$synth_deck") || { sis agent submit "Synthesis deck failed — deck: $synth_deck"; exit 1; }
-[ -n "$result" ] || { sis agent submit "Synthesis deck: empty result — deck: $synth_deck"; exit 1; }
-choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId // empty')
-notes=$(echo  "$result" | jq -r '.responses[0].freetext // ""')
-```
-## Routing after synthesis
-All four option ids return to the dialogue loop's turn-deck flow.
-- `breakthrough`, `useful`, `mixed` — carry the synthesis forward into the next turn's framing (the next turn deck body should reference what landed)
-- `wrong-direction` — discards the synthesis but does not exit the loop
-- `notes` flows into the next turn's framing regardless of `choice`
-**Increment the turn counter `N`** before issuing the next turn deck. Skipping the increment produces two consecutive `Turn N — <lens>` subtitles with the same N, breaking inbox scannability.
-## Failure handling
-- If more than four of eight agents return errors, surface partial results if any returned cleanly, otherwise bail
-- If `termrender --tmux` fails on the synthesis render, bail with file path and exit code
-- If the synthesis deck fails or returns empty, bail with the deck path
-## Body content rules
-The deck `body` field uses `##` headings, bullet lists, and bold only — no tables, no code fences, no termrender directives.

package/templates/agent-plugin/skills/problem-document/SKILL.md DELETED Viewed

@@ -1,105 +0,0 @@
----
-name: problem-document
-description: >
-  Load when ready to draft `context/problem.md` — the thinking artifact that orients downstream agents (spec, plan, implement) to why the work exists. Provides design principles, the section vocabulary to pick from, and an anchor example showing the target style. Use this before writing the draft, not after.
----
-# Designing the problem document
-The problem document is a **thinking artifact**, not a spec. Its job is to orient downstream agents (spec, plan, implement) to *why* the work exists — what hurts, what's the non-obvious trick, what matters, what's risky — tightly enough that they can read the whole thing in under thirty seconds.
-## Design principles
-- **Scannable, not exhaustive.** A downstream agent reads this once before doing real work. It needs to walk away with the right mental model, not every detail of the conversation that produced it.
-- **Sections are a vocabulary, not a checklist.** Use the sections that earn their place for *this* problem. Skip ones that don't. Add ones that do. Different problems need different shapes.
-- **Each section answers a question a downstream agent would ask:** "What hurts? What's the trick? What are we building? Why is it tricky? What does done look like? What can't we do? What's still up in the air?" If a section doesn't answer one of those, cut it.
-- **Tables and bullets do the structural work; prose fills gaps where tables would feel forced.** A central decision shown as a 2-row table is worth ten sentences of paragraph.
-- **No alternatives section.** The forks you considered and rejected lived in the conversation — they don't need to live in the artifact. Downstream agents care about the path forward, not the paths not taken.
-- **Length follows from clarity, not from rules.** When the thinking is crisp, the document is short on its own. If a section feels like it wants more words, the answer is usually to tighten the thinking, not expand the section.
-## Section vocabulary
-Pick what earns its place; rename freely.
-- **The pain / what's wrong** — what hurts and why now
-- **Key insight** — the non-obvious understanding that reframes the problem
-- **What we're building** — the artifact(s) or change(s) the work produces
-- **Why it's tricky** — failure modes, mental traps, things that defeat the obvious approach
-- **What success looks like** — concrete outcomes, not metrics theater
-- **Constraints** — what bounds the solution (not assumptions, not anti-goals — actual bounds)
-- **Open questions** — unresolved choices the next phase needs to make
-## Anchor example
-This is the target style — terse, scannable, structured by what serves the content rather than by template:
-<example>
-# Session debugging is too expensive to do
-## The pain
-When a sisyphus session produces unexpected output, the maintainer can't
-cheaply learn from it. The choice is between re-teaching Claude the
-architecture every conversation, or doing manual archaeology across raw
-JSONL files. Both are expensive enough that the learning loop gets skipped
-entirely.
-## Key insight
-The data is already on disk — sisyphus just doesn't read it. Every agent's
-full transcript lives at `~/.claude/projects/<cwd>/<sessionId>.jsonl` with
-file touches, tokens, subagent spawns, and timing. The fix is a reader, not
-new instrumentation.
-## The two artifacts
-| What | Why it's needed |
-|---|---|
-| **Debugging toolkit** (CLI verbs) | Cheap "what happened in session X" lookups Claude can compose with grep/jq |
-| **Architecture skill** (SKILL.md) | A mental model Claude can pull when reasoning about sisyphus runtime — the novel multi-agent design defeats its priors |
-Useless apart, powerful together. The toolkit answers *what*; the skill
-answers *how to make sense of what*.
-## Why the skill matters
-Claude's failure modes when reasoning about sisyphus are predictable:
-- Treats the orchestrator as a long-running process with memory (it's
-  stateless, fork-per-cycle)
-- Conflates sisyphus-managed agents with Claude-Code-managed Task-tool
-  subagents
-- Misses that "completed" means three different things at three levels
-- Loses track of which channel agents communicate over
-These aren't undocumented — they're scattered across CLAUDE.md files framed
-as traps, not mental models. The skill is synthesis with decision heuristics,
-not new philosophy.
-## What success looks like
-- Maintainer says "investigate session X", Claude pulls the skill, runs a
-  couple of CLI queries, gives a grounded diagnosis citing real file paths
-  and JSONL evidence — no re-teaching
-- Same skill loads automatically for high-level architecture discussions,
-  not just debugging
-- Zero new instrumentation — derived from data already on disk plus a
-  one-line fix to complete an existing index
-## Constraints
-- Claude Code JSONL format isn't a stable contract — reader must degrade
-  gracefully if Anthropic changes it
-- Codex/OpenAI agents have no equivalent transcript — known blind spot,
-  not in scope
-## Open questions
-- Skill scope: one broad "sisyphus" skill (architecture + debugging) or
-  split into two?
-- Pre-fix sessions: accept they're harder to debug, or add an mtime-proximity
-  fallback in the reader?
-</example>
-Notice what this example *doesn't* have: no "Alternatives Considered," no "Assumptions" section, no "User Experience" header (folded into success), no "Anti-Goals." Each section earned its place because the content needed it. A different problem would skip "Why the skill matters" and add "Migration path" or "User flows" — whatever the content demands.
-## Bifurcation case
-If the conversation revealed that the scope contains **independent sub-problems** rather than one problem with sub-parts, do not write a unified `problem.md`. Instead, use the bifurcation-exit pattern from the agent prompt — the orchestrator handles re-entering discovery for each sub-problem.

package/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md DELETED Viewed

@@ -1,83 +0,0 @@
----
-name: problem-plateau-breakers
-description: >
-  Load when the problem-agent dialogue loop signals the conversation has stalled — repeated circling, user freetext like "different angle" / "going nowhere" / "feels stuck", or the agent senses it has been chasing the same framing for several turns without traction. Provides four breaker-deck shapes (flip, zoom-out, zoom-in, name-tension) and the routing for each. Increments the turn counter and returns control to the dialogue loop.
----
-# Plateau-breaker decks
-When the conversation circles, the user wants a *different shape of question*, not another variation of the same one. Pick the breaker whose move matches the stall pattern, issue the deck, then resume the turn loop.
-## Pick the breaker type
-| Type | Use when | Move |
-|---|---|---|
-| `flip` | The conversation keeps assuming a position is correct | Embrace the opposite — what changes if we believed the inverse? |
-| `zoom-out` | The conversation is litigating details before establishing whether they matter | Step back — does this distinction even change the outcome? |
-| `zoom-in` | The conversation is trading abstractions without testing them against a real case | Pick a concrete scenario and see if the framing survives |
-| `name-tension` | Two values are being held in tension without naming the trade-off | Surface the tension itself as the question |
-Choose one per stall. Do not chain breakers — if a breaker doesn't unstick the conversation, the next one is the *next* stall, counted toward the repeated-stuck guard.
-## Issue the deck
-Required prior assignments before the heredoc:
-- `type` — one of `flip` / `zoom-out` / `zoom-in` / `name-tension`
-Angle-bracket placeholders (substitute literally before writing the heredoc):
-- `<observation>` — what the conversation has been circling
-- `<reframe>` — provisional alternative tied to the breaker type
-```bash
-type=flip  # or zoom-out / zoom-in / name-tension
-deck="$SISYPHUS_SESSION_DIR/context/.ask-problem-plateau-${type}-$(date +%s)-$$.json"
-cat > "$deck" <<EOF
-{
-  "interactions": [{
-    "id": "problem-plateau-${type}",
-    "title": "Plateau breaker",
-    "subtitle": "Plateau breaker — ${type}",
-    "body": "## Stalled\n\n- <observation>\n\n## Reframe\n\n- <reframe>",
-    "kind": "decision",
-    "options": [
-      <options for this type — see table below>
-    ],
-    "allowFreetext": true,
-    "freetextLabel": "Or describe the angle differently"
-  }]
-}
-EOF
-result=$(sis ask "$deck") || { sis agent submit "Plateau-breaker deck failed — type: $type — deck: $deck"; exit 1; }
-[ -n "$result" ] || { sis agent submit "Plateau-breaker deck: empty result — type: $type — deck: $deck"; exit 1; }
-choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId // empty')
-notes=$(echo  "$result" | jq -r '.responses[0].freetext // ""')
-```
-## Per-breaker options
-Pre-substitute the matching row before writing the heredoc:
-| `type` | Options (id / label) |
-|---|---|
-| `flip` | `embrace-flipped` / "Embrace the flipped position" · `stick-original` / "Stick with original" · `merge-both` / "Merge both" |
-| `zoom-out` | `drop-doesnt-matter` / "Doesn't matter — drop" · `smaller-scope` / "Matters but smaller" · `matters-as-is` / "Matters as is" |
-| `zoom-in` | `scenario-breaks-it` / "This scenario breaks it" · `scenario-holds` / "Scenario holds" · `different-scenario` / "Different scenario" |
-| `name-tension` | `pick-side-A` / "Pick A" · `pick-side-B` / "Pick B" · `tension-itself` / "The tension itself is the problem" |
-## After the response
-Increment the turn counter `N` and return to the dialogue loop's turn-deck flow. The user's `choice` and `notes` flow into the next turn's framing.
-## Body content rules
-The deck `body` field uses `##` headings, bullet lists, and bold only — no tables, no code fences, no termrender directives. Violations fail `termrender --check` inside `parseDeck`.
-## Sanitize freetext on bail
-If you bail with the user's freetext in the message, sanitize it first:
-```bash
-safe_notes=$(printf '%s' "$notes" | tr -d '`$"\\')
-```
-Raw `"$notes"` in a shell-interpolated bail message is a defect.

package/templates/orchestrator-plugin/skills/humanloop/SKILL.md DELETED Viewed

@@ -1,150 +0,0 @@
----
-name: humanloop
-description: >
-  Read before calling `sis ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to invoke synchronously so the orchestrator's process blocks until the user answers.
----
-# Talking to the user via decks
-`sis ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
-This skill covers **what to put in a deck** and **how to invoke it**. Run `sis ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
-## Reach for a deck when
-- You have **2+ questions** to surface in one beat (bundle them into one deck).
-- You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
-- You're choosing between **concrete alternatives** the user must pick.
-- The work will sit while the user thinks. Decks survive across cycles; chat does not.
-## Skip the deck when
-- It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
-- You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
-- The user is actively conversing with you — converting a live exchange into a deck adds friction.
-## How to invoke
-**Run `sis ask` in the foreground — let the Bash tool block.** The CLI waits internally for the user to resolve the deck (potentially 10+ minutes). Your pane stays alive in tmux for the duration; the daemon will not respawn you while a tool call is in flight. When the user answers, the bash returns stdout and you parse it inline.
-```bash
-result=$(sis ask "$deck")
-choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId')
-notes=$(echo "$result"  | jq -r '.responses[0].freetext // ""')
-```
-**Do not `run_in_background` and yield** — yielding kills your pane and any backgrounded bash with it; the next cycle's fresh orchestrator can only peek the on-disk deck (`sis ask peek`) and yield again, producing a polling loop. The daemon now refuses `sis orch yield` while a deck owned by orchestrator is pending; the supported pattern is foreground.
-Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
-If you respawn mid-wait and find a pending deck on disk (e.g. after a daemon restart that orphaned the prior bash), block on it with `sis ask poll <askId>` to re-attach. `sis ask peek <askId>` is non-blocking and reserved for respawn-recovery diagnostics. See `sis ask -h`.
-## Designing interactions
-### Each option is a concrete path forward
-The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
-<example type="good">
-```
-title: "Session store backend?"
-subtitle: "Auth needs persistent sessions across restarts"
-kind: decision
-options:
-  in-memory:  "In-memory map — simplest. Loses sessions on restart; single-process only."
-  redis:      "Redis — survives restart, supports horizontal scale. New ops dependency."
-  postgres:   "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
-  defer:      "Ship in-memory now, migrate later if scale becomes real."
-allowFreetext: true
-freetextLabel: "Different framing — describe it"
-```
-</example>
-<example type="bad">
-```
-title: "Happy with this design?"
-options:
-  1. Yes
-  2. No, start over
-  3. Maybe, with comments
-  4. (no option, just freetext)
-```
-"Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
-</example>
-### Use `allowFreetext: true` as a safety valve, not the primary input
-Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
-<example type="bad">
-```
-title: "Approve?"
-options:
-  1. Approve
-  2. Reject
-  3. Comment
-allowFreetext: true
-```
-A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
-</example>
-### Bound option count to 2–4
-Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
-### Ground options in what you've already gathered
-Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
-### One concern per interaction
-When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
-## `kind` — display hint
-| kind | use for |
-|---|---|
-| `decision` | fork in the road; user picks a path forward |
-| `validation` | sign-off on completed work |
-| `notify` | FYI; user acknowledges |
-| `context` | surfacing background that needs a response |
-| `error` | something went wrong; user picks a recovery |
-The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
-## Bundling
-If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
-```bash
-deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
-cat > "$deck" <<'EOF'
-{
-  "title": "Phase 2 sign-off + follow-on decisions",
-  "interactions": [
-    {
-      "id": "approve-phase-2",
-      "title": "Phase 2 looks good?",
-      "kind": "validation",
-      "options": [...]
-    },
-    {
-      "id": "phase-3-scope",
-      "title": "Phase 3 scope?",
-      "kind": "decision",
-      "options": [...]
-    }
-  ]
-}
-EOF
-# Then invoke `sis ask "$deck"` synchronously (foreground bash) — blocks until answered.
-# Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
-```
-## Submission notes
-- The deck is validated at submit (precise errors — trust them).
-- `kind` is an enum: `notify` | `validation` | `decision` | `context` | `error`. No other values accepted (see the table above for which to pick).
-- `bodyPath` points at a markdown file instead of inlining the body in JSON. The path is resolved **relative to the deck JSON's directory** and must stay inside it (no `..`, no symlinks out, no absolute paths pointing elsewhere). Practical pattern: write the deck JSON next to its body file — e.g. both inside `$SISYPHUS_SESSION_DIR/context/` — and use a basename like `"completion-summary.md"`. Mutually exclusive with `body`.
-- On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
-- See `sis ask -h` for the full CLI surface.

package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md DELETED Viewed

	@@ -1 +0,0 @@
1	- - `sis orch yield --mode <mode>` is required on every yield. Pass the current mode to stay in it; pass a different mode to transition. There is no implicit "keep current mode" fallback — the CLI rejects yields without `--mode`.

package/templates/orchestrator-plugin/skills/orchestration/SKILL.md DELETED Viewed

@@ -1,29 +0,0 @@
----
-name: orchestration
-description: >
-  Task breakdown patterns for sisyphus orchestrator sessions. How to structure tasks, sequence agents, and manage cycles for debugging, feature builds, refactors, and other common workflows. Use when planning orchestration strategy or structuring a multi-agent session.
----
-# Orchestration Patterns
-How to structure sisyphus sessions for common task types. This skill helps the orchestrator break work into tasks, choose agent types, sequence cycles, and handle failures.
-## Core Principles
-1. **roadmap.md is the orchestrator's memory.** roadmap.md and agent reports persist across cycles — they're all you have. Keep roadmap.md current and specific enough that a fresh orchestrator can pick up where you left off.
-2. **Agents are disposable.** Each agent gets one focused instruction. If it fails or the scope changes, spawn a new one — don't try to redirect a running agent.
-3. **Parallelize when independent.** If two tasks don't share files or depend on each other's output, spawn agents for both in the same cycle.
-4. **Interleave verification.** Don't batch all implementation and defer review to the end. Embed critique and validation checkpoints between stages based on risk — the more subsequent work depends on a stage being correct, the more it needs verification before you build on it.
-5. **Reports are handoffs.** Agent reports should contain everything the next cycle's orchestrator needs — what was done, what was found, what's unresolved, where artifacts were saved.
-## Agent Types
-Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sis agent spawn`.
-For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
-For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
-For strategy.md authoring — stage patterns, process shapes, format — see [strategy.md](strategy.md).

package/templates/orchestrator-plugin/skills/orchestration/strategy.md DELETED Viewed

@@ -1,160 +0,0 @@
-# Strategy Reference
-Reference material for writing and updating strategy.md — the document that maps the shape of the work across stages.
-## strategy.md Format
-```markdown
-## Completed
-[Compressed summaries of finished stages — delete detail, keep outcomes]
-## Current Stage: [name]
-[Detailed process flow with exit criteria and backtrack triggers]
-## Ahead
-[Sketched future stages — one line each: name + what it covers]
-[Only as far as you can currently see — it's OK if this is vague]
-```
-**Principles:**
-- **Detail the current stage** — concrete enough that the orchestrator can execute without re-reading this skill
-- **Sketch what's ahead** — enough continuity that future updates don't lose the thread, not so much that you're committing to unknowns
-- **Every detailed stage gets exit criteria** — concrete enough to evaluate, not so rigid they become checkboxes
-- **Include user gates** — where does this stage need the user? What decision or approval?
-## Stages name kinds of work, not areas of code
-A strategy stage is a **process phase** — `discovery`, `planning`, `implementation`, `validation`, `spike`. It describes the *kind* of thinking happening that stage. It is **not** a work-area label like `auth-refactor`, `tui-panel`, `migration-script`, or `foundations`.
-Work areas are the plan agent's job. They live in `context/{plan-lead-agent-id}/plan-stage-N-*.md` and structure the implementation phase from the inside. Keep them out of `strategy.md`.
-<example>
-✓ Correct — process phases:
-```
-## Ahead
-- **implementation** — phased build per the plan outline (5 sub-stages: foundations → ask-cli → tui → orphan-handling → migration). Critique + validate per stage.
-- **validation** — run e2e recipe end-to-end, capture evidence, user gate.
-```
-✗ Wrong — work areas masquerading as stages:
-```
-## Ahead
-- **foundations** — humanloop refactor + ask-store helpers
-- **ask-cli + haiku + template** — CLI command and tool-use loop
-- **tui-integration** — inbox panel and key routing
-- **orphan-handling** — kill/complete paths
-- **migration + e2e validation** — drop old command, run recipe
-```
-The second list is a roadmap of code work. Strategy.md collapses into a task list and the process shape (when do we critique? when do we validate? what's the user gate?) disappears.
-</example>
-When you're tempted to name a stage after a code area, that signals you're sketching the plan, not the strategy. Push that detail down into the plan agent's output and keep `strategy.md` at the process-shape layer.
-## Default Pipeline Shape
-The session's effort tier dictates the default pipeline. **Use this shape unless the problem explicitly demands more or less.** The user can change tiers via `sis session effort <low|medium|high|xhigh>`.
-<!--EFFORT:LOW-->
-**Pipeline:** `plan → implement → validate`
-A single plan agent, a single implement agent, a single validate agent. No spec, problem, test-spec, or review-plan stages — the user's request is the requirement; ask in-band if anything's ambiguous. If the work is wrapper-shaped (every change backs onto an existing CLI/API/handler), move directly from discovery into implementation mode without a planning-mode cycle at all.
-<!--/EFFORT-->
-<!--EFFORT:MEDIUM-->
-**Pipeline:** `(spec, if behavior changes) → plan → implement → validate`
-Add `sisyphus:review-plan` only when the plan covers multi-domain integration. Add `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Spawn `sisyphus:spec` and `sisyphus:problem` only when the goal has multiple valid framings or the design space is genuinely open.
-<!--/EFFORT-->
-<!--EFFORT:HIGH,XHIGH-->
-**Pipeline:** `discovery → spec → planning (with parallel review-plan) → phased implementation with critique/validate checkpoints → validation`
-`sisyphus:review-plan` runs after the plan is drafted. `sisyphus:spec` spawns whenever a feature adds user-visible behavior. `sisyphus:problem` spawns when the goal is nebulous. Append `+ test-spec` to the planning stage **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"); silence is a "no." When justified, `sisyphus:test-spec` spawns in parallel with the high-level plan at Cycle 2, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do.
-<!--/EFFORT-->
-**Re-evaluate the tier when scope shifts mid-session.** A MEDIUM feature that uncovers a new subsystem may have crossed into HIGH; a HIGH feature whose scope was narrowed may have dropped to MEDIUM. Re-run `sis session effort` and re-invoke this skill rather than continuing under the old tier's pipeline.
-## Choosing a Different Shape
-If the default doesn't match the problem, these canonical progressions are the next-best starting points — pick the closest one and prune what's already clear, rather than inventing custom shapes:
-```
-discovery → spec → planning → implementation → validation
-exploration → spike → design → implementation → validation
-investigation → recommendation → (user decides) → implementation
-analysis → phased-transformation → verification
-discovery → product-design → technical-investigation → architecture → implementation → validation
-```
-Add a new stage *type* only when the problem demands a kind of work the patterns don't cover — for example a `spike` to prove feasibility, a `compatibility-check` before a migration, or a `prototype` before committing. The test for "is this a real new stage?" is whether it names a different kind of thinking, not a different slice of code.
-## Stage Patterns
-Use these as starting points. Invent new stage types when the problem demands it. Add backtrack edges where you can foresee things going wrong.
-### discovery
-**Use when:** Goal is undefined, ambiguous, or has shifted — need to clarify what "done" looks like before any other stage runs. Also re-entered mid-session when a pivot invalidates the current goal.
-- Process: read prior context (goal.md, prior strategy if any) → if the goal is provably clear, write goal.md and run the clarity-confirmation deck → otherwise spawn `sisyphus:problem` for interactive exploration → user iterates → fold result into goal.md → set effort tier → write or revise strategy.md
-- Exit: goal.md is current and confirmed; effort tier is set; strategy.md exists for this iteration
-- Produces: goal.md, strategy.md, optionally context/problem.md or context/problem-bifurcation.md
-- Backtrack: if scope reveals multiple independent projects, issue a decomposition deck and let the user pick a lead — record the others under "Known follow-ups" in goal.md
-### exploration
-**Use when:** Need to understand the technical landscape before committing to an approach.
-- Process: spawn explore agents (each producing a focused context doc) → review findings → identify gaps → re-explore or converge
-- Exit: enough understanding to make decisions — key questions answered, relevant patterns documented
-- Produces: context documents (one per investigation angle, not one sprawling doc)
-### spike
-**Use when:** Feasibility is uncertain — need to prove an approach works before investing in full design.
-- Process: identify the riskiest assumption → build a minimal prototype that tests it → evaluate results → present findings to user if the spike changes the approach
-- Exit: feasibility confirmed or denied with evidence, decision on path forward
-- Produces: spike findings in context/, prototype code (may be throwaway)
-- Backtrack: if spike fails → re-explore alternatives
-### spec
-**Use when:** Need to define what to build and how, in a single interactive session.
-- Process: spawn sisyphus:spec → lead explores codebase, asks user questions, dispatches engineer for design and a single writer for requirements → user reviews via TUI → lead deepens design with findings
-- Exit: user-approved design + requirements with testable acceptance criteria
-- Produces: context/design.md + context/design.json + context/requirements.json + context/requirements.md
-- Backtrack: if problem was misframed → re-explore or re-discover
-### planning
-**Use when:** Design approved, need an executable breakdown.
-- Process: spawn plan lead with spec outputs (requirements + design) as inputs → adversarial review of plan → create e2e verification recipe
-- Exit: reviewed plan + executable e2e-recipe.md that defines how to prove the feature works
-- Produces: phased implementation plan + e2e recipe in context/
-- Backtrack: if plan reveals design infeasibility → revisit spec
-### implementation
-**Use when:** Plan exists, time to build.
-- Process: for each phase → detail-plan → spawn implement agents → single critique pass → refine → validate phase
-- Exit: all phases validated with evidence, no critical review findings remain
-- Loops: none within a phase — review runs once, fixes land, then validation. If review surfaces architectural issues, backtrack to plan; otherwise advance.
-- Backtrack: if 2+ agents hit same unexpected complexity → revisit plan or spec; if review finds architectural issues → revisit plan
-### validation
-**Use when:** Implementation complete, need to prove it works end-to-end.
-- Process: run full e2e recipe → collect evidence (command output, screenshots, responses) → assess against success criteria → step back and check if the goal is actually met
-- Exit: all recipe steps pass with concrete evidence, original goal satisfied
-- Produces: validation report with evidence
-- Backtrack: if bugs found → implementation; if architectural issues → spec
-## Mid-session shape revisions
-When the work in flight reveals the strategy itself is off, escalate up this ladder — reach for the lowest-cost move that fits.
-1. **Revise in place.** Stage detail evolved but the pipeline shape holds. Edit `strategy.md` and `roadmap.md`; continue.
-2. **`sisyphus:strategize`.** Approach is wrong but artifacts (specs, explorations, reports) still apply. Annotates the pivot into `strategy.md` and yields `--mode discovery` with a fresh orchestrator.
-3. **`sis session clone <goal>`.** The session is actually two (or more) independent projects. Forks scope into a new top-level session; update `goal.md`/`roadmap.md` here to drop what was cloned.
-4. **`sis session rollback <sessionId> <cycle>`.** A specific cycle introduced state to discard. Rewinds and pauses the session — cycles after the target are lost. Last resort; the others preserve history.
-When the user is the source of the change, update `goal.md` first — strategy revision is downstream of goal.
-## Design Philosophy
-Frameworks to inform process shape selection — use them to *choose the right shape*, not to follow mechanically:
-- **Double Diamond** — Diverge to explore, converge on a definition; diverge on solutions, converge on implementation. Use when requirements are unclear or the problem needs defining.
-- **OODA (Observe–Orient–Decide–Act)** — Tight sensing/reacting loops. Use when the situation is fluid and the cost of wrong moves is low (debugging, spikes, incident response).
-- **Cynefin** — Match approach to domain. Clear → best practice. Complicated → analyze then execute. Complex → probe, sense, respond. Chaotic → act to stabilize.