sisyphi 1.2.1 → 1.2.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. package/README.md +20 -20
  2. package/dist/cli.js +12461 -11237
  3. package/dist/cli.js.map +1 -1
  4. package/dist/daemon.js +1112 -564
  5. package/dist/daemon.js.map +1 -1
  6. package/dist/templates/agent-plugin/agents/CLAUDE.md +2 -2
  7. package/dist/templates/agent-plugin/agents/implementor.md +3 -2
  8. package/dist/templates/agent-plugin/agents/operator.md +3 -4
  9. package/dist/templates/agent-plugin/agents/plan.md +1 -1
  10. package/dist/templates/agent-plugin/agents/problem.md +20 -20
  11. package/dist/templates/agent-plugin/agents/research-lead.md +1 -1
  12. package/dist/templates/agent-plugin/agents/spec/engineer.md +9 -7
  13. package/dist/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  14. package/dist/templates/agent-plugin/agents/spec.md +31 -25
  15. package/dist/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  16. package/dist/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  17. package/dist/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  18. package/dist/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  19. package/dist/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  20. package/dist/templates/agent-plugin/hooks/require-submit.sh +1 -1
  21. package/dist/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  22. package/dist/templates/agent-suffix.md +4 -18
  23. package/dist/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  24. package/dist/templates/dashboard-claude.md +15 -13
  25. package/dist/templates/orchestrator-base.md +44 -78
  26. package/dist/templates/orchestrator-completion.md +9 -11
  27. package/dist/templates/orchestrator-discovery.md +8 -8
  28. package/dist/templates/orchestrator-impl.md +6 -7
  29. package/dist/templates/orchestrator-planning.md +2 -2
  30. package/dist/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  31. package/dist/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  32. package/dist/templates/orchestrator-validation.md +1 -3
  33. package/dist/templates/termrender-haiku-system.md +5 -3
  34. package/dist/tui.js +1817 -1400
  35. package/dist/tui.js.map +1 -1
  36. package/native/build-notify.sh +2 -2
  37. package/package.json +3 -3
  38. package/templates/agent-plugin/agents/CLAUDE.md +2 -2
  39. package/templates/agent-plugin/agents/implementor.md +3 -2
  40. package/templates/agent-plugin/agents/operator.md +3 -4
  41. package/templates/agent-plugin/agents/plan.md +1 -1
  42. package/templates/agent-plugin/agents/problem.md +20 -20
  43. package/templates/agent-plugin/agents/research-lead.md +1 -1
  44. package/templates/agent-plugin/agents/spec/engineer.md +9 -7
  45. package/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  46. package/templates/agent-plugin/agents/spec.md +31 -25
  47. package/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  48. package/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  49. package/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  50. package/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  51. package/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  52. package/templates/agent-plugin/hooks/require-submit.sh +1 -1
  53. package/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  54. package/templates/agent-suffix.md +4 -18
  55. package/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  56. package/templates/dashboard-claude.md +15 -13
  57. package/templates/orchestrator-base.md +44 -78
  58. package/templates/orchestrator-completion.md +9 -11
  59. package/templates/orchestrator-discovery.md +8 -8
  60. package/templates/orchestrator-impl.md +6 -7
  61. package/templates/orchestrator-planning.md +2 -2
  62. package/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  63. package/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  64. package/templates/orchestrator-validation.md +1 -3
  65. package/templates/termrender-haiku-system.md +5 -3
  66. package/dist/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  67. package/dist/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  68. package/dist/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  69. package/dist/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  70. package/dist/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  71. package/dist/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  72. package/dist/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  73. package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  74. package/dist/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  75. package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  76. package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
  77. package/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  78. package/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  79. package/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  80. package/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  81. package/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  82. package/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  83. package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  84. package/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  85. package/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  86. package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  87. package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
@@ -1,115 +0,0 @@
1
- ---
2
- name: perspective-fanout
3
- description: >
4
- Load when the problem-agent dialogue has produced enough substance to react to but conclusions haven't hardened — typically four or more turns in, with a framing solidifying. Provides the protocol for spawning eight perspective sub-agents in parallel, synthesizing their outputs, and presenting the synthesis back to the user via a render+deck pair. Available only at MEDIUM, HIGH, or XHIGH effort.
5
- ---
6
-
7
- # Perspective fanout
8
-
9
- Spawn the eight perspective lenses as parallel sub-agents to challenge convergence before the framing locks in. The agents operate from a shared problem statement so their outputs are directly comparable. After they return, synthesize and surface to the user — convergence, surprises, insights — as the seed for the next dialogue turn.
10
-
11
- ## When to spawn
12
-
13
- - The conversation has substance to react to (typically four or more turns in)
14
- - A framing is starting to solidify
15
- - You want to challenge convergence, not rescue a stalled discussion
16
- - You have already formed your own take
17
-
18
- If the conversation is stalled, use a plateau-breaker instead — perspective fanout needs material to push against.
19
-
20
- ## Before spawning: write the shared problem statement
21
-
22
- Two or three sentences, given verbatim to all eight agents:
23
-
24
- - What's happening (or not happening)
25
- - What's been considered so far (from your exploration and the user input)
26
- - What a good outcome looks like
27
-
28
- This shared framing is what makes the eight outputs comparable. Different framings produce different conversations and the synthesis collapses.
29
-
30
- ## The eight lenses
31
-
32
- Spawn one sub-agent per lens, all in the background, in parallel:
33
-
34
- | Lens | Brief |
35
- |---|---|
36
- | First Principles | Strip away assumptions. What is the actual problem at its most fundamental level? |
37
- | User Empathy | Forget the code. What does the person using this actually need? |
38
- | Simplifier | What can be deleted, removed, or skipped? The best solution might be no solution. |
39
- | Systems Thinker | Zoom out. What are the second-order effects? What breaks downstream? |
40
- | Contrarian | Take the opposite position of whatever seems obvious. |
41
- | Time Traveler | Six months from now, what will we wish we had done? |
42
- | Adversarial | Assume the current approach is wrong. Find the flaw, the hidden assumption that breaks under stress. |
43
- | Precedent | Has this been solved before? In this codebase, in open source, in a different domain entirely? |
44
-
45
- Continue the conversation with the user while the agents run. Do not block.
46
-
47
- ## Synthesis
48
-
49
- When the eight return, write to `$SISYPHUS_SESSION_DIR/context/perspective-synthesis.md` covering:
50
-
51
- - **Convergence** — where multiple lenses pointed the same direction (signal worth trusting)
52
- - **Surprises** — which perspective said something nobody else did (potential breakthroughs)
53
- - **Insights** — name each key finding in a memorable sentence the user can carry forward
54
-
55
- Then render in the side pane:
56
-
57
- ```bash
58
- termrender --tmux "$SISYPHUS_SESSION_DIR/context/perspective-synthesis.md"
59
- ```
60
-
61
- Bail on non-zero exit with the file path and exit code.
62
-
63
- ## Surface to the user
64
-
65
- Issue the synthesis deck. No `${var}` shell assignments needed; angle-bracket placeholders are pre-substituted:
66
-
67
- - `<one-line convergence>` — where multiple lenses pointed the same direction
68
- - `<one-line surprise>` — what a single lens said that nobody else did
69
-
70
- ```bash
71
- synth_deck="$SISYPHUS_SESSION_DIR/context/.ask-problem-synth-$(date +%s)-$$.json"
72
- cat > "$synth_deck" <<EOF
73
- {
74
- "interactions": [{
75
- "id": "problem-perspective-synth",
76
- "title": "Lens synthesis",
77
- "subtitle": "After 8 perspective agents",
78
- "body": "## In the side pane\n\n- Synthesis rendered via termrender — scroll and react below.\n\n## What I'm hearing\n\n- <one-line convergence>\n- <one-line surprise>",
79
- "kind": "decision",
80
- "options": [
81
- {"id": "breakthrough", "label": "Breakthrough — this lens reframes it"},
82
- {"id": "useful", "label": "Useful but not load-bearing"},
83
- {"id": "wrong-direction", "label": "Wrong direction — discard"},
84
- {"id": "mixed", "label": "Mixed — see freetext"}
85
- ],
86
- "allowFreetext": true,
87
- "freetextLabel": "Which lens, what landed, what's still missing"
88
- }]
89
- }
90
- EOF
91
- result=$(sis ask "$synth_deck") || { sis agent submit "Synthesis deck failed — deck: $synth_deck"; exit 1; }
92
- [ -n "$result" ] || { sis agent submit "Synthesis deck: empty result — deck: $synth_deck"; exit 1; }
93
- choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId // empty')
94
- notes=$(echo "$result" | jq -r '.responses[0].freetext // ""')
95
- ```
96
-
97
- ## Routing after synthesis
98
-
99
- All four option ids return to the dialogue loop's turn-deck flow.
100
-
101
- - `breakthrough`, `useful`, `mixed` — carry the synthesis forward into the next turn's framing (the next turn deck body should reference what landed)
102
- - `wrong-direction` — discards the synthesis but does not exit the loop
103
- - `notes` flows into the next turn's framing regardless of `choice`
104
-
105
- **Increment the turn counter `N`** before issuing the next turn deck. Skipping the increment produces two consecutive `Turn N — <lens>` subtitles with the same N, breaking inbox scannability.
106
-
107
- ## Failure handling
108
-
109
- - If more than four of eight agents return errors, surface partial results if any returned cleanly, otherwise bail
110
- - If `termrender --tmux` fails on the synthesis render, bail with file path and exit code
111
- - If the synthesis deck fails or returns empty, bail with the deck path
112
-
113
- ## Body content rules
114
-
115
- The deck `body` field uses `##` headings, bullet lists, and bold only — no tables, no code fences, no termrender directives.
@@ -1,105 +0,0 @@
1
- ---
2
- name: problem-document
3
- description: >
4
- Load when ready to draft `context/problem.md` — the thinking artifact that orients downstream agents (spec, plan, implement) to why the work exists. Provides design principles, the section vocabulary to pick from, and an anchor example showing the target style. Use this before writing the draft, not after.
5
- ---
6
-
7
- # Designing the problem document
8
-
9
- The problem document is a **thinking artifact**, not a spec. Its job is to orient downstream agents (spec, plan, implement) to *why* the work exists — what hurts, what's the non-obvious trick, what matters, what's risky — tightly enough that they can read the whole thing in under thirty seconds.
10
-
11
- ## Design principles
12
-
13
- - **Scannable, not exhaustive.** A downstream agent reads this once before doing real work. It needs to walk away with the right mental model, not every detail of the conversation that produced it.
14
- - **Sections are a vocabulary, not a checklist.** Use the sections that earn their place for *this* problem. Skip ones that don't. Add ones that do. Different problems need different shapes.
15
- - **Each section answers a question a downstream agent would ask:** "What hurts? What's the trick? What are we building? Why is it tricky? What does done look like? What can't we do? What's still up in the air?" If a section doesn't answer one of those, cut it.
16
- - **Tables and bullets do the structural work; prose fills gaps where tables would feel forced.** A central decision shown as a 2-row table is worth ten sentences of paragraph.
17
- - **No alternatives section.** The forks you considered and rejected lived in the conversation — they don't need to live in the artifact. Downstream agents care about the path forward, not the paths not taken.
18
- - **Length follows from clarity, not from rules.** When the thinking is crisp, the document is short on its own. If a section feels like it wants more words, the answer is usually to tighten the thinking, not expand the section.
19
-
20
- ## Section vocabulary
21
-
22
- Pick what earns its place; rename freely.
23
-
24
- - **The pain / what's wrong** — what hurts and why now
25
- - **Key insight** — the non-obvious understanding that reframes the problem
26
- - **What we're building** — the artifact(s) or change(s) the work produces
27
- - **Why it's tricky** — failure modes, mental traps, things that defeat the obvious approach
28
- - **What success looks like** — concrete outcomes, not metrics theater
29
- - **Constraints** — what bounds the solution (not assumptions, not anti-goals — actual bounds)
30
- - **Open questions** — unresolved choices the next phase needs to make
31
-
32
- ## Anchor example
33
-
34
- This is the target style — terse, scannable, structured by what serves the content rather than by template:
35
-
36
- <example>
37
- # Session debugging is too expensive to do
38
-
39
- ## The pain
40
- When a sisyphus session produces unexpected output, the maintainer can't
41
- cheaply learn from it. The choice is between re-teaching Claude the
42
- architecture every conversation, or doing manual archaeology across raw
43
- JSONL files. Both are expensive enough that the learning loop gets skipped
44
- entirely.
45
-
46
- ## Key insight
47
- The data is already on disk — sisyphus just doesn't read it. Every agent's
48
- full transcript lives at `~/.claude/projects/<cwd>/<sessionId>.jsonl` with
49
- file touches, tokens, subagent spawns, and timing. The fix is a reader, not
50
- new instrumentation.
51
-
52
- ## The two artifacts
53
-
54
- | What | Why it's needed |
55
- |---|---|
56
- | **Debugging toolkit** (CLI verbs) | Cheap "what happened in session X" lookups Claude can compose with grep/jq |
57
- | **Architecture skill** (SKILL.md) | A mental model Claude can pull when reasoning about sisyphus runtime — the novel multi-agent design defeats its priors |
58
-
59
- Useless apart, powerful together. The toolkit answers *what*; the skill
60
- answers *how to make sense of what*.
61
-
62
- ## Why the skill matters
63
-
64
- Claude's failure modes when reasoning about sisyphus are predictable:
65
- - Treats the orchestrator as a long-running process with memory (it's
66
- stateless, fork-per-cycle)
67
- - Conflates sisyphus-managed agents with Claude-Code-managed Task-tool
68
- subagents
69
- - Misses that "completed" means three different things at three levels
70
- - Loses track of which channel agents communicate over
71
-
72
- These aren't undocumented — they're scattered across CLAUDE.md files framed
73
- as traps, not mental models. The skill is synthesis with decision heuristics,
74
- not new philosophy.
75
-
76
- ## What success looks like
77
-
78
- - Maintainer says "investigate session X", Claude pulls the skill, runs a
79
- couple of CLI queries, gives a grounded diagnosis citing real file paths
80
- and JSONL evidence — no re-teaching
81
- - Same skill loads automatically for high-level architecture discussions,
82
- not just debugging
83
- - Zero new instrumentation — derived from data already on disk plus a
84
- one-line fix to complete an existing index
85
-
86
- ## Constraints
87
-
88
- - Claude Code JSONL format isn't a stable contract — reader must degrade
89
- gracefully if Anthropic changes it
90
- - Codex/OpenAI agents have no equivalent transcript — known blind spot,
91
- not in scope
92
-
93
- ## Open questions
94
-
95
- - Skill scope: one broad "sisyphus" skill (architecture + debugging) or
96
- split into two?
97
- - Pre-fix sessions: accept they're harder to debug, or add an mtime-proximity
98
- fallback in the reader?
99
- </example>
100
-
101
- Notice what this example *doesn't* have: no "Alternatives Considered," no "Assumptions" section, no "User Experience" header (folded into success), no "Anti-Goals." Each section earned its place because the content needed it. A different problem would skip "Why the skill matters" and add "Migration path" or "User flows" — whatever the content demands.
102
-
103
- ## Bifurcation case
104
-
105
- If the conversation revealed that the scope contains **independent sub-problems** rather than one problem with sub-parts, do not write a unified `problem.md`. Instead, use the bifurcation-exit pattern from the agent prompt — the orchestrator handles re-entering discovery for each sub-problem.
@@ -1,83 +0,0 @@
1
- ---
2
- name: problem-plateau-breakers
3
- description: >
4
- Load when the problem-agent dialogue loop signals the conversation has stalled — repeated circling, user freetext like "different angle" / "going nowhere" / "feels stuck", or the agent senses it has been chasing the same framing for several turns without traction. Provides four breaker-deck shapes (flip, zoom-out, zoom-in, name-tension) and the routing for each. Increments the turn counter and returns control to the dialogue loop.
5
- ---
6
-
7
- # Plateau-breaker decks
8
-
9
- When the conversation circles, the user wants a *different shape of question*, not another variation of the same one. Pick the breaker whose move matches the stall pattern, issue the deck, then resume the turn loop.
10
-
11
- ## Pick the breaker type
12
-
13
- | Type | Use when | Move |
14
- |---|---|---|
15
- | `flip` | The conversation keeps assuming a position is correct | Embrace the opposite — what changes if we believed the inverse? |
16
- | `zoom-out` | The conversation is litigating details before establishing whether they matter | Step back — does this distinction even change the outcome? |
17
- | `zoom-in` | The conversation is trading abstractions without testing them against a real case | Pick a concrete scenario and see if the framing survives |
18
- | `name-tension` | Two values are being held in tension without naming the trade-off | Surface the tension itself as the question |
19
-
20
- Choose one per stall. Do not chain breakers — if a breaker doesn't unstick the conversation, the next one is the *next* stall, counted toward the repeated-stuck guard.
21
-
22
- ## Issue the deck
23
-
24
- Required prior assignments before the heredoc:
25
- - `type` — one of `flip` / `zoom-out` / `zoom-in` / `name-tension`
26
-
27
- Angle-bracket placeholders (substitute literally before writing the heredoc):
28
- - `<observation>` — what the conversation has been circling
29
- - `<reframe>` — provisional alternative tied to the breaker type
30
-
31
- ```bash
32
- type=flip # or zoom-out / zoom-in / name-tension
33
- deck="$SISYPHUS_SESSION_DIR/context/.ask-problem-plateau-${type}-$(date +%s)-$$.json"
34
- cat > "$deck" <<EOF
35
- {
36
- "interactions": [{
37
- "id": "problem-plateau-${type}",
38
- "title": "Plateau breaker",
39
- "subtitle": "Plateau breaker — ${type}",
40
- "body": "## Stalled\n\n- <observation>\n\n## Reframe\n\n- <reframe>",
41
- "kind": "decision",
42
- "options": [
43
- <options for this type — see table below>
44
- ],
45
- "allowFreetext": true,
46
- "freetextLabel": "Or describe the angle differently"
47
- }]
48
- }
49
- EOF
50
- result=$(sis ask "$deck") || { sis agent submit "Plateau-breaker deck failed — type: $type — deck: $deck"; exit 1; }
51
- [ -n "$result" ] || { sis agent submit "Plateau-breaker deck: empty result — type: $type — deck: $deck"; exit 1; }
52
- choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId // empty')
53
- notes=$(echo "$result" | jq -r '.responses[0].freetext // ""')
54
- ```
55
-
56
- ## Per-breaker options
57
-
58
- Pre-substitute the matching row before writing the heredoc:
59
-
60
- | `type` | Options (id / label) |
61
- |---|---|
62
- | `flip` | `embrace-flipped` / "Embrace the flipped position" · `stick-original` / "Stick with original" · `merge-both` / "Merge both" |
63
- | `zoom-out` | `drop-doesnt-matter` / "Doesn't matter — drop" · `smaller-scope` / "Matters but smaller" · `matters-as-is` / "Matters as is" |
64
- | `zoom-in` | `scenario-breaks-it` / "This scenario breaks it" · `scenario-holds` / "Scenario holds" · `different-scenario` / "Different scenario" |
65
- | `name-tension` | `pick-side-A` / "Pick A" · `pick-side-B` / "Pick B" · `tension-itself` / "The tension itself is the problem" |
66
-
67
- ## After the response
68
-
69
- Increment the turn counter `N` and return to the dialogue loop's turn-deck flow. The user's `choice` and `notes` flow into the next turn's framing.
70
-
71
- ## Body content rules
72
-
73
- The deck `body` field uses `##` headings, bullet lists, and bold only — no tables, no code fences, no termrender directives. Violations fail `termrender --check` inside `parseDeck`.
74
-
75
- ## Sanitize freetext on bail
76
-
77
- If you bail with the user's freetext in the message, sanitize it first:
78
-
79
- ```bash
80
- safe_notes=$(printf '%s' "$notes" | tr -d '`$"\\')
81
- ```
82
-
83
- Raw `"$notes"` in a shell-interpolated bail message is a defect.
@@ -1,150 +0,0 @@
1
- ---
2
- name: humanloop
3
- description: >
4
- Read before calling `sis ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to invoke synchronously so the orchestrator's process blocks until the user answers.
5
- ---
6
-
7
- # Talking to the user via decks
8
-
9
- `sis ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
10
-
11
- This skill covers **what to put in a deck** and **how to invoke it**. Run `sis ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
12
-
13
- ## Reach for a deck when
14
-
15
- - You have **2+ questions** to surface in one beat (bundle them into one deck).
16
- - You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
17
- - You're choosing between **concrete alternatives** the user must pick.
18
- - The work will sit while the user thinks. Decks survive across cycles; chat does not.
19
-
20
- ## Skip the deck when
21
-
22
- - It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
23
- - You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
24
- - The user is actively conversing with you — converting a live exchange into a deck adds friction.
25
-
26
- ## How to invoke
27
-
28
- **Run `sis ask` in the foreground — let the Bash tool block.** The CLI waits internally for the user to resolve the deck (potentially 10+ minutes). Your pane stays alive in tmux for the duration; the daemon will not respawn you while a tool call is in flight. When the user answers, the bash returns stdout and you parse it inline.
29
-
30
- ```bash
31
- result=$(sis ask "$deck")
32
- choice=$(echo "$result" | jq -r '.responses[0].selectedOptionId')
33
- notes=$(echo "$result" | jq -r '.responses[0].freetext // ""')
34
- ```
35
-
36
- **Do not `run_in_background` and yield** — yielding kills your pane and any backgrounded bash with it; the next cycle's fresh orchestrator can only peek the on-disk deck (`sis ask peek`) and yield again, producing a polling loop. The daemon now refuses `sis orch yield` while a deck owned by orchestrator is pending; the supported pattern is foreground.
37
-
38
- Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
39
-
40
- If you respawn mid-wait and find a pending deck on disk (e.g. after a daemon restart that orphaned the prior bash), block on it with `sis ask poll <askId>` to re-attach. `sis ask peek <askId>` is non-blocking and reserved for respawn-recovery diagnostics. See `sis ask -h`.
41
-
42
- ## Designing interactions
43
-
44
- ### Each option is a concrete path forward
45
-
46
- The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
47
-
48
- <example type="good">
49
- ```
50
- title: "Session store backend?"
51
- subtitle: "Auth needs persistent sessions across restarts"
52
- kind: decision
53
- options:
54
- in-memory: "In-memory map — simplest. Loses sessions on restart; single-process only."
55
- redis: "Redis — survives restart, supports horizontal scale. New ops dependency."
56
- postgres: "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
57
- defer: "Ship in-memory now, migrate later if scale becomes real."
58
- allowFreetext: true
59
- freetextLabel: "Different framing — describe it"
60
- ```
61
- </example>
62
-
63
- <example type="bad">
64
- ```
65
- title: "Happy with this design?"
66
- options:
67
- 1. Yes
68
- 2. No, start over
69
- 3. Maybe, with comments
70
- 4. (no option, just freetext)
71
- ```
72
- "Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
73
- </example>
74
-
75
- ### Use `allowFreetext: true` as a safety valve, not the primary input
76
-
77
- Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
78
-
79
- <example type="bad">
80
- ```
81
- title: "Approve?"
82
- options:
83
- 1. Approve
84
- 2. Reject
85
- 3. Comment
86
- allowFreetext: true
87
- ```
88
- A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
89
- </example>
90
-
91
- ### Bound option count to 2–4
92
-
93
- Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
94
-
95
- ### Ground options in what you've already gathered
96
-
97
- Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
98
-
99
- ### One concern per interaction
100
-
101
- When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
102
-
103
- ## `kind` — display hint
104
-
105
- | kind | use for |
106
- |---|---|
107
- | `decision` | fork in the road; user picks a path forward |
108
- | `validation` | sign-off on completed work |
109
- | `notify` | FYI; user acknowledges |
110
- | `context` | surfacing background that needs a response |
111
- | `error` | something went wrong; user picks a recovery |
112
-
113
- The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
114
-
115
- ## Bundling
116
-
117
- If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
118
-
119
- ```bash
120
- deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
121
- cat > "$deck" <<'EOF'
122
- {
123
- "title": "Phase 2 sign-off + follow-on decisions",
124
- "interactions": [
125
- {
126
- "id": "approve-phase-2",
127
- "title": "Phase 2 looks good?",
128
- "kind": "validation",
129
- "options": [...]
130
- },
131
- {
132
- "id": "phase-3-scope",
133
- "title": "Phase 3 scope?",
134
- "kind": "decision",
135
- "options": [...]
136
- }
137
- ]
138
- }
139
- EOF
140
- # Then invoke `sis ask "$deck"` synchronously (foreground bash) — blocks until answered.
141
- # Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
142
- ```
143
-
144
- ## Submission notes
145
-
146
- - The deck is validated at submit (precise errors — trust them).
147
- - `kind` is an enum: `notify` | `validation` | `decision` | `context` | `error`. No other values accepted (see the table above for which to pick).
148
- - `bodyPath` points at a markdown file instead of inlining the body in JSON. The path is resolved **relative to the deck JSON's directory** and must stay inside it (no `..`, no symlinks out, no absolute paths pointing elsewhere). Practical pattern: write the deck JSON next to its body file — e.g. both inside `$SISYPHUS_SESSION_DIR/context/` — and use a basename like `"completion-summary.md"`. Mutually exclusive with `body`.
149
- - On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
150
- - See `sis ask -h` for the full CLI surface.
@@ -1 +0,0 @@
1
- - `sis orch yield --mode <mode>` is required on every yield. Pass the current mode to stay in it; pass a different mode to transition. There is no implicit "keep current mode" fallback — the CLI rejects yields without `--mode`.
@@ -1,29 +0,0 @@
1
- ---
2
- name: orchestration
3
- description: >
4
- Task breakdown patterns for sisyphus orchestrator sessions. How to structure tasks, sequence agents, and manage cycles for debugging, feature builds, refactors, and other common workflows. Use when planning orchestration strategy or structuring a multi-agent session.
5
- ---
6
-
7
- # Orchestration Patterns
8
-
9
- How to structure sisyphus sessions for common task types. This skill helps the orchestrator break work into tasks, choose agent types, sequence cycles, and handle failures.
10
-
11
- ## Core Principles
12
-
13
- 1. **roadmap.md is the orchestrator's memory.** roadmap.md and agent reports persist across cycles — they're all you have. Keep roadmap.md current and specific enough that a fresh orchestrator can pick up where you left off.
14
-
15
- 2. **Agents are disposable.** Each agent gets one focused instruction. If it fails or the scope changes, spawn a new one — don't try to redirect a running agent.
16
-
17
- 3. **Parallelize when independent.** If two tasks don't share files or depend on each other's output, spawn agents for both in the same cycle.
18
-
19
- 4. **Interleave verification.** Don't batch all implementation and defer review to the end. Embed critique and validation checkpoints between stages based on risk — the more subsequent work depends on a stage being correct, the more it needs verification before you build on it.
20
-
21
- 5. **Reports are handoffs.** Agent reports should contain everything the next cycle's orchestrator needs — what was done, what was found, what's unresolved, where artifacts were saved.
22
-
23
- ## Agent Types
24
-
25
- Available agent types are listed under **Available Agent Types** in your prompt. Use `--agent-type` with `sis agent spawn`.
26
-
27
- For task breakdown patterns per workflow type, see [task-patterns.md](task-patterns.md).
28
- For end-to-end workflow examples, see [workflow-examples.md](workflow-examples.md).
29
- For strategy.md authoring — stage patterns, process shapes, format — see [strategy.md](strategy.md).
@@ -1,160 +0,0 @@
1
- # Strategy Reference
2
-
3
- Reference material for writing and updating strategy.md — the document that maps the shape of the work across stages.
4
-
5
- ## strategy.md Format
6
-
7
- ```markdown
8
- ## Completed
9
- [Compressed summaries of finished stages — delete detail, keep outcomes]
10
-
11
- ## Current Stage: [name]
12
- [Detailed process flow with exit criteria and backtrack triggers]
13
-
14
- ## Ahead
15
- [Sketched future stages — one line each: name + what it covers]
16
- [Only as far as you can currently see — it's OK if this is vague]
17
- ```
18
-
19
- **Principles:**
20
- - **Detail the current stage** — concrete enough that the orchestrator can execute without re-reading this skill
21
- - **Sketch what's ahead** — enough continuity that future updates don't lose the thread, not so much that you're committing to unknowns
22
- - **Every detailed stage gets exit criteria** — concrete enough to evaluate, not so rigid they become checkboxes
23
- - **Include user gates** — where does this stage need the user? What decision or approval?
24
-
25
- ## Stages name kinds of work, not areas of code
26
-
27
- A strategy stage is a **process phase** — `discovery`, `planning`, `implementation`, `validation`, `spike`. It describes the *kind* of thinking happening that stage. It is **not** a work-area label like `auth-refactor`, `tui-panel`, `migration-script`, or `foundations`.
28
-
29
- Work areas are the plan agent's job. They live in `context/{plan-lead-agent-id}/plan-stage-N-*.md` and structure the implementation phase from the inside. Keep them out of `strategy.md`.
30
-
31
- <example>
32
- ✓ Correct — process phases:
33
- ```
34
- ## Ahead
35
- - **implementation** — phased build per the plan outline (5 sub-stages: foundations → ask-cli → tui → orphan-handling → migration). Critique + validate per stage.
36
- - **validation** — run e2e recipe end-to-end, capture evidence, user gate.
37
- ```
38
-
39
- ✗ Wrong — work areas masquerading as stages:
40
- ```
41
- ## Ahead
42
- - **foundations** — humanloop refactor + ask-store helpers
43
- - **ask-cli + haiku + template** — CLI command and tool-use loop
44
- - **tui-integration** — inbox panel and key routing
45
- - **orphan-handling** — kill/complete paths
46
- - **migration + e2e validation** — drop old command, run recipe
47
- ```
48
- The second list is a roadmap of code work. Strategy.md collapses into a task list and the process shape (when do we critique? when do we validate? what's the user gate?) disappears.
49
- </example>
50
-
51
- When you're tempted to name a stage after a code area, that signals you're sketching the plan, not the strategy. Push that detail down into the plan agent's output and keep `strategy.md` at the process-shape layer.
52
-
53
- ## Default Pipeline Shape
54
-
55
- The session's effort tier dictates the default pipeline. **Use this shape unless the problem explicitly demands more or less.** The user can change tiers via `sis session effort <low|medium|high|xhigh>`.
56
-
57
- <!--EFFORT:LOW-->
58
- **Pipeline:** `plan → implement → validate`
59
-
60
- A single plan agent, a single implement agent, a single validate agent. No spec, problem, test-spec, or review-plan stages — the user's request is the requirement; ask in-band if anything's ambiguous. If the work is wrapper-shaped (every change backs onto an existing CLI/API/handler), move directly from discovery into implementation mode without a planning-mode cycle at all.
61
- <!--/EFFORT-->
62
-
63
- <!--EFFORT:MEDIUM-->
64
- **Pipeline:** `(spec, if behavior changes) → plan → implement → validate`
65
-
66
- Add `sisyphus:review-plan` only when the plan covers multi-domain integration. Add `sisyphus:test-spec` **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"). Silence is a "no" — do not proactively ask, do not infer from feature risk. Spawn `sisyphus:spec` and `sisyphus:problem` only when the goal has multiple valid framings or the design space is genuinely open.
67
- <!--/EFFORT-->
68
-
69
- <!--EFFORT:HIGH,XHIGH-->
70
- **Pipeline:** `discovery → spec → planning (with parallel review-plan) → phased implementation with critique/validate checkpoints → validation`
71
-
72
- `sisyphus:review-plan` runs after the plan is drafted. `sisyphus:spec` spawns whenever a feature adds user-visible behavior. `sisyphus:problem` spawns when the goal is nebulous. Append `+ test-spec` to the planning stage **only when the user's initial prompt or goal.md explicitly requested tests** (e.g. "with tests", "TDD", "include unit tests", "test coverage"); silence is a "no." When justified, `sisyphus:test-spec` spawns in parallel with the high-level plan at Cycle 2, not after implementation — post-implementation test-spec silently describes what the code does rather than what it should do.
73
- <!--/EFFORT-->
74
-
75
- **Re-evaluate the tier when scope shifts mid-session.** A MEDIUM feature that uncovers a new subsystem may have crossed into HIGH; a HIGH feature whose scope was narrowed may have dropped to MEDIUM. Re-run `sis session effort` and re-invoke this skill rather than continuing under the old tier's pipeline.
76
-
77
- ## Choosing a Different Shape
78
-
79
- If the default doesn't match the problem, these canonical progressions are the next-best starting points — pick the closest one and prune what's already clear, rather than inventing custom shapes:
80
-
81
- ```
82
- discovery → spec → planning → implementation → validation
83
- exploration → spike → design → implementation → validation
84
- investigation → recommendation → (user decides) → implementation
85
- analysis → phased-transformation → verification
86
- discovery → product-design → technical-investigation → architecture → implementation → validation
87
- ```
88
-
89
- Add a new stage *type* only when the problem demands a kind of work the patterns don't cover — for example a `spike` to prove feasibility, a `compatibility-check` before a migration, or a `prototype` before committing. The test for "is this a real new stage?" is whether it names a different kind of thinking, not a different slice of code.
90
-
91
- ## Stage Patterns
92
-
93
- Use these as starting points. Invent new stage types when the problem demands it. Add backtrack edges where you can foresee things going wrong.
94
-
95
- ### discovery
96
- **Use when:** Goal is undefined, ambiguous, or has shifted — need to clarify what "done" looks like before any other stage runs. Also re-entered mid-session when a pivot invalidates the current goal.
97
- - Process: read prior context (goal.md, prior strategy if any) → if the goal is provably clear, write goal.md and run the clarity-confirmation deck → otherwise spawn `sisyphus:problem` for interactive exploration → user iterates → fold result into goal.md → set effort tier → write or revise strategy.md
98
- - Exit: goal.md is current and confirmed; effort tier is set; strategy.md exists for this iteration
99
- - Produces: goal.md, strategy.md, optionally context/problem.md or context/problem-bifurcation.md
100
- - Backtrack: if scope reveals multiple independent projects, issue a decomposition deck and let the user pick a lead — record the others under "Known follow-ups" in goal.md
101
-
102
- ### exploration
103
- **Use when:** Need to understand the technical landscape before committing to an approach.
104
- - Process: spawn explore agents (each producing a focused context doc) → review findings → identify gaps → re-explore or converge
105
- - Exit: enough understanding to make decisions — key questions answered, relevant patterns documented
106
- - Produces: context documents (one per investigation angle, not one sprawling doc)
107
-
108
- ### spike
109
- **Use when:** Feasibility is uncertain — need to prove an approach works before investing in full design.
110
- - Process: identify the riskiest assumption → build a minimal prototype that tests it → evaluate results → present findings to user if the spike changes the approach
111
- - Exit: feasibility confirmed or denied with evidence, decision on path forward
112
- - Produces: spike findings in context/, prototype code (may be throwaway)
113
- - Backtrack: if spike fails → re-explore alternatives
114
-
115
- ### spec
116
- **Use when:** Need to define what to build and how, in a single interactive session.
117
- - Process: spawn sisyphus:spec → lead explores codebase, asks user questions, dispatches engineer for design and a single writer for requirements → user reviews via TUI → lead deepens design with findings
118
- - Exit: user-approved design + requirements with testable acceptance criteria
119
- - Produces: context/design.md + context/design.json + context/requirements.json + context/requirements.md
120
- - Backtrack: if problem was misframed → re-explore or re-discover
121
-
122
- ### planning
123
- **Use when:** Design approved, need an executable breakdown.
124
- - Process: spawn plan lead with spec outputs (requirements + design) as inputs → adversarial review of plan → create e2e verification recipe
125
- - Exit: reviewed plan + executable e2e-recipe.md that defines how to prove the feature works
126
- - Produces: phased implementation plan + e2e recipe in context/
127
- - Backtrack: if plan reveals design infeasibility → revisit spec
128
-
129
- ### implementation
130
- **Use when:** Plan exists, time to build.
131
- - Process: for each phase → detail-plan → spawn implement agents → single critique pass → refine → validate phase
132
- - Exit: all phases validated with evidence, no critical review findings remain
133
- - Loops: none within a phase — review runs once, fixes land, then validation. If review surfaces architectural issues, backtrack to plan; otherwise advance.
134
- - Backtrack: if 2+ agents hit same unexpected complexity → revisit plan or spec; if review finds architectural issues → revisit plan
135
-
136
- ### validation
137
- **Use when:** Implementation complete, need to prove it works end-to-end.
138
- - Process: run full e2e recipe → collect evidence (command output, screenshots, responses) → assess against success criteria → step back and check if the goal is actually met
139
- - Exit: all recipe steps pass with concrete evidence, original goal satisfied
140
- - Produces: validation report with evidence
141
- - Backtrack: if bugs found → implementation; if architectural issues → spec
142
-
143
- ## Mid-session shape revisions
144
-
145
- When the work in flight reveals the strategy itself is off, escalate up this ladder — reach for the lowest-cost move that fits.
146
-
147
- 1. **Revise in place.** Stage detail evolved but the pipeline shape holds. Edit `strategy.md` and `roadmap.md`; continue.
148
- 2. **`sisyphus:strategize`.** Approach is wrong but artifacts (specs, explorations, reports) still apply. Annotates the pivot into `strategy.md` and yields `--mode discovery` with a fresh orchestrator.
149
- 3. **`sis session clone <goal>`.** The session is actually two (or more) independent projects. Forks scope into a new top-level session; update `goal.md`/`roadmap.md` here to drop what was cloned.
150
- 4. **`sis session rollback <sessionId> <cycle>`.** A specific cycle introduced state to discard. Rewinds and pauses the session — cycles after the target are lost. Last resort; the others preserve history.
151
-
152
- When the user is the source of the change, update `goal.md` first — strategy revision is downstream of goal.
153
-
154
- ## Design Philosophy
155
-
156
- Frameworks to inform process shape selection — use them to *choose the right shape*, not to follow mechanically:
157
-
158
- - **Double Diamond** — Diverge to explore, converge on a definition; diverge on solutions, converge on implementation. Use when requirements are unclear or the problem needs defining.
159
- - **OODA (Observe–Orient–Decide–Act)** — Tight sensing/reacting loops. Use when the situation is fluid and the cost of wrong moves is low (debugging, spikes, incident response).
160
- - **Cynefin** — Match approach to domain. Clear → best practice. Complicated → analyze then execute. Complex → probe, sense, respond. Chaotic → act to stabilize.