npm - sisyphi - Versions diffs - 1.0.7 → 1.0.9 - Mend

sisyphi 1.0.7 → 1.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/dist/templates/agent-plugin/hooks/CLAUDE.md ADDED Viewed

@@ -0,0 +1,57 @@
+# templates/agent-plugin/hooks/
+Lifecycle hooks for agent plugin workflows. Enable specialized prompt generation and context handling during agent spawning.
+## hooks.json
+Schema: `{ "phaseKey": { "hookName": "script-name.sh" } }`
+Example:
+```json
+{
+  "plan": {
+    "userPrompt": "plan-user-prompt.sh",
+    "systemPrompt": "plan-system-prompt.sh"
+  }
+}
+```
+- **Keys**: Phase names (e.g., `plan`, `spec`, `implement`) — must correspond to phase modes in agent spawn workflow
+- **Values**: Object mapping hook types to shell script names
+- **Hook types**: `userPrompt`, `systemPrompt` (extensible for future hooks)
+## Shell Scripts
+Each script receives environment variables and outputs text to stdout.
+```bash
+# Receives: $SISYPHUS_SESSION_ID, $SISYPHUS_AGENT_ID, $INSTRUCTION, $AGENT_TYPE, context files
+# Outputs: Full user or system prompt text
+```
+**Convention**: `{phase}-{hook-type}.sh`
+**Inputs**:
+- `$SISYPHUS_SESSION_ID` — Session UUID
+- `$SISYPHUS_AGENT_ID` — Agent ID (e.g., `agent-001`)
+- `$INSTRUCTION` — Task instruction from spawn command
+- `$AGENT_TYPE` — Agent type (e.g., `plan`, `spec`, `implement`)
+- Context files at `.sisyphus/sessions/$SISYPHUS_SESSION_ID/context/`
+**Output**: Must write complete prompt text to stdout (no errors to stderr)
+## Invocation
+Hooks are executed during agent spawn when:
+1. Agent type matches a plugin agent type (e.g., `--agent-type sisyphus:plan`)
+2. Phase has hooks configured in hooks.json
+3. Daemon renders prompts before passing to Claude
+Output becomes the `--append-system-prompt` or user message content.
+## Key Patterns
+- **No placeholders in shell scripts** — unlike `.md` templates, scripts perform logic and generate final text
+- **Context access**: Scripts can read session state from `$SISYPHUS_SESSION_ID` directory
+- **Error handling**: Exit non-zero to fail agent spawn; errors logged to daemon.log
+- **Stdout only**: Scripts must output complete prompt to stdout; nothing to stderr

package/dist/templates/agent-plugin/hooks/debug-user-prompt.sh ADDED Viewed

@@ -0,0 +1,15 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce systematic methodology for debug agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<debug-reminder>
+Systematic debugging — don't skip the fundamentals:
+- Check git log/blame near the failure — recent changes are the highest-signal evidence
+- For medium+ difficulty (crosses 2+ modules, unclear cause), spawn parallel subagents: data flow tracer, assumption auditor, change investigator
+- Your report must include: exact failing line(s), concrete evidence (code snippets, data flow), confidence level (high/medium/low), and recommended fix
+Investigate only — no code changes except reproduction tests.
+</debug-reminder>
+HINT

package/dist/templates/agent-plugin/hooks/hooks.json CHANGED Viewed

@@ -3,18 +3,22 @@
     "PreToolUse": [
       {
         "matcher": "SendMessage",
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/intercept-send-message.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/intercept-send-message.sh"
+          }
+        ]
       }
     ],
     "Stop": [
       {
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/require-submit.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/require-submit.sh"
+          }
+        ]
       }
     ]
   }

package/dist/templates/agent-plugin/hooks/operator-user-prompt.sh ADDED Viewed

@@ -0,0 +1,14 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce paranoid testing for operator agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<operator-reminder>
+Click EVERYTHING — assume something is broken and prove it:
+- Every link, button, nav item, dropdown, toggle, accordion, interactive element on the page
+- Edge cases: empty forms, duplicate submissions, back-button mid-flow, double-clicks, rapid navigation, browser refresh mid-action
+- Check ALL sources: DOM, console errors, network failures, logs — not just what's visually obvious
+- Spawn subagents to parallelize when scope is broad (one per page/flow/feature area) — the cost of missing a broken button is higher than an extra agent
+</operator-reminder>
+HINT

package/dist/templates/agent-plugin/hooks/plan-user-prompt.sh CHANGED Viewed

@@ -2,10 +2,7 @@
 # UserPromptSubmit hook: remind plan agent to delegate for large tasks.
 if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
-python3 -c "
-import json, sys
-print(json.dumps({'additionalContext': sys.stdin.read()}))
-" <<'HINT'
+cat <<'HINT'
 <planning-reminder>
 For particularly large or multi-domain tasks, delegate sub-plans to specialist agents rather than planning everything solo:

package/dist/templates/agent-plugin/hooks/review-plan-user-prompt.sh ADDED Viewed

@@ -0,0 +1,16 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce cross-plan interface focus for plan review agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<review-plan-reminder>
+The primary source of bugs is the interfaces between plans:
+- Confirm critical/high findings by cross-referencing spec and code yourself — don't rubber-stamp subagent opinions
+- Flag file ownership conflicts: any file touched by 2+ plans or agents needs explicit coordination
+- Read actual source files for pattern consistency — don't review the plan in isolation
+- Type definitions must have exactly one owner; flag divergent names/shapes for the same concept
+You are read-only. Synthesize and report — never edit plan or code files yourself.
+</review-plan-reminder>
+HINT

package/dist/templates/agent-plugin/hooks/review-user-prompt.sh ADDED Viewed

@@ -0,0 +1,16 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce validation discipline for review agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<review-reminder>
+Only report confirmed findings — spawn validation subagents (~1 per 3 issues) before finalizing:
+- Bugs/Security: opus validates exploitable/broken
+- Everything else: sonnet confirms significant (not nitpick)
+- Drop anything subjective, pre-existing, or linter-catchable
+- Every finding needs `file:line` + concrete evidence — no "this could be a problem"
+You are read-only. Investigate and direct fixes through implementers — never edit code yourself.
+</review-reminder>
+HINT

package/dist/templates/agent-plugin/hooks/spec-user-prompt.sh CHANGED Viewed

@@ -2,10 +2,7 @@
 # UserPromptSubmit hook: remind spec agent to iterate with the user.
 if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
-python3 -c "
-import json, sys
-print(json.dumps({'additionalContext': sys.stdin.read()}))
-" <<'HINT'
+cat <<'HINT'
 <spec-reminder>
 Iterate with the user — include them in the process before writing anything to disk:

package/dist/templates/agent-plugin/hooks/test-spec-user-prompt.sh ADDED Viewed

@@ -0,0 +1,14 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce behavioral invariants for test-spec agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<test-spec-reminder>
+Behavioral properties, not test code:
+- State behaviors as invariants: "Users can log in with email/password" — not "loginHandler calls bcrypt.compare"
+- Each property must be independently verifiable
+- Include negative properties — what must NOT happen is as important as what must
+- If the change is purely mechanical with nothing to verify, submit { "testsNeeded": false }
+</test-spec-reminder>
+HINT

package/dist/templates/companion-plugin/hooks/hooks.json CHANGED Viewed

@@ -2,10 +2,12 @@
   "hooks": {
     "UserPromptSubmit": [
       {
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/user-prompt-context.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/user-prompt-context.sh"
+          }
+        ]
       }
     ]
   }

package/dist/templates/orchestrator-base.md CHANGED Viewed

@@ -160,9 +160,9 @@ This means the roadmap evolves. Outlined phases get refined (or reworked) as you
 This applies at every level of the hierarchy. Don't produce a detailed implementation plan before you've researched and specified — detailed plans based on assumptions will change. Defer detail until you're about to execute.
-### Validate before advancing
+### Validate before unverified work compounds
-Each completed phase or stage gets verified before the next one starts. Don't build on unverified work. Validation means a separate agent (not the one that did the work) confirms the change actually works — running tests, exercising behavior, reviewing code.
+Don't let unverified work accumulate unchecked. The more stages you implement without any critique or validation, the harder it becomes to identify where things went wrong. Interleave verification cycles between implementation stages — how often depends on risk. High-risk stages (core logic, integration points) should be verified before you build on them. Low-risk stages (types, config) can be batched into a broader validation later. The failure mode to avoid is implementing everything and only validating at the end — by then, bugs are buried under layers of dependent code and the feedback is useless.
 ### Every change deserves rigor
@@ -174,15 +174,15 @@ For multi-file changes or design decisions, invest fully in the earlier phases:
 The system gives you unlimited cycles for a reason: so you never have to cut corners. Failed implementations, deferred issues, and skipped reviews are far more expensive than extra cycles. Use cycles to be thorough, not to be fast.
-**Each feature is multiple cycles, not one.** A typical feature like "auth system" is not a single implementation cycle. It's a sequence:
+**Each feature is multiple cycles, not one.** You have three tools for ensuring quality, and your job is to apply them with judgment:
-1. **Implement** — one or more cycles of agents writing code (sometimes the implementation itself needs multiple cycles if it's complex enough)
-2. **Critique** — spawn review agents to find flaws, code smells, overengineering, missed edge cases. They report problems, not fixes.
-3. **Refine** — spawn agents to fix what the reviewers found, simplify, refactor. Agents can use `/simplify` to systematically look for reuse, quality, and efficiency issues.
-4. **Repeat 2-3** until reviewers come back clean — no feedback means you're done, not "good enough." Every issue found gets addressed. Nothing is deferred.
-5. **Validate** — e2e verification by a separate agent that the feature actually works end-to-end
+- **Critique** — spawn review agents to find flaws, code smells, overengineering, missed edge cases. They report problems, not fixes.
+- **Refine** — spawn agents to fix what the reviewers found, simplify, refactor. Agents can use `/simplify` to systematically look for reuse, quality, and efficiency issues.
+- **Validate** — e2e verification by a separate agent that the feature actually works end-to-end.
-This implement → critique → refine loop is how quality happens. Skipping it produces code that passes tests but is brittle, overengineered, or subtly wrong. Budget for it in your roadmap. Never compress it.
+Not every stage needs every tool. A types-only stage might need none — the consumers will surface type errors. A core logic stage needs critique at minimum. An integration stage needs critique and validation. The judgment call is yours, based on risk: how much subsequent work depends on this stage being correct? How costly would a bug here be to find later?
+What you must avoid is the **batch-everything-then-review-at-the-end** pattern. If you implement five stages before any critique or validation, you've turned a series of small, localizable problems into one massive, entangled debugging session. Interleave verification between implementation stages — not necessarily after every one, but often enough that you're catching problems close to where they were introduced.
 A phase like "Implement auth system" is realistically 4-6 cycles. A phase like "Frontend shell" is 8+. Be honest about scope — underestimating just means you'll lose track of where you are.

package/dist/templates/orchestrator-impl.md CHANGED Viewed

@@ -6,15 +6,21 @@
 Before starting each cycle, ask: **which stages or tasks are independent right now?** If two stages touch different subsystems (e.g., backend vs frontend, separate services, unrelated modules), spawn them concurrently — don't serialize work that doesn't need to be serialized. Use `--worktree` when parallel agents might touch overlapping files.
-Sequential execution is the default trap. Fight it actively. At every yield, look for work that can run alongside the next stage — review agents while the next implementation starts, frontend and backend stages in parallel, independent fix agents concurrently. A cycle with one agent running is a wasted cycle if other work was ready.
+Maximize parallelism **within your development cycle, not by skipping parts of it.** Running a review alongside the next stage's implementation is good parallelism. Skipping review because the next stage is ready is not — that's cutting corners faster, not working faster. A cycle with one agent running is a wasted cycle if other work was ready, but "other work" includes critique and validation agents, not just the next implementation stage.
-If the plan has stages that share no file dependencies, **run them in parallel from the start.** Each stage is multiple cycles:
+If the plan has stages that share no file dependencies, **run them in parallel from the start.** The development cycle for each stage involves some combination of:
 1. **Detail-plan it** — expand the high-level outline into specific file changes, informed by previous stages. If complex enough, spawn a spec agent first.
 2. **Implement it** — spawn agents with self-contained instructions (see Agent Instructions below). May itself take multiple cycles if the stage has enough work.
-3. **Critique and refine it** — spawn parallel review agents, fix what they find, repeat until clean (see below).
-4. **Validate it end-to-end** — spawn a validation agent with the e2e recipe. Don't advance until it passes.
-5. **Update roadmap.md** — mark the stage done in the implementation phase, refine future stage outlines if what you learned changes the approach.
+3. **Critique and refine it** — spawn review agents, fix what they find (see Critique and Refinement below).
+4. **Validate it** — spawn a validation agent to verify the stage actually works (see E2E Validation below).
+Not every stage needs every step. Use your judgment about what level of rigor each stage deserves:
+- A types/interfaces stage might just need implementation — the next stage that consumes the types will surface any problems.
+- A core business logic stage needs implementation + critique at minimum — subtle bugs here cascade everywhere.
+- An integration stage or anything touching critical paths needs the full loop including validation — you're building on accumulated assumptions and need to verify they hold.
+The key question each cycle: **what's the riskiest unverified work right now?** If you just finished a foundation stage and are about to build on it, validate the foundation. If you just implemented a low-risk config change, move on and batch it into a broader review later. When multiple stages have completed without any critique or validation, you've lost the feedback loop — stop implementing and catch up on verification before problems compound.
 Don't detail-plan all stages up front. What you learn implementing earlier stages should inform later ones.
@@ -52,11 +58,11 @@ When you see these reports, investigate before pushing forward. If the smell sug
 ## Critique and Refinement
-After implementation agents report, **do not advance to the next stage.** The code needs to be reviewed and refined first. This is not optional.
+After implementation agents report, assess whether the stage needs critique before advancing. For stages that touch core logic, integration points, or critical paths — review before building on top. For low-risk stages (types, config, boilerplate), you can defer review and batch it with a later critique cycle. The failure mode is not "sometimes skipping review" — it's implementing six stages in a row without any review at all.
 ### Critique cycle
-Spawn three review agents in parallel, each attacking a different dimension:
+When a stage warrants critique, spawn review agents in parallel, each attacking a different dimension:
 1. **Code reuse reviewer** — searches the codebase for existing utilities, helpers, and patterns that the new code duplicates. Flags any new function that reimplements existing functionality, any inline logic that could use an existing utility.
@@ -83,7 +89,7 @@ Spawn reviewers again on the refined code. If they come back with new issues, fi
 ## E2E Validation
-After the critique/refine loop produces clean code, **validate end-to-end before advancing.** This is also not optional. The implementing agent is the worst validator of its own work — same blind spots, same assumptions.
+E2E validation confirms the implementation actually works — not just that it compiles or passes unit tests, but that the feature behaves correctly when exercised. Reserve full e2e validation for stages where you're about to build on accumulated work (integration stages, milestones where multiple stages come together) or where failure would be expensive to debug later. Not every stage needs its own e2e pass — but don't let more than 2-3 stages accumulate without one.
 Spawn a validation agent with the e2e recipe from `context/e2e-recipe.md`. The agent should:
 - Follow the setup steps exactly (build, start servers, seed data)

package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md CHANGED Viewed

@@ -78,28 +78,33 @@ Feature with moderate complexity. Requirements may need clarification. Multiple
 ### Implementation
 - [ ] Phase 1 — [foundation/types/interfaces]
 - [ ] Phase 2 — [core logic]
+- [ ] Critique phases 1-2
 - [ ] Phase 3 — [integration/wiring]
-### Validation
-- [ ] Validate full implementation
+- [ ] Validate — smoketest full feature e2e
 - [ ] Review implementation
 ```
+Note: critique and validation are embedded between implementation phases, not deferred to the end. Phase 1 (types) is low-risk and doesn't need its own review, but critique catches issues before Phase 3 builds on them. Validation happens after integration, when all the pieces come together.
 ### Cycle plan
 - **Cycle 1**: Spawn `sisyphus:spec-draft` for spec. Yield. (Human iterates on spec between cycles.)
 - **Cycle 2**: Spawn `sisyphus:plan` for plan. Yield.
 - **Cycle 3**: Spawn `sisyphus:review-plan` for review. If fail, respawn plan with issues. Yield.
 - **Cycle 4**: Spawn `sisyphus:implement` for Phase 1. Yield.
-- **Cycle 5**: Spawn `sisyphus:implement` for Phase 2 + `sisyphus:validate` for Phase 1 (parallel if independent). Yield.
-- **Cycle 6-8**: Continue phases, validate, review.
+- **Cycle 5**: Spawn `sisyphus:implement` for Phase 2. Phase 1 is types — low risk, doesn't need its own validation. Yield.
+- **Cycle 6**: Spawn `sisyphus:review` for critique of phases 1-2. This is the checkpoint before integration builds on top. Yield.
+- **Cycle 7**: Address critique findings + spawn `sisyphus:implement` for Phase 3. Yield.
+- **Cycle 8**: Spawn `sisyphus:validate` for e2e smoketest. Yield.
+- **Cycle 9**: Address validation failures or complete.
 ### Failure modes
 - **Spec needs human input**: Mark session as needing human review. Orchestrator notes open questions.
 - **Plan fails review**: Feed review issues back, respawn planner.
-- **Phase fails validation**: Feed specifics back to implement agent for that phase only.
+- **Critique finds issues in foundation**: Fix before starting integration — don't build on shaky ground.
+- **Validation fails**: Feed specifics back to implement agent for the failing area.
 ### Parallelization
-Phases without dependencies can run in parallel. Types/interfaces (Phase 1) must complete before implementation phases that consume them.
+Phases without dependencies can run in parallel. Types/interfaces (Phase 1) must complete before implementation phases that consume them. Critique can run alongside detail-planning for the next phase.
 ---
@@ -119,31 +124,40 @@ Cross-cutting feature, multiple domains, needs team coordination. Uses **progres
 ### Stage Outline (high-level only — no file-level detail yet)
 1. [domain A foundation] — no deps — ~N cycles
 2. [domain B foundation] — no deps — ~N cycles
+   → critique stages 1-2 (foundation is low-risk individually, but review before building on it)
 3. [domain A implementation] — depends on 1 — ~N cycles
 4. [domain B implementation] — depends on 2 — ~N cycles
+   → critique + validate stages 3-4 (core logic, high risk — verify before integration)
 5. [integration layer] — depends on 3, 4 — ~N cycles
-6. [integration tests] — depends on all — ~N cycles
+   → validate end-to-end (integration is where accumulated assumptions break)
+6. [final review] — depends on all
 ### Current Stage: [whichever is active]
 See context/plan-stage-N-{name}.md for detail plan.
 - [ ] [task-level items from detail plan]
 ```
+Note: verification checkpoints are embedded in the stage outline, not deferred to a final phase. The level of rigor varies — foundation stages get a light critique, core logic gets critique + validation, integration gets full e2e validation. This is judgment, not formula.
 ### Cycle plan
 - **Cycle 1**: Spawn `sisyphus:spec-draft` for spec. Yield.
-- **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Do not detail any stage — no file-level specifics." Spawn `sisyphus:test-spec` for test properties (parallel). Yield.
+- **Cycle 2**: Spawn `sisyphus:plan` for **high-level stage outline only**. Instruction: "Outline stages, dependencies, one-sentence descriptions, cycle estimates. Include verification checkpoints between stages based on risk." Spawn `sisyphus:test-spec` for test properties (parallel). Yield.
 - **Cycle 3**: Review outline. Spawn `sisyphus:plan` to **detail-plan stage 1 only** (provide outline as context). Output to `context/plan-stage-1-{name}.md`. Yield.
 - **Cycle 4**: Spawn `sisyphus:implement` for stage 1. If stage 2 is independent, spawn `sisyphus:plan` to detail-plan stage 2 in parallel. Yield.
-- **Cycle 5**: Validate stage 1. Spawn `sisyphus:implement` for stage 2 (if detail-planned). Detail-plan stage 3 in parallel if independent. Yield.
-- **Cycle 6+**: Continue pattern — implement current stage, validate previous, detail-plan next. Each stage follows implement → critique → refine → validate.
+- **Cycle 5**: Spawn `sisyphus:implement` for stage 2 (if detail-planned). Spawn `sisyphus:review` to critique stages 1-2 in parallel — foundation review before core logic builds on it. Detail-plan stage 3 in parallel. Yield.
+- **Cycle 6**: Address critique findings. Spawn `sisyphus:implement` for stage 3. Yield.
+- **Cycle 7**: Spawn `sisyphus:implement` for stage 4. Spawn `sisyphus:review` to critique stage 3 in parallel. Yield.
+- **Cycle 8**: Spawn `sisyphus:validate` for stages 3-4 — core logic checkpoint before integration. Address stage 3 critique. Yield.
+- **Cycle 9+**: Implement integration stage. Validate e2e. Final review.
 ### Failure modes
 - **Detail-plan agent can't produce quality output**: The stage is still too large. Break it into sub-stages in the outline and detail-plan each sub-stage individually.
 - **Integration failures**: Often means contracts between domains don't match. Spawn debug agent targeting the integration seam.
 - **Stage N implementation invalidates stage N+1 outline**: Update the high-level outline. This is expected — it's why you don't detail-plan everything upfront.
+- **Critique finds issues after multiple stages built on top**: This is the scenario verification checkpoints exist to prevent. If it happens, you waited too long to review — add earlier checkpoints to the roadmap going forward.
 ### Parallelization
-Maximize within the progressive pattern. Independent stages run in parallel. Detail-planning the next stage runs alongside implementing the current one. Foundation stages complete before dependent stages. Integration waits for all domain implementations.
+Maximize within the progressive pattern. Independent stages run in parallel. Detail-planning the next stage runs alongside implementing the current one. Critique and validation agents run alongside the next stage's planning or implementation. Foundation stages complete before dependent stages. Integration waits for all domain implementations.
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sisyphi",
-  "version": "1.0.7",
+  "version": "1.0.9",
   "description": "tmux-integrated orchestration daemon for Claude Code multi-agent workflows",
   "license": "MIT",
   "repository": {

package/templates/agent-plugin/hooks/CLAUDE.md ADDED Viewed

@@ -0,0 +1,57 @@
+# templates/agent-plugin/hooks/
+Lifecycle hooks for agent plugin workflows. Enable specialized prompt generation and context handling during agent spawning.
+## hooks.json
+Schema: `{ "phaseKey": { "hookName": "script-name.sh" } }`
+Example:
+```json
+{
+  "plan": {
+    "userPrompt": "plan-user-prompt.sh",
+    "systemPrompt": "plan-system-prompt.sh"
+  }
+}
+```
+- **Keys**: Phase names (e.g., `plan`, `spec`, `implement`) — must correspond to phase modes in agent spawn workflow
+- **Values**: Object mapping hook types to shell script names
+- **Hook types**: `userPrompt`, `systemPrompt` (extensible for future hooks)
+## Shell Scripts
+Each script receives environment variables and outputs text to stdout.
+```bash
+# Receives: $SISYPHUS_SESSION_ID, $SISYPHUS_AGENT_ID, $INSTRUCTION, $AGENT_TYPE, context files
+# Outputs: Full user or system prompt text
+```
+**Convention**: `{phase}-{hook-type}.sh`
+**Inputs**:
+- `$SISYPHUS_SESSION_ID` — Session UUID
+- `$SISYPHUS_AGENT_ID` — Agent ID (e.g., `agent-001`)
+- `$INSTRUCTION` — Task instruction from spawn command
+- `$AGENT_TYPE` — Agent type (e.g., `plan`, `spec`, `implement`)
+- Context files at `.sisyphus/sessions/$SISYPHUS_SESSION_ID/context/`
+**Output**: Must write complete prompt text to stdout (no errors to stderr)
+## Invocation
+Hooks are executed during agent spawn when:
+1. Agent type matches a plugin agent type (e.g., `--agent-type sisyphus:plan`)
+2. Phase has hooks configured in hooks.json
+3. Daemon renders prompts before passing to Claude
+Output becomes the `--append-system-prompt` or user message content.
+## Key Patterns
+- **No placeholders in shell scripts** — unlike `.md` templates, scripts perform logic and generate final text
+- **Context access**: Scripts can read session state from `$SISYPHUS_SESSION_ID` directory
+- **Error handling**: Exit non-zero to fail agent spawn; errors logged to daemon.log
+- **Stdout only**: Scripts must output complete prompt to stdout; nothing to stderr

package/templates/agent-plugin/hooks/debug-user-prompt.sh ADDED Viewed

@@ -0,0 +1,15 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce systematic methodology for debug agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<debug-reminder>
+Systematic debugging — don't skip the fundamentals:
+- Check git log/blame near the failure — recent changes are the highest-signal evidence
+- For medium+ difficulty (crosses 2+ modules, unclear cause), spawn parallel subagents: data flow tracer, assumption auditor, change investigator
+- Your report must include: exact failing line(s), concrete evidence (code snippets, data flow), confidence level (high/medium/low), and recommended fix
+Investigate only — no code changes except reproduction tests.
+</debug-reminder>
+HINT

package/templates/agent-plugin/hooks/hooks.json CHANGED Viewed

@@ -3,18 +3,22 @@
     "PreToolUse": [
       {
         "matcher": "SendMessage",
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/intercept-send-message.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/intercept-send-message.sh"
+          }
+        ]
       }
     ],
     "Stop": [
       {
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/require-submit.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/require-submit.sh"
+          }
+        ]
       }
     ]
   }

package/templates/agent-plugin/hooks/operator-user-prompt.sh ADDED Viewed

@@ -0,0 +1,14 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce paranoid testing for operator agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<operator-reminder>
+Click EVERYTHING — assume something is broken and prove it:
+- Every link, button, nav item, dropdown, toggle, accordion, interactive element on the page
+- Edge cases: empty forms, duplicate submissions, back-button mid-flow, double-clicks, rapid navigation, browser refresh mid-action
+- Check ALL sources: DOM, console errors, network failures, logs — not just what's visually obvious
+- Spawn subagents to parallelize when scope is broad (one per page/flow/feature area) — the cost of missing a broken button is higher than an extra agent
+</operator-reminder>
+HINT

package/templates/agent-plugin/hooks/plan-user-prompt.sh CHANGED Viewed

@@ -2,10 +2,7 @@
 # UserPromptSubmit hook: remind plan agent to delegate for large tasks.
 if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
-python3 -c "
-import json, sys
-print(json.dumps({'additionalContext': sys.stdin.read()}))
-" <<'HINT'
+cat <<'HINT'
 <planning-reminder>
 For particularly large or multi-domain tasks, delegate sub-plans to specialist agents rather than planning everything solo:

package/templates/agent-plugin/hooks/review-plan-user-prompt.sh ADDED Viewed

@@ -0,0 +1,16 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce cross-plan interface focus for plan review agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<review-plan-reminder>
+The primary source of bugs is the interfaces between plans:
+- Confirm critical/high findings by cross-referencing spec and code yourself — don't rubber-stamp subagent opinions
+- Flag file ownership conflicts: any file touched by 2+ plans or agents needs explicit coordination
+- Read actual source files for pattern consistency — don't review the plan in isolation
+- Type definitions must have exactly one owner; flag divergent names/shapes for the same concept
+You are read-only. Synthesize and report — never edit plan or code files yourself.
+</review-plan-reminder>
+HINT

package/templates/agent-plugin/hooks/review-user-prompt.sh ADDED Viewed

@@ -0,0 +1,16 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce validation discipline for review agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<review-reminder>
+Only report confirmed findings — spawn validation subagents (~1 per 3 issues) before finalizing:
+- Bugs/Security: opus validates exploitable/broken
+- Everything else: sonnet confirms significant (not nitpick)
+- Drop anything subjective, pre-existing, or linter-catchable
+- Every finding needs `file:line` + concrete evidence — no "this could be a problem"
+You are read-only. Investigate and direct fixes through implementers — never edit code yourself.
+</review-reminder>
+HINT

package/templates/agent-plugin/hooks/spec-user-prompt.sh CHANGED Viewed

@@ -2,10 +2,7 @@
 # UserPromptSubmit hook: remind spec agent to iterate with the user.
 if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
-python3 -c "
-import json, sys
-print(json.dumps({'additionalContext': sys.stdin.read()}))
-" <<'HINT'
+cat <<'HINT'
 <spec-reminder>
 Iterate with the user — include them in the process before writing anything to disk:

package/templates/agent-plugin/hooks/test-spec-user-prompt.sh ADDED Viewed

@@ -0,0 +1,14 @@
+#!/bin/bash
+# UserPromptSubmit hook: reinforce behavioral invariants for test-spec agents.
+if [ -z "$SISYPHUS_SESSION_ID" ]; then exit 0; fi
+cat <<'HINT'
+<test-spec-reminder>
+Behavioral properties, not test code:
+- State behaviors as invariants: "Users can log in with email/password" — not "loginHandler calls bcrypt.compare"
+- Each property must be independently verifiable
+- Include negative properties — what must NOT happen is as important as what must
+- If the change is purely mechanical with nothing to verify, submit { "testsNeeded": false }
+</test-spec-reminder>
+HINT

package/templates/companion-plugin/hooks/hooks.json CHANGED Viewed

@@ -2,10 +2,12 @@
   "hooks": {
     "UserPromptSubmit": [
       {
-        "hook": {
-          "type": "command",
-          "command": "bash hooks/user-prompt-context.sh"
-        }
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/user-prompt-context.sh"
+          }
+        ]
       }
     ]
   }