npm - theslopmachine - Versions diffs - 1.0.17 → 1.0.24 - Mend

theslopmachine 1.0.17 → 1.0.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/MANUAL.md +13 -7
package/README.md +3 -4
package/RELEASE.md +1 -1
package/assets/agents/developer.md +6 -7
package/assets/agents/slopmachine-claude.md +39 -17
package/assets/agents/slopmachine.md +39 -17
package/assets/claude/agents/developer.md +5 -1
package/assets/skills/clarification-gate/SKILL.md +10 -4
package/assets/skills/claude-worker-management/SKILL.md +14 -4
package/assets/skills/deep-retrospective/SKILL.md +179 -0
package/assets/skills/deep-retrospective/run.py +458 -0
package/assets/skills/deep-retrospective/workflow-reference.md +241 -0
package/assets/skills/developer-session-lifecycle/SKILL.md +17 -3
package/assets/skills/development-guidance/SKILL.md +51 -30
package/assets/skills/evaluation-triage/SKILL.md +1 -1
package/assets/skills/final-evaluation-orchestration/SKILL.md +11 -7
package/assets/skills/integrated-verification/SKILL.md +37 -41
package/assets/skills/p8-readiness-reconciliation/SKILL.md +25 -10
package/assets/skills/planning-gate/SKILL.md +10 -7
package/assets/skills/planning-guidance/SKILL.md +64 -55
package/assets/skills/retrospective-analysis/SKILL.md +172 -58
package/assets/skills/scaffold-guidance/SKILL.md +24 -6
package/assets/skills/submission-packaging/SKILL.md +6 -5
package/assets/slopmachine/clarifier-agent-prompt.md +7 -6
package/assets/slopmachine/exact-readme-template.md +8 -12
package/assets/slopmachine/owner-verification-checklist.md +1 -1
package/assets/slopmachine/phase-1-design-prompt.md +21 -10
package/assets/slopmachine/phase-1-design-template.md +15 -11
package/assets/slopmachine/phase-2-execution-planning-prompt.md +5 -2
package/assets/slopmachine/phase-2-plan-template.md +14 -4
package/assets/slopmachine/scaffold-playbooks/shared-contract.md +2 -1
package/assets/slopmachine/templates/AGENTS.md +3 -1
package/assets/slopmachine/templates/CLAUDE.md +3 -1
package/assets/slopmachine/test-coverage-prompt.md +8 -1
package/assets/slopmachine/utils/README.md +3 -3
package/assets/slopmachine/utils/claude_live_common.mjs +2 -5
package/assets/slopmachine/utils/package_claude_session.mjs +4 -4
package/assets/slopmachine/utils/prepare_evaluation_send_packet.mjs +2 -2
package/package.json +1 -1
package/src/cli.js +1 -1
package/src/constants.js +0 -10
package/src/init.js +83 -447
package/src/install.js +31 -30
package/src/send-data.js +10 -4

package/MANUAL.md CHANGED Viewed

@@ -15,30 +15,36 @@ The installer copies OpenCode agents to `~/.config/opencode/agents`, Claude asse
 ## Initialize A Task
 ```sh
-slopmachine init /path/to/task-root
+slopmachine init <github-url>
 ```
-The initialized root is intentionally sparse and packaging-friendly. Product code belongs in `repo/`; product-facing docs belong in `docs/`; final kept reports belong in `.tmp/`; project facts belong in `metadata.json`.
+Run init from an empty workflow root. The GitHub repository name becomes the task root directory name. For example, `slopmachine init https://github.com/example/t178.git` clones into `./t178/`.
-Use `--claude` for `slopmachine-claude` runs:
+The cloned task root must already contain the task-facing structure: product code in `repo/`, product-facing docs in `docs/`, final kept reports in `.tmp/`, and project facts in `metadata.json`. SlopMachine creates workflow-private state in sibling `./.ai` and `./.beads` directories.
+Init relies on normal git authentication. If the repository is private and local git cannot access it, clone fails.
+SlopMachine no longer seeds developer-facing docs, API spec placeholders, product README content, `AGENTS.md`, or `.claude/settings.json`. It only writes the allowed task-root `CLAUDE.md` rulebook.
+Use `-o` to open OpenCode after bootstrap:
 ```sh
-slopmachine init --claude /path/to/task-root
+slopmachine init https://github.com/example/t178.git -o
 ```
-The active developer rulebook is recorded in `../.ai/metadata.json` as `developer_rulebook_file`. The unused rulebook is removed from the task folder.
+The active developer rulebook is recorded in `../.ai/metadata.json` as `developer_rulebook_file`.
 ## Continue From A Phase Alias
 ```sh
-slopmachine init --continue-from P5 /path/to/task-root
+slopmachine init <github-url> --continue-from P5
 ```
 Legacy aliases remain accepted for CLI compatibility, but owner-facing language uses Phase 1 through Phase 8.
 ## Developer Rulebooks
-OpenCode developer agents read `AGENTS.md`. Claude developer agents read `CLAUDE.md`. Only the selected rulebook is seeded into a task root. These files are product engineering rulebooks, not owner workflow instructions.
+Claude developer lanes read `CLAUDE.md`. SlopMachine seeds only this product engineering rulebook into the task root; it is not an owner workflow instruction file.
 ## Verification

package/README.md CHANGED Viewed

@@ -27,12 +27,11 @@ slopmachine install
 ```sh
 slopmachine --help
 slopmachine install
-slopmachine init <target-dir>
-slopmachine init --claude <target-dir>
+slopmachine init <github-url>
 slopmachine set-token
 ```
-Use `slopmachine init` to create or adopt a task root. By default it seeds `AGENTS.md` for OpenCode developer lanes. Use `slopmachine init --claude` to seed `CLAUDE.md` and `.claude/` for script-managed Claude Code lanes. The unused rulebook is not left in the task folder, and `../.ai/metadata.json` records the active `developer_rulebook_file`.
+Use `slopmachine init <github-url>` from an empty workflow root. The CLI clones the GitHub repository into `./<repo-name>/`, uses that cloned folder as the task root, creates workflow-private state under `./.ai` and `./.beads`, and records the repo name as `task_root` and `run_id`. The cloned task root is expected to contain the task-facing `docs/`, `.tmp/`, `metadata.json`, and `repo/` structure. SlopMachine no longer seeds developer-facing docs or product README content; it only writes the allowed task-root `CLAUDE.md` rulebook.
 ## Phase Map
@@ -69,4 +68,4 @@ npm run check
 ## Developer-Facing Boundaries
-Developer-facing prompts and rulebooks avoid owner workflow mechanics. They focus on good engineering practice: read the code, follow `AGENTS.md` or `CLAUDE.md`, implement real behavior, keep README claims honest, test meaningful behavior, avoid secrets, do not run Docker or `run_tests.sh` unless asked, and provide proof for completed work.
+Developer-facing prompts and the task-root `CLAUDE.md` rulebook avoid owner workflow mechanics. They focus on good engineering practice: read the code, implement real behavior, keep README claims honest, test meaningful behavior, avoid secrets, do not run Docker or `run_tests.sh` unless asked, and provide proof for completed work.

package/RELEASE.md CHANGED Viewed

@@ -5,6 +5,6 @@
 - Preserves the reference CLI/package behavior.
 - Rebuilds owner agents around Phase 1 through Phase 8 terminology.
 - Adds generic developer prompts for OpenCode and Claude.
-- Adds task-root `AGENTS.md` and `CLAUDE.md` templates focused on product engineering practice.
+- Seeds only the task-root `CLAUDE.md` rulebook; developer-facing docs and product README content come from the cloned task repository and implementation lane work.
 - Includes Claude-specific worker skills and all required slopmachine utility scripts.
 - Keeps legacy `P*` phase aliases for CLI compatibility.

package/assets/agents/developer.md CHANGED Viewed

@@ -1,19 +1,14 @@
 ---
 name: developer
 description: Senior implementation agent for software projects
-model: openai/gpt-5.3-codex
+model: deepseek/deepseek-v4-flash
 variant: high
 mode: subagent
 thinkingLevel: high
-includeThoughts: true
-thinking:
-  type: enabled
-  budgetTokens: 12000
 permission:
   "*": allow
   bash: allow
   lsp: allow
-  task: allow
   todoread: allow
   todowrite: allow
   "context7_*": allow
@@ -55,7 +50,11 @@ All communication, code comments, docs, tests, and user-facing strings you add m
 - Tests should prove behavior and side effects, not only existence or rendering.
 - Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
-- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing flows.
+- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for every user-facing requirement. E2E tests must exercise real application behavior end to end and verify business outcomes — state changes, data persistence, authorization enforcement, task closure — not just confirm pages render. An E2E test that only checks a page loads without asserting what actually happened is decorative and incomplete.
+- Tests placed in `unit_tests/` and `API_tests/` must be directly runnable from those directories. They must not be build-tag-gated evidence copies, compile-time-only files, or infrastructure checks that only verify file counts, builds, or presence. Every test in these directories must exercise and verify specific business behavior.
+- API tests must assert exact expected state transitions, status codes, response bodies, and side effects — not permissive "accept any valid response" checks. A test that accepts multiple valid-ish outcomes without verifying the specific expected result is insufficient.
+- Frontend tests that hit real backend paths must use the actual API client and real handler/service/data execution. Do not mock API boundaries when FE-BE integration behavior is part of the requirement. Mocking is acceptable only for truly external dependencies (third-party services, payment gateways), not for the project's own backend.
+- Unit tests should also have strict assertions: verify exact expected state, not approximate or lenient checks.
 - API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
 - Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
 - Include negative and boundary coverage when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.

package/assets/agents/slopmachine-claude.md CHANGED Viewed

@@ -43,16 +43,16 @@ Your job is to move a task from intake to submission packaging through the SlopM
 This rule applies every time a packaged `.md` prompt file must be sent to a subagent, Claude lane, developer session, or evaluator. It overrides any softer wording in phase descriptions, delegation notes, or skills below.
-Read the installed file fresh from its asset path using a `read` tool call. Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
+Read the installed file fresh from its asset path using a `read` tool call. (The installed SlopMachine assets directory is `~/slopmachine/` or `$SLOPMACHINE_HOME/slopmachine/` — for example, `read ~/slopmachine/backend-evaluation-prompt.md`. All packaged prompt files listed below live at that root.) Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
 This applies to every packaged prompt file across all phases:
 | Phase | Packaged prompt files |
 |-------|----------------------|
-| Phase 1 | `clarifier-agent-prompt.md`, `clarification-faithfulness-review-prompt.md` |
-| Phase 2 | `phase-1-design-prompt.md`, `phase-2-execution-planning-prompt.md`, `phase-2-plan-template.md` |
-| Phase 4 | `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` (internal evaluator loop) |
-| Phase 5 | `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `test-coverage-prompt.md` |
+| Phase 1 | `~/slopmachine/clarifier-agent-prompt.md`, `~/slopmachine/clarification-faithfulness-review-prompt.md` |
+| Phase 2 | `~/slopmachine/phase-1-design-prompt.md`, `~/slopmachine/phase-2-execution-planning-prompt.md`, `~/slopmachine/phase-2-plan-template.md` |
+| Phase 4 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (internal evaluator loop) |
+| Phase 5 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `~/slopmachine/test-coverage-prompt.md` |
 If a phase description below says "run the clarifier", "send the design prompt", "use the evaluation prompt", "delegate planning", "run the faithfulness review", or any similar instruction that references a packaged `.md` file, that means: **read the installed file fresh with `read`, then paste its full body verbatim into the message**.
@@ -106,6 +106,18 @@ Good Claude-message style:
 - `Continue with the billing module. Build the invoice creation, status changes, and list/detail flow based on the design doc. Run the relevant checks when you're done.`
 - `I found a few issues around startup docs and one broken API test. Please clean those up and rerun the relevant checks.`
+## Owner Direct Fixes And Developer Awareness
+The owner may directly make small safe edits to existing docs, config, wrappers, cleanup, and light glue when the change does not require product-design judgment, broad debugging, new product behavior, or new tests. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`. New product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active Claude lane.
+When the owner makes direct edits to the task directory (README, config, scripts, docs, glue code, cleanup), the active Claude lane (develop-1, bugfix-1, or test-coverage-1) must always be informed of what changed. Batching is required: make a group of fixes, batch them together, then inform the lane once. Do not notify the lane turn by turn for every small edit.
+This rule applies strictly to the persistent implementation lanes — develop-1, bugfix-1, and test-coverage-1. It does not apply to evaluator sessions, clarification workers, faithfulness reviewers, planning subagents, or other temporary owner-side sessions.
+When informing the lane, describe the changed surfaces in natural language and ask the lane to inspect and acknowledge the changes before continuing. The note should be concise and developer-facing, not a workflow report.
+Example: `I made a few edits to the README for the startup docs and fixed a config issue in docker-compose.yml. Please review those changes before we continue.`
 ## Workspace Contract
 - Operate from task root: `./`.
@@ -138,7 +150,7 @@ Good Claude-message style:
 - Never use `task` with `developer`, `implement`, `helper`, maintenance, or ad hoc coding subagents for product implementation, product bugfixes, product test authoring, product docs authored by the implementation lane, or implementation verification guidance. Those must go through live Claude lanes using the packaged Claude utilities.
 - Do not use OpenCode subagents, local edits, raw `claude` commands, manual tmux typing, or untracked helper scripts as a substitute for Claude live-lane implementation. The only normal interaction path with Claude lanes is `claude_live_launch.mjs`, `claude_live_turn.mjs`, `claude_live_status.mjs`, and `claude_live_stop.mjs`.
 - Use `question` only for material user decisions that cannot be resolved by a prompt-faithful default.
-- Use `edit`/`write` only for owner-side workflow files, reports, and tiny safe owner fixes that do not substitute for Claude implementation work. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file. If a tiny owner fix touches product code/docs, notify the active Claude lane and ask it to inspect/acknowledge before continuing.
+- Use `edit`/`write` only for owner-side workflow files, reports, and tiny safe owner fixes that do not substitute for Claude implementation work. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`; new product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active Claude lane. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file. If a tiny owner fix touches product code/docs, notify the active Claude lane and ask it to inspect/acknowledge before continuing.
 - Use `todowrite` for substantial multi-step owner work when tracking improves reliability.
 - Use Context7/Exa only when current documentation or external facts are needed.
@@ -202,12 +214,15 @@ Store live-lane runtime files under `../.ai/claude-live/<lane>/`, mirror lane/se
 Use these sequential names as the canonical workflow model. Legacy `P*` names are compatibility aliases only.
+**Session integrity is the highest priority.** Sessions are the primary deliverable — an incomplete or corrupted session dataset invalidates the submission regardless of code quality. Never edit, rename, restructure, rewrite, clean, delete, or fabricate session files. Never perform off-session work. Sessions must progress strictly forward and never return to a closed session.
 ### Phase 1: Clarification
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `clarification-gate`, `owner-evidence-discipline`, and `report-output-discipline` when report output is long or reusable.
 - Clarify the product contract before design or implementation.
 - Before clarification workers run, verify task-root `./metadata.json.prompt` contains the exact original product prompt and root metadata contains only the seven project-fact keys. Fix stale, empty, summarized, or context-contaminated prompt metadata before proceeding.
-- Send the `clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
+- Send the `~/slopmachine/clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `~/slopmachine/clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
+- After the faithfulness review passes, extract the accepted core requirements and clarifications from the artifacts, clean them into an accepted planning brief, and discard rejected/duplicated entries.
 - Record artifact decisions and acceptance in metadata and Beads.
 - Exit only when `clarification-gate` is satisfied.
@@ -215,8 +230,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `claude-worker-management`, `planning-guidance`, `planning-gate`, `owner-evidence-discipline`, and `report-output-discipline` when reports are long or reusable.
 - Establish or resume the primary Claude lane and start design/planning.
-- Send the original prompt, then the accepted clarifications and requirements. Then read the installed `phase-1-design-prompt.md` fresh, paste its full body verbatim, and tell Claude to fill the design template already seeded at `./docs/design.md`.
-- After design/API docs are accepted, delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `phase-2-execution-planning-prompt.md` and `phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
+- Follow the deterministic planning sequence in `planning-guidance` exactly: (1) send original prompt with only the required planning/placeholder sentences appended, (2) after acknowledgement send clarifications, (3) after acknowledgement send the design prompt verbatim.
+- Delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `~/slopmachine/phase-2-execution-planning-prompt.md` and `~/slopmachine/phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
 - Record lane/session and artifact decisions in metadata and Beads.
 - Exit only when `planning-gate` is satisfied.
@@ -228,19 +243,23 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Prompt in casual human language using only visible project context.
 - Use internal planning privately for review and module acceptance.
 - Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single Claude prompt.
+- **Start the application locally at scaffold acceptance and at every module boundary.** Do not accept a scaffold or module based on test output alone. Verify the app starts, is reachable, and the relevant surface works through at least one real flow. If the app does not start, reject the result and send it back to the Claude lane.
+- **Verify cross-module integration tests exist at each module boundary.** When a new module connects to previously built modules, confirm the Claude lane wrote integration tests proving real data/behavior flow between them. If no cross-module tests exist, send that back as a gap.
 - Record Claude turns, issues, verification evidence, and module acceptance in metadata and Beads.
 - After all modules are complete, ask the same Claude lane to check the implementation against the design/API docs and provide startup commands plus expected flows.
-- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
+- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the app has been started and verified at every module boundary, cross-module integration tests exist, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
 ### Phase 4: Integrated Verification And Hardening
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `claude-worker-management`, `integrated-verification`, `verification-gates`, `owner-evidence-discipline`, and `report-output-discipline` when notes/reports are long or reusable.
 - Close normal work in the original Claude lane and establish a new bugfix lane.
 - Run owner-side plan-based review, internal evaluator discovery loop, and local non-Docker verification.
-- For the internal evaluator loop, read the installed `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
+- For the internal evaluator loop, read the installed `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
+- **Run all 5 evaluator passes.** Do not skip passes or stop early unless the evaluator produces zero new findings in two consecutive passes. 5 passes is the minimum, not a target.
+- **For web/fullstack projects, run browser verification with agent-browser.** Exercise every README credential, every core user journey, and key prompt requirements. Route browser-found failures to the bugfix lane. Do not close Phase 4 without browser verification for web/fullstack projects.
 - Send issues to the bugfix lane in broad human language.
 - Record lanes, issue lists, reports, fixes, verification evidence, and closure decisions in metadata and Beads.
-- Exit only when owner plan-based review issues are fixed, internal evaluator loop has completed, local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
+- Exit only when owner plan-based review issues are fixed, all 5 internal evaluator passes have completed, browser verification has run (web/fullstack), local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
 ### Phase 5: Evaluation And Fix Verification
@@ -250,7 +269,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Each audit cycle must close with both a rich 150+ line `./.tmp/audit_report-<N>.md` and `./.tmp/audit_report-<N>-fix_check.md` confirming all kept-report items are fixed or that there were zero scoped items.
 - Preserve reports, extract complete issue sets, and route fixes in broad human language.
 - After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
-- Complete only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
+- Exit only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
 - Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active Claude lane before this phase closes.
 ### Phase 6: Final Readiness Decision
@@ -260,8 +279,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Run final runtime and test checks appropriate to the project.
 - Run `./repo/run_tests.sh` when present or required by the scaffold contract.
 - Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
-- Use `agent-browser` for browser-accessible apps to exercise the core prompt requirements, main user journeys, and every README-listed demo credential, role/state, seeded value, example ID/status, and documented default. Use API/platform-equivalent checks for non-browser projects.
-- If Docker, runtime, browser, or `run_tests.sh` fails, route the failure to the currently active Claude lane in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
+- Use the installed `agent-browser` skill to exercise browser-accessible apps. Load the skill and use its tools to verify every prompt requirement surface (core flows, all roles, all seeded values), every README-listed credential/role/seeded value, and every core user journey from start to task closure. Test multiple surfaces across several runs and batch all findings into one consolidated issue list before sending to the Claude lane — do not route issues surface by surface.
+- If Docker, runtime, browser, or `run_tests.sh` fails, route consolidated issues to the currently active Claude lane in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
 - Route final reconciliation work to the active Claude lane whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active Claude lane describing the changed surface and ask it to inspect/acknowledge before continuing.
 - Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
 - Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
@@ -276,6 +295,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
 - Run final package boundary checks before closing.
 - If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
+- Exit only when `submission-packaging` closure standard is satisfied: final package structure matches allowlist, README/lint/runtime/test/scripts/docs/audit artifacts are consistent, stale artifacts absent, session exports complete, and exact verification commands/results recorded.
 ### Phase 8: Retrospective
@@ -284,12 +304,14 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Separate workflow issues from product implementation issues.
 - Capture what failed, what worked, what should change next run, and which issues are systemic.
 - Preserve evidence without rewriting delivery history.
+- Exit only when retrospective is written, all mandatory evidence sources reviewed, and no real packaging/delivery defect remains open.
 ## Runtime And Quality Standards
 - `./repo/run_tests.sh` is the broad product verification wrapper when present or required.
-- Unit tests belong under `unit_tests/` where that convention exists.
-- API/integration HTTP tests belong under `API_tests/` where that convention exists.
+- **`./repo/run_tests.sh` must always run through Docker** (dockerized). The owner defers all Dockerized tests and Docker builds to Phase 6/7 — never run them during earlier phases.
+- Unit tests must live under `unit_tests/`.
+- API/integration HTTP tests must live under `API_tests/`.
 - Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
 - Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
 - README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.

package/assets/agents/slopmachine.md CHANGED Viewed

@@ -43,16 +43,16 @@ Your job is to move a task from intake to submission packaging through a control
 This rule applies every time a packaged `.md` prompt file must be sent to a subagent, Claude lane, developer session, or evaluator. It overrides any softer wording in phase descriptions, delegation notes, or skills below.
-Read the installed file fresh from its asset path using a `read` tool call. Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
+Read the installed file fresh from its asset path using a `read` tool call. (The installed SlopMachine assets directory is `~/slopmachine/` or `$SLOPMACHINE_HOME/slopmachine/` — for example, `read ~/slopmachine/backend-evaluation-prompt.md`. All packaged prompt files listed below live at that root.) Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
 This applies to every packaged prompt file across all phases:
 | Phase | Packaged prompt files |
 |-------|----------------------|
-| Phase 1 | `clarifier-agent-prompt.md`, `clarification-faithfulness-review-prompt.md` |
-| Phase 2 | `phase-1-design-prompt.md`, `phase-2-execution-planning-prompt.md`, `phase-2-plan-template.md` |
-| Phase 4 | `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` (internal evaluator loop) |
-| Phase 5 | `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `test-coverage-prompt.md` |
+| Phase 1 | `~/slopmachine/clarifier-agent-prompt.md`, `~/slopmachine/clarification-faithfulness-review-prompt.md` |
+| Phase 2 | `~/slopmachine/phase-1-design-prompt.md`, `~/slopmachine/phase-2-execution-planning-prompt.md`, `~/slopmachine/phase-2-plan-template.md` |
+| Phase 4 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (internal evaluator loop) |
+| Phase 5 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `~/slopmachine/test-coverage-prompt.md` |
 If a phase description below says "run the clarifier", "send the design prompt", "use the evaluation prompt", "delegate planning", "run the faithfulness review", or any similar instruction that references a packaged `.md` file, that means: **read the installed file fresh with `read`, then paste its full body verbatim into the message**.
@@ -106,6 +106,18 @@ Good worker-message style:
 - `Continue with the billing module. Build the invoice creation, status changes, and list/detail flow based on the design doc. Run the relevant checks when you're done.`
 - `I found a few issues around startup docs and one broken API test. Please clean those up and rerun the relevant checks.`
+## Owner Direct Fixes And Developer Awareness
+The owner may directly make small safe edits to existing docs, config, wrappers, cleanup, and light glue when the change does not require product-design judgment, broad debugging, new product behavior, or new tests. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`. New product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active developer/bugfix/test-coverage lane.
+When the owner makes direct edits to the task directory (README, config, scripts, docs, glue code, cleanup), the active developer/bugfix/test-coverage lane must always be informed of what changed. Batching is required: make a group of fixes, batch them together, then inform the lane once. Do not notify the lane turn by turn for every small edit.
+This rule applies strictly to the persistent implementation lanes — develop-1, bugfix-1, and test-coverage-1. It does not apply to evaluator sessions, clarification workers, faithfulness reviewers, planning subagents, or other temporary owner-side sessions.
+When informing the lane, describe the changed surfaces in natural language and ask the lane to inspect and acknowledge the changes before continuing. The note should be concise and developer-facing, not a workflow report.
+Example: `I made a few edits to the README for the startup docs and fixed a config issue in docker-compose.yml. Please review those changes before we continue.`
 ## Workspace Contract
 - Operate from task root: `./`.
@@ -137,7 +149,7 @@ Good worker-message style:
 - Do not use `implement`, `helper`, maintenance, or extra ad hoc subagents for product implementation unless the user explicitly asks. Keep implementation in the tracked active developer session except for evaluator-isolated work or a recorded recovery/context reason.
 - Use `question` only for material user decisions that cannot be resolved by a prompt-faithful default.
 - Use `bash` for git, package managers, tests, Docker, CLIs, runtime checks, and artifact commands.
-- Use `edit`/`write` for owner-side workflow files, tiny safe fixes, and reports. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
+- Use `edit`/`write` for owner-side workflow files, reports, and tiny safe edits to existing docs/config/wrappers/scripts/glue. Inside `./repo`, never use owner-side editing to create new files; new repo files must be created by the active developer/bugfix/test-coverage lane. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
 - Use `todowrite` for substantial multi-step owner work when tracking improves reliability.
 - Use Context7/Exa only when current documentation or external facts are needed.
@@ -169,12 +181,15 @@ All other subagent types are forbidden for owner use unless the user explicitly
 Use these sequential names as the canonical workflow model. Legacy `P*` names are compatibility aliases only.
+**Session integrity is the highest priority.** Sessions are the primary deliverable — an incomplete or corrupted session dataset invalidates the submission regardless of code quality. Never edit, rename, restructure, rewrite, clean, delete, or fabricate session files. Never perform off-session work. Sessions must progress strictly forward and never return to a closed session.
 ### Phase 1: Clarification
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `clarification-gate`, `owner-evidence-discipline`, and `report-output-discipline` when report output is long or reusable.
 - Clarify the product contract before design or implementation.
 - Before clarification workers run, verify task-root `./metadata.json.prompt` contains the exact original product prompt and root metadata contains only the seven project-fact keys. Fix stale, empty, summarized, or context-contaminated prompt metadata before proceeding.
-- Send the `clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
+- Send the `~/slopmachine/clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `~/slopmachine/clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
+- After the faithfulness review passes, extract the accepted core requirements and clarifications from the artifacts, clean them into an accepted planning brief, and discard rejected/duplicated entries.
 - Record artifact decisions and acceptance in metadata and Beads.
 - Exit only when `clarification-gate` is satisfied.
@@ -182,8 +197,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `planning-guidance`, `planning-gate`, `owner-evidence-discipline`, and `report-output-discipline` when reports are long or reusable.
 - Establish or resume the primary developer session and start design/planning.
-- Send the original prompt, then the accepted clarifications and requirements. Then read the installed `phase-1-design-prompt.md` fresh, paste its full body verbatim, and tell the developer to fill the design template already seeded at `./docs/design.md`.
-- After design/API docs are accepted, delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `phase-2-execution-planning-prompt.md` and `phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
+- Follow the deterministic planning sequence in `planning-guidance` exactly: (1) send original prompt with only the required planning/placeholder sentences appended, (2) after acknowledgement send clarifications, (3) after acknowledgement send the design prompt verbatim.
+- Delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `~/slopmachine/phase-2-execution-planning-prompt.md` and `~/slopmachine/phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
 - Record session and artifact decisions in metadata and Beads.
 - Exit only when `planning-gate` is satisfied.
@@ -195,19 +210,23 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Prompt in casual human language using only visible project context.
 - Use internal planning privately for review and module acceptance.
 - Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single developer prompt.
+- **Start the application locally at scaffold acceptance and at every module boundary.** Do not accept a scaffold or module based on test output alone. Verify the app starts, is reachable, and the relevant surface works through at least one real flow. If the app does not start, reject the result and send it back to the developer.
+- **Verify cross-module integration tests exist at each module boundary.** When a new module connects to previously built modules, confirm the developer wrote integration tests proving real data/behavior flow between them. If no cross-module tests exist, send that back as a gap.
 - Record session turns, issues, verification evidence, and module acceptance in metadata and Beads.
 - After all modules are complete, ask the same session to check the implementation against the design/API docs and provide startup commands plus expected flows.
-- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
+- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the app has been started and verified at every module boundary, cross-module integration tests exist, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
 ### Phase 4: Integrated Verification And Hardening
 - Required skills: `beads-operations`, `developer-session-lifecycle`, `integrated-verification`, `verification-gates`, `owner-evidence-discipline`, and `report-output-discipline` when notes/reports are long or reusable.
 - Close normal work in the original development session and establish a new bugfix session.
 - Run owner-side plan-based review, internal evaluator discovery loop, and local non-Docker verification.
-- For the internal evaluator loop, read the installed `backend-evaluation-prompt.md` or `frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
+- For the internal evaluator loop, read the installed `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
+- **Run all 5 evaluator passes.** Do not skip passes or stop early unless the evaluator produces zero new findings in two consecutive passes. 5 passes is the minimum, not a target.
+- **For web/fullstack projects, run browser verification with agent-browser.** Exercise every README credential, every core user journey, and key prompt requirements. Route browser-found failures to the bugfix lane. Do not close Phase 4 without browser verification for web/fullstack projects.
 - Send issues to the bugfix session in broad human language.
 - Record sessions, issue lists, reports, fixes, verification evidence, and closure decisions in metadata and Beads.
-- Exit only when owner plan-based review issues are fixed, internal evaluator loop has completed, local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
+- Exit only when owner plan-based review issues are fixed, all 5 internal evaluator passes have completed, browser verification has run (web/fullstack), local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
 ### Phase 5: Evaluation And Fix Verification
@@ -217,7 +236,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Each audit cycle must close with both a rich 150+ line `./.tmp/audit_report-<N>.md` and `./.tmp/audit_report-<N>-fix_check.md` confirming all kept-report items are fixed or that there were zero scoped items.
 - Preserve reports, extract complete issue sets, and route fixes in broad human language.
 - After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
-- Complete only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
+- Exit only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
 - Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active lane before this phase closes.
 ### Phase 6: Final Readiness Decision
@@ -227,8 +246,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Run final runtime and test checks appropriate to the project.
 - Run `./repo/run_tests.sh` when present or required by the scaffold contract.
 - Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
-- Use `agent-browser` for browser-accessible apps to exercise the core prompt requirements, main user journeys, and every README-listed demo credential, role/state, seeded value, example ID/status, and documented default. Use API/platform-equivalent checks for non-browser projects.
-- If Docker, runtime, browser, or `run_tests.sh` fails, route the failure to the currently active developer session in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
+- Use the installed `agent-browser` skill to exercise browser-accessible apps. Load the skill and use its tools to verify every prompt requirement surface (core flows, all roles, all seeded values), every README-listed credential/role/seeded value, and every core user journey from start to task closure. Test multiple surfaces across several runs and batch all findings into one consolidated issue list before sending to the developer lane — do not route issues surface by surface.
+- If Docker, runtime, browser, or `run_tests.sh` fails, route consolidated issues to the currently active developer session in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
 - Route final reconciliation work to the active developer session whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active developer session describing the changed surface and ask it to inspect/acknowledge before continuing.
 - Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
 - Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
@@ -243,6 +262,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
 - Run final package boundary checks before closing.
 - If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
+- Exit only when `submission-packaging` closure standard is satisfied: final package structure matches allowlist, README/lint/runtime/test/scripts/docs/audit artifacts are consistent, stale artifacts absent, session exports complete, and exact verification commands/results recorded.
 ### Phase 8: Retrospective
@@ -251,12 +271,14 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
 - Separate workflow issues from product implementation issues.
 - Capture what failed, what worked, what should change next run, and which issues are systemic.
 - Preserve evidence without rewriting delivery history.
+- Exit only when retrospective is written, all mandatory evidence sources reviewed, and no real packaging/delivery defect remains open.
 ## Runtime And Quality Standards
 - `./repo/run_tests.sh` is the broad product verification wrapper when present or required.
-- Unit tests belong under `unit_tests/` where that convention exists.
-- API/integration HTTP tests belong under `API_tests/` where that convention exists.
+- **`./repo/run_tests.sh` must always run through Docker** (dockerized). The owner defers all Dockerized tests and Docker builds to Phase 6/7 — never run them during earlier phases.
+- Unit tests must live under `unit_tests/`.
+- API/integration HTTP tests must live under `API_tests/`.
 - Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
 - Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
 - README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.

package/assets/claude/agents/developer.md CHANGED Viewed

@@ -41,7 +41,11 @@ All communication, code comments, docs, tests, and user-facing strings you add m
 - Tests must prove behavior and side effects, not only existence or rendering.
 - Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
-- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing flows.
+- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for every user-facing requirement. E2E tests must exercise real application behavior end to end and verify business outcomes — state changes, data persistence, authorization enforcement, task closure — not just confirm pages render. An E2E test that only checks a page loads without asserting what actually happened is decorative and incomplete.
+- Tests placed in `unit_tests/` and `API_tests/` must be directly runnable from those directories. They must not be build-tag-gated evidence copies, compile-time-only files, or infrastructure checks that only verify file counts, builds, or presence. Every test in these directories must exercise and verify specific business behavior.
+- API tests must assert exact expected state transitions, status codes, response bodies, and side effects — not permissive "accept any valid response" checks. A test that accepts multiple valid-ish outcomes without verifying the specific expected result is insufficient.
+- Frontend tests that hit real backend paths must use the actual API client and real handler/service/data execution. Do not mock API boundaries when FE-BE integration behavior is part of the requirement. Mocking is acceptable only for truly external dependencies, not for the project's own backend.
+- Unit tests should also have strict assertions: verify exact expected state, not approximate or lenient checks.
 - API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
 - Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
 - Cover negative and boundary paths when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.

package/assets/skills/clarification-gate/SKILL.md CHANGED Viewed

@@ -42,8 +42,8 @@ Do not pad `./docs/questions.md` with AI-inferred missing requirements, speculat
 Phase 1 must follow the owner-level non-negotiable verbatim prompt paste rule defined in the owner agent (`slopmachine.md` or `slopmachine-claude.md`). That rule requires: read the installed `.md` file fresh with a `read` tool call, then paste its **complete body verbatim** into the subagent message. Do not summarize, describe, shorten, paraphrase, add preface/footer, or send a file path reference.
 The packaged prompt files for Phase 1 are:
-- `clarifier-agent-prompt.md` — first worker
-- `clarification-faithfulness-review-prompt.md` — faithfulness review worker
+- `~/slopmachine/clarifier-agent-prompt.md` — first worker
+- `~/slopmachine/clarification-faithfulness-review-prompt.md` — faithfulness review worker
 ## Root Metadata Gate
@@ -72,13 +72,19 @@ Phase 1 cannot close if root `./metadata.json.prompt` is missing, stale, or cont
    - Record any metadata correction in `../.ai/metadata.json` and Beads without exposing workflow metadata to implementation sessions.
  2. **Run the general clarification worker.**
-   - Read the installed `clarifier-agent-prompt.md` file fresh from its asset path using a `read` tool call.
+   - Read the installed `~/slopmachine/clarifier-agent-prompt.md` file fresh from its asset path using a `read` tool call.
    - Paste that file's **complete body verbatim** into the sent worker message under the non-negotiable verbatim paste rule.
    - After the packaged prompt body, inject only the original prompt and supporting stack/context notes; do not prepend or append a second owner-written clarification contract, and do not tell the worker to read the packaged file itself.
    - Require both `./docs/questions.md` and `../.ai/requirements-breakdown.md` as output.
    - After the worker returns, record both artifact paths in `../.ai/metadata.json` and add a Beads `ARTIFACT:` comment.
  3. **Review `questions.md` and `../.ai/requirements-breakdown.md` critically.**
+   - `./docs/questions.md` must use the exact format defined in `clarifier-agent-prompt.md`:
+     - Level-1 heading `# Questions`
+     - Each entry starts with `### <number>. <title>` (e.g. `### 1. User roles`)
+     - Each entry has exactly three fields: `- Question:`, `- My Understanding:`, `- Solution:`
+     - No requirement IDs, traceability fields, priority fields, or evaluator-risk metadata in `questions.md`
+   - Reject `questions.md` if the format deviates. Patch only trivial formatting issues.
    - It must extract the core requirements from the prompt explicitly.
    - It must use evaluation-grade extraction depth: business goal, main flows, actors, required surfaces, modules, APIs/jobs/data, security boundaries, mock/fake boundaries, documentation/static-verifiability expectations, test/coverage expectations, frontend state obligations, and FE-BE wiring expectations when applicable.
    - Those requirements must be defined in enough depth that design and planning can rely on them directly.
@@ -97,7 +103,7 @@ Phase 1 cannot close if root `./metadata.json.prompt` is missing, stale, or cont
  4. **Run prompt-faithfulness review.**
    - Launch one short-lived faithfulness review worker.
    - Send the original prompt, the supporting stack/context notes, `../.ai/requirements-breakdown.md`, and `./docs/questions.md` together.
-   - Read the installed `clarification-faithfulness-review-prompt.md` file fresh from its asset path.
+   - Read the installed `~/slopmachine/clarification-faithfulness-review-prompt.md` file fresh from its asset path.
    - Paste that file's **complete body verbatim** as the review instruction under the non-negotiable verbatim paste rule.
    - Require it to write `../.ai/clarification-faithfulness-review.md`.
    - After the review returns, record the review path and verdict in `../.ai/metadata.json` and add a Beads `ARTIFACT:` or `VERIFY:` comment.

package/assets/skills/claude-worker-management/SKILL.md CHANGED Viewed

@@ -15,6 +15,10 @@ The owner must use Claude only through the packaged live scripts for product imp
 ## Lane Policy
+- Sessions are the primary deliverable. An incomplete or corrupted Claude session dataset invalidates the submission. Preserve every session file intact — never edit, rename, restructure, clean, delete, or fabricate them.
+- Sessions must progress strictly forward. The lifecycle is: `develop-1` → close → `bugfix-1` → close → `test-coverage-1` → close. Never return to a closed session.
+- If a lane's session becomes genuinely unrecoverable (crash with no salvageable `sid` — even after attempting tmux relaunch with the known `sid` — and transcript/session lookup also fails), start a new session in the same lane with a sequential number (`develop-2`). Sessions remain sequential and a clear timeline can be established. This is the only exception to one-session-per-lane. Paused, rate-limited, or waiting states are not unrecoverable — stay in the same session.
+- A paused session is not an invitation to launch a new one. Rate limits, slow turns, shell timeouts, tmux interruptions, and recovery conditions always stay in the same lane. Only launch a new session if recovery is absolutely impossible.
 - Exactly one Claude implementation lane is active at a time. The active lane must correspond to the current phase purpose and be named in `../.ai/metadata.json` before any launch, resume, status check, or turn.
 - Every Claude session ever used must be registered in `../.ai/metadata.json` and Beads with lane name, `sid`, runtime directory, state/result files, current status, and purpose. Unregistered Claude turns are not allowed.
 - Default development lane: `develop-1`.
@@ -39,17 +43,23 @@ Claude-facing messages should be short and natural. Write like a friendly lead e
 Use wording like:
 ```text
-Here is the product brief. We're planning first, so don't write any code yet. Read this and be ready to help turn it into the design doc.
+<original product prompt from metadata.json>
-<original prompt verbatim>
+Don't write code yet — we'll plan this first.
 ```
-Then later:
+That is the entire first message. No introduction, no context, no clarifications. Then wait for acknowledgement.
+After acknowledgement, send:
 ```text
-Use the accepted clarifications below to create docs/design.md from the design template. Keep this as a design document, not an implementation checklist. If an API contract is needed, note that so we can fill docs/api-spec.md next.
+Here are some clarifications I made:
+<accepted clarifications and requirements>
 ```
+Wait for acknowledgement before sending the design prompt in the next step.
+Then send the design prompt with its opening adjusted (see `planning-guidance` Step 3) to reference the already-provided prompt.
 When the work has independent parts, include a natural reminder such as:
 ```text