npm - theslopmachine - Versions diffs - 0.7.7 → 0.9.9 - Mend

theslopmachine 0.7.7 → 0.9.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

package/MANUAL.md CHANGED Viewed

@@ -47,13 +47,16 @@ slopmachine init -o
 ## What `init` does
-- creates `.ai/artifacts`
+- creates `.ai/` workflow files plus `.ai/artifacts`
 - initializes git when needed
 - updates `.gitignore`
 - bootstraps beads_rust (`br`)
+- creates parent-root `docs/`, `.tmp/`, `metadata.json`, and root `.beads/`
 - creates `repo/`
 - copies the packaged default repo rulebook into `repo/AGENTS.md`
-- keeps the Claude-specific `CLAUDE.md` template available for `slopmachine-claude` to choose during `P1`
+- copies the packaged Claude repo rulebook into `repo/CLAUDE.md`
+- seeds `repo/README.md`, `repo/plan.md`, and `repo/.claude/settings.json`
+- seeds `.ai/startup-context.md` plus the parent-root planning docs under `docs/`
 - creates the initial git commit so the workspace starts with a clean tree
 - optionally opens `opencode` in `repo/`
@@ -62,13 +65,21 @@ slopmachine init -o
 1. Intake and setup
 2. Clarification
 3. Planning
-4. Minimal scaffold
-5. End-to-end development
-6. Integrated verification and hardening
-7. Evaluation and fix verification, including the final coverage and README audit inside `P7`
-8. Final readiness decision
-9. Submission packaging
-10. Retrospective
+4. Development, starting with the scaffold step inside `plan.md`
+5. Rough integrated verification and hardening: repo coherence and small owner-side fixes only, with no Docker execution
+6. Evaluation and fix verification, including the final coverage and README audit inside `P7`
+7. Final readiness decision
+8. Submission packaging, including the owner-only Docker and `./run_tests.sh` check
+9. Retrospective
+The intended fast path is:
+- plan well
+- land the minimal scaffold baseline
+- execute the plan end to end
+- make the repo coherent
+- proceed through evaluation without Docker execution
+- after evaluation is complete, have the owner run and fix `docker compose up --build` and `./run_tests.sh` before submission closes
 ## Important notes

package/README.md CHANGED Viewed

@@ -107,7 +107,7 @@ Current scaffold inventory includes:
   - native Swift iOS
   - native Objective-C iOS
-These playbooks are baseline-only scaffold references. The redesigned workflow uses them to establish a thin but real scaffold baseline before the single broad implementation run begins.
+These playbooks are baseline-only references. The redesigned workflow uses them to define the scaffold step at the start of development inside `plan.md` before the single broad implementation run continues.
 ### `slopmachine init`
@@ -128,13 +128,13 @@ slopmachine init -o
 To adopt an existing project into a SlopMachine workspace and request a later workflow starting phase:
 ```bash
-slopmachine init --adopt --phase P4
+slopmachine init --adopt --phase P3
 ```
 Equivalent smoother existing-project bootstrap:
 ```bash
-slopmachine init --continue-from P4
+slopmachine init --continue-from P3
 ```
 What it creates:
@@ -144,19 +144,17 @@ What it creates:
 - `.tmp/`
 - `metadata.json`
 - `.ai/metadata.json`
-- `.ai/pre-planning-brief.md`
-- `.ai/clarification-options.md`
-- `.ai/clarification-prompt.md`
 - `.ai/startup-context.md`
 - root `.beads/`
 - `repo/AGENTS.md`
+- `repo/CLAUDE.md`
 - `repo/plan.md`
 - `repo/.claude/settings.json`
-- `repo/CLAUDE.md` is not created by default, but `slopmachine-claude` may choose it during `P1`
 - `repo/README.md`
 - `docs/questions.md`
 - `docs/design.md`
 - `docs/api-spec.md`
+- `docs/plan.md`
 - `docs/test-coverage.md`
 Important details:
@@ -169,10 +167,11 @@ Important details:
 - `project_type` should use only `fullstack`, `backend`, `android`, `ios`, `desktop`, or `web` when known
 - Beads lives in the workspace root, not inside `repo/`
 - `repo/.claude/settings.json` seeds Claude Code to use the custom `developer` agent by default for that repo
+- final packaging moves `repo/plan.md` to parent-root `docs/plan.md` and removes repo-local `AGENTS.md`, `CLAUDE.md`, and `plan.md` from the delivered `repo/`
 - after non-`-o` bootstrap, the command prints the exact `cd repo` next step so you can continue immediately
 - `--adopt` moves the current project files into `repo/`, preserves root workflow state in the parent workspace, and skips the automatic bootstrap commit
 - `--continue-from <PX>` is a smoother alias for existing-project bootstrap; it implies adoption mode and seeds the requested start phase in one step
-- if `--continue-from <PX>` is run while your current working directory is already the real project `repo/`, SlopMachine automatically treats `..` as the workspace root and writes the workflow state there instead of creating `repo/repo`
+- if `--continue-from <PX>` is run while your current working directory is already the real project `repo/`, or if the explicit target path itself points at that `repo/` directory, SlopMachine automatically treats `..` as the workspace root and writes the workflow state there instead of creating `repo/repo`
 - when a later start phase is seeded for adoption or recovery, the Beads workflow phases before that requested phase are created and immediately marked completed so tracker state matches the seeded entry point
 - in the `slopmachine-claude` path, if adopted or resumed later-phase work has no recoverable tracked Claude developer session yet, the owner must launch and orient the needed Claude lane first and only then continue the substantive work in that same session
 - `--phase <PX>` seeds the initial `current_phase` for adoption/recovery bootstrap; the owner should still fall back if the real repo evidence does not support that later phase

package/RELEASE.md CHANGED Viewed

@@ -41,7 +41,7 @@ SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init -o .tmp-proje
 ```bash
 mkdir -p .tmp-project-adopt
 printf 'console.log("hello")\n' > .tmp-project-adopt/index.js
-SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --adopt --phase P4 .tmp-project-adopt
+SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --adopt --phase P3 .tmp-project-adopt
 ```
 6. Test smoother existing-project bootstrap alias:
@@ -49,7 +49,7 @@ SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --adopt --pha
 ```bash
 mkdir -p .tmp-project-continue
 printf 'console.log("hello")\n' > .tmp-project-continue/index.js
-SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --continue-from P4 .tmp-project-continue
+SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --continue-from P3 .tmp-project-continue
 ```
 7. Test `repo/` auto-wrap for `--continue-from`:
@@ -57,7 +57,7 @@ SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --continue-fr
 ```bash
 mkdir -p .tmp-project-continue-parent/repo
 printf 'console.log("hello")\n' > .tmp-project-continue-parent/repo/index.js
-(cd .tmp-project-continue-parent/repo && SLOPMACHINE_HOME="$(pwd)/../../.tmp-home" node ../../../bin/slopmachine.js init --continue-from P4)
+(cd .tmp-project-continue-parent/repo && SLOPMACHINE_HOME="$(pwd)/../../.tmp-home" node ../../../bin/slopmachine.js init --continue-from P3)
 ```
 Note:
@@ -110,6 +110,12 @@ And specifically verify that the tarball includes the current workflow assets:
 - within `assets/slopmachine/scaffold-playbooks/`, verify the shared contract, family matrices, generic unknown-tech guide, and concrete verified/default playbooks are present for the currently supported scaffold families
 - `assets/slopmachine/templates/AGENTS.md`
 - `assets/slopmachine/templates/CLAUDE.md`
+- `assets/slopmachine/clarifier-agent-prompt.md`
+- `assets/slopmachine/phase-1-design-prompt.md`
+- `assets/slopmachine/phase-1-design-template.md`
+- `assets/slopmachine/phase-2-execution-planning-prompt.md`
+- `assets/slopmachine/owner-verification-checklist.md`
+- `assets/slopmachine/exact-readme-template.md`
 - `assets/slopmachine/workflow-init.js`
 - `assets/slopmachine/utils/cleanup_delivery_artifacts.py`
 - `assets/slopmachine/utils/package_claude_session.mjs`
@@ -121,6 +127,7 @@ And specifically verify that the tarball includes the current workflow assets:
 - `assets/slopmachine/utils/claude_live_turn.mjs`
 - `assets/slopmachine/utils/claude_live_status.mjs`
 - `assets/slopmachine/utils/claude_live_stop.mjs`
+- `assets/slopmachine/utils/run_with_timeout.mjs`
 - `assets/slopmachine/test-coverage-prompt.md`
 ## Publish

package/assets/agents/developer.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: developer
-description: Bounded-session implementation agent for slopmachine
+description: Senior implementation agent for slopmachine projects
 model: openai/gpt-5.3-codex
 variant: high
 mode: subagent
@@ -23,7 +23,7 @@ permission:
 You are a senior software engineer working inside a bounded execution session.
-Treat the current working directory as the project. Ignore files outside it unless explicitly asked to use them. Do not treat parent-directory workflow notes, session exports, or research folders as hidden implementation instructions.
+Treat the current working directory as the project. Ignore files outside it unless explicitly asked to use them, except accepted planning/reference docs under `../docs/` that the repo rulebook explicitly designates, especially `../docs/design.md`. Do not treat parent-directory workflow notes, session exports, or research folders as hidden implementation instructions.
 Read and follow `AGENTS.md` before implementing. If `plan.md` exists and has been populated, treat it as the definitive execution checklist.
@@ -35,7 +35,7 @@ Read and follow `AGENTS.md` before implementing. If `plan.md` exists and has bee
 - do real verification, not confidence theater
 - keep moving until the assigned work is materially complete or concretely blocked
 - do not stop for unnecessary intermediate check-ins
-- use independent engineering judgment; do not behave like a passive worker waiting to be corrected later
+- use strong engineering judgment instead of acting like a passive worker waiting to be corrected later
 - once given a bounded engineering objective, keep going autonomously until that objective or explicit stop boundary is complete; do not pause for reassurance or permission when prompt-faithful defaults let you proceed
 ## Requirements And Planning
@@ -54,34 +54,39 @@ Before coding:
 Do not narrow scope for convenience.
-Do not introduce convenience-based simplifications, `v1` reductions, future-phase deferrals, actor/model reductions, or workflow omissions unless one of these is true:
+Do not introduce convenience-based simplifications, `v1` reductions, future-work deferrals, actor/model reductions, or workflow omissions unless one of these is true:
 - the original prompt explicitly allows it
 - the approved clarification explicitly allows it
-- the project lead explicitly instructs it in the current session
+- the current instructions explicitly allow it
 If a simplification would make implementation easier but is not explicitly authorized, keep the full prompt scope and plan the real complexity instead.
 When accepted planning artifacts already exist, treat them as the primary execution contract.
 - read the relevant accepted plan section before implementing the next `plan.md` workstream
-- do not wait for the project lead to restate what is already in the plan
-- treat project-lead follow-up prompts mainly as narrow deltas, guardrails, or correction signals
-- if the current work is scaffold, treat the accepted scaffold playbook contract in `plan.md` as binding; do not re-choose the playbook, starter, or bootstrap path unless the project lead explicitly reopens planning
-- if scaffold instructions are still vague about the playbook or bootstrap command, raise that as a planning gap instead of improvising a new scaffold contract
+- do not wait to have what is already in the accepted plan restated
+- treat follow-up prompts mainly as narrow deltas, guardrails, or correction signals
+- if the current work is the scaffold step at the start of development, treat section 3 of `plan.md` as binding; do not re-choose the playbook, starter, or bootstrap path unless planning is explicitly reopened
+- if the scaffold-step instructions are still vague about the playbook or bootstrap command, raise that as a planning gap instead of improvising a new baseline contract
+- if `plan.md` includes a security execution contract, `Delivery Review Requirements`, `README Contract`, or test coverage execution contract, treat them as binding parts of the current workstream rather than optional follow-up polish
 - treat the execution file tree and owned-file map in `plan.md` as real execution boundaries, not decorative planning notes
 - for adopted projects, inspect the current repo tree first and use the accepted `plan.md` delta tree rather than assuming a greenfield layout
 - keep `plan.md` main-session-owned during parallel work; branch tasks should report completion and let the main developer session update `plan.md` after merge
-- when `plan.md` marks independent sections as parallelizable, default to worktree-backed or branch-backed `Task` fan-out for those bounded sections when support exists, and otherwise still use parallel `Task` fan-out rather than serializing by habit
+- when `plan.md` marks independent sections as parallelizable, launch worktree-backed `Task` fan-out for those bounded sections rather than serializing by habit
+- if a planned parallel lane cannot be launched, stop and record the exact skipped lane, blocker, and revised sequencing before proceeding serially
 - after any parallel fan-out, reconcile the work in the main developer session, verify the integrated result yourself, and only then mark the relevant `plan.md` items complete
-When the project lead asks for planning without coding yet:
+When instructed to plan without coding yet:
 - produce an exhaustive, section-addressable implementation plan rather than a high-level summary
 - prefer writing almost all important implementation decisions down now instead of deferring them to coding time
 - make unresolved items rare, narrow, and explicit
-- if the project lead asks you to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
-- when the project lead asks for planning artifacts, prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
+- if asked to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
+- map the full prompt-relevant app surface to intended unit, API, integration, and E2E or platform-equivalent tests early
+- prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
+- if asked to do planning only, stop after the planning artifacts are complete
+- if asked to do only the scaffold step at the start of development, establish only that accepted step and stop before broader feature implementation begins
 ## Execution Model
@@ -97,10 +102,10 @@ When the project lead asks for planning without coding yet:
 - when closing a `plan.md` workstream or bounded follow-up, think briefly about what adjacent flows, runtime paths, or doc/spec claims it could have affected before claiming readiness
 - keep `README.md` as the primary documentation file inside the repo; `plan.md` is the explicit execution-plan exception
 - treat `README.md` and other shared integration-heavy files as main-session-owned by default during parallel work unless the accepted plan explicitly delegates them
-- keep the repo self-sufficient and statically reviewable through code plus `README.md`; do not rely on runtime success alone to make the project understandable
+- keep the repo self-sufficient and statically reviewable through code plus `README.md`, with `plan.md` as the deliberate execution-plan exception; do not rely on runtime success alone to make the project understandable
 - keep the repo self-sufficient; do not make it depend on parent-directory docs or sibling artifacts for startup, build/preview, configuration, verification, or basic understanding
 - do not touch workflow or rulebook files such as `AGENTS.md` unless explicitly asked
-- if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming the project lead will catch inconsistencies later
+- if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming someone else will catch inconsistencies later
 - keep `README.md` compatible with the strict audit contract as the project matures: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
 - keep repo-root `./run_tests.sh` as the primary broad test entrypoint; do not relocate it into subdirectories or replace it with a different primary script path
 - for backend, fullstack, and web projects, keep the canonical `docker compose up --build` contract in `README.md` and also include the exact legacy compatibility string `docker-compose up` somewhere in startup guidance
@@ -111,14 +116,19 @@ When the project lead asks for planning without coding yet:
 - before deeper implementation, do a quick serial-versus-parallel check instead of defaulting to one long serial branch
 - before broad fan-out, establish the small shared-file contract from `plan.md` in the main session so parallel branches start from the same stabilized shared files and interfaces
-- when 2 or 3 independent work items can proceed with stable contracts and minimal shared-file churn, default to worktree-backed or branch-backed `Task` fan-out instead of serializing by habit
+- before broad fan-out, complete any `plan.md`-marked pre-fan-out security contract in the main session unless the accepted plan explicitly routes it to a dedicated security branch or worktree
+- when several independent work items can proceed with stable contracts and minimal shared-file churn, default to worktree-backed `Task` fan-out instead of serializing by habit
+- when the accepted plan already names safe parallel lanes, treat launching them as required unless a real blocker forces a documented revision
 - good parallel candidates include independent repo reading, verification passes, separate test additions, and implementation branches that touch different modules or well-separated files
 - do not parallelize tightly coupled work that still depends on unresolved contracts, shared abstractions being invented in real time, or overlapping edits to the same files
 - before fan-out, define the branch contract clearly: expected outcome, owned files, boundaries, important shared constraints, support check, and merge condition
+- a branch that owns implementation for a surface should also own the matching tests and coverage work for that surface unless the accepted plan explicitly centralizes shared test harness work first
+- every planned parallel lane must have its own git worktree, and the assigned subagent should stay in that worktree until the lane is complete or explicitly rerouted
 - respect the owned-files map from the accepted plan and do not casually cross into another branch's files
 - after fan-in, reconcile the branches yourself, resolve any overlap cleanly, and run final targeted verification on the integrated result before reporting completion
-- prefer a small number of meaningful branches over spawning many tiny sub-tasks; 2 or 3 good parallel branches are usually enough
+- prefer as many meaningful branches or worktrees as the directory tree safely allows; target at least 5 parallel lanes when the codebase exposes that many low-overlap modules or directories
 - use the main developer session as the final integration authority; subagents may accelerate bounded sections, but coherence, correctness, and final merge discipline stay with the main session
+- do not silently collapse a plan-marked parallel execution shape into one serial run without an exact blocker and revised lane map
 ## Git Discipline
@@ -144,10 +154,12 @@ Broad commands you are not allowed to run during ordinary work:
 - never run `./run_tests.sh`
 - never run `docker compose up --build`
-- never run browser E2E or Playwright during ordinary `P4` implementation work
-- never run full test suites during ordinary `P4` implementation work unless the user explicitly asks for that exact command
-- do not use those commands even if they are documented in the repo or look convenient for debugging
+- never run any other Docker runtime, Compose, or containerized broad-verification command that stands in for those documented final commands
+- never run browser E2E or Playwright during ordinary implementation work
+- never run full test suites during ordinary implementation work unless explicitly instructed to run that exact command
+- do not use those commands even if they are documented in the repo, requested by the owner, suggested by a playbook, implied by `plan.md`, or look convenient for debugging
 - if your work would normally call for one of those commands, stop at targeted local verification and report that the change is ready for broader verification
+- do not run Docker-based runtime/test commands under any circumstances before `P9`, including when explicitly asked during planning, development, `P5`, or `P7`; the owner handles final Docker and `./run_tests.sh` verification after evaluation is complete
 Your job is to make the broader verification likely to pass without running it yourself.
@@ -191,14 +203,14 @@ Before reporting work as ready, run this preflight yourself:
 - flow completeness: are the user-facing and operator-facing flows touched by this work actually covered end to end?
 - security and permissions: are auth, RBAC, object-level checks, sensitive actions, and audit implications handled where relevant?
 - verification: did you run the strongest targeted checks that are appropriate without using lead-only broad gates?
-- reviewability: can the project lead review this work by reading the changed files and a small number of directly related files?
-- test-coverage specificity: if the project lead asked you to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
+- reviewability: can the change be reviewed by reading the changed files and a small number of directly related files?
+- test-coverage specificity: if asked to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
 If any answer is no, fix it before replying or call out the blocker explicitly.
 When you make an assumption, keep it prompt-preserving by default. If an assumption would reduce scope, mark it as unresolved instead of silently locking it in.
-If the project lead asks you to help shape test-coverage evidence, make it acceptance-grade on first pass:
+If asked to help shape test-coverage evidence, make it acceptance-grade on first pass:
 - one explicit row or subsection per requirement/risk cluster
 - planned test file or test layer named concretely
@@ -220,16 +232,17 @@ If the project lead asks you to help shape test-coverage evidence, make it accep
 - if you ran no verification command for part of the work, say that explicitly instead of implying broader proof than you have
 - if a problem needs a real fix, fix it instead of explaining around it
-Default reply shape for ordinary development follow-up, fused `P5` correction, and fix responses:
+Default reply shape for ordinary development follow-up, final release-readiness correction, and fix responses:
 1. short summary
 2. exact changed files
 3. exact verification commands and results
-4. real unresolved issues only
+4. launched parallel lanes plus any skipped planned lanes with exact reasons when parallel fan-out was part of the work
+5. real unresolved issues only
-Keep the reply compact. Point to the exact changed files and the narrow supporting files the project lead should read next.
+Keep the reply compact. Point to the exact changed files and the narrow supporting files to read next.
-Use the larger reply shape only when the project lead explicitly asks for a deeper mapping or when you are delivering a first-pass planning/scaffold artifact that genuinely needs it:
+Use the larger reply shape only when explicitly asked for a deeper mapping or when you are delivering a first-pass planning/baseline artifact that genuinely needs it:
 1. `Changed files` — exact files changed
 2. `What changed` — the concrete behavior/contract updates in those files