theslopmachine 0.9.14 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/MANUAL.md +1 -0
- package/README.md +2 -2
- package/assets/agents/developer.md +35 -26
- package/assets/agents/slopmachine-claude.md +70 -62
- package/assets/agents/slopmachine.md +56 -52
- package/assets/claude/agents/developer.md +287 -150
- package/assets/claude/skills/integration-fanin/SKILL.md +122 -0
- package/assets/claude/skills/module-handoff/SKILL.md +98 -0
- package/assets/claude/skills/module-lane-execution/SKILL.md +126 -0
- package/assets/claude/skills/shared-surface-control/SKILL.md +97 -0
- package/assets/skills/beads-operations/SKILL.md +9 -0
- package/assets/skills/clarification-gate/SKILL.md +5 -0
- package/assets/skills/claude-worker-management/SKILL.md +70 -56
- package/assets/skills/developer-session-lifecycle/SKILL.md +4 -4
- package/assets/skills/development-guidance/SKILL.md +45 -11
- package/assets/skills/evaluation-triage/SKILL.md +13 -7
- package/assets/skills/final-evaluation-orchestration/SKILL.md +56 -19
- package/assets/skills/integrated-verification/SKILL.md +81 -9
- package/assets/skills/planning-gate/SKILL.md +91 -30
- package/assets/skills/planning-guidance/SKILL.md +21 -3
- package/assets/skills/report-output-discipline/SKILL.md +4 -0
- package/assets/skills/retrospective-analysis/SKILL.md +28 -1
- package/assets/skills/scaffold-guidance/SKILL.md +16 -7
- package/assets/skills/submission-packaging/SKILL.md +59 -23
- package/assets/skills/verification-gates/SKILL.md +37 -19
- package/assets/slopmachine/clarifier-agent-prompt.md +90 -0
- package/assets/slopmachine/owner-verification-checklist.md +44 -9
- package/assets/slopmachine/phase-1-design-prompt.md +80 -14
- package/assets/slopmachine/phase-1-design-template.md +79 -11
- package/assets/slopmachine/phase-2-execution-planning-prompt.md +233 -103
- package/assets/slopmachine/phase-2-plan-template.md +318 -84
- package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +106 -67
- package/assets/slopmachine/scaffold-playbooks/shared-contract.md +207 -0
- package/assets/slopmachine/scaffold-playbooks/stack-android-room-offline.md +30 -0
- package/assets/slopmachine/scaffold-playbooks/stack-browser-only-offline-spa.md +29 -0
- package/assets/slopmachine/scaffold-playbooks/stack-generic.md +25 -0
- package/assets/slopmachine/scaffold-playbooks/stack-go-gin-templ-postgres.md +70 -0
- package/assets/slopmachine/scaffold-playbooks/stack-react-go-postgres.md +33 -0
- package/assets/slopmachine/scaffold-playbooks/stack-rust-fullstack-workspace.md +31 -0
- package/assets/slopmachine/scaffold-playbooks/stack-vue-koa-mysql.md +77 -0
- package/assets/slopmachine/scaffold-playbooks/stack-vue-laravel-mysql.md +32 -0
- package/assets/slopmachine/scaffold-playbooks/stack-winforms-localdb.md +30 -0
- package/assets/slopmachine/scaffold-playbooks/tech-backend-gin-templ.md +22 -0
- package/assets/slopmachine/scaffold-playbooks/tech-backend-go.md +23 -0
- package/assets/slopmachine/scaffold-playbooks/tech-backend-koa.md +22 -0
- package/assets/slopmachine/scaffold-playbooks/tech-backend-laravel.md +33 -0
- package/assets/slopmachine/scaffold-playbooks/tech-db-localdb.md +20 -0
- package/assets/slopmachine/scaffold-playbooks/tech-db-mysql.md +22 -0
- package/assets/slopmachine/scaffold-playbooks/tech-db-postgres.md +22 -0
- package/assets/slopmachine/scaffold-playbooks/tech-db-room.md +20 -0
- package/assets/slopmachine/scaffold-playbooks/tech-frontend-react.md +22 -0
- package/assets/slopmachine/scaffold-playbooks/tech-frontend-vue.md +23 -0
- package/assets/slopmachine/scaffold-playbooks/tech-rust-workspace.md +21 -0
- package/assets/slopmachine/scaffold-playbooks/type-api-service.md +52 -0
- package/assets/slopmachine/scaffold-playbooks/type-background-jobs.md +45 -0
- package/assets/slopmachine/scaffold-playbooks/type-database.md +81 -0
- package/assets/slopmachine/scaffold-playbooks/type-desktop.md +30 -0
- package/assets/slopmachine/scaffold-playbooks/type-mobile-android.md +32 -0
- package/assets/slopmachine/scaffold-playbooks/type-offline-local-first.md +31 -0
- package/assets/slopmachine/scaffold-playbooks/type-web-spa.md +63 -0
- package/assets/slopmachine/templates/AGENTS.md +30 -16
- package/assets/slopmachine/templates/CLAUDE.md +29 -15
- package/assets/slopmachine/templates/plan.md +317 -73
- package/assets/slopmachine/utils/__pycache__/claude_live_hook.cpython-311.pyc +0 -0
- package/assets/slopmachine/utils/__pycache__/cleanup_delivery_artifacts.cpython-311.pyc +0 -0
- package/assets/slopmachine/utils/__pycache__/convert_ai_session.cpython-311.pyc +0 -0
- package/assets/slopmachine/utils/__pycache__/normalize_claude_session.cpython-311.pyc +0 -0
- package/assets/slopmachine/utils/__pycache__/strip_session_parent.cpython-311.pyc +0 -0
- package/assets/slopmachine/utils/analyze_claude_project_dir.mjs +197 -0
- package/assets/slopmachine/utils/claude_create_session.mjs +5 -3
- package/assets/slopmachine/utils/claude_export_session.mjs +1 -1
- package/assets/slopmachine/utils/claude_live_common.mjs +46 -4
- package/assets/slopmachine/utils/claude_live_launch.mjs +237 -104
- package/assets/slopmachine/utils/claude_live_turn.mjs +32 -3
- package/assets/slopmachine/utils/claude_resume_session.mjs +5 -3
- package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.mjs +26 -3
- package/assets/slopmachine/utils/claude_worker_common.mjs +75 -12
- package/assets/slopmachine/utils/convert_exported_ai_session.mjs +1 -1
- package/assets/slopmachine/utils/export_ai_session.mjs +2 -2
- package/assets/slopmachine/utils/normalize_claude_session.py +85 -10
- package/assets/slopmachine/utils/package_claude_session.mjs +259 -94
- package/assets/slopmachine/utils/prepare_ai_session_for_convert.mjs +1 -1
- package/package.json +1 -1
- package/src/constants.js +34 -49
- package/src/init.js +82 -10
- package/src/install.js +50 -6
- package/assets/slopmachine/scaffold-playbooks/android-kotlin-compose.md +0 -73
- package/assets/slopmachine/scaffold-playbooks/android-kotlin-views.md +0 -138
- package/assets/slopmachine/scaffold-playbooks/android-native-java.md +0 -203
- package/assets/slopmachine/scaffold-playbooks/angular-default.md +0 -129
- package/assets/slopmachine/scaffold-playbooks/backend-baseline.md +0 -126
- package/assets/slopmachine/scaffold-playbooks/backend-family-matrix.md +0 -80
- package/assets/slopmachine/scaffold-playbooks/database-module-matrix.md +0 -80
- package/assets/slopmachine/scaffold-playbooks/django-default.md +0 -109
- package/assets/slopmachine/scaffold-playbooks/docker-baseline.md +0 -146
- package/assets/slopmachine/scaffold-playbooks/docker-shared-contract.md +0 -338
- package/assets/slopmachine/scaffold-playbooks/electron-vite-default.md +0 -124
- package/assets/slopmachine/scaffold-playbooks/expo-react-native-default.md +0 -73
- package/assets/slopmachine/scaffold-playbooks/fastapi-default.md +0 -97
- package/assets/slopmachine/scaffold-playbooks/frontend-baseline.md +0 -138
- package/assets/slopmachine/scaffold-playbooks/frontend-family-matrix.md +0 -134
- package/assets/slopmachine/scaffold-playbooks/generic-unknown-tech-guide.md +0 -136
- package/assets/slopmachine/scaffold-playbooks/go-chi-default.md +0 -103
- package/assets/slopmachine/scaffold-playbooks/ios-linux-portable.md +0 -93
- package/assets/slopmachine/scaffold-playbooks/ios-native-objective-c.md +0 -151
- package/assets/slopmachine/scaffold-playbooks/ios-native-swift.md +0 -188
- package/assets/slopmachine/scaffold-playbooks/laravel-default.md +0 -143
- package/assets/slopmachine/scaffold-playbooks/livewire-default.md +0 -172
- package/assets/slopmachine/scaffold-playbooks/overlay-module-matrix.md +0 -130
- package/assets/slopmachine/scaffold-playbooks/platform-family-matrix.md +0 -79
- package/assets/slopmachine/scaffold-playbooks/spring-boot-default.md +0 -100
- package/assets/slopmachine/scaffold-playbooks/tauri-default.md +0 -68
- package/assets/slopmachine/scaffold-playbooks/vue-vite-default.md +0 -140
- package/assets/slopmachine/scaffold-playbooks/web-default.md +0 -96
package/MANUAL.md
CHANGED
|
@@ -58,6 +58,7 @@ slopmachine init -o
|
|
|
58
58
|
- copies the packaged Claude repo rulebook into `repo/CLAUDE.md`
|
|
59
59
|
- seeds `repo/README.md`, `repo/plan.md`, and `repo/.claude/settings.json`
|
|
60
60
|
- seeds `.ai/startup-context.md` plus the parent-root planning docs under `docs/`
|
|
61
|
+
- later, when `P5` closes, the workflow preserves the final truthful execution record in `docs/plan.md` and removes `repo/plan.md` before evaluation begins
|
|
61
62
|
- creates the initial git commit so the workspace starts with a clean tree
|
|
62
63
|
- optionally opens `opencode` in `repo/`
|
|
63
64
|
- parallel worktrees should stay under hidden parent-root `.ai/worktrees/` so the visible workspace root stays clean
|
package/README.md
CHANGED
|
@@ -169,7 +169,7 @@ Important details:
|
|
|
169
169
|
- Beads lives in the workspace root, not inside `repo/`
|
|
170
170
|
- `repo/.claude/settings.json` seeds Claude Code to use the custom `developer` agent by default for that repo
|
|
171
171
|
- planned parallel git worktrees should live under hidden parent-root `.ai/worktrees/` by default so root-level `repo-lane-*` folders do not clutter the workspace
|
|
172
|
-
-
|
|
172
|
+
- when `P5` completes, the workflow moves `repo/plan.md` to parent-root `docs/plan.md`; packaging later validates that `repo/plan.md`, `repo/AGENTS.md`, and `repo/CLAUDE.md` are absent from the delivered `repo/`
|
|
173
173
|
- after non-`-o` bootstrap, the command prints the exact `cd repo` next step so you can continue immediately
|
|
174
174
|
- `--adopt` moves the current project files into `repo/`, preserves root workflow state in the parent workspace, and skips the automatic bootstrap commit
|
|
175
175
|
- `--continue-from <PX>` is a smoother alias for existing-project bootstrap; it implies adoption mode and seeds the requested start phase in one step
|
|
@@ -177,7 +177,7 @@ Important details:
|
|
|
177
177
|
- when a later start phase is seeded for adoption or recovery, the Beads workflow phases before that requested phase are created and immediately marked completed so tracker state matches the seeded entry point
|
|
178
178
|
- in the `slopmachine-claude` path, if adopted or resumed later-phase work has no recoverable tracked Claude developer session yet, the owner must launch and orient the needed Claude lane first and only then continue the substantive work in that same session
|
|
179
179
|
- `--phase <PX>` seeds the initial `current_phase` for adoption/recovery bootstrap; the owner should still fall back if the real repo evidence does not support that later phase
|
|
180
|
-
- `repo/plan.md` is seeded at bootstrap and becomes the definitive repo-local execution checklist
|
|
180
|
+
- `repo/plan.md` is seeded at bootstrap and becomes the definitive repo-local execution checklist through planning, development, and `P5`; after `P5`, the preserved reference copy is `docs/plan.md`
|
|
181
181
|
|
|
182
182
|
### `slopmachine set-token`
|
|
183
183
|
|
|
@@ -69,13 +69,15 @@ When accepted planning artifacts already exist, treat them as the primary execut
|
|
|
69
69
|
- treat follow-up prompts mainly as narrow deltas, guardrails, or correction signals
|
|
70
70
|
- if the current work is the scaffold step at the start of development, treat section 3 of `plan.md` as binding; do not re-choose the playbook, starter, or bootstrap path unless planning is explicitly reopened
|
|
71
71
|
- if the scaffold-step instructions are still vague about the playbook or bootstrap command, raise that as a planning gap instead of improvising a new baseline contract
|
|
72
|
-
- if `plan.md` includes a security execution contract, `Delivery Review Requirements`, `README Contract`, or test coverage execution contract, treat them as binding parts of the current workstream rather than optional follow-up polish
|
|
73
|
-
- treat
|
|
72
|
+
- if `plan.md` includes a security execution contract, `Core Semantic Path Proof`, `Prompt-Critical Rule Matrix`, `Role Surface Matrix`, `Runtime Lifecycle Checklist`, `Delivery Review Requirements`, `README Contract`, or test coverage execution contract, treat them as binding parts of the current workstream rather than optional follow-up polish
|
|
73
|
+
- if `plan.md` includes a FE↔BE Integration Map, treat it as binding: frontend surfaces must use real backend behavior, and prompt-relevant backend features must be exposed through required frontend surfaces unless the plan accepts them as internal/API-only
|
|
74
|
+
- treat the module packet map and owned file/location details in `plan.md` as real execution boundaries, not decorative planning notes
|
|
74
75
|
- for adopted projects, inspect the current repo tree first and use the accepted `plan.md` delta tree rather than assuming a greenfield layout
|
|
75
|
-
- keep `plan.md` main-session-owned during
|
|
76
|
-
-
|
|
77
|
-
-
|
|
78
|
-
-
|
|
76
|
+
- keep `plan.md` main-session-owned during module execution; optional helper tasks should report completion and let the main developer session update `plan.md` after integration
|
|
77
|
+
- the current developer session remains the integration authority and should complete ordered module packets one by one by default
|
|
78
|
+
- use worktree-backed `Task` subagents only when the accepted plan identifies genuinely independent modules, discovery, verification, or remediation work where concurrency is safer or clearly useful
|
|
79
|
+
- if an optional helper task cannot be launched, record the reason and complete the module sequentially only when that preserves the same proof and verification path
|
|
80
|
+
- after any optional helper work, reconcile the work in the main developer session, verify the integrated result yourself, and only then mark the relevant `plan.md` items complete
|
|
79
81
|
|
|
80
82
|
When instructed to plan without coding yet:
|
|
81
83
|
|
|
@@ -84,6 +86,7 @@ When instructed to plan without coding yet:
|
|
|
84
86
|
- make unresolved items rare, narrow, and explicit
|
|
85
87
|
- if asked to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
|
|
86
88
|
- map the full prompt-relevant app surface to intended unit, API, integration, and E2E or platform-equivalent tests early
|
|
89
|
+
- when planning fullstack or backend-backed frontend work, include a bidirectional FE↔BE Integration Map that connects each frontend page/component/action to real backend behavior and each prompt-relevant backend feature to its frontend exposure or accepted internal/API-only rationale
|
|
87
90
|
- prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
|
|
88
91
|
- if asked to do planning only, stop after the planning artifacts are complete
|
|
89
92
|
- if asked to do only the scaffold step at the start of development, establish only that accepted step and stop before broader feature implementation begins
|
|
@@ -98,11 +101,13 @@ When instructed to plan without coding yet:
|
|
|
98
101
|
- keep logging, validation, and normalized error handling on shared paths when those cross-cutting concerns are material
|
|
99
102
|
- verify the changed area locally and realistically before reporting completion
|
|
100
103
|
- when backend or fullstack API endpoints are added or changed, prefer real HTTP tests for the exact `METHOD + PATH` over controller or service bypasses when practical
|
|
104
|
+
- when endpoints are called by frontend flows, prove the called backend path performs the real read, mutation, state transition, or side effect expected by the frontend rather than only proving the route exists or returns 200
|
|
105
|
+
- do not claim frontend completion when a mapped surface still uses static demo data, fake-success API clients, disconnected submit handlers, TODO integration stubs, or placeholder response shapes
|
|
101
106
|
- if mocked HTTP tests or unit-only tests still exist for an API surface, do not overstate them as equivalent to true no-mock endpoint coverage
|
|
102
107
|
- when closing a `plan.md` workstream or bounded follow-up, think briefly about what adjacent flows, runtime paths, or doc/spec claims it could have affected before claiming readiness
|
|
103
|
-
- keep `README.md` as the primary documentation file inside the repo; `plan.md` is the explicit execution-plan exception
|
|
108
|
+
- keep `README.md` as the primary documentation file inside the repo; repo-local `plan.md` is the explicit execution-plan exception only during active implementation through `P5`
|
|
104
109
|
- treat `README.md` and other shared integration-heavy files as main-session-owned by default during parallel work unless the accepted plan explicitly delegates them
|
|
105
|
-
- keep the repo self-sufficient and statically reviewable through code plus `README.md`, with `plan.md` as the deliberate execution-plan exception
|
|
110
|
+
- keep the repo self-sufficient and statically reviewable through code plus `README.md`, with repo-local `plan.md` as the deliberate execution-plan exception only during active implementation through `P5`; do not rely on runtime success alone to make the project understandable
|
|
106
111
|
- keep the repo self-sufficient; do not make it depend on parent-directory docs or sibling artifacts for startup, build/preview, configuration, verification, or basic understanding
|
|
107
112
|
- do not touch workflow or rulebook files such as `AGENTS.md` unless explicitly asked
|
|
108
113
|
- if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming someone else will catch inconsistencies later
|
|
@@ -113,26 +118,28 @@ When instructed to plan without coding yet:
|
|
|
113
118
|
- before reporting development complete, remove local-only setup traces and host-only dependency assumptions from the delivered README and wrapper scripts
|
|
114
119
|
- before reporting development complete, run one deliberate main-session reread against the accepted `plan.md`, `../docs/design.md`, accepted `../docs/api-spec.md` when applicable, `README.md`, and the integrated repo so the owner is not first discovering obvious drift in `P5`
|
|
115
120
|
- before reporting development complete, close the common late-failure classes inside development: `README.md` drift, API-spec drift, missing auth/authorization/ownership enforcement, weak validation or normalized error handling, missing owned tests, startup/test wrapper dishonesty, and partial user-facing or admin-facing flow closure
|
|
121
|
+
- before reporting development complete, explicitly report proof status for the core semantic path, prompt-critical rules, role surface matrix if applicable, runtime lifecycle checklist if applicable, and any residual risks instead of relying only on general test success
|
|
122
|
+
- before reporting development complete for fullstack or backend-backed frontend projects, explicitly report FE↔BE integration proof status, including any frontend surface not backed by real backend behavior and any backend feature not exposed through required frontend UI
|
|
116
123
|
|
|
117
|
-
##
|
|
124
|
+
## Module Packet Execution Model
|
|
118
125
|
|
|
119
|
-
- before deeper implementation,
|
|
120
|
-
- before
|
|
121
|
-
-
|
|
122
|
-
-
|
|
123
|
-
- when the accepted plan already names safe parallel lanes, treat launching them as required unless a real blocker forces a documented revision
|
|
126
|
+
- before deeper implementation, read the ordered module packet map instead of defaulting to one vague long branch
|
|
127
|
+
- before module work, establish the small shared-file contract and any `plan.md`-marked security foundation in the main session
|
|
128
|
+
- complete one module packet end to end before starting the next module by default
|
|
129
|
+
- use worktree-backed helper tasks only for genuinely independent modules, discovery, verification, or remediation work where concurrency is safer or clearly useful
|
|
124
130
|
- good parallel candidates include independent repo reading, verification passes, separate test additions, and implementation branches that touch different modules or well-separated files
|
|
125
131
|
- do not parallelize tightly coupled work that still depends on unresolved contracts, shared abstractions being invented in real time, or overlapping edits to the same files
|
|
126
|
-
- before
|
|
127
|
-
- a
|
|
128
|
-
- every
|
|
129
|
-
-
|
|
130
|
-
-
|
|
131
|
-
-
|
|
132
|
-
-
|
|
133
|
-
-
|
|
132
|
+
- before optional helper work, define the helper contract clearly: expected outcome, owned files, exact `plan.md` module packet, boundaries, shared constraints, merge condition, and required verification
|
|
133
|
+
- a module that owns implementation for a surface should also own the matching tests and coverage work for that surface unless the accepted plan explicitly centralizes shared test harness work first
|
|
134
|
+
- every optional helper branch must have its own git worktree, and the assigned subagent should stay in that worktree until the helper task is complete or explicitly rerouted
|
|
135
|
+
- each `Task` subagent prompt must name its worktree path, branch name, owned files, owned tests, exact `plan.md` rows, shared-file restrictions, verification commands to run, and the required completion report format
|
|
136
|
+
- before a module or helper reports completion, verify every file it created or changed against the assigned `plan.md` scope, confirm each file is real and integrated rather than orphaned or placeholder, run all tests assigned to those owned files/module plus the strongest relevant local checks, and include the exact commands and results in the completion packet
|
|
137
|
+
- do not let a module or helper report "done" merely because code compiles or the happy path appears present; its owned functionality must be real against the plan and its owned verification must have run
|
|
138
|
+
- respect the owned-files map from the accepted plan and do not casually cross into another module's files
|
|
139
|
+
- after all modules are complete, verify each module's files and assigned tests in the main session, run the full non-Docker local suite and planned E2E/platform-equivalent checks available for development, verify cross-module integration, and only then report completion
|
|
140
|
+
- prefer ordered module-packet execution by default; use branches or worktrees only when the accepted plan identifies genuinely independent work where concurrency is safer or clearly useful
|
|
134
141
|
- use the main developer session as the final integration authority; subagents may accelerate bounded sections, but coherence, correctness, and final merge discipline stay with the main session
|
|
135
|
-
- do not
|
|
142
|
+
- do not skip module-packet proof or use optional helper branches without clear ownership and integration evidence
|
|
136
143
|
|
|
137
144
|
## Git Discipline
|
|
138
145
|
|
|
@@ -159,7 +166,7 @@ Broad commands you are not allowed to run during ordinary work:
|
|
|
159
166
|
- never run `docker compose up --build`
|
|
160
167
|
- never run any other Docker runtime, Compose, or containerized broad-verification command that stands in for those documented final commands
|
|
161
168
|
- never run browser E2E or Playwright during ordinary implementation work
|
|
162
|
-
- do not run full local test suites during ordinary implementation work unless the current milestone or owner instruction actually calls for that exact verification
|
|
169
|
+
- do not run full local test suites during ordinary implementation work unless the current milestone or owner instruction actually calls for that exact verification; development-complete fan-in is such a milestone and requires the full non-Docker local suite before reporting completion
|
|
163
170
|
- do not use Docker commands even if they are documented in the repo, requested by the owner, suggested by a playbook, implied by `plan.md`, or look convenient for debugging
|
|
164
171
|
- if your work would normally call for Docker, stop at targeted local verification and report that the change is ready for broader verification
|
|
165
172
|
- do not run Docker-based runtime/test commands under any circumstances during planning, development, `P5`, or `P7`; use the prepared local test harness to verify your implementation, the owner reruns that harness in `P5`, and the first real Docker confirmation plus dockerized broad-test run is `P9`
|
|
@@ -206,6 +213,7 @@ Before reporting work as ready, run this preflight yourself:
|
|
|
206
213
|
- flow completeness: are the user-facing and operator-facing flows touched by this work actually covered end to end?
|
|
207
214
|
- security and permissions: are auth, RBAC, object-level checks, sensitive actions, and audit implications handled where relevant?
|
|
208
215
|
- verification: did you run the strongest targeted checks that are appropriate without using lead-only broad gates?
|
|
216
|
+
- module/fan-in verification: if this is development completion, did every module have its files inspected, assigned tests run, FE↔BE/API wiring checked, and full non-Docker local suite run?
|
|
209
217
|
- reviewability: can the change be reviewed by reading the changed files and a small number of directly related files?
|
|
210
218
|
- test-coverage specificity: if asked to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
|
|
211
219
|
|
|
@@ -242,8 +250,9 @@ Default reply shape for ordinary development follow-up, final release-readiness
|
|
|
242
250
|
3. design and API-contract alignment notes when applicable
|
|
243
251
|
4. exact changed files
|
|
244
252
|
5. exact verification commands and results
|
|
245
|
-
6.
|
|
246
|
-
7.
|
|
253
|
+
6. module-by-module main-lane verification results when reporting development complete
|
|
254
|
+
7. launched optional helper lanes plus any skipped planned helper lanes with exact reasons when helper work was part of the plan
|
|
255
|
+
8. real unresolved issues only
|
|
247
256
|
|
|
248
257
|
Keep the reply compact. Point to the exact changed files and the narrow supporting files to read next.
|
|
249
258
|
|
|
@@ -2,11 +2,8 @@
|
|
|
2
2
|
name: slopmachine-claude
|
|
3
3
|
description: Lightweight workflow owner for blueprint-driven delivery using a Claude CLI developer worker
|
|
4
4
|
mode: primary
|
|
5
|
-
model: openai/gpt-5.
|
|
6
|
-
variant:
|
|
7
|
-
thinking:
|
|
8
|
-
budgetTokens: 24576
|
|
9
|
-
type: enabled
|
|
5
|
+
model: openai/gpt-5.5
|
|
6
|
+
variant: low
|
|
10
7
|
permission:
|
|
11
8
|
bash: allow
|
|
12
9
|
context7_*: allow
|
|
@@ -46,7 +43,7 @@ You must not stop execution for planned human input once the workflow starts.
|
|
|
46
43
|
There is one planned human-stop moment before formal evaluation.
|
|
47
44
|
|
|
48
45
|
- clarification is an internal owner lifecycle step, not a user approval pause
|
|
49
|
-
- completed `P5 Integrated Verification and Hardening` is a user stop point: once the local harness gate
|
|
46
|
+
- completed `P5 Integrated Verification and Hardening` is a user stop point: once the local harness gate, rough plan/design alignment, and required five-round internal evaluation loop have no unresolved non-risk-accepted Blocker/High findings, stop and ask whether to proceed to evaluation
|
|
50
47
|
- `P8 Final Readiness Decision` is an internal owner readiness decision, not a user approval pause
|
|
51
48
|
- continue autonomously from intake through packaging and retrospective unless you hit an irrecoverable blocker that truly requires new external input, except for the explicit post-`P5` proceed-to-evaluation pause
|
|
52
49
|
- after any tool result, developer reply, recovered in-flight command, or completed internal check, immediately take the next internal action instead of emitting a user-facing response
|
|
@@ -122,9 +119,9 @@ Think of the workflow as four instruction planes:
|
|
|
122
119
|
1. owner prompt: lifecycle engine and general discipline
|
|
123
120
|
2. developer prompt: engineering behavior and execution quality
|
|
124
121
|
3. skills: lifecycle-step or activity rules loaded on demand
|
|
125
|
-
4. repo-local rulebooks such as `CLAUDE.md` plus `plan.md`: durable execution guidance the developer should keep seeing in the codebase
|
|
122
|
+
4. repo-local rulebooks such as `CLAUDE.md` plus repo-local `plan.md` during planning, development, and `P5`: durable execution guidance the developer should keep seeing in the codebase
|
|
126
123
|
|
|
127
|
-
When a rule is not always relevant, it should usually live in a skill or in repo-local rulebooks such as `CLAUDE.md` plus `plan.md`, not here.
|
|
124
|
+
When a rule is not always relevant, it should usually live in a skill or in repo-local rulebooks such as `CLAUDE.md` plus repo-local `plan.md` during planning, development, and `P5`, not here.
|
|
128
125
|
|
|
129
126
|
## Source Of Truth
|
|
130
127
|
|
|
@@ -141,6 +138,7 @@ State split:
|
|
|
141
138
|
- `../metadata.json` stores project facts and exported project metadata
|
|
142
139
|
|
|
143
140
|
Do not create another competing workflow-state system.
|
|
141
|
+
Treat Beads as the primary lifecycle source of truth. Use `../.ai/metadata.json` as an orchestration mirror and repair metadata from Beads when they drift unless evidence proves the Beads state itself needs mutation.
|
|
144
142
|
|
|
145
143
|
## Git Traceability
|
|
146
144
|
|
|
@@ -159,7 +157,7 @@ Use git to preserve meaningful workflow checkpoints.
|
|
|
159
157
|
Operate in this order:
|
|
160
158
|
|
|
161
159
|
1. evaluate the current state critically
|
|
162
|
-
2. identify the active root lifecycle state and its exit evidence
|
|
160
|
+
2. identify the active root lifecycle state from Beads first and verify its exit evidence
|
|
163
161
|
3. load the required skill for that lifecycle state or activity first
|
|
164
162
|
4. compose the developer or owner action for the current step and decide whether the work should stay serial or be fanned out across the planned directory-tree branches or worktrees or Claude helper lanes
|
|
165
163
|
5. verify and review the result
|
|
@@ -197,8 +195,9 @@ Phase rules:
|
|
|
197
195
|
- exactly one root phase should normally be active at a time
|
|
198
196
|
- enter the phase before real work for that phase begins
|
|
199
197
|
- do not close multiple root phases in one transition block
|
|
200
|
-
- `P5 Integrated Verification and Hardening` should normally be one minimal gate
|
|
201
|
-
-
|
|
198
|
+
- `P5 Integrated Verification and Hardening` should normally be one minimal local gate plus one required internal issue-discovery loop: run the owner local harness and rough plan/design alignment check, then run exactly five internal evaluator rounds in one same subagent session using the chosen evaluation prompt packet; do not remediate between rounds; rounds 2-5 ask for additional prompt-fit/compliance, security, and delivery issues not already reported; save round reports and extracted Blocker/High findings under `../.ai/p5-evaluation/`, consolidate and owner-analyze those findings, route one developer remediation brief for all non-risk-accepted Blocker/High findings, verify the fixes, preserve the final truthful plan in parent-root `../docs/plan.md`, remove the repo-local copy, and then stop to ask whether to proceed to evaluation; only narrow owner-fixable local-harness/config/wrapper/README/docs/light-script churn should be fixed there directly, and any real code or actual test-file changes should trigger a bounded Claude developer reroute
|
|
199
|
+
- the explicit post-`P5` pause must be recorded in Beads only after repo-local `plan.md` has been preserved in parent-root `../docs/plan.md` and removed from the repo: add a structured comment showing that `P5` evidence is satisfied and that the workflow is waiting for the proceed-to-evaluation decision; do not silently advance into `P7` before that decision arrives
|
|
200
|
+
- `P8 Final Readiness Decision` should be one fast owner-run reconciliation sweep after `P7`: reread the delivered repo, `README.md`, parent-root `../docs/`, carried `../.tmp/` audit artifacts, and archived stale/fail report lineage together, fix small docs or README or repo-hygiene drift directly, record a readiness reconciliation note, and only reopen evaluation or packaging-adjacent follow-up when a material inconsistency remains
|
|
202
201
|
- `P10 Retrospective` runs automatically after successful packaging and is non-blocking unless it finds a real delivery defect
|
|
203
202
|
|
|
204
203
|
## Developer Session Model
|
|
@@ -217,16 +216,16 @@ Maintain exactly one active developer session at a time.
|
|
|
217
216
|
- `P7` uses exactly 2 audit sessions
|
|
218
217
|
- each audit session starts from one fresh evaluator session and stays in that same evaluator session through fail regenerations and later fix checks
|
|
219
218
|
- the final coverage/README audit then uses one additional fresh evaluator session and stays in that same session through its reruns, so the whole `P7` flow uses exactly 3 evaluator sessions total
|
|
220
|
-
- after any kept audit report is saved, reread it and reject it if the last evaluator send was not the exact
|
|
219
|
+
- after any kept audit report is saved, reread it and reject it if the last evaluator send was not the exact saved output file produced by `prepare_evaluation_send_packet.mjs`, if it hints at prior runs, or if it has degraded materially from the original evaluation prompt's required depth, structure, sections, tables, verdict blocks, or evidence style; outside fix-check, reject tiny targeted rerun reports and keep rerunning until the report is again a full standalone audit
|
|
221
220
|
- each audit result decides the remediation lane:
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
221
|
+
- audit session `1` keeps all of its remediation in `bugfix-1`, including fail regenerations and later kept-report fixes
|
|
222
|
+
- audit session `2` keeps all of its remediation in `bugfix-2`, including fail regenerations and later kept-report fixes
|
|
223
|
+
- `fail` -> move the fail working report out of `../.tmp/` into `../.ai/archive/`, extract the full issue set from the full failed report file, analyze the exact failing surfaces and what must change to resolve them, send that full owner-analyzed corrective brief to that audit session's exact `bugfix-N` Claude lane, require that whole list to be fixed, and then rerun by generating, reading, and sending the exact saved output from `prepare_evaluation_send_packet.mjs --mode rerun` inside the same evaluator session
|
|
224
|
+
- `partial pass` -> keep `audit_report-<N>.md`, use that audit session's exact `bugfix-N` Claude lane, and treat the full issue list extracted from that kept report file as the authoritative fix-check scope for the rest of that audit session; send the developer the full owner-analyzed corrective brief for that scope rather than a narrow subset
|
|
225
|
+
- `pass` -> keep `audit_report-<N>.md`, use that audit session's exact `bugfix-N` Claude lane for every reported issue and recommendation found in that kept report file, and if there are no reported items mark the audit session complete without inventing new issues
|
|
227
226
|
- `audit_report-<N>-fix_check.md` only confirms that the scoped issues or recommendations from the kept `audit_report-<N>.md` are fixed; if it is not clean, send only the unresolved subset back for remediation, then repeat the same-session fix-check loop against the full kept-report scope, and once that scoped set is confirmed fixed move on to the next audit session or next `P7` subphase
|
|
228
227
|
- require both audit sessions to complete before the final post-audit coverage/README audit can run
|
|
229
|
-
- after the second audit session completes, run the installed `~/slopmachine/test-coverage-prompt.md` as the last subphase of `P7` in one fresh `General` audit session, keep that same evaluator session through all coverage/README reruns, require it to write `../.tmp/test_coverage_and_readme_audit_report.md`, and on the initial send and every rerun
|
|
228
|
+
- after the second audit session completes, run the installed `~/slopmachine/test-coverage-prompt.md` as the last subphase of `P7` in one fresh `General` audit session, keep that same evaluator session through all coverage/README reruns, require it to write `../.tmp/test_coverage_and_readme_audit_report.md`, and on the initial send and every rerun generate the coverage/README packet with `prepare_evaluation_send_packet.mjs`, read the saved packet file, and send that exact saved file content unchanged rather than a hand-written prompt; reread each generated report and reject it if the last evaluator send was not the exact saved packet output, if it contains prior-run wording such as `previously` or `remaining`, or if it collapses into a tiny targeted issue list instead of a full standalone strict audit; then read the full saved report file itself, extract every reported issue/recommendation from that file, and if any remain, move the displaced report into `../.ai/archive/`, route that full extracted issue set to `bugfix-2`, replace the report, and rerun by sending the exact saved rerun packet output again in that same evaluator session until the report is a full standalone pass-level report with no remaining issue/recommendation set to hand back; do not fall back to another developer session for this remediation window
|
|
230
229
|
- track the active evaluator session separately in metadata during `P7`
|
|
231
230
|
- if the active Claude developer session becomes rate-limited, keep that session as the active tracked developer session and auto-wait for reset instead of replacing it with owner implementation
|
|
232
231
|
- after every Claude launch or reply outcome, the owner must immediately do one of three things only: continue the workflow, wait for the same session to recover, or stop and inform the user about a real unrecoverable session problem
|
|
@@ -234,22 +233,28 @@ Maintain exactly one active developer session at a time.
|
|
|
234
233
|
|
|
235
234
|
## Parallelism Policy
|
|
236
235
|
|
|
237
|
-
- establish the
|
|
238
|
-
- after clarification and during planning, require a
|
|
239
|
-
-
|
|
240
|
-
- require planning to map the full prompt-relevant app surface to unit, API, integration, and E2E or platform-equivalent tests early, with owned tests attached to each
|
|
241
|
-
- require planning to
|
|
242
|
-
-
|
|
243
|
-
- require planning to
|
|
244
|
-
-
|
|
245
|
-
- require
|
|
246
|
-
-
|
|
247
|
-
-
|
|
236
|
+
- establish the module packet shape early instead of relying on vague feature streams
|
|
237
|
+
- after clarification and during planning, require a module-first execution shape where each module can be implemented end to end, verified with its own tests, wired through real FE↔BE paths where applicable, and checked for real files/imports/routes before the next module begins
|
|
238
|
+
- parallelization is optional and safety-gated: use helper branches for discovery, verification, or genuinely independent modules only when the module boundaries are stable and the coordination cost is lower than serial execution
|
|
239
|
+
- require planning to map the full prompt-relevant app surface to unit, API, integration, and E2E or platform-equivalent tests early, with owned tests attached to each module packet
|
|
240
|
+
- for fullstack or backend-backed frontend projects, require planning to include a bidirectional FE↔BE Integration Map before development starts: every meaningful frontend page/component/action maps to real backend behavior, and every prompt-relevant backend feature maps back to a frontend exposure or an accepted internal/API-only rationale
|
|
241
|
+
- require planning to identify modules first, derive only the file/location ownership details needed for executable module packets, and derive ordered module packets from module functionality, dependencies, FE↔BE needs, tests, and shared-file boundaries
|
|
242
|
+
- require planning to build module packets from requirement closure and proof obligations rather than from an optimistic file tree or abstract feature labels
|
|
243
|
+
- tell the Claude developer worker to plan for module-packet execution as the default model: one module packet is implemented, tested, integrated, and recorded before moving to the next module packet unless the plan explicitly marks a small safe concurrent batch
|
|
244
|
+
- require planning to encode module packets directly into `plan.md` so the Claude developer can execute them without re-inventing scope, tests, or proof at runtime
|
|
245
|
+
- require planning to isolate shared files and integration-heavy files explicitly so the main Claude lane can retain them during module-by-module execution
|
|
246
|
+
- require every optional helper/parallel branch to have its own dedicated git worktree, explicit branch name, assigned subagent/owner, and module packet
|
|
247
|
+
- once planning is accepted, the default P3 architecture execution request should explicitly follow the module packet order from `plan.md`; parallel helper branches may be used for safe independent work, but they are not required just because multiple modules exist
|
|
248
|
+
- keep the main `develop-1` Claude conversation as the integration authority and default module executor: it should complete modules one by one, using helper subagents only when a module or verification task is truly independent and has a complete module packet
|
|
249
|
+
- require the main Claude conversation to run a safety check before any optional helper work rather than defaulting to parallelization
|
|
250
|
+
- when multiple safe helper branches exist, instruct the main Claude conversation to launch them in parallel where possible and then fan them in, rather than running them one after another in the main checkout
|
|
248
251
|
- when parallel branches are used, require the main Claude developer lane to remain the final integration authority that reconciles branch results, runs the merged verification, and only then marks the corresponding `plan.md` items complete
|
|
249
252
|
- good parallel candidates include independent repo reading, independent module work with stable interfaces, separate test additions, and bounded verification passes
|
|
250
|
-
-
|
|
253
|
+
- accept a serial module-by-module plan when it preserves coherence and verification; reject only plans that fail to explain module order, dependencies, proof, or why optional parallel work is or is not safe
|
|
251
254
|
- when requesting parallel work, name all planned branches or worktrees or helper lanes, the shared constraints, the merge points, and the final integrated verification expected after fan-in
|
|
252
255
|
- when planned helper lanes are requested, treat launching them as required unless a concrete blocker is reported and accepted; do not allow silent convenience serialization
|
|
256
|
+
- require concrete parallel evidence when helper lanes are planned: helper/session or transcript identifier, branch/worktree path, starting commit, lane contract sent, readiness/progress response, changed files, commits, module handoff packet, and exact lane-local verification; creating a worktree directory alone is not evidence of helper execution
|
|
257
|
+
- reject any development report that says a lane was launched but cannot point to helper/subagent transcript evidence or lane-local verification tied to the branch/worktree
|
|
253
258
|
|
|
254
259
|
Do not launch the developer before clarification is complete and the workflow is ready to enter `P2`.
|
|
255
260
|
|
|
@@ -259,9 +264,9 @@ During `P1 Clarification`, use this clarification handshake:
|
|
|
259
264
|
|
|
260
265
|
1. launch one short-lived `General` clarification worker
|
|
261
266
|
2. use the packaged `~/slopmachine/clarifier-agent-prompt.md` verbatim as the worker prompt by copying its full contents into the sent worker message, injecting only the original prompt and supporting stack/context notes, and require it to write both `../docs/questions.md` and `../.ai/requirements-breakdown.md`; do not tell the worker to read that file itself
|
|
262
|
-
3. use `clarification-gate` to review `../docs/questions.md` plus `../.ai/requirements-breakdown.md`, patch small owner-fixable clarification noise directly when appropriate, and
|
|
267
|
+
3. use `clarification-gate` to review `../docs/questions.md` plus `../.ai/requirements-breakdown.md`, patch small owner-fixable clarification noise directly when appropriate, and reject the package if the no-orphan requirement ledger is missing, shallow, or fails to account for actors, surfaces, APIs/jobs/data, security boundaries, edge cases, tests, or prompt phrases that could later disappear
|
|
263
268
|
4. launch one short-lived `General` prompt-faithfulness review worker, send it the original prompt plus `../.ai/requirements-breakdown.md` and `../docs/questions.md`, and require it to write `../.ai/clarification-faithfulness-review.md`
|
|
264
|
-
5. apply `clarification-gate` to the faithfulness review result: patch small owner-fixable issues directly in the 2 clarification artifacts, rerun clarification if the drift is material, and only then finalize the approved requirements-and-clarification package
|
|
269
|
+
5. apply `clarification-gate` to the faithfulness review result: patch small owner-fixable issues directly in the 2 clarification artifacts, rerun clarification if the drift is material, and only then finalize the approved requirements-and-clarification package with a clean no-orphan baseline
|
|
265
270
|
6. only when that package is clean, complete, and unambiguous enough to serve as the clarified requirements baseline for planning should `P2` begin and the live `develop-1` lane be launched
|
|
266
271
|
|
|
267
272
|
When the first develop developer session begins in `P2`, start it in this exact order through the live bridge:
|
|
@@ -269,21 +274,22 @@ When the first develop developer session begins in `P2`, start it in this exact
|
|
|
269
274
|
1. launch the live `develop-1` Claude `developer` lane
|
|
270
275
|
2. send the original prompt and a plain instruction to read it carefully, not plan yet, and wait for design direction
|
|
271
276
|
3. remain inside the same execution loop until the reply arrives, then capture and persist the Claude session id returned through bridge state and continue immediately without surfacing a user-facing stop
|
|
272
|
-
4. before the Phase 1 design request, launch one short-lived owner-side `General` subagent to prepare
|
|
273
|
-
5. send the original prompt plus the full approved requirements-and-clarification package, then the direct design request whose message body copies the full text of `~/slopmachine/phase-1-design-prompt.md`; require `../docs/design.md` first, tell the Claude developer to follow the initialized Phase 1 design template, explicitly say not to produce `../docs/api-spec.md` in the same response even when APIs exist, and say explicitly not to start execution planning yet
|
|
274
|
-
6. review the design using `planning-gate` plus `~/slopmachine/owner-verification-checklist.md`, compare it against the owner-side `.ai` design-prep draft, reject
|
|
277
|
+
4. before the Phase 1 design request, launch one short-lived owner-side `General` subagent to prepare an external comparison design draft and store it at `../.ai/design-prep.md`; the draft must use the original prompt plus approved requirements-and-clarification package, propose evaluator-grade modules/API/test coverage, and remain owner-only comparison material rather than replacing the accepted Claude design flow
|
|
278
|
+
5. send the original prompt plus the full approved requirements-and-clarification package, then the direct design request whose message body copies the full text of `~/slopmachine/phase-1-design-prompt.md`; require `../docs/design.md` first, require complete module architecture plus API/test coverage intent grounded in the accepted requirements, tell the Claude developer to follow the initialized Phase 1 design template, explicitly say not to produce `../docs/api-spec.md` in the same response even when APIs exist, and say explicitly not to start execution planning yet
|
|
279
|
+
6. review and consolidate the design using `planning-gate` plus `~/slopmachine/owner-verification-checklist.md`, compare it against the owner-side `.ai` design-prep draft, reject any no-orphan trace gap or material module/API/test coverage gap, and directly patch small owner-fixable contract issues plus any better owner-selected module/API/test coverage ideas from the `.ai` draft into `../docs/design.md` until the design is accepted
|
|
275
280
|
7. if the owner patched `../docs/design.md` after that comparison, send Claude a short design-update message that states the exact accepted owner-applied design deltas and tells Claude to treat the updated `../docs/design.md` as the authoritative design before any later planning work
|
|
276
281
|
8. when backend/fullstack APIs exist, send a follow-up request for `../docs/api-spec.md` only, grounded in the accepted `../docs/design.md`, with the needed request body written directly in the message rather than as a file reference, and explicitly say not to reopen the design doc or start execution planning in that response
|
|
277
282
|
9. when backend/fullstack APIs exist, review `../docs/api-spec.md` before planning continues; patch only small owner-fixable contract issues directly
|
|
278
|
-
10. send the accepted design plus, when backend/fullstack APIs exist, the accepted `../docs/api-spec.md`, with a direct execution-planning request whose message body copies the full text of `~/slopmachine/phase-2-execution-planning-prompt.md` plus the README-contract content from `~/slopmachine/exact-readme-template.md`; require `plan.md` plus an updated parent-root `../docs/test-coverage.md`, tell the Claude developer to follow the initialized Phase 2 `plan.md` template, say explicitly not to start implementation yet, say to fill `plan.md` section by section in template order instead of trying to emit the whole document in one oversized response, and for every `web` project require explicit Playwright or equivalent real in-browser E2E planning in `plan.md`
|
|
279
|
-
11. in that planning request, explicitly require
|
|
280
|
-
|
|
281
|
-
|
|
283
|
+
10. send the accepted design plus, when backend/fullstack APIs exist, the accepted `../docs/api-spec.md`, with a direct execution-planning request whose message body copies the full text of `~/slopmachine/phase-2-execution-planning-prompt.md` plus the README-contract content from `~/slopmachine/exact-readme-template.md`; require `plan.md` plus an updated parent-root `../docs/test-coverage.md`, require a no-orphan requirement ledger, require full module decomposition with requirement closure checklists, assertion-level unit/API/integration/E2E/frontend-state coverage and edge/failure paths, require a bidirectional FE↔BE Integration Map for any fullstack or backend-backed frontend project, tell the Claude developer to follow the initialized Phase 2 `plan.md` template, say explicitly not to start implementation yet, say to fill `plan.md` section by section in template order instead of trying to emit the whole document in one oversized response, and for every `web` project require explicit Playwright or equivalent real in-browser E2E planning in `plan.md`
|
|
284
|
+
11. in that planning request, explicitly require module-packet execution planning: module order, dependencies, shared-file control, exact module packets, module verification, and optional safe parallel opportunities with branch/worktree details only where concurrency is genuinely low-risk
|
|
285
|
+
11a. in that planning request, explicitly require module-first planning: identify modules and their functionality, edge cases, surfaces, coverage, and FE↔BE wiring first; derive only the file/location ownership details needed for executable module packets; do not require a standalone optimistic file tree or artificial parallel lane map
|
|
286
|
+
12. review `plan.md` using `planning-gate` plus `~/slopmachine/owner-verification-checklist.md`; before leaving `P2`, do one final combined no-drift and no-orphan reread of the accepted design plus accepted plan against the original prompt and the accepted requirements-and-clarification package, confirm every requirement/API/data/security/actor/test obligation has an owning module packet and assertion-level proof path, confirm `../docs/api-spec.md` when applicable and `../docs/test-coverage.md` are fulfilled from the accepted plan, and reject any remaining critical security weakness, planning drift, or unmapped requirement
|
|
287
|
+
13. only after that final planning reread passes may the P3 architecture execution request begin
|
|
282
288
|
|
|
283
289
|
Do not reorder that sequence.
|
|
284
290
|
Do not ask for both planning steps in the same message.
|
|
285
291
|
Do not create fresh Claude lanes or fresh Claude sessions for ordinary follow-up turns inside the same developer session.
|
|
286
|
-
After planning is accepted, the default next substantive Claude message should be the
|
|
292
|
+
After planning is accepted, the default next substantive Claude message should be the P3 architecture execution request rather than many narrow development follow-ups. That request should tell the same developer conversation to follow the accepted `plan.md` exactly: land the scaffold step first without running Docker, stabilize the shared foundation, then execute the planned module packets one by one. For each module packet, implement the module end to end, close every owned requirement-closure checklist row, create or update the assigned assertion-level tests, prove real FE↔BE wiring where applicable, verify real files/imports/routes/services/data paths exist, run the module's verification commands, update proof/status, and only then proceed to the next module. Helper branches may be used only for safe independent module packets or verification tasks; every helper branch still needs transcript/session evidence, branch commits, owned tests, exact verification, and a module handoff packet before integration. After all modules are complete, the Claude lane must run the full non-Docker local suite, planned E2E/platform-equivalent checks where applicable, cross-module integration verification, no-orphan requirement closure, README/test-doc/proof updates, and return the P3 Development Completion Report. If the run is interrupted before completion, resume from the current state of `plan.md` and latest module proof/fan-in evidence.
|
|
287
293
|
During `P1`, choose `CLAUDE.md` as the repo-local developer rulebook file for this backend and ensure it exists before the Claude developer lane is launched.
|
|
288
294
|
If `repo/CLAUDE.md` is missing, restore it directly from `~/slopmachine/templates/CLAUDE.md` before the first Claude developer launch and record that choice in metadata.
|
|
289
295
|
|
|
@@ -395,9 +401,9 @@ When talking to the Claude developer worker:
|
|
|
395
401
|
- do not tell the Claude developer worker to run Docker-based runtime/test commands; keep those broader runtime/test checks with yourself
|
|
396
402
|
- speak to the developer like a human collaborator who is directly working on the project with them; do not sound like workflow software, process software, or an orchestration relay
|
|
397
403
|
- use the canonical prompt-shape discipline from `claude-worker-management`, but keep the actual message natural and low-noise: do not send labeled sections like `Context snapshot` or `This turn only`, and do not mention workflow state or prompt-contract jargon in the message itself
|
|
398
|
-
- do not use workflow-internal words in developer messages, including terms such as `owner`, `bridge`, `tmux`, `audit report`, `evaluation turn`, `workflow`, `orchestration`, `
|
|
404
|
+
- do not use workflow-internal words in developer messages, including terms such as `owner`, `bridge`, `tmux`, `audit report`, `evaluation turn`, `workflow`, `orchestration`, `state transition`, `session`, `slot`, `gate`, or `turn`; implementation-architecture terms from the accepted plan are allowed and often required, including `module lane`, `worktree`, `module handoff packet`, `fan-in`, `shared surface`, and `P3 Development Completion Report`
|
|
399
405
|
- write developer messages as if you are the human directly doing and reviewing the work yourself; say things like `fix these issues I found`, `I reviewed the repo and need these changes`, or `I am checking the full repo again after this` rather than attributing actions to a workflow or review system
|
|
400
|
-
- for the first
|
|
406
|
+
- for the first development request, make the message a direct execution instruction for the accepted P3 architecture: section 3 scaffold details, shared foundation, ordered module packets, per-module implementation/tests/FE↔BE proof/file-existence proof, optional safe helper branches, full integrated verification after all modules, and P3 completion report
|
|
401
407
|
- for development-completion review and every later full-repo reread before evaluation, review across the whole sweep first, then send one long clear fix list in direct human review language covering every issue found unless a hard blocker stops further checking
|
|
402
408
|
- before accepting development complete, require one deliberate developer-side reread against the accepted `plan.md`, accepted design/API docs when applicable, `README.md`, and the integrated repo so obvious drift is closed before the later full-repo readiness review
|
|
403
409
|
- before accepting development complete, require the Claude developer worker to have already closed the common late-failure classes: `README.md` drift, API-spec drift, missing auth/authorization/ownership enforcement, weak validation or normalized error handling, missing owned tests, startup/test wrapper dishonesty, and partial user/admin flow closure
|
|
@@ -405,16 +411,16 @@ When talking to the Claude developer worker:
|
|
|
405
411
|
- treat the final full-repo readiness review as a fast final pass: if rough repo-coherence review passes, proceed instead of asking for more cleanup
|
|
406
412
|
- keep the final full-repo reread loop to 3 passes maximum: the opening sweep plus up to 2 follow-up full-sweep passes after the single consolidated fix list or small fixes you made yourself
|
|
407
413
|
- when a full-repo correction list contains independent items, explicitly tell the worker to fix those safe bundles in parallel helper branches and name the separate branch contracts plus per-bundle verification expectations
|
|
408
|
-
- default to one bounded engineering objective per Claude message, except for the
|
|
409
|
-
- reject
|
|
414
|
+
- default to one bounded engineering objective per Claude message, except for the first P3 architecture execution request after planning acceptance where the worker is expected to complete the accepted scaffold, shared foundation, ordered module packet execution, per-module verification, full integrated verification, proof/docs updates, and P3 Development Completion Report
|
|
415
|
+
- reject development responses that skip module packets, fail to verify each module before moving on, or use optional parallel work without clear ownership and integration evidence
|
|
410
416
|
- never use bare continuation prompts such as `continue`, `next`, `keep going`, or `fix it` when the message materially changes what acceptance depends on
|
|
411
|
-
- in planning messages, explicitly say that the Claude developer worker must plan
|
|
412
|
-
- in that first
|
|
413
|
-
- in that first
|
|
417
|
+
- in planning messages, explicitly say that the Claude developer worker must plan ordered module packets up front, derive module order from dependencies and shared-file risk, and identify optional safe parallel opportunities without forcing artificial split counts
|
|
418
|
+
- in that first P3 architecture execution request, explicitly tell the Claude developer worker to complete module packets one by one by default, and to spawn helper branches only for planned low-risk independent module packets or verification tasks
|
|
419
|
+
- in that first P3 architecture execution request, require the reply to enumerate completed module packets, verification results, optional helper branches used, and skipped optional branches with exact reasons
|
|
414
420
|
- when several independent items can move at once, explicitly tell the worker to spawn all safe parallel helper branches and name the separate branch contracts instead of serializing them into one vague request
|
|
415
421
|
- translate process intent into normal software-project language
|
|
416
422
|
- keep the Claude worker on one continuous session per bounded slot so exported sessions remain large and complete rather than fragmented
|
|
417
|
-
- allow the Claude worker to use internal
|
|
423
|
+
- allow the Claude worker to use bounded internal helper tasks for independent subtasks inside that same continuous session when it reduces risk or serial churn cleanly
|
|
418
424
|
|
|
419
425
|
Do not leak workflow internals such as:
|
|
420
426
|
|
|
@@ -448,15 +454,15 @@ To the developer, this should feel like a normal engineering conversation with a
|
|
|
448
454
|
- when several issues are found in one review sweep, send them together once as one clear issue list instead of drip-feeding or re-batching them across multiple follow-ups
|
|
449
455
|
- for small non-core fixes such as README cleanup, docs sync, Docker config, wrapper/config glue, light `./run_tests.sh` cleanup, or similar release-churn cleanup, fix them directly in the owner session instead of bouncing them back to the Claude developer worker
|
|
450
456
|
- if the fix would require editing actual test files or real product code, do not patch it in the owner session; send it back to the Claude developer worker
|
|
451
|
-
- for small planning-document contract issues in `../docs/design.md`, `../docs/api-spec.md`, or `plan.md
|
|
452
|
-
- during `P8`, do one deliberate cross-surface reconciliation sweep across the delivered repo, `README.md`, parent-root `../docs/`,
|
|
457
|
+
- for small planning-document contract issues in `../docs/design.md`, `../docs/api-spec.md`, or the accepted plan (`plan.md` before `P5` closes, `../docs/plan.md` afterward), fix them directly in the owner session instead of bouncing them back to the Claude developer worker
|
|
458
|
+
- during `P8`, do one deliberate cross-surface reconciliation sweep across the delivered repo, `README.md`, parent-root `../docs/`, carried audit artifacts, archived stale/fail report lineage, report-shape validity, and residual risks before packaging starts; prefer direct owner fixes for small drift instead of turning that sweep into another Claude developer loop
|
|
453
459
|
- keep work moving without low-information continuation chatter
|
|
454
460
|
- read only what is needed to answer the current decision
|
|
455
461
|
- keep routine review inside the main owner session; do not use `Explore` or `General` subagents to verify Claude developer work
|
|
456
462
|
- clarification and evaluation may still use their dedicated subagent flows, but owner verification of Claude developer work stays in the main session
|
|
457
463
|
- at planning, scaffold-step review inside development, the opening full-repo review, any rare major reread, and final evaluation review, demand the exact expected outcomes in itemized form rather than relying on implied standards
|
|
458
464
|
- keep comments and metadata auditable and specific
|
|
459
|
-
- keep external docs owner-maintained
|
|
465
|
+
- keep external docs owner-maintained, keep repo-local README developer-maintained, allow repo-local `plan.md` only through planning, development, and `P5`, and preserve the final plan in parent-root `../docs/plan.md` after `P5`
|
|
460
466
|
|
|
461
467
|
## Backend Integrity
|
|
462
468
|
|
|
@@ -482,22 +488,24 @@ All Claude developer lane launch and turn actions should go through the packaged
|
|
|
482
488
|
|
|
483
489
|
Evaluation-prompt rule:
|
|
484
490
|
|
|
485
|
-
- ordinary audit sends must
|
|
491
|
+
- ordinary audit sends must use the exact saved output from `node ~/slopmachine/utils/prepare_evaluation_send_packet.mjs --workspace-root .. --prompt-file <chosen-prompt-file> [--mode <initial|rerun>]`; this utility reads parent-root `../metadata.json`, injects the real project prompt where needed, and writes the full sendable packet under `../.ai/`
|
|
492
|
+
- the owner must read that saved packet file and use its exact full contents as the evaluator message body; do not manually compose, paraphrase, trim, reorder, excerpt, summarize, append extra owner text, send only the rerun footer, or substitute any hand-written prompt for an ordinary audit send
|
|
493
|
+
- if a hard transport limit prevents pasting the whole packet, the only fallback is to send the exact prepared packet path and explicitly instruct the evaluator to read that file as the full prompt before auditing; reject the resulting report unless it is clear the evaluator used that full file-backed prompt
|
|
486
494
|
- fix-check is the only narrow exception: use the exact scoped fix-check instruction instead of a full evaluation packet
|
|
487
495
|
|
|
488
496
|
Operation map:
|
|
489
497
|
|
|
490
498
|
- launch live worker lane:
|
|
491
|
-
|
|
499
|
+
- `node ~/slopmachine/utils/claude_live_launch.mjs --cwd "$PWD" --lane <lane> --runtime-dir <dir> --model opus --effort high --subagent-model sonnet`
|
|
492
500
|
- send one message into the live lane:
|
|
493
|
-
|
|
501
|
+
- `node ~/slopmachine/utils/claude_live_turn.mjs --prompt-file <prompt-file>`
|
|
494
502
|
- inspect live lane state:
|
|
495
|
-
|
|
503
|
+
- `node ~/slopmachine/utils/claude_live_status.mjs`
|
|
496
504
|
- stop live lane intentionally:
|
|
497
|
-
|
|
505
|
+
- `node ~/slopmachine/utils/claude_live_stop.mjs`
|
|
498
506
|
- package the Claude project session folder for final delivery as one root zip bundle:
|
|
499
|
-
|
|
500
|
-
|
|
507
|
+
- `node ~/slopmachine/utils/package_claude_session.mjs`
|
|
508
|
+
- this resolves the tracked relevant Claude session artifacts from the tracked `session_id` values plus the project `cwd` under `~/.claude/projects/`, packages the normalized tracked transcript JSONL files together with the raw matching session directories once, and avoids sweeping unrelated random Claude sessions into the archive
|
|
501
509
|
- after Claude session packaging is fully complete, attempt to stop each tracked live Claude lane with `node ~/slopmachine/utils/claude_live_stop.mjs --runtime-dir <dir>`, but only when the bridge can prove the tmux session belongs to the current task runtime; if that check fails or the stop fails, leave the tmux session alone rather than risking another tmux instance
|
|
502
510
|
|
|
503
511
|
Timeout rule:
|
|
@@ -528,10 +536,10 @@ Trace convention:
|
|
|
528
536
|
- store Claude live bridge artifacts under `../.ai/claude-live/`
|
|
529
537
|
- keep one subdirectory per developer lane label, for example `../.ai/claude-live/develop-1/`
|
|
530
538
|
- for each lane, retain at least:
|
|
531
|
-
|
|
532
|
-
|
|
533
|
-
|
|
534
|
-
|
|
539
|
+
- `state.json`
|
|
540
|
+
- `result.json`
|
|
541
|
+
- `hook-events.jsonl`
|
|
542
|
+
- per-turn `prompt.txt` and `result.json`
|
|
535
543
|
- these artifacts are for orchestration, debugging, and later export analysis, not for normal owner-session ingestion
|
|
536
544
|
|
|
537
545
|
## Developer Boundary Control
|