theslopmachine 0.7.1 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -40,7 +40,7 @@ From this package directory:
40
40
  npm install
41
41
  npm run check
42
42
  npm pack
43
- npm install -g ./theslopmachine-0.6.2.tgz
43
+ npm install -g ./theslopmachine-0.7.2.tgz
44
44
  ```
45
45
 
46
46
  For local development instead:
@@ -57,23 +57,23 @@ Do not introduce convenience-based simplifications, `v1` reductions, future-phas
57
57
 
58
58
  - the original prompt explicitly allows it
59
59
  - the approved clarification explicitly allows it
60
- - the owner explicitly instructs it in the current session
60
+ - the project lead explicitly instructs it in the current session
61
61
 
62
62
  If a simplification would make implementation easier but is not explicitly authorized, keep the full prompt scope and plan the real complexity instead.
63
63
 
64
64
  When accepted planning artifacts already exist, treat them as the primary execution contract.
65
65
 
66
66
  - read the relevant accepted plan section before implementing the next slice
67
- - do not wait for the owner to restate what is already in the plan
68
- - treat owner follow-up prompts mainly as narrow deltas, guardrails, or correction signals
67
+ - do not wait for the project lead to restate what is already in the plan
68
+ - treat project-lead follow-up prompts mainly as narrow deltas, guardrails, or correction signals
69
69
 
70
- When the owner asks for planning without coding yet:
70
+ When the project lead asks for planning without coding yet:
71
71
 
72
72
  - produce an exhaustive, section-addressable implementation plan rather than a high-level summary
73
73
  - prefer writing almost all important implementation decisions down now instead of deferring them to coding time
74
74
  - make unresolved items rare, narrow, and explicit
75
- - if the owner asks you to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
76
- - when the owner asks for planning artifacts, prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
75
+ - if the project lead asks you to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
76
+ - when the project lead asks for planning artifacts, prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
77
77
 
78
78
  ## Execution Model
79
79
 
@@ -91,7 +91,7 @@ When the owner asks for planning without coding yet:
91
91
  - keep the repo self-sufficient and statically reviewable through code plus `README.md`; do not rely on runtime success alone to make the project understandable
92
92
  - keep the repo self-sufficient; do not make it depend on parent-directory docs or sibling artifacts for startup, build/preview, configuration, verification, or basic understanding
93
93
  - do not touch workflow or rulebook files such as `AGENTS.md` unless explicitly asked
94
- - if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming the owner will catch inconsistencies later
94
+ - if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming the project lead will catch inconsistencies later
95
95
  - keep `README.md` compatible with the strict audit contract as the project matures: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
96
96
  - for backend, fullstack, and web projects, keep the canonical `docker compose up --build` contract in `README.md` and also include the exact legacy compatibility string `docker-compose up` somewhere in startup guidance
97
97
  - for Android, iOS, and desktop projects, keep the required Docker-contained final contract while also maintaining the project-type-specific host-side guidance sections expected by the strict README audit
@@ -170,15 +170,15 @@ Before reporting work as ready, run this preflight yourself:
170
170
  - consistency: do code, docs, route contracts, security notes, and runtime/test commands agree?
171
171
  - flow completeness: are the user-facing and operator-facing flows touched by this work actually covered end to end?
172
172
  - security and permissions: are auth, RBAC, object-level checks, sensitive actions, and audit implications handled where relevant?
173
- - verification: did you run the strongest targeted checks that are appropriate without using owner-only broad gates?
174
- - reviewability: can the owner review this work by reading the changed files and a small number of directly related files?
175
- - test-coverage specificity: if the owner asked you to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
173
+ - verification: did you run the strongest targeted checks that are appropriate without using lead-only broad gates?
174
+ - reviewability: can the project lead review this work by reading the changed files and a small number of directly related files?
175
+ - test-coverage specificity: if the project lead asked you to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
176
176
 
177
177
  If any answer is no, fix it before replying or call out the blocker explicitly.
178
178
 
179
179
  When you make an assumption, keep it prompt-preserving by default. If an assumption would reduce scope, mark it as unresolved instead of silently locking it in.
180
180
 
181
- If the owner asks you to help shape test-coverage evidence, make it acceptance-grade on first pass:
181
+ If the project lead asks you to help shape test-coverage evidence, make it acceptance-grade on first pass:
182
182
 
183
183
  - one explicit row or subsection per requirement/risk cluster
184
184
  - planned test file or test layer named concretely
@@ -207,9 +207,9 @@ Default reply shape for ordinary slice completion, hardening, and fix responses:
207
207
  3. exact verification commands and results
208
208
  4. real unresolved issues only
209
209
 
210
- Keep the reply compact. Point to the exact changed files and the narrow supporting files the owner should read next.
210
+ Keep the reply compact. Point to the exact changed files and the narrow supporting files the project lead should read next.
211
211
 
212
- Use the larger reply shape only when the owner explicitly asks for a deeper mapping or when you are delivering a first-pass planning/scaffold artifact that genuinely needs it:
212
+ Use the larger reply shape only when the project lead explicitly asks for a deeper mapping or when you are delivering a first-pass planning/scaffold artifact that genuinely needs it:
213
213
 
214
214
  1. `Changed files` — exact files changed
215
215
  2. `What changed` — the concrete behavior/contract updates in those files
@@ -237,7 +237,7 @@ When the first develop developer session begins in `P2`, start it in this exact
237
237
  2. send the original prompt and a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction
238
238
  3. capture and persist the Claude session id returned through bridge state
239
239
  4. form your own initial planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
240
- 5. send a compact second owner message through that same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, your initial planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for an exhaustive, section-addressable implementation plan plus major risks or assumptions, with the planning artifacts filled densely enough that later implementation mostly follows the accepted plan instead of inventing new structure
240
+ 5. send a compact second planning-direction message through that same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, your initial planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for an exhaustive, section-addressable implementation plan plus major risks or assumptions, with the planning artifacts filled densely enough that later implementation mostly follows the accepted plan instead of inventing new structure
241
241
  6. continue with planning from there in that same Claude session
242
242
 
243
243
  Do not reorder that sequence.
@@ -347,6 +347,7 @@ When talking to the Claude developer worker:
347
347
  - when backend or fullstack APIs are relevant, explicitly require progress on endpoint inventory, true no-mock HTTP coverage for important `METHOD + PATH` surfaces, and honest classification of mocked or indirect tests
348
348
  - when README compliance is relevant, explicitly require the strict audit sections: project type, startup instructions, access method, verification method, and demo credentials or the exact statement `No authentication required`
349
349
  - during ordinary development you may allow fast local iteration, but before development closes and before hardening closes require cleanup of local-only setup traces so the delivered runtime and broad test contract is Docker-contained and reviewable
350
+ - speak to the developer like a human project manager or technical lead who cares about the project outcome; do not sound like workflow software or an orchestration relay
350
351
  - use the canonical prompt-shape discipline from `claude-worker-management`: every substantive turn should make the current boundary, expected outcomes, required evidence, disallowed shortcuts, and stop boundary unmistakable
351
352
  - default to one bounded engineering objective per Claude turn; split cross-boundary work into separate turns instead of hoping Claude infers the boundary correctly
352
353
  - never use bare continuation prompts such as `continue`, `next`, `keep going`, or `fix it` when the turn materially changes what acceptance depends on
@@ -222,7 +222,7 @@ When the first develop developer session begins in `P2`, use this planning hands
222
222
  1. send the original prompt and tell the developer to read it carefully, not plan yet, and wait for clarifications and planning direction
223
223
  2. wait for the developer's first reply
224
224
  3. before the second message, form your own initial planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
225
- 4. send the approved clarification content, your initial planning view, and the explicit plain-language planning brief as the second owner message in that same session; that brief should summarize the prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky areas that planning must resolve
225
+ 4. send the approved clarification content, your initial planning view, and the explicit plain-language planning brief as the second planning-direction message in that same session; that brief should summarize the prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky areas that planning must resolve
226
226
  5. only then ask for an exhaustive, section-addressable implementation plan plus major risks or assumptions, with the planning artifacts filled densely enough that later implementation mostly follows the accepted plan instead of inventing new structure
227
227
  6. continue with planning from there
228
228
 
@@ -338,6 +338,7 @@ When talking to the developer:
338
338
  - when backend or fullstack APIs are relevant, explicitly require progress on endpoint inventory, true no-mock HTTP coverage for important `METHOD + PATH` surfaces, and honest classification of mocked or indirect tests
339
339
  - when README compliance is relevant, explicitly require the strict audit sections: project type, startup instructions, access method, verification method, and demo credentials or the exact statement `No authentication required`
340
340
  - during ordinary development you may allow fast local iteration, but before development closes and before hardening closes require cleanup of local-only setup traces so the delivered runtime and broad test contract is Docker-contained and reviewable
341
+ - speak to the developer like a human project manager or technical lead who cares about the project outcome; do not sound like workflow software or an orchestration relay
341
342
  - do not re-dump the entire plan, but do enumerate the exact subset of plan-backed outcomes that must now be delivered
342
343
  - when the next slice is already described in the accepted plan, tell the developer to use the relevant accepted plan section and only add the narrow delta, guardrail, or review concern for that slice
343
344
  - when 2 or 3 independent items can move at once, explicitly authorize parallel execution and name the separate branch contracts instead of serializing them into one vague request
@@ -50,14 +50,14 @@ Do not narrow scope for convenience.
50
50
  - if mocked HTTP tests or unit-only tests still exist for an API surface, do not overstate them as equivalent to true no-mock endpoint coverage
51
51
  - update `README.md` when behavior or run/test instructions change
52
52
  - do not touch workflow or rulebook files such as `CLAUDE.md` unless explicitly asked
53
- - when the owner says to plan without coding yet, produce planning artifacts and stop
53
+ - when the project lead says to plan without coding yet, produce planning artifacts and stop
54
54
  - when planning, produce an exhaustive, section-addressable implementation plan rather than a high-level summary
55
55
  - prefer writing almost all important implementation decisions down now instead of deferring them to coding time
56
56
  - make unresolved items rare, narrow, and explicit
57
- - when the owner asks for planning artifacts, prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
58
- - planning-only deliverables inside the repo should be limited to `README.md` unless the owner explicitly asks for another in-repo artifact
59
- - when the owner says to finish the scaffold and not start feature implementation yet, stop before starting development work
60
- - do not continue into extra follow-on work that the owner did not ask for
57
+ - when the project lead asks for planning artifacts, prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
58
+ - planning-only deliverables inside the repo should be limited to `README.md` unless the project lead explicitly asks for another in-repo artifact
59
+ - when the project lead says to finish the scaffold and not start feature implementation yet, stop before starting development work
60
+ - do not continue into extra follow-on work that the project lead did not ask for
61
61
  - keep `README.md` compatible with the strict audit contract as the project matures: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
62
62
  - for backend, fullstack, and web projects, keep the canonical `docker compose up --build` contract in `README.md` and also include the exact legacy compatibility string `docker-compose up` somewhere in startup guidance
63
63
  - for Android, iOS, and desktop projects, keep the required Docker-contained final contract while also maintaining the project-type-specific host-side guidance sections expected by the strict README audit
@@ -121,7 +121,7 @@ Selected-stack defaults:
121
121
  - be direct and technically clear
122
122
  - report what changed, what was verified, and what still looks weak
123
123
  - if a problem needs a real fix, fix it instead of explaining around it
124
- - when the owner asks for a bounded deliverable, end with a concise summary of what was completed and what remains
124
+ - when the project lead asks for a bounded deliverable, end with a concise summary of what was completed and what remains
125
125
  - when you write or update files, end with:
126
126
  - `FILES_CHANGED:` followed by the exact repo-local file paths changed
127
127
  - `NEXT_STEP:` followed by the next concrete engineering step or remaining blocker when useful
@@ -133,12 +133,12 @@ Its primary target is requirements ambiguity from the original prompt.
133
133
 
134
134
  Prefer questions about missing or unclear product behavior, actor expectations, workflow requirements, business rules, scope boundaries, output expectations, and other prompt-level ambiguities.
135
135
 
136
- Each entry should answer this structure:
136
+ Each entry should use this exact structure:
137
137
 
138
- 1. what was unclear from the original prompt
139
- 2. how you interpreted it
140
- 3. what decision or solution you chose for it
141
- 4. why that choice is prompt-faithful and reasonable
138
+ 1. a numbered clarification heading
139
+ 2. `Question:`
140
+ 3. `My Understanding:`
141
+ 4. `Solution:`
142
142
 
143
143
  Keep the file narrow and explicit.
144
144
 
@@ -156,19 +156,10 @@ Do not use `questions.md` for:
156
156
  Preferred entry shape:
157
157
 
158
158
  ```md
159
- ## Item N: <short ambiguity title>
160
-
161
- ### What was unclear
162
- <the exact ambiguity or missing detail>
163
-
164
- ### Interpretation
165
- <how it was interpreted>
166
-
167
- ### Decision
168
- <the chosen resolution or safe default>
169
-
170
- ### Why this is reasonable
171
- <brief justification tied to prompt faithfulness>
159
+ ### 1. <short clarification title>
160
+ - Question: <the exact ambiguity or missing detail>
161
+ - My Understanding: <how it was interpreted and why this needed to be locked>
162
+ - Solution: <the chosen resolution or safe default>
172
163
  ```
173
164
 
174
165
  If nothing material was unclear, still create `questions.md` and keep it minimal rather than inventing content.
@@ -20,7 +20,7 @@ Use this skill whenever `slopmachine-claude` needs to launch, inspect, or messag
20
20
  - do not use the OpenCode `developer` subagent for implementation work in the `slopmachine-claude` path
21
21
  - do not read Claude transcript files as the normal communication channel
22
22
  - communicate with the Claude worker through the packaged live bridge scripts in `~/slopmachine/utils/`
23
- - use `claude_live_launch.mjs` once per lane and `claude_live_turn.mjs` for each owner message into that lane
23
+ - use `claude_live_launch.mjs` once per lane and `claude_live_turn.mjs` for each message into that lane
24
24
  - set the Claude live runtime settings default `agent` to `developer` so the lane stays on the intended system prompt even if the session is resumed or inspected through Claude-native controls
25
25
  - treat bridge `state.json` as the durable control-plane truth for lane status, routing, and Claude session identity
26
26
  - treat bridge `result.json` as the semantic source of truth after each completed turn
@@ -32,9 +32,9 @@ Use this skill whenever `slopmachine-claude` needs to launch, inspect, or messag
32
32
  - launch the live lane with `--dangerously-skip-permissions` so the worker does not stall on routine file-edit permission prompts inside the bounded repo
33
33
  - when Claude uses internal task fan-out and the environment allows explicit agent selection, prefer the installed `developer` agent for implementation-capable branches so the same engineering standard applies across those branches
34
34
  - there is no repo-controlled guarantee that every Claude helper subagent globally reuses the `developer` prompt, so keep critical implementation in the main developer lane or in explicitly developer-scoped helper branches rather than relying on unspecified built-in helper behavior
35
- - make every owner-to-Claude turn boundary-controlled, reviewable, and explicit about what must happen now versus later
36
- - do not send vague owner prompts such as `continue`, `keep going`, `handle the rest`, or `fix it` without a precise bounded contract
37
- - each substantive owner message should state the current engineering boundary, exact expected outcomes for that turn, the evidence required back, the important shortcuts that are not acceptable, and the stopping point
35
+ - make every project-manager-to-Claude turn boundary-controlled, reviewable, and explicit about what must happen now versus later
36
+ - do not send vague prompts such as `continue`, `keep going`, `handle the rest`, or `fix it` without a precise bounded contract
37
+ - each substantive message should state the current engineering boundary, exact expected outcomes for that turn, the evidence required back, the important shortcuts that are not acceptable, and the stopping point
38
38
  - default to one bounded engineering objective per owner turn; if a request would naturally cross planning, scaffold, development, or gate-review boundaries, split it into separate turns
39
39
 
40
40
  ## Lane launch rule
@@ -82,7 +82,7 @@ For all later turns in the same bounded developer slot:
82
82
  printf '%s' "$PROMPT" | node ~/slopmachine/utils/claude_live_turn.mjs --runtime-dir <dir> --timeout-ms <turn-timeout>
83
83
  ```
84
84
 
85
- - inject exactly one owner message at a time into the idle live lane
85
+ - inject exactly one message at a time into the idle live lane
86
86
  - pass the prompt directly to the wrapper through stdin as the primary input path instead of requiring an owner-side prompt file
87
87
  - wait for `Stop` or `StopFailure` before sending the next message
88
88
  - do not bypass the bridge by calling the channel HTTP endpoint directly from owner logic
@@ -90,7 +90,7 @@ printf '%s' "$PROMPT" | node ~/slopmachine/utils/claude_live_turn.mjs --runtime-
90
90
 
91
91
  ## Turn-preflight checklist
92
92
 
93
- Before sending any owner message into the live lane:
93
+ Before sending any message into the live lane:
94
94
 
95
95
  1. read bridge `state.json` and confirm the lane is the intended lane and currently `idle`
96
96
  2. read the latest bridge `result.json` when it exists and review the last normalized Claude answer before composing the next turn
@@ -99,12 +99,12 @@ Before sending any owner message into the live lane:
99
99
  5. define the turn contract before writing the prompt: what Claude must produce now, what evidence it must return now, and exactly where it must stop
100
100
 
101
101
  If the stop boundary is fuzzy, the turn is too broad.
102
- If the owner prompt would span multiple major boundaries, split it.
102
+ If the message would span multiple major boundaries, split it.
103
103
  Do not send the next turn until the prior turn has been reviewed and either accepted, corrected, or explicitly rerouted.
104
104
 
105
- ## Canonical owner-message contract
105
+ ## Canonical lead-message contract
106
106
 
107
- For substantive live-lane turns, write the owner message in natural engineering language but make sure it includes all of these ingredients:
107
+ For substantive live-lane turns, write the message in natural engineering language but make sure it includes all of these ingredients:
108
108
 
109
109
  - `Context snapshot`: the current accepted state and only the fresh deltas that matter now
110
110
  - `Contract anchor`: the relevant accepted plan sections, clarified decisions, or concrete evaluator findings that define the work
@@ -122,16 +122,18 @@ When the turn intentionally uses internal parallel fan-out, also include:
122
122
  - `Fan-in rule`: how Claude should merge the branch results and what integrated verification must run before stopping
123
123
 
124
124
  Keep the wording natural. Do not turn every prompt into a rigid template dump.
125
+ The actual message should read like it came from a human project manager or technical lead who is invested in the project, not from workflow software.
126
+ Do not use obvious automation phrasing such as `owner`, `workflow`, `phase`, `session slot`, `contract anchor`, or `reply contract` in the message sent to Claude unless the user explicitly wants that style.
125
127
  But do make the contract mechanically obvious enough that Claude cannot plausibly misunderstand what acceptance depends on.
126
128
 
127
129
  ## Canonical prompt shapes
128
130
 
129
131
  ### Planning-start shape
130
132
 
131
- For the second owner message in the first `develop` lane and for other explicit planning-entry turns:
133
+ For the second planning-direction message in the first `develop` lane and for other explicit planning-entry turns:
132
134
 
133
135
  - inline the approved clarification content and requirements-ambiguity resolutions directly in the message
134
- - include the owner's initial planning view so Claude refines a direction instead of inventing one from zero
136
+ - include the initial planning view so Claude refines a direction instead of inventing one from zero
135
137
  - restate prompt-critical requirements, actors, required surfaces, locked defaults, explicit non-goals, and risky areas in plain engineering language
136
138
  - say clearly that the worker should produce an exhaustive, section-addressable implementation plan and must not start coding yet
137
139
  - require dense planning artifacts, especially `../docs/design.md`, with explicit treatment of modules, business rules, state machines, permissions, validation, verification strategy, checkpoints, and definition of done when applicable
@@ -164,7 +166,7 @@ For ordinary implementation turns:
164
166
  - name the exact slice, user/admin actor path, modules, or surfaces to complete now
165
167
  - itemize the expected outcomes for happy path, failure path, and auth/ownership/validation behavior when those dimensions matter
166
168
  - require targeted local verification tied back to those expected outcomes
167
- - explicitly prohibit owner-only broad verification commands and unrelated follow-on work
169
+ - explicitly prohibit broad verification commands that are reserved for later gate checks and unrelated follow-on work
168
170
  - when the slice can truly be parallelized, name the separate branch contracts explicitly instead of asking Claude to infer them
169
171
  - say to stop after this slice and report the exact changed files plus exact verification results
170
172
 
@@ -199,7 +201,7 @@ For evaluator-driven remediation inside a `bugfix-N` session opened by a `partia
199
201
 
200
202
  Do not do these:
201
203
 
202
- - send `continue`, `next`, or `keep going` as a substantive owner prompt
204
+ - send `continue`, `next`, or `keep going` as a substantive prompt
203
205
  - ask for planning and implementation in the same turn unless that mixed boundary is intentional and explicitly stated
204
206
  - ask for multiple gate exits in one turn
205
207
  - let Claude decide its own stopping point implicitly
@@ -262,17 +264,17 @@ When the first `develop` slot begins in planning:
262
264
  1. launch the live `develop` lane if it is not already running
263
265
  2. send the original prompt plus a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction through the bridge
264
266
  3. store the Claude session id from bridge `state.json`
265
- 4. form an initial owner planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
266
- 5. send a compact second owner message through the same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial owner planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
267
+ 4. form an initial planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
268
+ 5. send a compact second message through the same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
267
269
  6. continue the planning conversation in that same Claude session
268
270
 
269
271
  Do not merge those two first messages.
270
272
  Do not ask for a plan in the first message.
271
273
 
272
- Preferred second owner message shape:
274
+ Preferred second planning-direction message shape:
273
275
 
274
- - inline the approved clarification content and the requirements-ambiguity resolutions directly in the owner message
275
- - include the owner's initial planning view so planning is refined collaboratively rather than invented from zero
276
+ - inline the approved clarification content and the requirements-ambiguity resolutions directly in the message
277
+ - include the initial planning view so planning is refined collaboratively rather than invented from zero
276
278
  - add any short delta notes that are not already captured in that inlined summary
277
279
  - express the current boundary in plain engineering language and then ask for an exhaustive, section-addressable implementation plan plus major risks or assumptions
278
280
  - require the plan to fill the planning artifacts densely, especially `../docs/design.md`, with explicit sections for actors, success paths, modules, business rules, state machines, permissions, validation, test strategy, checkpoints, and definition of done when those dimensions matter
@@ -280,7 +282,7 @@ Preferred second owner message shape:
280
282
  - say explicitly that coding must not start yet and that the response should stop after the planning artifacts and summary are complete
281
283
 
282
284
  Do not tell the developer worker to read files outside `repo/`.
283
- If owner-side artifacts outside `repo/` matter, restate their content directly in the owner message instead of passing file paths.
285
+ If project-lead artifacts outside `repo/` matter, restate their content directly in the message instead of passing file paths.
284
286
  Do not mention session names, slot labels, or workflow phase labels to the developer worker.
285
287
 
286
288
  ### `bugfix-N` orientation handshake
@@ -288,7 +290,7 @@ Do not mention session names, slot labels, or workflow phase labels to the devel
288
290
  When a fresh `partial pass` evaluation result opens the next remediation lane:
289
291
 
290
292
  1. launch a fresh live Claude developer lane for the next `bugfix-N` label
291
- 2. use the first owner message only to orient that session to the repo and the current delivered state
293
+ 2. use the first message only to orient that session to the repo and the current delivered state
292
294
  3. make clear in plain engineering language that follow-up work will be focused remediation against evaluator findings
293
295
  4. wait for the first response and store the Claude session id from bridge `state.json`
294
296
  5. only after that orientation exchange, continue the same `bugfix-N` live lane with the first evaluator-driven issue list
@@ -400,7 +402,7 @@ Do not advance the workflow based only on Bash success if bridge files and metad
400
402
  - if the bridge reports `blocked` because of `claude_usage_limit`, treat that as an automatic wait-and-resume path rather than a handoff-stop condition unless the wait or resume path itself fails
401
403
  - if the saved live lane cannot continue, do not silently create a replacement session unless the workflow explicitly chooses a controlled replacement
402
404
  - if a replacement session is required, record the handoff clearly in metadata and tracker comments
403
- - keep hook logs and transcript pointers for debugging, but do not surface raw bridge artifacts back into normal owner prompts unless debugging is explicitly needed
405
+ - keep hook logs and transcript pointers for debugging, but do not surface raw bridge artifacts back into normal developer-facing prompts unless debugging is explicitly needed
404
406
 
405
407
  ## Rate-limit handling
406
408
 
@@ -65,6 +65,7 @@ Use this skill during `P4 Development` before prompting the developer.
65
65
  - do not let implementation depend on parent-root docs or sibling artifacts for normal repo understanding
66
66
  - explain behavior changes clearly enough that the owner can keep parent-root `../docs/design.md`, `../docs/api-spec.md`, and `../docs/test-coverage.md` accurate when they apply
67
67
  - before reporting development complete, remove or correct local-only setup instructions, host-only dependency assumptions, and other fast-iteration traces that should not survive into the final Docker-contained delivery
68
+ - before reporting development complete, make sure the delivered repo is converging on exactly what `README.md` promises; if the README documents a final runtime command or broad test command, treat that as the required final output format rather than a loose note
68
69
  - verify the module against its planned behavior before trying to move on
69
70
  - do not move on while the module is still obviously weak or half-finished
70
71
  - do not spread broad partial logic across many modules; bias toward completed trustworthy slices before opening the next major chunk
@@ -80,8 +81,10 @@ Use this skill during `P4 Development` before prompting the developer.
80
81
  - if the local toolchain is missing, install or enable the local targeted test tooling; do not fall back to Docker, `./run_tests.sh`, Playwright, or other broad-gate tooling during ordinary slice work
81
82
  - fast local iteration is allowed during development even when the final delivered runtime and broad verification contract must be Docker-contained
82
83
  - do not let temporary local tooling or host-only setup assumptions leak into the final README, wrapper scripts, or declared delivery contract
84
+ - local verification is for speed during development; the README-documented runtime and broad test commands are the final contract that must pass at the later gate when they are part of the README promise
83
85
  - do not run browser E2E, Playwright, full test suites, `./run_tests.sh`, or Docker runtime commands during ordinary development slices
84
86
  - for frontend-bearing projects, rely on targeted local tests such as unit, component, route, page, or state-focused tests instead of browser E2E during ordinary slice work
87
+ - for `fullstack` and `web` projects, treat frontend unit tests as a real expected deliverable rather than optional polish; do not rely on package manifests or tooling presence as a substitute for real test files
85
88
  - for mobile and desktop projects, rely on targeted local non-E2E verification during ordinary slice work rather than broad checkpoint commands
86
89
  - when the slice materially changes frontend code, frontend tooling, or release-facing build behavior, include production build health in meaningful local verification when practical
87
90
  - for non-trivial frontend stateful work, do not rely only on runtime or E2E checks; add component, page, route, or state-focused tests when that is the credible way to prove the behavior statically
@@ -138,6 +138,7 @@ Inside a `partial pass` audit's bugfix loop:
138
138
  - if the report finds any issue, treat that as blocking `P7` completion
139
139
  - route those issues to the currently active recoverable developer session; prefer the most recently used developer session, which will usually be `bugfix-2`
140
140
  - require fixes plus concrete verification evidence from that developer session
141
+ - after the fixes land, if `README.md` documents `docker compose up --build` and/or `./run_tests.sh` as part of the delivered contract, run those exact commands before the next static coverage/README rerun and treat failures as unresolved issues
141
142
  - after the fixes land, run a fresh new coverage/README audit again and replace the old report
142
143
  - allow at most 3 remediation attempts for this final coverage/README audit
143
144
  - if the report is still not clean after the third remediation attempt, stop the retry loop, preserve the latest `../.tmp/test_coverage_and_readme_audit_report.md`, and treat that as the final evidence carried forward
@@ -51,6 +51,7 @@ Hardening should treat these as the main review buckets before final evaluation
51
51
  - audit whether feature flags, debug/demo surfaces, default-enabled config states, and mock/interception defaults are disclosed accurately in `README.md` and reflected in external docs when they exist
52
52
  - audit frontend flow readiness: major pages and interactions should have a traceable state model covering loading, empty, submitting, disabled, success, error, and duplicate-action protection where relevant
53
53
  - audit whether frontend-bearing projects have the right mix of component, page/route, and E2E evidence for their complexity rather than only one thin layer
54
+ - for `fullstack` and `web` projects, explicitly determine whether frontend unit tests are PRESENT or MISSING under the strict audit criteria, and treat missing or insufficient frontend unit tests as a critical gap before `P7`
54
55
  - audit whether logging categories, redaction expectations, and validation/error-normalization paths are concrete enough for static review
55
56
  - verify that missing failure handling is not being hidden behind fake-success behavior
56
57
  - run exploratory testing around awkward states, repeated actions, and realistic edge behavior
@@ -58,6 +59,7 @@ Hardening should treat these as the main review buckets before final evaluation
58
59
  - run a prototype-residue sweep for hardcoded preview values, placeholder text, seeded defaults, hidden fallbacks, and computed-but-unrendered behavior
59
60
  - enforce env-file discipline during hardening
60
61
  - run documentation verification against the real codebase and runtime behavior, not just document existence
62
+ - if `README.md` declares containerized runtime or broad test commands, verify that the final delivered output really supports those exact commands and that the docs do not overpromise beyond what the repo actually does
61
63
  - audit README compliance against the strict post-bugfix README review shape:
62
64
  - project type near the top
63
65
  - startup instructions
@@ -67,6 +69,7 @@ Hardening should treat these as the main review buckets before final evaluation
67
69
  - architecture and workflow clarity
68
70
  - for backend, fullstack, and web projects, verify the README still documents the canonical `docker compose up --build` contract while also containing the exact legacy compatibility string `docker-compose up` for the strict README audit
69
71
  - verify that fast local-iteration traces have been cleaned up before hardening closes: no lingering README dependence on `npm install`, `pip install`, `apt-get`, host-only runtime setup, or manual DB setup for the final delivered flow
72
+ - before hardening closes, if the README-documented final contract includes `docker compose up --build` and/or `./run_tests.sh`, require those exact commands to pass or explicitly fail the phase
70
73
  - re-check prompt-critical operational obligations such as scheduled jobs, retention, backups, worker behavior, privacy/accountability logging, and admin controls
71
74
  - enter release-candidate mode: stop feature work and focus only on fixes, verification, docs, and packaging preparation
72
75
  - make sure the system is genuinely reviewable and reproducible
@@ -33,6 +33,7 @@ Once a failure class is known:
33
33
  - for applicable UI-bearing work, this owner-run phase may use the selected stack's platform-appropriate UI/E2E tool for the affected flows, capture screenshots or equivalent artifacts, and verify the UI behavior and quality directly
34
34
  - verify requirement closure, not just feature existence
35
35
  - verify behavior against the current plan, the actual requirements, and any settled project decisions that affect the change
36
+ - verify the delivered runtime and broad-test behavior against `README.md`; if the README says a command is how the project should be run or verified, treat that command as part of the real external contract
36
37
  - verify end-to-end flow behavior where the change affects real workflows
37
38
  - verify that tests are real and effective checks of actual code logic rather than bypass-style or fake-confidence test paths
38
39
  - for web fullstack work, run Playwright coverage for major flows and review screenshots for real UI behavior and regressions
@@ -51,6 +52,7 @@ Once a failure class is known:
51
52
  - trace the changed tests and verification back to the prompt-critical risks, not just the easiest happy paths
52
53
  - when integrated verification repeatedly finds the same avoidable failure class, treat that as evidence that earlier slice execution or slice-close acceptance must become more system-aware in future runs
53
54
  - before closing the phase, verify the delivered startup path is genuinely runnable, the documented tests really execute, frontend behavior is usable when applicable, UI quality is acceptable, core running logic is complete, and Docker startup works when Docker is the runtime contract
55
+ - before closing the phase, if `README.md` documents `docker compose up --build` and/or `./run_tests.sh` as part of the delivered contract, run those exact commands here as part of the final integrated proof for the phase
54
56
  - tighten parent-root `../docs/test-coverage.md` during or immediately after integrated verification so major requirement and risk points, mapped tests, coverage status, and remaining gaps match the actual verification evidence
55
57
  - when security-bearing behavior changes, tighten parent-root `../docs/design.md` and `../docs/api-spec.md` as needed so enforcement points and mapped tests stay accurate
56
58
  - when frontend-bearing behavior changes, tighten `README.md` plus parent-root `../docs/design.md` as needed so key pages, interactions, and required UI states stay accurate
@@ -210,6 +210,7 @@ Selected-stack defaults:
210
210
  - for backend or fullstack projects, explicitly plan coverage for 401, 403, 404, conflicts or duplicate submission when relevant, object-level authorization, tenant or user isolation, sensitive-log exposure, and pagination/filter/sort when those behaviors exist
211
211
  - for frontend-bearing projects, explicitly plan a layered frontend test story when UI state or routing is material: unit, component, page or route integration, and E2E where applicable
212
212
  - for non-trivial frontend projects, explicitly plan a frontend test layer beyond runtime-only confidence: component, page, route, or state-focused tests when UI state complexity is meaningful
213
+ - for `fullstack` and `web` projects, explicitly plan real frontend unit tests and make it possible for later audit output to state `Frontend unit tests: PRESENT` with direct file-level evidence rather than inference
213
214
  - for web fullstack work, explicitly plan Playwright coverage for the synchronized frontend/backend flows when end-to-end testing is applicable, but treat Playwright as a real verified dependency rather than a decorative default
214
215
  - for mobile work, plan Jest plus React Native Testing Library as the local default test layer and add a platform-appropriate mobile UI/E2E tool when real device-flow proof is needed
215
216
  - for desktop work, plan a local desktop test runner plus Playwright Electron support or another platform-appropriate desktop UI/E2E tool when real window-flow proof is needed
@@ -64,6 +64,7 @@ No screenshots are required as packaging artifacts.
64
64
  - ensure `README.md` matches the delivered codebase, functionality, runtime steps, test steps, main repo contents, and important new-developer information, and stays friendly to a junior developer
65
65
  - ensure `README.md` also describes the delivered architecture at an implementation-review level rather than only listing commands
66
66
  - ensure `README.md` remains the primary in-repo documentation surface
67
+ - treat `README.md` as the final public output format for runtime and broad test expectations: the packaged repo must comply exactly with the commands and constraints it documents
67
68
  - verify no repo-local file depends on parent-root docs or sibling workflow artifacts for startup, build/preview, configuration, static review, or basic project understanding
68
69
  - if the project uses mock, stub, fake, interception, or local-data behavior, ensure `README.md` discloses that scope accurately and does not imply undisclosed real integration
69
70
  - if mock or interception behavior is enabled by default, ensure `README.md` says so clearly
@@ -141,6 +142,7 @@ After those steps:
141
142
  - do one final package review before declaring packaging complete
142
143
  - confirm the package is coherent as a delivered project, not just a working repo snapshot
143
144
  - confirm the delivered project is actually runnable in the promised startup model, the documented tests are runnable, frontend behavior is usable when applicable, UI quality is acceptable, core logic is complete, and Docker startup works when Docker is the runtime contract
145
+ - if `README.md` documents `docker compose up --build` and/or `./run_tests.sh` as part of the final contract, make sure the final package review uses those exact commands rather than a substitute path
144
146
  - confirm the final git checkpoint can be created cleanly for the packaged state when a checkpoint is needed
145
147
  - if packaging reveals a real defect or missing artifact, fix it before closing the phase
146
148
  - do not close packaging until all required docs, session exports, audit/fix-check files, cleanup conditions, and final structure checks are satisfied
@@ -26,6 +26,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
26
26
  - require the README to show the correct primary runtime command and `./run_tests.sh` as the primary broad test command
27
27
  - do not require the README to carry a full API catalog
28
28
  - require the README to include the strict audit sections when they are relevant to the project shape: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
29
+ - treat the README as the final public contract for runtime and broad-test behavior: if it documents a runtime command or a broad test command, the delivered output must satisfy that exact contract
29
30
  - do not allow the repo to depend on parent-root docs or sibling artifacts for startup, build/preview, configuration, evaluator traceability, or basic project understanding
30
31
  - require the delivered repo to be statically reviewable: README, scripts, entry points, routes, config, and test commands must be traceably consistent
31
32
  - if the project uses mock, stub, fake, interception, or local-data behavior, require the README and visible code boundaries to disclose that scope accurately
@@ -188,11 +189,13 @@ Use evidence such as internal metadata files, structured Beads comments, verific
188
189
  - module implementation acceptance should use a narrow slice-close checklist: required behavior present, adjacent high-risk seams checked, docs or contract honesty preserved, exact verification evidence supplied, and no known release-facing regression left behind
189
190
  - when backend or fullstack APIs are touched, module implementation acceptance should also check that endpoint-oriented coverage notes and true no-mock HTTP tests are moving with the code instead of being deferred indefinitely
190
191
  - integrated verification entry requires one of the limited owner-run broad gate moments once development is complete; this is the normal next place where `docker compose up --build` and `./run_tests.sh` are expected after scaffold acceptance
192
+ - integrated verification entry requires one of the limited owner-run broad gate moments once development is complete; when `README.md` documents `docker compose up --build` and/or `./run_tests.sh`, those exact commands are expected here as part of the final external-contract proof
191
193
  - module implementation acceptance should also challenge whether the slice is advancing toward the planned module contract and the hard minimum 90 percent coverage threshold instead of accumulating test debt
192
194
  - before leaving development, require explicit proof that the planned development outcomes for the relevant modules or slices are actually closed, not merely started, and that the targeted verification evidence covers the important happy path, failure path, and security or ownership path where relevant
193
195
  - before leaving development, require cleanup of local-iteration residue from the delivered contract: final README, wrapper scripts, and declared run/test flows should no longer depend on host-only setup conveniences
194
196
  - integrated verification completion requires explicit full-system evidence before the phase can close
195
197
  - integrated verification completion also requires explicit evidence that the delivered startup path is runnable, the documented tests are real and runnable, frontend behavior is usable when applicable, UI quality is acceptable, core logic is complete, and Docker startup works when Docker is the runtime contract
198
+ - before leaving development, hardening, or packaging, if `README.md` documents a containerized final runtime or broad test command, require those exact commands to be run at the appropriate final gate and verify that the README still matches the real output
196
199
  - web fullstack integrated verification must include owner-run Playwright coverage for every major flow, plus screenshots used to evaluate frontend behavior and UI quality along the flow using `frontend-design`
197
200
  - mobile and desktop integrated verification must include the selected stack's platform-appropriate UI/E2E coverage for every major user flow when UI-bearing flows are material
198
201
  - for Electron or other Linux-targetable desktop projects, integrated verification should use the Dockerized desktop build/test path plus headless UI/runtime verification artifacts
@@ -207,9 +210,11 @@ Use evidence such as internal metadata files, structured Beads comments, verific
207
210
  - before `P7`, require that parent-root `../docs/test-coverage.md` is detailed enough for the owner to map major requirement and risk points to tests and gaps without inference work
208
211
  - before `P7`, require that security-bearing projects present traceable static evidence for auth entry points, route authorization, object authorization, function-level authorization, admin/internal/debug protection, and tenant or user isolation when those dimensions apply
209
212
  - before `P7`, for non-trivial frontend work, require meaningful static frontend test evidence for major state transitions or failure paths rather than relying only on runtime screenshots or E2E confidence
213
+ - before `P7`, for `fullstack` and `web` projects, require an explicit frontend unit-test verdict backed by direct file-level evidence; if frontend unit tests are missing or insufficient, treat that as a critical gap
210
214
  - before `P7`, require repo-local build/preview/config traceability plus disclosure in `README.md` of feature flags, debug/demo surfaces, and mock defaults when those surfaces exist
211
215
  - before `P7`, require logging and validation contracts to be statically traceable enough that the owner can review them from the repo plus external references when needed
212
216
  - final evaluation readiness requires the audit-numbered `P7` model under `../.tmp/`; only `partial pass` fresh evaluations leave persisted `audit_report-<N>.md` files, `fail` audits route back to the latest `develop-N` session and discard their working report after triage, `pass` audits discard their working report and rerun fresh evaluation, `partial pass` audits open scoped `bugfix-N` sessions whose fix checks are stored as `audit_report-<N>-fix_check-<M>.md`, and the last subphase of `P7` runs `test_coverage_and_readme_audit_report.md` with up to 3 remediation attempts before carrying the latest report forward
217
+ - before leaving `P7`, if `README.md` documents `docker compose up --build` and/or `./run_tests.sh` as part of the delivered external contract, run those exact commands on the final state and require them to pass before moving to `P8`
213
218
  - if the `P7` issue-fix loop materially reopens the integrated verification boundary, route it back through integrated verification before continuing with follow-up fix verification
214
219
  - before leaving `P7`, require the parent-root `../.tmp/test_coverage_and_readme_audit_report.md` to exist from the last `P7` subphase; if it finds issues, route the fixes to the currently active recoverable developer session, replace the report, and rerun the audit, but stop after 3 remediation attempts and keep the latest report as the final carried-forward evidence
215
220
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "theslopmachine",
3
- "version": "0.7.1",
3
+ "version": "0.7.2",
4
4
  "description": "SlopMachine installer and project bootstrap CLI",
5
5
  "license": "MIT",
6
6
  "type": "module",
package/src/init.js CHANGED
@@ -288,7 +288,13 @@ async function createInitialPhaseArtifacts(targetPath, options) {
288
288
  `## Bootstrap Status\n\n` +
289
289
  `- Workspace initialized by slopmachine.\n` +
290
290
  `${options.adoptExisting ? '- Existing project adoption mode is active.\n' : ''}` +
291
- `${options.requestedStartPhase ? `- Requested start phase: ${options.requestedStartPhase}.\n` : ''}`
291
+ `${options.requestedStartPhase ? `- Requested start phase: ${options.requestedStartPhase}.\n` : ''}` +
292
+ `\n## Entry Template\n\n` +
293
+ `Copy this exact structure for each clarification item:\n\n` +
294
+ `### 1. Clarification Defaults for Planning\n` +
295
+ `- Question: Can the drafted clarification defaults be used for planning?\n` +
296
+ `- My Understanding: The prompt was large enough that planning needed explicit confirmation that the clarification package was acceptable. We needed to lock this in rather than carrying uncertainty forward into the planning phase.\n` +
297
+ `- Solution: Yes. Proceed with the drafted defaults, allowing planning to start from the approved clarification brief instead of an uncertain baseline.\n`
292
298
 
293
299
  const prePlanningBriefContent = `# Pre-Planning Brief\n\n` +
294
300
  `Capture the planning-critical project shape here before real planning begins.\n\n` +