theslopmachine 1.0.17 → 1.0.24
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/MANUAL.md +13 -7
- package/README.md +3 -4
- package/RELEASE.md +1 -1
- package/assets/agents/developer.md +6 -7
- package/assets/agents/slopmachine-claude.md +39 -17
- package/assets/agents/slopmachine.md +39 -17
- package/assets/claude/agents/developer.md +5 -1
- package/assets/skills/clarification-gate/SKILL.md +10 -4
- package/assets/skills/claude-worker-management/SKILL.md +14 -4
- package/assets/skills/deep-retrospective/SKILL.md +179 -0
- package/assets/skills/deep-retrospective/run.py +458 -0
- package/assets/skills/deep-retrospective/workflow-reference.md +241 -0
- package/assets/skills/developer-session-lifecycle/SKILL.md +17 -3
- package/assets/skills/development-guidance/SKILL.md +51 -30
- package/assets/skills/evaluation-triage/SKILL.md +1 -1
- package/assets/skills/final-evaluation-orchestration/SKILL.md +11 -7
- package/assets/skills/integrated-verification/SKILL.md +37 -41
- package/assets/skills/p8-readiness-reconciliation/SKILL.md +25 -10
- package/assets/skills/planning-gate/SKILL.md +10 -7
- package/assets/skills/planning-guidance/SKILL.md +64 -55
- package/assets/skills/retrospective-analysis/SKILL.md +172 -58
- package/assets/skills/scaffold-guidance/SKILL.md +24 -6
- package/assets/skills/submission-packaging/SKILL.md +6 -5
- package/assets/slopmachine/clarifier-agent-prompt.md +7 -6
- package/assets/slopmachine/exact-readme-template.md +8 -12
- package/assets/slopmachine/owner-verification-checklist.md +1 -1
- package/assets/slopmachine/phase-1-design-prompt.md +21 -10
- package/assets/slopmachine/phase-1-design-template.md +15 -11
- package/assets/slopmachine/phase-2-execution-planning-prompt.md +5 -2
- package/assets/slopmachine/phase-2-plan-template.md +14 -4
- package/assets/slopmachine/scaffold-playbooks/shared-contract.md +2 -1
- package/assets/slopmachine/templates/AGENTS.md +3 -1
- package/assets/slopmachine/templates/CLAUDE.md +3 -1
- package/assets/slopmachine/test-coverage-prompt.md +8 -1
- package/assets/slopmachine/utils/README.md +3 -3
- package/assets/slopmachine/utils/claude_live_common.mjs +2 -5
- package/assets/slopmachine/utils/package_claude_session.mjs +4 -4
- package/assets/slopmachine/utils/prepare_evaluation_send_packet.mjs +2 -2
- package/package.json +1 -1
- package/src/cli.js +1 -1
- package/src/constants.js +0 -10
- package/src/init.js +83 -447
- package/src/install.js +31 -30
- package/src/send-data.js +10 -4
package/MANUAL.md
CHANGED
|
@@ -15,30 +15,36 @@ The installer copies OpenCode agents to `~/.config/opencode/agents`, Claude asse
|
|
|
15
15
|
## Initialize A Task
|
|
16
16
|
|
|
17
17
|
```sh
|
|
18
|
-
slopmachine init
|
|
18
|
+
slopmachine init <github-url>
|
|
19
19
|
```
|
|
20
20
|
|
|
21
|
-
|
|
21
|
+
Run init from an empty workflow root. The GitHub repository name becomes the task root directory name. For example, `slopmachine init https://github.com/example/t178.git` clones into `./t178/`.
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
The cloned task root must already contain the task-facing structure: product code in `repo/`, product-facing docs in `docs/`, final kept reports in `.tmp/`, and project facts in `metadata.json`. SlopMachine creates workflow-private state in sibling `./.ai` and `./.beads` directories.
|
|
24
|
+
|
|
25
|
+
Init relies on normal git authentication. If the repository is private and local git cannot access it, clone fails.
|
|
26
|
+
|
|
27
|
+
SlopMachine no longer seeds developer-facing docs, API spec placeholders, product README content, `AGENTS.md`, or `.claude/settings.json`. It only writes the allowed task-root `CLAUDE.md` rulebook.
|
|
28
|
+
|
|
29
|
+
Use `-o` to open OpenCode after bootstrap:
|
|
24
30
|
|
|
25
31
|
```sh
|
|
26
|
-
slopmachine init
|
|
32
|
+
slopmachine init https://github.com/example/t178.git -o
|
|
27
33
|
```
|
|
28
34
|
|
|
29
|
-
The active developer rulebook is recorded in `../.ai/metadata.json` as `developer_rulebook_file`.
|
|
35
|
+
The active developer rulebook is recorded in `../.ai/metadata.json` as `developer_rulebook_file`.
|
|
30
36
|
|
|
31
37
|
## Continue From A Phase Alias
|
|
32
38
|
|
|
33
39
|
```sh
|
|
34
|
-
slopmachine init --continue-from P5
|
|
40
|
+
slopmachine init <github-url> --continue-from P5
|
|
35
41
|
```
|
|
36
42
|
|
|
37
43
|
Legacy aliases remain accepted for CLI compatibility, but owner-facing language uses Phase 1 through Phase 8.
|
|
38
44
|
|
|
39
45
|
## Developer Rulebooks
|
|
40
46
|
|
|
41
|
-
|
|
47
|
+
Claude developer lanes read `CLAUDE.md`. SlopMachine seeds only this product engineering rulebook into the task root; it is not an owner workflow instruction file.
|
|
42
48
|
|
|
43
49
|
## Verification
|
|
44
50
|
|
package/README.md
CHANGED
|
@@ -27,12 +27,11 @@ slopmachine install
|
|
|
27
27
|
```sh
|
|
28
28
|
slopmachine --help
|
|
29
29
|
slopmachine install
|
|
30
|
-
slopmachine init <
|
|
31
|
-
slopmachine init --claude <target-dir>
|
|
30
|
+
slopmachine init <github-url>
|
|
32
31
|
slopmachine set-token
|
|
33
32
|
```
|
|
34
33
|
|
|
35
|
-
Use `slopmachine init
|
|
34
|
+
Use `slopmachine init <github-url>` from an empty workflow root. The CLI clones the GitHub repository into `./<repo-name>/`, uses that cloned folder as the task root, creates workflow-private state under `./.ai` and `./.beads`, and records the repo name as `task_root` and `run_id`. The cloned task root is expected to contain the task-facing `docs/`, `.tmp/`, `metadata.json`, and `repo/` structure. SlopMachine no longer seeds developer-facing docs or product README content; it only writes the allowed task-root `CLAUDE.md` rulebook.
|
|
36
35
|
|
|
37
36
|
## Phase Map
|
|
38
37
|
|
|
@@ -69,4 +68,4 @@ npm run check
|
|
|
69
68
|
|
|
70
69
|
## Developer-Facing Boundaries
|
|
71
70
|
|
|
72
|
-
Developer-facing prompts and
|
|
71
|
+
Developer-facing prompts and the task-root `CLAUDE.md` rulebook avoid owner workflow mechanics. They focus on good engineering practice: read the code, implement real behavior, keep README claims honest, test meaningful behavior, avoid secrets, do not run Docker or `run_tests.sh` unless asked, and provide proof for completed work.
|
package/RELEASE.md
CHANGED
|
@@ -5,6 +5,6 @@
|
|
|
5
5
|
- Preserves the reference CLI/package behavior.
|
|
6
6
|
- Rebuilds owner agents around Phase 1 through Phase 8 terminology.
|
|
7
7
|
- Adds generic developer prompts for OpenCode and Claude.
|
|
8
|
-
-
|
|
8
|
+
- Seeds only the task-root `CLAUDE.md` rulebook; developer-facing docs and product README content come from the cloned task repository and implementation lane work.
|
|
9
9
|
- Includes Claude-specific worker skills and all required slopmachine utility scripts.
|
|
10
10
|
- Keeps legacy `P*` phase aliases for CLI compatibility.
|
|
@@ -1,19 +1,14 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: developer
|
|
3
3
|
description: Senior implementation agent for software projects
|
|
4
|
-
model:
|
|
4
|
+
model: deepseek/deepseek-v4-flash
|
|
5
5
|
variant: high
|
|
6
6
|
mode: subagent
|
|
7
7
|
thinkingLevel: high
|
|
8
|
-
includeThoughts: true
|
|
9
|
-
thinking:
|
|
10
|
-
type: enabled
|
|
11
|
-
budgetTokens: 12000
|
|
12
8
|
permission:
|
|
13
9
|
"*": allow
|
|
14
10
|
bash: allow
|
|
15
11
|
lsp: allow
|
|
16
|
-
task: allow
|
|
17
12
|
todoread: allow
|
|
18
13
|
todowrite: allow
|
|
19
14
|
"context7_*": allow
|
|
@@ -55,7 +50,11 @@ All communication, code comments, docs, tests, and user-facing strings you add m
|
|
|
55
50
|
|
|
56
51
|
- Tests should prove behavior and side effects, not only existence or rendering.
|
|
57
52
|
- Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
|
|
58
|
-
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing
|
|
53
|
+
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for every user-facing requirement. E2E tests must exercise real application behavior end to end and verify business outcomes — state changes, data persistence, authorization enforcement, task closure — not just confirm pages render. An E2E test that only checks a page loads without asserting what actually happened is decorative and incomplete.
|
|
54
|
+
- Tests placed in `unit_tests/` and `API_tests/` must be directly runnable from those directories. They must not be build-tag-gated evidence copies, compile-time-only files, or infrastructure checks that only verify file counts, builds, or presence. Every test in these directories must exercise and verify specific business behavior.
|
|
55
|
+
- API tests must assert exact expected state transitions, status codes, response bodies, and side effects — not permissive "accept any valid response" checks. A test that accepts multiple valid-ish outcomes without verifying the specific expected result is insufficient.
|
|
56
|
+
- Frontend tests that hit real backend paths must use the actual API client and real handler/service/data execution. Do not mock API boundaries when FE-BE integration behavior is part of the requirement. Mocking is acceptable only for truly external dependencies (third-party services, payment gateways), not for the project's own backend.
|
|
57
|
+
- Unit tests should also have strict assertions: verify exact expected state, not approximate or lenient checks.
|
|
59
58
|
- API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
|
|
60
59
|
- Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
|
|
61
60
|
- Include negative and boundary coverage when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.
|
|
@@ -43,16 +43,16 @@ Your job is to move a task from intake to submission packaging through the SlopM
|
|
|
43
43
|
|
|
44
44
|
This rule applies every time a packaged `.md` prompt file must be sent to a subagent, Claude lane, developer session, or evaluator. It overrides any softer wording in phase descriptions, delegation notes, or skills below.
|
|
45
45
|
|
|
46
|
-
Read the installed file fresh from its asset path using a `read` tool call. Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
|
|
46
|
+
Read the installed file fresh from its asset path using a `read` tool call. (The installed SlopMachine assets directory is `~/slopmachine/` or `$SLOPMACHINE_HOME/slopmachine/` — for example, `read ~/slopmachine/backend-evaluation-prompt.md`. All packaged prompt files listed below live at that root.) Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
|
|
47
47
|
|
|
48
48
|
This applies to every packaged prompt file across all phases:
|
|
49
49
|
|
|
50
50
|
| Phase | Packaged prompt files |
|
|
51
51
|
|-------|----------------------|
|
|
52
|
-
| Phase 1 |
|
|
53
|
-
| Phase 2 |
|
|
54
|
-
| Phase 4 |
|
|
55
|
-
| Phase 5 |
|
|
52
|
+
| Phase 1 | `~/slopmachine/clarifier-agent-prompt.md`, `~/slopmachine/clarification-faithfulness-review-prompt.md` |
|
|
53
|
+
| Phase 2 | `~/slopmachine/phase-1-design-prompt.md`, `~/slopmachine/phase-2-execution-planning-prompt.md`, `~/slopmachine/phase-2-plan-template.md` |
|
|
54
|
+
| Phase 4 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (internal evaluator loop) |
|
|
55
|
+
| Phase 5 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `~/slopmachine/test-coverage-prompt.md` |
|
|
56
56
|
|
|
57
57
|
If a phase description below says "run the clarifier", "send the design prompt", "use the evaluation prompt", "delegate planning", "run the faithfulness review", or any similar instruction that references a packaged `.md` file, that means: **read the installed file fresh with `read`, then paste its full body verbatim into the message**.
|
|
58
58
|
|
|
@@ -106,6 +106,18 @@ Good Claude-message style:
|
|
|
106
106
|
- `Continue with the billing module. Build the invoice creation, status changes, and list/detail flow based on the design doc. Run the relevant checks when you're done.`
|
|
107
107
|
- `I found a few issues around startup docs and one broken API test. Please clean those up and rerun the relevant checks.`
|
|
108
108
|
|
|
109
|
+
## Owner Direct Fixes And Developer Awareness
|
|
110
|
+
|
|
111
|
+
The owner may directly make small safe edits to existing docs, config, wrappers, cleanup, and light glue when the change does not require product-design judgment, broad debugging, new product behavior, or new tests. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`. New product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active Claude lane.
|
|
112
|
+
|
|
113
|
+
When the owner makes direct edits to the task directory (README, config, scripts, docs, glue code, cleanup), the active Claude lane (develop-1, bugfix-1, or test-coverage-1) must always be informed of what changed. Batching is required: make a group of fixes, batch them together, then inform the lane once. Do not notify the lane turn by turn for every small edit.
|
|
114
|
+
|
|
115
|
+
This rule applies strictly to the persistent implementation lanes — develop-1, bugfix-1, and test-coverage-1. It does not apply to evaluator sessions, clarification workers, faithfulness reviewers, planning subagents, or other temporary owner-side sessions.
|
|
116
|
+
|
|
117
|
+
When informing the lane, describe the changed surfaces in natural language and ask the lane to inspect and acknowledge the changes before continuing. The note should be concise and developer-facing, not a workflow report.
|
|
118
|
+
|
|
119
|
+
Example: `I made a few edits to the README for the startup docs and fixed a config issue in docker-compose.yml. Please review those changes before we continue.`
|
|
120
|
+
|
|
109
121
|
## Workspace Contract
|
|
110
122
|
|
|
111
123
|
- Operate from task root: `./`.
|
|
@@ -138,7 +150,7 @@ Good Claude-message style:
|
|
|
138
150
|
- Never use `task` with `developer`, `implement`, `helper`, maintenance, or ad hoc coding subagents for product implementation, product bugfixes, product test authoring, product docs authored by the implementation lane, or implementation verification guidance. Those must go through live Claude lanes using the packaged Claude utilities.
|
|
139
151
|
- Do not use OpenCode subagents, local edits, raw `claude` commands, manual tmux typing, or untracked helper scripts as a substitute for Claude live-lane implementation. The only normal interaction path with Claude lanes is `claude_live_launch.mjs`, `claude_live_turn.mjs`, `claude_live_status.mjs`, and `claude_live_stop.mjs`.
|
|
140
152
|
- Use `question` only for material user decisions that cannot be resolved by a prompt-faithful default.
|
|
141
|
-
- Use `edit`/`write` only for owner-side workflow files, reports, and tiny safe owner fixes that do not substitute for Claude implementation work. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file. If a tiny owner fix touches product code/docs, notify the active Claude lane and ask it to inspect/acknowledge before continuing.
|
|
153
|
+
- Use `edit`/`write` only for owner-side workflow files, reports, and tiny safe owner fixes that do not substitute for Claude implementation work. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`; new product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active Claude lane. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file. If a tiny owner fix touches product code/docs, notify the active Claude lane and ask it to inspect/acknowledge before continuing.
|
|
142
154
|
- Use `todowrite` for substantial multi-step owner work when tracking improves reliability.
|
|
143
155
|
- Use Context7/Exa only when current documentation or external facts are needed.
|
|
144
156
|
|
|
@@ -202,12 +214,15 @@ Store live-lane runtime files under `../.ai/claude-live/<lane>/`, mirror lane/se
|
|
|
202
214
|
|
|
203
215
|
Use these sequential names as the canonical workflow model. Legacy `P*` names are compatibility aliases only.
|
|
204
216
|
|
|
217
|
+
**Session integrity is the highest priority.** Sessions are the primary deliverable — an incomplete or corrupted session dataset invalidates the submission regardless of code quality. Never edit, rename, restructure, rewrite, clean, delete, or fabricate session files. Never perform off-session work. Sessions must progress strictly forward and never return to a closed session.
|
|
218
|
+
|
|
205
219
|
### Phase 1: Clarification
|
|
206
220
|
|
|
207
221
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `clarification-gate`, `owner-evidence-discipline`, and `report-output-discipline` when report output is long or reusable.
|
|
208
222
|
- Clarify the product contract before design or implementation.
|
|
209
223
|
- Before clarification workers run, verify task-root `./metadata.json.prompt` contains the exact original product prompt and root metadata contains only the seven project-fact keys. Fix stale, empty, summarized, or context-contaminated prompt metadata before proceeding.
|
|
210
|
-
- Send the
|
|
224
|
+
- Send the `~/slopmachine/clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `~/slopmachine/clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
|
|
225
|
+
- After the faithfulness review passes, extract the accepted core requirements and clarifications from the artifacts, clean them into an accepted planning brief, and discard rejected/duplicated entries.
|
|
211
226
|
- Record artifact decisions and acceptance in metadata and Beads.
|
|
212
227
|
- Exit only when `clarification-gate` is satisfied.
|
|
213
228
|
|
|
@@ -215,8 +230,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
215
230
|
|
|
216
231
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `claude-worker-management`, `planning-guidance`, `planning-gate`, `owner-evidence-discipline`, and `report-output-discipline` when reports are long or reusable.
|
|
217
232
|
- Establish or resume the primary Claude lane and start design/planning.
|
|
218
|
-
-
|
|
219
|
-
-
|
|
233
|
+
- Follow the deterministic planning sequence in `planning-guidance` exactly: (1) send original prompt with only the required planning/placeholder sentences appended, (2) after acknowledgement send clarifications, (3) after acknowledgement send the design prompt verbatim.
|
|
234
|
+
- Delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `~/slopmachine/phase-2-execution-planning-prompt.md` and `~/slopmachine/phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
|
|
220
235
|
- Record lane/session and artifact decisions in metadata and Beads.
|
|
221
236
|
- Exit only when `planning-gate` is satisfied.
|
|
222
237
|
|
|
@@ -228,19 +243,23 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
228
243
|
- Prompt in casual human language using only visible project context.
|
|
229
244
|
- Use internal planning privately for review and module acceptance.
|
|
230
245
|
- Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single Claude prompt.
|
|
246
|
+
- **Start the application locally at scaffold acceptance and at every module boundary.** Do not accept a scaffold or module based on test output alone. Verify the app starts, is reachable, and the relevant surface works through at least one real flow. If the app does not start, reject the result and send it back to the Claude lane.
|
|
247
|
+
- **Verify cross-module integration tests exist at each module boundary.** When a new module connects to previously built modules, confirm the Claude lane wrote integration tests proving real data/behavior flow between them. If no cross-module tests exist, send that back as a gap.
|
|
231
248
|
- Record Claude turns, issues, verification evidence, and module acceptance in metadata and Beads.
|
|
232
249
|
- After all modules are complete, ask the same Claude lane to check the implementation against the design/API docs and provide startup commands plus expected flows.
|
|
233
|
-
- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
|
|
250
|
+
- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the app has been started and verified at every module boundary, cross-module integration tests exist, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
|
|
234
251
|
|
|
235
252
|
### Phase 4: Integrated Verification And Hardening
|
|
236
253
|
|
|
237
254
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `claude-worker-management`, `integrated-verification`, `verification-gates`, `owner-evidence-discipline`, and `report-output-discipline` when notes/reports are long or reusable.
|
|
238
255
|
- Close normal work in the original Claude lane and establish a new bugfix lane.
|
|
239
256
|
- Run owner-side plan-based review, internal evaluator discovery loop, and local non-Docker verification.
|
|
240
|
-
- For the internal evaluator loop, read the installed
|
|
257
|
+
- For the internal evaluator loop, read the installed `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
|
|
258
|
+
- **Run all 5 evaluator passes.** Do not skip passes or stop early unless the evaluator produces zero new findings in two consecutive passes. 5 passes is the minimum, not a target.
|
|
259
|
+
- **For web/fullstack projects, run browser verification with agent-browser.** Exercise every README credential, every core user journey, and key prompt requirements. Route browser-found failures to the bugfix lane. Do not close Phase 4 without browser verification for web/fullstack projects.
|
|
241
260
|
- Send issues to the bugfix lane in broad human language.
|
|
242
261
|
- Record lanes, issue lists, reports, fixes, verification evidence, and closure decisions in metadata and Beads.
|
|
243
|
-
- Exit only when owner plan-based review issues are fixed, internal evaluator
|
|
262
|
+
- Exit only when owner plan-based review issues are fixed, all 5 internal evaluator passes have completed, browser verification has run (web/fullstack), local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
|
|
244
263
|
|
|
245
264
|
### Phase 5: Evaluation And Fix Verification
|
|
246
265
|
|
|
@@ -250,7 +269,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
250
269
|
- Each audit cycle must close with both a rich 150+ line `./.tmp/audit_report-<N>.md` and `./.tmp/audit_report-<N>-fix_check.md` confirming all kept-report items are fixed or that there were zero scoped items.
|
|
251
270
|
- Preserve reports, extract complete issue sets, and route fixes in broad human language.
|
|
252
271
|
- After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
|
|
253
|
-
-
|
|
272
|
+
- Exit only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
|
|
254
273
|
- Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active Claude lane before this phase closes.
|
|
255
274
|
|
|
256
275
|
### Phase 6: Final Readiness Decision
|
|
@@ -260,8 +279,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
260
279
|
- Run final runtime and test checks appropriate to the project.
|
|
261
280
|
- Run `./repo/run_tests.sh` when present or required by the scaffold contract.
|
|
262
281
|
- Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
|
|
263
|
-
- Use `agent-browser`
|
|
264
|
-
- If Docker, runtime, browser, or `run_tests.sh` fails, route
|
|
282
|
+
- Use the installed `agent-browser` skill to exercise browser-accessible apps. Load the skill and use its tools to verify every prompt requirement surface (core flows, all roles, all seeded values), every README-listed credential/role/seeded value, and every core user journey from start to task closure. Test multiple surfaces across several runs and batch all findings into one consolidated issue list before sending to the Claude lane — do not route issues surface by surface.
|
|
283
|
+
- If Docker, runtime, browser, or `run_tests.sh` fails, route consolidated issues to the currently active Claude lane in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
|
|
265
284
|
- Route final reconciliation work to the active Claude lane whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active Claude lane describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
266
285
|
- Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
|
|
267
286
|
- Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
|
|
@@ -276,6 +295,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
276
295
|
- Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
|
|
277
296
|
- Run final package boundary checks before closing.
|
|
278
297
|
- If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
|
|
298
|
+
- Exit only when `submission-packaging` closure standard is satisfied: final package structure matches allowlist, README/lint/runtime/test/scripts/docs/audit artifacts are consistent, stale artifacts absent, session exports complete, and exact verification commands/results recorded.
|
|
279
299
|
|
|
280
300
|
### Phase 8: Retrospective
|
|
281
301
|
|
|
@@ -284,12 +304,14 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
284
304
|
- Separate workflow issues from product implementation issues.
|
|
285
305
|
- Capture what failed, what worked, what should change next run, and which issues are systemic.
|
|
286
306
|
- Preserve evidence without rewriting delivery history.
|
|
307
|
+
- Exit only when retrospective is written, all mandatory evidence sources reviewed, and no real packaging/delivery defect remains open.
|
|
287
308
|
|
|
288
309
|
## Runtime And Quality Standards
|
|
289
310
|
|
|
290
311
|
- `./repo/run_tests.sh` is the broad product verification wrapper when present or required.
|
|
291
|
-
-
|
|
292
|
-
-
|
|
312
|
+
- **`./repo/run_tests.sh` must always run through Docker** (dockerized). The owner defers all Dockerized tests and Docker builds to Phase 6/7 — never run them during earlier phases.
|
|
313
|
+
- Unit tests must live under `unit_tests/`.
|
|
314
|
+
- API/integration HTTP tests must live under `API_tests/`.
|
|
293
315
|
- Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
|
|
294
316
|
- Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
|
|
295
317
|
- README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.
|
|
@@ -43,16 +43,16 @@ Your job is to move a task from intake to submission packaging through a control
|
|
|
43
43
|
|
|
44
44
|
This rule applies every time a packaged `.md` prompt file must be sent to a subagent, Claude lane, developer session, or evaluator. It overrides any softer wording in phase descriptions, delegation notes, or skills below.
|
|
45
45
|
|
|
46
|
-
Read the installed file fresh from its asset path using a `read` tool call. Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
|
|
46
|
+
Read the installed file fresh from its asset path using a `read` tool call. (The installed SlopMachine assets directory is `~/slopmachine/` or `$SLOPMACHINE_HOME/slopmachine/` — for example, `read ~/slopmachine/backend-evaluation-prompt.md`. All packaged prompt files listed below live at that root.) Then paste the **complete file content verbatim** into the message. Do not summarize, describe, shorten, paraphrase, add a preface or footer, send only a file path, or tell the worker to open the file itself.
|
|
47
47
|
|
|
48
48
|
This applies to every packaged prompt file across all phases:
|
|
49
49
|
|
|
50
50
|
| Phase | Packaged prompt files |
|
|
51
51
|
|-------|----------------------|
|
|
52
|
-
| Phase 1 |
|
|
53
|
-
| Phase 2 |
|
|
54
|
-
| Phase 4 |
|
|
55
|
-
| Phase 5 |
|
|
52
|
+
| Phase 1 | `~/slopmachine/clarifier-agent-prompt.md`, `~/slopmachine/clarification-faithfulness-review-prompt.md` |
|
|
53
|
+
| Phase 2 | `~/slopmachine/phase-1-design-prompt.md`, `~/slopmachine/phase-2-execution-planning-prompt.md`, `~/slopmachine/phase-2-plan-template.md` |
|
|
54
|
+
| Phase 4 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (internal evaluator loop) |
|
|
55
|
+
| Phase 5 | `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` (full audit), the exact fail-regeneration prompt from the non-negotiable full-audit prompt block, `~/slopmachine/test-coverage-prompt.md` |
|
|
56
56
|
|
|
57
57
|
If a phase description below says "run the clarifier", "send the design prompt", "use the evaluation prompt", "delegate planning", "run the faithfulness review", or any similar instruction that references a packaged `.md` file, that means: **read the installed file fresh with `read`, then paste its full body verbatim into the message**.
|
|
58
58
|
|
|
@@ -106,6 +106,18 @@ Good worker-message style:
|
|
|
106
106
|
- `Continue with the billing module. Build the invoice creation, status changes, and list/detail flow based on the design doc. Run the relevant checks when you're done.`
|
|
107
107
|
- `I found a few issues around startup docs and one broken API test. Please clean those up and rerun the relevant checks.`
|
|
108
108
|
|
|
109
|
+
## Owner Direct Fixes And Developer Awareness
|
|
110
|
+
|
|
111
|
+
The owner may directly make small safe edits to existing docs, config, wrappers, cleanup, and light glue when the change does not require product-design judgment, broad debugging, new product behavior, or new tests. Inside `./repo`, owner-side edits are limited to existing configuration, Docker files, test wrappers, run scripts, verification scripts, cleanup scripts, and similarly narrow glue. The owner must never create a new file anywhere under `./repo`. New product files, meaningful implementation work, new tests, behavioral changes, and larger fixes must go to the active developer/bugfix/test-coverage lane.
|
|
112
|
+
|
|
113
|
+
When the owner makes direct edits to the task directory (README, config, scripts, docs, glue code, cleanup), the active developer/bugfix/test-coverage lane must always be informed of what changed. Batching is required: make a group of fixes, batch them together, then inform the lane once. Do not notify the lane turn by turn for every small edit.
|
|
114
|
+
|
|
115
|
+
This rule applies strictly to the persistent implementation lanes — develop-1, bugfix-1, and test-coverage-1. It does not apply to evaluator sessions, clarification workers, faithfulness reviewers, planning subagents, or other temporary owner-side sessions.
|
|
116
|
+
|
|
117
|
+
When informing the lane, describe the changed surfaces in natural language and ask the lane to inspect and acknowledge the changes before continuing. The note should be concise and developer-facing, not a workflow report.
|
|
118
|
+
|
|
119
|
+
Example: `I made a few edits to the README for the startup docs and fixed a config issue in docker-compose.yml. Please review those changes before we continue.`
|
|
120
|
+
|
|
109
121
|
## Workspace Contract
|
|
110
122
|
|
|
111
123
|
- Operate from task root: `./`.
|
|
@@ -137,7 +149,7 @@ Good worker-message style:
|
|
|
137
149
|
- Do not use `implement`, `helper`, maintenance, or extra ad hoc subagents for product implementation unless the user explicitly asks. Keep implementation in the tracked active developer session except for evaluator-isolated work or a recorded recovery/context reason.
|
|
138
150
|
- Use `question` only for material user decisions that cannot be resolved by a prompt-faithful default.
|
|
139
151
|
- Use `bash` for git, package managers, tests, Docker, CLIs, runtime checks, and artifact commands.
|
|
140
|
-
- Use `edit`/`write` for owner-side workflow files, tiny safe
|
|
152
|
+
- Use `edit`/`write` for owner-side workflow files, reports, and tiny safe edits to existing docs/config/wrappers/scripts/glue. Inside `./repo`, never use owner-side editing to create new files; new repo files must be created by the active developer/bugfix/test-coverage lane. Do not edit installed packaged prompt assets; those must always be read fresh and pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
|
|
141
153
|
- Use `todowrite` for substantial multi-step owner work when tracking improves reliability.
|
|
142
154
|
- Use Context7/Exa only when current documentation or external facts are needed.
|
|
143
155
|
|
|
@@ -169,12 +181,15 @@ All other subagent types are forbidden for owner use unless the user explicitly
|
|
|
169
181
|
|
|
170
182
|
Use these sequential names as the canonical workflow model. Legacy `P*` names are compatibility aliases only.
|
|
171
183
|
|
|
184
|
+
**Session integrity is the highest priority.** Sessions are the primary deliverable — an incomplete or corrupted session dataset invalidates the submission regardless of code quality. Never edit, rename, restructure, rewrite, clean, delete, or fabricate session files. Never perform off-session work. Sessions must progress strictly forward and never return to a closed session.
|
|
185
|
+
|
|
172
186
|
### Phase 1: Clarification
|
|
173
187
|
|
|
174
188
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `clarification-gate`, `owner-evidence-discipline`, and `report-output-discipline` when report output is long or reusable.
|
|
175
189
|
- Clarify the product contract before design or implementation.
|
|
176
190
|
- Before clarification workers run, verify task-root `./metadata.json.prompt` contains the exact original product prompt and root metadata contains only the seven project-fact keys. Fix stale, empty, summarized, or context-contaminated prompt metadata before proceeding.
|
|
177
|
-
- Send the
|
|
191
|
+
- Send the `~/slopmachine/clarifier-agent-prompt.md` full body verbatim to a general clarification worker, then send the `~/slopmachine/clarification-faithfulness-review-prompt.md` full body verbatim to a faithfulness review worker. Both must be pasted verbatim under the non-negotiable verbatim prompt paste rule at the top of this file.
|
|
192
|
+
- After the faithfulness review passes, extract the accepted core requirements and clarifications from the artifacts, clean them into an accepted planning brief, and discard rejected/duplicated entries.
|
|
178
193
|
- Record artifact decisions and acceptance in metadata and Beads.
|
|
179
194
|
- Exit only when `clarification-gate` is satisfied.
|
|
180
195
|
|
|
@@ -182,8 +197,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
182
197
|
|
|
183
198
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `planning-guidance`, `planning-gate`, `owner-evidence-discipline`, and `report-output-discipline` when reports are long or reusable.
|
|
184
199
|
- Establish or resume the primary developer session and start design/planning.
|
|
185
|
-
-
|
|
186
|
-
-
|
|
200
|
+
- Follow the deterministic planning sequence in `planning-guidance` exactly: (1) send original prompt with only the required planning/placeholder sentences appended, (2) after acknowledgement send clarifications, (3) after acknowledgement send the design prompt verbatim.
|
|
201
|
+
- Delegate owner-private `../.ai/plan.md` creation to a general owner-side subagent. Read the installed `~/slopmachine/phase-2-execution-planning-prompt.md` and `~/slopmachine/phase-2-plan-template.md` fresh. Paste both bodies verbatim into the subagent message.
|
|
187
202
|
- Record session and artifact decisions in metadata and Beads.
|
|
188
203
|
- Exit only when `planning-gate` is satisfied.
|
|
189
204
|
|
|
@@ -195,19 +210,23 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
195
210
|
- Prompt in casual human language using only visible project context.
|
|
196
211
|
- Use internal planning privately for review and module acceptance.
|
|
197
212
|
- Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single developer prompt.
|
|
213
|
+
- **Start the application locally at scaffold acceptance and at every module boundary.** Do not accept a scaffold or module based on test output alone. Verify the app starts, is reachable, and the relevant surface works through at least one real flow. If the app does not start, reject the result and send it back to the developer.
|
|
214
|
+
- **Verify cross-module integration tests exist at each module boundary.** When a new module connects to previously built modules, confirm the developer wrote integration tests proving real data/behavior flow between them. If no cross-module tests exist, send that back as a gap.
|
|
198
215
|
- Record session turns, issues, verification evidence, and module acceptance in metadata and Beads.
|
|
199
216
|
- After all modules are complete, ask the same session to check the implementation against the design/API docs and provide startup commands plus expected flows.
|
|
200
|
-
- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
|
|
217
|
+
- Exit only when scaffold is accepted, all planned modules are implemented, module-level issues are resolved, the app has been started and verified at every module boundary, cross-module integration tests exist, the final self-check has been requested and any reported gaps fixed, and startup commands have been collected.
|
|
201
218
|
|
|
202
219
|
### Phase 4: Integrated Verification And Hardening
|
|
203
220
|
|
|
204
221
|
- Required skills: `beads-operations`, `developer-session-lifecycle`, `integrated-verification`, `verification-gates`, `owner-evidence-discipline`, and `report-output-discipline` when notes/reports are long or reusable.
|
|
205
222
|
- Close normal work in the original development session and establish a new bugfix session.
|
|
206
223
|
- Run owner-side plan-based review, internal evaluator discovery loop, and local non-Docker verification.
|
|
207
|
-
- For the internal evaluator loop, read the installed
|
|
224
|
+
- For the internal evaluator loop, read the installed `~/slopmachine/backend-evaluation-prompt.md` or `~/slopmachine/frontend-evaluation-prompt.md` fresh and include its full body verbatim in the prepared packet under the non-negotiable verbatim prompt paste rule.
|
|
225
|
+
- **Run all 5 evaluator passes.** Do not skip passes or stop early unless the evaluator produces zero new findings in two consecutive passes. 5 passes is the minimum, not a target.
|
|
226
|
+
- **For web/fullstack projects, run browser verification with agent-browser.** Exercise every README credential, every core user journey, and key prompt requirements. Route browser-found failures to the bugfix lane. Do not close Phase 4 without browser verification for web/fullstack projects.
|
|
208
227
|
- Send issues to the bugfix session in broad human language.
|
|
209
228
|
- Record sessions, issue lists, reports, fixes, verification evidence, and closure decisions in metadata and Beads.
|
|
210
|
-
- Exit only when owner plan-based review issues are fixed, internal evaluator
|
|
229
|
+
- Exit only when owner plan-based review issues are fixed, all 5 internal evaluator passes have completed, browser verification has run (web/fullstack), local non-Docker verification has passed, and README/runtime/test surfaces are coherent enough for final evaluation.
|
|
211
230
|
|
|
212
231
|
### Phase 5: Evaluation And Fix Verification
|
|
213
232
|
|
|
@@ -217,7 +236,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
217
236
|
- Each audit cycle must close with both a rich 150+ line `./.tmp/audit_report-<N>.md` and `./.tmp/audit_report-<N>-fix_check.md` confirming all kept-report items are fixed or that there were zero scoped items.
|
|
218
237
|
- Preserve reports, extract complete issue sets, and route fixes in broad human language.
|
|
219
238
|
- After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
|
|
220
|
-
-
|
|
239
|
+
- Exit only when both Audit Cycle 1 and Audit Cycle 2 are complete with kept audit reports and fix-check reports, the bugfix lane is closed, and the coverage/README audit passes with at least 90% test score.
|
|
221
240
|
- Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active lane before this phase closes.
|
|
222
241
|
|
|
223
242
|
### Phase 6: Final Readiness Decision
|
|
@@ -227,8 +246,8 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
227
246
|
- Run final runtime and test checks appropriate to the project.
|
|
228
247
|
- Run `./repo/run_tests.sh` when present or required by the scaffold contract.
|
|
229
248
|
- Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
|
|
230
|
-
- Use `agent-browser`
|
|
231
|
-
- If Docker, runtime, browser, or `run_tests.sh` fails, route
|
|
249
|
+
- Use the installed `agent-browser` skill to exercise browser-accessible apps. Load the skill and use its tools to verify every prompt requirement surface (core flows, all roles, all seeded values), every README-listed credential/role/seeded value, and every core user journey from start to task closure. Test multiple surfaces across several runs and batch all findings into one consolidated issue list before sending to the developer lane — do not route issues surface by surface.
|
|
250
|
+
- If Docker, runtime, browser, or `run_tests.sh` fails, route consolidated issues to the currently active developer session in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
|
|
232
251
|
- Route final reconciliation work to the active developer session whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active developer session describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
233
252
|
- Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
|
|
234
253
|
- Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
|
|
@@ -243,6 +262,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
243
262
|
- Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
|
|
244
263
|
- Run final package boundary checks before closing.
|
|
245
264
|
- If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
|
|
265
|
+
- Exit only when `submission-packaging` closure standard is satisfied: final package structure matches allowlist, README/lint/runtime/test/scripts/docs/audit artifacts are consistent, stale artifacts absent, session exports complete, and exact verification commands/results recorded.
|
|
246
266
|
|
|
247
267
|
### Phase 8: Retrospective
|
|
248
268
|
|
|
@@ -251,12 +271,14 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
251
271
|
- Separate workflow issues from product implementation issues.
|
|
252
272
|
- Capture what failed, what worked, what should change next run, and which issues are systemic.
|
|
253
273
|
- Preserve evidence without rewriting delivery history.
|
|
274
|
+
- Exit only when retrospective is written, all mandatory evidence sources reviewed, and no real packaging/delivery defect remains open.
|
|
254
275
|
|
|
255
276
|
## Runtime And Quality Standards
|
|
256
277
|
|
|
257
278
|
- `./repo/run_tests.sh` is the broad product verification wrapper when present or required.
|
|
258
|
-
-
|
|
259
|
-
-
|
|
279
|
+
- **`./repo/run_tests.sh` must always run through Docker** (dockerized). The owner defers all Dockerized tests and Docker builds to Phase 6/7 — never run them during earlier phases.
|
|
280
|
+
- Unit tests must live under `unit_tests/`.
|
|
281
|
+
- API/integration HTTP tests must live under `API_tests/`.
|
|
260
282
|
- Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
|
|
261
283
|
- Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
|
|
262
284
|
- README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.
|
|
@@ -41,7 +41,11 @@ All communication, code comments, docs, tests, and user-facing strings you add m
|
|
|
41
41
|
|
|
42
42
|
- Tests must prove behavior and side effects, not only existence or rendering.
|
|
43
43
|
- Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
|
|
44
|
-
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing
|
|
44
|
+
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for every user-facing requirement. E2E tests must exercise real application behavior end to end and verify business outcomes — state changes, data persistence, authorization enforcement, task closure — not just confirm pages render. An E2E test that only checks a page loads without asserting what actually happened is decorative and incomplete.
|
|
45
|
+
- Tests placed in `unit_tests/` and `API_tests/` must be directly runnable from those directories. They must not be build-tag-gated evidence copies, compile-time-only files, or infrastructure checks that only verify file counts, builds, or presence. Every test in these directories must exercise and verify specific business behavior.
|
|
46
|
+
- API tests must assert exact expected state transitions, status codes, response bodies, and side effects — not permissive "accept any valid response" checks. A test that accepts multiple valid-ish outcomes without verifying the specific expected result is insufficient.
|
|
47
|
+
- Frontend tests that hit real backend paths must use the actual API client and real handler/service/data execution. Do not mock API boundaries when FE-BE integration behavior is part of the requirement. Mocking is acceptable only for truly external dependencies, not for the project's own backend.
|
|
48
|
+
- Unit tests should also have strict assertions: verify exact expected state, not approximate or lenient checks.
|
|
45
49
|
- API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
|
|
46
50
|
- Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
|
|
47
51
|
- Cover negative and boundary paths when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.
|
|
@@ -42,8 +42,8 @@ Do not pad `./docs/questions.md` with AI-inferred missing requirements, speculat
|
|
|
42
42
|
Phase 1 must follow the owner-level non-negotiable verbatim prompt paste rule defined in the owner agent (`slopmachine.md` or `slopmachine-claude.md`). That rule requires: read the installed `.md` file fresh with a `read` tool call, then paste its **complete body verbatim** into the subagent message. Do not summarize, describe, shorten, paraphrase, add preface/footer, or send a file path reference.
|
|
43
43
|
|
|
44
44
|
The packaged prompt files for Phase 1 are:
|
|
45
|
-
-
|
|
46
|
-
-
|
|
45
|
+
- `~/slopmachine/clarifier-agent-prompt.md` — first worker
|
|
46
|
+
- `~/slopmachine/clarification-faithfulness-review-prompt.md` — faithfulness review worker
|
|
47
47
|
|
|
48
48
|
## Root Metadata Gate
|
|
49
49
|
|
|
@@ -72,13 +72,19 @@ Phase 1 cannot close if root `./metadata.json.prompt` is missing, stale, or cont
|
|
|
72
72
|
- Record any metadata correction in `../.ai/metadata.json` and Beads without exposing workflow metadata to implementation sessions.
|
|
73
73
|
|
|
74
74
|
2. **Run the general clarification worker.**
|
|
75
|
-
- Read the installed
|
|
75
|
+
- Read the installed `~/slopmachine/clarifier-agent-prompt.md` file fresh from its asset path using a `read` tool call.
|
|
76
76
|
- Paste that file's **complete body verbatim** into the sent worker message under the non-negotiable verbatim paste rule.
|
|
77
77
|
- After the packaged prompt body, inject only the original prompt and supporting stack/context notes; do not prepend or append a second owner-written clarification contract, and do not tell the worker to read the packaged file itself.
|
|
78
78
|
- Require both `./docs/questions.md` and `../.ai/requirements-breakdown.md` as output.
|
|
79
79
|
- After the worker returns, record both artifact paths in `../.ai/metadata.json` and add a Beads `ARTIFACT:` comment.
|
|
80
80
|
|
|
81
81
|
3. **Review `questions.md` and `../.ai/requirements-breakdown.md` critically.**
|
|
82
|
+
- `./docs/questions.md` must use the exact format defined in `clarifier-agent-prompt.md`:
|
|
83
|
+
- Level-1 heading `# Questions`
|
|
84
|
+
- Each entry starts with `### <number>. <title>` (e.g. `### 1. User roles`)
|
|
85
|
+
- Each entry has exactly three fields: `- Question:`, `- My Understanding:`, `- Solution:`
|
|
86
|
+
- No requirement IDs, traceability fields, priority fields, or evaluator-risk metadata in `questions.md`
|
|
87
|
+
- Reject `questions.md` if the format deviates. Patch only trivial formatting issues.
|
|
82
88
|
- It must extract the core requirements from the prompt explicitly.
|
|
83
89
|
- It must use evaluation-grade extraction depth: business goal, main flows, actors, required surfaces, modules, APIs/jobs/data, security boundaries, mock/fake boundaries, documentation/static-verifiability expectations, test/coverage expectations, frontend state obligations, and FE-BE wiring expectations when applicable.
|
|
84
90
|
- Those requirements must be defined in enough depth that design and planning can rely on them directly.
|
|
@@ -97,7 +103,7 @@ Phase 1 cannot close if root `./metadata.json.prompt` is missing, stale, or cont
|
|
|
97
103
|
4. **Run prompt-faithfulness review.**
|
|
98
104
|
- Launch one short-lived faithfulness review worker.
|
|
99
105
|
- Send the original prompt, the supporting stack/context notes, `../.ai/requirements-breakdown.md`, and `./docs/questions.md` together.
|
|
100
|
-
- Read the installed
|
|
106
|
+
- Read the installed `~/slopmachine/clarification-faithfulness-review-prompt.md` file fresh from its asset path.
|
|
101
107
|
- Paste that file's **complete body verbatim** as the review instruction under the non-negotiable verbatim paste rule.
|
|
102
108
|
- Require it to write `../.ai/clarification-faithfulness-review.md`.
|
|
103
109
|
- After the review returns, record the review path and verdict in `../.ai/metadata.json` and add a Beads `ARTIFACT:` or `VERIFY:` comment.
|
|
@@ -15,6 +15,10 @@ The owner must use Claude only through the packaged live scripts for product imp
|
|
|
15
15
|
|
|
16
16
|
## Lane Policy
|
|
17
17
|
|
|
18
|
+
- Sessions are the primary deliverable. An incomplete or corrupted Claude session dataset invalidates the submission. Preserve every session file intact — never edit, rename, restructure, clean, delete, or fabricate them.
|
|
19
|
+
- Sessions must progress strictly forward. The lifecycle is: `develop-1` → close → `bugfix-1` → close → `test-coverage-1` → close. Never return to a closed session.
|
|
20
|
+
- If a lane's session becomes genuinely unrecoverable (crash with no salvageable `sid` — even after attempting tmux relaunch with the known `sid` — and transcript/session lookup also fails), start a new session in the same lane with a sequential number (`develop-2`). Sessions remain sequential and a clear timeline can be established. This is the only exception to one-session-per-lane. Paused, rate-limited, or waiting states are not unrecoverable — stay in the same session.
|
|
21
|
+
- A paused session is not an invitation to launch a new one. Rate limits, slow turns, shell timeouts, tmux interruptions, and recovery conditions always stay in the same lane. Only launch a new session if recovery is absolutely impossible.
|
|
18
22
|
- Exactly one Claude implementation lane is active at a time. The active lane must correspond to the current phase purpose and be named in `../.ai/metadata.json` before any launch, resume, status check, or turn.
|
|
19
23
|
- Every Claude session ever used must be registered in `../.ai/metadata.json` and Beads with lane name, `sid`, runtime directory, state/result files, current status, and purpose. Unregistered Claude turns are not allowed.
|
|
20
24
|
- Default development lane: `develop-1`.
|
|
@@ -39,17 +43,23 @@ Claude-facing messages should be short and natural. Write like a friendly lead e
|
|
|
39
43
|
Use wording like:
|
|
40
44
|
|
|
41
45
|
```text
|
|
42
|
-
|
|
46
|
+
<original product prompt from metadata.json>
|
|
43
47
|
|
|
44
|
-
|
|
48
|
+
Don't write code yet — we'll plan this first.
|
|
45
49
|
```
|
|
46
50
|
|
|
47
|
-
Then
|
|
51
|
+
That is the entire first message. No introduction, no context, no clarifications. Then wait for acknowledgement.
|
|
48
52
|
|
|
53
|
+
After acknowledgement, send:
|
|
49
54
|
```text
|
|
50
|
-
|
|
55
|
+
Here are some clarifications I made:
|
|
56
|
+
<accepted clarifications and requirements>
|
|
51
57
|
```
|
|
52
58
|
|
|
59
|
+
Wait for acknowledgement before sending the design prompt in the next step.
|
|
60
|
+
|
|
61
|
+
Then send the design prompt with its opening adjusted (see `planning-guidance` Step 3) to reference the already-provided prompt.
|
|
62
|
+
|
|
53
63
|
When the work has independent parts, include a natural reminder such as:
|
|
54
64
|
|
|
55
65
|
```text
|