theslopmachine 1.0.11 → 1.0.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/agents/developer.md +2 -0
- package/assets/agents/slopmachine-claude.md +9 -2
- package/assets/agents/slopmachine.md +9 -2
- package/assets/claude/agents/developer.md +2 -0
- package/assets/skills/development-guidance/SKILL.md +10 -5
- package/assets/skills/final-evaluation-orchestration/SKILL.md +2 -2
- package/assets/skills/integrated-verification/SKILL.md +2 -0
- package/assets/skills/p8-readiness-reconciliation/SKILL.md +7 -4
- package/assets/skills/planning-gate/SKILL.md +5 -2
- package/assets/skills/planning-guidance/SKILL.md +3 -2
- package/assets/skills/scaffold-guidance/SKILL.md +1 -1
- package/assets/skills/submission-packaging/SKILL.md +5 -1
- package/assets/slopmachine/exact-readme-template.md +2 -2
- package/assets/slopmachine/owner-verification-checklist.md +5 -4
- package/assets/slopmachine/phase-1-design-prompt.md +2 -1
- package/assets/slopmachine/phase-1-design-template.md +20 -2
- package/assets/slopmachine/phase-2-execution-planning-prompt.md +5 -3
- package/assets/slopmachine/phase-2-plan-template.md +18 -9
- package/assets/slopmachine/templates/AGENTS.md +1 -0
- package/assets/slopmachine/templates/CLAUDE.md +1 -0
- package/package.json +1 -1
|
@@ -56,6 +56,8 @@ All communication, code comments, docs, tests, and user-facing strings you add m
|
|
|
56
56
|
- Tests should prove behavior and side effects, not only existence or rendering.
|
|
57
57
|
- Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
|
|
58
58
|
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing flows.
|
|
59
|
+
- API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
|
|
60
|
+
- Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
|
|
59
61
|
- Include negative and boundary coverage when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.
|
|
60
62
|
- For frontend work, test loading, empty, submitting, disabled, success, error, and re-entry states when those states are relevant.
|
|
61
63
|
- For backend-backed frontend work, verify the frontend uses the real client/API path and the backend performs real handler/service/data work.
|
|
@@ -148,11 +148,14 @@ Do not interact with Claude through raw `claude` commands, manual tmux typing, u
|
|
|
148
148
|
- Claude messages must read like a lead engineer talking to another engineer.
|
|
149
149
|
- Use private planning only to decide the next normal Claude instruction; do not mention private planning or its existence.
|
|
150
150
|
- Include what to build or fix, why it matters, the broad affected area, expected behavior, and useful verification.
|
|
151
|
+
- Prompt Claude phase-by-phase and slice-by-slice. Prefer one phase, scaffold, module, or fix batch per prompt; at most combine two adjacent tightly coupled slices when separating them would create needless churn.
|
|
152
|
+
- Never give Claude the whole workflow, all phases, or a full end-to-end delivery packet at once.
|
|
151
153
|
- For substantial Claude turns, you may include a normal human reminder that Claude can use its own built-in subagents for bounded investigation, implementation support, or verification inside the same Claude lane. Do not frame Claude subagents as separate workflow lanes, and do not create OpenCode subagents to help Claude implement.
|
|
152
154
|
- Keep ordinary issue prompts at module/product level. Avoid file/line details unless the user explicitly asks you to pass exact references.
|
|
153
155
|
- Do not paste, summarize, cite, name, or mention hidden plans.
|
|
154
156
|
- Do not combine original-prompt orientation, design, implementation, verification, and bugfix work into one large prompt.
|
|
155
157
|
- Do not send workflow mechanics, evaluator internals, Beads state, hidden-file paths, owner-state reasoning, or negative instructions about nonexistent artifacts to Claude.
|
|
158
|
+
- After each Claude completion, verify the result against the original product prompt in `./metadata.json`, `./docs/design.md`, `./docs/api-spec.md` when applicable, and owner-private `../.ai/plan.md`. If there are issues, correct through the same active Claude lane before proceeding to the next slice.
|
|
156
159
|
- If you make a direct owner-side code or docs change that affects the product repo, tell the active Claude lane exactly what changed and what remains to verify.
|
|
157
160
|
|
|
158
161
|
## Claude Utility Map
|
|
@@ -193,6 +196,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
193
196
|
- Scaffold first, then proceed module by module.
|
|
194
197
|
- Prompt in casual human language using only visible project context.
|
|
195
198
|
- Use internal planning privately for review and module acceptance.
|
|
199
|
+
- Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single Claude prompt.
|
|
196
200
|
- Record Claude turns, issues, verification evidence, and module acceptance in metadata and Beads.
|
|
197
201
|
- After all modules are complete, ask the same Claude lane to check the implementation against the design/API docs and provide startup commands plus expected flows.
|
|
198
202
|
|
|
@@ -211,6 +215,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
211
215
|
- Preserve reports, extract complete issue sets, and route fixes in broad human language.
|
|
212
216
|
- After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
|
|
213
217
|
- Complete only when the coverage/README audit passes with at least 90% test score.
|
|
218
|
+
- Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active Claude lane before this phase closes.
|
|
214
219
|
|
|
215
220
|
### Phase 6: Final Readiness Decision
|
|
216
221
|
|
|
@@ -219,8 +224,9 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
219
224
|
- Run final runtime and test checks appropriate to the project.
|
|
220
225
|
- Run `./repo/run_tests.sh` when present or required by the scaffold contract.
|
|
221
226
|
- Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
|
|
227
|
+
- Use `agent-browser` for browser-accessible apps to exercise the core prompt requirements, main user journeys, and every README-listed demo credential, role/state, seeded value, example ID/status, and documented default. Use API/platform-equivalent checks for non-browser projects.
|
|
222
228
|
- If Docker, runtime, browser, or `run_tests.sh` fails, route the failure to the currently active Claude lane in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
|
|
223
|
-
- If the owner makes a direct safe fix, send a minimal note to the active Claude lane describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
229
|
+
- Route final reconciliation work to the active Claude lane whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active Claude lane describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
224
230
|
- Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
|
|
225
231
|
- Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
|
|
226
232
|
|
|
@@ -232,6 +238,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
232
238
|
- Include only package docs: `docs/questions.md`, `docs/design.md`, and `docs/api-spec.md` when applicable.
|
|
233
239
|
- Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
|
|
234
240
|
- Run final package boundary checks before closing.
|
|
241
|
+
- If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
|
|
235
242
|
|
|
236
243
|
### Phase 8: Retrospective
|
|
237
244
|
|
|
@@ -248,7 +255,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
248
255
|
- API/integration HTTP tests belong under `API_tests/` where that convention exists.
|
|
249
256
|
- Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
|
|
250
257
|
- Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
|
|
251
|
-
- README must truthfully document startup, tests, configuration, access, demo credentials or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, and known limitations.
|
|
258
|
+
- README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.
|
|
252
259
|
|
|
253
260
|
## Evidence Discipline
|
|
254
261
|
|
|
@@ -127,10 +127,13 @@ All other subagent types are forbidden for owner use unless the user explicitly
|
|
|
127
127
|
- Developer messages must read like a lead engineer talking to another engineer.
|
|
128
128
|
- Use private planning only to decide the next normal implementation instruction; do not mention private planning or its existence.
|
|
129
129
|
- Include what to build or fix, why it matters, the broad affected area, expected behavior, and useful verification.
|
|
130
|
+
- Prompt developers phase-by-phase and slice-by-slice. Prefer one phase, scaffold, module, or fix batch per prompt; at most combine two adjacent tightly coupled slices when separating them would create needless churn.
|
|
131
|
+
- Never give the developer the whole workflow, all phases, or a full end-to-end delivery packet at once.
|
|
130
132
|
- Keep ordinary issue prompts at module/product level. Avoid file/line details unless the user explicitly asks you to pass exact references.
|
|
131
133
|
- Do not paste, summarize, cite, name, or mention hidden plans.
|
|
132
134
|
- Do not combine original-prompt orientation, design, implementation, verification, and bugfix work into one large prompt.
|
|
133
135
|
- Do not send workflow mechanics, evaluator internals, Beads state, hidden-file paths, owner-state reasoning, or negative instructions about nonexistent artifacts to developers.
|
|
136
|
+
- After each developer completion, verify the result against the original product prompt in `./metadata.json`, `./docs/design.md`, `./docs/api-spec.md` when applicable, and owner-private `../.ai/plan.md`. If there are issues, correct through the same active developer session before proceeding to the next slice.
|
|
134
137
|
- If you make a direct owner-side code or docs change that affects the product repo, tell the active developer session exactly what changed and what remains to verify.
|
|
135
138
|
|
|
136
139
|
## Phase Model
|
|
@@ -160,6 +163,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
160
163
|
- Scaffold first, then proceed module by module.
|
|
161
164
|
- Prompt in casual human language using only visible project context.
|
|
162
165
|
- Use internal planning privately for review and module acceptance.
|
|
166
|
+
- Do not send more than the current module/slice, or two adjacent tightly coupled slices, in a single developer prompt.
|
|
163
167
|
- Record session turns, issues, verification evidence, and module acceptance in metadata and Beads.
|
|
164
168
|
- After all modules are complete, ask the same session to check the implementation against the design/API docs and provide startup commands plus expected flows.
|
|
165
169
|
|
|
@@ -178,6 +182,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
178
182
|
- Preserve reports, extract complete issue sets, and route fixes in broad human language.
|
|
179
183
|
- After both audit cycles, close the bugfix lane and start a test-coverage/final-reconciliation lane.
|
|
180
184
|
- Complete only when the coverage/README audit passes with at least 90% test score.
|
|
185
|
+
- Treat README hard-gate failures, missing true endpoint coverage, missing frontend unit tests for web/fullstack, and missing FE-BE proof as reconciliation work for the active lane before this phase closes.
|
|
181
186
|
|
|
182
187
|
### Phase 6: Final Readiness Decision
|
|
183
188
|
|
|
@@ -186,8 +191,9 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
186
191
|
- Run final runtime and test checks appropriate to the project.
|
|
187
192
|
- Run `./repo/run_tests.sh` when present or required by the scaffold contract.
|
|
188
193
|
- Run `docker compose up --build` for container-supported web/backend/fullstack projects unless explicitly out of scope.
|
|
194
|
+
- Use `agent-browser` for browser-accessible apps to exercise the core prompt requirements, main user journeys, and every README-listed demo credential, role/state, seeded value, example ID/status, and documented default. Use API/platform-equivalent checks for non-browser projects.
|
|
189
195
|
- If Docker, runtime, browser, or `run_tests.sh` fails, route the failure to the currently active developer session in broad human language, verify the fix, rerun the failed check, and repeat until green or explicitly risk-accepted by the user.
|
|
190
|
-
- If the owner makes a direct safe fix, send a minimal note to the active developer session describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
196
|
+
- Route final reconciliation work to the active developer session whenever it is more than a tiny, safe owner-side edit. If the owner makes a minor direct safe fix, send a minimal note to the active developer session describing the changed surface and ask it to inspect/acknowledge before continuing.
|
|
191
197
|
- Use platform-equivalent checks for Android, iOS, desktop, or other native projects.
|
|
192
198
|
- Do not pass readiness with unresolved blocker/high findings, unverified runtime claims, README drift, or known fake behavior.
|
|
193
199
|
|
|
@@ -199,6 +205,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
199
205
|
- Include only package docs: `docs/questions.md`, `docs/design.md`, and `docs/api-spec.md` when applicable.
|
|
200
206
|
- Do not package workflow-private `../.ai`, `../.beads`, hidden session state, owner plans, raw evaluator workspaces, or task-root rulebooks unless the packaging spec explicitly requires them.
|
|
201
207
|
- Run final package boundary checks before closing.
|
|
208
|
+
- If packaging, cleanup, README edits, config, or seed/runtime changes could affect documented behavior, rerun the affected Docker/runtime, `run_tests.sh`, and browser/API seeded-value checks before closing.
|
|
202
209
|
|
|
203
210
|
### Phase 8: Retrospective
|
|
204
211
|
|
|
@@ -215,7 +222,7 @@ Use these sequential names as the canonical workflow model. Legacy `P*` names ar
|
|
|
215
222
|
- API/integration HTTP tests belong under `API_tests/` where that convention exists.
|
|
216
223
|
- Fullstack/backend-backed frontend work must prove real frontend-to-backend behavior through user-visible flows unless accepted design explicitly marks a capability internal/API-only.
|
|
217
224
|
- Security, authorization, ownership, isolation, validation, error handling, logging, config, seeded data, and README claims must align with delivered behavior.
|
|
218
|
-
- README must truthfully document startup, tests, configuration, access, demo credentials or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, and known limitations.
|
|
225
|
+
- README must truthfully document project type near the top, startup, tests, configuration, access, demo credentials and all roles or `No authentication required`, seeded data or `No seeded data required; the app is useful from an empty state.`, mock/local/debug boundaries, and known limitations.
|
|
219
226
|
|
|
220
227
|
## Evidence Discipline
|
|
221
228
|
|
|
@@ -42,6 +42,8 @@ All communication, code comments, docs, tests, and user-facing strings you add m
|
|
|
42
42
|
- Tests must prove behavior and side effects, not only existence or rendering.
|
|
43
43
|
- Add or update tests for every implementation change. Target full meaningful coverage of delivered behavior, not just a smoke path.
|
|
44
44
|
- Cover implementation at the strongest relevant layers: unit tests for business logic, API/integration HTTP tests for every endpoint or interface, and E2E/platform tests for user-facing flows.
|
|
45
|
+
- API/integration tests should exercise the real route/interface and business logic without mocking the transport, controller, or execution-path services unless there is a documented reason this is not possible.
|
|
46
|
+
- Frontend unit/component tests should be directly detectable and should import or render the real frontend components/modules they cover.
|
|
45
47
|
- Cover negative and boundary paths when relevant: unauthenticated, unauthorized, not found, conflicts, invalid input, empty states, duplicate actions, object ownership, and sensitive data exposure.
|
|
46
48
|
- For frontend work, test loading, empty, submitting, disabled, success, error, and re-entry states when those states are relevant.
|
|
47
49
|
- For backend-backed frontend work, verify the frontend uses the real client/API path and the backend performs real handler/service/data work.
|
|
@@ -20,6 +20,8 @@ Use this skill during `Phase 3: Development` before prompting the active develop
|
|
|
20
20
|
|
|
21
21
|
Prompt like a human developer working with an AI coding assistant.
|
|
22
22
|
|
|
23
|
+
Prompt one bounded slice at a time. The preferred unit is one phase-purpose, scaffold, module, work package, or fix batch. At most combine two adjacent tightly coupled slices in one prompt, and only when splitting them would make the work less coherent. Never send all phases, the full private plan, or a start-to-finish workflow packet to the developer/Claude lane.
|
|
24
|
+
|
|
23
25
|
Use direct wording such as:
|
|
24
26
|
- `I checked the user module and found a missing authorization test. Please add that and rerun the relevant tests.`
|
|
25
27
|
- `Continue with the invoice module. Build the create/list/detail flow against the existing product contract and cover the main success and validation paths.`
|
|
@@ -29,7 +31,7 @@ Do not send robotic process language. Do not require a specific response format.
|
|
|
29
31
|
|
|
30
32
|
Do not keep restating visible doc paths in routine follow-up prompts when the same session already knows the project contract. It is fine to say `existing product contract`, `accepted docs`, or simply name the module. Mention exact doc paths only when orienting a new session, resolving confusion, or asking for a final contract check.
|
|
31
33
|
|
|
32
|
-
For larger module slices, group expectations by user/business behavior instead of turning every endpoint, field, and negative case into a long checklist. Ask for real backend-backed behavior, visible UI states, and meaningful success/failure tests, but keep the wording natural.
|
|
34
|
+
For larger module slices, group expectations by user/business behavior instead of turning every endpoint, field, and negative case into a long checklist. Ask for real backend-backed behavior, visible UI states, and meaningful success/failure tests, but keep the wording natural. If a module is too large to explain without becoming a checklist packet, split it into smaller sequential prompts.
|
|
33
35
|
|
|
34
36
|
Example of a good larger module prompt:
|
|
35
37
|
|
|
@@ -50,7 +52,7 @@ Do not say `the review found`, `the evaluation found`, or `the audit found`. The
|
|
|
50
52
|
## Development Sequence
|
|
51
53
|
|
|
52
54
|
1. **Scaffold first.**
|
|
53
|
-
- Establish the framework/runtime/test/README baseline.
|
|
55
|
+
- Establish the framework/runtime/test/README baseline, including the strict README gates that final review will expect.
|
|
54
56
|
- Keep it free of project-specific business logic except the minimum proof surface needed to verify the stack is wired.
|
|
55
57
|
- Use `scaffold-guidance` and scaffold playbooks privately to shape the prompt.
|
|
56
58
|
|
|
@@ -61,11 +63,13 @@ Do not say `the review found`, `the evaluation found`, or `the audit found`. The
|
|
|
61
63
|
|
|
62
64
|
3. **Proceed module by module.**
|
|
63
65
|
- Select the next section/module from `./docs/design.md` and the private plan.
|
|
64
|
-
- Prompt the developer using the docs only.
|
|
66
|
+
- Prompt the developer using the docs only, one module/work package at a time by default.
|
|
65
67
|
- Ask for the implementation and the relevant tests/checks for that module.
|
|
68
|
+
- Combine two adjacent modules/work packages only when they share the same user flow or data contract and are easier to verify together.
|
|
66
69
|
|
|
67
70
|
4. **Owner checks after each module.**
|
|
68
71
|
- Inspect changed files manually.
|
|
72
|
+
- Compare behavior against the original product prompt in `./metadata.json`.
|
|
69
73
|
- Compare behavior against `./docs/design.md` and `./docs/api-spec.md`.
|
|
70
74
|
- Privately compare against `../.ai/plan.md` for tests, coverage, discoverability, functionality, and module completeness.
|
|
71
75
|
- Run targeted checks when practical.
|
|
@@ -100,11 +104,12 @@ For each scaffold/module, check:
|
|
|
100
104
|
- no-orphan ledger items assigned to the module are closed
|
|
101
105
|
- project-specific behavior is real, not placeholder/shell/demo-only behavior
|
|
102
106
|
- tests exist for the implemented behavior or a concrete exception is recorded
|
|
103
|
-
- planned API/interface proof is present when the module owns endpoints/interfaces
|
|
107
|
+
- planned API/interface proof is present when the module owns endpoints/interfaces, with true no-mock HTTP/API endpoint tests where applicable
|
|
108
|
+
- frontend unit tests are directly detectable and import/render real frontend components/modules when the module owns frontend behavior
|
|
104
109
|
- planned FE-BE proof is present when the module crosses frontend/backend boundaries
|
|
105
110
|
- failure, validation, authorization, ownership, empty, loading, error, and duplicate/re-entry cases are covered where relevant
|
|
106
111
|
- frontend/backend wiring is real where applicable
|
|
107
|
-
- README changes match delivered runtime, commands, auth/no-auth, seed/demo data, and
|
|
112
|
+
- README changes match delivered runtime, commands, auth/no-auth, seed/demo data, verification behavior, mock/local/debug boundaries, and strict startup/access gates
|
|
108
113
|
- targeted checks ran or were clearly blocked
|
|
109
114
|
|
|
110
115
|
## Internal Plan Alignment
|
|
@@ -154,9 +154,9 @@ After the new reconciliation lane is established:
|
|
|
154
154
|
2. Send `test-coverage-prompt.md` verbatim.
|
|
155
155
|
3. Require `./.tmp/test_coverage_and_readme_audit_report.md`.
|
|
156
156
|
4. Read the generated report.
|
|
157
|
-
5. Require an overall Pass. Pass with caveats is acceptable.
|
|
157
|
+
5. Require an overall Pass. Pass with caveats is acceptable only when caveats are explicit, bounded, non-blocking, and do not contradict README hard gates or required coverage surfaces.
|
|
158
158
|
6. Require at least 90% test score.
|
|
159
|
-
7. If verdict is not Pass or test score is below 90%, extract all missing items and send them to the reconciliation lane in broad human language.
|
|
159
|
+
7. If verdict is not Pass or Pass with acceptable caveats, if test score is below 90%, or if the report identifies README hard-gate failures, mocked endpoint coverage presented as true API coverage, missing frontend unit tests for web/fullstack, or missing FE-BE proof, extract all missing items and send them to the reconciliation lane in broad human language.
|
|
160
160
|
8. After fixes, start a new evaluator session and send the same full verbatim test coverage/README prompt again.
|
|
161
161
|
9. Repeat until verdict is Pass or Pass with caveats and test score is at least 90%.
|
|
162
162
|
|
|
@@ -52,6 +52,7 @@ Check:
|
|
|
52
52
|
- README gate matrix
|
|
53
53
|
- risk/negative coverage matrix
|
|
54
54
|
- runtime/test/config consistency
|
|
55
|
+
- strict README gates: project type near the top, Docker/startup/access/verification commands, auth/no-auth, every documented demo credential/role and seeded value, mock/local/debug disclosure, and no hidden manual setup
|
|
55
56
|
- security, authorization, ownership, validation, and data integrity
|
|
56
57
|
- placeholder, shell, fake-success, disconnected UI, or static-demo behavior
|
|
57
58
|
|
|
@@ -117,6 +118,7 @@ Rules:
|
|
|
117
118
|
- Use the startup commands and expected flows supplied at the end of development.
|
|
118
119
|
- Verify the app starts locally when feasible.
|
|
119
120
|
- Verify key expected flows manually/API-wise/platform-wise as appropriate.
|
|
121
|
+
- For browser-accessible apps, manually exercise representative core prompt requirements and every README-listed seeded/demo account, role, and seeded value where feasible. Record any unverified surface and route failures to the bugfix lane.
|
|
120
122
|
- Run relevant unit/API/integration/E2E/platform checks locally when available.
|
|
121
123
|
- If a command or local runtime check cannot run, record the exact blocker and risk.
|
|
122
124
|
- Any issue found goes back to the bugfix lane in human language.
|
|
@@ -36,14 +36,16 @@ Use these D1-D9 buckets for major issue classification:
|
|
|
36
36
|
|
|
37
37
|
- Run the broad product test wrapper `./repo/run_tests.sh` when it exists and is applicable.
|
|
38
38
|
- Run final runtime verification before packaging: `docker compose up --build` for web/backend/fullstack/container-supported projects, native/platform-equivalent startup for mobile/desktop projects, or a recorded not-applicable reason.
|
|
39
|
-
- Use `agent-browser` for manual functionality verification where browser-accessible UI exists.
|
|
40
|
-
- Exercise every relevant seeded/demo account
|
|
39
|
+
- Use `agent-browser` for manual functionality verification where browser-accessible UI exists. The browser pass must walk the core prompt requirements and main user journeys, not just confirm that the app loads.
|
|
40
|
+
- Exercise every relevant seeded/demo account, role/state, and README-listed seeded value. Confirm that documented credentials, seeded records, examples, IDs, statuses, roles, permissions, and expected default states are present, usable, and consistent with README claims.
|
|
41
|
+
- For backend/API-only projects, replace browser checks with equivalent API/manual checks for every README-listed credential, seeded value, role/state, and core requirement.
|
|
41
42
|
- If any final runtime, test, browser, account, or platform check cannot run, readiness cannot be `Pass` unless the user explicitly risk-accepts the unverified surface.
|
|
42
43
|
|
|
43
44
|
## Failure Routing Loop
|
|
44
45
|
|
|
45
46
|
- Phase 6 is the primary green gate for broad Docker/runtime and `./repo/run_tests.sh` verification.
|
|
46
|
-
-
|
|
47
|
+
- Final reconciliation work belongs in the currently active developer/Claude implementation lane whenever it is more than a tiny, safe owner-side edit. Route product behavior, tests, README/runtime drift, Docker/runtime failures, browser/account issues, and coverage gaps to that lane in broad human language.
|
|
48
|
+
- If `docker compose up --build`, native/platform startup, browser/API manual checks, account/seeded-value checks, or `./repo/run_tests.sh` fails, do not move to packaging.
|
|
47
49
|
- Route the failure to the currently active developer/Claude implementation lane in broad human language: describe the failing behavior, command, and user-visible/runtime impact without exposing evaluator or owner-private mechanics.
|
|
48
50
|
- After the lane reports a fix, the owner verifies the changed surface and reruns the failed check.
|
|
49
51
|
- Repeat fix, verify, and rerun until the check is green, not applicable for a documented reason, or explicitly risk-accepted by the user.
|
|
@@ -51,7 +53,8 @@ Use these D1-D9 buckets for major issue classification:
|
|
|
51
53
|
|
|
52
54
|
## Owner Direct Fixes
|
|
53
55
|
|
|
54
|
-
- The owner may directly fix
|
|
56
|
+
- The owner may directly fix only minor, safe docs, wrapper, config, cleanup, or light glue issues when the change does not require product-design judgment, new tests, behavioral changes, or non-trivial debugging.
|
|
57
|
+
- If the reconciliation issue is large enough to need real implementation work, meaningful test updates, runtime debugging, README/runtime restructuring, or product judgment, do not fix it owner-side. Send it to the currently active developer/Claude lane.
|
|
55
58
|
- After any direct owner fix, send a minimal note to the currently active developer/Claude lane describing the changed surface and ask it to inspect/acknowledge the change before readiness continues.
|
|
56
59
|
- The note should be concise and developer-facing, not a workflow report.
|
|
57
60
|
- Still rerun the affected command or check after acknowledgement.
|
|
@@ -25,7 +25,8 @@ Accept `./docs/design.md` only if it:
|
|
|
25
25
|
- defines modules as product/system responsibilities, not file-by-file work packets
|
|
26
26
|
- handles auth, authorization, ownership/isolation, validation, logging/redaction, admin/debug boundaries, and sensitive data where relevant
|
|
27
27
|
- defines frontend states and FE-BE expectations where relevant
|
|
28
|
-
- visibly defines the testing contract in the design itself: 90%+ unit coverage target for meaningful business logic, API
|
|
28
|
+
- visibly defines the testing contract in the design itself: 90%+ unit coverage target for meaningful business logic, true HTTP/API tests for every runtime endpoint with positive and negative cases, identifiable frontend unit tests that import/render real components/modules where a frontend exists, fullstack FE-BE proof, and full E2E/platform coverage for main user journeys in user-facing apps
|
|
29
|
+
- defines strict README/runtime obligations: project type near the top, primary `docker compose up --build` for container-supported deliveries, legacy compatibility string `docker-compose up` without making it primary, access and verification method, all auth/demo credentials and roles or exact `No authentication required`, seeded data values or empty-state statement, no manual runtime installs/manual DB setup/hidden `.env` dependency, mock/local/debug disclosures, and known limitations
|
|
29
30
|
- gives explicit not-applicable reasons and replacement proof layers for any missing unit/API/E2E coverage surface
|
|
30
31
|
- avoids vague placeholders such as `TBD`, `later`, `standard CRUD`, `normal auth`, or `basic tests` for correctness-critical behavior
|
|
31
32
|
|
|
@@ -55,11 +56,13 @@ Accept `../.ai/plan.md` only if it is strong enough for the owner to drive devel
|
|
|
55
56
|
- security execution obligations
|
|
56
57
|
- API/interface implementation and proof obligations
|
|
57
58
|
- API coverage matrix when APIs exist, including true HTTP/API proof and exception rationale
|
|
59
|
+
- frontend unit-test detectability when frontend exists: direct test files, framework evidence, and imports/renders of real frontend components/modules
|
|
58
60
|
- frontend state and integration obligations where applicable
|
|
59
61
|
- FE-BE integration matrix and backend-to-frontend exposure check where applicable
|
|
60
62
|
- README/runtime/test obligations
|
|
61
63
|
- README gate matrix covering startup, access, verification, auth/no-auth, seeded/empty-state, config/no-secret handling, mock/local-data disclosure, and known limitations
|
|
62
|
-
-
|
|
64
|
+
- README gate matrix covering strict audit requirements: project type near top, `docker compose up --build`, legacy `docker-compose up` string, startup, access, verification, auth/no-auth, seeded values/empty-state, config/no-secret/no-hidden-env/no-manual-install handling, mock/local/debug disclosure, and known limitations
|
|
65
|
+
- test coverage map with unit/API/integration/E2E or platform-equivalent expectations, including true no-mock HTTP/API endpoint proof where applicable
|
|
63
66
|
- risk/negative coverage matrix for validation, authorization, ownership/isolation, empty/not-found, duplicate/conflict, re-entry, and sensitive-data leakage where relevant
|
|
64
67
|
- final integrated verification and readiness preparation
|
|
65
68
|
|
|
@@ -70,8 +70,9 @@ Phase 2 establishes the primary developer session and produces the accepted plan
|
|
|
70
70
|
- Provide original prompt, stack/context, accepted questions, requirements breakdown, design, and API spec.
|
|
71
71
|
- The general subagent must use the packaged `phase-2-execution-planning-prompt.md` as its instruction prompt.
|
|
72
72
|
- The general subagent must use the packaged `phase-2-plan-template.md` as the required structure for `../.ai/plan.md`.
|
|
73
|
-
|
|
74
|
-
|
|
73
|
+
- Require output to `../.ai/plan.md` and `../.ai/test-coverage.md` when useful.
|
|
74
|
+
- Record private plan and coverage artifact paths in metadata and Beads after the subagent returns.
|
|
75
|
+
- Ensure the private plan can be executed as small sequential developer prompts. Reject plans that require dumping multiple phases or the whole delivery contract into a single developer/Claude prompt.
|
|
75
76
|
|
|
76
77
|
8. Owner accepts or rejects the planning package.
|
|
77
78
|
- Use `planning-gate`.
|
|
@@ -45,7 +45,7 @@ Adjust the exact wording to the project. Do not over-format the message.
|
|
|
45
45
|
- product repo root `./repo/run_tests.sh` when required by the project contract
|
|
46
46
|
- runtime/Docker files when relevant, wired honestly for later verification
|
|
47
47
|
- database/bootstrap/seed path when the product will require seeded data or persistent storage
|
|
48
|
-
- README baseline with project type, stack, startup/access, verification, auth/no-auth, seeded/empty-state note, and repo layout
|
|
48
|
+
- README baseline with project type near the top, stack, primary startup/access command, legacy `docker-compose up` compatibility string where applicable, verification method, auth/no-auth, seeded/empty-state note, mock/local/debug disclosures, known limitations, and repo layout
|
|
49
49
|
- no committed secrets, `.env`, `.env.example`, hidden host setup, no-op tests, or fake-success integration paths
|
|
50
50
|
|
|
51
51
|
## Scaffold Should Not Deliver
|
|
@@ -39,7 +39,8 @@ Packaging must reject or remove stale workflow notes and scratch execution artif
|
|
|
39
39
|
|
|
40
40
|
## Packaging Checks
|
|
41
41
|
|
|
42
|
-
- Confirm README, scripts, config, routes, docs, tests, and runtime instructions agree.
|
|
42
|
+
- Confirm README, scripts, config, routes, docs, tests, browser/API manual evidence, and runtime instructions agree.
|
|
43
|
+
- Confirm every README-listed demo credential, role, seeded value, documented example, and expected default state was verified in the final runtime/browser/API pass or explicitly risk-accepted by the user.
|
|
43
44
|
- Confirm kept evaluation reports remain immutable evidence under `.tmp`, and that failed/stale/superseded reports are archived unchanged outside final `.tmp`.
|
|
44
45
|
- Confirm Claude/session handoff artifacts are outside the product package path.
|
|
45
46
|
- Confirm task-root rulebooks/settings are stripped from the final submission package when the packaging flow requires a product-only handoff.
|
|
@@ -55,7 +56,9 @@ Packaging must reject or remove stale workflow notes and scratch execution artif
|
|
|
55
56
|
## Final Runtime And Test Confirmation
|
|
56
57
|
|
|
57
58
|
- Phase 7 owns the final Docker/runtime confirmation and dockerized broad `./repo/run_tests.sh` confirmation when those commands are part of the delivered contract or when late fixes/packaging changes could affect runtime/test behavior.
|
|
59
|
+
- Phase 7 also owns final browser/API manual confirmation when late fixes, README edits, cleanup, package boundary changes, or seed/config changes could affect user-visible behavior or documented seeded values.
|
|
58
60
|
- If `./repo/README.md` documents `docker compose up --build` or `./repo/run_tests.sh`, treat those as package contract commands, not aspirational notes.
|
|
61
|
+
- If `./repo/README.md` documents demo accounts, roles, seeded data, example IDs, default statuses, or verification flows, treat those as package contract values that must be exercised through `agent-browser` or API/platform-equivalent checks before closure.
|
|
59
62
|
- Fix owner-side Docker/config/wrapper/README/docs/light-script glue directly when safe; route real product-code or test-file defects back through the appropriate developer fix lane before packaging closes.
|
|
60
63
|
- Never imply unrun Docker, runtime, browser, native/platform, or broad test commands passed.
|
|
61
64
|
- End Docker verification with project-specific cleanup unless the user explicitly wants containers left running.
|
|
@@ -99,6 +102,7 @@ Phase 7 can close only when:
|
|
|
99
102
|
- final package structure satisfies the allowlist;
|
|
100
103
|
- stale visible execution artifacts are absent;
|
|
101
104
|
- README, docs, scripts, config, routes, tests, audit artifacts, and repo behavior no longer contradict one another;
|
|
105
|
+
- README-listed credentials, roles, seeded values, examples, default states, and verification flows have been exercised or explicitly risk-accepted;
|
|
102
106
|
- runtime/test/package commands that were required have run or are explicitly risk-accepted by the user;
|
|
103
107
|
- `.tmp` contains the final kept report set and no stale superseded reports;
|
|
104
108
|
- session exports are complete and outside the package root;
|
|
@@ -215,10 +215,10 @@ Expected result:
|
|
|
215
215
|
If `init_db.sh` is part of the standard test bootstrap, document that relationship clearly.
|
|
216
216
|
|
|
217
217
|
### Local verification harness
|
|
218
|
-
- Document the separate local verification command(s) used for ordinary development and readiness checks.
|
|
218
|
+
- Document the separate local verification command(s) used for ordinary development and readiness checks only if they do not become required reviewer setup.
|
|
219
219
|
- Make clear that these local verification commands are distinct from the dockerized `./repo/run_tests.sh` broad test path.
|
|
220
220
|
- Use the real stack-native local suite for the chosen language/framework where applicable, for example Vitest, Jest, PHPUnit, pytest, go test, cargo test, or another framework-native equivalent.
|
|
221
|
-
-
|
|
221
|
+
- Do not require reviewers to run manual installs or machine-level setup for the standard packaged verification path.
|
|
222
222
|
|
|
223
223
|
### Test entry points
|
|
224
224
|
- Unit tests: `[command/path]`
|
|
@@ -18,7 +18,7 @@ Reject only for material defects that would mislead development, evaluation, or
|
|
|
18
18
|
- [ ] `../.ai/plan.md` captures owner-private workstreams, module slices, tests, runtime rules, security obligations, and packaging checks.
|
|
19
19
|
- [ ] `../.ai/plan.md` contains a no-orphan ledger mapping every accepted requirement, clarification, design trace row, API route, actor path, data object, security boundary, report/export/notification, and documentation obligation to a module/workstream and proof path.
|
|
20
20
|
- [ ] `../.ai/plan.md` defines scaffold first, ordered module packets, owned files/tests, shared-file boundaries, FE<->BE/API proof, verification commands, completion checklist, and development-exit proof.
|
|
21
|
-
- [ ] `../.ai/test-coverage.md` exists when meaningful coverage mapping is applicable and maps requirements/risks/API endpoints/frontend flows to planned tests, assertions, current status, and gaps.
|
|
21
|
+
- [ ] `../.ai/test-coverage.md` exists when meaningful coverage mapping is applicable and maps requirements/risks/API endpoints/frontend flows to planned tests, assertions, current status, and gaps, including true no-mock HTTP/API classification and frontend unit-test detectability where applicable.
|
|
22
22
|
- [ ] Private plan slices can be translated into normal developer prompts.
|
|
23
23
|
- [ ] Developer prompts do not ask workers to read private workflow files.
|
|
24
24
|
|
|
@@ -32,6 +32,7 @@ Reject only for material defects that would mislead development, evaluation, or
|
|
|
32
32
|
- [ ] A separate stack-native local harness exists for development/Phase 4, or the missing harness is explicitly user risk-accepted.
|
|
33
33
|
- [ ] Tests prove behavior and side effects, not only route existence, component existence, mocked client returns, or status codes detached from state/artifact effects.
|
|
34
34
|
- [ ] Fullstack/backend-backed frontend flows have real FE<->BE proof, not only separate backend and frontend tests.
|
|
35
|
+
- [ ] Web/fullstack frontend unit tests are directly detectable and import/render real frontend components/modules.
|
|
35
36
|
|
|
36
37
|
## Development Completion
|
|
37
38
|
|
|
@@ -43,7 +44,7 @@ Reject only for material defects that would mislead development, evaluation, or
|
|
|
43
44
|
## Phase 4 And Phase 5
|
|
44
45
|
|
|
45
46
|
- [ ] Phase 4 runs all available relevant tests except broad commands that require explicit user approval, asks the bugfix lane for verification guidance where useful, manually exercises relevant runtime/account surfaces, runs internal owner self-test cycles for issue discovery, and routes issues back to the bugfix lane.
|
|
46
|
-
- [ ] Every provided seeded/demo account and every relevant role/state has been exercised or the unverified surface is explicitly user risk-accepted.
|
|
47
|
+
- [ ] Every provided seeded/demo account, README-listed seeded value, example ID/status, and every relevant role/state has been exercised or the unverified surface is explicitly user risk-accepted.
|
|
47
48
|
- [ ] Phase 5 uses fresh evaluator sessions for full self-test audits, evaluator subagent only, full prompt packets verbatim, no rerun footer, immutable reports, one bugfix/fix-check lane for issues from both final self-test audits, and same-evaluator scoped fix-check only for kept Partial Pass reports.
|
|
48
49
|
- [ ] Every evaluator finding and recommendation is fixed and verified or explicitly risk-accepted by the user before Phase 5 closes.
|
|
49
50
|
- [ ] Coverage/README/final reconciliation uses a dedicated developer session after Phase 5 findings.
|
|
@@ -52,7 +53,7 @@ Reject only for material defects that would mislead development, evaluation, or
|
|
|
52
53
|
|
|
53
54
|
- [ ] Final runtime verification has run before packaging, or the unrun surface is explicitly risk-accepted.
|
|
54
55
|
- [ ] `./repo/run_tests.sh` has run where applicable before packaging, or the unrun surface is explicitly risk-accepted.
|
|
55
|
-
- [ ] `agent-browser` manual functionality verification has run for browser-accessible UI before packaging, or the unrun surface is explicitly risk-accepted.
|
|
56
|
+
- [ ] `agent-browser` manual functionality verification has run through core prompt requirements, main user journeys, README-listed seeded values, demo credentials, and role/state behavior for browser-accessible UI before packaging, or the unrun surface is explicitly risk-accepted.
|
|
56
57
|
- [ ] D1-D9 readiness categories are pass, not applicable, or explicitly risk-accepted.
|
|
57
58
|
- [ ] Final docs, reports, repo state, and package-root expectations agree.
|
|
58
59
|
|
|
@@ -60,7 +61,7 @@ Reject only for material defects that would mislead development, evaluation, or
|
|
|
60
61
|
|
|
61
62
|
- [ ] Final package root and docs allowlist are correct.
|
|
62
63
|
- [ ] Workflow-private artifacts stay outside the product package.
|
|
63
|
-
- [ ] README, scripts, config, tests,
|
|
64
|
+
- [ ] README, scripts, config, tests, runtime instructions, browser/API manual evidence, and seeded/demo values agree.
|
|
64
65
|
- [ ] Stale workflow notes and scratch execution artifacts are absent from final package.
|
|
65
66
|
- [ ] `.tmp` contains final kept audit/fix-check/coverage reports only, with stale failed/superseded reports archived outside the package.
|
|
66
67
|
- [ ] `repo/docker-compose.yml`, `repo/run_tests.sh`, and `repo/init_db.sh` where applicable match README claims.
|
|
@@ -23,7 +23,8 @@ The design must:
|
|
|
23
23
|
- preserve the original business goal and required user outcomes
|
|
24
24
|
- incorporate accepted clarifications and requirements without narrowing them
|
|
25
25
|
- identify the project type, stack, actors, roles, main flows, modules, data, UI/API surfaces, security boundaries, assumptions, and verification strategy
|
|
26
|
-
- define the testing contract as part of the visible design: every API/interface endpoint must have positive and negative tests, unit coverage must target 90%+ for meaningful business logic, and user-facing applications must include full E2E/platform coverage for the main user journeys unless a surface is genuinely not applicable
|
|
26
|
+
- define the testing contract as part of the visible design: every API/interface endpoint must have positive and negative true HTTP/API tests where a runtime endpoint exists, unit coverage must target 90%+ for meaningful business logic, frontend unit tests must be identifiable and must import/render real frontend components where a frontend exists, fullstack/web apps must prove frontend-to-backend behavior, and user-facing applications must include full E2E/platform coverage for the main user journeys unless a surface is genuinely not applicable
|
|
27
|
+
- define README/runtime obligations that satisfy strict review: project type near the top, `docker compose up --build` as the primary startup command for container-supported deliveries, the legacy compatibility string `docker-compose up` without making it primary, access URL/port or platform launch method, verification method, auth/demo credentials for every role or the exact statement `No authentication required`, seeded data or empty-state statement, no manual runtime installs, no hidden `.env` dependency, mock/local/debug disclosures, and known limitations
|
|
27
28
|
- make meaningful assumptions explicit
|
|
28
29
|
- mark unresolved items only when a real decision is still needed
|
|
29
30
|
- identify API/interface surfaces that should be captured in `./docs/api-spec.md`
|
|
@@ -88,6 +88,8 @@ Cover where relevant: authentication, route authorization, object authorization,
|
|
|
88
88
|
- Configuration model:
|
|
89
89
|
- Persistent storage:
|
|
90
90
|
- Seed/demo data need:
|
|
91
|
+
- README startup/access expectation:
|
|
92
|
+
- README auth/seed expectation:
|
|
91
93
|
- Background jobs or scheduled work:
|
|
92
94
|
- External integrations:
|
|
93
95
|
|
|
@@ -96,8 +98,10 @@ Cover where relevant: authentication, route authorization, object authorization,
|
|
|
96
98
|
This is a design-level strategy, not an execution checklist.
|
|
97
99
|
|
|
98
100
|
Required testing contract:
|
|
99
|
-
- All API/interface endpoints must have test coverage for successful behavior and important negative/error cases. If there is no API/interface surface, state `Not Applicable` with the reason.
|
|
101
|
+
- All API/interface endpoints must have true HTTP/API test coverage for successful behavior and important negative/error cases where a runtime endpoint exists. If a non-HTTP interface or accepted exception requires another proof layer, state the exception and replacement proof. If there is no API/interface surface, state `Not Applicable` with the reason.
|
|
100
102
|
- Meaningful business logic must target 90%+ unit coverage. If a component cannot be unit-tested meaningfully, state the exception and the replacement proof layer.
|
|
103
|
+
- Frontend unit tests must be identifiable by file pattern/framework evidence and must import or render real frontend components/modules when a frontend exists.
|
|
104
|
+
- Fullstack or backend-backed frontend work must include proof that real frontend actions reach the intended backend/service behavior.
|
|
101
105
|
- User-facing applications must have full E2E/platform coverage for the main user journeys, including success, validation/failure, and recovery states. If E2E/platform testing is not applicable, state why and what proof replaces it.
|
|
102
106
|
|
|
103
107
|
| Surface / risk | Expected proof layer | Notes |
|
|
@@ -107,10 +111,24 @@ Required testing contract:
|
|
|
107
111
|
| security boundaries | | |
|
|
108
112
|
| API/interface behavior | endpoint tests for every endpoint, including positive and negative cases | |
|
|
109
113
|
| UI states / interactions | | |
|
|
110
|
-
| integration paths | | |
|
|
114
|
+
| frontend-to-backend integration paths | | |
|
|
111
115
|
| unit coverage | 90%+ meaningful business-logic coverage | |
|
|
116
|
+
| frontend unit/component tests | identifiable tests importing/rendering real components/modules | |
|
|
112
117
|
| E2E/platform journeys | full main-journey coverage for user-facing apps | |
|
|
113
118
|
|
|
119
|
+
## 11.1 README / Runtime Gate Strategy
|
|
120
|
+
|
|
121
|
+
| README/runtime gate | Required design outcome |
|
|
122
|
+
|---|---|
|
|
123
|
+
| project type | `backend`, `fullstack`, `web`, `android`, `ios`, or `desktop` near the top of README |
|
|
124
|
+
| startup | primary `docker compose up --build` for container-supported deliveries; include legacy compatibility string `docker-compose up` without making it primary |
|
|
125
|
+
| access | URL + port, emulator/device steps, or desktop launch steps |
|
|
126
|
+
| verification | concrete API/UI/mobile/desktop verification method |
|
|
127
|
+
| environment | Docker-contained or platform-contained setup; no manual runtime installs, manual DB setup, hidden `.env`, or secret-bearing examples |
|
|
128
|
+
| auth | all demo credentials and roles, or exact `No authentication required` statement |
|
|
129
|
+
| seed/demo data | seeded values and how to exercise them, or an empty-state statement |
|
|
130
|
+
| mock/local/debug | truthful disclosure of mock, stub, local-data, or debug boundaries |
|
|
131
|
+
|
|
114
132
|
## 12. API Spec Handoff
|
|
115
133
|
|
|
116
134
|
- API spec required: [yes/no]
|
|
@@ -37,11 +37,11 @@ Create a practical implementation plan that can be translated into concise imple
|
|
|
37
37
|
- Security work needs negative proof where relevant.
|
|
38
38
|
- README/runtime/test obligations must be assigned.
|
|
39
39
|
- Maintain a no-orphan ledger so no accepted requirement, clarification, API/interface, data object, actor path, security boundary, report/export/notification, README obligation, or test obligation disappears between design and implementation.
|
|
40
|
-
- Include an API coverage matrix when APIs exist:
|
|
40
|
+
- Include an API coverage matrix when APIs exist: exact `METHOD + PATH` or interface, expected implementation owner, true no-mock HTTP/API proof where applicable, mocked or unit-only exceptions, negative cases, and exact proof expectation.
|
|
41
41
|
- Include a FE-BE integration matrix for fullstack/backend-backed frontend work: frontend action, backend endpoint/service/job, payload/state input, response/side effect, UI states, and proof path.
|
|
42
42
|
- Include a backend-to-frontend exposure check when backend capabilities exist: every prompt-relevant backend capability must have visible exposure or a specific accepted internal/API-only reason.
|
|
43
|
-
- Include README/runtime gates: project type, startup/access, verification, auth/no-auth, seeded
|
|
44
|
-
- Include coverage rigor: unit, API/
|
|
43
|
+
- Include README/runtime gates that match the strict README audit: project type near the top, primary `docker compose up --build` for container-supported deliveries, legacy compatibility string `docker-compose up` without making it primary, startup/access, verification, auth/no-auth, all demo credentials/roles when auth exists, seeded values or empty-state statement, configuration/no-secret handling, no manual runtime installs or manual DB setup, test commands, known limitations, and mock/local-data/debug disclosures.
|
|
44
|
+
- Include coverage rigor: 90%+ unit target for meaningful business logic, exact true no-mock HTTP/API endpoint tests, identifiable frontend unit tests that import/render real components/modules, fullstack FE-BE proof, E2E/platform proof, security/negative cases, and final local verification expectations.
|
|
45
45
|
- Include module acceptance checks that prevent shell/demo completion: observable behavior, persisted state/artifact or UI/API outcome, relevant negative paths, tests, README impact, and integration evidence.
|
|
46
46
|
|
|
47
47
|
## Output Requirements
|
|
@@ -50,6 +50,8 @@ Use the provided plan template.
|
|
|
50
50
|
|
|
51
51
|
`../.ai/plan.md` must be actionable enough to support one clean human implementation prompt at a time without exposing this file.
|
|
52
52
|
|
|
53
|
+
The plan must be sequenced so the owner can prompt the developer/Claude lane one bounded slice at a time. Prefer one work package per prompt. Mark any pair that should be sent together only when two adjacent slices are tightly coupled and easier to verify together. Do not create a plan that requires sending all phases, all modules, or the full delivery workflow to the implementation lane at once.
|
|
54
|
+
|
|
53
55
|
`../.ai/test-coverage.md`, when written, should summarize planned coverage by module, API/interface, UI flow, risk, and final verification need.
|
|
54
56
|
|
|
55
57
|
If the design or API spec has a material contradiction, record it as a planning exception instead of silently rewriting the contract.
|
|
@@ -68,6 +68,8 @@ If APIs exist, every accepted endpoint/interface must have a row.
|
|
|
68
68
|
|---|---|---|---|---|---|---|
|
|
69
69
|
| | | | yes/no | | | |
|
|
70
70
|
|
|
71
|
+
True HTTP/API proof means a test sends a request to the exact runtime route/interface and reaches the real handler/business logic without mocking transport, controllers, services, or providers used in the execution path. If this is not possible or not applicable, record the accepted exception and replacement proof.
|
|
72
|
+
|
|
71
73
|
## 7. Frontend / Interaction Execution Plan
|
|
72
74
|
|
|
73
75
|
If not applicable, state `Not Applicable` with the accepted reason.
|
|
@@ -76,6 +78,8 @@ If not applicable, state `Not Applicable` with the accepted reason.
|
|
|
76
78
|
|---|---|---|---|---|
|
|
77
79
|
| | | loading / empty / submitting / disabled / success / error | | |
|
|
78
80
|
|
|
81
|
+
Frontend unit tests must be directly detectable by the final audit: test files must use the project test framework and import or render actual frontend components/modules, not only backend utilities or package scripts.
|
|
82
|
+
|
|
79
83
|
## 7.1 FE-BE Integration Matrix
|
|
80
84
|
|
|
81
85
|
Required for fullstack or backend-backed frontend work. If not applicable, state `Not Applicable` with the accepted reason.
|
|
@@ -101,11 +105,12 @@ Required when backend capabilities exist. If not applicable, state `Not Applicab
|
|
|
101
105
|
## 9. README / Runtime / Configuration Plan
|
|
102
106
|
|
|
103
107
|
- README obligations:
|
|
104
|
-
- Startup/access documentation:
|
|
108
|
+
- Startup/access documentation, including primary `docker compose up --build` where container-supported and legacy compatibility string `docker-compose up` without making it primary:
|
|
105
109
|
- Test documentation:
|
|
106
|
-
- Auth/no-auth documentation:
|
|
107
|
-
- Seed/demo data documentation:
|
|
108
|
-
- Config and no-secret handling:
|
|
110
|
+
- Auth/no-auth documentation, including all demo credentials and roles when auth exists or exact `No authentication required` statement:
|
|
111
|
+
- Seed/demo data documentation, including every seeded value the reviewer should be able to exercise or an empty-state statement:
|
|
112
|
+
- Config and no-secret handling, including no hidden `.env` dependency and no secret-bearing examples:
|
|
113
|
+
- Environment constraints, including no manual runtime installs or manual DB setup for packaged verification:
|
|
109
114
|
- Known limitations documentation:
|
|
110
115
|
|
|
111
116
|
## 9.1 README Gate Matrix
|
|
@@ -113,13 +118,13 @@ Required when backend capabilities exist. If not applicable, state `Not Applicab
|
|
|
113
118
|
| README requirement | Expected content | Owning work package | Proof / review point |
|
|
114
119
|
|---|---|---|---|
|
|
115
120
|
| project type near top | | | |
|
|
116
|
-
| startup command |
|
|
121
|
+
| startup command | primary `docker compose up --build` where applicable, plus legacy compatibility string `docker-compose up` not presented as primary | | |
|
|
117
122
|
| access method | | | |
|
|
118
123
|
| verification method | | | |
|
|
119
124
|
| broad test command | | | |
|
|
120
125
|
| auth credentials or no-auth statement | | | |
|
|
121
126
|
| seeded data or empty-state statement | | | |
|
|
122
|
-
| configuration / no secrets / no env-file dependency | | | |
|
|
127
|
+
| configuration / no secrets / no env-file dependency / no manual installs | | | |
|
|
123
128
|
| mock/local-data/debug disclosure | | | |
|
|
124
129
|
| known limitations | | | |
|
|
125
130
|
|
|
@@ -147,6 +152,8 @@ Required when backend capabilities exist. If not applicable, state `Not Applicab
|
|
|
147
152
|
- Integration coverage expectation:
|
|
148
153
|
- E2E/platform coverage expectation:
|
|
149
154
|
- Frontend state/component coverage expectation:
|
|
155
|
+
- Final browser/manual core-flow expectation:
|
|
156
|
+
- README seeded-value/account verification expectation:
|
|
150
157
|
- Known accepted exceptions:
|
|
151
158
|
|
|
152
159
|
## 11. Integration And Hardening Plan
|
|
@@ -174,9 +181,11 @@ For each module, acceptance requires:
|
|
|
174
181
|
|
|
175
182
|
Translate these into human prompts; do not paste this section verbatim.
|
|
176
183
|
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
|
184
|
+
Each row should represent one bounded prompt by default. Mark a combined prompt only when two adjacent rows are tightly coupled and should be implemented/verified together. Never combine all phases, the full plan, or the whole delivery into one developer/Claude prompt.
|
|
185
|
+
|
|
186
|
+
| Sequence | Human prompt intent | Visible context to mention | Expected completion report | Combine with adjacent row? |
|
|
187
|
+
|---|---|---|---|---|
|
|
188
|
+
| 1 | | | | no |
|
|
180
189
|
|
|
181
190
|
## 13. Plan Closure Checklist
|
|
182
191
|
|
|
@@ -42,6 +42,7 @@ This file contains product engineering rules for the current project.
|
|
|
42
42
|
- Use `unit_tests/` for unit tests and `API_tests/` for API/integration HTTP tests when those surfaces exist.
|
|
43
43
|
- Every implementation change should include tests for the behavior it owns. Target full meaningful coverage across unit, API/integration, and E2E/platform layers where those surfaces exist.
|
|
44
44
|
- API/interface endpoints should have real positive and negative tests for exact behavior. User-facing flows should have E2E/platform coverage for the main journeys and important failure/recovery states.
|
|
45
|
+
- API/interface tests should hit the real route/interface and real business logic without mocking transport/controllers/execution-path services unless there is a documented exception. Frontend unit tests should import or render real components/modules so coverage is directly reviewable.
|
|
45
46
|
- Prefer the fastest meaningful targeted checks during ordinary implementation.
|
|
46
47
|
- Never claim a command passed unless you actually ran it and saw the result.
|
|
47
48
|
- If required verification cannot run in the current environment, report it as unverified with the exact risk.
|
|
@@ -42,6 +42,7 @@ This file contains product engineering rules for the current project.
|
|
|
42
42
|
- Use `unit_tests/` for unit tests and `API_tests/` for API/integration HTTP tests when those surfaces exist.
|
|
43
43
|
- Every implementation change should include tests for the behavior it owns. Target full meaningful coverage across unit, API/integration, and E2E/platform layers where those surfaces exist.
|
|
44
44
|
- API/interface endpoints should have real positive and negative tests for exact behavior. User-facing flows should have E2E/platform coverage for the main journeys and important failure/recovery states.
|
|
45
|
+
- API/interface tests should hit the real route/interface and real business logic without mocking transport/controllers/execution-path services unless there is a documented exception. Frontend unit tests should import or render real components/modules so coverage is directly reviewable.
|
|
45
46
|
- Prefer the fastest meaningful targeted checks during ordinary implementation.
|
|
46
47
|
- Never claim a command passed unless you actually ran it and saw the result.
|
|
47
48
|
- If required verification cannot run in the current environment, report it as unverified with the exact risk.
|