theslopmachine 1.0.2 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/assets/agents/developer.md +38 -32
  2. package/assets/agents/slopmachine-claude.md +36 -25
  3. package/assets/agents/slopmachine.md +61 -45
  4. package/assets/claude/agents/developer.md +27 -10
  5. package/assets/skills/claude-worker-management/SKILL.md +4 -4
  6. package/assets/skills/developer-session-lifecycle/SKILL.md +13 -3
  7. package/assets/skills/development-guidance/SKILL.md +24 -5
  8. package/assets/skills/evaluation-triage/SKILL.md +4 -4
  9. package/assets/skills/final-evaluation-orchestration/SKILL.md +29 -3
  10. package/assets/skills/integrated-verification/SKILL.md +24 -23
  11. package/assets/skills/p8-readiness-reconciliation/SKILL.md +98 -0
  12. package/assets/skills/planning-gate/SKILL.md +2 -2
  13. package/assets/skills/planning-guidance/SKILL.md +7 -4
  14. package/assets/skills/scaffold-guidance/SKILL.md +2 -0
  15. package/assets/skills/submission-packaging/SKILL.md +30 -3
  16. package/assets/skills/verification-gates/SKILL.md +11 -7
  17. package/assets/slopmachine/clarification-faithfulness-review-prompt.md +69 -45
  18. package/assets/slopmachine/clarifier-agent-prompt.md +46 -40
  19. package/assets/slopmachine/exact-readme-template.md +38 -11
  20. package/assets/slopmachine/owner-verification-checklist.md +2 -2
  21. package/assets/slopmachine/phase-1-design-prompt.md +94 -17
  22. package/assets/slopmachine/phase-1-design-template.md +124 -21
  23. package/assets/slopmachine/phase-2-execution-planning-prompt.md +155 -87
  24. package/assets/slopmachine/phase-2-plan-template.md +169 -81
  25. package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +8 -1
  26. package/assets/slopmachine/scaffold-playbooks/tech-frontend-vue.md +2 -0
  27. package/assets/slopmachine/scaffold-playbooks/type-web-spa.md +1 -0
  28. package/assets/slopmachine/templates/AGENTS.md +18 -17
  29. package/assets/slopmachine/templates/CLAUDE.md +18 -17
  30. package/assets/slopmachine/templates/plan.md +115 -36
  31. package/package.json +9 -2
  32. package/src/constants.js +1 -0
  33. package/src/init.js +8 -0
  34. package/src/install.js +130 -0
  35. package/assets/slopmachine/utils/__pycache__/claude_live_hook.cpython-311.pyc +0 -0
  36. package/assets/slopmachine/utils/__pycache__/cleanup_delivery_artifacts.cpython-311.pyc +0 -0
  37. package/assets/slopmachine/utils/__pycache__/convert_ai_session.cpython-311.pyc +0 -0
  38. package/assets/slopmachine/utils/__pycache__/normalize_claude_session.cpython-311.pyc +0 -0
  39. package/assets/slopmachine/utils/__pycache__/strip_session_parent.cpython-311.pyc +0 -0
@@ -1,15 +1,15 @@
1
- # Phase 2 Execution Planning Prompt
1
+ # Execution Planning Prompt
2
2
 
3
- You are the execution planning lead for a software delivery that already has an accepted Phase 1 design document.
3
+ You are the execution planning lead for a software delivery that already has an accepted design document.
4
4
 
5
5
  Your task is to convert the accepted design into the authoritative execution plan.
6
6
 
7
- This phase produces:
7
+ This planning pass produces:
8
8
  `plan.md`
9
9
 
10
- When applicable, this phase must also write or update parent-root `../docs/test-coverage.md` so the external coverage reference already reflects the accepted planning-time coverage contract rather than remaining a blank seed or stale placeholder.
10
+ When applicable, this planning pass must also write or update parent-root `../docs/test-coverage.md` so the external coverage reference already reflects the accepted planning-time coverage contract rather than remaining a blank seed or stale placeholder.
11
11
 
12
- This phase is the critical execution contract.
12
+ This plan is the critical execution contract.
13
13
  It must operationalize the accepted design without reinterpreting it.
14
14
 
15
15
  ---
@@ -18,13 +18,13 @@ It must operationalize the accepted design without reinterpreting it.
18
18
 
19
19
  ### Design Contract Obedience
20
20
 
21
- The accepted Phase 1 design is binding.
21
+ The accepted design is binding.
22
22
 
23
23
  You must:
24
24
  - treat the accepted design as the locked product contract
25
25
  - preserve the original prompt plus the accepted requirements-and-clarification baseline as already resolved by the design
26
26
  - convert the design into a concrete execution plan
27
- - make the plan actionable, test-complete, and merge-safe
27
+ - make the plan actionable, test-complete, and integration-safe
28
28
  - carry forward all design-level runtime, security, coverage, README, and review obligations into execution form
29
29
 
30
30
  You must not:
@@ -34,7 +34,7 @@ You must not:
34
34
  - defer critical delivery obligations to “cleanup later”
35
35
  - separate implementation work from its required tests
36
36
  - allow shared files to become uncontrolled churn
37
- - leave the final verification/hardening phase vague
37
+ - leave final verification and hardening vague
38
38
 
39
39
  If you discover a contradiction, gap, or impossible dependency in the design:
40
40
  - flag it explicitly as a handoff exception
@@ -42,13 +42,13 @@ If you discover a contradiction, gap, or impossible dependency in the design:
42
42
  - propose the narrowest correction
43
43
  - do not silently rewrite the contract
44
44
 
45
- When an execution-planning decision depends on framework, library, API, platform, or tool specifics, verify the decision against authoritative documentation first and targeted current research second. Do not guess about important stack behavior.
45
+ When an execution-planning decision depends on framework, library, API, platform, or tool specifics, verify the decision with the Context7 CLI/skill first and targeted current research second. Use `npx ctx7@latest library <name> "<question>"` before `npx ctx7@latest docs <libraryId> "<question>"` unless a valid Context7 library id is already known. Do not guess about important stack behavior.
46
46
 
47
47
  ---
48
48
 
49
- ## Phase Boundary
49
+ ## Planning Boundary
50
50
 
51
- This is an execution-planning phase.
51
+ This is an execution-planning task.
52
52
 
53
53
  Do not:
54
54
  - write implementation code
@@ -59,7 +59,7 @@ Do not:
59
59
  - invent new product meaning
60
60
  - bypass the design by making convenience decisions that should have been locked earlier
61
61
 
62
- This phase must decide exactly how the system will be built and verified.
62
+ This plan must decide exactly how the system will be built and verified.
63
63
 
64
64
  ---
65
65
 
@@ -74,36 +74,41 @@ Produce `plan.md` as an ordered, section-addressable, resume-friendly execution
74
74
  - collision-safe Compose/runtime assumptions where relevant
75
75
  - explicit security execution contract
76
76
  - explicit prompt-critical rule matrix derived from exact prompt language and accepted clarifications
77
+ - explicit prompt verb acceptance matrix derived from action verbs in the prompt and accepted clarifications
77
78
  - explicit role surface matrix for any role-sensitive project
78
79
  - explicit test coverage execution contract
79
- - explicit core semantic path proof target and expected `P5` evidence
80
+ - explicit core semantic path proof target and expected local readiness evidence
80
81
  - explicit delivery-review requirement matrices
81
- - explicit module-first decomposition before any file/path ownership details
82
- - a minimal file/location ownership map only where it is needed to make module packets executable
82
+ - explicit module-first decomposition before any file/path responsibility details
83
+ - a minimal file/location responsibility map only where it is needed to make module packets executable
83
84
  - shared-file contract that lands before module execution
84
- - ordered module packets derived from the module ownership map rather than only abstract feature buckets
85
+ - canonical cross-surface contracts for shared schemas, payloads, imports, exports, notifications, reports, clipboard content, templates, and adapter boundaries where applicable
86
+ - ordered module packets derived from the module responsibility map rather than only abstract feature buckets
85
87
  - optional independent helper work only when it is clearly safer than normal sequential execution
86
- - file/location ownership by module wherever realistically useful
88
+ - file/location responsibility by module wherever realistically useful
87
89
  - tests traveling with implementation
88
90
  - per-module integration targets and final cross-module verification
89
- - final integrated verification and hardening plan that gets the workflow to evaluation quickly rather than creating a long pre-evaluation fix loop
90
- - cleanup before final delivery review
91
+ - final integrated verification and hardening plan that gets the repo to truthful readiness quickly rather than creating a long fix loop
92
+ - plan-section closure evidence that the developer must cite in major completion claims
93
+ - cleanup before final delivery
91
94
 
92
95
  ---
93
96
 
94
97
  ## Non-Negotiable Rules
95
98
 
96
- ### 0. Plan must operationalize the final evaluator prompts exactly
97
- The plan must turn the accepted design into concrete repo evidence for the final static evaluation prompts and the unified test-coverage/README prompt. Keep this compact and implementation-facing; do not add process-only bloat.
99
+ ### 0. Plan must operationalize the delivery contract exactly
100
+ The plan must turn the accepted design into concrete repo evidence for strict product, security, test, and README verification. Keep this compact and implementation-facing; do not add process-only bloat.
98
101
 
99
- The plan must explicitly require these evaluator-judged deliverables when applicable:
102
+ The plan must explicitly require these delivery-critical artifacts when applicable:
100
103
  - repo-root `README.md` with project type near the top using exactly one of `backend`, `fullstack`, `web`, `android`, `ios`, or `desktop`
104
+ - static consistency across README/docs/scripts/routes/config/examples/manifests/env examples, with entry points and project structure traceable without running the app
101
105
  - README startup instructions containing `docker compose up --build` as the primary runtime contract and the exact legacy compatibility string `docker-compose up` without making it primary
102
106
  - README access method: URL/port for backend/web/fullstack, emulator/device steps for mobile, launch steps for desktop
103
107
  - README verification method: API curl/Postman, UI flow, mobile screen usage, or desktop interaction as applicable
104
108
  - README auth disclosure: credentials for every role if auth exists, otherwise exact statement `No authentication required`
109
+ - README quick-start seeded data disclosure: for any app that is not useful from an empty state, the delivered runtime must create deterministic local demo/test data through the normal bootstrap path and the README must list the seeded accounts, sample records, URLs, IDs, or steps needed to exercise the main flows quickly
105
110
  - no README/runtime dependence on `npm install`, `pip install`, `apt-get`, manual DB setup, runtime installs, `.env`, or `.env.example`
106
- - repo-root dockerized `./run_tests.sh`; local dependency based broad-test scripts may exist only as developer convenience, not the final audit contract
111
+ - repo-root dockerized `./run_tests.sh`; local dependency based broad-test scripts may exist only as developer convenience, not the broad delivery contract
107
112
  - API endpoint inventory for every backend/fullstack route as exact `METHOD + fully resolved PATH`, including prefixes/versioning/nested routers
108
113
  - API test plan per endpoint: covered yes/no target, test type (`true no-mock HTTP`, `HTTP with mocking`, or `unit-only / indirect`), planned files, request input, response content/side effect, and key assertion
109
114
  - API coverage summary must be computable from the plan when APIs exist: total endpoints, endpoints with HTTP tests, endpoints with true no-mock HTTP tests, HTTP coverage percentage, true API coverage percentage, and mocked/indirect exceptions
@@ -112,20 +117,22 @@ The plan must explicitly require these evaluator-judged deliverables when applic
112
117
  - frontend unit/component/state tests for `fullstack` and `web` projects that import/render actual frontend modules; do not let package.json alone count as evidence
113
118
  - frontend state coverage for loading, empty, error, submitting, disabled, success/failure, validation, duplicate/re-entry, and prompt-critical search/filter/sort/pagination
114
119
  - frontend static quality coverage for web/fullstack/frontend work: layout hierarchy, spacing/alignment consistency, theme/content fit, font/icon consistency, and static support for hover/click/disabled/current-state styling where applicable
120
+ - frontend structure credibility for web/fullstack/frontend work: connected pages/routes/app shell, organized state/data flow, credible service/adaptor/mock/storage boundaries, and no excessive single-file or fragmented-snippet delivery
115
121
  - fullstack major-flow FE↔BE tests or platform-equivalent proof; if missing, the plan must name the compensating API/unit proof and residual risk
116
122
  - security test/proof rows for authentication, route authorization, object-level authorization, tenant/user isolation, and admin/internal/debug protection
117
123
  - logging/observability and sensitive-data leakage checks for logs, responses, console, analytics, storage, and visible UI where applicable
118
124
  - mock/local/demo/debug disclosure and fake-success prevention in README, UI, and tests
125
+ - pure-frontend mock/local-data boundary honesty: for `web` projects with no backend service, mock/local data is allowed when disclosed, but docs/UI must not imply real backend integration and fake-success paths must not hide missing failure handling
119
126
 
120
- Every item above must map to a module packet, README task, API coverage row, frontend coverage row, or final verification row. If not applicable, state `Not Applicable` with a short reason. Do not leave evaluator obligations only in broad prose.
127
+ Every item above must map to a module packet, README task, API coverage row, frontend coverage row, or final verification row. If not applicable, state `Not Applicable` with a short reason. Do not leave delivery obligations only in broad prose.
121
128
 
122
129
  ### 1. Every planned element must travel with tests
123
130
  For every meaningful planned element, the plan must state:
124
131
  - what is being implemented
125
132
  - where it will live
126
133
  - which tests must exist
127
- - which module or owner owns those tests
128
- - what proof is expected before merge
134
+ - which module or responsible implementer owns those tests
135
+ - what proof is expected before integration
129
136
 
130
137
  Plan the whole prompt-relevant app surface early.
131
138
  Do not stop at only major modules or a sample of happy paths.
@@ -137,7 +144,7 @@ When frontend surfaces exist, plan explicit flow-by-flow state coverage for load
137
144
  No orphan requirements are allowed in `plan.md`. Every requirement ID from the accepted requirements breakdown, every accepted clarification that affects delivery, every design no-orphan row, and every API spec route must map to at least one module packet and one proof path. If a requirement is intentionally not implemented, the plan must name the exact accepted out-of-scope or non-applicability rationale; vague `not needed`, `later`, `covered generally`, or `internal` language is a planning defect.
138
145
 
139
146
  ### 1.5 Module-first planning before files and execution order
140
- Before writing module execution order or any file/location ownership details, identify the real modules/domains/product units from the accepted design, API spec, prompt requirements, and current repo tree.
147
+ Before writing module execution order or any file/location responsibility details, identify the real modules/domains/product units from the accepted design, API spec, prompt requirements, and current repo tree.
141
148
 
142
149
  The module map is not a naming exercise. It is the acceptance skeleton for implementation. Each module row must describe the behavior that will make the prompt-required product real, the surfaces that expose that behavior, the state/artifact/config changes that prove it happened, and the tests that would fail if the implementation only preserved current placeholder behavior.
143
150
 
@@ -150,7 +157,7 @@ For each module, the plan must name:
150
157
  - shared contracts and dependencies
151
158
  - required API, unit, integration, E2E/platform, and frontend component/state tests
152
159
  - required FE↔BE wiring proof when UI and backend both exist
153
- - required main-lane module verification after fan-in before development exit
160
+ - required integrated module verification before development exit
154
161
  - failure paths, security boundaries, and lazy/shell-only partial-completion risks
155
162
  - exact acceptance proof: the observable state change, persisted row, artifact, byte provenance, UI task closure, or external effect that proves the module is not a shell
156
163
  - current-behavior trap: any tempting test that would merely prove the incomplete implementation rather than the prompt-required behavior, with the corrected assertion named
@@ -158,15 +165,29 @@ For each module, the plan must name:
158
165
  - adjacent requirements explicitly not owned by this module, with the module that does own them, so no requirement disappears between modules
159
166
 
160
167
  Only after this module map exists should the plan derive:
161
- - the minimal file/location ownership needed for executable module packets
168
+ - the minimal file/location responsibility details needed for executable module packets
162
169
  - shared-file stabilization order
163
170
  - ordered module packets
164
171
 
165
172
  If the plan starts with files before modules, it is incomplete. A full optimistic file tree is no longer required as a standalone planning artifact; the plan should name files, directories, routes, tests, and shared surfaces only to the extent needed to make module packets concrete and verifiable.
166
173
 
167
174
  Coverage target:
168
- - plan meaningful tests for every evaluator-relevant endpoint, core flow, security boundary, and major failure path
175
+ - plan meaningful tests for every delivery-relevant endpoint, core flow, security boundary, and major failure path
169
176
  - do not use an abstract percentage target as a substitute for endpoint-by-endpoint, flow-by-flow, and risk-by-risk coverage mapping
177
+ - when the stack has measurable coverage tooling, set the minimum clean-development target to at least `90%` line/branch coverage for unit-testable product code and at least `90%` completion of planned E2E/platform-critical flow coverage; if the stack cannot measure one of these honestly, the plan must name the exact reason and substitute a concrete file/flow/assertion coverage ledger
178
+ - when backend/fullstack APIs exist, require `100%` of documented prompt-relevant `METHOD + PATH` surfaces to have true no-mock HTTP tests unless a narrow accepted exception is recorded per endpoint with compensating proof
179
+
180
+ ### 1.6 Vertical implementation requirement
181
+ The plan must enforce vertical implementation over breadth-first coverage:
182
+ - build one complete user/operator flow end-to-end before starting the next
183
+ - for every form, implement template + route + handler + service + persistence + response together
184
+ - for every page link, register and render the target page before claiming the source page complete
185
+ - for every background job, wire it from startup and verify it is reachable before claiming it complete
186
+ - for every security control, enforce it at the correct layer (service, middleware, DB, template, runtime) before claiming it complete
187
+ - do not allow module completion claims based on file counts, route counts, template counts, or test counts alone
188
+ - a feature is complete only when the intended actor can perform the task end-to-end through the real app path, or it is explicitly marked incomplete with a named residual risk
189
+ - when planning module execution, order modules so that core vertical flows are completed before peripheral surfaces are added
190
+ - prohibit completion language such as "all X endpoints implemented", "Y modules complete", or "Z tests passing" unless each surface has been verified as a working vertical flow
170
191
 
171
192
  For all `web` projects:
172
193
  - require explicit Playwright or equivalent real in-browser E2E coverage planning in `plan.md`
@@ -182,32 +203,34 @@ For all `fullstack` projects:
182
203
  - reject generic internal/API-only reasons; each accepted internal-only backend capability must name its real actor/consumer, invocation path, authorization boundary, evidence path, and why frontend exposure would be contrary to or unnecessary for the prompt
183
204
  - do not allow placeholders to satisfy the map: no fake-success API clients, static demo data pretending to be backend data, disconnected forms, TODO integration stubs, or shell handlers returning hardcoded success
184
205
  - require backend verification to prove frontend-called backend operations are real: they reach real handlers/services, perform the planned read/mutation/side effect, and are not merely registered routes or 200 responses
206
+ - require exact form field name contracts: for every form-backed endpoint, the plan must document the exact frontend form field names and the exact backend handler field names, and require a test that submits the rendered form through the real handler path to prove the contract works
207
+ - a form-to-backend contract is broken if the template uses `name="price"` and the handler reads `price_value`; the plan must prevent this through shared constants, typed request structs, or explicit field-name mapping
185
208
 
186
209
  Core semantic path:
187
210
  - identify the single task centerpiece path that would prove the product is real rather than a shell
188
211
  - define its exact user/API route, persisted input fixture or realistic data setup, required config/rule path, expected state transition or artifact, expected failure behavior, and test/manual proof path
189
- - make this proof mandatory before evaluation unless a named residual risk is explicitly accepted
212
+ - make this proof mandatory before readiness handoff unless a named residual risk is explicitly accepted
190
213
 
191
214
  Prompt-critical rules:
192
215
  - extract exact rules from the original prompt and accepted clarifications into a matrix
193
216
  - include dates, thresholds, limits, state names, actor identity, retries, timing/scheduling, public/private boundaries, security constraints, and operator promises
194
- - assign each rule an implementation owner, test owner, README/doc implication, and `P5` proof expectation
217
+ - assign each rule an implementation responsibility, test responsibility, README/doc implication, and local proof expectation
195
218
 
196
219
  Prompt-critical interaction defaults:
197
220
  - for every prompt-critical form, modal, wizard, approval/share flow, import/export flow, or configuration screen, define initial visible state, default field values, untouched submit behavior, first normal action behavior, duplicate/re-entry behavior, and the planned proof path
198
221
  - do not let default-state or untouched-submit semantics remain implied for critical user-facing flows
199
222
 
200
223
  Role/permission surfaces:
201
- - for any auth, role, ownership, public/private route, admin, viewer/read-only, audit, export, or notification behavior, build a role surface matrix
224
+ - for any auth, role, ownership, public/private route, admin, viewer/read-only, accountability logging, export, or notification behavior, build a role surface matrix
202
225
  - cover route/page/API/action/data object/notification/export/report/admin surfaces
203
- - include role access, read/write scope, ownership, admin override, viewer/read-only behavior, audit/logging, frontend route/nav expectation, and required test evidence
226
+ - include role access, read/write scope, ownership, admin override, viewer/read-only behavior, accountability/logging, frontend route/nav expectation, and required test evidence
204
227
  - do not allow design, API spec, router, backend, frontend, README, or tests to contradict the matrix
205
228
 
206
229
  The plan must not allow:
207
230
  - a surface to be implemented now and tested “later"
208
231
  - central test cleanup as the default for normal feature work
209
232
  - workstreams that claim completeness without their matching tests
210
- - large prompt-relevant surfaces with no planned test ownership
233
+ - large prompt-relevant surfaces with no planned test responsibility
211
234
  - shell/demo/placeholder workstreams being presented as full implementation
212
235
  - route/page/module shells being treated as equivalent to closed user-facing or operator-facing behavior
213
236
  - risk-path coverage to remain implied when the prompt or implementation clearly suggests validation failures, edge cases, authorization failures, concurrency-sensitive behavior, or partial-failure behavior that later planning could miss
@@ -223,22 +246,24 @@ The plan must enforce:
223
246
 
224
247
  ### 3. The plan must support module-packet execution with optional safe parallelism
225
248
  Default to:
226
- - a small shared foundation in the main lane first
249
+ - a small shared foundation first
227
250
  - then ordered module packets executed one by one by default
228
- - optional independent helper work only when a module, discovery, or verification task is clearly safer outside the main sequence
251
+ - optional independent helper work only when a module, discovery, test-coverage task, or verification task is clearly safer outside the main sequence
229
252
  - explicit per-module integration points
230
253
  - explicit post-module and final cross-module verification
231
254
  - explicit justification for any optional concurrent module batch
232
255
 
233
256
  Planning must make module execution launch-ready, not merely theoretically ordered. The plan must include a concrete module packet sequence: which module starts first, which modules depend on it, which files/tests each module owns, which shared files it must not edit, what FE↔BE and file-existence proof is required, what verification commands must run before moving to the next module, and what completion report is expected.
234
257
 
258
+ The plan must also include a plan-row execution ledger that turns every actionable `plan.md` row into a completion item with responsible module/work package, repo evidence, verification evidence, status, and blocker/risk field. This ledger is the development execution scoreboard: no actionable row may be left `planned`, vague, or unassigned before clean development completion.
259
+
235
260
  Each module packet must be complete enough to execute without invention. A module packet must name the module capsule, requirement IDs, owned files/tests, shared-file restrictions, frontend/backend/data/security/failure-state scope, real FE↔BE wiring proof where applicable, real file/import/route/service/data proof, verification commands, and the standard module completion checklist.
236
261
 
237
262
  Each module packet must include an explicit `Requirement Closure Checklist` with one row per requirement or sub-requirement it owns. Each row must include implementation surface, test/assertion surface, negative/edge proof if applicable, documentation/README implication if applicable, and completion status. A module packet cannot be marked complete while any owned row is unchecked, delegated without a target module, or proven only by generic smoke coverage.
238
263
 
239
- The primary developer/integration lane remains responsible for sequencing, shared-file foundation, module implementation, integration decisions, and final cross-module verification. It may use helper work only when the plan names a safe independent packet or verification task.
264
+ The primary implementation sequence remains responsible for sequencing, shared-file foundation, module implementation, integration decisions, and final cross-module verification. Helper work may be used only when the plan names a safe independent packet or verification task.
240
265
 
241
- The primary developer/integration lane must consume module completion evidence after each module: inspect ownership, reject incomplete module packets, wire shared surfaces, rerun targeted module tests in the integrated checkout, and record integrated evidence before moving to the next module.
266
+ The primary implementation sequence must consume module completion evidence after each module: inspect file responsibility, reject incomplete module packets, wire shared surfaces, rerun targeted module tests in the integrated checkout, and record integrated evidence before moving to the next module.
242
267
 
243
268
  The execution order must be explicit and module-first:
244
269
  - scaffold step first
@@ -258,8 +283,9 @@ Each planned module packet must specify:
258
283
  - real file/import/route/service/data existence proof
259
284
  - verification commands that must pass before moving to the next module
260
285
  - required standard module completion checklist
286
+ - coverage contribution: endpoints, unit-testable product-code areas, and E2E/platform-critical flows this module must close
261
287
 
262
- If helper work is planned, the plan must state only: task/module, why it is safer outside the main sequence, owned files/tests, shared-file restrictions, verification commands, and integration rule. Do not add branch/worktree launch mechanics to the general plan unless the owner explicitly asks.
288
+ If helper work is planned, the plan must state only: task/module, why it is safer outside the main sequence, owned files/tests, shared-file restrictions, verification commands, and integration rule. Do not add branch/worktree launch mechanics unless explicitly required for implementation.
263
289
 
264
290
  Do not leave module packet creation as optional follow-up.
265
291
  If the plan cannot produce a launch-ready ordered module packet sequence after the shared foundation, it is not ready for development.
@@ -270,12 +296,12 @@ If the plan cannot produce a launch-ready ordered module packet sequence after t
270
296
  - ordered
271
297
  - resume-friendly
272
298
  - section-addressable
273
- - explicit about prerequisites, owners, and merge targets
274
- - explicit about what is main-sequence work vs independently safe helper work
299
+ - explicit about prerequisites, responsible modules, and integration targets
300
+ - explicit about what is primary-sequence work vs independently safe helper work
275
301
 
276
302
  ### 5. Shared files must be isolated early
277
303
  The plan must stabilize a small shared-file contract before broader module implementation.
278
- Typical main-lane-owned files include:
304
+ Typical primary-sequence shared files include:
279
305
  - `plan.md`
280
306
  - `README.md`
281
307
  - root config files
@@ -288,16 +314,20 @@ Typical main-lane-owned files include:
288
314
  Independent helper work must not touch shared files casually.
289
315
 
290
316
  ### 6. Final convergence must be planned explicitly
291
- The plan must define `P5` as a minimal gate, not a perfection phase.
317
+ The plan must define integrated readiness hardening as a minimal delivery-readiness condition, not a perfection pass.
292
318
 
293
319
  It must define:
294
- - the owner-run `P5` local-harness check
320
+ - the local-harness readiness check
295
321
  - that Docker runtime and dockerized `./run_tests.sh` are prepared but deferred to final packaging confirmation
296
322
  - the rough repo-to-`plan.md` coherence check
297
- - that the owner should stop and ask whether to proceed to evaluation once those are satisfied
298
- - that only narrow config/wrapper/README/docs/light-script glue may be fixed directly in `P5`
299
- - that real code or actual test-file changes go back to the developer
300
- - that final packaging confirmation is the first real Docker/runtime confirmation point
323
+ - the readiness handoff once those checks are satisfied
324
+ - that only narrow config/wrapper/README/docs/light-script glue may be fixed directly by whoever is assigned that cleanup
325
+ - that real code or actual test-file changes are handled as implementation corrections after initial implementation is accepted
326
+ - that final runtime confirmation in packaging is the broad Docker/runtime confirmation point, while P8 may launch the app only for `agent-browser` functional verification when no equivalent local runtime is available
327
+
328
+ This stop boundary applies only when the active instruction asks for an implementation-readiness handoff.
329
+
330
+ If implementation is explicitly scoped end to end in one autonomous execution run, it must not stop at rough local readiness. It must continue through all implementation, assigned tests, README/runtime/script reconciliation, integrated local verification, and final truthful handoff unless the active instruction explicitly says to stop earlier.
301
331
 
302
332
  ---
303
333
 
@@ -330,40 +360,69 @@ State the rules for:
330
360
  - test-with-implementation
331
361
  - no `.env` files
332
362
  - Docker/runtime/test honesty
333
- - merge discipline
334
- - final rough coherence-to-evaluation handoff discipline
363
+ - integration discipline
364
+ - final rough coherence-to-readiness handoff discipline
335
365
  - no Docker execution before final packaging confirmation
336
366
 
367
+ External planning documents under `../docs/` are planning/reference inputs only.
368
+
369
+ Delivery evidence must exist inside the delivered repo through:
370
+ - `README.md`
371
+ - code structure
372
+ - route/app/server registration
373
+ - tests
374
+ - scripts
375
+ - config/manifests
376
+ - Docker/runtime files
377
+ - seeded/bootstrap paths where applicable
378
+
379
+ Do not rely on parent-directory docs as the only evidence for any final delivery, runtime, README, API, frontend, security, or test obligation.
380
+
381
+ The README contract must include a Configuration and Environment Model section or equivalent content explaining:
382
+ - whether any configuration is required for local use
383
+ - where runtime defaults come from
384
+ - how Docker/Compose receives local-development configuration
385
+ - whether seeded data is created automatically
386
+ - whether auth exists and what credentials are available
387
+ - that no committed `.env` or `.env.example` is required
388
+ - that no manual package/runtime/database setup is required beyond documented host prerequisites
389
+ - how config-sensitive behavior can be verified
390
+
391
+ If no configuration is required, require the README to state: `No manual environment configuration is required for local Docker startup. The delivered local runtime uses repo-controlled Docker/Compose defaults and bootstrap paths.`
392
+
337
393
  ### C. Scaffold Step At The Start Of Development
338
394
  Lock:
339
395
  - exact starter/playbook/baseline
340
396
  - exact bootstrap command
341
397
  - baseline files and directories expected after the scaffold step lands
342
398
  - language/framework bootstrap notes that must be preserved
399
+ - frontend defaults if prompt/repo are silent: Vue 3 + Vite + TypeScript, Tailwind CSS, and shadcn/ui when compatible
343
400
  - minimal proof surface that should exist immediately after bootstrap and before real feature work
344
401
  - exact Docker/runtime setup to land first, with `docker compose up --build` honestly wired for later final confirmation without local dependency guesswork
402
+ - exact seeded-data strategy to land early when the product needs non-empty data: what is seeded, where the seed lives, how it runs through Docker/init paths, how it stays deterministic/idempotent, and what README quick-start information exposes it
345
403
  - exact repo-root `./run_tests.sh` setup to land first as the dockerized full-app test-suite path reserved for final confirmation, and it must not be planned as a smoke-only, no-op, or shortcut command
346
- - exact separate local test harness setup to land first for development and owner-side `P5`, using the real stack-native local suite for the chosen language/framework such as Vitest, Jest, PHPUnit, pytest, go test, cargo test, or the closest framework-native equivalent, with clear local prerequisites, deterministic execution, and useful failure output
404
+ - exact separate local test harness setup to land first for development and later readiness checks, using the real stack-native local suite for the chosen language/framework such as Vitest, Jest, PHPUnit, pytest, go test, cargo test, or the closest framework-native equivalent, with clear local prerequisites, deterministic execution, and useful failure output
347
405
  - exact requirement that missing local-suite tooling be installed and configured on the local machine before local verification is treated as available
348
406
  - exact reliability contract those commands must satisfy later: for the local harness, no hidden setup steps, no hidden shell state, clear local prerequisites, deterministic execution, and useful failure output; for Docker/runtime and dockerized `./run_tests.sh`, no manual exports, no hidden prep, real readiness gating and healthchecks instead of arbitrary sleeps where practical, and useful failure output
349
407
  - exact expectation that Docker/runtime files and dockerized `./run_tests.sh` are set up but not executed before final packaging confirmation, while the local harness is prepared for development use from scaffold onward
408
+ - exact expectation that seeded quick-start data is not fake completion: seeded data may provide fixtures/accounts/sample records, but product flows must still use real code paths, persistence, validation, authorization, and tests
350
409
  - local development environment/tooling and testing harness contract if available
351
410
  - exact README structure/template that must land early
352
411
  - stop boundary if the scaffold step is intentionally isolated
353
412
  - completion evidence for the scaffold step
354
- - explicit statement that development continues immediately after this step unless the owner says otherwise
413
+ - explicit statement that development continues immediately after this step unless the active instruction says otherwise
355
414
 
356
415
  The scaffold step must stay strict and minimal:
357
416
  - no prompt-specific business features
358
417
  - no deep domain implementation
359
418
  - no release polish beyond the startup/test/documentation baseline
360
419
  - no host-side dependency setup beyond the documented local development prerequisites plus the deferred Docker/runtime contract
361
- - no Docker execution or dockerized `./run_tests.sh` execution before final packaging confirmation, while the separate local harness is available for development use and later `P5` verification
420
+ - no Docker execution or dockerized `./run_tests.sh` execution before explicit final runtime confirmation, while the separate local harness is available for development use and later readiness checks
362
421
  - scaffold placeholders must be explicitly labelled as scaffold-only and must not be counted as product surface completion
363
- - scaffold scripts may not use no-op, echo-only, or pass-with-no-tests behavior for a surface that the final product claims as implemented; if a temporary scaffold wrapper exists, the plan must name the later replacement owner and proof before development completion
422
+ - scaffold scripts may not use no-op, echo-only, or pass-with-no-tests behavior for a surface that the final product claims as implemented; if a temporary scaffold wrapper exists, the plan must name the later replacement responsibility and proof before development completion
364
423
  - product-facing pages, routes, services, jobs, and handlers are incomplete until they close the actor task through real code paths and owned tests, not merely because the repo can build
365
424
 
366
- For those deferred Docker/runtime/test commands, the plan must still demand an implementation shape that is highly likely to pass on the first owner-side execution later:
425
+ For those deferred Docker/runtime/test commands, the plan must still demand an implementation shape that is highly likely to pass on the first explicit final runtime check later:
367
426
  - container build inputs and startup paths must be fully repo-controlled and reviewable
368
427
  - Compose services must use real healthchecks and readiness-gated startup rather than sleep-based sequencing where practical
369
428
  - runtime and broad-test paths must be non-interactive and must not depend on hidden shell state
@@ -374,7 +433,7 @@ Use matrix form and map each applicable requirement to:
374
433
  - source requirement ID(s)
375
434
  - planned repo evidence
376
435
  - planned verification evidence
377
- - owning module/main-sequence section
436
+ - responsible module/section
378
437
 
379
438
  Required matrices:
380
439
  0. core semantic path proof and prompt-critical proof ledger
@@ -391,13 +450,13 @@ Required matrices:
391
450
  Define:
392
451
  - security-sensitive surfaces
393
452
  - shared security foundations that must land before broader module implementation
394
- - any dedicated security lane if needed
453
+ - any dedicated security work package if needed
395
454
  - required negative-path tests
396
- - required merge gates
455
+ - required integration checks
397
456
  - sensitive-data and logging rules
398
457
  - auth/authorization/isolation enforcement map
399
- - role surface matrix when auth, roles, ownership, public/private routes, admin, viewer/read-only behavior, audit, exports, or notifications are relevant
400
- - exact fail-closed expectations for privileged actions, signed links/tokens, audit durability, encryption, object ownership, and public-route boundaries where applicable
458
+ - role surface matrix when auth, roles, ownership, public/private routes, admin, viewer/read-only behavior, accountability logging, exports, or notifications are relevant
459
+ - exact fail-closed expectations for privileged actions, signed links/tokens, durable activity logs, encryption, object ownership, and public-route boundaries where applicable
401
460
 
402
461
  ### F. Test Coverage Execution Contract
403
462
  Map every meaningful planned surface/work package to:
@@ -406,10 +465,10 @@ Map every meaningful planned surface/work package to:
406
465
  - exact planned test files, suites, or test modules when they can already be named honestly
407
466
  - expected coverage evidence
408
467
  - owning module/main-sequence section
409
- - merge proof
468
+ - integration proof
410
469
 
411
- This section must be broad enough that a reviewer can trace how the full prompt-relevant app surface is expected to approach near-complete coverage before implementation starts.
412
- It must also be concrete enough that a reviewer can tell which specific tests are intended to be written, not just which abstract test layer labels apply.
470
+ This section must be broad enough that the full prompt-relevant app surface can be traced to coverage before implementation starts.
471
+ It must also be concrete enough to show which specific tests are intended to be written, not just which abstract test layer labels apply.
413
472
 
414
473
  Must include, where applicable:
415
474
  - backend unit tests
@@ -417,7 +476,7 @@ Must include, where applicable:
417
476
  - frontend unit/component/state tests
418
477
  - frontend unit/component/state tests planned early by real routes/screens/flows
419
478
  - prompt-critical interaction-default tests for untouched/default-path semantics where forms, modals, wizards, approval/share flows, imports/exports, or configuration screens matter
420
- - frontend unit-test audit readability for `web` / `fullstack`, including a plan strong enough that later review can state `Frontend unit tests: PRESENT` or `Frontend unit tests: MISSING` from file-level evidence
479
+ - frontend unit-test readability for `web` / `fullstack`, including a plan strong enough that later review can state `Frontend unit tests: PRESENT` or `Frontend unit tests: MISSING` from file-level evidence
421
480
  - API tests
422
481
  - API tests planned directly from accepted `../docs/api-spec.md` when applicable
423
482
  - integration tests
@@ -433,20 +492,20 @@ Must include, where applicable:
433
492
  - endpoint evidence categories: handler-reaching success, handler-reaching error, authorization short-circuit, mocked client evidence, unit-only evidence, documentation-only claim
434
493
  - behavioral proof categories: core semantic path proven, shallow status/enqueue only, artifact/state side effect proven, failure path proven, frontend-to-backend/browser proven, manually verified only
435
494
 
436
- During this phase, also write or update parent-root `../docs/test-coverage.md` from the accepted coverage contract so it already captures the planned requirement/risk mapping, API coverage strategy when applicable, and the honest current status as a planning-time baseline rather than waiting until packaging.
495
+ During planning, also write or update parent-root `../docs/test-coverage.md` from the accepted coverage contract so it already captures the planned requirement/risk mapping, API coverage strategy when applicable, and the honest current status as a planning-time baseline rather than waiting until final delivery.
437
496
 
438
497
  ### G. README Execution Contract
439
- The plan must include the exact README structure to be implemented and the lane/main-lane ownership model for keeping it current.
498
+ The plan must include the exact README structure to be implemented and the responsible module/primary-sequence model for keeping it current.
440
499
 
441
500
  ### H. Shared-File Contract
442
501
  Define:
443
502
  - which files are shared
444
- - which module or main lane owns each shared file
503
+ - which module or primary sequence owns each shared file
445
504
  - which files must be stabilized before module execution
446
505
  - which later workstreams may reference but not edit them without explicit reason
447
506
 
448
507
  ### I. Ordered Module Packet Map
449
- Build this from the module map plus concrete file/location ownership details where needed. Do not invent workstreams as abstract feature buckets first.
508
+ Build this from the module map plus concrete file/location responsibility details where needed. Do not invent workstreams as abstract feature buckets first.
450
509
 
451
510
  For each module packet define:
452
511
  - module order
@@ -474,7 +533,7 @@ For each major module or domain define:
474
533
  - source requirement IDs
475
534
  - exact frontend pages/components/actions owned
476
535
  - exact backend endpoints/services/jobs owned
477
- - exact planned files/directories before lane assignment
536
+ - exact planned files/directories before work assignment
478
537
  - supporting contracts
479
538
  - dependencies
480
539
  - required unit, API, integration, E2E/platform, and frontend component/state test files/layers
@@ -482,7 +541,7 @@ For each major module or domain define:
482
541
  - proof that backend capabilities are exposed in the frontend when required
483
542
  - failure paths that must be covered
484
543
  - what would count as a lazy or shell-only partial implementation that must not be accepted
485
- - merge criteria
544
+ - integration criteria
486
545
 
487
546
  Module planning rules:
488
547
  - identify modules before the file tree
@@ -491,11 +550,12 @@ Module planning rules:
491
550
  - every prompt-relevant module must have full API coverage when it owns APIs, meaningful unit coverage, integration proof where it crosses boundaries, E2E/platform proof for major actor paths, and frontend wiring proof where UI exists
492
551
  - every module must include frontend loading, empty, submitting, disabled, success, error, and duplicate-action states where those states matter
493
552
  - every module must name the exact backend operations its frontend calls and the exact frontend surfaces that expose its backend features
494
- - every module must include a development-exit verification row requiring the main lane to reread its planned files, confirm those files are real and integrated, run the module's assigned tests, confirm FE↔BE/API wiring, and record the result before development is declared complete
553
+ - every module must include a development-exit verification row requiring integrated reread of its planned files, confirmation that those files are real and wired, the module's assigned tests, FE↔BE/API wiring confirmation, and recorded results before development is declared complete
495
554
  - do not accept module plans that leave API coverage, frontend wiring, or E2E proof to generic final cleanup
496
555
 
497
556
  ### J1. Development Exit Verification Plan
498
- Before development can be reported complete, the primary integration lane must:
557
+ Before development can be reported complete, the primary implementation sequence must:
558
+ - update the plan-row execution ledger so every actionable plan row is `complete`, `not applicable`, or explicitly `risk accepted`; any `planned`, `in progress`, `delegated without receiver`, or unverified row blocks a clean completion claim
499
559
  - collect each module's standard completion checklist
500
560
  - compare planned module packets against completed module packets and record any skipped or blocked module with a blocker and revised sequencing
501
561
  - reject broad summary claims as sufficient evidence when module files, tests, FE↔BE proof, or file-existence proof are missing
@@ -503,7 +563,9 @@ Before development can be reported complete, the primary integration lane must:
503
563
  - run each module's assigned tests after module integration using the non-Docker local harness or targeted stack-native commands
504
564
  - run the full non-Docker local test suite available for development before exit, while still deferring Docker and dockerized `./run_tests.sh` to final packaging confirmation
505
565
  - confirm every module's FE↔BE surface, API coverage, unit coverage, integration coverage, and E2E/platform proof status against the plan
506
- - update the module verification matrix in `plan.md` before handing the repo to owner-side integrated verification
566
+ - confirm API coverage targets: every documented prompt-relevant `METHOD + PATH` has true no-mock HTTP proof or a per-endpoint accepted exception with compensating evidence
567
+ - confirm coverage floors: measured unit-testable product-code coverage is at least `90%` where tooling supports it, and planned E2E/platform-critical flow coverage is at least `90%` complete; if measurement is unavailable, the plan's substitute coverage ledger must be fully closed
568
+ - update the module verification matrix in `plan.md` before claiming integrated readiness
507
569
 
508
570
  ### K. API Inventory and Coverage Plan (if applicable)
509
571
  Build this section directly from the accepted `../docs/api-spec.md` and classify the expected tests as:
@@ -544,22 +606,22 @@ Define:
544
606
  - platform proof caveats honestly where required
545
607
 
546
608
  ### N. Integrated Verification and Hardening Plan
547
- This must be a real final phase in the plan.
609
+ This must be a real final readiness section in the plan.
548
610
 
549
611
  It must define:
550
- - the minimal `P5` gate: the local test harness passes and the repo is roughly coherent with `plan.md`
551
- - the required internal `P5` issue-discovery loop: one fresh evaluator subagent session, exact prepared evaluation packet sent once, four same-session additional scan requests explicitly asking for additional prompt-fit/compliance, security, and delivery issues not already reported, no inter-round remediation, exactly five total reports saved under `../.ai/p5-evaluation/`, Blocker/High findings extracted after each round, consolidated, owner-analyzed, routed to the developer in one remediation brief, fixed, and verified before `P5` can close
552
- - the default proceed rule: once that minimal gate is satisfied, move to evaluation
553
- - every owner `P5` pass must reread the accepted design, accepted `../docs/api-spec.md` when applicable, `plan.md`, `README.md`, repo state, and current evidence in one full sweep rather than reducing follow-up passes to targeted sections only
554
- - normal `P5` should cap at 3 owner sweeps total: the opening sweep plus up to two follow-up full sweeps when needed
612
+ - the minimal readiness condition: the local test harness passes and the repo is roughly coherent with `plan.md`
613
+ - independent readiness expectation: the repo must withstand strict prompt-fit/compliance, security, delivery, test-coverage, README/static-verifiability, mock/demo/fake-success, and frontend state/interaction verification before final handoff
614
+ - the default proceed rule: once that minimal condition is satisfied, produce a truthful readiness handoff
615
+ - every integrated readiness pass must reread the accepted design, accepted `../docs/api-spec.md` when applicable, `plan.md`, `README.md`, repo state, and current evidence in one full sweep rather than reducing follow-up passes to targeted sections only
616
+ - normal readiness hardening should cap at 3 full sweeps total: the opening sweep plus up to two follow-up full sweeps when needed
555
617
  - narrow rerun strategy after known failures
556
- - issue-batching rule for integrated verification: collect all issues that directly block the minimal gate, then either fix the small owner-side churn directly or route one consolidated major fix list back to implementation
557
- - when a consolidated fix list contains independent items, the default is ordered remediation in the main developer lane; use helper work only where genuinely safer, and require per-bundle verification plus integrated proof before accepting completion
558
- - explicit statement that `P5` should not become an iterative development-completion mini-loop
618
+ - issue-batching rule for integrated verification: collect all issues that directly block readiness, then handle small docs/config/script churn directly if assigned or return one consolidated major fix list to implementation
619
+ - when a consolidated fix list contains independent items, the default is ordered remediation in the active correction workstream; use helper work only where genuinely safer, and require per-bundle verification plus integrated proof before accepting completion
620
+ - explicit statement that readiness hardening should not become an iterative development-completion mini-loop
559
621
  - doc tightening only when needed to keep the delivered state roughly coherent with `plan.md`
560
622
  - what must be true before completion is reported
561
623
 
562
- The integrated verification plan must explicitly say that shell/demo/placeholder implementations are reopened as incomplete work rather than accepted as partial success, but it should still be a rough release-alignment pass rather than an exhaustive final perfection gate. It must also explicitly say that a separate local test harness is prepared in scaffold and used during development plus owner-side `P5`, while Docker execution and dockerized `./run_tests.sh` remain deferred until final packaging confirmation.
624
+ The integrated verification plan must explicitly say that shell/demo/placeholder implementations are reopened as incomplete work rather than accepted as partial success, but it should still be a rough release-alignment pass rather than exhaustive final perfection. It must also explicitly say that a separate local test harness is prepared in scaffold and used during development plus later readiness checks, while Docker execution and dockerized `./run_tests.sh` remain deferred until explicit final runtime confirmation.
563
625
 
564
626
  ### O. Cleanup Before Final Review
565
627
  Define the cleanup step that removes:
@@ -609,18 +671,24 @@ Produce `plan.md` in this order:
609
671
 
610
672
  ---
611
673
 
674
+ ## Section-by-Section Delivery Rule
675
+
676
+ Write `plan.md` one template section at a time, appending each completed section to the file on disk. Do not paste full section text in chat and do not dump the accumulated document text in the final response.
677
+
678
+ ---
679
+
612
680
  ## Planning Quality Bar
613
681
 
614
682
  The execution plan is acceptable only if:
615
683
  - it obeys the accepted design
616
684
  - it does not reinterpret product meaning
617
- - it names only the files, routes, tests, and shared surfaces needed for evaluator-relevant implementation and proof
685
+ - it names only the files, routes, tests, and shared surfaces needed for delivery-relevant implementation and proof
618
686
  - it makes shared files explicit
619
687
  - it avoids helper structure unless it is clearly safer for an independent task
620
688
  - it attaches tests to implementation work
621
689
  - it defines exact runtime/test/README obligations
622
- - it defines exact merge and fan-in verification
623
- - it defines the real final convergence/hardening gate
690
+ - it defines exact integration and fan-in verification
691
+ - it defines the real final convergence/hardening condition
624
692
  - it is specific enough that implementation can follow it with minimal reinvention
625
693
 
626
694
  If a section is not applicable, mark it `Not Applicable` and explain why.