theslopmachine 1.0.2 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/assets/agents/developer.md +38 -32
  2. package/assets/agents/slopmachine-claude.md +36 -25
  3. package/assets/agents/slopmachine.md +61 -45
  4. package/assets/claude/agents/developer.md +27 -10
  5. package/assets/skills/claude-worker-management/SKILL.md +4 -4
  6. package/assets/skills/developer-session-lifecycle/SKILL.md +13 -3
  7. package/assets/skills/development-guidance/SKILL.md +24 -5
  8. package/assets/skills/evaluation-triage/SKILL.md +4 -4
  9. package/assets/skills/final-evaluation-orchestration/SKILL.md +29 -3
  10. package/assets/skills/integrated-verification/SKILL.md +24 -23
  11. package/assets/skills/p8-readiness-reconciliation/SKILL.md +98 -0
  12. package/assets/skills/planning-gate/SKILL.md +2 -2
  13. package/assets/skills/planning-guidance/SKILL.md +7 -4
  14. package/assets/skills/scaffold-guidance/SKILL.md +2 -0
  15. package/assets/skills/submission-packaging/SKILL.md +30 -3
  16. package/assets/skills/verification-gates/SKILL.md +11 -7
  17. package/assets/slopmachine/clarification-faithfulness-review-prompt.md +69 -45
  18. package/assets/slopmachine/clarifier-agent-prompt.md +46 -40
  19. package/assets/slopmachine/exact-readme-template.md +38 -11
  20. package/assets/slopmachine/owner-verification-checklist.md +2 -2
  21. package/assets/slopmachine/phase-1-design-prompt.md +94 -17
  22. package/assets/slopmachine/phase-1-design-template.md +124 -21
  23. package/assets/slopmachine/phase-2-execution-planning-prompt.md +155 -87
  24. package/assets/slopmachine/phase-2-plan-template.md +169 -81
  25. package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +8 -1
  26. package/assets/slopmachine/scaffold-playbooks/tech-frontend-vue.md +2 -0
  27. package/assets/slopmachine/scaffold-playbooks/type-web-spa.md +1 -0
  28. package/assets/slopmachine/templates/AGENTS.md +18 -17
  29. package/assets/slopmachine/templates/CLAUDE.md +18 -17
  30. package/assets/slopmachine/templates/plan.md +115 -36
  31. package/package.json +9 -2
  32. package/src/constants.js +1 -0
  33. package/src/init.js +8 -0
  34. package/src/install.js +130 -0
  35. package/assets/slopmachine/utils/__pycache__/claude_live_hook.cpython-311.pyc +0 -0
  36. package/assets/slopmachine/utils/__pycache__/cleanup_delivery_artifacts.cpython-311.pyc +0 -0
  37. package/assets/slopmachine/utils/__pycache__/convert_ai_session.cpython-311.pyc +0 -0
  38. package/assets/slopmachine/utils/__pycache__/normalize_claude_session.cpython-311.pyc +0 -0
  39. package/assets/slopmachine/utils/__pycache__/strip_session_parent.cpython-311.pyc +0 -0
@@ -14,8 +14,9 @@ Use this skill after development begins whenever you are reviewing work, decidin
14
14
  - use this skill as the source of truth for owner-side verification, review pressure, and gate interpretation
15
15
  - do not pause execution for human approval while using this skill; continue reviewing, fixing, rerouting, and rerunning only until the material blocker is cleared
16
16
  - clarification completion and `P8 Final Readiness Decision` are internal workflow transitions, not user-stop gates; do not pause execution just to summarize progress or ask the user whether to continue
17
- - `P8 Final Readiness Decision` is the fast post-`P7` cross-surface reconciliation sweep: compare the delivered repo, `README.md`, parent-root `../docs/`, and carried `../.tmp/` audit artifacts, fix small owner-side drift directly, validate report shape and lineage, confirm final residual risks, and only hold back packaging when a material inconsistency remains
18
- - `P8` must emit or record a readiness reconciliation note covering docs checked, kept reports checked, archived/stale report lineage reviewed, package-root expectations, and any final residual gaps
17
+ - `P8 Final Readiness Decision` is the fast post-`P7` cross-surface reconciliation sweep; load `p8-readiness-reconciliation` and follow it as the source of truth for the final readiness note, readiness-category sweep, and required `agent-browser` functional verification
18
+ - `P8` must emit or record the readiness reconciliation note required by `p8-readiness-reconciliation`
19
+ - the `P8` readiness note should include a residual-risk reconciliation table with rows for each reported issue, recommendation, stale artifact, doc drift, or final gap; use statuses such as `resolved`, `accepted residual risk`, `stale/doc-only cleanup`, `artifact-lineage issue`, or `material blocker`
19
20
 
20
21
  ## Documentation and repo hygiene
21
22
 
@@ -31,6 +32,8 @@ Use this skill after development begins whenever you are reviewing work, decidin
31
32
  - require `./run_tests.sh` to run the full test suite of the delivered app rather than a smoke subset, no-op placeholder, or shortcut path
32
33
  - do not require the README to carry a full API catalog
33
34
  - require the README to include the strict audit sections when they are relevant to the project shape: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
35
+ - require the README to include quick-start seeded data for any app that needs non-empty data to exercise main flows, or the exact statement `No seeded data required; the app is useful from an empty state.`
36
+ - reject seeded data that is hidden, non-idempotent, disconnected from the normal runtime/bootstrap/database path, or used as static fake-success product behavior instead of real implementation
34
37
  - treat the README as the final public contract for runtime and broad-test behavior: if it documents a runtime command or a broad test command, the delivered output must satisfy that exact contract
35
38
  - do not allow the repo to depend on parent-root docs or sibling artifacts for startup, build/preview, configuration, evaluator traceability, or basic project understanding
36
39
  - require the delivered repo to be statically reviewable: README, scripts, entry points, routes, config, and test commands must be traceably consistent
@@ -81,6 +84,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
81
84
  - when backend or fullstack APIs exist, do not accept missing endpoint inventory or missing API-test mapping for the important `METHOD + PATH` surfaces
82
85
  - when backend or fullstack APIs exist, do not accept mocked or indirect tests being presented as equivalent to true no-mock HTTP endpoint coverage
83
86
  - do not accept a README that is missing project type, startup instructions, access method, verification method, or auth disclosure when the strict README audit would expect them
87
+ - do not accept a README that omits seeded quick-start data or the exact empty-state rationale when a user would otherwise start into a blank unusable app
84
88
  - do not accept final delivered docs or wrapper flows that still depend on `npm install`, `pip install`, `apt-get`, manual DB setup, or other host-only setup assumptions after development is complete
85
89
  - do not accept a repo that only becomes understandable by reading parent-root docs or sibling workflow artifacts
86
90
  - do not accept frontend-bearing work that lacks repo-local build/preview/config guidance when those commands or surfaces are material to the product
@@ -111,7 +115,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
111
115
  ## Cadence rule
112
116
 
113
117
  - use targeted local verification as the default during early scaffold-step corrections inside development and in-development follow-up work
114
- - reserve owner-run local verification for the integrated verification gate in `P5`, and reserve owner-run `docker compose up --build` plus dockerized `./run_tests.sh` for the final runtime and broad-test confirmation in `P9`
118
+ - reserve owner-run local verification for the integrated verification gate in `P5`, reserve the narrow `P8` app launch for `agent-browser` functional verification, and reserve dockerized `./run_tests.sh` plus final broad Docker/runtime confirmation for `P9`
115
119
  - do not turn ordinary acceptance into repeated integrated-style gate runs
116
120
  - do not run `docker compose up --build` anywhere from planning through the end of `P7`
117
121
  - ordinary development and evaluation should rely on local verification, static review, evaluator sessions, and owner-side coherence checks only
@@ -140,7 +144,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
140
144
  - do not ask the developer to run Docker runtime commands or dockerized `./run_tests.sh` during ordinary in-development follow-up work; do require the prepared local test harness, including its full readiness pass before major readiness claims, when that is the right correctness check
141
145
  - if the developer already ran the relevant targeted local test command and reported it clearly, do not rerun the same command on the owner side unless the evidence is weak, contradictory, flaky, high-risk, or needed to answer a new question
142
146
  - when the remaining gap is a small non-core issue such as docs cleanup, README sync, Docker config, wrapper/config glue, or light `./run_tests.sh` cleanup, the owner may fix it directly instead of bouncing it back to the developer
143
- - if the remaining gap requires editing actual test files or suites, or real product code outside narrow config or wrapper glue, route it back to the developer instead of fixing it in-owner
147
+ - if the remaining gap requires editing actual test files or suites, or real product code outside narrow config or wrapper glue, route it to the current developer lane instead of fixing it in-owner; in `P5`, that means the active P5 bugfix lane, not the completed `develop-*` lane
144
148
  - during planning review, if the remaining problem is a small contract, wording, structure, or owner-maintained-document issue in `../docs/design.md`, `../docs/api-spec.md`, or `plan.md`, fix it directly in the owner session instead of reopening planning
145
149
  - for ordinary in-development follow-up acceptance, default review scope to the changed files and the narrow supporting files named by the developer; expand only when a concrete inconsistency, missing dependency, or suspicious claim forces wider review
146
150
  - for ordinary in-development follow-up acceptance, prefer a narrow acceptance checklist over broad exploratory rereads
@@ -172,7 +176,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
172
176
  - the workflow target is at most 2 broad owner-run verification moments across the whole cycle
173
177
  - ordinary planning, ordinary in-development follow-up acceptance, and routine in-development verification are not broad gates by default and should rely on targeted local verification unless the risk profile says otherwise
174
178
 
175
- From planning through the end of `P7`, do not run Docker-based verification.
179
+ From planning through the end of `P7`, do not run Docker-based verification. In `P8`, run Docker only when `p8-readiness-reconciliation` requires it to launch the app for `agent-browser` functional verification and no equivalent local runtime is available.
176
180
  Do not run Docker-based verification inside `P7`.
177
181
  The ordinary cadence is one owner-side local-harness gate in `P5`, plus the first real owner-side Docker/runtime and dockerized broad-test confirmation in `P9`.
178
182
 
@@ -189,7 +193,7 @@ Use evidence such as internal metadata files, structured Beads comments, verific
189
193
  - planning exit also requires parent-root `../docs/test-coverage.md` to be updated from the accepted planning contract enough that a reviewer can see the planned requirement/risk mapping there rather than only inside `plan.md`
190
194
  - planning exit also requires that the accepted plan covers the final README hard-gate shape and, when backend or fullstack APIs exist, the endpoint-inventory and API-test mapping strategy needed for the strict coverage audit
191
195
  - planning exit also requires an accepted scaffold step in `plan.md` that makes the initial bootstrap promptable without re-selecting the playbook at runtime and that locks Docker/runtime, `./run_tests.sh`, local testing harness and development tooling when applicable, and the early README structure
192
- - planning exit also requires security and test coverage execution contracts in `plan.md` that define the shared security foundation, per-surface test ownership, a confident roughly `90%` overall real-test coverage target, the applicable frontend/backend/API-surface/E2E obligations, and strong real-HTTP coverage expectations for resolved backend or fullstack API surfaces when they exist
196
+ - planning exit also requires security and test coverage execution contracts in `plan.md` that define the shared security foundation, per-surface test responsibility, at least `90%` unit-testable product-code coverage where measurable, at least `90%` closure of planned E2E/platform-critical flows, the applicable frontend/backend/API-surface/E2E obligations, and `100%` true no-mock HTTP coverage for documented prompt-relevant backend or fullstack API surfaces unless endpoint-level exceptions are recorded
193
197
  - planning exit also requires that the full prompt-relevant app surface is mapped to planned unit, API, integration, and E2E or platform-equivalent test ownership early enough that major surfaces are not left for later discovery
194
198
  - planning exit also requires an exact README contract in `plan.md` that locks the required README section structure, command strings, disclosures, and platform-specific guidance expected by the strict audits
195
199
  - planning exit also requires a `Delivery Review Requirements` section in `plan.md` that directly captures the applicable prompt-fit, static reviewability, runtime and broad-test, logging/validation/error-handling, backend/API, frontend/UX, end-to-end/platform, README, and coverage obligations, each mapped to planned repo evidence, planned verification evidence, and an owning main-lane or branch-worktree section
@@ -246,7 +250,7 @@ Use evidence such as internal metadata files, structured Beads comments, verific
246
250
  - if a required flow cannot be exercised through the intended UI surface, treat that as incomplete implementation rather than acceptable E2E coverage
247
251
  - the fused `P5` phase should not chase perfection; once local verification is green, the repo is roughly coherent and broadly correct against `plan.md` plus accepted `../docs/design.md`, and the required internal evaluation loop is resolved, stop and ask whether to proceed to evaluation
248
252
  - the fused `P5` phase may still fix small owner-fixable docs/config churn directly, but should not hold evaluation for nitpicks
249
- - during `P5`, treat owner-side direct edits as limited to docs, `README.md`, parent-root reference docs, Docker/runtime config, wrapper/config glue, and light `./run_tests.sh` script cleanup; actual test files and core code changes belong to the developer lane
253
+ - during `P5`, treat owner-side direct edits as limited to docs, `README.md`, parent-root reference docs, Docker/runtime config, wrapper/config glue, and light `./run_tests.sh` script cleanup; actual test files and core code changes belong to the active P5 bugfix lane because `develop-*` is done after accepted `P3`
250
254
  - before `P7`, do not hold back evaluation over documentation, polish, or extra owner-side analysis once the minimal `P5` test/coherence gate and required internal evaluation loop are satisfied
251
255
  - before `P7`, prefer traceable static evidence for security-bearing projects covering auth entry points, route authorization, object authorization, function-level authorization, admin/internal/debug protection, and tenant or user isolation when those dimensions apply, but let evaluation surface the remaining strict gaps when the repo is otherwise coherent
252
256
  - before `P7`, for non-trivial frontend work, prefer meaningful static frontend test evidence for major state transitions or failure paths rather than relying only on runtime screenshots or E2E confidence, but do not turn this into a pre-evaluation perfection gate
@@ -1,67 +1,91 @@
1
- # Clarification Prompt-Faithfulness Review
1
+ # Clarification Faithfulness Review Prompt
2
2
 
3
- You are a strict prompt-faithfulness reviewer.
3
+ You are a strict faithfulness reviewer.
4
4
 
5
- Your job is to compare:
5
+ Your job is to compare the clarification artifacts against the original product prompt and the final evaluation expectations, then report any drift, narrowing, or missing coverage.
6
6
 
7
- 1. the original prompt
8
- 2. `../.ai/requirements-breakdown.md`
9
- 3. `../docs/questions.md`
7
+ ## Inputs
10
8
 
11
- and determine whether the requirements breakdown plus clarifications are truly representative of the prompt.
9
+ You will receive:
10
+ 1. The original product prompt
11
+ 2. `../docs/questions.md`
12
+ 3. `../.ai/requirements-breakdown.md`
12
13
 
13
- You must:
14
- - identify every missing core requirement
15
- - identify every missing implied but binding requirement
16
- - identify every weakened or narrowed requirement
17
- - identify every clarification decision that drifts from the prompt
18
- - identify any over-interpretation that broadens the prompt beyond a slight prompt-faithful upgrade
19
- - identify any requirement that is present but still too shallow, under-explained, or missing the details that planning could easily underbuild later
20
- - identify any missing success-closure, failure/negative-condition, actor-boundary, or hidden-constraint detail that should have been extracted from the prompt
21
- - identify whether the planning-miss checklist is too weak to protect later design/planning from underbuilding subtle prompt details
22
- - identify any sections that are strong and should be preserved
14
+ ## Review Scope
23
15
 
24
- You must be strict.
25
- Do not be optimistic.
26
- Do not treat “roughly similar” as good enough.
16
+ Check the following dimensions systematically:
27
17
 
28
- Output only markdown for:
18
+ ### 1. Prompt Faithfulness
19
+ - Has any actor's actions been narrowed, removed, or reassigned?
20
+ - Has any required flow been shortened, made optional, or delegated to a different actor?
21
+ - Has any explicit constraint been weakened or reinterpreted?
22
+ - Are all prompt-required surfaces (pages, routes, APIs, jobs, reports, exports) still present?
29
23
 
30
- `../.ai/clarification-faithfulness-review.md`
24
+ ### 2. Requirement Registry Completeness
25
+ - Does `../.ai/requirements-breakdown.md` contain a requirement entry for every explicit prompt requirement?
26
+ - Does it contain entries for implied but binding requirements?
27
+ - Are locked defaults labeled with their implementation risk tier (`evaluation-critical`, `design-stabilizing`, `scope-expanding`)?
28
+ - Are there any orphan requirements: entries in the prompt with no corresponding requirement ID?
31
29
 
32
- Use this exact structure:
30
+ ### 3. Actor-Action Integrity
31
+ - For every actor in the prompt, list their granted actions
32
+ - Verify each action appears in the requirements breakdown or has an explicit non-applicability reason
33
+ - Flag any actor whose actions were reduced without explicit justification
34
+
35
+ ### 4. Evaluation Crosswalk
36
+ - Does the requirements breakdown cover all dimensions that the final evaluation will check?
37
+ - prompt alignment and delivery completeness
38
+ - static verifiability (README, config, routes, structure)
39
+ - API endpoint inventory and true no-mock HTTP strategy
40
+ - frontend pages, states, and interactions
41
+ - security boundaries (auth, authorization, isolation, admin protection)
42
+ - validation, error handling, logging, sensitive-data handling
43
+ - test coverage expectations (unit, API, integration, E2E)
44
+ - mock/demo/fake-success prevention
45
+ - Are there evaluation-critical requirements that are only vaguely represented?
46
+
47
+ ### 5. `questions.md` Format Compliance
48
+ - Does `../docs/questions.md` contain requirement IDs, traceability fields, priority fields, or evaluator-risk metadata?
49
+ - If yes, flag as format violation. Requirement IDs belong in `.ai/` artifacts only.
50
+ - Is the file clean, narrow, and decisive?
51
+
52
+ ## Output
53
+
54
+ Produce a review report with this exact structure:
33
55
 
34
56
  ```md
35
57
  # Clarification Faithfulness Review
36
58
 
37
59
  ## Verdict
38
- - [PASS | FAIL]
60
+ [PASS | PASS WITH MINOR NOTES | DRIFT DETECTED]
39
61
 
40
- ## Core Requirement Coverage Gaps
41
- - [gap]
62
+ ## Findings
42
63
 
43
- ## Missing Implied Requirements
44
- - [gap]
64
+ ### Finding N: [short title]
65
+ - Dimension: [prompt-faithfulness | registry-completeness | actor-action-integrity | evaluation-crosswalk | format-compliance]
66
+ - Severity: [blocker | high | medium | low]
67
+ - Evidence: [exact quote or reference from the artifact]
68
+ - Impact: [what would go wrong in later phases if this is not fixed]
69
+ - Suggested Fix: [concrete correction]
45
70
 
46
- ## Under-Specified Requirement Details
47
- - [detail]
71
+ ## No-Drift Confirmations
72
+ - [List of important prompt requirements that were correctly preserved]
48
73
 
49
- ## Drift / Narrowing Findings
50
- - [finding]
74
+ ## Residual Risks
75
+ - [Any ambiguous areas that are acceptable but should be watched in P2]
76
+ ```
51
77
 
52
- ## Acceptable Prompt-Faithful Upgrades
53
- - [upgrade]
78
+ ## Rules
54
79
 
55
- ## Strong Sections To Preserve
56
- - [section]
80
+ - Be strict. A minor drift in P1 becomes a major gap in P3.
81
+ - Do not approve requirements breakdowns that mix traceability metadata into `questions.md`.
82
+ - Do not approve actor-action narrowing unless the prompt explicitly supports it.
83
+ - Flag scope-expanding locked defaults that could strain implementation without adding evaluation value.
84
+ - Every finding must cite exact evidence from the artifacts.
85
+ - If no drift is found, state that explicitly and list the key no-drift confirmations.
57
86
 
58
- ## Exact Corrections Required
59
- - [correction]
60
- ```
87
+ ## Final Instruction
88
+
89
+ Produce the strongest possible faithfulness review.
61
90
 
62
- Rules:
63
- - every finding must be tied back to the original prompt
64
- - do not suggest implementation structure
65
- - do not suggest stack choices unless the prompt itself forces them
66
- - prefer exact corrections over broad commentary
67
- - if the documents are faithful, say so clearly
91
+ Your goal is to catch prompt drift before design begins, not to add process bloat.
@@ -13,7 +13,7 @@ You do not choose the stack unless the prompt itself contains a material contrad
13
13
  You do not write a design doc.
14
14
  You do not write `plan.md`.
15
15
 
16
- Your job is to extract the core requirements from the prompt, define them deeply enough for later design and planning, and then find every meaningful prompt ambiguity, missing rule, vague boundary, unclear workflow, incomplete actor expectation, hidden dependency, or unclear success condition that could cause the product objective to be misunderstood or built incorrectly later.
16
+ Your job is to extract the core requirements from the prompt, define them deeply enough for later design and planning, and then find every meaningful prompt ambiguity, missing rule, vague boundary, unclear lifecycle, incomplete actor expectation, hidden dependency, or unclear success condition that could cause the product objective to be misunderstood or built incorrectly later.
17
17
 
18
18
  You must resolve those ambiguities with the safest prompt-faithful decisions and write them into `questions.md`, while writing the deeper prompt-faithful requirements analysis into `../.ai/requirements-breakdown.md`.
19
19
 
@@ -35,12 +35,12 @@ The output must:
35
35
  ### 1. Preserve prompt faithfulness
36
36
  - Do not weaken, narrow, simplify, or reinterpret the prompt for convenience.
37
37
  - Do not introduce unauthorized `v1` reductions.
38
- - Do not silently drop implied actors, workflows, enforcement points, admin/operator behavior, or reporting expectations.
38
+ - Do not silently drop implied actors, lifecycle flows, enforcement points, admin/operator behavior, or reporting expectations.
39
39
  - When two readings are possible, choose the stricter prompt-faithful one.
40
40
 
41
41
  ### 2. Focus on material ambiguity only
42
42
  - Include only ambiguities that would materially improve later design or planning.
43
- - Focus on product behavior, actor behavior, workflow closure, lifecycle/state rules, business rules, security/privacy boundaries, data boundaries, offline/network assumptions, reporting/export meaning, and operational expectations when they affect product meaning.
43
+ - Focus on product behavior, actor behavior, lifecycle closure, lifecycle/state rules, business rules, security/privacy boundaries, data boundaries, offline/network assumptions, reporting/export meaning, and operational expectations when they affect product meaning.
44
44
  - Do not turn this into planning, stack selection, or implementation structure.
45
45
 
46
46
  ### 2.5 Extract the core requirements first
@@ -51,8 +51,8 @@ The output must:
51
51
  - Do not weaken or summarize them into vague labels.
52
52
  - After extracting and defining them, check them back against the original prompt and remove anything that narrows, broadens, or drifts from the prompt.
53
53
 
54
- ### 2.6 Use evaluation-grade requirement extraction
55
- Requirement extraction must be as strict and tedious as the final static evaluation prompt.
54
+ ### 2.6 Use strict requirement extraction
55
+ Requirement extraction must be strict and tedious enough for a critical static review.
56
56
 
57
57
  Before writing either artifact, extract and preserve:
58
58
  - core business goal and usage scenario
@@ -60,14 +60,15 @@ Before writing either artifact, extract and preserve:
60
60
  - required pages, screens, routes, APIs, jobs, modules, data objects, reports, exports, notifications, and integrations
61
61
  - main happy paths and task-closure conditions
62
62
  - required failure paths, validation failures, empty states, duplicate/re-entry behavior, cancellation, retry, rollback, and approval paths where relevant
63
- - security boundaries: authentication, route authorization, object authorization, function authorization, tenant/user isolation, admin/internal/debug protection, sensitive data, and audit/logging requirements
63
+ - security boundaries: authentication, route authorization, object authorization, function authorization, tenant/user isolation, admin/internal/debug protection, sensitive data, and accountability/logging requirements
64
64
  - engineering credibility requirements: coherent project shape, module decomposition, central config, logging, validation, error handling, maintainable service/adaptor boundaries, and no demo-only delivery
65
+ - static architecture credibility requirements: pages/routes/app shell/data flow must be connected where applicable, excessive single-file implementations and redundant/unnecessary files must be avoided, and pure frontend projects must keep mock/local-data boundaries disclosed without pretending they are backend integrations
65
66
  - documentation and static verifiability requirements: README clarity, startup/build/test/config guidance, entry points, scripts, routes, and repo structure being statically traceable
66
67
  - test and coverage expectations: unit, API, integration, frontend component/state, E2E/platform-equivalent, true no-mock HTTP where applicable, and coverage for core happy paths plus high-risk failure paths
67
68
  - mock/stub/fake/local-data boundaries: when they are allowed, when they must be disclosed, and when fake success would violate the product contract
68
69
  - frontend state and interaction obligations: loading, empty, submitting, disabled, success, error, validation, hover/click/current-state feedback, and task closure where applicable
69
70
  - FE↔BE integration expectations for fullstack/backend-backed frontend projects: every meaningful frontend action needs real backend support, and prompt-relevant backend features need frontend exposure unless truly internal/API-only
70
- - hidden delivery risks that final evaluation would catch: prompt drift, shell routes/pages/handlers, hardcoded fake behavior, missing owned tests, README drift, static entry-point inconsistency, weak security, and untraceable module boundaries
71
+ - hidden delivery risks that strict review would catch: prompt drift, shell routes/pages/handlers, hardcoded fake behavior, missing owned tests, README drift, static entry-point inconsistency, weak security, and untraceable module boundaries
71
72
 
72
73
  Every extracted requirement must be atomic enough to survive planning. Do not combine multiple product promises into one broad bullet if they could be implemented, tested, authorized, or documented separately. If a prompt phrase implies a user-visible behavior, a backend capability, a data lifecycle rule, a security boundary, a delivery/documentation obligation, or a test obligation, give it a traceable requirement entry or explicitly mark why it is not applicable.
73
74
 
@@ -80,7 +81,15 @@ Before finalizing, run a no-orphan requirement sweep:
80
81
  - every mock/fake/local-data allowance maps to a disclosure and a forbidden fake-success boundary
81
82
  - every high-risk unknown is either resolved by a prompt-faithful default or listed as a clarification item with a decisive `Solution`
82
83
 
83
- Do not treat this as a short summary. The requirements breakdown should be strong enough that a later evaluator's prompt-to-code review has little new product meaning to discover.
84
+ Additionally, run an explicit actor-action sweep:
85
+ - for every actor mentioned in the prompt, list every verb/action attached to that actor in the original prompt text
86
+ - compare that list against every locked default and clarification decision
87
+ - verify that no clarification has removed, narrowed, or reassigned any action that the prompt explicitly grants to an actor
88
+ - specifically guard against: removing public/consumer actions, narrowing creator rights to admin-only, or reassigning ownership tasks to operators
89
+ - if any actor-action pair from the prompt is missing from the requirement registry, add it or explicitly justify why it is not applicable
90
+ - this sweep prevents prompt drift where a clarifier accidentally narrows the product contract before design begins
91
+
92
+ Do not treat this as a short summary. The requirements breakdown should be strong enough that a later prompt-to-code review has little new product meaning to discover.
84
93
 
85
94
  ### 3. Resolve instead of punting
86
95
  - Every entry must end with a decisive `Solution`.
@@ -90,10 +99,10 @@ Do not treat this as a short summary. The requirements breakdown should be stron
90
99
  ### 4. Run one real ambiguity sweep
91
100
  Before finalizing, explicitly check for ambiguity or hidden scope cuts around:
92
101
  - actors and role boundaries
93
- - workflow start, completion, failure, retry, cancellation, and approval paths
102
+ - lifecycle start, completion, failure, retry, cancellation, and approval paths
94
103
  - business rules, limits, uniqueness, precedence, ownership, and conflict handling
95
104
  - lifecycle/state transitions
96
- - security, permissions, isolation, masking, retention, and auditability
105
+ - security, permissions, isolation, masking, retention, and accountability
97
106
  - data visibility, history, edit authority, and cross-surface dependencies
98
107
  - reporting, export, reconciliation, or financial semantics when relevant
99
108
  - hidden environment and trust-boundary assumptions, especially on-prem, intranet, offline, LAN, browser access, auth cookies/tokens, local storage, self-contained deployment, external reachability, or secure/insecure transport when those can change product behavior
@@ -108,12 +117,13 @@ You must output only markdown into these 2 files:
108
117
 
109
118
  Do not include any preface, explanation, summary, commentary, or planning notes outside the file content.
110
119
 
111
- `../.ai/requirements-breakdown.md` must contain a deep prompt-faithful requirements analysis using this exact structure:
120
+ `../.ai/requirements-breakdown.md` must contain a deep prompt-faithful requirements analysis using this exact structure. Assign stable `REQ-###` IDs before design begins. Every requirement that later design or planning may need to map must appear in the registry, and every later requirement entry must carry its `Requirement ID` field.
112
121
 
113
122
  ```md
114
123
  # Requirements Breakdown
115
124
 
116
125
  ## Core Business Goal
126
+ - Requirement ID:
117
127
  - Requirement:
118
128
  - Definition:
119
129
  - Prompt Basis:
@@ -122,6 +132,12 @@ Do not include any preface, explanation, summary, commentary, or planning notes
122
132
  - Failure / Negative Conditions:
123
133
  - Hidden Planning Risk:
124
134
 
135
+ ## Requirement ID Registry
136
+
137
+ | Requirement ID | Requirement title | Type: explicit / implied / safe default | Prompt basis | Actor / surface | Success closure | Failure / negative conditions | Hidden planning risk |
138
+ |---|---|---|---|---|---|---|---|
139
+ | REQ-001 | | explicit | | | | | |
140
+
125
141
  ## Evaluation-Grade Requirement Inventory
126
142
  ### Business / Prompt Fit
127
143
  - Core business objective:
@@ -139,6 +155,7 @@ Do not include any preface, explanation, summary, commentary, or planning notes
139
155
  - Required module decomposition:
140
156
  - Required service/adaptor/data boundaries:
141
157
  - Config/logging/validation/error-handling expectations:
158
+ - Static structure expectations, including connected pages/routes/state/data flow, avoiding excessive single-file implementation, avoiding redundant/unnecessary files, and separating pure-frontend mock/local data from real backend claims:
142
159
  - Static reviewability expectations:
143
160
 
144
161
  ### Security / Privacy / Authorization
@@ -146,7 +163,7 @@ Do not include any preface, explanation, summary, commentary, or planning notes
146
163
  - Route/object/function authorization expectations:
147
164
  - Tenant/user isolation expectations:
148
165
  - Admin/internal/debug protection expectations:
149
- - Sensitive data, logging, audit, retention, or masking expectations:
166
+ - Sensitive data, logging, accountability, retention, or masking expectations:
150
167
 
151
168
  ### Test / Coverage Expectations
152
169
  - Core happy path proof required:
@@ -181,6 +198,7 @@ Do not include any preface, explanation, summary, commentary, or planning notes
181
198
 
182
199
  ## Explicit Prompt Requirements
183
200
  ### <short requirement title>
201
+ - Requirement ID:
184
202
  - Requirement:
185
203
  - Definition:
186
204
  - Prompt Basis:
@@ -191,6 +209,7 @@ Do not include any preface, explanation, summary, commentary, or planning notes
191
209
 
192
210
  ## Implied But Binding Requirements
193
211
  ### <short requirement title>
212
+ - Requirement ID:
194
213
  - Requirement:
195
214
  - Definition:
196
215
  - Prompt Basis:
@@ -201,6 +220,7 @@ Do not include any preface, explanation, summary, commentary, or planning notes
201
220
 
202
221
  ## Locked Safe Defaults And Assumptions
203
222
  ### <short item title>
223
+ - Requirement ID:
204
224
  - Requirement:
205
225
  - Definition:
206
226
  - Prompt Basis:
@@ -208,6 +228,10 @@ Do not include any preface, explanation, summary, commentary, or planning notes
208
228
  - Success Closure:
209
229
  - Failure / Negative Conditions:
210
230
  - Why This Default Is Safe:
231
+ - Implementation Risk Tier: [evaluation-critical / design-stabilizing / scope-expanding]
232
+ - `evaluation-critical`: the evaluator will likely check this; missing it risks audit failure
233
+ - `design-stabilizing`: reduces later ambiguity but is not directly scored
234
+ - `scope-expanding`: adds implementation work beyond the minimum prompt requirement; flag these so P2/P3 can prioritize or challenge them
211
235
  - Hidden Planning Risk:
212
236
 
213
237
  ## Planning-Miss Checklist
@@ -223,19 +247,9 @@ Do not include any preface, explanation, summary, commentary, or planning notes
223
247
  - hidden constraints or implied non-goals that planning could miss:
224
248
  ```
225
249
 
226
- `../docs/questions.md` must start with this exact section:
250
+ `../docs/questions.md` must contain only clarification entries in the exact format below. Do not add requirement IDs, traceability fields, priority fields, evaluator-risk fields, core requirements baseline sections, or any other extra metadata to `../docs/questions.md`. Core requirements belong in `../.ai/requirements-breakdown.md` only. `../docs/questions.md` must stay a clean clarification artifact.
227
251
 
228
- ```md
229
- ## Core Requirements Baseline
230
-
231
- ### <short requirement title>
232
- - Requirement: <the core requirement stated directly>
233
- - Definition: <what this requirement means in depth, including the important boundaries, actors, behaviors, constraints, and success conditions that must stay true later>
234
- - Prompt Basis: <the exact prompt-grounded reason this requirement is part of the contract>
235
- - Hidden Planning Risk: <what later design/planning could miss or weaken if this requirement is not carried forward explicitly>
236
- ```
237
-
238
- After that section, use this exact structure for every clarification entry in `../docs/questions.md`:
252
+ Use this exact structure for every entry in `../docs/questions.md`:
239
253
 
240
254
  ```md
241
255
  ### <number>. <short clarification title>
@@ -249,29 +263,21 @@ After that section, use this exact structure for every clarification entry in `.
249
263
  Use this exact style:
250
264
 
251
265
  ```md
252
- ## Core Requirements Baseline
253
-
254
- ### Clarified Prompt Contract Baseline
255
- - Requirement: The accepted core requirements and clarification decisions in this file define the product contract for later design and execution planning.
256
- - Definition: Design and execution planning must preserve the core requirements and the accepted clarification decisions captured here. They may operationalize them, but they may not silently narrow, soften, or replace them.
257
- - Prompt Basis: The original prompt is the primary source of truth, and this file exists to extract its core requirements and resolve the ambiguities that would otherwise force later guesswork.
258
- - Hidden Planning Risk: If this baseline is not carried forward explicitly, later design or planning can quietly weaken the prompt, drop implied constraints, or underbuild important task-closure behavior.
259
-
260
- ### 1. Clarification Baseline for Design and Planning
261
- - Question: Can the clarification decisions captured in this file be treated as the baseline for design and execution planning?
262
- - My Understanding: The prompt was large enough that design and execution planning needed one accepted clarification record. We needed to lock that baseline before planning rather than carrying ambiguity forward.
263
- - Solution: Yes. Treat the clarification decisions in this file as the accepted baseline for Phase 1 design and Phase 2 execution planning.
264
-
265
- ### 2. <short clarification title>
266
+ ### 1. <short clarification title>
266
267
  - Question: <the exact ambiguity or contradiction that needed to be resolved>
267
268
  - My Understanding: <how the prompt was interpreted, why the ambiguity mattered, and what risk it created for design and planning>
268
269
  - Solution: <the chosen prompt-faithful resolution or safe default>
270
+
271
+ ### 2. <short clarification title>
272
+ - Question: <the exact ambiguity or missing detail that needed to be locked>
273
+ - My Understanding: <how the prompt was interpreted, why this was ambiguous, and why it matters for later design and planning>
274
+ - Solution: <the chosen prompt-faithful resolution or safe default, written decisively>
269
275
  ```
270
276
 
271
277
  ## Output discipline
272
278
 
273
279
  - Cover every material ambiguity you can justify.
274
- - Cover the core requirements explicitly before the clarification entries.
280
+ - Put all core requirements in `../.ai/requirements-breakdown.md`, not in `../docs/questions.md`.
275
281
  - Extract prompt details strongly enough that later planning is not likely to miss edge conditions, operator/admin expectations, failure behavior, or implicit constraints hiding inside broad prompt wording.
276
282
  - Explicitly separate what the prompt states directly from what is implied but still binding.
277
283
  - Finish with a planning-miss checklist strong enough that later design and planning are less likely to miss subtle prompt details.
@@ -279,7 +285,7 @@ Use this exact style:
279
285
  - Every entry must be planning-relevant.
280
286
  - Every `Solution` must be decisive.
281
287
  - Large prompts will often need many entries, but unusually explicit prompts may need fewer.
282
- - Keep the file narrow and explicit; this is not a general project summary, but it must contain a strong core-requirements baseline plus the necessary clarifications.
288
+ - Keep the file narrow and explicit; this is not a general project summary, and it must not contain a core-requirements baseline. Core requirements belong in `../.ai/requirements-breakdown.md` only. `../docs/questions.md` contains only the necessary clarifications.
283
289
  - The separate `../.ai/requirements-breakdown.md` file should be the deeper analysis artifact: in-depth, requirement-focused, and as prompt-faithful as possible.
284
290
 
285
291
  ## Inputs you will receive
@@ -217,7 +217,7 @@ Expected result:
217
217
  If `init_db.sh` is part of the standard test bootstrap, document that relationship clearly.
218
218
 
219
219
  ### Local verification harness
220
- - Document the separate local verification command(s) used for ordinary development and owner-side pre-evaluation checks.
220
+ - Document the separate local verification command(s) used for ordinary development and readiness checks.
221
221
  - Make clear that these local verification commands are distinct from the dockerized `./run_tests.sh` broad test path.
222
222
  - Use the real stack-native local suite for the chosen language/framework where applicable, for example Vitest, Jest, PHPUnit, pytest, go test, cargo test, or another framework-native equivalent.
223
223
  - If that local suite needs machine-level installation or setup, document that clearly in the local verification notes.
@@ -236,7 +236,7 @@ If `init_db.sh` is part of the standard test bootstrap, document that relationsh
236
236
 
237
237
  ### Test notes
238
238
  - `./run_tests.sh` is the dockerized broad test path reserved for the final containerized confirmation flow.
239
- - Local verification commands are used for ordinary development iteration and pre-evaluation owner checks.
239
+ - Local verification commands are used for ordinary development iteration and readiness checks.
240
240
  - [Docker-contained notes if applicable]
241
241
  - [seed/fixture notes if applicable]
242
242
  - [known test constraints if any]
@@ -270,12 +270,38 @@ Use that exact line if the project truly has no authentication requirement.
270
270
 
271
271
  ---
272
272
 
273
- ## 11. Workflow / Operational Notes
273
+ ## 11. Quick-Start Seeded Data
274
274
 
275
- ### Main workflows
276
- - [workflow 1]
277
- - [workflow 2]
278
- - [workflow 3]
275
+ Choose exactly one of the two sections below.
276
+
277
+ ### Option A — Seeded data exists
278
+
279
+ The local runtime creates deterministic demo/test data through the normal bootstrap path.
280
+
281
+ | Data type | Value / identifier | Purpose | How to use it |
282
+ |---|---|---|---|
283
+ | Account / role | [email or username] | [role / flow] | [login or action steps] |
284
+ | Sample record | [record ID/name/URL] | [flow it unlocks] | [where to open/click/call] |
285
+
286
+ Important:
287
+ - Seeded data must be idempotent and recreated by the documented startup/database path.
288
+ - Seeded data is for local demonstration and testing only.
289
+ - Seeded data must not replace real product behavior with static fake-success paths.
290
+
291
+ ### Option B — No seeded data required
292
+
293
+ No seeded data required; the app is useful from an empty state.
294
+
295
+ Use that exact line only if a new user can exercise the main flows without preloaded records or accounts.
296
+
297
+ ---
298
+
299
+ ## 12. Workflow / Operational Notes
300
+
301
+ ### Main lifecycle flows
302
+ - [lifecycle flow 1]
303
+ - [lifecycle flow 2]
304
+ - [lifecycle flow 3]
279
305
 
280
306
  ### Security / access notes
281
307
  - [public pages/endpoints]
@@ -289,12 +315,12 @@ Use that exact line if the project truly has no authentication requirement.
289
315
  - [jobs / queues / workers if applicable]
290
316
 
291
317
  ### Operational notes
292
- - [backup / retention / auditability / support notes if applicable]
293
- - [important admin/operator workflow notes if applicable]
318
+ - [backup / retention / accountability / support notes if applicable]
319
+ - [important admin/operator lifecycle notes if applicable]
294
320
 
295
321
  ---
296
322
 
297
- ## 12. Feature-Flag / Mock / Debug / Demo Disclosure
323
+ ## 13. Feature-Flag / Mock / Debug / Demo Disclosure
298
324
 
299
325
  This section is mandatory whenever feature flags, mock data, local JSON, interception, fake/demo behavior, or debug surfaces exist.
300
326
 
@@ -309,7 +335,7 @@ If none of the above exist, say so explicitly.
309
335
 
310
336
  ---
311
337
 
312
- ## 13. Important Limitations / Non-Goals
338
+ ## 14. Important Limitations / Non-Goals
313
339
 
314
340
  - [known limitation or boundary 1]
315
341
  - [known limitation or boundary 2]
@@ -330,6 +356,7 @@ Use this section for transparent disclosure, not for hiding missing core require
330
356
  - [ ] Docker-contained environment rules are clear
331
357
  - [ ] `.env` and `.env.example` are not required or referenced as committed repo artifacts
332
358
  - [ ] Auth credentials for all roles are present, or exact line `No authentication required` is present
359
+ - [ ] Seeded quick-start data is documented, or exact line `No seeded data required; the app is useful from an empty state.` is present
333
360
  - [ ] Architecture summary is present
334
361
  - [ ] Workflow / operational notes are present when relevant
335
362
  - [ ] Feature-flag/mock/debug/demo disclosure is present if applicable
@@ -57,7 +57,7 @@ Do not accept “close enough” on prompt-faithfulness, security, runtime hones
57
57
  - [ ] Exact runtime commands, broad test commands, wrapper-script mechanics, and README section contracts are not being pushed into `../docs/design.md`.
58
58
 
59
59
  ### A5. Test Strategy and Coverage Contract
60
- - [ ] The plan expresses a confident roughly `90%` overall real-test coverage target.
60
+ - [ ] The plan requires at least `90%` unit-testable product-code coverage where measurable and at least `90%` closure of planned E2E/platform-critical flow rows.
61
61
  - [ ] The measurement path and confidence expectations are named.
62
62
  - [ ] Every meaningful surface family has required test layers.
63
63
  - [ ] Frontend unit/component/state testing is explicit when the project is `web` or `fullstack`.
@@ -148,7 +148,7 @@ Do not accept “close enough” on prompt-faithfulness, security, runtime hones
148
148
  - [ ] Signed links/tokens, public routes, audit durability, encryption, and privileged action re-checks are planned fail-closed where relevant.
149
149
 
150
150
  ### B5. Test Coverage Execution Contract
151
- - [ ] The plan keeps a confident roughly `90%` overall real-test coverage target.
151
+ - [ ] The plan keeps at least `90%` unit-testable product-code coverage where measurable and at least `90%` closure of planned E2E/platform-critical flow rows.
152
152
  - [ ] Every meaningful planned surface/work package is mapped to tests.
153
153
  - [ ] Every prompt-relevant module owns or explicitly inherits unit/API/integration/E2E/frontend-state proof as applicable.
154
154
  - [ ] Modules that own APIs have concrete full API coverage expectations, preferably true no-mock HTTP by exact `METHOD + PATH` unless a narrow exception is accepted.