theslopmachine 0.6.2 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/MANUAL.md +21 -6
- package/README.md +55 -7
- package/RELEASE.md +15 -0
- package/assets/agents/developer.md +41 -1
- package/assets/agents/slopmachine-claude.md +100 -60
- package/assets/agents/slopmachine.md +40 -17
- package/assets/claude/agents/developer.md +42 -5
- package/assets/skills/clarification-gate/SKILL.md +25 -5
- package/assets/skills/claude-worker-management/SKILL.md +280 -57
- package/assets/skills/developer-session-lifecycle/SKILL.md +81 -37
- package/assets/skills/development-guidance/SKILL.md +21 -1
- package/assets/skills/evaluation-triage/SKILL.md +32 -23
- package/assets/skills/final-evaluation-orchestration/SKILL.md +86 -50
- package/assets/skills/hardening-gate/SKILL.md +17 -3
- package/assets/skills/integrated-verification/SKILL.md +3 -3
- package/assets/skills/planning-gate/SKILL.md +32 -3
- package/assets/skills/planning-guidance/SKILL.md +72 -13
- package/assets/skills/retrospective-analysis/SKILL.md +2 -2
- package/assets/skills/scaffold-guidance/SKILL.md +129 -124
- package/assets/skills/submission-packaging/SKILL.md +33 -27
- package/assets/skills/verification-gates/SKILL.md +44 -14
- package/assets/slopmachine/backend-evaluation-prompt.md +1 -1
- package/assets/slopmachine/frontend-evaluation-prompt.md +5 -5
- package/assets/slopmachine/scaffold-playbooks/android-kotlin-compose.md +81 -0
- package/assets/slopmachine/scaffold-playbooks/android-kotlin-views.md +191 -0
- package/assets/slopmachine/scaffold-playbooks/android-native-java.md +203 -0
- package/assets/slopmachine/scaffold-playbooks/angular-default.md +181 -0
- package/assets/slopmachine/scaffold-playbooks/backend-baseline.md +142 -0
- package/assets/slopmachine/scaffold-playbooks/backend-family-matrix.md +80 -0
- package/assets/slopmachine/scaffold-playbooks/database-module-matrix.md +80 -0
- package/assets/slopmachine/scaffold-playbooks/django-default.md +166 -0
- package/assets/slopmachine/scaffold-playbooks/docker-baseline.md +189 -0
- package/assets/slopmachine/scaffold-playbooks/docker-shared-contract.md +334 -0
- package/assets/slopmachine/scaffold-playbooks/electron-vite-default.md +124 -0
- package/assets/slopmachine/scaffold-playbooks/expo-react-native-default.md +73 -0
- package/assets/slopmachine/scaffold-playbooks/fastapi-default.md +134 -0
- package/assets/slopmachine/scaffold-playbooks/frontend-baseline.md +160 -0
- package/assets/slopmachine/scaffold-playbooks/frontend-family-matrix.md +134 -0
- package/assets/slopmachine/scaffold-playbooks/generic-unknown-tech-guide.md +136 -0
- package/assets/slopmachine/scaffold-playbooks/go-chi-default.md +160 -0
- package/assets/slopmachine/scaffold-playbooks/ios-linux-portable.md +93 -0
- package/assets/slopmachine/scaffold-playbooks/ios-native-objective-c.md +151 -0
- package/assets/slopmachine/scaffold-playbooks/ios-native-swift.md +188 -0
- package/assets/slopmachine/scaffold-playbooks/laravel-default.md +216 -0
- package/assets/slopmachine/scaffold-playbooks/livewire-default.md +265 -0
- package/assets/slopmachine/scaffold-playbooks/overlay-module-matrix.md +130 -0
- package/assets/slopmachine/scaffold-playbooks/platform-family-matrix.md +79 -0
- package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +72 -0
- package/assets/slopmachine/scaffold-playbooks/spring-boot-default.md +182 -0
- package/assets/slopmachine/scaffold-playbooks/tauri-default.md +80 -0
- package/assets/slopmachine/scaffold-playbooks/vue-vite-default.md +162 -0
- package/assets/slopmachine/scaffold-playbooks/web-default.md +96 -0
- package/assets/slopmachine/templates/AGENTS.md +41 -3
- package/assets/slopmachine/templates/CLAUDE.md +111 -0
- package/assets/slopmachine/utils/claude_create_session.mjs +1 -0
- package/assets/slopmachine/utils/claude_live_channel.mjs +188 -0
- package/assets/slopmachine/utils/claude_live_common.mjs +406 -0
- package/assets/slopmachine/utils/claude_live_hook.py +47 -0
- package/assets/slopmachine/utils/claude_live_launch.mjs +181 -0
- package/assets/slopmachine/utils/claude_live_status.mjs +25 -0
- package/assets/slopmachine/utils/claude_live_stop.mjs +45 -0
- package/assets/slopmachine/utils/claude_live_turn.mjs +250 -0
- package/assets/slopmachine/utils/claude_resume_session.mjs +1 -0
- package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.mjs +23 -0
- package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.sh +5 -0
- package/assets/slopmachine/utils/claude_worker_common.mjs +224 -4
- package/assets/slopmachine/utils/cleanup_delivery_artifacts.py +4 -0
- package/assets/slopmachine/utils/export_ai_session.mjs +1 -1
- package/assets/slopmachine/utils/normalize_claude_session.py +153 -0
- package/assets/slopmachine/utils/package_claude_session.mjs +96 -0
- package/assets/slopmachine/utils/prepare_strict_audit_workspace.mjs +65 -0
- package/package.json +1 -1
- package/src/constants.js +42 -3
- package/src/init.js +173 -28
- package/src/install.js +75 -0
- package/src/send-data.js +56 -57
|
@@ -37,9 +37,9 @@ Once a failure class is known:
|
|
|
37
37
|
- verify that tests are real and effective checks of actual code logic rather than bypass-style or fake-confidence test paths
|
|
38
38
|
- for web fullstack work, run Playwright coverage for major flows and review screenshots for real UI behavior and regressions
|
|
39
39
|
- for mobile and desktop work, run the selected stack's platform-appropriate UI/E2E coverage for major flows and review screenshots or equivalent artifacts for real UI behavior and regressions
|
|
40
|
-
- for Electron or other Linux-targetable desktop work, use the Dockerized desktop build/test path
|
|
41
|
-
- for Android work, use the Dockerized Android build/test path without requiring an emulator
|
|
42
|
-
- for iOS-targeted work on Linux,
|
|
40
|
+
- for Electron or other Linux-targetable desktop work, use `docker compose up --build` plus the Dockerized desktop build/test path and headless UI/runtime verification through Xvfb or an equivalent Linux-capable harness
|
|
41
|
+
- for Android work, use `docker compose up --build` plus the Dockerized Android build/test path without requiring an emulator
|
|
42
|
+
- for iOS-targeted work on Linux, use `docker compose up --build` plus `./run_tests.sh`, portable test evidence, and static review evidence honestly; do not claim native iOS runtime verification unless a real macOS/Xcode checkpoint exists
|
|
43
43
|
- end-to-end coverage must use the real intended user-facing or admin-facing surfaces for the flow; if the flow cannot be exercised that way, treat the missing surface as incomplete work
|
|
44
44
|
- verify important failure, conflict, stale-state, negative-auth, and cross-user-isolation paths where relevant
|
|
45
45
|
- verify 401, 403, 404, conflict or duplicate-submission, object-authorization, tenant or user-isolation, and sensitive-log-exposure paths where those risks exist
|
|
@@ -37,11 +37,19 @@ If the owner notices a concrete role, contract, or scope mismatch, planning does
|
|
|
37
37
|
|
|
38
38
|
- the developer should produce the first in-depth technical plan
|
|
39
39
|
- once accepted, the plan should be detailed and section-addressable enough that later owner prompts can stay short and point the developer back to the relevant accepted section instead of re-dumping the implementation contract
|
|
40
|
+
- reject planning that stays at a high level when the project obviously needs deeper domain rules, lifecycle rules, permission rules, or verification detail to avoid later ambiguity
|
|
40
41
|
- do not create deep execution sub-items before the technical plan is accepted
|
|
41
42
|
- do not accept planning that reduces, weakens, narrows, or silently reinterprets the original prompt
|
|
42
43
|
- do not accept convenience-based narrowing, including unauthorized `v1` simplifications, deferred workflows, reduced actor/role models, weaker enforcement, or omitted operator/admin surfaces
|
|
43
44
|
- declare prompt-critical planning acceptance criteria before accepting the first planning pass when those criteria are already visible from the prompt
|
|
44
45
|
- require relevant cross-cutting system contracts to be explicitly planned rather than left to per-module invention
|
|
46
|
+
- when the project has multiple major modules or workstreams, require the plan to distinguish shared prerequisites from the 2 or 3 main work packages that can later proceed in parallel, including where the merge points and no-parallel boundaries really are
|
|
47
|
+
|
|
48
|
+
## Owner planning-demand rule
|
|
49
|
+
|
|
50
|
+
- when correcting or evaluating planning, reference the specific plan sections that matter and state an explicit planning-exit checklist rather than a generic request for `more detail`
|
|
51
|
+
- the planning-exit checklist should say exactly which sections must be deeper, what kind of specificity is missing, and what later-phase risk that missing detail would create
|
|
52
|
+
- demand enough planning detail that scaffold and development can mostly execute by following the accepted plan instead of inventing critical structure later
|
|
45
53
|
|
|
46
54
|
## Unauthorized narrowing rule
|
|
47
55
|
|
|
@@ -69,10 +77,19 @@ Examples of rejection-worthy narrowing include:
|
|
|
69
77
|
- when `../docs/test-coverage.md` is relevant, require it to be structured as explicit requirement or risk mappings rather than generic narrative
|
|
70
78
|
- review `../docs/test-coverage.md` only after `README.md` and `../docs/design.md` establish the claimed scope; then use the coverage doc to verify prompt-critical risks concretely instead of rereading unrelated planning docs
|
|
71
79
|
- require the accepted plan to cover system overview, architecture reasoning, major modules or chunks, domain model, data model where relevant, interface contracts, failure paths, state transitions, logging strategy, testing strategy, README implications, and Docker execution assumptions when those dimensions apply
|
|
80
|
+
- require the accepted plan to have section-addressable coverage for authoritative tech stack summary, product overview, explicit out-of-scope items, actors and roles, actor success paths, authoritative business rules, permissions, validation, security/compliance expectations, non-functional requirements, implementation checkpoints, definition of done, and deliverables when those dimensions matter
|
|
81
|
+
- require the accepted plan to define the final README contract when the post-bugfix README audit will apply, including project-type declaration, startup instructions, access method, verification method, and demo-credentials or explicit no-auth disclosure
|
|
82
|
+
- require the accepted plan to explain how fast local iteration may be used during development without leaking local-only setup assumptions into the final delivered Docker-contained runtime and test contract
|
|
83
|
+
- reject planning that uses placeholder language such as `TBD`, `later`, `as needed`, `standard CRUD`, `normal auth`, or similarly vague stand-ins where concrete implementation guidance is expected
|
|
84
|
+
- reject planning that leaves large sections effectively empty or only restates the prompt without making forward-looking engineering decisions
|
|
72
85
|
|
|
73
86
|
## Cross-cutting planning requirements
|
|
74
87
|
|
|
75
88
|
- require shared lifecycle and state models to be aligned across planning artifacts when the product has meaningful workflow state
|
|
89
|
+
- require actor-to-surface coverage to be explicit enough that each important persona has a real path to success and is not accidentally dropped in implementation
|
|
90
|
+
- require authoritative business-rule coverage when the prompt implies formulas, thresholds, limits, conflict rules, uniqueness, retries, reversals, or ownership constraints
|
|
91
|
+
- for backend or fullstack APIs, require an endpoint-inventory and API-test strategy that distinguishes true no-mock HTTP coverage from HTTP-with-mocking and unit-only or indirect coverage
|
|
92
|
+
- for backend or fullstack APIs, require the plan to call out how important endpoints will be exercised through the real HTTP layer and how important modules not yet tested will be made visible in `../docs/test-coverage.md`
|
|
76
93
|
- require explicit cross-cutting system contracts when relevant, especially:
|
|
77
94
|
- error normalization and user-visible error behavior
|
|
78
95
|
- audit/logging and redaction patterns
|
|
@@ -81,7 +98,7 @@ Examples of rejection-worthy narrowing include:
|
|
|
81
98
|
- auth/session edge cases such as expiry, refresh, or clock skew tolerance
|
|
82
99
|
- when the prompt says behavior is configurable, require the real configuration surface, permissions, operator flow, and backend support to be planned explicitly
|
|
83
100
|
- when a feature must be admin-manageable or operator-manageable, require a real usable UI surface for that management flow, not just API endpoints or data-model notes
|
|
84
|
-
- for web projects, require
|
|
101
|
+
- for web projects, require `docker compose up --build` as the runtime contract
|
|
85
102
|
- for Dockerized web projects, require a concrete dev-only runtime bootstrap script plan so `docker compose up --build` works without user exports or `.env`
|
|
86
103
|
- do not accept Dockerized web planning that depends on manual `export ...` steps, checked-in env files, or hardcoded runtime values for startup
|
|
87
104
|
- do not accept Dockerized web planning where `./run_tests.sh` uses a different secret/bootstrap model than `docker compose up --build`
|
|
@@ -95,6 +112,7 @@ Examples of rejection-worthy narrowing include:
|
|
|
95
112
|
- do not accept planning that lets a mock-only or local-data-only project look like undisclosed real integration delivery
|
|
96
113
|
- do not accept planning that hides missing failure handling behind fake-success branches
|
|
97
114
|
- when the project has meaningful auth or access control, require a static security-boundary inventory in planning artifacts covering auth entry points, route authorization, object authorization, function-level authorization, admin/internal/debug surfaces, and tenant or user isolation rules when applicable
|
|
115
|
+
- for Android, mobile, desktop, and iOS-targeted projects, require planning for a meaningful `docker compose up --build` command plus containerized `./run_tests.sh` even when platform-specific runtime proof differs from web semantics
|
|
98
116
|
- require README disclosure planning for feature flags, debug or demo surfaces, default enabled states, and mock or interception defaults whenever they exist
|
|
99
117
|
- require traceability planning for build, preview, configuration, app entry points, route registration, module boundaries, and test entry points through `README.md` plus external references rather than additional in-repo docs
|
|
100
118
|
- require logging and validation contracts to be planned concretely enough for static review through code, `README.md`, and external docs when needed
|
|
@@ -111,7 +129,12 @@ Examples of rejection-worthy narrowing include:
|
|
|
111
129
|
- scope is still prompt-faithful
|
|
112
130
|
- the plan has explicitly addressed prompt-fit risks and requirement drift
|
|
113
131
|
- no unauthorized convenience-based narrowing or `v1` simplification has been introduced
|
|
132
|
+
- explicit out-of-scope items are documented tightly enough to prevent speculative overbuilding without shrinking the prompt
|
|
133
|
+
- actor and role coverage is explicit enough that each main persona has a real path to success
|
|
114
134
|
- major user-facing flows are mapped to backend support and verification targets
|
|
135
|
+
- authoritative business rules, defaults, limits, transitions, and ownership rules are explicit enough that implementation will not invent them ad hoc later
|
|
136
|
+
- unresolved items are narrow, explicit, and few enough that they will not force broad replanning during scaffold or early implementation
|
|
137
|
+
- security, compliance, reliability, reporting, and non-functional requirements are explicit where the prompt or product shape makes them material
|
|
115
138
|
- security-critical areas are planned early enough that they will not be left to accidental late cleanup
|
|
116
139
|
- test sufficiency has been considered at the level of core happy path, major failure paths, security-critical paths, and obvious high-risk boundaries
|
|
117
140
|
- the plan explicitly defines module-level responsibilities, flows, boundaries, and completion tests before implementation
|
|
@@ -122,7 +145,10 @@ Examples of rejection-worthy narrowing include:
|
|
|
122
145
|
- backend or fullstack plans explicitly cover 401, 403, 404, conflict or duplicate submission when relevant, object-level authorization, tenant or user isolation, and sensitive-log exposure in the coverage plan
|
|
123
146
|
- frontend-bearing plans explicitly cover the required state model for major flows, including loading, empty, submitting, disabled, success, error, and duplicate-action protection where relevant
|
|
124
147
|
- frontend-bearing plans explicitly include component, page or route integration, and E2E coverage where applicable; non-trivial frontend plans explicitly include component, page, route, or state-focused test coverage where UI state complexity is meaningful rather than relying only on E2E or runtime confidence
|
|
125
|
-
- the coverage plan is strong enough to
|
|
148
|
+
- the coverage plan names a concrete measurement path and is strong enough to achieve and prove a minimum 90 percent test coverage threshold for the delivered behavior surface
|
|
149
|
+
- when backend or fullstack APIs exist, the coverage plan includes a resolved endpoint inventory, API test mapping strategy, and explicit mock-classification strategy rather than only high-level risk prose
|
|
150
|
+
- the README plan is specific enough to satisfy the strict README audit without contradicting the real canonical runtime and test contract
|
|
151
|
+
- the plan includes a cleanup step that removes local-iteration dependency traces before development closes and before hardening judges the final Docker-contained delivery
|
|
126
152
|
- major engineering quality has been addressed through maintainable boundaries, clear decomposition, and shared contracts
|
|
127
153
|
- frontend route, page, component, and state boundaries are planned when the UI is material
|
|
128
154
|
- configurable behaviors are concretely planned where the prompt requires configurability
|
|
@@ -133,9 +159,12 @@ Examples of rejection-worthy narrowing include:
|
|
|
133
159
|
- static review readiness is explicitly planned, including how a fresh reviewer can trace entry points, routes, config, test commands, and any mock or local-data boundaries from `README.md` plus the code inside the repo, while the owner maintains fuller external references under `../docs/`
|
|
134
160
|
- static security-boundary readiness is explicitly planned in docs or code structure where applicable
|
|
135
161
|
- the repo remains self-sufficient with `README.md` as its only documentation file; external docs under `../docs/` may exist as references, but the repo must not depend on them
|
|
136
|
-
- web projects
|
|
162
|
+
- web projects require Docker runtime planning
|
|
137
163
|
- relevant cross-cutting system contracts are explicitly defined rather than left to per-module invention
|
|
164
|
+
- implementation checkpoints and a hard definition of done are explicit enough to block fake-complete or scaffold-only acceptance
|
|
165
|
+
- the plan has done a real look-ahead sweep across scaffold, implementation, integrated verification, hardening, evaluation, and packaging concerns instead of treating those as future rediscovery work
|
|
138
166
|
- each major module has a clear integration contract with existing modules and shared patterns
|
|
167
|
+
- when the project is large enough to benefit from it, the plan makes dependency order and safe parallel work packages explicit enough that execution can parallelize independent items without colliding on unstable shared foundations
|
|
139
168
|
- verification plans include cross-module seam checks, not just isolated feature tests
|
|
140
169
|
- visible mismatches are corrected or explicitly dispositioned
|
|
141
170
|
- planning comments and artifacts reflect current policy truth
|
|
@@ -34,6 +34,7 @@ The goal is to reduce late audit failures by designing for these concerns up fro
|
|
|
34
34
|
- planning should make the delivered repo statically reviewable by a fresh reviewer through `README.md`, entry points, config shape, tests, and visible module boundaries rather than depending on runtime tribal knowledge
|
|
35
35
|
- keep `README.md` as the only normal documentation file inside the repo
|
|
36
36
|
- do not create or rely on additional documentation files inside `repo/` beyond `README.md` unless the user explicitly asks for them
|
|
37
|
+
- if explicit assumptions or dispositions must be recorded, keep them in the owner-maintained external planning docs rather than creating a repo-local `ASSUMPTIONS.md` by default
|
|
37
38
|
- keep the owner-maintained external doc set under `../docs/` current when relevant, especially:
|
|
38
39
|
- `../docs/design.md`
|
|
39
40
|
- `../docs/api-spec.md`
|
|
@@ -61,6 +62,8 @@ Selected-stack defaults:
|
|
|
61
62
|
- start from the actual project prompt and build the plan from there
|
|
62
63
|
- carry the settled project requirements forward consistently as you plan
|
|
63
64
|
- make the accepted plan durable enough to serve as the primary execution contract for later scaffold and development prompts instead of forcing the owner to restate the same implementation context repeatedly
|
|
65
|
+
- prefer over-specifying important implementation details in planning rather than deferring them to later invention during coding
|
|
66
|
+
- treat the planning package as an execution spec, not a sketch; almost every later-critical decision should be made now unless there is a strong reason it truly cannot be
|
|
64
67
|
- identify the hard non-negotiable requirements early and do not quietly trade them away for implementation convenience
|
|
65
68
|
- explicitly check that the plan still fits the business goal, main flows, and implicit constraints from the prompt
|
|
66
69
|
- when planning technical items that depend on a library, framework, API, or tool, check Context7 documentation first for authoritative usage details
|
|
@@ -71,7 +74,34 @@ Selected-stack defaults:
|
|
|
71
74
|
- make the planning explicit enough that the owner can maintain external design notes and API/spec docs accurately when relevant
|
|
72
75
|
- keep the spec focused on required behavior rather than turning it into a progress or completion narrative
|
|
73
76
|
- make the plan include system overview, architecture choice and reasoning, major modules or chunks, domain model, data model where relevant, interface contracts, failure paths, state transitions, logging strategy, testing strategy, README implications, and Docker execution assumptions when those dimensions apply
|
|
77
|
+
- make the plan explicitly account for the final post-bugfix coverage and README audit contract so hardening is not surprised later
|
|
78
|
+
- identify shared prerequisites and the 2 or 3 biggest work packages that could later proceed in parallel once those prerequisites are settled when the project is large enough for that distinction to matter
|
|
79
|
+
- define which planned work must stay serial because of shared contracts or overlapping files, and which work can safely branch in parallel with a clear merge point
|
|
80
|
+
- make parallel work packages explicit enough that later owner prompts can ask for parallel execution without re-inventing the branch boundaries
|
|
81
|
+
- make the accepted planning package explicitly section-addressable and execution-grade, with clear headings for at least:
|
|
82
|
+
- authoritative tech stack summary
|
|
83
|
+
- product overview
|
|
84
|
+
- in-scope domains or modules
|
|
85
|
+
- explicit out-of-scope items
|
|
86
|
+
- actors and roles
|
|
87
|
+
- actor-specific path-to-success summaries for the main workflows
|
|
88
|
+
- authoritative business rules
|
|
89
|
+
- state machines or lifecycle rules when workflow state matters
|
|
90
|
+
- permissions and authorization model
|
|
91
|
+
- validation rules
|
|
92
|
+
- security, compliance, and data-governance requirements
|
|
93
|
+
- offline, queueing, reliability, and background-job behavior when relevant
|
|
94
|
+
- reporting, analytics, search, indexing, import, or export behavior when relevant
|
|
95
|
+
- non-functional requirements
|
|
96
|
+
- implementation phases and checkpoints
|
|
97
|
+
- definition of done
|
|
98
|
+
- concrete deliverables
|
|
99
|
+
- explicit assumptions or dispositions when safe defaults had to be locked
|
|
100
|
+
- keep unresolved items rare; if something really cannot be decided yet, isolate it in a small explicit unresolved-items section with the reason it is still open and what evidence or decision is needed to close it
|
|
101
|
+
- do not leave major module boundaries, API shapes, business rules, state transitions, security boundaries, or verification criteria as vague future implementation work
|
|
102
|
+
- use tables, bullet lists, and explicit subsections so the plan is dense, skimmable, and hard to misread
|
|
74
103
|
- keep the primary planning package concentrated in parent-root `../docs/design.md`
|
|
104
|
+
- make `../docs/design.md` the authoritative detailed plan, not a high-level narrative summary
|
|
75
105
|
- organize the accepted plan so later slices can reference concrete sections cleanly instead of requiring the owner to rewrite the plan in follow-up prompts
|
|
76
106
|
- put the risk-to-test matrix in parent-root `../docs/test-coverage.md`
|
|
77
107
|
- when prompt-critical API/interface details need a dedicated document, keep them in parent-root `../docs/api-spec.md`
|
|
@@ -92,18 +122,24 @@ Selected-stack defaults:
|
|
|
92
122
|
- plan disclosure of feature flags, debug or demo surfaces, default enabled states, and mock or interception defaults in `README.md` and owner-maintained external docs whenever they exist
|
|
93
123
|
- do not plan fake-success paths that hide missing failure handling
|
|
94
124
|
- define failure paths, permissions, validation, logging, runtime assumptions, and test strategy before coding
|
|
125
|
+
- define authoritative business rules before coding, including defaults, limits, conflicts, uniqueness, reversal or cancellation behavior, retry rules, and ownership rules when they matter
|
|
95
126
|
- for frontend-bearing work, plan each prompt-critical flow with an explicit state model covering loading, empty, submitting, disabled, success, error, and duplicate-action or re-entry protection states where relevant
|
|
96
127
|
- define logging contracts early, including categories, levels, redaction expectations, and what must never be logged
|
|
128
|
+
- for backend or fullstack work, define a central config module or equivalent single source of truth for runtime configuration instead of scattering direct environment reads through business logic
|
|
129
|
+
- for backend or fullstack work, define centralized logging expectations strongly enough that route or request outcomes, exceptions, and background failures can be understood without leaking sensitive data
|
|
97
130
|
- define validation contracts early, including request validation, form validation, boundary validation, and normalized user-facing error behavior when relevant
|
|
98
131
|
- for complex security, offline, sync, authorization, or data-governance features, define what `done` means across all prompt-promised dimensions rather than stopping at a partial foundation or hook layer
|
|
99
132
|
- define shared lifecycle and state models when the product has meaningful workflow state, and keep those models aligned across design notes and API/spec notes
|
|
100
133
|
- require cross-document consistency so design, API/spec, and test-planning artifacts do not drift on lifecycle/state models, flow coverage, permissions, or operational behavior
|
|
134
|
+
- define implementation dependency and parallelism expectations early enough that scaffold and development do not accidentally serialize independent work or parallelize unstable shared foundations
|
|
101
135
|
- define logging and observability expectations for both frontend and backend
|
|
102
136
|
- define operator visibility and operator workflow expectations when the prompt implies admin, operational, audit, backup, or support responsibilities
|
|
103
137
|
- when the system has meaningful cross-cutting behavior, define shared implementation contracts early rather than leaving each module to invent its own pattern
|
|
104
138
|
- define error-handling contracts when relevant, including normalization patterns for user-visible errors and backend error-shape expectations
|
|
105
139
|
- define audit contracts when relevant, including centralized helper or service expectations and redaction rules
|
|
140
|
+
- when third-party services are mentioned but real live integration is not clearly required for delivery proof, define explicit adaptor or stub boundaries rather than leaving the integration strategy ambiguous
|
|
106
141
|
- define permission contracts when relevant so navigation visibility, route guards, and API enforcement stay aligned
|
|
142
|
+
- define actor and role contracts explicitly, including which personas exist, which ones need real surfaces, and what successful end-to-end completion looks like for each main actor path
|
|
107
143
|
- define state-lifecycle contracts when relevant, including context-switch or tenant-switch cleanup expectations
|
|
108
144
|
- define auth edge-case expectations when relevant, such as token refresh, session expiry, or clock-skew tolerance
|
|
109
145
|
- call out operational obligations early when they are prompt-critical, such as scheduling, retention, backups, workers, auditability, or offline behavior
|
|
@@ -114,35 +150,37 @@ Selected-stack defaults:
|
|
|
114
150
|
- start `./init_db.sh` during scaffold with the real database setup already known, then keep expanding it as migrations, schema setup, bootstrap data, and other database dependencies become real through implementation
|
|
115
151
|
- when the project has database dependencies, plan to inject database setup through initialization scripts rather than packaging local database dependency artifacts or environment-specific database state
|
|
116
152
|
- define the project-standard runtime contract and the universal broad test entrypoint `./run_tests.sh` early, and keep both compatible with the selected stack
|
|
117
|
-
- for web projects,
|
|
118
|
-
- for web projects, default the primary runtime command to `docker compose up --build`
|
|
153
|
+
- for web projects, require `docker compose up --build` as the runtime contract
|
|
119
154
|
- for Dockerized web projects, plan a dev-only runtime bootstrap script that is invoked by the Docker startup path so `docker compose up --build` works without user-side exports or `.env` files
|
|
120
155
|
- for Dockerized web projects, plan runtime value generation or injection through that dev-only bootstrap path instead of hardcoded repo values
|
|
121
156
|
- for Dockerized web projects, require `./run_tests.sh` to use the same bootstrap path or an equivalent path with the same generated-value rules rather than a separate pre-seeded secret model
|
|
122
157
|
- for Dockerized web projects, do not allow pre-seeded secret literals in Compose files, config files, Dockerfiles, or startup scripts even if comments describe them as local-only, test-only, or non-production
|
|
123
158
|
- for Dockerized web projects, if runtime values must persist across restarts, plan Docker-managed runtime state rather than committed repo files
|
|
124
159
|
- for Dockerized web projects, plan README disclosure that the bootstrap path is local-development-only behavior and not the production secret-management path
|
|
125
|
-
-
|
|
126
|
-
- for mobile, desktop,
|
|
160
|
+
- for Android, mobile, desktop, and iOS-targeted projects, also require a meaningful `docker compose up --build` command that starts a containerized build, artifact, preview, or support environment even when native runtime proof differs from web semantics
|
|
161
|
+
- for Android, mobile, desktop, and iOS-targeted projects, keep `./run_tests.sh` containerized as the broad verification path
|
|
162
|
+
- for non-web projects, `./run_app.sh` may still exist as a platform helper, but it does not replace the required Docker contract
|
|
127
163
|
- `./run_tests.sh` must exist for every project as the platform-independent broad test wrapper
|
|
128
164
|
- `./run_tests.sh` must be able to run on a clean Linux VM that only has Docker and curl available by default
|
|
129
165
|
- do not require host package managers, host language runtimes, or host test tooling for the broad test path unless the stack absolutely forces it and the exception is explicitly justified
|
|
130
166
|
- `./run_tests.sh` must prepare or install anything required inside its own controlled execution path when that setup is needed for a clean environment
|
|
131
|
-
- for web projects
|
|
167
|
+
- for web projects, `./run_tests.sh` must run the full test path through Docker rather than a purely local test invocation
|
|
132
168
|
- when host-level setup would otherwise be required, prefer a Dockerized `./run_tests.sh` path even outside traditional web stacks so the broad verification remains portable
|
|
133
|
-
- for non-web
|
|
169
|
+
- for non-web projects, `./run_tests.sh` must call the selected stack's Dockerized or platform-equivalent full test path while keeping the same single-command interface
|
|
134
170
|
- local tests should still exist for ordinary developer iteration, but `./run_tests.sh` is the broad final test path for the project
|
|
135
171
|
- for Electron or other Linux-targetable desktop projects, plan a Dockerized broad path that covers build, tests, packaging smoke checks, and headless UI/runtime verification through Xvfb or an equivalent Linux-capable desktop harness
|
|
136
172
|
- for Android projects, plan a Dockerized broad path that covers Gradle build, lint, unit tests, and local Android JVM-side tests such as Robolectric without depending on an emulator
|
|
137
|
-
- for iOS-targeted projects, keep `./run_tests.sh` as the portable Linux verification wrapper for lint, typecheck, shared logic tests, JS/UI-level tests when applicable, and static config/build-shape validation, but do not pretend it is native iOS runtime proof
|
|
173
|
+
- for iOS-targeted projects, keep `./run_tests.sh` as the containerized portable Linux verification wrapper for lint, typecheck, shared logic tests, JS/UI-level tests when applicable, and static config/build-shape validation, but do not pretend it is native iOS runtime proof
|
|
138
174
|
- if true native iOS build or runtime evidence is prompt-critical, call out that it requires a separate macOS/Xcode owner checkpoint rather than trying to fake equivalence on Linux
|
|
139
|
-
- for web projects
|
|
140
|
-
- for web projects
|
|
175
|
+
- for web projects, plan collision-resistant Compose defaults from the start: unique `COMPOSE_PROJECT_NAME`, no unnecessary `container_name`, only the app-facing port exposed to host by default, and internal services kept off host ports unless required
|
|
176
|
+
- for web projects, prefer random host-port binding on `127.0.0.1` for the default runtime so parallel projects can start cleanly; if a fixed host port is genuinely required, plan an override plus a free-port fallback in the runtime or test wrapper
|
|
141
177
|
- define frontend validation and accessibility expectations when the product surface materially depends on them, including keyboard, focus, feedback, and other user-interaction quality requirements where relevant
|
|
142
178
|
- if backup or recovery behavior is prompt-critical, plan the designated media, operator drill flow, visibility, and verification expectations explicitly
|
|
143
179
|
- if the prompt names literal storage, indexing, partitioning, retention, or performance dimensions, represent them literally in the planning artifacts rather than abstracting them away
|
|
144
|
-
- for web frontend work, unless the prompt, existing repository, or established stack clearly dictates otherwise, default to Tailwind CSS for styling
|
|
145
|
-
-
|
|
180
|
+
- for web frontend work, unless the prompt, existing repository, or established stack clearly dictates otherwise, default to Tailwind CSS for styling
|
|
181
|
+
- when the selected frontend ecosystem supports `shadcn/ui` or an equivalent well-documented port cleanly, prefer that for component primitives
|
|
182
|
+
- otherwise use a mainstream documented component system appropriate to the chosen stack, such as Material UI, Ant Design, Ant Design Vue, or Angular Material
|
|
183
|
+
- if the existing project already uses a different UI system, preserve and extend that system instead of forcing the default CSS/component choices into it
|
|
146
184
|
- when the prompt leaves the stack or starter open, explicitly choose the default stack, starter, and bootstrap command during planning instead of leaving scaffold to improvise them ad hoc
|
|
147
185
|
- prefer official or clearly de facto standard starters and bootstrap commands when they fit the prompt, because they usually reduce setup waste and improve baseline quality
|
|
148
186
|
- when multiple credible defaults exist, prefer the one with the strongest ecosystem support, best current maintenance posture, easiest Docker/test/E2E integration, and least friction for prompt-faithful delivery
|
|
@@ -151,24 +189,45 @@ Selected-stack defaults:
|
|
|
151
189
|
- for mobile work, unless the prompt or existing repository clearly dictates otherwise, default to Expo + React Native + TypeScript
|
|
152
190
|
- for desktop work, unless the prompt or existing repository clearly dictates otherwise, default to Electron + Vite + TypeScript
|
|
153
191
|
- define end-to-end coverage for major user flows before coding
|
|
192
|
+
- define phase checkpoints and definition-of-done gates strongly enough that a coding model cannot confuse partial infrastructure with completed product behavior
|
|
193
|
+
- do an explicit look-ahead sweep across scaffold, implementation, integrated verification, hardening, evaluation, and packaging so later-phase needs are not rediscovered too late
|
|
154
194
|
- define enough test coverage up front to catch major issues later, especially core happy path, important failure paths, security-critical paths, and obvious high-risk boundaries
|
|
155
|
-
- enforce a plan to
|
|
195
|
+
- enforce a concrete plan to achieve a minimum 90 percent test coverage threshold, including the exact measurement path, reporting command, and failing threshold for the selected stack when practical
|
|
196
|
+
- do not leave coverage as a qualitative aspiration; planning must state how the project will prove and maintain the minimum 90 percent threshold
|
|
197
|
+
- for backend or fullstack projects, plan an endpoint-by-endpoint API audit story: resolved `METHOD + PATH` inventory, expected HTTP coverage, true no-mock HTTP coverage, and which tests are only mocked or indirect
|
|
198
|
+
- for backend or fullstack projects, plan core API tests so the important endpoints are exercised through the real HTTP layer rather than controller or service bypasses
|
|
199
|
+
- when mocked HTTP tests or unit-only coverage still exist, plan to classify them explicitly instead of overstating them as equivalent to true no-mock API coverage
|
|
200
|
+
- plan audit-readable API test evidence: the test suite and `../docs/test-coverage.md` should make the endpoint, request input, and response assertions easy to trace statically
|
|
201
|
+
- plan a module-family test summary that can call out important modules not yet tested, especially controllers, services, repositories, auth, guards, and middleware when they exist
|
|
156
202
|
- require API tests to exercise real API endpoints and real call flows rather than bypassing the endpoint layer with internal helper-only checks
|
|
157
203
|
- when API tests are material, plan for them to print simple useful response evidence such as status codes and message/body summaries so verification output is easy to inspect
|
|
158
204
|
- plan endpoint coverage so prompt-required functions and dependent multi-step API flows are actually exercised, not just isolated happy-path fragments
|
|
159
205
|
- plan `../docs/test-coverage.md` in evaluator-facing shape rather than loose prose: requirement or risk point, mapped test file(s), key assertion(s) or fixtures, coverage status, major gap, and minimum test addition
|
|
160
206
|
- do not satisfy `../docs/test-coverage.md` with generic test categories alone; make the matrix concrete enough that the owner can review prompt-critical risks without reconstructing the test story manually
|
|
207
|
+
- when backend or fullstack APIs exist, make `../docs/test-coverage.md` carry both the requirement/risk matrix and an endpoint inventory plus API test mapping table
|
|
208
|
+
- when backend or fullstack APIs exist, make `../docs/test-coverage.md` distinguish true no-mock HTTP tests from HTTP-with-mocking and unit-only or indirect coverage
|
|
161
209
|
- when multiple prompt-critical domains exist, group the matrix by domain or risk cluster so each section names the requirement, planned test location, key assertions, current status, and remaining gap explicitly
|
|
162
210
|
- for backend or fullstack projects, explicitly plan coverage for 401, 403, 404, conflicts or duplicate submission when relevant, object-level authorization, tenant or user isolation, sensitive-log exposure, and pagination/filter/sort when those behaviors exist
|
|
163
211
|
- for frontend-bearing projects, explicitly plan a layered frontend test story when UI state or routing is material: unit, component, page or route integration, and E2E where applicable
|
|
164
212
|
- for non-trivial frontend projects, explicitly plan a frontend test layer beyond runtime-only confidence: component, page, route, or state-focused tests when UI state complexity is meaningful
|
|
165
|
-
- for web fullstack work, explicitly plan Playwright coverage for the synchronized frontend/backend flows when end-to-end testing is applicable
|
|
213
|
+
- for web fullstack work, explicitly plan Playwright coverage for the synchronized frontend/backend flows when end-to-end testing is applicable, but treat Playwright as a real verified dependency rather than a decorative default
|
|
166
214
|
- for mobile work, plan Jest plus React Native Testing Library as the local default test layer and add a platform-appropriate mobile UI/E2E tool when real device-flow proof is needed
|
|
167
215
|
- for desktop work, plan a local desktop test runner plus Playwright Electron support or another platform-appropriate desktop UI/E2E tool when real window-flow proof is needed
|
|
168
216
|
- for Android work, do not rely on an emulator as the default broad verification contract
|
|
169
217
|
- for iOS work on Linux, plan code and portable-test evaluation honestly and treat native simulator/runtime proof as out-of-band unless a macOS checkpoint is explicitly available
|
|
170
218
|
- when UI-bearing flows are material, explicitly plan screenshot review or equivalent platform artifacts as part of UI verification so correctness is checked, not just command success
|
|
171
219
|
- define verification strategy, selected-stack runtime expectations, and documentation implications before coding
|
|
220
|
+
- plan the final README contract explicitly enough to satisfy both the normal delivery docs and the strict post-bugfix README audit:
|
|
221
|
+
- project type declared near the top using one of `backend`, `fullstack`, `web`, `android`, `ios`, or `desktop`
|
|
222
|
+
- startup instructions
|
|
223
|
+
- access method
|
|
224
|
+
- verification method
|
|
225
|
+
- demo credentials for every role when auth exists, or the exact statement `No authentication required`
|
|
226
|
+
- tech stack clarity and architecture explanation
|
|
227
|
+
- workflow and security or role notes when relevant
|
|
228
|
+
- for backend, fullstack, and web projects, plan README startup instructions so they include the canonical `docker compose up --build` contract and also the exact legacy compatibility string `docker-compose up` to satisfy the strict README audit without weakening the real runtime contract
|
|
229
|
+
- for Android, iOS, and desktop projects, plan README platform-specific host-side build or launch guidance in addition to the required Docker-contained runtime and test contract so the strict README audit has the expected section shape
|
|
230
|
+
- plan a deliberate cleanup step between fast local iteration and final hardening so local-only setup traces, host-only dependency assumptions, and misleading README instructions are removed before integrated verification and hardening close
|
|
172
231
|
- define the static review story before coding: a fresh reviewer should be able to trace startup, test entry points, main routes or entry modules, core data flow, and any mock or local-data boundaries from repo artifacts without rewriting the project
|
|
173
232
|
- define the static audit story for security and tests before coding: a fresh reviewer should be able to trace security boundaries and requirement-to-test coverage from repository artifacts and docs without reconstructing the design mentally
|
|
174
233
|
- define repo traceability before coding through `README.md` plus the code structure inside the repo; keep fuller external references in `../docs/design.md`, `../docs/api-spec.md`, and `../docs/test-coverage.md`
|
|
@@ -49,7 +49,7 @@ Prefer existing workflow artifacts first:
|
|
|
49
49
|
- developer-session handoffs
|
|
50
50
|
- review and rejection history
|
|
51
51
|
- verification gate notes
|
|
52
|
-
-
|
|
52
|
+
- `../.tmp/` audit and fix-check reports
|
|
53
53
|
- packaging checks
|
|
54
54
|
|
|
55
55
|
Do not reread the entire codebase unless a real inconsistency requires it.
|
|
@@ -66,7 +66,7 @@ Do not rerun broad Docker or full-suite verification just for retrospective anal
|
|
|
66
66
|
- owner shell
|
|
67
67
|
- developer prompt
|
|
68
68
|
- skills
|
|
69
|
-
- `AGENTS.md`
|
|
69
|
+
- repo-local rulebook file such as `AGENTS.md` or `CLAUDE.md`
|
|
70
70
|
7. actionable improvements
|
|
71
71
|
|
|
72
72
|
## Audit buckets
|