theslopmachine 1.0.2 → 1.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. package/MANUAL.md +18 -18
  2. package/README.md +60 -65
  3. package/RELEASE.md +4 -4
  4. package/assets/agents/developer.md +68 -229
  5. package/assets/agents/slopmachine-claude.md +82 -542
  6. package/assets/agents/slopmachine.md +60 -483
  7. package/assets/claude/agents/developer.md +51 -285
  8. package/assets/claude/skills/integration-fanin/SKILL.md +15 -114
  9. package/assets/claude/skills/module-handoff/SKILL.md +15 -87
  10. package/assets/claude/skills/module-lane-execution/SKILL.md +15 -118
  11. package/assets/claude/skills/shared-surface-control/SKILL.md +15 -91
  12. package/assets/skills/beads-operations/SKILL.md +2 -8
  13. package/assets/skills/clarification-gate/SKILL.md +7 -8
  14. package/assets/skills/claude-worker-management/SKILL.md +18 -584
  15. package/assets/skills/developer-session-lifecycle/SKILL.md +19 -258
  16. package/assets/skills/development-guidance/SKILL.md +23 -165
  17. package/assets/skills/evaluation-triage/SKILL.md +28 -28
  18. package/assets/skills/final-evaluation-orchestration/SKILL.md +29 -292
  19. package/assets/skills/integrated-verification/SKILL.md +25 -136
  20. package/assets/skills/p8-readiness-reconciliation/SKILL.md +42 -0
  21. package/assets/skills/planning-gate/SKILL.md +23 -634
  22. package/assets/skills/planning-guidance/SKILL.md +45 -154
  23. package/assets/skills/report-output-discipline/SKILL.md +1 -1
  24. package/assets/skills/retrospective-analysis/SKILL.md +2 -2
  25. package/assets/skills/scaffold-guidance/SKILL.md +21 -176
  26. package/assets/skills/submission-packaging/SKILL.md +29 -200
  27. package/assets/skills/verification-gates/SKILL.md +21 -255
  28. package/assets/slopmachine/backend-evaluation-prompt.md +211 -165
  29. package/assets/slopmachine/clarification-faithfulness-review-prompt.md +69 -45
  30. package/assets/slopmachine/clarifier-agent-prompt.md +50 -44
  31. package/assets/slopmachine/exact-readme-template.md +43 -18
  32. package/assets/slopmachine/frontend-evaluation-prompt.md +221 -179
  33. package/assets/slopmachine/owner-verification-checklist.md +29 -270
  34. package/assets/slopmachine/phase-1-design-prompt.md +129 -53
  35. package/assets/slopmachine/phase-1-design-template.md +133 -30
  36. package/assets/slopmachine/phase-2-execution-planning-prompt.md +189 -121
  37. package/assets/slopmachine/phase-2-plan-template.md +196 -108
  38. package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +13 -6
  39. package/assets/slopmachine/scaffold-playbooks/shared-contract.md +8 -6
  40. package/assets/slopmachine/scaffold-playbooks/stack-go-gin-templ-postgres.md +3 -3
  41. package/assets/slopmachine/scaffold-playbooks/stack-vue-koa-mysql.md +1 -1
  42. package/assets/slopmachine/scaffold-playbooks/tech-backend-gin-templ.md +1 -1
  43. package/assets/slopmachine/scaffold-playbooks/tech-frontend-vue.md +2 -0
  44. package/assets/slopmachine/scaffold-playbooks/type-web-spa.md +1 -0
  45. package/assets/slopmachine/templates/AGENTS.md +43 -179
  46. package/assets/slopmachine/templates/CLAUDE.md +43 -178
  47. package/assets/slopmachine/test-coverage-prompt.md +4 -4
  48. package/assets/slopmachine/utils/README.md +242 -0
  49. package/assets/slopmachine/utils/claude_create_session.mjs +2 -1
  50. package/assets/slopmachine/utils/claude_export_session.mjs +2 -1
  51. package/assets/slopmachine/utils/claude_live_common.mjs +23 -10
  52. package/assets/slopmachine/utils/claude_live_launch.mjs +4 -3
  53. package/assets/slopmachine/utils/claude_live_turn.mjs +2 -2
  54. package/assets/slopmachine/utils/claude_resume_session.mjs +2 -1
  55. package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.sh +0 -0
  56. package/assets/slopmachine/utils/claude_worker_common.mjs +36 -5
  57. package/assets/slopmachine/utils/convert_ai_session.py +85 -85
  58. package/assets/slopmachine/utils/convert_exported_ai_session.mjs +5 -1
  59. package/assets/slopmachine/utils/export_ai_session.mjs +3 -2
  60. package/assets/slopmachine/utils/package_claude_session.mjs +15 -11
  61. package/assets/slopmachine/utils/prepare_evaluation_prompt.mjs +18 -6
  62. package/assets/slopmachine/utils/prepare_evaluation_send_packet.mjs +34 -7
  63. package/assets/slopmachine/utils/prepare_strict_audit_workspace.mjs +10 -8
  64. package/package.json +17 -4
  65. package/src/cli.js +4 -4
  66. package/src/constants.js +31 -31
  67. package/src/init.js +116 -120
  68. package/src/install.js +161 -3
  69. package/src/send-data.js +47 -43
  70. package/src/utils.js +1 -1
  71. package/tsconfig.json +24 -0
  72. package/assets/slopmachine/templates/plan.md +0 -887
  73. package/assets/slopmachine/utils/__pycache__/claude_live_hook.cpython-311.pyc +0 -0
  74. package/assets/slopmachine/utils/__pycache__/cleanup_delivery_artifacts.cpython-311.pyc +0 -0
  75. package/assets/slopmachine/utils/__pycache__/convert_ai_session.cpython-311.pyc +0 -0
  76. package/assets/slopmachine/utils/__pycache__/normalize_claude_session.cpython-311.pyc +0 -0
  77. package/assets/slopmachine/utils/__pycache__/strip_session_parent.cpython-311.pyc +0 -0
package/MANUAL.md CHANGED
@@ -39,7 +39,7 @@ Inside a new or empty project directory, run:
39
39
  slopmachine init
40
40
  ```
41
41
 
42
- Or to open OpenCode immediately in `repo/` after bootstrap:
42
+ Or to open OpenCode immediately in `task/` after bootstrap:
43
43
 
44
44
  ```bash
45
45
  slopmachine init -o
@@ -47,32 +47,32 @@ slopmachine init -o
47
47
 
48
48
  ## What `init` does
49
49
 
50
- - creates `.ai/` workflow files plus `.ai/artifacts`
51
- - creates hidden `.ai/worktrees/` as the default location for parallel git worktrees
52
- - initializes git when needed
53
- - updates `.gitignore`
50
+ - creates workflow-root `.ai/` workflow files plus `.ai/artifacts`
51
+ - creates hidden workflow-root `.ai/worktrees/` as the default location for parallel git worktrees
52
+ - creates `task/` and initializes git inside `task/`
53
+ - updates `task/.gitignore`
54
54
  - bootstraps beads_rust (`br`)
55
- - creates parent-root `docs/`, `.tmp/`, `metadata.json`, and root `.beads/`
56
- - creates `repo/`
57
- - copies the packaged default repo rulebook into `repo/AGENTS.md`
58
- - copies the packaged Claude repo rulebook into `repo/CLAUDE.md`
59
- - seeds `repo/README.md`, `repo/plan.md`, and `repo/.claude/settings.json`
60
- - seeds `.ai/startup-context.md` plus the parent-root planning docs under `docs/`
61
- - later, when `P5` closes, the workflow preserves the final truthful execution record in `docs/plan.md` and removes `repo/plan.md` before evaluation begins
55
+ - creates workflow-root `.beads/` outside `task/`
56
+ - creates task-root `docs/`, `.tmp/`, `metadata.json`, and `repo/`
57
+ - copies the packaged default rulebook into `task/AGENTS.md`
58
+ - copies the packaged Claude rulebook into `task/CLAUDE.md`
59
+ - seeds `task/repo/README.md`, task-visible product docs, and `task/.claude/settings.json`
60
+ - seeds `.ai/startup-context.md` plus owner-private planning files under `.ai/`
61
+ - keeps execution planning owner-private in `.ai/plan.md`
62
62
  - creates the initial git commit so the workspace starts with a clean tree
63
- - optionally opens `opencode` in `repo/`
64
- - parallel worktrees should stay under hidden parent-root `.ai/worktrees/` so the visible workspace root stays clean
63
+ - optionally opens `opencode` in `task/`
64
+ - parallel worktrees should stay under hidden workflow-root `.ai/worktrees/` so the visible task root stays clean
65
65
 
66
66
  ## Rough workflow
67
67
 
68
68
  1. Intake and setup
69
69
  2. Clarification
70
70
  3. Planning
71
- 4. Development, starting with the scaffold step inside `plan.md`
71
+ 4. Development, starting with scaffold and then module-by-module owner prompts
72
72
  5. Rough integrated verification and hardening: repo coherence and small owner-side fixes only, with no Docker execution
73
73
  6. Evaluation and fix verification, including the final coverage and README audit inside `P7`
74
74
  7. Final readiness decision
75
- 8. Submission packaging, including the owner-only Docker and `./run_tests.sh` check
75
+ 8. Submission packaging, including the owner-only Docker and `./repo/run_tests.sh` check
76
76
  9. Retrospective
77
77
 
78
78
  The intended fast path is:
@@ -82,7 +82,7 @@ The intended fast path is:
82
82
  - execute the plan end to end
83
83
  - make the repo coherent
84
84
  - proceed through evaluation without Docker execution
85
- - after evaluation is complete, have the owner run and fix `docker compose up --build` and `./run_tests.sh` before submission closes
85
+ - after evaluation is complete, have the owner run and fix `docker compose up --build` and `./repo/run_tests.sh` before submission closes
86
86
 
87
87
  ## Important notes
88
88
 
@@ -92,5 +92,5 @@ The intended fast path is:
92
92
  - packaging and send-data depend on archive support: `zip` on non-Windows systems or PowerShell on Windows.
93
93
  - The workflow-owner agents use mandatory skills for specific phases; skipping them is considered a workflow failure.
94
94
  - `slopmachine` is the lighter current engine: it keeps the owner prompt smaller, uses more specialized skills, and keeps one active developer session at a time while preserving rollover history when new sessions are intentionally started.
95
- - the scaffold playbook inventory now covers the main repeated families used in current tasks: React/Vite, Vue/Vite, Angular, FastAPI, Spring Boot, Django, Laravel, Livewire, Go/Chi, Android Java Views, Android Kotlin Compose, Electron/Vite, Tauri, Expo iOS-on-Linux, plus honest Linux partial-proof native Swift and Objective-C iOS playbooks.
95
+ - the scaffold playbook inventory covers current packaged type, technology, and composed-stack playbooks under `~/slopmachine/scaffold-playbooks/`; unsupported stacks fall back to the generic scaffold path instead of a nonexistent named playbook.
96
96
  - Submission packaging collects the final docs, accepted evaluation reports, cleaned OpenCode session exports or one Claude session zip bundle containing only the tracked relevant Claude sessions, and the cleaned repo into the required final structure.
package/README.md CHANGED
@@ -85,29 +85,24 @@ Notes:
85
85
 
86
86
  Current scaffold inventory includes:
87
87
 
88
- - shared Docker/runtime/test contract
89
- - generic unknown-tech scaffold guide
90
- - frontend, backend, database, platform, and overlay family matrices
91
- - experimentally verified concrete playbooks for:
92
- - React/Vite
93
- - Vue/Vite
94
- - Angular
95
- - FastAPI
96
- - Spring Boot
97
- - Django
98
- - Laravel
99
- - Livewire
100
- - Go/Chi
101
- - Android Java Views
102
- - Android Kotlin Compose
103
- - Electron/Vite desktop
104
- - Tauri desktop
105
- - Expo iOS-on-Linux
106
- - experimentally verified Linux partial-proof playbooks for:
107
- - native Swift iOS
108
- - native Objective-C iOS
109
-
110
- These playbooks are baseline-only references. The redesigned workflow uses them to define the scaffold step at the start of development inside `plan.md` before the single broad implementation run continues.
88
+ - shared runtime/test contract
89
+ - stack selection matrix
90
+ - type playbooks for web SPA, API service, database, background jobs, offline/local-first, Android, and desktop work
91
+ - technology playbooks for React, Vue, Go, Koa, Laravel, Gin/Templ, MySQL, Postgres, Room, LocalDB, and Rust workspaces
92
+ - composed stack playbooks for browser-only offline SPA, Vue/Koa/MySQL, Vue/Laravel/MySQL, React/Go/Postgres, Go/Gin/Templ/Postgres, Rust fullstack workspace, Android Room offline, WinForms LocalDB, and generic fallback work
93
+
94
+ These playbooks are baseline-only references. The workflow uses them to guide the scaffold step from owner-private planning before module-by-module implementation continues.
95
+
96
+ ## Development Checks
97
+
98
+ Run these before packaging changes to the CLI or installed tools:
99
+
100
+ ```bash
101
+ npm run typecheck
102
+ npm run check
103
+ ```
104
+
105
+ `npm run typecheck` uses TypeScript `checkJs` over the package CLI source and shipped `assets/slopmachine` JavaScript utilities without adding a build step. The utility reference lives at `assets/slopmachine/utils/README.md` and documents each installed helper's arguments and output contract.
111
106
 
112
107
  ### `slopmachine init`
113
108
 
@@ -139,45 +134,44 @@ slopmachine init --continue-from P3
139
134
 
140
135
  What it creates:
141
136
 
142
- - `repo/`
143
- - `docs/`
144
- - `.tmp/`
145
- - `metadata.json`
146
- - `.ai/metadata.json`
147
- - `.ai/startup-context.md`
148
- - hidden `.ai/worktrees/` for parallel git worktrees when used
149
- - root `.beads/`
150
- - `repo/AGENTS.md`
151
- - `repo/CLAUDE.md`
152
- - `repo/plan.md`
153
- - `repo/.claude/settings.json`
154
- - `repo/README.md`
155
- - `docs/questions.md`
156
- - `docs/design.md`
157
- - `docs/api-spec.md`
158
- - `docs/plan.md`
159
- - `docs/test-coverage.md`
137
+ - workflow root `.ai/metadata.json`
138
+ - workflow root `.ai/startup-context.md`
139
+ - hidden workflow root `.ai/worktrees/` for parallel git worktrees when used
140
+ - workflow root `.beads/`
141
+ - `task/`
142
+ - `task/.git/`
143
+ - `task/AGENTS.md`
144
+ - `task/CLAUDE.md`
145
+ - `task/.claude/settings.json`
146
+ - `task/repo/`
147
+ - `task/repo/README.md`
148
+ - `task/docs/questions.md`
149
+ - `task/docs/design.md`
150
+ - `task/docs/api-spec.md`
151
+ - `task/.tmp/`
152
+ - `task/metadata.json`
160
153
 
161
154
  Important details:
162
155
 
163
- - `run_id` is created in `.ai/metadata.json`
164
- - the workspace root is the parent directory containing `repo/`
165
- - parent-root `.tmp/` is the audit and fix-check artifact directory used during `P7`
166
- - parent-root `.tmp/` also holds `test_coverage_and_readme_audit_report.md` after the final post-bugfix audit
167
- - parent-root `metadata.json` is strict project metadata only and must contain exactly these keys: `prompt`, `project_type`, `frontend_language`, `backend_language`, `database`, `frontend_framework`, `backend_framework`
156
+ - `run_id` is created in workflow root `.ai/metadata.json`
157
+ - the operational session root is `task/`
158
+ - product code lives under `task/repo/`
159
+ - task-root `.tmp/` is the audit and fix-check artifact directory used during `P7`
160
+ - task-root `.tmp/` also holds `test_coverage_and_readme_audit_report.md` after the final post-bugfix audit
161
+ - task-root `metadata.json` is strict project metadata only and must contain exactly these keys: `prompt`, `project_type`, `frontend_language`, `backend_language`, `database`, `frontend_framework`, `backend_framework`
168
162
  - `project_type` should use only `fullstack`, `backend`, `android`, `ios`, `desktop`, or `web` when known
169
- - Beads lives in the workspace root, not inside `repo/`
170
- - `repo/.claude/settings.json` seeds Claude Code to use the custom `developer` agent by default for that repo
171
- - planned parallel git worktrees should live under hidden parent-root `.ai/worktrees/` by default so root-level `repo-lane-*` folders do not clutter the workspace
172
- - when `P5` completes, the workflow moves `repo/plan.md` to parent-root `docs/plan.md`; packaging later validates that `repo/plan.md`, `repo/AGENTS.md`, and `repo/CLAUDE.md` are absent from the delivered `repo/`
173
- - after non-`-o` bootstrap, the command prints the exact `cd repo` next step so you can continue immediately
174
- - `--adopt` moves the current project files into `repo/`, preserves root workflow state in the parent workspace, and skips the automatic bootstrap commit
163
+ - Beads lives in the workflow root outside `task/`
164
+ - `task/.claude/settings.json` seeds Claude Code to use the custom `developer` agent by default for that task root
165
+ - planned parallel git worktrees should live under hidden workflow root `.ai/worktrees/` by default so visible task-root folders do not clutter the delivery structure
166
+ - owner-private execution planning lives under workflow root `.ai/plan.md` and is translated into normal developer prompts
167
+ - after non-`-o` bootstrap, the command prints the exact `cd task` next step so you can continue immediately
168
+ - `--adopt` moves the current project files into `task/repo/`, preserves workflow state outside `task/`, and skips the automatic bootstrap commit
175
169
  - `--continue-from <PX>` is a smoother alias for existing-project bootstrap; it implies adoption mode and seeds the requested start phase in one step
176
- - if `--continue-from <PX>` is run while your current working directory is already the real project `repo/`, or if the explicit target path itself points at that `repo/` directory, SlopMachine automatically treats `..` as the workspace root and writes the workflow state there instead of creating `repo/repo`
170
+ - if `--continue-from <PX>` is run while your current working directory is already `task/` or `task/repo/`, SlopMachine automatically resolves the surrounding workflow root instead of creating nested task/repo directories
177
171
  - when a later start phase is seeded for adoption or recovery, the Beads workflow phases before that requested phase are created and immediately marked completed so tracker state matches the seeded entry point
178
172
  - in the `slopmachine-claude` path, if adopted or resumed later-phase work has no recoverable tracked Claude developer session yet, the owner must launch and orient the needed Claude lane first and only then continue the substantive work in that same session
179
173
  - `--phase <PX>` seeds the initial `current_phase` for adoption/recovery bootstrap; the owner should still fall back if the real repo evidence does not support that later phase
180
- - `repo/plan.md` is seeded at bootstrap and becomes the definitive repo-local execution checklist through planning, development, and `P5`; after `P5`, the preserved reference copy is `docs/plan.md`
174
+ - `task/docs/plan.md` and `task/docs/test-coverage.md` are not seeded or required; planning and coverage notes stay owner-private under workflow root `.ai/`
181
175
 
182
176
  ### `slopmachine set-token`
183
177
 
@@ -222,34 +216,35 @@ slopmachine send-data ses_abc123 --endpoint "https://<project-ref>.supabase.co/f
222
216
 
223
217
  Where to run it:
224
218
 
225
- - preferred: workspace root
226
- - also supported: `repo/`
219
+ - preferred: `task/`
220
+ - also supported: workflow root containing `task/`
221
+ - also supported: `task/repo/`
227
222
 
228
- If run from `repo/`, the command resolves the parent workspace root automatically.
223
+ If run from `task/repo/`, the command resolves the surrounding task and workflow roots automatically.
229
224
 
230
225
  What it exports live:
231
226
 
232
227
  - owner session from the positional `owner-session-id`
233
- - developer sessions from `.ai/metadata.json`
234
- - `beads-export.json` from root `.beads/`
228
+ - developer sessions from workflow-root `.ai/metadata.json`
229
+ - `beads-export.json` from workflow-root `.beads/`
235
230
 
236
231
  What it includes when present:
237
232
 
238
- - `.tmp/`
233
+ - task-root `.tmp/`
239
234
  - `retrospective-<run_id>.md`
240
235
  - `improvement-actions-<run_id>.md`
241
236
  - `test_coverage_and_readme_audit_report.md`
242
237
 
243
238
  What it always includes:
244
239
 
245
- - `metadata.json`
240
+ - task-root `metadata.json`
246
241
  - `ai-metadata.json`
247
242
  - `manifest.json`
248
243
 
249
244
  Fail-fast conditions:
250
245
 
251
246
  - missing owner session id argument
252
- - missing `.ai/metadata.json`
247
+ - missing workflow-root `.ai/metadata.json`
253
248
  - missing `run_id`
254
249
  - missing tracked developer session ids
255
250
  - owner session export failure
@@ -257,7 +252,7 @@ Fail-fast conditions:
257
252
 
258
253
  Warn-only conditions:
259
254
 
260
- - missing `.tmp/`
255
+ - missing task-root `.tmp/`
261
256
  - missing retrospective files
262
257
 
263
258
  Output behavior:
@@ -343,5 +338,5 @@ slopmachine send-data <owner-session-id> --dry-run --endpoint "https://<project-
343
338
 
344
339
  - the upload token is machine-level state and is not stored in the repo
345
340
  - the owner session id is currently supplied manually to `send-data`
346
- - developer session ids come from `.ai/metadata.json`
347
- - broad workflow files and session exports live at workspace root, not inside `repo/`
341
+ - developer session ids come from workflow-root `.ai/metadata.json`
342
+ - broad workflow files and session exports live at workflow root, outside `task/`
package/RELEASE.md CHANGED
@@ -52,12 +52,12 @@ printf 'console.log("hello")\n' > .tmp-project-continue/index.js
52
52
  SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --continue-from P3 .tmp-project-continue
53
53
  ```
54
54
 
55
- 7. Test `repo/` auto-wrap for `--continue-from`:
55
+ 7. Test `task/repo/` auto-wrap for `--continue-from`:
56
56
 
57
57
  ```bash
58
- mkdir -p .tmp-project-continue-parent/repo
59
- printf 'console.log("hello")\n' > .tmp-project-continue-parent/repo/index.js
60
- (cd .tmp-project-continue-parent/repo && SLOPMACHINE_HOME="$(pwd)/../../.tmp-home" node ../../../bin/slopmachine.js init --continue-from P3)
58
+ mkdir -p .tmp-project-continue-parent/task/repo
59
+ printf 'console.log("hello")\n' > .tmp-project-continue-parent/task/repo/index.js
60
+ (cd .tmp-project-continue-parent/task/repo && SLOPMACHINE_HOME="$(pwd)/../../../.tmp-home" node ../../../../bin/slopmachine.js init --continue-from P3)
61
61
  ```
62
62
 
63
63
  Note:
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: developer
3
- description: Senior implementation agent for slopmachine projects
3
+ description: Senior implementation agent for software projects
4
4
  model: openai/gpt-5.3-codex
5
5
  variant: high
6
6
  mode: subagent
@@ -21,245 +21,84 @@ permission:
21
21
  "grep_app_*": allow
22
22
  ---
23
23
 
24
- You are a senior software engineer working inside a bounded execution session.
24
+ You are a senior software engineer working on a product implementation.
25
25
 
26
- Treat the current working directory as the project. Ignore files outside it unless explicitly asked to use them, except accepted planning/reference docs under `../docs/` that the repo rulebook explicitly designates, especially `../docs/design.md`. Do not treat parent-directory workflow notes, session exports, or research folders as hidden implementation instructions.
26
+ Product code lives under `./repo`. Read and follow `AGENTS.md` before implementing.
27
27
 
28
- Read and follow `AGENTS.md` before implementing. If `plan.md` exists and has been populated, treat it as the definitive execution checklist.
28
+ ## Project Inputs
29
29
 
30
- ## Core Standard
31
-
32
- - think before coding
33
- - build in coherent end-to-end workstreams
34
- - keep architecture intentional and reviewable
35
- - do real verification, not confidence theater
36
- - keep moving until the assigned work is materially complete or concretely blocked
37
- - do not stop for unnecessary intermediate check-ins
38
- - use strong engineering judgment instead of acting like a passive worker waiting to be corrected later
39
- - once given a bounded engineering objective, keep going autonomously until that objective or explicit stop boundary is complete; do not pause for reassurance or permission when prompt-faithful defaults let you proceed
40
-
41
- ## Requirements And Planning
42
-
43
- Before coding:
44
-
45
- - identify requirements, constraints, flows, and edge cases
46
- - identify the actors or personas touched by the work and the concrete path to success for each one
47
- - make the important business rules explicit before coding, including defaults, thresholds, limits, uniqueness, conflicts, reversals, retry behavior, and ownership rules when those dimensions matter
48
- - define or confirm the relevant state machine when the feature has meaningful lifecycle state
49
- - keep explicit out-of-scope boundaries in mind so you do not overbuild speculative features
50
- - surface meaningful ambiguity only when it is genuinely blocking or materially changes the product contract; otherwise choose the safest prompt-faithful default and keep moving
51
- - make the plan concrete enough to drive real implementation
52
- - keep frontend/backend surfaces aligned when both sides matter
53
- - check prompt-fit before reporting completion; if the requested result still has visible gaps, keep working or call them out explicitly
54
-
55
- Do not narrow scope for convenience.
56
-
57
- Do not introduce convenience-based simplifications, `v1` reductions, future-work deferrals, actor/model reductions, or workflow omissions unless one of these is true:
58
-
59
- - the original prompt explicitly allows it
60
- - the approved clarification explicitly allows it
61
- - the current instructions explicitly allow it
62
-
63
- If a simplification would make implementation easier but is not explicitly authorized, keep the full prompt scope and plan the real complexity instead.
64
-
65
- When accepted planning artifacts already exist, treat them as the primary execution contract.
66
-
67
- - read the relevant accepted plan section before implementing the next `plan.md` workstream
68
- - do not wait to have what is already in the accepted plan restated
69
- - treat follow-up prompts mainly as narrow deltas, guardrails, or correction signals
70
- - if the current work is the scaffold step at the start of development, treat section 3 of `plan.md` as binding; do not re-choose the playbook, starter, or bootstrap path unless planning is explicitly reopened
71
- - if the scaffold-step instructions are still vague about the playbook or bootstrap command, raise that as a planning gap instead of improvising a new baseline contract
72
- - if `plan.md` includes a security execution contract, `Core Semantic Path Proof`, `Prompt-Critical Rule Matrix`, `Role Surface Matrix`, `Runtime Lifecycle Checklist`, `Delivery Review Requirements`, `README Contract`, or test coverage execution contract, treat them as binding parts of the current workstream rather than optional follow-up polish
73
- - if `plan.md` includes a FE↔BE Integration Map, treat it as binding: frontend surfaces must use real backend behavior, and prompt-relevant backend features must be exposed through required frontend surfaces unless the plan accepts them as internal/API-only
74
- - treat the module packet map and owned file/location details in `plan.md` as real execution boundaries, not decorative planning notes
75
- - for adopted projects, inspect the current repo tree first and use the accepted `plan.md` delta tree rather than assuming a greenfield layout
76
- - keep `plan.md` main-session-owned during module execution; optional helper tasks should report completion and let the main developer session update `plan.md` after integration
77
- - the current developer session remains the integration authority and should complete ordered module packets one by one by default
78
- - use worktree-backed `Task` subagents only when the accepted plan identifies genuinely independent modules, discovery, verification, or remediation work where concurrency is safer or clearly useful
79
- - if an optional helper task cannot be launched, record the reason and complete the module sequentially only when that preserves the same proof and verification path
80
- - after any optional helper work, reconcile the work in the main developer session, verify the integrated result yourself, and only then mark the relevant `plan.md` items complete
81
-
82
- When instructed to plan without coding yet:
83
-
84
- - produce an exhaustive, section-addressable implementation plan rather than a high-level summary
85
- - prefer writing almost all important implementation decisions down now instead of deferring them to coding time
86
- - make unresolved items rare, narrow, and explicit
87
- - if asked to write planning artifacts, fill them densely enough that later implementation can mostly execute by following the plan rather than inventing new structure
88
- - map the full prompt-relevant app surface to intended unit, API, integration, and E2E or platform-equivalent tests early
89
- - when planning fullstack or backend-backed frontend work, include a bidirectional FE↔BE Integration Map that connects each frontend page/component/action to real backend behavior and each prompt-relevant backend feature to its frontend exposure or accepted internal/API-only rationale
90
- - prefer putting the real planning depth into the requested planning files rather than leaving the important detail only in chat
91
- - if asked to do planning only, stop after the planning artifacts are complete
92
- - if asked to do only the scaffold step at the start of development, establish only that accepted step and stop before broader feature implementation begins
93
-
94
- ## Execution Model
95
-
96
- - implement real behavior, not placeholders
97
- - keep user-facing and admin-facing flows complete through their real surfaces
98
- - when roles or privileges matter, keep route-level, object-level, and function-level authorization aligned with the actual actor model
99
- - when third-party integrations are required but real external integration is not explicitly demanded, prefer internal stubs or adaptors over brittle live-service coupling
100
- - for backend or fullstack work, keep configuration reads centralized instead of scattering direct environment access through business logic
101
- - keep logging, validation, and normalized error handling on shared paths when those cross-cutting concerns are material
102
- - verify the changed area locally and realistically before reporting completion
103
- - when backend or fullstack API endpoints are added or changed, prefer real HTTP tests for the exact `METHOD + PATH` over controller or service bypasses when practical
104
- - when endpoints are called by frontend flows, prove the called backend path performs the real read, mutation, state transition, or side effect expected by the frontend rather than only proving the route exists or returns 200
105
- - do not claim frontend completion when a mapped surface still uses static demo data, fake-success API clients, disconnected submit handlers, TODO integration stubs, or placeholder response shapes
106
- - if mocked HTTP tests or unit-only tests still exist for an API surface, do not overstate them as equivalent to true no-mock endpoint coverage
107
- - when closing a `plan.md` workstream or bounded follow-up, think briefly about what adjacent flows, runtime paths, or doc/spec claims it could have affected before claiming readiness
108
- - keep `README.md` as the primary documentation file inside the repo; repo-local `plan.md` is the explicit execution-plan exception only during active implementation through `P5`
109
- - treat `README.md` and other shared integration-heavy files as main-session-owned by default during parallel work unless the accepted plan explicitly delegates them
110
- - keep the repo self-sufficient and statically reviewable through code plus `README.md`, with repo-local `plan.md` as the deliberate execution-plan exception only during active implementation through `P5`; do not rely on runtime success alone to make the project understandable
111
- - keep the repo self-sufficient; do not make it depend on parent-directory docs or sibling artifacts for startup, build/preview, configuration, verification, or basic understanding
112
- - do not touch workflow or rulebook files such as `AGENTS.md` unless explicitly asked
113
- - if the work changes acceptance-critical docs or contracts, review those docs yourself before replying instead of assuming someone else will catch inconsistencies later
114
- - keep `README.md` compatible with the strict audit contract as the project matures: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
115
- - keep repo-root `./run_tests.sh` as the primary broad test entrypoint; do not relocate it into subdirectories or replace it with a different primary script path
116
- - for backend, fullstack, and web projects, keep the canonical `docker compose up --build` contract in `README.md` and also include the exact legacy compatibility string `docker-compose up` somewhere in startup guidance
117
- - for Android, iOS, and desktop projects, keep the required Docker-contained final contract while also maintaining the project-type-specific host-side guidance sections expected by the strict README audit
118
- - before reporting development complete, remove local-only setup traces and host-only dependency assumptions from the delivered README and wrapper scripts
119
- - before reporting development complete, run one deliberate main-session reread against the accepted `plan.md`, `../docs/design.md`, accepted `../docs/api-spec.md` when applicable, `README.md`, and the integrated repo so the owner is not first discovering obvious drift in `P5`
120
- - before reporting development complete, close the common late-failure classes inside development: `README.md` drift, API-spec drift, missing auth/authorization/ownership enforcement, weak validation or normalized error handling, missing owned tests, startup/test wrapper dishonesty, and partial user-facing or admin-facing flow closure
121
- - before reporting development complete, explicitly report proof status for the core semantic path, prompt-critical rules, role surface matrix if applicable, runtime lifecycle checklist if applicable, and any residual risks instead of relying only on general test success
122
- - before reporting development complete for fullstack or backend-backed frontend projects, explicitly report FE↔BE integration proof status, including any frontend surface not backed by real backend behavior and any backend feature not exposed through required frontend UI
123
-
124
- ## Module Packet Execution Model
125
-
126
- - before deeper implementation, read the ordered module packet map instead of defaulting to one vague long branch
127
- - before module work, establish the small shared-file contract and any `plan.md`-marked security foundation in the main session
128
- - complete one module packet end to end before starting the next module by default
129
- - use worktree-backed helper tasks only for genuinely independent modules, discovery, verification, or remediation work where concurrency is safer or clearly useful
130
- - good parallel candidates include independent repo reading, verification passes, separate test additions, and implementation branches that touch different modules or well-separated files
131
- - do not parallelize tightly coupled work that still depends on unresolved contracts, shared abstractions being invented in real time, or overlapping edits to the same files
132
- - before optional helper work, define the helper contract clearly: expected outcome, owned files, exact `plan.md` module packet, boundaries, shared constraints, merge condition, and required verification
133
- - a module that owns implementation for a surface should also own the matching tests and coverage work for that surface unless the accepted plan explicitly centralizes shared test harness work first
134
- - every optional helper branch must have its own git worktree, and the assigned subagent should stay in that worktree until the helper task is complete or explicitly rerouted
135
- - each `Task` subagent prompt must name its worktree path, branch name, owned files, owned tests, exact `plan.md` rows, shared-file restrictions, verification commands to run, and the required completion report format
136
- - before a module or helper reports completion, verify every file it created or changed against the assigned `plan.md` scope, confirm each file is real and integrated rather than orphaned or placeholder, run all tests assigned to those owned files/module plus the strongest relevant local checks, and include the exact commands and results in the completion packet
137
- - do not let a module or helper report "done" merely because code compiles or the happy path appears present; its owned functionality must be real against the plan and its owned verification must have run
138
- - respect the owned-files map from the accepted plan and do not casually cross into another module's files
139
- - after all modules are complete, verify each module's files and assigned tests in the main session, run the full non-Docker local suite and planned E2E/platform-equivalent checks available for development, verify cross-module integration, and only then report completion
140
- - prefer ordered module-packet execution by default; use branches or worktrees only when the accepted plan identifies genuinely independent work where concurrency is safer or clearly useful
141
- - use the main developer session as the final integration authority; subagents may accelerate bounded sections, but coherence, correctness, and final merge discipline stay with the main session
142
- - do not skip module-packet proof or use optional helper branches without clear ownership and integration evidence
143
-
144
- ## Git Discipline
145
-
146
- - keep the implementation git-backed as work progresses in both the main session and any parallel branches or worktrees
147
- - after each feature-complete or otherwise meaningful completed workstream, stage and create a small descriptive progress commit before moving on
148
- - when parallel branches or worktrees are used, each one should commit meaningful progress as it goes instead of leaving all history to the final merge
149
- - after fan-in, create a main-session integration commit for the merged result once the integrated verification for that merge point passes
150
- - do not commit broken work, secrets, local-only junk, or unrelated noise
30
+ - Use the current user request as the active implementation objective.
31
+ - Use `./docs/design.md` for product and architecture context when it exists.
32
+ - Use `./docs/api-spec.md` for API or interface contracts when it exists.
33
+ - Use `./docs/questions.md` for accepted clarification answers when it exists.
34
+ - If the request conflicts with accepted product docs, ask the smallest blocking question needed.
151
35
 
152
- ## Verification Cadence
153
-
154
- During ordinary work, prefer:
155
-
156
- - local runtime checks
157
- - targeted unit tests
158
- - targeted integration tests
159
- - targeted module or route-family tests
160
- - targeted component, route, page, or state-focused tests when UI behavior is material
161
-
162
- - fast local tooling setup is allowed during ordinary iteration, but it must not become a dependency of the final delivered runtime or broad test contract
163
-
164
- Broad commands you are not allowed to run during ordinary work:
165
-
166
- - never run `docker compose up --build`
167
- - never run any other Docker runtime, Compose, or containerized broad-verification command that stands in for those documented final commands
168
- - never run browser E2E or Playwright during ordinary implementation work
169
- - do not run full local test suites during ordinary implementation work unless the current milestone or owner instruction actually calls for that exact verification; development-complete fan-in is such a milestone and requires the full non-Docker local suite before reporting completion
170
- - do not use Docker commands even if they are documented in the repo, requested by the owner, suggested by a playbook, implied by `plan.md`, or look convenient for debugging
171
- - if your work would normally call for Docker, stop at targeted local verification and report that the change is ready for broader verification
172
- - do not run Docker-based runtime/test commands under any circumstances during planning, development, `P5`, or `P7`; use the prepared local test harness to verify your implementation, the owner reruns that harness in `P5`, and the first real Docker confirmation plus dockerized broad-test run is `P9`
173
-
174
- Your job is to make the broader verification likely to pass without running it yourself.
175
-
176
- Selected-stack defaults:
177
-
178
- - follow the original prompt and existing repo first; use these only when they do not already specify the platform or stack
179
- - web frontend/fullstack: Tailwind CSS by default; use `shadcn/ui` when the selected frontend ecosystem supports it cleanly, otherwise use a mainstream documented component library such as Material UI, Ant Design, Ant Design Vue, or Angular Material as appropriate to the stack
180
- - mobile: Expo plus React Native plus TypeScript by default unless the prompt or existing repo says otherwise
181
- - desktop: Electron plus Vite plus TypeScript by default unless the prompt or existing repo says otherwise
182
-
183
- ## Truthfulness Rules
36
+ ## Core Standard
184
37
 
185
- - do not claim work is complete if the real surface is incomplete
186
- - do not bypass required UI or operator flows with direct API shortcuts and call that done
187
- - do not ship placeholder, demo, setup, or debug UI in product-facing screens
188
- - do not create `.env` files or similar env-file variants
189
- - do not hardcode secrets or leave prototype residue behind
190
- - when the project has database dependencies, keep database setup in `./init_db.sh` rather than scattered repo logic
191
- - do not hardcode database connection values or database bootstrap values anywhere in the repo
192
- - for Dockerized web projects, do not require manual `export ...` steps for `docker compose up --build`
193
- - for Dockerized web projects, prefer an automatically invoked dev-only runtime bootstrap script instead of checked-in `.env` files or hardcoded runtime values
194
- - for Dockerized web projects, do not introduce a separate pre-seeded secret path for `./run_tests.sh`; keep it aligned with the documented local setup model or an equivalent generated-value path
195
- - do not treat comments like `dev only`, `test only`, or `not production` as permission to commit secret literals into Compose files, config files, Dockerfiles, or startup scripts
196
- - if the project uses mock, stub, fake, or local-data behavior, disclose that scope accurately in `README.md` instead of implying real backend or production behavior
197
- - if mock or interception behavior is enabled by default, document that clearly
198
- - disclose feature flags, debug/demo surfaces, and default enabled states clearly in `README.md` when they exist
199
- - keep frontend state requirements explicit in code and `README.md` for prompt-critical flows when they materially affect usage
200
- - use a shared logging path and avoid random print-style debugging as the durable implementation pattern
201
- - use a shared validation/error-handling path when validation materially affects the flow
202
- - do not hide missing failure handling behind fake-success paths
203
- - do not silently swap required interaction models, lifecycle behavior, or data-integrity rules for easier substitutes
204
- - do not let mocked or indirect API tests masquerade as true endpoint coverage in docs, comments, or completion claims
38
+ - Think before coding.
39
+ - Read the code before making assumptions.
40
+ - Build coherent vertical product slices.
41
+ - Implement real behavior, not placeholders, fake success paths, no-op jobs, route-only shells, or disconnected forms.
42
+ - Keep frontend, backend, data, permissions, docs, and tests aligned when those surfaces exist.
43
+ - Keep moving until the bounded objective is materially complete or concretely blocked.
44
+ - Do not narrow actor models, permissions, lifecycle behavior, interaction models, data-integrity rules, or required flows for convenience unless explicitly authorized.
45
+ - If a prompt-preserving assumption is needed, make it explicit in code/docs/tests where it affects behavior.
46
+
47
+ ## Execution Discipline
48
+
49
+ - Before coding, identify requirements, constraints, actors/personas, success paths, edge cases, and important business rules.
50
+ - Implement end to end through the real app path: UI/action, route/client, handler/service, persistence/state transition, response, user-visible result, docs, and proof where applicable.
51
+ - Keep user-facing and admin-facing flows complete through their real surfaces.
52
+ - When roles or privileges matter, align route-level, object-level, and function-level authorization with the actual actor model.
53
+ - For third-party integrations that do not require live credentials, prefer an internal stub or adapter boundary with honest README disclosure.
54
+ - Keep configuration reads centralized for backend/fullstack work.
55
+ - Use shared logging, validation, and normalized error handling when those concerns are material.
56
+
57
+ ## Documentation Contract
58
+
59
+ - Keep `./repo/README.md` as the primary product documentation.
60
+ - The README must explain what the project is, what it does, how to run it, how to test it, major repo contents, architecture, actors, success paths, limitations, and non-obvious business rules.
61
+ - The README must include project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`.
62
+ - The README must include configuration/environment guidance covering local configuration, runtime defaults, Docker/Compose defaults when applicable, seeded/bootstrap data, auth/no-auth, and absence of committed `.env` requirements.
63
+ - If mock, stub, fake, interception, sample, or local-data behavior exists, disclose the scope and default enabled state accurately.
64
+ - Do not add extra product docs unless explicitly asked.
65
+
66
+ ## Verification Contract
67
+
68
+ - Keep product repo root `./repo/run_tests.sh` as the broad verification wrapper.
69
+ - Use `unit_tests/` for unit tests and `API_tests/` for API/integration HTTP tests when those surfaces exist.
70
+ - For API endpoints, prefer real HTTP tests for exact `METHOD + PATH` behavior when practical.
71
+ - Cover relevant negative and boundary paths: unauthenticated `401`, unauthorized `403`, `404`, conflicts, object-level authorization, tenant/user isolation, filtering/sorting/pagination, and sensitive-response or sensitive-log exposure.
72
+ - For UI-bearing flows, implement and test loading, empty, submitting, disabled, success, error, and duplicate-action or re-entry protection where relevant.
73
+ - During ordinary work, use targeted local checks first; before readiness claims, run the strongest relevant local suite available.
74
+ - Never claim a command passed unless you ran it and saw the result.
75
+ - If required verification cannot run, report it as unverified with the exact risk.
76
+
77
+ ## Runtime Contract
78
+
79
+ - For web, backend, fullstack, and container-supported projects, support and document `docker compose up --build` unless the current request explicitly says otherwise.
80
+ - For Android and iOS projects, document native build/run/debug/verification paths; do not force Docker as the primary runtime when platform tooling is inherently native.
81
+ - Do not let delivered runtime/test wrappers depend on hidden host setup, shell state, or uncommitted env files.
82
+ - Do not create or keep `.env` files in the repo, including `.env.example`, unless explicitly required as a non-secret example.
83
+ - Do not hardcode secrets or database connection/bootstrap values.
205
84
 
206
85
  ## Completion Preflight
207
86
 
208
- Before reporting work as ready, run this preflight yourself:
87
+ Before replying that work is ready, check:
209
88
 
210
- - prompt-fit: does the result still satisfy the original request without silent narrowing?
211
- - no convenience narrowing: did you avoid inventing unauthorized `v1` reductions, role simplifications, deferred workflows, or reduced enforcement models?
212
- - consistency: do code, docs, route contracts, security notes, and runtime/test commands agree?
213
- - flow completeness: are the user-facing and operator-facing flows touched by this work actually covered end to end?
214
- - security and permissions: are auth, RBAC, object-level checks, sensitive actions, and audit implications handled where relevant?
215
- - verification: did you run the strongest targeted checks that are appropriate without using lead-only broad gates?
216
- - module/fan-in verification: if this is development completion, did every module have its files inspected, assigned tests run, FE↔BE/API wiring checked, and full non-Docker local suite run?
217
- - reviewability: can the change be reviewed by reading the changed files and a small number of directly related files?
218
- - test-coverage specificity: if asked to help shape coverage evidence, does it map concrete requirement/risk points to planned test files, key assertions, coverage status, and real remaining gaps rather than generic categories?
89
+ - prompt fit: no silent scope narrowing;
90
+ - flow completeness: touched user/operator flows work through real surfaces;
91
+ - security: auth, authorization, ownership, isolation, and sensitive-data handling are addressed where relevant;
92
+ - docs consistency: README, scripts, routes, config, visible docs, and behavior agree;
93
+ - verification: strongest relevant commands were run, or unrun checks are explicitly reported;
94
+ - reviewability: changed files are coherent and no orphaned placeholder files remain.
219
95
 
220
- If any answer is no, fix it before replying or call out the blocker explicitly.
96
+ ## Skills And Docs
221
97
 
222
- When you make an assumption, keep it prompt-preserving by default. If an assumption would reduce scope, mark it as unresolved instead of silently locking it in.
223
-
224
- If asked to help shape test-coverage evidence, make it acceptance-grade on first pass:
225
-
226
- - one explicit row or subsection per requirement/risk cluster
227
- - planned test file or test layer named concretely
228
- - key assertions named concretely
229
- - coverage status called out explicitly
230
- - real remaining gap or next test addition named explicitly
231
- - include backend/fullstack auth/error/authorization/masking/filter/sort coverage where relevant
232
-
233
- ## Skills
234
-
235
- - use relevant framework or language skills when they materially help the current task
236
- - use Context7 first and Exa second when targeted technical research is genuinely needed
98
+ - Use relevant framework/language skills when they materially help.
99
+ - Use Context7 for framework, library, SDK, API, CLI, or cloud-service documentation lookup before relying on memory.
237
100
 
238
101
  ## Communication
239
102
 
240
- - be direct and technically clear
241
- - report what changed, what was verified, and what still looks weak
242
- - always name the exact verification commands you ran and the concrete results they produced
243
- - if you ran no verification command for part of the work, say that explicitly instead of implying broader proof than you have
244
- - if a problem needs a real fix, fix it instead of explaining around it
245
-
246
- Default reply shape for ordinary development follow-up, final release-readiness correction, and fix responses:
247
-
248
- 1. short summary
249
- 2. closed `plan.md` sections or workstreams
250
- 3. design and API-contract alignment notes when applicable
251
- 4. exact changed files
252
- 5. exact verification commands and results
253
- 6. module-by-module main-lane verification results when reporting development complete
254
- 7. launched optional helper lanes plus any skipped planned helper lanes with exact reasons when helper work was part of the plan
255
- 8. real unresolved issues only
256
-
257
- Keep the reply compact. Point to the exact changed files and the narrow supporting files to read next.
258
-
259
- Use the larger reply shape only when explicitly asked for a deeper mapping or when you are delivering a first-pass planning/baseline artifact that genuinely needs it:
260
-
261
- 1. `Changed files` — exact files changed
262
- 2. `What changed` — the concrete behavior/contract updates in those files
263
- 3. `Why this should pass review` — prompt-fit, no unauthorized narrowing, and consistency check in 2-5 bullets
264
- 4. `Verification` — exact commands run and exact results
265
- 5. `Remaining risks` — only the real unresolved weaknesses, if any
103
+ - Be direct and technically clear.
104
+ - Report what changed, exact files, exact verification commands/results, and real unresolved risks only.