theslopmachine 0.6.2 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (76) hide show
  1. package/MANUAL.md +21 -6
  2. package/README.md +55 -7
  3. package/RELEASE.md +15 -0
  4. package/assets/agents/developer.md +41 -1
  5. package/assets/agents/slopmachine-claude.md +100 -60
  6. package/assets/agents/slopmachine.md +40 -17
  7. package/assets/claude/agents/developer.md +42 -5
  8. package/assets/skills/clarification-gate/SKILL.md +25 -5
  9. package/assets/skills/claude-worker-management/SKILL.md +280 -57
  10. package/assets/skills/developer-session-lifecycle/SKILL.md +81 -37
  11. package/assets/skills/development-guidance/SKILL.md +21 -1
  12. package/assets/skills/evaluation-triage/SKILL.md +32 -23
  13. package/assets/skills/final-evaluation-orchestration/SKILL.md +86 -50
  14. package/assets/skills/hardening-gate/SKILL.md +17 -3
  15. package/assets/skills/integrated-verification/SKILL.md +3 -3
  16. package/assets/skills/planning-gate/SKILL.md +32 -3
  17. package/assets/skills/planning-guidance/SKILL.md +72 -13
  18. package/assets/skills/retrospective-analysis/SKILL.md +2 -2
  19. package/assets/skills/scaffold-guidance/SKILL.md +129 -124
  20. package/assets/skills/submission-packaging/SKILL.md +33 -27
  21. package/assets/skills/verification-gates/SKILL.md +44 -14
  22. package/assets/slopmachine/backend-evaluation-prompt.md +1 -1
  23. package/assets/slopmachine/frontend-evaluation-prompt.md +5 -5
  24. package/assets/slopmachine/scaffold-playbooks/android-kotlin-compose.md +81 -0
  25. package/assets/slopmachine/scaffold-playbooks/android-kotlin-views.md +191 -0
  26. package/assets/slopmachine/scaffold-playbooks/android-native-java.md +203 -0
  27. package/assets/slopmachine/scaffold-playbooks/angular-default.md +181 -0
  28. package/assets/slopmachine/scaffold-playbooks/backend-baseline.md +142 -0
  29. package/assets/slopmachine/scaffold-playbooks/backend-family-matrix.md +80 -0
  30. package/assets/slopmachine/scaffold-playbooks/database-module-matrix.md +80 -0
  31. package/assets/slopmachine/scaffold-playbooks/django-default.md +166 -0
  32. package/assets/slopmachine/scaffold-playbooks/docker-baseline.md +189 -0
  33. package/assets/slopmachine/scaffold-playbooks/docker-shared-contract.md +334 -0
  34. package/assets/slopmachine/scaffold-playbooks/electron-vite-default.md +124 -0
  35. package/assets/slopmachine/scaffold-playbooks/expo-react-native-default.md +73 -0
  36. package/assets/slopmachine/scaffold-playbooks/fastapi-default.md +134 -0
  37. package/assets/slopmachine/scaffold-playbooks/frontend-baseline.md +160 -0
  38. package/assets/slopmachine/scaffold-playbooks/frontend-family-matrix.md +134 -0
  39. package/assets/slopmachine/scaffold-playbooks/generic-unknown-tech-guide.md +136 -0
  40. package/assets/slopmachine/scaffold-playbooks/go-chi-default.md +160 -0
  41. package/assets/slopmachine/scaffold-playbooks/ios-linux-portable.md +93 -0
  42. package/assets/slopmachine/scaffold-playbooks/ios-native-objective-c.md +151 -0
  43. package/assets/slopmachine/scaffold-playbooks/ios-native-swift.md +188 -0
  44. package/assets/slopmachine/scaffold-playbooks/laravel-default.md +216 -0
  45. package/assets/slopmachine/scaffold-playbooks/livewire-default.md +265 -0
  46. package/assets/slopmachine/scaffold-playbooks/overlay-module-matrix.md +130 -0
  47. package/assets/slopmachine/scaffold-playbooks/platform-family-matrix.md +79 -0
  48. package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +72 -0
  49. package/assets/slopmachine/scaffold-playbooks/spring-boot-default.md +182 -0
  50. package/assets/slopmachine/scaffold-playbooks/tauri-default.md +80 -0
  51. package/assets/slopmachine/scaffold-playbooks/vue-vite-default.md +162 -0
  52. package/assets/slopmachine/scaffold-playbooks/web-default.md +96 -0
  53. package/assets/slopmachine/templates/AGENTS.md +41 -3
  54. package/assets/slopmachine/templates/CLAUDE.md +111 -0
  55. package/assets/slopmachine/utils/claude_create_session.mjs +1 -0
  56. package/assets/slopmachine/utils/claude_live_channel.mjs +188 -0
  57. package/assets/slopmachine/utils/claude_live_common.mjs +406 -0
  58. package/assets/slopmachine/utils/claude_live_hook.py +47 -0
  59. package/assets/slopmachine/utils/claude_live_launch.mjs +181 -0
  60. package/assets/slopmachine/utils/claude_live_status.mjs +25 -0
  61. package/assets/slopmachine/utils/claude_live_stop.mjs +45 -0
  62. package/assets/slopmachine/utils/claude_live_turn.mjs +250 -0
  63. package/assets/slopmachine/utils/claude_resume_session.mjs +1 -0
  64. package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.mjs +23 -0
  65. package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.sh +5 -0
  66. package/assets/slopmachine/utils/claude_worker_common.mjs +224 -4
  67. package/assets/slopmachine/utils/cleanup_delivery_artifacts.py +4 -0
  68. package/assets/slopmachine/utils/export_ai_session.mjs +1 -1
  69. package/assets/slopmachine/utils/normalize_claude_session.py +153 -0
  70. package/assets/slopmachine/utils/package_claude_session.mjs +96 -0
  71. package/assets/slopmachine/utils/prepare_strict_audit_workspace.mjs +65 -0
  72. package/package.json +1 -1
  73. package/src/constants.js +42 -3
  74. package/src/init.js +173 -28
  75. package/src/install.js +75 -0
  76. package/src/send-data.js +56 -57
@@ -1,91 +1,236 @@
1
1
  ---
2
2
  name: claude-worker-management
3
- description: Launch, resume, and persist the Claude CLI developer worker session used by slopmachine-claude.
3
+ description: Launch, persist, and message the live Claude developer lane used by slopmachine-claude.
4
4
  ---
5
5
 
6
6
  # Claude Worker Management
7
7
 
8
- Use this skill whenever `slopmachine-claude` needs to create, resume, or message the persistent Claude developer worker.
8
+ Use this skill whenever `slopmachine-claude` needs to launch, inspect, or message the persistent live Claude developer lane.
9
9
 
10
10
  ## Purpose
11
11
 
12
12
  - keep the Claude developer worker as a large complete conversation per bounded developer slot
13
13
  - avoid losing worker context by accidentally creating fresh sessions for ordinary follow-up turns
14
14
  - make session persistence and response capture deterministic
15
+ - make OpenCode talk to a live Claude TUI through a bridge instead of non-interactive resume calls
15
16
 
16
17
  ## Core rules
17
18
 
18
19
  - the Claude worker must be invoked by the installed Claude agent name `developer`
19
20
  - do not use the OpenCode `developer` subagent for implementation work in the `slopmachine-claude` path
20
21
  - do not read Claude transcript files as the normal communication channel
21
- - communicate with the Claude worker through the packaged wrapper scripts in `~/slopmachine/utils/`
22
- - treat raw Claude stdout and stderr as trace artifacts written to files, not as owner-session context
23
- - treat the wrapper `result-file` as the semantic source of truth in normal owner flow
24
- - treat terminal stdout from the wrapper as only a tiny pointer or status channel
25
- - always capture the session id and normalized result from the `result-file`
26
- - always re-pass `--agent developer` on every call, even when resuming an existing session
27
- - always constrain Claude to a single-session developer lane by limiting tools to `Read Write Edit Bash Glob Grep`
28
- - do not allow Claude internal agent fan-out in the normal developer path
29
- - use `--dangerously-skip-permissions` in the wrapper path so the worker does not stall on routine file-edit permission prompts inside the bounded repo
30
-
31
- ## Session creation rule
22
+ - communicate with the Claude worker through the packaged live bridge scripts in `~/slopmachine/utils/`
23
+ - use `claude_live_launch.mjs` once per lane and `claude_live_turn.mjs` for each owner message into that lane
24
+ - set the Claude live runtime settings default `agent` to `developer` so the lane stays on the intended system prompt even if the session is resumed or inspected through Claude-native controls
25
+ - treat bridge `state.json` as the durable control-plane truth for lane status, routing, and Claude session identity
26
+ - treat bridge `result.json` as the semantic source of truth after each completed turn
27
+ - treat terminal stdout from bridge scripts as only a tiny pointer or status channel
28
+ - always capture the session id from the launched bridge state and the normalized turn result from bridge `result.json`
29
+ - always constrain Claude to a single-session developer lane even when it uses internal Claude task fan-out
30
+ - allow Claude internal task fan-out inside that one continuous live session when it reduces serial churn cleanly
31
+ - encourage Claude to parallelize independent search, reading, verification, and bounded implementation subtasks through internal task fan-out when that reduces serial churn cleanly
32
+ - launch the live lane with `--dangerously-skip-permissions` so the worker does not stall on routine file-edit permission prompts inside the bounded repo
33
+ - when Claude uses internal task fan-out and the environment allows explicit agent selection, prefer the installed `developer` agent for implementation-capable branches so the same engineering standard applies across those branches
34
+ - there is no repo-controlled guarantee that every Claude helper subagent globally reuses the `developer` prompt, so keep critical implementation in the main developer lane or in explicitly developer-scoped helper branches rather than relying on unspecified built-in helper behavior
35
+ - make every owner-to-Claude turn boundary-controlled, reviewable, and explicit about what must happen now versus later
36
+ - do not send vague owner prompts such as `continue`, `keep going`, `handle the rest`, or `fix it` without a precise bounded contract
37
+ - each substantive owner message should state the current engineering boundary, exact expected outcomes for that turn, the evidence required back, the important shortcuts that are not acceptable, and the stopping point
38
+ - default to one bounded engineering objective per owner turn; if a request would naturally cross planning, scaffold, development, or gate-review boundaries, split it into separate turns
39
+
40
+ ## Lane launch rule
32
41
 
33
42
  For a new bounded developer session slot:
34
43
 
35
- 1. run Claude in print mode with the installed `developer` agent
36
- 2. capture the returned `session_id`
44
+ 1. launch one live Claude TUI lane inside `tmux`
45
+ 2. wait for bridge registration to capture the Claude `session_id`
37
46
  3. store it in `../.ai/metadata.json`
38
47
  4. mirror it in tracker comments using `SESSION:`
39
- 5. keep using that same session for all later turns in the same bounded slot
48
+ 5. keep using that same live lane for all later turns in the same bounded slot
40
49
 
41
- Preferred creation pattern:
50
+ Preferred launch pattern:
42
51
 
43
52
  ```bash
44
- node ~/slopmachine/utils/claude_create_session.mjs --cwd "$PWD" --prompt-file <file> --raw-output <file> --raw-error <file> --state-file <file> --result-file <file>
53
+ node ~/slopmachine/utils/claude_live_launch.mjs --cwd "$PWD" --lane <lane> --runtime-dir <dir>
45
54
  ```
46
55
 
56
+ The launch implementation must pass Claude `--dangerously-skip-permissions` in the live TUI command path.
57
+
47
58
  When the owner invokes this through the OpenCode Bash tool, use a long-running timeout suitable for real developer work.
48
59
 
49
60
  Default:
50
61
 
51
- - Claude create and resume worker turns should use a Bash timeout of at least `3600000` ms (1 hour)
52
- - do not use ordinary short Bash timeouts for Claude worker turns
62
+ - Claude launch and turn bridge operations should not use ordinary short Bash timeouts
63
+ - when automatic rate-limit waiting is enabled, prefer no outer timeout at all for live Claude worker turns; if the host wrapper forces a timeout value, it must exceed the possible reset wait plus buffer rather than using a generic 1 hour cap
53
64
 
54
65
  Do not pre-generate a UUID unless there is a strong reason to do so.
55
- The default pattern is to let Claude create the session and then persist the returned `session_id`.
66
+ The default pattern is to let the live lane start normally and then persist the `session_id` captured by bridge registration.
56
67
 
57
- ## Resume rule
68
+ ## Turn rule
58
69
 
59
70
  For all later turns in the same bounded developer slot:
60
71
 
61
72
  ```bash
62
- node ~/slopmachine/utils/claude_resume_session.mjs --cwd "$PWD" --session-id <session_id> --prompt-file <file> --raw-output <file> --raw-error <file> --state-file <file> --result-file <file>
73
+ node ~/slopmachine/utils/claude_live_turn.mjs --runtime-dir <dir> --prompt-file <file> --timeout-ms <turn-timeout>
63
74
  ```
64
75
 
65
- - use `--resume` inside the wrapper implementation, not `-r`
66
- - when calling the resume wrapper from the owner session, treat it as a long-running operation and keep the Bash timeout at or above `3600000` ms
67
- - do not reuse `--session-id` after creation
68
- - if resume fails, stop and recover explicitly instead of silently creating a new worker
76
+ - inject exactly one owner message at a time into the idle live lane
77
+ - wait for `Stop` or `StopFailure` before sending the next message
78
+ - do not bypass the bridge by calling the channel HTTP endpoint directly from owner logic
79
+ - if turn execution fails, stop and recover explicitly instead of silently creating a new worker
80
+
81
+ ## Turn-preflight checklist
82
+
83
+ Before sending any owner message into the live lane:
84
+
85
+ 1. read bridge `state.json` and confirm the lane is the intended lane and currently `idle`
86
+ 2. read the latest bridge `result.json` when it exists and review the last normalized Claude answer before composing the next turn
87
+ 3. decide the prompt kind explicitly, such as `planning-start`, `planning-revision`, `scaffold-start`, `scaffold-review`, `development-slice`, `development-correction`, `bugfix-orientation`, `bugfix-fix`, `resume`, or `recovery`
88
+ 4. gather only the minimum accepted-plan sections, clarified requirements, boundary summary, and fresh deltas needed for this turn
89
+ 5. define the turn contract before writing the prompt: what Claude must produce now, what evidence it must return now, and exactly where it must stop
90
+
91
+ If the stop boundary is fuzzy, the turn is too broad.
92
+ If the owner prompt would span multiple major boundaries, split it.
93
+ Do not send the next turn until the prior turn has been reviewed and either accepted, corrected, or explicitly rerouted.
94
+
95
+ ## Canonical owner-message contract
96
+
97
+ For substantive live-lane turns, write the owner message in natural engineering language but make sure it includes all of these ingredients:
98
+
99
+ - `Context snapshot`: the current accepted state and only the fresh deltas that matter now
100
+ - `Contract anchor`: the relevant accepted plan sections, clarified decisions, or concrete evaluator findings that define the work
101
+ - `This turn only`: the bounded deliverable for this turn and whether this is planning-only, scaffold-only, coding allowed, or correction-only
102
+ - `Expected outcomes now`: the exact behaviors, artifacts, or fixes that must exist before this turn can be considered successful
103
+ - `Evidence required now`: the exact verification, file updates, or summaries Claude must return for owner review
104
+ - `Disallowed shortcuts now`: future-work deferrals, placeholder implementations, bypassed auth/validation, fake verification, mixed-boundary drift, or other shortcuts that would make the result misleading
105
+ - `Stop boundary`: what Claude should stop after producing, and what it must not start yet
106
+ - `Reply contract`: request the exact changed files, exact verification commands and results, and only the real remaining risks or blockers
107
+
108
+ When the turn intentionally uses internal parallel fan-out, also include:
109
+
110
+ - `Branch map`: the 2 or 3 independent branches, their boundaries, and their expected outputs
111
+ - `Shared constraints`: the contracts or files that must stay aligned across branches
112
+ - `Fan-in rule`: how Claude should merge the branch results and what integrated verification must run before stopping
113
+
114
+ Keep the wording natural. Do not turn every prompt into a rigid template dump.
115
+ But do make the contract mechanically obvious enough that Claude cannot plausibly misunderstand what acceptance depends on.
116
+
117
+ ## Canonical prompt shapes
118
+
119
+ ### Planning-start shape
120
+
121
+ For the second owner message in the first `develop` lane and for other explicit planning-entry turns:
122
+
123
+ - inline the approved clarification content and requirements-ambiguity resolutions directly in the message
124
+ - include the owner's initial planning view so Claude refines a direction instead of inventing one from zero
125
+ - restate prompt-critical requirements, actors, required surfaces, locked defaults, explicit non-goals, and risky areas in plain engineering language
126
+ - say clearly that the worker should produce an exhaustive, section-addressable implementation plan and must not start coding yet
127
+ - require dense planning artifacts, especially `../docs/design.md`, with explicit treatment of modules, business rules, state machines, permissions, validation, verification strategy, checkpoints, and definition of done when applicable
128
+ - require a concise changed-files summary with the planning response
129
+
130
+ ### Planning-revision shape
131
+
132
+ When a planning draft is not good enough:
133
+
134
+ - point to the exact plan sections or requirement areas that are weak or incomplete
135
+ - state the exact missing detail or unacceptable vagueness that must be corrected now
136
+ - keep the turn planning-only; do not let the worker start coding as a compensation move
137
+ - require the revised planning artifacts plus a short summary of what changed and what is still explicitly unresolved
138
+
139
+ ### Scaffold-start shape
140
+
141
+ When entering scaffold work:
142
+
143
+ - cite the relevant accepted design sections and the intended baseline runtime/test/config contract
144
+ - state that the turn is scaffold-only and name the exact baseline surfaces expected now, such as app shell, routing skeleton, persistence skeleton, config wiring, logging path, validation path, auth foundation, test harness, or README baseline when they apply
145
+ - state explicitly which feature work must not begin yet
146
+ - require exact local verification evidence for the scaffold baseline and exact changed files
147
+ - say to stop after the scaffold baseline is complete and verified
148
+
149
+ ### Development-slice shape
150
+
151
+ For ordinary implementation turns:
152
+
153
+ - anchor the request to the relevant accepted plan sections and current boundary summary
154
+ - name the exact slice, user/admin actor path, modules, or surfaces to complete now
155
+ - itemize the expected outcomes for happy path, failure path, and auth/ownership/validation behavior when those dimensions matter
156
+ - require targeted local verification tied back to those expected outcomes
157
+ - explicitly prohibit owner-only broad verification commands and unrelated follow-on work
158
+ - when the slice can truly be parallelized, name the separate branch contracts explicitly instead of asking Claude to infer them
159
+ - say to stop after this slice and report the exact changed files plus exact verification results
160
+
161
+ ### Development-correction shape
162
+
163
+ When the worker partially missed the slice or crossed boundaries:
164
+
165
+ - quote the exact missing outcome, regression risk, or evidence gap
166
+ - ask for a correction-only turn focused on those gaps
167
+ - require fresh verification evidence for the corrected surface
168
+ - do not mix new feature asks into the correction turn
169
+
170
+ ### Resume shape
171
+
172
+ When resuming a long-lived lane:
173
+
174
+ - start from the stored boundary summary and the relevant accepted plan sections instead of replaying broad history
175
+ - include only the new delta since the last accepted state
176
+ - restate the current bounded task, evidence required, and stop boundary
177
+ - do not re-dump the entire project or workflow unless continuity is genuinely broken
178
+
179
+ ### Bugfix issue-turn shape
180
+
181
+ For evaluator-driven remediation inside a `bugfix-N` session opened by a `partial pass` audit:
182
+
183
+ - lead with the concrete evaluator finding or owner-reviewed issue statement
184
+ - state the expected fix and the affected non-regression surfaces
185
+ - require proof for the issue path plus the nearby happy path and security/ownership boundary when relevant
186
+ - say to stop after the named issue set rather than reopening unrelated refactors
187
+
188
+ ## Turn anti-patterns
189
+
190
+ Do not do these:
191
+
192
+ - send `continue`, `next`, or `keep going` as a substantive owner prompt
193
+ - ask for planning and implementation in the same turn unless that mixed boundary is intentional and explicitly stated
194
+ - ask for multiple gate exits in one turn
195
+ - let Claude decide its own stopping point implicitly
196
+ - pass parent-directory file paths as hidden instructions instead of restating the needed content directly
197
+ - paste raw bridge state, raw transcript payloads, or workflow bookkeeping into normal developer prompts
198
+ - respond to a weak result by broadening the next prompt instead of correcting the specific gap
199
+
200
+ ## Status rule
201
+
202
+ When owner logic needs to inspect the lane without sending a new message:
203
+
204
+ ```bash
205
+ node ~/slopmachine/utils/claude_live_status.mjs --runtime-dir <dir>
206
+ ```
207
+
208
+ Use `state.json` plus `claude_live_status.mjs` to determine whether the lane is:
209
+
210
+ - `idle`
211
+ - `running`
212
+ - `blocked`
213
+ - `failed`
69
214
 
70
215
  ## Result capture rule
71
216
 
72
- The wrapper scripts should pipe the raw Claude JSON output to file, parse it after process exit, and persist a normalized `result-file` plus a live `state-file`.
217
+ The live bridge should persist a normalized turn `result.json` plus a durable lane `state.json`.
73
218
 
74
- Use the `result-file` fields only:
219
+ Use the turn result fields only:
75
220
 
76
221
  - `sid`
77
222
  - `res`
78
223
 
79
224
  Monitoring files should include at least:
80
225
 
81
- - a live `state-file` showing running/completed/failed state, pid, byte counts, timestamps, and exit code
82
- - a final `result-file` containing the normalized success or failure object
226
+ - a live `state.json` showing lane status, Claude session id, tmux session id, transcript pointer, and current turn state
227
+ - a final `result.json` containing the normalized success or failure object for the latest completed turn
228
+ - `hook-events.jsonl` as the live outward event feed
83
229
 
84
230
  Treat `res` as the worker's answer.
85
- Do not feed raw Claude JSON into the owner session.
86
231
  Do not rely on transcript scraping for normal turn-to-turn orchestration.
87
- Do not rely on Bash stdout alone when the wrapper state or result files provide a clearer source of truth.
88
- Read `result-file` after process completion before deciding the next owner turn.
232
+ Do not rely on Bash stdout alone when bridge state or result files provide a clearer source of truth.
233
+ Read bridge `result.json` after turn completion before deciding the next owner turn.
89
234
 
90
235
  ## Developer-slot continuity
91
236
 
@@ -95,7 +240,8 @@ The purpose of this backend is to preserve one large complete conversation per b
95
240
  - the `bugfix` slot should stay one continuous Claude session unless irrecoverable failure forces replacement
96
241
  - do not start a fresh Claude worker for every slice, clarification, or review loop
97
242
  - do not roll sessions casually just because the conversation is long
98
- - do not let the Claude worker create its own internal sub-agents for routine planning, scaffold, or implementation work
243
+ - internal Claude task sub-agents are allowed inside the same developer session when they help parallelize independent bounded work cleanly
244
+ - prefer task fan-out for parallel discovery, repo reading, comparison, or verification passes when those branches can be merged back without ambiguity
99
245
 
100
246
  ## First-session handshakes
101
247
 
@@ -103,12 +249,12 @@ The purpose of this backend is to preserve one large complete conversation per b
103
249
 
104
250
  When the first `develop` slot begins in planning:
105
251
 
106
- 1. create the Claude developer session with:
107
- - the original prompt plus a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction
108
- 2. wait for the first response and store the returned Claude session id from wrapper field `sid`
109
- 3. form an initial owner planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
110
- 4. resume the same session and send a compact second owner message that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial owner planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
111
- 5. continue the planning conversation in that same Claude session
252
+ 1. launch the live `develop` lane if it is not already running
253
+ 2. send the original prompt plus a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction through the bridge
254
+ 3. store the Claude session id from bridge `state.json`
255
+ 4. form an initial owner planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
256
+ 5. send a compact second owner message through the same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial owner planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
257
+ 6. continue the planning conversation in that same Claude session
112
258
 
113
259
  Do not merge those two first messages.
114
260
  Do not ask for a plan in the first message.
@@ -118,8 +264,10 @@ Preferred second owner message shape:
118
264
  - inline the approved clarification content and the requirements-ambiguity resolutions directly in the owner message
119
265
  - include the owner's initial planning view so planning is refined collaboratively rather than invented from zero
120
266
  - add any short delta notes that are not already captured in that inlined summary
121
- - express the current boundary in plain engineering language and then ask for the implementation plan plus major risks or assumptions
267
+ - express the current boundary in plain engineering language and then ask for an exhaustive, section-addressable implementation plan plus major risks or assumptions
268
+ - require the plan to fill the planning artifacts densely, especially `../docs/design.md`, with explicit sections for actors, success paths, modules, business rules, state machines, permissions, validation, test strategy, checkpoints, and definition of done when those dimensions matter
122
269
  - ask for repo-local planning artifacts plus a concise changed-files summary
270
+ - say explicitly that coding must not start yet and that the response should stop after the planning artifacts and summary are complete
123
271
 
124
272
  Do not tell the developer worker to read files outside `repo/`.
125
273
  If owner-side artifacts outside `repo/` matter, restate their content directly in the owner message instead of passing file paths.
@@ -127,13 +275,13 @@ Do not mention session names, slot labels, or workflow phase labels to the devel
127
275
 
128
276
  ### `bugfix-N` orientation handshake
129
277
 
130
- When `P7` begins and the workflow opens the remediation lane:
278
+ When a fresh `partial pass` evaluation result opens the next remediation lane:
131
279
 
132
- 1. create a fresh Claude developer session for the next `bugfix-N` label
280
+ 1. launch a fresh live Claude developer lane for the next `bugfix-N` label
133
281
  2. use the first owner message only to orient that session to the repo and the current delivered state
134
282
  3. make clear in plain engineering language that follow-up work will be focused remediation against evaluator findings
135
- 4. wait for the first response and store the returned Claude session id from wrapper field `sid`
136
- 5. only after that orientation exchange, resume the same `bugfix-N` session with the first evaluator-driven issue list
283
+ 4. wait for the first response and store the Claude session id from bridge `state.json`
284
+ 5. only after that orientation exchange, continue the same `bugfix-N` live lane with the first evaluator-driven issue list
137
285
 
138
286
  The orientation message should:
139
287
 
@@ -142,12 +290,24 @@ The orientation message should:
142
290
  - state that incoming work will be a sequence of concrete issue-fix requests against evaluator findings
143
291
  - avoid mentioning workflow internals, phase labels, or session-lane labels
144
292
 
293
+ ## Between-turn owner review rule
294
+
295
+ After each meaningful Claude response and before the next owner turn:
296
+
297
+ 1. review the normalized bridge `result.json`
298
+ 2. decide whether the result was accepted, needs correction, or crossed a boundary that must be rolled back in the next prompt
299
+ 3. update metadata and boundary summary only after that review decision
300
+ 4. compose the next turn as a deliberate correction, continuation, or new bounded objective rather than a vague nudge
301
+
302
+ If Claude starts coding during a planning-only turn, treat that as a boundary violation and correct it explicitly.
303
+ If Claude continues into extra work beyond the requested stop boundary, do not silently accept the spillover; review the requested boundary first and then decide whether any spillover is acceptable.
304
+
145
305
  ## Metadata expectations
146
306
 
147
307
  The active developer session record should include at least:
148
308
 
149
309
  - `lane`
150
- - `backend: "claude"`
310
+ - `backend: "claude-live"`
151
311
  - `session_id`
152
312
  - `label`
153
313
  - `status`
@@ -157,26 +317,89 @@ Recommended additional fields when useful:
157
317
 
158
318
  - `agent_name: "developer"`
159
319
  - `created_phase`
160
- - `trace_dir`
320
+ - `runtime_dir`
321
+ - `tmux_session`
322
+ - `transcript_path`
323
+ - `opened_from_audit_number`
161
324
  - `last_result_summary`
162
- - `last_resumed_at`
325
+ - `last_turn_at`
326
+
327
+ ## Owner state-sync rule
328
+
329
+ Bridge lane state is the authoritative transport state for Claude-backed developer work.
330
+
331
+ After each meaningful bridge action, immediately read bridge `state.json` and mirror the important fields into `../.ai/metadata.json`, `../metadata.json`, and Beads comments before advancing workflow state.
332
+
333
+ ### After lane launch
334
+
335
+ - read bridge `state.json`
336
+ - set or confirm:
337
+ - `current_developer_lane`
338
+ - `active_developer_session_id`
339
+ - create or update the active `developer_sessions[]` record with:
340
+ - `lane`
341
+ - `sequence`
342
+ - `label`
343
+ - `backend: "claude-live"`
344
+ - `agent_name: "developer"`
345
+ - `created_phase`
346
+ - `session_id`
347
+ - `status`
348
+ - `runtime_dir`
349
+ - `tmux_session`
350
+ - `transcript_path`
351
+ - `opened_from_audit_number` when the session was opened from a `partial pass` audit
352
+ - `orientation_completed: false`
353
+ - mirror `session_id` into `../metadata.json` as `session_id`
354
+ - record the session in Beads using `SESSION:`
355
+
356
+ ### After each successful turn
357
+
358
+ - read bridge `state.json` and bridge `result.json`
359
+ - update the active `developer_sessions[]` record with:
360
+ - `status: "idle"`
361
+ - `session_id`
362
+ - `transcript_path`
363
+ - `last_result_summary`
364
+ - `last_turn_at`
365
+ - if the first orientation or first planning handshake completed, set `orientation_completed: true`
366
+ - keep `active_developer_session_id` and `current_developer_lane` aligned with that same active session
367
+
368
+ ### After a blocked or failed turn
369
+
370
+ - read bridge `state.json` and bridge `result.json`
371
+ - preserve the same tracked Claude session id and runtime pointers
372
+ - update the active `developer_sessions[]` record status to match the real workflow meaning, such as:
373
+ - `rate_limited` for bridge `blocked` / `claude_usage_limit`
374
+ - `failed` for bridge `failed`
375
+ - update `last_result_summary` and `last_turn_at` when there is meaningful result text
376
+ - update Beads comments so the pause or failure is auditable without reading bridge artifacts directly
377
+
378
+ Do not advance the workflow based only on Bash success if bridge files and metadata are not yet aligned.
379
+
380
+ ## Owner-controlled lane rule
381
+
382
+ - treat a bridge-managed Claude lane as owner-controlled during ordinary operation
383
+ - do not manually type into the managed Claude TUI or send ad hoc prompts outside the bridge during the workflow
384
+ - if manual recovery or debugging ever happens in that TUI, record it clearly and resync metadata from bridge state and hook evidence before continuing normal workflow
163
385
 
164
386
  ## Failure handling
165
387
 
166
- - if Claude CLI returns a parseable result with a session id, persist it immediately
167
- - if Claude CLI returns malformed output, treat that as a worker communication failure and stop to recover it cleanly
168
- - if the saved session id cannot be resumed, do not silently create a replacement session unless the workflow explicitly chooses a controlled replacement
388
+ - if bridge launch captures a Claude session id, persist it immediately
389
+ - if the bridge reports `failed`, treat that as a worker communication failure and recover it cleanly
390
+ - if the bridge reports `blocked` because of `claude_usage_limit`, treat that as an automatic wait-and-resume path rather than a handoff-stop condition unless the wait or resume path itself fails
391
+ - if the saved live lane cannot continue, do not silently create a replacement session unless the workflow explicitly chooses a controlled replacement
169
392
  - if a replacement session is required, record the handoff clearly in metadata and tracker comments
170
- - write raw stdout and stderr to trace files for debugging, but do not surface those raw files back into normal owner prompts unless debugging is explicitly needed
393
+ - keep hook logs and transcript pointers for debugging, but do not surface raw bridge artifacts back into normal owner prompts unless debugging is explicitly needed
171
394
 
172
395
  ## Rate-limit handling
173
396
 
174
- - if Claude returns a usage-limit or capacity-exhaustion result for the active developer session, do not take over implementation work in the owner session
397
+ - if the bridge returns `claude_usage_limit` or the live lane becomes capacity-blocked, do not take over implementation work in the owner session
175
398
  - mark the active developer session status as `rate_limited`
176
399
  - preserve the same Claude session id as the active tracked developer session
177
- - update `../.ai/metadata.json` and Beads `SESSION:` or `HANDOFF:` comments to record the rate-limit pause clearly
178
- - set workflow state to await user resume rather than creating owner-side implementation fallback work
179
- - when the user later resumes the run, continue from the same Claude developer session if it is resumable
400
+ - use the packaged `~/slopmachine/utils/claude_wait_for_rate_limit_reset.sh` helper or the built-in turn retry path to wait until the reset time specified by Claude, then continue from the same live lane
401
+ - update `../.ai/metadata.json` and Beads `SESSION:` or `HANDOFF:` comments to record the blocked state, wait window, and resumed continuity clearly
402
+ - only surface the situation to the user if the reset time cannot be determined or the wait or resume path itself fails
180
403
 
181
404
  ## Worker prompt discipline
182
405