theslopmachine 0.6.2 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. package/MANUAL.md +21 -6
  2. package/README.md +55 -7
  3. package/RELEASE.md +16 -1
  4. package/assets/agents/developer.md +41 -1
  5. package/assets/agents/slopmachine-claude.md +101 -60
  6. package/assets/agents/slopmachine.md +40 -17
  7. package/assets/claude/agents/developer.md +42 -5
  8. package/assets/skills/clarification-gate/SKILL.md +25 -5
  9. package/assets/skills/claude-worker-management/SKILL.md +290 -57
  10. package/assets/skills/developer-session-lifecycle/SKILL.md +83 -38
  11. package/assets/skills/development-guidance/SKILL.md +21 -1
  12. package/assets/skills/evaluation-triage/SKILL.md +34 -23
  13. package/assets/skills/final-evaluation-orchestration/SKILL.md +88 -50
  14. package/assets/skills/hardening-gate/SKILL.md +17 -3
  15. package/assets/skills/integrated-verification/SKILL.md +3 -3
  16. package/assets/skills/planning-gate/SKILL.md +32 -3
  17. package/assets/skills/planning-guidance/SKILL.md +72 -13
  18. package/assets/skills/retrospective-analysis/SKILL.md +2 -2
  19. package/assets/skills/scaffold-guidance/SKILL.md +129 -124
  20. package/assets/skills/submission-packaging/SKILL.md +33 -27
  21. package/assets/skills/verification-gates/SKILL.md +44 -14
  22. package/assets/slopmachine/backend-evaluation-prompt.md +1 -1
  23. package/assets/slopmachine/frontend-evaluation-prompt.md +5 -5
  24. package/assets/slopmachine/scaffold-playbooks/android-kotlin-compose.md +81 -0
  25. package/assets/slopmachine/scaffold-playbooks/android-kotlin-views.md +191 -0
  26. package/assets/slopmachine/scaffold-playbooks/android-native-java.md +203 -0
  27. package/assets/slopmachine/scaffold-playbooks/angular-default.md +181 -0
  28. package/assets/slopmachine/scaffold-playbooks/backend-baseline.md +142 -0
  29. package/assets/slopmachine/scaffold-playbooks/backend-family-matrix.md +80 -0
  30. package/assets/slopmachine/scaffold-playbooks/database-module-matrix.md +80 -0
  31. package/assets/slopmachine/scaffold-playbooks/django-default.md +166 -0
  32. package/assets/slopmachine/scaffold-playbooks/docker-baseline.md +189 -0
  33. package/assets/slopmachine/scaffold-playbooks/docker-shared-contract.md +334 -0
  34. package/assets/slopmachine/scaffold-playbooks/electron-vite-default.md +124 -0
  35. package/assets/slopmachine/scaffold-playbooks/expo-react-native-default.md +73 -0
  36. package/assets/slopmachine/scaffold-playbooks/fastapi-default.md +134 -0
  37. package/assets/slopmachine/scaffold-playbooks/frontend-baseline.md +160 -0
  38. package/assets/slopmachine/scaffold-playbooks/frontend-family-matrix.md +134 -0
  39. package/assets/slopmachine/scaffold-playbooks/generic-unknown-tech-guide.md +136 -0
  40. package/assets/slopmachine/scaffold-playbooks/go-chi-default.md +160 -0
  41. package/assets/slopmachine/scaffold-playbooks/ios-linux-portable.md +93 -0
  42. package/assets/slopmachine/scaffold-playbooks/ios-native-objective-c.md +151 -0
  43. package/assets/slopmachine/scaffold-playbooks/ios-native-swift.md +188 -0
  44. package/assets/slopmachine/scaffold-playbooks/laravel-default.md +216 -0
  45. package/assets/slopmachine/scaffold-playbooks/livewire-default.md +265 -0
  46. package/assets/slopmachine/scaffold-playbooks/overlay-module-matrix.md +130 -0
  47. package/assets/slopmachine/scaffold-playbooks/platform-family-matrix.md +79 -0
  48. package/assets/slopmachine/scaffold-playbooks/selection-matrix.md +72 -0
  49. package/assets/slopmachine/scaffold-playbooks/spring-boot-default.md +182 -0
  50. package/assets/slopmachine/scaffold-playbooks/tauri-default.md +80 -0
  51. package/assets/slopmachine/scaffold-playbooks/vue-vite-default.md +162 -0
  52. package/assets/slopmachine/scaffold-playbooks/web-default.md +96 -0
  53. package/assets/slopmachine/templates/AGENTS.md +41 -3
  54. package/assets/slopmachine/templates/CLAUDE.md +111 -0
  55. package/assets/slopmachine/test-coverage-prompt.md +561 -0
  56. package/assets/slopmachine/utils/claude_create_session.mjs +3 -2
  57. package/assets/slopmachine/utils/claude_live_channel.mjs +188 -0
  58. package/assets/slopmachine/utils/claude_live_common.mjs +411 -0
  59. package/assets/slopmachine/utils/claude_live_hook.py +47 -0
  60. package/assets/slopmachine/utils/claude_live_launch.mjs +187 -0
  61. package/assets/slopmachine/utils/claude_live_status.mjs +25 -0
  62. package/assets/slopmachine/utils/claude_live_stop.mjs +46 -0
  63. package/assets/slopmachine/utils/claude_live_turn.mjs +277 -0
  64. package/assets/slopmachine/utils/claude_resume_session.mjs +3 -2
  65. package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.mjs +23 -0
  66. package/assets/slopmachine/utils/claude_wait_for_rate_limit_reset.sh +5 -0
  67. package/assets/slopmachine/utils/claude_worker_common.mjs +361 -4
  68. package/assets/slopmachine/utils/cleanup_delivery_artifacts.py +4 -0
  69. package/assets/slopmachine/utils/export_ai_session.mjs +1 -1
  70. package/assets/slopmachine/utils/normalize_claude_session.py +153 -0
  71. package/assets/slopmachine/utils/package_claude_session.mjs +123 -0
  72. package/assets/slopmachine/utils/prepare_strict_audit_workspace.mjs +65 -0
  73. package/package.json +1 -1
  74. package/src/constants.js +42 -3
  75. package/src/init.js +173 -28
  76. package/src/install.js +156 -8
  77. package/src/send-data.js +56 -57
@@ -1,91 +1,246 @@
1
1
  ---
2
2
  name: claude-worker-management
3
- description: Launch, resume, and persist the Claude CLI developer worker session used by slopmachine-claude.
3
+ description: Launch, persist, and message the live Claude developer lane used by slopmachine-claude.
4
4
  ---
5
5
 
6
6
  # Claude Worker Management
7
7
 
8
- Use this skill whenever `slopmachine-claude` needs to create, resume, or message the persistent Claude developer worker.
8
+ Use this skill whenever `slopmachine-claude` needs to launch, inspect, or message the persistent live Claude developer lane.
9
9
 
10
10
  ## Purpose
11
11
 
12
12
  - keep the Claude developer worker as a large complete conversation per bounded developer slot
13
13
  - avoid losing worker context by accidentally creating fresh sessions for ordinary follow-up turns
14
14
  - make session persistence and response capture deterministic
15
+ - make OpenCode talk to a live Claude TUI through a bridge instead of non-interactive resume calls
15
16
 
16
17
  ## Core rules
17
18
 
18
19
  - the Claude worker must be invoked by the installed Claude agent name `developer`
19
20
  - do not use the OpenCode `developer` subagent for implementation work in the `slopmachine-claude` path
20
21
  - do not read Claude transcript files as the normal communication channel
21
- - communicate with the Claude worker through the packaged wrapper scripts in `~/slopmachine/utils/`
22
- - treat raw Claude stdout and stderr as trace artifacts written to files, not as owner-session context
23
- - treat the wrapper `result-file` as the semantic source of truth in normal owner flow
24
- - treat terminal stdout from the wrapper as only a tiny pointer or status channel
25
- - always capture the session id and normalized result from the `result-file`
26
- - always re-pass `--agent developer` on every call, even when resuming an existing session
27
- - always constrain Claude to a single-session developer lane by limiting tools to `Read Write Edit Bash Glob Grep`
28
- - do not allow Claude internal agent fan-out in the normal developer path
29
- - use `--dangerously-skip-permissions` in the wrapper path so the worker does not stall on routine file-edit permission prompts inside the bounded repo
30
-
31
- ## Session creation rule
22
+ - communicate with the Claude worker through the packaged live bridge scripts in `~/slopmachine/utils/`
23
+ - use `claude_live_launch.mjs` once per lane and `claude_live_turn.mjs` for each owner message into that lane
24
+ - set the Claude live runtime settings default `agent` to `developer` so the lane stays on the intended system prompt even if the session is resumed or inspected through Claude-native controls
25
+ - treat bridge `state.json` as the durable control-plane truth for lane status, routing, and Claude session identity
26
+ - treat bridge `result.json` as the semantic source of truth after each completed turn
27
+ - treat terminal stdout from bridge scripts as only a tiny pointer or status channel
28
+ - always capture the session id from the launched bridge state and the normalized turn result from bridge `result.json`
29
+ - always constrain Claude to a single-session developer lane even when it uses internal Claude task fan-out
30
+ - allow Claude internal task fan-out inside that one continuous live session when it reduces serial churn cleanly
31
+ - encourage Claude to parallelize independent search, reading, verification, and bounded implementation subtasks through internal task fan-out when that reduces serial churn cleanly
32
+ - launch the live lane with `--dangerously-skip-permissions` so the worker does not stall on routine file-edit permission prompts inside the bounded repo
33
+ - when Claude uses internal task fan-out and the environment allows explicit agent selection, prefer the installed `developer` agent for implementation-capable branches so the same engineering standard applies across those branches
34
+ - there is no repo-controlled guarantee that every Claude helper subagent globally reuses the `developer` prompt, so keep critical implementation in the main developer lane or in explicitly developer-scoped helper branches rather than relying on unspecified built-in helper behavior
35
+ - make every owner-to-Claude turn boundary-controlled, reviewable, and explicit about what must happen now versus later
36
+ - do not send vague owner prompts such as `continue`, `keep going`, `handle the rest`, or `fix it` without a precise bounded contract
37
+ - each substantive owner message should state the current engineering boundary, exact expected outcomes for that turn, the evidence required back, the important shortcuts that are not acceptable, and the stopping point
38
+ - default to one bounded engineering objective per owner turn; if a request would naturally cross planning, scaffold, development, or gate-review boundaries, split it into separate turns
39
+
40
+ ## Lane launch rule
32
41
 
33
42
  For a new bounded developer session slot:
34
43
 
35
- 1. run Claude in print mode with the installed `developer` agent
36
- 2. capture the returned `session_id`
44
+ 1. launch one live Claude TUI lane inside `tmux`
45
+ 2. wait for bridge registration to capture the Claude `session_id`
37
46
  3. store it in `../.ai/metadata.json`
38
47
  4. mirror it in tracker comments using `SESSION:`
39
- 5. keep using that same session for all later turns in the same bounded slot
48
+ 5. keep using that same live lane for all later turns in the same bounded slot
40
49
 
41
- Preferred creation pattern:
50
+ Preferred launch pattern:
42
51
 
43
52
  ```bash
44
- node ~/slopmachine/utils/claude_create_session.mjs --cwd "$PWD" --prompt-file <file> --raw-output <file> --raw-error <file> --state-file <file> --result-file <file>
53
+ node ~/slopmachine/utils/claude_live_launch.mjs --cwd "$PWD" --lane <lane> --runtime-dir <dir>
45
54
  ```
46
55
 
56
+ ## Model selection rule
57
+
58
+ - choose the live-lane model at launch time; do not rely on an implicit Claude default when the owner can decide intentionally
59
+ - default to `--model sonnet` for ordinary planning, scaffold, development, and routine bugfix work
60
+ - escalate to `--model opus` only for genuinely difficult planning, security-critical hardening, architecturally tangled debugging, or repeated stubborn failures where the extra reasoning depth is justified
61
+ - keep `--subagent-model sonnet` by default unless there is a concrete reason to raise helper-branch cost as well
62
+ - when the task difficulty warrants it, also pass an explicit `--effort <level>` at launch time rather than hoping the default thinking level is ideal
63
+ - keep the chosen `model`, `effort`, and `subagent_model` recorded in bridge state so later recovery and review can see what launched the lane
64
+
65
+ The launch implementation must pass Claude `--dangerously-skip-permissions` in the live TUI command path.
66
+
47
67
  When the owner invokes this through the OpenCode Bash tool, use a long-running timeout suitable for real developer work.
48
68
 
49
69
  Default:
50
70
 
51
- - Claude create and resume worker turns should use a Bash timeout of at least `3600000` ms (1 hour)
52
- - do not use ordinary short Bash timeouts for Claude worker turns
71
+ - Claude launch and turn bridge operations should not use ordinary short Bash timeouts
72
+ - when automatic rate-limit waiting is enabled, prefer no outer timeout at all for live Claude worker turns; if the host wrapper forces a timeout value, it must exceed the possible reset wait plus buffer rather than using a generic 1 hour cap
53
73
 
54
74
  Do not pre-generate a UUID unless there is a strong reason to do so.
55
- The default pattern is to let Claude create the session and then persist the returned `session_id`.
75
+ The default pattern is to let the live lane start normally and then persist the `session_id` captured by bridge registration.
56
76
 
57
- ## Resume rule
77
+ ## Turn rule
58
78
 
59
79
  For all later turns in the same bounded developer slot:
60
80
 
61
81
  ```bash
62
- node ~/slopmachine/utils/claude_resume_session.mjs --cwd "$PWD" --session-id <session_id> --prompt-file <file> --raw-output <file> --raw-error <file> --state-file <file> --result-file <file>
82
+ printf '%s' "$PROMPT" | node ~/slopmachine/utils/claude_live_turn.mjs --runtime-dir <dir> --timeout-ms <turn-timeout>
63
83
  ```
64
84
 
65
- - use `--resume` inside the wrapper implementation, not `-r`
66
- - when calling the resume wrapper from the owner session, treat it as a long-running operation and keep the Bash timeout at or above `3600000` ms
67
- - do not reuse `--session-id` after creation
68
- - if resume fails, stop and recover explicitly instead of silently creating a new worker
85
+ - inject exactly one owner message at a time into the idle live lane
86
+ - pass the prompt directly to the wrapper through stdin as the primary input path instead of requiring an owner-side prompt file
87
+ - wait for `Stop` or `StopFailure` before sending the next message
88
+ - do not bypass the bridge by calling the channel HTTP endpoint directly from owner logic
89
+ - if turn execution fails, stop and recover explicitly instead of silently creating a new worker
90
+
91
+ ## Turn-preflight checklist
92
+
93
+ Before sending any owner message into the live lane:
94
+
95
+ 1. read bridge `state.json` and confirm the lane is the intended lane and currently `idle`
96
+ 2. read the latest bridge `result.json` when it exists and review the last normalized Claude answer before composing the next turn
97
+ 3. decide the prompt kind explicitly, such as `planning-start`, `planning-revision`, `scaffold-start`, `scaffold-review`, `development-slice`, `development-correction`, `bugfix-orientation`, `bugfix-fix`, `resume`, or `recovery`
98
+ 4. gather only the minimum accepted-plan sections, clarified requirements, boundary summary, and fresh deltas needed for this turn
99
+ 5. define the turn contract before writing the prompt: what Claude must produce now, what evidence it must return now, and exactly where it must stop
100
+
101
+ If the stop boundary is fuzzy, the turn is too broad.
102
+ If the owner prompt would span multiple major boundaries, split it.
103
+ Do not send the next turn until the prior turn has been reviewed and either accepted, corrected, or explicitly rerouted.
104
+
105
+ ## Canonical owner-message contract
106
+
107
+ For substantive live-lane turns, write the owner message in natural engineering language but make sure it includes all of these ingredients:
108
+
109
+ - `Context snapshot`: the current accepted state and only the fresh deltas that matter now
110
+ - `Contract anchor`: the relevant accepted plan sections, clarified decisions, or concrete evaluator findings that define the work
111
+ - `This turn only`: the bounded deliverable for this turn and whether this is planning-only, scaffold-only, coding allowed, or correction-only
112
+ - `Expected outcomes now`: the exact behaviors, artifacts, or fixes that must exist before this turn can be considered successful
113
+ - `Evidence required now`: the exact verification, file updates, or summaries Claude must return for owner review
114
+ - `Disallowed shortcuts now`: future-work deferrals, placeholder implementations, bypassed auth/validation, fake verification, mixed-boundary drift, or other shortcuts that would make the result misleading
115
+ - `Stop boundary`: what Claude should stop after producing, and what it must not start yet
116
+ - `Reply contract`: request the exact changed files, exact verification commands and results, and only the real remaining risks or blockers
117
+
118
+ When the turn intentionally uses internal parallel fan-out, also include:
119
+
120
+ - `Branch map`: the 2 or 3 independent branches, their boundaries, and their expected outputs
121
+ - `Shared constraints`: the contracts or files that must stay aligned across branches
122
+ - `Fan-in rule`: how Claude should merge the branch results and what integrated verification must run before stopping
123
+
124
+ Keep the wording natural. Do not turn every prompt into a rigid template dump.
125
+ But do make the contract mechanically obvious enough that Claude cannot plausibly misunderstand what acceptance depends on.
126
+
127
+ ## Canonical prompt shapes
128
+
129
+ ### Planning-start shape
130
+
131
+ For the second owner message in the first `develop` lane and for other explicit planning-entry turns:
132
+
133
+ - inline the approved clarification content and requirements-ambiguity resolutions directly in the message
134
+ - include the owner's initial planning view so Claude refines a direction instead of inventing one from zero
135
+ - restate prompt-critical requirements, actors, required surfaces, locked defaults, explicit non-goals, and risky areas in plain engineering language
136
+ - say clearly that the worker should produce an exhaustive, section-addressable implementation plan and must not start coding yet
137
+ - require dense planning artifacts, especially `../docs/design.md`, with explicit treatment of modules, business rules, state machines, permissions, validation, verification strategy, checkpoints, and definition of done when applicable
138
+ - require a concise changed-files summary with the planning response
139
+
140
+ ### Planning-revision shape
141
+
142
+ When a planning draft is not good enough:
143
+
144
+ - point to the exact plan sections or requirement areas that are weak or incomplete
145
+ - state the exact missing detail or unacceptable vagueness that must be corrected now
146
+ - keep the turn planning-only; do not let the worker start coding as a compensation move
147
+ - require the revised planning artifacts plus a short summary of what changed and what is still explicitly unresolved
148
+
149
+ ### Scaffold-start shape
150
+
151
+ When entering scaffold work:
152
+
153
+ - cite the relevant accepted design sections and the intended baseline runtime/test/config contract
154
+ - state that the turn is scaffold-only and name the exact baseline surfaces expected now, such as app shell, routing skeleton, persistence skeleton, config wiring, logging path, validation path, auth foundation, test harness, or README baseline when they apply
155
+ - state explicitly which feature work must not begin yet
156
+ - require exact local verification evidence for the scaffold baseline and exact changed files
157
+ - say to stop after the scaffold baseline is complete and verified
158
+
159
+ ### Development-slice shape
160
+
161
+ For ordinary implementation turns:
162
+
163
+ - anchor the request to the relevant accepted plan sections and current boundary summary
164
+ - name the exact slice, user/admin actor path, modules, or surfaces to complete now
165
+ - itemize the expected outcomes for happy path, failure path, and auth/ownership/validation behavior when those dimensions matter
166
+ - require targeted local verification tied back to those expected outcomes
167
+ - explicitly prohibit owner-only broad verification commands and unrelated follow-on work
168
+ - when the slice can truly be parallelized, name the separate branch contracts explicitly instead of asking Claude to infer them
169
+ - say to stop after this slice and report the exact changed files plus exact verification results
170
+
171
+ ### Development-correction shape
172
+
173
+ When the worker partially missed the slice or crossed boundaries:
174
+
175
+ - quote the exact missing outcome, regression risk, or evidence gap
176
+ - ask for a correction-only turn focused on those gaps
177
+ - require fresh verification evidence for the corrected surface
178
+ - do not mix new feature asks into the correction turn
179
+
180
+ ### Resume shape
181
+
182
+ When resuming a long-lived lane:
183
+
184
+ - start from the stored boundary summary and the relevant accepted plan sections instead of replaying broad history
185
+ - include only the new delta since the last accepted state
186
+ - restate the current bounded task, evidence required, and stop boundary
187
+ - do not re-dump the entire project or workflow unless continuity is genuinely broken
188
+
189
+ ### Bugfix issue-turn shape
190
+
191
+ For evaluator-driven remediation inside a `bugfix-N` session opened by a `partial pass` audit:
192
+
193
+ - lead with the concrete evaluator finding or owner-reviewed issue statement
194
+ - state the expected fix and the affected non-regression surfaces
195
+ - require proof for the issue path plus the nearby happy path and security/ownership boundary when relevant
196
+ - say to stop after the named issue set rather than reopening unrelated refactors
197
+
198
+ ## Turn anti-patterns
199
+
200
+ Do not do these:
201
+
202
+ - send `continue`, `next`, or `keep going` as a substantive owner prompt
203
+ - ask for planning and implementation in the same turn unless that mixed boundary is intentional and explicitly stated
204
+ - ask for multiple gate exits in one turn
205
+ - let Claude decide its own stopping point implicitly
206
+ - pass parent-directory file paths as hidden instructions instead of restating the needed content directly
207
+ - paste raw bridge state, raw transcript payloads, or workflow bookkeeping into normal developer prompts
208
+ - respond to a weak result by broadening the next prompt instead of correcting the specific gap
209
+
210
+ ## Status rule
211
+
212
+ When owner logic needs to inspect the lane without sending a new message:
213
+
214
+ ```bash
215
+ node ~/slopmachine/utils/claude_live_status.mjs --runtime-dir <dir>
216
+ ```
217
+
218
+ Use `state.json` plus `claude_live_status.mjs` to determine whether the lane is:
219
+
220
+ - `idle`
221
+ - `running`
222
+ - `blocked`
223
+ - `failed`
69
224
 
70
225
  ## Result capture rule
71
226
 
72
- The wrapper scripts should pipe the raw Claude JSON output to file, parse it after process exit, and persist a normalized `result-file` plus a live `state-file`.
227
+ The live bridge should persist a normalized turn `result.json` plus a durable lane `state.json`.
73
228
 
74
- Use the `result-file` fields only:
229
+ Use the turn result fields only:
75
230
 
76
231
  - `sid`
77
232
  - `res`
78
233
 
79
234
  Monitoring files should include at least:
80
235
 
81
- - a live `state-file` showing running/completed/failed state, pid, byte counts, timestamps, and exit code
82
- - a final `result-file` containing the normalized success or failure object
236
+ - a live `state.json` showing lane status, Claude session id, tmux session id, transcript pointer, and current turn state
237
+ - a final `result.json` containing the normalized success or failure object for the latest completed turn
238
+ - `hook-events.jsonl` as the live outward event feed
83
239
 
84
240
  Treat `res` as the worker's answer.
85
- Do not feed raw Claude JSON into the owner session.
86
241
  Do not rely on transcript scraping for normal turn-to-turn orchestration.
87
- Do not rely on Bash stdout alone when the wrapper state or result files provide a clearer source of truth.
88
- Read `result-file` after process completion before deciding the next owner turn.
242
+ Do not rely on Bash stdout alone when bridge state or result files provide a clearer source of truth.
243
+ Read bridge `result.json` after turn completion before deciding the next owner turn.
89
244
 
90
245
  ## Developer-slot continuity
91
246
 
@@ -95,7 +250,8 @@ The purpose of this backend is to preserve one large complete conversation per b
95
250
  - the `bugfix` slot should stay one continuous Claude session unless irrecoverable failure forces replacement
96
251
  - do not start a fresh Claude worker for every slice, clarification, or review loop
97
252
  - do not roll sessions casually just because the conversation is long
98
- - do not let the Claude worker create its own internal sub-agents for routine planning, scaffold, or implementation work
253
+ - internal Claude task sub-agents are allowed inside the same developer session when they help parallelize independent bounded work cleanly
254
+ - prefer task fan-out for parallel discovery, repo reading, comparison, or verification passes when those branches can be merged back without ambiguity
99
255
 
100
256
  ## First-session handshakes
101
257
 
@@ -103,12 +259,12 @@ The purpose of this backend is to preserve one large complete conversation per b
103
259
 
104
260
  When the first `develop` slot begins in planning:
105
261
 
106
- 1. create the Claude developer session with:
107
- - the original prompt plus a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction
108
- 2. wait for the first response and store the returned Claude session id from wrapper field `sid`
109
- 3. form an initial owner planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
110
- 4. resume the same session and send a compact second owner message that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial owner planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
111
- 5. continue the planning conversation in that same Claude session
262
+ 1. launch the live `develop` lane if it is not already running
263
+ 2. send the original prompt plus a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction through the bridge
264
+ 3. store the Claude session id from bridge `state.json`
265
+ 4. form an initial owner planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
266
+ 5. send a compact second owner message through the same live lane that directly includes the approved clarification content, the requirements-ambiguity resolutions, that initial owner planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
267
+ 6. continue the planning conversation in that same Claude session
112
268
 
113
269
  Do not merge those two first messages.
114
270
  Do not ask for a plan in the first message.
@@ -118,8 +274,10 @@ Preferred second owner message shape:
118
274
  - inline the approved clarification content and the requirements-ambiguity resolutions directly in the owner message
119
275
  - include the owner's initial planning view so planning is refined collaboratively rather than invented from zero
120
276
  - add any short delta notes that are not already captured in that inlined summary
121
- - express the current boundary in plain engineering language and then ask for the implementation plan plus major risks or assumptions
277
+ - express the current boundary in plain engineering language and then ask for an exhaustive, section-addressable implementation plan plus major risks or assumptions
278
+ - require the plan to fill the planning artifacts densely, especially `../docs/design.md`, with explicit sections for actors, success paths, modules, business rules, state machines, permissions, validation, test strategy, checkpoints, and definition of done when those dimensions matter
122
279
  - ask for repo-local planning artifacts plus a concise changed-files summary
280
+ - say explicitly that coding must not start yet and that the response should stop after the planning artifacts and summary are complete
123
281
 
124
282
  Do not tell the developer worker to read files outside `repo/`.
125
283
  If owner-side artifacts outside `repo/` matter, restate their content directly in the owner message instead of passing file paths.
@@ -127,13 +285,13 @@ Do not mention session names, slot labels, or workflow phase labels to the devel
127
285
 
128
286
  ### `bugfix-N` orientation handshake
129
287
 
130
- When `P7` begins and the workflow opens the remediation lane:
288
+ When a fresh `partial pass` evaluation result opens the next remediation lane:
131
289
 
132
- 1. create a fresh Claude developer session for the next `bugfix-N` label
290
+ 1. launch a fresh live Claude developer lane for the next `bugfix-N` label
133
291
  2. use the first owner message only to orient that session to the repo and the current delivered state
134
292
  3. make clear in plain engineering language that follow-up work will be focused remediation against evaluator findings
135
- 4. wait for the first response and store the returned Claude session id from wrapper field `sid`
136
- 5. only after that orientation exchange, resume the same `bugfix-N` session with the first evaluator-driven issue list
293
+ 4. wait for the first response and store the Claude session id from bridge `state.json`
294
+ 5. only after that orientation exchange, continue the same `bugfix-N` live lane with the first evaluator-driven issue list
137
295
 
138
296
  The orientation message should:
139
297
 
@@ -142,12 +300,24 @@ The orientation message should:
142
300
  - state that incoming work will be a sequence of concrete issue-fix requests against evaluator findings
143
301
  - avoid mentioning workflow internals, phase labels, or session-lane labels
144
302
 
303
+ ## Between-turn owner review rule
304
+
305
+ After each meaningful Claude response and before the next owner turn:
306
+
307
+ 1. review the normalized bridge `result.json`
308
+ 2. decide whether the result was accepted, needs correction, or crossed a boundary that must be rolled back in the next prompt
309
+ 3. update metadata and boundary summary only after that review decision
310
+ 4. compose the next turn as a deliberate correction, continuation, or new bounded objective rather than a vague nudge
311
+
312
+ If Claude starts coding during a planning-only turn, treat that as a boundary violation and correct it explicitly.
313
+ If Claude continues into extra work beyond the requested stop boundary, do not silently accept the spillover; review the requested boundary first and then decide whether any spillover is acceptable.
314
+
145
315
  ## Metadata expectations
146
316
 
147
317
  The active developer session record should include at least:
148
318
 
149
319
  - `lane`
150
- - `backend: "claude"`
320
+ - `backend: "claude-live"`
151
321
  - `session_id`
152
322
  - `label`
153
323
  - `status`
@@ -157,26 +327,89 @@ Recommended additional fields when useful:
157
327
 
158
328
  - `agent_name: "developer"`
159
329
  - `created_phase`
160
- - `trace_dir`
330
+ - `runtime_dir`
331
+ - `tmux_session`
332
+ - `transcript_path`
333
+ - `opened_from_audit_number`
161
334
  - `last_result_summary`
162
- - `last_resumed_at`
335
+ - `last_turn_at`
336
+
337
+ ## Owner state-sync rule
338
+
339
+ Bridge lane state is the authoritative transport state for Claude-backed developer work.
340
+
341
+ After each meaningful bridge action, immediately read bridge `state.json` and mirror the important fields into `../.ai/metadata.json`, `../metadata.json`, and Beads comments before advancing workflow state.
342
+
343
+ ### After lane launch
344
+
345
+ - read bridge `state.json`
346
+ - set or confirm:
347
+ - `current_developer_lane`
348
+ - `active_developer_session_id`
349
+ - create or update the active `developer_sessions[]` record with:
350
+ - `lane`
351
+ - `sequence`
352
+ - `label`
353
+ - `backend: "claude-live"`
354
+ - `agent_name: "developer"`
355
+ - `created_phase`
356
+ - `session_id`
357
+ - `status`
358
+ - `runtime_dir`
359
+ - `tmux_session`
360
+ - `transcript_path`
361
+ - `opened_from_audit_number` when the session was opened from a `partial pass` audit
362
+ - `orientation_completed: false`
363
+ - mirror `session_id` into `../metadata.json` as `session_id`
364
+ - record the session in Beads using `SESSION:`
365
+
366
+ ### After each successful turn
367
+
368
+ - read bridge `state.json` and bridge `result.json`
369
+ - update the active `developer_sessions[]` record with:
370
+ - `status: "idle"`
371
+ - `session_id`
372
+ - `transcript_path`
373
+ - `last_result_summary`
374
+ - `last_turn_at`
375
+ - if the first orientation or first planning handshake completed, set `orientation_completed: true`
376
+ - keep `active_developer_session_id` and `current_developer_lane` aligned with that same active session
377
+
378
+ ### After a blocked or failed turn
379
+
380
+ - read bridge `state.json` and bridge `result.json`
381
+ - preserve the same tracked Claude session id and runtime pointers
382
+ - update the active `developer_sessions[]` record status to match the real workflow meaning, such as:
383
+ - `rate_limited` for bridge `blocked` / `claude_usage_limit`
384
+ - `failed` for bridge `failed`
385
+ - update `last_result_summary` and `last_turn_at` when there is meaningful result text
386
+ - update Beads comments so the pause or failure is auditable without reading bridge artifacts directly
387
+
388
+ Do not advance the workflow based only on Bash success if bridge files and metadata are not yet aligned.
389
+
390
+ ## Owner-controlled lane rule
391
+
392
+ - treat a bridge-managed Claude lane as owner-controlled during ordinary operation
393
+ - do not manually type into the managed Claude TUI or send ad hoc prompts outside the bridge during the workflow
394
+ - if manual recovery or debugging ever happens in that TUI, record it clearly and resync metadata from bridge state and hook evidence before continuing normal workflow
163
395
 
164
396
  ## Failure handling
165
397
 
166
- - if Claude CLI returns a parseable result with a session id, persist it immediately
167
- - if Claude CLI returns malformed output, treat that as a worker communication failure and stop to recover it cleanly
168
- - if the saved session id cannot be resumed, do not silently create a replacement session unless the workflow explicitly chooses a controlled replacement
398
+ - if bridge launch captures a Claude session id, persist it immediately
399
+ - if the bridge reports `failed`, treat that as a worker communication failure and recover it cleanly
400
+ - if the bridge reports `blocked` because of `claude_usage_limit`, treat that as an automatic wait-and-resume path rather than a handoff-stop condition unless the wait or resume path itself fails
401
+ - if the saved live lane cannot continue, do not silently create a replacement session unless the workflow explicitly chooses a controlled replacement
169
402
  - if a replacement session is required, record the handoff clearly in metadata and tracker comments
170
- - write raw stdout and stderr to trace files for debugging, but do not surface those raw files back into normal owner prompts unless debugging is explicitly needed
403
+ - keep hook logs and transcript pointers for debugging, but do not surface raw bridge artifacts back into normal owner prompts unless debugging is explicitly needed
171
404
 
172
405
  ## Rate-limit handling
173
406
 
174
- - if Claude returns a usage-limit or capacity-exhaustion result for the active developer session, do not take over implementation work in the owner session
407
+ - if the bridge returns `claude_usage_limit` or the live lane becomes capacity-blocked, do not take over implementation work in the owner session
175
408
  - mark the active developer session status as `rate_limited`
176
409
  - preserve the same Claude session id as the active tracked developer session
177
- - update `../.ai/metadata.json` and Beads `SESSION:` or `HANDOFF:` comments to record the rate-limit pause clearly
178
- - set workflow state to await user resume rather than creating owner-side implementation fallback work
179
- - when the user later resumes the run, continue from the same Claude developer session if it is resumable
410
+ - use the packaged `~/slopmachine/utils/claude_wait_for_rate_limit_reset.sh` helper or the built-in turn retry path to wait until the reset time specified by Claude, then continue from the same live lane
411
+ - update `../.ai/metadata.json` and Beads `SESSION:` or `HANDOFF:` comments to record the blocked state, wait window, and resumed continuity clearly
412
+ - only surface the situation to the user if the reset time cannot be determined or the wait or resume path itself fails
180
413
 
181
414
  ## Worker prompt discipline
182
415