agent-threader 2.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (172) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +126 -0
  3. package/compiled/claude/agent-threader/SKILL.md +361 -0
  4. package/compiled/codex/agent-threader/SKILL.md +361 -0
  5. package/compiled/cursor/rules/agent-threader.mdc +367 -0
  6. package/compiled/cursor/skills/agent-threader/SKILL.md +361 -0
  7. package/compiled/opencode/agent-threader.md +361 -0
  8. package/compiled/windsurf/rules/agent-threader.md +361 -0
  9. package/compiled/windsurf/skills/agent-threader/SKILL.md +361 -0
  10. package/dist/cli/commands/doctor.d.ts +6 -0
  11. package/dist/cli/commands/doctor.d.ts.map +1 -0
  12. package/dist/cli/commands/doctor.js +7 -0
  13. package/dist/cli/commands/doctor.js.map +1 -0
  14. package/dist/cli/commands/explain-error.d.ts +12 -0
  15. package/dist/cli/commands/explain-error.d.ts.map +1 -0
  16. package/dist/cli/commands/explain-error.js +23 -0
  17. package/dist/cli/commands/explain-error.js.map +1 -0
  18. package/dist/cli/commands/init-state.d.ts +6 -0
  19. package/dist/cli/commands/init-state.d.ts.map +1 -0
  20. package/dist/cli/commands/init-state.js +10 -0
  21. package/dist/cli/commands/init-state.js.map +1 -0
  22. package/dist/cli/commands/logs.d.ts +6 -0
  23. package/dist/cli/commands/logs.d.ts.map +1 -0
  24. package/dist/cli/commands/logs.js +9 -0
  25. package/dist/cli/commands/logs.js.map +1 -0
  26. package/dist/cli/commands/parse-heal.d.ts +6 -0
  27. package/dist/cli/commands/parse-heal.d.ts.map +1 -0
  28. package/dist/cli/commands/parse-heal.js +5 -0
  29. package/dist/cli/commands/parse-heal.js.map +1 -0
  30. package/dist/cli/commands/parse-result.d.ts +6 -0
  31. package/dist/cli/commands/parse-result.d.ts.map +1 -0
  32. package/dist/cli/commands/parse-result.js +5 -0
  33. package/dist/cli/commands/parse-result.js.map +1 -0
  34. package/dist/cli/commands/status.d.ts +6 -0
  35. package/dist/cli/commands/status.d.ts.map +1 -0
  36. package/dist/cli/commands/status.js +5 -0
  37. package/dist/cli/commands/status.js.map +1 -0
  38. package/dist/cli/commands/validate-manifest.d.ts +6 -0
  39. package/dist/cli/commands/validate-manifest.d.ts.map +1 -0
  40. package/dist/cli/commands/validate-manifest.js +5 -0
  41. package/dist/cli/commands/validate-manifest.js.map +1 -0
  42. package/dist/cli/index.d.ts +6 -0
  43. package/dist/cli/index.d.ts.map +1 -0
  44. package/dist/cli/index.js +360 -0
  45. package/dist/cli/index.js.map +1 -0
  46. package/dist/cli/output-formatter.d.ts +6 -0
  47. package/dist/cli/output-formatter.d.ts.map +1 -0
  48. package/dist/cli/output-formatter.js +19 -0
  49. package/dist/cli/output-formatter.js.map +1 -0
  50. package/dist/index.d.ts +3 -0
  51. package/dist/index.d.ts.map +1 -0
  52. package/dist/index.js +5 -0
  53. package/dist/index.js.map +1 -0
  54. package/dist/lib/adapters/types.d.ts +40 -0
  55. package/dist/lib/adapters/types.d.ts.map +1 -0
  56. package/dist/lib/adapters/types.js +3 -0
  57. package/dist/lib/adapters/types.js.map +1 -0
  58. package/dist/lib/contracts/schema-validator.d.ts +15 -0
  59. package/dist/lib/contracts/schema-validator.d.ts.map +1 -0
  60. package/dist/lib/contracts/schema-validator.js +63 -0
  61. package/dist/lib/contracts/schema-validator.js.map +1 -0
  62. package/dist/lib/contracts/types.d.ts +91 -0
  63. package/dist/lib/contracts/types.d.ts.map +1 -0
  64. package/dist/lib/contracts/types.js +15 -0
  65. package/dist/lib/contracts/types.js.map +1 -0
  66. package/dist/lib/contracts/validate-manifest.d.ts +16 -0
  67. package/dist/lib/contracts/validate-manifest.d.ts.map +1 -0
  68. package/dist/lib/contracts/validate-manifest.js +123 -0
  69. package/dist/lib/contracts/validate-manifest.js.map +1 -0
  70. package/dist/lib/diagnostics/doctor.d.ts +17 -0
  71. package/dist/lib/diagnostics/doctor.d.ts.map +1 -0
  72. package/dist/lib/diagnostics/doctor.js +131 -0
  73. package/dist/lib/diagnostics/doctor.js.map +1 -0
  74. package/dist/lib/errors/explain-error.d.ts +10 -0
  75. package/dist/lib/errors/explain-error.d.ts.map +1 -0
  76. package/dist/lib/errors/explain-error.js +73 -0
  77. package/dist/lib/errors/explain-error.js.map +1 -0
  78. package/dist/lib/errors/types.d.ts +16 -0
  79. package/dist/lib/errors/types.d.ts.map +1 -0
  80. package/dist/lib/errors/types.js +50 -0
  81. package/dist/lib/errors/types.js.map +1 -0
  82. package/dist/lib/index.d.ts +29 -0
  83. package/dist/lib/index.d.ts.map +1 -0
  84. package/dist/lib/index.js +25 -0
  85. package/dist/lib/index.js.map +1 -0
  86. package/dist/lib/orchestrator/batch-strategy.d.ts +9 -0
  87. package/dist/lib/orchestrator/batch-strategy.d.ts.map +1 -0
  88. package/dist/lib/orchestrator/batch-strategy.js +34 -0
  89. package/dist/lib/orchestrator/batch-strategy.js.map +1 -0
  90. package/dist/lib/orchestrator/healing-policy.d.ts +47 -0
  91. package/dist/lib/orchestrator/healing-policy.d.ts.map +1 -0
  92. package/dist/lib/orchestrator/healing-policy.js +104 -0
  93. package/dist/lib/orchestrator/healing-policy.js.map +1 -0
  94. package/dist/lib/orchestrator/index.d.ts +11 -0
  95. package/dist/lib/orchestrator/index.d.ts.map +1 -0
  96. package/dist/lib/orchestrator/index.js +11 -0
  97. package/dist/lib/orchestrator/index.js.map +1 -0
  98. package/dist/lib/orchestrator/patch-validation.d.ts +9 -0
  99. package/dist/lib/orchestrator/patch-validation.d.ts.map +1 -0
  100. package/dist/lib/orchestrator/patch-validation.js +58 -0
  101. package/dist/lib/orchestrator/patch-validation.js.map +1 -0
  102. package/dist/lib/orchestrator/scheduling.d.ts +12 -0
  103. package/dist/lib/orchestrator/scheduling.d.ts.map +1 -0
  104. package/dist/lib/orchestrator/scheduling.js +74 -0
  105. package/dist/lib/orchestrator/scheduling.js.map +1 -0
  106. package/dist/lib/orchestrator/write-safety.d.ts +14 -0
  107. package/dist/lib/orchestrator/write-safety.d.ts.map +1 -0
  108. package/dist/lib/orchestrator/write-safety.js +44 -0
  109. package/dist/lib/orchestrator/write-safety.js.map +1 -0
  110. package/dist/lib/parser/parse-heal.d.ts +12 -0
  111. package/dist/lib/parser/parse-heal.d.ts.map +1 -0
  112. package/dist/lib/parser/parse-heal.js +9 -0
  113. package/dist/lib/parser/parse-heal.js.map +1 -0
  114. package/dist/lib/parser/parse-result.d.ts +12 -0
  115. package/dist/lib/parser/parse-result.d.ts.map +1 -0
  116. package/dist/lib/parser/parse-result.js +9 -0
  117. package/dist/lib/parser/parse-result.js.map +1 -0
  118. package/dist/lib/parser/parser.d.ts +8 -0
  119. package/dist/lib/parser/parser.d.ts.map +1 -0
  120. package/dist/lib/parser/parser.js +167 -0
  121. package/dist/lib/parser/parser.js.map +1 -0
  122. package/dist/lib/state/init-state.d.ts +15 -0
  123. package/dist/lib/state/init-state.d.ts.map +1 -0
  124. package/dist/lib/state/init-state.js +50 -0
  125. package/dist/lib/state/init-state.js.map +1 -0
  126. package/dist/lib/state/logs.d.ts +19 -0
  127. package/dist/lib/state/logs.d.ts.map +1 -0
  128. package/dist/lib/state/logs.js +25 -0
  129. package/dist/lib/state/logs.js.map +1 -0
  130. package/dist/lib/state/state.d.ts +7 -0
  131. package/dist/lib/state/state.d.ts.map +1 -0
  132. package/dist/lib/state/state.js +72 -0
  133. package/dist/lib/state/state.js.map +1 -0
  134. package/dist/lib/state/status.d.ts +22 -0
  135. package/dist/lib/state/status.d.ts.map +1 -0
  136. package/dist/lib/state/status.js +34 -0
  137. package/dist/lib/state/status.js.map +1 -0
  138. package/dist/lib/state/types.d.ts +55 -0
  139. package/dist/lib/state/types.d.ts.map +1 -0
  140. package/dist/lib/state/types.js +14 -0
  141. package/dist/lib/state/types.js.map +1 -0
  142. package/install-local.sh +239 -0
  143. package/install.sh +36 -0
  144. package/package.json +55 -0
  145. package/site/CNAME +1 -0
  146. package/site/index.html +141 -0
  147. package/site/install.sh +36 -0
  148. package/site/style.css +319 -0
  149. package/skill/SKILL.md +127 -0
  150. package/skill/SPEC.md +1189 -0
  151. package/skill/build/compile.mjs +237 -0
  152. package/skill/build/manifest.json +21 -0
  153. package/skill/fragments/common/model-selection.md +11 -0
  154. package/skill/fragments/common/portability-rules.md +16 -0
  155. package/skill/fragments/common/workflow.md +12 -0
  156. package/skill/fragments/domain/adapter-model.md +42 -0
  157. package/skill/fragments/domain/architecture-overview.md +36 -0
  158. package/skill/fragments/domain/contracts.md +31 -0
  159. package/skill/fragments/domain/pbh-healing.md +47 -0
  160. package/skill/fragments/domain/state-resume.md +34 -0
  161. package/skill/fragments/domain/verification-safety.md +33 -0
  162. package/skill/fragments/meta/schemas-reference.md +13 -0
  163. package/skill/fragments/meta/templates-reference.md +11 -0
  164. package/skill/schemas/heal_decision.v2.json +100 -0
  165. package/skill/schemas/manifest.v2.json +91 -0
  166. package/skill/schemas/state.v2.json +183 -0
  167. package/skill/schemas/task_result.v2.json +104 -0
  168. package/skill/schemas/verify_profile.v2.json +61 -0
  169. package/skill/skills/agent-threader/agent-threader.md +85 -0
  170. package/skill/templates/orchestrator.ts +38 -0
  171. package/skill/templates/parser.ts +384 -0
  172. package/skill/templates/types.ts +282 -0
package/skill/SPEC.md ADDED
@@ -0,0 +1,1189 @@
1
+ # AgentThreader v2: Stand-Alone Architecture Specification
2
+
3
+ **Status:** Normative v2 design proposal
4
+
5
+ **Audience:** Engineers implementing or reviewing reusable runners that invoke agentic CLIs across many tasks.
6
+
7
+ **Normative keywords:** `MUST`, `SHOULD`, and `MAY` are used in the RFC sense. `MUST` is required behavior. `SHOULD` is the default or strongly recommended behavior. `MAY` is optional behavior.
8
+
9
+ ## 1. Problem Statement
10
+
11
+ This specification defines a standard architecture for building runners that repeatedly invoke agentic CLIs such as `agent`, `opencode`, and `claude` across many tasks. The purpose of the system is to make large prompt-driven workflows durable, inspectable, resumable, and safe enough to run unattended.
12
+
13
+ The hard problems in this domain are not prompt formatting alone. The hard problems are:
14
+
15
+ - durable state and resume behavior after interruptions
16
+ - deterministic parsing of machine-readable results from model output
17
+ - external verification owned by the runner instead of the model
18
+ - bounded recovery after fixable failures
19
+ - portability across multiple CLIs without rewriting the orchestrator
20
+
21
+ Prior v1 implementations diverged in three main areas:
22
+
23
+ - healing schedule: per-task, fixed batch, or epoch-based
24
+ - parser strategy: regex/text extraction versus structured contracts
25
+ - platform packaging: each IDE or tool surface redefining architecture in its own wrapper
26
+
27
+ This v2 specification replaces that divergence with one canonical design. The system defined here is intended for:
28
+
29
+ - batch code edits
30
+ - audit and evaluation runs
31
+ - stage-based workflows
32
+ - resumable overnight runs
33
+ - bounded self-healing after fixable failures
34
+
35
+ This v2 specification is not trying to solve:
36
+
37
+ - general-purpose multi-agent planning frameworks
38
+ - autonomous source-code editing by the healer
39
+ - unbounded retry loops
40
+ - platform-specific UX beyond thin wrappers
41
+
42
+ ## 2. Goals and Non-Goals
43
+
44
+ ### Goals
45
+
46
+ - Define one vocabulary, one runtime model, and one contract stack for all implementations.
47
+ - Make the orchestrator the single source of truth for task status, verification, checkpointing, and healing policy.
48
+ - Standardize worker and healer output as schema-validated JSON contracts.
49
+ - Preserve CLI portability through adapters rather than forking orchestrator logic.
50
+ - Define a default healing model that starts conservative, expands when stable, and stops when automation is no longer justified.
51
+ - Define enough detail that a peer can implement the system without prior knowledge of v1 variants or this repository.
52
+
53
+ ### Non-Goals
54
+
55
+ - This document does not prescribe project-specific build, test, or browser commands.
56
+ - This document does not require a specific product repository structure beyond the files needed by the runner.
57
+ - This document does not require a specific agent model or provider.
58
+ - This document does not define a UI or dashboard for monitoring runs.
59
+
60
+ ### Assumptions
61
+
62
+ - Readers are technical peers evaluating architecture, not end users.
63
+ - The document is a normative v2 design document, not a brainstorm or rough proposal.
64
+ - Platform wrappers are packaging concerns and are not allowed to redefine architecture.
65
+ - The reference implementation is expected to run TypeScript via global `tsx`.
66
+
67
+ ## 3. Glossary
68
+
69
+ | Term | Definition |
70
+ | --- | --- |
71
+ | `Task` | The smallest unit of work the runner schedules, executes, verifies, and tracks. |
72
+ | `Manifest` | The source of truth for the set of tasks and their metadata. |
73
+ | `Shared Context` | Reusable prompt material applied to multiple tasks, such as operating constraints, style rules, or output contract reminders. |
74
+ | `Worker` | The model or CLI invocation that performs the actual task work. |
75
+ | `Healer` | The model or CLI invocation that analyzes fixable failures and emits allowed patches to prompts, shared context, or bounded runtime knobs. |
76
+ | `Orchestrator` | The deterministic runtime that owns scheduling, parsing, verification, checkpointing, healing, and retry policy. |
77
+ | `Adapter` | The CLI-specific execution layer used by the orchestrator to invoke a tool without embedding tool-specific behavior in the core runtime. |
78
+ | `Verification Gate` | Any external check run by the orchestrator after worker output is parsed, such as build, test, lint, smoke, or browser validation. |
79
+ | `Failure Class` | The normalized reason category assigned to a failed task. |
80
+ | `Failure Signature` | The stable, comparable fingerprint used to detect repeated failures across tasks or retries. |
81
+ | `Batch` | The current window of ready tasks processed before the orchestrator evaluates whether healing is needed. |
82
+ | `Epoch` | One full sweep over all currently pending tasks. |
83
+ | `PBH` | Progressive Batch Healing, the default healing strategy that adjusts batch size based on observed stability. |
84
+ | `Convergence` | Evidence that healing is reducing failures rather than repeating them. |
85
+ | `Escalation` | A terminal outcome where the system stops retrying a task or run because further automated healing is not justified. |
86
+
87
+ ## 4. System Overview
88
+
89
+ The canonical v2 system has five moving parts:
90
+
91
+ - a manifest that declares work
92
+ - an orchestrator that owns truth
93
+ - adapters that invoke specific CLIs
94
+ - a worker that proposes task results
95
+ - a healer that proposes bounded recovery patches
96
+
97
+ At a high level, the system operates like this:
98
+
99
+ ```text
100
+ Manifest
101
+ -> Orchestrator
102
+ -> Adapter
103
+ -> Worker
104
+ -> Parser and Schema Validator
105
+ -> Verification Gates
106
+ -> State Checkpoint
107
+ -> Healer Checkpoint
108
+ -> Adapter
109
+ -> Healer
110
+ -> Patch Validation and Application
111
+ -> Resume or Escalate
112
+ ```
113
+
114
+ The orchestrator owns truth at every stage. Worker and healer outputs are only candidate data until the orchestrator validates them and commits them to state.
115
+
116
+ ### Canonical Source Tree
117
+
118
+ The canonical source of truth SHOULD be organized like this:
119
+
120
+ ```text
121
+ agent-threader/
122
+ SKILL.md
123
+ SPEC.md
124
+ schemas/
125
+ templates/
126
+ platforms/
127
+ ```
128
+
129
+ The purpose of each top-level artifact is:
130
+
131
+ - `SKILL.md`: short entrypoint describing when to use the skill and where the normative specification lives
132
+ - `SPEC.md`: the normative architecture document
133
+ - `schemas/`: JSON schemas for manifest, worker result, healer decision, and state
134
+ - `templates/`: reference runtime, parser, and adapter skeletons
135
+ - `platforms/`: thin wrappers for `cursor`, `codex`, `claude`, and `windsurf`
136
+
137
+ Platform wrappers MUST NOT define new architectural behavior. They MAY describe invocation syntax, UX wording, or tool-specific setup.
138
+
139
+ ### Default Configuration
140
+
141
+ | Setting | Default |
142
+ | --- | --- |
143
+ | Reference runtime | TypeScript via global `tsx` |
144
+ | Contract format | Fenced JSON only |
145
+ | Default healing schedule | `auto` |
146
+ | Default healing strategy | `PBH` |
147
+ | Batch growth strategy | `fibonacci` |
148
+ | Manual batch size default | `5` |
149
+ | Failure threshold | `0.2` |
150
+ | Max worker attempts per task | `2` |
151
+ | Max heal rounds per window | `2` |
152
+ | Max total heal rounds | `8` |
153
+ | Signature repeat limit | `2` |
154
+ | Verification owner | Orchestrator |
155
+ | Parser authority | Schema-validated parser modules only |
156
+
157
+ ## 5. Canonical Runtime Model
158
+
159
+ ### Runtime Choice
160
+
161
+ The reference implementation SHOULD be a typed TypeScript orchestrator executed via global `tsx`.
162
+
163
+ This choice means:
164
+
165
+ - `tsx` is assumed to be globally available in environments using the reference implementation
166
+ - shell and expect remain valid as adapter implementations, not as the canonical orchestration core
167
+ - Python is no longer the normative parser runtime
168
+ - conforming orchestrators MAY be implemented in other languages if they preserve the same contracts, state transitions, parser guarantees, and healing behavior
169
+
170
+ ### Stable Ordering Rules
171
+
172
+ The orchestrator MUST build a stable execution order:
173
+
174
+ - Tasks MUST be topologically sorted by `depends_on`.
175
+ - Lower `priority` values SHOULD run before higher `priority` values.
176
+ - If dependency depth and priority are equal, manifest order MUST be preserved.
177
+ - A task MUST NOT start until all of its dependencies are `DONE`.
178
+
179
+ ### Concurrency and Window Semantics
180
+
181
+ In this specification, `concurrency` and `parallelism` mean the same thing: two or more worker processes executing at the same time and consuming real runtime resources such as CPU threads, process slots, or provider capacity.
182
+
183
+ The orchestrator MUST support these rules:
184
+
185
+ - Default `concurrency` is `1`.
186
+ - A window is the scheduling set currently being attempted before a healing checkpoint.
187
+ - The effective attempted window size is `min(current_batch_size, count of ready tasks in the current scheduling slice)`.
188
+ - Tasks within a window MAY run sequentially or in parallel.
189
+ - If `concurrency > 1`, tasks in the same window MAY execute simultaneously, but each task MUST still respect `depends_on`.
190
+ - A window is complete only when every runnable task assigned to that window has settled into one of: `DONE`, `BLOCKED`, `FAILED`, or `ESCALATED` for that attempt.
191
+ - Healing MUST NOT trigger mid-window.
192
+ - Every concurrently executed task MUST write to its own worker log and verify log paths.
193
+ - State updates from concurrent completions MUST still preserve atomic checkpoint semantics.
194
+
195
+ ### End-to-End Control Flow
196
+
197
+ The orchestrator MUST implement this control flow:
198
+
199
+ 1. Load the manifest.
200
+ 2. Load or initialize state.
201
+ 3. Validate the manifest against `manifest.v2`.
202
+ 4. Build the dependency-resolved pending queue.
203
+ 5. Run worker tasks through adapters.
204
+ 6. Parse fenced result JSON from worker output.
205
+ 7. Run verification gates.
206
+ 8. Classify task outcome and generate a failure signature if needed.
207
+ 9. Checkpoint state atomically.
208
+ 10. Invoke the healer at batch checkpoints when policy says healing is needed.
209
+ 11. Apply validated allowed patches.
210
+ 12. Reset retryable tasks and continue.
211
+ 13. Stop on completion, bounded non-convergence, or unrecoverable escalation.
212
+
213
+ ### Core Runtime Rules
214
+
215
+ - The orchestrator MUST capture combined stdout and stderr for every worker and healer invocation.
216
+ - Exit code alone MUST NOT be treated as task success.
217
+ - The orchestrator MUST parse and validate worker and healer contracts before any state mutation.
218
+ - The orchestrator MUST run verification after parse succeeds and before a task can become `DONE`.
219
+ - The orchestrator MUST checkpoint state after every task attempt and every healing round.
220
+ - The orchestrator MUST treat direct model prose outside the fenced contracts as non-authoritative.
221
+ - The orchestrator MUST use non-destructive logging: write full logs first, then inspect or parse them.
222
+ - The orchestrator MUST trap shutdown signals, terminate child processes it started, and leave state in a resumable condition.
223
+
224
+ ### Shared Skill Utilities
225
+
226
+ Parser, validation, hashing, rollback, and state utility functions SHOULD exist as shared skill utilities rather than being reimplemented inside each adapter.
227
+
228
+ Adapters MAY expose convenience helpers, but contract extraction and schema validation MUST resolve to shared parser and validator utilities so all adapters produce identical acceptance and failure behavior.
229
+
230
+ ### Parser Error Handling
231
+
232
+ The parser layer MUST return deterministic error classes. At minimum, it MUST support:
233
+
234
+ - `NO_SENTINEL`
235
+ - `INVALID_JSON`
236
+ - `SCHEMA_VIOLATION`
237
+ - `MISSING_REQUIRED_FIELD`
238
+ - `UNSUPPORTED_VERSION`
239
+
240
+ Parser errors SHOULD be converted into normalized failure classes and signatures by the orchestrator.
241
+
242
+ Before returning `INVALID_JSON`, the shared parser utility SHOULD attempt a conservative repair pass limited to:
243
+
244
+ - stripping outer markdown fences
245
+ - removing trailing commas
246
+ - removing JavaScript-style comments
247
+
248
+ If repair still fails, the task MUST be treated as a contract error.
249
+
250
+ ## 6. Healing Model (`PBH`)
251
+
252
+ ### Scheduling Modes
253
+
254
+ The canonical schedule enum MUST be:
255
+
256
+ - `auto`
257
+ - `off`
258
+ - `task`
259
+ - `batch`
260
+ - `epoch`
261
+
262
+ The meaning of each mode is:
263
+
264
+ | Mode | Meaning |
265
+ | --- | --- |
266
+ | `auto` | Use Progressive Batch Healing with adaptive growth and shrink behavior. |
267
+ | `off` | Disable healing entirely. |
268
+ | `task` | Heal only the single failed task being retried. Effective window size is always `1`. |
269
+ | `batch` | Heal at fixed batch checkpoints using a fixed `batch_size`. Default fixed batch size is `5`. |
270
+ | `epoch` | Attempt all currently pending tasks before healing. Effective window size is all pending tasks in the epoch. |
271
+
272
+ `auto` MUST be the default and MUST be the only mode that uses progressive growth and shrink behavior.
273
+
274
+ ### PBH Definition
275
+
276
+ Progressive Batch Healing (`PBH`) is the default healing strategy. Under PBH:
277
+
278
+ - the orchestrator starts with a small healing window
279
+ - it increases batch size only after successful or stable windows
280
+ - it reduces batch size when failures imply systemic instability
281
+ - it retries only after a validated healer patch set is applied
282
+
283
+ ### PBH Defaults
284
+
285
+ The default PBH policy MUST be:
286
+
287
+ - `heal.schedule = auto`
288
+ - `batch.strategy = fibonacci`
289
+ - fibonacci sequence = `1, 2, 3, 5, 8, 13, ...`
290
+ - `failure_threshold = 0.2`
291
+ - `max_worker_attempts_per_task = 2`
292
+ - `max_heal_rounds_per_window = 2`
293
+ - `max_total_heal_rounds = 8`
294
+ - `signature_repeat_limit = 2`
295
+
296
+ ### Failure Rate
297
+
298
+ For PBH, `failure_rate` MUST be defined as:
299
+
300
+ ```text
301
+ failure_rate = healable_failed_tasks_in_window / attempted_healable_tasks_in_window
302
+ ```
303
+
304
+ `healable_failed_tasks_in_window` includes only tasks in the current window that:
305
+
306
+ - were attempted in the current window, and
307
+ - ended the attempt in a non-`DONE` state, and
308
+ - are currently classified by the orchestrator as healable
309
+
310
+ `BLOCKED` tasks and non-healable failures MUST NOT count toward the PBH failure-rate numerator. They still count for reporting and MAY cause escalation, but they do not consume heal budget by themselves.
311
+
312
+ If `attempted_healable_tasks_in_window == 0`, then:
313
+
314
+ - `failure_rate` is treated as `0`
315
+ - the orchestrator MUST skip healer invocation for that window
316
+ - the window MAY still produce escalations for blocked or non-healable failures
317
+
318
+ ### PBH Behavior
319
+
320
+ The orchestrator MUST implement the following behavior in `auto` mode:
321
+
322
+ - If a window finishes with zero failures, move to the next larger batch size in the configured sequence.
323
+ - If failure rate is greater than `0` but less than or equal to `failure_threshold`, run the healer once and retry the same window.
324
+ - If failure rate is above `failure_threshold`, shrink one batch level and isolate repeated signatures.
325
+ - If the same task repeats the same failure signature after allowed healing, escalate that task.
326
+ - If healing rounds stop reducing total failing tasks or signature diversity, abort the run and record the non-convergence reason in the run summary and state.
327
+
328
+ If PBH fails to heal a run, the run MUST be aborted rather than looping indefinitely. The abort record MUST include a human-readable reason, such as:
329
+
330
+ - repeated same failure signatures after allowed retries
331
+ - no reduction in failing task count across heal rounds
332
+ - total healing budget exhausted
333
+ - current window contains only non-healable outcomes
334
+
335
+ ### Healable Versus Non-Healable Failures
336
+
337
+ The orchestrator MUST classify each failure as `healable` or `non_healable`.
338
+
339
+ The default healable set SHOULD include:
340
+
341
+ - `prompt_gap`
342
+ - `missing_paths`
343
+ - `weak_contract`
344
+ - `contract_error`
345
+ - `output_format`
346
+ - `timeout`
347
+ - `transient_infra`
348
+
349
+ The default non-healable set SHOULD include:
350
+
351
+ - `blocked_external`
352
+ - `real_bug`
353
+
354
+ `build_error`, `test_error`, and `smoke_error` MAY be treated as healable when evidence points to prompt, context, or runtime configuration rather than a genuine product defect.
355
+
356
+ Tasks that fail with parser-layer contract errors SHOULD receive one automatic contract-format retry before consuming normal task retry or heal budget. This retry SHOULD append a strict formatting reminder to the next worker prompt and MUST NOT invoke the healer.
357
+
358
+ ### Healer Authority Under Guardrails
359
+
360
+ The healer MAY emit bounded runtime patches, but only under guardrails enforced by the orchestrator.
361
+
362
+ Allowed runtime keys are:
363
+
364
+ - `timeout_sec`
365
+ - `concurrency`
366
+ - `current_batch_size`
367
+
368
+ The healer MUST NOT modify:
369
+
370
+ - `heal.schedule`
371
+ - `batch.strategy`
372
+ - verification commands
373
+ - protected-file rules
374
+ - parser behavior
375
+ - model provider or model identity
376
+
377
+ The orchestrator MUST validate runtime patches against operator-defined limits before applying them.
378
+
379
+ ## 7. Public Interfaces and Schemas
380
+
381
+ All public contracts MUST be JSON, versioned, and schema-validated.
382
+
383
+ ### `manifest.v2`
384
+
385
+ #### Required Top-Level Fields
386
+
387
+ | Field | Type | Meaning |
388
+ | --- | --- | --- |
389
+ | `manifest_version` | string | Contract version. MUST be `"2.0"`. |
390
+ | `run_id` | string | Logical run identifier. |
391
+ | `tasks` | array | Ordered list of task definitions. |
392
+
393
+ #### Required Task Fields
394
+
395
+ | Field | Type | Meaning |
396
+ | --- | --- | --- |
397
+ | `id` | string | Stable, unique task identifier. |
398
+ | `prompt_ref` | string | Relative path or logical reference to the task prompt. |
399
+ | `depends_on` | array of strings | Upstream task IDs that must be `DONE` before execution. |
400
+ | `timeout_sec` | number | Task timeout in seconds. |
401
+ | `verify_profile` | string | Name of the project-defined verification profile. |
402
+
403
+ #### Optional Task Fields
404
+
405
+ | Field | Type | Meaning |
406
+ | --- | --- | --- |
407
+ | `context_refs` | array of strings | Shared context references applied to the task. |
408
+ | `priority` | number | Lower number means earlier scheduling within the same dependency depth. |
409
+ | `retry_policy` | object | Task-specific retry constraints. |
410
+ | `metadata` | object | Arbitrary task metadata for reporting or filtering. |
411
+
412
+ `metadata` keys SHOULD remain flat unless nested structure is required for interoperability with an external system.
413
+
414
+ #### `retry_policy` Shape
415
+
416
+ | Field | Type | Meaning |
417
+ | --- | --- | --- |
418
+ | `max_attempts` | number | Maximum worker attempts for this task. Defaults to global policy if omitted. |
419
+ | `retry_on` | array of strings | Failure classes eligible for retry. |
420
+
421
+ ### `verify_profile` Registry
422
+
423
+ `verify_profile` is a manifest reference to an operator-defined verification profile. The profile registry is outside the worker contract and MUST be resolved by the orchestrator from project configuration.
424
+
425
+ The canonical schema for this operator-owned registry is `schemas/verify_profile.v2.json`.
426
+
427
+ The minimum logical shape of a profile registry is:
428
+
429
+ ```json
430
+ {
431
+ "profiles": {
432
+ "build_and_test": {
433
+ "steps": [
434
+ {
435
+ "name": "build",
436
+ "cmd": "pnpm build",
437
+ "cwd": ".",
438
+ "timeout_sec": 300
439
+ },
440
+ {
441
+ "name": "test",
442
+ "cmd": "pnpm test",
443
+ "cwd": ".",
444
+ "timeout_sec": 600
445
+ }
446
+ ],
447
+ "rollback_on_failure": true
448
+ }
449
+ }
450
+ }
451
+ ```
452
+
453
+ The orchestrator MAY load this registry from any project-defined path, but the registry format SHOULD be documented wherever the runner is packaged and SHOULD validate against `verify_profile.v2`.
454
+
455
+ #### Example
456
+
457
+ ```json
458
+ {
459
+ "manifest_version": "2.0",
460
+ "run_id": "run-20260320-001",
461
+ "tasks": [
462
+ {
463
+ "id": "WP-017",
464
+ "prompt_ref": "prompts/WP-017.md",
465
+ "context_refs": ["_shared-context.md"],
466
+ "depends_on": [],
467
+ "priority": 1,
468
+ "timeout_sec": 900,
469
+ "verify_profile": "build_and_test",
470
+ "retry_policy": {
471
+ "max_attempts": 2,
472
+ "retry_on": ["prompt_gap", "timeout", "transient_infra"]
473
+ },
474
+ "metadata": {
475
+ "component": "button"
476
+ }
477
+ }
478
+ ]
479
+ }
480
+ ```
481
+
482
+ ### `task_result.v2`
483
+
484
+ The worker MUST emit exactly one fenced JSON block:
485
+
486
+ ```text
487
+ <<<TASK_RESULT_V2>>>
488
+ { ...json... }
489
+ <<<END_TASK_RESULT_V2>>>
490
+ ```
491
+
492
+ #### Required Fields
493
+
494
+ | Field | Type | Meaning |
495
+ | --- | --- | --- |
496
+ | `contract_version` | string | MUST be `"2.0"`. |
497
+ | `task_id` | string | Task ID matching the current manifest task. |
498
+ | `status` | string | One of `DONE`, `BLOCKED`, `FAILED`, `CONTRACT_ERROR`. |
499
+ | `summary` | string | Short human-readable summary of what happened. |
500
+
501
+ #### Optional Fields
502
+
503
+ | Field | Type | Meaning |
504
+ | --- | --- | --- |
505
+ | `changed_files` | array of strings | Relative file paths changed by the proposed work. |
506
+ | `writes` | array | Proposed file write operations applied by the orchestrator. |
507
+ | `evidence` | object | Commands, log references, or notes supplied by the worker. |
508
+ | `failure_class` | string | Optional worker-supplied hint. The orchestrator still owns final classification. |
509
+
510
+ #### `writes[]` Shape
511
+
512
+ | Field | Type | Meaning |
513
+ | --- | --- | --- |
514
+ | `path` | string | Relative normalized path. MUST NOT escape the workspace root. |
515
+ | `op` | string | One of `create`, `replace`, `append`. |
516
+ | `encoding` | string | MUST be `"utf8"` for the reference implementation. |
517
+ | `content` | string | Inline file content to be applied by the orchestrator. |
518
+ | `content_ref` | string | Optional path to staged content written by the worker tooling instead of inline content. |
519
+ | `sha256_before` | string | Optional precondition hash for conflict detection. |
520
+
521
+ At least one of `content` or `content_ref` MUST be present for each write entry.
522
+
523
+ The orchestrator SHOULD prefer inline `content` for small and medium files. `content_ref` MAY be used when the worker environment can stage large content more reliably than JSON escaping.
524
+
525
+ #### `evidence` Shape
526
+
527
+ | Field | Type | Meaning |
528
+ | --- | --- | --- |
529
+ | `commands` | array of strings | Commands the worker claims to have run. |
530
+ | `log_refs` | array of strings | Relative log references produced by the worker. |
531
+ | `notes` | array of strings | Additional structured evidence notes. |
532
+
533
+ #### Example
534
+
535
+ ```json
536
+ {
537
+ "contract_version": "2.0",
538
+ "task_id": "WP-017",
539
+ "status": "DONE",
540
+ "summary": "Implemented focus-visible fix and updated tests.",
541
+ "changed_files": [
542
+ "packages/ui/button.tsx",
543
+ "packages/ui/button.test.ts"
544
+ ],
545
+ "writes": [
546
+ {
547
+ "path": "packages/ui/button.tsx",
548
+ "op": "replace",
549
+ "encoding": "utf8",
550
+ "content": "export function Button() {}",
551
+ "sha256_before": "sha256:example"
552
+ }
553
+ ],
554
+ "evidence": {
555
+ "commands": [
556
+ "pnpm --filter sample-site test:filter button"
557
+ ],
558
+ "log_refs": [
559
+ "logs/WP-017.verify.log"
560
+ ]
561
+ }
562
+ }
563
+ ```
564
+
565
+ ### `heal_decision.v2`
566
+
567
+ The healer MUST emit exactly one fenced JSON block:
568
+
569
+ ```text
570
+ <<<HEAL_DECISION_V2>>>
571
+ { ...json... }
572
+ <<<END_HEAL_DECISION_V2>>>
573
+ ```
574
+
575
+ #### Required Fields
576
+
577
+ | Field | Type | Meaning |
578
+ | --- | --- | --- |
579
+ | `contract_version` | string | MUST be `"2.0"`. |
580
+ | `scope` | string | Advisory healer view of the current healing level. One of `task`, `batch`, `epoch`. |
581
+ | `decision` | string | One of `RETRY`, `ESCALATE`, `NOT_FIXABLE`. |
582
+ | `failure_class` | string | Normalized failure class the healer is addressing. |
583
+ | `root_cause` | string | One-sentence diagnosis of the repeated issue. |
584
+ | `patches` | array | Allowed patch operations. |
585
+
586
+ #### Optional Fields
587
+
588
+ | Field | Type | Meaning |
589
+ | --- | --- | --- |
590
+ | `learned_rule` | string | Reusable rule recorded by the orchestrator for future runs. |
591
+ | `escalations` | array | Explicit per-task escalation records. |
592
+ | `retry_policy` | object | Optional retry/reset directives for the orchestrator. |
593
+
594
+ #### `patches[]` Shape
595
+
596
+ | Field | Type | Meaning |
597
+ | --- | --- | --- |
598
+ | `target` | string | One of `shared_context`, `task_prompt`, `runtime_patch`, `contract_hint`. |
599
+ | `operation` | string | One of `replace`, `append`, `merge`. |
600
+ | `path` | string | Required for `shared_context` and task prompt file replacements. |
601
+ | `task_id` | string | Required when target is `task_prompt`. |
602
+ | `content` | string or object | Patch payload. String for text replacements, object for runtime merge content. |
603
+
604
+ `scope` is informational and MAY be recorded for diagnostics, but the orchestrator MUST derive actual patch applicability from:
605
+
606
+ - the active healing schedule
607
+ - the current window membership
608
+ - the patch targets present in `patches[]`
609
+
610
+ The orchestrator MUST NOT grant additional authority solely because the healer labeled a decision as `epoch` or `batch`.
611
+
612
+ `contract_hint` means non-authoritative text merged into future prompt assembly. It is not a file write by itself. The orchestrator MUST apply `contract_hint` like this:
613
+
614
+ - current healing scope means the set of task IDs included in the healer input bundle for the current invocation
615
+ - if `task_id` is present, append the hint to the next assembled worker prompt for that task only
616
+ - if `task_id` is absent, append the hint to the next assembled prompt for every task in the current healing scope
617
+ - `contract_hint` MUST NOT be written to disk unless another patch explicitly writes a file
618
+
619
+ #### `retry_policy` Shape
620
+
621
+ | Field | Type | Meaning |
622
+ | --- | --- | --- |
623
+ | `reset_tasks` | array of strings | Tasks to reset to pending for retry. |
624
+ | `retry_window` | string | One of `same_window`, `shrink_window`, `next_epoch`. |
625
+
626
+ #### Example
627
+
628
+ ```json
629
+ {
630
+ "contract_version": "2.0",
631
+ "scope": "batch",
632
+ "decision": "RETRY",
633
+ "failure_class": "prompt_gap",
634
+ "root_cause": "Shared context omitted the import convention needed by multiple tasks.",
635
+ "patches": [
636
+ {
637
+ "target": "shared_context",
638
+ "operation": "append",
639
+ "path": "_shared-context.md",
640
+ "content": "Always include the cn() import rule."
641
+ },
642
+ {
643
+ "target": "task_prompt",
644
+ "operation": "replace",
645
+ "task_id": "WP-017",
646
+ "path": "prompts/WP-017.md",
647
+ "content": "Use the shared import convention and emit TASK_RESULT_V2."
648
+ },
649
+ {
650
+ "target": "runtime_patch",
651
+ "operation": "merge",
652
+ "content": {
653
+ "timeout_sec": 1200,
654
+ "current_batch_size": 2
655
+ }
656
+ },
657
+ {
658
+ "target": "contract_hint",
659
+ "operation": "append",
660
+ "task_id": "WP-017",
661
+ "content": "Return exactly one TASK_RESULT_V2 block at end of output."
662
+ }
663
+ ],
664
+ "learned_rule": "When GTS tasks fail in a group, patch shared context before retrying isolated prompts.",
665
+ "retry_policy": {
666
+ "reset_tasks": ["WP-017", "WP-018"],
667
+ "retry_window": "same_window"
668
+ },
669
+ "escalations": []
670
+ }
671
+ ```
672
+
673
+ ### `state.v2`
674
+
675
+ #### Required Top-Level Fields
676
+
677
+ | Field | Type | Meaning |
678
+ | --- | --- | --- |
679
+ | `state_version` | string | MUST be `"2.0"`. |
680
+ | `run_id` | string | Current run identifier. |
681
+ | `run_status` | string | One of `RUNNING`, `COMPLETED`, `ABORTED`. |
682
+ | `abort_reason` | string or null | Human-readable abort reason when `run_status` is `ABORTED`. |
683
+ | `manifest_digest` | string | Hash of the normalized manifest used for resume validation. |
684
+ | `policy` | object | Effective runtime policy for this run. |
685
+ | `tasks` | object | Per-task state keyed by task ID. |
686
+ | `healing_rounds` | array | Ordered record of healing checkpoints. |
687
+
688
+ #### Required `policy` Fields
689
+
690
+ | Field | Type | Meaning |
691
+ | --- | --- | --- |
692
+ | `heal_schedule` | string | Effective schedule mode. |
693
+ | `batch_strategy` | string | Usually `fibonacci` or `fixed`. |
694
+ | `current_batch_size` | number | Current effective window size. |
695
+ | `failure_threshold` | number | PBH threshold for the current run. |
696
+ | `max_worker_attempts_per_task` | number | Effective retry cap. |
697
+ | `max_heal_rounds_per_window` | number | Effective heal cap per window. |
698
+ | `max_total_heal_rounds` | number | Effective total heal budget. |
699
+ | `signature_repeat_limit` | number | Repeated signature escalation cap. |
700
+
701
+ #### Required Per-Task Fields
702
+
703
+ | Field | Type | Meaning |
704
+ | --- | --- | --- |
705
+ | `status` | string | One of `PENDING`, `RUNNING`, `DONE`, `BLOCKED`, `FAILED`, `ESCALATED`. |
706
+ | `worker_attempts` | number | Current worker attempt count. |
707
+ | `healer_attempts` | number | Current healer attempt count affecting the task. |
708
+ | `last_failure_class` | string or null | Most recent normalized failure class. |
709
+ | `last_failure_signature` | string or null | Most recent normalized failure signature. |
710
+ | `applied_patch_ids` | array of strings | Patch identifiers applied to this task or its shared context. |
711
+ | `history` | array | Attempt history records. |
712
+
713
+ #### Required History Fields
714
+
715
+ | Field | Type | Meaning |
716
+ | --- | --- | --- |
717
+ | `task_id` | string | Task ID for the record. |
718
+ | `phase` | string | One of `worker`, `verify`, `healer`, `rollback`. |
719
+ | `attempt_number` | number | Monotonic attempt number within the phase. |
720
+ | `log_path` | string | Relative path to the primary log. |
721
+ | `verify_log_path` | string or null | Relative path to verification log when applicable. |
722
+ | `exit_code` | number or null | Process exit code when applicable. |
723
+ | `failure_class` | string or null | Failure class for that attempt. |
724
+ | `failure_signature` | string or null | Failure signature for that attempt. |
725
+ | `applied_patch_ids` | array of strings | Patches active for that attempt. |
726
+ | `duration_sec` | number or null | Attempt duration in seconds when measurable. |
727
+ | `timestamp` | string | ISO-8601 timestamp. |
728
+
729
+ #### Example
730
+
731
+ ```json
732
+ {
733
+ "state_version": "2.0",
734
+ "run_id": "run-20260320-001",
735
+ "run_status": "RUNNING",
736
+ "abort_reason": null,
737
+ "manifest_digest": "sha256:example",
738
+ "policy": {
739
+ "heal_schedule": "auto",
740
+ "batch_strategy": "fibonacci",
741
+ "current_batch_size": 2,
742
+ "failure_threshold": 0.2,
743
+ "max_worker_attempts_per_task": 2,
744
+ "max_heal_rounds_per_window": 2,
745
+ "max_total_heal_rounds": 8,
746
+ "signature_repeat_limit": 2
747
+ },
748
+ "tasks": {
749
+ "WP-017": {
750
+ "status": "FAILED",
751
+ "worker_attempts": 1,
752
+ "healer_attempts": 1,
753
+ "last_failure_class": "build_error",
754
+ "last_failure_signature": "build_error:missing-cn-import",
755
+ "applied_patch_ids": ["patch-001"],
756
+ "history": [
757
+ {
758
+ "task_id": "WP-017",
759
+ "phase": "worker",
760
+ "attempt_number": 1,
761
+ "log_path": "logs/WP-017.worker.1.log",
762
+ "verify_log_path": "logs/WP-017.verify.1.log",
763
+ "exit_code": 0,
764
+ "failure_class": "build_error",
765
+ "failure_signature": "build_error:missing-cn-import",
766
+ "applied_patch_ids": [],
767
+ "duration_sec": 42,
768
+ "timestamp": "2026-03-20T15:21:00Z"
769
+ }
770
+ ]
771
+ }
772
+ },
773
+ "healing_rounds": [
774
+ {
775
+ "round_number": 1,
776
+ "scope": "batch",
777
+ "window_task_ids": ["WP-017", "WP-018"],
778
+ "failed_task_ids": ["WP-017", "WP-018"],
779
+ "decision": "RETRY",
780
+ "applied_patch_ids": ["patch-001"],
781
+ "timestamp": "2026-03-20T15:25:00Z"
782
+ }
783
+ ]
784
+ }
785
+ ```
786
+
787
+ ### `adapter.v2`
788
+
789
+ The reference adapter contract SHOULD be expressed in TypeScript like this:
790
+
791
+ ```ts
792
+ export type ParserErrorCode =
793
+ | "NO_SENTINEL"
794
+ | "INVALID_JSON"
795
+ | "SCHEMA_VIOLATION"
796
+ | "MISSING_REQUIRED_FIELD"
797
+ | "UNSUPPORTED_VERSION";
798
+
799
+ export interface PreparedInvocation {
800
+ cwd: string;
801
+ argv: string[];
802
+ env?: Record<string, string>;
803
+ stdin?: string | null;
804
+ timeoutSec: number;
805
+ }
806
+
807
+ export interface ExecutionArtifact {
808
+ logPath: string;
809
+ exitCode: number | null;
810
+ startedAt: string;
811
+ finishedAt: string;
812
+ }
813
+
814
+ export interface ParserFailure {
815
+ ok: false;
816
+ code: ParserErrorCode;
817
+ message: string;
818
+ }
819
+
820
+ export interface AdapterHealth {
821
+ ready: boolean;
822
+ details: string[];
823
+ }
824
+
825
+ export interface CliAdapter {
826
+ id: string;
827
+ capabilities: {
828
+ stdinPrompt: boolean;
829
+ argPrompt: boolean;
830
+ pty: boolean;
831
+ interactive: boolean;
832
+ };
833
+ prepare(task: ManifestTaskV2, ctx: RunContext): PreparedInvocation;
834
+ execute(invocation: PreparedInvocation, ctx: RunContext): Promise<ExecutionArtifact>;
835
+ extractResult(artifact: ExecutionArtifact, ctx: RunContext): Promise<TaskResultV2 | ParserFailure>;
836
+ healthcheck(ctx: RunContext): Promise<AdapterHealth>;
837
+ }
838
+ ```
839
+
840
+ `extractResult` MUST validate `task_result.v2` and MUST return deterministic parser failures on invalid output.
841
+ `extractResult` MUST use the shared parser and validator utilities provided by the skill rather than adapter-specific parsing logic.
842
+
843
+ ## 8. Verification and Safety Model
844
+
845
+ ### Verification Ownership
846
+
847
+ Verification always belongs to the orchestrator. Verification MUST run after successful parse and before final task success.
848
+
849
+ The worker MAY report evidence, but the worker does not own final pass or fail classification.
850
+
851
+ If verification fails, the orchestrator MUST NOT record `DONE` regardless of the worker-declared `status`.
852
+
853
+ ### Verification Layers
854
+
855
+ The orchestrator MUST support three verification layers:
856
+
857
+ | Layer | Timing | Purpose |
858
+ | --- | --- | --- |
859
+ | Post-parse validation | Immediately after parsing worker output | Validate contract integrity, candidate writes, path safety, and parser consistency. |
860
+ | Post-write build or test validation | After writes are applied | Detect build, test, lint, or type failures caused by the change. |
861
+ | Final smoke or browser validation | After build and test pass | Confirm runtime behavior, UI behavior, or custom project checks when needed. |
862
+
863
+ ### Allowed Write Path
864
+
865
+ The only canonical write path is:
866
+
867
+ 1. worker emits `writes[]` in `task_result.v2`
868
+ 2. orchestrator validates those writes
869
+ 3. orchestrator applies those writes
870
+ 4. orchestrator verifies the result
871
+
872
+ Worker output MUST NOT be treated as direct authority to mutate protected files outside this path.
873
+
874
+ ### Required Write Safeguards
875
+
876
+ The orchestrator MUST enforce these safeguards:
877
+
878
+ - path normalization so writes cannot escape the workspace root
879
+ - protected-file denylist
880
+ - shrinkage detection
881
+ - optional `sha256_before` precondition validation
882
+ - backup before write
883
+ - rollback on verification failure
884
+
885
+ The default shrinkage rule SHOULD reject a replacement when:
886
+
887
+ - the original file is larger than `100` bytes, and
888
+ - the replacement is less than `50%` of the original size, and
889
+ - the task or operator has not explicitly allowed the shrinkage
890
+
891
+ ### Healer Patch Safety
892
+
893
+ Healer patches are subject to the same validation model for prompt and shared context files.
894
+
895
+ The healer is forbidden from:
896
+
897
+ - editing product source files directly
898
+ - disabling verification
899
+ - bypassing protected-file rules
900
+ - changing healing schedule mid-run
901
+
902
+ `runtime_patch` targets MAY adjust bounded runtime settings only when those settings are exposed by the operator configuration.
903
+
904
+ ### Parser Authority
905
+
906
+ The parser and validator modules are the only authority allowed to interpret worker and healer contracts.
907
+
908
+ The orchestrator MUST NOT:
909
+
910
+ - parse unconstrained model prose with regex as the normative path
911
+ - trust exit code alone as success
912
+ - treat an unvalidated JSON body as a valid contract
913
+
914
+ If multiple fenced blocks exist in a log, the parser MUST use the last matching fenced block for that contract type. This rule exists to defeat prompt echo contamination and duplicate draft outputs.
915
+
916
+ ## 9. Adapter Model
917
+
918
+ ### Adapter Responsibilities
919
+
920
+ Adapters are the only place where CLI-specific behavior lives. An adapter MUST:
921
+
922
+ - construct the concrete CLI invocation
923
+ - decide whether prompt delivery uses stdin, argv, or PTY interaction
924
+ - manage PTY or expect requirements for interactive CLIs
925
+ - capture combined stdout and stderr to the execution log
926
+ - return execution artifacts to the orchestrator
927
+ - delegate contract parsing and schema validation to the shared parser and validator utilities
928
+
929
+ ### Orchestrator Responsibilities
930
+
931
+ The orchestrator core MUST:
932
+
933
+ - never call CLIs directly except through `CliAdapter.execute`
934
+ - never parse raw logs without going through parser and validator modules
935
+ - never assume exit code alone means success
936
+ - remain CLI-agnostic outside the adapter boundary
937
+
938
+ ### Initial Reference Adapters
939
+
940
+ The initial reference adapters SHOULD be:
941
+
942
+ - `agent`
943
+ - `opencode`
944
+ - `claude`
945
+
946
+ These adapters MUST share one orchestrator contract model even if their invocation mechanics differ.
947
+
948
+ ### Interactive CLIs
949
+
950
+ For interactive CLIs, prompt rescue logic and TTY heuristics are adapter-local behavior. The core runtime MUST NOT embed tool-specific rescue logic.
951
+
952
+ Examples of adapter-local behavior include:
953
+
954
+ - trust prompt handling
955
+ - permission prompt handling
956
+ - idle detection
957
+ - PTY completion heuristics
958
+
959
+ Interactive adapters SHOULD also implement:
960
+
961
+ - ANSI stripping before parser handoff
962
+ - bounded idle detection
963
+ - finite rescue attempts for blocked prompts
964
+ - explicit completion detection before declaring success
965
+
966
+ ### Design Proof From Recent Tempest Runners
967
+
968
+ Recent Tempest runners provide empirical design proof for several behaviors that this specification adopts:
969
+
970
+ - the newest gap-remediation runner demonstrated dependency-aware scheduling, bounded concurrency, backups, and rollback on verification failure
971
+ - storybook audit and fix runners demonstrated long-running batch logging and semaphore or file-lock style coordination
972
+ - component-check plus expect wrappers demonstrated server lifecycle ownership and adapter-local handling for interactive CLIs
973
+
974
+ These examples are evidence for the design. They are not normative inputs to the specification and are not required to understand or implement v2.
975
+
976
+ ## 10. State, Resume, and Convergence Rules
977
+
978
+ ### Atomic State Writes
979
+
980
+ Atomic state writes are mandatory. The orchestrator MUST write state via a temporary file followed by atomic rename on the same filesystem.
981
+
982
+ ### Resume Semantics
983
+
984
+ The orchestrator MUST implement resume like this:
985
+
986
+ - `DONE` tasks are skipped on resume if `manifest_digest` still matches
987
+ - if `manifest_digest` changes, the orchestrator MUST warn or force reconciliation before reuse
988
+ - `ESCALATED` tasks are not retried automatically
989
+ - `FAILED` and `BLOCKED` tasks are eligible only if retry policy allows it
990
+
991
+ The orchestrator SHOULD provide a reconciliation mode that can mark affected tasks back to `PENDING` when the manifest changes in a way that invalidates prior attempts.
992
+
993
+ At minimum, reconciliation SHOULD handle:
994
+
995
+ - tasks removed from the manifest since the last run
996
+ - tasks added since the last run
997
+ - tasks whose `prompt_ref`, `depends_on`, or `verify_profile` changed
998
+
999
+ ### Failure Signature Generation
1000
+
1001
+ The orchestrator MUST generate stable failure signatures. The failure signature algorithm MUST:
1002
+
1003
+ 1. start with the normalized failure class
1004
+ 2. extract the primary stable signal from parser output, verification logs, or known error codes
1005
+ 3. remove timestamps, absolute paths, task IDs, and obviously unstable numeric fragments where possible
1006
+ 4. lowercase and collapse whitespace
1007
+ 5. truncate to a stable maximum length
1008
+
1009
+ The resulting format SHOULD be:
1010
+
1011
+ ```text
1012
+ <failure_class>:<normalized_primary_signal>
1013
+ ```
1014
+
1015
+ Examples:
1016
+
1017
+ - `contract_error:no_sentinel`
1018
+ - `build_error:missing_cn_import`
1019
+ - `timeout:worker_idle`
1020
+
1021
+ ### Convergence Rules
1022
+
1023
+ Healing is converging only if at least one of these conditions is true after a healing round:
1024
+
1025
+ - total failing task count drops
1026
+ - repeated signature count drops
1027
+ - a broader failure class narrows to a more local and isolated issue
1028
+
1029
+ Healing is non-convergent if any of these conditions is true:
1030
+
1031
+ - the same task repeats the same signature after allowed retries
1032
+ - the same failing set persists across rounds
1033
+ - total healing budget is exhausted without measurable improvement
1034
+
1035
+ ### Escalation Rules
1036
+
1037
+ Per-task escalation MUST happen when:
1038
+
1039
+ - a task repeats the same failure signature `signature_repeat_limit` times after healing
1040
+ - a failure is classified as non-healable and retry policy does not permit further attempts
1041
+
1042
+ Per-run escalation MUST happen when:
1043
+
1044
+ - `max_total_heal_rounds` is exhausted without convergence
1045
+ - the orchestrator determines that continuing would only repeat the same failure set
1046
+
1047
+ Escalated tasks MUST remain in state for reporting and MUST NOT be silently dropped.
1048
+
1049
+ When a run is aborted for non-convergence, the orchestrator MUST:
1050
+
1051
+ - set `run_status` to `ABORTED`
1052
+ - write a non-empty `abort_reason`
1053
+ - persist the final failing task set and last observed failure signatures
1054
+ - include the same reason in the human-readable run summary
1055
+
1056
+ ### Learned Rule Lifecycle
1057
+
1058
+ `learned_rule` entries are durable run artifacts, not automatic policy changes.
1059
+
1060
+ The orchestrator MUST:
1061
+
1062
+ - record each accepted `learned_rule` in state or a linked healing journal
1063
+ - record which healing round produced the rule
1064
+
1065
+ The orchestrator MUST NOT automatically promote a learned rule into canonical shared context unless an operator or higher-level workflow explicitly chooses to do so.
1066
+
1067
+ The orchestrator MAY expose learned rules to future runs as optional advisory input, but this behavior MUST be opt-in and clearly labeled as non-canonical.
1068
+
1069
+ ## 11. Rollout and Migration
1070
+
1071
+ ### Replace-Now Migration Strategy
1072
+
1073
+ This specification assumes a replace-now migration. The migration steps are:
1074
+
1075
+ 1. Freeze schemas and enums.
1076
+ 2. Publish canonical `SKILL.md` and `SPEC.md`.
1077
+ 3. Add the TSX reference orchestrator and validator modules.
1078
+ 4. Wrap current shell and expect flows behind adapters.
1079
+ 5. Replace platform-specific authoritative docs with thin wrappers.
1080
+ 6. Mark legacy parsing and legacy contract docs as deprecated.
1081
+ 7. Run a reference validation manifest before declaring cutover complete.
1082
+
1083
+ ### Legacy Variant Behavior
1084
+
1085
+ After cutover:
1086
+
1087
+ - old docs remain historical references only
1088
+ - old docs MUST NOT define new behavior
1089
+ - legacy text or XML parsing MAY exist only as an explicit compatibility plugin
1090
+ - compatibility plugins are non-normative and MUST NOT be the default path
1091
+
1092
+ ### Wrapper Rules
1093
+
1094
+ Platform-specific wrapper files MUST:
1095
+
1096
+ - point to the canonical `SKILL.md` and `SPEC.md`
1097
+ - stay thin
1098
+ - avoid duplicating architecture
1099
+
1100
+ Platform-specific wrapper files MUST NOT:
1101
+
1102
+ - redefine healing policy
1103
+ - redefine contract formats
1104
+ - introduce parser behavior not described by the canonical spec
1105
+
1106
+ ### Recommended Packaging Outcome
1107
+
1108
+ The migration SHOULD leave one canonical source tree plus thin platform wrappers. Monolithic platform-specific architecture documents SHOULD be retired as authoritative artifacts.
1109
+
1110
+ ## 12. Test Plan and Acceptance Criteria
1111
+
1112
+ ### Architecture-Level Test Scenarios
1113
+
1114
+ The minimum test matrix MUST cover:
1115
+
1116
+ - valid worker JSON result
1117
+ - missing worker fence
1118
+ - invalid worker JSON
1119
+ - invalid healer JSON
1120
+ - schema violation on either contract
1121
+ - prompt echo contamination with multiple fenced blocks
1122
+ - automatic contract-error retry without heal-budget consumption
1123
+ - successful verification after write
1124
+ - failed verification with rollback
1125
+ - protected-file rejection
1126
+ - shrinkage rejection
1127
+ - concurrent window completion with atomic checkpoints
1128
+ - signal-triggered shutdown with resumable state
1129
+ - resume after interruption
1130
+ - manifest digest mismatch
1131
+ - PBH growth on stable windows
1132
+ - PBH retry on moderate failures
1133
+ - PBH shrink on instability
1134
+ - PBH abort with recorded non-convergence reason
1135
+ - repeated-signature escalation
1136
+ - adapter parity across `agent`, `opencode`, and `claude`
1137
+
1138
+ ### Implementation Acceptance Criteria
1139
+
1140
+ The implementation is complete only if:
1141
+
1142
+ - all platform wrappers consume the same canonical schemas
1143
+ - the reference runtime uses the same control flow and healing policy regardless of adapter
1144
+ - at least one integration test exists per adapter
1145
+ - legacy regex-only parser paths are disabled by default
1146
+ - migration behavior is documented without ambiguity
1147
+
1148
+ ### Document Acceptance Criteria
1149
+
1150
+ The document is acceptable only if:
1151
+
1152
+ - a peer unfamiliar with repository history can explain the system after reading it once
1153
+ - a second engineer can implement schemas and runtime behavior without asking what key terms mean
1154
+ - all defaults, stop conditions, and safety rules are explicitly named
1155
+ - the specification no longer depends on Tempest-specific local paths to make sense
1156
+
1157
+ ## 13. Appendix: Mapping from Legacy Variants
1158
+
1159
+ | Legacy concept | v2 mapping |
1160
+ | --- | --- |
1161
+ | Per-task healing | `heal.schedule = task` |
1162
+ | Fixed batch healing | `heal.schedule = batch` with fixed `batch_size` |
1163
+ | Epoch healing | `heal.schedule = epoch` |
1164
+ | New default healing | `heal.schedule = auto` with `PBH` |
1165
+ | Shell-first regex parser stack | Fenced JSON contracts plus schema validation |
1166
+ | Monolithic platform docs | Thin wrappers pointing to canonical `SKILL.md` and `SPEC.md` |
1167
+ | Shell or expect orchestration cores | Adapter implementations beneath the reference orchestrator or conforming alternate runtimes |
1168
+ | Tempest runner behavior | Design proof only, not normative spec text |
1169
+
1170
+ The practical effect of this mapping is:
1171
+
1172
+ - old per-task healing still exists, but it is no longer the default
1173
+ - old fixed batch healing still exists, but it is now an explicit override
1174
+ - old epoch healing still exists, but it is now an explicit override
1175
+ - the new default is adaptive `PBH`
1176
+ - the new parser path is structured JSON plus schema validation, not text scraping
1177
+
1178
+ ## Final Position
1179
+
1180
+ v2 is a spec-first, adapter-based, typed orchestration system with:
1181
+
1182
+ - one orchestrator-owned execution model
1183
+ - one strict JSON contract stack
1184
+ - one adapter boundary for multiple CLIs
1185
+ - one default healing policy: `PBH`
1186
+ - one reference runtime: TypeScript via global `tsx`
1187
+ - conforming alternate runtimes allowed if they preserve the same contracts, state transitions, parser guarantees, and healing behavior
1188
+
1189
+ This document is written so that a peer can understand the system without any other repo context and implement it without needing unwritten design assumptions.