maestro-flow 0.4.17 → 0.4.19
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/maestro/SKILL.md +1 -1
- package/.agents/skills/maestro-analyze/SKILL.md +5 -0
- package/.agents/skills/maestro-blueprint/SKILL.md +5 -0
- package/.agents/skills/maestro-brainstorm/SKILL.md +5 -0
- package/.agents/skills/maestro-next/SKILL.md +254 -0
- package/.agents/skills/team-swarm/SKILL.md +180 -0
- package/.agents/skills/team-swarm/roles/analyst/role.md +187 -0
- package/.agents/skills/team-swarm/roles/ant/role.md +169 -0
- package/.agents/skills/team-swarm/roles/coordinator/commands/converge.md +146 -0
- package/.agents/skills/team-swarm/roles/coordinator/commands/init-swarm.md +136 -0
- package/.agents/skills/team-swarm/roles/coordinator/commands/iterate.md +232 -0
- package/.agents/skills/team-swarm/roles/coordinator/role.md +211 -0
- package/.agents/skills/team-swarm/roles/scorer/role.md +157 -0
- package/.agents/skills/team-swarm/scripts/aco.py +473 -0
- package/.agents/skills/team-swarm/scripts/pheromone.py +144 -0
- package/.agents/skills/team-swarm/scripts/scoring.py +92 -0
- package/.agents/skills/team-swarm/scripts/test_aco.py +475 -0
- package/.agents/skills/team-swarm/specs/ant-output-schema.md +119 -0
- package/.agents/skills/team-swarm/specs/convergence-criteria.md +106 -0
- package/.agents/skills/team-swarm/specs/pheromone-schema.md +123 -0
- package/.agents/skills/team-swarm/specs/swarm-config-template.json +71 -0
- package/.agents/skills/team-swarm/specs/swarm-protocol.md +117 -0
- package/.agy/skills/maestro/SKILL.md +1 -1
- package/.agy/skills/maestro-analyze/SKILL.md +5 -0
- package/.agy/skills/maestro-blueprint/SKILL.md +5 -0
- package/.agy/skills/maestro-brainstorm/SKILL.md +5 -0
- package/.agy/skills/maestro-next/SKILL.md +250 -0
- package/.agy/skills/team-swarm/SKILL.md +176 -0
- package/.agy/skills/team-swarm/roles/analyst/role.md +183 -0
- package/.agy/skills/team-swarm/roles/ant/role.md +165 -0
- package/.agy/skills/team-swarm/roles/coordinator/commands/converge.md +134 -0
- package/.agy/skills/team-swarm/roles/coordinator/commands/init-swarm.md +136 -0
- package/.agy/skills/team-swarm/roles/coordinator/commands/iterate.md +202 -0
- package/.agy/skills/team-swarm/roles/coordinator/role.md +209 -0
- package/.agy/skills/team-swarm/roles/scorer/role.md +153 -0
- package/.agy/skills/team-swarm/scripts/aco.py +473 -0
- package/.agy/skills/team-swarm/scripts/pheromone.py +144 -0
- package/.agy/skills/team-swarm/scripts/scoring.py +92 -0
- package/.agy/skills/team-swarm/scripts/test_aco.py +475 -0
- package/.agy/skills/team-swarm/specs/ant-output-schema.md +119 -0
- package/.agy/skills/team-swarm/specs/convergence-criteria.md +106 -0
- package/.agy/skills/team-swarm/specs/pheromone-schema.md +123 -0
- package/.agy/skills/team-swarm/specs/swarm-config-template.json +71 -0
- package/.agy/skills/team-swarm/specs/swarm-protocol.md +117 -0
- package/.claude/commands/maestro-analyze.md +5 -0
- package/.claude/commands/maestro-blueprint.md +5 -0
- package/.claude/commands/maestro-brainstorm.md +5 -0
- package/.claude/commands/maestro-next.md +252 -0
- package/.claude/commands/maestro.md +1 -1
- package/.claude/skills/team-swarm/SKILL.md +178 -0
- package/.claude/skills/team-swarm/roles/analyst/role.md +185 -0
- package/.claude/skills/team-swarm/roles/ant/role.md +167 -0
- package/.claude/skills/team-swarm/roles/coordinator/commands/converge.md +146 -0
- package/.claude/skills/team-swarm/roles/coordinator/commands/init-swarm.md +136 -0
- package/.claude/skills/team-swarm/roles/coordinator/commands/iterate.md +232 -0
- package/.claude/skills/team-swarm/roles/coordinator/role.md +209 -0
- package/.claude/skills/team-swarm/roles/scorer/role.md +155 -0
- package/.claude/skills/team-swarm/scripts/aco.py +473 -0
- package/.claude/skills/team-swarm/scripts/pheromone.py +144 -0
- package/.claude/skills/team-swarm/scripts/scoring.py +92 -0
- package/.claude/skills/team-swarm/scripts/test_aco.py +475 -0
- package/.claude/skills/team-swarm/specs/ant-output-schema.md +119 -0
- package/.claude/skills/team-swarm/specs/convergence-criteria.md +106 -0
- package/.claude/skills/team-swarm/specs/pheromone-schema.md +123 -0
- package/.claude/skills/team-swarm/specs/swarm-config-template.json +71 -0
- package/.claude/skills/team-swarm/specs/swarm-protocol.md +117 -0
- package/.codex/skills/learn-decompose/SKILL.md +34 -3
- package/.codex/skills/learn-retro/SKILL.md +31 -1
- package/.codex/skills/learn-second-opinion/SKILL.md +34 -4
- package/.codex/skills/maestro-analyze/SKILL.md +44 -5
- package/.codex/skills/maestro-blueprint/SKILL.md +5 -0
- package/.codex/skills/maestro-brainstorm/SKILL.md +46 -0
- package/.codex/skills/maestro-execute/SKILL.md +61 -5
- package/.codex/skills/maestro-milestone-audit/SKILL.md +64 -13
- package/.codex/skills/maestro-milestone-complete/SKILL.md +12 -0
- package/.codex/skills/maestro-next/SKILL.md +297 -0
- package/.codex/skills/maestro-plan/SKILL.md +36 -1
- package/.codex/skills/maestro-player/SKILL.md +25 -6
- package/.codex/skills/maestro-ralph/SKILL.md +17 -10
- package/.codex/skills/maestro-ralph-execute/SKILL.md +2 -1
- package/.codex/skills/maestro-roadmap/SKILL.md +35 -4
- package/.codex/skills/maestro-ui-codify/SKILL.md +38 -10
- package/.codex/skills/maestro-verify/SKILL.md +40 -5
- package/.codex/skills/manage-codebase-rebuild/SKILL.md +52 -5
- package/.codex/skills/manage-issue-discover/SKILL.md +106 -15
- package/.codex/skills/quality-auto-test/SKILL.md +70 -16
- package/.codex/skills/quality-debug/SKILL.md +139 -28
- package/.codex/skills/quality-refactor/SKILL.md +61 -11
- package/.codex/skills/quality-review/SKILL.md +45 -9
- package/.codex/skills/quality-test/SKILL.md +58 -3
- package/.codex/skills/security-audit/SKILL.md +38 -0
- package/.codex/skills/spec-map/SKILL.md +65 -8
- package/.codex/skills/team-coordinate/SKILL.md +28 -11
- package/.codex/skills/team-coordinate/specs/role-catalog.md +20 -0
- package/.codex/skills/team-lifecycle-v4/SKILL.md +23 -7
- package/.codex/skills/team-lifecycle-v4/instructions/agent-instruction.md +20 -0
- package/.codex/skills/team-quality-assurance/SKILL.md +40 -2
- package/.codex/skills/team-review/SKILL.md +42 -2
- package/.codex/skills/team-tech-debt/SKILL.md +45 -2
- package/.codex/skills/team-testing/SKILL.md +42 -2
- package/dashboard/dist-server/dashboard/src/server/wiki/search.d.ts +6 -4
- package/dashboard/dist-server/dashboard/src/server/wiki/search.js +50 -8
- package/dashboard/dist-server/dashboard/src/server/wiki/search.js.map +1 -1
- package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.d.ts +32 -0
- package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.js +294 -0
- package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.js.map +1 -1
- package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.d.ts +1 -0
- package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js +35 -1
- package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js.map +1 -1
- package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.test.js +235 -0
- package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.test.js.map +1 -1
- package/dist/src/commands/install.js +5 -1
- package/dist/src/commands/install.js.map +1 -1
- package/dist/src/i18n/locales/en.d.ts.map +1 -1
- package/dist/src/i18n/locales/en.js +9 -0
- package/dist/src/i18n/locales/en.js.map +1 -1
- package/dist/src/i18n/locales/zh.d.ts.map +1 -1
- package/dist/src/i18n/locales/zh.js +9 -0
- package/dist/src/i18n/locales/zh.js.map +1 -1
- package/dist/src/i18n/types.d.ts +3 -0
- package/dist/src/i18n/types.d.ts.map +1 -1
- package/dist/src/ralph/cmd-check.js +1 -1
- package/dist/src/ralph/cmd-check.js.map +1 -1
- package/dist/src/ralph/cmd-complete.js +1 -1
- package/dist/src/ralph/cmd-complete.js.map +1 -1
- package/dist/src/ralph/cmd-next.d.ts.map +1 -1
- package/dist/src/ralph/cmd-next.js +12 -4
- package/dist/src/ralph/cmd-next.js.map +1 -1
- package/dist/src/ralph/cmd-session.js +2 -2
- package/dist/src/ralph/cmd-session.js.map +1 -1
- package/dist/src/ralph/status-store.d.ts +8 -1
- package/dist/src/ralph/status-store.d.ts.map +1 -1
- package/dist/src/ralph/status-store.js +12 -2
- package/dist/src/ralph/status-store.js.map +1 -1
- package/dist/src/tools/store-knowhow.d.ts.map +1 -1
- package/dist/src/tools/store-knowhow.js +51 -64
- package/dist/src/tools/store-knowhow.js.map +1 -1
- package/dist/src/tui/install-ui/HooksConfig.d.ts +5 -1
- package/dist/src/tui/install-ui/HooksConfig.d.ts.map +1 -1
- package/dist/src/tui/install-ui/HooksConfig.js +5 -3
- package/dist/src/tui/install-ui/HooksConfig.js.map +1 -1
- package/dist/src/tui/install-ui/InstallConfirm.d.ts +2 -0
- package/dist/src/tui/install-ui/InstallConfirm.d.ts.map +1 -1
- package/dist/src/tui/install-ui/InstallConfirm.js +1 -1
- package/dist/src/tui/install-ui/InstallConfirm.js.map +1 -1
- package/dist/src/tui/install-ui/InstallExecution.d.ts +1 -0
- package/dist/src/tui/install-ui/InstallExecution.d.ts.map +1 -1
- package/dist/src/tui/install-ui/InstallExecution.js +26 -3
- package/dist/src/tui/install-ui/InstallExecution.js.map +1 -1
- package/dist/src/tui/install-ui/InstallFlow.d.ts +1 -1
- package/dist/src/tui/install-ui/InstallFlow.d.ts.map +1 -1
- package/dist/src/tui/install-ui/InstallFlow.js +76 -16
- package/dist/src/tui/install-ui/InstallFlow.js.map +1 -1
- package/dist/src/tui/install-ui/InstallHub.d.ts +2 -0
- package/dist/src/tui/install-ui/InstallHub.d.ts.map +1 -1
- package/dist/src/tui/install-ui/InstallHub.js +8 -0
- package/dist/src/tui/install-ui/InstallHub.js.map +1 -1
- package/dist/src/tui/install-ui/InstallResult.d.ts.map +1 -1
- package/dist/src/tui/install-ui/InstallResult.js +1 -1
- package/dist/src/tui/install-ui/InstallResult.js.map +1 -1
- package/dist/src/utils/update-notices.js +23 -0
- package/dist/src/utils/update-notices.js.map +1 -1
- package/package.json +1 -1
- package/workflows/finish-work.md +119 -0
- package/workflows/milestone-complete.md +23 -1
|
@@ -141,6 +141,28 @@ For each layer L1->L3 (sequential, respecting --layer filter):
|
|
|
141
141
|
6. Record per-scenario pass/fail
|
|
142
142
|
7. Fail-fast: any critical-priority failed -> stop layer progression
|
|
143
143
|
|
|
144
|
+
**Test Writer Spawn output_schema** (strict JSON Schema, used for both writer + diagnosis spawns):
|
|
145
|
+
|
|
146
|
+
```json
|
|
147
|
+
{
|
|
148
|
+
"type": "object",
|
|
149
|
+
"properties": {
|
|
150
|
+
"id": { "type": "string" },
|
|
151
|
+
"result_status": { "type": "string", "enum": ["completed", "failed", "blocked"] },
|
|
152
|
+
"red_result": { "type": "string", "enum": ["expected_fail", "pass", "unexpected_fail", ""] },
|
|
153
|
+
"classification": { "type": "string", "enum": ["test_defect", "code_defect", "env_issue", ""] },
|
|
154
|
+
"fix_code": { "type": "string" },
|
|
155
|
+
"evidence": { "type": "string" },
|
|
156
|
+
"findings": { "type": "string", "maxLength": 500 },
|
|
157
|
+
"files_modified": { "type": "string" },
|
|
158
|
+
"error": { "type": "string" }
|
|
159
|
+
},
|
|
160
|
+
"required": ["id", "result_status", "findings"]
|
|
161
|
+
}
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
Merge: `result_status` → master `status`; copy `red_result` / `classification` / `fix_code` / `evidence` / `findings` / `files_modified` / `error`.
|
|
165
|
+
|
|
144
166
|
**Test Writer Agent Instruction** (injected into spawn_agents_on_csv):
|
|
145
167
|
```
|
|
146
168
|
You are a test writer. Write ONE test file for the given scenario.
|
|
@@ -154,16 +176,33 @@ You are a test writer. Write ONE test file for the given scenario.
|
|
|
154
176
|
- Run test file once after writing
|
|
155
177
|
|
|
156
178
|
## RED-GREEN Rules
|
|
157
|
-
- Test PASSES immediately:
|
|
158
|
-
- Test FAILS as expected (tests real behavior):
|
|
159
|
-
- Test FAILS unexpectedly (setup/import error): fix test setup,
|
|
179
|
+
- Test PASSES immediately: red_result="pass" — may need strengthening
|
|
180
|
+
- Test FAILS as expected (tests real behavior): red_result="expected_fail" — good
|
|
181
|
+
- Test FAILS unexpectedly (setup/import error): fix test setup, red_result="unexpected_fail"
|
|
160
182
|
- NEVER modify source code — only write/fix test files
|
|
161
183
|
|
|
162
|
-
##
|
|
163
|
-
|
|
164
|
-
-
|
|
165
|
-
-
|
|
166
|
-
-
|
|
184
|
+
## Termination Contract (MANDATORY)
|
|
185
|
+
You MUST call report_agent_job_result EXACTLY ONCE before exiting.
|
|
186
|
+
- Success → result_status=completed (test file written + run executed; red_result populated)
|
|
187
|
+
- Failure → result_status=failed (cannot write test file, parse error, missing target)
|
|
188
|
+
- Blocked → result_status=blocked (test framework unavailable)
|
|
189
|
+
- Timeout → near max_runtime_seconds → result_status=failed with error="timeout"
|
|
190
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
191
|
+
|
|
192
|
+
## Output (must match output_schema)
|
|
193
|
+
{
|
|
194
|
+
"id": "<row id>",
|
|
195
|
+
"result_status": "completed" | "failed" | "blocked",
|
|
196
|
+
"red_result": "expected_fail" | "pass" | "unexpected_fail" | "",
|
|
197
|
+
"findings": "<patterns discovered, notes for dependent scenarios, max 500 chars>",
|
|
198
|
+
"files_modified": "<test file path>",
|
|
199
|
+
"error": "<message if not completed, else empty>"
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
## Hard Constraints
|
|
203
|
+
- Do NOT modify source code under test (only test files).
|
|
204
|
+
- Do NOT write to scenarios.csv, layer-L*.csv, results.csv (orchestrator owns those).
|
|
205
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
167
206
|
|
|
168
207
|
## Context
|
|
169
208
|
- prev_context: {prev_context} (findings from prior layer)
|
|
@@ -181,7 +220,7 @@ OUTER LOOP (max_iter iterations):
|
|
|
181
220
|
3. Diagnosis agent (see instruction below). test_defect -> provide fix. code_defect -> document evidence.
|
|
182
221
|
4. Apply test_defect fixes, re-run layer
|
|
183
222
|
|
|
184
|
-
**Diagnosis Agent Instruction** (injected into spawn_agents_on_csv):
|
|
223
|
+
**Diagnosis Agent Instruction** (injected into spawn_agents_on_csv; uses same output_schema as Test Writer):
|
|
185
224
|
```
|
|
186
225
|
You are a test failure diagnostician. Classify ONE test failure.
|
|
187
226
|
|
|
@@ -193,16 +232,31 @@ You are a test failure diagnostician. Classify ONE test failure.
|
|
|
193
232
|
- code_defect: Source violates business rule (actual != expected requirement)
|
|
194
233
|
- env_issue: Environment problem (service down, config missing, timeout)
|
|
195
234
|
|
|
196
|
-
##
|
|
197
|
-
|
|
198
|
-
-
|
|
199
|
-
-
|
|
200
|
-
-
|
|
201
|
-
|
|
202
|
-
|
|
235
|
+
## Termination Contract (MANDATORY)
|
|
236
|
+
You MUST call report_agent_job_result EXACTLY ONCE before exiting.
|
|
237
|
+
- Success → result_status=completed with concrete classification
|
|
238
|
+
- Failure → result_status=failed if you cannot read test_file or target_file
|
|
239
|
+
- Blocked → result_status=blocked when env_issue prevents diagnosis
|
|
240
|
+
- Timeout → near max_runtime_seconds → result_status=failed with error="timeout"
|
|
241
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
242
|
+
|
|
243
|
+
## Output (must match output_schema)
|
|
244
|
+
{
|
|
245
|
+
"id": "<row id>",
|
|
246
|
+
"result_status": "completed" | "failed" | "blocked",
|
|
247
|
+
"classification": "test_defect" | "code_defect" | "env_issue",
|
|
248
|
+
"fix_code": "<old_line → new_line or full replacement (test_defect only); empty otherwise>",
|
|
249
|
+
"evidence": "<file:line refs supporting classification>",
|
|
250
|
+
"findings": "<one-sentence diagnosis summary, max 500 chars>",
|
|
251
|
+
"error": "<message if not completed>"
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
## Hard Constraints
|
|
203
255
|
- NEVER suggest source code changes — only test fixes for test_defect
|
|
204
256
|
- Test correctly catching a real bug = code_defect, not test_defect
|
|
205
257
|
- When uncertain: prefer code_defect (conservative)
|
|
258
|
+
- Do NOT write to scenarios.csv, layer-L*.csv, results.csv (orchestrator owns those).
|
|
259
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
206
260
|
```
|
|
207
261
|
5. If no test_defects remain: break inner
|
|
208
262
|
REFLECT: analyze trends, log strategy, test confidence scoring (5 dims: scenario_coverage, test_quality, diagnostic_accuracy, strategy_effectiveness, infrastructure_fitness)
|
|
@@ -91,12 +91,12 @@ When `--yes` or `-y`: Auto-confirm hypothesis selection, skip interactive sympto
|
|
|
91
91
|
### tasks.csv (Master State)
|
|
92
92
|
|
|
93
93
|
```csv
|
|
94
|
-
id,title,description,hypothesis,deps,context_from,wave
|
|
95
|
-
"H1","Null pointer in login handler","Investigate whether login handler crashes due to null user object after failed DB lookup","User object is null when DB returns empty result; login.ts:42 dereferences without null check","","","1"
|
|
96
|
-
"H2","Missing error boundary","Investigate whether unhandled promise rejection in auth middleware propagates to 500","Auth middleware catches DB errors but not validation errors; middleware.ts:78 has no catch block","","","1"
|
|
97
|
-
"H3","Stale session token","Investigate whether expired session tokens bypass refresh logic","Session refresh only triggers on 403 but server returns 401 for expired tokens; session.ts:15","","","1"
|
|
98
|
-
"FIX-H1","Fix null pointer in login","Apply null check before user object dereference in login handler","","H1","H1","2"
|
|
99
|
-
"FIX-H3","Fix session token refresh","Update refresh trigger to also handle 401 status codes","","H3","H3","2"
|
|
94
|
+
id,title,description,hypothesis,deps,context_from,wave,status,findings,evidence_for,evidence_against,fix_applied,verified,error
|
|
95
|
+
"H1","Null pointer in login handler","Investigate whether login handler crashes due to null user object after failed DB lookup","User object is null when DB returns empty result; login.ts:42 dereferences without null check","","","1","pending","","","","","",""
|
|
96
|
+
"H2","Missing error boundary","Investigate whether unhandled promise rejection in auth middleware propagates to 500","Auth middleware catches DB errors but not validation errors; middleware.ts:78 has no catch block","","","1","pending","","","","","",""
|
|
97
|
+
"H3","Stale session token","Investigate whether expired session tokens bypass refresh logic","Session refresh only triggers on 403 but server returns 401 for expired tokens; session.ts:15","","","1","pending","","","","","",""
|
|
98
|
+
"FIX-H1","Fix null pointer in login","Apply null check before user object dereference in login handler","","H1","H1","2","pending","","","","","",""
|
|
99
|
+
"FIX-H3","Fix session token refresh","Update refresh trigger to also handle 401 status codes","","H3","H3","2","pending","","","","","",""
|
|
100
100
|
```
|
|
101
101
|
|
|
102
102
|
**Columns**:
|
|
@@ -110,15 +110,17 @@ id,title,description,hypothesis,deps,context_from,wave
|
|
|
110
110
|
| `deps` | Input | Semicolon-separated dependency task IDs (wave 2 depends on wave 1) |
|
|
111
111
|
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
|
|
112
112
|
| `wave` | Input | Wave number (1 = investigation, 2 = fix attempt) |
|
|
113
|
-
| `
|
|
114
|
-
| `findings` |
|
|
115
|
-
| `evidence_for` |
|
|
116
|
-
| `evidence_against` |
|
|
117
|
-
| `fix_applied` |
|
|
118
|
-
| `verified` |
|
|
119
|
-
| `error` |
|
|
113
|
+
| `status` | Lifecycle | `pending` (initial) → `confirmed`/`refuted`/`inconclusive`/`fixed`/`fix_failed`/`failed`/`skipped` (set by merge step from worker's `result_status`) |
|
|
114
|
+
| `findings` | Lifecycle | Key findings summary (max 500 chars; merged from worker output) |
|
|
115
|
+
| `evidence_for` | Lifecycle | Evidence supporting the hypothesis (wave 1; merged) |
|
|
116
|
+
| `evidence_against` | Lifecycle | Evidence refuting the hypothesis (wave 1; merged) |
|
|
117
|
+
| `fix_applied` | Lifecycle | Description of fix applied (wave 2 only; merged) |
|
|
118
|
+
| `verified` | Lifecycle | `true` / `false` — whether fix was verified to work (wave 2 only; merged) |
|
|
119
|
+
| `error` | Lifecycle | Error message if failed (merged) |
|
|
120
120
|
|
|
121
|
-
**Column separation rule**: Input columns
|
|
121
|
+
**Column separation rule**: Wave CSV (input to `spawn_agents_on_csv`) contains Input columns + `prev_context` only. Lifecycle columns are NEVER passed to workers. Workers return Output columns exclusively via `output_schema` — those output column names MUST NOT collide with Input column names. During merge: `result_status` → master `status`; other output columns copied as-is into matching lifecycle columns.
|
|
122
|
+
|
|
123
|
+
**Initial state**: All rows are written with `status="pending"` and empty lifecycle columns. Each wave selects rows where `wave == N AND status == "pending"` from the master CSV.
|
|
122
124
|
|
|
123
125
|
### Per-Wave CSV (Temporary)
|
|
124
126
|
|
|
@@ -234,41 +236,150 @@ mkdir -p {sessionFolder}
|
|
|
234
236
|
|
|
235
237
|
#### Wave 1: Hypothesis Investigation (Parallel)
|
|
236
238
|
|
|
237
|
-
1. Extract wave
|
|
238
|
-
2. Execute
|
|
239
|
+
1. **Extract wave-1 input**: filter master `tasks.csv` rows where `wave == 1 AND status == "pending"` → write `wave-1.csv` containing ONLY input columns (id, title, description, hypothesis, deps, context_from, wave). No lifecycle columns, no prev_context (wave 1 has no upstream).
|
|
240
|
+
2. **Execute**:
|
|
239
241
|
|
|
240
242
|
```javascript
|
|
241
243
|
spawn_agents_on_csv({
|
|
242
244
|
csv_path: `${sessionFolder}/wave-1.csv`,
|
|
243
245
|
id_column: "id",
|
|
244
|
-
instruction:
|
|
245
|
-
max_concurrency: maxConcurrency,
|
|
246
|
+
instruction: WAVE1_INVESTIGATION_INSTRUCTION, // see "Wave 1 Worker Contract" below
|
|
247
|
+
max_concurrency: maxConcurrency,
|
|
248
|
+
max_runtime_seconds: 3600,
|
|
246
249
|
output_csv_path: `${sessionFolder}/wave-1-results.csv`,
|
|
247
|
-
output_schema: {
|
|
250
|
+
output_schema: {
|
|
251
|
+
type: "object",
|
|
252
|
+
properties: {
|
|
253
|
+
id: { type: "string" },
|
|
254
|
+
result_status: { type: "string", enum: ["confirmed", "refuted", "inconclusive", "failed"] },
|
|
255
|
+
findings: { type: "string", maxLength: 500 },
|
|
256
|
+
evidence_for: { type: "string" },
|
|
257
|
+
evidence_against: { type: "string" },
|
|
258
|
+
error: { type: "string" }
|
|
259
|
+
},
|
|
260
|
+
required: ["id", "result_status", "findings"]
|
|
261
|
+
}
|
|
248
262
|
})
|
|
249
263
|
```
|
|
250
264
|
|
|
251
|
-
3. Merge `wave-1-results.csv
|
|
252
|
-
4. **
|
|
265
|
+
3. **Merge**: for each row in `wave-1-results.csv`, look up master row by `id` and write `master.status = result_status`, then copy `findings`, `evidence_for`, `evidence_against`, `error`. Delete `wave-1.csv` and `wave-1-results.csv`.
|
|
266
|
+
4. **Wave 2 gating** (read from MASTER `tasks.csv` after merge, NOT from wave-1-results.csv):
|
|
267
|
+
- For each `FIX-H{N}` row: read its `context_from` hypothesis ID (e.g., `H{N}`) from master; if master `H{N}.status != "confirmed"`, set `FIX-H{N}.status = "skipped"` (with findings = "upstream {H{N}.status}").
|
|
268
|
+
- Only rows where `status == "pending"` proceed to wave 2.
|
|
269
|
+
|
|
270
|
+
#### Wave 1 Worker Contract (WAVE1_INVESTIGATION_INSTRUCTION)
|
|
271
|
+
|
|
272
|
+
The literal `instruction` string passed to `spawn_agents_on_csv` MUST include the following contract (substitute `{sessionFolder}` at build time):
|
|
273
|
+
|
|
274
|
+
```
|
|
275
|
+
You are a hypothesis investigation worker. ONE hypothesis row from wave-1.csv is assigned to you.
|
|
276
|
+
|
|
277
|
+
INPUT (from your CSV row):
|
|
278
|
+
- id, title, hypothesis, description
|
|
279
|
+
|
|
280
|
+
REQUIRED STEPS:
|
|
281
|
+
1. Read shared discoveries: {sessionFolder}/discoveries.ndjson (may be empty)
|
|
282
|
+
2. Scan codebase for evidence using Read/Grep/Glob (read-only investigation)
|
|
283
|
+
3. Classify the hypothesis based on evidence collected:
|
|
284
|
+
- confirmed → strong evidence supports the hypothesis (file:line proof)
|
|
285
|
+
- refuted → strong evidence contradicts the hypothesis
|
|
286
|
+
- inconclusive → insufficient evidence within time budget; do NOT guess
|
|
287
|
+
- failed → tool error / cannot read files / blocked by environment
|
|
288
|
+
4. Append discoveries to {sessionFolder}/discoveries.ndjson if reusable (root_cause / hypothesis_evidence types)
|
|
289
|
+
5. Call report_agent_job_result EXACTLY ONCE with the verdict
|
|
290
|
+
|
|
291
|
+
TERMINATION CONTRACT (mandatory — NO worker may end without calling report_agent_job_result):
|
|
292
|
+
- Success path → result_status = confirmed | refuted, with evidence
|
|
293
|
+
- Timeout path → if approaching {max_runtime_seconds}, STOP investigation and report inconclusive
|
|
294
|
+
- Failure path → on any unrecoverable error, report failed with error message
|
|
295
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
296
|
+
|
|
297
|
+
OUTPUT (return via report_agent_job_result; must match output_schema):
|
|
298
|
+
{
|
|
299
|
+
"id": "<your row id>",
|
|
300
|
+
"result_status": "confirmed" | "refuted" | "inconclusive" | "failed",
|
|
301
|
+
"findings": "<one-sentence summary, max 500 chars>",
|
|
302
|
+
"evidence_for": "<bullet list of file:line refs supporting, or empty>",
|
|
303
|
+
"evidence_against": "<bullet list of file:line refs refuting, or empty>",
|
|
304
|
+
"error": "<message if failed, else empty>"
|
|
305
|
+
}
|
|
306
|
+
|
|
307
|
+
CONSTRAINTS:
|
|
308
|
+
- Do NOT modify source code. This is investigation only.
|
|
309
|
+
- Do NOT write to tasks.csv, wave-*.csv, or results.csv (orchestrator owns those).
|
|
310
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
311
|
+
```
|
|
253
312
|
|
|
254
313
|
#### Wave 2: Fix Attempts (Parallel, Confirmed Only)
|
|
255
314
|
|
|
256
|
-
1. If no
|
|
257
|
-
2. Extract wave 2 pending
|
|
258
|
-
3.
|
|
315
|
+
1. If no master rows have `wave == 2 AND status == "pending"` after gating, skip wave 2 entirely.
|
|
316
|
+
2. **Extract wave-2 input**: filter master `tasks.csv` where `wave == 2 AND status == "pending"`. For each row, build `prev_context` by concatenating findings/evidence_for from each ID in `context_from` (read from master). Write `wave-2.csv` with input columns + `prev_context`.
|
|
317
|
+
3. **Execute**:
|
|
259
318
|
|
|
260
319
|
```javascript
|
|
261
320
|
spawn_agents_on_csv({
|
|
262
321
|
csv_path: `${sessionFolder}/wave-2.csv`,
|
|
263
322
|
id_column: "id",
|
|
264
|
-
instruction:
|
|
265
|
-
max_concurrency: maxConcurrency,
|
|
323
|
+
instruction: WAVE2_FIX_INSTRUCTION, // see "Wave 2 Worker Contract" below
|
|
324
|
+
max_concurrency: maxConcurrency,
|
|
325
|
+
max_runtime_seconds: 3600,
|
|
266
326
|
output_csv_path: `${sessionFolder}/wave-2-results.csv`,
|
|
267
|
-
output_schema: {
|
|
327
|
+
output_schema: {
|
|
328
|
+
type: "object",
|
|
329
|
+
properties: {
|
|
330
|
+
id: { type: "string" },
|
|
331
|
+
result_status: { type: "string", enum: ["fixed", "fix_failed", "failed"] },
|
|
332
|
+
findings: { type: "string", maxLength: 500 },
|
|
333
|
+
fix_applied: { type: "string" },
|
|
334
|
+
verified: { type: "string", enum: ["true", "false"] },
|
|
335
|
+
error: { type: "string" }
|
|
336
|
+
},
|
|
337
|
+
required: ["id", "result_status", "findings", "verified"]
|
|
338
|
+
}
|
|
268
339
|
})
|
|
269
340
|
```
|
|
270
341
|
|
|
271
|
-
4. Merge `
|
|
342
|
+
4. **Merge**: write `master.status = result_status`, copy `findings`, `fix_applied`, `verified`, `error`. Delete `wave-2.csv` and `wave-2-results.csv`.
|
|
343
|
+
|
|
344
|
+
#### Wave 2 Worker Contract (WAVE2_FIX_INSTRUCTION)
|
|
345
|
+
|
|
346
|
+
```
|
|
347
|
+
You are a fix worker. ONE confirmed hypothesis row is assigned to you.
|
|
348
|
+
|
|
349
|
+
INPUT (from your CSV row):
|
|
350
|
+
- id (FIX-H{N}), title, description, prev_context (confirmed evidence from H{N})
|
|
351
|
+
|
|
352
|
+
REQUIRED STEPS:
|
|
353
|
+
1. Read prev_context — the confirmed root cause evidence
|
|
354
|
+
2. Apply the minimal fix using Edit / Write
|
|
355
|
+
3. Run verification:
|
|
356
|
+
- If project has tests: run the relevant test suite via Bash
|
|
357
|
+
- If no tests: re-read the modified file and confirm the fix matches the planned change
|
|
358
|
+
4. Append discoveries (type=fix_applied) to {sessionFolder}/discoveries.ndjson if reusable
|
|
359
|
+
5. Call report_agent_job_result EXACTLY ONCE
|
|
360
|
+
|
|
361
|
+
TERMINATION CONTRACT (mandatory):
|
|
362
|
+
- Success path → fix applied AND verified → result_status=fixed, verified="true"
|
|
363
|
+
- Partial path → fix applied but verification failed → result_status=fix_failed, verified="false"
|
|
364
|
+
- Timeout path → approaching {max_runtime_seconds} with no fix applied → result_status=fix_failed with error="timeout"
|
|
365
|
+
- Failure path → cannot apply fix (file missing, parse error, etc.) → result_status=failed
|
|
366
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
367
|
+
|
|
368
|
+
OUTPUT (return via report_agent_job_result; must match output_schema):
|
|
369
|
+
{
|
|
370
|
+
"id": "<your row id>",
|
|
371
|
+
"result_status": "fixed" | "fix_failed" | "failed",
|
|
372
|
+
"findings": "<one-sentence summary of what was changed, max 500 chars>",
|
|
373
|
+
"fix_applied": "<file:line description of the change>",
|
|
374
|
+
"verified": "true" | "false",
|
|
375
|
+
"error": "<message if failed, else empty>"
|
|
376
|
+
}
|
|
377
|
+
|
|
378
|
+
CONSTRAINTS:
|
|
379
|
+
- Modify ONLY files implicated by prev_context evidence. No drive-by refactors.
|
|
380
|
+
- Do NOT write to tasks.csv, wave-*.csv, or results.csv.
|
|
381
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
382
|
+
```
|
|
272
383
|
|
|
273
384
|
### Phase 3: Results Aggregation
|
|
274
385
|
|
|
@@ -169,24 +169,74 @@ For each wave N in ascending order:
|
|
|
169
169
|
|
|
170
170
|
```javascript
|
|
171
171
|
spawn_agents_on_csv({
|
|
172
|
-
csv_path: `${sessionFolder}/wave-${N}.csv`,
|
|
172
|
+
csv_path: `${sessionFolder}/wave-${N}.csv`, // only rows where wave==N AND status=="pending"
|
|
173
173
|
id_column: "id",
|
|
174
|
-
instruction:
|
|
175
|
-
1
|
|
176
|
-
|
|
177
|
-
3. Verify convergence_criteria via grep (all criteria must pass)
|
|
178
|
-
4. Run verification_cmd and report test result
|
|
179
|
-
5. If tests fail: revert ALL changes for this task, set result_status=failed
|
|
180
|
-
6. Append discoveries to ${sessionFolder}/discoveries.ndjson
|
|
181
|
-
Report: files_modified (semicolon-separated), tests_passed (true/false), findings (what was changed and why)`,
|
|
182
|
-
max_concurrency: 1, max_runtime_seconds: 1800,
|
|
174
|
+
instruction: REFACTOR_INSTRUCTION, // see "Refactor Worker Contract" below
|
|
175
|
+
max_concurrency: 1,
|
|
176
|
+
max_runtime_seconds: 1800,
|
|
183
177
|
output_csv_path: `${sessionFolder}/wave-${N}-results.csv`,
|
|
184
|
-
output_schema: {
|
|
178
|
+
output_schema: {
|
|
179
|
+
type: "object",
|
|
180
|
+
properties: {
|
|
181
|
+
id: { type: "string" },
|
|
182
|
+
result_status: { type: "string", enum: ["completed", "failed", "blocked"] },
|
|
183
|
+
findings: { type: "string", maxLength: 500 },
|
|
184
|
+
files_modified: { type: "string", description: "Semicolon-separated paths (empty if reverted)" },
|
|
185
|
+
tests_passed: { type: "string", enum: ["true", "false"] },
|
|
186
|
+
error: { type: "string" }
|
|
187
|
+
},
|
|
188
|
+
required: ["id", "result_status", "findings", "tests_passed"]
|
|
189
|
+
}
|
|
185
190
|
})
|
|
186
191
|
```
|
|
187
192
|
|
|
188
193
|
4. Merge results into master `tasks.csv`: map `result_status` -> master `status` column, copy `findings`, `files_modified`, `tests_passed`, `error` into master. Delete temporary `wave-{N}.csv` and `wave-{N}-results.csv`.
|
|
189
194
|
|
|
195
|
+
#### Refactor Worker Contract (REFACTOR_INSTRUCTION)
|
|
196
|
+
|
|
197
|
+
```
|
|
198
|
+
You are a refactoring executor. ONE task row is assigned to you.
|
|
199
|
+
|
|
200
|
+
INPUT (from your CSV row):
|
|
201
|
+
- id, title, description (refactoring plan)
|
|
202
|
+
- read_first (semicolon-separated paths to read for context)
|
|
203
|
+
- scope (files in refactor scope)
|
|
204
|
+
- convergence_criteria (grep patterns that must pass after refactor)
|
|
205
|
+
- verification_cmd (test command to run)
|
|
206
|
+
- prev_context (findings from upstream tasks)
|
|
207
|
+
|
|
208
|
+
REQUIRED STEPS:
|
|
209
|
+
1. Read all files in read_first to understand context
|
|
210
|
+
2. Apply refactoring per description, modifying only files in scope
|
|
211
|
+
3. Verify EVERY convergence_criterion via grep (ALL must pass; ANY miss → failure)
|
|
212
|
+
4. Run verification_cmd via Bash; capture pass/fail
|
|
213
|
+
5. If tests fail OR convergence fails → revert ALL changes for this task using git (or Edit reverse), set files_modified=""
|
|
214
|
+
6. Append discoveries (type=implementation_note / pattern) to {sessionFolder}/discoveries.ndjson
|
|
215
|
+
7. Call report_agent_job_result EXACTLY ONCE
|
|
216
|
+
|
|
217
|
+
TERMINATION CONTRACT (mandatory — NO worker may end without calling report_agent_job_result):
|
|
218
|
+
- Success path → tests pass AND convergence passes → result_status=completed, tests_passed="true"
|
|
219
|
+
- Failed path → tests fail OR convergence fails → REVERT, result_status=failed, tests_passed="false"
|
|
220
|
+
- Blocked path → cannot apply (file missing, parse error, unclear scope) → result_status=blocked
|
|
221
|
+
- Timeout path → approaching max_runtime_seconds → REVERT partial changes, result_status=failed with error="timeout"
|
|
222
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
223
|
+
|
|
224
|
+
OUTPUT (return via report_agent_job_result; must match output_schema):
|
|
225
|
+
{
|
|
226
|
+
"id": "<your row id>",
|
|
227
|
+
"result_status": "completed" | "failed" | "blocked",
|
|
228
|
+
"findings": "<what was changed and why, max 500 chars>",
|
|
229
|
+
"files_modified": "<semicolon-separated paths or empty if reverted>",
|
|
230
|
+
"tests_passed": "true" | "false",
|
|
231
|
+
"error": "<message if not completed, else empty>"
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
CONSTRAINTS:
|
|
235
|
+
- Modify ONLY files in scope. Never drive-by edit unrelated files.
|
|
236
|
+
- Do NOT write to tasks.csv, wave-*.csv, results.csv, reflection-log.md (orchestrator owns those).
|
|
237
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
238
|
+
```
|
|
239
|
+
|
|
190
240
|
**5b. Reflect per wave:**
|
|
191
241
|
|
|
192
242
|
Append to `reflection-log.md`:
|
|
@@ -216,33 +216,69 @@ Filter master `tasks.csv` for `wave == 1 AND status == pending` → write `wave-
|
|
|
216
216
|
spawn_agents_on_csv({
|
|
217
217
|
csv_path: `${sessionFolder}/wave-1.csv`,
|
|
218
218
|
id_column: "id",
|
|
219
|
-
instruction:
|
|
219
|
+
instruction: REVIEW_DIMENSION_INSTRUCTION, // see "Dimension Worker Contract" below
|
|
220
220
|
max_concurrency: maxConcurrency,
|
|
221
221
|
max_runtime_seconds: 3600,
|
|
222
222
|
output_csv_path: `${sessionFolder}/wave-1-results.csv`,
|
|
223
223
|
output_schema: {
|
|
224
224
|
type: "object",
|
|
225
225
|
properties: {
|
|
226
|
-
id:
|
|
227
|
-
result_status:
|
|
228
|
-
findings:
|
|
229
|
-
severity_counts: { type: "string" },
|
|
230
|
-
top_issues:
|
|
231
|
-
error:
|
|
226
|
+
id: { type: "string" },
|
|
227
|
+
result_status: { type: "string", enum: ["completed", "failed"] },
|
|
228
|
+
findings: { type: "string", maxLength: 500 },
|
|
229
|
+
severity_counts: { type: "string", description: "JSON object string {critical, high, medium, low}" },
|
|
230
|
+
top_issues: { type: "string", description: "JSON array string of top issues with file:line" },
|
|
231
|
+
error: { type: "string" }
|
|
232
232
|
},
|
|
233
233
|
required: ["id", "result_status", "findings"]
|
|
234
234
|
}
|
|
235
235
|
})
|
|
236
236
|
```
|
|
237
237
|
|
|
238
|
-
Merge `wave-1-results.csv` into master `tasks.csv` (map `result_status` → master `status` column), then delete both `wave-1.csv` and `wave-1-results.csv`.
|
|
238
|
+
Merge `wave-1-results.csv` into master `tasks.csv` (map `result_status` → master `status` column; copy `findings`, `severity_counts`, `top_issues`, `error`), then delete both `wave-1.csv` and `wave-1-results.csv`.
|
|
239
|
+
|
|
240
|
+
#### Dimension Worker Contract (REVIEW_DIMENSION_INSTRUCTION)
|
|
241
|
+
|
|
242
|
+
```
|
|
243
|
+
You are a code reviewer for ONE dimension (correctness/security/performance/maintainability/...). Your dimension, scope, and standards come from your CSV row.
|
|
244
|
+
|
|
245
|
+
REQUIRED STEPS:
|
|
246
|
+
1. Read shared discoveries: {sessionFolder}/discoveries.ndjson
|
|
247
|
+
2. Read specs loaded by orchestrator (review category) for severity calibration
|
|
248
|
+
3. Scan code in scope using Read/Grep/Glob (read-only)
|
|
249
|
+
4. Classify each issue: critical / high / medium / low with file:line refs
|
|
250
|
+
5. Append cross-cutting patterns to discoveries.ndjson
|
|
251
|
+
6. Call report_agent_job_result EXACTLY ONCE
|
|
252
|
+
|
|
253
|
+
TERMINATION CONTRACT (mandatory — NO worker may end without calling report_agent_job_result):
|
|
254
|
+
- Success → result_status=completed (severity_counts may be all-zero if clean)
|
|
255
|
+
- Timeout → near max_runtime_seconds, STOP and report completed with partial findings
|
|
256
|
+
- Failure → unrecoverable read/parse error → result_status=failed
|
|
257
|
+
- NEVER skip report_agent_job_result.
|
|
258
|
+
|
|
259
|
+
OUTPUT (must match output_schema):
|
|
260
|
+
{
|
|
261
|
+
"id": "<your row id>",
|
|
262
|
+
"result_status": "completed" | "failed",
|
|
263
|
+
"findings": "<one-sentence dimension summary, max 500 chars>",
|
|
264
|
+
"severity_counts": "<JSON object string: {critical:N, high:N, medium:N, low:N}>",
|
|
265
|
+
"top_issues": "<JSON array string: [{title, severity, location, recommendation}...]>",
|
|
266
|
+
"error": "<message if failed, else empty>"
|
|
267
|
+
}
|
|
268
|
+
|
|
269
|
+
CONSTRAINTS:
|
|
270
|
+
- Every issue MUST have a concrete file:line reference. No speculation.
|
|
271
|
+
- Do NOT modify source. This is review only.
|
|
272
|
+
- Do NOT write to tasks.csv, wave-*.csv, results.csv, review.json (orchestrator owns those).
|
|
273
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
274
|
+
```
|
|
239
275
|
|
|
240
276
|
#### Wave 2: Aggregation + Deep-Dive
|
|
241
277
|
|
|
242
278
|
Filter master `tasks.csv` for `wave == 2 AND status == pending`. If all wave 1 tasks failed, skip aggregation.
|
|
243
279
|
|
|
244
280
|
Build `prev_context` from wave 1 findings (format: `[Task N: Title] summary...` per task).
|
|
245
|
-
Write `wave-2.csv` with `prev_context` column → execute `spawn_agents_on_csv` → merge results into master `tasks.csv` (map `result_status` → master `status` column) → delete both `wave-2.csv` and `wave-2-results.csv`.
|
|
281
|
+
Write `wave-2.csv` with `prev_context` column → execute `spawn_agents_on_csv` with `REVIEW_AGGREGATION_INSTRUCTION` (same termination contract; output_schema returns `result_status` enum [completed|failed], findings, plus `verdict` enum [PASS|WARN|BLOCK]) → merge results into master `tasks.csv` (map `result_status` → master `status` column) → delete both `wave-2.csv` and `wave-2-results.csv`.
|
|
246
282
|
|
|
247
283
|
### Phase 3: Results Aggregation
|
|
248
284
|
|
|
@@ -183,10 +183,65 @@ On issue: auto-create in `.workflow/issues/issues.jsonl`:
|
|
|
183
183
|
### A_DIAGNOSE_GAPS
|
|
184
184
|
|
|
185
185
|
1. Cluster gaps by component/module/feature
|
|
186
|
-
2. Build diagnosis.csv
|
|
187
|
-
3. `spawn_agents_on_csv` for parallel diagnosis
|
|
186
|
+
2. Build diagnosis.csv with `status="pending"` per row, target_files, source_context
|
|
187
|
+
3. Filter `status=="pending"` -> write diagnosis-wave.csv -> `spawn_agents_on_csv` for parallel diagnosis with the contract below
|
|
188
188
|
4. **Diagnosis agent**: Find root cause (not symptom), suggest fix direction, list affected files. Do NOT modify files. Reference issue_id for traceability.
|
|
189
|
-
5. Merge results:
|
|
189
|
+
5. Merge results: map `result_status` → master `status`; copy `root_cause`, `fix_direction`, `affected_files`; update uat.md gaps
|
|
190
|
+
|
|
191
|
+
**output_schema**:
|
|
192
|
+
|
|
193
|
+
```json
|
|
194
|
+
{
|
|
195
|
+
"type": "object",
|
|
196
|
+
"properties": {
|
|
197
|
+
"id": { "type": "string" },
|
|
198
|
+
"result_status": { "type": "string", "enum": ["completed", "failed", "blocked"] },
|
|
199
|
+
"root_cause": { "type": "string", "maxLength": 500 },
|
|
200
|
+
"fix_direction": { "type": "string" },
|
|
201
|
+
"affected_files": { "type": "string", "description": "Semicolon-separated file:line refs" },
|
|
202
|
+
"findings": { "type": "string", "maxLength": 500 },
|
|
203
|
+
"error": { "type": "string" }
|
|
204
|
+
},
|
|
205
|
+
"required": ["id", "result_status", "root_cause", "findings"]
|
|
206
|
+
}
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
**Diagnosis Worker Instruction** (embed in spawn instruction):
|
|
210
|
+
```
|
|
211
|
+
You are a UAT gap diagnostician. ONE gap row is assigned to you.
|
|
212
|
+
|
|
213
|
+
INPUT: id, target_files, source_context, issue_id, gap description
|
|
214
|
+
|
|
215
|
+
REQUIRED STEPS:
|
|
216
|
+
1. Read target_files + source_context (read-only)
|
|
217
|
+
2. Trace symptom backward through call chain to root cause
|
|
218
|
+
3. Identify fix direction (high-level — orchestrator hands off to maestro-plan)
|
|
219
|
+
4. List affected files with file:line refs
|
|
220
|
+
5. Call report_agent_job_result EXACTLY ONCE
|
|
221
|
+
|
|
222
|
+
TERMINATION CONTRACT (mandatory):
|
|
223
|
+
- Success → result_status=completed with root_cause + fix_direction populated
|
|
224
|
+
- Failure → result_status=failed (cannot read files, parse error)
|
|
225
|
+
- Blocked → result_status=blocked (insufficient context to diagnose; orchestrator may re-cluster)
|
|
226
|
+
- Timeout → near max_runtime_seconds → result_status=blocked with error="timeout"
|
|
227
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
228
|
+
|
|
229
|
+
OUTPUT (must match output_schema):
|
|
230
|
+
{
|
|
231
|
+
"id": "<your row id>",
|
|
232
|
+
"result_status": "completed" | "failed" | "blocked",
|
|
233
|
+
"root_cause": "<concrete root cause with file:line, max 500 chars>",
|
|
234
|
+
"fix_direction": "<high-level fix strategy, NOT code>",
|
|
235
|
+
"affected_files": "<semicolon-separated file:line refs>",
|
|
236
|
+
"findings": "<one-sentence summary>",
|
|
237
|
+
"error": "<message if not completed>"
|
|
238
|
+
}
|
|
239
|
+
|
|
240
|
+
CONSTRAINTS:
|
|
241
|
+
- Read-only. Do NOT modify source files.
|
|
242
|
+
- Do NOT write to uat.md, diagnosis.csv, issues.jsonl (orchestrator owns those).
|
|
243
|
+
- Do NOT call spawn_agents_on_csv (no recursion).
|
|
244
|
+
```
|
|
190
245
|
|
|
191
246
|
### A_GAP_FIX_LOOP
|
|
192
247
|
|
|
@@ -59,6 +59,38 @@ Use `Grep` for pattern matching (e.g., `eval(`, `exec(`, `innerHTML`, `dangerous
|
|
|
59
59
|
For `standard` and `deep` tiers, use `spawn_agents_on_csv` to parallelize OWASP category scans
|
|
60
60
|
across multiple agents, one agent per 2-3 categories.
|
|
61
61
|
|
|
62
|
+
**spawn_agents_on_csv contract** (OWASP scan):
|
|
63
|
+
- CSV columns: `id, title, owasp_categories, scope_glob, deps, wave, status` (initial `status="pending"`); filter `wave==1 AND status=="pending"` before writing wave-1.csv.
|
|
64
|
+
- `output_schema`:
|
|
65
|
+
|
|
66
|
+
```json
|
|
67
|
+
{
|
|
68
|
+
"type": "object",
|
|
69
|
+
"properties": {
|
|
70
|
+
"id": { "type": "string" },
|
|
71
|
+
"result_status": { "type": "string", "enum": ["completed", "failed"] },
|
|
72
|
+
"findings": { "type": "string", "maxLength": 500 },
|
|
73
|
+
"severity_counts": { "type": "string", "description": "JSON: {critical, high, medium, low}" },
|
|
74
|
+
"top_issues": { "type": "string", "description": "JSON array: [{title, severity, file:line, cwe}]" },
|
|
75
|
+
"error": { "type": "string" }
|
|
76
|
+
},
|
|
77
|
+
"required": ["id", "result_status", "findings"]
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
- Merge: `result_status` → master `status`; copy `findings`, `severity_counts`, `top_issues`, `error`.
|
|
82
|
+
- **Termination contract** (embed in instruction):
|
|
83
|
+
```
|
|
84
|
+
You MUST call report_agent_job_result EXACTLY ONCE before exiting.
|
|
85
|
+
- Success → result_status=completed (severity_counts may be zero if clean)
|
|
86
|
+
- Failure → result_status=failed with error message
|
|
87
|
+
- Timeout → near max_runtime_seconds → result_status=completed with partial top_issues (do not fail the wave for timeout)
|
|
88
|
+
- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
|
|
89
|
+
- Every issue MUST include file:line and CWE reference. No speculation.
|
|
90
|
+
- Read-only. Do NOT modify source.
|
|
91
|
+
Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv (no recursion).
|
|
92
|
+
```
|
|
93
|
+
|
|
62
94
|
**Phase 3: Dependency Audit** (all tiers)
|
|
63
95
|
|
|
64
96
|
```bash
|
|
@@ -107,6 +139,12 @@ For each critical module identified in Phase 1:
|
|
|
107
139
|
For `deep` tier, use `spawn_agents_on_csv` to parallelize STRIDE analysis across critical modules,
|
|
108
140
|
one agent per module. Use `request_user_input` to confirm critical module list before spawning.
|
|
109
141
|
|
|
142
|
+
**STRIDE spawn contract**:
|
|
143
|
+
- CSV columns: `id, module_path, threats_to_assess, deps, wave, status` (initial `status="pending"`); filter `wave==2 AND status=="pending"`.
|
|
144
|
+
- Same `output_schema` as the OWASP spawn above, but `top_issues` JSON items use shape `{stride_category, threat, severity, file:line, mitigation}`.
|
|
145
|
+
- Same termination contract as the OWASP spawn above (mandatory `report_agent_job_result`, read-only, no recursion, timeout → partial findings, etc.).
|
|
146
|
+
- Merge: `result_status` → master `status`; copy `findings`, `severity_counts`, `top_issues`, `error`.
|
|
147
|
+
|
|
110
148
|
**Phase 7: Git History Archaeology** (deep only)
|
|
111
149
|
|
|
112
150
|
```bash
|