ai-fob 1.9.8 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. package/assets/agents/architect-agent.md +19 -0
  2. package/assets/agents/build-validator-agent.md +63 -0
  3. package/assets/agents/builder-agent.md +9 -0
  4. package/assets/agents/plan-validator-agent.md +13 -0
  5. package/assets/commands/build-feature.md +14 -14
  6. package/assets/commands/build-phase-V2.md +193 -57
  7. package/assets/commands/setup-project.md +50 -4
  8. package/assets/pi/agents/build-phase-architect.md +7 -4
  9. package/assets/pi/agents/build-phase-build-validator.md +40 -10
  10. package/assets/pi/agents/build-phase-plan-validator.md +8 -1
  11. package/assets/pi/extensions/inter-agent-team/README.md +35 -0
  12. package/assets/pi/extensions/inter-agent-team/agents.ts +191 -0
  13. package/assets/pi/extensions/inter-agent-team/index.ts +21 -0
  14. package/assets/pi/extensions/inter-agent-team/package.json +26 -0
  15. package/assets/pi/extensions/inter-agent-team/registry.ts +167 -0
  16. package/assets/pi/extensions/inter-agent-team/runtime/pi-adapter.ts +51 -0
  17. package/assets/pi/extensions/inter-agent-team/runtime/process-tree.ts +20 -0
  18. package/assets/pi/extensions/inter-agent-team/runtime/pty-session.ts +180 -0
  19. package/assets/pi/extensions/inter-agent-team/runtime/runtime-adapter.ts +27 -0
  20. package/assets/pi/extensions/inter-agent-team/runtime/session-manager.ts +171 -0
  21. package/assets/pi/extensions/inter-agent-team/schema.ts +43 -0
  22. package/assets/pi/extensions/inter-agent-team/tool.ts +181 -0
  23. package/assets/pi/extensions/inter-agent-team/types.ts +62 -0
  24. package/assets/pi/extensions/inter-agent-team/ui/teammate-overlay.ts +46 -0
  25. package/assets/pi/extensions/inter-agent-team/widget.ts +73 -0
  26. package/assets/pi/prompts/build-feature.md +28 -19
  27. package/assets/pi/prompts/build-phase.md +62 -21
  28. package/assets/pi/skills/testing-and-validation/SKILL.md +151 -24
  29. package/assets/pi/skills/testing-and-validation/auth.example.json +47 -0
  30. package/assets/skills/testing-and-validation/SKILL.md +164 -13
  31. package/assets/skills/testing-and-validation/auth.example.json +47 -0
  32. package/manifest.json +9 -3
  33. package/package.json +1 -1
@@ -20,3 +20,22 @@ You are an expert system architect. Help the user plan/design a new feature, bug
20
20
  2. Collaborate with the user to brainstorm the best approach to the problem.
21
21
  3. Finalise the plan and present it to the user for approval.
22
22
 
23
+ ## Appendix A: SUCCESS/FAILURE response contract (opt-in)
24
+
25
+ When the calling command supplies an `ARTIFACT_PATH` in your prompt, after writing the plan respond on a single final line:
26
+
27
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
28
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
29
+
30
+ Do not return the full plan body when an ARTIFACT_PATH is provided — the orchestrator reads the file from disk. Calling commands that do not supply an ARTIFACT_PATH (e.g. an interactive brainstorm session) should ignore this contract and present the plan inline as usual.
31
+
32
+ ## Appendix B: Validation metadata preservation (opt-in)
33
+
34
+ When the calling command's spawn prompt asks you to preserve HL-plan validation tags on success criteria (e.g. `[must-pass]`, `[operator-followup]`, `[unsafe-manual]`, `[pre-release]`, `[agent-executable]`, `[operator-required]`, `[external-provider]`, `[dashboard]`, `[destructive]`, `[credentialed]`):
35
+
36
+ - Carry every supplied tag through verbatim into your plan's `## Success Criteria Verification` table and `## Phase {N} Validation` checklist.
37
+ - For untagged criteria, do not invent policy tags as facts. You may add an inferred validation note when the reason is grounded in HL plan/research, but explicitly mark it as inferred.
38
+ - For operator-only, provider-dashboard, destructive, credentialed, or pre-release validation, state whether it is intended as a follow-up and what substitute evidence the builder/validator can collect.
39
+
40
+ This appendix only applies when the spawn prompt asks for it; standalone brainstorming sessions are unaffected.
41
+
@@ -30,6 +30,18 @@ Run every validation check. Report exactly what you observe. If a check fails, d
30
30
 
31
31
  NEVER use the macOS `open` command to open URLs in a browser. ALWAYS use `agent-browser open <url>` for all browser-based checks. The `open` command launches Safari, which cannot be automated, snapshotted, or device-emulated. All browser interactions MUST go through `agent-browser` (Chromium via Playwright).
32
32
 
33
+ ## Validation Taxonomy Reference
34
+
35
+ For BLOCKED checks and aggregate-result derivation, follow the **Validation Taxonomy and Blocked Check Severity** section of the `testing-and-validation` skill. Specifically:
36
+
37
+ - Classify every BLOCKED check as `severity: terminal` or `severity: non-terminal`.
38
+ - Record `criticality`, `executor`, `risk` from the taxonomy.
39
+ - Apply the **overall-result derivation** rules to compute the aggregate result: `pass | fail | blocked | pass-with-followups`.
40
+ - Apply the **self-consistency requirement**: if no terminal blockers exist, the overall result must be `pass-with-followups`, not `blocked`.
41
+ - For auth-required browser checks, follow the **Auth and Provider Runtime Configuration** section of the same skill (look up `.claude/skills/testing-and-validation/auth.json`; never mark auth BLOCKED if valid credentials are configured; never print raw secrets).
42
+
43
+ The detailed check schema (Criticality/Executor/Risk columns, `[HL]` prefix semantics, fix-cycle interaction) is provided by the calling command's spawn prompt — this agent applies the taxonomy generically and lets the spawn prompt specify column layout.
44
+
33
45
  ## Checks
34
46
 
35
47
  The calling prompt provides a numbered list of checks to run. Execute every check listed -- no more, no fewer. For each check:
@@ -106,3 +118,54 @@ Overall result determination:
106
118
  - **PASS**: ALL checks passed (no FAIL, no BLOCKED).
107
119
  - **FAIL**: At least one check has result FAIL (regardless of BLOCKED count).
108
120
  - **BLOCKED**: No checks FAILed, but at least one check is BLOCKED. This means the build may be correct but cannot be fully verified.
121
+
122
+ ## Extended report variant (when the spawn prompt provides taxonomy parameters)
123
+
124
+ When the calling command's spawn prompt supplies `severity classification`, `Continuation Assessment`, or `pass-with-followups` as allowed outcomes, use this extended frontmatter and structure in addition to the standard template above. This variant is BACKWARD-COMPATIBLE — fields below are additive and may be omitted when the spawn prompt does not request them.
125
+
126
+ ```yaml
127
+ ---
128
+ task: {TASK_NAME}
129
+ phase: {PHASE_NUMBER}
130
+ phase-name: {PHASE_NAME}
131
+ type: build-validation-report
132
+ cycle: {CYCLE}
133
+ result: pass | fail | blocked | pass-with-followups
134
+ checks-passed: X/Y
135
+ checks-blocked: Z/Y
136
+ checks-blocked-terminal: A/Y
137
+ checks-blocked-nonterminal: B/Y
138
+ date: {current date}
139
+ ---
140
+ ```
141
+
142
+ Extra report sections (append after the standard Verified Checks section):
143
+
144
+ ```markdown
145
+ ## Blocked Checks
146
+ | Check | Severity | Criticality | Executor | Risk | Reason | Substitute Evidence | Continuation Impact | Follow-Up |
147
+ |-------|----------|-------------|----------|------|--------|---------------------|---------------------|-----------|
148
+ | {check name} | terminal | non-terminal | {must-pass | should-pass | operator-followup | unsafe-manual | pre-release | unknown} | {agent-executable | agent-with-env | operator-required | external-provider | dashboard | unknown} | {safe | state-mutating | destructive | credentialed | unknown} | {reason} | {evidence or "none"} | {blocks future development | does not block future development | unknown} | {required action} |
149
+
150
+ If none blocked, write: None.
151
+
152
+ ## Non-Terminal Follow-Ups
153
+ - {check name}: {follow-up description and substitute evidence}
154
+
155
+ If none, write: None.
156
+
157
+ ## Continuation Assessment
158
+ - Can later phases safely continue? {yes | no | unknown}
159
+ - Reason: {brief evidence-based reason}
160
+ - Blocking issues: {terminal blockers or None}
161
+ - Non-terminal follow-ups: {summary or None}
162
+ ```
163
+
164
+ ## SUCCESS/FAILURE response contract
165
+
166
+ When the calling command supplies an `ARTIFACT_PATH`, after writing the report respond on a single final line:
167
+
168
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={pass|fail|blocked|pass-with-followups} checks-passed=X/Y checks-blocked=Z/Y`
169
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
170
+
171
+ Do not return the full report body in your response when an ARTIFACT_PATH is provided — the orchestrator will read the file from disk. Calling commands that do not supply an ARTIFACT_PATH should ignore this contract; the standard template above still applies.
@@ -95,3 +95,12 @@ Use the format matching your assignment mode.
95
95
 
96
96
  ### Overall: {PASS | FAIL}
97
97
  ```
98
+
99
+ ## SUCCESS/FAILURE response contract (opt-in)
100
+
101
+ When the calling command supplies an `ARTIFACT_PATH` (or a domain-specific path such as `BUILD_REPORT_PATH` or `FIX_REPORT_PATH`) in your prompt, after writing the report respond on a single final line:
102
+
103
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
104
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
105
+
106
+ Do not return the full report body when an ARTIFACT_PATH is provided — the orchestrator reads the file from disk. Calling commands that do not supply an ARTIFACT_PATH should ignore this contract; the existing Build-Only and Build+Validate templates above still apply.
@@ -85,3 +85,16 @@ For each FAIL, provide:
85
85
  ```
86
86
 
87
87
  Overall is PASS only if ALL checks pass. Any single FAIL makes the overall FAIL.
88
+
89
+ ## SUCCESS/FAILURE response contract
90
+
91
+ When the calling command supplies an `ARTIFACT_PATH` in your prompt, after writing the report respond on a single final line:
92
+
93
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={pass|fail} checks-passed=X/N`
94
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
95
+
96
+ Do not return the full report body when an ARTIFACT_PATH is provided — the orchestrator reads the file from disk. Calling commands that do not supply an ARTIFACT_PATH should ignore this contract; the standard report at lines 53-85 still applies.
97
+
98
+ ## Validation metadata preservation (optional, opt-in via spawn prompt)
99
+
100
+ When the calling command's spawn prompt supplies HL-plan validation tags (e.g. `[must-pass]`, `[operator-followup]`, `[unsafe-manual]`, `[pre-release]`, `[agent-executable]`, `[operator-required]`, `[external-provider]`, `[dashboard]`, `[destructive]`, `[credentialed]`), verify the plan preserves them in `## Phase {N} Validation` and `## Success Criteria Verification`. Flag any tag dropped or downgraded without justification. This check is in addition to the standard checks specified by the spawn prompt.
@@ -246,7 +246,7 @@ Enter a polling loop. Each iteration:
246
246
  3. Cross-reference against HL_CRITERIA[LAST_POLL_PHASE] (the original criteria extracted in Step 0). For each original criterion:
247
247
  - If a matching row exists in the report table with result PASS: mark as **complete**.
248
248
  - If a matching row exists with result FAIL: mark as **incomplete**.
249
- - If a matching row exists with result BLOCKED: mark as **blocked**.
249
+ - If a matching row exists with result BLOCKED: read the criterion's severity if present (terminal | non-terminal). Mark **blocked-terminal** if severity is terminal (or severity is absent and BUILD_VALIDATION_RESULT is "blocked"); mark **blocked-nonterminal** if severity is non-terminal (or severity is absent and BUILD_VALIDATION_RESULT is "pass-with-followups"). If the criterion appears under the new "## Non-Terminal Follow-Ups" section of the report rather than the main results table, also treat as **blocked-nonterminal**.
250
250
  - If no matching row exists (criterion was dropped from the report): mark as **incomplete** and flag as "DROPPED -- not present in phase completion report."
251
251
  4. Check for extra rows in the report table that do not correspond to any original HL plan criterion. Flag these as "EXTRA -- not in original HL plan."
252
252
  5. Log the results to LEARNINGS_FILE:
@@ -518,8 +518,8 @@ When the build process exits (detected in Step 2b.2):
518
518
  find "{SPEC_DIR}" -maxdepth 1 -type d -name "phase{N}_*" 2>/dev/null | head -1
519
519
  ```
520
520
  - If the directory exists, read `{PHASE_DIR}/phase_completion_report.md` using the Read tool.
521
- - If the file exists: Parse its YAML frontmatter for the `status:` field.
522
- - If `status:` starts with `completed`: phase **succeeded**.
521
+ - If the file exists: Parse its YAML frontmatter for the `status:` field AND the new `build-validation-result:` field if present.
522
+ - If `status:` starts with `completed`: phase **succeeded**. Sub-classify using `build-validation-result` (if present): "pass" → clean success; "pass-with-followups" → success with N non-terminal follow-ups (read N from `checks-blocked-nonterminal`); "blocked" → success only if the `status:` value is `completed-with-blockers` (terminal blockers surfaced); otherwise treat as inconsistent and flag.
523
523
  - If `status:` is any other value: phase **completed with issues**.
524
524
  - If the file does not exist: phase **failed or was interrupted**.
525
525
 
@@ -541,11 +541,11 @@ When the build process exits (detected in Step 2b.2):
541
541
  Duration: {minutes}m
542
542
 
543
543
  ### Per-Phase Results
544
- | Phase | Steps Completed | Status | Interventions |
545
- |-------|-----------------|--------|---------------|
546
- | 1 | {count}/6 | {status} | {count} |
547
- | 2 | {count}/6 | {status} | {count} |
548
- | ... | ... | ... | ... |
544
+ | Phase | Steps Completed | Status | Followups | Interventions |
545
+ |-------|-----------------|--------|-----------|---------------|
546
+ | 1 | {count}/6 | {status} | {count of non-terminal blockers, or 0} | {count} |
547
+ | 2 | {count}/6 | {status} | {count} | {count} |
548
+ | ... | ... | ... | ... | ... |
549
549
  ```
550
550
 
551
551
  6. **Comprehensive success criteria audit.** For each phase 1 through TOTAL_PHASES that was NOT already verified during a phase transition in Step 2b.4 (e.g., the final phase, or phases that completed between polls), perform the same verification procedure described in Step 2b.4 (resolve phase directory, read phase_completion_report.md, extract `## HL Plan Success Criteria Results`, cross-reference against HL_CRITERIA[N], log results). Then produce a full audit summary appended to LEARNINGS_FILE:
@@ -554,12 +554,12 @@ When the build process exits (detected in Step 2b.2):
554
554
  ## Success Criteria Audit: Full Build
555
555
  Timestamp: {ISO-8601 timestamp}
556
556
 
557
- | Phase | Total Criteria | Complete | Incomplete | Blocked | Dropped |
558
- |-------|---------------|----------|------------|---------|---------|
559
- | 1 | {count} | {count} | {count} | {count} | {count} |
560
- | 2 | {count} | {count} | {count} | {count} | {count} |
561
- | ... | ... | ... | ... | ... | ... |
562
- | TOTAL | {sum} | {sum} | {sum} | {sum} | {sum} |
557
+ | Phase | Total Criteria | Complete | Incomplete | Blocked (terminal) | Blocked (non-terminal) | Dropped |
558
+ |-------|---------------|----------|------------|--------------------|------------------------|---------|
559
+ | 1 | {count} | {count} | {count} | {count} | {count} | {count} |
560
+ | 2 | {count} | {count} | {count} | {count} | {count} | {count} |
561
+ | ... | ... | ... | ... | ... | ... | ... |
562
+ | TOTAL | {sum} | {sum} | {sum} | {sum} | {sum} | {sum} |
563
563
 
564
564
  {If all criteria across all phases are complete: "All success criteria verified complete across all phases."}
565
565
  {If any are incomplete/blocked/dropped: "WARNING: {total_incomplete + total_blocked + total_dropped} criteria not fully met. Review per-phase verification logs above for details."}
@@ -101,7 +101,7 @@ Used during resume detection to verify step outputs are genuine and complete (no
101
101
  - OR one or more `{PHASE_DIR}/build_report_*.md` files exist, each with more than 5 lines
102
102
 
103
103
  ### Step 5 (Validate Build)
104
- - `{PHASE_DIR}/build_validation_report.md`: VALID if exists AND YAML frontmatter contains `result: pass` AND has a `checks-passed:` field
104
+ - `{PHASE_DIR}/build_validation_report.md`: VALID if exists AND YAML frontmatter contains `result: pass` OR `result: pass-with-followups` AND has a `checks-passed:` field. Resume detection treats both as a completed successful Step 5; downstream consumers (Step 6) read the actual `result` value to differentiate.
105
105
 
106
106
  ### Step 6 (Report)
107
107
  - `{PHASE_DIR}/phase_completion_report.md`: VALID if exists AND has more than 10 lines AND YAML frontmatter contains `type: phase-report`
@@ -299,7 +299,13 @@ Run MARK_STEP_START(1).
299
299
  Write your complete findings to: {PHASE_DIR}/explorer_findings.md
300
300
  Structure your report with these sections: Prerequisites Status, Key Files, Existing Patterns, Integration Points, Shared Utilities, Potential Conflicts, Success Criteria Grounding, Data Flow, File Size Audit.
301
301
  Write the file using the Write tool.
302
- Return the file path in your response.
302
+ ARTIFACT_PATH: {PHASE_DIR}/explorer_findings.md
303
+
304
+ ## Response contract
305
+ After writing the artifact, respond on a single final line:
306
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
307
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
308
+ Do NOT return the full findings body in your response — the orchestrator will read the file from disk.
303
309
  ```
304
310
 
305
311
  2. **Conditionally spawn Docs Researcher** via Task tool (`subagent_type: "docs-researcher-agent"`).
@@ -334,10 +340,20 @@ Run MARK_STEP_START(1).
334
340
  Write your complete findings to: {PHASE_DIR}/docs_research.md
335
341
  Structure your report with these sections: Technologies Researched, Key API References, Configuration Requirements, Pitfalls and Gotchas, Deprecation Notices.
336
342
  Write the file using the Write tool.
337
- Return the file path in your response.
343
+ ARTIFACT_PATH: {PHASE_DIR}/docs_research.md
344
+
345
+ ## Response contract
346
+ After writing the artifact, respond on a single final line:
347
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
348
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
349
+ Do NOT return the full findings body in your response.
338
350
  ```
339
351
 
340
- 3. **Verify output files**: After both Task calls return, read `{PHASE_DIR}/explorer_findings.md` (required) and `{PHASE_DIR}/docs_research.md` (if docs researcher was spawned). Verify files exist and contain substantive content (not empty or trivial). If a file was not written by the agent (e.g., permission mode blocked writes), write the returned content to the expected path using the Write tool. If a file is missing and no content was returned, warn and note in the report.
352
+ 3. **Verify output files**: After both Task calls return:
353
+ a. Parse the final line of each sub-agent response. Expect either `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)` or `FAILURE: could not write {ARTIFACT_PATH}: {reason}`.
354
+ b. If `SUCCESS`, read `{PHASE_DIR}/explorer_findings.md` (required) and `{PHASE_DIR}/docs_research.md` (if docs researcher was spawned) using the Read tool, and verify they exist and contain substantive content (not empty or trivial). If they do, proceed.
355
+ c. If `FAILURE`, abort this step and present the failure reason to the user. Do NOT silently fabricate the file by writing the response body — a sub-agent that could not write its own artifact should not have its response treated as authoritative content.
356
+ d. If the sub-agent did not emit either `SUCCESS:` or `FAILURE:` (legacy or malformed response), fall back to the previous behavior: check whether the expected file exists; if not, treat as FAILURE and abort. Do NOT write the sub-agent's prose response to the artifact path — only treat content as canonical when emitted from the Write tool by the agent itself.
341
357
 
342
358
  Run MARK_STEP_COMPLETE(1).
343
359
 
@@ -440,15 +456,27 @@ After research is complete, spawn the architect agent to create the implementati
440
456
  Then follow the Single-Phase Implementation Plan format provided below.
441
457
 
442
458
  Write the file using the Write tool.
443
- Return the file path in your response.
459
+ ARTIFACT_PATH: {PHASE_DIR}/plan_V1.md
460
+
461
+ ## Validation metadata preservation
462
+ This phase's HL plan MAY include validation tags on success criteria such as `[must-pass]`, `[should-pass]`, `[operator-followup]`, `[unsafe-manual]`, `[pre-release]`, `[agent-executable]`, `[agent-with-env]`, `[operator-required]`, `[external-provider]`, `[dashboard]`, `[requires-approval]`, `[destructive]`, `[credentialed]`. Carry every supplied tag through verbatim into your plan's `## Success Criteria Verification` table and `## Phase {N} Validation` checklist. For untagged criteria, do not invent policy tags as facts; you may add an inferred validation note grounded in HL plan/research and explicitly mark it as inferred. For operator-only/provider-dashboard/destructive/credentialed/pre-release validation, state whether it is intended as a follow-up and what substitute evidence can be collected.
463
+
464
+ ## Response contract
465
+ After writing the artifact, respond on a single final line:
466
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
467
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
468
+ Do NOT return the full plan body in your response — the orchestrator will read the file from disk.
444
469
  ```
445
470
 
446
- 3. **Verify output file**: After the Task call returns, read `{PHASE_DIR}/plan_V1.md`. Verify the file exists and contains:
447
- - YAML frontmatter with correct fields (`task`, `category`, `spec`, `phase`, `phase-name`, `type`, `version`, `status`, `date`, `source`)
448
- - All required sections from the Single-Phase Implementation Plan format
449
- - Inline source citations (look for `// per:` patterns)
450
- - Docs gap flags if applicable (look for `⚠️ DOCS GAP`)
451
- If the file was not written by the agent, write the returned content to `{PHASE_DIR}/plan_V1.md` using the Write tool. If the file is missing and no content was returned, warn and note in the report.
471
+ 3. **Verify output file**: After the Task call returns:
472
+ a. Parse the final line of the sub-agent response. Expect either `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)` or `FAILURE: could not write {ARTIFACT_PATH}: {reason}`.
473
+ b. If `SUCCESS`, read `{PHASE_DIR}/plan_V1.md` and verify the file exists and contains:
474
+ - YAML frontmatter with correct fields (`task`, `category`, `spec`, `phase`, `phase-name`, `type`, `version`, `status`, `date`, `source`)
475
+ - All required sections from the Single-Phase Implementation Plan format
476
+ - Inline source citations (look for `// per:` patterns)
477
+ - Docs gap flags if applicable (look for `⚠️ DOCS GAP`)
478
+ c. If `FAILURE`, abort this step and present the failure reason. Do NOT write the sub-agent's prose response to `plan_V1.md` — only treat content as canonical when emitted from the Write tool by the agent itself.
479
+ d. If the response is malformed (no SUCCESS/FAILURE line), check whether the file exists; if not, abort. Do not synthesize the plan from the response body.
452
480
 
453
481
  Run MARK_STEP_COMPLETE(2).
454
482
 
@@ -629,12 +657,25 @@ Write your validation report to: {PHASE_DIR}/plan_validation_report.md
629
657
  Use the YAML frontmatter format from your agent instructions with the parameters above. Include `checks-passed: X/11` in the frontmatter.
630
658
 
631
659
  Write the file using the Write tool.
632
- Return the file path AND the overall result (pass or fail) in your response.
660
+ ARTIFACT_PATH: {PHASE_DIR}/plan_validation_report.md
661
+
662
+ ## Additional check: Validation tag preservation
663
+ In addition to the 11 standard checks, verify that the plan preserves every HL-plan validation tag (`[must-pass]`, `[should-pass]`, `[operator-followup]`, `[unsafe-manual]`, `[pre-release]`, `[agent-executable]`, `[agent-with-env]`, `[operator-required]`, `[external-provider]`, `[dashboard]`, `[requires-approval]`, `[destructive]`, `[credentialed]`) supplied by the HL plan's success criteria. Flag any dropped, downgraded, or invented tag. Treat this as part of check 9 (Validation step coverage) for the purposes of pass/fail counting.
664
+
665
+ ## Response contract
666
+ After writing the artifact, respond on a single final line:
667
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={pass|fail} checks-passed=X/11`
668
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
669
+ Do NOT return the full report body in your response — the orchestrator will read the file from disk.
633
670
  ```
634
671
 
635
672
  #### 3b. Read Validation Result
636
673
 
637
- After the Task returns, read `{PHASE_DIR}/plan_validation_report.md`. Extract the `result` field from the YAML frontmatter.
674
+ After the Task returns:
675
+ 1. Parse the final line of the sub-agent response. Expect `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={pass|fail} checks-passed=X/11` or `FAILURE: ...`.
676
+ 2. If `SUCCESS`, read `{PHASE_DIR}/plan_validation_report.md` and extract the `result` field from the YAML frontmatter. The frontmatter is authoritative; the SUCCESS line is a fast-path hint.
677
+ 3. If `FAILURE`, present the failure reason and abort this step. Do NOT write the sub-agent's prose response to the report path.
678
+ 4. If the response is malformed (no SUCCESS/FAILURE), check whether the report exists; if not, abort. Do not synthesize the report from the response body.
638
679
 
639
680
  - **If `result: pass`**: Set `VALIDATION_RESULT = "pass"`. Run MARK_STEP_COMPLETE(3). Proceed to Step 4 (Build).
640
681
  - **If `result: fail`**: Proceed to 3c (Correction).
@@ -796,7 +837,13 @@ Package Manager: {PACKAGE_MANAGER}
796
837
  ## Output
797
838
  Write your build report to: {PHASE_DIR}/build_report.md
798
839
  Write the file using the Write tool.
799
- Return the file path in your response.
840
+ ARTIFACT_PATH: {PHASE_DIR}/build_report.md
841
+
842
+ ## Response contract
843
+ After writing the artifact, respond on a single final line:
844
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
845
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
846
+ Do NOT return the full report body — the orchestrator will read the file from disk.
800
847
  ```
801
848
 
802
849
  **Parallel domain builders** (domains marked `| PARALLEL`):
@@ -845,7 +892,13 @@ Package Manager: {PACKAGE_MANAGER}
845
892
  ## Output
846
893
  Write your build report to: {PHASE_DIR}/build_report_{DOMAIN_NAME_KEBAB}.md
847
894
  Write the file using the Write tool.
848
- Return the file path in your response.
895
+ ARTIFACT_PATH: {PHASE_DIR}/build_report_{DOMAIN_NAME_KEBAB}.md
896
+
897
+ ## Response contract
898
+ After writing the artifact, respond on a single final line:
899
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
900
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
901
+ Do NOT return the full report body — the orchestrator will read the file from disk.
849
902
  ```
850
903
 
851
904
  For any domains NOT marked `| PARALLEL`, combine them into a single sequential builder using the single-builder prompt pattern above, with only those domains' content included.
@@ -854,9 +907,11 @@ For any domains NOT marked `| PARALLEL`, combine them into a single sequential b
854
907
 
855
908
  After all Task calls return:
856
909
 
857
- 1. Read each builder report file (`build_report.md` or `build_report_{domain}.md`)
858
- 2. If a report file was not written by the agent, write the returned content to the expected path using the Write tool
859
- 3. Extract `### Issues Encountered` from each report
910
+ 1. For each builder, parse the final line of its response. Expect `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)` or `FAILURE: ...`.
911
+ 2. If `SUCCESS`, read each builder report file (`build_report.md` or `build_report_{domain}.md`) using the Read tool.
912
+ 3. If `FAILURE` on any builder, present the failure reason and abort Step 4. Do NOT write the builder's prose response to the report path — only treat content as canonical when emitted from the Write tool by the agent itself.
913
+ 4. If a builder response is malformed (no SUCCESS/FAILURE line), check whether the expected file exists; if not, treat as FAILURE and abort.
914
+ 5. Extract `### Issues Encountered` from each report
860
915
  4. If ANY builder reports issues (section content is not "None"):
861
916
  - Present:
862
917
  ```
@@ -981,34 +1036,42 @@ Read the build report(s) for context on what was built:
981
1036
  "No dev server is running. There are no browser checks in this validation."}
982
1037
 
983
1038
  ## Test Credentials
984
- {Read the Test Credentials section from the testing-and-validation skill. Extract the Login URL, Username, Password, and Post-Login URL values.
1039
+ {Look up `.claude/skills/testing-and-validation/auth.json` (the gitignored local credentials file). If `auth.json` does not exist, fall back to the legacy SKILL.md fields ("Login URL / Username / Password / Post-Login URL" in the Test Credentials table) for backward compatibility with projects scaffolded before auth.json was introduced.
1040
+
1041
+ Source resolution:
1042
+ PRIMARY: `.claude/skills/testing-and-validation/auth.json` if it exists. Read `browserAuth.enabled`, `browserAuth.loginUrl`, `browserAuth.postLoginUrl`, `browserAuth.users.primary.username`, `browserAuth.users.primary.password`.
1043
+ FALLBACK: the Test Credentials section of `.claude/skills/testing-and-validation/SKILL.md` (Login URL / Username / Password / Post-Login URL).
985
1044
 
986
- If Username is NOT "NONE" and NOT "REPLACE_WITH_TEST_USERNAME":
987
- "Test credentials are configured for this project. You MUST authenticate before running browser checks that require a logged-in session.
1045
+ Then branch:
988
1046
 
989
- - Login URL: {Login URL from skill}
990
- - Username: {Username from skill}
991
- - Password: {Password from skill}
992
- - Post-Login URL: {Post-Login URL from skill}
1047
+ If browserAuth.enabled is true AND primary username is set AND is NOT "REPLACE_WITH_TEST_USERNAME" (or legacy Username is NOT "NONE" and NOT "REPLACE_WITH_TEST_USERNAME"):
1048
+ "Test credentials are configured. You MUST authenticate before running browser checks that require a logged-in session.
1049
+
1050
+ - Login URL: {browserAuth.loginUrl OR legacy Login URL}
1051
+ - Username: [REDACTED — present]
1052
+ - Password: [REDACTED — present]
1053
+ - Post-Login URL: {browserAuth.postLoginUrl OR legacy Post-Login URL}
993
1054
 
994
1055
  Authentication procedure (use the agent-browser skill workflow):
995
1056
  1. Open the Login URL: `agent-browser open {Login URL}`
996
1057
  2. Take a snapshot to identify form elements: `agent-browser snapshot -i`
997
- 3. Fill in the username field with the Username value
998
- 4. Fill in the password field with the Password value
1058
+ 3. Fill in the username field with the configured username
1059
+ 4. Fill in the password field with the configured password
999
1060
  5. Click the submit/login button
1000
1061
  6. Wait for navigation to the Post-Login URL: `agent-browser wait --url \"**{Post-Login URL path}\"`
1001
1062
  7. Take a snapshot to confirm successful login
1002
- 8. Save the authenticated state: `agent-browser state save auth.json`
1003
- 9. For subsequent browser checks, load the saved state: `agent-browser state load auth.json`
1063
+ 8. Save the authenticated browser state: `agent-browser state save browser-state.json` (NOTE: this is the agent-browser playwright state file; it is DISTINCT from the credentials file `auth.json` in the skill directory)
1064
+ 9. For subsequent browser checks, load the saved state: `agent-browser state load browser-state.json`
1065
+
1066
+ REDACTION REQUIREMENT: Never print raw usernames, passwords, API keys, webhook secrets, or full credential values in your report, evidence, or summary. Redact as `[REDACTED]` or report only presence/absence. Failure to redact must be treated as a critical reporting bug.
1004
1067
 
1005
- IMPORTANT: Because credentials are configured, you MUST NOT mark any browser check as BLOCKED due to authentication requirements. If authentication fails, mark the check as FAIL (not BLOCKED) and include the error details."
1068
+ IMPORTANT: Because credentials are configured, you MUST NOT mark any browser check as BLOCKED due to authentication requirements. If authentication fails, mark the check as FAIL (not BLOCKED) and include the error details with credentials redacted."
1006
1069
 
1007
- If Username is "NONE":
1070
+ If browserAuth.enabled is false AND no legacy credentials configured (or Username is "NONE"):
1008
1071
  "This project does not require authentication for browser tests. No login step is needed."
1009
1072
 
1010
- If Username is "REPLACE_WITH_TEST_USERNAME":
1011
- "Test credentials have NOT been configured in the testing-and-validation skill. The placeholder values have not been replaced. Browser checks that require authentication MUST be marked as BLOCKED with reason: 'Test credentials not configured in testing-and-validation skill -- user must replace placeholder values.'"}
1073
+ If neither auth.json nor legacy SKILL.md contain valid credentials (auth.json missing AND legacy Username is "REPLACE_WITH_TEST_USERNAME", or auth.json browserAuth.enabled is true but users.primary.username is "REPLACE_WITH_TEST_USERNAME"):
1074
+ "Test credentials have NOT been configured. Browser checks that require authentication MUST be marked as BLOCKED with severity classification per the Validation Taxonomy in the testing-and-validation skill. Use reason: 'Test credentials not configured (no auth.json present and SKILL.md placeholder unchanged) — user must populate .claude/skills/testing-and-validation/auth.json from auth.example.json.' Mark severity as non-terminal if substitute evidence for the underlying behavior exists (source review, build/lint/typecheck pass, adjacent non-auth flows pass); mark terminal if the entire phase depends on authenticated browser flow."}
1012
1075
 
1013
1076
  ## Mobile Device Testing
1014
1077
  {Read the Mobile Test Devices section from the testing-and-validation skill. Extract the Primary Device and Secondary Device values.
@@ -1042,10 +1105,21 @@ The checks prefixed with `[HL]` are the authoritative success criteria from the
1042
1105
  - If you cannot execute a check (e.g., requires a service that is not running, requires manual user action, depends on an external system), mark it as BLOCKED with a specific reason
1043
1106
 
1044
1107
  ## Check Result States
1045
- For each check, report one of three results:
1108
+ For each check, report one of three per-check results: PASS, FAIL, or BLOCKED.
1046
1109
  - **PASS**: The check was executed and the criterion is fully satisfied. Include concrete evidence.
1047
1110
  - **FAIL**: The check was executed and the criterion is NOT satisfied. Include what was expected vs. what was observed.
1048
- - **BLOCKED**: The check could NOT be executed due to infrastructure or environmental limitations OUTSIDE your control. Include the specific reason why (e.g., "Dev server not running", "External API unavailable", "Test credentials not configured in testing-and-validation skill"). BLOCKED is a non-pass result. IMPORTANT: If test credentials are provided in the Test Credentials section above, you MUST NOT mark browser checks as BLOCKED due to authentication -- you must authenticate using the provided credentials and report PASS or FAIL based on the result.
1111
+ - **BLOCKED**: The check could NOT be executed due to infrastructure or environmental limitations OUTSIDE your control. Include the specific reason why (e.g., "Dev server not running", "External API unavailable", "Test credentials not configured"). BLOCKED is a non-pass result. IMPORTANT: If test credentials are provided in the Test Credentials section above, you MUST NOT mark browser checks as BLOCKED due to authentication -- you must authenticate using the provided credentials and report PASS or FAIL based on the result.
1112
+
1113
+ For every BLOCKED check, follow the **Validation Taxonomy and Blocked Check Severity** section of the `testing-and-validation` skill: record `severity: terminal | non-terminal`, `criticality`, `executor`, `risk`, blocked reason, substitute evidence (or why none exists), continuation impact, and recommended follow-up. Use the extended report variant in your agent instructions for the Blocked Checks table, Non-Terminal Follow-Ups list, and Continuation Assessment section.
1114
+
1115
+ ## Overall result derivation
1116
+ Compute the aggregate `result` for the report frontmatter using these rules in order:
1117
+ 1. If any check is FAIL → `result: fail`.
1118
+ 2. Else if any BLOCKED check is **terminal** → `result: blocked`.
1119
+ 3. Else if any BLOCKED check is **non-terminal** AND substitute evidence is documented → `result: pass-with-followups`.
1120
+ 4. Else → `result: pass`.
1121
+
1122
+ Self-consistency requirement: do not write `result: blocked` when `checks-blocked-terminal == 0`; in that case write `pass-with-followups`. Do not write `result: pass` when any non-terminal blockers exist — write `pass-with-followups` (or `blocked` if substitute evidence is missing).
1049
1123
 
1050
1124
  ## Validation Parameters
1051
1125
  - task: {TASK_NAME}
@@ -1059,21 +1133,40 @@ For each check, report one of three results:
1059
1133
  ## Output
1060
1134
  Write your validation report to: {PHASE_DIR}/build_validation_report.md
1061
1135
 
1062
- Use the YAML frontmatter format from your agent instructions with the parameters above. Include `checks-passed: X/{BUILD_CHECK_COUNT}` and `checks-blocked: Y/{BUILD_CHECK_COUNT}` in the frontmatter.
1136
+ Use the **extended report variant** from your agent instructions (the variant with taxonomy-aware frontmatter and Blocked Checks/Non-Terminal Follow-Ups/Continuation Assessment sections). Frontmatter MUST include:
1137
+ - `result: pass | fail | blocked | pass-with-followups` (per the Overall result derivation rules above)
1138
+ - `checks-passed: X/{BUILD_CHECK_COUNT}`
1139
+ - `checks-blocked: Y/{BUILD_CHECK_COUNT}`
1140
+ - `checks-blocked-terminal: A/{BUILD_CHECK_COUNT}`
1141
+ - `checks-blocked-nonterminal: B/{BUILD_CHECK_COUNT}`
1063
1142
 
1064
1143
  Write the file using the Write tool.
1065
- Return the file path AND the overall result (pass, fail, or blocked) in your response.
1144
+ ARTIFACT_PATH: {PHASE_DIR}/build_validation_report.md
1145
+
1146
+ ## Response contract
1147
+ After writing the artifact, respond on a single final line:
1148
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={pass|fail|blocked|pass-with-followups} checks-passed=X/Y checks-blocked=Z/Y checks-blocked-terminal=A/Y checks-blocked-nonterminal=B/Y`
1149
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
1150
+ Do NOT return the full report body — the orchestrator will read the file from disk.
1066
1151
  ```
1067
1152
 
1068
1153
  #### 5d. Read Build Validation Result
1069
1154
 
1070
- After the Task returns, read `{PHASE_DIR}/build_validation_report.md`. Extract the `result` field from the YAML frontmatter.
1155
+ After the Task returns:
1156
+ 1. Parse the final line of the sub-agent response. Expect `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines) result={...} ...` or `FAILURE: ...`.
1157
+ 2. If `SUCCESS`, read `{PHASE_DIR}/build_validation_report.md` and extract the `result`, `checks-passed`, `checks-blocked`, `checks-blocked-terminal`, and `checks-blocked-nonterminal` fields from the YAML frontmatter. The frontmatter is authoritative.
1158
+ 3. If `FAILURE`, present the failure reason and treat as a cycle failure (proceed to 5e).
1159
+ 4. If the response is malformed (no SUCCESS/FAILURE line), check whether the report exists; if not, treat as FAIL and proceed to 5e. Do NOT write the sub-agent's prose response to the report path.
1071
1160
 
1072
- If the file was not written by the agent, write the returned content to `{PHASE_DIR}/build_validation_report.md` using the Write tool. If the file is missing and no content was returned, warn and treat as FAIL.
1161
+ Then branch on `result`:
1073
1162
 
1074
- - **If `result: pass`**: Set `BUILD_VALIDATION_RESULT = "pass"`. Extract `checks-passed` value as `BUILD_CHECKS_PASSED`. Extract `checks-blocked` value as `BUILD_CHECKS_BLOCKED`. Proceed to 5f (Cleanup).
1075
- - **If `result: fail`**: Extract `checks-passed` value as `BUILD_CHECKS_PASSED`. Extract `checks-blocked` value as `BUILD_CHECKS_BLOCKED`. Proceed to 5e (Fix and Re-validate).
1076
- - **If `result: blocked`**: Extract `checks-passed` value as `BUILD_CHECKS_PASSED`. Extract `checks-blocked` value as `BUILD_CHECKS_BLOCKED`. Review each BLOCKED check reason. If any BLOCKED check cites "authentication", "login", "credentials", or "authenticated session" AND test credentials are configured in the testing-and-validation skill (Username is not "NONE" and not "REPLACE_WITH_TEST_USERNAME"), this is a validation error -- the validator should have authenticated. Log a warning: "Validator marked auth-requiring check as BLOCKED despite credentials being configured. Re-running validation." Decrement BUILD_VALIDATION_CYCLE by 1 (to not count this as a failed cycle) and return to 5c to re-spawn the validator. Otherwise, if ALL non-pass results are genuinely BLOCKED (infrastructure/environmental), set `BUILD_VALIDATION_RESULT = "blocked"`. Do NOT enter the fix loop -- genuinely BLOCKED checks cannot be fixed by a builder agent. Proceed to 5f (Cleanup). The blocked checks will be surfaced in the phase completion report for human action.
1163
+ - **If `result: pass`**: Set `BUILD_VALIDATION_RESULT = "pass"`. Extract `checks-passed`, `checks-blocked`, `checks-blocked-terminal`, `checks-blocked-nonterminal`. Proceed to 5f (Cleanup).
1164
+ - **If `result: pass-with-followups`**: Set `BUILD_VALIDATION_RESULT = "pass-with-followups"`. Extract the same fields. The build is accepted; non-terminal follow-ups will be surfaced in the phase completion report for human action. Proceed to 5f (Cleanup). Do NOT enter the fix loop — non-terminal blockers are operator/dashboard/external work, not code defects.
1165
+ - **If `result: fail`**: Extract all five fields. Proceed to 5e (Fix and Re-validate).
1166
+ - **If `result: blocked`**: Extract all five fields. This means at least one TERMINAL blocker exists. Auth-credentials recovery check (constrained):
1167
+ - Review each terminal BLOCKED check reason. If a BLOCKED check cites "authentication", "login", "credentials", or "authenticated session" AND `auth.json` is present with `browserAuth.enabled: true` AND `users.primary.username` is set (or legacy SKILL.md Username is not "NONE" and not "REPLACE_WITH_TEST_USERNAME") AND this is the FIRST cycle (BUILD_VALIDATION_CYCLE == 1) AND the check has no documented substitute evidence: re-spawn the validator ONCE without incrementing BUILD_VALIDATION_CYCLE. Log: "Validator marked auth-requiring check as BLOCKED despite configured credentials on first cycle. Re-running once without cycle increment."
1168
+ - The no-increment recovery is allowed ONLY when ALL of the conditions above hold. On any subsequent cycle, on any non-auth-related blocker, or when credentials are unconfigured, increment the cycle normally and treat the result as terminal-blocked.
1169
+ - If recovery does not apply, set `BUILD_VALIDATION_RESULT = "blocked"`. Do NOT enter the fix loop — genuinely BLOCKED checks cannot be fixed by a builder agent. Proceed to 5f (Cleanup). The terminal blocked checks will be surfaced in the phase completion report for human action.
1077
1170
 
1078
1171
  #### 5e. Fix and Re-validate
1079
1172
 
@@ -1139,13 +1232,21 @@ Use this format:
1139
1232
  - ... (or "None")
1140
1233
 
1141
1234
  Write the file using the Write tool.
1142
- Return the file path in your response.
1235
+ ARTIFACT_PATH: {PHASE_DIR}/fix_report_cycle{BUILD_VALIDATION_CYCLE}.md
1236
+
1237
+ ## Response contract
1238
+ After writing the artifact, respond on a single final line:
1239
+ - On success: `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)`
1240
+ - On failure: `FAILURE: could not write {ARTIFACT_PATH}: {reason}`
1241
+ Do NOT return the full fix-report body — the orchestrator will read the file from disk.
1143
1242
  ```
1144
1243
 
1145
1244
  After the Task returns:
1146
- 1. Read `{PHASE_DIR}/fix_report_cycle{BUILD_VALIDATION_CYCLE}.md`
1147
- 2. If the file was not written by the agent, write the returned content to the expected path using the Write tool
1148
- 3. Check for "Unresolved Issues" -- if any exist (section content is not "None"), note them for the summary
1245
+ 1. Parse the final line of the sub-agent response. Expect `SUCCESS: wrote {ARTIFACT_PATH} ({line_count} lines)` or `FAILURE: ...`.
1246
+ 2. If `SUCCESS`, read `{PHASE_DIR}/fix_report_cycle{BUILD_VALIDATION_CYCLE}.md` using the Read tool.
1247
+ 3. If `FAILURE`, present the failure reason and abort the fix cycle (treat as if no fix progress was made proceed to 5e-iv re-validation which will likely re-FAIL and consume a cycle).
1248
+ 4. If the response is malformed (no SUCCESS/FAILURE line), check whether the fix report exists; if not, treat as no-progress and proceed to 5e-iv. Do NOT write the sub-agent's prose response to the fix report path.
1249
+ 5. Check for "Unresolved Issues" — if any exist (section content is not "None"), note them for the summary.
1149
1250
 
1150
1251
  ##### 5e-ii. Post-fix git checkpoint
1151
1252
 
@@ -1165,7 +1266,8 @@ If DEV_SERVER_STARTED == true:
1165
1266
  Repeat from 5c -- spawn the build validator again with `cycle: {BUILD_VALIDATION_CYCLE}`. The validator reads the (possibly fixed) codebase and overwrites `build_validation_report.md`.
1166
1267
 
1167
1268
  - **If `result: pass`**: Set `BUILD_VALIDATION_RESULT = "pass"`. Proceed to 5f (Cleanup).
1168
- - **If `result: blocked`**: All non-pass results are BLOCKED (no FAIL items remain). Set `BUILD_VALIDATION_RESULT = "blocked"`. Extract `checks-blocked` value as `BUILD_CHECKS_BLOCKED`. Proceed to 5f (Cleanup) -- do not re-enter the fix loop.
1269
+ - **If `result: pass-with-followups`**: Set `BUILD_VALIDATION_RESULT = "pass-with-followups"`. Extract `checks-blocked-terminal` (must be 0) and `checks-blocked-nonterminal` for the report. Proceed to 5f (Cleanup) -- do not re-enter the fix loop. Non-terminal follow-ups are operator/dashboard work, not code defects.
1270
+ - **If `result: blocked`**: At least one terminal blocker remains (no FAIL items, but core correctness cannot be established). Set `BUILD_VALIDATION_RESULT = "blocked"`. Extract `checks-blocked` and `checks-blocked-terminal`. Proceed to 5f (Cleanup) -- do not re-enter the fix loop.
1169
1271
  - **If `result: fail`**: Repeat fix (5e).
1170
1272
 
1171
1273
  ##### 5e-v. Abort on Final Failure
@@ -1204,16 +1306,19 @@ Stop execution. Do not proceed to Step 6.
1204
1306
  If DEV_SERVER_STARTED == true:
1205
1307
  1. Kill any process on port 3000: run `lsof -ti:3000 | xargs kill -9 2>/dev/null` via Bash.
1206
1308
 
1207
- **Git post-validation checkpoint** (only on success):
1208
- If GIT_AVAILABLE == true AND BUILD_VALIDATION_RESULT == "pass":
1209
- 1. Run `git add -A && git commit -m "checkpoint: phase-{PHASE_NUMBER} build validated ({PHASE_NAME_KEBAB})"` via Bash.
1210
- 2. If nothing to commit, skip silently.
1309
+ **Git post-validation checkpoint** (on success or pass-with-followups):
1310
+ If GIT_AVAILABLE == true AND (BUILD_VALIDATION_RESULT == "pass" OR BUILD_VALIDATION_RESULT == "pass-with-followups"):
1311
+ 1. Build commit message suffix: if BUILD_VALIDATION_RESULT == "pass-with-followups", use "build validated (pass-with-followups: N non-terminal blockers) ({PHASE_NAME_KEBAB})" where N is parsed from BUILD_CHECKS_BLOCKED_NONTERMINAL. Otherwise use "build validated ({PHASE_NAME_KEBAB})".
1312
+ 2. Run `git add -A && git commit -m "checkpoint: phase-{PHASE_NUMBER} {suffix}"` via Bash.
1313
+ 3. If nothing to commit, skip silently.
1211
1314
 
1212
1315
  **Store summary variables** for Step 6:
1213
- - BUILD_VALIDATION_RESULT: "pass", "fail", or "blocked" (should be "pass" or "blocked" if we reached here from 5d/5e-iv)
1316
+ - BUILD_VALIDATION_RESULT: "pass", "pass-with-followups", "fail", or "blocked" (should be "pass", "pass-with-followups", or "blocked" if we reached here from 5d/5e-iv)
1214
1317
  - BUILD_VALIDATION_CYCLE: final cycle number (1 if passed first try)
1215
1318
  - BUILD_CHECKS_PASSED: "{X}/{BUILD_CHECK_COUNT}" from the final validation report
1216
1319
  - BUILD_CHECKS_BLOCKED: "{Y}/{BUILD_CHECK_COUNT}" from the final validation report (0 if none blocked)
1320
+ - BUILD_CHECKS_BLOCKED_TERMINAL: "{A}/{BUILD_CHECK_COUNT}" from the final validation report (new field; default "0/{BUILD_CHECK_COUNT}" if old-format report)
1321
+ - BUILD_CHECKS_BLOCKED_NONTERMINAL: "{B}/{BUILD_CHECK_COUNT}" from the final validation report (new field; default "0/{BUILD_CHECK_COUNT}" if old-format report)
1217
1322
  - FIX_REPORTS: list of fix report file paths (empty if passed on first try)
1218
1323
 
1219
1324
  Run MARK_STEP_COMPLETE(5).
@@ -1256,7 +1361,14 @@ Cross-reference sources:
1256
1361
 
1257
1362
  For each deviation, write: area, what HL plan assumed, what actually happened, why, impact.
1258
1363
 
1259
- Set `REPORT_STATUS`: "completed-with-deviations" if any deviations were identified, otherwise "completed".
1364
+ Set `REPORT_STATUS` using this matrix (priority order):
1365
+ - If BUILD_VALIDATION_RESULT == "blocked": `REPORT_STATUS = "completed-with-blockers"` (terminal blockers exist; later phases may be unable to proceed).
1366
+ - Else if BUILD_VALIDATION_RESULT == "pass-with-followups" AND any deviations were identified: `REPORT_STATUS = "completed-with-deviations-and-followups"`.
1367
+ - Else if BUILD_VALIDATION_RESULT == "pass-with-followups": `REPORT_STATUS = "completed-with-followups"`.
1368
+ - Else if any deviations were identified: `REPORT_STATUS = "completed-with-deviations"`.
1369
+ - Else: `REPORT_STATUS = "completed"`.
1370
+
1371
+ All `REPORT_STATUS` values that start with `completed` are treated as successful builds by downstream consumers (e.g. build-feature.md). Only `failed` or absent reports should be treated as build failures. The new `completed-with-followups` and `completed-with-deviations-and-followups` values preserve backward compatibility with downstream consumers that match on the `completed` prefix.
1260
1372
 
1261
1373
  #### 6e. Identify Impacts on Future Phases
1262
1374
 
@@ -1277,6 +1389,11 @@ status: {REPORT_STATUS}
1277
1389
  date: {current date}
1278
1390
  pre-phase-sha: {PRE_PHASE_SHA | "(git unavailable)"}
1279
1391
  post-build-sha: {POST_BUILD_SHA | "(git unavailable)"}
1392
+ build-validation-result: {BUILD_VALIDATION_RESULT}
1393
+ checks-passed: {BUILD_CHECKS_PASSED}
1394
+ checks-blocked: {BUILD_CHECKS_BLOCKED}
1395
+ checks-blocked-terminal: {BUILD_CHECKS_BLOCKED_TERMINAL}
1396
+ checks-blocked-nonterminal: {BUILD_CHECKS_BLOCKED_NONTERMINAL}
1280
1397
  ---
1281
1398
 
1282
1399
  # Phase {N} Report: {PHASE_NAME}
@@ -1292,11 +1409,28 @@ post-build-sha: {POST_BUILD_SHA | "(git unavailable)"}
1292
1409
  |---|-----------|--------|----------|
1293
1410
  {rows from 6c -- Result is PASS, FAIL, or BLOCKED}
1294
1411
 
1295
- {If any BLOCKED results exist:}
1296
- ### Blocked Criteria -- Action Required
1297
- The following success criteria could not be verified automatically and require human verification:
1298
- {For each BLOCKED criterion:}
1412
+ {If any BLOCKED results exist, classified as terminal:}
1413
+ ### Terminal Blocked Criteria -- Action Required
1414
+ The following success criteria could not be verified automatically and represent terminal blockers (core correctness could not be established):
1415
+ {For each terminal-BLOCKED criterion:}
1416
+ - **{criterion text}**: {blocked reason}
1417
+ - severity: terminal
1418
+ - criticality: {must-pass | should-pass | unknown}
1419
+ - substitute evidence: {summary or "none"}
1420
+ - continuation impact: {summary}
1421
+ - recommended follow-up: {action}
1422
+
1423
+ {If any BLOCKED results exist, classified as non-terminal (new section, additive):}
1424
+ ## Non-Terminal Follow-Ups
1425
+ The following checks could not be executed safely as part of automated validation, but core functionality has been verified through substitute evidence. These items are recommended human/operator follow-ups; they do NOT block future phases:
1426
+ {For each non-terminal-BLOCKED criterion:}
1299
1427
  - **{criterion text}**: {blocked reason}
1428
+ - severity: non-terminal
1429
+ - criticality: {operator-followup | unsafe-manual | pre-release | unknown}
1430
+ - executor: {operator-required | external-provider | dashboard | unknown}
1431
+ - substitute evidence: {what was verified instead}
1432
+ - continuation impact: does not block future development
1433
+ - recommended follow-up: {action}
1300
1434
 
1301
1435
  ## Deviations from HL Plan
1302
1436
  {bullets from 6d, or "None. Implementation matched the HL plan as specified."}
@@ -1314,6 +1448,8 @@ The following success criteria could not be verified automatically and require h
1314
1448
  - Plan validation: {VALIDATION_RESULT} (cycle {VALIDATION_CYCLE})
1315
1449
  - Build validation: {BUILD_VALIDATION_RESULT} (cycle {BUILD_VALIDATION_CYCLE})
1316
1450
  - Checks passed: {BUILD_CHECKS_PASSED}
1451
+ - Checks blocked (terminal): {BUILD_CHECKS_BLOCKED_TERMINAL}
1452
+ - Checks blocked (non-terminal): {BUILD_CHECKS_BLOCKED_NONTERMINAL}
1317
1453
  - Fix cycles: {BUILD_VALIDATION_CYCLE - 1}
1318
1454
  ```
1319
1455