@chrono-meta/fh-gate 1.4.16 → 1.4.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CATALOG.md +6 -1
- package/CHEATSHEET.md +1 -1
- package/CLAUDE.md +15 -2
- package/package.json +1 -1
- package/plugins/fh-meta/skills/context-doctor/SKILL.md +10 -96
- package/plugins/fh-meta/skills/context-doctor/SKILL_detail.md +114 -0
- package/plugins/fh-meta/skills/field-harvest/SKILL.md +23 -163
- package/plugins/fh-meta/skills/field-harvest/SKILL_detail.md +189 -0
- package/plugins/fh-meta/skills/frontier-digest/SKILL.md +23 -167
- package/plugins/fh-meta/skills/frontier-digest/SKILL_detail.md +177 -0
- package/plugins/fh-meta/skills/goal-quench/SKILL.md +42 -263
- package/plugins/fh-meta/skills/goal-quench/SKILL_detail.md +249 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +1 -1
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +35 -420
- package/plugins/fh-meta/skills/install-wizard/SKILL_detail.md +432 -0
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +19 -179
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL_detail.md +149 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +7 -1
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +52 -3
|
@@ -49,11 +49,7 @@ Basis: [one-line summary of deciding factor]
|
|
|
49
49
|
| `FAIL` | Blocking issue found | Halt chain, surface to user immediately |
|
|
50
50
|
| `ESCALATE` | Ambiguous state requiring human judgment | Pause chain, present three options to user |
|
|
51
51
|
|
|
52
|
-
`CONDITIONAL_PASS` means proceed, not skip
|
|
53
|
-
|
|
54
|
-
`FAIL` halts the chain. The user decides whether to fix-and-resume or abandon the sweep.
|
|
55
|
-
|
|
56
|
-
`ESCALATE` pauses the chain. The user chooses one of three paths — the chain resumes or closes based on their decision.
|
|
52
|
+
`CONDITIONAL_PASS` means proceed, not skip — captured items accumulate into the final report's Pending section and must be addressed before the sweep is considered clean. `FAIL` halts the chain (user decides fix-and-resume or abandon). `ESCALATE` pauses the chain pending the user's choice.
|
|
57
53
|
|
|
58
54
|
### ESCALATE Handling
|
|
59
55
|
|
|
@@ -72,21 +68,17 @@ When any step returns `ESCALATE`, present:
|
|
|
72
68
|
|
|
73
69
|
## Step 0. Scope, Mode, and Translation
|
|
74
70
|
|
|
75
|
-
Determine
|
|
71
|
+
Determine before running any pipeline step:
|
|
76
72
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
- Full harness: all assets under `plugins/` and `templates/`
|
|
73
|
+
1. **Scope**: single skill (path to `SKILL.md`) · specific asset (file/directory) · full harness (`plugins/` + `templates/`)
|
|
74
|
+
2. **Mode**: `--quick`, `--no-sim`, or `--full` (default)
|
|
75
|
+
3. **Branch**: current git branch (used for regression guard and report header)
|
|
81
76
|
|
|
82
|
-
|
|
77
|
+
If scope is not specified, ask: *"What should I sweep? (e.g., a specific skill path, a directory, or the full harness)"* **Do not infer scope** — a wrong scope produces misleading verdicts.
|
|
83
78
|
|
|
84
|
-
|
|
79
|
+
Output the sweep plan and get Y/N confirmation before running.
|
|
85
80
|
|
|
86
|
-
|
|
87
|
-
> "What should I sweep? (e.g., a specific skill path, a directory, or the full harness)"
|
|
88
|
-
|
|
89
|
-
Do not infer scope — a wrong scope produces misleading verdicts.
|
|
81
|
+
> **Detail**: See `SKILL_detail.md §Output-Formats` — sweep plan block, per-step verdict blocks, aggregated report template — read when emitting any step output.
|
|
90
82
|
|
|
91
83
|
### Scope Translation Table
|
|
92
84
|
|
|
@@ -98,19 +90,6 @@ The four constituent skills use heterogeneous scope models. Translate the pipeli
|
|
|
98
90
|
| Specific directory | Check session findings in this domain | Attack all SKILL.md files in directory | Back-trace all claims in directory | Area A + Area D on the domain |
|
|
99
91
|
| Full harness | Check all recent session findings | Attack all harness assets under scope | Back-trace all claims across harness | Area A + Area B + Area D |
|
|
100
92
|
|
|
101
|
-
### Step 0 Output
|
|
102
|
-
|
|
103
|
-
```
|
|
104
|
-
pipeline-conductor — Sweep Plan
|
|
105
|
-
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
106
|
-
Scope: {scope}
|
|
107
|
-
Mode: {--full / --quick / --no-sim}
|
|
108
|
-
Branch: {branch name}
|
|
109
|
-
Steps: {1 → 2 → 3 → 4 / 2 → 3 / 1 → 2 → 3}
|
|
110
|
-
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
111
|
-
Proceed? (Y: run / N: cancel)
|
|
112
|
-
```
|
|
113
|
-
|
|
114
93
|
---
|
|
115
94
|
|
|
116
95
|
## Step 0.5. return-path-gate — Pre-flight Chain Audit
|
|
@@ -125,21 +104,7 @@ Run `/return-path-gate --skill [scope]`.
|
|
|
125
104
|
| MEDIUM/LOW severity OPEN chains only | `CONDITIONAL_PASS` | Proceed; capture in Pending |
|
|
126
105
|
| 1+ HIGH severity OPEN chains | `FAIL` | **Halt sweep** — fix chains before running pipeline |
|
|
127
106
|
|
|
128
|
-
**
|
|
129
|
-
|
|
130
|
-
```
|
|
131
|
-
[Step 0.5 — return-path-gate]
|
|
132
|
-
Verdict: FAIL
|
|
133
|
-
Basis: HIGH severity OPEN chain(s) — inter-step verdicts unreliable with broken chains
|
|
134
|
-
Open: [list of OPEN chains with severity]
|
|
135
|
-
|
|
136
|
-
Fix OPEN chains then re-run pipeline-conductor.
|
|
137
|
-
Skip? (S: override gate — records as degraded, disqualifies CLEAN (--full) status)
|
|
138
|
-
```
|
|
139
|
-
|
|
140
|
-
`S` (skip) is available but must be declared explicitly. Skipping records as `degraded: return-path-gate (user override)` in the final report and the sweep cannot reach `CLEAN (--full)` status regardless of subsequent step verdicts.
|
|
141
|
-
|
|
142
|
-
**On CONDITIONAL_PASS (MEDIUM/LOW only)**: Capture OPEN chains in Pending. Continue to Step 1.
|
|
107
|
+
**Skip override rule**: `S` (skip on FAIL) is available but must be declared explicitly. Skipping records as `degraded: return-path-gate (user override)` in the final report and the sweep **cannot reach `CLEAN (--full)` status** regardless of subsequent step verdicts.
|
|
143
108
|
|
|
144
109
|
---
|
|
145
110
|
|
|
@@ -147,16 +112,7 @@ Skip? (S: override gate — records as degraded, disqualifies CLEAN (--full) sta
|
|
|
147
112
|
|
|
148
113
|
> Skipped in `--quick` mode.
|
|
149
114
|
|
|
150
|
-
Check for harvest-loop signals in the current session
|
|
151
|
-
|
|
152
|
-
**What it checks**: Whether recent session learnings have been absorbed, whether duplicate skill candidates are pending, whether knowledge loop integrity issues exist in the current session.
|
|
153
|
-
|
|
154
|
-
**Invocation path**:
|
|
155
|
-
- If harvest-loop ran in this session: read its output and incorporate findings.
|
|
156
|
-
- If harvest-loop did not run in this session: output `CONDITIONAL_PASS` — knowledge loop not validated in this sweep.
|
|
157
|
-
- If harvest-loop auto-skipped (fewer than 3 patterns found): output `CONDITIONAL_PASS` — sub-threshold, knowledge loop not validated.
|
|
158
|
-
|
|
159
|
-
**Verdict criteria**:
|
|
115
|
+
Check for harvest-loop signals in the current session — read session context rather than invoking harvest-loop directly (a mid-sweep invocation would conflict with the sweep's own pattern collection).
|
|
160
116
|
|
|
161
117
|
| harvest-loop result | pipeline-conductor verdict |
|
|
162
118
|
|---|---|
|
|
@@ -166,35 +122,19 @@ Check for harvest-loop signals in the current session. pipeline-conductor reads
|
|
|
166
122
|
| Did not run this session | `CONDITIONAL_PASS` — note: knowledge loop not validated |
|
|
167
123
|
| Auto-skipped (< 3 patterns) | `CONDITIONAL_PASS` — note: sub-threshold, not validated |
|
|
168
124
|
|
|
169
|
-
**
|
|
170
|
-
> "harvest-loop reported a blocking issue. Fix and re-run Step 1, or abort the sweep?"
|
|
171
|
-
|
|
172
|
-
**On CONDITIONAL_PASS**: Log all pending patterns or unvalidated notes. Continue to Step 2.
|
|
173
|
-
|
|
174
|
-
```
|
|
175
|
-
[Step 1 — harvest-loop]
|
|
176
|
-
Verdict: {verdict}
|
|
177
|
-
Basis: {one-line}
|
|
178
|
-
Pending: {list of items if CONDITIONAL_PASS, else "none"}
|
|
179
|
-
```
|
|
125
|
+
> **Detail**: See `SKILL_detail.md §Execution-Notes` — harvest-loop invocation path, per-step FAIL prompts, Area B cadence bash, what-each-step-checks reference — read when executing Steps 1–4.
|
|
180
126
|
|
|
181
127
|
---
|
|
182
128
|
|
|
183
129
|
## Step 2. steel-quench — Adversarial Verification
|
|
184
130
|
|
|
185
|
-
Run steel-quench against the target scope.
|
|
186
|
-
|
|
187
|
-
**What it checks**: Design attack surface, trigger phrase collisions, over-engineered steps, self-declaration language, structural flaws surfaced by devil-advocate rounds.
|
|
188
|
-
|
|
189
|
-
**Invocation**: Run steel-quench in its standard mode. For `--quick`, this is the first step.
|
|
131
|
+
Run steel-quench against the target scope (for `--quick`, this is the first step).
|
|
190
132
|
|
|
191
133
|
**steel-quench severity vocabulary** (S/A/B grade — not M/S/R):
|
|
192
134
|
- **S-grade**: Immediate blocker — must fix before proceeding
|
|
193
135
|
- **A-grade**: Required before deployment — fix before external release
|
|
194
136
|
- **B-grade**: Improvement recommended — non-blocking
|
|
195
137
|
|
|
196
|
-
**Verdict criteria**:
|
|
197
|
-
|
|
198
138
|
| steel-quench result | pipeline-conductor verdict |
|
|
199
139
|
|---|---|
|
|
200
140
|
| 0 S-grade findings | `PASS` |
|
|
@@ -202,42 +142,19 @@ Run steel-quench against the target scope.
|
|
|
202
142
|
| 1+ S-grade findings | `FAIL` |
|
|
203
143
|
| Wave 4 (Meta-Aware Adversary) surfaces structural ambiguity | `ESCALATE` |
|
|
204
144
|
|
|
205
|
-
**On FAIL**: Output S-grade finding(s) from steel-quench. Ask:
|
|
206
|
-
> "steel-quench found a blocking issue. Fix and re-run Step 2, or abort the sweep?"
|
|
207
|
-
|
|
208
|
-
**On CONDITIONAL_PASS**: Capture A-grade and B-grade findings. Continue to Step 3.
|
|
209
|
-
|
|
210
|
-
```
|
|
211
|
-
[Step 2 — steel-quench]
|
|
212
|
-
Verdict: {verdict}
|
|
213
|
-
Basis: {one-line}
|
|
214
|
-
Findings:
|
|
215
|
-
S-grade: {count} — {top item or "none"}
|
|
216
|
-
A-grade: {count} — {top item or "none"}
|
|
217
|
-
B-grade: {count} — {top item or "none"}
|
|
218
|
-
```
|
|
219
|
-
|
|
220
145
|
---
|
|
221
146
|
|
|
222
147
|
## Step 3. phantom-quench — Phantom Claim Detection
|
|
223
148
|
|
|
224
|
-
Run phantom-quench
|
|
149
|
+
Run phantom-quench scoped to the same target as Steps 1 and 2.
|
|
225
150
|
|
|
226
|
-
**
|
|
227
|
-
|
|
228
|
-
**Invocation**: Run phantom-quench scoped to the same target as Steps 1 and 2.
|
|
229
|
-
|
|
230
|
-
**Load-bearing Phantom** (binary test — apply mechanically):
|
|
231
|
-
|
|
232
|
-
A Phantom is load-bearing if it appears in any of:
|
|
151
|
+
**Load-bearing Phantom** (binary test — apply mechanically). A Phantom is load-bearing if it appears in any of:
|
|
233
152
|
- `§Done When` section of the skill under audit
|
|
234
153
|
- Any numbered `§Step N` execution body
|
|
235
154
|
- `§Chains` with mandatory dispatch language (`→ Mandatory next`)
|
|
236
155
|
|
|
237
156
|
All other locations (§Triggers, advisory §Chains language, frontmatter description) are non-load-bearing by definition.
|
|
238
157
|
|
|
239
|
-
**Verdict criteria**:
|
|
240
|
-
|
|
241
158
|
| phantom-quench result | pipeline-conductor verdict |
|
|
242
159
|
|---|---|
|
|
243
160
|
| 0 Phantoms, all claims grounded | `PASS` |
|
|
@@ -245,40 +162,15 @@ All other locations (§Triggers, advisory §Chains language, frontmatter descrip
|
|
|
245
162
|
| 1+ load-bearing Phantoms | `FAIL` |
|
|
246
163
|
| Grounding ambiguous (source file exists but content unclear) | `ESCALATE` |
|
|
247
164
|
|
|
248
|
-
**On FAIL**: Output the load-bearing Phantom(s). Ask:
|
|
249
|
-
> "phantom-quench found a load-bearing Phantom claim. Fix and re-run Step 3, or abort the sweep?"
|
|
250
|
-
|
|
251
|
-
**On CONDITIONAL_PASS**: Capture non-load-bearing Phantoms. Continue to Step 4.
|
|
252
|
-
|
|
253
|
-
```
|
|
254
|
-
[Step 3 — phantom-quench]
|
|
255
|
-
Verdict: {verdict}
|
|
256
|
-
Basis: {one-line}
|
|
257
|
-
Phantoms: {count} — {load-bearing: Y/N} — {top item or "none"}
|
|
258
|
-
```
|
|
259
|
-
|
|
260
165
|
---
|
|
261
166
|
|
|
262
167
|
## Step 4. sim-conductor — Pre-Deploy Simulation
|
|
263
168
|
|
|
264
169
|
> Skipped in `--quick` and `--no-sim` modes.
|
|
265
170
|
|
|
266
|
-
Run sim-conductor against the target scope.
|
|
267
|
-
|
|
268
|
-
**What it checks**: Transferability to external users, new-user entry point friction, power-user edge cases, artifact quality from an outside perspective.
|
|
269
|
-
|
|
270
|
-
**Area B cadence check** (sim-conductor Area B has a once-per-week frequency limit):
|
|
271
|
-
|
|
272
|
-
```bash
|
|
273
|
-
find tracks/_meta/ -name "*sim_conductor*" -newer "$(date -v-7d +%Y-%m-%d 2>/dev/null || date -d '7 days ago' +%Y-%m-%d 2>/dev/null)" 2>/dev/null | head -1
|
|
274
|
-
```
|
|
275
|
-
|
|
276
|
-
- Result found (Area B ran within 7 days): skip Area B → run Area A + Area D only. Capture as `CONDITIONAL_PASS` with note: "Area B skipped — within 7-day cadence limit."
|
|
277
|
-
- No result (Area B not run within 7 days): run Area A + Area B + Area D (scope permitting).
|
|
171
|
+
Run sim-conductor against the target scope: Area A + Area B (if cadence allows) + Area D (if scope is a specific SKILL.md or document).
|
|
278
172
|
|
|
279
|
-
**
|
|
280
|
-
|
|
281
|
-
**Verdict criteria**:
|
|
173
|
+
**Area B cadence rule (behavioral)**: Area B has a once-per-week frequency limit. If Area B ran within 7 days (detection bash in §Execution-Notes) → skip Area B, run Area A + D only, capture as `CONDITIONAL_PASS` with note "Area B skipped — within 7-day cadence limit."
|
|
282
174
|
|
|
283
175
|
| sim-conductor result | pipeline-conductor verdict |
|
|
284
176
|
|---|---|
|
|
@@ -287,51 +179,11 @@ find tracks/_meta/ -name "*sim_conductor*" -newer "$(date -v-7d +%Y-%m-%d 2>/dev
|
|
|
287
179
|
| 1+ M-tier findings from any persona | `FAIL` |
|
|
288
180
|
| Persona results conflict, resolution requires human judgment | `ESCALATE` |
|
|
289
181
|
|
|
290
|
-
**On FAIL**: Output M-tier finding(s) from sim-conductor. Ask:
|
|
291
|
-
> "sim-conductor found a blocking issue. Fix and re-run Step 4, or accept findings and close with FAIL status?"
|
|
292
|
-
|
|
293
|
-
**On CONDITIONAL_PASS**: Capture S/R-tier findings. Proceed to final report.
|
|
294
|
-
|
|
295
|
-
```
|
|
296
|
-
[Step 4 — sim-conductor]
|
|
297
|
-
Verdict: {verdict}
|
|
298
|
-
Basis: {one-line}
|
|
299
|
-
Area B: {ran / skipped — cadence limit}
|
|
300
|
-
Findings:
|
|
301
|
-
M: {count} — {top item or "none"}
|
|
302
|
-
S: {count} — {top item or "none"}
|
|
303
|
-
R: {count} — {top item or "none"}
|
|
304
|
-
```
|
|
305
|
-
|
|
306
182
|
---
|
|
307
183
|
|
|
308
184
|
## Step 5. Aggregated Report
|
|
309
185
|
|
|
310
|
-
After all steps complete (or after chain halt), output the aggregated report.
|
|
311
|
-
|
|
312
|
-
```
|
|
313
|
-
pipeline-conductor — Sweep Report
|
|
314
|
-
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
315
|
-
Scope: {scope}
|
|
316
|
-
Mode: {mode}
|
|
317
|
-
Branch: {branch}
|
|
318
|
-
Date: {YYYY-MM-DD}
|
|
319
|
-
|
|
320
|
-
Step 0.5 — return-path-gate: {PASS / CONDITIONAL_PASS / FAIL / SKIPPED / degraded}
|
|
321
|
-
Step 1 — harvest-loop: {PASS / CONDITIONAL_PASS / FAIL / ESCALATE / SKIPPED}
|
|
322
|
-
Step 2 — steel-quench: {verdict}
|
|
323
|
-
Step 3 — phantom-quench: {verdict}
|
|
324
|
-
Step 4 — sim-conductor: {verdict}
|
|
325
|
-
|
|
326
|
-
Overall: {CLEAN (--full) / CLEAN (--quick) / CLEAN (--no-sim) / PENDING / BLOCKED}
|
|
327
|
-
|
|
328
|
-
── Pending items (CONDITIONAL_PASS steps + accepted ESCALATE) ──
|
|
329
|
-
{item list, or "none"}
|
|
330
|
-
|
|
331
|
-
── Blocking items (FAIL steps + unresolved ESCALATE) ──
|
|
332
|
-
{item list, or "none — sweep completed cleanly"}
|
|
333
|
-
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
334
|
-
```
|
|
186
|
+
After all steps complete (or after chain halt), output the aggregated report (template in §Output-Formats).
|
|
335
187
|
|
|
336
188
|
**Overall verdict logic**:
|
|
337
189
|
|
|
@@ -347,14 +199,7 @@ pipeline-conductor — Sweep Report
|
|
|
347
199
|
- `PENDING`: Resolve captured items before external release.
|
|
348
200
|
- `BLOCKED`: Do not proceed. Fix blocking items and re-run affected step(s).
|
|
349
201
|
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
Save the report to:
|
|
353
|
-
```
|
|
354
|
-
tracks/_meta/pipeline_conductor_{YYYY-MM-DD}_{scope-slug}.md
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
If `tracks/_meta/` does not exist, output a B-grade warning and skip persistence.
|
|
202
|
+
**Report persistence**: save to `tracks/_meta/pipeline_conductor_{YYYY-MM-DD}_{scope-slug}.md`. If `tracks/_meta/` does not exist, output a B-grade warning and skip persistence.
|
|
358
203
|
|
|
359
204
|
---
|
|
360
205
|
|
|
@@ -370,12 +215,7 @@ If `tracks/_meta/` does not exist, output a B-grade warning and skip persistence
|
|
|
370
215
|
|
|
371
216
|
Resume is scope-bound — the scope from Step 0 is preserved across resume calls.
|
|
372
217
|
|
|
373
|
-
**After ESCALATE**:
|
|
374
|
-
|
|
375
|
-
1. Present the three options (see ESCALATE Handling above).
|
|
376
|
-
2. Option (a) — user provides decision: incorporate as resolution note, re-evaluate the step gate with decision context, continue chain from step N or N+1.
|
|
377
|
-
3. Option (b) — accept and continue: mark ESCALATE item as PENDING, continue chain from step N+1.
|
|
378
|
-
4. Option (c) — abort: close sweep, output BLOCKED report with ESCALATE item in blocking section.
|
|
218
|
+
**After ESCALATE**: present the three options (see ESCALATE Handling above); (a) incorporate decision and continue, (b) mark PENDING and continue from N+1, (c) abort with BLOCKED report.
|
|
379
219
|
|
|
380
220
|
---
|
|
381
221
|
|
|
@@ -0,0 +1,149 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pipeline-conductor-detail
|
|
3
|
+
description: Detail reference for pipeline-conductor — per-step output format blocks, aggregated report template, invocation notes, Area B cadence bash. Load when executing a specific step.
|
|
4
|
+
load: on-demand
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# pipeline-conductor — Detail Reference
|
|
8
|
+
|
|
9
|
+
> Load when executing a specific step. SKILL.md contains triggers, modes, the verdict vocabulary + chain behavior, scope translation, per-step gate tables, overall verdict logic, resume rules, and Done When.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## §Output-Formats
|
|
14
|
+
|
|
15
|
+
### Step 0 — Sweep Plan
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
pipeline-conductor — Sweep Plan
|
|
19
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
20
|
+
Scope: {scope}
|
|
21
|
+
Mode: {--full / --quick / --no-sim}
|
|
22
|
+
Branch: {branch name}
|
|
23
|
+
Steps: {1 → 2 → 3 → 4 / 2 → 3 / 1 → 2 → 3}
|
|
24
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
25
|
+
Proceed? (Y: run / N: cancel)
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
### Step 0.5 — return-path-gate FAIL block
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
[Step 0.5 — return-path-gate]
|
|
32
|
+
Verdict: FAIL
|
|
33
|
+
Basis: HIGH severity OPEN chain(s) — inter-step verdicts unreliable with broken chains
|
|
34
|
+
Open: [list of OPEN chains with severity]
|
|
35
|
+
|
|
36
|
+
Fix OPEN chains then re-run pipeline-conductor.
|
|
37
|
+
Skip? (S: override gate — records as degraded, disqualifies CLEAN (--full) status)
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### Step 1 — harvest-loop verdict block
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
[Step 1 — harvest-loop]
|
|
44
|
+
Verdict: {verdict}
|
|
45
|
+
Basis: {one-line}
|
|
46
|
+
Pending: {list of items if CONDITIONAL_PASS, else "none"}
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Step 2 — steel-quench verdict block
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
[Step 2 — steel-quench]
|
|
53
|
+
Verdict: {verdict}
|
|
54
|
+
Basis: {one-line}
|
|
55
|
+
Findings:
|
|
56
|
+
S-grade: {count} — {top item or "none"}
|
|
57
|
+
A-grade: {count} — {top item or "none"}
|
|
58
|
+
B-grade: {count} — {top item or "none"}
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### Step 3 — phantom-quench verdict block
|
|
62
|
+
|
|
63
|
+
```
|
|
64
|
+
[Step 3 — phantom-quench]
|
|
65
|
+
Verdict: {verdict}
|
|
66
|
+
Basis: {one-line}
|
|
67
|
+
Phantoms: {count} — {load-bearing: Y/N} — {top item or "none"}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### Step 4 — sim-conductor verdict block
|
|
71
|
+
|
|
72
|
+
```
|
|
73
|
+
[Step 4 — sim-conductor]
|
|
74
|
+
Verdict: {verdict}
|
|
75
|
+
Basis: {one-line}
|
|
76
|
+
Area B: {ran / skipped — cadence limit}
|
|
77
|
+
Findings:
|
|
78
|
+
M: {count} — {top item or "none"}
|
|
79
|
+
S: {count} — {top item or "none"}
|
|
80
|
+
R: {count} — {top item or "none"}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### Step 5 — Aggregated Report
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
pipeline-conductor — Sweep Report
|
|
87
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
88
|
+
Scope: {scope}
|
|
89
|
+
Mode: {mode}
|
|
90
|
+
Branch: {branch}
|
|
91
|
+
Date: {YYYY-MM-DD}
|
|
92
|
+
|
|
93
|
+
Step 0.5 — return-path-gate: {PASS / CONDITIONAL_PASS / FAIL / SKIPPED / degraded}
|
|
94
|
+
Step 1 — harvest-loop: {PASS / CONDITIONAL_PASS / FAIL / ESCALATE / SKIPPED}
|
|
95
|
+
Step 2 — steel-quench: {verdict}
|
|
96
|
+
Step 3 — phantom-quench: {verdict}
|
|
97
|
+
Step 4 — sim-conductor: {verdict}
|
|
98
|
+
|
|
99
|
+
Overall: {CLEAN (--full) / CLEAN (--quick) / CLEAN (--no-sim) / PENDING / BLOCKED}
|
|
100
|
+
|
|
101
|
+
── Pending items (CONDITIONAL_PASS steps + accepted ESCALATE) ──
|
|
102
|
+
{item list, or "none"}
|
|
103
|
+
|
|
104
|
+
── Blocking items (FAIL steps + unresolved ESCALATE) ──
|
|
105
|
+
{item list, or "none — sweep completed cleanly"}
|
|
106
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## §Execution-Notes
|
|
112
|
+
|
|
113
|
+
### Step 1 — harvest-loop invocation path
|
|
114
|
+
|
|
115
|
+
pipeline-conductor reads session context for harvest-loop findings rather than invoking harvest-loop directly — invoking harvest-loop mid-sweep would start a new harvest cycle that conflicts with the sweep's own pattern collection. If no harvest-loop run exists in this session, pipeline-conductor can invoke harvest-loop in proposal mode (`/harvest-loop`) as a pre-Step-1 option, but this is not automatic.
|
|
116
|
+
|
|
117
|
+
- If harvest-loop ran in this session: read its output and incorporate findings.
|
|
118
|
+
- If harvest-loop did not run in this session: output `CONDITIONAL_PASS` — knowledge loop not validated in this sweep.
|
|
119
|
+
- If harvest-loop auto-skipped (fewer than 3 patterns found): output `CONDITIONAL_PASS` — sub-threshold, knowledge loop not validated.
|
|
120
|
+
|
|
121
|
+
### Per-step FAIL prompts
|
|
122
|
+
|
|
123
|
+
On FAIL the chain halts and asks (insert the failing step's name and finding):
|
|
124
|
+
|
|
125
|
+
> "{step skill} found a blocking issue. Fix and re-run Step {N}, or abort the sweep?"
|
|
126
|
+
|
|
127
|
+
(Step 4 variant: "…or accept findings and close with FAIL status?")
|
|
128
|
+
|
|
129
|
+
On CONDITIONAL_PASS in any step: capture the finding list (Step 2: A/B-grade; Step 3: non-load-bearing Phantoms; Step 4: S/R-tier findings) and continue to the next step.
|
|
130
|
+
|
|
131
|
+
### Step 4 — Area B cadence check bash
|
|
132
|
+
|
|
133
|
+
sim-conductor Area B has a once-per-week frequency limit:
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
find tracks/_meta/ -name "*sim_conductor*" -newer "$(date -v-7d +%Y-%m-%d 2>/dev/null || date -d '7 days ago' +%Y-%m-%d 2>/dev/null)" 2>/dev/null | head -1
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
- Result found (Area B ran within 7 days): skip Area B → run Area A + Area D only. Capture as `CONDITIONAL_PASS` with note: "Area B skipped — within 7-day cadence limit."
|
|
140
|
+
- No result (Area B not run within 7 days): run Area A + Area B + Area D (scope permitting).
|
|
141
|
+
|
|
142
|
+
### What each step checks (reference)
|
|
143
|
+
|
|
144
|
+
| Step | Checks |
|
|
145
|
+
|---|---|
|
|
146
|
+
| 1 harvest-loop | Recent session learnings absorbed, duplicate skill candidates pending, knowledge loop integrity in current session |
|
|
147
|
+
| 2 steel-quench | Design attack surface, trigger phrase collisions, over-engineered steps, self-declaration language, devil-advocate structural flaws |
|
|
148
|
+
| 3 phantom-quench | Proper nouns, numerical values, file paths, branching conditions back-traced to declared sources; ungrounded claims = Phantom |
|
|
149
|
+
| 4 sim-conductor | Transferability to external users, new-user entry friction, power-user edge cases, artifact quality from outside perspective |
|
|
@@ -172,13 +172,19 @@ Run multi-team? (a) Full panel (b) Claude sub-agents only (c) Skip to Area B
|
|
|
172
172
|
| T2 Copilot | `gh copilot suggest` | challenger · expert | `gh copilot suggest -t shell` |
|
|
173
173
|
| T3 Ollama | `ollama run` | challenger | `ollama run llama3 PROMPT` |
|
|
174
174
|
| T4 Codex | `npx @openai/codex exec` | challenger · edge-case-hunter | `echo PROMPT \| npx @openai/codex exec -m gpt-5 -` |
|
|
175
|
+
| T5 agy | `agy -p` (gemini successor) | challenger · beginner | `agy -p "PROMPT"` — argument form only (stdin pipe prints help); timebox+retry hard rule (intermittent hang class); -p auto-approves tools → trusted artifacts only |
|
|
175
176
|
|
|
176
177
|
### CLI detection bash
|
|
177
178
|
|
|
178
179
|
```bash
|
|
179
180
|
# Detect available external CLIs
|
|
180
181
|
AVAILABLE_CLIS=""
|
|
181
|
-
|
|
182
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; } # portable timebox — darwin ships no `timeout`
|
|
183
|
+
command -v agy &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS agy"
|
|
184
|
+
# gemini backend EOL 2026-06-18 — binary outlives the service; liveness probe (timeboxed minimal
|
|
185
|
+
# call) replaces the bare `command -v`, which would pass while pipes silently return empty.
|
|
186
|
+
# Probe uses the same stdin-pipe form T1 dispatch uses; result valid per session (no re-probe).
|
|
187
|
+
command -v gemini &>/dev/null && [ -n "$(echo ping | tb 20 gemini 2>/dev/null)" ] && AVAILABLE_CLIS="$AVAILABLE_CLIS gemini"
|
|
182
188
|
command -v gh &>/dev/null && gh copilot --version &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS copilot"
|
|
183
189
|
command -v ollama &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS ollama"
|
|
184
190
|
command -v npx &>/dev/null && npx @openai/codex --version &>/dev/null 2>&1 && AVAILABLE_CLIS="$AVAILABLE_CLIS codex"
|
|
@@ -243,7 +243,7 @@ claude CLI also absent → Skip Wave 5, note in residual risk card
|
|
|
243
243
|
|
|
244
244
|
```
|
|
245
245
|
Wave 5 — Multi-Team Panel available.
|
|
246
|
-
Detected external CLIs: [gemini · gh-copilot · ollama | none]
|
|
246
|
+
Detected external CLIs: [agy · gemini · gh-copilot · ollama | none]
|
|
247
247
|
|
|
248
248
|
Estimated token cost:
|
|
249
249
|
External CLIs: each team ×2-3 personas → ~2K–5K tokens per team (billed to that CLI)
|
|
@@ -259,7 +259,15 @@ Run Wave 5?
|
|
|
259
259
|
|
|
260
260
|
```bash
|
|
261
261
|
TEAMS=()
|
|
262
|
-
|
|
262
|
+
# tb = portable timebox (darwin ships no `timeout`; perl alarm works everywhere)
|
|
263
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; }
|
|
264
|
+
command -v agy &>/dev/null && TEAMS+=("agy")
|
|
265
|
+
# gemini backend EOL 2026-06-18 — the binary outlives the service, so a bare `command -v` goes
|
|
266
|
+
# silently stale (pipes return empty behind 2>/dev/null). Liveness probe (one minimal billed
|
|
267
|
+
# call, timeboxed) gates the slot instead. Probe uses the SAME stdin-pipe form the T1 dispatch
|
|
268
|
+
# uses (probing a different invocation path can pass while dispatch fails — agy proved the two
|
|
269
|
+
# forms diverge on one binary). Probe result is valid for the session — skip re-probe on re-runs.
|
|
270
|
+
command -v gemini &>/dev/null && [ -n "$(echo ping | tb 20 gemini 2>/dev/null)" ] && TEAMS+=("gemini")
|
|
263
271
|
command -v gh &>/dev/null && gh copilot --version &>/dev/null 2>&1 && TEAMS+=("gh-copilot")
|
|
264
272
|
command -v ollama &>/dev/null && TEAMS+=("ollama")
|
|
265
273
|
npx @openai/codex --version &>/dev/null 2>&1 && TEAMS+=("codex")
|
|
@@ -277,6 +285,7 @@ Default team-persona assignments:
|
|
|
277
285
|
| **T2 Copilot** | `gh copilot suggest` | devil · expert |
|
|
278
286
|
| **T3 Ollama** | `ollama run {model}` | devil |
|
|
279
287
|
| **T4 Codex** | `npx @openai/codex exec` | devil · edge-case-hunter |
|
|
288
|
+
| **T5 agy** | `agy -p "PROMPT"` (argument form only — stdin pipe prints help, measured 2026-06-13) | devil · beginner · alternatives (gemini successor) |
|
|
280
289
|
|
|
281
290
|
**Step 1 — Parallel Team Dispatch**:
|
|
282
291
|
|
|
@@ -332,6 +341,46 @@ $ARTIFACT_TAIL" 2>/dev/null) &
|
|
|
332
341
|
$C_EDGE"
|
|
333
342
|
fi
|
|
334
343
|
|
|
344
|
+
# ── T5: agy (Antigravity) — gemini successor ──────────────
|
|
345
|
+
# Constraints: argument form only (`agy -p "PROMPT"`); timebox+retry is a hard rule
|
|
346
|
+
# (intermittent hang class, ~50% observed 2026-06-11); -p auto-approves tool execution,
|
|
347
|
+
# so feed only trusted artifacts — never untrusted external content. Serial by design
|
|
348
|
+
# (worst case ~6 min if the hang class fires on every call) — do NOT copy the T1-T4
|
|
349
|
+
# `VAR=$(...) &` pattern: it assigns inside a background subshell.
|
|
350
|
+
if [[ " ${TEAMS[*]} " =~ " agy " ]]; then
|
|
351
|
+
# tb redefined here — each fenced block must be self-contained when copy-run.
|
|
352
|
+
# Limitation: alarm kills the exec'd process only; a child holding the stdout
|
|
353
|
+
# pipe can outlive it (same gap as `timeout` without process-group kill).
|
|
354
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; }
|
|
355
|
+
agy_call() { # $1=prompt — 60s timebox, 1 retry on empty
|
|
356
|
+
local out; out=$(tb 60 agy -p "$1" 2>/dev/null)
|
|
357
|
+
[ -z "$out" ] && out=$(tb 60 agy -p "$1" 2>/dev/null)
|
|
358
|
+
printf '%s' "$out"
|
|
359
|
+
}
|
|
360
|
+
A_DEVIL=$(agy_call "[Devil] Adversarial reviewer, no prior context.
|
|
361
|
+
Find 3 critical structural flaws — especially whether Done When criteria are binary and achievable.
|
|
362
|
+
Format: [issue · location · severity S/A/B]
|
|
363
|
+
---
|
|
364
|
+
$ARTIFACT_TAIL")
|
|
365
|
+
A_NEW=$(agy_call "[Beginner] First-time user, zero background.
|
|
366
|
+
Find 3 unclear or jargon-heavy points.
|
|
367
|
+
Format: [issue · location · severity]
|
|
368
|
+
---
|
|
369
|
+
$ARTIFACT_TAIL")
|
|
370
|
+
A_SKEP=$(agy_call "[Alternatives — challenger U1 lens] Pragmatic outsider.
|
|
371
|
+
Find 3 \"why not just X?\" challenges.
|
|
372
|
+
Format: [issue · location · severity]
|
|
373
|
+
---
|
|
374
|
+
$ARTIFACT_TAIL")
|
|
375
|
+
if [ -n "$A_DEVIL$A_NEW$A_SKEP" ]; then
|
|
376
|
+
TEAM_RESULTS["agy"]="$A_DEVIL
|
|
377
|
+
$A_NEW
|
|
378
|
+
$A_SKEP"
|
|
379
|
+
else
|
|
380
|
+
echo "T5 agy: empty after timebox+retry — dropped (degraded coverage, do not count in synthesis)"
|
|
381
|
+
fi
|
|
382
|
+
fi
|
|
383
|
+
|
|
335
384
|
# ── Path B: Cross-session Claude fallback ─────────────────
|
|
336
385
|
if [ ${#TEAMS[@]} -eq 0 ]; then
|
|
337
386
|
TEAM_RESULTS["cross-session-claude"]=$(claude --print \
|
|
@@ -356,7 +405,7 @@ Confidence scoring:
|
|
|
356
405
|
1 team only → B-grade (single-team observation)
|
|
357
406
|
|
|
358
407
|
Claude blind spots (highest value):
|
|
359
|
-
Issues raised by T1~
|
|
408
|
+
Issues raised by T1~T5 but absent from Wave 1~4 (T0) results
|
|
360
409
|
→ flag explicitly as "cross-team delta"
|
|
361
410
|
```
|
|
362
411
|
|