@chrono-meta/fh-gate 1.2.2 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +7 -4
- package/CATALOG.md +6 -1
- package/CHEATSHEET.md +125 -1
- package/CLAUDE.md +49 -6
- package/README.md +79 -20
- package/docs/codex-compat.md +4 -4
- package/docs/pillars.svg +26 -29
- package/knowledge/shared/harness-core/fh_integration_contract.md +1 -1
- package/package.json +1 -2
- package/plugins/fh-commons/skills/deliberation/SKILL.md +1 -1
- package/plugins/fh-meta/agents/beginner.md +104 -0
- package/{.claude → plugins/fh-meta}/agents/challenger.md +3 -1
- package/plugins/fh-meta/agents/expert.md +114 -0
- package/plugins/fh-meta/agents/main-player.md +106 -0
- package/plugins/fh-meta/skills/agent-composer/SKILL.md +2 -2
- package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +2 -2
- package/plugins/fh-meta/skills/apex-review/SKILL.md +1 -1
- package/plugins/fh-meta/skills/edit-manifest/SKILL.md +1 -1
- package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +1 -1
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +54 -30
- package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +1 -1
- package/plugins/fh-meta/skills/phantom-quench/SKILL.md +248 -0
- package/plugins/fh-meta/skills/{source-grounding-audit → phantom-quench}/SKILL_detail.md +3 -3
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +10 -10
- package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +77 -1
- package/plugins/fh-meta/skills/return-path-gate/SKILL.md +2 -2
- package/plugins/fh-meta/skills/sim-conductor/SKILL.md +91 -24
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +18 -18
- package/plugins/fh-meta/skills/skill-splitter/SKILL.md +4 -4
- package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +2 -2
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +27 -215
- package/plugins/fh-meta/skills/steel-quench/SKILL.md +24 -2
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +8 -8
- package/scripts/fh-gate.sh +3 -9
- package/scripts/fh-run.sh +1 -1
|
@@ -24,6 +24,7 @@ breaks the "public repo = model-agnostic methodology only" invariant.
|
|
|
24
24
|
|
|
25
25
|
- `/public-surface-audit`
|
|
26
26
|
- `/public-surface-audit --target <repo path>`
|
|
27
|
+
- `/public-surface-audit --json` (machine-parseable verdict for hook-gating — see Step 5)
|
|
27
28
|
- "Did I leak anything into the public repo?", "public surface audit", "private token scan"
|
|
28
29
|
- "Check tracked files for private tokens", "is my public/private split clean?"
|
|
29
30
|
- "Did any operator-private token survive into a tracked file?", "scan before publish"
|
|
@@ -137,6 +138,38 @@ not "edit the HTML by hand". Flag them with a `(generated artifact)` note.
|
|
|
137
138
|
|
|
138
139
|
---
|
|
139
140
|
|
|
141
|
+
## Step 3b. FP Hygiene — Placeholder & Example Exclusion
|
|
142
|
+
|
|
143
|
+
A scan that flags its own placeholders erodes trust. Two **value-shape** classes are never real leaks
|
|
144
|
+
and are dropped before the report — imported from `gstack-redact`'s canonical-example allowlist, scoped
|
|
145
|
+
in PSA's direction to the *matched token* (not the whole line, so a real leak sharing a substring still
|
|
146
|
+
reports):
|
|
147
|
+
|
|
148
|
+
- **Angle-bracket placeholders** — the matched token is itself a placeholder (`<your-unix-username>`,
|
|
149
|
+
`<company-asset>`, `{project}`). PSA dogfoods these in Step 1; the scan must not report them as leaks.
|
|
150
|
+
- **Canonical dummy values** — the matched token is a documented example/dummy (`EXAMPLE`, `dummy`,
|
|
151
|
+
`changeme`, `REDACTED`, `xxxx`, AWS-doc keys like `AKIAIOSFODNN7EXAMPLE`). A high-entropy *example* is
|
|
152
|
+
not a secret.
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
# FP-hygiene tests the MATCHED TOKEN only — never the whole line. A line-level `grep -v` would
|
|
156
|
+
# suppress a real leak that merely *mentions* an example (e.g. `user=<realname> # see EXAMPLE.md`),
|
|
157
|
+
# violating PSA's "allowlist tight" rule. So extract the matched span per hit and drop it only when
|
|
158
|
+
# the span is *entirely* a placeholder/example (anchored ^…$).
|
|
159
|
+
PLACEHOLDER='^(<[a-z0-9_-]+>|\{project\}|EXAMPLE|dummy|changeme|REDACTED|xxxx)$'
|
|
160
|
+
grep -nIE "$regex" $(cat /tmp/_psa_tracked.txt) 2>/dev/null | while IFS= read -r hit; do
|
|
161
|
+
tok=$(printf '%s' "$hit" | grep -oiE "$regex" | head -1)
|
|
162
|
+
printf '%s' "$tok" | grep -qiE "$PLACEHOLDER" && continue # token IS a placeholder → drop
|
|
163
|
+
printf '%s\n' "$hit"
|
|
164
|
+
done
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
This differs from the Step 2 allowlist: Step 2 suppresses by **file::token legitimacy**, Step 3b by
|
|
168
|
+
**token value-shape**. Both run — Step 2 then Step 3b. Keep it tight (PSA's "allowlist tight" rule): if a
|
|
169
|
+
token only *contains* an example substring but is otherwise a real private value, it still reports.
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
140
173
|
## Step 4. Report
|
|
141
174
|
|
|
142
175
|
```
|
|
@@ -175,6 +208,32 @@ tokens in {N} tracked files (X allowlist-suppressed)." Do not print empty severi
|
|
|
175
208
|
|
|
176
209
|
---
|
|
177
210
|
|
|
211
|
+
## Step 5. Machine Output (`--json`) — Hook-Gateable Verdict
|
|
212
|
+
|
|
213
|
+
By default PSA prints the Step 4 human report. With `--json`, emit a machine-parseable verdict so a
|
|
214
|
+
**pre-publish / pre-push hook can gate on counts mechanically** — turning PSA from advisory into
|
|
215
|
+
enforceable (FH's "enforcement is a hook, not a prompt" principle). Imported from `gstack-redact --json`.
|
|
216
|
+
|
|
217
|
+
```json
|
|
218
|
+
{
|
|
219
|
+
"target": "{REPO_PATH}",
|
|
220
|
+
"tracked_files": 0,
|
|
221
|
+
"findings": [
|
|
222
|
+
{"file": "path", "line": 42, "token": "<matched>", "severity": "HIGH", "class": "username"}
|
|
223
|
+
],
|
|
224
|
+
"counts": {"HIGH": 0, "MED": 0, "LOW": 0, "suppressed": 0},
|
|
225
|
+
"verdict": "CLEAN"
|
|
226
|
+
}
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
`verdict` is one of `CLEAN | REVIEW | LEAK | NOT_CONFIGURED` (same thresholds as Step 4). **`verdict` is
|
|
230
|
+
authoritative — never gate on `counts` alone**: a counts-only check (`HIGH==0 && MED==0`) misreads
|
|
231
|
+
`NOT_CONFIGURED` (which also has zero counts) as a pass. A caller blocks when `verdict` is `LEAK` **or**
|
|
232
|
+
`NOT_CONFIGURED` — an unconfigured scan is not a pass (the same silent-failure guard as the human path:
|
|
233
|
+
absence ≠ CLEAN).
|
|
234
|
+
|
|
235
|
+
---
|
|
236
|
+
|
|
178
237
|
## Connected Skills
|
|
179
238
|
|
|
180
239
|
| Situation | Connected skill |
|
|
@@ -182,7 +241,7 @@ tokens in {N} tracked files (X allowlist-suppressed)." Do not print empty severi
|
|
|
182
241
|
| Broader pre-publish repo readiness (README, license, API keys) | `/marketplace-gate` (Check 5 Public Safety is the wide net; this skill is the private-token detail) |
|
|
183
242
|
| A leak is a recurring process gap, not a one-off | log via `field-harvest` → candidate `#rule-candidate` |
|
|
184
243
|
| Where should the leaked content actually live? | `/asset-placement-gate` (hub vs project vs CLAUDE.local.md) |
|
|
185
|
-
| Phantom refs / stale links on the same surface | `/
|
|
244
|
+
| Phantom refs / stale links on the same surface | `/phantom-quench` (forward axis — orthogonal to this leak axis) |
|
|
186
245
|
|
|
187
246
|
---
|
|
188
247
|
|
|
@@ -222,3 +281,20 @@ Verdict: **CLEAN** (0 tokens after allowlist) | **REVIEW** (LOW-only — drift,
|
|
|
222
281
|
leak even though hand-editing it is wrong — report it, prescribe "regenerate from sanitized source".
|
|
223
282
|
- **Allowlist tight, not loose**: when unsure whether a reference is legitimate, report it. A false LEAK
|
|
224
283
|
the user dismisses is cheaper than a real leak suppressed by an over-broad allowlist.
|
|
284
|
+
- **Auto-redact deliberately not imported**: `gstack-redact` offers `--auto-redact` (rewrite + diff). PSA's
|
|
285
|
+
philosophy is *report + prescribe; the human decides where the line goes* — auto-redacting a HIGH
|
|
286
|
+
(username/company) hit would pre-empt that judgment, and auto-editing a generated artifact is explicitly
|
|
287
|
+
wrong (regenerate from source). If ever imported, restrict to the MED absolute-home-path class only
|
|
288
|
+
(mechanically safe: `/Users/<user>/` → `~/` or `{project}`), never HIGH, never generated files.
|
|
289
|
+
|
|
290
|
+
---
|
|
291
|
+
|
|
292
|
+
## Sister-Asset Provenance
|
|
293
|
+
|
|
294
|
+
Step 3b (FP hygiene) and Step 5 (`--json`) were imported from **garrytan/gstack** `gstack-redact`
|
|
295
|
+
(`lib/redact-engine.ts`) during a hands-on sister-asset cross-audit (2026-06-06; see
|
|
296
|
+
`tracks/_audit/session_2026_06_06_gstack_sister_handson.md`). They are adapted to PSA's operator-IP
|
|
297
|
+
ontology — `gstack-redact`'s generic secret/PII classes (AWS / PEM / JWT / hostname) stay out of PSA's
|
|
298
|
+
scope (orthogonal coverage: PSA = operator-IP leak, redact = generic secret). The reverse direction
|
|
299
|
+
(PSA's operator private-codename + bare-username classes, which `gstack-redact` structurally cannot
|
|
300
|
+
detect) is a candidate contribution back to gstack.
|
|
@@ -132,7 +132,7 @@ Apply the following decision rules per (caller → callee) pair.
|
|
|
132
132
|
| Caller identity | Tier |
|
|
133
133
|
|---|:---:|
|
|
134
134
|
| Core pipeline skill (harvest-loop, steel-quench, apex-review, agent-composer, sim-conductor, pipeline-conductor) | HIGH |
|
|
135
|
-
| Diagnostic or gate skill (harness-doctor,
|
|
135
|
+
| Diagnostic or gate skill (harness-doctor, phantom-quench, verify-bidirectional, return-path-gate) | MEDIUM |
|
|
136
136
|
| Utility or advisory skill (context-doctor, plugin-recommender, frontier-digest, etc.) | LOW |
|
|
137
137
|
|
|
138
138
|
*Callee consequence tier*:
|
|
@@ -239,7 +239,7 @@ Verdict: PASS (0 HIGH severity OPEN chains) | CONDITIONAL_PASS (MEDIUM/LOW sever
|
|
|
239
239
|
| Situation | Connected Skill |
|
|
240
240
|
|---|---|
|
|
241
241
|
| Check harness structural completeness alongside chain closure | `/harness-doctor` |
|
|
242
|
-
| Verify phantom references in §Chains targets (callee skill actually exists) | `/
|
|
242
|
+
| Verify phantom references in §Chains targets (callee skill actually exists) | `/phantom-quench` |
|
|
243
243
|
| Run chain audit as pre-flight before parallel dispatch | `pipeline-conductor` Step 0.5 calls this skill |
|
|
244
244
|
| Prescribe verdict format for callee skills missing structured output | `/meta-prompt-builder` |
|
|
245
245
|
| Chain OPEN finding becomes improvement candidate | `/field-harvest` |
|
|
@@ -32,7 +32,7 @@ Proposal format: `"If it's related to [X], should I simulate with /sim-conductor
|
|
|
32
32
|
| "Test it with personas", "Run an external user simulation" | External user reaction | Area A |
|
|
33
33
|
| "Look at it through someone else's eyes" | External perspective | Area A |
|
|
34
34
|
| "Validate that this actually works" | Real-usage validation | Area D |
|
|
35
|
-
| "Find problems aggressively" | Adversarial validation | Area B (
|
|
35
|
+
| "Find problems aggressively" | Adversarial validation | Area B (`challenger`) |
|
|
36
36
|
|
|
37
37
|
## Triggers
|
|
38
38
|
|
|
@@ -76,10 +76,10 @@ Read target artifact(s) → classify on 5 dimensions → output recommendation
|
|
|
76
76
|
| Dimension | Signal → Weight shift |
|
|
77
77
|
|---|---|
|
|
78
78
|
| `artifact_type` | SKILL.md / design-doc → Area B + D-skill↑ · README / CHEATSHEET → Area A↑ · code / config → Area D-code↑ |
|
|
79
|
-
| `audience` | external installer / first-time user →
|
|
80
|
-
| `claim_density` | 3+ stated benefits or superlatives →
|
|
79
|
+
| `audience` | external installer / first-time user → beginner↑ · internal team only → challenger↑ |
|
|
80
|
+
| `claim_density` | 3+ stated benefits or superlatives → challenger↑ |
|
|
81
81
|
| `risk_level` | external publish / marketplace listing → steel-quench prerequisite triggered |
|
|
82
|
-
| `novelty` | first-of-its-kind / no prior session evidence →
|
|
82
|
+
| `novelty` | first-of-its-kind / no prior session evidence → phantom-quench recommended |
|
|
83
83
|
|
|
84
84
|
```
|
|
85
85
|
Target Profile output:
|
|
@@ -92,7 +92,7 @@ Recommendation:
|
|
|
92
92
|
Areas: [list + rationale]
|
|
93
93
|
Persona composition: [list + weight]
|
|
94
94
|
Scale: [Minimum 3 | Extended 4–8 | Full ≤16]
|
|
95
|
-
Prerequisites: [steel-quench /
|
|
95
|
+
Prerequisites: [steel-quench / phantom-quench / none]
|
|
96
96
|
```
|
|
97
97
|
|
|
98
98
|
#### Persona Discovery (after profile → before dispatch)
|
|
@@ -113,8 +113,8 @@ Persona Discovery output:
|
|
|
113
113
|
|
|
114
114
|
```
|
|
115
115
|
Persona Map:
|
|
116
|
-
|
|
117
|
-
|
|
116
|
+
beginner → [installed agent: beginner] OR [ad-hoc directive]
|
|
117
|
+
challenger → [installed agent: challenger] OR [ad-hoc directive]
|
|
118
118
|
[profile-specific role] → ⚠️ GAP — plugin-recommender recommends: [X] (install? y/n)
|
|
119
119
|
```
|
|
120
120
|
|
|
@@ -165,17 +165,20 @@ sim-conductor does **not** run a fixed persona set. It derives needed perspectiv
|
|
|
165
165
|
③ External fetch — chain to /plugin-recommender when ①② insufficient for high-stakes tasks
|
|
166
166
|
```
|
|
167
167
|
|
|
168
|
-
|
|
168
|
+
Shipped standpoint agents (① tier — sourced installed-first):
|
|
169
169
|
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
|
173
|
-
|
|
174
|
-
|
|
|
175
|
-
|
|
|
176
|
-
|
|
|
170
|
+
FH ships a coherent **user-mastery spectrum** as real, reusable agents (not prompt-directive shells) — found by the ① installed-first scan, reusable across skills, and each isolated-context dispatchable for a true cold read (the bias-isolation value: an evaluator outside the author's context reads cold):
|
|
171
|
+
|
|
172
|
+
| Agent | Spectrum tier | Standpoint | Type |
|
|
173
|
+
|---|---|---|---|
|
|
174
|
+
| `beginner` | entry | First-contact cold-read — onboarding friction a fluent author cannot feel | reasoning |
|
|
175
|
+
| `main-player` | core | Engaged user; intelligently scopes Light / Midcore / Heavy (Heavy = classic power-user edge/limit lens) | reasoning |
|
|
176
|
+
| `expert` | frontier | Domain authority; web-grounded accuracy + SOTA currency, citation-enforced | data (WebSearch/WebFetch) |
|
|
177
|
+
| `challenger` | adversarial axis | Frontier adversary; U1 absorbs the skeptic "why not just X?" lens | adversarial |
|
|
178
|
+
|
|
179
|
+
> **Lineage**: `beginner` / `main-player` / `expert` are the FH-native frontier successors to the field deep-insight `user` group (newcomer / power-user) — re-derived to FH grade with embedded methodology + Done-When, not name-copied. `challenger` is the advanced form of the field `devil-advocate`. The former standalone skeptic standpoint is folded into `challenger` U1.
|
|
177
180
|
|
|
178
|
-
|
|
181
|
+
**Ad-hoc roles** (② tier — prompt-directive fallback): when the profile demands a standpoint with no shipped agent (e.g. "security auditor", "non-native reader"), inject the role as a directive into a general-purpose Agent. Prefer ① shipped agents; use ② only for genuinely task-specific one-offs.
|
|
179
182
|
|
|
180
183
|
#### Scale
|
|
181
184
|
|
|
@@ -197,9 +200,9 @@ Pre-entry user confirmation required before multi-team execution.
|
|
|
197
200
|
|
|
198
201
|
> **Detail**: See `SKILL_detail.md §MultiTeam` — team formation table (T0–T4), CLI detection bash, confirmation dialog, cross-team synthesis format — read when multi-team mode activates.
|
|
199
202
|
|
|
200
|
-
**A-1** (
|
|
201
|
-
**A-2** (
|
|
202
|
-
**A-3** (challenger
|
|
203
|
+
**A-1** (`beginner`) — first-contact friendliness · onboarding friction · terminology clarity
|
|
204
|
+
**A-2** (`main-player`) — engaged-use fit; Heavy tier: install conflicts · duplication · silent overwrite
|
|
205
|
+
**A-3** (`challenger`, artifact_type="SKILL") — claim-evidence gaps · angles U3, U5, S2
|
|
203
206
|
|
|
204
207
|
> ⚠️ **Human review gate**: Area A S-tier judgments require owner review before entering AI-AI loop.
|
|
205
208
|
|
|
@@ -246,10 +249,10 @@ Persona composition adapts to `artifact_type` from Step 0.3 profile:
|
|
|
246
249
|
|
|
247
250
|
| Artifact type | Primary persona | Supporting persona | Focus |
|
|
248
251
|
|---|---|---|---|
|
|
249
|
-
| SKILL.md / design doc | challenger (artifact_type="SKILL") |
|
|
250
|
-
| Python / JS / bash code | challenger (artifact_type="Code") |
|
|
251
|
-
| Prompt / config |
|
|
252
|
-
| Auth / security-sensitive | challenger + Security-auditor† |
|
|
252
|
+
| SKILL.md / design doc | challenger (artifact_type="SKILL") | `beginner` | Governance gaps, behavioral rule coverage |
|
|
253
|
+
| Python / JS / bash code | challenger (artifact_type="Code") | `main-player` (Heavy) | Edge cases, performance, security surface |
|
|
254
|
+
| Prompt / config | `beginner` | challenger | Interpretation errors, implicit assumptions |
|
|
255
|
+
| Auth / security-sensitive | challenger + Security-auditor† | `main-player` (Heavy) | Attack surface, privilege escalation |
|
|
253
256
|
|
|
254
257
|
† Security-auditor = built-in fallback role (② tier) injected as prompt directive.
|
|
255
258
|
|
|
@@ -265,12 +268,76 @@ Consumer agent attempts actual use (not just reads and judges). Grades: F (funct
|
|
|
265
268
|
|
|
266
269
|
### Area E — Artifact Quality Review
|
|
267
270
|
|
|
268
|
-
|
|
271
|
+
`expert` objection (E-1) + Practitioner confusion (E-2) in parallel → Pattern structuring (E-3) integrates both.
|
|
269
272
|
|
|
270
273
|
> **Detail**: See `SKILL_detail.md §AreaE-Detail` — E-1/E-2/E-3 execution, finding format, pattern naming procedure — read when executing Area E.
|
|
271
274
|
|
|
272
275
|
---
|
|
273
276
|
|
|
277
|
+
## Step 1.5 — Persona Output Protocol + Neutral Synthesizer (parallax)
|
|
278
|
+
|
|
279
|
+
Generalized from the field `deep-insight` multi-persona pattern (fh-be #7), domain-stripped — the *pattern*
|
|
280
|
+
is renamed **parallax** for public FH (it is a mode of this skill, not a separate skill — see asset-placement
|
|
281
|
+
2026-06-06). It gives the persona dispatch above a shared output contract + a neutral aggregator, so
|
|
282
|
+
multi-persona findings stay comparable and the synthesis injects no bias of its own.
|
|
283
|
+
|
|
284
|
+
> **Naming provenance (precise)**: "renamed" above refers to the *pattern* (→ parallax), not the personas.
|
|
285
|
+
> The company-team-coupled field personas (fe/be/ios/pm/ux-writer/compliance/qa) were domain-stripped
|
|
286
|
+
> entirely. The generic `user` group (newcomer/power-user) was **re-derived to FH grade as the shipped
|
|
287
|
+
> `beginner` / `main-player` / `expert` mastery-spectrum agents** (embedded methodology + Done-When, not
|
|
288
|
+
> name-copied shells), and the field `devil-advocate` was **advanced into `challenger`** (sandboxed-adversary
|
|
289
|
+
> + adaptive attack matrix). Lineage is acknowledged; nothing is carried verbatim as a shell.
|
|
290
|
+
|
|
291
|
+
**Shared persona output protocol** — every dispatched persona emits the same shape, whatever its lens:
|
|
292
|
+
|
|
293
|
+
```
|
|
294
|
+
### Strengths (0–3, from this persona's viewpoint)
|
|
295
|
+
### Concerns
|
|
296
|
+
Critical — compile/runtime failure · clear logic error · data corruption · security leak
|
|
297
|
+
Important — significant user/service impact in a plausible scenario
|
|
298
|
+
Suggestion — optional improvement
|
|
299
|
+
(each item: [file:line or quoted span] one-line summary — rationale)
|
|
300
|
+
### Open questions (0–3 items needed for a decision)
|
|
301
|
+
### Absence check (outside-vantage personas — beginner/integrator: what does the artifact FAIL to
|
|
302
|
+
specify that this standpoint needs? discoverability · undocumented contract ·
|
|
303
|
+
unstated assumption. A normal, self-administrable rubric item — surfaces real gaps.)
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
**FP judgment discipline** — only escalate when confident. Never escalate: pre-existing issues not
|
|
307
|
+
introduced by this change · style/quality without a quotable rule · linter-catchable · speculative
|
|
308
|
+
("might break if…") · subjective preference (→ Suggestion). **If not confident, do not mark it** — false
|
|
309
|
+
positives erode reviewer trust.
|
|
310
|
+
|
|
311
|
+
**Neutral synthesizer** — the aggregator is a NON-persona; it adds no opinion of its own:
|
|
312
|
+
- No opinion injection — never a conclusion no persona stated.
|
|
313
|
+
- Preserve attribution — always traceable which persona said what.
|
|
314
|
+
- Priority labels verbatim — Critical/Important/Suggestion carried as the persona set them.
|
|
315
|
+
- No forced consensus or forced conflict — report Common opinions (2+ personas agree) and Conflicts
|
|
316
|
+
(position A vs B, each with rationale) as-is. Feeds Step 2 M/S/R triage (M ← Critical or 2+ personas).
|
|
317
|
+
|
|
318
|
+
The two severity vocabularies are layered, not redundant: a persona running **in isolation** assigns only
|
|
319
|
+
its own Critical/Important/Suggestion — it cannot assign M/S/R, since `S = found by 3+ personas` depends on
|
|
320
|
+
cross-persona agreement the isolated persona never sees. The synthesizer is the only context that can triage
|
|
321
|
+
to M/S/R. So isolation *requires* the per-persona → synthesized two-layer split.
|
|
322
|
+
|
|
323
|
+
**External-harness persona sourcing** — isomorphic to steel-quench Step 0.4 (Specialized Reviewer
|
|
324
|
+
Discovery) + Wave 5 (external CLI teams). A needed lens may be sourced from an **installed sibling
|
|
325
|
+
harness**, not only the built-in palette — e.g. gstack `/review` (staff-engineer), `/cso`
|
|
326
|
+
(security-officer), `/qa` (QA-lead) when gstack is installed. sim-conductor orchestrates; the sibling
|
|
327
|
+
supplies the specialist lens. Same ①installed → ②fallback → ③fetch priority as Persona Discovery — an
|
|
328
|
+
external harness's review-skills count as ① installed sources, widening the persona pool without FH
|
|
329
|
+
shipping every specialist.
|
|
330
|
+
|
|
331
|
+
> **Absence check — resolved (added above)**: the clean replication (fh-be RESULT9 Arm F: identical
|
|
332
|
+
> artifact, explicit omission prompt, ownership-only variable) found self ≈ isolated (~90% overlap) —
|
|
333
|
+
> **omission-detection is self-administrable when explicitly asked**, refuting the earlier "the author
|
|
334
|
+
> can't see their own omissions" (Arm E, a prompt/design-drift artifact). So the Absence check is a
|
|
335
|
+
> normal, valuable rubric item (it surfaces real undocumented-contract gaps), not a special isolation-only
|
|
336
|
+
> power. The pattern's value is rubric/standpoint *supply + routine enforcement* (copyable utility), not
|
|
337
|
+
> de-biasing — see `knowledge/shared/patterns/multi-persona-review.md`.
|
|
338
|
+
|
|
339
|
+
---
|
|
340
|
+
|
|
274
341
|
## Step 2 — Synthesis
|
|
275
342
|
|
|
276
343
|
| Tier | Criteria | Action |
|
|
@@ -64,11 +64,11 @@ GAP detected for [perspective X]:
|
|
|
64
64
|
|
|
65
65
|
| Artifact type | Optimal personas | Likely GAP (not in FH native) |
|
|
66
66
|
|---|---|---|
|
|
67
|
-
| SKILL.md / governance doc | challenger ·
|
|
68
|
-
| README / marketing copy |
|
|
69
|
-
| Python / JS code | challenger/Code ·
|
|
70
|
-
| Auth / security-sensitive code | security-auditor · challenger/Code ·
|
|
71
|
-
| Design doc + citations | challenger ·
|
|
67
|
+
| SKILL.md / governance doc | challenger · beginner · expert | deep org-specific governance role → query plugin-recommender |
|
|
68
|
+
| README / marketing copy | beginner · challenger · expert | none native (challenger U1 covers the skeptic lens); niche field depth → query |
|
|
69
|
+
| Python / JS code | challenger/Code · main-player (Heavy) · security-auditor | security-auditor → query if auth/data |
|
|
70
|
+
| Auth / security-sensitive code | security-auditor · challenger/Code · main-player (Heavy) | security-auditor → block if GAP (high-weight) |
|
|
71
|
+
| Design doc + citations | challenger · expert | expert web-grounds citations natively; deep niche subfield → query |
|
|
72
72
|
|
|
73
73
|
### Degraded coverage flag
|
|
74
74
|
|
|
@@ -96,7 +96,7 @@ Target Profile:
|
|
|
96
96
|
|
|
97
97
|
Recommendation:
|
|
98
98
|
Areas: B (internal meta audit) + D-skill (cold-start validation)
|
|
99
|
-
Persona composition: challenger (high — claim verification),
|
|
99
|
+
Persona composition: challenger (high — claim verification), beginner (medium — onboarding), expert (medium — governance accuracy)
|
|
100
100
|
Scale: Minimum (3)
|
|
101
101
|
Prerequisites: none (not yet external publish)
|
|
102
102
|
```
|
|
@@ -112,7 +112,7 @@ Target Profile:
|
|
|
112
112
|
|
|
113
113
|
Recommendation:
|
|
114
114
|
Areas: A (external user perspective — primary) + C (naming gap scan)
|
|
115
|
-
Persona composition:
|
|
115
|
+
Persona composition: beginner (high), challenger (high — claim density), main-player (medium)
|
|
116
116
|
Scale: Extended (4–5)
|
|
117
117
|
Prerequisites: steel-quench REQUIRED before Area A proceeds
|
|
118
118
|
```
|
|
@@ -128,7 +128,7 @@ Target Profile:
|
|
|
128
128
|
|
|
129
129
|
Recommendation:
|
|
130
130
|
Areas: D-code (primary)
|
|
131
|
-
Persona composition: challenger/Code (edge cases, security surface),
|
|
131
|
+
Persona composition: challenger/Code (edge cases, security surface), main-player (Heavy — performance, limits)
|
|
132
132
|
Scale: Minimum (2 — third persona adds minimal value for bash)
|
|
133
133
|
Prerequisites: none
|
|
134
134
|
```
|
|
@@ -144,10 +144,10 @@ Target Profile:
|
|
|
144
144
|
novelty: high (references recent paper)
|
|
145
145
|
|
|
146
146
|
Recommendation:
|
|
147
|
-
Areas: D-code +
|
|
148
|
-
Persona composition: challenger (claim-evidence),
|
|
147
|
+
Areas: D-code + phantom-quench (quantitative claims)
|
|
148
|
+
Persona composition: challenger (claim-evidence), expert (arXiv validity, web-grounded)
|
|
149
149
|
Scale: Minimum (3)
|
|
150
|
-
Prerequisites:
|
|
150
|
+
Prerequisites: phantom-quench recommended (novelty + citations)
|
|
151
151
|
```
|
|
152
152
|
|
|
153
153
|
---
|
|
@@ -167,11 +167,11 @@ Run multi-team? (a) Full panel (b) Claude sub-agents only (c) Skip to Area B
|
|
|
167
167
|
|
|
168
168
|
| Team | CLI | Personas | Dispatch method |
|
|
169
169
|
|---|---|---|---|
|
|
170
|
-
| T0 Claude | Agent sub-agent | hub-persona-auditor · challenger ·
|
|
171
|
-
| T1 Gemini | `gemini` pipe |
|
|
172
|
-
| T2 Copilot | `gh copilot suggest` |
|
|
173
|
-
| T3 Ollama | `ollama run` |
|
|
174
|
-
| T4 Codex | `npx @openai/codex exec` |
|
|
170
|
+
| T0 Claude | Agent sub-agent | hub-persona-auditor · challenger · expert | Agent() call |
|
|
171
|
+
| T1 Gemini | `gemini` pipe | beginner · main-player · challenger | `echo PROMPT \| gemini` |
|
|
172
|
+
| T2 Copilot | `gh copilot suggest` | challenger · expert | `gh copilot suggest -t shell` |
|
|
173
|
+
| T3 Ollama | `ollama run` | challenger | `ollama run llama3 PROMPT` |
|
|
174
|
+
| T4 Codex | `npx @openai/codex exec` | challenger · edge-case-hunter | `echo PROMPT \| npx @openai/codex exec -m gpt-5 -` |
|
|
175
175
|
|
|
176
176
|
### CLI detection bash
|
|
177
177
|
|
|
@@ -205,7 +205,7 @@ Claude blind spots (external-only findings):
|
|
|
205
205
|
|
|
206
206
|
Structural methods to reduce self-reference risk in Area B:
|
|
207
207
|
|
|
208
|
-
1. **Regular
|
|
208
|
+
1. **Regular adversarial attacks**: Area B once/month + `challenger` attack once/quarter. Route challenger → defense results directly into SKILL.md via steel-quench handoff after Area B ends.
|
|
209
209
|
2. **Direct external user validation**: Non-owner attempts install + invocation → collect reactions. (cascade β validated: first autonomous external run confirmed.)
|
|
210
210
|
3. **steel-quench integration**: After Area B ends, hand off challenger findings to `/steel-quench` for deeper adversarial review + SKILL.md inscription.
|
|
211
211
|
4. **Dual validation principle**: Internal validation (Area B) alone is insufficient — minimized only when combined with external install reaction collection or cross-model validation.
|
|
@@ -266,7 +266,7 @@ Findings format: `[judgment type · pattern · root cause · fix direction]`
|
|
|
266
266
|
|
|
267
267
|
### E-2 — Practitioner Confusion
|
|
268
268
|
|
|
269
|
-
Agent (
|
|
269
|
+
Agent (`beginner` brief): confusing items, fix suggestions more awkward than original, classification criteria consistency breaks.
|
|
270
270
|
|
|
271
271
|
Findings format: `[item · confusion cause · improvement direction]`
|
|
272
272
|
|
|
@@ -56,7 +56,7 @@ Step 3 — Draft SKILL_detail.md
|
|
|
56
56
|
Front-matter: name, description, load: on-demand
|
|
57
57
|
|
|
58
58
|
Step 4 — Verify
|
|
59
|
-
|
|
59
|
+
phantom-quench: every §pointer in SKILL.md resolves to ## §SectionName in SKILL_detail.md
|
|
60
60
|
sim-conductor Area D-skill: consumer agent with SKILL.md only → must reach grade F
|
|
61
61
|
→ Any pointer mismatch or grade P/B → fix before commit
|
|
62
62
|
```
|
|
@@ -101,9 +101,9 @@ Run on a SKILL.md when **any one** of:
|
|
|
101
101
|
| Situation | Skill |
|
|
102
102
|
|---|---|
|
|
103
103
|
| Diagnose which SKILL.md files are candidates | `/context-doctor` or `/harness-doctor` |
|
|
104
|
-
| Verify §pointer grounding after split | `/
|
|
104
|
+
| Verify §pointer grounding after split | `/phantom-quench` |
|
|
105
105
|
| Verify cold-start still works after split | `/sim-conductor D skill <name>` |
|
|
106
|
-
| Check new SKILL_detail.md for phantom claims | `/
|
|
106
|
+
| Check new SKILL_detail.md for phantom claims | `/phantom-quench` |
|
|
107
107
|
| Adversarial review of the split result | `/steel-quench` |
|
|
108
108
|
|
|
109
109
|
---
|
|
@@ -115,7 +115,7 @@ Step 1 classification table produced
|
|
|
115
115
|
+ SKILL.md trimmed: triggers · principles · step overview · decision tables · Done When retained
|
|
116
116
|
+ SKILL.md has imperative pointer for every removed section (> **Detail**: See SKILL_detail.md §X)
|
|
117
117
|
+ SKILL_detail.md created: ## §SectionName header for every pointer in SKILL.md
|
|
118
|
-
+
|
|
118
|
+
+ phantom-quench: 0 phantoms (all §pointers resolve)
|
|
119
119
|
→ Fallback (skill unavailable): run §Verification-Checklist manually from SKILL_detail.md
|
|
120
120
|
+ sim-conductor Area D-skill: grade F (consumer completes core task from SKILL.md alone)
|
|
121
121
|
→ Fallback (skill unavailable): manually confirm "trigger → step overview → key decision → Done When" all present in SKILL.md
|
|
@@ -164,7 +164,7 @@ grep "^## §" plugins/{plugin}/skills/{name}/SKILL_detail.md
|
|
|
164
164
|
```
|
|
165
165
|
|
|
166
166
|
Then run:
|
|
167
|
-
- `/
|
|
167
|
+
- `/phantom-quench` — artifact: SKILL.md, declared source: SKILL_detail.md
|
|
168
168
|
- `/sim-conductor D skill {name}` — provide SKILL.md only, attempt core task from trigger phrase
|
|
169
169
|
|
|
170
170
|
---
|
|
@@ -177,7 +177,7 @@ Use before committing a completed split:
|
|
|
177
177
|
|---|---|
|
|
178
178
|
| Trigger phrases ≥ 3 | SKILL.md §Trigger Phrases has 3+ entries |
|
|
179
179
|
| Done When defined | SKILL.md has Done When block with ≥1 measurable condition |
|
|
180
|
-
| All §pointers resolve |
|
|
180
|
+
| All §pointers resolve | phantom-quench: 0 phantoms |
|
|
181
181
|
| Cold-start grade F | sim-conductor Area D-skill: consumer reaches core completion |
|
|
182
182
|
| No behavioral rule in SKILL_detail only | Any rule governing "what counts as X" present in SKILL.md |
|
|
183
183
|
| No orphan §sections | Every ## §SectionName in SKILL_detail.md has a pointer from SKILL.md |
|