@chrono-meta/fh-gate 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/.claude/agents/challenger.md +169 -0
  2. package/AGENTS.md +160 -0
  3. package/CATALOG.md +256 -0
  4. package/CHEATSHEET.md +367 -0
  5. package/CLAUDE.md +331 -0
  6. package/CONTRIBUTING.md +198 -0
  7. package/LICENSE +21 -0
  8. package/README.md +60 -7
  9. package/bin/fh-goal.js +9 -0
  10. package/bin/fh-run.js +9 -0
  11. package/docs/banner.png +0 -0
  12. package/docs/codex-compat.md +123 -0
  13. package/docs/pillars.svg +70 -0
  14. package/knowledge/shared/harness-core/fh_integration_contract.md +45 -28
  15. package/package.json +31 -6
  16. package/plugins/fh-commons/README.md +37 -0
  17. package/plugins/fh-commons/agents/quench-challenger.md +373 -0
  18. package/plugins/fh-commons/skills/convergence-loop/SKILL.md +155 -0
  19. package/plugins/fh-commons/skills/deliberation/SKILL.md +288 -0
  20. package/plugins/fh-commons/skills/mcp-circuit-breaker/SKILL.md +196 -0
  21. package/plugins/fh-commons/skills/token-budget-gate/SKILL.md +175 -0
  22. package/plugins/fh-meta/agents/fact-checker.md +121 -0
  23. package/plugins/fh-meta/agents/hub-persona-auditor.md +109 -0
  24. package/plugins/fh-meta/agents/persona-innovator.md +195 -0
  25. package/plugins/fh-meta/skills/agent-composer/SKILL.md +461 -0
  26. package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +464 -0
  27. package/plugins/fh-meta/skills/apex-review/SKILL.md +185 -0
  28. package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +135 -0
  29. package/plugins/fh-meta/skills/contention-layer/SKILL.md +127 -0
  30. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL.md +30 -0
  31. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL_detail.md +144 -0
  32. package/plugins/fh-meta/skills/context-doctor/SKILL.md +341 -0
  33. package/plugins/fh-meta/skills/cross-ecosystem-synergy-detection/SKILL.md +202 -0
  34. package/plugins/fh-meta/skills/deep-clarify/SKILL.md +144 -0
  35. package/plugins/fh-meta/skills/edit-manifest/SKILL.md +210 -0
  36. package/plugins/fh-meta/skills/field-harvest/SKILL.md +384 -0
  37. package/plugins/fh-meta/skills/frontier-digest/SKILL.md +272 -0
  38. package/plugins/fh-meta/skills/goal-quench/SKILL.md +509 -0
  39. package/plugins/fh-meta/skills/harness-doctor/SKILL.md +277 -0
  40. package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +484 -0
  41. package/plugins/fh-meta/skills/harvest-loop/SKILL.md +231 -0
  42. package/plugins/fh-meta/skills/harvest-loop/SKILL_detail.md +201 -0
  43. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +129 -0
  44. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL_detail.md +158 -0
  45. package/plugins/fh-meta/skills/install-doctor/SKILL.md +207 -0
  46. package/plugins/fh-meta/skills/install-wizard/SKILL.md +613 -0
  47. package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +193 -0
  48. package/plugins/fh-meta/skills/memory-hygiene/SKILL.md +143 -0
  49. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL.md +167 -0
  50. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL_detail.md +37 -0
  51. package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +430 -0
  52. package/plugins/fh-meta/skills/plugin-recommender/SKILL.md +221 -0
  53. package/plugins/fh-meta/skills/plugin-recommender/SKILL_detail.md +220 -0
  54. package/plugins/fh-meta/skills/prompt-regression/SKILL.md +178 -0
  55. package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +224 -0
  56. package/plugins/fh-meta/skills/return-path-gate/SKILL.md +257 -0
  57. package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md +129 -0
  58. package/plugins/fh-meta/skills/sim-conductor/SKILL.md +364 -0
  59. package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +337 -0
  60. package/plugins/fh-meta/skills/skill-splitter/SKILL.md +126 -0
  61. package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +185 -0
  62. package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +230 -0
  63. package/plugins/fh-meta/skills/source-grounding-audit/SKILL_detail.md +182 -0
  64. package/plugins/fh-meta/skills/steel-quench/SKILL.md +226 -0
  65. package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +453 -0
  66. package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +238 -0
  67. package/scripts/fh-gate.sh +175 -40
  68. package/scripts/fh-goal.sh +182 -0
  69. package/scripts/fh-run.sh +269 -0
@@ -0,0 +1,231 @@
1
+ ---
2
+ name: harvest-loop
3
+ description: A self-evolution pipeline that runs automatically after field sessions end. field-harvest (pattern extraction) → contention-layer (collision signals) → [Agent(subagent_type="challenger") + persona-innovator parallel] → synthesizer (challenger/innovator collision harvest) → Critic isolated Agent (SAGE automated critique) → harness-doctor (health check) → verify-bidirectional (consistency validation) → curator (skill lifecycle management) — 8 steps. Session learnings are automatically absorbed back into the FH ecosystem so the harness evolves on its own. In the main development environment, runs automatically at session end. For external FH users, proposes execution first. Triggered by "session harvest", "learning absorption", "fh evolution", or "harvest-loop". (The phrase "run the pipeline" is ceded to pipeline-conductor to avoid a trigger collision — for end-to-end verification sweeps use pipeline-conductor.)
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Write", "Bash", "Grep", "Glob", "Agent"]
6
+ model: opus
7
+ ---
8
+
9
+ # harvest-loop — Field Session → FH Self-Evolution Pipeline
10
+
11
+ > Automatically absorbs patterns/conflicts/discoveries from field sessions back into the FH ecosystem.
12
+ > Internalizes as a pipeline the return loop from field projects to the harness that was previously done manually.
13
+ > One of the core functions is real-time detection and blocking of **Semantic Drift** — where agent terminology gradually diverges in meaning as sessions grow longer.
14
+
15
+ ## Operation Modes
16
+
17
+ | Mode | Description | Trigger |
18
+ |---|---|---|
19
+ | **Forced mode** | Auto-runs at end of local development session. Executes without approval, only confirms final suggestions | Session wrap-up rules in hub CLAUDE.md |
20
+ | **Lightweight mode** | Immediate harvest after Wave completion. Skip Steps 3/3.5/4 — prioritize fast recording | agent-composer Step 4-c (2+ new files or 3+ existing files changed, **or M-tier resolved**) |
21
+ | **Proposal mode** | External FH users — confirms "run harvest-loop?" before executing | User utterance or `/harvest-loop` |
22
+
23
+ **Simplification guard**: Sessions that only browsed/explored (no code changes or outputs) auto-skip even in forced mode.
24
+
25
+ **Lightweight mode Done When**:
26
+ ```
27
+ Step 0 (Regression Guard) + Step 1 (field-harvest) + Step 2 (contention-layer) + Step 5 (verify-bidirectional) complete
28
+ + harvested pattern summary 1~3 lines output
29
+ + "run full harvest-loop?" proposed (if patterns found)
30
+ + [Card update prohibited] Do NOT update reference_next_session_starter.md in lightweight mode alone
31
+ ```
32
+
33
+ **Early Trigger** (mid-session): Same pattern 3+ times · same skill fails 2+ consecutive times · session 2+ hours elapsed → "Early harvest condition detected. Run mid-session harvest?" If Y → field-harvest → contention-layer → verify-bidirectional only.
34
+
35
+ ---
36
+
37
+ ## Pipeline Structure
38
+
39
+ ```
40
+ Session end
41
+
42
+ [Step 0-a] FH asset change detection → auto-quench
43
+ │ git diff --name-only HEAD | grep -E "SKILL\.md|\.claude/rules/|templates/|CLAUDE\.md"
44
+ │ → 1+ FH assets changed: run full 3-axis gate
45
+ │ → No changes: proceed to Step 0-b immediately
46
+
47
+ [Step 0-b] Card cross-check — reconstruct completed items (no memory dependency)
48
+ │ Read reference_next_session_starter.md + fh_completed_{today}.md + git log
49
+ │ → Generate removal candidate list from 3-source cross-check
50
+
51
+ [Step 0-c] Edit Manifest Verification + Memory Hygiene
52
+ │ edit-manifest VERIFY: check pending predictions in edit_manifest.yaml
53
+ │ memory-hygiene scan: staleness check on memory/*.md entries (skip if < 7 days)
54
+
55
+ [Step 0] Regression Guard
56
+ │ Check: does anything from this session conflict with or regress a validated skill?
57
+ │ → Regression detected: flag, route to contention-layer
58
+ │ → No regression: proceed
59
+
60
+ [Step 1] field-harvest
61
+ │ Scan field git diff / outputs → extract patterns (proceed if 3+, skip if fewer)
62
+
63
+ [Step 2] contention-layer
64
+ │ Compare patterns ↔ existing FH skills → collision = new skill candidate signal
65
+
66
+ [Step 3a] challenger (Agent) [Step 3b] persona-innovator ← parallel
67
+ │ Attack existing skills Propose new skill candidates
68
+
69
+ [Step 3.5] synthesizer
70
+ │ Cross-synthesize attack ↔ proposal → readjust grades (HIGH/MED/LOW)
71
+
72
+ [Step 3.75] Critic (isolated Agent — SAGE pattern)
73
+ │ Independent critique of synthesizer proposals → PASS / CONDITIONAL PASS / FAIL
74
+
75
+ [Step 4] harness-doctor
76
+ │ Health check when adding candidates (Done When exists? ≥70% overlap?)
77
+
78
+ [Step 5] verify-bidirectional
79
+ │ Bidirectional consistency check on candidate skill
80
+
81
+ Output final proposal list → Y: PR creation / N: persist to tracks/_meta/fh_signal
82
+
83
+ [Step 6] Curator lifecycle review (auto-run after Y)
84
+ │ SKILL.md STALE/merge candidates + Memory self-correction
85
+ ```
86
+
87
+ ---
88
+
89
+ ## Execution Instructions
90
+
91
+ ### Step 1 — field-harvest
92
+ `/field-harvest --since 1d` — Fewer than 3 patterns → auto-skip + output "no session harvest targets".
93
+
94
+ ### Step 2 — contention-layer
95
+ `/contention-layer [field-harvest output patterns]`
96
+
97
+ | Collision type | Routing |
98
+ |---|---|
99
+ | Overlaps with existing skill | Existing skill enhancement candidate |
100
+ | New area not covered | New skill candidate |
101
+ | Two skills conflict | Mediation skill candidate |
102
+
103
+ ### Step 3 — Parallel challenger + innovator
104
+ **3a challenger**: "Does this discovery overturn existing skill X?" / "Doesn't existing skill already handle this?" / "Does adding this simplify or complicate FH?"
105
+ **3b innovator**: Field pattern → abstraction → naming candidates + Done When draft required.
106
+
107
+ ### Step 3.5 — synthesizer
108
+
109
+ | devil attack | innovator proposal | synthesizer verdict |
110
+ |:---:|:---:|---|
111
+ | S-tier attack | Proposal for that area | **HIGH** — immediate reflection candidate |
112
+ | S-tier attack | No proposal | **HIGH** — fix existing skill weakness immediately |
113
+ | No attack | Proposal exists | **MED** — re-review in next wave |
114
+ | Attack overturns proposal | — | Proposal **rejected** — persist as fh_signal on hold |
115
+
116
+ **Fallback** (deep-insight not installed): Inline synthesis. Apply same judgment matrix. If quality low → Step 3.75 Critic processes as CONDITIONAL PASS.
117
+
118
+ **Step 3.5-X** (optional): Cross-session 2nd validation when 2+ HIGH-grade items exist. External CLI (gemini/codex) or cross-session Claude. Items flagged as over-promoted → downgrade HIGH → MED.
119
+
120
+ > **Detail**: See `SKILL_detail.md §Step3-5X` — bash execution scripts for external CLI and cross-session Claude fallback — read when running Step 3.5-X validation.
121
+
122
+ ### Step 3.75 — Critic (Isolated Agent)
123
+
124
+ > Source: SAGE (arXiv 2603.15255). Isolation = Critic does not inherit synthesizer reasoning chain → resolves Cost of Consensus.
125
+
126
+ Critic evaluation: Done When logic validation · failure mode exploration (2+ edge cases) · claim vs. implementation alignment · scope appropriateness (Too Narrow / Too Broad).
127
+
128
+ FAIL routing: First FAIL → 1 re-synthesis allowed. FAIL after re-synthesis → auto-persist as `fh_signal` on hold. Maximum retries: **1**.
129
+
130
+ > **Detail**: See `SKILL_detail.md §Step3-75` — Critic isolated Agent() call format, evaluation items table, FAIL routing, Post-Core-Skill Critic connection — read when executing Step 3.75.
131
+
132
+ ### Step 4 — harness-doctor
133
+ `/harness-doctor --scope new-candidates` — Check: Done When exists · ≥70% overlap with existing skills · self-reference structure.
134
+
135
+ ### Step 5 — verify-bidirectional
136
+ `/verify-bidirectional [new skill draft]` — If A references B, does B back-reference A?
137
+
138
+ ### Step 6 — Curator Lifecycle Review
139
+
140
+ **6-1 SKILL.md Lifecycle**: 30+ day unused → [STALE] candidate. `pinned: true` → never touch. ≥70% overlap → merge candidate suggestion. **> 300 lines AND no `SKILL_detail.md`** → propose `/skill-splitter` (governance-semantic split — not compression; the grew-through-harvest pattern is a natural split trigger).
141
+
142
+ **6-1-a Archive-candidate auto-tag**: When 0 invocations in 30 days detected (cross-check `tracks/_meta/skill_usage.md`), auto-append `#archive-candidate` tag to that skill's CATALOG.md entry. No file deletion — tag only. User reviews tagged entries at next session start.
143
+
144
+ **6-2 Memory Self-Correction**: INDEX-ORPHAN (in MEMORY.md but file missing → auto-remove) · FILE-ORPHAN (file exists, not indexed → confirm with user) · MEM-STALE (30+ day unmodified → confirm with user).
145
+
146
+ **Memory curator safety**: Only INDEX-ORPHAN removal is auto-allowed. Actual file deletion absolutely prohibited without explicit approval. `type: reference` items with 🔑 keywords excluded from STALE detection.
147
+
148
+ **6-a Skill Usage Leaderboard**: Record skills called this session in `tracks/_meta/skill_usage.md`. Flag 4+ weeks no-call → deprecation candidate.
149
+
150
+ **6-b Harness Evolution Cadence** (4-week cycle): Scan skills with `complexity_routing`. Aggregate escalation records from `fh_signal_*.md`. Valid conditions = keep; never activated in 4 weeks = removal candidate; pattern in fh_signal = addition candidate. No auto-modification — output candidates then require user approval.
151
+
152
+ > **Detail**: See `SKILL_detail.md §Step6-Detail` — bash scripts for STALE detection, memory scan, skill usage leaderboard, evolution cadence aggregation — read when executing Step 6.
153
+
154
+ ---
155
+
156
+ ## Observability Hook (glass-box self-improvement)
157
+
158
+ Every evolution decision must leave a 3-part trace in `tracks/_meta/edit_manifest.yaml`:
159
+ - **(a) what changed** — file + diff summary
160
+ - **(b) predicted effect** — `predicted_impact` + `predicted_measurable_by`
161
+ - **(c) verify checkpoint** — `validation_status` flipped at next Step 0-c VERIFY
162
+
163
+ A proposal accepted without a recorded prediction is a black-box edit — flag, do not silently apply.
164
+
165
+ > **Detail**: See `SKILL_detail.md §Observability` — full observability hook spec and trace format.
166
+
167
+ ---
168
+
169
+ ## Output Format
170
+
171
+ ```
172
+ ## harvest-loop Execution Results
173
+
174
+ Session: [date] [project name]
175
+ field-harvest: [N patterns extracted]
176
+ contention-layer: [N collision signals]
177
+ synthesizer: [HIGH N / MED N / rejected N]
178
+
179
+ ### Final Proposals (sorted by synthesizer grade)
180
+ | # | Type | Target | Grade | devil | innovator | synthesizer verdict |
181
+ |:---:|---|---|:---:|---|---|---|
182
+
183
+ → Y: Create PR / draft skill file
184
+ → N: Persist to tracks/_meta/fh_signal_YYYY_MM_DD_{slug}.md
185
+
186
+ ### [Required final step] Session card update (proof gate)
187
+ Read reference_next_session_starter.md → apply Step 0-b removal list → add new priorities
188
+ → output "BEFORE N items → AFTER M items (removed: [list])" — required
189
+ → No diff (N=M) = warning + Step 0-b re-check obligation
190
+
191
+ **Natural-language close (4th source)**: Even without git log match, items with these patterns stated in session → treated as closed, remove immediately:
192
+ - "not possible / confirmed impossible" · "no response + N weeks elapsed" → abandoned
193
+ - "mutual citation confirmed" · "merged" · "cancelled" · "no longer needed"
194
+ - User says "stop monitoring" · "close this" · "remove it"
195
+ ```
196
+
197
+ > **Detail**: See `SKILL_detail.md §Output-Detail` — 2-source mode (when fh_completed absent), exact match criteria, natural-language close edge cases — read when reconstructing session card without fh_completed file.
198
+
199
+ ---
200
+
201
+ ## Linked Skills
202
+
203
+ | Situation | Linked skill |
204
+ |---|---|
205
+ | 3+ new skill candidates | `/agent-composer` for dispatch plan |
206
+ | Design existing skill enhancement direction | `/meta-prompt-builder` |
207
+ | Validate candidates from external user perspective | `fh-meta:hub-persona-auditor` |
208
+ | Review before sharing with team | `/apex-review` |
209
+ | Self-marketing pattern discovered as HIGH P10 | `/harness-doctor --lint` auto-propose |
210
+ | Edit predictions to verify / rejected buffer | `fh-meta:edit-manifest` (Step 0-c) |
211
+ | Stale memory entries to re-verify | `fh-meta:memory-hygiene` (Step 0-c) |
212
+
213
+ ---
214
+
215
+ ## Done When
216
+
217
+ ```
218
+ All stages Step 0-c → 0 → 1 → 2 → 3 (parallel) → 3.5 → 3.75 → 4 → 5 complete
219
+ + Step 0-c: edit-manifest pending entries verified + memory-hygiene scan run
220
+ + Step 3.75 Critic verdict received (PASS/CONDITIONAL PASS/FAIL stated) before Step 4
221
+ + synthesizer grade readjustment complete (rejected candidates separated)
222
+ + Final proposal list output (sorted by HIGH/MED)
223
+ + User Y/N approval gate complete
224
+ + (If Y) Step 6 Curator complete
225
+ → 6-1: STALE candidate list + merge candidates
226
+ → 6-2: INDEX-ORPHAN/FILE-ORPHAN/MEM-STALE detection results
227
+ + [Required] reference_next_session_starter.md delta update complete
228
+ → BEFORE N items → AFTER M items diff output required (proof gate)
229
+ → No diff (N=M) = warning + Step 0-b re-check
230
+ → Completed items remaining = bug (Done When not met)
231
+ ```
@@ -0,0 +1,201 @@
1
+ ---
2
+ name: harvest-loop-detail
3
+ description: Detail file for harvest-loop — bash scripts for Step 6 curator, observability hook spec, Step 3.5-X bash, Critic Agent call format. Load when executing a specific step.
4
+ load: on-demand
5
+ ---
6
+
7
+ # harvest-loop — Detail Reference
8
+
9
+ > Load when executing a specific step. SKILL.md contains operation modes, pipeline structure diagram, execution instructions overview, and Done When.
10
+
11
+ ---
12
+
13
+ ## §Step3-75 — Critic Isolated Agent Call
14
+
15
+ ```
16
+ Agent(
17
+ prompt="Independently evaluate the following skill proposals.\n[synthesizer output passed]",
18
+ # synthesizer reasoning chain not inherited — blind evaluation
19
+ )
20
+ ```
21
+
22
+ Meaning of isolation: The Critic reads the synthesizer conclusion but does not inherit the reasoning chain that reached it (devil attack reflection/grade adjustment process). Independent reasoning path = resolves Cost of Consensus.
23
+
24
+ **Evaluation items**:
25
+
26
+ | Item | Judgment criteria |
27
+ |---|---|
28
+ | Done When logic validation | When condition is met, is the goal actually achieved? Is it measurable? |
29
+ | Failure mode exploration | 2+ edge cases where this skill could fail |
30
+ | Claim vs. implementation alignment | Does description promise contradict execution guide? |
31
+ | Scope appropriateness | Too Narrow / Too Broad verdict |
32
+
33
+ **FAIL routing** (infinite loop prevention):
34
+ - First FAIL → pass Critic findings to synthesizer for **1 re-synthesis** allowed
35
+ - FAIL after re-synthesis → auto-persist as `fh_signal` on hold (no additional retries)
36
+ - Maximum retries: **1**
37
+
38
+ **Post-Core-Skill Critic Verdict Connection**: Following core skills can have Critic called inline after completion: harness-doctor · verify-bidirectional · hub-cc-pr-reviewer · context-doctor · sim-conductor. Trigger: immediately after completion announcement + "steel-quench" / "re-validate" / "run Critic" utterance.
39
+
40
+ ---
41
+
42
+ ## §Step3-5X — Cross-Session 2nd Validation Bash
43
+
44
+ ```bash
45
+ # Option A: External CLI team (if available — same detection as steel-quench Wave 5)
46
+ SYNTH_CHALLENGE=$(printf \
47
+ 'You are an adversarial reviewer with zero prior context.\nEvaluate these skill proposals and find flaws in the synthesis logic.\nFlag any HIGH-grade items that are over-promoted.\nFormat: [item · flaw · downgrade-to]\n---\n%s' \
48
+ "${SYNTHESIZER_OUTPUT}" | gemini 2>/dev/null)
49
+
50
+ # Option B: Cross-session Claude (fallback)
51
+ SYNTH_CHALLENGE=$(claude --print \
52
+ "Adversarial reviewer, zero context. Evaluate these skill proposals.
53
+ Flag over-promoted HIGH-grade items. Format: [item · flaw · downgrade-to]
54
+ ---
55
+ ${SYNTHESIZER_OUTPUT}" 2>/dev/null)
56
+ ```
57
+
58
+ **Outcome**:
59
+ - Items flagged as over-promoted → downgrade HIGH → MED, proceed with caution
60
+ - Confirmed across both → HIGH-confirmed → pass to Step 3.75 with elevated confidence
61
+ - Zero new issues → synthesis confirmed, proceed normally
62
+
63
+ Token cost: External CLI ~1K–2K tokens (billed to that CLI). Cross-session Claude ~2K–3K. Propose once — user may skip.
64
+
65
+ ---
66
+
67
+ ## §Step6-Detail — Bash Scripts for Curator + Memory + Leaderboard
68
+
69
+ ### 6-1. SKILL.md Lifecycle (bash)
70
+
71
+ ```bash
72
+ # Detect skills unused for 30+ days (based on git log)
73
+ git log --since="30 days ago" --name-only --pretty=format: \
74
+ plugins/*/skills/*/SKILL.md | sort -u > /tmp/recently_touched.txt
75
+
76
+ find plugins -name "SKILL.md" | while read f; do
77
+ grep -qxF "$f" /tmp/recently_touched.txt || echo "[STALE candidate] $f"
78
+ done
79
+ ```
80
+
81
+ | Status | Judgment criteria | Action |
82
+ |---|---|---|
83
+ | **STALE** | 30+ day git no-modify + no recent session mention | Confirm with user, then mark `status: stale` in frontmatter |
84
+ | **Pin protected** | `pinned: true` in frontmatter | Never touch under any circumstances |
85
+ | **Merge candidate** | Two skills with ≥70% functional overlap | Suggest merge draft (no auto-execution) |
86
+ | **Normal** | None of the above | Keep |
87
+
88
+ ### 6-2. Memory Self-Correction (bash)
89
+
90
+ **Theoretical basis — Agent Aging 4 Mechanisms** (arXiv:2605.26302, AgingBench):
91
+
92
+ | Aging type | Definition | 6-2 defense |
93
+ |---|---|---|
94
+ | **Compression aging** | Information loss during history compression | harvest-loop Step 0-a/b real-time recording obligation |
95
+ | **Interference aging** | New knowledge corrupts existing | FILE-ORPHAN detection |
96
+ | **Revision aging** | Stale facts after updates create inconsistency | MEM-STALE detection |
97
+ | **Maintenance aging** | Side effects from routine cleanup | Curator safety principles — no auto-delete |
98
+
99
+ ```bash
100
+ # MEMORY.md index vs actual files consistency check
101
+ grep -oP '\[.*?\]\(\K[^)]+' memory/MEMORY.md | while read f; do
102
+ [ -f "memory/$f" ] || echo "[INDEX-ORPHAN] memory/$f — in index but file missing"
103
+ done
104
+
105
+ # Detect orphan files not in index
106
+ find memory -name "*.md" ! -name "MEMORY.md" | while read f; do
107
+ fname=$(basename "$f")
108
+ grep -q "$fname" memory/MEMORY.md || echo "[FILE-ORPHAN] $f — file exists but not indexed"
109
+ done
110
+
111
+ # Detect memory files unmodified for 30+ days
112
+ git log --since="30 days ago" --name-only --pretty=format: -- memory/ \
113
+ | sort -u > /tmp/recently_touched_mem.txt
114
+
115
+ find memory -name "*.md" ! -name "MEMORY.md" | while read f; do
116
+ grep -qxF "$f" /tmp/recently_touched_mem.txt || echo "[MEM-STALE candidate] $f"
117
+ done
118
+ ```
119
+
120
+ | Status | Judgment | Action |
121
+ |---|---|---|
122
+ | **INDEX-ORPHAN** | In MEMORY.md but file missing | Remove from MEMORY.md immediately (auto-allowed) |
123
+ | **FILE-ORPHAN** | File exists but not indexed | Confirm: "add to index or delete?" |
124
+ | **MEM-STALE** | 30+ day git no-modify | Confirm: "archive or delete?" |
125
+ | **PROJECT type priority** | `type: project` file | Suggest moving to CLOSED section if completed |
126
+
127
+ Memory curator safety: Only INDEX-ORPHAN removal is auto-allowed. `type: reference` items with 🔑 keywords excluded from STALE detection.
128
+
129
+ ### 6-a. Skill Usage Leaderboard (bash)
130
+
131
+ ```bash
132
+ # Check whether skill_usage.md exists
133
+ ls tracks/_meta/skill_usage.md 2>/dev/null || echo "MISSING"
134
+ ```
135
+
136
+ **If absent**: `cp {FH_DIR}/templates/skill_usage_template.md tracks/_meta/skill_usage.md`
137
+ **If present**: Add row at bottom of "Recent session records" table:
138
+ ```markdown
139
+ | {YYYY-MM-DD} | {comma-separated list of skills called this session} |
140
+ ```
141
+
142
+ Update `Last used` date for called skills in Leaderboard table.
143
+ Flag skills with 4+ weeks no call: status `⚠️ Under observation` → if 28+ days → `❌ Deprecation candidate`.
144
+
145
+ ### 6-b. Harness Evolution Cadence (bash — run when 4+ weeks of data accumulated)
146
+
147
+ ```bash
148
+ # 1. Scan skills with complexity_routing
149
+ grep -rl "complexity_routing" plugins/*/skills/*/SKILL.md
150
+
151
+ # 2. Aggregate escalation records from fh_signal files
152
+ grep -rh "" tracks/_meta/fh_signal_*.md 2>/dev/null | \
153
+ grep -oE "(harness-doctor|verify-bidirectional|hub-cc-pr-reviewer|context-doctor|sim-conductor|agent-composer|harvest-loop|steel-quench)" | \
154
+ sort | uniq -c | sort -rn
155
+ ```
156
+
157
+ | Status | Criteria | Action |
158
+ |---|---|---|
159
+ | **Valid** | 1+ actual activations within last cycle | Keep |
160
+ | **Removal candidate** | Never activated + 4+ weeks | Suggest removal |
161
+ | **New candidate** | Pattern appearing repeatedly in fh_signal | Suggest addition |
162
+
163
+ Output update candidate list → modify relevant SKILL.md after user approval. No auto-modification.
164
+
165
+ ---
166
+
167
+ ## §Observability — Full Observability Hook Spec
168
+
169
+ > Frontier basis: `harness_frontier_diagnosis_2026-06-02.md` §Frontier Highlights 3 (AHE — agents cannot reliably improve a black-box harness).
170
+
171
+ Every evolution decision must leave a 3-part trace in `tracks/_meta/edit_manifest.yaml`:
172
+
173
+ | Trace part | Where | When written |
174
+ |---|---|---|
175
+ | **(a) what changed** | `edit_manifest.yaml` entry (file + diff summary) | On accepting a proposal (Y gate) |
176
+ | **(b) predicted effect** | same entry's `predicted_impact` + `predicted_measurable_by` | Same moment — decision with no prediction is blind |
177
+ | **(c) verify checkpoint** | same entry's `validation_status` (accepted/rejected) | Next harvest-loop Step 0-c VERIFY |
178
+
179
+ **Decision-log obligation**: When Y gate accepts a proposal, append (a)+(b) pair to `edit_manifest.yaml` in the same step — do not defer. A proposal accepted without a recorded prediction is a black-box edit — flag, not silently applied.
180
+
181
+ **Glass-box Done When**: harvest-loop should be able to answer "for each change, what did we predict and did it hold?" purely from `edit_manifest.yaml` — zero reliance on session memory.
182
+
183
+ ---
184
+
185
+ ## §Output-Detail — Session Card Update (Natural Language Close Judgment)
186
+
187
+ **Natural-language close judgment (4th source — conversation context)**:
188
+
189
+ Even without git log match, items with the following patterns stated in session are treated as "natural-language closed" and removed immediately:
190
+ - "not possible / confirmed impossible" (endorsement not possible, cannot proceed)
191
+ - "no response + N weeks elapsed" → abandoned
192
+ - "mutual citation confirmed" · "merged" · "cancelled" · "no longer needed"
193
+ - User directly says "stop monitoring" · "close this" · "remove it"
194
+
195
+ Natural-language closed items → remove from card immediately (leave only "✅ closed" one-liner in reference table). Do not re-mention.
196
+
197
+ **When fh_completed file is absent (2-source mode)**:
198
+ - Source: starter card + git log only
199
+ - Items confirmed by git log alone → "confirmed removal"
200
+ - Card item name ↔ commit message mismatch → "removal candidate (uncertain)" + "real-time log missing — manual check needed: [item list]"
201
+ - Exact match required — no LLM semantic judgment for "confirmed" status
@@ -0,0 +1,129 @@
1
+ ---
2
+ name: hub-cc-pr-reviewer
3
+ description: Checks a submitted PR against the environment's baseline assets (CLAUDE.md, memory, naming, asset classification) and attaches a review comment with a merge recommendation. 5 steps — diff read, 8-area consistency check, self-catch, comment, merge recommendation.
4
+ user-invocable: true
5
+ allowed-tools: ["Bash", "Read", "Grep", "Glob"]
6
+ model: sonnet
7
+ complexity_routing:
8
+ base: sonnet
9
+ high: opus
10
+ escalate_when:
11
+ - adversarial
12
+ - cross_project
13
+ - high_stakes
14
+ ---
15
+
16
+ > **Note:** The original developer is the forge-harness original developer (development source + meta-monitoring home). In external user install environments, the install environment user themselves is the baseline integrity gate operator (following path B generalization baseline / `SKILL_detail.md §External User Environment Adaptation Path` §).
17
+
18
+ # hub-cc-pr-reviewer — Hub Gate Operation Rule Automation
19
+
20
+ When a PR is submitted, checks consistency against the user environment's baseline assets (CLAUDE.md · memory · naming · asset classification) and attaches a review comment. 5-step: diff read → 8-matrix check → self-catch → comment attachment → merge recommendation.
21
+
22
+ ## Activation Triggers
23
+
24
+ 1. **PR #N input**: *"Review PR #N"* / *"Check PR #N"* / *"hub review"* / *"baseline consistency check"*
25
+ 2. **Action leader cc → hub sync point**: Large decision area PR catch (following Option C Hybrid policy — memory creation / CLAUDE.md change / CATALOG round / skill v0.x evolution / policy change / asset synergy branch judgment)
26
+ 3. **Hub cc session entry**: Layer A auto-read recent external commit catch (auto-discover new PRs)
27
+
28
+ ### Natural Language Triggers (General user phrasing — activates without internal vocabulary)
29
+
30
+ | Example phrasing | Intent |
31
+ |---|---|
32
+ | "Is it okay to submit this PR?" | PR review request |
33
+ | "This change seems inconsistent with existing rules" | Baseline consistency check |
34
+ | "Please review before merging" | PR review gate |
35
+ | "Does this change affect other parts?" | Consistency check |
36
+
37
+ **Activation criteria**: "Review PR #N" / "Add review comment" / "Baseline check" → Run this skill directly
38
+ *(pr-review-watcher deprecated as of v0.2.0 — recommend using `gh pr view --json reviews` directly)*
39
+
40
+ **Exceptions** (this skill does NOT apply):
41
+ - **Small patches** (typo / 1-line cross-ref addition / sync / minor adjustments) → Follow Option C Hybrid policy (direct push allowed area / review skip)
42
+ - **Original developer simple correction commands** ("This is wrong, redo it") → Immediate correction (no review / direct handling)
43
+
44
+ ## Processing Steps (5-step)
45
+
46
+ ### Step 1. PR Diff Read
47
+
48
+ Read the PR diff + metadata. If this cc authored the change, the diff read can be skipped (directly state changed areas in PR body).
49
+
50
+ > **Detail**: See `SKILL_detail.md §Step 1 Diff Read` — `gh pr diff` + `gh pr view` commands — read when executing the diff read.
51
+
52
+ ### Step 2. Baseline Consistency Check — 8-Matrix Auto-Generation
53
+
54
+ | # | Area | Check path |
55
+ |:---:|---|---|
56
+ | 1 | CLAUDE.md (hub identity + asset ownership + sync policy) | Grep PR diff vs CLAUDE.md baseline areas |
57
+ | 2 | Memory accumulation (accumulated naming/decision baseline + asset synergy branch judgment + active onboarding + bidirectional self-validation, etc.) | Grep PR diff vs `memory feedback_*.md` key areas (**External environment**: skip this item if memory files absent → `SKILL_detail.md §External User Environment Adaptation Path` §) |
58
+ | 3 | Naming baseline (accumulated naming baseline + new naming candidate area) | Catch new naming candidates from PR diff / check adherence to existing naming |
59
+ | 4 | Asset synergy branch judgment (meta/hub seed vs action leader persistent location) | Check PR changed asset location consistency |
60
+ | 5 | Simplification guard (P15 asymmetry catch + R7 over-engineering) | New asset creation vs existing asset reinforcement judgment / body length check |
61
+ | 6 | Dimension separation baseline (## Plugins / ## Skills / ## Agents) | Check dimension separation consistency on CATALOG changes |
62
+ | 7 | Branch criteria (large decision PR mandatory vs small patch direct push) | Check if PR is a large decision area (Option C Hybrid) |
63
+ | 8 | Hub gate operation consistency | Check if PR itself is a hub gate operation proof path |
64
+
65
+ Matrix result = Consistent ✅ / Partially Consistent ⚠️ / Inconsistent ❌.
66
+
67
+ ### Step 3. Layer 5 Self-Catch Matrix
68
+
69
+ Self-precision catch areas after first cc review (following previous PR self-catch patterns):
70
+ - Check adherence to frontmatter description plain text only baseline (project baseline)
71
+ - Check honest documentation of generalization effect weakening areas
72
+ - Check explicit statement of gap between accumulated history (original developer environment) vs external user starting point (0 instances)
73
+ - Check explicit statement that audience-specific guides are limited to original developer environment
74
+ - Check explicit statement of organization-specific areas
75
+
76
+ Self-catch areas 0 items = skip this entire catch matrix (no token-filling / following `feedback_simplification_evidence`).
77
+
78
+ ### Step 4. Review Comment Attachment
79
+
80
+ Attach the review comment (8-matrix results + self-catch + refinement suggestions + merge recommendation) via `gh pr comment`. Within this skill's execution authority (automatic).
81
+
82
+ > **Detail**: See `SKILL_detail.md §Step 4 Comment Template` — `gh pr comment` heredoc template — read when attaching the comment.
83
+
84
+ ### Step 5. Admin Override Merge Recommendation
85
+
86
+ **User decision delegation** (this skill = review/recording automation / no merge authority):
87
+ - Beta stage policy (`enforce_admins: false`) adherence → admin override possible
88
+ - Self-approve blocked (GHE policy) → admin override path adherence
89
+ - When this cc authored the change, admin override path is mandatory
90
+ - N+1th operation proof = baseline stabilization acceleration path
91
+
92
+ > **Detail**: See `SKILL_detail.md §Step 5 Merge Command` — `gh pr merge` command (executed after user decision, not by this skill) — read when the user authorizes merge.
93
+
94
+ ## User Approval Gate
95
+
96
+ | Stage | Approval |
97
+ |---|---|
98
+ | Step 1~3 check auto-activation | **Automatic** (editable afterward) |
99
+ | Step 4 review comment attachment | **Automatic** (gh pr comment within this skill's execution authority) |
100
+ | Step 5 admin override merge execution | **User decision** (this skill = recommendation only / no merge authority) |
101
+
102
+ ## Constraints
103
+
104
+ - **This skill = review/recording automation / no merge authority** — user admin override or other reviewer merge decision
105
+ - **No single-person decision application** — following `fact-checker` rule (narrow 1 / broad N+1 / this cc self-catch joins fact-checker count)
106
+ - **Simplification guard consistency** (`feedback_simplification_evidence`) — when creating/modifying this skill, update SKILL.md only. No new auxiliary files
107
+ - **Markdown editing discipline mandatory** (`feedback_markdown_edit_discipline`) — Edit first. No Write
108
+ - **Frontmatter description plain text only baseline** (`feedback_skill_frontmatter_description_plain_text`) — avoid markdown bold
109
+
110
+ > **Detail**: See `SKILL_detail.md §Sister Asset Utilization Path`, `§External User Environment Adaptation Path`, `§Disable Path`, `§Persona Synergy Catch` — cross-ecosystem utilization, external-environment fallback, own-PRS disable resolution, and deep-insight simultaneous-activation handling — read when operating in an external user environment, resolving own-PRS conflict, or coordinating with deep-insight.
111
+
112
+ ## Done When
113
+
114
+ ```
115
+ All 5 Steps completed
116
+ + Baseline consistency check 8-matrix results output (✅/⚠️/❌ each item)
117
+ + Review comment attached via gh pr comment command
118
+ + Admin override merge recommendation output (merge execution is user's decision)
119
+ + External verification path: harvest-loop Step 3.75 Critic isolation Agent can independently judge based on above criteria (skill_quality_rubric.md verifiable criteria)
120
+ ```
121
+
122
+ **→ Mandatory when PR contains SKILL.md / rules / templates changes: `bash templates/regression_guard.sh`** — run Axis 1 (backward check) before merge recommendation is issued. If regression_guard exits with M-tier block, merge recommendation must change to ❌ regardless of other checks.
123
+
124
+ ## References
125
+
126
+ - Rule body: `memory feedback_command_tower_gate.md` (hub gate accumulated naming baseline) + `memory feedback_qasp_to_hub_sync_protocol.md` (Option C Hybrid sync policy)
127
+ - Consistency rules: `feedback_simplification_evidence` · `feedback_markdown_edit_discipline` · `feedback_skill_frontmatter_description_plain_text` · `feedback_bidirectional_self_validation` · `feedback_reference_own_hub_assets_first`
128
+ - Sister skills: `cross-ecosystem-synergy-detection` (sister asset cluster baseline) · `verify-bidirectional` (bidirectional self-validation automation / self-catch auxiliary axis) · `harvest-loop` (weekly audit automation / operation proof accumulation cross-link)
129
+ - Autonomous commit proposal §2.19 baseline: `memory feedback_autonomous_commit_proposal.md` (① development source automation + PR proposal under human approval)