agent-harness-kit 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/.claude-plugin/marketplace.json +2 -2
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/bin/cli.mjs +21 -0
  4. package/package.json +1 -1
  5. package/src/core/doctor.mjs +24 -0
  6. package/src/core/render-templates.mjs +29 -0
  7. package/src/core/upgrade.mjs +81 -60
  8. package/src/templates/.claude/agents/api-consistency-reviewer.md.vi +37 -0
  9. package/src/templates/.claude/agents/architecture-reviewer.md.vi.hbs +45 -0
  10. package/src/templates/.claude/agents/performance-reviewer.md.vi +39 -0
  11. package/src/templates/.claude/agents/reliability-reviewer.md.vi +42 -0
  12. package/src/templates/.claude/agents/security-reviewer.md.vi +43 -0
  13. package/src/templates/.claude/hooks/hooks.json +22 -0
  14. package/src/templates/.claude/output-styles/harness-terse.md +42 -0
  15. package/src/templates/.claude/settings.json.hbs +1 -0
  16. package/src/templates/.claude/skills/add-adr/SKILL.md.vi +64 -0
  17. package/src/templates/.claude/skills/add-feature/SKILL.md.vi.hbs +50 -0
  18. package/src/templates/.claude/skills/debug-flow/SKILL.md.vi.hbs +42 -0
  19. package/src/templates/.claude/skills/doc-drift-scan/SKILL.md.vi +52 -0
  20. package/src/templates/.claude/skills/eval-runner/SKILL.md.vi +59 -0
  21. package/src/templates/.claude/skills/garbage-collection/SKILL.md.vi.hbs +58 -0
  22. package/src/templates/.claude/skills/i18n-add-locale/SKILL.md +52 -0
  23. package/src/templates/.claude/skills/i18n-add-locale/SKILL.md.vi +56 -0
  24. package/src/templates/.claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs +120 -0
  25. package/src/templates/.claude/skills/inspect-app/SKILL.md.vi +61 -0
  26. package/src/templates/.claude/skills/inspect-module/SKILL.md.vi.hbs +57 -0
  27. package/src/templates/.claude/skills/map-domain/SKILL.md +42 -0
  28. package/src/templates/.claude/skills/map-domain/SKILL.md.vi +42 -0
  29. package/src/templates/.claude/skills/map-domain/scripts/domain-map.mjs +145 -0
  30. package/src/templates/.claude/skills/propose-harness-improvement/SKILL.md.vi +49 -0
  31. package/src/templates/.claude/skills/propose-harness-improvement/scripts/improvement-bundle.mjs +172 -0
  32. package/src/templates/.claude/skills/refactor-feature/SKILL.md +60 -0
  33. package/src/templates/.claude/skills/refactor-feature/SKILL.md.vi +64 -0
  34. package/src/templates/.claude/skills/refactor-feature/scripts/feature-diff.mjs +146 -0
  35. package/src/templates/.claude/skills/review-this-pr/SKILL.md +59 -0
  36. package/src/templates/.claude/skills/review-this-pr/SKILL.md.vi +63 -0
  37. package/src/templates/.claude/skills/review-this-pr/scripts/pr-review-driver.mjs +152 -0
  38. package/src/templates/.claude/skills/structural-test-author/SKILL.md.vi.hbs +50 -0
  39. package/src/templates/.claude/skills/write-skill/SKILL.md.vi +43 -0
  40. package/src/templates/.harness/eval/rubrics/feature-step-done.mjs +148 -0
  41. package/src/templates/.harness/eval/tasks/feature-step-done.answer.md +53 -0
  42. package/src/templates/.harness/eval/tasks/feature-step-done.json +10 -0
  43. package/src/templates/.harness/eval/tasks/feature-step-done.prompt.md +43 -0
  44. package/src/templates/.mcp.json.example +35 -0
  45. package/src/templates/scripts/pretooluse-edit-guard.sh.hbs +115 -0
  46. package/src/templates/scripts/session-end.sh.hbs +6 -0
  47. package/src/templates/scripts/session-rollup.mjs +96 -0
  48. package/src/templates/scripts/session-start.sh.hbs +25 -0
  49. package/src/templates/scripts/subagent-stop.sh.hbs +76 -0
@@ -0,0 +1,57 @@
1
+ ---
2
+ name: inspect-module
3
+ description: Use this skill whenever the user mentions "explore", "inspect", "understand", "what does X do", "where is Y", or before adding a new feature in an unfamiliar area. Produces a structured map of one module — files, exports, dependencies, layer assignment, and recent commits — without reading the entire codebase. Always invoke this skill before editing an unfamiliar module so the agent has accurate context, not guesses.
4
+ allowed-tools: Read, Glob, Grep, Bash(git log:*), Bash(git ls-tree:*), Bash(tree:*), Bash(node .claude/skills/inspect-module/scripts/module-summary.mjs:*)
5
+ suggested-turns: 6
6
+ ---
7
+
8
+ ## Khi nào dùng
9
+
10
+ User hỏi bất cứ câu nào dạng: "X hoạt động như thế nào", "trong src/foo
11
+ có gì", "trước khi thêm feature Y, khu vực này có gì?", "explore <path>",
12
+ "hiện cho tôi shape của <module>".
13
+
14
+ ## Các bước
15
+
16
+ 1. **Resolve target.** Nếu user đưa tên feature (không phải path), grep
17
+ `feature_list.json` để tìm. Nếu nhiều path khớp, hỏi user chọn cái nào.
18
+ 2. **Tóm tắt one-shot (deterministic).** Chạy side-car script — bundles
19
+ exports + inbound + outbound deps + layer + recent commits vào một JSON
20
+ blob, thay thế ba LLM turn grep:
21
+
22
+ ```bash
23
+ node .claude/skills/inspect-module/scripts/module-summary.mjs <target>
24
+ ```
25
+
26
+ Đọc JSON. Nếu `layer` là `null`, file nằm ngoài layer root đã config —
27
+ flag điều đó và hỏi user có muốn thêm vào layer không.
28
+ 3. **Forward-only check.** Đi qua `outbound[]` và xác minh mỗi outbound
29
+ cross layer theo chiều tiến (không bao giờ ngược). Structural test sẽ
30
+ enforce điều này cơ học rồi, nhưng flag ở đây tránh được một write step
31
+ bị lãng phí.
32
+ 4. **Rủi ro.** Flag bất kỳ trường hợp nào: dynamic imports, eval, shell-out
33
+ với interpolation, missing tests cho function được export. (LLM judgment
34
+ — side-car report facts, không report risks.)
35
+
36
+ ## Output contract
37
+
38
+ Tạo Markdown report với các section sau, theo thứ tự này:
39
+
40
+ ```
41
+ ### Module: <path>
42
+ ### Layer: <layer-name>
43
+ ### Public surface: <list>
44
+ ### Inbound deps: <list of paths>
45
+ ### Outbound deps: <list of paths or external packages>
46
+ ### Recent changes: <top 3 commit messages>
47
+ ### Risks: <bulleted list, "none" nếu clean>
48
+ ```
49
+
50
+ Kết thúc bằng: "Tôi sẵn sàng thay đổi `<module>`. Architecture-reviewer
51
+ subagent sẽ được gọi khi hoàn tất."
52
+
53
+ ## Anti-patterns
54
+
55
+ - Không đọc mọi file trong module — sample exports, rồi chỉ drill in
56
+ vào nơi mà task của user chỉ tới.
57
+ - Không đề xuất changes trong skill này. Đây là read-only context-gathering.
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: map-domain
3
+ description: Use this skill to render the harness's domain/layer config as a mermaid diagram + check for drift between harness.config.json#domains and the actual filesystem. Surfaces "the config says layers A→B→C but the repo has folders A, B, X" — drift that silently invalidates the structural-test contract.
4
+ allowed-tools: Read, Bash(node .claude/skills/map-domain/scripts/domain-map.mjs:*)
5
+ suggested-turns: 3
6
+ ---
7
+
8
+ ## When to invoke
9
+
10
+ - After editing `harness.config.json#domains`.
11
+ - After moving files between layer directories.
12
+ - During onboarding — gives a one-glance view of the kit's layer rule.
13
+
14
+ ## Steps
15
+
16
+ 1. **Run the side-car.**
17
+ ```
18
+ node .claude/skills/map-domain/scripts/domain-map.mjs --out docs/architecture/domain-map.md
19
+ ```
20
+ 2. **Inspect drift.** The mermaid diagram embeds a "drift" badge per layer:
21
+ - `✓` — config layer name has a matching `<root>/<layer>/` directory.
22
+ - `✗` — directory missing.
23
+ - `?` — directory exists but contains only sub-layers (likely OK; review).
24
+ 3. **Update the README** (optional). The generated markdown is safe to
25
+ commit; re-running the side-car is idempotent.
26
+
27
+ ## Output contract
28
+
29
+ ```
30
+ domains: <N>
31
+ layers: <M>
32
+ drift_count: <K>
33
+ report: docs/architecture/domain-map.md
34
+ ```
35
+
36
+ ## Anti-patterns
37
+
38
+ - Don't rename a layer in the config without moving the directory at the
39
+ same time — the structural-test will start scanning a path that no
40
+ longer exists.
41
+ - Don't add a layer to the config without seeding it with at least a
42
+ `README.md` so the drift check passes.
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: map-domain
3
+ description: Use this skill to render the harness's domain/layer config as a mermaid diagram + check for drift between harness.config.json#domains and the actual filesystem. Surfaces "the config says layers A→B→C but the repo has folders A, B, X" — drift that silently invalidates the structural-test contract.
4
+ allowed-tools: Read, Bash(node .claude/skills/map-domain/scripts/domain-map.mjs:*)
5
+ suggested-turns: 3
6
+ ---
7
+
8
+ ## Khi nào dùng
9
+
10
+ - Sau khi sửa `harness.config.json#domains`.
11
+ - Sau khi di chuyển files giữa các layer directory.
12
+ - Trong onboarding — cho cái nhìn tổng quan một-cái-nhìn về layer rule
13
+ của kit.
14
+
15
+ ## Các bước
16
+
17
+ 1. **Chạy side-car.**
18
+ ```
19
+ node .claude/skills/map-domain/scripts/domain-map.mjs --out docs/architecture/domain-map.md
20
+ ```
21
+ 2. **Soi drift.** Mermaid diagram nhúng "drift" badge per layer:
22
+ - `✓` — layer name trong config có thư mục `<root>/<layer>/` tương ứng.
23
+ - `✗` — thư mục thiếu.
24
+ - `?` — thư mục tồn tại nhưng chỉ chứa sub-layers (có khả năng OK; review).
25
+ 3. **Update README** (không bắt buộc). Markdown sinh ra có thể safely
26
+ commit; chạy lại side-car là idempotent.
27
+
28
+ ## Output contract
29
+
30
+ ```
31
+ domains: <N>
32
+ layers: <M>
33
+ drift_count: <K>
34
+ report: docs/architecture/domain-map.md
35
+ ```
36
+
37
+ ## Anti-patterns
38
+
39
+ - Không rename một layer trong config mà không cùng lúc di chuyển
40
+ directory — structural-test sẽ bắt đầu scan một path không còn tồn tại.
41
+ - Không thêm layer vào config mà không seed nó với ít nhất một
42
+ `README.md` để drift check pass.
@@ -0,0 +1,145 @@
1
+ #!/usr/bin/env node
2
+ // domain-map.mjs — deterministic step for /map-domain.
3
+ // Renders harness.config.json#domains as a markdown doc with embedded
4
+ // mermaid graph + drift check against the filesystem.
5
+ //
6
+ // Usage:
7
+ // domain-map.mjs [--out docs/architecture/domain-map.md]
8
+ // domain-map.mjs --stdout
9
+
10
+ import { readFileSync, existsSync, writeFileSync, mkdirSync, statSync, readdirSync } from "node:fs";
11
+ import { resolve, dirname } from "node:path";
12
+
13
+ const ROOT = process.env.CLAUDE_PROJECT_DIR || process.cwd();
14
+
15
+ function parseArgs(argv) {
16
+ const out = { outPath: null, stdout: false };
17
+ for (let i = 0; i < argv.length; i++) {
18
+ if (argv[i] === "--out") out.outPath = argv[++i];
19
+ else if (argv[i] === "--stdout") out.stdout = true;
20
+ }
21
+ if (!out.outPath && !out.stdout) out.stdout = true;
22
+ return out;
23
+ }
24
+
25
+ function loadConfig() {
26
+ const p = resolve(ROOT, "harness.config.json");
27
+ if (!existsSync(p)) {
28
+ console.error("domain-map: harness.config.json not found at repo root");
29
+ process.exit(2);
30
+ }
31
+ try { return JSON.parse(readFileSync(p, "utf8")); }
32
+ catch (e) {
33
+ console.error(`domain-map: harness.config.json is not valid JSON: ${e.message}`);
34
+ process.exit(2);
35
+ }
36
+ }
37
+
38
+ function dirState(abs) {
39
+ try {
40
+ const st = statSync(abs);
41
+ if (!st.isDirectory()) return "missing";
42
+ const entries = readdirSync(abs);
43
+ if (entries.length === 0) return "empty";
44
+ return "present";
45
+ } catch {
46
+ return "missing";
47
+ }
48
+ }
49
+
50
+ function driftBadge(state) {
51
+ if (state === "present") return "✓";
52
+ if (state === "empty") return "?";
53
+ return "✗";
54
+ }
55
+
56
+ function safeId(s) {
57
+ return String(s).replace(/[^a-zA-Z0-9_]/g, "_");
58
+ }
59
+
60
+ function buildMermaid(domains) {
61
+ const lines = [];
62
+ lines.push("```mermaid");
63
+ lines.push("flowchart LR");
64
+ let domainIdx = 0;
65
+ const driftRows = [];
66
+ for (const d of domains) {
67
+ domainIdx++;
68
+ const dName = d.name || `domain${domainIdx}`;
69
+ const root = d.root || "";
70
+ const layers = Array.isArray(d.layers) ? d.layers : [];
71
+ const dId = safeId(dName);
72
+ lines.push(` subgraph ${dId} ["${dName} (${root}/)"]`);
73
+ let prev = null;
74
+ for (const layer of layers) {
75
+ const abs = resolve(ROOT, root, layer);
76
+ const state = dirState(abs);
77
+ const badge = driftBadge(state);
78
+ const node = `${dId}__${safeId(layer)}`;
79
+ lines.push(` ${node}["${layer} ${badge}"]`);
80
+ if (prev) lines.push(` ${prev} --> ${node}`);
81
+ prev = node;
82
+ driftRows.push({ domain: dName, layer, root, state, badge });
83
+ }
84
+ lines.push(" end");
85
+ }
86
+ lines.push("```");
87
+ return { mermaid: lines.join("\n"), driftRows };
88
+ }
89
+
90
+ function buildMarkdown(cfg) {
91
+ const domains = Array.isArray(cfg.domains) ? cfg.domains : [];
92
+ const { mermaid, driftRows } = buildMermaid(domains);
93
+ const drift = driftRows.filter((r) => r.state !== "present");
94
+ const md = [];
95
+ md.push(`# Domain map`);
96
+ md.push("");
97
+ md.push(`Generated by \`/map-domain\` from \`harness.config.json\`. Re-run after editing domain/layer config.`);
98
+ md.push("");
99
+ md.push(`- domains: ${domains.length}`);
100
+ md.push(`- layers (total): ${driftRows.length}`);
101
+ md.push(`- drift entries: ${drift.length}`);
102
+ md.push("");
103
+ md.push(`## Diagram`);
104
+ md.push("");
105
+ md.push(mermaid);
106
+ md.push("");
107
+ md.push(`## Drift table`);
108
+ md.push("");
109
+ md.push(`| domain | layer | root | state | badge |`);
110
+ md.push(`|---|---|---|---|---|`);
111
+ for (const r of driftRows) {
112
+ md.push(`| ${r.domain} | ${r.layer} | \`${r.root}/${r.layer}/\` | ${r.state} | ${r.badge} |`);
113
+ }
114
+ md.push("");
115
+ if (drift.length > 0) {
116
+ md.push(`## Resolutions`);
117
+ md.push("");
118
+ for (const d of drift) {
119
+ if (d.state === "missing") {
120
+ md.push(`- \`${d.root}/${d.layer}/\` is **missing**. Either create the directory (with a README.md) or remove "${d.layer}" from harness.config.json#domains[${d.domain}].layers.`);
121
+ } else if (d.state === "empty") {
122
+ md.push(`- \`${d.root}/${d.layer}/\` exists but is empty. Add at least a README.md or seed file, or drop the layer from config.`);
123
+ }
124
+ }
125
+ md.push("");
126
+ }
127
+ const tail = { domains: domains.length, layers: driftRows.length, drift_count: drift.length };
128
+ md.push(`<!-- machine-tail: ${JSON.stringify(tail)} -->`);
129
+ return { md: md.join("\n") + "\n", drift_count: drift.length };
130
+ }
131
+
132
+ function main() {
133
+ const { outPath, stdout } = parseArgs(process.argv.slice(2));
134
+ const cfg = loadConfig();
135
+ const { md, drift_count } = buildMarkdown(cfg);
136
+ if (outPath) {
137
+ const abs = resolve(ROOT, outPath);
138
+ mkdirSync(dirname(abs), { recursive: true });
139
+ writeFileSync(abs, md);
140
+ process.stdout.write(JSON.stringify({ written: outPath, drift_count }) + "\n");
141
+ }
142
+ if (stdout) process.stdout.write(md);
143
+ }
144
+
145
+ main();
@@ -0,0 +1,49 @@
1
+ ---
2
+ name: propose-harness-improvement
3
+ description: Use this skill whenever the agent makes a mistake, the user observes an avoidable failure, a pattern recurs, or someone says "the agent keeps doing X". Files an "Engineer the Harness" entry — Mitchell Hashimoto's discipline: every failure becomes a permanent prevention mechanism. Always invoke this instead of just fixing the immediate symptom.
4
+ allowed-tools: Read, Edit, Write, Bash(git diff:*)
5
+ suggested-turns: 8
6
+ ---
7
+
8
+ ## Các bước
9
+
10
+ 1. **Triage.** Hỏi: "Vừa rồi sai cái gì? Hành vi mong muốn của agent là
11
+ gì? Triệu chứng là gì?"
12
+ 2. **Phân loại.** Một trong:
13
+ - **(a) Thiếu context** — agent không biết một điều gì đó. Fix: thêm
14
+ vào `docs/`.
15
+ - **(b) Thiếu rule** — agent làm một việc bị một unwritten rule cấm.
16
+ Fix: gọi `/structural-test-author`.
17
+ - **(c) Thiếu tool/skill** — agent với lấy sai tool. Fix: gọi
18
+ `/write-skill`.
19
+ - **(d) Wrong layer / architecture** — cấu trúc đã mời gọi sai lầm.
20
+ Fix: viết ADR qua `/add-adr`.
21
+ - **(e) Wrong instruction in prompt** — failure truy ngược về một
22
+ skill/agent prompt đã ambiguous, gây hiểu sai, hoặc under-constrained.
23
+ Agent đã làm theo prompt chính xác nhưng chính prompt đã dẫn sai.
24
+ Fix: edit file vi phạm tại `.claude/skills/<name>/SKILL.md` hoặc
25
+ `.claude/agents/<name>.md`. Sau đó chạy lại `/eval-runner` để xác
26
+ nhận regression đã đóng.
27
+ 3. **Append entry** vào `docs/agent-failures.md` với: date, symptom, fix,
28
+ fix-type, file modified.
29
+ 4. **Áp dụng fix tại nơi đúng.** KHÔNG BAO GIỜ dán đè bằng một câu "be
30
+ careful" vào CLAUDE.md trừ khi rule (a) áp dụng — và ngay cả khi đó,
31
+ chỉ làm pointer đến doc dài hơn.
32
+ 5. **Update PROGRESS.** Append `harness-improvement: <slug>` vào
33
+ `.harness/PROGRESS.md`.
34
+
35
+ ## Output contract
36
+
37
+ ```
38
+ ### Failure: <one-line summary>
39
+ ### Classification: (a|b|c|d|e) <name>
40
+ ### Fix applied at: <file:line>
41
+ ### docs/agent-failures.md entry: §<n>
42
+ ```
43
+
44
+ ## Anti-patterns (block on these)
45
+
46
+ - Không thêm câu "be careful with X" mơ hồ vào CLAUDE.md.
47
+ - Không thêm rule mà enforcement của nó cũng dựa trên LLM.
48
+ - Không dùng skill này để log những cleanup ideas không liên quan —
49
+ chúng đi vào `docs/tech-debt-tracker.md`.
@@ -0,0 +1,172 @@
1
+ #!/usr/bin/env node
2
+ // improvement-bundle.mjs — deterministic step for /propose-harness-improvement.
3
+ // Replaces the "ask the agent to summarize recent failures" LLM turn with a
4
+ // mechanical sweep over telemetry + git history + bypass log.
5
+ //
6
+ // Output (JSON, stdout or --out):
7
+ // {
8
+ // window_days: <n>,
9
+ // recent_failures: [ {ts, event, source, detail} ],
10
+ // recurring_patterns: [ {pattern, count, sample_ts} ],
11
+ // classification: { context, rule, tool_skill, architecture, prompt },
12
+ // fix_targets: [ {file, why} ]
13
+ // }
14
+ //
15
+ // Classification rubric mirrors the (a)-(e) buckets in the SKILL.md:
16
+ // (a) context — pretooluse denials referencing rules in docs/
17
+ // (b) rule — structural-test failures / baseline drift
18
+ // (c) tool/skill — bypass.log entries / missing-skill prompt-guard hits
19
+ // (d) architecture — layer-violation patterns appearing >=3 times
20
+ // (e) prompt — skill_invoked followed by failure within same session
21
+ //
22
+ // The buckets are heuristic; an LLM still makes the final call. The point is
23
+ // to hand it a dense, factual digest instead of forcing it to scan files
24
+ // blind.
25
+
26
+ import { readFileSync, existsSync, writeFileSync } from "node:fs";
27
+ import { resolve } from "node:path";
28
+ import { spawnSync } from "node:child_process";
29
+
30
+ const ROOT = process.env.CLAUDE_PROJECT_DIR || process.cwd();
31
+
32
+ function parseArgs(argv) {
33
+ const out = { window: 14, out: null };
34
+ for (let i = 0; i < argv.length; i++) {
35
+ if (argv[i] === "--window") out.window = Number(argv[++i]) || 14;
36
+ else if (argv[i] === "--out") out.out = argv[++i];
37
+ }
38
+ return out;
39
+ }
40
+
41
+ function readJsonl(path) {
42
+ if (!existsSync(path)) return [];
43
+ const body = readFileSync(path, "utf8");
44
+ const out = [];
45
+ for (const line of body.split("\n")) {
46
+ if (!line.trim()) continue;
47
+ try { out.push(JSON.parse(line)); } catch { /* skip malformed */ }
48
+ }
49
+ return out;
50
+ }
51
+
52
+ function isWithin(ts, days) {
53
+ const t = Date.parse(ts);
54
+ if (!Number.isFinite(t)) return false;
55
+ return (Date.now() - t) <= days * 24 * 3600 * 1000;
56
+ }
57
+
58
+ function gitLogFixes(days) {
59
+ const since = `${days}.days`;
60
+ const r = spawnSync("git", ["log", `--since=${since}`, "--oneline", "--grep=fix\\|revert\\|hotfix"], {
61
+ cwd: ROOT, encoding: "utf8",
62
+ });
63
+ if (r.status !== 0) return [];
64
+ return (r.stdout || "").split("\n").filter(Boolean).slice(0, 50);
65
+ }
66
+
67
+ function summariseFailures(telemetry, bypass, windowDays) {
68
+ const failures = [];
69
+ for (const rec of telemetry) {
70
+ if (!rec.ts || !isWithin(rec.ts, windowDays)) continue;
71
+ if (rec.event === "structural_test_fail" || rec.event === "precompletion_block" ||
72
+ rec.event === "permission_denied" || rec.event === "userprompt_block") {
73
+ failures.push({
74
+ ts: rec.ts,
75
+ event: rec.event,
76
+ source: rec.source || rec.rule || "(unspecified)",
77
+ detail: (rec.reason || rec.detail || rec.skill || "").slice(0, 200),
78
+ });
79
+ }
80
+ }
81
+ for (const rec of bypass) {
82
+ if (!rec.ts || !isWithin(rec.ts, windowDays)) continue;
83
+ failures.push({
84
+ ts: rec.ts,
85
+ event: "bypass",
86
+ source: rec.rule || rec.bypass || "(unspecified)",
87
+ detail: (rec.command || rec.file || "").slice(0, 200),
88
+ });
89
+ }
90
+ failures.sort((a, b) => a.ts.localeCompare(b.ts));
91
+ return failures.slice(-40);
92
+ }
93
+
94
+ function recurringPatterns(failures) {
95
+ const counts = new Map();
96
+ const samples = new Map();
97
+ for (const f of failures) {
98
+ const key = `${f.event}::${f.source}`;
99
+ counts.set(key, (counts.get(key) || 0) + 1);
100
+ if (!samples.has(key)) samples.set(key, f.ts);
101
+ }
102
+ const out = [];
103
+ for (const [key, count] of counts) {
104
+ if (count >= 2) out.push({ pattern: key, count, sample_ts: samples.get(key) });
105
+ }
106
+ out.sort((a, b) => b.count - a.count);
107
+ return out.slice(0, 20);
108
+ }
109
+
110
+ function classify(failures, recurring) {
111
+ const buckets = { context: 0, rule: 0, tool_skill: 0, architecture: 0, prompt: 0 };
112
+ for (const f of failures) {
113
+ if (f.event === "structural_test_fail") buckets.rule++;
114
+ else if (f.event === "precompletion_block") buckets.rule++;
115
+ else if (f.event === "permission_denied") buckets.context++;
116
+ else if (f.event === "userprompt_block") buckets.context++;
117
+ else if (f.event === "bypass") buckets.tool_skill++;
118
+ }
119
+ for (const r of recurring) {
120
+ if (r.count >= 3 && r.pattern.startsWith("structural_test_fail::")) {
121
+ buckets.architecture++;
122
+ }
123
+ }
124
+ return buckets;
125
+ }
126
+
127
+ function fixTargets(buckets) {
128
+ const out = [];
129
+ if (buckets.rule > 0) {
130
+ out.push({ file: "harness.config.json", why: "structural rule lives here; consider tightening" });
131
+ out.push({ file: ".harness/structural-baseline.json", why: "review whether baseline entries should drain" });
132
+ }
133
+ if (buckets.context > 0) {
134
+ out.push({ file: "docs/golden-principles.md", why: "context gap surfaced via permission denials" });
135
+ out.push({ file: "CLAUDE.md", why: "consider a pointer (not a paste) to relevant doc" });
136
+ }
137
+ if (buckets.tool_skill > 0) {
138
+ out.push({ file: ".claude/skills/", why: "missing skill or wrong skill chosen — write or edit one" });
139
+ }
140
+ if (buckets.architecture > 0) {
141
+ out.push({ file: "docs/adr/", why: "recurring violation suggests an ADR is needed" });
142
+ }
143
+ if (buckets.prompt > 0) {
144
+ out.push({ file: ".claude/skills/<name>/SKILL.md", why: "prompt ambiguity led the agent astray" });
145
+ }
146
+ return out;
147
+ }
148
+
149
+ function main() {
150
+ const { window: windowDays, out: outPath } = parseArgs(process.argv.slice(2));
151
+ const telemetry = readJsonl(resolve(ROOT, ".harness/telemetry.jsonl"));
152
+ const bypass = readJsonl(resolve(ROOT, ".harness/bypass.log"));
153
+ const recentFailures = summariseFailures(telemetry, bypass, windowDays);
154
+ const recurring = recurringPatterns(recentFailures);
155
+ const classification = classify(recentFailures, recurring);
156
+ const targets = fixTargets(classification);
157
+ const fixCommits = gitLogFixes(windowDays);
158
+
159
+ const payload = {
160
+ window_days: windowDays,
161
+ recent_failures: recentFailures,
162
+ recurring_patterns: recurring,
163
+ classification,
164
+ fix_targets: targets,
165
+ recent_fix_commits: fixCommits,
166
+ };
167
+ const text = JSON.stringify(payload, null, 2);
168
+ if (outPath) writeFileSync(resolve(ROOT, outPath), text + "\n");
169
+ else process.stdout.write(text + "\n");
170
+ }
171
+
172
+ main();
@@ -0,0 +1,60 @@
1
+ ---
2
+ name: refactor-feature
3
+ description: Use this skill when restructuring a feature in feature_list.json — splitting steps, merging steps, renaming, or marking a previously-failing step done. The side-car diffs feature_list.json#steps before/after and rejects the edit when a step.done transition is not accompanied by a test reference. Forces "no done without proof".
4
+ allowed-tools: Read, Edit, Bash(git diff:*, node .claude/skills/refactor-feature/scripts/feature-diff.mjs:*)
5
+ suggested-turns: 6
6
+ isolation: worktree
7
+ ---
8
+
9
+ ## When to invoke
10
+
11
+ - Re-decomposing a feature (one becomes many, or vice versa).
12
+ - Marking `passes: false → true` for a step that was previously WIP.
13
+ - Renaming feature ids (this is the dangerous case — the side-car catches
14
+ silent renames that orphan PROGRESS.md references).
15
+
16
+ ## Pre-flight (side-car gate)
17
+
18
+ Run the diff side-car BEFORE any feature_list.json edit lands:
19
+
20
+ ```
21
+ node .claude/skills/refactor-feature/scripts/feature-diff.mjs \
22
+ --before-ref HEAD --after-file feature_list.json
23
+ ```
24
+
25
+ Side-car contract:
26
+ - Exits 0 + JSON when changes are coherent.
27
+ - Exits 2 + JSON with `violations: [...]` when:
28
+ - A step's `passes` flipped `false → true` without a test entry under
29
+ `step.tests` (or `step.testCommit`).
30
+ - A step's `id` changed without a `renamed_from` field (silent rename).
31
+ - A step disappeared without an entry in `step.replaced_by`.
32
+
33
+ ## Steps
34
+
35
+ 1. **Capture before-state.** `git show HEAD:feature_list.json > /tmp/before.json`
36
+ 2. **Edit.** Make the refactor in your working copy.
37
+ 3. **Run the gate.** Side-car compares HEAD vs working copy. Address any
38
+ violation before staging.
39
+ 4. **Stage + test.** If `passes` flipped true, the test must exist and be
40
+ referenced in `step.tests`.
41
+ 5. **Commit with a body explaining the refactor.** Use commit trailer
42
+ `Refactor-Feature: <feature_id>` so /review-this-pr can group changes.
43
+
44
+ ## Output contract
45
+
46
+ ```
47
+ feature_list refactor: <id>
48
+ steps_changed: <N>
49
+ done_transitions: <M> (each with a test reference)
50
+ renames: <list of id→id>
51
+ gate: passed
52
+ ```
53
+
54
+ ## Anti-patterns
55
+
56
+ - Don't mark `passes: true` first and "add tests later" — the side-car
57
+ blocks at the boundary on purpose. Flip the bit only AFTER the test
58
+ exists.
59
+ - Don't delete a step without `replaced_by` — orphaned PROGRESS.md
60
+ entries get out of sync with the live feature list.
@@ -0,0 +1,64 @@
1
+ <!-- LOCALE_TODO: translate body to vi -->
2
+ <!-- Source: .claude/skills/refactor-feature/SKILL.md -->
3
+ <!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
4
+
5
+ ---
6
+ name: refactor-feature
7
+ description: Use this skill when restructuring a feature in feature_list.json — splitting steps, merging steps, renaming, or marking a previously-failing step done. The side-car diffs feature_list.json#steps before/after and rejects the edit when a step.done transition is not accompanied by a test reference. Forces "no done without proof".
8
+ allowed-tools: Read, Edit, Bash(git diff:*, node .claude/skills/refactor-feature/scripts/feature-diff.mjs:*)
9
+ suggested-turns: 6
10
+ isolation: worktree
11
+ ---
12
+
13
+ ## When to invoke
14
+
15
+ - Re-decomposing a feature (one becomes many, or vice versa).
16
+ - Marking `passes: false → true` for a step that was previously WIP.
17
+ - Renaming feature ids (this is the dangerous case — the side-car catches
18
+ silent renames that orphan PROGRESS.md references).
19
+
20
+ ## Pre-flight (side-car gate)
21
+
22
+ Run the diff side-car BEFORE any feature_list.json edit lands:
23
+
24
+ ```
25
+ node .claude/skills/refactor-feature/scripts/feature-diff.mjs \
26
+ --before-ref HEAD --after-file feature_list.json
27
+ ```
28
+
29
+ Side-car contract:
30
+ - Exits 0 + JSON when changes are coherent.
31
+ - Exits 2 + JSON with `violations: [...]` when:
32
+ - A step's `passes` flipped `false → true` without a test entry under
33
+ `step.tests` (or `step.testCommit`).
34
+ - A step's `id` changed without a `renamed_from` field (silent rename).
35
+ - A step disappeared without an entry in `step.replaced_by`.
36
+
37
+ ## Steps
38
+
39
+ 1. **Capture before-state.** `git show HEAD:feature_list.json > /tmp/before.json`
40
+ 2. **Edit.** Make the refactor in your working copy.
41
+ 3. **Run the gate.** Side-car compares HEAD vs working copy. Address any
42
+ violation before staging.
43
+ 4. **Stage + test.** If `passes` flipped true, the test must exist and be
44
+ referenced in `step.tests`.
45
+ 5. **Commit with a body explaining the refactor.** Use commit trailer
46
+ `Refactor-Feature: <feature_id>` so /review-this-pr can group changes.
47
+
48
+ ## Output contract
49
+
50
+ ```
51
+ feature_list refactor: <id>
52
+ steps_changed: <N>
53
+ done_transitions: <M> (each with a test reference)
54
+ renames: <list of id→id>
55
+ gate: passed
56
+ ```
57
+
58
+ ## Anti-patterns
59
+
60
+ - Don't mark `passes: true` first and "add tests later" — the side-car
61
+ blocks at the boundary on purpose. Flip the bit only AFTER the test
62
+ exists.
63
+ - Don't delete a step without `replaced_by` — orphaned PROGRESS.md
64
+ entries get out of sync with the live feature list.