agent-harness-kit 0.8.0 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +2 -2
- package/.claude-plugin/plugin.json +1 -1
- package/bin/cli.mjs +21 -0
- package/package.json +1 -1
- package/src/core/doctor.mjs +24 -0
- package/src/core/render-templates.mjs +29 -0
- package/src/core/upgrade.mjs +81 -60
- package/src/templates/.claude/agents/api-consistency-reviewer.md.vi +37 -0
- package/src/templates/.claude/agents/architecture-reviewer.md.vi.hbs +45 -0
- package/src/templates/.claude/agents/performance-reviewer.md.vi +39 -0
- package/src/templates/.claude/agents/reliability-reviewer.md.vi +42 -0
- package/src/templates/.claude/agents/security-reviewer.md.vi +43 -0
- package/src/templates/.claude/hooks/hooks.json +22 -0
- package/src/templates/.claude/output-styles/harness-terse.md +42 -0
- package/src/templates/.claude/settings.json.hbs +1 -0
- package/src/templates/.claude/skills/add-adr/SKILL.md.vi +64 -0
- package/src/templates/.claude/skills/add-feature/SKILL.md.vi.hbs +50 -0
- package/src/templates/.claude/skills/debug-flow/SKILL.md.vi.hbs +42 -0
- package/src/templates/.claude/skills/doc-drift-scan/SKILL.md.vi +52 -0
- package/src/templates/.claude/skills/eval-runner/SKILL.md.vi +59 -0
- package/src/templates/.claude/skills/garbage-collection/SKILL.md.vi.hbs +58 -0
- package/src/templates/.claude/skills/i18n-add-locale/SKILL.md +52 -0
- package/src/templates/.claude/skills/i18n-add-locale/SKILL.md.vi +56 -0
- package/src/templates/.claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs +120 -0
- package/src/templates/.claude/skills/inspect-app/SKILL.md.vi +61 -0
- package/src/templates/.claude/skills/inspect-module/SKILL.md.vi.hbs +57 -0
- package/src/templates/.claude/skills/map-domain/SKILL.md +42 -0
- package/src/templates/.claude/skills/map-domain/SKILL.md.vi +42 -0
- package/src/templates/.claude/skills/map-domain/scripts/domain-map.mjs +145 -0
- package/src/templates/.claude/skills/propose-harness-improvement/SKILL.md.vi +49 -0
- package/src/templates/.claude/skills/propose-harness-improvement/scripts/improvement-bundle.mjs +172 -0
- package/src/templates/.claude/skills/refactor-feature/SKILL.md +60 -0
- package/src/templates/.claude/skills/refactor-feature/SKILL.md.vi +64 -0
- package/src/templates/.claude/skills/refactor-feature/scripts/feature-diff.mjs +146 -0
- package/src/templates/.claude/skills/review-this-pr/SKILL.md +59 -0
- package/src/templates/.claude/skills/review-this-pr/SKILL.md.vi +63 -0
- package/src/templates/.claude/skills/review-this-pr/scripts/pr-review-driver.mjs +152 -0
- package/src/templates/.claude/skills/structural-test-author/SKILL.md.vi.hbs +50 -0
- package/src/templates/.claude/skills/write-skill/SKILL.md.vi +43 -0
- package/src/templates/.harness/eval/rubrics/feature-step-done.mjs +148 -0
- package/src/templates/.harness/eval/tasks/feature-step-done.answer.md +53 -0
- package/src/templates/.harness/eval/tasks/feature-step-done.json +10 -0
- package/src/templates/.harness/eval/tasks/feature-step-done.prompt.md +43 -0
- package/src/templates/.mcp.json.example +35 -0
- package/src/templates/scripts/pretooluse-edit-guard.sh.hbs +115 -0
- package/src/templates/scripts/session-end.sh.hbs +6 -0
- package/src/templates/scripts/session-rollup.mjs +96 -0
- package/src/templates/scripts/session-start.sh.hbs +25 -0
- package/src/templates/scripts/subagent-stop.sh.hbs +76 -0
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/add-adr/SKILL.md -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: add-adr
|
|
7
|
+
description: Use this skill whenever a decision is made about architecture, dependencies, frameworks, naming conventions, or layer order. Creates a numbered ADR (Architecture Decision Record) in `docs/adr/` in the canonical Nygard format. Always invoke this before changing layer order, adding a layer, swapping a major dependency, or introducing a new external service.
|
|
8
|
+
allowed-tools: Read, Write, Glob
|
|
9
|
+
suggested-turns: 6
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Steps
|
|
13
|
+
|
|
14
|
+
1. **Find the next number.** List `docs/adr/` and pick the highest existing
|
|
15
|
+
number + 1 (zero-padded to 4 digits).
|
|
16
|
+
2. **Generate the file.** Write `docs/adr/{NNNN}-{kebab-title}.md` with the
|
|
17
|
+
sections below.
|
|
18
|
+
3. **Update affected configs.** If the ADR changes layer order or adds a
|
|
19
|
+
layer, update `harness.config.json` AND the structural-test config in the
|
|
20
|
+
same commit as the ADR.
|
|
21
|
+
4. **Append to the index.** Add a one-line entry under "Recent decisions" in
|
|
22
|
+
`docs/architecture.md`.
|
|
23
|
+
|
|
24
|
+
## ADR template (write exactly this shape)
|
|
25
|
+
|
|
26
|
+
```markdown
|
|
27
|
+
# ADR <NNNN> — <title>
|
|
28
|
+
|
|
29
|
+
- **Status:** proposed | accepted | superseded by <link>
|
|
30
|
+
- **Date:** YYYY-MM-DD
|
|
31
|
+
- **Deciders:** <names or "project owner">
|
|
32
|
+
|
|
33
|
+
## Context
|
|
34
|
+
|
|
35
|
+
<What forces are in play? What constraints? What did we learn that triggered this?>
|
|
36
|
+
|
|
37
|
+
## Decision
|
|
38
|
+
|
|
39
|
+
<What we decided. Single sentence then a list.>
|
|
40
|
+
|
|
41
|
+
## Consequences
|
|
42
|
+
|
|
43
|
+
Positive: ...
|
|
44
|
+
Negative: ...
|
|
45
|
+
|
|
46
|
+
## Alternatives considered
|
|
47
|
+
|
|
48
|
+
- <alternative>: <why rejected>
|
|
49
|
+
- <alternative>: <why rejected>
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Output contract
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
### ADR: <NNNN>-<slug>
|
|
56
|
+
### Status: <status>
|
|
57
|
+
### Configs updated: <list or "none">
|
|
58
|
+
### docs/architecture.md updated: yes/no
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Anti-patterns
|
|
62
|
+
|
|
63
|
+
- Don't write an ADR for a one-line refactor — those go in commit messages.
|
|
64
|
+
- Don't change the status of an existing ADR retroactively. Supersede it.
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: add-feature
|
|
3
|
+
description: Use this skill whenever the user asks to add, implement, or build a new feature, capability, endpoint, page, command, or anything user-visible. Enforces the Anthropic two-fold harness pattern — read feature_list.json, pick exactly one feature, implement incrementally, run the structural test on every save, and never declare "done" without updating the JSON. Always invoke this skill instead of writing new feature code freehand.
|
|
4
|
+
allowed-tools: Read, Edit, Write, Bash(npm run:*), Bash(pytest:*), Bash(ruff:*), Bash(git:*), Glob, Grep
|
|
5
|
+
suggested-turns: 25
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Các bước
|
|
9
|
+
|
|
10
|
+
1. **Đọc `feature_list.json`.** Xác nhận feature tồn tại và `passes:
|
|
11
|
+
false`. Nếu user mô tả feature chưa có trong danh sách, **dừng lại**:
|
|
12
|
+
hỏi xem có nên thêm qua `/add-adr` trước không.
|
|
13
|
+
2. **Đọc `docs/architecture.md`** cho domain bị ảnh hưởng. Xác định
|
|
14
|
+
những layer nào sẽ thay đổi.
|
|
15
|
+
3. **Chạy `/inspect-module`** trên mỗi module bị ảnh hưởng. Làm điều này
|
|
16
|
+
ngay cả khi bạn nghĩ đã hiểu khu vực đó — phải kiểm chứng, không phỏng đoán.
|
|
17
|
+
4. **Lập kế hoạch trước.** Viết một đoạn văn ngắn vào `.harness/PLAN.md`
|
|
18
|
+
*trước khi thay đổi code*. (Pattern theo Anthropic Claude 4 prompt-guide.)
|
|
19
|
+
5. **Bắt đầu từ thay đổi nhỏ nhất.** Sửa ít nhất đủ để một `steps[]`
|
|
20
|
+
chuyển từ failing → passing.
|
|
21
|
+
6. **Chạy structural test.** {{#if isPython}}`python -m harness.structural_test`{{else}}`npm run harness:check`{{/if}}.
|
|
22
|
+
Nếu fail, sửa vi phạm trước khi tiếp tục — không bao giờ disable test.
|
|
23
|
+
7. **Smoke test.** Chạy smoke test liên quan trong `scripts/dev-up.sh`.
|
|
24
|
+
8. **Cập nhật `feature_list.json` CHỈ** bằng cách đổi field `passes` của
|
|
25
|
+
một item. Không bao giờ xóa hoặc viết lại items. (Quy tắc Anthropic
|
|
26
|
+
"JSON hơn Markdown": "model ít có khả năng thay đổi/ghi đè JSON files
|
|
27
|
+
so với Markdown.")
|
|
28
|
+
9. **Append vào PROGRESS.** Một dòng vào `.harness/PROGRESS.md`:
|
|
29
|
+
`YYYY-MM-DD HH:MM | <feature_id> | done`.
|
|
30
|
+
10. **Commit.** Message: `feat(<domain>): <feature_id> - <ngắn gọn>`.
|
|
31
|
+
|
|
32
|
+
## Các failure mode cần tránh (mỗi dòng tương ứng một lỗi đã quan sát thực tế)
|
|
33
|
+
|
|
34
|
+
- Không tuyên bố feature đã xong khi chưa chạy smoke test.
|
|
35
|
+
- Không đánh dấu `passes: true` khi structural test còn failing.
|
|
36
|
+
- Không thêm feature mới vào `feature_list.json` giữa session — đề xuất
|
|
37
|
+
cho session sau qua ADR.
|
|
38
|
+
- Không refactor code không liên quan trong cùng commit.
|
|
39
|
+
|
|
40
|
+
## Output contract
|
|
41
|
+
|
|
42
|
+
Sau khi triển khai, tóm tắt:
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
### Feature: <id>
|
|
46
|
+
### Files changed: <list>
|
|
47
|
+
### Structural test: PASS|FAIL
|
|
48
|
+
### Smoke test: PASS|FAIL
|
|
49
|
+
### Reviewer subagents to invoke: architecture-reviewer, security-reviewer (nếu chạm auth/IO), reliability-reviewer (nếu chạm retries/timeouts)
|
|
50
|
+
```
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/debug-flow/SKILL.md.hbs -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: debug-flow
|
|
7
|
+
description: Use this skill whenever the user reports a bug, unexpected output, or "this doesn't work". Runs the dev server, drives the failing flow via Playwright MCP if installed (else captures stdout/stderr), and produces a minimal repro before any fix. Mirrors the OpenAI Chrome-DevTools-Protocol-into-runtime pattern at solo scale — verify the failure before you propose a fix.
|
|
8
|
+
allowed-tools: Read, Bash({{devCmd}}), Bash(curl:*), Bash(playwright:*), Bash(scripts/dev-up.sh)
|
|
9
|
+
suggested-turns: 20
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Steps
|
|
13
|
+
|
|
14
|
+
1. **Start the dev server** via `scripts/dev-up.sh`. Wait for the readiness
|
|
15
|
+
probe.
|
|
16
|
+
2. **Drive the failing flow.**
|
|
17
|
+
- If the bug is UI: use Playwright MCP (`mcp__playwright__*`) — the
|
|
18
|
+
Anthropic claude.ai-clone pattern.
|
|
19
|
+
- If MCP unavailable: fall back to `curl -i` + screenshot via
|
|
20
|
+
`scrot`/`screencapture`/`gnome-screenshot`.
|
|
21
|
+
3. **Capture context.** Request payload (if any), response status, stderr
|
|
22
|
+
tail (last 50 lines), last 3 git commits.
|
|
23
|
+
4. **Write a minimal repro** to `.harness/repros/<date>-<slug>.md` with:
|
|
24
|
+
environment, steps, expected, actual.
|
|
25
|
+
5. **Only then propose a fix.** Run the structural test and the relevant
|
|
26
|
+
smoke test after the fix. Re-run the repro to confirm.
|
|
27
|
+
|
|
28
|
+
## Output contract
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
### Repro saved: .harness/repros/<filename>
|
|
32
|
+
### Failure mode: <one-line summary>
|
|
33
|
+
### Smallest failing input: <code or curl command>
|
|
34
|
+
### Proposed fix location: <file:line>
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Anti-patterns
|
|
38
|
+
|
|
39
|
+
- Don't propose a fix before reproducing the bug locally.
|
|
40
|
+
- Don't fix more than the user reported in the same commit.
|
|
41
|
+
- Don't add a defensive try/except over the failing call without
|
|
42
|
+
understanding why it fails.
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/doc-drift-scan/SKILL.md -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: doc-drift-scan
|
|
7
|
+
description: Use this skill weekly, before releases, or when the user mentions "stale docs", "doc drift", "docs are wrong", or "the README is out of date". Cross-checks every code path, file path, and command referenced in `docs/` and `CLAUDE.md` against the current repo state and produces a list of stale references — the doc-gardening agent pattern.
|
|
8
|
+
allowed-tools: Read, Glob, Grep, Bash(test -e:*), Bash(command -v:*), Bash(node .claude/skills/doc-drift-scan/scripts/scan-paths.mjs:*)
|
|
9
|
+
suggested-turns: 8
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Steps
|
|
13
|
+
|
|
14
|
+
1. **Extract references + validate (deterministic).** Run the side-car
|
|
15
|
+
script — walks `docs/**/*.md` + `CLAUDE.md`, extracts every backtick-path
|
|
16
|
+
containing a slash, checks `existsSync` per ref:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
node .claude/skills/doc-drift-scan/scripts/scan-paths.mjs
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Read the JSON: `{ stats: { docs_scanned, refs_found, refs_missing },
|
|
23
|
+
drift: [{ doc, ref }] }`. Replaces three LLM grep turns.
|
|
24
|
+
2. **Validate commands (LLM judgment, narrow).** Optional second pass for
|
|
25
|
+
backtick-commands the side-car doesn't classify (no slash → not a path).
|
|
26
|
+
Use `command -v <cmd>` and allow a small allowlist (`jq`, `gh`, `rg`).
|
|
27
|
+
3. **Group findings.**
|
|
28
|
+
- `missing-paths`: file moved or deleted.
|
|
29
|
+
- `wrong-layer-claim`: doc says module is in layer X, structural test
|
|
30
|
+
says layer Y.
|
|
31
|
+
- `outdated-commands`: command no longer exists or signature changed.
|
|
32
|
+
4. **Open ONE PR.** Label `doc-drift`. Patch all findings in one commit. Do
|
|
33
|
+
not merge.
|
|
34
|
+
|
|
35
|
+
## Output contract
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
### Doc-drift scan: <date>
|
|
39
|
+
### References checked: <N>
|
|
40
|
+
### Drifted: <count>
|
|
41
|
+
### PR opened: #<n>
|
|
42
|
+
### Top 3 drifts:
|
|
43
|
+
1. ...
|
|
44
|
+
2. ...
|
|
45
|
+
3. ...
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Anti-patterns
|
|
49
|
+
|
|
50
|
+
- Don't fix code to match docs. Fix docs to match code.
|
|
51
|
+
- Don't propose deleting a doc that drifted — drift means the doc was once
|
|
52
|
+
useful. Update or supersede it.
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/eval-runner/SKILL.md -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: eval-runner
|
|
7
|
+
description: Use this skill whenever a skill, subagent, or hook is changed, before merging to main, or when the user mentions "eval", "regression test for the harness", or "is the harness still working". This skill is a thin wrapper — it runs a single shell command (`npm run harness:eval` or `python -m harness.eval_runner`) and summarizes the JSONL output. Do not implement the eval logic yourself; the runner is already deterministic.
|
|
8
|
+
allowed-tools: Bash(npm run harness:eval:*), Bash(python -m harness.eval_runner:*), Read
|
|
9
|
+
suggested-turns: 2
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## When to use
|
|
13
|
+
|
|
14
|
+
The user said any of: "run the evals", "regression-test the harness", "is the
|
|
15
|
+
harness still working", "eval the kit changes".
|
|
16
|
+
|
|
17
|
+
## Steps
|
|
18
|
+
|
|
19
|
+
This is a **2-turn skill**. Don't over-engineer.
|
|
20
|
+
|
|
21
|
+
1. **Run the script.** Pick the right invocation based on stack:
|
|
22
|
+
- TypeScript: `npm run harness:eval -- --quick --transport=mock`
|
|
23
|
+
- Python: `python -m harness.eval_runner --quick --transport=mock`
|
|
24
|
+
|
|
25
|
+
Use `--quick` (3 tasks, ~30s) by default. Switch to `--full` only if the
|
|
26
|
+
user asked for it. Use `--transport=claude-cli` only if the user explicitly
|
|
27
|
+
wants a real (paid) run.
|
|
28
|
+
|
|
29
|
+
2. **Summarize the JSONL output.** Read the latest file in
|
|
30
|
+
`.harness/eval/results/` and produce:
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
### Eval run: <sha>
|
|
34
|
+
### Set: quick | full
|
|
35
|
+
### Transport: mock | claude-cli
|
|
36
|
+
### Tasks: <pass>/<total>
|
|
37
|
+
### Failed dimensions:
|
|
38
|
+
- <task-id>.outcome: <info>
|
|
39
|
+
- <task-id>.process: missing skills [<list>]
|
|
40
|
+
### Verdict: PASS | FAIL
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
That's it. Stop after the summary.
|
|
44
|
+
|
|
45
|
+
## What NOT to do
|
|
46
|
+
|
|
47
|
+
- **Don't re-implement the eval logic.** The runner is a deterministic script.
|
|
48
|
+
If you find yourself writing tool calls one by one, you've misread this skill.
|
|
49
|
+
- **Don't run `--full --transport=claude-cli` without permission.** That's a
|
|
50
|
+
~$2 paid API run. Always confirm first.
|
|
51
|
+
- **Don't fix failing tasks.** This skill reports the result; fixing is a
|
|
52
|
+
separate task that uses `/structural-test-author` or
|
|
53
|
+
`/propose-harness-improvement`.
|
|
54
|
+
|
|
55
|
+
## Anti-pattern flag
|
|
56
|
+
|
|
57
|
+
If you (the agent reading this skill) feel like you need >3 turns to do this,
|
|
58
|
+
stop and re-read the steps. The whole skill is: spawn a subprocess, read a
|
|
59
|
+
file, format a summary.
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: garbage-collection
|
|
3
|
+
description: Use this skill on Fridays, before tagging a release, or when the user mentions "cleanup", "tech debt", "AI slop", "GC", or "garbage collection". Runs the deterministic linters, structural tests, and doc-drift scans, then proposes the top-3 highest-leverage cleanups (with risk/cost/benefit) — does NOT auto-merge. This is the solo-dev shrunk version of OpenAI's Friday garbage-collection ritual.
|
|
4
|
+
allowed-tools: Read, Glob, Grep, Bash(npm run:*), Bash(pytest:*), Bash(ruff:*), Bash(git:*), Bash(gh:*), Bash(node .claude/skills/garbage-collection/scripts/gc-classify.mjs:*)
|
|
5
|
+
suggested-turns: 15
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Các bước
|
|
9
|
+
|
|
10
|
+
1. **Lấy baseline.** Chạy toàn bộ test suite và lưu output:
|
|
11
|
+
- {{#if isPython}}`python -m harness.structural_test`{{else}}`npm run harness:check`{{/if}}
|
|
12
|
+
- {{#if isPython}}`ruff check . --output-format json`{{else}}`npm run lint -- --format json`{{/if}}
|
|
13
|
+
- Lưu vào `.harness/gc-<date>.json`.
|
|
14
|
+
2. **Phân loại vi phạm.** Với mỗi finding:
|
|
15
|
+
- **Vi phạm layer** → sửa.
|
|
16
|
+
- **Duplicate utility** (cùng function body ở 2+ chỗ) → đề xuất
|
|
17
|
+
trích xuất sang `src/shared/`.
|
|
18
|
+
- **Dead import** → xóa.
|
|
19
|
+
- **Doc drift** (đường dẫn trong `docs/architecture.md` không còn tồn tại) →
|
|
20
|
+
gọi skill `doc-drift-scan`.
|
|
21
|
+
- **Helper viết tay** trùng với shared utility đã có → đề xuất thay thế.
|
|
22
|
+
3. **Chấm điểm** từng candidate fix trên ba thang điểm 1–5 qua side-car
|
|
23
|
+
script (thay thế LLM-scored turn trước đây — deterministic và có thể
|
|
24
|
+
kiểm chứng):
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
node .claude/skills/garbage-collection/scripts/gc-classify.mjs \
|
|
28
|
+
--baseline .harness/gc-<date>.json \
|
|
29
|
+
--history .harness/gc-history.json
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Script áp dụng rubric cơ học: `risk = 1 + ceil(touched/3)`,
|
|
33
|
+
`cost = 1 + ceil(lines/30)`, `benefit = recurrenceCount(class)`. Đọc
|
|
34
|
+
`candidates[]` JSON đã sort theo `(benefit desc, cost asc, risk asc)`.
|
|
35
|
+
4. **Chỉ đề xuất top 3** cleanup (mức cap cho solo-dev; OpenAI làm hàng chục,
|
|
36
|
+
bạn làm 3). Mở thành các PR riêng bằng `gh pr create --label gc --draft`.
|
|
37
|
+
5. **Append một row** vào `.harness/gc-history.json`:
|
|
38
|
+
`{ "date": "...", "violations_found": N, "fixes_opened": M, "total_tokens": K }`.
|
|
39
|
+
6. **Dừng lại.** Không merge gì cả. Phải có human review ở scale solo.
|
|
40
|
+
|
|
41
|
+
## Output contract
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
### GC run: <date>
|
|
45
|
+
### Violations found: <N>
|
|
46
|
+
### Top 3 fixes proposed:
|
|
47
|
+
1. <slug> — risk:<1-5> cost:<1-5> benefit:<1-5> — PR #<n>
|
|
48
|
+
2. ...
|
|
49
|
+
3. ...
|
|
50
|
+
### Fixes deferred (not in top 3): <count>
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Anti-patterns
|
|
54
|
+
|
|
55
|
+
- Không mở quá 3 PR trong một lần chạy. Cap đó là feature, không phải bug.
|
|
56
|
+
- Không auto-merge — toàn bộ ý nghĩa của GC ở solo scale là human review.
|
|
57
|
+
- Không bỏ qua các findings điểm thấp; record chúng vào
|
|
58
|
+
`.harness/gc-history.json` để xu hướng nổi lên theo thời gian.
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: i18n-add-locale
|
|
3
|
+
description: Use this skill to scaffold a new human-language locale for the kit's skills/agents/CLAUDE.md. Mirrors every existing SKILL.md.hbs into a SKILL.md.<locale>.hbs stub so a translator (or LLM) can edit copy without touching machine-readable frontmatter. Default locale codes — vi, ja, fr, es, de — but accepts any 2-5 char code.
|
|
4
|
+
allowed-tools: Read, Write, Bash(node scripts/locale-scaffold.mjs:*)
|
|
5
|
+
suggested-turns: 4
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## When to invoke
|
|
9
|
+
|
|
10
|
+
- Adding a new human language to the kit (or a fork's downstream).
|
|
11
|
+
- After upstream adds a new English skill — re-running this skill scaffolds
|
|
12
|
+
the locale stubs for the new file, leaving translated files untouched.
|
|
13
|
+
|
|
14
|
+
## Steps
|
|
15
|
+
|
|
16
|
+
1. **Pick a locale.** Two-to-five char ISO code (vi, ja, fr-CA, etc.).
|
|
17
|
+
2. **Dry-run the scan.**
|
|
18
|
+
```
|
|
19
|
+
node .claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs \
|
|
20
|
+
--locale <code> --dry-run
|
|
21
|
+
```
|
|
22
|
+
Lists every SKILL.md / SKILL.md.hbs that lacks a `.<locale>.hbs` sibling.
|
|
23
|
+
3. **Materialize stubs.**
|
|
24
|
+
```
|
|
25
|
+
node .claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs \
|
|
26
|
+
--locale <code>
|
|
27
|
+
```
|
|
28
|
+
For each missing sibling, copies the English master and prepends a
|
|
29
|
+
`<!-- LOCALE_TODO: translate body --> ` banner so the translator can grep
|
|
30
|
+
for pending work.
|
|
31
|
+
4. **Register the locale.** Edit `src/core/render-templates.mjs` and add the
|
|
32
|
+
code to `SUPPORTED_HUMAN_LANGS`. The renderer picks the variant via
|
|
33
|
+
`--locale <code>` or `HARNESS_LOCALE` env.
|
|
34
|
+
5. **Verify rendering** by running `agent-harness-kit init --locale <code>`
|
|
35
|
+
in a scratch dir and grepping the output for `LOCALE_TODO` (must be
|
|
36
|
+
zero before publishing).
|
|
37
|
+
|
|
38
|
+
## Output contract
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
locale: <code>
|
|
42
|
+
scaffolded: <N>
|
|
43
|
+
already-present: <M>
|
|
44
|
+
register-in: src/core/render-templates.mjs (SUPPORTED_HUMAN_LANGS)
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Anti-patterns
|
|
48
|
+
|
|
49
|
+
- Don't translate the YAML frontmatter — the renderer + Claude Code parse
|
|
50
|
+
it as machine-readable. Translate only the body markdown.
|
|
51
|
+
- Don't drop the master `.hbs` file. The locale stub *augments*; the
|
|
52
|
+
renderer falls back to the master when the locale variant is missing.
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/i18n-add-locale/SKILL.md -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: i18n-add-locale
|
|
7
|
+
description: Use this skill to scaffold a new human-language locale for the kit's skills/agents/CLAUDE.md. Mirrors every existing SKILL.md.hbs into a SKILL.md.<locale>.hbs stub so a translator (or LLM) can edit copy without touching machine-readable frontmatter. Default locale codes — vi, ja, fr, es, de — but accepts any 2-5 char code.
|
|
8
|
+
allowed-tools: Read, Write, Bash(node scripts/locale-scaffold.mjs:*)
|
|
9
|
+
suggested-turns: 4
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## When to invoke
|
|
13
|
+
|
|
14
|
+
- Adding a new human language to the kit (or a fork's downstream).
|
|
15
|
+
- After upstream adds a new English skill — re-running this skill scaffolds
|
|
16
|
+
the locale stubs for the new file, leaving translated files untouched.
|
|
17
|
+
|
|
18
|
+
## Steps
|
|
19
|
+
|
|
20
|
+
1. **Pick a locale.** Two-to-five char ISO code (vi, ja, fr-CA, etc.).
|
|
21
|
+
2. **Dry-run the scan.**
|
|
22
|
+
```
|
|
23
|
+
node .claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs \
|
|
24
|
+
--locale <code> --dry-run
|
|
25
|
+
```
|
|
26
|
+
Lists every SKILL.md / SKILL.md.hbs that lacks a `.<locale>.hbs` sibling.
|
|
27
|
+
3. **Materialize stubs.**
|
|
28
|
+
```
|
|
29
|
+
node .claude/skills/i18n-add-locale/scripts/locale-scaffold.mjs \
|
|
30
|
+
--locale <code>
|
|
31
|
+
```
|
|
32
|
+
For each missing sibling, copies the English master and prepends a
|
|
33
|
+
`<!-- LOCALE_TODO: translate body --> ` banner so the translator can grep
|
|
34
|
+
for pending work.
|
|
35
|
+
4. **Register the locale.** Edit `src/core/render-templates.mjs` and add the
|
|
36
|
+
code to `SUPPORTED_HUMAN_LANGS`. The renderer picks the variant via
|
|
37
|
+
`--locale <code>` or `HARNESS_LOCALE` env.
|
|
38
|
+
5. **Verify rendering** by running `agent-harness-kit init --locale <code>`
|
|
39
|
+
in a scratch dir and grepping the output for `LOCALE_TODO` (must be
|
|
40
|
+
zero before publishing).
|
|
41
|
+
|
|
42
|
+
## Output contract
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
locale: <code>
|
|
46
|
+
scaffolded: <N>
|
|
47
|
+
already-present: <M>
|
|
48
|
+
register-in: src/core/render-templates.mjs (SUPPORTED_HUMAN_LANGS)
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Anti-patterns
|
|
52
|
+
|
|
53
|
+
- Don't translate the YAML frontmatter — the renderer + Claude Code parse
|
|
54
|
+
it as machine-readable. Translate only the body markdown.
|
|
55
|
+
- Don't drop the master `.hbs` file. The locale stub *augments*; the
|
|
56
|
+
renderer falls back to the master when the locale variant is missing.
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
// locale-scaffold.mjs — deterministic step for /i18n-add-locale.
|
|
3
|
+
// Walks .claude/skills/* (and a small whitelist of other files) and scaffolds
|
|
4
|
+
// missing `.<locale>.hbs` siblings from their English masters.
|
|
5
|
+
//
|
|
6
|
+
// Usage:
|
|
7
|
+
// locale-scaffold.mjs --locale vi [--dry-run] [--root .claude/skills]
|
|
8
|
+
//
|
|
9
|
+
// A "master" file is either:
|
|
10
|
+
// - <stem>.md.hbs → scaffold sibling <stem>.md.<locale>.hbs
|
|
11
|
+
// - <stem>.md → scaffold sibling <stem>.md.<locale>.hbs
|
|
12
|
+
// (rare; only when no .hbs counterpart exists)
|
|
13
|
+
//
|
|
14
|
+
// Idempotent: if the sibling already exists, the script leaves it alone.
|
|
15
|
+
|
|
16
|
+
import { readFileSync, existsSync, writeFileSync, readdirSync, statSync } from "node:fs";
|
|
17
|
+
import { resolve, join, dirname, basename, relative } from "node:path";
|
|
18
|
+
|
|
19
|
+
const ROOT = process.env.CLAUDE_PROJECT_DIR || process.cwd();
|
|
20
|
+
|
|
21
|
+
function parseArgs(argv) {
|
|
22
|
+
const out = { locale: null, dryRun: false, roots: [".claude/skills", ".claude/agents"] };
|
|
23
|
+
for (let i = 0; i < argv.length; i++) {
|
|
24
|
+
if (argv[i] === "--locale") out.locale = argv[++i];
|
|
25
|
+
else if (argv[i] === "--dry-run") out.dryRun = true;
|
|
26
|
+
else if (argv[i] === "--root") out.roots = [argv[++i]];
|
|
27
|
+
}
|
|
28
|
+
if (!out.locale || !/^[a-z]{2,5}(-[A-Z]{2})?$/.test(out.locale)) {
|
|
29
|
+
console.error("usage: locale-scaffold.mjs --locale <code> [--dry-run] [--root <dir>]");
|
|
30
|
+
console.error(" <code> = 2-5 lowercase letters, optional region (e.g. vi, ja, fr-CA)");
|
|
31
|
+
process.exit(2);
|
|
32
|
+
}
|
|
33
|
+
return out;
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
function* walk(dir) {
|
|
37
|
+
let entries;
|
|
38
|
+
try { entries = readdirSync(dir, { withFileTypes: true }); }
|
|
39
|
+
catch { return; }
|
|
40
|
+
for (const e of entries) {
|
|
41
|
+
const full = join(dir, e.name);
|
|
42
|
+
if (e.isDirectory()) yield* walk(full);
|
|
43
|
+
else yield full;
|
|
44
|
+
}
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
function isMaster(path, locale) {
|
|
48
|
+
// English masters: *.md or *.md.hbs (but NOT *.md.<locale>.hbs already).
|
|
49
|
+
const name = basename(path);
|
|
50
|
+
if (!/\.md(?:\.hbs)?$/.test(name)) return false;
|
|
51
|
+
// Skip locale-specific files of any code.
|
|
52
|
+
if (/\.md\.[a-z]{2,5}(?:-[A-Z]{2})?\.hbs$/.test(name)) return false;
|
|
53
|
+
return true;
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
function siblingPath(masterPath, locale) {
|
|
57
|
+
// Preserve the master's Handlebars-ness in the variant. Master .md.hbs
|
|
58
|
+
// → variant .md.<lang>.hbs (Handlebars-active). Master .md (plain) →
|
|
59
|
+
// variant .md.<lang> (no Handlebars). This matters because plain-md
|
|
60
|
+
// masters often contain literal `{{...}}` strings as examples (e.g.
|
|
61
|
+
// XSS demos in security-reviewer.md); promoting them to .hbs would
|
|
62
|
+
// make Handlebars choke on the example text.
|
|
63
|
+
const name = basename(masterPath);
|
|
64
|
+
if (name.endsWith(".md.hbs")) {
|
|
65
|
+
return join(dirname(masterPath), name.replace(/\.md\.hbs$/, `.md.${locale}.hbs`));
|
|
66
|
+
}
|
|
67
|
+
if (name.endsWith(".md")) {
|
|
68
|
+
return join(dirname(masterPath), name.replace(/\.md$/, `.md.${locale}`));
|
|
69
|
+
}
|
|
70
|
+
return null;
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
function scaffold(masterPath, siblingPathAbs, locale, dryRun) {
|
|
74
|
+
if (existsSync(siblingPathAbs)) return { status: "skip", reason: "exists" };
|
|
75
|
+
if (dryRun) return { status: "would-create" };
|
|
76
|
+
const body = readFileSync(masterPath, "utf8");
|
|
77
|
+
const banner =
|
|
78
|
+
`<!-- LOCALE_TODO: translate body to ${locale} -->
|
|
79
|
+
<!-- Source: ${relative(ROOT, masterPath)} -->
|
|
80
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
81
|
+
|
|
82
|
+
`;
|
|
83
|
+
writeFileSync(siblingPathAbs, banner + body);
|
|
84
|
+
return { status: "created" };
|
|
85
|
+
}
|
|
86
|
+
|
|
87
|
+
function main() {
|
|
88
|
+
const { locale, dryRun, roots } = parseArgs(process.argv.slice(2));
|
|
89
|
+
const masters = [];
|
|
90
|
+
for (const r of roots) {
|
|
91
|
+
const abs = resolve(ROOT, r);
|
|
92
|
+
if (!existsSync(abs)) continue;
|
|
93
|
+
for (const f of walk(abs)) {
|
|
94
|
+
if (isMaster(f, locale)) masters.push(f);
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
let created = 0;
|
|
98
|
+
let skipped = 0;
|
|
99
|
+
const wouldCreate = [];
|
|
100
|
+
for (const m of masters) {
|
|
101
|
+
const sib = siblingPath(m, locale);
|
|
102
|
+
if (!sib) continue;
|
|
103
|
+
const res = scaffold(m, sib, locale, dryRun);
|
|
104
|
+
if (res.status === "created") created++;
|
|
105
|
+
else if (res.status === "would-create") wouldCreate.push(relative(ROOT, sib));
|
|
106
|
+
else skipped++;
|
|
107
|
+
}
|
|
108
|
+
const payload = {
|
|
109
|
+
locale,
|
|
110
|
+
dry_run: dryRun,
|
|
111
|
+
scaffolded: dryRun ? wouldCreate.length : created,
|
|
112
|
+
already_present: skipped,
|
|
113
|
+
scanned_masters: masters.length,
|
|
114
|
+
would_create: dryRun ? wouldCreate : undefined,
|
|
115
|
+
register_in: "src/core/render-templates.mjs#SUPPORTED_HUMAN_LANGS",
|
|
116
|
+
};
|
|
117
|
+
process.stdout.write(JSON.stringify(payload, null, 2) + "\n");
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
main();
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
<!-- LOCALE_TODO: translate body to vi -->
|
|
2
|
+
<!-- Source: .claude/skills/inspect-app/SKILL.md -->
|
|
3
|
+
<!-- Edit only the markdown body — keep frontmatter verbatim so the kit's renderer + Claude Code parse it identically across locales. -->
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
name: inspect-app
|
|
7
|
+
description: Use this skill whenever the user asks to "test the UI", "check what the app looks like", "inspect the page", "verify the dev server is up", or before claiming a UI feature is done. Boots the dev server via scripts/dev-up.sh and drives the failing flow through Playwright MCP if installed (else falls back to curl + lightweight HTML capture). Mirrors the OpenAI Chrome-DevTools-Protocol-into-runtime pattern at solo scale — verify the running app, don't trust the type checker alone.
|
|
8
|
+
allowed-tools: Read, Bash(scripts/dev-up.sh), Bash(curl:*), Bash(playwright:*), Bash(node:*)
|
|
9
|
+
suggested-turns: 12
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## When to use
|
|
13
|
+
|
|
14
|
+
The user said any of: "what does the page look like", "test the UI flow",
|
|
15
|
+
"is the dev server up", "before merging the UI work", or invoked this skill
|
|
16
|
+
explicitly via `/inspect-app`. Also auto-invokes from `/debug-flow` when the
|
|
17
|
+
bug is UI-shaped.
|
|
18
|
+
|
|
19
|
+
## Steps
|
|
20
|
+
|
|
21
|
+
1. **Detect dev server.** Read `harness.config.json` for the framework.
|
|
22
|
+
- If a process is already listening on the expected port (3000 / 8000 /
|
|
23
|
+
5000 depending on framework), reuse it.
|
|
24
|
+
- Else: `bash scripts/dev-up.sh &` in the background and wait up to 30s
|
|
25
|
+
for the readiness probe.
|
|
26
|
+
2. **Capture mode — Playwright MCP (preferred).** If `mcp__playwright__*`
|
|
27
|
+
tools are available:
|
|
28
|
+
- `mcp__playwright__browser_navigate` to the target URL
|
|
29
|
+
- `mcp__playwright__browser_snapshot` for accessibility tree
|
|
30
|
+
- `mcp__playwright__browser_take_screenshot` for a visual
|
|
31
|
+
- Optionally drive a single user flow (click → fill → click → wait)
|
|
32
|
+
3. **Capture mode — curl fallback.** If MCP is unavailable:
|
|
33
|
+
- `curl -i -s -o response.body "http://localhost:$PORT$PATH"` for headers + body
|
|
34
|
+
- `wc -l response.body` to size-check
|
|
35
|
+
- `grep -E '<title>|<h1>' response.body | head -5` for sanity
|
|
36
|
+
4. **Diff against expectation.** If the user gave an expected element /
|
|
37
|
+
text, grep for it. If not, just report what's on the page.
|
|
38
|
+
5. **Cleanup.** Kill the dev server we started (don't kill ones already
|
|
39
|
+
running before this session).
|
|
40
|
+
|
|
41
|
+
## Output contract
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
### App inspection
|
|
45
|
+
**URL:** http://localhost:<port><path>
|
|
46
|
+
**Status:** <HTTP status>
|
|
47
|
+
**Title:** <page title or first H1>
|
|
48
|
+
**Mode:** playwright-mcp | curl-fallback
|
|
49
|
+
**Screenshot:** <path or "n/a">
|
|
50
|
+
**Findings:** <bulleted list of matches/mismatches against expectation>
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Anti-patterns
|
|
54
|
+
|
|
55
|
+
- Don't claim a UI feature is done without running this skill once.
|
|
56
|
+
- Don't leave a dev server running after the inspection — kill what you
|
|
57
|
+
started.
|
|
58
|
+
- Don't grep for "Error" alone as a failure signal — many pages legitimately
|
|
59
|
+
contain that word. Match against specific expected text instead.
|
|
60
|
+
- Don't take screenshots of pages with secrets or test fixtures with PII —
|
|
61
|
+
the screenshot lands on disk.
|