@htechcs/harness-kit 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.en.md +8 -8
- package/README.md +8 -8
- package/bin/cli.js +43 -43
- package/docs/harness-engineering-tutorial.en.md +1 -1
- package/docs/harness-engineering-tutorial.md +1 -1
- package/package.json +1 -1
- package/skills/init-harness/SKILL.md +74 -74
- package/templates/agents/README.md +25 -24
- package/templates/agents/repo-explorer.md +16 -16
- package/templates/evals/README.md +39 -35
- package/templates/evals/cases/example-task.md +22 -22
- package/templates/evals/observability.md +43 -42
- package/templates/guardrails/README.md +59 -57
- package/templates/long-running/README.md +29 -28
- package/templates/long-running/TASK.md +19 -19
- package/templates/mcp-audit.md +16 -16
- package/templates/new-worktree.sh +16 -16
- package/templates/setup.sh +25 -25
- package/templates/spec/FEATURE.md +19 -19
- package/templates/spec/README.md +20 -20
package/README.en.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
#
|
|
3
|
+
# harness-kit
|
|
4
4
|
|
|
5
5
|
**Set up any repo for a coding agent (Claude Code) across the 5 maturity levels of harness engineering — in one command.**
|
|
6
6
|
|
|
@@ -19,7 +19,7 @@ orchestration, measurement) that makes an agent reliable. This kit packages it i
|
|
|
19
19
|
**artifacts** plus a **tutorial** that teaches *when* to use each. The kit ships files; the tutorial
|
|
20
20
|
teaches the discipline.
|
|
21
21
|
|
|
22
|
-
##
|
|
22
|
+
## 1. How it works
|
|
23
23
|
|
|
24
24
|
```mermaid
|
|
25
25
|
flowchart LR
|
|
@@ -36,7 +36,7 @@ flowchart LR
|
|
|
36
36
|
|
|
37
37
|
One `npx` command drops artifacts in the right places; from there Claude Code reads them every session.
|
|
38
38
|
|
|
39
|
-
##
|
|
39
|
+
## 2. Quick install
|
|
40
40
|
|
|
41
41
|
```bash
|
|
42
42
|
npx @htechcs/harness-kit # pick levels interactively, then install
|
|
@@ -47,7 +47,7 @@ npx @htechcs/harness-kit --levels=1,3 # specific levels only
|
|
|
47
47
|
Requires **Node ≥18**. The command saves all docs into `docs/harness/` so your team keeps them.
|
|
48
48
|
**Idempotent** — safe to re-run (`--force` to overwrite).
|
|
49
49
|
|
|
50
|
-
##
|
|
50
|
+
## 3. The 5 levels — each prevents a failure mode
|
|
51
51
|
|
|
52
52
|
```mermaid
|
|
53
53
|
flowchart TD
|
|
@@ -66,7 +66,7 @@ flowchart TD
|
|
|
66
66
|
| **4 — Long-running** | long tasks break mid-way, can't resume | `setup.sh`, `new-worktree.sh`, `TASK.md` |
|
|
67
67
|
| **5 — Evals & Obs** | no idea whether the agent does well or badly | golden-task template + observability guide |
|
|
68
68
|
|
|
69
|
-
##
|
|
69
|
+
## 4. Manual install per level
|
|
70
70
|
|
|
71
71
|
> The installer just automates the `cp` commands below — expand them to understand/do it by hand.
|
|
72
72
|
|
|
@@ -122,18 +122,18 @@ mkdir -p docs/specs && cp templates/spec/FEATURE.md docs/specs/<feature>.md
|
|
|
122
122
|
```
|
|
123
123
|
</details>
|
|
124
124
|
|
|
125
|
-
##
|
|
125
|
+
## 5. Level dependencies
|
|
126
126
|
|
|
127
127
|
- **Level 1 first** — it's the backbone; later levels reference `CLAUDE.md`.
|
|
128
128
|
- **Levels 3 & 4** both "point `CLAUDE.md` to" their artifacts → require Level 1 done.
|
|
129
129
|
- **Level 5** needs at least one level applied to have a change to measure (see the feedback loop above).
|
|
130
130
|
|
|
131
|
-
##
|
|
131
|
+
## 6. Docs
|
|
132
132
|
|
|
133
133
|
`docs/harness-engineering-tutorial.en.md` ([Tiếng Việt](docs/harness-engineering-tutorial.md)) — the
|
|
134
134
|
why + *when* to use each piece (after install: `docs/harness/`). Full industry-wide source catalog:
|
|
135
135
|
[Awesome Harness Engineering](https://github.com/walkinglabs/awesome-harness-engineering).
|
|
136
136
|
|
|
137
|
-
##
|
|
137
|
+
## 7. License
|
|
138
138
|
|
|
139
139
|
[MIT](LICENSE).
|
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
#
|
|
3
|
+
# harness-kit
|
|
4
4
|
|
|
5
5
|
**Thiết lập repo cho coding agent (Claude Code) theo 5 mức trưởng thành của harness engineering — bằng một lệnh.**
|
|
6
6
|
|
|
@@ -18,7 +18,7 @@
|
|
|
18
18
|
đo lường) để agent làm việc đáng tin. Kit này đóng gói nó thành **artifact** cài được + một
|
|
19
19
|
**tutorial** dạy *khi nào* dùng từng thứ. Kit ship file; tutorial dạy kỷ luật.
|
|
20
20
|
|
|
21
|
-
##
|
|
21
|
+
## 1. Hoạt động thế nào
|
|
22
22
|
|
|
23
23
|
```mermaid
|
|
24
24
|
flowchart LR
|
|
@@ -35,7 +35,7 @@ flowchart LR
|
|
|
35
35
|
|
|
36
36
|
Một lệnh `npx` rải artifact vào đúng chỗ; từ đó Claude Code đọc chúng ở mọi phiên làm việc.
|
|
37
37
|
|
|
38
|
-
##
|
|
38
|
+
## 2. Cài nhanh
|
|
39
39
|
|
|
40
40
|
```bash
|
|
41
41
|
npx @htechcs/harness-kit # hỏi chọn mức rồi cài
|
|
@@ -46,7 +46,7 @@ npx @htechcs/harness-kit --levels=1,3 # chỉ mức cụ thể
|
|
|
46
46
|
Cần **Node ≥18**. Lệnh lưu toàn bộ tài liệu vào `docs/harness/` để team giữ lại. **Idempotent** —
|
|
47
47
|
chạy lại an toàn (`--force` để ghi đè).
|
|
48
48
|
|
|
49
|
-
##
|
|
49
|
+
## 3. 5 mức — mỗi mức chặn một kiểu thất bại
|
|
50
50
|
|
|
51
51
|
```mermaid
|
|
52
52
|
flowchart TD
|
|
@@ -65,7 +65,7 @@ flowchart TD
|
|
|
65
65
|
| **4 — Long-running** | việc dài đứt giữa chừng, không resume | `setup.sh`, `new-worktree.sh`, `TASK.md` |
|
|
66
66
|
| **5 — Evals & Obs** | không biết agent làm tốt hay tệ | golden-task template + guide observability |
|
|
67
67
|
|
|
68
|
-
##
|
|
68
|
+
## 4. Cài tay từng mức
|
|
69
69
|
|
|
70
70
|
> Installer chỉ tự động hoá đúng những lệnh `cp` dưới đây — mở ra nếu muốn hiểu/làm thủ công.
|
|
71
71
|
|
|
@@ -121,18 +121,18 @@ mkdir -p docs/specs && cp templates/spec/FEATURE.md docs/specs/<feature>.md
|
|
|
121
121
|
```
|
|
122
122
|
</details>
|
|
123
123
|
|
|
124
|
-
##
|
|
124
|
+
## 5. Phụ thuộc giữa các mức
|
|
125
125
|
|
|
126
126
|
- **Mức 1 trước hết** — xương sống; các mức sau viện tới `CLAUDE.md`.
|
|
127
127
|
- **Mức 3 & 4** đều "trỏ `CLAUDE.md` tới" artifact của chúng → cần Mức 1 xong.
|
|
128
128
|
- **Mức 5** cần ít nhất một mức đã áp để có thay đổi mà đo (xem vòng feedback ở sơ đồ trên).
|
|
129
129
|
|
|
130
|
-
##
|
|
130
|
+
## 6. Tài liệu
|
|
131
131
|
|
|
132
132
|
`docs/harness-engineering-tutorial.md` ([English](docs/harness-engineering-tutorial.en.md)) — vì sao +
|
|
133
133
|
*khi nào* dùng từng thứ (sau khi cài: `docs/harness/`). Danh mục nguồn đầy đủ của cả ngành:
|
|
134
134
|
[Awesome Harness Engineering](https://github.com/walkinglabs/awesome-harness-engineering).
|
|
135
135
|
|
|
136
|
-
##
|
|
136
|
+
## 7. License
|
|
137
137
|
|
|
138
138
|
[MIT](LICENSE).
|
package/bin/cli.js
CHANGED
|
@@ -1,59 +1,59 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
2
|
'use strict'
|
|
3
3
|
|
|
4
|
-
// harness-kit — installer
|
|
5
|
-
//
|
|
6
|
-
//
|
|
7
|
-
//
|
|
8
|
-
//
|
|
4
|
+
// harness-kit — installer for the Harness Engineering starter kit.
|
|
5
|
+
// Run: npx @htechcs/harness-kit (pick levels)
|
|
6
|
+
// npx @htechcs/harness-kit --all (install all, no prompt)
|
|
7
|
+
// npx @htechcs/harness-kit --levels=1,3
|
|
8
|
+
// Extra flags: --target=<dir> (default: current directory), --force (overwrite), --help.
|
|
9
9
|
//
|
|
10
|
-
//
|
|
11
|
-
//
|
|
10
|
+
// Principles: IDEMPOTENT (safe to re-run — existing files are skipped unless --force)
|
|
11
|
+
// and FAIL-SOFT (a missing artifact is warned about, it doesn't abort the whole install).
|
|
12
12
|
|
|
13
13
|
const fs = require('fs')
|
|
14
14
|
const path = require('path')
|
|
15
15
|
const os = require('os')
|
|
16
16
|
const readline = require('readline')
|
|
17
17
|
|
|
18
|
-
const KIT = path.resolve(__dirname, '..') //
|
|
18
|
+
const KIT = path.resolve(__dirname, '..') // package root (holds templates/, skills/, docs/)
|
|
19
19
|
const HOME = os.homedir()
|
|
20
|
-
let TARGET = process.cwd() //
|
|
20
|
+
let TARGET = process.cwd() // reset in main(); used to print tidy relative paths
|
|
21
21
|
|
|
22
|
-
// ----
|
|
22
|
+
// ---- Artifact map per level: [source-in-kit, destination]. Absolute dest = user-level install. ----
|
|
23
23
|
const LEVELS = {
|
|
24
24
|
'1': {
|
|
25
|
-
title: '
|
|
25
|
+
title: 'Level 1 — Foundation (CLAUDE.md)',
|
|
26
26
|
copy: [['skills/init-harness', path.join(HOME, '.claude', 'skills', 'init-harness')]],
|
|
27
|
-
next: '
|
|
27
|
+
next: 'Run /init-harness in the target repo to generate CLAUDE.md.',
|
|
28
28
|
},
|
|
29
29
|
'2': {
|
|
30
|
-
title: '
|
|
30
|
+
title: 'Level 2 — Clean context',
|
|
31
31
|
copy: [['templates/agents/repo-explorer.md', '.claude/agents/repo-explorer.md']],
|
|
32
|
-
next: '
|
|
32
|
+
next: 'Read docs/harness/agents-README.md (subagent rules) + mcp-audit.md (prune unused MCP servers).',
|
|
33
33
|
},
|
|
34
34
|
'3': {
|
|
35
|
-
title: '
|
|
35
|
+
title: 'Level 3 — Guardrails',
|
|
36
36
|
copy: [['templates/settings.json', '.claude/settings.json']],
|
|
37
|
-
next: '
|
|
37
|
+
next: 'OPEN .claude/settings.json → add your repo\'s test/lint commands to "allow" (see docs/harness/guardrails-README.md). Do this FIRST.',
|
|
38
38
|
},
|
|
39
39
|
'4': {
|
|
40
|
-
title: '
|
|
40
|
+
title: 'Level 4 — Long-running',
|
|
41
41
|
copy: [
|
|
42
42
|
['templates/setup.sh', 'setup.sh'],
|
|
43
43
|
['templates/new-worktree.sh', 'new-worktree.sh'],
|
|
44
44
|
],
|
|
45
45
|
chmod: ['setup.sh', 'new-worktree.sh'],
|
|
46
|
-
next: '
|
|
46
|
+
next: 'Fill in setup.sh for your repo; copy docs/harness/TASK.md out when you start a long task.',
|
|
47
47
|
},
|
|
48
48
|
'5': {
|
|
49
|
-
title: '
|
|
49
|
+
title: 'Level 5 — Evals & Observability',
|
|
50
50
|
copy: [['templates/evals/cases/example-task.md', 'evals/cases/example-task.md']],
|
|
51
|
-
next: '
|
|
51
|
+
next: 'Read docs/harness/evals-README.md (includes a no-harness baseline step) + observability.md.',
|
|
52
52
|
},
|
|
53
53
|
}
|
|
54
54
|
|
|
55
|
-
// ----
|
|
56
|
-
// (npx cache
|
|
55
|
+
// ---- Guidance docs: always copied into <target>/docs/harness/ so the team keeps them ----
|
|
56
|
+
// (the npx cache disappears after the run, so the guidance must land in the repo.)
|
|
57
57
|
const GUIDE_DIR = path.join('docs', 'harness')
|
|
58
58
|
const GUIDE = [
|
|
59
59
|
['docs/harness-engineering-tutorial.md', 'tutorial.md'],
|
|
@@ -84,19 +84,19 @@ function parseArgs(argv) {
|
|
|
84
84
|
|
|
85
85
|
function showHelp() {
|
|
86
86
|
console.log(`
|
|
87
|
-
harness-kit —
|
|
87
|
+
harness-kit — set up a repo for a coding agent across 5 maturity levels.
|
|
88
88
|
|
|
89
|
-
npx harness-kit
|
|
90
|
-
npx harness-kit --all
|
|
91
|
-
npx harness-kit --levels=1,3
|
|
89
|
+
npx @htechcs/harness-kit pick levels interactively, then install
|
|
90
|
+
npx @htechcs/harness-kit --all install all 5 levels
|
|
91
|
+
npx @htechcs/harness-kit --levels=1,3 install specific levels
|
|
92
92
|
|
|
93
|
-
|
|
94
|
-
--target=<dir>
|
|
95
|
-
--force
|
|
93
|
+
Flags:
|
|
94
|
+
--target=<dir> target directory (default: current directory)
|
|
95
|
+
--force overwrite existing files (default: skip)
|
|
96
96
|
--help
|
|
97
97
|
|
|
98
|
-
|
|
99
|
-
|
|
98
|
+
Levels: 1 Foundation · 2 Context · 3 Guardrails · 4 Long-running · 5 Evals.
|
|
99
|
+
Docs are always written to <target>/docs/harness/.
|
|
100
100
|
`)
|
|
101
101
|
}
|
|
102
102
|
|
|
@@ -106,27 +106,27 @@ function rel(p) {
|
|
|
106
106
|
return p
|
|
107
107
|
}
|
|
108
108
|
|
|
109
|
-
//
|
|
109
|
+
// Place one artifact. destAbs: an already-resolved absolute path.
|
|
110
110
|
function place(srcRel, destAbs, force) {
|
|
111
111
|
const src = path.join(KIT, srcRel)
|
|
112
112
|
if (!fs.existsSync(src)) {
|
|
113
|
-
console.log(`
|
|
113
|
+
console.log(` missing ${srcRel} (not in kit, skipping)`)
|
|
114
114
|
return false
|
|
115
115
|
}
|
|
116
116
|
if (fs.existsSync(destAbs) && !force) {
|
|
117
|
-
console.log(`
|
|
117
|
+
console.log(` kept ${rel(destAbs)} (use --force to overwrite)`)
|
|
118
118
|
return false
|
|
119
119
|
}
|
|
120
120
|
fs.mkdirSync(path.dirname(destAbs), { recursive: true })
|
|
121
121
|
fs.cpSync(src, destAbs, { recursive: true })
|
|
122
|
-
console.log(`
|
|
122
|
+
console.log(` added ${rel(destAbs)}`)
|
|
123
123
|
return true
|
|
124
124
|
}
|
|
125
125
|
|
|
126
126
|
function installLevel(key, target, force) {
|
|
127
127
|
const lv = LEVELS[key]
|
|
128
128
|
if (!lv) {
|
|
129
|
-
console.log(`\n(
|
|
129
|
+
console.log(`\n(skipping invalid level: ${key})`)
|
|
130
130
|
return
|
|
131
131
|
}
|
|
132
132
|
console.log(`\n${lv.title}`)
|
|
@@ -141,7 +141,7 @@ function installLevel(key, target, force) {
|
|
|
141
141
|
}
|
|
142
142
|
|
|
143
143
|
function installGuide(target, force) {
|
|
144
|
-
console.log(`\
|
|
144
|
+
console.log(`\nGuidance docs → ${path.join(rel(path.join(target, GUIDE_DIR)))}/`)
|
|
145
145
|
for (const [srcRel, name] of GUIDE) {
|
|
146
146
|
place(srcRel, path.join(target, GUIDE_DIR, name), force)
|
|
147
147
|
}
|
|
@@ -157,26 +157,26 @@ async function main() {
|
|
|
157
157
|
if (opt.help) { showHelp(); return }
|
|
158
158
|
|
|
159
159
|
TARGET = opt.target
|
|
160
|
-
console.log(`harness-kit →
|
|
160
|
+
console.log(`harness-kit → installing into: ${opt.target}`)
|
|
161
161
|
|
|
162
162
|
let levels
|
|
163
163
|
if (opt.all) levels = Object.keys(LEVELS)
|
|
164
164
|
else if (opt.levels) levels = opt.levels
|
|
165
165
|
else if (process.stdin.isTTY) {
|
|
166
|
-
const a = (await ask('\
|
|
166
|
+
const a = (await ask('\nWhich levels? [all] or a list e.g. 1,3,4: ')).trim()
|
|
167
167
|
levels = !a || a.toLowerCase() === 'all' ? Object.keys(LEVELS) : a.split(',').map((s) => s.trim()).filter(Boolean)
|
|
168
168
|
} else {
|
|
169
|
-
levels = Object.keys(LEVELS) //
|
|
169
|
+
levels = Object.keys(LEVELS) // no TTY (CI/pipe) → install everything
|
|
170
170
|
}
|
|
171
171
|
|
|
172
172
|
for (const k of levels) installLevel(k, opt.target, opt.force)
|
|
173
173
|
installGuide(opt.target, opt.force)
|
|
174
174
|
|
|
175
|
-
console.log('\n—
|
|
175
|
+
console.log('\n— Done. Next steps —')
|
|
176
176
|
for (const k of levels) {
|
|
177
177
|
if (LEVELS[k]) console.log(` [${k}] ${LEVELS[k].next}`)
|
|
178
178
|
}
|
|
179
|
-
console.log('\
|
|
179
|
+
console.log('\nRead first: docs/harness/tutorial.md (the why + WHEN to use each piece).')
|
|
180
180
|
}
|
|
181
181
|
|
|
182
|
-
main().catch((e) => { console.error('harness-kit
|
|
182
|
+
main().catch((e) => { console.error('harness-kit error:', e.message); process.exit(1) })
|
|
@@ -210,7 +210,7 @@ No measurement, no improvement. With agents, "measuring" is harder than regular
|
|
|
210
210
|
|
|
211
211
|
This is the first exercise to do with a team — the biggest lever, doable in one afternoon. This part focuses on `CLAUDE.md` (Claude Code); `AGENTS.md` for Codex comes later (see the end of the section).
|
|
212
212
|
|
|
213
|
-
###
|
|
213
|
+
### Don't auto-generate with the default `/init`
|
|
214
214
|
|
|
215
215
|
The canonical source [Writing a good CLAUDE.md — HumanLayer](https://www.humanlayer.dev/blog/writing-a-good-claude-md) says it plainly: ***"don't auto-generate with `/init` — carefully craft its contents"***. Why: `/init` only **describes the current state** of the repo, and still misses the two things that make a "harness" — **Guardrails** (constraints on agent behavior) and **Definition of Done** (evidence for the agent to self-verify).
|
|
216
216
|
|
|
@@ -210,7 +210,7 @@ Không đo thì không cải tiến được. Với agent, "đo" khó hơn code
|
|
|
210
210
|
|
|
211
211
|
Đây là bài thực hành đầu tiên nên làm với team — đòn bẩy lớn nhất, làm trong 1 buổi. Phần này tập trung `CLAUDE.md` (Claude Code); `AGENTS.md` cho Codex để đợt sau (xem cuối mục).
|
|
212
212
|
|
|
213
|
-
###
|
|
213
|
+
### Đừng auto-generate bằng `/init` mặc định
|
|
214
214
|
|
|
215
215
|
Nguồn chuẩn [Writing a good CLAUDE.md — HumanLayer](https://www.humanlayer.dev/blog/writing-a-good-claude-md) nói thẳng: ***"don't auto-generate with `/init` — carefully craft its contents"***. Lý do: `/init` chỉ **mô tả hiện trạng** repo, còn thiếu hai thứ làm nên "harness" — **Guardrails** (ràng buộc hành vi agent) và **Definition of Done** (bằng chứng để agent tự verify).
|
|
216
216
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@htechcs/harness-kit",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.1",
|
|
4
4
|
"description": "Harness Engineering starter kit — thiết lập repo cho coding agent theo 5 mức trưởng thành (CLAUDE.md, context, guardrails, long-running, evals).",
|
|
5
5
|
"bin": {
|
|
6
6
|
"harness-kit": "bin/cli.js"
|
|
@@ -1,68 +1,68 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: init-harness
|
|
3
|
-
description:
|
|
4
|
-
argument-hint: [
|
|
3
|
+
description: Generate a "harness-engineering-grade" CLAUDE.md for a repo, replacing the default /init. Analyzes the codebase to fill in WHAT/WHY/HOW (stack, build/test/run, architecture, conventions), then asks to confirm exactly the 2 things it can't infer — Guardrails (constraints on agent behavior) and Definition of Done (when work counts as finished) — the parts /init omits. Follows HumanLayer's "Writing a good CLAUDE.md" discipline: short (<300 lines, ideally ~60), pointers instead of copies, no style/convention stuffing (leave that to the linter), hand-crafted rather than auto-generated. Use this skill when the user wants to create/standardize a CLAUDE.md, "init harness", or set up a repo for a coding agent. Currently does CLAUDE.md only (Claude Code); AGENTS.md/Codex to be added later.
|
|
4
|
+
argument-hint: [repo path, defaults to the current directory]
|
|
5
5
|
allowed-tools: Read, Write, Edit, Glob, Grep, Bash, AskUserQuestion
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
# init-harness — CLAUDE.md
|
|
8
|
+
# init-harness — a harness-engineering-grade CLAUDE.md
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
session.
|
|
12
|
-
|
|
13
|
-
|
|
10
|
+
Produce a `CLAUDE.md` that is a **durable repo-local instruction** Claude Code reads in *every*
|
|
11
|
+
session. It differs from the default `/init`: `/init` only **describes the current state** of the repo
|
|
12
|
+
(what it is, how to build it); this skill also **defines agent behavior** (Guardrails + Definition of
|
|
13
|
+
Done) and follows the canonical source's context-management discipline.
|
|
14
14
|
|
|
15
|
-
##
|
|
15
|
+
## Foundational principles (never violate)
|
|
16
16
|
|
|
17
|
-
|
|
17
|
+
From [Writing a good CLAUDE.md — HumanLayer](https://www.humanlayer.dev/blog/writing-a-good-claude-md):
|
|
18
18
|
|
|
19
|
-
- **"LLMs are stateless functions"** — `CLAUDE.md`
|
|
20
|
-
- **"Less is more"** — model
|
|
21
|
-
- **"Prefer pointers to copies"** —
|
|
22
|
-
- **
|
|
23
|
-
- **
|
|
24
|
-
- **Right altitude —
|
|
25
|
-
- **Craft,
|
|
26
|
-
- **
|
|
19
|
+
- **"LLMs are stateless functions"** — `CLAUDE.md` is the *only* file loaded into *every* session. Every line must earn its place.
|
|
20
|
+
- **"Less is more"** — a model can follow ~150–200 instructions; the system prompt already takes ~50. Keep the file **under 300 lines, aim for ~60–120**.
|
|
21
|
+
- **"Prefer pointers to copies"** — point to `path/file.ts:42`, do NOT copy code snippets into the file.
|
|
22
|
+
- **No style/convention stuffing** — leave that to the linter/formatter. Don't write "use 2 spaces", "name things camelCase".
|
|
23
|
+
- **Only universally relevant content** — task-specific info goes elsewhere, link to it when needed (progressive disclosure).
|
|
24
|
+
- **Right altitude — don't hardcode rigid rules** — the canonical source warns of TWO EQUAL extremes: too vague *and* too rigid ("overly complex hardcoded logic"). Write stable heuristics/boundaries, do NOT stuff in if-this-then-that chains for every edge case (that makes `CLAUDE.md` brittle, contradictory, hard to maintain). A rule that's only true for one task → push it to a per-task file, don't cram it into `CLAUDE.md`.
|
|
25
|
+
- **Craft, don't auto-dump** — the canonical source says it plainly: *don't auto-generate*. This skill analyzes to *propose*, but must distill, not dump raw.
|
|
26
|
+
- **Write the NON-obvious** — drop the directory tree (the agent discovers it), drop generic advice ("write clean code").
|
|
27
27
|
|
|
28
|
-
|
|
29
|
-
- **Guardrails** — agent *
|
|
30
|
-
- **Definition of Done** —
|
|
28
|
+
The two parts that make a "harness" (which `/init` lacks):
|
|
29
|
+
- **Guardrails** — what the agent *must not* do, what it *must* do before declaring done, which operations need confirmation.
|
|
30
|
+
- **Definition of Done** — concrete evidence (tests pass, build green, lint clean) for the agent to self-verify before claiming completion.
|
|
31
31
|
|
|
32
|
-
##
|
|
32
|
+
## Procedure
|
|
33
33
|
|
|
34
|
-
###
|
|
34
|
+
### Step 1 — Identify the repo & scan the current state
|
|
35
35
|
|
|
36
|
-
-
|
|
37
|
-
-
|
|
38
|
-
-
|
|
36
|
+
- Target repo = `$ARGUMENTS` if given, otherwise the current directory. Confirm it's a git repo (`git rev-parse --show-toplevel`).
|
|
37
|
+
- If a `CLAUDE.md` already exists → read it, propose *improvements* rather than blindly overwriting. Ask before overwriting.
|
|
38
|
+
- Gather existing sources to consolidate (don't leave them scattered): `README.md`, `.cursorrules` / `.cursor/rules/`, `.github/copilot-instructions.md`, `AGENTS.md` if present.
|
|
39
39
|
|
|
40
|
-
###
|
|
40
|
+
### Step 2 — Infer WHAT / WHY / HOW (automatic, dense)
|
|
41
41
|
|
|
42
|
-
|
|
42
|
+
Analyze the codebase; do NOT ask for what you can infer:
|
|
43
43
|
|
|
44
|
-
- **HOW (
|
|
45
|
-
- Build / run / test / lint / typecheck —
|
|
46
|
-
- **
|
|
47
|
-
-
|
|
48
|
-
- **WHAT:** tech stack,
|
|
49
|
-
- **WHY:**
|
|
44
|
+
- **HOW (highest priority — the agent needs it to run the loop):**
|
|
45
|
+
- Build / run / test / lint / typecheck — read from `package.json` scripts, `Makefile`, `justfile`, `pyproject.toml`, `Cargo.toml`, `go.mod`, CI workflows…
|
|
46
|
+
- **The command to run a SINGLE test** — you must find it (e.g. `pytest path::test_x`, `vitest run -t "name"`, `go test -run`). This is the biggest feedback-loop speed-up.
|
|
47
|
+
- Specific tooling: `bun` vs `node`, `uv` vs `pip`, which package manager.
|
|
48
|
+
- **WHAT:** tech stack, module boundaries, a "map" of the codebase (especially important for a monorepo). Describe the *big picture you only get from reading many files*, do NOT list every file.
|
|
49
|
+
- **WHY:** the project's purpose, the role of each major part — 1–3 sentences.
|
|
50
50
|
|
|
51
|
-
###
|
|
51
|
+
### Step 3 — Ask to confirm EXACTLY the 2 things you can't infer (in a single round)
|
|
52
52
|
|
|
53
|
-
|
|
53
|
+
Use AskUserQuestion, bundling everything into one round (respect the context budget):
|
|
54
54
|
|
|
55
|
-
1. **Guardrails** —
|
|
56
|
-
-
|
|
57
|
-
-
|
|
58
|
-
-
|
|
59
|
-
2. **Definition of Done** —
|
|
55
|
+
1. **Guardrails** — suggest candidates from the repo, then ask to confirm:
|
|
56
|
+
- Files/dirs that must NOT be edited (generated, already-run migrations, lockfiles, `dist/`, `vendor/`).
|
|
57
|
+
- Operations that need confirmation first (running a migration, deleting data, pushing, editing CI/secrets).
|
|
58
|
+
- Mandatory rules ("always run tests before declaring done", "don't add a new dependency without asking").
|
|
59
|
+
2. **Definition of Done** — when does a change count as finished? (e.g. `<test command>` green + `<lint>` clean + `<typecheck>` pass). This is what the agent uses to self-verify.
|
|
60
60
|
|
|
61
|
-
|
|
61
|
+
If the repo is very large / multi-module → ask whether to **layer** it: a lean root `CLAUDE.md` + child files for complex modules (closest-file-wins). Default: root only.
|
|
62
62
|
|
|
63
|
-
###
|
|
63
|
+
### Step 4 — Write CLAUDE.md to the template
|
|
64
64
|
|
|
65
|
-
|
|
65
|
+
Must begin with:
|
|
66
66
|
|
|
67
67
|
```md
|
|
68
68
|
# CLAUDE.md
|
|
@@ -70,52 +70,52 @@ Bắt buộc mở đầu bằng:
|
|
|
70
70
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
71
71
|
```
|
|
72
72
|
|
|
73
|
-
|
|
73
|
+
Content template (drop any section that doesn't truly apply — don't invent):
|
|
74
74
|
|
|
75
75
|
```md
|
|
76
|
-
##
|
|
77
|
-
<WHY + WHAT: 1–3
|
|
76
|
+
## What this repo is
|
|
77
|
+
<WHY + WHAT: 1–3 sentences. The thing you can't get from reading one file.>
|
|
78
78
|
|
|
79
79
|
## Build / Test / Run
|
|
80
|
-
- Build: <
|
|
81
|
-
-
|
|
82
|
-
-
|
|
83
|
-
- Lint / typecheck: <
|
|
84
|
-
-
|
|
80
|
+
- Build: <command>
|
|
81
|
+
- Run all tests: <command>
|
|
82
|
+
- Run a SINGLE test: <exact copy-paste command>
|
|
83
|
+
- Lint / typecheck: <command>
|
|
84
|
+
- Specific tooling: <bun/uv/...>
|
|
85
85
|
|
|
86
|
-
##
|
|
87
|
-
<Big picture
|
|
88
|
-
|
|
86
|
+
## Architecture overview
|
|
87
|
+
<Big picture you only get from reading many files: the main flow, module boundaries,
|
|
88
|
+
where the system's "heart" is. Use pointers path/file.ts:line, don't copy code.>
|
|
89
89
|
|
|
90
|
-
##
|
|
91
|
-
<
|
|
90
|
+
## Conventions (only the NON-obvious)
|
|
91
|
+
<Mandatory patterns the agent can't guess. NO style/format — leave that to the linter.>
|
|
92
92
|
|
|
93
93
|
## Guardrails
|
|
94
|
-
-
|
|
95
|
-
-
|
|
96
|
-
-
|
|
94
|
+
- DON'T edit: <files/dirs>.
|
|
95
|
+
- ALWAYS <run tests/lint> before declaring done.
|
|
96
|
+
- Require confirmation before: <dangerous operation>.
|
|
97
97
|
|
|
98
98
|
## Definition of Done
|
|
99
|
-
|
|
99
|
+
A change is done when: <concrete evidence — tests green, build ok, lint clean>.
|
|
100
100
|
```
|
|
101
101
|
|
|
102
|
-
###
|
|
102
|
+
### Step 5 — Self-check before handing off
|
|
103
103
|
|
|
104
|
-
|
|
104
|
+
Compare the file you wrote against the checklist; fix any violation:
|
|
105
105
|
|
|
106
|
-
- [ ]
|
|
107
|
-
- [ ]
|
|
108
|
-
- [ ]
|
|
109
|
-
- [ ]
|
|
110
|
-
- [ ]
|
|
111
|
-
- [ ] Guardrails/
|
|
112
|
-
- [ ]
|
|
106
|
+
- [ ] Under 300 lines (ideally <120). If long → cut, push task-specific detail to a separate file + link.
|
|
107
|
+
- [ ] No directory tree, no generic advice, no surplus style/convention.
|
|
108
|
+
- [ ] Uses pointers (`path:line`) instead of copied snippets.
|
|
109
|
+
- [ ] Has a single-test command, copy-paste runnable.
|
|
110
|
+
- [ ] Has Guardrails and Definition of Done sections (this is what makes it a "harness").
|
|
111
|
+
- [ ] Guardrails/conventions are boundary-level heuristics, NOT if-this-then-that chains for each edge case; rules true for only one task are pushed to a per-task file.
|
|
112
|
+
- [ ] Every command checked actually exists in the repo (don't invent scripts that aren't there).
|
|
113
113
|
|
|
114
|
-
###
|
|
114
|
+
### Step 6 — Report
|
|
115
115
|
|
|
116
|
-
|
|
116
|
+
Summarize for the user: which files were created/updated, which sources were consolidated (Cursor/Copilot/README), the line count, and the Guardrails/DoD that were locked in. Note: **AGENTS.md for Codex comes in a later round** (then it's just `ln -s AGENTS.md CLAUDE.md` to share one source).
|
|
117
117
|
|
|
118
|
-
##
|
|
118
|
+
## Out of scope (for now)
|
|
119
119
|
|
|
120
|
-
-
|
|
121
|
-
-
|
|
120
|
+
- Doesn't generate `AGENTS.md` / no symlink — saved for a later round on request (focus Claude Code first).
|
|
121
|
+
- Doesn't set up hooks/settings.json/MCP — those are other harness pillars, done separately.
|