@htechcs/harness-kit 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.en.md +8 -8
- package/README.md +8 -8
- package/bin/cli.js +43 -43
- package/docs/harness-engineering-tutorial.en.md +1 -1
- package/docs/harness-engineering-tutorial.md +1 -1
- package/package.json +1 -1
- package/skills/init-harness/SKILL.md +74 -74
- package/templates/agents/README.md +25 -24
- package/templates/agents/repo-explorer.md +16 -16
- package/templates/evals/README.md +39 -35
- package/templates/evals/cases/example-task.md +22 -22
- package/templates/evals/observability.md +43 -42
- package/templates/guardrails/README.md +59 -57
- package/templates/long-running/README.md +29 -28
- package/templates/long-running/TASK.md +19 -19
- package/templates/mcp-audit.md +16 -16
- package/templates/new-worktree.sh +16 -16
- package/templates/setup.sh +25 -25
- package/templates/spec/FEATURE.md +19 -19
- package/templates/spec/README.md +20 -20
|
@@ -1,43 +1,44 @@
|
|
|
1
|
-
# Subagents —
|
|
1
|
+
# Subagents — how to write a good one (Level 2)
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
3
|
+
A subagent is **Level 2's main lever**: push heavy work (reading many files, big refactors,
|
|
4
|
+
repeated test runs) into a **separate context**, so the main context receives only *conclusions* —
|
|
5
|
+
it doesn't flood.
|
|
5
6
|
|
|
6
|
-
##
|
|
7
|
+
## Install
|
|
7
8
|
|
|
8
|
-
|
|
9
|
-
Copy
|
|
9
|
+
Each subagent is a `.md` file in `.claude/agents/` (repo-level) or `~/.claude/agents/` (user-level).
|
|
10
|
+
Copy the sample file there and Claude Code picks it up automatically:
|
|
10
11
|
|
|
11
12
|
```bash
|
|
12
13
|
mkdir -p .claude/agents
|
|
13
14
|
cp repo-explorer.md .claude/agents/
|
|
14
15
|
```
|
|
15
16
|
|
|
16
|
-
##
|
|
17
|
+
## File structure
|
|
17
18
|
|
|
18
19
|
```md
|
|
19
20
|
---
|
|
20
21
|
name: <kebab-case>
|
|
21
|
-
description: <
|
|
22
|
-
tools: Read, Grep, Glob # (
|
|
22
|
+
description: <WHEN to use it — Claude reads this line to decide whether to call this agent>
|
|
23
|
+
tools: Read, Grep, Glob # (optional) give it only the tools it NEEDS, to keep it focused
|
|
23
24
|
---
|
|
24
|
-
<system prompt:
|
|
25
|
+
<system prompt: role + how to work + ESPECIALLY how to RETURN>
|
|
25
26
|
```
|
|
26
27
|
|
|
27
|
-
## 4
|
|
28
|
+
## 4 rules you can't forget
|
|
28
29
|
|
|
29
|
-
1.
|
|
30
|
-
|
|
31
|
-
2. **
|
|
32
|
-
|
|
33
|
-
3. **`tools`
|
|
34
|
-
|
|
35
|
-
4. **
|
|
36
|
-
|
|
30
|
+
1. **Write `description` as "when to use", not "what it is".** This is what the main agent reads
|
|
31
|
+
to decide whether to delegate. Vague → it never gets called.
|
|
32
|
+
2. **A subagent must return a *distilled conclusion*, not a log.** The whole reason a subagent
|
|
33
|
+
exists is so the main context does NOT have to swallow what it read. A raw dump defeats the purpose.
|
|
34
|
+
3. **`tools` lists only what's truly needed.** This is NOT a safety guardrail (that's Level 3) —
|
|
35
|
+
it keeps the agent lean and focused. A comprehension agent only needs `Read, Grep, Glob`.
|
|
36
|
+
4. **One subagent = one clear job.** Don't merge "read + edit + test" into one agent. Different
|
|
37
|
+
heavy jobs → different separate contexts.
|
|
37
38
|
|
|
38
|
-
##
|
|
39
|
+
## When you DON'T need a new subagent
|
|
39
40
|
|
|
40
|
-
Claude Code
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
41
|
+
Claude Code already ships `Explore` (read-only sweeps), `Plan` (design), and `general-purpose`.
|
|
42
|
+
They cover most context-isolation needs already. **Only write your own subagent when** you have a
|
|
43
|
+
recurring, domain-specific job the built-ins don't handle — don't spawn a redundant generic copy,
|
|
44
|
+
because surplus tools are exactly the context noise Level 2 teaches you to avoid.
|
|
@@ -1,27 +1,27 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: repo-explorer
|
|
3
|
-
description:
|
|
3
|
+
description: Read many files to answer "where is X / how does this flow work" WITHOUT flooding the main context. Use when you need a broad sweep of the codebase and only need conclusions, not the contents of each file. Read-only.
|
|
4
4
|
tools: Read, Grep, Glob
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
7
|
+
You are a **codebase-comprehension** agent. Your job is to scan many files and return a
|
|
8
|
+
**distilled conclusion** to the main agent — the main agent CANNOT see what you read, only what
|
|
9
|
+
you return. Therefore:
|
|
10
10
|
|
|
11
|
-
##
|
|
11
|
+
## How to work
|
|
12
12
|
|
|
13
|
-
1.
|
|
14
|
-
2.
|
|
15
|
-
3.
|
|
13
|
+
1. Stick to the exact question you were given. Don't widen the scope.
|
|
14
|
+
2. Use `Grep`/`Glob` to narrow down first; only `Read` files that are genuinely relevant.
|
|
15
|
+
3. Read *enough to conclude*, not everything.
|
|
16
16
|
|
|
17
|
-
##
|
|
17
|
+
## How to return (the most important part)
|
|
18
18
|
|
|
19
|
-
|
|
19
|
+
Return **a conclusion, not a log**. Specifically:
|
|
20
20
|
|
|
21
|
-
-
|
|
22
|
-
-
|
|
23
|
-
-
|
|
24
|
-
|
|
25
|
-
-
|
|
21
|
+
- The direct answer to the question, up top.
|
|
22
|
+
- Pointers `path/file.ts:line` for each important spot (so the main agent can open them itself when needed).
|
|
23
|
+
- NEVER paste long file contents into your answer — that is exactly the context flooding this
|
|
24
|
+
subagent exists to prevent.
|
|
25
|
+
- If you can't find it, say plainly "couldn't find X", don't guess.
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
Goal: the main agent reads your 15 lines and understands, instead of having to read 50 files itself.
|
|
@@ -1,51 +1,55 @@
|
|
|
1
|
-
# Evals & Observability —
|
|
1
|
+
# Evals & Observability — measure whether the agent does the right thing (Level 5)
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
settings / skill
|
|
3
|
+
Levels 1–4 *build* the harness. Level 5 is the **feedback loop**: how do you know whether changing
|
|
4
|
+
CLAUDE.md / settings / a skill made the agent **better or worse**? Without Level 5, every harness
|
|
5
|
+
change is a guess.
|
|
5
6
|
|
|
6
|
-
##
|
|
7
|
+
## Two halves
|
|
7
8
|
|
|
8
|
-
|
|
|
9
|
-
|
|
10
|
-
| **Evals** | "
|
|
11
|
-
| **Observability** | "
|
|
9
|
+
| Half | Answers | File |
|
|
10
|
+
|------|---------|------|
|
|
11
|
+
| **Evals** | "Does the agent produce the *right* result?" (measurable pass/fail) | [`cases/`](./cases/) |
|
|
12
|
+
| **Observability** | "What did the agent *do*, and why did it fail?" | [`observability.md`](./observability.md) |
|
|
12
13
|
|
|
13
|
-
Evals
|
|
14
|
-
(
|
|
14
|
+
Evals tell you **right/wrong**. Observability tells you **why** — and often reveals a problem at a
|
|
15
|
+
lower level (re-reading files endlessly = dirty context → Level 2; constant permission prompts → fix
|
|
16
|
+
`allow` at Level 3).
|
|
15
17
|
|
|
16
|
-
##
|
|
18
|
+
## The loop (this is the core, not the files)
|
|
17
19
|
|
|
18
20
|
```
|
|
19
|
-
define → run → read → fix → (
|
|
20
|
-
golden
|
|
21
|
+
define → run → read → fix → (re-run)
|
|
22
|
+
golden the the the
|
|
21
23
|
task agent trace harness
|
|
22
|
-
|
|
23
|
-
|
|
24
|
+
/result (CLAUDE.md,
|
|
25
|
+
settings, skill)
|
|
24
26
|
```
|
|
25
27
|
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
28
|
+
The real lever: the golden-task set is a **regression net for the harness itself**. After you change
|
|
29
|
+
CLAUDE.md, re-run the set — a case that used to pass and now fails → almost certainly your change
|
|
30
|
+
(confirm by re-running a few times to rule out noise), not a guess.
|
|
29
31
|
|
|
30
|
-
##
|
|
32
|
+
## The truth: Level 5 is the least file-able
|
|
31
33
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
+
A *real* eval is **domain-specific** — no kit ships "is your agent correct yet" out of the box. This
|
|
35
|
+
kit only gives you the **scaffold + discipline**:
|
|
34
36
|
|
|
35
|
-
-
|
|
36
|
-
-
|
|
37
|
-
|
|
37
|
+
- **File:** the folder structure + a golden-task template + an observability guide.
|
|
38
|
+
- **Discipline (most of it):** *define* what "correct" means for your task, build the case set, read
|
|
39
|
+
traces, **close the loop** (eval finds a regression → fix the harness → re-run).
|
|
38
40
|
|
|
39
|
-
|
|
41
|
+
The kit deliberately ships **no** runner code — a runner is repo-specific, and faking a generic one
|
|
42
|
+
would just be junk.
|
|
40
43
|
|
|
41
|
-
##
|
|
44
|
+
## Getting started
|
|
42
45
|
|
|
43
|
-
1. Copy `cases/example-task.md`
|
|
44
|
-
2.
|
|
45
|
-
3. **
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
46
|
+
1. Copy `cases/example-task.md` into a real case and fill in *objective* done-criteria.
|
|
47
|
+
2. Gather 3–5 **representative** tasks (what the agent does most often) — you don't need many, you need the right ones.
|
|
48
|
+
3. **Take a "no-harness" baseline FIRST.** Run the set once with the harness not yet applied (empty
|
|
49
|
+
CLAUDE.md / before installing Levels 1–4) and record the score. This is what **quantifies the
|
|
50
|
+
harness's ROI**: re-run after applying the harness and compare the delta — exactly the field's
|
|
51
|
+
opening thesis (changing the harness moves the score; see `docs/harness-engineering-tutorial.md`).
|
|
52
|
+
This is *different* from the regression check below.
|
|
53
|
+
4. After that, before/after **each change** to the harness, re-run the set and compare (run it a few
|
|
54
|
+
times to rule out noise before concluding cause).
|
|
55
|
+
5. When a case fails, open [`observability.md`](./observability.md) to trace *why*.
|
|
@@ -1,33 +1,33 @@
|
|
|
1
1
|
<!--
|
|
2
|
-
|
|
3
|
-
|
|
2
|
+
A "golden task" = one representative job + OBJECTIVE pass criteria, so it can be re-run after every
|
|
3
|
+
harness change. Copy this file per case. Name it after the job: add-endpoint.md, fix-flaky-test.md...
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
5
|
+
Principle: done-criteria must be MACHINE-checkable where possible (a command that returns pass/fail);
|
|
6
|
+
fall back to human grading only when unavoidable. "Looks right" is NOT a criterion.
|
|
7
|
+
Delete this comment when you use it for real.
|
|
8
8
|
-->
|
|
9
9
|
|
|
10
|
-
# Case: <
|
|
10
|
+
# Case: <the job — e.g. "add endpoint GET /users/:id">
|
|
11
11
|
|
|
12
|
-
## Task (
|
|
13
|
-
<
|
|
12
|
+
## Task (give this verbatim to the agent)
|
|
13
|
+
<The exact prompt you'd type to the agent. The closer to reality, the better.>
|
|
14
14
|
|
|
15
|
-
## Setup (
|
|
16
|
-
<
|
|
15
|
+
## Setup (what state the repo must be in before running)
|
|
16
|
+
<Base branch/commit, seed data, env vars. So the case repeats identically every time.>
|
|
17
17
|
- Base: <commit/branch>
|
|
18
18
|
-
|
|
19
19
|
|
|
20
|
-
## Done-criteria (
|
|
21
|
-
<
|
|
22
|
-
- [ ] `<
|
|
23
|
-
- [ ] `<
|
|
24
|
-
- [ ] <
|
|
25
|
-
- [ ]
|
|
20
|
+
## Done-criteria (OBJECTIVE — clear pass/fail)
|
|
21
|
+
<What MUST be true after the agent finishes. Prefer commands that return pass/fail.>
|
|
22
|
+
- [ ] `<test command>` green
|
|
23
|
+
- [ ] `<lint / typecheck command>` clean
|
|
24
|
+
- [ ] <a concrete change exists — e.g. "new route returns 200 for a valid id, 404 otherwise">
|
|
25
|
+
- [ ] Did NOT touch <out-of-scope file/path>
|
|
26
26
|
|
|
27
|
-
##
|
|
28
|
-
<
|
|
29
|
-
-
|
|
30
|
-
-
|
|
27
|
+
## How to grade
|
|
28
|
+
<If automatable, write the exact command. If it must be manual, write a short rubric; don't leave it "implied".>
|
|
29
|
+
- Automated: `<command that returns an exit code>`
|
|
30
|
+
- Manual (if needed): <1–2 sentence rubric>
|
|
31
31
|
|
|
32
|
-
##
|
|
33
|
-
<
|
|
32
|
+
## Reference (optional)
|
|
33
|
+
<A "correct" commit/PR to compare against, if any. Helps see where the agent diverged.>
|
|
@@ -1,42 +1,43 @@
|
|
|
1
|
-
# Observability —
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
##
|
|
7
|
-
|
|
8
|
-
1. **
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
2. **`/cost`** —
|
|
12
|
-
|
|
13
|
-
3. **Telemetry (
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
→
|
|
17
|
-
4. **
|
|
18
|
-
[guardrails/README.md](../guardrails/README.md)
|
|
19
|
-
|
|
20
|
-
##
|
|
21
|
-
|
|
22
|
-
Observability
|
|
23
|
-
|
|
24
|
-
|
|
|
25
|
-
|
|
26
|
-
|
|
|
27
|
-
|
|
|
28
|
-
|
|
|
29
|
-
|
|
|
30
|
-
|
|
|
31
|
-
|
|
|
32
|
-
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
>
|
|
1
|
+
# Observability — see what the agent did (Level 5)
|
|
2
|
+
|
|
3
|
+
When an eval fails — or the agent "does something weird" — you need to *see* what it did, not guess.
|
|
4
|
+
Here's where to look, cheapest to deepest.
|
|
5
|
+
|
|
6
|
+
## Where to look
|
|
7
|
+
|
|
8
|
+
1. **The session transcript** — a record of *every* tool call the agent made (which files it read,
|
|
9
|
+
which commands it ran, what it edited). This is the number-one source of truth for "why did it do
|
|
10
|
+
that". Claude Code stores a per-session transcript under the project folder in `~/.claude/`.
|
|
11
|
+
2. **`/cost`** — the session's tokens & cost. An unusual spike = a sign of context bloat (re-reading
|
|
12
|
+
surplus files, MCP tool stuffing) → pull it back to Level 2.
|
|
13
|
+
3. **Telemetry (for a whole team / background runs)** — Claude Code can export metrics/logs via
|
|
14
|
+
OpenTelemetry: enable it with the `CLAUDE_CODE_ENABLE_TELEMETRY` env var, then point an OTLP
|
|
15
|
+
exporter at your backend (Grafana, Honeycomb, Datadog…). Use it when you need to watch many
|
|
16
|
+
sessions/agents, not just one. → Find the exact config in the Claude Code docs under *monitoring / telemetry*.
|
|
17
|
+
4. **A log hook (proactive audit)** — attach a `PostToolUse` hook that writes each tool call to a file
|
|
18
|
+
(see [guardrails/README.md](../guardrails/README.md), the Hooks section). Handy for background runs you want to review later.
|
|
19
|
+
|
|
20
|
+
## Symptoms & which level they point to
|
|
21
|
+
|
|
22
|
+
Observability isn't just for debugging one session — it **surfaces harness gaps**:
|
|
23
|
+
|
|
24
|
+
| What you see in the trace | Problem | Fix at level |
|
|
25
|
+
|---------------------------|---------|--------------|
|
|
26
|
+
| Re-reading the same files; tokens climbing | dirty context | **Level 2** (subagent / `/clear` / prune MCP) |
|
|
27
|
+
| Constant permission prompts for safe commands | missing allowlist | **Level 3** (add to `allow`) |
|
|
28
|
+
| Nearly ran a destructive command | missing guard | **Level 3** (add to `deny`/`ask`) |
|
|
29
|
+
| Loses context mid-long-session, starts over | no checkpoint | **Level 4** (`TASK.md`) |
|
|
30
|
+
| Does it wrong and nobody notices until late | missing eval | **Level 5** (add a golden task) |
|
|
31
|
+
| Vague about "how do build/test run" | missing guidance | **Level 1** (CLAUDE.md) |
|
|
32
|
+
| Repeats the same mistake across sessions | rule never written down | **Level 1** (add one line to CLAUDE.md) |
|
|
33
|
+
|
|
34
|
+
**Closing the loop back to Level 1 — `CLAUDE.md` is a living document.** When the trace shows the agent
|
|
35
|
+
**repeating** a mistake Z (e.g. forgetting to run a migration, editing a generated file), don't just
|
|
36
|
+
fix it by hand this time: add **one line** of guardrail/convention to `CLAUDE.md` to prevent it next
|
|
37
|
+
time, then **re-run the golden task** to confirm the regression is gone. That's how `CLAUDE.md` grows
|
|
38
|
+
*from real mistakes*, instead of bloating on speculation.
|
|
39
|
+
|
|
40
|
+
## Principle
|
|
41
|
+
|
|
42
|
+
> Don't improve the harness on vibes. **Read the trace, let it point to the exact level to fix**, fix
|
|
43
|
+
> it, then re-run the golden task to confirm it actually got better. That's the whole Level 5 loop.
|
|
@@ -1,88 +1,90 @@
|
|
|
1
|
-
# Guardrails — permission baseline (
|
|
1
|
+
# Guardrails — permission baseline (Level 3)
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
3
|
+
Level 3 controls **what the agent is allowed to do** — the safety boundary. Unlike Level 2 (a *clean*
|
|
4
|
+
context), Level 3 is about *safe* actions: block destructive commands, ask before risky ones, let
|
|
5
|
+
safe ones run straight through.
|
|
5
6
|
|
|
6
|
-
##
|
|
7
|
+
## Install
|
|
7
8
|
|
|
8
|
-
Copy `settings.json`
|
|
9
|
+
Copy `settings.json` into the repo's `.claude/`:
|
|
9
10
|
|
|
10
11
|
```bash
|
|
11
12
|
mkdir -p .claude
|
|
12
13
|
cp settings.json .claude/settings.json
|
|
13
14
|
```
|
|
14
15
|
|
|
15
|
-
**
|
|
16
|
-
|
|
17
|
-
|
|
16
|
+
**Why `.claude/settings.json` (not `settings.local.json`):** this file is **checked into the repo**,
|
|
17
|
+
so everyone who clones it **inherits the same guardrails automatically**. `settings.local.json` is a
|
|
18
|
+
personal override (gitignored) — use it for machine-specific tweaks, not to impose on the team.
|
|
18
19
|
|
|
19
|
-
>
|
|
20
|
-
>
|
|
21
|
-
>
|
|
22
|
-
>
|
|
20
|
+
> **The FIRST thing to do after copying — add test/lint/build commands to `allow`.** The baseline
|
|
21
|
+
> deliberately only allows read-only git. Without your repo's feedback loop, Claude asks permission
|
|
22
|
+
> *every time* it runs tests → you fall into the "click yes to get it over with" habit the *Insight*
|
|
23
|
+
> section below calls the real danger. Open `.claude/settings.json` and add your stack's commands to `allow`:
|
|
23
24
|
>
|
|
24
25
|
> - **Node:** `"Bash(npm run test:*)"`, `"Bash(npm run lint:*)"`, `"Bash(npm run build:*)"`
|
|
25
26
|
> - **Python:** `"Bash(pytest:*)"`, `"Bash(ruff:*)"`, `"Bash(mypy:*)"`
|
|
26
27
|
> - **Go:** `"Bash(go test:*)"`, `"Bash(go build:*)"`, `"Bash(go vet:*)"`
|
|
27
28
|
>
|
|
28
|
-
>
|
|
29
|
+
> This is **mandatory, not optional** — a fast feedback loop is the thread running through the whole kit.
|
|
29
30
|
|
|
30
|
-
##
|
|
31
|
+
## The 3-bucket model
|
|
31
32
|
|
|
32
|
-
|
|
33
|
+
Every action (run bash, read/edit a file) falls into one of 3 buckets:
|
|
33
34
|
|
|
34
|
-
|
|
|
35
|
-
|
|
36
|
-
| **deny** |
|
|
37
|
-
| **ask** |
|
|
38
|
-
| **allow** |
|
|
35
|
+
| Bucket | Meaning | Examples in the baseline |
|
|
36
|
+
|--------|---------|--------------------------|
|
|
37
|
+
| **deny** | absolutely forbidden, the agent can't call it | `rm -rf`, read `.env`/`secrets/**`, read `*.pem` keys |
|
|
38
|
+
| **ask** | stop and ask you first | `git push`, `git reset --hard`, `git clean`, `rm` |
|
|
39
|
+
| **allow** | run straight through, no prompt | `git status`, `git diff`, `git log`, `git branch` |
|
|
39
40
|
|
|
40
|
-
|
|
41
|
-
- Bash
|
|
42
|
-
-
|
|
41
|
+
Rule syntax: `Tool(specifier)`.
|
|
42
|
+
- Bash matches by **prefix**: `Bash(npm run test:*)` matches any command starting with `npm run test`.
|
|
43
|
+
- Files match **gitignore-style**: `Read(./secrets/**)`, `Edit(./dist/**)`.
|
|
43
44
|
|
|
44
|
-
##
|
|
45
|
+
## Extend it for your repo
|
|
45
46
|
|
|
46
|
-
|
|
47
|
+
The baseline is deliberately **minimal and universal**. Add what's specific to your repo:
|
|
47
48
|
|
|
48
|
-
- **
|
|
49
|
+
- **Into `allow`** — safe, daily commands so you aren't asked constantly:
|
|
49
50
|
`Bash(npm run test:*)`, `Bash(npm run lint:*)`, `Bash(pytest:*)`, `Bash(make:*)`.
|
|
50
|
-
- **
|
|
51
|
-
|
|
52
|
-
- **
|
|
51
|
+
- **Into `ask`** — repo-specific actions with *big consequences*:
|
|
52
|
+
running a migration (`Bash(npm run migrate:*)`), deploys, `Bash(docker compose down:*)`.
|
|
53
|
+
- **Into `deny`** — paths that must never be edited/read:
|
|
53
54
|
`Edit(./dist/**)`, `Edit(./vendor/**)`, `Read(./**/*.key)`.
|
|
54
55
|
|
|
55
|
-
>
|
|
56
|
-
>
|
|
56
|
+
> Tip: don't over-stuff `allow`. Every time Claude asks is a chance for you to *review* — over-stuffing
|
|
57
|
+
> `allow` throws away your own review checkpoint.
|
|
57
58
|
|
|
58
|
-
## Insight: deny ≠
|
|
59
|
+
## Insight: deny ≠ airtight security
|
|
59
60
|
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
- `deny`/`ask`
|
|
63
|
-
- `allow`
|
|
64
|
-
|
|
61
|
+
A deny-list **doesn't** stop an adversary (an agent can route around it: write `rm` via a script,
|
|
62
|
+
base64…). It's a **safety net + friction reducer**:
|
|
63
|
+
- `deny`/`ask` block **accidents** (deleting/pushing by mistake) — they guard against *errors*, not *attacks*.
|
|
64
|
+
- `allow` lets safe commands run straight through → you avoid the "click yes to get it over with"
|
|
65
|
+
habit (that habit is the real danger).
|
|
65
66
|
|
|
66
|
-
**
|
|
67
|
-
|
|
67
|
+
**Real safety** is still **reviewing diffs + plan mode** before letting the agent act — that's runtime
|
|
68
|
+
discipline, not something you can package into a file.
|
|
68
69
|
|
|
69
|
-
##
|
|
70
|
+
## External content & prompt injection
|
|
70
71
|
|
|
71
|
-
`deny`/`ask`
|
|
72
|
-
web,
|
|
73
|
-
|
|
72
|
+
The `deny`/`ask` rules above block *accidents*; they **don't** stop prompt injection. Content the agent
|
|
73
|
+
reads from the web, issues, PRs, logs… is **data — not commands**, but an attacker can hide instructions
|
|
74
|
+
in it to steer the agent. This is **runtime discipline** (so the kit ships no pre-baked hook/CI-scan), a
|
|
75
|
+
few practical guards:
|
|
74
76
|
|
|
75
|
-
-
|
|
76
|
-
-
|
|
77
|
-
-
|
|
77
|
+
- Read external content (web/issue/PR) in **plan mode** — the agent *proposes* before it *acts*.
|
|
78
|
+
- DON'T let the agent auto-run a command / `curl` pulled out of content it just fetched.
|
|
79
|
+
- Untrusted input → split into a **separate session**; don't mix it into a high-privilege session.
|
|
78
80
|
|
|
79
|
-
|
|
80
|
-
|
|
81
|
+
Automated CI scanning and deeper background: see `docs/harness-engineering-tutorial.md` (the **Lurkr**
|
|
82
|
+
link for CI-scan, **OpenHands — mitigating prompt injection** for the background).
|
|
81
83
|
|
|
82
|
-
##
|
|
84
|
+
## Advanced (optional): Hooks
|
|
83
85
|
|
|
84
|
-
|
|
85
|
-
lint
|
|
86
|
+
When you need *logic* beyond static allow/deny — e.g. "block every edit to a protected path", "auto-run
|
|
87
|
+
lint after each edit" — use a **hook**: a script that runs *before/after* every tool call. Declare it in
|
|
86
88
|
`settings.json`:
|
|
87
89
|
|
|
88
90
|
```json
|
|
@@ -95,12 +97,12 @@ lint sau mỗi lần sửa" — dùng **hook**: một script chạy *trước/sa
|
|
|
95
97
|
}
|
|
96
98
|
```
|
|
97
99
|
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
100
|
+
The script reads the tool-call JSON from stdin; **exit code 2 = block**, with a message on stderr.
|
|
101
|
+
Because a blocking hook is repo-specific, the kit **ships none** — it just points the way. Write one
|
|
102
|
+
when you genuinely have a recurring rule that static allow/deny can't express.
|
|
101
103
|
|
|
102
|
-
**Audit-log (`PostToolUse`)** —
|
|
103
|
-
`evals/observability.md` (
|
|
104
|
+
**Audit-log (`PostToolUse`)** — record *every* tool call to review later. This is what
|
|
105
|
+
`evals/observability.md` (Level 5) points to; this hook is **generic, not repo-specific**:
|
|
104
106
|
|
|
105
107
|
```json
|
|
106
108
|
{
|
|
@@ -112,7 +114,7 @@ một quy tắc lặp lại mà allow/deny tĩnh không diễn đạt nổi.
|
|
|
112
114
|
}
|
|
113
115
|
```
|
|
114
116
|
|
|
115
|
-
`audit-log.sh`
|
|
117
|
+
`audit-log.sh` just appends the stdin payload to a file — one JSON tool-call per line:
|
|
116
118
|
|
|
117
119
|
```bash
|
|
118
120
|
#!/usr/bin/env bash
|