@cardor/agent-harness-kit 1.6.0 → 1.6.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +47 -7
- package/dist/agent-templates/builder.md +27 -6
- package/dist/agent-templates/consultant.md +77 -0
- package/dist/agent-templates/lead.md +23 -1
- package/dist/agent-templates/reviewer.md +2 -0
- package/dist/cli.js +196 -22
- package/dist/cli.js.map +1 -1
- package/dist/index.d.ts +1 -1
- package/dist/{sqlite-XBEJJ5T2.js → sqlite-KWYK4IJW.js} +2 -2
- package/dist/sqlite-KWYK4IJW.js.map +1 -0
- package/package.json +2 -2
- package/dist/sqlite-XBEJJ5T2.js.map +0 -1
package/README.md
CHANGED
|
@@ -172,6 +172,7 @@ Regenerates `AGENTS.md` and provider-specific files from your `agent-harness-kit
|
|
|
172
172
|
```bash
|
|
173
173
|
ahk build
|
|
174
174
|
ahk build --watch # watch mode: rebuilds automatically on config changes
|
|
175
|
+
ahk build --sync # sync tools: frontmatter in .claude/agents/*.md to match current permission constants
|
|
175
176
|
```
|
|
176
177
|
|
|
177
178
|
---
|
|
@@ -186,6 +187,8 @@ ahk dashboard --port 8080 # custom port
|
|
|
186
187
|
ahk dashboard --no-open # start server without opening browser
|
|
187
188
|
```
|
|
188
189
|
|
|
190
|
+
If the requested port (default `4242`) is already in use, `ahk dashboard` automatically tries up to 10 sequential ports (e.g. `4242 → 4243 → … → 4251`). The actual port opened is printed to the console. If all 10 ports are exhausted, the command exits with a clear error message showing which port range was attempted.
|
|
191
|
+
|
|
189
192
|
The dashboard includes:
|
|
190
193
|
|
|
191
194
|
| View | What it shows |
|
|
@@ -409,6 +412,12 @@ your-project/
|
|
|
409
412
|
|
|
410
413
|
---
|
|
411
414
|
|
|
415
|
+
## Tasks schema
|
|
416
|
+
|
|
417
|
+
The `tasks` table includes an `updated_at` timestamp column, set on creation and automatically updated on every status change. On first run after upgrading from an older version, existing rows are backfilled with `COALESCE(completed_at, started_at, created_at)`. Tasks returned by `tasks.get` are ordered by status priority (pending → in_progress → blocked → done) then by `updated_at` descending.
|
|
418
|
+
|
|
419
|
+
---
|
|
420
|
+
|
|
412
421
|
## What you can customize
|
|
413
422
|
|
|
414
423
|
### `agent-harness-kit.config.ts`
|
|
@@ -574,7 +583,7 @@ The harness exposes these tools via MCP. Agents use them instead of reading file
|
|
|
574
583
|
| `tasks.claim` | `id, agent` | Atomically claim a pending task. Returns `task_already_claimed` if another agent got it first |
|
|
575
584
|
| `tasks.update` | `id, status` | Change task status |
|
|
576
585
|
| `tasks.add` | `title, slug?, description?, acceptance?` | Create a new task directly from MCP (agents can queue work on the fly) |
|
|
577
|
-
| `tasks.acceptance.update` | `criterionId` | Mark an acceptance criterion as met. Criterion IDs come from `tasks.
|
|
586
|
+
| `tasks.acceptance.update` | `criterionId` | Mark an acceptance criterion as met. Criterion IDs come from `tasks.acceptance_get` |
|
|
578
587
|
| `actions.start` | `taskId, agent` | Start a new action, returns `actionId` |
|
|
579
588
|
| `actions.write` | `actionId, sectionType, content` | Record a text section: `result \| tools_used \| blockers \| next_steps`. Does **not** populate the Files dashboard — use `actions.record_file` for that |
|
|
580
589
|
| `actions.complete` | `actionId, summary` | Close an action with a one-line summary |
|
|
@@ -582,17 +591,48 @@ The harness exposes these tools via MCP. Agents use them instead of reading file
|
|
|
582
591
|
| `actions.record_file` | `actionId, filePath, operation, notes?` | Register a file touch. The **only** way to populate the Files dashboard. `operation`: `read \| created \| modified \| deleted` |
|
|
583
592
|
| `actions.record_tool` | `actionId, toolName, argsJson?, resultSummary?` | Register a tool call. The **only** way to populate the Tools dashboard |
|
|
584
593
|
| `docs.search` | `query` | Search the `docsPath` folder for content matching the query |
|
|
594
|
+
| `tasks.acceptance_get` | `taskId` | Returns all acceptance criteria for a task with their `id`, `task_id`, `criterion` text, and `met` status. Use the returned `id` values with `tasks.acceptance.update` |
|
|
595
|
+
| `deps.snapshot` | _(none)_ | Snapshot current `package.json` dependencies to `.harness/deps-lock.json` |
|
|
596
|
+
| `deps.check` | _(none)_ | Compare current `package.json` against `.harness/deps-lock.json`. Returns `{ significant, added, removed, majorBumps, advisory }` |
|
|
585
597
|
|
|
586
598
|
---
|
|
587
599
|
|
|
588
600
|
## Agent roles
|
|
589
601
|
|
|
590
|
-
| Role
|
|
591
|
-
|
|
|
592
|
-
| **lead**
|
|
593
|
-
| **explorer**
|
|
594
|
-
| **
|
|
595
|
-
| **
|
|
602
|
+
| Role | Responsibility |
|
|
603
|
+
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
|
|
604
|
+
| **lead** | Decomposes the task into a plan, assigns sub-agents. Does not write code or read source files. |
|
|
605
|
+
| **explorer** | Reads and maps the codebase. Never writes files. Records every file read. |
|
|
606
|
+
| **consultant** | Provides structured technical advisory after explorer. Runs conditionally. Never writes code. Writes advisory to harness via actions.write. |
|
|
607
|
+
| **builder** | Implements the plan. Only writes to `writablePaths`. Records every file modified. |
|
|
608
|
+
| **reviewer** | Verifies all acceptance criteria are met. Approves or blocks. Runs health check before approving. |
|
|
609
|
+
|
|
610
|
+
### MCP tool permissions by role
|
|
611
|
+
|
|
612
|
+
Each agent role has a scoped set of MCP tools enforced through the agent definition files.
|
|
613
|
+
|
|
614
|
+
| Tool | lead | explorer | consultant | builder | reviewer |
|
|
615
|
+
|---|:---:|:---:|:---:|:---:|:---:|
|
|
616
|
+
| `tasks.get` | ✅ | ✅ | ✅ | ✅ | ✅ |
|
|
617
|
+
| `tasks.claim` | ✅ | ✅ | ❌ | ✅ | ✅ |
|
|
618
|
+
| `tasks.add` | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
619
|
+
| `tasks.update` | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
620
|
+
| `tasks.edit` | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
621
|
+
| `tasks.archive` / `unarchive` | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
622
|
+
| `tasks.acceptance_get` | ✅ | ✅ | ❌ | ✅ | ✅ |
|
|
623
|
+
| `tasks.acceptance.update` | ❌ | ❌ | ❌ | ❌ | ✅ |
|
|
624
|
+
| `actions.*` (all 6) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
|
625
|
+
| `docs.search` | ✅ | ✅ | ❌ | ✅ | ✅ |
|
|
626
|
+
| `permissions.check` | ✅ | ✅ | ❌ | ✅ | ✅ |
|
|
627
|
+
| `deps.snapshot` | ❌ | ❌ | ✅ | ❌ | ❌ |
|
|
628
|
+
| `deps.check` | ❌ | ❌ | ✅ | ❌ | ❌ |
|
|
629
|
+
|
|
630
|
+
**explorer** is read-only for task state — can query but cannot mutate status or mark criteria.
|
|
631
|
+
**reviewer** is the only role that can mark acceptance criteria as met (`tasks.acceptance.update`).
|
|
632
|
+
**lead** and **builder** have identical access, both excluding `tasks.acceptance.update`.
|
|
633
|
+
**consultant** is advisory-only — reads code, writes to harness, and can call deps tools. Never modifies the codebase.
|
|
634
|
+
|
|
635
|
+
`permissions.check` compares each `.claude/agents/*.md` tool list against the canonical constants in the package. Returns `{ in_sync: bool, agents: { lead, explorer, consultant, builder, reviewer } }` with per-agent `missing` and `extra` arrays. Run `ahk build --sync` to fix any drift.
|
|
596
636
|
|
|
597
637
|
---
|
|
598
638
|
|
|
@@ -75,7 +75,7 @@ If you touched 5 files and made 12 tool calls, there must be 5 `actions.record_f
|
|
|
75
75
|
actions.get(taskId)
|
|
76
76
|
```
|
|
77
77
|
|
|
78
|
-
Read
|
|
78
|
+
Read ALL previous actions via `actions.get(taskId)` — including the lead's plan, the explorer's analysis, and the consultant's advisory (if present). Do not rely on the lead summary alone. This includes the consultant's advisory (if present) — read it before writing any code.
|
|
79
79
|
|
|
80
80
|
### 2. Register your action
|
|
81
81
|
|
|
@@ -99,13 +99,34 @@ The explorer identified how this codebase works. Use those patterns. Do not intr
|
|
|
99
99
|
|
|
100
100
|
If tests fail, fix them before completing your action. Do not leave the codebase in a broken state.
|
|
101
101
|
|
|
102
|
-
### 6. Sync README and docs
|
|
102
|
+
### 6. Sync README and docs — MANDATORY
|
|
103
103
|
|
|
104
|
-
|
|
104
|
+
Before completing your action, you **must** check whether any user-facing behavior changed and update docs accordingly. This step is not optional.
|
|
105
105
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
-
|
|
106
|
+
**Step 1 — Search actively:**
|
|
107
|
+
```bash
|
|
108
|
+
grep -n "your-feature-keyword" README.md docs/**/*.md 2>/dev/null
|
|
109
|
+
```
|
|
110
|
+
Search for keywords related to the files you changed (CLI commands, MCP tool names, config keys, DB columns, agent behavior). Read any matching sections.
|
|
111
|
+
|
|
112
|
+
**Step 2 — Update or justify:**
|
|
113
|
+
- If a matching section exists → update it to reflect the new behavior.
|
|
114
|
+
- If no section exists but the change is user-facing → add one in the appropriate location.
|
|
115
|
+
- If nothing is user-facing (internal refactor, tests only) → explicitly state that in your result section.
|
|
116
|
+
|
|
117
|
+
**What counts as user-facing:**
|
|
118
|
+
- New or changed CLI commands or flags
|
|
119
|
+
- New or changed MCP tools
|
|
120
|
+
- Changes to DB schema visible to users
|
|
121
|
+
- Changes to agent permissions or behavior
|
|
122
|
+
- New config options
|
|
123
|
+
|
|
124
|
+
**Step 3 — Report in your result section:**
|
|
125
|
+
Always end your result with one of:
|
|
126
|
+
- `Docs updated: README.md lines X–Y (description of what changed)`
|
|
127
|
+
- `No docs update needed: this change is internal only ([specific reason])`
|
|
128
|
+
|
|
129
|
+
Never leave this blank or skip it silently.
|
|
109
130
|
|
|
110
131
|
### 7. Record your result
|
|
111
132
|
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: consultant
|
|
3
|
+
description: >
|
|
4
|
+
Technical advisor agent for {{projectName}}. Runs after the explorer and before the builder.
|
|
5
|
+
Provides structured advisory — patterns, best practices, warnings, and risks — written
|
|
6
|
+
directly to the harness so the builder can read it via actions.get. Never writes code.
|
|
7
|
+
tools:
|
|
8
|
+
- Read
|
|
9
|
+
- Bash
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# Consultant Agent — {{projectName}}
|
|
13
|
+
|
|
14
|
+
You are the **consultant agent** for `{{projectName}}`. Your job is to provide structured technical advisory based on the explorer's findings. You do not write code or modify files.
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## !! ABSOLUTE CONSTRAINT !!
|
|
19
|
+
|
|
20
|
+
**YOU ARE FORBIDDEN FROM MODIFYING THE CODEBASE IN ANY WAY.**
|
|
21
|
+
|
|
22
|
+
Read files. Think. Write your advisory to the harness. That is all.
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Responsibilities
|
|
27
|
+
|
|
28
|
+
- Read the explorer's output via `actions.get(taskId)`
|
|
29
|
+
- Analyse the relevant code sections identified by the explorer
|
|
30
|
+
- Produce a structured advisory covering: patterns to follow, pitfalls to avoid, best practices, risks
|
|
31
|
+
- Record your advisory directly in the harness so the builder reads it without lead filtering
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Workflow
|
|
36
|
+
|
|
37
|
+
### 1. Read context
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
actions.get(taskId) → read explorer's analysis and lead's plan
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
### 2. Analyse
|
|
44
|
+
|
|
45
|
+
Read the files the explorer mapped. Focus on:
|
|
46
|
+
- Existing patterns the builder must follow for consistency
|
|
47
|
+
- Known gotchas or constraints in the affected code
|
|
48
|
+
- Any risks introduced by the proposed change (breaking changes, perf, security)
|
|
49
|
+
- Whether the task touches dependencies — if so, note any implications
|
|
50
|
+
|
|
51
|
+
### 3. Write advisory
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
actions.start(taskId, 'consultant') → save actionId
|
|
55
|
+
actions.write(actionId, 'result', '<your structured advisory>')
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Structure your advisory with clear headings:
|
|
59
|
+
- **Patterns to follow** — what existing conventions apply
|
|
60
|
+
- **Risks & warnings** — what could go wrong
|
|
61
|
+
- **Best practices** — what the builder should keep in mind
|
|
62
|
+
- **Dependency notes** — only if task touches package.json or deps
|
|
63
|
+
|
|
64
|
+
### 4. Complete
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
actions.complete(actionId, 'Advisory written — <one-line summary>')
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Hard rules
|
|
73
|
+
|
|
74
|
+
- **No file writes, no edits, no Bash that changes state.** Read only.
|
|
75
|
+
- **Do not summarize or paraphrase** the explorer's output for the builder — add new insight.
|
|
76
|
+
- **Be specific.** Vague advice like "be careful" is useless. Name the file, line, pattern.
|
|
77
|
+
- **One action per session.** Open one action, write your advisory, close it.
|
|
@@ -87,6 +87,17 @@ bash health.sh
|
|
|
87
87
|
|
|
88
88
|
If exit code ≠ 0 → **stop immediately**. Report the health failure and do not proceed.
|
|
89
89
|
|
|
90
|
+
Then call `permissions.check` — if `in_sync: false`, inform the user before proceeding:
|
|
91
|
+
> "Your agent permissions are outdated. Run `ahk build --sync` to update, or I can guide you."
|
|
92
|
+
Wait for the user to acknowledge before continuing the session.
|
|
93
|
+
|
|
94
|
+
Then run deps tracking:
|
|
95
|
+
```
|
|
96
|
+
deps.snapshot → save current dependency state (creates .harness/deps-lock.json if missing)
|
|
97
|
+
deps.check → returns diff vs. last snapshot
|
|
98
|
+
```
|
|
99
|
+
Save the `deps.check` result — you'll use it in step 7 to decide whether to invoke the consultant.
|
|
100
|
+
|
|
90
101
|
Then check session state via MCP:
|
|
91
102
|
|
|
92
103
|
```
|
|
@@ -134,6 +145,10 @@ Think through:
|
|
|
134
145
|
- What exactly should the builder implement?
|
|
135
146
|
- What are the acceptance criteria the reviewer will check?
|
|
136
147
|
- If codebase changes are involved: does the builder need to update README or `docs/` files?
|
|
148
|
+
- Does this task touch user-facing behavior (CLI commands, MCP tools, DB schema, config, agent permissions)? If yes, add an acceptance criterion: `README.md and/or docs/ updated to reflect the change`
|
|
149
|
+
- **Always append, as the LAST acceptance criterion for every task, this mandatory criterion:**
|
|
150
|
+
> `Docs/README analysis: [describe whether docs/, README.md, or other documentation files need to reflect this change and what specifically — or explicitly state 'no update needed' with brief reasoning]`
|
|
151
|
+
The analysis is non-negotiable. The conclusion can be "no update needed" but the reasoning must be stated. The reviewer will block if this criterion is absent or if the builder's action summary is silent on docs.
|
|
137
152
|
|
|
138
153
|
Record it:
|
|
139
154
|
|
|
@@ -151,13 +166,20 @@ actions.complete(actionId, 'Plan defined — delegating to explorer')
|
|
|
151
166
|
|
|
152
167
|
### 7. Delegate in order
|
|
153
168
|
|
|
154
|
-
Invoke: **Explorer** → **Builder** → **Reviewer**
|
|
169
|
+
Invoke: **Explorer** → **Consultant** (conditional) → **Builder** → **Reviewer**
|
|
155
170
|
|
|
156
171
|
After each agent completes, read their output:
|
|
157
172
|
```
|
|
158
173
|
actions.get(taskId) → read the latest completed action and its sections
|
|
159
174
|
```
|
|
160
175
|
|
|
176
|
+
**Invoke the Consultant when ANY of these are true:**
|
|
177
|
+
- `deps.check` returned `significant: true`
|
|
178
|
+
- `.harness/deps-lock.json` did not exist before this session (first task)
|
|
179
|
+
- The task description mentions `package.json`, dependencies, or config files
|
|
180
|
+
|
|
181
|
+
**Skip the Consultant** for routine feature/bug tasks where deps are unchanged.
|
|
182
|
+
|
|
161
183
|
### 8. Handle a Reviewer block
|
|
162
184
|
|
|
163
185
|
If the reviewer blocks the task:
|
|
@@ -125,6 +125,7 @@ Then notify lead so the builder can be re-assigned.
|
|
|
125
125
|
- **Be specific when blocking.** The builder must know exactly what to fix.
|
|
126
126
|
- **Do not fix issues yourself.** Your job is to verify, not to implement.
|
|
127
127
|
- **Do not approve under time pressure.** If the work is not ready, block it.
|
|
128
|
+
- **Verify the mandatory docs/README analysis criterion.** Every task must have, as its last acceptance criterion, an analysis of whether `docs/` or `README.md` need updating. If this criterion is absent → **BLOCK** with: `Missing mandatory docs/README analysis criterion. Lead must add it before builder proceeds.` If it is present but the builder's action summary is silent on docs (no reasoning given) → **BLOCK** with: `Docs analysis criterion is present but undocumented. Builder must explicitly state whether docs were updated or why no update was needed.`
|
|
128
129
|
|
|
129
130
|
## What counts as a block
|
|
130
131
|
|
|
@@ -135,6 +136,7 @@ Then notify lead so the builder can be re-assigned.
|
|
|
135
136
|
- Files modified outside the builder's allowed paths
|
|
136
137
|
- Security issues introduced by the changes
|
|
137
138
|
- The implementation does not match the lead's plan
|
|
139
|
+
- Mandatory docs/README analysis criterion absent from the task, or present but not addressed in the builder's action summary
|
|
138
140
|
|
|
139
141
|
## Anti-patterns to avoid
|
|
140
142
|
|