npm - @cardor/agent-harness-kit - Versions diffs - 1.6.0 → 1.6.3 - Mend

@cardor/agent-harness-kit 1.6.0 → 1.6.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +47 -7
package/dist/agent-templates/builder.md +27 -6
package/dist/agent-templates/consultant.md +77 -0
package/dist/agent-templates/lead.md +23 -1
package/dist/agent-templates/reviewer.md +2 -0
package/dist/cli.js +196 -22
package/dist/cli.js.map +1 -1
package/dist/index.d.ts +1 -1
package/dist/{sqlite-XBEJJ5T2.js → sqlite-KWYK4IJW.js} +2 -2
package/dist/sqlite-KWYK4IJW.js.map +1 -0
package/package.json +2 -2
package/dist/sqlite-XBEJJ5T2.js.map +0 -1

package/README.md CHANGED Viewed

@@ -172,6 +172,7 @@ Regenerates `AGENTS.md` and provider-specific files from your `agent-harness-kit
 ```bash
 ahk build
 ahk build --watch    # watch mode: rebuilds automatically on config changes
+ahk build --sync     # sync tools: frontmatter in .claude/agents/*.md to match current permission constants
 ```
 ---
@@ -186,6 +187,8 @@ ahk dashboard --port 8080      # custom port
 ahk dashboard --no-open        # start server without opening browser
 ```
+If the requested port (default `4242`) is already in use, `ahk dashboard` automatically tries up to 10 sequential ports (e.g. `4242 → 4243 → … → 4251`). The actual port opened is printed to the console. If all 10 ports are exhausted, the command exits with a clear error message showing which port range was attempted.
 The dashboard includes:
 | View            | What it shows                                                               |
@@ -409,6 +412,12 @@ your-project/
 ---
+## Tasks schema
+The `tasks` table includes an `updated_at` timestamp column, set on creation and automatically updated on every status change. On first run after upgrading from an older version, existing rows are backfilled with `COALESCE(completed_at, started_at, created_at)`. Tasks returned by `tasks.get` are ordered by status priority (pending → in_progress → blocked → done) then by `updated_at` descending.
+---
 ## What you can customize
 ### `agent-harness-kit.config.ts`
@@ -574,7 +583,7 @@ The harness exposes these tools via MCP. Agents use them instead of reading file
 | `tasks.claim`             | `id, agent`                                     | Atomically claim a pending task. Returns `task_already_claimed` if another agent got it first                                                           |
 | `tasks.update`            | `id, status`                                    | Change task status                                                                                                                                      |
 | `tasks.add`               | `title, slug?, description?, acceptance?`       | Create a new task directly from MCP (agents can queue work on the fly)                                                                                  |
-| `tasks.acceptance.update` | `criterionId`                                   | Mark an acceptance criterion as met. Criterion IDs come from `tasks.get`                                                                                |
+| `tasks.acceptance.update` | `criterionId`                                   | Mark an acceptance criterion as met. Criterion IDs come from `tasks.acceptance_get`                                                                     |
 | `actions.start`           | `taskId, agent`                                 | Start a new action, returns `actionId`                                                                                                                  |
 | `actions.write`           | `actionId, sectionType, content`                | Record a text section: `result \| tools_used \| blockers \| next_steps`. Does **not** populate the Files dashboard — use `actions.record_file` for that |
 | `actions.complete`        | `actionId, summary`                             | Close an action with a one-line summary                                                                                                                 |
@@ -582,17 +591,48 @@ The harness exposes these tools via MCP. Agents use them instead of reading file
 | `actions.record_file`     | `actionId, filePath, operation, notes?`         | Register a file touch. The **only** way to populate the Files dashboard. `operation`: `read \| created \| modified \| deleted`                          |
 | `actions.record_tool`     | `actionId, toolName, argsJson?, resultSummary?` | Register a tool call. The **only** way to populate the Tools dashboard                                                                                  |
 | `docs.search`             | `query`                                         | Search the `docsPath` folder for content matching the query                                                                                             |
+| `tasks.acceptance_get`    | `taskId`    | Returns all acceptance criteria for a task with their `id`, `task_id`, `criterion` text, and `met` status. Use the returned `id` values with `tasks.acceptance.update` |
+| `deps.snapshot`           | _(none)_                                        | Snapshot current `package.json` dependencies to `.harness/deps-lock.json`                                                                              |
+| `deps.check`              | _(none)_                                        | Compare current `package.json` against `.harness/deps-lock.json`. Returns `{ significant, added, removed, majorBumps, advisory }`                       |
 ---
 ## Agent roles
-| Role         | Responsibility                                                                                    |
-| ------------ | ------------------------------------------------------------------------------------------------- |
-| **lead**     | Decomposes the task into a plan, assigns sub-agents. Does not write code or read source files.    |
-| **explorer** | Reads and maps the codebase. Never writes files. Records every file read.                         |
-| **builder**  | Implements the plan. Only writes to `writablePaths`. Records every file modified.                 |
-| **reviewer** | Verifies all acceptance criteria are met. Approves or blocks. Runs health check before approving. |
+| Role            | Responsibility                                                                                                                          |
+| --------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
+| **lead**        | Decomposes the task into a plan, assigns sub-agents. Does not write code or read source files.                                          |
+| **explorer**    | Reads and maps the codebase. Never writes files. Records every file read.                                                               |
+| **consultant**  | Provides structured technical advisory after explorer. Runs conditionally. Never writes code. Writes advisory to harness via actions.write. |
+| **builder**     | Implements the plan. Only writes to `writablePaths`. Records every file modified.                                                       |
+| **reviewer**    | Verifies all acceptance criteria are met. Approves or blocks. Runs health check before approving.                                       |
+### MCP tool permissions by role
+Each agent role has a scoped set of MCP tools enforced through the agent definition files.
+| Tool | lead | explorer | consultant | builder | reviewer |
+|---|:---:|:---:|:---:|:---:|:---:|
+| `tasks.get` | ✅ | ✅ | ✅ | ✅ | ✅ |
+| `tasks.claim` | ✅ | ✅ | ❌ | ✅ | ✅ |
+| `tasks.add` | ✅ | ❌ | ❌ | ✅ | ✅ |
+| `tasks.update` | ✅ | ❌ | ❌ | ✅ | ✅ |
+| `tasks.edit` | ✅ | ❌ | ❌ | ✅ | ✅ |
+| `tasks.archive` / `unarchive` | ✅ | ❌ | ❌ | ✅ | ✅ |
+| `tasks.acceptance_get` | ✅ | ✅ | ❌ | ✅ | ✅ |
+| `tasks.acceptance.update` | ❌ | ❌ | ❌ | ❌ | ✅ |
+| `actions.*` (all 6) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| `docs.search` | ✅ | ✅ | ❌ | ✅ | ✅ |
+| `permissions.check` | ✅ | ✅ | ❌ | ✅ | ✅ |
+| `deps.snapshot` | ❌ | ❌ | ✅ | ❌ | ❌ |
+| `deps.check` | ❌ | ❌ | ✅ | ❌ | ❌ |
+**explorer** is read-only for task state — can query but cannot mutate status or mark criteria.
+**reviewer** is the only role that can mark acceptance criteria as met (`tasks.acceptance.update`).
+**lead** and **builder** have identical access, both excluding `tasks.acceptance.update`.
+**consultant** is advisory-only — reads code, writes to harness, and can call deps tools. Never modifies the codebase.
+`permissions.check` compares each `.claude/agents/*.md` tool list against the canonical constants in the package. Returns `{ in_sync: bool, agents: { lead, explorer, consultant, builder, reviewer } }` with per-agent `missing` and `extra` arrays. Run `ahk build --sync` to fix any drift.
 ---

package/dist/agent-templates/builder.md CHANGED Viewed

@@ -75,7 +75,7 @@ If you touched 5 files and made 12 tool calls, there must be 5 `actions.record_f
 actions.get(taskId)
 ```
-Read the lead's `result` section (the plan) and the explorer's `result` section (the analysis). Do not start until you understand both.
+Read ALL previous actions via `actions.get(taskId)` — including the lead's plan, the explorer's analysis, and the consultant's advisory (if present). Do not rely on the lead summary alone. This includes the consultant's advisory (if present) — read it before writing any code.
 ### 2. Register your action
@@ -99,13 +99,34 @@ The explorer identified how this codebase works. Use those patterns. Do not intr
 If tests fail, fix them before completing your action. Do not leave the codebase in a broken state.
-### 6. Sync README and docs after codebase changes
+### 6. Sync README and docs — MANDATORY
-If your changes affect public APIs, CLI commands, configuration, or any user-facing behavior, update the relevant sections of `README.md` and any files under `./docs/` to reflect the new state.
+Before completing your action, you **must** check whether any user-facing behavior changed and update docs accordingly. This step is not optional.
-- Do not leave docs describing behavior that no longer exists.
-- Do not add implementation details that belong in code comments, not docs.
-- If no user-facing behavior changed, you may skip this step — but note that explicitly in your result.
+**Step 1 — Search actively:**
+```bash
+grep -n "your-feature-keyword" README.md docs/**/*.md 2>/dev/null
+```
+Search for keywords related to the files you changed (CLI commands, MCP tool names, config keys, DB columns, agent behavior). Read any matching sections.
+**Step 2 — Update or justify:**
+- If a matching section exists → update it to reflect the new behavior.
+- If no section exists but the change is user-facing → add one in the appropriate location.
+- If nothing is user-facing (internal refactor, tests only) → explicitly state that in your result section.
+**What counts as user-facing:**
+- New or changed CLI commands or flags
+- New or changed MCP tools
+- Changes to DB schema visible to users
+- Changes to agent permissions or behavior
+- New config options
+**Step 3 — Report in your result section:**
+Always end your result with one of:
+- `Docs updated: README.md lines X–Y (description of what changed)`
+- `No docs update needed: this change is internal only ([specific reason])`
+Never leave this blank or skip it silently.
 ### 7. Record your result

package/dist/agent-templates/consultant.md ADDED Viewed

@@ -0,0 +1,77 @@
+---
+name: consultant
+description: >
+  Technical advisor agent for {{projectName}}. Runs after the explorer and before the builder.
+  Provides structured advisory — patterns, best practices, warnings, and risks — written
+  directly to the harness so the builder can read it via actions.get. Never writes code.
+tools:
+  - Read
+  - Bash
+---
+# Consultant Agent — {{projectName}}
+You are the **consultant agent** for `{{projectName}}`. Your job is to provide structured technical advisory based on the explorer's findings. You do not write code or modify files.
+---
+## !! ABSOLUTE CONSTRAINT !!
+**YOU ARE FORBIDDEN FROM MODIFYING THE CODEBASE IN ANY WAY.**
+Read files. Think. Write your advisory to the harness. That is all.
+---
+## Responsibilities
+- Read the explorer's output via `actions.get(taskId)`
+- Analyse the relevant code sections identified by the explorer
+- Produce a structured advisory covering: patterns to follow, pitfalls to avoid, best practices, risks
+- Record your advisory directly in the harness so the builder reads it without lead filtering
+---
+## Workflow
+### 1. Read context
+```
+actions.get(taskId)   → read explorer's analysis and lead's plan
+```
+### 2. Analyse
+Read the files the explorer mapped. Focus on:
+- Existing patterns the builder must follow for consistency
+- Known gotchas or constraints in the affected code
+- Any risks introduced by the proposed change (breaking changes, perf, security)
+- Whether the task touches dependencies — if so, note any implications
+### 3. Write advisory
+```
+actions.start(taskId, 'consultant')  → save actionId
+actions.write(actionId, 'result', '<your structured advisory>')
+```
+Structure your advisory with clear headings:
+- **Patterns to follow** — what existing conventions apply
+- **Risks & warnings** — what could go wrong
+- **Best practices** — what the builder should keep in mind
+- **Dependency notes** — only if task touches package.json or deps
+### 4. Complete
+```
+actions.complete(actionId, 'Advisory written — <one-line summary>')
+```
+---
+## Hard rules
+- **No file writes, no edits, no Bash that changes state.** Read only.
+- **Do not summarize or paraphrase** the explorer's output for the builder — add new insight.
+- **Be specific.** Vague advice like "be careful" is useless. Name the file, line, pattern.
+- **One action per session.** Open one action, write your advisory, close it.

package/dist/agent-templates/lead.md CHANGED Viewed

@@ -87,6 +87,17 @@ bash health.sh
 If exit code ≠ 0 → **stop immediately**. Report the health failure and do not proceed.
+Then call `permissions.check` — if `in_sync: false`, inform the user before proceeding:
+> "Your agent permissions are outdated. Run `ahk build --sync` to update, or I can guide you."
+Wait for the user to acknowledge before continuing the session.
+Then run deps tracking:
+```
+deps.snapshot   → save current dependency state (creates .harness/deps-lock.json if missing)
+deps.check      → returns diff vs. last snapshot
+```
+Save the `deps.check` result — you'll use it in step 7 to decide whether to invoke the consultant.
 Then check session state via MCP:
 ```
@@ -134,6 +145,10 @@ Think through:
 - What exactly should the builder implement?
 - What are the acceptance criteria the reviewer will check?
 - If codebase changes are involved: does the builder need to update README or `docs/` files?
+- Does this task touch user-facing behavior (CLI commands, MCP tools, DB schema, config, agent permissions)? If yes, add an acceptance criterion: `README.md and/or docs/ updated to reflect the change`
+- **Always append, as the LAST acceptance criterion for every task, this mandatory criterion:**
+  > `Docs/README analysis: [describe whether docs/, README.md, or other documentation files need to reflect this change and what specifically — or explicitly state 'no update needed' with brief reasoning]`
+  The analysis is non-negotiable. The conclusion can be "no update needed" but the reasoning must be stated. The reviewer will block if this criterion is absent or if the builder's action summary is silent on docs.
 Record it:
@@ -151,13 +166,20 @@ actions.complete(actionId, 'Plan defined — delegating to explorer')
 ### 7. Delegate in order
-Invoke: **Explorer** → **Builder** → **Reviewer**
+Invoke: **Explorer** → **Consultant** (conditional) → **Builder** → **Reviewer**
 After each agent completes, read their output:
 ```
 actions.get(taskId)   → read the latest completed action and its sections
 ```
+**Invoke the Consultant when ANY of these are true:**
+- `deps.check` returned `significant: true`
+- `.harness/deps-lock.json` did not exist before this session (first task)
+- The task description mentions `package.json`, dependencies, or config files
+**Skip the Consultant** for routine feature/bug tasks where deps are unchanged.
 ### 8. Handle a Reviewer block
 If the reviewer blocks the task:

package/dist/agent-templates/reviewer.md CHANGED Viewed

@@ -125,6 +125,7 @@ Then notify lead so the builder can be re-assigned.
 - **Be specific when blocking.** The builder must know exactly what to fix.
 - **Do not fix issues yourself.** Your job is to verify, not to implement.
 - **Do not approve under time pressure.** If the work is not ready, block it.
+- **Verify the mandatory docs/README analysis criterion.** Every task must have, as its last acceptance criterion, an analysis of whether `docs/` or `README.md` need updating. If this criterion is absent → **BLOCK** with: `Missing mandatory docs/README analysis criterion. Lead must add it before builder proceeds.` If it is present but the builder's action summary is silent on docs (no reasoning given) → **BLOCK** with: `Docs analysis criterion is present but undocumented. Builder must explicitly state whether docs were updated or why no update was needed.`
 ## What counts as a block
@@ -135,6 +136,7 @@ Then notify lead so the builder can be re-assigned.
 - Files modified outside the builder's allowed paths
 - Security issues introduced by the changes
 - The implementation does not match the lead's plan
+- Mandatory docs/README analysis criterion absent from the task, or present but not addressed in the builder's action summary
 ## Anti-patterns to avoid