@open-agent-toolkit/cli 0.1.2 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/public-package-versions.json +4 -4
- package/assets/skills/oat-agent-instructions-analyze/SKILL.md +2 -2
- package/assets/skills/oat-agent-instructions-analyze/references/directory-assessment-criteria.md +33 -16
- package/assets/skills/oat-worktree-bootstrap-auto/SKILL.md +4 -2
- package/package.json +2 -2
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: oat-agent-instructions-analyze
|
|
3
|
-
version: 1.
|
|
3
|
+
version: 1.10.0
|
|
4
4
|
description: Run when you need to evaluate agent instruction file coverage, quality, and drift. Produces a severity-rated analysis artifact. Run before oat-agent-instructions-apply to identify what needs improvement.
|
|
5
5
|
disable-model-invocation: true
|
|
6
6
|
user-invocable: true
|
|
@@ -294,7 +294,7 @@ Any discrepancy that would cause agents to follow the wrong pattern should be fl
|
|
|
294
294
|
|
|
295
295
|
### Step 4: Assess Coverage Gaps
|
|
296
296
|
|
|
297
|
-
Walk the directory tree and evaluate each directory against `references/directory-assessment-criteria.md
|
|
297
|
+
Walk the directory tree and evaluate **each directory against `references/directory-assessment-criteria.md`, per-directory and at every depth**. The walk descends into subdirectories recursively — it is not limited to top-level apps and packages. A nested domain subdirectory such as `packages/<pkg>/src/<domain>/` is in scope and is assessed with the same primary indicators as a top-level package; the size of a directory's parent never gates whether that directory is evaluated.
|
|
298
298
|
|
|
299
299
|
Before general coverage-gap analysis, assess **provider baseline gaps** for every active provider.
|
|
300
300
|
These checks are mandatory even when the missing file does not appear in the discovered inventory.
|
package/assets/skills/oat-agent-instructions-analyze/references/directory-assessment-criteria.md
CHANGED
|
@@ -2,6 +2,8 @@
|
|
|
2
2
|
|
|
3
3
|
When does a directory need its own instruction file? Use these criteria to identify coverage gaps during analysis. The full guidance lives in `docs/agent-instruction.md` — this is a distilled, actionable checklist.
|
|
4
4
|
|
|
5
|
+
Apply these criteria **per directory, at every depth** — not just to top-level apps and packages. A nested subdirectory such as `packages/<pkg>/src/<domain>/` is assessed with exactly the same indicators as a top-level package. The size of a directory's parent never gates whether that directory is evaluated.
|
|
6
|
+
|
|
5
7
|
## Primary Indicators (any one = likely needs instructions)
|
|
6
8
|
|
|
7
9
|
### 1. Has Own Build Configuration
|
|
@@ -26,32 +28,45 @@ When does a directory need its own instruction file? Use these criteria to ident
|
|
|
26
28
|
|
|
27
29
|
- Represents a bounded context or module with domain-specific business logic
|
|
28
30
|
- Has its own data models, terminology, or invariants
|
|
29
|
-
-
|
|
30
|
-
- **
|
|
31
|
+
- Has non-obvious conventions an agent would otherwise miss — patterns that diverge from the parent's defaults
|
|
32
|
+
- **Applies at any depth.** A top-level package and a nested domain subdirectory are assessed the same way: `packages/billing/`, `services/auth/`, `lib/search-engine/`, **and** `packages/<pkg>/src/<domain>/` (for example, a `bigquery-sync/` or `payment-reconciliation/` subdirectory inside an otherwise modest package).
|
|
33
|
+
- **Signal strength:** Strong — this is the primary trigger for a nested instruction file. A bounded domain with its own models, terminology, invariants, and non-obvious conventions warrants a file **at any depth, regardless of how large or small its parent is**. Domain specificity, not parent size, decides this.
|
|
31
34
|
|
|
32
|
-
### 5. Significant Codebase
|
|
35
|
+
### 5. Significant Codebase
|
|
33
36
|
|
|
34
|
-
- Contains
|
|
37
|
+
- Contains a substantial body of code (loosely, 10+ source files) with specialized conventions
|
|
35
38
|
- Has patterns or conventions that differ from the rest of the repo
|
|
36
|
-
- **Signal strength:** Moderate — larger
|
|
39
|
+
- **Signal strength:** Moderate — a larger directory benefits more from explicit guidance, but only when it has conventions of its own
|
|
40
|
+
- **File count is never sufficient on its own.** A directory with many files that all mirror the parent's conventions is not a trigger — there is nothing distinct to capture. This indicator only fires alongside genuinely divergent, non-obvious conventions. Treat the "10+" figure as a loose illustration of "non-trivial", not a precise threshold.
|
|
41
|
+
|
|
42
|
+
## Nested Instruction Files (Progressive Specificity)
|
|
43
|
+
|
|
44
|
+
Instruction files form a hierarchy. A root `AGENTS.md` carries repo-wide conventions; deeper files carry progressively more specific guidance for the subtree they scope. **Deeper = more specific, never broader.**
|
|
45
|
+
|
|
46
|
+
A nested instruction file **inherits everything from its ancestors** and contains **only the domain-specific delta** — the conventions, models, terminology, and invariants that are true for that subtree and are not already captured (or are contradicted) above it. A child file must **not repeat** the parent: no copied conventions, no restated repo-wide rules. See `references/docs/agent-instruction.md` §13 ("Scoped Files (When and How)") for the full progressive-disclosure model.
|
|
47
|
+
|
|
48
|
+
Because a nested file is small and additive — it adds a thin delta rather than a full document — **the cost of adding one is low**. The decision bar is therefore qualitative, not size-based:
|
|
37
49
|
|
|
38
|
-
|
|
50
|
+
> Would an agent working only from the nearest existing (ancestor) instruction file get something wrong, or miss something, in this directory's domain?
|
|
39
51
|
|
|
40
|
-
|
|
41
|
-
subdirectories starting at depth 1-2 before writing a single broad recommendation.
|
|
52
|
+
If yes, the directory is a coverage-gap candidate. If no — the ancestor file already covers everything an agent needs here — it is not, regardless of how many files the directory contains.
|
|
42
53
|
|
|
43
|
-
|
|
44
|
-
distinct conventions are visible or the analysis stops yielding materially different guidance.
|
|
54
|
+
**Worked example.** A package has ~29 source files overall and a root-level `AGENTS.md` describing the package's general conventions. Inside it, `src/bigquery-sync/` holds ~15 files implementing BigQuery sync: it has its own data models, its own terminology (sync cursors, watermark tables, backfill windows), and non-obvious invariants (ordering guarantees, idempotency keys, partition-boundary handling) that appear nowhere else in the package. The parent package is well under any "large" bar, so an app/package-only or size-gated reading would conclude "no nested file warranted." That conclusion is wrong: an agent editing `src/bigquery-sync/` from the package-level `AGENTS.md` alone would miss the sync invariants and likely introduce a correctness bug. `src/bigquery-sync/` is a legitimate coverage-gap candidate — it meets Indicator 4 — and should be surfaced with a scoped `AGENTS.md` recommendation that captures only the sync-specific delta.
|
|
45
55
|
|
|
46
|
-
|
|
56
|
+
## Decomposing Broad Recommendations
|
|
47
57
|
|
|
48
|
-
|
|
58
|
+
When an area you would recommend a single instruction file for actually spans **distinct sub-areas**, decompose the recommendation: assess and recommend per sub-area instead of writing one broad, vague file. The trigger for decomposition is **heterogeneity**, not file count.
|
|
59
|
+
|
|
60
|
+
Decompose when the area's subdirectories differ by:
|
|
61
|
+
|
|
62
|
+
- tech stack or runtime (for example, an embedded React client vs Node server code)
|
|
49
63
|
- dominant file-type patterns (for example, resolvers vs repositories vs jobs)
|
|
50
64
|
- build or tooling configuration (separate tsconfigs, bundlers, framework configs)
|
|
51
65
|
- domain boundary or API surface
|
|
52
66
|
|
|
53
|
-
|
|
54
|
-
|
|
67
|
+
When you find heterogeneity, assess its major subdirectories starting at depth 1–2 before writing a single broad recommendation. If the first pass still leaves a sub-area that is clearly heterogeneous, keep decomposing deeper until the distinct conventions are visible or the analysis stops yielding materially different guidance. A homogeneous area — even a large one — needs only one recommendation; there are no distinct sub-areas to split out.
|
|
68
|
+
|
|
69
|
+
Record distinct sub-areas in the coverage gap assessment. A scoped `AGENTS.md` recommendation for a heterogeneous area should enumerate the major sub-areas and their conventions, not just report a total file count.
|
|
55
70
|
|
|
56
71
|
## Secondary Indicators (strengthen the case but not sufficient alone)
|
|
57
72
|
|
|
@@ -81,13 +96,15 @@ For each directory meeting 1+ primary indicators:
|
|
|
81
96
|
**Severity mapping:**
|
|
82
97
|
|
|
83
98
|
- **High:** Primary indicators 1-3 (own build, different stack, public API) — these are clear gaps
|
|
84
|
-
- **Medium:** Primary indicators 4-5 (domain boundary,
|
|
99
|
+
- **Medium:** Primary indicators 4-5 (domain boundary, significant codebase) — beneficial but not urgent
|
|
85
100
|
|
|
86
101
|
## Exclusions
|
|
87
102
|
|
|
88
103
|
Do NOT flag these as needing instructions:
|
|
89
104
|
|
|
90
105
|
- `node_modules/`, `dist/`, `build/`, `.git/` — generated/external
|
|
91
|
-
- Directories
|
|
106
|
+
- Directories that merely follow their parent's conventions with nothing distinct to capture — regardless of size. If an agent working from the nearest ancestor instruction file would already do the right thing here, a nested file would only repeat the parent.
|
|
92
107
|
- Test directories that follow the same patterns as their parent — covered by parent instructions
|
|
93
108
|
- Directories already covered by a parent's scoped rules (e.g., Cursor rule with `globs: packages/cli/**`)
|
|
109
|
+
|
|
110
|
+
**Anti-sprawl:** Do not recommend an instruction file for a directory just because it contains many files. File count alone is never a trigger. If those files all follow the parent's conventions — no distinct domain, no divergent patterns, nothing an agent would get wrong from the ancestor file — the directory is excluded no matter how large it is. The positive trigger is always distinct, non-obvious conventions worth capturing, not size.
|
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: oat-worktree-bootstrap-auto
|
|
3
|
-
version: 1.
|
|
3
|
+
version: 1.4.0
|
|
4
4
|
description: Use when an orchestrator/subagent needs autonomous worktree bootstrap. Non-interactive companion to oat-worktree-bootstrap.
|
|
5
5
|
argument-hint: '<branch-name> [--base <ref>] [--path <root>] [--baseline-policy <strict|allow-failing>]'
|
|
6
|
-
disable-model-invocation:
|
|
6
|
+
disable-model-invocation: false
|
|
7
7
|
user-invocable: false
|
|
8
8
|
allowed-tools: Read, Write, Bash, Glob, Grep
|
|
9
9
|
---
|
|
@@ -12,6 +12,8 @@ allowed-tools: Read, Write, Bash, Glob, Grep
|
|
|
12
12
|
|
|
13
13
|
Non-interactive worktree bootstrap for orchestrator and subagent execution flows. Creates or reuses a worktree, runs baseline checks, and reports structured status — all without user prompts.
|
|
14
14
|
|
|
15
|
+
This skill is **model-invocable** (`disable-model-invocation: false`): orchestrators such as `oat-project-implement` invoke it programmatically when a parallel phase group needs autonomous worktree bootstrap. It is **not** user-invocable (`user-invocable: false`) — it has no interactive surface and is never offered as a slash command.
|
|
16
|
+
|
|
15
17
|
> ⚠️ **When not to substitute.** This skill is the **only** supported mechanism for orchestrator-driven worktree creation in OAT skills. Host-native isolation primitives — Claude Code's `Agent({ isolation: "worktree" })`, Cursor's worktree-isolated agent invocations, and equivalents in other hosts — are **not** substitutes. They may use the primary repo's checkout (often `main`) as the base regardless of the caller's current branch, silently producing a worktree at the wrong base. OAT orchestrators dispatching mid-run from a feature branch MUST go through this skill with an explicit `--base` so the resulting worktree contains the orchestrator's prior commits.
|
|
16
18
|
|
|
17
19
|
## Relationship to oat-worktree-bootstrap
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@open-agent-toolkit/cli",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.4",
|
|
4
4
|
"private": false,
|
|
5
5
|
"description": "Open Agent Toolkit CLI",
|
|
6
6
|
"homepage": "https://github.com/voxmedia/open-agent-toolkit/tree/main/packages/cli",
|
|
@@ -33,7 +33,7 @@
|
|
|
33
33
|
"ora": "^9.0.0",
|
|
34
34
|
"yaml": "2.8.2",
|
|
35
35
|
"zod": "^3.25.76",
|
|
36
|
-
"@open-agent-toolkit/control-plane": "0.1.
|
|
36
|
+
"@open-agent-toolkit/control-plane": "0.1.4"
|
|
37
37
|
},
|
|
38
38
|
"devDependencies": {
|
|
39
39
|
"@types/node": "^22.10.0",
|