@open-agent-toolkit/cli 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  {
2
- "cli": "0.1.3",
3
- "docs-config": "0.1.3",
4
- "docs-theme": "0.1.3",
5
- "docs-transforms": "0.1.3"
2
+ "cli": "0.1.4",
3
+ "docs-config": "0.1.4",
4
+ "docs-theme": "0.1.4",
5
+ "docs-transforms": "0.1.4"
6
6
  }
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-agent-instructions-analyze
3
- version: 1.9.0
3
+ version: 1.10.0
4
4
  description: Run when you need to evaluate agent instruction file coverage, quality, and drift. Produces a severity-rated analysis artifact. Run before oat-agent-instructions-apply to identify what needs improvement.
5
5
  disable-model-invocation: true
6
6
  user-invocable: true
@@ -294,7 +294,7 @@ Any discrepancy that would cause agents to follow the wrong pattern should be fl
294
294
 
295
295
  ### Step 4: Assess Coverage Gaps
296
296
 
297
- Walk the directory tree and evaluate each directory against `references/directory-assessment-criteria.md`.
297
+ Walk the directory tree and evaluate **each directory against `references/directory-assessment-criteria.md`, per-directory and at every depth**. The walk descends into subdirectories recursively — it is not limited to top-level apps and packages. A nested domain subdirectory such as `packages/<pkg>/src/<domain>/` is in scope and is assessed with the same primary indicators as a top-level package; the size of a directory's parent never gates whether that directory is evaluated.
298
298
 
299
299
  Before general coverage-gap analysis, assess **provider baseline gaps** for every active provider.
300
300
  These checks are mandatory even when the missing file does not appear in the discovered inventory.
@@ -2,6 +2,8 @@
2
2
 
3
3
  When does a directory need its own instruction file? Use these criteria to identify coverage gaps during analysis. The full guidance lives in `docs/agent-instruction.md` — this is a distilled, actionable checklist.
4
4
 
5
+ Apply these criteria **per directory, at every depth** — not just to top-level apps and packages. A nested subdirectory such as `packages/<pkg>/src/<domain>/` is assessed with exactly the same indicators as a top-level package. The size of a directory's parent never gates whether that directory is evaluated.
6
+
5
7
  ## Primary Indicators (any one = likely needs instructions)
6
8
 
7
9
  ### 1. Has Own Build Configuration
@@ -26,32 +28,45 @@ When does a directory need its own instruction file? Use these criteria to ident
26
28
 
27
29
  - Represents a bounded context or module with domain-specific business logic
28
30
  - Has its own data models, terminology, or invariants
29
- - Example: `packages/billing/`, `services/auth/`, `lib/search-engine/`
30
- - **Signal strength:** Moderate depends on complexity
31
+ - Has non-obvious conventions an agent would otherwise miss — patterns that diverge from the parent's defaults
32
+ - **Applies at any depth.** A top-level package and a nested domain subdirectory are assessed the same way: `packages/billing/`, `services/auth/`, `lib/search-engine/`, **and** `packages/<pkg>/src/<domain>/` (for example, a `bigquery-sync/` or `payment-reconciliation/` subdirectory inside an otherwise modest package).
33
+ - **Signal strength:** Strong — this is the primary trigger for a nested instruction file. A bounded domain with its own models, terminology, invariants, and non-obvious conventions warrants a file **at any depth, regardless of how large or small its parent is**. Domain specificity, not parent size, decides this.
31
34
 
32
- ### 5. Significant Codebase (>10 source files)
35
+ ### 5. Significant Codebase
33
36
 
34
- - Contains more than ~10 source files with specialized conventions
37
+ - Contains a substantial body of code (loosely, 10+ source files) with specialized conventions
35
38
  - Has patterns or conventions that differ from the rest of the repo
36
- - **Signal strength:** Moderate — larger directories benefit more from explicit guidance
39
+ - **Signal strength:** Moderate — a larger directory benefits more from explicit guidance, but only when it has conventions of its own
40
+ - **File count is never sufficient on its own.** A directory with many files that all mirror the parent's conventions is not a trigger — there is nothing distinct to capture. This indicator only fires alongside genuinely divergent, non-obvious conventions. Treat the "10+" figure as a loose illustration of "non-trivial", not a precise threshold.
41
+
42
+ ## Nested Instruction Files (Progressive Specificity)
43
+
44
+ Instruction files form a hierarchy. A root `AGENTS.md` carries repo-wide conventions; deeper files carry progressively more specific guidance for the subtree they scope. **Deeper = more specific, never broader.**
45
+
46
+ A nested instruction file **inherits everything from its ancestors** and contains **only the domain-specific delta** — the conventions, models, terminology, and invariants that are true for that subtree and are not already captured (or are contradicted) above it. A child file must **not repeat** the parent: no copied conventions, no restated repo-wide rules. See `references/docs/agent-instruction.md` §13 ("Scoped Files (When and How)") for the full progressive-disclosure model.
47
+
48
+ Because a nested file is small and additive — it adds a thin delta rather than a full document — **the cost of adding one is low**. The decision bar is therefore qualitative, not size-based:
37
49
 
38
- ## Large Directory Decomposition
50
+ > Would an agent working only from the nearest existing (ancestor) instruction file get something wrong, or miss something, in this directory's domain?
39
51
 
40
- When a directory meets 1+ primary indicators and contains **more than 50 source files**, assess its major
41
- subdirectories starting at depth 1-2 before writing a single broad recommendation.
52
+ If yes, the directory is a coverage-gap candidate. If no the ancestor file already covers everything an agent needs here — it is not, regardless of how many files the directory contains.
42
53
 
43
- If the first pass still leaves a sub-area that is large or clearly heterogeneous, keep decomposing deeper until the
44
- distinct conventions are visible or the analysis stops yielding materially different guidance.
54
+ **Worked example.** A package has ~29 source files overall and a root-level `AGENTS.md` describing the package's general conventions. Inside it, `src/bigquery-sync/` holds ~15 files implementing BigQuery sync: it has its own data models, its own terminology (sync cursors, watermark tables, backfill windows), and non-obvious invariants (ordering guarantees, idempotency keys, partition-boundary handling) that appear nowhere else in the package. The parent package is well under any "large" bar, so an app/package-only or size-gated reading would conclude "no nested file warranted." That conclusion is wrong: an agent editing `src/bigquery-sync/` from the package-level `AGENTS.md` alone would miss the sync invariants and likely introduce a correctness bug. `src/bigquery-sync/` is a legitimate coverage-gap candidate — it meets Indicator 4 — and should be surfaced with a scoped `AGENTS.md` recommendation that captures only the sync-specific delta.
45
55
 
46
- Check whether subdirectories differ by:
56
+ ## Decomposing Broad Recommendations
47
57
 
48
- - tech stack or runtime (for example, embedded React client vs Node server code)
58
+ When an area you would recommend a single instruction file for actually spans **distinct sub-areas**, decompose the recommendation: assess and recommend per sub-area instead of writing one broad, vague file. The trigger for decomposition is **heterogeneity**, not file count.
59
+
60
+ Decompose when the area's subdirectories differ by:
61
+
62
+ - tech stack or runtime (for example, an embedded React client vs Node server code)
49
63
  - dominant file-type patterns (for example, resolvers vs repositories vs jobs)
50
64
  - build or tooling configuration (separate tsconfigs, bundlers, framework configs)
51
65
  - domain boundary or API surface
52
66
 
53
- Record distinct sub-areas in the coverage gap assessment. A scoped `AGENTS.md` recommendation for a large directory
54
- should enumerate the major sub-areas and their conventions, not just report the total file count.
67
+ When you find heterogeneity, assess its major subdirectories starting at depth 1–2 before writing a single broad recommendation. If the first pass still leaves a sub-area that is clearly heterogeneous, keep decomposing deeper until the distinct conventions are visible or the analysis stops yielding materially different guidance. A homogeneous area even a large one — needs only one recommendation; there are no distinct sub-areas to split out.
68
+
69
+ Record distinct sub-areas in the coverage gap assessment. A scoped `AGENTS.md` recommendation for a heterogeneous area should enumerate the major sub-areas and their conventions, not just report a total file count.
55
70
 
56
71
  ## Secondary Indicators (strengthen the case but not sufficient alone)
57
72
 
@@ -81,13 +96,15 @@ For each directory meeting 1+ primary indicators:
81
96
  **Severity mapping:**
82
97
 
83
98
  - **High:** Primary indicators 1-3 (own build, different stack, public API) — these are clear gaps
84
- - **Medium:** Primary indicators 4-5 (domain boundary, large codebase) — beneficial but not urgent
99
+ - **Medium:** Primary indicators 4-5 (domain boundary, significant codebase) — beneficial but not urgent
85
100
 
86
101
  ## Exclusions
87
102
 
88
103
  Do NOT flag these as needing instructions:
89
104
 
90
105
  - `node_modules/`, `dist/`, `build/`, `.git/` — generated/external
91
- - Directories with <5 source files and no build configtoo small to warrant overhead
106
+ - Directories that merely follow their parent's conventions with nothing distinct to capture regardless of size. If an agent working from the nearest ancestor instruction file would already do the right thing here, a nested file would only repeat the parent.
92
107
  - Test directories that follow the same patterns as their parent — covered by parent instructions
93
108
  - Directories already covered by a parent's scoped rules (e.g., Cursor rule with `globs: packages/cli/**`)
109
+
110
+ **Anti-sprawl:** Do not recommend an instruction file for a directory just because it contains many files. File count alone is never a trigger. If those files all follow the parent's conventions — no distinct domain, no divergent patterns, nothing an agent would get wrong from the ancestor file — the directory is excluded no matter how large it is. The positive trigger is always distinct, non-obvious conventions worth capturing, not size.
@@ -1,9 +1,9 @@
1
1
  ---
2
2
  name: oat-worktree-bootstrap-auto
3
- version: 1.3.0
3
+ version: 1.4.0
4
4
  description: Use when an orchestrator/subagent needs autonomous worktree bootstrap. Non-interactive companion to oat-worktree-bootstrap.
5
5
  argument-hint: '<branch-name> [--base <ref>] [--path <root>] [--baseline-policy <strict|allow-failing>]'
6
- disable-model-invocation: true
6
+ disable-model-invocation: false
7
7
  user-invocable: false
8
8
  allowed-tools: Read, Write, Bash, Glob, Grep
9
9
  ---
@@ -12,6 +12,8 @@ allowed-tools: Read, Write, Bash, Glob, Grep
12
12
 
13
13
  Non-interactive worktree bootstrap for orchestrator and subagent execution flows. Creates or reuses a worktree, runs baseline checks, and reports structured status — all without user prompts.
14
14
 
15
+ This skill is **model-invocable** (`disable-model-invocation: false`): orchestrators such as `oat-project-implement` invoke it programmatically when a parallel phase group needs autonomous worktree bootstrap. It is **not** user-invocable (`user-invocable: false`) — it has no interactive surface and is never offered as a slash command.
16
+
15
17
  > ⚠️ **When not to substitute.** This skill is the **only** supported mechanism for orchestrator-driven worktree creation in OAT skills. Host-native isolation primitives — Claude Code's `Agent({ isolation: "worktree" })`, Cursor's worktree-isolated agent invocations, and equivalents in other hosts — are **not** substitutes. They may use the primary repo's checkout (often `main`) as the base regardless of the caller's current branch, silently producing a worktree at the wrong base. OAT orchestrators dispatching mid-run from a feature branch MUST go through this skill with an explicit `--base` so the resulting worktree contains the orchestrator's prior commits.
16
18
 
17
19
  ## Relationship to oat-worktree-bootstrap
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@open-agent-toolkit/cli",
3
- "version": "0.1.3",
3
+ "version": "0.1.4",
4
4
  "private": false,
5
5
  "description": "Open Agent Toolkit CLI",
6
6
  "homepage": "https://github.com/voxmedia/open-agent-toolkit/tree/main/packages/cli",
@@ -33,7 +33,7 @@
33
33
  "ora": "^9.0.0",
34
34
  "yaml": "2.8.2",
35
35
  "zod": "^3.25.76",
36
- "@open-agent-toolkit/control-plane": "0.1.3"
36
+ "@open-agent-toolkit/control-plane": "0.1.4"
37
37
  },
38
38
  "devDependencies": {
39
39
  "@types/node": "^22.10.0",