get-tbd 0.1.28 → 0.1.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,1397 +1,863 @@
1
1
  ---
2
- title: CLI Agent Skill Patterns
3
- description: Best practices for building TypeScript CLIs that function as agent skills in Claude Code and other AI coding agents
2
+ title: Agent Skills & CLI Integration Patterns
3
+ description: How to write skills and agent-integrated CLIs that work across Claude Code, Codex, and the broader coding-agent ecosystem — a simple baseline plus references for advanced, multi-subcommand tools
4
4
  author: Joshua Levy (github.com/jlevy) with LLM assistance
5
5
  ---
6
- # CLI Agent Skill Patterns
6
+ # Agent Skills & CLI Integration Patterns
7
7
 
8
- These patterns apply to building TypeScript CLI applications that function as powerful
9
- skills within Claude Code and other AI coding agents.
10
- The patterns are derived from the `tbd` CLI implementation, which serves as a reference
11
- architecture for agent-integrated command-line tools.
8
+ **Last Updated**: 2026-05-23 (research verified against primary sources May 2026)
12
9
 
13
- The core insight is that a CLI can be much more than a command executor—it can be a
14
- **dynamic skill module** that provides context management, self-documentation, and
15
- seamless integration with multiple AI agents through a single npm package.
10
+ This guideline covers how to package a capability so AI coding agents can discover and
11
+ use it well — from a single-file skill up to a full CLI with many subcommands exposed as
12
+ a skill. It is deliberately **not dogmatic**: most needs are met by a tiny `SKILL.md`,
13
+ and the heavier patterns are opt-in for tools that genuinely need them.
16
14
 
17
- **When to use this guideline**: When building a CLI tool that should work well with AI
18
- coding agents, when adding agent integration to an existing CLI, or when designing
19
- context management for agent-aware applications.
15
+ The patterns draw on the current (May 2026) state of the ecosystem and on `tbd`’s own
16
+ implementation, which serves as a reference for the advanced “CLI-as-skill” approach.
20
17
 
21
- ## Key Patterns to Consider for CLIs
18
+ **When to use this guideline**: when building or packaging anything an AI coding agent
19
+ should use — a prompt-only skill, a CLI tool, an MCP server, or a multi-agent
20
+ integration — and you want it to work across Claude Code, Codex, Cursor, Gemini CLI, and
21
+ others without rewriting it per agent.
22
22
 
23
- 1. **CLI as Dynamic Skill Module:** CLIs can provide context management,
24
- self-documentation, and multi-agent integration through a single npm package.
25
-
26
- 2. **CLI as Knowledge Library:** Bundle guidelines, shortcuts, and templates that agents
27
- can query on-demand to improve work quality.
28
- This transforms the CLI from a tool the agent tells users about into a resource the
29
- agent uses to better serve users.
30
-
31
- 3. **Context Injection Loop:** A recursive architecture where skill documentation
32
- references commands, those commands output more context, and that context references
33
- further commands. This creates a self-directing knowledge system where agents get
34
- progressively smarter as they work.
35
-
36
- 4. **Task Management Integration:** CLIs that help agents track work across sessions,
37
- discover available tasks, and enforce session boundaries lead to more reliable
38
- agentic workflows.
23
+ > **The single most important shift since 2025**: skills and project instructions are
24
+ > now **open standards**, not per-vendor formats.
25
+ > `AGENTS.md` is governed under the Linux Foundation’s **Agentic AI Foundation (AAIF)**;
26
+ > the **Agent Skills (`SKILL.md`)** format is an open standard published at
27
+ > [agentskills.io](https://agentskills.io) and implemented by 20+ agents.
28
+ > Write to the standard once; most agents pick it up for free.
39
29
 
40
30
  * * *
41
31
 
42
- ## 1. CLI as Skill Architecture
32
+ ## 0. Start Here The Simple Baseline
43
33
 
44
- ### 1.1 Bundled Documentation Pattern
34
+ Read this section first.
35
+ For the large majority of cases, you are done after it.
45
36
 
46
- - **Bundle documentation files with CLI**: Include skill files, `README.md`, and docs in
47
- the CLI distribution.
48
- Maintain tiered skill files for different contexts:
37
+ ### 0.1 If you just want a skill (prompt-only capability)
49
38
 
50
- | Tier | File | Tokens | Purpose |
51
- | --- | --- | --- | --- |
52
- | Full | `skill-baseline.md` | ~2000 | Default installation, full workflow guide |
53
- | Brief | `skill-brief.md` | ~400 | Condensed version for `--brief` flag |
54
-
55
- ```typescript
56
- function getDocPath(filename: string): string {
57
- const __dirname = dirname(fileURLToPath(import.meta.url));
58
- return join(__dirname, 'docs', filename);
59
- }
39
+ Create one folder with one file:
60
40
 
61
- async function loadDocContent(filename: string): Promise<string> {
62
- // Try bundled location first
63
- try {
64
- return await readFile(getDocPath(filename), 'utf-8');
65
- } catch {
66
- // Fallback chain: dev path → repo path → error
67
- }
68
- }
41
+ ```
42
+ my-skill/
43
+ └── SKILL.md
69
44
  ```
70
45
 
71
- - **Use a `dist/docs/` directory**: Store bundled docs alongside the CLI for
72
- self-contained packages that work in sandboxed environments, containers, and CI.
46
+ ```markdown
47
+ ---
48
+ name: my-skill
49
+ description: >-
50
+ Analyze Excel spreadsheets, build pivot tables, and export results.
51
+ Use when the user mentions .xlsx files, tabular data, or spreadsheets.
52
+ ---
53
+ # My Skill
73
54
 
74
- - **Implement fallback loading**: Support bundled → development source → repo-level
75
- docs.
55
+ Step-by-step instructions the agent should follow...
56
+ ```
76
57
 
77
- - **Provide a `skill` subcommand**: Output skill content to stdout so agents can inspect
78
- or pipe it. Support verbosity flags:
79
- ```bash
80
- mycli skill # Full skill with dynamic content
81
- mycli skill --brief # Condensed version (~400 tokens)
82
- ```
58
+ That is the entire artifact no build step, no runtime, no dependencies.
59
+ This is the **Agent Skills open standard** ([agentskills.io](https://agentskills.io)),
60
+ and the same folder works in Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor,
61
+ Windsurf, Cline, pi, and 20+ other tools.
62
+ Garry Tan’s **gstack** (~97K stars) is 23 skills, each a plain `SKILL.md` and nothing
63
+ more — proof the baseline scales without custom tooling.
83
64
 
84
- ### 1.2 Multi-Agent Integration Files
65
+ **The only two things that matter for a basic skill:**
85
66
 
86
- Each agent platform has different file format requirements:
67
+ 1. A **`description`** that says *what it does* AND *when to use it* (this drives
68
+ activation — see §4.2).
69
+ 2. A **body under ~500 lines** of clear, imperative instructions.
87
70
 
88
- | Agent | File | Format | Location |
89
- | --- | --- | --- | --- |
90
- | Claude Code | SKILL.md | YAML frontmatter + Markdown | `.claude/skills/name/` |
91
- | Cursor IDE | CURSOR.mdc | MDC frontmatter + Markdown | `.cursor/rules/` |
92
- | Codex | AGENTS.md | HTML markers + Markdown | repo root |
71
+ **Install it** by copying into a known directory, or with the cross-agent package
72
+ manager:
93
73
 
94
- **SKILL.md Format** (Claude Code):
74
+ ```bash
75
+ npx skills add owner/repo # Vercel's skills.sh ecosystem (symlinks, 27+ agents)
76
+ # or commit it to a discovery directory:
77
+ # .agents/skills/my-skill/SKILL.md (cross-agent: Codex, pi, others)
78
+ # .claude/skills/my-skill/SKILL.md (Claude Code, project)
79
+ # ~/.claude/skills/my-skill/SKILL.md (Claude Code, personal)
80
+ ```
81
+
82
+ The `SKILL.md` folder is the portable **authoring** format; some agents add their own
83
+ discovery paths (Codex/pi also read `.agents/skills/`) and their own **distribution**
84
+ layers on top (Claude Code plugins, Codex plugins) — see §5.
85
+
86
+ ### 0.2 If your capability is a CLI
87
+
88
+ Most agents already know how to run CLIs from their training data, and benchmarks show a
89
+ CLI is far cheaper and more reliable than an MCP server for tools that have one (§7).
90
+ So:
91
+
92
+ 1. Make the CLI **agent-friendly**: a clear `--help`, a `--json` flag on every command,
93
+ actionable errors, and idempotent, non-interactive operation (`--yes`/`--auto`).
94
+ 2. Ship a **`SKILL.md`** (or an `AGENTS.md` snippet) that tells the agent the tool
95
+ exists, what it’s for, and the handful of commands to run.
96
+ Reference the CLI via a **pinned zero-install runner** (`npx`/`uvx <pkg>@<version>`)
97
+ so it works even in ephemeral/cloud environments — global install vs.
98
+ zero-install is its own design dimension (§6.7).
99
+ 3. That’s the baseline.
100
+ Stop here unless you have many subcommands or need cross-session state, structured
101
+ auth, or background services — then see §6 (CLI-as-skill) and §7 (MCP).
102
+
103
+ ### 0.3 The one-paragraph decision guide
104
+
105
+ - **Prompt/instructions only** → ship a `SKILL.md`. (§3, §4)
106
+ - **Project-wide conventions** (build/test/style) → add an `AGENTS.md`. (§2)
107
+ - **You have a CLI** → `SKILL.md` + agent-friendly `--json` CLI. (§0.2, §6)
108
+ - **Many subcommands / a knowledge library** → CLI-as-skill meta-pattern.
109
+ (§6)
110
+ - **A service with no CLI, or you need OAuth / multi-tenant / audit** → MCP server.
111
+ (§7)
112
+ - **Maximum reach across many agents** → layer them: AGENTS.md + SKILL.md + CLI + MCP.
113
+ (§1)
114
+
115
+ Everything below is reference material.
116
+ You do not need most of it for most tools.
95
117
 
96
- ```yaml
97
- ---
98
- name: mycli
99
- description: Lightweight, git-native issue tracking...
100
- allowed-tools: Bash(mycli:*)
101
- ---
102
- # mycli Workflow
103
- ...
104
- ```
118
+ * * *
105
119
 
106
- **CURSOR.mdc Format** (Cursor IDE):
120
+ ## 1. The Layered Model — “Write Once, Integrate Many”
107
121
 
108
- ```yaml
109
- ---
110
- description: mycli workflow rules for git-native issue tracking...
111
- alwaysApply: false
112
- ---
113
- # mycli Workflow
114
- ...
115
- ```
122
+ There is no single integration surface that every agent uses, but the surfaces compose
123
+ cleanly.
124
+ Pick the lowest layer that satisfies your need; add higher layers only for reach
125
+ or capability.
116
126
 
117
- **AGENTS.md Format** (Codex):
127
+ | Layer | Artifact | What it’s for | Reach (May 2026) |
128
+ | --- | --- | --- | --- |
129
+ | 1. Project baseline | `AGENTS.md` | Build/test/style/conventions for *this repo* | Codex, Cursor, Copilot, Gemini CLI, Windsurf, Amp, Jules, Goose, Factory, Aider, opencode, pi |
130
+ | 2. Portable capability | `SKILL.md` (Agent Skills) | A reusable, on-demand capability | Claude Code, Codex, Gemini CLI, Copilot (VS Code), Cursor, Windsurf, Cline, pi, 20+ |
131
+ | 3. Execution | A CLI | Efficient, composable tool the agent invokes via shell | Every agent with a shell tool |
132
+ | 4. Structured/remote | MCP server | Services without a CLI; OAuth, multi-tenant, audit | Every major agent except Aider (native); pi via extension |
133
+ | 5. Per-agent polish | `.cursor/rules/*.mdc`, plugin packaging, ACP, etc. | Glob-scoped activation, enterprise distribution, editor discovery | Per-agent |
118
134
 
119
- ```markdown
120
- <!-- BEGIN MYCLI INTEGRATION -->
121
- # mycli Workflow
122
- ...
123
- <!-- END MYCLI INTEGRATION -->
124
- ```
135
+ **Recommended default for a tool author who wants broad reach**: ship an `AGENTS.md`
136
+ snippet (universal baseline) + a `SKILL.md` (portable capability) + an agent-friendly
137
+ CLI. Add an MCP server only when a CLI can’t serve the need.
138
+ Add agent-specific files last, and only where they buy something.
125
139
 
126
140
  * * *
127
141
 
128
- ## 2. Context Management Commands
142
+ ## 2. AGENTS.md The Universal Project Baseline
143
+
144
+ `AGENTS.md` is a plain-Markdown file at the repo root that tells any agent how your
145
+ project works: build commands, test commands, conventions, gotchas.
146
+ It is **not** capability-specific — think of it as the README written for agents.
147
+
148
+ **Governance & reach**: Originated by OpenAI (Aug 2025); since Dec 2025 stewarded by the
149
+ **Agentic AI Foundation under the Linux Foundation** (co-founded by OpenAI, Anthropic,
150
+ and Block; ~180 member orgs).
151
+ Used by **60,000+** open-source projects.
152
+ Canonical spec: [agents.md](https://agents.md).
153
+
154
+ **Discovery & precedence** vary by agent — know your targets:
155
+
156
+ - **Codex**: reads global `~/.codex/AGENTS.md` (or `AGENTS.override.md`), then walks
157
+ from repo root down to the working directory, concatenating one file per directory
158
+ (root→leaf, deeper overrides shallower).
159
+ Combined content is capped at `project_doc_max_bytes` (**default 32 KiB**). It does
160
+ **not** lazy-load nested files when reading child directories (open request).
161
+ - **Cursor** (since v1.6), **Copilot** (since Aug 2025), **Windsurf**, **Gemini CLI**
162
+ (alongside `GEMINI.md`), **Amp** (falls back to `CLAUDE.md`), **Jules**, **Goose**
163
+ (hierarchical scoping), **Factory**, **opencode**, **pi**: all read `AGENTS.md`
164
+ natively.
165
+ - **Claude Code**: as of May 2026 does **not** auto-load `AGENTS.md`; it uses
166
+ `CLAUDE.md`. A common pattern is to symlink `CLAUDE.md → AGENTS.md`, or to maintain
167
+ both. (`tbd` writes a marked section into `AGENTS.md` and lets users keep a separate
168
+ `CLAUDE.md`.)
169
+ - **Aider**: uses `CONVENTIONS.md` but recommends `AGENTS.md` for interoperability.
170
+
171
+ **Author tip**: keep `AGENTS.md` concise (it loads into every turn and competes for
172
+ context).
173
+ Put deep, on-demand material in skills or files the agent can open when needed.
174
+ `AGENTS.override.md` lets a developer override the committed file locally without
175
+ editing it.
129
176
 
130
- ### 2.1 Prime Command Pattern
177
+ * * *
131
178
 
132
- The `prime` command is the key context management primitive.
133
- It outputs contextual information appropriate to the current state:
179
+ ## 3. The Agent Skills Standard (SKILL.md)
134
180
 
135
- - **Initialized repo**: Dashboard with status, rules, quick reference
136
- - **Not initialized**: Setup instructions
137
- - **Migration detected**: Migration warning and guidance
181
+ **What it is**: a folder with a `SKILL.md` file (YAML frontmatter + Markdown body), plus
182
+ optional supporting files.
183
+ Created by Anthropic (Dec 2025), published under Apache 2.0 at
184
+ [agentskills.io](https://agentskills.io).
185
+ 280,000+ public skills exist as of early 2026.
138
186
 
139
- **Key Features**:
187
+ **Standard frontmatter** (portable across agents): `name`, `description`. Optional but
188
+ widely recognized: `license`, `compatibility`, `metadata`, `allowed-tools`. Agents
189
+ silently ignore frontmatter keys they don’t understand, which is what makes a single
190
+ `SKILL.md` portable.
140
191
 
141
- - Called automatically via hooks at session start and before context compaction
142
- - `--brief` flag for constrained contexts (~200 tokens)
143
- - `--full` flag for complete skill documentation
144
- - Custom override via `.mycli/PRIME.md` file
145
- - CLI with no args shows help with prominent prompt to run `mycli prime` for full
146
- context
192
+ ### 3.1 Progressive disclosure (the core design principle)
147
193
 
148
- **Relationship with `skill` Command**:
194
+ Skills are loaded in three levels so they cost almost nothing until used:
149
195
 
150
- The `prime` and `skill` commands serve different purposes:
196
+ | Level | Content | Loaded | Budget |
197
+ | --- | --- | --- | --- |
198
+ | 1. Discovery | `name` + `description` only | Always (in the skill listing) | ~100 tokens |
199
+ | 2. Activation | Full `SKILL.md` body | When invoked (by user or model) | keep < ~5,000 tokens / 500 lines |
200
+ | 3. Execution | Supporting files, scripts, references | On demand, by the model reading the filesystem | unbounded |
151
201
 
152
- | Command | Purpose | When Called |
153
- | --- | --- | --- |
154
- | `mycli prime` | Dashboard + status + workflow rules | Session start, context compaction |
155
- | `mycli skill` | Pure skill content (no status/dashboard) | Agent inspection, installation preview |
202
+ **Constraints that matter**: keep the body **under 500 lines**; keep reference files
203
+ **one level deep** from `SKILL.md` (avoid `SKILL.md → a.md → b.md` chains); put bulky
204
+ material (schemas, examples, scripts) in supporting files.
205
+ Scripts execute *outside* the context window only their output costs tokens, which is
206
+ why bundling a script can be far cheaper than inlining instructions.
156
207
 
157
- Use `prime` for context restoration with current state, `skill` for static skill
158
- documentation.
208
+ ### 3.2 Bundled scripts and resources
159
209
 
160
- **Dashboard Output Structure**:
210
+ A skill folder can ship executable helpers:
161
211
 
162
212
  ```
163
- mycli v1.0.0
164
-
165
- --- INSTALLATION ---
166
- mycli installed (v1.0.0)
167
- Initialized in this repo
168
- Hooks installed
169
-
170
- --- PROJECT STATUS ---
171
- Repository: proj
172
- Tasks: 3 open (1 in_progress) | 1 blocked
173
-
174
- --- WORKFLOW RULES ---
175
- - Track all task work using mycli
176
- - Check `mycli ready` for available work
177
- - Run `mycli sync` at session end
178
-
179
- --- QUICK REFERENCE ---
180
- mycli ready Show tasks ready to work
181
- mycli show <id> View task details
182
- ...
213
+ my-skill/
214
+ ├── SKILL.md
215
+ ├── reference.md # loaded only when the body points to it
216
+ ├── scripts/
217
+ │ └── transform.py # run by the agent; only stdout enters context
218
+ └── assets/
219
+ └── template.xlsx
183
220
  ```
184
221
 
185
- ### 2.2 Progressive Disclosure Architecture
222
+ In Claude Code, `${CLAUDE_SKILL_DIR}` resolves to the skill’s directory for portable
223
+ script references. Anthropic’s own document skills (docx/pdf/pptx/xlsx) and the
224
+ `skill-creator` skill are good worked examples of this layout.
186
225
 
187
- The most important concept for building agent-integrated CLIs is **Progressive
188
- Disclosure**—showing just enough information to help agents decide what to do next, then
189
- revealing more details as needed.
190
- This minimizes token overhead while maintaining full capability.
191
-
192
- **Three-Level Architecture**:
193
-
194
- | Level | Content | Token Budget | When Loaded |
195
- | --- | --- | --- | --- |
196
- | Level 1 | Metadata only (name + description) | ~100 tokens | Always in system prompt |
197
- | Level 2 | SKILL.md body | ~1,500-5,000 tokens | On skill trigger |
198
- | Level 3 | Bundled resources (scripts, references) | Unlimited | As-needed |
199
-
200
- **Key Constraints**:
201
-
202
- - Keep SKILL.md under 500 lines
203
- - Reference files should stay one level deep from SKILL.md
204
- - Avoid chains like SKILL.md → advanced.md → details.md
205
- - Scripts execute outside context; only output uses tokens
206
-
207
- ### 2.3 Description Optimization Pattern
226
+ * * *
208
227
 
209
- Skill activation relies on **pure LLM reasoning**, not keyword matching or embeddings.
210
- Description quality directly impacts activation reliability.
228
+ ## 4. Writing a Great SKILL.md
211
229
 
212
- **Activation Reliability Data** (from real-world testing across 200+ prompts):
230
+ ### 4.1 Frontmatter: standard vs. Claude Code extensions
213
231
 
214
- | Approach | Success Rate |
215
- | --- | --- |
216
- | No optimization / vague descriptions | ~20% |
217
- | Optimized descriptions with “Use when …” | ~50% |
218
- | Descriptions with concrete examples | 72-90% |
219
- | Forced evaluation hooks | 80-84% |
232
+ Only `name` and `description` are required by the open standard.
233
+ Claude Code recognizes a larger set (other agents ignore the extras):
220
234
 
221
- **The Two-Part Rule**: Every description must answer:
235
+ | Field | Standard? | Notes |
236
+ | --- | --- | --- |
237
+ | `name` | ✓ | ≤ 64 chars; lowercase, digits, hyphens; defaults to directory name |
238
+ | `description` | ✓ | ≤ 1,024 chars in the field; see §4.2 |
239
+ | `license`, `compatibility`, `metadata` | ✓ (optional) | Portability/metadata |
240
+ | `allowed-tools` | ✓ (optional) | Pre-grant tools, e.g. `Bash(mycli:*), Read, Write` |
241
+ | `when_to_use` | Claude Code | Appended to `description` in the listing (shares the cap) |
242
+ | `argument-hint`, `arguments` | Claude Code | Autocomplete + `$name`/`$ARGUMENTS` substitution |
243
+ | `disable-model-invocation` | Claude Code | Skill won’t auto-trigger; user/explicit only |
244
+ | `user-invocable` | Claude Code | `false` = background knowledge, hidden from `/` menu |
245
+ | `model`, `effort` | Claude Code | Override model / reasoning effort for the skill’s turn |
246
+ | `context: fork`, `agent` | Claude Code | Run the skill in an isolated subagent |
247
+ | `paths` | Claude Code | Glob patterns that gate auto-activation to matching files |
248
+ | `hooks`, `shell` | Claude Code | Skill-scoped lifecycle hooks; `bash`/`powershell` |
249
+
250
+ Keep your committed `SKILL.md` to the **standard fields plus `allowed-tools`** if you
251
+ want maximum portability; add Claude-specific fields when you’re targeting Claude Code
252
+ specifically.
253
+
254
+ ### 4.2 Description optimization (this is what makes skills activate)
255
+
256
+ Activation is **pure LLM reasoning** — the model reads every installed skill’s `name` +
257
+ `description` and decides what to invoke.
258
+ There is no keyword matcher or embedding step.
259
+ So the description is the single highest-leverage thing you write.
260
+
261
+ **The two-part rule** — every description answers:
222
262
 
223
263
  1. **What does it do?** (capabilities)
224
- 2. **When to use it?** (activation triggers)
225
-
226
- **Anti-Pattern**:
264
+ 2. **When should it be used?** (explicit triggers, in the user’s words)
227
265
 
228
266
  ```yaml
267
+ # Anti-pattern
229
268
  description: Helps with documents
230
- ```
231
-
232
- **Preferred Pattern**:
233
269
 
234
- ```yaml
235
- description: >
270
+ # Preferred
271
+ description: >-
236
272
  Analyze Excel spreadsheets, create pivot tables, and export data.
237
- Use when analyzing .xlsx files, working with tabular data, or
238
- when the user mentions spreadsheets or Excel.
239
- ```
240
-
241
- **Writing Guidelines**:
273
+ Use when analyzing .xlsx files, working with tabular data, or when the
274
+ user mentions spreadsheets or Excel.
275
+ ```
276
+
277
+ **Writing rules**: third person ("Processes files," not “I can help you”); front-load
278
+ the most important trigger keywords in the first ~50 characters (descriptions can be
279
+ truncated in large collections); state both capability and trigger.
280
+
281
+ **Activation reliability** (community 650-trial sandboxed eval — directional, not
282
+ official): vague descriptions ~20% → optimized “Use when…” descriptions ~50% → adding
283
+ concrete examples ~72–90%. Two distinct failure modes to design against: *activation
284
+ failure* (never invoked) and *execution failure* (invoked but steps skipped — fix with
285
+ clearer, checklist-style instructions).
286
+
287
+ ### 4.3 The description budget (changed in 2026 — verify against your target)
288
+
289
+ Earlier guidance cited a flat ~15K-character budget.
290
+ **Claude Code’s current model is different**:
291
+
292
+ - The skill listing gets a budget of **~1% of the model’s context window** by default
293
+ (`skillListingBudgetFraction`, default `0.01`). When it overflows, the
294
+ least-recently-invoked skills lose their descriptions first.
295
+ - Per-skill listing text (`description` + `when_to_use`) is truncated at **1,536 chars**
296
+ (`maxSkillDescriptionChars`).
297
+ - `SLASH_COMMAND_TOOL_CHAR_BUDGET` overrides the fraction with a fixed character count.
298
+ - `skillOverrides` can set any skill to `on` / `name-only` / `user-invocable-only` /
299
+ `off` without editing the file; `/doctor` reports overflow.
300
+
301
+ **Implication for tools that install many skills**: don’t. Use the **meta-skill
302
+ pattern** (§6.2) — one skill that exposes N resources via CLI subcommands consumes a
303
+ single listing slot instead of N. This is the strongest architectural reason to prefer
304
+ CLI-as-skill once you have more than a handful of capabilities.
305
+
306
+ ### 4.4 Test the skill before publishing
307
+
308
+ Because activation is probabilistic (§4.2) and the body is executable influence, test
309
+ it:
310
+
311
+ - **Positive activation**: a few realistic prompts that *should* trigger the skill —
312
+ does the agent invoke it?
313
+ - **Negative activation**: nearby prompts that should *not* trigger it — no false fires?
314
+ - **Explicit invocation**: `/skill-name` (or the agent equivalent) loads and runs
315
+ cleanly.
316
+ - **Sandbox / write-denial**: the skill (and any bundled script) degrades gracefully
317
+ when the agent runs read-only or without network (§5, Codex/Claude Code sandboxes).
318
+ - **CI validation**: lint the frontmatter (required `name`/`description`, length caps)
319
+ and check that every referenced supporting file and link resolves.
320
+ For a CLI-as-skill, also run every `cli guidelines/shortcut/<name>` reference and
321
+ assert it exists.
322
+
323
+ ### 4.5 Keep portable; version deliberately
324
+
325
+ - **Portability**: keep the committed `SKILL.md` to the standard fields (plus
326
+ `allowed-tools`). Put vendor-specific behavior behind clearly labeled sections or
327
+ generated per-agent variants rather than non-standard frontmatter that other agents
328
+ silently drop.
329
+ - **Versioning**: for packaged skills/plugins, use semantic versions, keep a changelog,
330
+ and state compatibility (`compatibility` field / a “requires” note).
331
+ Pin consumers to a commit or version, not a moving tag.
332
+ - **Deprecation**: when removing or renaming a skill, leave a deprecation window with a
333
+ pointer to the replacement; don’t silently delete an activation trigger users rely on.
242
334
 
243
- - Use third person always ("Processes files" not “I can help you”)
244
- - Include explicit “Use when …” triggers with concrete scenarios
245
- - Be specific with keywords users would naturally say
246
- - State both capabilities AND activation conditions
247
- - Front-load the most important trigger keywords in the first 50 characters
248
- (descriptions may be truncated in large skill collections)
335
+ * * *
249
336
 
250
- ### 2.4 Description Length and Budget Constraints
337
+ ## 5. Per-Agent Integration Reference
338
+
339
+ Targets differ.
340
+ This matrix reflects May 2026; verify against current docs for the agents
341
+ you care about.
342
+
343
+ | Agent | Project file | Skill / rules mechanism | MCP | Hooks | Best integration path |
344
+ | --- | --- | --- | --- | --- | --- |
345
+ | **Claude Code** | `CLAUDE.md` | Agent Skills (`SKILL.md`), `.claude/skills/`; plugins/marketplaces | Yes (stdio + Streamable HTTP) | 29 events | SKILL.md (+ plugin for distribution) |
346
+ | **Codex CLI** | `AGENTS.md` | `SKILL.md` skills + plugins (skills+MCP); `~/.codex/prompts` (deprecated) | Yes (stdio + Streamable HTTP) | — | AGENTS.md + skills/plugins + MCP |
347
+ | **Cursor** | `.cursor/rules/*.mdc`, `AGENTS.md` | MDC rules (Always/Auto-glob/Agent-requested/Manual) | Yes | 6 events (incl. `beforeShellExecution`) | AGENTS.md + `.mdc` for glob scoping |
348
+ | **GitHub Copilot** | `.github/copilot-instructions.md`, `AGENTS.md` | `SKILL.md` (VS Code); `.agent.md` custom agents | Yes | `preToolUse`/`postToolUse`/… | SKILL.md + MCP; enterprise-managed plugins |
349
+ | **Gemini CLI** | `GEMINI.md` + `AGENTS.md` | Agent Skills; extensions (bundle hooks) | Yes (stdio + SSE) | ~12 events | AGENTS.md + MCP/extension |
350
+ | **Windsurf** | `.windsurf/rules/*.md` + `AGENTS.md` | Rules with activation modes | Yes (OAuth, Streamable HTTP) | pre-hooks can **block** (exit 2) | AGENTS.md + MCP |
351
+ | **Cline** | `.clinerules/` | Glob-scoped rules; Cline SDK plugins | Yes (mature marketplace) | SDK lifecycle events | `.clinerules` + MCP |
352
+ | **Aider** | `CONVENTIONS.md` / `AGENTS.md` | Conventions file (read-only context) | No native (proxy only) | Git hooks only | AGENTS.md/CONVENTIONS.md |
353
+ | **opencode** | `AGENTS.md` + `opencode.jsonc` | JS/TS plugins; skills dir | Yes | 25+ events, tool interception | Plugin + MCP |
354
+ | **Amp** | `AGENTS.md` (→ `CLAUDE.md`) | Plugins (tools + hooks) | Yes (Sourcegraph MCP + custom) | session/turn/tool | MCP + plugin |
355
+ | **Jules** (Google) | `AGENTS.md` | AGENTS.md only | Yes (curated, since Feb 2026) | — (cloud async) | AGENTS.md + Jules REST API |
356
+ | **Goose** (Block) | `AGENTS.md` | Recipes; 70+ MCP extensions | Yes (deepest) | extension lifecycle | MCP (primary) |
357
+ | **Zed** | `.rules` (reads `.cursorrules`, `CLAUDE.md`) | Rules Library | Yes (extensions) | — | MCP extension + `.rules` |
358
+ | **Factory** | `AGENTS.md` + `.factory/droids/*.md` | Custom Droids (sub-agents) | Yes | Delegator loop | AGENTS.md + droid file |
359
+ | **pi** | `AGENTS.md` / `CLAUDE.md`, `.pi/SYSTEM.md` | Agent Skills (`.pi/skills/`, `.agents/skills/`); TS extensions | **No (by design)** — use CLI+README or an extension | extension hooks | SKILL.md + CLI; extension for deep tool registration |
360
+
361
+ **Notes on the minimal end (pi)**: pi (Mario Zechner’s `@mariozechner/pi-coding-agent`,
362
+ ~44K stars) ships four tools (read/write/edit/bash) and treats context as a scarce
363
+ budget. It reads `AGENTS.md`/`CLAUDE.md`, supports the Agent Skills standard, and
364
+ **deliberately omits MCP** — its docs tell authors to “build CLI tools with READMEs” or
365
+ write a TypeScript extension (`pi.registerTool()` / `pi.registerCommand()`). This is a
366
+ clean endorsement of the CLI-as-skill approach: a self-documenting CLI plus a `SKILL.md`
367
+ is exactly what a minimal agent wants.
368
+
369
+ **Codex specifics** (it gained a real skill system in 2026): skills are `SKILL.md`
370
+ folders with the same progressive disclosure, discovered from
371
+ repository/user/admin/system `.agents/skills/` directories.
372
+ **Plugins** are one distribution layer on top (installable units bundling skills + MCP
373
+ servers — 90+ ship with Codex), not the only install path — a plain
374
+ `.agents/skills/<name>/SKILL.md` works without packaging.
375
+ Operational config lives in `~/.codex/config.toml` (or trusted per-project
376
+ `.codex/config.toml`): `model`, `approval_policy`
377
+ (`untrusted`/`on-request`/`granular`/`never`), `sandbox_mode`
378
+ (`read-only`/`workspace-write`/`danger-full-access`), and `[mcp_servers.*]`. A CLI your
379
+ tool ships will run **inside Codex’s sandbox** — under `workspace-write`, writes are
380
+ limited to workspace roots and network is off unless explicitly enabled.
381
+ Design your CLI to work read-only where possible and to fail with a clear message when
382
+ sandboxed.
251
383
 
252
- Claude Code has a **cumulative character budget** for all skill descriptions combined.
253
- Understanding these limits is critical for CLIs that install multiple skills.
384
+ * * *
254
385
 
255
- **Hard Limits**:
386
+ ## 6. CLI-as-Skill (Advanced) — One Tool, Many Self-Injecting Commands
256
387
 
257
- | Constraint | Limit | Notes |
258
- | --- | --- | --- |
259
- | Individual description | 1,024 characters | Per-skill maximum |
260
- | Skill name | 64 characters | Lowercase, numbers, hyphens only |
261
- | SKILL.md body | ~500 lines | Soft limit; use supporting files for more |
262
- | **Cumulative budget** | ~15,000-16,000 chars | For ALL skill descriptions combined |
388
+ This is the pattern for a richer tool: a CLI that is itself a skill, exposing many
389
+ capabilities as subcommands while costing a single description slot.
390
+ `tbd` is the reference implementation; **Beads/`bd`** (Steve Yegge), `tbd`’s lineage,
391
+ follows the same shape (subcommands + `AGENTS.md` + `--json` + an optional MCP server).
263
392
 
264
- **Per-Skill Overhead**: Each skill consumes ~109 characters of XML overhead (tags, name,
265
- location) plus the description length.
393
+ Use this when you have many capabilities, need cross-session state, or want a curated
394
+ knowledge library the agent pulls from.
395
+ For a single capability, the §0 baseline is better — don’t reach for this prematurely.
266
396
 
267
- **Truncation Behavior**: When the cumulative budget is exceeded, skills are hidden:
397
+ ### 6.1 Two kinds of commands
268
398
 
269
- | Skills Installed | Skills Visible | Hidden |
399
+ | Type | Purpose | Examples |
270
400
  | --- | --- | --- |
271
- | 63 | 42 | 33% |
272
- | 92 | 36 | 60% |
273
-
274
- **No warning is shown** when skills are truncated.
275
- Run `/context` to check for excluded skills.
276
-
277
- **Description Length Guidelines by Collection Size**:
278
-
279
- | Skill Collection Size | Recommended Description Length |
280
- | --- | --- |
281
- | < 40 skills | Up to 1,024 characters (full limit) |
282
- | 40-60 skills | ≤150 characters |
283
- | 60+ skills | ≤130 characters |
284
-
285
- **Override**: Set `SLASH_COMMAND_TOOL_CHAR_BUDGET` environment variable to increase the
286
- limit.
287
-
288
- **Meta-Skill Pattern**: For CLIs with many resources (50+), use a single meta-skill that
289
- exposes resources via CLI commands rather than individual skills.
290
- This consumes only one description slot (~200 chars) instead of 50+ slots that would
291
- exceed the budget. See
292
- [Skills vs Meta-Skill Architecture Research](../../project/research/current/research-skills-vs-meta-skill-architecture.md).
293
-
294
- ### 2.5 Skill Command Architecture
295
-
296
- Every agent-integrated CLI should have a `skill` subcommand that outputs skill content
297
- to stdout.
298
- This enables agents to inspect skill content, preview before installation, and
299
- pipe to other commands.
300
-
301
- **Basic Pattern**:
302
-
303
- ```bash
304
- mycli skill # Full skill content
305
- mycli skill --brief # Condensed version for constrained contexts
306
- ```
307
-
308
- **Key Behaviors**:
401
+ | **Action commands** | Perform operations | `create`, `close`, `sync` |
402
+ | **Informational commands** | Output guidance for the agent to *follow* | `guidelines <name>`, `shortcut <name>`, `template <name>` |
309
403
 
310
- - Outputs composed skill content (doesn’t just read a static file)
311
- - Dynamically generates resource directories from available docs
312
- - Supports verbosity flags for different contexts
313
- - Can be piped to files or other commands
404
+ Informational commands don’t *do* anything they print instructions, best practices, or
405
+ templates the agent reads and acts on.
406
+ This is the mechanism behind tbd’s **knowledge-injection-via-subcommands**: rather than
407
+ installing dozens of skills, tbd installs *one* meta-skill and exposes its entire
408
+ library (`tbd guidelines --list`, `tbd guidelines typescript-rules`, …) as commands the
409
+ agent calls just-in-time.
314
410
 
315
- #### 2.5.1 Tiered Skill Files
411
+ This works well in practice and is the right answer for a many-subcommand CLI because:
316
412
 
317
- Maintain two tiers of skill content for different contexts:
413
+ - **Budget**: one listing slot, unbounded resources (vs.
414
+ the per-skill budget in §4.3).
415
+ - **Currency**: resources ship and version with the CLI; `--list` is generated from
416
+ what’s actually installed, so it never goes stale.
417
+ - **Composability**: each resource can reference other commands, forming a
418
+ self-directing context loop (§6.4).
318
419
 
319
- | Tier | File | Tokens | When Used |
320
- | --- | --- | --- | --- |
321
- | Full | `skill-baseline.md` | ~2000 | Default setup, `skill` command |
322
- | Brief | `skill-brief.md` | ~400 | `skill --brief`, compacted contexts |
323
-
324
- #### 2.5.2 Skill Composition Pattern
420
+ ### 6.2 The meta-skill composition
325
421
 
326
- Compose full skill content from separate components:
422
+ tbd’s `skill` command composes the installed `SKILL.md` from parts so the static guide
423
+ and the dynamic catalog stay in sync:
327
424
 
328
425
  ```
329
426
  ┌─────────────────────────────────────┐
330
- │ claude-header.md (YAML frontmatter) │
427
+ │ claude-header.md (YAML frontmatter) │ ← name + two-part description + allowed-tools
331
428
  ├─────────────────────────────────────┤
332
- │ skill-baseline.md (workflow guide)
429
+ │ skill-baseline.md (workflow guide) ← the durable "how to use this tool" body
333
430
  ├─────────────────────────────────────┤
334
- │ <!-- BEGIN SHORTCUT DIRECTORY -->
335
- (dynamically generated tables)
336
- │ <!-- END SHORTCUT DIRECTORY -->
431
+ │ <!-- BEGIN SHORTCUT DIRECTORY --> ← generated from the live DocCache
432
+ | command | title | description |
433
+ │ <!-- END SHORTCUT DIRECTORY -->
337
434
  └─────────────────────────────────────┘
338
435
  ```
339
436
 
340
- **Benefits**:
341
-
342
- - Header can be updated independently from content
343
- - Dynamic sections stay current with available resources
344
- - HTML comment markers enable partial updates without full regeneration
345
-
346
- **Implementation Pattern**:
347
-
348
- ```typescript
349
- async function composeFullSkill(): Promise<string> {
350
- // Load YAML header (Claude Code metadata)
351
- const header = await loadDocContent('install/claude-header.md');
352
-
353
- // Load base skill content
354
- const baseSkill = await loadDocContent('shortcuts/system/skill-baseline.md');
437
+ Maintain **two tiers** of skill content: a full `skill-baseline.md` (~2,000 tokens, the
438
+ default and the `skill` command) and a `skill-brief.md` (~400 tokens, for constrained or
439
+ post-compaction contexts).
355
440
 
356
- // Generate dynamic resource directory
357
- const directory = await generateShortcutDirectory();
358
-
359
- // Compose: header + base + dynamic content
360
- let result = header + baseSkill;
361
- if (directory) {
362
- result = result.trimEnd() + '\n\n' + directory;
363
- }
364
-
365
- return result;
366
- }
367
- ```
441
+ ### 6.3 Resource directories: show the full command
368
442
 
369
- **Updatable Regions with HTML Markers**:
370
-
371
- When installing skill files, use HTML comment markers to identify sections that can be
372
- updated independently:
443
+ When listing resources, print the command to run, not just a name — it removes a step
444
+ for the agent.
373
445
 
374
446
  ```markdown
375
- <!-- BEGIN SHORTCUT DIRECTORY -->
376
447
  ## Available Shortcuts
377
- ...generated table...
378
- <!-- END SHORTCUT DIRECTORY -->
379
- ```
380
-
381
- This allows the CLI to update just the directory section when resources change, without
382
- regenerating the entire skill file.
383
-
384
- **“DO NOT EDIT” Markers**:
385
-
386
- For generated skill files installed in `.claude/skills/`, insert a “DO NOT EDIT” marker
387
- after the frontmatter to warn users:
388
-
389
- ```markdown
390
- ---
391
- name: mycli
392
- description: ...
393
- ---
394
- <!-- DO NOT EDIT: Generated by mycli setup.
395
- Run 'mycli setup' to update.
396
- -->
397
- # mycli Workflow
398
- ...
448
+ | Command | Purpose | Description |
449
+ |---------|---------|-------------|
450
+ | `mycli shortcut code-review` | Commit code | Pre-commit checks and commit flow |
451
+ | `mycli shortcut new-plan-spec` | Plan a feature | Create a planning specification |
399
452
  ```
400
453
 
401
- #### 2.5.3 File Management Patterns
402
-
403
- **Critical Architecture Principle**: CLIs should version control **source files**, not
404
- final installed files.
405
- Different file types have different management patterns.
406
-
407
- **File Ownership Summary**:
454
+ Back resource lookup with a **path-ordered cache** so project- and user-level files can
455
+ shadow built-ins (like `$PATH`): project `.mycli/docs/` → user `~/.mycli/docs/` →
456
+ bundled. This lets teams customize without forking.
408
457
 
409
- | File | Location | Managed By | User Editable? |
410
- | --- | --- | --- | --- |
411
- | **SKILL.md** | `.claude/skills/mycli/` | CLI (fully) | ❌ Never |
412
- | **AGENTS.md** | Repo root | CLI + User (hybrid) | ✓ Outside markers |
413
- | **CLAUDE.md** | Repo root | User (optional) | ✓ Fully |
414
-
415
- **Pattern 1: SKILL.md (CLI-Managed Only)**
458
+ ### 6.4 The context-injection loop
416
459
 
417
- The SKILL.md file is **entirely managed by the CLI**. Neither users nor agents should
418
- edit it directly—it will be overwritten on the next `setup` run.
460
+ The payoff of informational commands is a self-reinforcing chain:
419
461
 
420
462
  ```
421
- project-root/.claude/skills/mycli/SKILL.md # Generated, never edit
463
+ SKILL.md ── "for TypeScript work, run `mycli guidelines typescript-rules`"
464
+ └──▶ guideline ── "create issues with `mycli create`; for tests see `mycli guidelines testing`"
465
+ └──▶ action commands / more guidelines, loaded just-in-time
422
466
  ```
423
467
 
424
- **How tbd implements this**:
425
- - `tbd setup --auto` composes and installs `.claude/skills/tbd/SKILL.md`
426
- - Inserts “DO NOT EDIT” marker after frontmatter
427
- - Regenerates on each setup with latest CLI content + dynamic shortcut directory
468
+ Rules: reference commands **explicitly** (`mycli command arg`, never “see the docs”);
469
+ **limit chain depth to 3**; make every layer end in a concrete action.
428
470
 
429
- **Pattern 2: AGENTS.md (Hybrid Management)**
471
+ ### 6.5 Making the CLI agent-friendly
430
472
 
431
- The AGENTS.md file uses **HTML comment markers** to separate CLI-managed sections from
432
- user-editable content.
473
+ - **`--json` on every command** one output path that renders human or machine output.
474
+ - **`--brief`/`--quiet`** for constrained contexts and scripts.
475
+ - **Idempotent `setup --auto`** (non-interactive) vs.
476
+ `setup --interactive` for humans; never let an agent get stuck on a prompt.
477
+ - **Actionable errors** that include the next command to run.
478
+ - **Discoverable help**: an `IMPORTANT:` epilog pointing at a context-restore command
479
+ (e.g., `mycli prime`), and a “Getting Started” one-liner.
480
+ - **A `prime` command** (dashboard + status + rules) for session start and post-compact,
481
+ distinct from `skill` (pure documentation).
433
482
 
434
- ```markdown
435
- # My Project
483
+ ### 6.6 Distribution & multi-agent install
436
484
 
437
- User-written context here... User can edit
485
+ A CLI can install itself into multiple agents from one `setup` run.
486
+ tbd writes a CLI-managed `SKILL.md` to `.claude/skills/tbd/` and a **marker-bounded
487
+ section** into `AGENTS.md` (which now also feeds Cursor, Codex, and Factory), preserving
488
+ user content outside the markers:
438
489
 
490
+ ```markdown
439
491
  <!-- BEGIN MYCLI INTEGRATION -->
440
- ...CLI-generated content...
492
+ CLI-generated… ← owned by the CLI; regenerated on setup
441
493
  <!-- END MYCLI INTEGRATION -->
442
-
443
- More user content... ✓ User can edit
444
- ```
445
-
446
- **Marker-based updates**:
447
- - CLI owns content **between markers** (`<!-- BEGIN ... -->` to `<!-- END ... -->`)
448
- - User can freely edit content **outside markers**
449
- - `mycli setup --auto` updates only the marked section, preserving user content
450
-
451
- **How tbd implements this**:
452
- ```typescript
453
- // Define marker boundaries
454
- const BEGIN_MARKER = '<!-- BEGIN TBD INTEGRATION -->';
455
- const END_MARKER = '<!-- END TBD INTEGRATION -->';
456
-
457
- // Update only marked section
458
- function updateSection(existingContent: string, newSection: string): string {
459
- const start = existingContent.indexOf(BEGIN_MARKER);
460
- const end = existingContent.indexOf(END_MARKER) + END_MARKER.length;
461
- return existingContent.slice(0, start) + newSection + existingContent.slice(end);
462
- }
463
- ```
464
-
465
- **Pattern 3: CLAUDE.md (Optional User File)**
466
-
467
- CLAUDE.md is typically **user-managed** for project-specific instructions.
468
- CLIs can support different approaches:
469
-
470
- 1. **Symlink to AGENTS.md** (recommended for identical content):
471
- ```bash
472
- ln -s AGENTS.md CLAUDE.md
473
- ```
474
-
475
- 2. **Copy of AGENTS.md** (for separate content):
476
- ```bash
477
- mycli setup --create-claude-md # CLI creates initial copy
478
- ```
479
-
480
- 3. **Separate user-maintained file** (tbd’s approach):
481
- - CLI doesn’t manage it
482
- - Users create/maintain manually
483
-
484
- **What to Version Control in Your CLI Package**:
485
-
486
- ```
487
- packages/mycli/
488
- ├── docs/
489
- │ ├── install/
490
- │ │ └── claude-header.md # ✓ YAML frontmatter source
491
- │ └── shortcuts/system/
492
- │ ├── skill-baseline.md # ✓ Full skill source
493
- │ └── skill-brief.md # ✓ Brief skill source
494
- └── dist/docs/
495
- └── SKILL.md # ✓ Bundled during build
496
- ```
497
-
498
- **What NOT to Version in CLI Package**:
499
-
500
- - ❌ `.claude/skills/mycli/SKILL.md` (installed per-project)
501
- - ❌ `AGENTS.md` (created in user projects)
502
- - ❌ `CLAUDE.md` (user-managed)
503
-
504
- **User Project Structure** (after `mycli setup --auto`):
505
-
506
- ```
507
- user-project/
508
- ├── .claude/skills/mycli/
509
- │ └── SKILL.md # CLI-managed, DO NOT EDIT
510
- ├── AGENTS.md # Hybrid: CLI section + user content
511
- └── CLAUDE.md # Optional: User-managed or symlink
512
- ```
513
-
514
- **Correct Workflows**:
515
-
516
- ```bash
517
- # Setup in user project
518
- npm install -g mycli@latest
519
- cd /path/to/project
520
- mycli setup --auto
521
-
522
- # ✓ Users can edit AGENTS.md outside markers
523
- vim AGENTS.md # Edit user sections, avoid marked regions
524
-
525
- # ✓ Update CLI-managed content by re-running setup
526
- mycli setup --auto # Idempotent, preserves user edits
527
-
528
- # ❌ Don't edit CLI-managed files
529
- vim .claude/skills/mycli/SKILL.md # Will be overwritten!
530
- ```
531
-
532
- **Key Principles**:
533
-
534
- 1. **Single Source of Truth**: Source files in CLI package are canonical
535
- 2. **Clear Ownership Boundaries**: Use markers and “DO NOT EDIT” warnings
536
- 3. **Preserve User Content**: Surgical updates to marked sections only
537
- 4. **Idempotent Setup**: Safe to run `setup --auto` multiple times
538
- 5. **Dynamic Per-Project Content**: Installed files may include project-specific
539
- additions
540
-
541
- * * *
542
-
543
- ## 3. Context Injection Loop Pattern
544
-
545
- One of the most powerful patterns in agent-integrated CLIs is the **context injection
546
- loop**—a recursive architecture where skill documentation references commands, those
547
- commands output more context, and that context references further commands.
548
-
549
- **The Loop Structure**:
550
-
551
- ```
552
- ┌─────────────────────────────────────────────────────────────────────┐
553
- │ SKILL.md (Level 2 - loaded on activation) │
554
- │ ├── Describes capabilities and when to use them │
555
- │ ├── References: "For TypeScript work, run `cli guidelines ts`" │
556
- │ └── References: "To plan features, run `cli shortcut new-plan`" │
557
- └─────────────────────────┬───────────────────────────────────────────┘
558
- │ Agent runs referenced command
559
-
560
- ┌─────────────────────────────────────────────────────────────────────┐
561
- │ Guidelines/Shortcuts (Level 3 - loaded on demand) │
562
- │ ├── Domain-specific knowledge injected into context │
563
- │ ├── References: "Create issues with `cli create`" │
564
- │ ├── References: "For testing patterns, see `cli guidelines tdd`" │
565
- │ └── References: "Use template: `cli template plan-spec`" │
566
- └─────────────────────────┬───────────────────────────────────────────┘
567
- │ Agent follows instructions, may run more commands
568
-
569
- ┌─────────────────────────────────────────────────────────────────────┐
570
- │ Action Commands or More Context │
571
- │ ├── Agent executes actions with full accumulated context │
572
- │ └── Or loads additional guidelines/templates as needed │
573
- └─────────────────────────────────────────────────────────────────────┘
574
- ```
575
-
576
- **Key Properties**:
577
-
578
- | Property | Description |
579
- | --- | --- |
580
- | **Self-directing** | Each context layer tells the agent what to do next |
581
- | **Just-in-time** | Context loads only when relevant, preserving token budget |
582
- | **Composable** | Guidelines can reference other guidelines |
583
- | **Actionable** | Context always leads to concrete actions |
584
-
585
- **Implementation Guidelines**:
586
-
587
- 1. **Every guideline should reference related guidelines**: If typescript-rules mentions
588
- testing, it should reference `cli guidelines testing-rules`
589
-
590
- 2. **Every shortcut should reference action commands**: Shortcuts are workflows—they
591
- must tell the agent which commands to run
592
-
593
- 3. **Limit chain depth to 3**: SKILL.md → Guideline → Sub-guideline is fine; deeper
594
- chains confuse agents
595
-
596
- 4. **Use consistent reference syntax**: Always `cli command arg` format, never prose
597
- like “you might want to check the testing guidelines”
598
-
599
- **Anti-patterns**:
600
-
601
- ```markdown
602
- # BAD: Vague reference
603
- See the testing documentation for more details.
604
-
605
- # GOOD: Explicit command reference
606
- For testing patterns, run `mycli guidelines general-testing-rules`.
607
- ```
608
-
609
- * * *
610
-
611
- ## 4. Agent Mental Model Patterns
612
-
613
- ### 4.1 Agent as Partner, Not Messenger
614
-
615
- The fundamental mental model shift for agent-integrated CLIs is from:
616
- > “Here are commands you can tell the user about”
617
-
618
- To:
619
- > “Here’s how this tool helps you (the agent) serve the user better”
620
-
621
- Structure skill file orientation around capabilities, not commands:
622
-
623
- ```markdown
624
- ## What This Tool Does
625
-
626
- 1. **Issue Tracking**: Track tasks, bugs, features. Never lose discovered work.
627
- 2. **Coding Guidelines**: Best practices the agent can pull in when relevant.
628
- 3. **Workflow Shortcuts**: Pre-built processes for common tasks.
629
- 4. **Templates**: Starting points for common document types.
630
-
631
- ## How to Use It to Help Users
632
-
633
- - User describes a bug → create an issue
634
- - User wants a feature → create a plan spec, then break into issues
635
- - Starting a session → check for available work
636
- - Completing work → close issues with clear reasons
637
- ```
638
-
639
- ### 4.2 Informational Commands Pattern
640
-
641
- A key architectural pattern is the distinction between **action commands** and
642
- **informational commands**:
643
-
644
- | Type | Purpose | Example |
645
- | --- | --- | --- |
646
- | Action commands | Perform operations | `create`, `close`, `sync` |
647
- | Informational commands | Output guidance for the agent to follow | `shortcut`, `guidelines`, `template` |
648
-
649
- Informational commands don’t perform actions—they display instructions, best practices,
650
- or templates that tell the agent *how* to do something well.
651
- The agent reads the output and follows the guidance.
652
-
653
- ### 4.3 Resource Library Pattern
654
-
655
- Beyond core functionality, agent-integrated CLIs can bundle **resource libraries**—
656
- collections of guidelines, shortcuts, and templates that agents access on-demand.
657
-
658
- **Resource Types**:
659
-
660
- | Resource | Purpose | Access Pattern |
661
- | --- | --- | --- |
662
- | Guidelines | Best practices for specific domains | `cli guidelines <name>` |
663
- | Shortcuts | Step-by-step workflow instructions | `cli shortcut <name>` |
664
- | Templates | Document starting points | `cli template <name>` |
665
-
666
- **Benefits**:
667
-
668
- 1. **Self-contained**: Resources ship with the CLI, no external dependencies
669
- 2. **Versionable**: Resource improvements ship with CLI updates
670
- 3. **Discoverable**: `--list` flags help agents find available resources
671
- 4. **Contextual**: Agents query relevant resources just-in-time
672
-
673
- ### 4.4 Resource Directory Pattern
674
-
675
- When documenting available resources, show the **full command to run**, not just the
676
- resource name. This removes friction for agents.
677
-
678
- **Anti-pattern** (name only):
679
-
680
- ```markdown
681
- ## Available Shortcuts
682
- - code-review-and-commit
683
- - create-or-update-pr-simple
684
- - new-plan-spec
685
494
  ```
686
495
 
687
- **Preferred pattern** (full command):
496
+ **File-ownership rules** distinguish three categories:
688
497
 
689
- ```markdown
690
- ## Available Shortcuts
498
+ - **Project instruction files** (`AGENTS.md`, `CLAUDE.md`): *commit these*. They hold
499
+ human-authored project norms (§2). A CLI may own a **marker-bounded section** inside
500
+ `AGENTS.md` (regenerated on setup) while the user owns everything outside the markers.
501
+ - **Fully generated install artifacts** (`.claude/skills/<tool>/SKILL.md` and the like):
502
+ CLI-owned; mark them “DO NOT EDIT.” Commit them only if the project intentionally
503
+ dogfoods them — otherwise leave them to `setup` (and consider gitignoring).
504
+ - **Source files** in the CLI package (header, baseline, brief): the canonical inputs —
505
+ always version-controlled.
691
506
 
692
- | Command | Purpose | Description |
693
- |---------|---------|-------------|
694
- | `mycli shortcut code-review-and-commit` | Commit Code | How to run pre-commit checks and commit |
695
- | `mycli shortcut create-pr` | Create PR | How to create a pull request |
696
- | `mycli shortcut new-plan-spec` | Plan Feature | How to create a planning specification |
697
- ```
507
+ Make setup idempotent: dedupe hooks before merging, overwrite generated skills rather
508
+ than patching them, update only the marked section of `AGENTS.md`, and clean up legacy
509
+ files each run. (Generated output must also be stable under whatever formatter the repo
510
+ runs e.g. don’t emit a second YAML frontmatter block mid-document.)
698
511
 
699
- * * *
512
+ ### 6.7 Making the CLI available: global install vs. zero-install
700
513
 
701
- ## 5. Setup Flow Patterns
514
+ A separate design dimension from §6.6 (how the CLI installs *itself into agents*) is how
515
+ the CLI **binary** is made available so the skill can invoke it.
516
+ Decide this explicitly and state the chosen invocation in `SKILL.md`/`AGENTS.md`.
702
517
 
703
- ### 5.1 Two-Tier Command Structure
518
+ **The two ends of the spectrum** (plus a useful middle):
704
519
 
705
- Implement two levels of setup commands:
706
-
707
- | Command | Purpose | Audience |
520
+ | Approach | How | Best when |
708
521
  | --- | --- | --- |
709
- | `mycli setup --auto` | Full setup with auto-detection | Agents, scripts |
710
- | `mycli setup --interactive` | Prompted setup | Humans |
711
- | `mycli init --prefix=X` | Surgical initialization only | Advanced users |
712
-
713
- ### 5.2 Mode Flags Pattern
714
-
715
- Always require explicit mode selection for setup commands:
522
+ | **Global install** | `npm i -g pkg`, `uv tool install pkg`, `pipx install pkg`, Homebrew, prebuilt binary | Persistent dev machines; offline/perf-sensitive use; the **project pins the version** (in `package.json`/lockfile or a tool manifest) so every run is identical |
523
+ | **Zero-install runner** | `npx pkg@x.y.z`, `bunx`, `pnpm dlx`, `uvx pkg@x.y.z`, `pipx run`, `go run mod@x.y.z` | Ephemeral/cloud agents (Claude Code Cloud, CI, fresh containers) where nothing persists; broad reach with no setup |
524
+ | **Persistent-on-first-use** | `uv tool install` then `uvx pkg` reuses it; a `SessionStart` bootstrap that installs once if absent | You want zero-install ergonomics *and* warm-start speed within a session |
525
+
526
+ **Trade-offs**
527
+
528
+ - **Global install** *Pros*: fastest invocation (no per-call resolution), works
529
+ offline, and version is managed by the project (lockfile / `package.json` / `uv` tool
530
+ manifest), so it’s auditable and reproducible.
531
+ *Cons*: it’s a stateful prerequisite — in ephemeral or cloud environments the global
532
+ bin doesn’t persist, so the CLI can be **missing at session start** unless you
533
+ bootstrap it.
534
+ - **Zero-install** — *Pros*: works in any environment with no setup; nothing to persist;
535
+ ideal default for portability.
536
+ *Cons*: cold-start download/cache cost on first call (uvx cold ≈ 1s, cached ≈ tens of
537
+ ms; npx similar), needs network, and an **unpinned** invocation (`npx pkg`, `uvx pkg`)
538
+ silently pulls the newest release.
539
+
540
+ **Cloud / ephemeral bootstrap.** If you choose global install but target cloud agents,
541
+ ship a **`SessionStart` hook** (or an `ensure-installed` script) that installs/updates
542
+ the CLI if absent before first use.
543
+ tbd does exactly this: a `tbd-session.sh` hook ensures the `tbd` CLI is present, then
544
+ runs `tbd prime`. Without a bootstrap, a globally-installed CLI referenced by a skill
545
+ will fail on a fresh cloud session.
546
+
547
+ **Pin the version (security).** Whichever you choose, the skill’s referenced invocation
548
+ should pin a version so the agent can’t silently run a newer (possibly compromised)
549
+ release — the §9 / 14-day-package-age rule applied to the runner itself:
716
550
 
717
551
  ```bash
718
- mycli setup # Shows help, requires mode flag
719
- mycli setup --auto # Non-interactive (for agents)
720
- mycli setup --interactive # Interactive (for humans)
552
+ uvx mytool@1.4.2 ... # not `uvx mytool`
553
+ npx mytool@1.4.2 ... # not `npx mytool@latest`
721
554
  ```
722
555
 
723
- **Why This Matters**:
724
-
725
- - Prevents agents from getting stuck in interactive prompts
726
- - Ensures humans get guided experience when they want it
727
- - Explicit is better than implicit for setup operations
728
-
729
- ### 5.3 Setup Idempotency Requirements
730
-
731
- **Setup MUST be idempotent**—safe to run repeatedly without side effects or errors.
732
- This is critical because:
733
-
734
- - Agents may run setup multiple times to refresh configuration
735
- - Users may run setup after CLI updates to get new features
736
- - Setup may be called automatically via scripts or CI
737
-
738
- **Idempotency Patterns**:
739
-
740
- 1. **Hook Deduplication**: When merging hooks into `.claude/settings.json`, filter out
741
- existing hooks before adding new ones:
742
-
743
- ```typescript
744
- // Filter out existing CLI hooks before merging
745
- const existingHooks = currentSettings.hooks.SessionStart || [];
746
- const filtered = existingHooks.filter(
747
- (entry) => !entry.hooks?.some((h) => h.command?.includes('mycli'))
748
- );
749
- mergedHooks.SessionStart = [...filtered, ...newHooks];
750
- ```
556
+ Global installs get the same guarantee from the lockfile/manifest; zero-install gets it
557
+ only from an explicit `@version`.
751
558
 
752
- 2. **Skill File Regeneration**: Always regenerate skill files with fresh content on each
753
- setup run. Don’t try to update in place—overwrite:
559
+ **Current tooling (May 2026)**
754
560
 
755
- ```typescript
756
- // Always regenerate skill file
757
- const skillContent = await composeFullSkill();
758
- await writeFile(skillPath, skillContent); // Overwrite, don't append
759
- ```
561
+ - **Node / TypeScript**: zero-install via `npx <pkg>@<ver>` (`-y` to skip the prompt),
562
+ `bunx`, `pnpm dlx`, or `deno run`; persistent via `npm i -g` / a project
563
+ devDependency.
564
+ - **Python**: `uvx <pkg>@<ver>` (= `uv tool run`, bundled with Astral’s `uv`, Rust-fast,
565
+ no Python prereq) or `pipx run`; persistent via `uv tool install` / `pipx install`.
566
+ `uvx` reuses a persistent install if one exists.
567
+ - **Go**: `go run <module>@<ver>` (compiles on the fly) or `go install`.
568
+ - **Rust**: no first-class zero-install runner — ship **prebuilt binaries** (GitHub
569
+ releases + a `curl … | sh` installer) or `cargo binstall`; `cargo install` compiles.
570
+ - **Cross-language**: a prebuilt binary + install script, or a container image (Docker
571
+ is emerging as the production-grade distribution for MCP servers).
760
572
 
761
- 3. **Legacy Cleanup**: Remove deprecated patterns on each run:
573
+ This mirrors how the ecosystem ships agent tooling today: **MCP servers** are most often
574
+ referenced as `command: npx <pkg>` (Node) or `command: uvx <pkg>` (Python) in agent
575
+ configs; **CLIs** like Beads offer `brew` / `npm -g` / `curl` installers, while tbd uses
576
+ `npm -g` plus the bootstrap hook above.
762
577
 
763
- ```typescript
764
- // Clean up old hook patterns
765
- const oldScripts = ['setup-mycli.sh', 'ensure-mycli.sh'];
766
- for (const script of oldScripts) {
767
- await rm(join(projectDir, '.claude/scripts', script)).catch(() => {});
768
- }
769
- ```
770
-
771
- 4. **Directory Creation**: Use `mkdir -p` style recursive creation that succeeds if
772
- directory already exists:
773
-
774
- ```typescript
775
- await mkdir(dirname(skillPath), { recursive: true });
776
- ```
777
-
778
- **Testing Idempotency**:
779
-
780
- Run setup twice and verify:
781
- - No duplicate hooks in settings.json
782
- - Skill file content is identical on both runs (except timestamps)
783
- - No error messages about existing files/configs
784
- - All features work correctly after both runs
785
-
786
- ### 5.4 Never Guess User Preferences
787
-
788
- For configuration values that are matters of user taste (not technical requirements),
789
- **never guess or auto-detect**. Always ask the user.
790
-
791
- **Examples of preference values**:
792
- - Project prefixes/abbreviations
793
- - Naming conventions
794
- - Style choices
578
+ **Recommendation**: default the skill to a **pinned zero-install invocation**
579
+ (`uvx`/`npx <pkg>@<version>`) for maximum reach across ephemeral and cloud agents; offer
580
+ **global install + a `SessionStart` bootstrap** as the optimization for persistent
581
+ environments where the project wants lockfile-managed versions and warm-start speed.
795
582
 
796
583
  * * *
797
584
 
798
- ## 6. Agent Integration Patterns
799
-
800
- ### 6.1 Claude Code Hooks
801
-
802
- Configure Claude Code hooks for automatic context management:
585
+ ## 7. CLI vs MCP vs Skill — Choosing the Surface
803
586
 
804
- **Global Hooks** (`~/.claude/settings.json`):
587
+ These are complementary, not competing.
588
+ Pick by need:
805
589
 
806
- ```json
807
- {
808
- "hooks": {
809
- "SessionStart": [{
810
- "matcher": "",
811
- "hooks": [{ "type": "command", "command": "mycli prime" }]
812
- }],
813
- "PreCompact": [{
814
- "matcher": "",
815
- "hooks": [{ "type": "command", "command": "mycli prime" }]
816
- }]
817
- }
818
- }
819
- ```
820
-
821
- **Alternative PreCompact Hook Pattern (Optional)**:
822
-
823
- For CLIs where token efficiency is critical, consider using `skill --brief` instead of
824
- `prime` in the PreCompact hook:
825
-
826
- ```json
827
- "PreCompact": [{
828
- "matcher": "",
829
- "hooks": [{ "type": "command", "command": "mycli skill --brief" }]
830
- }]
831
- ```
832
-
833
- **Tradeoffs**:
834
- - ✓ More token-efficient (~~400 tokens vs ~~800-1200 for `prime`)
835
- - ✓ Pure skill content without status/dashboard noise
836
- - ✗ No project-specific status information before compaction
837
- - ✗ Agent loses current state context
838
-
839
- Use `skill --brief` when the condensed workflow rules are sufficient, or `prime` when
840
- current project status is valuable for context restoration.
590
+ | Need | Surface |
591
+ | --- | --- |
592
+ | Prompt/instructions, portable | **SKILL.md** |
593
+ | Local processing, composable, you have/can build a CLI | **CLI** |
594
+ | Service with no CLI; OAuth, multi-tenant, audit, remote | **MCP server** |
595
+ | Both local convenience and structured/remote access | **CLI + MCP** |
596
+
597
+ **Why CLI usually wins when one exists**: benchmarks (2026) put a CLI at ~100% task
598
+ reliability and ~1.3K–8.7K tokens, vs.
599
+ MCP at ~72% reliability and ~32K–82K tokens — roughly **17× cheaper** at scale — because
600
+ LLMs already know common CLI usage and no tool schema is injected.
601
+ Use MCP when there’s no CLI to lean on or you need its auth/permission machinery.
602
+
603
+ **MCP current state (May 2026)**: governed by AAIF/Linux Foundation; two transports —
604
+ **stdio** (local) and **Streamable HTTP** (remote; replaced legacy SSE in the Nov 2025
605
+ spec, supports OAuth 2.1 + PKCE). Primitives: **tools**, **resources**, **prompts**.
606
+ Security is a real concern (a scan found 492 public servers with no auth) — authenticate
607
+ every request, scope every tool call, validate inputs, never pass tokens between
608
+ servers.
609
+
610
+ **Code execution with MCP** ("Code Mode"): instead of exposing many MCP tools as direct
611
+ calls (each ~550–1,400 tokens of schema), let the agent write code against a compact
612
+ tool API in a sandbox — reported 78–99% token reduction.
613
+ Worth it when an MCP server exposes *many* tools; overkill for one.
841
614
 
842
- **Project Hooks** (`.claude/settings.json`):
615
+ * * *
843
616
 
844
- ```json
617
+ ## 8. Hooks & Lifecycle (Cross-Agent)
618
+
619
+ Hooks let a tool inject context or enforce invariants automatically.
620
+ Support varies:
621
+
622
+ - **Claude Code** has the richest set (~29 events incl.
623
+ `SessionStart`, `Setup`, `UserPromptSubmit`, `PreToolUse`, `PostToolUse`,
624
+ `PreCompact`/`PostCompact`, `Stop`, `SubagentStart/Stop`, `SessionEnd`). Inject
625
+ context via `additionalContext` (most events) or stdout
626
+ (`SessionStart`/`UserPromptSubmit`). Skills can declare their own scoped `hooks:`.
627
+ - **Cursor** (6 events, incl.
628
+ `beforeShellExecution`/`beforeMCPExecution`), **Windsurf** (pre-hooks can **block**
629
+ via exit code 2), **Gemini CLI** (~12), and **opencode** (25+, with tool interception)
630
+ all have lifecycle hooks.
631
+ - **Aider**, **Jules**, **Zed** have no agent hooks (Aider integrates Git pre-commit
632
+ hooks only).
633
+
634
+ **Common, portable use**: a `SessionStart` hook that runs your CLI’s `prime`/`skill`
635
+ command to restore workflow context; a `PreCompact` hook that re-injects a brief
636
+ (`skill --brief`) before the window is trimmed.
637
+ Keep injected context small — it competes with everything else.
638
+
639
+ ```jsonc
640
+ // Claude Code ~/.claude/settings.json
845
641
  {
846
642
  "hooks": {
847
- "PostToolUse": [{
848
- "matcher": "Bash",
849
- "hooks": [{
850
- "type": "command",
851
- "command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/closing-reminder.sh"
852
- }]
853
- }]
643
+ "SessionStart": [{ "matcher": "", "hooks": [{ "type": "command", "command": "mycli prime" }] }],
644
+ "PreCompact": [{ "matcher": "", "hooks": [{ "type": "command", "command": "mycli skill --brief" }] }]
854
645
  }
855
646
  }
856
647
  ```
857
648
 
858
- ### 6.2 PostToolUse Hook with JSON Parsing
859
-
860
- PostToolUse hooks receive JSON input describing the tool invocation.
861
- Use bash scripts with jq to parse and conditionally respond.
862
-
863
- ```bash
864
- #!/bin/bash
865
- input=$(cat)
866
- command=$(echo "$input" | jq -r '.tool_input.command // empty')
867
-
868
- # Trigger on specific patterns
869
- if [[ "$command" == git\ push* ]] || [[ "$command" == *"&& git push"* ]]; then
870
- if [ -d ".mycli" ]; then
871
- mycli closing
872
- fi
873
- fi
874
- exit 0
875
- ```
876
-
877
- ### 6.3 Help Structure for Agent Discovery
878
-
879
- Agents need clear signals about how to get started with your CLI. Structure help output
880
- to make setup commands and context restoration prominent.
881
-
882
- **Pattern 1: Help Epilog with IMPORTANT Section**
883
-
884
- Add prominent sections at the bottom of `--help` output:
885
-
886
- ```bash
887
- $ mycli --help
888
- Usage: mycli [options] [command]
889
- ...
890
-
891
- IMPORTANT:
892
- Agents unfamiliar with mycli should run `mycli prime` for full context.
893
-
894
- Getting Started:
895
- npm install -g mycli@latest && mycli setup --auto --prefix=<name>
896
- ```
897
-
898
- **Implementation with Commander.js**:
899
-
900
- ```typescript
901
- program
902
- .name('mycli')
903
- .description('Brief description')
904
- .addHelpText('after', `
905
- IMPORTANT:
906
- Agents unfamiliar with mycli should run \`mycli prime\` for full context.
907
-
908
- Getting Started:
909
- npm install -g mycli@latest && mycli setup --auto --prefix=<name>
910
- `);
911
- ```
912
-
913
- **Pattern 2: Context Recovery Prompt**
914
-
915
- When the CLI runs without arguments, don’t just show help—prompt for context:
916
-
917
- ```bash
918
- $ mycli
919
- Usage: mycli [command] [options]
920
- ...
921
- Tip: Run `mycli prime` to restore full workflow context.
922
- ```
923
-
924
- **Pattern 3: Setup Command Prominence**
925
-
926
- Ensure `setup` appears in a dedicated “Setup & Configuration” category in help output,
927
- not buried in a long alphabetical list:
928
-
929
- ```
930
- Setup & Configuration:
931
- init [options] Initialize mycli in a repository
932
- setup [options] Configure mycli integration with editors and tools
933
- config Manage configuration
934
- ```
935
-
936
- **Why This Matters**:
937
-
938
- - Agents often lose context after compaction or in new sessions
939
- - Prominent `prime` references help agents restore workflow rules
940
- - Clear setup commands help agents guide users through installation
941
- - “IMPORTANT” section is scanned even when full help is ignored
942
-
943
649
  * * *
944
650
 
945
- ## 7. Task Management Integration Patterns
946
-
947
- ### 7.1 Task Tracking Strategy Selection
948
-
949
- Agent-integrated CLIs often need to track work across sessions.
950
- The key architectural decision is **where task state lives** and **how complex the
951
- tracking needs to be**.
952
-
953
- **Three Strategies**:
954
-
955
- | Strategy | State Location | Complexity | Use Case |
956
- | --- | --- | --- | --- |
957
- | **Ephemeral** | None | Minimal | Quick calculations, queries, one-shot tasks |
958
- | **Session-local** | In-memory or temp file | Low | Multi-step tasks within a single session |
959
- | **Persistent** | Git-tracked files | Medium-High | Multi-session projects, team collaboration |
960
-
961
- **Decision Framework**:
962
-
963
- ```
964
- Is the task done in one command?
965
- Yes: Ephemeral (no tracking needed)
966
- No: Does it span multiple sessions?
967
- No: Session-local (agent's internal todo list)
968
- Yes: Persistent (tbd integration or equivalent)
969
- ```
970
-
971
- ### 7.2 Agent-Aware Task Patterns
972
-
973
- **Pattern 1: Auto-Discovery of Work**
974
-
975
- Include a “what should I work on?”
976
- command:
977
-
978
- ```bash
979
- mycli ready # Show tasks ready to work
980
- mycli next # Suggest the highest-priority task
981
- ```
982
-
983
- **Pattern 2: Context-Preserving Task Creation**
984
-
985
- When creating tasks from agent context, preserve relevant information:
986
-
987
- ```bash
988
- mycli task create "Fix authentication bug" \
989
- --context "User reported login fails after password reset" \
990
- --related-files "src/auth/login.ts,src/auth/reset.ts"
991
- ```
992
-
993
- **Pattern 3: Session Boundary Enforcement**
994
-
995
- The CLI should remind agents to handle tasks at session end:
996
-
997
- ```markdown
998
- ## Session Closing Protocol
999
-
1000
- Before completing a session:
1001
- 1. Close or update all tasks you worked on
1002
- 2. Create tasks for any discovered work
1003
- 3. Sync task state: `mycli sync`
1004
- ```
651
+ ## 9. Security & Supply Chain (Don’t Skip This)
652
+
653
+ Skills and instruction files are **executable influence** on an agent, which makes them
654
+ an attack surface. Treat them with the same care as dependencies.
655
+
656
+ - **Prompt injection via skills/instructions is real and effective**: 2026 security
657
+ research demonstrated up to ~80% attack success against frontier models using
658
+ malicious skills (instructions that exfiltrate data or escalate tool use).
659
+ Anything in `AGENTS.md`, `SKILL.md`, a fetched skill, or tool output is **untrusted
660
+ input**.
661
+ - **Never put secrets in skill/instruction files or tool output.** `AGENTS.md`,
662
+ `SKILL.md`, bundled scripts, and anything a command prints get loaded into agent
663
+ context (and often committed) keep credentials, tokens, and keys out of them; read
664
+ secrets from the environment at runtime instead.
665
+ - **Vet third-party skills before install.** Prefer sources that scan (skills.sh runs
666
+ Snyk on every install).
667
+ Read the body and any bundled scripts — review them like dependency code.
668
+ Pin to a commit, not a moving tag.
669
+ - **Scope tools tightly.** Use `allowed-tools` to grant the minimum (e.g.,
670
+ `Bash(mycli:*)` not blanket `Bash`). Prefer `disable-model-invocation` for
671
+ destructive/action-heavy skills so they require explicit invocation.
672
+ - **Lean on sandboxing.** Claude Code’s OS-level sandbox and Codex’s
673
+ `read-only`/`workspace-write` modes contain damage; design your CLI to run within them
674
+ and degrade gracefully (clear error, no silent failure) when writes/network are
675
+ denied.
676
+ - **Apply the same currency discipline** you use for packages: if your skill ships a
677
+ script with dependencies, the project’s supply-chain rules (e.g., the 14-day
678
+ package-age rule) apply — and a skill that references a zero-install runner must pin
679
+ the version (§6.7), since unpinned `npx`/`uvx` bypasses the cool-off.
680
+ See `tbd guidelines supply-chain-hardening` for the cross-ecosystem policy, or
681
+ `tbd guidelines bun-monorepo-patterns` / `pnpm-monorepo-patterns` for monorepo
682
+ specifics.
1005
683
 
1006
684
  * * *
1007
685
 
1008
- ## 8. Dynamic Generation Patterns
1009
-
1010
- ### 8.1 Dynamic Skill and Resource Directory Generation
1011
-
1012
- Rather than maintaining static content that can become stale, generate skill files and
1013
- resource directories dynamically at runtime from source components and installed
1014
- documents.
1015
-
1016
- **Pattern 1: Skill Composition from Multiple Sources**
1017
-
1018
- Compose skill content from separate files for maintainability:
1019
-
1020
- ```typescript
1021
- async function composeFullSkill(): Promise<string> {
1022
- // 1. Load YAML header (Claude Code metadata)
1023
- const header = await loadDocContent('install/claude-header.md');
1024
-
1025
- // 2. Load base skill workflow content
1026
- const baseSkill = await loadDocContent('shortcuts/system/skill-baseline.md');
1027
-
1028
- // 3. Generate dynamic resource directory from current docs
1029
- const directory = await generateShortcutDirectory();
1030
-
1031
- // 4. Compose final skill: header + base + dynamic content
1032
- let result = header + baseSkill;
1033
- if (directory) {
1034
- result = result.trimEnd() + '\n\n' + directory;
1035
- }
1036
-
1037
- return result;
1038
- }
1039
- ```
1040
-
1041
- **Pattern 2: Resource Directory Generation**
1042
-
1043
- Generate tables of available resources from loaded documents:
1044
-
1045
- ```typescript
1046
- async function generateShortcutDirectory(): Promise<string> {
1047
- const shortcuts = await docCache.listDocuments('shortcuts');
1048
- const rows = shortcuts.map(doc => {
1049
- const meta = doc.frontmatter;
1050
- return `| ${doc.name} | ${meta.title} | ${meta.description} |`;
1051
- });
1052
-
1053
- // Wrap in HTML markers for incremental updates
1054
- return [
1055
- '<!-- BEGIN SHORTCUT DIRECTORY -->',
1056
- '## Available Shortcuts',
1057
- '',
1058
- '| Name | Title | Description |',
1059
- '| --- | --- | --- |',
1060
- ...rows,
1061
- '<!-- END SHORTCUT DIRECTORY -->'
1062
- ].join('\n');
1063
- }
1064
- ```
1065
-
1066
- **Pattern 3: Incremental Updates with HTML Markers**
1067
-
1068
- Use HTML comment markers to identify sections that can be updated independently:
1069
-
1070
- ```markdown
1071
- <!-- BEGIN SHORTCUT DIRECTORY -->
1072
- ## Available Shortcuts
1073
- | Name | Description |
1074
- | --- | --- |
1075
- | code-review | Run pre-commit checks and commit |
1076
- | new-plan-spec | Create a planning specification |
1077
-
1078
- <!-- END SHORTCUT DIRECTORY -->
1079
- ```
1080
-
1081
- This allows updating just the directory section when resources change, without
1082
- regenerating the entire skill file.
1083
-
1084
- **Pattern 4: “DO NOT EDIT” Warnings**
1085
-
1086
- For generated files installed in `.claude/skills/`, insert warnings after frontmatter:
1087
-
1088
- ```typescript
1089
- function insertDoNotEditMarker(content: string): string {
1090
- const marker = `<!-- DO NOT EDIT: Generated by mycli setup.
1091
- Run 'mycli setup' to update.
1092
- -->`;
1093
-
1094
- // Insert after YAML frontmatter
1095
- const lines = content.split('\n');
1096
- const endOfFrontmatter = lines.findIndex((l, i) => i > 0 && l === '---');
1097
- lines.splice(endOfFrontmatter + 1, 0, marker);
1098
- return lines.join('\n');
1099
- }
1100
- ```
1101
-
1102
- **Benefits of Dynamic Generation**:
1103
-
1104
- 1. **Always Current**: Resource directories reflect actual available docs
1105
- 2. **DRY Principle**: Single source of truth (frontmatter) drives multiple outputs
1106
- 3. **Partial Updates**: HTML markers enable surgical updates to sections
1107
- 4. **Version-Safe**: Content updates ship with CLI version updates
1108
-
1109
- ### 8.2 DocCache Shadowing Pattern
1110
-
1111
- Implement path-ordered document loading that allows project-level resources to shadow
1112
- (override) built-in resources, similar to how shell `$PATH` works.
1113
-
1114
- **Loading Order** (earlier paths take precedence):
1115
-
1116
- 1. Project-level: `.mycli/docs/shortcuts/`
1117
- 2. User-level: `~/.mycli/docs/shortcuts/`
1118
- 3. Built-in: Bundled with CLI
1119
-
1120
- ```typescript
1121
- class DocCache {
1122
- private paths: string[]; // Ordered by priority
1123
-
1124
- async loadDocument(name: string): Promise<Document | null> {
1125
- for (const basePath of this.paths) {
1126
- const doc = await this.tryLoad(join(basePath, name));
1127
- if (doc) return doc; // First match wins
1128
- }
1129
- return null;
1130
- }
1131
- }
1132
- ```
686
+ ## 10. Emerging & Forward-Looking (Know It Exists)
687
+
688
+ You usually don’t need these to ship a skill, but they shape where the ecosystem is
689
+ going:
690
+
691
+ - **ACP (Agent Client Protocol)** Zed’s “LSP for agents” (JSON-RPC over stdio); 25+
692
+ agents (Claude Code, Codex, Gemini CLI, opencode) and editors (Zed, JetBrains, Kiro).
693
+ Complements MCP (editor↔agent, while MCP is agent↔tools).
694
+ Your agent runtime speaks it; a skill author doesn’t implement it.
695
+ - **A2A (Agent2Agent)** — Google/Linux Foundation, v1.0, 150+ orgs; for enterprise
696
+ agent-to-agent delegation, not skill authoring.
697
+ Ignore unless you build autonomous multi-agent systems.
698
+ - **Codex App-Server** — JSON-RPC (Thread/Turn/Item) decoupling Codex logic from client
699
+ surfaces; relevant only for Codex-specific integration surfaces.
700
+ - **Plugin marketplaces & `npx skills`** — distribution is consolidating: Claude Code
701
+ plugin marketplaces (official + community), Codex plugins, and Vercel’s
702
+ `npx skills add` over the skills.sh directory (cross-agent symlinks).
703
+ - **Routines / scheduled agents, background monitors, `/run` & `/verify` skills** —
704
+ newer Claude Code capabilities for autonomous, event-triggered, and app-verifying
705
+ workflows (confirm GA vs.
706
+ preview for your version before relying on them).
1133
707
 
1134
708
  * * *
1135
709
 
1136
- ## 9. MCP Integration Patterns
1137
-
1138
- MCP (Model Context Protocol) and CLI-as-Skill are complementary approaches.
1139
- Understanding when to use each is critical.
1140
-
1141
- | Aspect | CLI-as-Skill | MCP Server |
1142
- | --- | --- | --- |
1143
- | **Deployment** | npm install | Separate process |
1144
- | **Integration** | SKILL.md + hooks | MCP protocol |
1145
- | **Context** | Skill content in prompt | Tool calls |
1146
- | **State** | Stateless (per-command) | Can maintain state |
1147
- | **Scope** | Single CLI capabilities | Ecosystem of servers |
1148
- | **Complexity** | Lower | Higher |
1149
-
1150
- **When to Use CLI-as-Skill**:
710
+ ## 11. Best-Practices Summary
1151
711
 
1152
- - Self-contained functionality
1153
- - Workflow guidance and documentation
1154
- - Resource libraries (guidelines, templates)
1155
- - Simple tool integration
1156
- - Quick setup requirements
712
+ **Start simple**
1157
713
 
1158
- **When to Use MCP**:
714
+ - One capability → one `SKILL.md` (name + two-part description + < 500-line body).
715
+ Stop.
716
+ - Project conventions → `AGENTS.md` (concise; it loads every turn).
717
+ - Have a CLI → make it agent-friendly (`--json`, idempotent, actionable errors) and
718
+ point a `SKILL.md` at it.
1159
719
 
1160
- - Persistent connections (databases, APIs)
1161
- - Stateful operations
1162
- - Cross-tool orchestration
1163
- - Real-time data streaming
1164
- - Complex tool ecosystems
720
+ **Descriptions & disclosure**
1165
721
 
1166
- * * *
722
+ - Two-part rule: *what it does* + *when to use it*; third person; front-load keywords.
723
+ - Progressive disclosure: metadata → body → supporting files; bundle scripts
724
+ (output-only cost).
725
+ - Respect the budget; verify the current model for your target agent (Claude Code ≈ 1%
726
+ of context window, not a flat char count).
1167
727
 
1168
- ## Best Practices Summary
1169
-
1170
- ### Architecture
1171
-
1172
- 1. **Bundle documentation with CLI**: Self-contained packages work in all environments
1173
- 2. **Maintain tiered skill files**: Full (baseline) and brief versions for different
1174
- contexts
1175
- 3. **Provide a `skill` subcommand**: Output skill content to stdout with `--brief` flag
1176
- 4. **Implement fallback loading**: Support both bundled and development modes
1177
- 5. **Use platform-appropriate formats**: SKILL.md for Claude, MDC for Cursor, markers
1178
- for AGENTS.md
1179
-
1180
- ### Context Management
1181
-
1182
- 6. **Implement a `prime` command**: Dashboard at session start, brief mode for
1183
- constrained contexts
1184
- 7. **Implement a `skill` command**: Output pure skill content (no dashboard) for
1185
- inspection and installation preview
1186
- 8. **Separate skill from dashboard**: `prime` = status + context, `skill` = pure
1187
- documentation
1188
- 9. **Compose skills from multiple sources**: Header + baseline + dynamic directory
1189
- 10. **Include context recovery instructions**: Agents need to know how to restore
1190
- context
1191
- 11. **Two-level orientation only**: Full (default) and brief—avoid more granularity
1192
- 12. **Use progressive disclosure**: Level 1 (metadata) → Level 2 (skill body) → Level 3
1193
- (resources)
1194
- 13. **Keep SKILL.md under 500 lines**: Move detailed content to reference files
1195
-
1196
- ### Description Optimization
1197
-
1198
- 10. **Use the two-part rule**: What does it do?
1199
- + When to use it?
1200
- 11. **Write in third person**: “Processes files” not “I can help you”
1201
- 12. **Include explicit trigger phrases**: Match how users naturally describe needs
1202
- 13. **Front-load keywords**: Put most important triggers in first 50 characters
1203
- 14. **Respect cumulative budget**: All descriptions share a ~15K character limit
1204
- 15. **Use meta-skill pattern for 50+ resources**: One skill + CLI beats 50 individual
1205
- skills
1206
-
1207
- ### Self-Documentation
1208
-
1209
- 14. **Provide documentation commands**: `readme`, `docs`, `design` as built-in commands
1210
- 15. **Include Getting Started in help epilog**: One-liner must be easily accessible
1211
- 16. **Add IMPORTANT section to help**: Prominently reference `prime` command for context
1212
- restoration
1213
-
1214
- ### Setup Flows
1215
-
1216
- 17. **Two-tier command structure**: High-level (`setup`) and surgical (`init`)
1217
- 18. **Require explicit mode flags**: `--auto` for agents, `--interactive` for humans
1218
- 19. **Make setup idempotent**: Safe to run multiple times without errors or duplicates
1219
- 20. **Deduplicate hooks on each run**: Filter existing hooks before merging new ones
1220
- 21. **Regenerate skill files**: Always overwrite with fresh content, don’t try to update
1221
- in place
1222
- 22. **Clean up legacy patterns**: Remove deprecated files/configs on each setup run
1223
- 23. **Never guess user preferences**: For taste-based config (prefixes), always ask
1224
-
1225
- ### Agent Integration
1226
-
1227
- 24. **Install hooks programmatically**: SessionStart, PreCompact, PostToolUse
1228
- 25. **Use skill directories**: `.claude/skills/`, `.cursor/rules/`
1229
- 26. **Support multiple agents**: Single CLI, multiple integration points
1230
- 27. **Structure help for agent discovery**: IMPORTANT section, Getting Started
1231
- one-liner, prominent setup commands
1232
-
1233
- ### Output
1234
-
1235
- 28. **Implement `--json` for all commands**: Machine-readable output is essential
1236
- 29. **Use `output.data()` pattern**: Single code path for JSON and human output
1237
- 30. **Provide `--quiet` mode**: For scripted usage without noise
1238
-
1239
- ### Error Handling
1240
-
1241
- 31. **Include next steps in errors**: Actionable guidance, not just error messages
1242
- 32. **Graceful deprecation**: Keep old commands working with migration guidance
1243
- 33. **Explicit completion protocols**: Checklists prevent premature completion
1244
-
1245
- ### Agent Mental Model
1246
-
1247
- 34. **Design for agent-as-partner**: Help agents serve users, not relay commands
1248
- 35. **Lead with value proposition**: Explain *why* before *how*
1249
- 36. **Distinguish action from informational commands**: Some commands teach, not do
1250
-
1251
- ### Resource Libraries
1252
-
1253
- 37. **Bundle guidelines, shortcuts, templates**: Ship curated knowledge with CLI
1254
- 38. **Show full commands in directories**: `cli shortcut X`, not just `X`
1255
- 39. **Organize resources by purpose**: Categories by workflow phase or domain
1256
- 40. **Enable on-demand knowledge queries**: Agents pull in relevant resources JIT
1257
- 41. **Implement shadowing for customization**: Project-level overrides without forking
1258
- 42. **Generate directories dynamically**: Avoid stale documentation
1259
- 43. **Use HTML markers for updatable sections**: Enable partial updates without full
1260
- regeneration
1261
-
1262
- ### Context Injection
1263
-
1264
- 44. **Design self-reinforcing context chains**: SKILL.md → guidelines → actions
1265
- 45. **Reference commands explicitly**: Always `cli command arg`, never vague prose
1266
- 46. **Limit chain depth to 3**: Avoid deep reference chains that confuse agents
1267
- 47. **Make every layer actionable**: Each context injection should lead to actions
1268
-
1269
- ### Task Management
1270
-
1271
- 48. **Choose appropriate tracking strategy**: Ephemeral, session-local, or persistent
1272
- 49. **Implement work discovery**: `ready` or `next` commands for session start
1273
- 50. **Add session boundary enforcement**: Remind agents to sync/close at session end
1274
- 51. **Consider tbd integration**: For persistent multi-session task tracking
1275
-
1276
- * * *
728
+ **Scale up only when needed**
1277
729
 
1278
- ## Integration Checklist for New CLIs
730
+ - Many capabilities meta-skill + informational, self-injecting subcommands (one
731
+ listing slot, unbounded resources).
732
+ This is tbd’s validated approach.
733
+ - Path-ordered resource cache for project/user shadowing; generate `--list` dynamically.
734
+ - Context-injection loop with explicit `cli command arg` references; depth ≤ 3.
1279
735
 
1280
- **Agent Integration Files**
736
+ **Reach & surface**
1281
737
 
1282
- - [ ] SKILL.md with YAML frontmatter (name, description, allowed-tools)
1283
- - [ ] CURSOR.mdc with MDC frontmatter (description, alwaysApply)
1284
- - [ ] AGENTS.md section with HTML markers
1285
- - [ ] Tiered skill files: skill-baseline.md, skill-brief.md
1286
- - [ ] Separate header file: claude-header.md with YAML frontmatter
738
+ - Layer for reach: `AGENTS.md` + `SKILL.md` + CLI + (MCP if no CLI fits).
739
+ - Prefer CLI over MCP when a CLI exists (cheaper, more reliable); use MCP for
740
+ auth/multi-tenant/remote; consider code-execution mode for many-tool MCP servers.
741
+ - Add agent-specific files (`.cursor/rules`, plugins, ACP) last, only where they pay
742
+ off.
1287
743
 
1288
- **Description Quality**
744
+ **Operate safely**
1289
745
 
1290
- - [ ] Two-part description: capabilities + activation triggers
1291
- - [ ] Third-person language only
1292
- - [ ] Explicit “Use when …” trigger phrases matching user language
1293
- - [ ] Front-load important keywords in first 50 characters
1294
- - [ ] Description length appropriate for skill collection size (≤130 chars for 60+
1295
- skills)
746
+ - Treat all skill/instruction content and tool output as untrusted; vet and pin
747
+ third-party skills.
748
+ - Scope `allowed-tools` tightly; gate destructive skills; design for sandboxes.
749
+ - Idempotent multi-agent install with marker-bounded sections; version source files, not
750
+ fully generated install artifacts; mark generated files “DO NOT EDIT.”
1296
751
 
1297
- **Budget Management** (for CLIs installing multiple skills)
1298
-
1299
- - [ ] Calculate cumulative description size (descriptions + ~109 chars overhead each)
1300
- - [ ] Verify total stays under 15K character budget
1301
- - [ ] Use meta-skill pattern if resources exceed 50
1302
- - [ ] Run `/context` to verify skills aren’t being truncated
1303
-
1304
- **Context Management**
1305
-
1306
- - [ ] `prime` command with dashboard and brief modes (two levels only)
1307
- - [ ] `skill` command for full documentation output with `--brief` flag
1308
- - [ ] Skill composition from header + baseline + dynamic directory
1309
- - [ ] HTML markers for updatable sections (<!-- BEGIN/END -->)
1310
- - [ ] “DO NOT EDIT” warnings in generated skill files
1311
- - [ ] Value-first orientation in skill file (why before how)
1312
- - [ ] Context recovery instructions in all docs
1313
- - [ ] Session closing protocol checklist
1314
- - [ ] SKILL.md under 500 lines (progressive disclosure)
1315
-
1316
- **Setup Flow**
1317
-
1318
- - [ ] `setup --auto` for agent-friendly installation
1319
- - [ ] `init --prefix` for surgical initialization
1320
- - [ ] Multi-contributor detection (skip init if already configured)
1321
- - [ ] Setup is idempotent (safe to run multiple times)
1322
- - [ ] Hook deduplication (filter existing before merging)
1323
- - [ ] Skill file regeneration (always overwrite, don’t update in place)
1324
- - [ ] Legacy pattern cleanup on each setup run
1325
-
1326
- **Hooks**
1327
-
1328
- - [ ] SessionStart hook to call `prime`
1329
- - [ ] PreCompact hook to call `prime`
1330
- - [ ] PostToolUse hook for session completion reminders
1331
-
1332
- **Self-Documentation**
1333
-
1334
- - [ ] Help epilog with “IMPORTANT” section referencing `prime`
1335
- - [ ] Help epilog with “Getting Started” one-liner installation command
1336
- - [ ] Setup command in dedicated “Setup & Configuration” category
1337
- - [ ] Context recovery prompt when CLI runs without args
1338
- - [ ] Documentation commands (`readme`, `docs`)
1339
- - [ ] `--json` flag on all commands
1340
-
1341
- **Resource Libraries**
1342
-
1343
- - [ ] `shortcut` command with `--list` and category filtering
1344
- - [ ] `guidelines` command with `--list` and category filtering
1345
- - [ ] `template` command with `--list`
1346
- - [ ] Resources organized by purpose/workflow phase
1347
- - [ ] Resource directories generated dynamically
1348
- - [ ] Shadowing support for project-level overrides
1349
-
1350
- **Task Management**
752
+ * * *
1351
753
 
1352
- - [ ] Decide tracking strategy: ephemeral, session-local, or persistent
1353
- - [ ] Implement work discovery command (`ready`, `next`)
1354
- - [ ] Add session closing reminders for task sync
754
+ ## 12. Integration Checklist
755
+
756
+ **Baseline (every skill)**
757
+ - [ ] `SKILL.md` with `name` + two-part `description`
758
+ - [ ] Body < 500 lines; bulky material in supporting files one level deep
759
+ - [ ] Third-person description, trigger keywords front-loaded
760
+ - [ ] Installable via commit to `.claude/skills/` and/or `npx skills add`
761
+
762
+ **Project**
763
+ - [ ] `AGENTS.md` with build/test/style/conventions (concise)
764
+ - [ ] `CLAUDE.md` strategy decided (symlink to `AGENTS.md`, copy, or separate)
765
+
766
+ **CLI tool (if applicable)**
767
+ - [ ] `--json` on all commands; `--brief`/`--quiet`; actionable errors
768
+ - [ ] Idempotent `setup --auto`; `init` for surgical config
769
+ - [ ] Help epilog with `IMPORTANT:` + Getting Started one-liner
770
+ - [ ] `prime` (status/context) and `skill` (pure docs) commands
771
+ - [ ] Invocation strategy chosen (§6.7): pinned zero-install (`npx`/`uvx <pkg>@<ver>`)
772
+ by default, or global install + `SessionStart` bootstrap for cloud/ephemeral agents
773
+
774
+ **Advanced (many subcommands / knowledge library)**
775
+ - [ ] Meta-skill composition (header + baseline + dynamic directory)
776
+ - [ ] Informational commands (`guidelines`/`shortcut`/`template`) with `--list`
777
+ - [ ] Path-ordered DocCache with shadowing
778
+ - [ ] Tiered skill files (baseline + brief)
779
+ - [ ] Context-injection loop, explicit references, depth ≤ 3
780
+
781
+ **Reach**
782
+ - [ ] Decide target agents; add per-agent files only where needed
783
+ - [ ] MCP server only if no CLI fits, or for OAuth/multi-tenant/remote
784
+ - [ ] Marker-bounded multi-agent install; “DO NOT EDIT” on generated files
785
+
786
+ **Security**
787
+ - [ ] Third-party skills vetted, scanned, and pinned
788
+ - [ ] `allowed-tools` minimally scoped; destructive skills gated
789
+ - [ ] CLI works within agent sandboxes and degrades gracefully
1355
790
 
1356
791
  * * *
1357
792
 
1358
793
  ## References
1359
794
 
1360
- ### Official Documentation
1361
-
1362
- - Claude Code Skills Documentation: https://code.claude.com/docs/en/skills
1363
- - Agent Skills Open Standard: https://agentskills.io
1364
- - Cursor IDE MDC Format: https://cursor.sh/docs/rules
1365
-
1366
- ### Community Resources
1367
-
1368
- - Claude Skills Best Practices:
1369
- https://github.com/Dicklesworthstone/meta_skill/blob/main/BEST_PRACTICES_FOR_WRITING_AND_USING_SKILLS_MD_FILES.md
1370
- - Claude Code Skills Guide (Gist):
1371
- https://gist.github.com/mellanon/50816550ecb5f3b239aa77eef7b8ed8d
1372
- - Awesome Claude Skills: https://github.com/travisvn/awesome-claude-skills
1373
- - Skills Character Budget Research:
1374
- https://github.com/anthropics/claude-code/issues/11045
1375
- - Skill Activation Reliability Testing:
1376
- https://scottspence.com/posts/how-to-make-claude-code-skills-activate-reliably
1377
-
1378
- ### MCP Resources
1379
-
1380
- - Anthropic MCP Engineering Blog:
1381
- https://www.anthropic.com/engineering/code-execution-with-mcp
1382
- - MCP Agent Frameworks Comparison:
1383
- https://clickhouse.com/blog/how-to-build-ai-agents-mcp-12-frameworks
1384
- - GitHub MCP Registry: https://github.com/modelcontextprotocol
1385
-
1386
- ### Implementation References
1387
-
1388
- - Commander.js: https://github.com/tj/commander.js
1389
- - tbd Source Code: https://github.com/jlevy/tbd
795
+ ### Open standards & governance
796
+
797
+ - Agent Skills standard: https://agentskills.io (spec:
798
+ https://agentskills.io/specification)
799
+ - AGENTS.md: https://agents.md
800
+ - Agentic AI Foundation (Linux Foundation):
801
+ https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation
802
+
803
+ ### Claude Code
804
+
805
+ - Skills: https://code.claude.com/docs/en/skills
806
+ - Hooks: https://code.claude.com/docs/en/hooks
807
+ - Plugins: https://code.claude.com/docs/en/plugins
808
+ - Skill authoring best practices:
809
+ https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices
810
+ - Sandboxing: https://www.anthropic.com/engineering/claude-code-sandboxing
811
+
812
+ ### Codex
813
+
814
+ - AGENTS.md: https://developers.openai.com/codex/guides/agents-md
815
+ - Config reference: https://developers.openai.com/codex/config-reference
816
+ - Skills: https://developers.openai.com/codex/skills
817
+ - MCP: https://developers.openai.com/codex/mcp
818
+ - Sandboxing: https://developers.openai.com/codex/concepts/sandboxing
819
+
820
+ ### Other agents
821
+
822
+ - Cursor rules: https://cursor.com/docs/rules
823
+ - GitHub Copilot custom instructions:
824
+ https://docs.github.com/copilot/customizing-copilot/adding-custom-instructions-for-github-copilot
825
+ - Gemini CLI: https://geminicli.com/docs/cli/gemini-md/
826
+ - Windsurf AGENTS.md: https://docs.windsurf.com/windsurf/cascade/agents-md
827
+ - Cline rules: https://docs.cline.bot/customization/cline-rules
828
+ - Aider conventions: https://aider.chat/docs/usage/conventions.html
829
+ - opencode: https://opencode.ai/docs/rules/
830
+ - Amp: https://ampcode.com/manual
831
+ - pi: https://github.com/badlogic/pi-mono
832
+
833
+ ### MCP & protocols
834
+
835
+ - 2026 MCP roadmap: https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/
836
+ - Code execution with MCP: https://www.anthropic.com/engineering/code-execution-with-mcp
837
+ - CLI vs MCP benchmarks: https://www.firecrawl.dev/blog/mcp-vs-cli
838
+ - ACP: https://zed.dev/acp
839
+ - A2A:
840
+ https://www.linuxfoundation.org/press/a2a-protocol-surpasses-150-organizations-lands-in-major-cloud-platforms-and-sees-enterprise-production-use-in-first-year
841
+
842
+ ### Distribution & ecosystem
843
+
844
+ - Vercel skills / skills.sh:
845
+ https://vercel.com/changelog/introducing-skills-the-open-agent-skills-ecosystem
846
+ - npx skills: https://github.com/vercel-labs/skills
847
+ - Anthropic skills (examples): https://github.com/anthropics/skills
848
+ - gstack: https://github.com/garrytan/gstack
849
+ - Beads (bd): https://github.com/gastownhall/beads
850
+
851
+ ### Security
852
+
853
+ - Securing the skill ecosystem (Snyk):
854
+ https://snyk.io/blog/snyk-vercel-securing-agent-skill-ecosystem/
855
+ - Malicious-skill research:
856
+ https://labs.reversec.com/posts/2026/05/skill-issues-compromising-claude-code-with-malicious-skills-agents-part-1
1390
857
 
1391
858
  ## Related Guidelines
1392
859
 
1393
- - For TypeScript CLI implementation details, see
1394
- `tbd guidelines typescript-cli-tool-rules`
1395
- - For testing patterns, see `tbd guidelines general-testing-rules`
1396
- - For monorepo setup, see `tbd guidelines pnpm-monorepo-patterns` or
1397
- `bun-monorepo-patterns`
860
+ - TypeScript CLI implementation: `tbd guidelines typescript-cli-tool-rules`
861
+ - Supply-chain / dependency currency: `tbd guidelines bun-monorepo-patterns` or
862
+ `pnpm-monorepo-patterns`
863
+ - Testing: `tbd guidelines general-testing-rules`