superacli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +14 -0
- package/README.md +173 -0
- package/cli/adapters/http.js +72 -0
- package/cli/adapters/mcp.js +193 -0
- package/cli/adapters/openapi.js +160 -0
- package/cli/ask.js +208 -0
- package/cli/config.js +133 -0
- package/cli/executor.js +117 -0
- package/cli/help-json.js +46 -0
- package/cli/mcp-local.js +72 -0
- package/cli/plan-runtime.js +32 -0
- package/cli/planner.js +67 -0
- package/cli/skills.js +240 -0
- package/cli/supercli.js +704 -0
- package/docs/features/adapters.md +25 -0
- package/docs/features/agent-friendly.md +28 -0
- package/docs/features/ask.md +32 -0
- package/docs/features/config-sync.md +22 -0
- package/docs/features/execution-plans.md +25 -0
- package/docs/features/observability.md +22 -0
- package/docs/features/skills.md +25 -0
- package/docs/features/storage.md +25 -0
- package/docs/features/workflows.md +33 -0
- package/docs/initial/AGENTS_FRIENDLY_TOOLS.md +553 -0
- package/docs/initial/agent-friendly.md +447 -0
- package/docs/initial/architecture.md +436 -0
- package/docs/initial/built-in-mcp-server.md +64 -0
- package/docs/initial/command-plan.md +532 -0
- package/docs/initial/core-features-2.md +428 -0
- package/docs/initial/core-features.md +366 -0
- package/docs/initial/dag.md +20 -0
- package/docs/initial/description.txt +9 -0
- package/docs/initial/idea.txt +564 -0
- package/docs/initial/initial-spec-details.md +726 -0
- package/docs/initial/initial-spec.md +731 -0
- package/docs/initial/mcp-local-mode.md +53 -0
- package/docs/initial/mcp-sse-mode.md +54 -0
- package/docs/initial/skills-support.md +246 -0
- package/docs/initial/storage-adapter-example.md +155 -0
- package/docs/initial/supercli-vs-gwc.md +109 -0
- package/examples/mcp-sse/install-demo.js +86 -0
- package/examples/mcp-sse/server.js +81 -0
- package/examples/mcp-stdio/install-demo.js +78 -0
- package/examples/mcp-stdio/server.js +50 -0
- package/package.json +21 -0
- package/server/app.js +59 -0
- package/server/public/app.js +18 -0
- package/server/routes/ask.js +92 -0
- package/server/routes/commands.js +126 -0
- package/server/routes/config.js +58 -0
- package/server/routes/jobs.js +122 -0
- package/server/routes/mcp.js +79 -0
- package/server/routes/plans.js +134 -0
- package/server/routes/specs.js +79 -0
- package/server/services/configService.js +88 -0
- package/server/storage/adapter.js +32 -0
- package/server/storage/file.js +64 -0
- package/server/storage/mongo.js +55 -0
- package/server/views/command-edit.ejs +110 -0
- package/server/views/commands.ejs +49 -0
- package/server/views/jobs.ejs +72 -0
- package/server/views/layout.ejs +42 -0
- package/server/views/mcp.ejs +80 -0
- package/server/views/partials/foot.ejs +5 -0
- package/server/views/partials/head.ejs +27 -0
- package/server/views/specs.ejs +91 -0
- package/tests/test-cli.js +367 -0
- package/tests/test-mcp.js +189 -0
- package/tests/test-openapi.js +101 -0
|
@@ -0,0 +1,553 @@
|
|
|
1
|
+
# Agent-Friendly CLI Design Principles
|
|
2
|
+
|
|
3
|
+
## The Fundamental Question: What Is "Agent-Friendly"?
|
|
4
|
+
|
|
5
|
+
An agent-friendly tool is one designed for programmatic consumption where **information density** and **predictability** take precedence over human ergonomics. It recognizes that AI agents operate under fundamentally different constraints than human users.
|
|
6
|
+
|
|
7
|
+
## Core Differences: Agents vs Humans
|
|
8
|
+
|
|
9
|
+
### Agents Are Token-Constrained
|
|
10
|
+
- Every byte of help text, output, and command costs tokens
|
|
11
|
+
- Tokens are literal currency in agent operations
|
|
12
|
+
- Verbose output directly increases operational costs
|
|
13
|
+
- **Principle**: Maximum signal, minimum noise
|
|
14
|
+
|
|
15
|
+
### Agents Have Limited Context Windows
|
|
16
|
+
- No persistent memory between invocations unless explicitly provided
|
|
17
|
+
- Must understand tool capabilities quickly and completely
|
|
18
|
+
- Cannot "remember" previous interactions or help text
|
|
19
|
+
- **Principle**: Self-describing but concise interfaces
|
|
20
|
+
|
|
21
|
+
### Agents Require Deterministic Behavior
|
|
22
|
+
- Ambiguous output breaks automated pipelines
|
|
23
|
+
- Same input must always produce same output structure
|
|
24
|
+
- Schema changes are breaking changes
|
|
25
|
+
- **Principle**: Stable, predictable interfaces
|
|
26
|
+
|
|
27
|
+
## Five Foundational Principles
|
|
28
|
+
|
|
29
|
+
### 1. Machine-Friendly Escape Hatches
|
|
30
|
+
|
|
31
|
+
**Every command must support non-interactive execution.**
|
|
32
|
+
|
|
33
|
+
**Implementation:**
|
|
34
|
+
- `--no-prompt` / `--no-interactive` flags to disable stdin reads
|
|
35
|
+
- `--yes` / `-y` flags for automatic confirmations
|
|
36
|
+
- Environment variables for global configuration (e.g., `NO_COLOR=true`)
|
|
37
|
+
- Tool-specific environment variables (e.g., `BDG_PROJECT_ID=2558`)
|
|
38
|
+
|
|
39
|
+
**Rationale:** Agents cannot respond to prompts. Interactive tools break automation.
|
|
40
|
+
|
|
41
|
+
**Source:** InfoQ, "Patterns for AI Agent Driven CLIs" (August 2025)
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
### 2. Treat Output as API Contracts
|
|
46
|
+
|
|
47
|
+
**Output formats are versioned interfaces that must remain stable.**
|
|
48
|
+
|
|
49
|
+
**Implementation:**
|
|
50
|
+
- Semantic versioning for output schema changes
|
|
51
|
+
- Schema validation on every change
|
|
52
|
+
- Additive changes only (new fields allowed, removing fields = major version bump)
|
|
53
|
+
- Version numbers in structured output: `{"version": "1.0", "data": {...}}`
|
|
54
|
+
|
|
55
|
+
**Rationale:** Breaking output format disrupts all downstream automation. Agents parse output programmatically; humans can adapt to changes.
|
|
56
|
+
|
|
57
|
+
**Source:** InfoQ, "Patterns for AI Agent Driven CLIs" (August 2025)
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
### 3. Semantic Exit Codes
|
|
62
|
+
|
|
63
|
+
**Exit codes communicate actionable information, not just success/failure.**
|
|
64
|
+
|
|
65
|
+
**Implementation (based on Square's system):**
|
|
66
|
+
```
|
|
67
|
+
0 Success
|
|
68
|
+
1 Generic failure (backward compatibility)
|
|
69
|
+
80-99 User errors (invalid arguments, bad permissions, resource issues)
|
|
70
|
+
100-119 Software errors (bugs, integration failures, timeouts)
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
**Recommended Subdivisions (for finer-grained error handling):**
|
|
74
|
+
```
|
|
75
|
+
80-89 Input/validation errors (invalid arguments, bad permissions)
|
|
76
|
+
90-99 Resource/state errors (not found, already exists, conflicts)
|
|
77
|
+
100-109 Integration/external errors (API down, timeout, auth failed)
|
|
78
|
+
110-119 Internal software errors (bugs in the tool itself, panics)
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
**Agent Decision Logic:**
|
|
82
|
+
- 0: Proceed to next step
|
|
83
|
+
- 80-89: Don't retry, fix input/permissions first
|
|
84
|
+
- 90-99: Ask for clarification or try alternate resource
|
|
85
|
+
- 100-109: Retry with backoff (likely transient failure)
|
|
86
|
+
- 110-119: Report bug, don't retry
|
|
87
|
+
|
|
88
|
+
**Rationale:** Agents make programmatic decisions based on exit codes. Generic failure codes (0/1) provide no decision-making information. Square's two-tier system (80-99 user, 100-119 software) provides the foundation; subdivisions enable more sophisticated retry logic.
|
|
89
|
+
|
|
90
|
+
**Source:** Square Engineering, "Command Line Observability with Semantic Exit Codes" (January 2023)
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
### 4. Structured Output with Multiple Formats
|
|
95
|
+
|
|
96
|
+
**Support both human-readable and machine-parseable output.**
|
|
97
|
+
|
|
98
|
+
**Implementation:**
|
|
99
|
+
- Default: Structured text (key-value pairs, line-based)
|
|
100
|
+
- `--json`: Full JSON structure
|
|
101
|
+
- `--plain`: Tab-separated for grep/awk compatibility
|
|
102
|
+
- Consistent flags across all commands: `-o json` or `--output json`
|
|
103
|
+
|
|
104
|
+
**Output Separation:**
|
|
105
|
+
- Primary data → `stdout`
|
|
106
|
+
- Logs/warnings/progress → `stderr`
|
|
107
|
+
- Errors → `stderr` (with structured format when `--json` used)
|
|
108
|
+
|
|
109
|
+
**Example:**
|
|
110
|
+
```bash
|
|
111
|
+
# Default (human & agent readable)
|
|
112
|
+
$ bdg network requests
|
|
113
|
+
ID: req_123
|
|
114
|
+
URL: https://api.example.com/data
|
|
115
|
+
Status: 200
|
|
116
|
+
Duration: 145ms
|
|
117
|
+
|
|
118
|
+
# JSON mode (pure machine readable)
|
|
119
|
+
$ bdg network requests --json
|
|
120
|
+
{"id":"req_123","url":"https://api.example.com/data","status":200,"duration_ms":145}
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
**Rationale:** Agents need parseable structure. Humans need readable context. Both served by the same tool with format flags.
|
|
124
|
+
|
|
125
|
+
**Source:** Command Line Interface Guidelines (clig.dev), AWS CLI documentation
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
### 5. Real-Time Feedback for Long Operations
|
|
130
|
+
|
|
131
|
+
**Progress reporting prevents agent timeouts and enables early failure detection.**
|
|
132
|
+
|
|
133
|
+
**Implementation:**
|
|
134
|
+
- Progress indicators on `stderr` (never `stdout`)
|
|
135
|
+
- Event streaming for long-running operations
|
|
136
|
+
- Incremental output when possible (streaming JSON Lines)
|
|
137
|
+
- Timeout hints: `Estimated: 2m 30s remaining`
|
|
138
|
+
|
|
139
|
+
**Example:**
|
|
140
|
+
```bash
|
|
141
|
+
$ bdg performance trace --duration 30s
|
|
142
|
+
[stderr] Capturing trace... 15s elapsed
|
|
143
|
+
[stderr] Capturing trace... 30s complete
|
|
144
|
+
[stdout] {"trace_file": "/tmp/trace.json", "size_mb": 45.2}
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
**Rationale:** Long-running commands appear "hung" to agents without feedback. Progress on stderr allows agents to monitor without parsing complexity.
|
|
148
|
+
|
|
149
|
+
**Source:** InfoQ, "Patterns for AI Agent Driven CLIs" (August 2025)
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## Design Philosophy for Agent-First Tools
|
|
154
|
+
|
|
155
|
+
### Dual-Mode Architecture: Janus-Faced Design
|
|
156
|
+
|
|
157
|
+
Tools should **detect execution context** and adapt:
|
|
158
|
+
|
|
159
|
+
**Detection Strategy:**
|
|
160
|
+
```
|
|
161
|
+
if stdout.is_tty():
|
|
162
|
+
# Human mode: colors, formatting, helpful context
|
|
163
|
+
else:
|
|
164
|
+
# Agent mode: structured output, no colors, minimal decoration
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
**Override with explicit flags:**
|
|
168
|
+
- `--json`: Force machine-readable output
|
|
169
|
+
- `--no-color`: Disable ANSI escape codes
|
|
170
|
+
- `--no-interactive`: Disable all prompts
|
|
171
|
+
|
|
172
|
+
### Information Layering: Three Output Tiers
|
|
173
|
+
|
|
174
|
+
**Layer 1: Primary Data** (`stdout`)
|
|
175
|
+
- The answer to the question asked
|
|
176
|
+
- What agents will parse and compose
|
|
177
|
+
- Must be stable, versioned schema
|
|
178
|
+
|
|
179
|
+
**Layer 2: Metadata** (`stdout`, optional)
|
|
180
|
+
- Timestamps, URLs, identifiers
|
|
181
|
+
- Included with `--verbose` or `--metadata`
|
|
182
|
+
- Structured when present
|
|
183
|
+
|
|
184
|
+
**Layer 3: Context** (`stderr`)
|
|
185
|
+
- Progress indicators
|
|
186
|
+
- Warnings
|
|
187
|
+
- Explanatory messages
|
|
188
|
+
- Never interferes with pipelines
|
|
189
|
+
|
|
190
|
+
### Command Topology: Navigable Mental Model
|
|
191
|
+
|
|
192
|
+
**Commands should reflect investigation workflow:**
|
|
193
|
+
```
|
|
194
|
+
tool
|
|
195
|
+
├── resource # Top-level entity
|
|
196
|
+
│ ├── get <id> # Retrieve one
|
|
197
|
+
│ ├── list [filters] # Query many
|
|
198
|
+
│ └── subresource <id> # Navigate relationships
|
|
199
|
+
│
|
|
200
|
+
└── action # Operational commands
|
|
201
|
+
├── start <target>
|
|
202
|
+
└── stop <target>
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
**Composability Pattern:**
|
|
206
|
+
```bash
|
|
207
|
+
# Output of one command feeds the next
|
|
208
|
+
tool resource list --status=failed --json | \
|
|
209
|
+
jq -r '.[] | .id' | \
|
|
210
|
+
xargs -I {} tool resource get {} --json
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
Each command:
|
|
214
|
+
- Does one thing completely
|
|
215
|
+
- Returns structured, parseable output
|
|
216
|
+
- Can be composed with other commands via pipes
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## Error Handling Philosophy
|
|
221
|
+
|
|
222
|
+
### Errors Are Typed Information
|
|
223
|
+
|
|
224
|
+
Agents need **semantic error signals** to make decisions, not friendly messages.
|
|
225
|
+
|
|
226
|
+
**Error Structure:**
|
|
227
|
+
```json
|
|
228
|
+
{
|
|
229
|
+
"error": {
|
|
230
|
+
"code": 92,
|
|
231
|
+
"type": "resource_not_found",
|
|
232
|
+
"message": "Network request req_123 not found",
|
|
233
|
+
"details": {
|
|
234
|
+
"request_id": "req_123",
|
|
235
|
+
"reason": "Request may have been cleared from cache"
|
|
236
|
+
},
|
|
237
|
+
"recoverable": false,
|
|
238
|
+
"retry_after": null,
|
|
239
|
+
"suggestions": [
|
|
240
|
+
"List recent requests: bdg network requests --recent",
|
|
241
|
+
"Check request ID format: should be req_*"
|
|
242
|
+
]
|
|
243
|
+
}
|
|
244
|
+
}
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
**Key Fields for Agent Decision-Making:**
|
|
248
|
+
- `code`: Semantic exit code (matches process exit code)
|
|
249
|
+
- `type`: Machine-readable error category
|
|
250
|
+
- `recoverable`: Should agent retry?
|
|
251
|
+
- `retry_after`: When to retry (for rate limits, timeouts)
|
|
252
|
+
- `suggestions`: Array of next actions agent can take
|
|
253
|
+
|
|
254
|
+
### Tool Doesn't Retry - Agent Does
|
|
255
|
+
|
|
256
|
+
**Anti-Pattern:**
|
|
257
|
+
```bash
|
|
258
|
+
# Tool retries internally (bad)
|
|
259
|
+
$ bdg network requests
|
|
260
|
+
Connecting... failed
|
|
261
|
+
Retrying in 2s...
|
|
262
|
+
Retrying in 4s...
|
|
263
|
+
Error: Connection failed after 3 attempts
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
**Correct Pattern:**
|
|
267
|
+
```bash
|
|
268
|
+
# Tool reports error clearly (good)
|
|
269
|
+
$ bdg network requests --json
|
|
270
|
+
{"error": {"code": 105, "type": "connection_timeout", "recoverable": true, "retry_after": 2}}
|
|
271
|
+
$ echo $?
|
|
272
|
+
105
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
**Rationale:** Agents have their own retry logic, backoff strategies, and decision trees. Tool should provide clear signals, not hide failures behind retry loops.
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## Unix Philosophy Foundations
|
|
280
|
+
|
|
281
|
+
### Do One Thing Well
|
|
282
|
+
|
|
283
|
+
**Each command has a specific, well-defined purpose:**
|
|
284
|
+
- `bdg network requests` → List network requests
|
|
285
|
+
- `bdg network failed` → List failed requests only
|
|
286
|
+
- `bdg console errors` → List console errors
|
|
287
|
+
|
|
288
|
+
**Not:**
|
|
289
|
+
- `bdg diagnose-everything` → Analyzes network, console, performance in one command
|
|
290
|
+
|
|
291
|
+
### Composability Through Pipes
|
|
292
|
+
|
|
293
|
+
**Design for composition:**
|
|
294
|
+
```bash
|
|
295
|
+
# Find slow requests, get details, extract URLs
|
|
296
|
+
bdg network slow --threshold 1000ms --json | \
|
|
297
|
+
jq -r '.[] | .id' | \
|
|
298
|
+
xargs -I {} bdg network timing {} --json | \
|
|
299
|
+
jq -r '.url'
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
**Requirements:**
|
|
303
|
+
- Line-based or JSON output
|
|
304
|
+
- Stable field names
|
|
305
|
+
- Predictable structure
|
|
306
|
+
- Clean stdout (no decoration)
|
|
307
|
+
|
|
308
|
+
### Text Streams as Universal Interface
|
|
309
|
+
|
|
310
|
+
**Standard streams have distinct purposes:**
|
|
311
|
+
- `stdin`: Accept piped input when appropriate
|
|
312
|
+
- `stdout`: Primary data output (parseable)
|
|
313
|
+
- `stderr`: Logs, warnings, progress (ignorable)
|
|
314
|
+
- Exit code: Success/failure signal
|
|
315
|
+
|
|
316
|
+
**Never mix purposes:** Progress bars on stdout break pipes. Data on stderr is lost.
|
|
317
|
+
|
|
318
|
+
---
|
|
319
|
+
|
|
320
|
+
## Self-Describing Tools
|
|
321
|
+
|
|
322
|
+
### Tool Introspection
|
|
323
|
+
|
|
324
|
+
Agents don't read documentation - they query capabilities.
|
|
325
|
+
|
|
326
|
+
**Help as Data:**
|
|
327
|
+
```bash
|
|
328
|
+
$ bdg --help-json
|
|
329
|
+
{
|
|
330
|
+
"version": "1.2.0",
|
|
331
|
+
"commands": {
|
|
332
|
+
"network": {
|
|
333
|
+
"description": "Network request inspection",
|
|
334
|
+
"subcommands": ["requests", "failed", "slow", "timing"],
|
|
335
|
+
"flags": {
|
|
336
|
+
"--json": "Output in JSON format",
|
|
337
|
+
"--limit": "Maximum number of results (default: 50)"
|
|
338
|
+
}
|
|
339
|
+
}
|
|
340
|
+
},
|
|
341
|
+
"output_formats": ["text", "json"],
|
|
342
|
+
"exit_codes": {
|
|
343
|
+
"0": "success",
|
|
344
|
+
"85": "invalid_argument",
|
|
345
|
+
"92": "resource_not_found",
|
|
346
|
+
"105": "connection_timeout"
|
|
347
|
+
}
|
|
348
|
+
}
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
### Schema Discovery
|
|
352
|
+
|
|
353
|
+
For complex tools, provide JSON schemas:
|
|
354
|
+
```bash
|
|
355
|
+
$ bdg network requests --schema
|
|
356
|
+
{
|
|
357
|
+
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
358
|
+
"type": "object",
|
|
359
|
+
"properties": {
|
|
360
|
+
"id": {"type": "string"},
|
|
361
|
+
"url": {"type": "string"},
|
|
362
|
+
"method": {"type": "string", "enum": ["GET", "POST", "PUT", "DELETE"]},
|
|
363
|
+
"status": {"type": "integer"},
|
|
364
|
+
"duration_ms": {"type": "number"}
|
|
365
|
+
}
|
|
366
|
+
}
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
**Rationale:** Agents can validate output, understand structure, and adapt to schema versions.
|
|
370
|
+
|
|
371
|
+
---
|
|
372
|
+
|
|
373
|
+
## Context Without Verbosity
|
|
374
|
+
|
|
375
|
+
### Bad: Verbose Explanations
|
|
376
|
+
```
|
|
377
|
+
Connecting to Chrome DevTools Protocol...
|
|
378
|
+
Successfully established connection on port 9222
|
|
379
|
+
Querying network activity...
|
|
380
|
+
Found 47 requests in the last 30 seconds
|
|
381
|
+
Filtering for failed requests...
|
|
382
|
+
3 requests failed with status codes >= 400
|
|
383
|
+
Here are the results:
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
### Good: Structured Context
|
|
387
|
+
```
|
|
388
|
+
Connected: localhost:9222
|
|
389
|
+
Total Requests: 47
|
|
390
|
+
Failed: 3/47
|
|
391
|
+
Time Range: 30s
|
|
392
|
+
|
|
393
|
+
req_001 | POST /api/data | 500 | 145ms
|
|
394
|
+
req_015 | GET /config.json | 404 | 23ms
|
|
395
|
+
req_042 | PUT /update | 503 | 2341ms
|
|
396
|
+
```
|
|
397
|
+
|
|
398
|
+
### Best: JSON with Metadata
|
|
399
|
+
```json
|
|
400
|
+
{
|
|
401
|
+
"connection": "localhost:9222",
|
|
402
|
+
"summary": {
|
|
403
|
+
"total_requests": 47,
|
|
404
|
+
"failed_requests": 3,
|
|
405
|
+
"time_range_seconds": 30
|
|
406
|
+
},
|
|
407
|
+
"requests": [
|
|
408
|
+
{
|
|
409
|
+
"id": "req_001",
|
|
410
|
+
"method": "POST",
|
|
411
|
+
"url": "/api/data",
|
|
412
|
+
"status": 500,
|
|
413
|
+
"duration_ms": 145
|
|
414
|
+
}
|
|
415
|
+
]
|
|
416
|
+
}
|
|
417
|
+
```
|
|
418
|
+
|
|
419
|
+
**Key Principle:** Context is essential information, not chatty narration. Both humans and agents need context - they just need it structured differently.
|
|
420
|
+
|
|
421
|
+
---
|
|
422
|
+
|
|
423
|
+
## CLI vs MCP: Design Trade-offs for Agent Tools
|
|
424
|
+
|
|
425
|
+
**Note:** The industry consensus (including the InfoQ article) recommends **MCP adoption for agent integration**. MCP provides dynamic capability discovery and structured schemas. The observations below reflect personal experience building CLI-first tools and should be considered alongside MCP's benefits.
|
|
426
|
+
|
|
427
|
+
### CLI Advantages (Observed in Practice)
|
|
428
|
+
|
|
429
|
+
**Context Efficiency:**
|
|
430
|
+
- **CLI**: Command structure is the schema (`bdg network requests --failed`)
|
|
431
|
+
- **MCP**: Protocol overhead + server definitions + request/response wrapping
|
|
432
|
+
- **Observation**: Simpler commands can be more token-efficient in practice
|
|
433
|
+
|
|
434
|
+
**Debuggability:**
|
|
435
|
+
- **CLI**: `$ bdg network requests` fails → see exact error message
|
|
436
|
+
- **MCP**: Errors wrap in protocol layers, may require additional debugging steps
|
|
437
|
+
|
|
438
|
+
**Composability:**
|
|
439
|
+
- **CLI**: `bdg network requests | jq | grep | sort`
|
|
440
|
+
- **MCP**: Responses don't naturally compose with Unix tools
|
|
441
|
+
- **Strength**: Unix pipeline patterns for filtering and transformation
|
|
442
|
+
|
|
443
|
+
**Model Knowledge:**
|
|
444
|
+
- **CLI**: LLMs trained extensively on bash/zsh command patterns
|
|
445
|
+
- **MCP**: Newer protocol, less representation in training data
|
|
446
|
+
- **Caveat**: MCP enables dynamic discovery, which can offset this
|
|
447
|
+
|
|
448
|
+
### MCP Advantages (Industry Perspective)
|
|
449
|
+
|
|
450
|
+
**Dynamic Discovery:**
|
|
451
|
+
- Agents discover capabilities at runtime without hardcoded knowledge
|
|
452
|
+
- Schema validation prevents errors from format changes
|
|
453
|
+
- Versioned capability negotiation
|
|
454
|
+
|
|
455
|
+
**Standardization:**
|
|
456
|
+
- Single protocol for tool integration across ecosystems
|
|
457
|
+
- Reduces fragmentation compared to CLI tool diversity
|
|
458
|
+
|
|
459
|
+
**Complex Interactions:**
|
|
460
|
+
- Stateful, multi-turn interactions
|
|
461
|
+
- Complex authentication flows
|
|
462
|
+
- Real-time bidirectional communication
|
|
463
|
+
|
|
464
|
+
### Design Decision for bdg
|
|
465
|
+
|
|
466
|
+
**For this project (Chrome DevTools telemetry):** CLI is the chosen approach because:
|
|
467
|
+
- DevTools operations are atomic queries (list requests, get console logs)
|
|
468
|
+
- No stateful multi-turn workflows needed
|
|
469
|
+
- Target users already work in terminal environments
|
|
470
|
+
- Unix composability is a natural fit for data filtering/analysis
|
|
471
|
+
|
|
472
|
+
**This doesn't mean CLI is universally superior** - it's a trade-off based on use case. Tools requiring dynamic discovery, complex state management, or cross-platform consistency may benefit more from MCP.
|
|
473
|
+
|
|
474
|
+
---
|
|
475
|
+
|
|
476
|
+
## Practical Design Checklist
|
|
477
|
+
|
|
478
|
+
### ✅ Command Design
|
|
479
|
+
- [ ] Each command answers one specific question
|
|
480
|
+
- [ ] Subcommands reflect logical navigation path
|
|
481
|
+
- [ ] Command names are verbs or nouns, never sentences
|
|
482
|
+
- [ ] All commands support `--json` flag
|
|
483
|
+
- [ ] All commands support `--no-interactive` flag
|
|
484
|
+
|
|
485
|
+
### ✅ Output Design
|
|
486
|
+
- [ ] Primary data goes to stdout
|
|
487
|
+
- [ ] Logs/progress go to stderr
|
|
488
|
+
- [ ] Default output is human-readable AND line-parseable
|
|
489
|
+
- [ ] JSON output has stable schema with version number
|
|
490
|
+
- [ ] No ANSI colors when stdout is not a TTY
|
|
491
|
+
|
|
492
|
+
### ✅ Error Handling
|
|
493
|
+
- [ ] Exit codes follow semantic ranges (0, 80-89, 90-99, 100-109, 110-119)
|
|
494
|
+
- [ ] Errors include `type`, `code`, `recoverable`, `suggestions`
|
|
495
|
+
- [ ] Error messages on stderr
|
|
496
|
+
- [ ] JSON errors when `--json` flag used
|
|
497
|
+
- [ ] No retry logic (let agents decide)
|
|
498
|
+
|
|
499
|
+
### ✅ Composability
|
|
500
|
+
- [ ] Commands can be piped together
|
|
501
|
+
- [ ] Output can be filtered with grep/awk/jq
|
|
502
|
+
- [ ] Commands accept stdin when appropriate
|
|
503
|
+
- [ ] Each command has single responsibility
|
|
504
|
+
|
|
505
|
+
### ✅ Documentation
|
|
506
|
+
- [ ] `--help` provides human-readable usage
|
|
507
|
+
- [ ] `--help-json` provides machine-readable schema
|
|
508
|
+
- [ ] Examples in help show composition patterns
|
|
509
|
+
- [ ] Error messages include suggestions for next steps
|
|
510
|
+
|
|
511
|
+
---
|
|
512
|
+
|
|
513
|
+
## References
|
|
514
|
+
|
|
515
|
+
1. **InfoQ Article**: "Keep the Terminal Relevant: Patterns for AI Agent Driven CLIs" (August 2025)
|
|
516
|
+
- URL: https://www.infoq.com/articles/ai-agent-cli/
|
|
517
|
+
- Machine-friendly escape hatches
|
|
518
|
+
- Output as API contracts
|
|
519
|
+
- Real-time feedback patterns
|
|
520
|
+
|
|
521
|
+
2. **Square Engineering**: "Command Line Observability with Semantic Exit Codes" (January 2023)
|
|
522
|
+
- URL: https://developer.squareup.com/blog/command-line-observability-with-semantic-exit-codes/
|
|
523
|
+
- Exit code ranges: 80-99 user errors, 100-119 software errors
|
|
524
|
+
- Error type separation for SLOs
|
|
525
|
+
|
|
526
|
+
3. **Command Line Interface Guidelines** (clig.dev)
|
|
527
|
+
- URL: https://clig.dev/
|
|
528
|
+
- GitHub: https://github.com/cli-guidelines/cli-guidelines
|
|
529
|
+
- Unix philosophy application to modern CLIs
|
|
530
|
+
- Output separation (stdout/stderr)
|
|
531
|
+
- Composability patterns
|
|
532
|
+
|
|
533
|
+
4. **Unix Philosophy** (Bell Labs, Doug McIlroy, 1978)
|
|
534
|
+
- Classic formulation of "do one thing well"
|
|
535
|
+
- Expect output to become input to another program
|
|
536
|
+
- Design for composition
|
|
537
|
+
|
|
538
|
+
5. **AWS CLI / Azure CLI Documentation**
|
|
539
|
+
- AWS CLI: https://docs.aws.amazon.com/cli/
|
|
540
|
+
- Azure CLI: https://docs.microsoft.com/en-us/cli/azure/
|
|
541
|
+
- Multi-format output patterns
|
|
542
|
+
- Consistent flag conventions
|
|
543
|
+
- JMESPath query integration
|
|
544
|
+
|
|
545
|
+
---
|
|
546
|
+
|
|
547
|
+
## Conclusion
|
|
548
|
+
|
|
549
|
+
Agent-friendly tools are not a separate category from good CLI tools - they are an evolution that takes Unix philosophy seriously while adapting to LLM constraints.
|
|
550
|
+
|
|
551
|
+
**The Core Insight:** Design for deterministic, composable operations with structured output. This serves both agents (who need parseable data) and humans (who benefit from predictability).
|
|
552
|
+
|
|
553
|
+
**For bdg:** Every design decision should ask: "Does this help an agent make decisions?" If yes, implement it. If no, remove it.
|