caplets 0.9.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -11,6 +11,52 @@ or call that backend's underlying tools or operations.
11
11
  This keeps the initial MCP tool list small, makes tool selection easier, and avoids
12
12
  flattened tool-name collisions across servers.
13
13
 
14
+ ## Why It Matters
15
+
16
+ Large MCP setups make agents worse before they make them better. If every downstream
17
+ server exposes every tool up front, the model starts with a noisy flat list, duplicate
18
+ tool names, and a bigger context surface before it knows which capability matters.
19
+
20
+ Caplets turns that flat tool wall into progressive disclosure: one capability card first,
21
+ then scoped discovery only after the agent chooses the relevant domain.
22
+
23
+ ## Benchmark Results
24
+
25
+ In Caplets' reproducible coding-agent benchmark, the same three mock MCP servers are
26
+ exposed two ways: direct flat MCP aggregation versus Caplets progressive disclosure.
27
+
28
+ | Initial Agent Surface | Direct Flat MCP | Caplets | Reduction |
29
+ | ------------------------- | ----------------: | -----------: | ------------: |
30
+ | Visible tools | 106 | 3 | 97.2% fewer |
31
+ | Serialized MCP payload | 32,090 bytes | 8,400 bytes | 73.8% smaller |
32
+ | Approx. context surface | 8,023 tokens | 2,100 tokens | 5,923 fewer |
33
+ | Top-level name collisions | 3 duplicate names | 0 | eliminated |
34
+
35
+ The important part: Caplets does not remove access to the downstream tools. It hides
36
+ them behind scoped discovery operations like `search_tools`, `get_tool`, and `call_tool`,
37
+ so the agent sees less up front while still being able to reach the same capabilities.
38
+
39
+ A local OpenCode live benchmark also completed the full benchmark matrix successfully:
40
+
41
+ | Agent | Mode | Tasks Passed |
42
+ | ------------------------------ | --------------- | -----------: |
43
+ | OpenCode `openai/gpt-5.5-fast` | Direct flat MCP | 2/2 |
44
+ | OpenCode `openai/gpt-5.5-fast` | Caplets | 2/2 |
45
+
46
+ Live results are intentionally not committed as product claims because they depend on
47
+ local agent CLIs, credentials, models, providers, and agent behavior. The deterministic
48
+ surface benchmark is the reproducible claim.
49
+
50
+ See [`docs/benchmarks/coding-agent.md`](docs/benchmarks/coding-agent.md) for methodology,
51
+ limitations, and reproduction commands.
52
+
53
+ ```sh
54
+ pnpm benchmark
55
+ pnpm benchmark:check
56
+ pnpm build
57
+ CAPLETS_BENCH_LIVE=1 pnpm benchmark:live:opencode -- --model openai/gpt-5.5-fast
58
+ ```
59
+
14
60
  ## Inspiration
15
61
 
16
62
  Caplets is a mashup of two ideas that work well separately but leave a gap together:
@@ -28,8 +74,8 @@ the agent chooses that server and asks to search, list, inspect, or call them.
28
74
 
29
75
  ## What It Does
30
76
 
31
- - Reads downstream MCP server definitions, native OpenAPI endpoint definitions, native GraphQL endpoint definitions, and explicit HTTP API action definitions from the user config file.
32
- - Registers one generated MCP tool for each enabled MCP server, OpenAPI endpoint, GraphQL endpoint, or HTTP API.
77
+ - Reads downstream MCP server definitions, native OpenAPI endpoint definitions, native GraphQL endpoint definitions, explicit HTTP API action definitions, and curated CLI tool definitions from the user config file.
78
+ - Registers one generated MCP tool for each enabled MCP server, OpenAPI endpoint, GraphQL endpoint, HTTP API, or CLI tools backend.
33
79
  - Uses the configured server ID as the generated tool name.
34
80
  - Uses the configured `name` and `description` as the capability card shown to agents.
35
81
  - Starts downstream MCP servers and loads OpenAPI specs lazily when an operation needs them.
@@ -38,6 +84,7 @@ the agent chooses that server and asks to search, list, inspect, or call them.
38
84
  - Converts OpenAPI operations into MCP-style tool metadata and executes HTTP calls directly.
39
85
  - Converts configured GraphQL operations into MCP-style tool metadata, and can auto-generate GraphQL tools from schema root query and mutation fields.
40
86
  - Converts explicitly configured HTTP actions into MCP-style tool metadata and executes HTTP calls directly.
87
+ - Converts explicitly configured CLI actions into MCP-style tool metadata and executes commands directly without a shell.
41
88
  - Preserves downstream tool results instead of rewriting them into a custom format.
42
89
  - Redacts secrets from structured errors.
43
90
  - Supports static remote auth and OAuth token storage for remote servers.
@@ -172,7 +219,7 @@ the committed schema stays in sync with the Zod config validator.
172
219
 
173
220
  For richer skill-like cards, add Markdown Caplet files beside `config.json`. Every Caplet
174
221
  file must include exactly one executable backend: `mcpServer`, `openapiEndpoint`,
175
- `graphqlEndpoint`, or `httpApi`;
222
+ `graphqlEndpoint`, `httpApi`, or `cliTools`;
176
223
  serverless Caplets are intentionally out of scope.
177
224
 
178
225
  Top-level files derive the Caplet ID from the filename:
@@ -255,6 +302,26 @@ httpApi:
255
302
  # Status API
256
303
  ```
257
304
 
305
+ CLI-backed Caplet files use `cliTools`:
306
+
307
+ ```md
308
+ ---
309
+ name: Repository CLI
310
+ description: Run curated repository workflows through local CLI commands.
311
+ cliTools:
312
+ cwd: /home/you/project
313
+ actions:
314
+ git_status:
315
+ description: Show concise Git working tree status.
316
+ command: git
317
+ args: ["status", "--short"]
318
+ annotations:
319
+ readOnlyHint: true
320
+ ---
321
+
322
+ # Repository CLI
323
+ ```
324
+
258
325
  Top-level files derive their Caplet ID from the filename. Directory-style Caplets use
259
326
  `linear/CAPLET.md`, which is exposed as `linear`; sibling files can be referenced with
260
327
  normal Markdown links from `CAPLET.md`.
@@ -264,6 +331,8 @@ This repository includes polished working examples under [`caplets/`](caplets/):
264
331
  - `github`: GitHub's official MCP server container, using `GITHUB_PERSONAL_ACCESS_TOKEN`.
265
332
  - `linear`: Linear's hosted OAuth MCP endpoint.
266
333
  - `context7`: Context7 documentation lookup through `@upstash/context7-mcp`.
334
+ - `repo-cli`: Read-oriented repository CLI workflows through `git` and package scripts.
335
+ - `github-cli`: Read-oriented GitHub workflows through the `gh` CLI.
267
336
 
268
337
  Install every example from a repo's `caplets/` directory:
269
338
 
@@ -304,7 +373,7 @@ caplets init --force
304
373
 
305
374
  ### Caplet IDs
306
375
 
307
- Each key under `mcpServers`, `openapiEndpoints`, `graphqlEndpoints`, or `httpApis` is the
376
+ Each key under `mcpServers`, `openapiEndpoints`, `graphqlEndpoints`, `httpApis`, or `cliTools` is the
308
377
  stable Caplet ID. It becomes the generated MCP tool name exactly, so keep it short and specific:
309
378
 
310
379
  ```json
@@ -321,7 +390,7 @@ stable Caplet ID. It becomes the generated MCP tool name exactly, so keep it sho
321
390
  ```
322
391
 
323
392
  Caplet IDs must match `^[a-zA-Z0-9_-]{1,64}$` and must be unique across `mcpServers`,
324
- `openapiEndpoints`, `graphqlEndpoints`, and `httpApis`. Spaces, dots, slashes, colons, and Unicode IDs are rejected.
393
+ `openapiEndpoints`, `graphqlEndpoints`, `httpApis`, and `cliTools`. Spaces, dots, slashes, colons, and Unicode IDs are rejected.
325
394
 
326
395
  ### Stdio Servers
327
396
 
@@ -491,6 +560,49 @@ parsed `body` when present, and `elapsedMs`; non-2xx responses set `isError`, re
491
560
  timeouts are enforced, response bodies are capped by `maxResponseBytes` (default `1000000`), and
492
561
  errors redact secrets.
493
562
 
563
+ ### CLI Tools
564
+
565
+ Use `cliTools` for curated local command-line workflows. Each action is an explicitly configured
566
+ tool; Caplets does not expose arbitrary shell access and always spawns `command` plus `args`
567
+ without shell interpolation.
568
+
569
+ ```json
570
+ {
571
+ "name": "Repository CLI",
572
+ "description": "Run curated repository workflows through local CLI commands.",
573
+ "cwd": "/home/you/project",
574
+ "timeoutMs": 60000,
575
+ "maxOutputBytes": 1000000,
576
+ "actions": {
577
+ "git_status": {
578
+ "description": "Show concise Git working tree status.",
579
+ "command": "git",
580
+ "args": ["status", "--short"],
581
+ "annotations": { "readOnlyHint": true }
582
+ },
583
+ "run_tests": {
584
+ "description": "Run the package test script.",
585
+ "command": "pnpm",
586
+ "args": ["run", "test"],
587
+ "timeoutMs": 120000,
588
+ "annotations": { "readOnlyHint": true }
589
+ }
590
+ }
591
+ }
592
+ ```
593
+
594
+ CLI actions can set `inputSchema`, `outputSchema`, `env`, action-level `cwd`, `timeoutMs`,
595
+ `maxOutputBytes`, `output: {"type":"json"}`, and MCP annotations. `$input.field` references are
596
+ supported inside `args`, `env`, and `cwd` strings. Caplets performs basic required-field and
597
+ primitive-type validation before spawning. Results are returned as structured content with
598
+ `exitCode`, `stdout`, `stderr`, and `elapsedMs`; non-zero exits set `isError`.
599
+
600
+ Generate a reviewable CLI Caplet manifest from a repository:
601
+
602
+ ```sh
603
+ caplets author cli repo-tools --repo . --include git,gh,package --output -
604
+ ```
605
+
494
606
  ### Authentication
495
607
 
496
608
  Remote servers can use:
@@ -0,0 +1,41 @@
1
+ ---
2
+ $schema: https://raw.githubusercontent.com/spiritledsoftware/caplets/main/schemas/caplet.schema.json
3
+ name: GitHub CLI
4
+ description: Inspect GitHub pull requests and issues through curated gh CLI commands.
5
+ tags:
6
+ - cli
7
+ - github
8
+ - code
9
+ cliTools:
10
+ actions:
11
+ gh_pr_status:
12
+ description: Show pull request status for the current branch as JSON.
13
+ command: gh
14
+ args:
15
+ - pr
16
+ - status
17
+ - --json
18
+ - currentBranch
19
+ output:
20
+ type: json
21
+ annotations:
22
+ readOnlyHint: true
23
+ openWorldHint: true
24
+ gh_issue_list:
25
+ description: List open GitHub issues as JSON.
26
+ command: gh
27
+ args:
28
+ - issue
29
+ - list
30
+ - --json
31
+ - number,title,state,url
32
+ output:
33
+ type: json
34
+ annotations:
35
+ readOnlyHint: true
36
+ openWorldHint: true
37
+ ---
38
+
39
+ # GitHub CLI
40
+
41
+ Use this Caplet to expose read-oriented GitHub workflows through `gh` without giving the agent an unrestricted shell.
@@ -0,0 +1,37 @@
1
+ ---
2
+ $schema: https://raw.githubusercontent.com/spiritledsoftware/caplets/main/schemas/caplet.schema.json
3
+ name: Repository CLI
4
+ description: Inspect and run common local repository workflows through curated CLI tools.
5
+ tags:
6
+ - cli
7
+ - code
8
+ cliTools:
9
+ actions:
10
+ git_status:
11
+ description: Show concise Git working tree status.
12
+ command: git
13
+ args:
14
+ - status
15
+ - --short
16
+ annotations:
17
+ readOnlyHint: true
18
+ git_current_branch:
19
+ description: Print the current Git branch name.
20
+ command: git
21
+ args:
22
+ - branch
23
+ - --show-current
24
+ annotations:
25
+ readOnlyHint: true
26
+ package_test:
27
+ description: Run the repository test script with pnpm.
28
+ command: pnpm
29
+ args:
30
+ - run
31
+ - test
32
+ timeoutMs: 120000
33
+ ---
34
+
35
+ # Repository CLI
36
+
37
+ Use this Caplet to expose a small, typed set of local repository commands without giving an agent arbitrary shell access.