caplets 0.18.0 → 0.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +95 -52
  2. package/dist/index.js +2 -79525
  3. package/package.json +2 -2
package/README.md CHANGED
@@ -28,7 +28,7 @@
28
28
 
29
29
  Caplets turns MCP servers, APIs, and commands into focused agent capabilities: one card first, searchable tools next, inspectable schemas before calls, and preserved results after.
30
30
 
31
- Stop dumping every operation into context up front. Caplets wraps each tool source as a capability an agent can discover, inspect, call, and recover from one step at a time. Instead of exposing a giant flat wall of operations, Caplets shows a compact capability card with source, status, and next actions. The agent chooses a domain first, then uses scoped operations like `search_tools`, `get_tool`, and `call_tool` only when it needs more detail.
31
+ Stop dumping every operation into context up front. Caplets wraps each tool source as a capability an agent can discover, inspect, call, and recover from one step at a time. Instead of exposing a giant flat wall of operations, Caplets shows a compact capability card with source, status, and next actions. The agent chooses a domain first, then uses scoped operations like `search_tools`, `describe_tool`, and `call_tool` only when it needs more detail.
32
32
 
33
33
  For MCP-backed Caplets, the scoped operation set also includes resource discovery and reading, prompt listing and rendering, resource-template discovery, and completion for prompt or template arguments. Non-MCP backends expose focused tool and action operations.
34
34
 
@@ -43,7 +43,7 @@ caplets add mcp context7 --command npx --arg -y --arg @upstash/context7-mcp
43
43
  caplets serve
44
44
  ```
45
45
 
46
- In the deterministic benchmark, 106 flat tools became 3 top-level capabilities with an 87.9% smaller initial payload. Your agent starts with `context7`, then drills in through `inspect`, `search_tools`, `get_tool`, and `call_tool` only when needed.
46
+ In the deterministic benchmark, 106 flat tools became 3 top-level capabilities with an 87.9% smaller initial payload. Your agent starts with `context7`, then drills in through `inspect`, `search_tools`, `describe_tool`, and `call_tool` only when needed.
47
47
 
48
48
  ## Quick Start
49
49
 
@@ -139,15 +139,23 @@ Backends that require OAuth or token auth may need `caplets auth login <server>`
139
139
  Use Caplets as a normal MCP server everywhere, or install a native agent integration when
140
140
  your coding agent supports one.
141
141
 
142
- | Agent | Install | What It Provides |
143
- | -------------- | -------------------------------------------------------------- | -------------------------------------------------------------- |
144
- | Any MCP client | Add `caplets serve` or `caplets attach` manually in MCP config | Universal progressive-disclosure gateway |
145
- | Claude Code | Add `caplets serve` or `caplets attach` manually in MCP config | Local or remote/Cloud progressive-disclosure gateway |
146
- | Codex | Add `caplets serve` or `caplets attach` manually in MCP config | Local or remote/Cloud progressive-disclosure gateway |
147
- | OpenCode | Install [`@caplets/opencode`](packages/opencode/README.md) | Native `caplets_<id>` tools and prompt guidance hooks |
148
- | Pi | Install [`@caplets/pi`](packages/pi/README.md) | Native `caplets_<id>` tools with Pi prompt snippets/guidelines |
142
+ | Agent | Install | What It Provides |
143
+ | -------------- | -------------------------------------------------------------- | --------------------------------------------------------------- |
144
+ | Any MCP client | Add `caplets serve` or `caplets attach` manually in MCP config | Universal Code Mode gateway; progressive exposure is opt-in |
145
+ | Claude Code | Add `caplets serve` or `caplets attach` manually in MCP config | Local or remote/Cloud Code Mode gateway |
146
+ | Codex | Add `caplets serve` or `caplets attach` manually in MCP config | Local or remote/Cloud Code Mode gateway |
147
+ | OpenCode | Install [`@caplets/opencode`](packages/opencode/README.md) | Native `caplets__<id>` tools and prompt guidance hooks |
148
+ | Pi | Install [`@caplets/pi`](packages/pi/README.md) | Native `caplets__<id>` tools with Pi prompt snippets/guidelines |
149
149
 
150
- Manual local MCP config:
150
+ Codex local MCP config (`~/.codex/config.toml`):
151
+
152
+ ```toml
153
+ [mcp_servers.caplets]
154
+ command = "caplets"
155
+ args = ["serve"]
156
+ ```
157
+
158
+ Claude Code or generic JSON MCP config:
151
159
 
152
160
  ```json
153
161
  {
@@ -160,7 +168,15 @@ Manual local MCP config:
160
168
  }
161
169
  ```
162
170
 
163
- Manual remote or Cloud MCP config:
171
+ Codex remote or Cloud MCP config (`~/.codex/config.toml`):
172
+
173
+ ```toml
174
+ [mcp_servers.caplets]
175
+ command = "caplets"
176
+ args = ["attach"]
177
+ ```
178
+
179
+ Claude Code or generic JSON remote or Cloud MCP config:
164
180
 
165
181
  ```json
166
182
  {
@@ -195,7 +211,7 @@ Core Alchemy deploys the public landing page from `apps/landing`. It does not de
195
211
 
196
212
  ### Remote Caplets service
197
213
 
198
- OpenCode and Pi can use native `caplets_<id>` tools backed by a remote Caplets HTTP service. Codex, Claude Code, and any MCP client can connect to the same remote MCP endpoint directly.
214
+ OpenCode and Pi can use native `caplets__<id>` tools backed by a remote Caplets HTTP service. Codex, Claude Code, and any MCP client can connect to the same remote MCP endpoint directly.
199
215
 
200
216
  Hosted Caplets Cloud uses browser-mediated Cloud Auth:
201
217
 
@@ -312,38 +328,54 @@ Flat tool lists make agents guess before they understand. If every downstream se
312
328
  Caplets turns that flat wall into a staged path:
313
329
 
314
330
  1. **Choose** a capability, such as `GitHub`.
315
- 2. **Inspect** matching operations with `search_tools` or `list_tools`.
316
- 3. **Resolve** the exact schema with `get_tool`.
331
+ 2. **Inspect** matching operations with `search_tools` or `tools`.
332
+ 3. **Resolve** the exact schema with `describe_tool`.
317
333
  4. **Invoke** with `call_tool` while preserving downstream content, structured data, and error state.
318
334
 
319
335
  A backend enters agent context as a focused card with source, status, and next actions, not a wall of operations.
320
336
 
321
337
  ## Benchmark
322
338
 
323
- In Caplets' reproducible coding-agent benchmark, the same three mock MCP servers are
339
+ Caplets reduces the tool surface an agent has to carry while preserving access to the
340
+ same downstream operations.
341
+
342
+ In Caplets' deterministic coding-agent benchmark, the same seven mock MCP servers are
324
343
  exposed two ways: direct flat MCP aggregation versus Caplets progressive disclosure.
325
344
 
326
345
  | Initial Agent Surface | Direct Flat MCP | Caplets | Reduction |
327
346
  | ------------------------- | ----------------: | -----------: | ------------: |
328
- | Visible tools | 106 | 3 | 97.2% fewer |
329
- | Serialized MCP payload | 32,090 bytes | 8,442 bytes | 73.7% smaller |
330
- | Approx. context surface | 8,023 tokens | 2,111 tokens | 5,912 fewer |
347
+ | Visible tools | 215 | 7 | 96.7% fewer |
348
+ | Serialized MCP payload | 63,250 bytes | 12,720 bytes | 79.9% smaller |
349
+ | Approx. context surface | 15,813 tokens | 3,180 tokens | 12,633 fewer |
331
350
  | Top-level name collisions | 3 duplicate names | 0 | eliminated |
332
351
 
333
352
  Caplets does not remove access to downstream tools. It places them behind scoped
334
353
  discovery operations, so the agent sees less up front while retaining access to the same
335
354
  capabilities when needed.
336
355
 
337
- A local OpenCode live benchmark also completed the full benchmark matrix successfully:
338
-
339
- | Agent | Mode | Tasks Passed |
340
- | ------------------------------ | --------------- | -----------: |
341
- | OpenCode `openai/gpt-5.5-fast` | Direct flat MCP | 2/2 |
342
- | OpenCode `openai/gpt-5.5-fast` | Caplets | 2/2 |
343
-
344
- Live results are intentionally not committed as product claims because they depend on
345
- local agent CLIs, credentials, models, providers, and agent behavior. The deterministic
346
- surface benchmark is the reproducible claim.
356
+ In a live Pi eval on a real-world large MCP stack, Caplets Code Mode completed the same
357
+ 10/10 tasks as direct MCP and Executor while using far fewer total tokens. The stack used
358
+ GitHub, Context7, DeepWiki, Git, filesystem, Playwright, ast-grep, language-server, and
359
+ web-search MCP servers. The run used `openai-codex/gpt-5.5` as both the main model and
360
+ judge model, with 2 runs per task per mode.
361
+
362
+ | Mode | Tasks Passed | Avg request + output tokens | Avg provider tokens |
363
+ | ------------------------------- | -----------: | --------------------------: | ------------------: |
364
+ | Caplets Code Mode | 10/10 | 236,803 | 126,877 |
365
+ | Caplets progressive + Code Mode | 10/10 | 422,861 | 264,624 |
366
+ | Caplets progressive | 10/10 | 461,171 | 294,217 |
367
+ | Executor MCP | 10/10 | 675,842 | 369,992 |
368
+ | Direct vanilla MCP | 10/10 | 846,048 | 544,121 |
369
+
370
+ Against the same pass-rate baseline, Caplets Code Mode used 72.0% fewer request+output
371
+ tokens than direct vanilla MCP and 65.0% fewer than Executor MCP. Caplets progressive
372
+ disclosure also beat direct vanilla MCP by 45.5% and Executor MCP by 31.8% on
373
+ request+output tokens.
374
+
375
+ Live results depend on local agent CLIs, credentials, model/provider behavior, and the
376
+ date of the run. The deterministic surface benchmark remains the reproducible,
377
+ credential-free claim; the live eval demonstrates the same trend in a realistic large
378
+ MCP harness.
347
379
 
348
380
  See [`docs/benchmarks/coding-agent.md`](docs/benchmarks/coding-agent.md) for methodology,
349
381
  limitations, and reproduction commands.
@@ -352,7 +384,7 @@ limitations, and reproduction commands.
352
384
  pnpm benchmark
353
385
  pnpm benchmark:check
354
386
  pnpm build
355
- CAPLETS_BENCH_LIVE=1 pnpm benchmark:live:opencode -- --model openai/gpt-5.5-fast
387
+ CAPLETS_BENCH_LIVE=1 pnpm benchmark:live:pi-eval -- --task-suite mcp-real-world-large --mode caplets-code-mode,caplets-progressive,vanilla-mcp,executor-mcp --model openai-codex/gpt-5.5 --runs 2
356
388
  ```
357
389
 
358
390
  ## Design Model
@@ -390,7 +422,7 @@ If a backend fails, Caplets keeps the error scoped to the capability, preserves
390
422
  - Uses the configured `name` and `description` as the capability card shown to agents.
391
423
  - Starts downstream MCP servers and loads OpenAPI specs lazily when an operation needs them.
392
424
  - Supports stdio, Streamable HTTP, and legacy HTTP+SSE downstream servers.
393
- - Lets agents `list_tools`, `search_tools`, `get_tool`, and `call_tool` within one selected Caplet namespace.
425
+ - Lets agents `tools`, `search_tools`, `describe_tool`, and `call_tool` within one selected Caplet namespace.
394
426
  - Converts OpenAPI operations into MCP-style tool metadata and executes HTTP calls directly.
395
427
  - Converts configured GraphQL operations into MCP-style tool metadata, and can auto-generate GraphQL tools from schema root query and mutation fields.
396
428
  - Converts explicitly configured HTTP actions into MCP-style tool metadata and executes HTTP calls directly.
@@ -780,7 +812,7 @@ OpenAPI auth is explicit and supports:
780
812
  - `{"type": "oauth2", ...}`
781
813
  - `{"type": "oidc", ...}`
782
814
 
783
- OpenAPI `call_tool.arguments` uses grouped HTTP inputs:
815
+ OpenAPI `call_tool.args` uses grouped HTTP inputs:
784
816
 
785
817
  ```json
786
818
  {
@@ -824,7 +856,7 @@ endpoint and exactly one schema source: `schemaPath`, `schemaUrl`, or `introspec
824
856
 
825
857
  When `operations` is omitted or empty, Caplets auto-generates tools from schema root
826
858
  fields: `query_<field>` and `mutation_<field>`. Generated tools use bounded scalar
827
- selection sets and pass `call_tool.arguments` directly as GraphQL variables/root-field
859
+ selection sets and pass `call_tool.args` directly as GraphQL variables/root-field
828
860
  arguments.
829
861
 
830
862
  Every GraphQL endpoint can set:
@@ -878,7 +910,7 @@ must start with `/` and be URL paths that cannot change origin or escape the bas
878
910
  Action mappings can set `query`, `headers`, and `jsonBody`. `query` and `headers` must resolve
879
911
  to object maps whose values are strings, numbers, or booleans. `jsonBody` may use literals,
880
912
  nested arrays/objects, `$input.field` references, or `$input` for the whole argument object.
881
- Path placeholders such as `{service}` are read directly from `call_tool.arguments` and URL-encoded.
913
+ Path placeholders such as `{service}` are read directly from `call_tool.args` and URL-encoded.
882
914
  Configured action headers cannot set managed headers such as `authorization`, `host`,
883
915
  `content-length`, `connection`, or `content-type`; JSON bodies set `content-type` automatically.
884
916
 
@@ -939,8 +971,8 @@ an existing destination file.
939
971
  ### Caplet Sets
940
972
 
941
973
  Use `capletSets` to expose another Caplets collection as nested Caplets. Each child Caplet appears
942
- as one downstream tool and supports the full Caplets operation set: `inspect`, `check_backend`,
943
- `list_tools`, `search_tools`, `get_tool`, and `call_tool`.
974
+ as one downstream tool and supports the full Caplets operation set: `inspect`, `check`,
975
+ `tools`, `search_tools`, `describe_tool`, and `call_tool`.
944
976
 
945
977
  ```json
946
978
  {
@@ -1081,7 +1113,13 @@ their downstream connections keep running.
1081
1113
 
1082
1114
  ## Quick Integration Setup
1083
1115
 
1084
- Use `caplets setup` to install or configure an agent integration:
1116
+ Run the interactive setup flow to choose one or more agent integrations:
1117
+
1118
+ ```bash
1119
+ caplets setup
1120
+ ```
1121
+
1122
+ For scripted setup, pass the integration explicitly:
1085
1123
 
1086
1124
  ```bash
1087
1125
  caplets setup codex
@@ -1100,17 +1138,21 @@ caplets setup codex --dry-run
1100
1138
  For native integrations that should connect to a remote Caplets HTTP service:
1101
1139
 
1102
1140
  ```bash
1103
- caplets setup opencode --remote --server-url https://caplets.example.com/caplets
1141
+ caplets setup codex --remote-url https://caplets.example.com/caplets
1142
+ caplets setup claude-code --remote-url https://caplets.example.com/caplets
1143
+ caplets setup opencode --remote-url https://caplets.example.com/caplets
1104
1144
  ```
1105
1145
 
1106
- `caplets setup` runs the supported agent installer commands or writes the explicit config
1107
- path you pass with `--output`. It does not store secrets, edit unknown MCP client config
1108
- locations, or start `caplets serve`.
1146
+ For Codex and Claude Code, `caplets setup` uses each harness's MCP configuration command:
1147
+ `codex mcp add caplets -- caplets serve` and
1148
+ `claude mcp add --transport stdio --scope user caplets -- caplets serve`. Generic MCP
1149
+ clients still require an explicit `--output` path because their config locations are not
1150
+ standardized. The setup command does not store secrets or start `caplets serve`.
1109
1151
 
1110
1152
  ## Additional Native Integrations
1111
1153
 
1112
1154
  OpenCode and Pi support true native tool registration. Those integrations expose one
1113
- prefixed tool per configured Caplet, such as `caplets_github`, while reusing the same
1155
+ prefixed tool per configured Caplet, such as `caplets__github`, while reusing the same
1114
1156
  Caplets config and backend runtime.
1115
1157
 
1116
1158
  - [`@caplets/opencode`](packages/opencode/README.md): OpenCode plugin that injects prompt guidance through plugin hooks instead of editing `opencode.json`.
@@ -1135,7 +1177,7 @@ Each generated Caplet tool accepts an `operation`:
1135
1177
 
1136
1178
  ```json
1137
1179
  {
1138
- "operation": "list_tools"
1180
+ "operation": "tools"
1139
1181
  }
1140
1182
  ```
1141
1183
 
@@ -1153,7 +1195,7 @@ Inspect one exact downstream tool:
1153
1195
 
1154
1196
  ```json
1155
1197
  {
1156
- "operation": "get_tool",
1198
+ "operation": "describe_tool",
1157
1199
  "tool": "read_file"
1158
1200
  }
1159
1201
  ```
@@ -1173,23 +1215,23 @@ Call one exact downstream tool:
1173
1215
  Available operations:
1174
1216
 
1175
1217
  - `inspect`: return the configured capability card without starting the downstream server.
1176
- - `check_backend`: verify the selected backend, whether MCP, OpenAPI, GraphQL, HTTP, CLI, or nested Caplets.
1177
- - `list_tools`: return compact downstream tool metadata.
1218
+ - `check`: verify the selected backend, whether MCP, OpenAPI, GraphQL, HTTP, CLI, or nested Caplets.
1219
+ - `tools`: return compact downstream tool metadata.
1178
1220
  - `search_tools`: search downstream tool names and descriptions within this Caplet.
1179
- - `get_tool`: return full metadata for one exact downstream tool.
1221
+ - `describe_tool`: return full metadata for one exact downstream tool.
1180
1222
  - `call_tool`: invoke one exact downstream tool with JSON object arguments.
1181
1223
 
1182
1224
  Requests are strict: operation-specific extra fields are rejected, and `call_tool` requires
1183
1225
  `arguments` to be a JSON object.
1184
1226
 
1185
- Discovery operations (`inspect`, `check_backend`, `list_tools`, `search_tools`, and
1186
- `get_tool`) return wrapper-generated results whose `structuredContent.caplets` field
1227
+ Discovery operations (`inspect`, `check`, `tools`, `search_tools`, and
1228
+ `describe_tool`) return wrapper-generated results whose `structuredContent.caplets` field
1187
1229
  identifies the Caplet with `id`, plus backend, operation, status, and elapsed time when
1188
1230
  available. Discovery result objects and compact tool entries also use `id` for the
1189
- configured Caplet identity. Compact `list_tools` and `search_tools` entries may include
1231
+ configured Caplet identity. Compact `tools` and `search_tools` entries may include
1190
1232
  input/output schema hashes; treat those
1191
1233
  hashes as reuse hints for a schema you have already inspected, not as a replacement for
1192
- `get_tool` when arguments, output, or semantics are unclear.
1234
+ `describe_tool` when arguments, output, or semantics are unclear.
1193
1235
 
1194
1236
  Direct `call_tool` preserves the downstream tool result shape instead of wrapping it in
1195
1237
  `structuredContent.result`. When the result can carry MCP metadata, Caplets adds
@@ -1199,8 +1241,9 @@ or other saved files. Artifact `displayPath` values are either absolute local pa
1199
1241
  relative to the downstream MCP server process, not necessarily relative to the current
1200
1242
  project or Caplets process.
1201
1243
 
1202
- For first use, the explicit progressive-discovery path is still safest: choose a Caplet,
1203
- `search_tools` or `list_tools`, inspect uncertain tools with `get_tool`, then `call_tool`.
1244
+ Code Mode is the default exposure because it keeps discovery, filtering, execution, and
1245
+ summary work inside one compact tool call. To expose the older progressive wrapper tools,
1246
+ set `options.exposure` to `progressive` or `progressive_and_code_mode`.
1204
1247
 
1205
1248
  ## Development
1206
1249