@napster-corp/webmcp-toolkit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +531 -0
  3. package/bin/webmcp-toolkit.mjs +81 -0
  4. package/dist/debug.d.ts +5 -0
  5. package/dist/debug.d.ts.map +1 -0
  6. package/dist/debug.js +26 -0
  7. package/dist/debug.js.map +1 -0
  8. package/dist/dev-panel.d.ts +22 -0
  9. package/dist/dev-panel.d.ts.map +1 -0
  10. package/dist/dev-panel.js +1046 -0
  11. package/dist/dev-panel.js.map +1 -0
  12. package/dist/index.d.ts +6 -0
  13. package/dist/index.d.ts.map +1 -0
  14. package/dist/index.js +36 -0
  15. package/dist/index.js.map +1 -0
  16. package/dist/model-context.d.ts +13 -0
  17. package/dist/model-context.d.ts.map +1 -0
  18. package/dist/model-context.js +28 -0
  19. package/dist/model-context.js.map +1 -0
  20. package/dist/resources.d.ts +15 -0
  21. package/dist/resources.d.ts.map +1 -0
  22. package/dist/resources.js +179 -0
  23. package/dist/resources.js.map +1 -0
  24. package/dist/tiers.d.ts +31 -0
  25. package/dist/tiers.d.ts.map +1 -0
  26. package/dist/tiers.js +107 -0
  27. package/dist/tiers.js.map +1 -0
  28. package/dist/types.d.ts +145 -0
  29. package/dist/types.d.ts.map +1 -0
  30. package/dist/types.js +9 -0
  31. package/dist/types.js.map +1 -0
  32. package/hooks/post-commit +17 -0
  33. package/package.json +86 -0
  34. package/skills/add-edge-mcp-dev-panel/SKILL.md +206 -0
  35. package/skills/plan-capabilities-and-state/SKILL.md +168 -0
  36. package/skills/setup-edge-mcp/SKILL.md +546 -0
  37. package/skills/sync-webmcp-tools/SKILL.md +26 -0
  38. package/src/debug.ts +26 -0
  39. package/src/dev-panel.ts +1318 -0
  40. package/src/index.ts +66 -0
  41. package/src/model-context.ts +31 -0
  42. package/src/resources.ts +207 -0
  43. package/src/tiers.ts +132 -0
  44. package/src/types.ts +177 -0
  45. package/tools/generate-capabilities.mjs +266 -0
  46. package/tools/install-hook.mjs +81 -0
  47. package/tools/runners/anthropic.mjs +75 -0
  48. package/tools/runners/copilot.mjs +63 -0
@@ -0,0 +1,168 @@
1
+ ---
2
+ name: plan-capabilities-and-state
3
+ description: Analyze an existing web app and propose a curated plan for what an AI agent should be able to DO (capabilities) and SEE (live-state resources), with deliberate withholds. Use when the developer asks "what should I expose to the agent?", "help me figure out which capabilities to add", "what can my app do that an agent should know about?", or before running setup-edge-mcp on a new app. Produces a conversational plan that the developer reviews, edits, and approves — no file output. The setup-edge-mcp skill invokes this skill automatically if no plan has been agreed yet.
4
+ ---
5
+
6
+ # plan-capabilities-and-state
7
+
8
+ Read the developer's existing web app and propose a starter plan for a WebMCP integration: which operations should be exposed as **capabilities** the agent can invoke, which state slices should be exposed as **live-state resources** the agent can observe, and what should be deliberately **withheld**.
9
+
10
+ The output is a conversation, not a file. The developer reviews each item, edits or rejects freely, and approves the final list. That approved plan flows directly into the `setup-edge-mcp` skill, which turns it into code.
11
+
12
+ **Why this skill exists.** Without it, a developer setting up WebMCP starts from a blank canvas — they have to invent the capability list while also reasoning about side-effect tiers, the live-state resource gate, and what to deliberately withhold. That's high cognitive load. This skill does the inventory work first, so the developer reviews a structured proposal instead of brainstorming from scratch.
13
+
14
+ ## 0. Confirm you're reading the real target app
15
+
16
+ Before any analysis: skim the actual source — components, the store, the service calls the UI makes, the route structure. Don't trust ambient context files (`CLAUDE.md`, READMEs, project docs) if they describe a different project or contradict the code; the running code is the source of truth, and a stale context file has sent past runs down the wrong path.
17
+
18
+ If the codebase doesn't match the developer's description, stop and verify before proceeding.
19
+
20
+ ## 1. Identify the domain and the high-value workflows
21
+
22
+ Before listing individual capabilities, frame the bigger picture:
23
+
24
+ - **What is this app?** One sentence summarizing the domain. Apps that WebMCP fits include, but are not limited to:
25
+ - Commerce: *"Electronics e-commerce for logged-in customers"*
26
+ - Documentation / developer relations: *"API docs site with search, code samples, and a live-agent escape hatch"*
27
+ - Content / marketing: *"Marketing site with product pages, a blog, and a contact form"*
28
+ - Healthcare: *"Patient portal for appointment management"*
29
+ - Internal tools: *"CRM for sales reps"*, *"Read-mostly admin dashboard with a few destructive actions"*
30
+ - Productivity: *"Project management tool with tasks, comments, and a sidebar"*
31
+
32
+ Any of these is a valid target. Don't assume WebMCP only fits transactional apps — a docs site whose only "operations" are search, page-open, and copy-to-clipboard is just as legitimate as an e-commerce checkout.
33
+
34
+ - **Who is the user the agent will act on behalf of?** Often the signed-in user; for public or content-heavy sites it's the anonymous visitor. The agent runs scoped to that user's existing rights either way.
35
+
36
+ - **What are the two or three high-value workflows the agent should be good at?** Workflows look different per domain — match the *actual* shape of the app, not a transactional template:
37
+ - Commerce: *find and compare products, reorder a past purchase, start a return*
38
+ - Docs / dev rel: *find the right docs page, walk the user through API setup, copy snippets to clipboard, hand off to a live agent when needed*
39
+ - Content / marketing: *find a relevant article, jump to a specific section, submit a contact form*
40
+ - Healthcare: *schedule a follow-up, show upcoming visits*
41
+ - Internal tools: *look up a record, trigger a known recovery flow*
42
+
43
+ The capability list flows from these. A docs site with three workflows and five capabilities is a perfectly normal output of this skill — not a small or "weak" plan.
44
+
45
+ If you can't answer these from the codebase alone, ask the developer in plain language before going further.
46
+
47
+ ## 2. Propose candidate capabilities
48
+
49
+ For each high-value workflow, identify the real operations in the codebase that fulfill it. For each candidate, propose:
50
+
51
+ - **Name** — in the app's own domain terms, as `domain.verb`. Make cardinality obvious (`products.viewDetails` for one, `products.search` for many).
52
+ - **One-line purpose** — what the agent uses it for.
53
+ - **Arguments** — the input shape, in plain language drawn from the real function signature. List each argument with its type and whether it's required, or write "(no arguments)" if there are none. Example: "query (string, required), maxPrice (number, optional)". You are not writing JSON Schema yet — that's `setup-edge-mcp`'s job — but every capability's arguments must surface here, because the agent cannot use a capability whose input shape is unknown.
54
+ - **Side-effect tier** — using the table below.
55
+ - **Idempotency** — only relevant for `irreversible` tier; mark `true` only if the underlying operation tolerates safe retry (e.g. via an idempotency key).
56
+ - **Evidence** — the file and function in the codebase that backs it (e.g. `src/api/products.ts:searchProducts`). The developer should be able to grep-verify in seconds.
57
+ - **Confidence** — high / medium / low. Reserve low for cases where you're unsure the operation is technically exposable or appropriate.
58
+
59
+ ### Side-effect tiers
60
+
61
+ | Tier | Returns | Examples | Default when ambiguous |
62
+ |---|---|---|---|
63
+ | `read` | Information; changes no durable state | search, look up, compare | — |
64
+ | `reversible` | A change that's easy to undo | add to cart, save draft, apply filter | escalate from `read` to here if any doubt |
65
+ | `irreversible` | A change that can't be cleanly undone | place order, charge card, cancel, send email | escalate from `reversible` to here if any doubt |
66
+
67
+ **Default to the safer tier when ambiguous.** Irreversible tier triggers explicit verbal confirmation at runtime; classifying down loses that protection.
68
+
69
+ ### Rules for the capability list
70
+
71
+ - **One real operation = one capability.** No invented operations, no wrappers around capabilities that don't reflect real app behavior.
72
+ - **Composition is fine.** A `checkout.placeOrder` capability may chain several of the app's real operations (`startCheckout → updateCheckout → placeOrder → refresh`). That's orchestrating the app's own calls. Re-deriving business logic (recomputing prices, re-validating rules the app already owns) is what's forbidden.
73
+ - **Named navigation IS a capability.** `docs.openPage({ slug })`, `account.openSettings()`, `cart.viewDrawer()` are legitimate first-class capabilities — especially for docs sites, content-heavy apps, and anything where "take me to X" is a real user intent. The pattern to **strongly avoid** is a *generic* `navigate({ url })` or `goto({ path })` tool that hands raw routes to the agent. The bridge won't reject it, but it forces the agent to reason in routes instead of domain terms — the agent invents URLs that don't exist, calls break when you refactor a route, and the curation that's the whole point of WebMCP gets sidestepped. The distinguishing test: does the capability require the agent to know the *route system*, or just the *domain*? If domain (a page slug, a product id, a section name), it's fine — that's still a named intent with a parameter. If route system (a raw URL), prefer a named alternative. Routes stay an implementation detail of `execute`.
74
+ - **"Read-only" surfaces still have capabilities.** Even an app that doesn't mutate data has things the agent can DO: search, open a specific page, copy a snippet, open a modal, start a session, submit a form. If you find yourself thinking "this app has no real operations," look again — the operations are everything the user can trigger via a click.
75
+ - **Start small. The plan can also be very small, or zero.** Default to a high-value subset, not an inventory of everything. The plan should be the smallest set that supports the workflows from §1. Three to five capabilities is a normal-sized plan for a focused app. One or two is fine. **Zero is also a valid outcome:** if, after honestly reading the codebase, the app has no real operations worth exposing to the agent (a pure marketing page with one contact form, a static doc browser whose only "operation" is the user reading), say so. Recommend that the developer keep their existing MCP server (if any), use a non-Edge agent (a chatbot reading the docs), or skip the bridge entirely. Padding the plan with speculative capabilities (`copyPrompt` for a single hardcoded prompt, `openSomething` wrapping a button click that the user can already perform) is a worse outcome than an honest "WebMCP isn't the right fit here."
76
+
77
+ ## 3. Propose candidate live-state resources
78
+
79
+ For each candidate live-state resource, propose:
80
+
81
+ - **Name** — a noun identifying the slice (`cart`, `currentOrder`, `orderStatus`).
82
+ - **Why it cleared the gate** — see below.
83
+ - **Evidence** — where the underlying state lives (store, signal, query-cache key).
84
+ - **Confidence** — high / medium / low.
85
+
86
+ ### The gate (mechanical test)
87
+
88
+ Add a live-state resource **only for state that changes out-of-band** — state the agent needs to see that changes *without the agent acting*:
89
+
90
+ - The **user** changes it by hand (edits a cart quantity, toggles a filter, navigates somewhere), or
91
+ - It changes **server-side over time** (an order moves from `processing` to `shipped` while the conversation is open).
92
+
93
+ That's the whole test. **If a capability already returns the answer, do NOT add a resource that mirrors it.** A search that returns its results inline needs no `searchResults` resource; a `products.viewDetails` that returns the product needs no `currentProduct` resource. Resources are for the state the return *can't* give you, not a shadow copy of the state it can.
94
+
95
+ Common cases:
96
+ - `cart` → usually a good resource (user can edit it directly in the UI)
97
+ - `orderStatus` → usually a good resource (moves server-side)
98
+ - `searchResults` → usually NOT a resource (the search capability returns the results)
99
+ - `currentProduct` → usually NOT a resource (the `viewDetails` capability returns the product)
100
+
101
+ Apply the gate to each candidate and drop anything that fails it. The live-state resource list is the **exception, not the rule** — most apps need only a handful, and **many apps need zero**. Content sites, docs sites, marketing pages, and read-mostly admin tools often have no state that changes out-of-band: the user reads, the agent searches and navigates, nothing mutates behind anyone's back. An empty resources list is the correct outcome for those apps, not a sign that something is wrong.
102
+
103
+ ## 4. List the deliberate withholds
104
+
105
+ The negative space matters as much as the positive list. Identify and call out:
106
+
107
+ - **Capabilities deliberately withheld.** Destructive admin actions, billing internals, anything sensitive or rarely needed. Name them and explain why each is excluded. Common examples: `admin.deleteUser`, `account.changePassword`, raw payment-method management, refund disputes.
108
+ - **State the agent should NOT see.** Other users' data, sensitive auth state, private payment information, transient UI state that would confuse rather than help.
109
+ - **Deflections** — workflows the agent should redirect the user to handle themselves rather than attempt programmatically. Common examples: multi-factor confirmation flows, support disputes, anything requiring human review.
110
+
111
+ The framing here is important: **exposing a capability is an approval, not an inventory.** The agent runs as the signed-in user, but that doesn't mean every operation the user can do should flow through the agent. Curation is the point.
112
+
113
+ ## 5. Present the plan to the developer
114
+
115
+ Walk through the plan in conversation. Use a clear structure:
116
+
117
+ 1. **Domain summary + high-value workflows** — confirm framing.
118
+ 2. **Candidate capabilities, grouped by tier**, with arguments and confidence levels noted. Flag low-confidence items for closer review.
119
+ 3. **Candidate live-state resources**, each with one-line why-it-cleared-the-gate.
120
+ 4. **Deliberate withholds and deflections**, with reasoning.
121
+
122
+ Force an explicit per-item review for high-confidence items rather than passive accept-by-default. The developer should *approve* each capability, not silently accept the whole list.
123
+
124
+ For each item, the developer can:
125
+ - **Approve** as proposed.
126
+ - **Edit** name, tier, or scope.
127
+ - **Reject** with a reason (which adds it to the withhold list with a justification).
128
+ - **Add** a capability or resource you missed.
129
+
130
+ End the review with an open prompt: "Anything missing?" — the developer almost always thinks of one or two workflows the analysis didn't surface.
131
+
132
+ ## 6. Iterate until approved
133
+
134
+ The first pass is a draft. Be prepared to revise:
135
+ - Re-classify tiers if the developer pushes back ("`cancelOrder` is actually idempotent — we have an idempotency-key server-side").
136
+ - Drop resources that turn out to mirror returns once the capability shape is clear.
137
+ - Add capabilities you missed.
138
+ - Move items between the expose list and the withhold list as the conversation refines them.
139
+
140
+ Don't move on until the developer explicitly says the plan is good to go. The whole point is that they own the decision; rushing approval defeats the purpose.
141
+
142
+ ## 7. Hand off
143
+
144
+ Once the plan is approved, summarize the final state:
145
+
146
+ - N capabilities approved (M read, K reversible, J irreversible, with idempotency flags)
147
+ - P live-state resources approved
148
+ - Q items deliberately withheld
149
+ - R deflections noted
150
+
151
+ Then prompt to continue with `setup-edge-mcp` (or remind the developer that's the natural next step if they invoked this skill directly). The setup skill picks up the approved plan and turns it into registered code.
152
+
153
+ ### If the plan came out at zero
154
+
155
+ If the honest answer was "this app has no real operations worth exposing," don't run `setup-edge-mcp`. Tell the developer plainly:
156
+
157
+ > Based on the codebase, WebMCP isn't the right fit for this app right now. The agent has nothing real to DO here — only what the user is already doing by reading the page. Options: (a) keep using your existing chatbot / MCP server / docs-aware assistant if you have one; (b) revisit this if you add interactive features later (forms, dynamic content, multi-step flows). WebMCP earns its weight when the agent can act on the user's behalf; without that, it's scaffolding for no payoff.
158
+
159
+ Skip the setup. A zero-plan is a sign that this skill did its job — not a failure to find capabilities.
160
+
161
+ ## What you will NOT do in this skill
162
+
163
+ - Write any code. This skill is conversational. Code happens in `setup-edge-mcp`.
164
+ - Produce a JSON or markdown file. The plan lives in the conversation; the code is the record.
165
+ - Skip the gate for live-state resources. Every resource must have a one-line justification that maps to user-edit or server-side change.
166
+ - Approve items the developer hasn't reviewed. "Looks good?" with no explicit approval doesn't count.
167
+ - Suggest capabilities the codebase doesn't actually support. Every candidate must have evidence; if you can't cite a file and function, don't propose it.
168
+ - Plan vendor-specific integration (the Napster Web SDK auto-attach, Function configuration, voice/avatar persona). That's downstream of WebMCP and belongs to the agent vendor's skills.