anymcp 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 MCP Forge
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,159 @@
1
+ # anymcp
2
+
3
+ A **self-contained** MCP server that turns any website into a runnable MCP server — entirely in-process. No
4
+ backend, no Postgres, no Redis, no Docker. One `npx`, like Playwright MCP.
5
+
6
+ It scrapes a URL, figures out the tools, and writes a runnable MCP server to disk that you can install into
7
+ your MCP clients.
8
+
9
+ > Part of the MCP Forge multi-product repo. The hosted web platform + microservices and the Chrome extension
10
+ > are separate products; **this** is the standalone, install-and-go MCP server.
11
+
12
+ ---
13
+
14
+ ## Install (default: no API key needed)
15
+
16
+ By default, **the brain is the model you're already running** — the agent that called the tool (Claude Code,
17
+ Codex, Cursor, …) does the inference itself, exactly like how Playwright MCP just drives a browser. So the
18
+ default config needs **no API key and no external service**:
19
+
20
+ ```jsonc
21
+ {
22
+ "mcpServers": {
23
+ "anymcp": {
24
+ "command": "npx",
25
+ "args": ["-y", "anymcp"]
26
+ // No env needed. By default inference is done by the model you're already using (host-as-brain):
27
+ // forge_scrape returns the page, your model designs the tools, forge_emit_server writes the server.
28
+ }
29
+ }
30
+ }
31
+ ```
32
+
33
+ Then, in your client: *"Build me an MCP server for https://rubygems.org"*. The agent calls `forge_scrape`,
34
+ designs the tools, and calls `forge_emit_server`. Done — zero config.
35
+
36
+ ---
37
+
38
+ ## The three tools
39
+
40
+ | Tool | LLM? | What it does |
41
+ |------|------|--------------|
42
+ | `forge_scrape` | none | Scrape a URL → structured page analysis (forms, links, candidate endpoints, network, DOM sample). **You** design the tools from it. |
43
+ | `forge_emit_server` | none | Take the tool definitions you designed → deterministic codegen → writes a runnable MCP server to disk. |
44
+ | `forge_generate` | configurable | One-shot scrape→infer→build, using a server-side model (see below). For non-agentic clients or when you *want* a specific model. |
45
+
46
+ The recommended path is `forge_scrape` + `forge_emit_server` (no key, and your agent is usually the smartest
47
+ model available). `forge_generate` exists for clients that can't do multi-step tool calls.
48
+
49
+ ---
50
+
51
+ ## Optional: server-side inference for `forge_generate`
52
+
53
+ If you want the server itself to do the inference, set **`FORGE_INFERENCE`**. It accepts a provider name or the
54
+ LiteLLM-style `provider/model` form (e.g. `groq/llama-3.3-70b-versatile`). Every hosted and local option below
55
+ goes through **one** OpenAI-compatible client — the standard the whole ecosystem converged on — so adding a
56
+ provider is trivial and a key uses that provider's conventional env var.
57
+
58
+ | `FORGE_INFERENCE` | Needs | Notes |
59
+ |-------------------|-------|-------|
60
+ | *(unset)* / `host` | nothing | Default. Host-as-brain via scrape+emit; `forge_generate` falls back to the keyless heuristic. |
61
+ | `heuristic` | nothing | Keyless, rule-based. No LLM, no network. |
62
+ | `openai` | `OPENAI_API_KEY` | + optional `gpt-4o-mini` default; pin with `openai/<model>`. |
63
+ | `groq` | `GROQ_API_KEY` | Fast. |
64
+ | `together` | `TOGETHER_API_KEY` | |
65
+ | `openrouter` | `OPENROUTER_API_KEY` | 300+ models behind one key. |
66
+ | `deepseek` | `DEEPSEEK_API_KEY` | |
67
+ | `mistral` | `MISTRAL_API_KEY` | |
68
+ | `fireworks` | `FIREWORKS_API_KEY` | |
69
+ | `xai` | `XAI_API_KEY` | Grok. |
70
+ | `claude` | `ANTHROPIC_API_KEY` | Native Anthropic client. |
71
+ | `gemini` | `GEMINI_API_KEY` | Native Google client. |
72
+ | `ollama` | nothing | **Fully local.** Runs against `ollama serve` (`OLLAMA_URL`, default `http://localhost:11434/v1`; `OLLAMA_MODEL`, default `llama3.1`). No key. |
73
+ | `lmstudio` | nothing | Local LM Studio (`LMSTUDIO_URL`, default `http://localhost:1234/v1`; `LMSTUDIO_MODEL`). No key. |
74
+ | `vllm` | nothing | Local vLLM (`VLLM_BASE_URL`, default `http://localhost:8000/v1`; `VLLM_MODEL`). No key. |
75
+ | `openai-compatible` | `FORGE_OPENAI_BASE_URL` | **Any** other OpenAI-compatible endpoint (a gateway, a proxy, a new provider). `FORGE_API_KEY` optional. |
76
+ | `http` | `FORGE_INFERENCE_URL` | **Bring your own logic** — POSTs the scraped page to your endpoint; you return the tool list. |
77
+
78
+ ### Examples
79
+
80
+ Local model, no key, nothing leaves your machine:
81
+
82
+ ```jsonc
83
+ { "mcpServers": { "anymcp": {
84
+ "command": "npx", "args": ["-y", "anymcp"],
85
+ "env": { "FORGE_INFERENCE": "ollama", "OLLAMA_MODEL": "llama3.1" }
86
+ } } }
87
+ ```
88
+
89
+ A hosted provider (any big one — swap the name + key):
90
+
91
+ ```jsonc
92
+ { "mcpServers": { "anymcp": {
93
+ "command": "npx", "args": ["-y", "anymcp"],
94
+ "env": { "FORGE_INFERENCE": "groq/llama-3.3-70b-versatile", "GROQ_API_KEY": "gsk_..." }
95
+ } } }
96
+ ```
97
+
98
+ Your own inference logic (a script, a router, anything that speaks back tool JSON):
99
+
100
+ ```jsonc
101
+ { "mcpServers": { "anymcp": {
102
+ "command": "npx", "args": ["-y", "anymcp"],
103
+ "env": { "FORGE_INFERENCE_URL": "https://my-host/infer" }
104
+ } } }
105
+ ```
106
+
107
+ Your custom endpoint receives `{ systemPrompt, url, payload, bundle }` and returns either a JSON array of
108
+ tools, `{ "tools": [...] }`, or a JSON string of the same.
109
+
110
+ ---
111
+
112
+ ## Other env
113
+
114
+ | Var | Default | Meaning |
115
+ |-----|---------|---------|
116
+ | `MCP_FORGE_HOME` | `~/.mcp-forge` | Where generated servers + `registry.json` are written. |
117
+ | `FORGE_BROWSER` | *(on)* | In-process stealth browser capture (renders JS + captures XHR/fetch traffic) for dynamic / bot-walled sites. Set `0` to force the cheap static-only fetch. Needs Chromium: `npx playwright install chromium`. |
118
+ | `SCRAPER_DISCOVERY_MODE` | `1` | Escalate to the browser even on server-rendered pages so their API traffic is captured into tools. `0` keeps the static result when it's sufficient. |
119
+ | `SCRAPER_INTERACT` | `1` | During a browser capture, scroll / submit a search / click "load more" to surface action-only XHR. |
120
+ | `MCP_BROWSER_CHANNEL` | *(unset)* | Drive your real installed Chrome (`chrome`) instead of bundled Chromium — stronger stealth. |
121
+ | `MCP_BROWSER_DRIVER` | *(unset)* | Use a stealth-patched Playwright drop-in (`patchright` / `rebrowser-playwright`, install it yourself) for hard bot walls. |
122
+ | `SCRAPER_URL` | *(unset)* | If set, use the remote Python scraper service instead of the in-process browser (its 4-tier stealth incl. Camoufox + nodriver). |
123
+ | `FORGE_MODEL` | per-provider | Override the model for the selected provider. |
124
+ | `FORGE_MAX_TOKENS` | `8192` | Max output tokens for OpenAI-compatible inference. Lower it for small-context local models. |
125
+ | `FORGE_INFERENCE_HEADERS` | *(unset)* | JSON object of extra headers for the `http` inference endpoint (e.g. auth). |
126
+ | `FORGE_FETCH_TIMEOUT_MS` | `20000` | Timeout for the built-in static page fetch. |
127
+ | `FORGE_FETCH_MAX_BYTES` | `5000000` | Max page size the built-in scraper will read (memory guard). |
128
+ | `FORGE_BROWSER_TIMEOUT_MS` | `30000` | Navigation timeout for the in-process browser capture. |
129
+ | `FORGE_INFERENCE_TIMEOUT_MS` | `60000` | Timeout for a custom (`http`) inference endpoint. |
130
+ | `FORGE_INFERENCE_RESPONSE_MAX_BYTES` | `1000000` | Max response body read from a custom (`http`) inference endpoint. |
131
+
132
+ ## Develop
133
+
134
+ ```bash
135
+ npm run build # build @mcp/generator + this package
136
+ npm test # 4 suites: provider resolution, stdio boot, emit-server e2e, full local pipeline (no key, no network)
137
+ ```
138
+
139
+ ## Dynamic / bot-walled sites
140
+
141
+ By default the server captures with an **in-process stealth browser** (Playwright): it renders client-side JS
142
+ and captures the page's XHR/fetch traffic, so it builds tools for SPAs and anti-bot-protected sites with **no
143
+ backend**. It needs a Chromium binary once:
144
+
145
+ ```bash
146
+ npx playwright install chromium
147
+ ```
148
+
149
+ Stealth mirrors the generated servers (AutomationControlled off, `navigator.webdriver` stripped). For hard
150
+ walls, set `MCP_BROWSER_CHANNEL=chrome` to drive your real Chrome, or `MCP_BROWSER_DRIVER=patchright`. Set
151
+ `FORGE_BROWSER=0` to skip the browser entirely (static-only). If Chromium isn't installed, the server
152
+ automatically falls back to the static fetch.
153
+
154
+ ## Limitations (honest)
155
+
156
+ - The in-process browser handles most dynamic sites, but the hardest anti-bot walls also score IP reputation:
157
+ from a datacenter IP some sites still block regardless of stealth. Use `MCP_BROWSER_CHANNEL=chrome` /
158
+ `MCP_BROWSER_DRIVER=patchright`, or point `SCRAPER_URL` at the full Python scraper (4-tier, Camoufox + nodriver).
159
+ - `forge_generate` quality depends on the model you pick; the keyless heuristic is a floor, not a ceiling.