@arabold/docs-mcp-server 1.36.0 → 1.37.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +48 -509
- package/dist/index.js +51 -5
- package/dist/index.js.map +1 -1
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -1,557 +1,96 @@
|
|
|
1
1
|
# Grounded Docs: Your AI's Up-to-Date Documentation Expert
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
AI coding assistants often struggle with outdated documentation and hallucinations. The **Docs MCP Server** solves this by providing a personal, always-current knowledge base for your AI. It **indexes 3rd party documentation** from various sources (websites, GitHub, npm, PyPI, local files) and offers powerful, version-aware search tools via the Model Context Protocol (MCP).
|
|
6
|
-
|
|
7
|
-
This enables your AI agent to access the **latest official documentation**, dramatically improving the quality and reliability of generated code and integration details. It's **free**, **open-source**, runs **locally** for privacy, and integrates seamlessly into your development workflow.
|
|
8
|
-
|
|
9
|
-
## Why Use the Docs MCP Server?
|
|
10
|
-
|
|
11
|
-
LLM-assisted coding promises speed and efficiency, but often falls short due to:
|
|
12
|
-
|
|
13
|
-
- 🌀 **Stale Knowledge:** LLMs train on snapshots of the internet and quickly fall behind new library releases and API changes.
|
|
14
|
-
- 👻 **Code Hallucinations:** AI can invent plausible-looking code that is syntactically correct but functionally wrong or uses non-existent APIs.
|
|
15
|
-
- ❓ **Version Ambiguity:** Generic answers rarely account for the specific version dependencies in your project, leading to subtle bugs.
|
|
16
|
-
- ⏳ **Verification Overhead:** Developers spend valuable time double-checking AI suggestions against official documentation.
|
|
17
|
-
|
|
18
|
-
**Docs MCP Server solves these problems by:**
|
|
19
|
-
|
|
20
|
-
- ✅ **Providing Up-to-Date Context:** Fetches and indexes documentation directly from official sources (websites, GitHub, npm, PyPI, local files) on demand.
|
|
21
|
-
- 🎯 **Delivering Version-Specific Answers:** Search queries can target exact library versions, ensuring information matches your project's dependencies.
|
|
22
|
-
- 💡 **Reducing Hallucinations:** Grounds the LLM in real documentation for accurate examples and integration details.
|
|
23
|
-
- ⚡ **Boosting Productivity:** Get trustworthy answers faster, integrated directly into your AI assistant workflow.
|
|
24
|
-
|
|
25
|
-
## ✨ Key Features
|
|
26
|
-
|
|
27
|
-
- **Accurate & Version-Aware AI Responses:** Provides up-to-date, version-specific documentation to reduce AI hallucinations and improve code accuracy.
|
|
28
|
-
- **Broad Source Compatibility:** Scrapes documentation from websites, GitHub repos, package manager sites (npm, PyPI), and local file directories.
|
|
29
|
-
- **Advanced Search & Processing:** Intelligently chunks documentation semantically, generates embeddings, and combines vector similarity with full-text search.
|
|
30
|
-
- **Flexible Embedding Models:** Supports various providers including OpenAI (and compatible APIs), Google Gemini/Vertex AI, Azure OpenAI, and AWS Bedrock. Vector search is optional.
|
|
31
|
-
- **Enterprise Authentication:** Optional OAuth2/OIDC authentication with dynamic client registration for secure deployments.
|
|
32
|
-
- **Web Interface:** Easy-to-use web interface for searching and managing documentation.
|
|
33
|
-
- **Local & Private:** Runs entirely on your machine, ensuring data and queries remain private.
|
|
34
|
-
- **Free & Open Source:** Community-driven and freely available.
|
|
35
|
-
- **Simple Deployment:** Easy setup via Docker or `npx`.
|
|
36
|
-
- **Seamless Integration:** Works with MCP-compatible clients (like Claude, Cline, Roo).
|
|
37
|
-
|
|
38
|
-
> **What is semantic chunking?**
|
|
39
|
-
>
|
|
40
|
-
> Semantic chunking splits documentation into meaningful sections based on structure—like headings, code blocks, and tables—rather than arbitrary text size. Docs MCP Server preserves logical boundaries, keeps code and tables intact, and removes navigation clutter from HTML docs. This ensures LLMs receive coherent, context-rich information for more accurate and relevant answers.
|
|
41
|
-
|
|
42
|
-
## How to Run the Docs MCP Server
|
|
43
|
-
|
|
44
|
-
Choose your deployment method:
|
|
45
|
-
|
|
46
|
-
- [Standalone Server (Recommended)](#standalone-server-recommended)
|
|
47
|
-
- [Embedded Server](#embedded-server)
|
|
48
|
-
- [Advanced: Docker Compose (Scaling)](#advanced-docker-compose-scaling)
|
|
49
|
-
|
|
50
|
-
## Standalone Server (Recommended)
|
|
51
|
-
|
|
52
|
-
Run a standalone server that includes both MCP endpoints and web interface in a single process. This is the easiest way to get started.
|
|
53
|
-
|
|
54
|
-
### Option 1: Docker
|
|
55
|
-
|
|
56
|
-
1. **Install Docker.**
|
|
57
|
-
2. **Start the server:**
|
|
58
|
-
|
|
59
|
-
```bash
|
|
60
|
-
docker run --rm \
|
|
61
|
-
-v docs-mcp-data:/data \
|
|
62
|
-
-v docs-mcp-config:/config \
|
|
63
|
-
-p 6280:6280 \
|
|
64
|
-
ghcr.io/arabold/docs-mcp-server:latest \
|
|
65
|
-
--protocol http --host 0.0.0.0 --port 6280
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
**Configuration:** The server writes its configuration to `/config/docs-mcp-server/config.yaml`. Mounting the `/config` volume ensures your settings persist across restarts.
|
|
69
|
-
|
|
70
|
-
**Optional:** Add `-e OPENAI_API_KEY="your-openai-api-key"` to enable vector search for improved results.
|
|
71
|
-
|
|
72
|
-
### Option 2: npx
|
|
73
|
-
|
|
74
|
-
1. **Install Node.js 20.x or later.**
|
|
75
|
-
2. **Start the server:**
|
|
76
|
-
|
|
77
|
-
```bash
|
|
78
|
-
npx @arabold/docs-mcp-server@latest
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
This runs the server on port 6280 by default.
|
|
82
|
-
|
|
83
|
-
**Optional:** Prefix with `OPENAI_API_KEY="your-openai-api-key"` to enable vector search for improved results.
|
|
84
|
-
|
|
85
|
-
### Configure Your MCP Client
|
|
86
|
-
|
|
87
|
-
Add this to your MCP settings (VS Code, Claude Desktop, etc.):
|
|
88
|
-
|
|
89
|
-
```json
|
|
90
|
-
{
|
|
91
|
-
"mcpServers": {
|
|
92
|
-
"docs-mcp-server": {
|
|
93
|
-
"type": "sse",
|
|
94
|
-
"url": "http://localhost:6280/sse",
|
|
95
|
-
"disabled": false,
|
|
96
|
-
"autoApprove": []
|
|
97
|
-
}
|
|
98
|
-
}
|
|
99
|
-
}
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
**Alternative connection types:**
|
|
103
|
-
|
|
104
|
-
```jsonc
|
|
105
|
-
// SSE (Server-Sent Events)
|
|
106
|
-
"type": "sse", "url": "http://localhost:6280/sse"
|
|
107
|
-
|
|
108
|
-
// HTTP (Streamable)
|
|
109
|
-
"type": "http", "url": "http://localhost:6280/mcp"
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
Restart your AI assistant after updating the config.
|
|
113
|
-
|
|
114
|
-
### Access the Web Interface
|
|
115
|
-
|
|
116
|
-
Open `http://localhost:6280` in your browser to manage documentation and monitor jobs.
|
|
117
|
-
|
|
118
|
-
### CLI Usage with Standalone Server
|
|
119
|
-
|
|
120
|
-
You can also use CLI commands to interact with the local database:
|
|
121
|
-
|
|
122
|
-
```bash
|
|
123
|
-
# List indexed libraries
|
|
124
|
-
OPENAI_API_KEY="your-key" npx @arabold/docs-mcp-server@latest list
|
|
125
|
-
|
|
126
|
-
# Search documentation
|
|
127
|
-
OPENAI_API_KEY="your-key" npx @arabold/docs-mcp-server@latest search react "useState hook"
|
|
128
|
-
|
|
129
|
-
# Scrape new documentation (connects to running server's worker)
|
|
130
|
-
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react --server-url http://localhost:8080/api
|
|
131
|
-
```
|
|
132
|
-
|
|
133
|
-
### Adding Library Documentation
|
|
134
|
-
|
|
135
|
-
1. Open the Web Interface at `http://localhost:6280`.
|
|
136
|
-
2. Use the "Add New Documentation" form.
|
|
137
|
-
3. Enter the documentation URL, library name, and (optionally) version.
|
|
138
|
-
4. Click "Start Indexing". Monitor progress in the Job Queue.
|
|
139
|
-
5. Repeat for each library you want indexed.
|
|
140
|
-
|
|
141
|
-
Once a job completes, the docs are searchable via your AI assistant or the Web UI.
|
|
3
|
+
**Docs MCP Server** solves the problem of AI hallucinations and outdated knowledge by providing a personal, always-current documentation index for your AI coding assistant. It fetches official docs from websites, GitHub, npm, PyPI, and local files, allowing your AI to query the exact version you are using.
|
|
142
4
|
|
|
143
5
|

|
|
144
6
|
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
- Single command setup with both web UI and MCP server
|
|
148
|
-
- Persistent data storage (Docker volume or local directory)
|
|
149
|
-
- No repository cloning required
|
|
150
|
-
- Full feature access including web interface
|
|
151
|
-
|
|
152
|
-
To stop the server, press `Ctrl+C`.
|
|
153
|
-
|
|
154
|
-
## Configuration overrides
|
|
155
|
-
|
|
156
|
-
- **Configuration Precedence**: Configuration is loaded in the following order (last one wins):
|
|
157
|
-
|
|
158
|
-
> For a complete reference of all configuration options, see the [Configuration Guide](docs/concepts/configuration.md).
|
|
159
|
-
|
|
160
|
-
1. **Defaults**: Built-in default values.
|
|
161
|
-
2. **Config File**: `config.json` or `config.yaml` in global store, project root, or current directory.
|
|
162
|
-
3. **Environment Variables**: Specific `DOCS_MCP_*` variables override file settings.
|
|
163
|
-
4. **CLI Arguments**: Command-line flags (e.g., `--port`) have the highest priority.
|
|
164
|
-
|
|
165
|
-
### Configuration File
|
|
166
|
-
|
|
167
|
-
You can create a `config.json` or `config.yaml` file to persist your settings. The server searches for this file in:
|
|
168
|
-
|
|
169
|
-
1. The path specified by `--config` (**Read-Only**).
|
|
170
|
-
2. The path specified by `DOCS_MCP_CONFIG` environment variable (**Read-Only**).
|
|
171
|
-
3. The system default configuration directory (**Read-Write**):
|
|
172
|
-
- **macOS**: `~/Library/Preferences/docs-mcp-server/config.yaml`
|
|
173
|
-
- **Linux**: `~/.config/docs-mcp-server/config.yaml` (or defined by `$XDG_CONFIG_HOME`)
|
|
174
|
-
- **Windows**: `%APPDATA%\docs-mcp-server\config\config.yaml`
|
|
175
|
-
|
|
176
|
-
> **Note:** On startup, if no explicit configuration file is provided, the server will seek the system default config. If present, it loads it. If missing, it creates it with default values. It will also update it with any new setting keys. If you provide a custom config via `--config` or env var, the server treats it as **Read-Only** and will NOT modify it or write defaults back to it.
|
|
177
|
-
|
|
178
|
-
**Example `config.yaml`:**
|
|
179
|
-
|
|
180
|
-
```yaml
|
|
181
|
-
server:
|
|
182
|
-
host: "0.0.0.0"
|
|
183
|
-
ports:
|
|
184
|
-
mcp: 9000
|
|
185
|
-
default: 8000
|
|
186
|
-
scraper:
|
|
187
|
-
maxPages: 500
|
|
188
|
-
pageTimeoutMs: 10000
|
|
189
|
-
splitter:
|
|
190
|
-
maxChunkSize: 2000
|
|
191
|
-
embeddings:
|
|
192
|
-
vectorDimension: 1536
|
|
193
|
-
```
|
|
194
|
-
|
|
195
|
-
### Environment Variables
|
|
196
|
-
|
|
197
|
-
Specific configuration options can be set via environment variables. These override values from the configuration file.
|
|
198
|
-
|
|
199
|
-
| Environment Variable | Config Path | Description |
|
|
200
|
-
| -------------------------- | ---------------------- | ----------------------------------------- |
|
|
201
|
-
| `DOCS_MCP_PROTOCOL` | `server.protocol` | Server protocol (`auto`, `stdio`, `http`) |
|
|
202
|
-
| `DOCS_MCP_HOST`, `HOST` | `server.host` | Host to bind the server to |
|
|
203
|
-
| `DOCS_MCP_PORT`, `PORT` | `server.ports.default` | Default server port |
|
|
204
|
-
| `DOCS_MCP_WEB_PORT` | `server.ports.web` | Web interface port |
|
|
205
|
-
| `DOCS_MCP_STORE_PATH` | `app.storePath` | Custom storage directory path |
|
|
206
|
-
| `DOCS_MCP_READ_ONLY` | `app.readOnly` | Enable read-only mode |
|
|
207
|
-
| `DOCS_MCP_AUTH_ENABLED` | `auth.enabled` | Enable authentication |
|
|
208
|
-
| `DOCS_MCP_AUTH_ISSUER_URL` | `auth.issuerUrl` | OIDC Issuer URL |
|
|
209
|
-
| `DOCS_MCP_AUTH_AUDIENCE` | `auth.audience` | JWT Audience |
|
|
210
|
-
| `DOCS_MCP_EMBEDDING_MODEL` | `app.embeddingModel` | Embedding model string |
|
|
211
|
-
| `DOCS_MCP_TELEMETRY` | `app.telemetryEnabled` | Enable/disable telemetry |
|
|
212
|
-
|
|
213
|
-
## Embedded Server
|
|
214
|
-
|
|
215
|
-
Run the MCP server directly embedded in your AI assistant without a separate process or web interface. This method provides MCP integration only.
|
|
216
|
-
|
|
217
|
-
### Configure Your MCP Client
|
|
218
|
-
|
|
219
|
-
Add this to your MCP settings (VS Code, Claude Desktop, etc.):
|
|
220
|
-
|
|
221
|
-
```json
|
|
222
|
-
{
|
|
223
|
-
"mcpServers": {
|
|
224
|
-
"docs-mcp-server": {
|
|
225
|
-
"command": "npx",
|
|
226
|
-
"args": ["@arabold/docs-mcp-server@latest"],
|
|
227
|
-
"disabled": false,
|
|
228
|
-
"autoApprove": []
|
|
229
|
-
}
|
|
230
|
-
}
|
|
231
|
-
}
|
|
232
|
-
```
|
|
233
|
-
|
|
234
|
-
**Optional:** To enable vector search for improved results, add an `env` section with your API key:
|
|
235
|
-
|
|
236
|
-
```json
|
|
237
|
-
{
|
|
238
|
-
"mcpServers": {
|
|
239
|
-
"docs-mcp-server": {
|
|
240
|
-
"command": "npx",
|
|
241
|
-
"args": ["@arabold/docs-mcp-server@latest"],
|
|
242
|
-
"env": {
|
|
243
|
-
"OPENAI_API_KEY": "sk-proj-..." // Your OpenAI API key
|
|
244
|
-
},
|
|
245
|
-
"disabled": false,
|
|
246
|
-
"autoApprove": []
|
|
247
|
-
}
|
|
248
|
-
}
|
|
249
|
-
}
|
|
250
|
-
```
|
|
251
|
-
|
|
252
|
-
Restart your application after updating the config.
|
|
253
|
-
|
|
254
|
-
### Adding Library Documentation
|
|
255
|
-
|
|
256
|
-
**Option 1: Use MCP Tools**
|
|
7
|
+
## ✨ Why Grounded Docs MCP Server?
|
|
257
8
|
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
```
|
|
261
|
-
Please scrape the React documentation from https://react.dev/reference/react for library "react" version "18.x"
|
|
262
|
-
```
|
|
263
|
-
|
|
264
|
-
**Option 2: Launch Web Interface**
|
|
265
|
-
|
|
266
|
-
Start a temporary web interface that shares the same database:
|
|
267
|
-
|
|
268
|
-
```bash
|
|
269
|
-
OPENAI_API_KEY="your-key" npx @arabold/docs-mcp-server@latest web --port 6281
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
Then open `http://localhost:6281` to manage documentation. Stop the web interface when done (`Ctrl+C`).
|
|
273
|
-
|
|
274
|
-
**Option 3: CLI Commands**
|
|
275
|
-
|
|
276
|
-
Use CLI commands directly (avoid running scrape jobs concurrently with embedded server):
|
|
277
|
-
|
|
278
|
-
```bash
|
|
279
|
-
# List libraries
|
|
280
|
-
OPENAI_API_KEY="your-key" npx @arabold/docs-mcp-server@latest list
|
|
281
|
-
|
|
282
|
-
# Search documentation
|
|
283
|
-
OPENAI_API_KEY="your-key" npx @arabold/docs-mcp-server@latest search react "useState hook"
|
|
284
|
-
```
|
|
285
|
-
|
|
286
|
-
**Benefits:**
|
|
287
|
-
|
|
288
|
-
- Direct integration with AI assistant
|
|
289
|
-
- No separate server process required
|
|
290
|
-
- Persistent data storage in user's home directory
|
|
291
|
-
- Shared database with standalone server and CLI
|
|
292
|
-
|
|
293
|
-
**Limitations:**
|
|
294
|
-
|
|
295
|
-
- No web interface (unless launched separately)
|
|
296
|
-
- Documentation indexing requires MCP tools or separate commands
|
|
297
|
-
|
|
298
|
-
## Scraping Local Files and Folders
|
|
299
|
-
|
|
300
|
-
You can index documentation from your local filesystem by using a `file://` URL as the source. This works in both the Web UI and CLI.
|
|
301
|
-
|
|
302
|
-
**Examples:**
|
|
303
|
-
|
|
304
|
-
- Web: `https://react.dev/reference/react`
|
|
305
|
-
- Local file: `file:///Users/me/docs/index.html`
|
|
306
|
-
- Local folder: `file:///Users/me/docs/my-library`
|
|
307
|
-
|
|
308
|
-
**Requirements:**
|
|
309
|
-
|
|
310
|
-
- All files with a MIME type of `text/*` are processed. This includes HTML, Markdown, plain text, and source code files such as `.js`, `.ts`, `.tsx`, `.css`, etc. Binary files, PDFs, images, and other non-text formats are ignored.
|
|
311
|
-
- You must use the `file://` prefix for local files/folders.
|
|
312
|
-
- The path must be accessible to the server process.
|
|
313
|
-
- **If running in Docker:**
|
|
314
|
-
- You must mount the local folder into the container and use the container path in your `file://` URL.
|
|
315
|
-
- Example Docker run:
|
|
316
|
-
```bash
|
|
317
|
-
docker run --rm \
|
|
318
|
-
-e OPENAI_API_KEY="your-key" \
|
|
319
|
-
-v /absolute/path/to/docs:/docs:ro \
|
|
320
|
-
-v docs-mcp-data:/data \
|
|
321
|
-
-p 6280:6280 \
|
|
322
|
-
ghcr.io/arabold/docs-mcp-server:latest \
|
|
323
|
-
scrape mylib file:///docs/my-library
|
|
324
|
-
```
|
|
325
|
-
- In the Web UI, enter the path as `file:///docs/my-library` (matching the container path).
|
|
9
|
+
The open-source alternative to **Context7**, **Nia**, and **Ref.Tools**.
|
|
326
10
|
|
|
327
|
-
|
|
11
|
+
- ✅ **Up-to-Date Context:** Fetches documentation directly from official sources on demand.
|
|
12
|
+
- 🎯 **Version-Specific:** Queries target the exact library versions in your project.
|
|
13
|
+
- 💡 **Reduces Hallucinations:** Grounds LLMs in real documentation.
|
|
14
|
+
- 🔒 **Private & Local:** Runs entirely on your machine; your code never leaves your network.
|
|
15
|
+
- 🧩 **Broad Compatibility:** Works with any MCP-compatible client (Claude, Cline, etc.).
|
|
16
|
+
- 📁 **Multiple Sources:** Index websites, GitHub repositories, local folders, and zip archives.
|
|
17
|
+
- 📄 **Rich File Support:** Processes HTML, Markdown, PDF, Word (.docx), Excel, PowerPoint, and source code.
|
|
328
18
|
|
|
329
|
-
|
|
19
|
+
---
|
|
330
20
|
|
|
331
|
-
|
|
21
|
+
## 🚀 Quick Start
|
|
332
22
|
|
|
333
|
-
**Start the
|
|
23
|
+
**1. Start the server** (requires Node.js 20+):
|
|
334
24
|
|
|
335
25
|
```bash
|
|
336
|
-
|
|
337
|
-
git clone https://github.com/arabold/docs-mcp-server.git
|
|
338
|
-
cd docs-mcp-server
|
|
339
|
-
|
|
340
|
-
# Set your environment variables
|
|
341
|
-
export OPENAI_API_KEY="your-key-here"
|
|
342
|
-
|
|
343
|
-
# Start all services
|
|
344
|
-
docker compose up -d
|
|
26
|
+
npx @arabold/docs-mcp-server@latest
|
|
345
27
|
```
|
|
346
28
|
|
|
347
|
-
**
|
|
348
|
-
|
|
349
|
-
- **Worker** (port 8080): Handles documentation processing jobs
|
|
350
|
-
- **MCP Server** (port 6280): Provides `/sse` endpoint for AI tools
|
|
351
|
-
- **Web Interface** (port 6281): Browser-based management interface
|
|
29
|
+
**2. Open the Web UI** at **[http://localhost:6280](http://localhost:6280)** to add documentation.
|
|
352
30
|
|
|
353
|
-
**
|
|
31
|
+
**3. Connect your AI client** by adding this to your MCP settings (e.g., `claude_desktop_config.json`):
|
|
354
32
|
|
|
355
33
|
```json
|
|
356
34
|
{
|
|
357
35
|
"mcpServers": {
|
|
358
36
|
"docs-mcp-server": {
|
|
359
37
|
"type": "sse",
|
|
360
|
-
"url": "http://localhost:6280/sse"
|
|
361
|
-
"disabled": false,
|
|
362
|
-
"autoApprove": []
|
|
38
|
+
"url": "http://localhost:6280/sse"
|
|
363
39
|
}
|
|
364
40
|
}
|
|
365
41
|
}
|
|
366
42
|
```
|
|
367
43
|
|
|
368
|
-
**
|
|
369
|
-
|
|
370
|
-
```json
|
|
371
|
-
// SSE (Server-Sent Events)
|
|
372
|
-
"type": "sse", "url": "http://localhost:6280/sse"
|
|
373
|
-
|
|
374
|
-
// HTTP (Streamable)
|
|
375
|
-
"type": "http", "url": "http://localhost:6280/mcp"
|
|
376
|
-
```
|
|
377
|
-
|
|
378
|
-
**Access interfaces:**
|
|
379
|
-
|
|
380
|
-
- Web Interface: `http://localhost:6281`
|
|
381
|
-
- MCP Endpoint (HTTP): `http://localhost:6280/mcp`
|
|
382
|
-
- MCP Endpoint (SSE): `http://localhost:6280/sse`
|
|
383
|
-
|
|
384
|
-
This architecture allows independent scaling of processing (workers) and user interfaces.
|
|
385
|
-
|
|
386
|
-
## Embeddings
|
|
387
|
-
|
|
388
|
-
Set the embedding model with YAML (`embeddings.model`), `DOCS_MCP_EMBEDDING_MODEL`, or `--embedding-model`. If you leave the model empty but provide `OPENAI_API_KEY`, the server defaults to `text-embedding-3-small`. Provider credentials use the provider-specific environment variables below.
|
|
389
|
-
|
|
390
|
-
| Variable | Description |
|
|
391
|
-
| ---------------------------------- | ----------------------------------------------------- |
|
|
392
|
-
| `DOCS_MCP_EMBEDDING_MODEL` | Embedding model to use (see below for options). |
|
|
393
|
-
| `OPENAI_API_KEY` | OpenAI API key for embeddings. |
|
|
394
|
-
| `OPENAI_API_BASE` | Custom OpenAI-compatible API endpoint (e.g., Ollama). |
|
|
395
|
-
| `GOOGLE_API_KEY` | Google API key for Gemini embeddings. |
|
|
396
|
-
| `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON for Vertex AI. |
|
|
397
|
-
| `AWS_ACCESS_KEY_ID` | AWS key for Bedrock embeddings. |
|
|
398
|
-
| `AWS_SECRET_ACCESS_KEY` | AWS secret for Bedrock embeddings. |
|
|
399
|
-
| `AWS_REGION` | AWS region for Bedrock. |
|
|
400
|
-
| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key. |
|
|
401
|
-
| `AZURE_OPENAI_API_INSTANCE_NAME` | Azure OpenAI instance name. |
|
|
402
|
-
| `AZURE_OPENAI_API_DEPLOYMENT_NAME` | Azure OpenAI deployment name. |
|
|
403
|
-
| `AZURE_OPENAI_API_VERSION` | Azure OpenAI API version. |
|
|
404
|
-
|
|
405
|
-
See [examples above](#alternative-using-docker) for usage.
|
|
406
|
-
|
|
407
|
-
### Embedding Model Options
|
|
408
|
-
|
|
409
|
-
Set `DOCS_MCP_EMBEDDING_MODEL` to one of:
|
|
410
|
-
|
|
411
|
-
- `text-embedding-3-small` (default, OpenAI)
|
|
412
|
-
- `openai:snowflake-arctic-embed2` (OpenAI-compatible, Ollama)
|
|
413
|
-
- `vertex:text-embedding-004` (Google Vertex AI)
|
|
414
|
-
- `gemini:embedding-001` (Google Gemini)
|
|
415
|
-
- `aws:amazon.titan-embed-text-v1` (AWS Bedrock)
|
|
416
|
-
- `microsoft:text-embedding-ada-002` (Azure OpenAI)
|
|
417
|
-
- Or any OpenAI-compatible model name
|
|
418
|
-
|
|
419
|
-
### Provider-Specific Configuration Examples
|
|
420
|
-
|
|
421
|
-
Here are complete configuration examples for different embedding providers:
|
|
422
|
-
|
|
423
|
-
**OpenAI (Default):**
|
|
424
|
-
|
|
425
|
-
```bash
|
|
426
|
-
OPENAI_API_KEY="sk-proj-your-openai-api-key" \
|
|
427
|
-
DOCS_MCP_EMBEDDING_MODEL="text-embedding-3-small" \
|
|
428
|
-
npx @arabold/docs-mcp-server@latest
|
|
429
|
-
```
|
|
430
|
-
|
|
431
|
-
**Ollama (Local):**
|
|
432
|
-
|
|
433
|
-
```bash
|
|
434
|
-
OPENAI_API_KEY="ollama" \
|
|
435
|
-
OPENAI_API_BASE="http://localhost:11434/v1" \
|
|
436
|
-
DOCS_MCP_EMBEDDING_MODEL="nomic-embed-text" \
|
|
437
|
-
npx @arabold/docs-mcp-server@latest
|
|
438
|
-
```
|
|
439
|
-
|
|
440
|
-
**LM Studio (Local):**
|
|
441
|
-
|
|
442
|
-
```bash
|
|
443
|
-
OPENAI_API_KEY="lmstudio" \
|
|
444
|
-
OPENAI_API_BASE="http://localhost:1234/v1" \
|
|
445
|
-
DOCS_MCP_EMBEDDING_MODEL="text-embedding-qwen3-embedding-4b" \
|
|
446
|
-
npx @arabold/docs-mcp-server@latest
|
|
447
|
-
```
|
|
448
|
-
|
|
449
|
-
**Google Gemini:**
|
|
450
|
-
|
|
451
|
-
```bash
|
|
452
|
-
GOOGLE_API_KEY="your-google-api-key" \
|
|
453
|
-
DOCS_MCP_EMBEDDING_MODEL="gemini:embedding-001" \
|
|
454
|
-
npx @arabold/docs-mcp-server@latest
|
|
455
|
-
```
|
|
456
|
-
|
|
457
|
-
**Google Vertex AI:**
|
|
458
|
-
|
|
459
|
-
```bash
|
|
460
|
-
GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/gcp-service-account.json" \
|
|
461
|
-
DOCS_MCP_EMBEDDING_MODEL="vertex:text-embedding-004" \
|
|
462
|
-
npx @arabold/docs-mcp-server@latest
|
|
463
|
-
```
|
|
464
|
-
|
|
465
|
-
**AWS Bedrock:**
|
|
44
|
+
See **[Connecting Clients](docs/guides/mcp-clients.md)** for VS Code (Cline, Roo) and other setup options.
|
|
466
45
|
|
|
467
|
-
|
|
468
|
-
|
|
469
|
-
AWS_SECRET_ACCESS_KEY="your-aws-secret-access-key" \
|
|
470
|
-
AWS_REGION="us-east-1" \
|
|
471
|
-
DOCS_MCP_EMBEDDING_MODEL="aws:amazon.titan-embed-text-v1" \
|
|
472
|
-
npx @arabold/docs-mcp-server@latest
|
|
473
|
-
```
|
|
474
|
-
|
|
475
|
-
**Azure OpenAI:**
|
|
46
|
+
<details>
|
|
47
|
+
<summary>Alternative: Run with Docker</summary>
|
|
476
48
|
|
|
477
49
|
```bash
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
50
|
+
docker run --rm \
|
|
51
|
+
-v docs-mcp-data:/data \
|
|
52
|
+
-v docs-mcp-config:/config \
|
|
53
|
+
-p 6280:6280 \
|
|
54
|
+
ghcr.io/arabold/docs-mcp-server:latest \
|
|
55
|
+
--protocol http --host 0.0.0.0 --port 6280
|
|
484
56
|
```
|
|
485
57
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
For enterprise authentication and security features, see the [Authentication Guide](docs/infrastructure/authentication.md).
|
|
489
|
-
|
|
490
|
-
## Telemetry
|
|
491
|
-
|
|
492
|
-
The Docs MCP Server includes privacy-first telemetry to help improve the product. We collect anonymous usage data to understand how the tool is used and identify areas for improvement.
|
|
493
|
-
|
|
494
|
-
### What We Collect
|
|
495
|
-
|
|
496
|
-
- Command usage patterns and success rates
|
|
497
|
-
- Tool execution metrics (counts, durations, error types)
|
|
498
|
-
- Pipeline job statistics (progress, completion rates)
|
|
499
|
-
- Service configuration patterns (auth enabled, read-only mode)
|
|
500
|
-
- Performance metrics (response times, processing efficiency)
|
|
501
|
-
- Protocol usage (stdio vs HTTP, transport modes)
|
|
58
|
+
</details>
|
|
502
59
|
|
|
503
|
-
###
|
|
60
|
+
### 🧠 Configure Embedding Model (Recommended)
|
|
504
61
|
|
|
505
|
-
|
|
506
|
-
- URLs being scraped or accessed
|
|
507
|
-
- Document content or scraped data
|
|
508
|
-
- Authentication tokens or credentials
|
|
509
|
-
- Personal information or identifying data
|
|
62
|
+
Using an embedding model is **optional** but dramatically improves search quality by enabling semantic vector search.
|
|
510
63
|
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
You can disable telemetry collection entirely:
|
|
514
|
-
|
|
515
|
-
**Option 1: CLI Flag**
|
|
516
|
-
|
|
517
|
-
```bash
|
|
518
|
-
npx @arabold/docs-mcp-server@latest --no-telemetry
|
|
519
|
-
```
|
|
520
|
-
|
|
521
|
-
**Option 2: Environment Variable**
|
|
64
|
+
**Example: Enable OpenAI Embeddings**
|
|
522
65
|
|
|
523
66
|
```bash
|
|
524
|
-
|
|
525
|
-
```
|
|
526
|
-
|
|
527
|
-
**Option 3: Docker**
|
|
528
|
-
|
|
529
|
-
```bash
|
|
530
|
-
docker run \
|
|
531
|
-
-e DOCS_MCP_TELEMETRY=false \
|
|
532
|
-
-v docs-mcp-data:/data \
|
|
533
|
-
-p 6280:6280 \
|
|
534
|
-
ghcr.io/arabold/docs-mcp-server:latest
|
|
67
|
+
OPENAI_API_KEY="sk-proj-..." npx @arabold/docs-mcp-server@latest
|
|
535
68
|
```
|
|
536
69
|
|
|
537
|
-
|
|
70
|
+
See **[Embedding Models](docs/guides/embedding-models.md)** for configuring **Ollama**, **Gemini**, **Azure**, and others.
|
|
538
71
|
|
|
539
|
-
|
|
72
|
+
---
|
|
540
73
|
|
|
541
|
-
|
|
74
|
+
## 📚 Documentation
|
|
542
75
|
|
|
543
|
-
|
|
544
|
-
-
|
|
545
|
-
-
|
|
546
|
-
-
|
|
76
|
+
### Getting Started
|
|
77
|
+
- **[Installation](docs/setup/installation.md)**: Detailed setup guides for Docker, Node.js (npx), and Embedded mode.
|
|
78
|
+
- **[Connecting Clients](docs/guides/mcp-clients.md)**: How to connect Claude, VS Code (Cline/Roo), and other MCP clients.
|
|
79
|
+
- **[Basic Usage](docs/guides/basic-usage.md)**: Using the Web UI, CLI, and scraping local files.
|
|
80
|
+
- **[Configuration](docs/setup/configuration.md)**: Full reference for config files and environment variables.
|
|
81
|
+
- **[Embedding Models](docs/guides/embedding-models.md)**: Configure OpenAI, Ollama, Gemini, and other providers.
|
|
547
82
|
|
|
548
|
-
|
|
83
|
+
### Key Concepts & Architecture
|
|
84
|
+
- **[Deployment Modes](docs/infrastructure/deployment-modes.md)**: Standalone vs. Distributed (Docker Compose).
|
|
85
|
+
- **[Authentication](docs/infrastructure/authentication.md)**: Securing your server with OAuth2/OIDC.
|
|
86
|
+
- **[Telemetry](docs/infrastructure/telemetry.md)**: Privacy-first usage data collection.
|
|
87
|
+
- **[Architecture](ARCHITECTURE.md)**: Deep dive into the system design.
|
|
549
88
|
|
|
550
|
-
|
|
89
|
+
---
|
|
551
90
|
|
|
552
|
-
|
|
91
|
+
## 🤝 Contributing
|
|
553
92
|
|
|
554
|
-
|
|
93
|
+
We welcome contributions! Please see **[CONTRIBUTING.md](CONTRIBUTING.md)** for development guidelines and setup instructions.
|
|
555
94
|
|
|
556
95
|
## License
|
|
557
96
|
|