@arabold/docs-mcp-server 1.15.1 → 1.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,49 +1,44 @@
1
1
  # Docs MCP Server: Your AI's Up-to-Date Documentation Expert
2
2
 
3
- AI coding assistants often struggle with outdated documentation, leading to incorrect suggestions or hallucinated code examples. Verifying AI responses against specific library versions can be time-consuming and inefficient.
3
+ AI coding assistants often struggle with outdated documentation and hallucinations. The **Docs MCP Server** solves this by providing a personal, always-current knowledge base for your AI. It **indexes 3rd party documentation** from various sources (websites, GitHub, npm, PyPI, local files) and offers powerful, version-aware search tools via the Model Context Protocol (MCP).
4
4
 
5
- The **Docs MCP Server** solves this by acting as a personal, always-current knowledge base for your AI assistant. Its primary purpose is to **index 3rd party documentation** the libraries you actually use in your codebase. It scrapes websites, GitHub repositories, package managers (npm, PyPI), and even local files, cataloging the docs locally. It then provides powerful search tools via the Model Context Protocol (MCP) to your coding agent.
6
-
7
- This enables your LLM agent to access the **latest official documentation** for any library you add, dramatically improving the quality and reliability of generated code and integration details.
8
-
9
- By grounding AI responses in accurate, version-aware context, the Docs MCP Server enables you to receive concise and relevant integration details and code snippets, improving the reliability and efficiency of LLM-assisted development.
10
-
11
- It's **free**, **open-source**, runs **locally** for privacy, and integrates seamlessly into your development workflow.
5
+ This enables your AI agent to access the **latest official documentation**, dramatically improving the quality and reliability of generated code and integration details. It's **free**, **open-source**, runs **locally** for privacy, and integrates seamlessly into your development workflow.
12
6
 
13
7
  ## Why Use the Docs MCP Server?
14
8
 
15
9
  LLM-assisted coding promises speed and efficiency, but often falls short due to:
16
10
 
17
- - 🌀 **Stale Knowledge:** LLMs train on snapshots of the internet, quickly falling behind new library releases and API changes.
11
+ - 🌀 **Stale Knowledge:** LLMs train on snapshots of the internet and quickly fall behind new library releases and API changes.
18
12
  - 👻 **Code Hallucinations:** AI can invent plausible-looking code that is syntactically correct but functionally wrong or uses non-existent APIs.
19
- - ❓ **Version Ambiguity:** Generic answers rarely account for the specific version dependencies in _your_ project, leading to subtle bugs.
13
+ - ❓ **Version Ambiguity:** Generic answers rarely account for the specific version dependencies in your project, leading to subtle bugs.
20
14
  - ⏳ **Verification Overhead:** Developers spend valuable time double-checking AI suggestions against official documentation.
21
15
 
22
- **The Docs MCP Server tackles these problems head-on by:**
16
+ **Docs MCP Server solves these problems by:**
23
17
 
24
- - ✅ **Providing Always Up-to-Date Context:** It fetches and indexes documentation _directly_ from official sources (websites, GitHub, npm, PyPI, local files) on demand.
25
- - 🎯 **Delivering Version-Specific Answers:** Search queries can target exact library versions, ensuring the information aligns with your project's dependencies.
26
- - 💡 **Reducing Hallucinations:** By grounding the LLM in real documentation, it provides accurate examples and integration details.
18
+ - ✅ **Providing Up-to-Date Context:** Fetches and indexes documentation directly from official sources (websites, GitHub, npm, PyPI, local files) on demand.
19
+ - 🎯 **Delivering Version-Specific Answers:** Search queries can target exact library versions, ensuring information matches your project's dependencies.
20
+ - 💡 **Reducing Hallucinations:** Grounds the LLM in real documentation for accurate examples and integration details.
27
21
  - ⚡ **Boosting Productivity:** Get trustworthy answers faster, integrated directly into your AI assistant workflow.
28
22
 
29
23
  ## ✨ Key Features
30
24
 
31
- - **Up-to-Date Knowledge:** Fetches the latest documentation directly from the source.
32
- - **Version-Aware Search:** Get answers relevant to specific library versions (e.g., `react@18.2.0` vs `react@17.0.0`).
33
- - **Accurate Snippets:** Reduces AI hallucinations by using context from official docs.
34
- - **Web Interface:** Provides a easy-to-use web interface for searching and managing documentation.
35
- - **Broad Source Compatibility:** Scrapes websites, GitHub repos, package manager sites (npm, PyPI), and even local file directories.
36
- - **Intelligent Processing:** Automatically chunks documentation semantically and generates embeddings.
37
- - **Flexible Embedding Models:** Supports OpenAI (incl. compatible APIs like Ollama), Google Gemini/Vertex AI, Azure OpenAI, AWS Bedrock, and more.
38
- - **Powerful Hybrid Search:** Combines vector similarity with full-text search for relevance.
39
- - **Local & Private:** Runs entirely on your machine, keeping your data and queries private.
40
- - **Free & Open Source:** Built for the community, by the community.
25
+ - **Accurate & Version-Aware AI Responses:** Provides up-to-date, version-specific documentation to reduce AI hallucinations and improve code accuracy.
26
+ - **Broad Source Compatibility:** Scrapes documentation from websites, GitHub repos, package manager sites (npm, PyPI), and local file directories.
27
+ - **Advanced Search & Processing:** Intelligently chunks documentation semantically, generates embeddings, and combines vector similarity with full-text search.
28
+ - **Flexible Embedding Models:** Supports various providers including OpenAI (and compatible APIs), Google Gemini/Vertex AI, Azure OpenAI, and AWS Bedrock.
29
+ - **Web Interface:** Easy-to-use web interface for searching and managing documentation.
30
+ - **Local & Private:** Runs entirely on your machine, ensuring data and queries remain private.
31
+ - **Free & Open Source:** Community-driven and freely available.
41
32
  - **Simple Deployment:** Easy setup via Docker or `npx`.
42
33
  - **Seamless Integration:** Works with MCP-compatible clients (like Claude, Cline, Roo).
43
34
 
35
+ > **What is semantic chunking?**
36
+ >
37
+ > Semantic chunking splits documentation into meaningful sections based on structure—like headings, code blocks, and tables—rather than arbitrary text size. Docs MCP Server preserves logical boundaries, keeps code and tables intact, and removes navigation clutter from HTML docs. This ensures LLMs receive coherent, context-rich information for more accurate and relevant answers.
38
+
44
39
  ## How to Run the Docs MCP Server
45
40
 
46
- Get up and running quickly! We recommend using Docker Desktop (Docker Compose) for the easiest setup and management.
41
+ Get started quickly:
47
42
 
48
43
  - [Recommended: Docker Desktop](#recommended-docker-desktop)
49
44
  - [Alternative: Using Docker](#alternative-using-docker)
@@ -51,100 +46,100 @@ Get up and running quickly! We recommend using Docker Desktop (Docker Compose) f
51
46
 
52
47
  ## Recommended: Docker Desktop
53
48
 
54
- This method provides a persistent local setup by running the server and web interface using Docker Compose. It requires cloning the repository but simplifies managing both services together.
55
-
56
- 1. **Ensure Docker and Docker Compose are installed and running.**
57
- 2. **Clone the repository:**
58
- ```bash
59
- git clone https://github.com/arabold/docs-mcp-server.git
60
- cd docs-mcp-server
61
- ```
62
- 3. **Set up your environment:**
63
- Copy the example environment file and edit it to add your OpenAI API key (required):
64
-
65
- ```bash
66
- cp .env.example .env
67
- # Edit the .env file and set your OpenAI API key:
68
- ```
49
+ Run the server and web interface together using Docker Compose.
69
50
 
70
- Example `.env`:
71
-
72
- ```
73
- OPENAI_API_KEY=your-api-key-here
74
- ```
75
-
76
- For additional configuration options (e.g., other providers, advanced settings), see the [Configuration](#configuration) section.
77
-
78
- 4. **Launch the services:**
79
- Run this command from the repository's root directory. It will build the images (if necessary) and start the server and web interface in the background.
80
-
81
- ```bash
82
- docker compose up -d
83
- ```
84
-
85
- - `-d`: Runs the containers in detached mode (in the background). Omit this to see logs directly in your terminal.
86
-
87
- **Note:** If you pull updates for the repository (e.g., using `git pull`), you'll need to rebuild the Docker images to include the changes by running `docker compose up -d --build`.
51
+ 1. **Install Docker and Docker Compose.**
52
+ 2. **Clone the repository:**
53
+ ```bash
54
+ git clone https://github.com/arabold/docs-mcp-server.git
55
+ cd docs-mcp-server
56
+ ```
57
+ 3. **Set up your environment:**
58
+ Copy the example environment file and add your OpenAI API key:
59
+ ```bash
60
+ cp .env.example .env
61
+ # Edit .env and set your OpenAI API key
62
+ ```
63
+ 4. **Start the services:**
64
+ ```bash
65
+ docker compose up -d
66
+ ```
67
+ - Use `-d` for detached mode. Omit to see logs in your terminal.
68
+ - To rebuild after updates: `docker compose up -d --build`.
69
+ 5. **Configure your MCP client:**
70
+ Add this to your MCP settings:
71
+ ```json
72
+ {
73
+ "mcpServers": {
74
+ "docs-mcp-server": {
75
+ "url": "http://localhost:6280/sse",
76
+ "disabled": false,
77
+ "autoApprove": []
78
+ }
79
+ }
80
+ }
81
+ ```
82
+ Restart your AI assistant after updating the config.
83
+ 6. **Access the Web Interface:**
84
+ Open `http://localhost:6281` in your browser.
88
85
 
89
- 5. **Configure your MCP client:**
90
- Add the following configuration block to your MCP settings file (e.g., for Claude, Cline, Roo):
86
+ **Benefits:**
91
87
 
92
- ```json
93
- {
94
- "mcpServers": {
95
- "docs-mcp-server": {
96
- "url": "http://localhost:6280/sse", // Connects via HTTP to the Docker Compose service
97
- "disabled": false,
98
- "autoApprove": []
99
- }
100
- }
101
- }
102
- ```
88
+ - One command runs both server and web UI
89
+ - Persistent data storage via Docker volume
90
+ - Easy config via `.env`
103
91
 
104
- Restart your AI assistant application after updating the configuration.
92
+ To stop, run `docker compose down`.
105
93
 
106
- Note: The Docker Compose setup runs the Docs MCP Server in HTTP mode (via SSE) by design, as it's intended as a standalone, connectable instance. It does not support stdio communication.
94
+ ### Adding Library Documentation
107
95
 
108
- 6. **Access the Web Interface:**
109
- The web interface will be available at `http://localhost:6281`.
96
+ 1. Open the Web Interface at `http://localhost:6281`.
97
+ 2. Use the "Queue New Scrape Job" form.
98
+ 3. Enter the documentation URL, library name, and (optionally) version.
99
+ 4. Click "Queue Job". Monitor progress in the Job Queue.
100
+ 5. Repeat for each library you want indexed.
110
101
 
111
- **Benefits of this method:**
102
+ Once a job completes, the docs are searchable via your AI assistant or the Web UI.
112
103
 
113
- - Runs both the server and web UI with a single command.
114
- - Uses the local source code (rebuilds automatically if code changes and you run `docker compose up --build`).
115
- - Persistent data storage via the `docs-mcp-data` Docker volume.
116
- - Easy configuration management via the `.env` file.
104
+ ## Scraping Local Files and Folders
117
105
 
118
- To stop the services, run `docker compose down` from the repository directory.
106
+ You can index documentation from your local filesystem by using a `file://` URL as the source. This works in both the Web UI and CLI.
119
107
 
120
- ### Adding Library Documentation
108
+ **Examples:**
121
109
 
122
- ![Docs MCP Server Web Interface](docs/docs-mcp-server.png)
110
+ - Web: `https://react.dev/reference/react`
111
+ - Local file: `file:///Users/me/docs/index.html`
112
+ - Local folder: `file:///Users/me/docs/my-library`
123
113
 
124
- Once the Docs MCP Server is running, you can use the Web Interface to **add new documentation** to be indexed or **search existing documentation**.
114
+ **Requirements:**
125
115
 
126
- 1. **Open the Web Interface:** If you used the recommended Docker Compose setup, navigate your browser to `http://localhost:6281`.
127
- 2. **Find the "Queue New Scrape Job" Form:** This is usually prominently displayed on the main page.
128
- 3. **Enter the Details:**
129
- - **URL:** Provide the starting URL for the documentation you want to index (e.g., `https://react.dev/reference/react`, `https://github.com/expressjs/express`, `https://docs.python.org/3/`).
130
- - **Library Name:** Give it a short, memorable name (e.g., `react`, `express`, `python`). This is how you'll refer to it in searches.
131
- - **Version (Optional):** If you want to index a specific version, enter it here (e.g., `18.2.0`, `4.17.1`, `3.11`). If left blank, the server often tries to detect the latest version or indexes it as unversioned.
132
- - **(Optional) Advanced Settings:** Adjust `Scope` (e.g., 'Subpages', 'Hostname', 'Domain'), `Max Pages`, `Max Depth`, and `Follow Redirects` if needed. Defaults are usually sufficient.
133
- 4. **Click "Queue Job":** The server will start a background job to fetch, process, and index the documentation. You can monitor its progress in the "Job Queue" section of the Web UI.
134
- 5. **Repeat:** Repeat steps 3-4 for every library whose documentation you want the server to manage.
116
+ - All files with a MIME type of `text/*` are processed. This includes HTML, Markdown, plain text, and source code files such as `.js`, `.ts`, `.tsx`, `.css`, etc. Binary files, PDFs, images, and other non-text formats are ignored.
117
+ - You must use the `file://` prefix for local files/folders.
118
+ - The path must be accessible to the server process.
119
+ - **If running in Docker or Docker Compose:**
120
+ - You must mount the local folder into the container and use the container path in your `file://` URL.
121
+ - Example Docker run:
122
+ ```bash
123
+ docker run --rm \
124
+ -e OPENAI_API_KEY="your-key" \
125
+ -v /absolute/path/to/docs:/docs:ro \
126
+ -v docs-mcp-data:/data \
127
+ ghcr.io/arabold/docs-mcp-server:latest \
128
+ scrape mylib file:///docs/my-library
129
+ ```
130
+ - In the Web UI, enter the path as `file:///docs/my-library` (matching the container path).
135
131
 
136
- **That's it!** Once a job completes successfully, the documentation for that library and version becomes available for searching through your connected AI coding assistant (using the `search_docs` tool) or directly in the Web UI by clicking on the library name in the "Indexed Documenation" section.
132
+ See the tooltips in the Web UI and CLI help for more details.
137
133
 
138
134
  ## Alternative: Using Docker
139
135
 
140
- This approach is easy, straightforward, and doesn't require cloning the repository.
141
-
142
- 1. **Ensure Docker is installed and running.**
143
- 2. **Configure your MCP settings:**
136
+ > **Note:** The published Docker images support both x86_64 (amd64) and Mac Silicon (arm64).
144
137
 
145
- **Claude/Cline/Roo Configuration Example:**
146
- Add the following configuration block to your MCP settings file (adjust path as needed):
138
+ This method is simple and doesn't require cloning the repository.
147
139
 
140
+ 1. **Install and start Docker.**
141
+ 2. **Configure your MCP client:**
142
+ Add this block to your MCP settings (adjust as needed):
148
143
  ```json
149
144
  {
150
145
  "mcpServers": {
@@ -161,7 +156,7 @@ This approach is easy, straightforward, and doesn't require cloning the reposito
161
156
  "ghcr.io/arabold/docs-mcp-server:latest"
162
157
  ],
163
158
  "env": {
164
- "OPENAI_API_KEY": "sk-proj-..." // Required if using OpenAI (default)
159
+ "OPENAI_API_KEY": "sk-proj-..." // Your OpenAI API key
165
160
  },
166
161
  "disabled": false,
167
162
  "autoApprove": []
@@ -169,56 +164,52 @@ This approach is easy, straightforward, and doesn't require cloning the reposito
169
164
  }
170
165
  }
171
166
  ```
172
-
173
- Remember to replace `"sk-proj-..."` with your actual OpenAI API key and restart the application.
174
-
175
- 3. **That's it!** The server will now be available to your AI assistant.
167
+ Replace `sk-proj-...` with your OpenAI API key. Restart your application.
168
+ 3. **Done!** The server is now available to your AI assistant.
176
169
 
177
170
  **Docker Container Settings:**
178
171
 
179
- - `-i`: Keep STDIN open, crucial for MCP communication over stdio.
180
- - `--rm`: Automatically remove the container when it exits.
181
- - `-e OPENAI_API_KEY`: **Required.** Set your OpenAI API key.
182
- - `-v docs-mcp-data:/data`: **Required for persistence.** Mounts a Docker named volume `docs-mcp-data` to store the database. You can replace with a specific host path if preferred (e.g., `-v /path/on/host:/data`).
172
+ - `-i`: Keeps STDIN open for MCP communication.
173
+ - `--rm`: Removes the container on exit.
174
+ - `-e OPENAI_API_KEY`: **Required.**
175
+ - `-v docs-mcp-data:/data`: **Required for persistence.**
176
+
177
+ You can pass any configuration environment variable (see [Configuration](#configuration)) using `-e`.
183
178
 
184
- Any of the configuration environment variables (see [Configuration](#configuration) above) can be passed to the container using the `-e` flag. For example:
179
+ **Examples:**
185
180
 
186
181
  ```bash
187
- # Example 1: Using OpenAI embeddings (default)
182
+ # OpenAI embeddings (default)
188
183
  docker run -i --rm \
189
- -e OPENAI_API_KEY="your-key-here" \
184
+ -e OPENAI_API_KEY="your-key" \
190
185
  -e DOCS_MCP_EMBEDDING_MODEL="text-embedding-3-small" \
191
186
  -v docs-mcp-data:/data \
192
- ghcr.io/arabold/docs-mcp-server:latest # Runs MCP server (stdio by default)
193
- # To run MCP server in HTTP mode on port 6280, append to the line above:
194
- # --protocol http --port 6280
187
+ ghcr.io/arabold/docs-mcp-server:latest
195
188
 
196
- # Example 2: Using OpenAI-compatible API (like Ollama)
189
+ # OpenAI-compatible API (Ollama)
197
190
  docker run -i --rm \
198
- -e OPENAI_API_KEY="your-key-here" \
191
+ -e OPENAI_API_KEY="your-key" \
199
192
  -e OPENAI_API_BASE="http://localhost:11434/v1" \
200
193
  -e DOCS_MCP_EMBEDDING_MODEL="embeddings" \
201
194
  -v docs-mcp-data:/data \
202
195
  ghcr.io/arabold/docs-mcp-server:latest
203
196
 
204
- # Example 3a: Using Google Cloud Vertex AI embeddings
197
+ # Google Vertex AI
205
198
  docker run -i --rm \
206
- -e OPENAI_API_KEY="your-openai-key" \ # For OpenAI provider
207
199
  -e DOCS_MCP_EMBEDDING_MODEL="vertex:text-embedding-004" \
208
200
  -e GOOGLE_APPLICATION_CREDENTIALS="/app/gcp-key.json" \
209
201
  -v docs-mcp-data:/data \
210
202
  -v /path/to/gcp-key.json:/app/gcp-key.json:ro \
211
203
  ghcr.io/arabold/docs-mcp-server:latest
212
204
 
213
- # Example 3b: Using Google Generative AI (Gemini) embeddings
205
+ # Google Gemini
214
206
  docker run -i --rm \
215
- -e OPENAI_API_KEY="your-openai-key" \ # For OpenAI provider
216
207
  -e DOCS_MCP_EMBEDDING_MODEL="gemini:embedding-001" \
217
208
  -e GOOGLE_API_KEY="your-google-api-key" \
218
209
  -v docs-mcp-data:/data \
219
210
  ghcr.io/arabold/docs-mcp-server:latest
220
211
 
221
- # Example 4: Using AWS Bedrock embeddings
212
+ # AWS Bedrock
222
213
  docker run -i --rm \
223
214
  -e AWS_ACCESS_KEY_ID="your-aws-key" \
224
215
  -e AWS_SECRET_ACCESS_KEY="your-aws-secret" \
@@ -227,7 +218,7 @@ docker run -i --rm \
227
218
  -v docs-mcp-data:/data \
228
219
  ghcr.io/arabold/docs-mcp-server:latest
229
220
 
230
- # Example 5: Using Azure OpenAI embeddings
221
+ # Azure OpenAI
231
222
  docker run -i --rm \
232
223
  -e AZURE_OPENAI_API_KEY="your-azure-key" \
233
224
  -e AZURE_OPENAI_API_INSTANCE_NAME="your-instance" \
@@ -238,242 +229,144 @@ docker run -i --rm \
238
229
  ghcr.io/arabold/docs-mcp-server:latest
239
230
  ```
240
231
 
241
- ### Launching Web Interface
242
-
243
- You can access a web-based GUI at `http://localhost:6281` to manage and search library documentation through your browser.
232
+ ### Web Interface via Docker
244
233
 
245
- If you're running the server with Docker, use Docker for the web interface as well:
234
+ Access the web UI at `http://localhost:6281`:
246
235
 
247
236
  ```bash
248
237
  docker run --rm \
249
- -e OPENAI_API_KEY="your-openai-api-key-here" \
238
+ -e OPENAI_API_KEY="your-openai-api-key" \
250
239
  -v docs-mcp-data:/data \
251
240
  -p 6281:6281 \
252
241
  ghcr.io/arabold/docs-mcp-server:latest \
253
242
  web --port 6281
254
243
  ```
255
244
 
256
- Make sure to:
245
+ - Use the same volume name as your server.
246
+ - Map port 6281 with `-p 6281:6281`.
247
+ - Pass config variables with `-e` as needed.
257
248
 
258
- - Use the same volume name (`docs-mcp-data` in this example) as your server
259
- - Map port 6281 with `-p 6281:6281`
260
- - Pass any configuration environment variables with `-e` flags
249
+ ### CLI via Docker
261
250
 
262
- ### Using the CLI
263
-
264
- You can use the CLI to manage documentation directly via Docker by passing CLI commands after the image name:
251
+ Run CLI commands by appending them after the image name:
265
252
 
266
253
  ```bash
267
254
  docker run --rm \
268
- -e OPENAI_API_KEY="your-openai-api-key-here" \
255
+ -e OPENAI_API_KEY="your-openai-api-key" \
269
256
  -v docs-mcp-data:/data \
270
257
  ghcr.io/arabold/docs-mcp-server:latest \
271
- <command> [options] # e.g., list, scrape <library> <url>, search <library> <query>
258
+ <command> [options]
272
259
  ```
273
260
 
274
261
  Example:
275
262
 
276
263
  ```bash
277
264
  docker run --rm \
278
- -e OPENAI_API_KEY="your-openai-api-key-here" \
265
+ -e OPENAI_API_KEY="your-openai-api-key" \
279
266
  -v docs-mcp-data:/data \
280
267
  ghcr.io/arabold/docs-mcp-server:latest \
281
268
  list
282
269
  ```
283
270
 
284
- Make sure to use the same volume name (`docs-mcp-data` in this example) as your MCP server container if you want them to share data. Any of the configuration environment variables (see [Configuration](#configuration) above) can be passed using `-e` flags.
285
-
286
- The main commands available are:
287
-
288
- - `scrape`: Scrapes and indexes documentation from a URL.
289
- - `search`: Searches the indexed documentation.
290
- - `list`: Lists all indexed libraries.
291
- - `remove`: Removes indexed documentation.
292
- - `fetch-url`: Fetches a single URL and converts to Markdown.
293
- - `find-version`: Finds the best matching version for a library.
271
+ Use the same volume for data sharing. For command help, run:
294
272
 
295
- For detailed command usage, run the CLI with the --help flag (e.g., `docker run ... ghcr.io/arabold/docs-mcp-server:latest --help`).
273
+ ```bash
274
+ docker run --rm ghcr.io/arabold/docs-mcp-server:latest --help
275
+ ```
296
276
 
297
277
  ## Alternative: Using npx
298
278
 
299
- This approach is useful when you need local file access (e.g., indexing documentation from your local file system). While this can also be achieved by mounting paths into a Docker container, using `npx` is simpler but requires a Node.js installation.
300
-
301
- 1. **Ensure Node.js is installed.**
302
- 2. **Configure your MCP settings:**
279
+ You can run the Docs MCP Server without installing or cloning the repo:
303
280
 
304
- **Claude/Cline/Roo Configuration Example:**
305
- Add the following configuration block to your MCP settings file:
306
-
307
- ```json
308
- {
309
- "mcpServers": {
310
- "docs-mcp-server": {
311
- "command": "npx",
312
- "args": ["-y", "@arabold/docs-mcp-server"],
313
- // This will run the default MCP server (stdio).
314
- // To run in HTTP mode, add arguments: e.g.
315
- // "args": ["-y", "@arabold/docs-mcp-server", "--protocol", "http", "--port", "6280"],
316
- "env": {
317
- "OPENAI_API_KEY": "sk-proj-..." // Required if using OpenAI (default)
318
- },
319
- "disabled": false,
320
- "autoApprove": []
321
- }
322
- }
323
- }
281
+ 1. **Run the server:**
282
+ ```bash
283
+ npx @arabold/docs-mcp-server@latest
324
284
  ```
285
+ 2. **Set your OpenAI API key:**
286
+ - Use the `OPENAI_API_KEY` environment variable.
287
+ - Example:
288
+ ```bash
289
+ OPENAI_API_KEY="sk-proj-..." npx @arabold/docs-mcp-server@latest
290
+ ```
291
+ 3. **Configure your MCP client:**
292
+ - Use the same settings as in the Docker example, but replace the `command` and `args` with the `npx` command above.
325
293
 
326
- Remember to replace `"sk-proj-..."` with your actual OpenAI API key and restart the application.
294
+ **Note:** Data is stored in a temporary directory and will not persist between runs. For persistent storage, use Docker or a local install.
327
295
 
328
- 3. **That's it!** The server will now be available to your AI assistant.
296
+ ### CLI via npx
329
297
 
330
- ### Launching Web Interface
331
-
332
- If you're running the MCP server with `npx` (as shown above, it runs by default), use `npx` for the web interface as well:
298
+ You can run CLI commands directly with npx, without installing the package globally:
333
299
 
334
300
  ```bash
335
- npx -y @arabold/docs-mcp-server web --port 6281
301
+ npx @arabold/docs-mcp-server@latest <command> [options]
336
302
  ```
337
303
 
338
- You can specify a different port for the web interface using its `--port` flag.
339
-
340
- The `npx` approach will use the default data directory on your system (typically in your home directory), ensuring consistency between server and web interface.
341
-
342
- ### Using the CLI
343
-
344
- If you're running the MCP server with `npx`, you can also use `npx` for CLI commands:
304
+ Example:
345
305
 
346
306
  ```bash
347
- npx -y @arabold/docs-mcp-server <command> [options]
307
+ npx @arabold/docs-mcp-server@latest list
348
308
  ```
349
309
 
350
- Example:
310
+ For command help, run:
351
311
 
352
312
  ```bash
353
- npx -y @arabold/docs-mcp-server list
313
+ npx @arabold/docs-mcp-server@latest --help
354
314
  ```
355
315
 
356
- The `npx` approach will use the default data directory on your system (typically in your home directory), ensuring consistency.
357
-
358
- For detailed command usage, run the CLI with the --help flag (e.g., `npx -y @arabold/docs-mcp-server --help`).
359
-
360
316
  ## Configuration
361
317
 
362
- The following environment variables are supported to configure the embedding model behavior. Specify them in your `.env` file or pass them as `-e` flags when running the server via Docker or npx.
363
-
364
- ### Embedding Model Configuration
365
-
366
- - `DOCS_MCP_EMBEDDING_MODEL`: **Optional.** Format: `provider:model_name` or just `model_name` (defaults to `text-embedding-3-small`). Supported providers and their required environment variables:
367
-
368
- - `openai` (default provider): Uses OpenAI's embedding models.
369
-
370
- - `OPENAI_API_KEY`: Your OpenAI API key. **Required if `openai` is the active provider.**
371
- - `OPENAI_ORG_ID`: **Optional.** Your OpenAI Organization ID
372
- - `OPENAI_API_BASE`: **Optional.** Custom base URL for OpenAI-compatible APIs (e.g., Ollama).
373
-
374
- - `vertex`: Uses Google Cloud Vertex AI embeddings
375
-
376
- - `GOOGLE_APPLICATION_CREDENTIALS`: **Required.** Path to service account JSON key file
377
-
378
- - `gemini`: Uses Google Generative AI (Gemini) embeddings
379
-
380
- - `GOOGLE_API_KEY`: **Required.** Your Google API key
381
-
382
- - `aws`: Uses AWS Bedrock embeddings
383
-
384
- - `AWS_ACCESS_KEY_ID`: **Required.** AWS access key
385
- - `AWS_SECRET_ACCESS_KEY`: **Required.** AWS secret key
386
- - `AWS_REGION` or `BEDROCK_AWS_REGION`: **Required.** AWS region for Bedrock
387
-
388
- - `microsoft`: Uses Azure OpenAI embeddings
389
- - `AZURE_OPENAI_API_KEY`: **Required.** Azure OpenAI API key
390
- - `AZURE_OPENAI_API_INSTANCE_NAME`: **Required.** Azure instance name
391
- - `AZURE_OPENAI_API_DEPLOYMENT_NAME`: **Required.** Azure deployment name
392
- - `AZURE_OPENAI_API_VERSION`: **Required.** Azure API version
393
-
394
- ### Vector Dimensions
395
-
396
- The database schema uses a fixed dimension of 1536 for embedding vectors. Only models that produce vectors with dimension ≤ 1536 are supported, except for certain providers (like Gemini) that support dimension reduction.
397
-
398
- For OpenAI-compatible APIs (like Ollama), use the `openai` provider with `OPENAI_API_BASE` pointing to your endpoint.
318
+ The Docs MCP Server is configured via environment variables. Set these in your shell, Docker, or MCP client config.
319
+
320
+ | Variable | Description |
321
+ | ---------------------------------- | ----------------------------------------------------- |
322
+ | `DOCS_MCP_EMBEDDING_MODEL` | Embedding model to use (see below for options). |
323
+ | `OPENAI_API_KEY` | OpenAI API key for embeddings. |
324
+ | `OPENAI_API_BASE` | Custom OpenAI-compatible API endpoint (e.g., Ollama). |
325
+ | `GOOGLE_API_KEY` | Google API key for Gemini embeddings. |
326
+ | `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON for Vertex AI. |
327
+ | `AWS_ACCESS_KEY_ID` | AWS key for Bedrock embeddings. |
328
+ | `AWS_SECRET_ACCESS_KEY` | AWS secret for Bedrock embeddings. |
329
+ | `AWS_REGION` | AWS region for Bedrock. |
330
+ | `AZURE_OPENAI_API_KEY` | Azure OpenAI API key. |
331
+ | `AZURE_OPENAI_API_INSTANCE_NAME` | Azure OpenAI instance name. |
332
+ | `AZURE_OPENAI_API_DEPLOYMENT_NAME` | Azure OpenAI deployment name. |
333
+ | `AZURE_OPENAI_API_VERSION` | Azure OpenAI API version. |
334
+ | `DOCS_MCP_DATA_DIR` | Data directory (default: `./data`). |
335
+ | `DOCS_MCP_PORT` | Server port (default: `6281`). |
336
+
337
+ See [examples above](#alternative-using-docker) for usage.
338
+
339
+ ### Embedding Model Options
340
+
341
+ Set `DOCS_MCP_EMBEDDING_MODEL` to one of:
342
+
343
+ - `text-embedding-3-small` (default, OpenAI)
344
+ - `openai:llama2` (OpenAI-compatible, Ollama)
345
+ - `vertex:text-embedding-004` (Google Vertex AI)
346
+ - `gemini:embedding-001` (Google Gemini)
347
+ - `aws:amazon.titan-embed-text-v1` (AWS Bedrock)
348
+ - `microsoft:text-embedding-ada-002` (Azure OpenAI)
349
+ - Or any OpenAI-compatible model name
350
+
351
+ For more, see the [ARCHITECTURE.md](ARCHITECTURE.md) and [examples above](#alternative-using-docker).
399
352
 
400
353
  ## Development
401
354
 
402
- This section covers running the server/CLI directly from the source code for development purposes. The primary usage method is via the public Docker image (`ghcr.io/arabold/docs-mcp-server:latest`), as detailed in the "Alternative: Using Docker" section, or via Docker Compose as described in the "Recommended: Docker Desktop" section.
403
-
404
- ### Running from Source
405
-
406
- > **Note:** Playwright browsers are not installed automatically during `npm install`. If you need to run tests or use features that require Playwright, run:
407
- >
408
- > ```bash
409
- > npx playwright install --no-shell --with-deps chromium
410
- > ```
411
-
412
- This provides an isolated environment and exposes the server via HTTP endpoints.
413
-
414
- This method is useful for contributing to the project or running un-published versions.
415
-
416
- 1. **Clone the repository:**
417
- ```bash
418
- git clone https://github.com/arabold/docs-mcp-server.git # Replace with actual URL if different
419
- cd docs-mcp-server
420
- ```
421
- 2. **Install dependencies:**
422
- ```bash
423
- npm install
424
- ```
425
- 3. **Build the project:**
426
- This compiles TypeScript to JavaScript in the `dist/` directory.
427
- ```bash
428
- npm run build
429
- ```
430
- 4. **Setup Environment:**
431
- Create and configure your `.env` file as described in the [Configuration](#configuration) section. This is crucial for providing the `OPENAI_API_KEY`.
432
-
433
- 5. **Run:**
434
- - **Default MCP Server (Development):**
435
- - Stdio mode (default): `npm run dev:server`
436
- - HTTP mode: `npm run dev:server:http` (uses default port)
437
- - Custom HTTP: `vite-node src/index.ts -- --protocol http --port <your_port>`
438
- - **Web Interface (Development):** `npm run dev:web`
439
- - This starts the web server (e.g., on port 6281) and watches for asset changes.
440
- - **CLI Commands (Development):** `npm run dev:cli -- <command> [options]`
441
- - Example: `npm run dev:cli -- list`
442
- - Example: `vite-node src/index.ts scrape <library> <url>`
443
- - **Production Mode (after `npm run build`):**
444
- - Default MCP Server (stdio): `npm run start` (or `node dist/index.js`)
445
- - MCP Server (HTTP): `npm run start -- --protocol http --port <your_port>` (or `node dist/index.js --protocol http --port <your_port>`)
446
- - Web Interface: `npm run web -- --port <web_port>` (or `node dist/index.js web --port <web_port>`)
447
- - CLI Commands: `npm run cli -- <command> [options]` (or `node dist/index.js <command> [options]`)
448
-
449
- ### Testing
450
-
451
- Since MCP servers communicate over stdio when run directly via Node.js (or `vite-node`), debugging can be challenging. We recommend using the [MCP Inspector](https://github.com/modelcontextprotocol/inspector).
452
-
453
- After building the project (`npm run build`):
454
-
455
- ```bash
456
- # For stdio mode (default)
457
- npx @modelcontextprotocol/inspector node dist/index.js
355
+ To develop or contribute to the Docs MCP Server:
458
356
 
459
- # For HTTP mode (e.g., on port 6280)
460
- npx @modelcontextprotocol/inspector node dist/index.js -- --protocol http --port 6280
461
- ```
357
+ - Fork the repository and create a feature branch.
358
+ - Follow the code conventions in [ARCHITECTURE.md](ARCHITECTURE.md).
359
+ - Write clear commit messages (see Git guidelines above).
360
+ - Open a pull request with a clear description of your changes.
462
361
 
463
- If using `vite-node` for development:
464
-
465
- ```bash
466
- # For stdio mode (default)
467
- npx @modelcontextprotocol/inspector vite-node src/index.ts
468
-
469
- # For HTTP mode (e.g., on port 6280)
470
- npx @modelcontextprotocol/inspector vite-node src/index.ts -- --protocol http --port 6280
471
- ```
472
-
473
- The Inspector will provide a URL to access debugging tools in your browser.
362
+ For questions or suggestions, open an issue.
474
363
 
475
364
  ### Architecture
476
365
 
477
366
  For details on the project's architecture and design principles, please see [ARCHITECTURE.md](ARCHITECTURE.md).
478
367
 
479
368
  _Notably, the vast majority of this project's code was generated by the AI assistant Cline, leveraging the capabilities of this very MCP server._
369
+
370
+ ## License
371
+
372
+ This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.