preflight-mcp 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 preflight-mcp contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -1,42 +1,193 @@
1
1
  # preflight-mcp
2
2
 
3
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4
+ [![Node.js Version](https://img.shields.io/badge/node-%3E%3D18-brightgreen)](https://nodejs.org/)
5
+ [![MCP Compatible](https://img.shields.io/badge/MCP-Compatible-blue)](https://modelcontextprotocol.io/)
6
+
3
7
  > **English** | [中文](./README.zh-CN.md)
4
8
 
5
- An MCP (Model Context Protocol) **stdio** server
9
+ An MCP (Model Context Protocol) **stdio** server that creates evidence-based preflight bundles for GitHub repositories and library documentation.
6
10
 
7
11
  Each bundle contains:
8
12
  - A local copy of repo docs + code (normalized text)
9
13
  - A lightweight **full-text search index** (SQLite FTS5)
10
14
  - Agent-facing entry files: `START_HERE.md`, `AGENTS.md`, and `OVERVIEW.md` (factual-only, with evidence pointers)
11
15
 
12
- ## What you get
13
- - **10 Tools** to create/update/search/verify/read bundles
16
+ ## Features
17
+
18
+ - **12 MCP tools** to create/update/repair/search/verify/read bundles (plus resources)
19
+ - **De-duplication**: prevent repeated indexing of the same normalized inputs
20
+ - **Resilient GitHub fetching**: configurable git clone timeout + GitHub archive (zipball) fallback
21
+ - **Offline repair**: rebuild missing/empty derived artifacts (index/guides/overview) without re-fetching
14
22
  - **Static facts extraction** via `analysis/FACTS.json` (non-LLM)
15
23
  - **Evidence-based verification** to reduce hallucinations
16
24
  - **Resources** to read bundle files via `preflight://...` URIs
17
25
  - **Multi-path mirror backup** for cloud storage redundancy
18
26
  - **Resilient storage** with automatic failover when mounts are unavailable
27
+ - **Atomic bundle creation** with crash-safety and zero orphans
28
+ - **Fast background deletion** with 100-300x performance improvement
29
+ - **Auto-cleanup** on startup for historical orphan bundles
30
+
31
+ ## Architecture Improvements (v0.1.2)
32
+
33
+ ### 🚀 Atomic Bundle Creation
34
+ **Problem**: Bundle creation failures could leave incomplete orphan directories.
35
+
36
+ **Solution**: Temporary directory + atomic rename pattern:
37
+ 1. Create bundle in `tmpDir/bundles-wip/` (invisible to list)
38
+ 2. Validate completeness before making visible
39
+ 3. Atomic rename/move to final location
40
+ 4. Automatic cleanup on any failure
41
+
42
+ **Benefits**:
43
+ - ✅ Zero orphan bundles
44
+ - 🔒 Crash-safe (temp dirs auto-cleaned)
45
+ - 📏 Validation before visibility
46
+ - 🔄 Cross-filesystem fallback
47
+
48
+ ### ⚡ Fast Background Deletion
49
+ **Problem**: Deleting large bundles could timeout (10+ seconds).
50
+
51
+ **Solution**: Rename + background deletion:
52
+ 1. Instant rename to `.deleting.{timestamp}` (<100ms)
53
+ 2. Background deletion (fire-and-forget)
54
+ 3. Automatic cleanup of `.deleting` dirs on startup
55
+
56
+ **Benefits**:
57
+ - ⚡ 100-300x faster response (<100ms)
58
+ - 🔄 No blocking operations
59
+ - 👁️ Invisible to list (non-UUID format)
60
+ - 🛡️ Fallback to direct delete on rename failure
61
+
62
+ ### 🔧 Auto-Cleanup on Startup
63
+ **Problem**: Historical orphan bundles need manual cleanup.
64
+
65
+ **Solution**: Automatic cleanup on MCP server startup:
66
+ 1. Scans storage directories for invalid bundles
67
+ 2. Checks manifest.json validity
68
+ 3. Deletes orphans older than 1 hour (safety margin)
69
+ 4. Cleans `.deleting` residues
70
+
71
+ **Benefits**:
72
+ - 🤖 Fully automatic
73
+ - 🛡️ Safe with 1-hour age threshold
74
+ - ⚡ Fast when no orphans (<10ms)
75
+ - 🚫 Non-blocking background execution
76
+
77
+ ### 🧹 Manual Cleanup Tool
78
+ **New Tool**: `preflight_cleanup_orphans`
79
+
80
+ Manually trigger orphan cleanup with full control:
81
+ ```json
82
+ {
83
+ "dryRun": true, // Only report, don't delete
84
+ "minAgeHours": 1 // Age threshold
85
+ }
86
+ ```
87
+
88
+ ### 🔍 UUID Validation
89
+ List and cleanup now strictly filter by UUID format:
90
+ - ✅ Only valid UUID v4 bundle IDs
91
+ - 🚫 Filters out system directories (`#recycle`, `tmp`)
92
+ - 🚫 Filters out `.deleting` directories
93
+ - 🛡️ Protects user custom directories
94
+
95
+ For technical details, see:
96
+ - [ISSUES_ANALYSIS.md](./ISSUES_ANALYSIS.md) - Root cause analysis
97
+ - [IMPLEMENTATION_SUMMARY.md](./IMPLEMENTATION_SUMMARY.md) - Implementation details
98
+ - [CLEANUP_STRATEGY.md](./CLEANUP_STRATEGY.md) - MCP-specific cleanup design
99
+
100
+ ## Table of Contents
101
+
102
+ - [Requirements](#requirements)
103
+ - [Installation](#installation)
104
+ - [Quick Start](#quick-start)
105
+ - [Tools](#tools-12-total)
106
+ - [Environment Variables](#environment-variables)
107
+ - [Contributing](#contributing)
108
+ - [License](#license)
19
109
 
20
110
  ## Requirements
111
+
21
112
  - Node.js >= 18
22
113
  - `git` available on PATH
23
114
 
24
- ## Install
25
- Local dev:
26
- - `npm install`
27
- - `npm run build`
115
+ ## Installation
116
+
117
+ ### From npm (after published)
118
+
119
+ ```bash
120
+ npm install -g preflight-mcp
121
+ ```
122
+
123
+ ### Local Development
124
+
125
+ ```bash
126
+ git clone https://github.com/jonnyhoo/preflight-mcp.git
127
+ cd preflight-mcp
128
+ npm install
129
+ npm run build
130
+ ```
131
+
132
+ ## Quick Start
133
+
134
+ ### 1. Configure MCP Host (e.g., Claude Desktop)
28
135
 
29
- Publish install (after you publish to npm):
30
- - `npm install -g preflight-mcp`
136
+ Add to your MCP configuration file:
31
137
 
32
- ## Use (stdio MCP server)
33
- This server communicates over stdin/stdout, so you typically run it via an MCP host (e.g. mcp-hub).
138
+ ```json
139
+ {
140
+ "mcpServers": {
141
+ "preflight": {
142
+ "command": "npx",
143
+ "args": ["preflight-mcp"]
144
+ }
145
+ }
146
+ }
147
+ ```
34
148
 
35
- Example MCP host command:
36
- - `preflight-mcp`
149
+ Or for local development:
37
150
 
38
- Or, for local dev:
39
- - `node dist/index.js`
151
+ ```json
152
+ {
153
+ "mcpServers": {
154
+ "preflight": {
155
+ "command": "node",
156
+ "args": ["path/to/preflight-mcp/dist/index.js"]
157
+ }
158
+ }
159
+ }
160
+ ```
161
+
162
+ ### 2. Create Your First Bundle
163
+
164
+ Ask your AI assistant:
165
+
166
+ ```
167
+ "Create a bundle for the repository octocat/Hello-World"
168
+ ```
169
+
170
+ This will:
171
+ - Clone the repository
172
+ - Index all docs and code
173
+ - Generate searchable SQLite FTS5 index
174
+ - Create `START_HERE.md`, `AGENTS.md`, and `OVERVIEW.md`
175
+
176
+ ### 3. Search the Bundle
177
+
178
+ ```
179
+ "Search for 'GitHub' in the bundle"
180
+ ```
181
+
182
+ ### 4. Test Locally (Optional)
183
+
184
+ Run end-to-end smoke test:
185
+
186
+ ```bash
187
+ npm run smoke
188
+ ```
189
+
190
+ This will test bundle creation, search, and update operations.
40
191
 
41
192
  ## Smoke test
42
193
  Runs an end-to-end stdio client that:
@@ -51,7 +202,7 @@ Command:
51
202
 
52
203
  Note: the smoke test clones `octocat/Hello-World` from GitHub, so it needs internet access.
53
204
 
54
- ## Tools (10 total)
205
+ ## Tools (13 total)
55
206
 
56
207
  ### `preflight_list_bundles`
57
208
  List bundle IDs in storage.
@@ -61,10 +212,21 @@ List bundle IDs in storage.
61
212
  Create a new bundle from one or more inputs.
62
213
  - Triggers: "index this repo", "学习这个项目", "创建bundle"
63
214
 
215
+ Key semantics:
216
+ - **De-dup by default**: if a bundle already exists for the same normalized inputs, creation is rejected.
217
+ - Use `ifExists` to control behavior:
218
+ - `error` (default): reject duplicate
219
+ - `returnExisting`: return the existing bundle without fetching
220
+ - `updateExisting`: update the existing bundle then return it
221
+ - `createNew`: bypass de-duplication
222
+ - GitHub ingest uses **shallow clone**; if `git clone` fails, it will fall back to **GitHub archive (zipball)**.
223
+ - Supports `repos.kind: "local"` to ingest from a local directory (e.g. an extracted zip).
224
+
64
225
  Input (example):
65
- - `repos`: `[{ kind: "github", repo: "owner/repo" }, { kind: "deepwiki", url: "https://deepwiki.com/owner/repo" }]`
226
+ - `repos`: `[{ kind: "github", repo: "owner/repo" }, { kind: "local", repo: "owner/repo", path: "/path/to/dir" }, { kind: "deepwiki", url: "https://deepwiki.com/owner/repo" }]`
66
227
  - `libraries`: `["nextjs", "react"]` (Context7; optional)
67
228
  - `topics`: `["routing", "api"]` (Context7 topic filter; optional)
229
+ - `ifExists`: `"error" | "returnExisting" | "updateExisting" | "createNew"`
68
230
 
69
231
  ### `preflight_read_file`
70
232
  Read a file from bundle (OVERVIEW.md, START_HERE.md, AGENTS.md, or any repo file).
@@ -90,30 +252,64 @@ Optional parameters:
90
252
  Batch update all bundles at once.
91
253
  - Triggers: "批量更新", "全部刷新"
92
254
 
255
+ ### `preflight_find_bundle`
256
+ Check whether a bundle already exists for the given inputs (no fetching, no changes).
257
+ - Use when your UI/agent wants to decide whether to create/update.
258
+
259
+ ### `preflight_repair_bundle`
260
+ Offline repair for a bundle (no fetching): rebuild missing/empty derived artifacts.
261
+ - Rebuilds `indexes/search.sqlite3`, `START_HERE.md`, `AGENTS.md`, `OVERVIEW.md` when missing/empty.
262
+ - Use when: search fails due to index corruption, bundle files were partially deleted, etc.
263
+
93
264
  ### `preflight_search_bundle`
94
265
  Full-text search across ingested docs/code (line-based SQLite FTS5).
95
266
  - Triggers: "搜索bundle", "在仓库中查找", "搜代码"
96
267
 
97
- Optional parameters:
98
- - `ensureFresh`: If true, check if bundle needs update before searching.
99
- - `maxAgeHours`: Max age in hours before triggering auto-update (default: 24).
268
+ Important: **this tool is strictly read-only**.
269
+ - `ensureFresh` / `maxAgeHours` are **deprecated** and will error if provided.
270
+ - To update: call `preflight_update_bundle`, then search again.
271
+ - To repair: call `preflight_repair_bundle`, then search again.
100
272
 
101
273
  ### `preflight_search_by_tags`
102
274
  Search across multiple bundles filtered by tags (line-based SQLite FTS5).
103
275
  - Triggers: "search in MCP bundles", "search in all bundles", "在MCP项目中搜索", "搜索所有agent"
104
276
 
277
+ Notes:
278
+ - This tool is read-only and **does not auto-repair**.
279
+ - If some bundles fail to search (e.g. missing/corrupt index), they will be reported in `warnings`.
280
+
105
281
  Optional parameters:
106
282
  - `tags`: Filter bundles by tags (e.g., `["mcp", "agents"]`)
107
283
  - `scope`: Search scope (`docs`, `code`, or `all`)
108
284
  - `limit`: Max total hits across all bundles
109
285
 
286
+ Output additions:
287
+ - `warnings?: [{ bundleId, kind, message }]` (non-fatal per-bundle errors)
288
+ - `warningsTruncated?: true` if warnings were capped
289
+
110
290
  ### `preflight_verify_claim`
111
291
  Find evidence for a claim/statement in bundle.
112
292
  - Triggers: "验证说法", "找证据", "这个对吗"
113
293
 
114
- Optional parameters:
115
- - `ensureFresh`: If true, check if bundle needs update before verifying.
116
- - `maxAgeHours`: Max age in hours before triggering auto-update (default: 24).
294
+ Important: **this tool is strictly read-only**.
295
+ - `ensureFresh` / `maxAgeHours` are **deprecated** and will error if provided.
296
+ - To update: call `preflight_update_bundle`, then verify again.
297
+ - To repair: call `preflight_repair_bundle`, then verify again.
298
+
299
+ ### `preflight_cleanup_orphans`
300
+ Remove incomplete or corrupted bundles (bundles without valid manifest.json).
301
+ - Triggers: "clean up broken bundles", "remove orphans", "清理孤儿bundle"
302
+
303
+ Parameters:
304
+ - `dryRun` (default: true): Only report orphans without deleting
305
+ - `minAgeHours` (default: 1): Only clean bundles older than N hours
306
+
307
+ Output:
308
+ - `totalFound`: Number of orphan bundles found
309
+ - `totalCleaned`: Number of orphan bundles deleted
310
+ - `details`: Per-directory breakdown
311
+
312
+ Note: This is also automatically executed on server startup (background, non-blocking).
117
313
 
118
314
  ## Resources
119
315
  ### `preflight://bundles`
@@ -126,6 +322,24 @@ Examples:
126
322
  - `preflight://bundle/<id>/file/START_HERE.md`
127
323
  - `preflight://bundle/<id>/file/repos%2Fowner%2Frepo%2Fnorm%2FREADME.md`
128
324
 
325
+ ## Error semantics (stable, UI-friendly)
326
+ Most tool errors are wrapped with a stable, machine-parseable prefix:
327
+ - `[preflight_error kind=<kind>] <message>`
328
+
329
+ Common kinds:
330
+ - `bundle_not_found`
331
+ - `file_not_found`
332
+ - `invalid_path` (unsafe path traversal attempt)
333
+ - `permission_denied`
334
+ - `index_missing_or_corrupt`
335
+ - `deprecated_parameter`
336
+ - `unknown`
337
+
338
+ This is designed so UIs/agents can reliably decide whether to:
339
+ - call `preflight_update_bundle`
340
+ - call `preflight_repair_bundle`
341
+ - prompt the user for a different bundleId/path
342
+
129
343
  ## Environment variables
130
344
  ### Storage
131
345
  - `PREFLIGHT_STORAGE_DIR`: bundle storage dir (default: `~/.preflight-mcp/bundles`)
@@ -138,20 +352,21 @@ Examples:
138
352
  - `PREFLIGHT_ANALYSIS_MODE`: Static analysis mode - `none` or `quick` (default: `quick`). Generates `analysis/FACTS.json`.
139
353
 
140
354
  ### GitHub & Context7
141
- - `GITHUB_TOKEN`: optional; used for GitHub API/auth patterns (currently not required for public repos)
355
+ - `GITHUB_TOKEN`: optional; used for GitHub API/auth patterns and GitHub archive fallback (public repos usually work without it)
356
+ - `PREFLIGHT_GIT_CLONE_TIMEOUT_MS`: optional; max time to allow `git clone` before failing over to archive (default: 5 minutes)
142
357
  - `CONTEXT7_API_KEY`: optional; enables higher Context7 limits (runs without a key but may be rate-limited)
143
358
  - `CONTEXT7_MCP_URL`: optional; defaults to Context7 MCP endpoint
144
359
 
145
360
  ## Bundle layout (on disk)
146
361
  Inside a bundle directory:
147
- - `manifest.json`
362
+ - `manifest.json` (includes `fingerprint`, `displayName`, `tags`, and per-repo `source`)
148
363
  - `START_HERE.md`
149
364
  - `AGENTS.md`
150
365
  - `OVERVIEW.md`
151
366
  - `indexes/search.sqlite3`
152
367
  - **`analysis/FACTS.json`** (static analysis)
153
368
  - `repos/<owner>/<repo>/raw/...`
154
- - `repos/<owner>/<repo>/norm/...`
369
+ - `repos/<owner>/<repo>/norm/...` (GitHub/local snapshots)
155
370
  - `deepwiki/<owner>/<repo>/norm/index.md` (DeepWiki sources)
156
371
  - `deepwiki/<owner>/<repo>/meta.json`
157
372
  - `libraries/context7/<...>/meta.json`
@@ -179,7 +394,7 @@ $env:PREFLIGHT_STORAGE_DIRS = "D:\OneDrive\preflight;E:\GoogleDrive\preflight"
179
394
  ```
180
395
  ```bash
181
396
  # macOS/Linux
182
- export PREFLIGHT_STORAGE_DIRS="$HOME/OneDrive/preflight:$HOME/Dropbox/preflight"
397
+ export PREFLIGHT_STORAGE_DIRS="$HOME/OneDrive/preflight;$HOME/Dropbox/preflight"
183
398
  ```
184
399
 
185
400
  ### MCP host config (Claude Desktop)
@@ -206,3 +421,43 @@ export PREFLIGHT_STORAGE_DIRS="$HOME/OneDrive/preflight:$HOME/Dropbox/preflight"
206
421
  ### Important notes
207
422
  - **Avoid concurrent access**: Only use on one machine at a time (SQLite conflicts)
208
423
  - **Wait for sync**: After updates, wait for cloud sync before switching machines
424
+
425
+ ## Contributing
426
+
427
+ We welcome contributions! Please see our [Contributing Guide](./CONTRIBUTING.md) for details on:
428
+
429
+ - Development setup
430
+ - Code style guidelines
431
+ - Testing requirements
432
+ - Pull request process
433
+
434
+ Please also read our [Code of Conduct](./CODE_OF_CONDUCT.md) before contributing.
435
+
436
+ ## Support
437
+
438
+ If you encounter any issues or have questions:
439
+
440
+ - **Issues**: [GitHub Issues](https://github.com/jonnyhoo/preflight-mcp/issues)
441
+ - **Discussions**: [GitHub Discussions](https://github.com/jonnyhoo/preflight-mcp/discussions)
442
+
443
+ ## License
444
+
445
+ This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.
446
+
447
+ The MIT License allows you to:
448
+ - Use commercially
449
+ - Modify
450
+ - Distribute
451
+ - Use privately
452
+
453
+ With the only requirement being to include the original copyright and license notice.
454
+
455
+ ## Acknowledgments
456
+
457
+ - Built on the [Model Context Protocol](https://modelcontextprotocol.io/)
458
+ - Uses SQLite FTS5 for efficient full-text search
459
+ - Inspired by the need for evidence-based AI assistance
460
+
461
+ ---
462
+
463
+ Made with ❤️ for the AI developer community