npm - preflight-mcp - Versions diffs - 0.2.6 → 0.3.1 - Mend

preflight-mcp 0.2.6 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +21 -2
package/README.zh-CN.md +21 -3
package/dist/bundle/tree.js +1 -1
package/dist/evidence/dependencyGraph.js +54 -18
package/dist/server.js +71 -11
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -197,6 +197,7 @@ Read file(s) from bundle. Two modes:
 ### `preflight_repo_tree`
 Get repository structure overview without wasting tokens on search.
 - Returns: ASCII directory tree, file count by extension/directory, entry point candidates
+- Default depth: 6 (v0.3.1+, was 4) - shows 2-3 levels under `norm/` directory
 - Use BEFORE deep analysis to understand project layout
 - Triggers: "show project structure", "what files are in this repo", "项目结构", "文件分布"
@@ -225,6 +226,12 @@ Important: **this tool is strictly read-only**.
 - To update: call `preflight_update_bundle`, then search again.
 - To repair: call `preflight_repair_bundle`, then search again.
+**New filtering options** (v0.3.1):
+- `excludePatterns`: Filter out paths matching patterns (e.g., `["**/tests/**", "**/__pycache__/**"]`)
+- `maxSnippetLength`: Limit snippet length per result (50-500 chars) to reduce token consumption
+**Deprecated parameters**: `ensureFresh`, `autoRepairIndex`, `maxAgeHours` are deprecated and will return warnings instead of errors.
 ### `preflight_search_by_tags`
 Search across multiple bundles filtered by tags (line-based SQLite FTS5).
 - Triggers: "search in MCP bundles", "在MCP项目中搜索", "搜索所有agent"
@@ -240,11 +247,23 @@ Optional parameters:
 ### `preflight_evidence_dependency_graph`
 Generate an evidence-based dependency graph. Two modes:
-- **Target mode** (provide `target.file`): Analyze a specific file's imports and callers
+- **Target mode** (provide `target.file`): Analyze a specific file's imports and references
 - **Global mode** (omit `target`): Generate project-wide import graph of all code files
 - Deterministic output with source ranges for edges.
 - Uses Tree-sitter parsing when `PREFLIGHT_AST_ENGINE=wasm`; falls back to regex extraction otherwise.
-- Emits `imports` edges (file → module) and `imports_resolved` edges (file → internal file).
+**Edge types** (v0.2.7+):
+- `edgeTypes: "imports"` (default): Only AST-based import edges (high confidence, recommended)
+- `edgeTypes: "all"`: Include FTS-based reference edges (name matching, may have false positives)
+**Cache transparency** (v0.2.7+):
+- Response includes `meta.cacheInfo` with `fromCache`, `generatedAt`, `cacheAgeMs`
+- Use `force: true` to regenerate cached global graphs
+**Large file handling**:
+- `options.maxFileSizeBytes` (default: 1MB): Skip files larger than this
+- `options.largeFileStrategy`: `"skip"` (default) or `"truncate"`
+- `options.excludeExtensions`: Filter out non-code files from reference search (default: `.json`, `.md`, `.txt`, `.yml`, etc.)
 ### `preflight_trace_upsert`
 Create or update traceability links (code↔test, code↔doc, file↔requirement).

package/README.zh-CN.md CHANGED Viewed

@@ -218,6 +218,7 @@ npm run smoke
 ### `preflight_repo_tree`
 获取仓库结构概览，避免浪费 token 搜索。
 - 返回：ASCII 目录树、按扩展名/目录统计文件数、入口点候选
+- 默认深度：6（v0.3.1+，原为 4）— 可看到 norm/ 下 2-3 层子目录
 - 在深入分析前使用，了解项目布局
 - 触发词：「项目结构」「文件分布」「show tree」
@@ -243,10 +244,15 @@ npm run smoke
 - 触发词：「搜索bundle」「在仓库中查找」「搜代码」
 重要：**此工具是严格只读的**。
-- `ensureFresh` / `maxAgeHours` 已**弃用**，提供时会报错
 - 更新：先调用 `preflight_update_bundle`，再搜索
 - 修复：先调用 `preflight_repair_bundle`，再搜索
+**新过滤选项**（v0.3.1+）：
+- `excludePatterns`：排除匹配模式的路径（如 `["**/tests/**", "**/__pycache__/**"]`）
+- `maxSnippetLength`：限制每个结果的代码片段长度（50-500 字符），减少 token 消耗
+**已弃用参数**：`ensureFresh`、`autoRepairIndex`、`maxAgeHours` 已弃用，使用时会返回警告。
 ### `preflight_search_by_tags`
 跨多个 bundle 按标签过滤搜索（基于行的 SQLite FTS5）。
 - 触发词：「search in MCP bundles」「search in all bundles」「在MCP项目中搜索」「搜索所有agent」
@@ -261,10 +267,22 @@ npm run smoke
 - `limit`：跨所有 bundle 的最大命中数
 ### `preflight_evidence_dependency_graph`
-生成目标文件/符号的「基于证据」的依赖图（imports + callers）。
+生成目标文件/符号的「基于证据」的依赖图（imports + references）。
 - 输出确定性（best-effort），并为每条边提供可追溯 source range
 - `PREFLIGHT_AST_ENGINE=wasm` 时使用 Tree-sitter；否则回退到正则抽取
-- 既输出 `imports`（file → module），也会在可解析时输出 `imports_resolved`（file → file）
+**边类型**（v0.2.7+）：
+- `edgeTypes: "imports"`（默认）：仅返回基于 AST 的 import 边（高置信度，推荐）
+- `edgeTypes: "all"`：包含基于 FTS 的 reference 边（名称匹配，可能有误报）
+**缓存透明化**（v0.2.7+）：
+- 响应包含 `meta.cacheInfo`：`fromCache`、`generatedAt`、`cacheAgeMs`
+- 使用 `force: true` 可重新生成缓存的全局图
+**大文件处理**：
+- `options.maxFileSizeBytes`（默认：1MB）：跳过超过此大小的文件
+- `options.largeFileStrategy`：`"skip"`（默认）或 `"truncate"`
+- `options.excludeExtensions`：从 reference 搜索中排除非代码文件（默认：`.json`、`.md`、`.txt`、`.yml` 等）
 ### `preflight_trace_upsert`
 写入/更新 bundle 级 traceability links（commit↔ticket、symbol↔test、code↔doc 等）。

package/dist/bundle/tree.js CHANGED Viewed

@@ -47,7 +47,7 @@ function shouldExclude(relativePath, excludePatterns) {
     return false;
 }
 export async function generateRepoTree(bundleRootDir, bundleId, options = {}) {
-    const depth = options.depth ?? 4;
+    const depth = options.depth ?? 6;
     const includePatterns = options.include ?? [];
     const excludePatterns = options.exclude ?? ['node_modules', '.git', '__pycache__', '.venv', 'venv', 'dist', 'build', '*.pyc'];
     const reposDir = path.join(bundleRootDir, 'repos');

package/dist/evidence/dependencyGraph.js CHANGED Viewed

@@ -23,6 +23,10 @@ export const DependencyGraphInputSchema = {
         'Global mode shows import relationships between all files but may be truncated for large projects.'),
     force: z.boolean().default(false).describe('If true, regenerate the dependency graph even if cached. ' +
         'Global mode results are cached in the bundle; use force=true to refresh.'),
+    /** Edge types to include in the result. Default: only imports (AST-based, high confidence). */
+    edgeTypes: z.enum(['imports', 'all']).default('imports').describe('Edge types to include. "imports": only AST-based import edges (high confidence, recommended). ' +
+        '"all": include FTS-based reference edges (name matching, may have false positives). ' +
+        'Default: "imports" for accuracy. Use "all" only when you need to find callers/references.'),
     options: z
         .object({
         maxFiles: z.number().int().min(1).max(500).default(200),
@@ -38,8 +42,11 @@ export const DependencyGraphInputSchema = {
         /** If largeFileStrategy=truncate, how many lines to read */
         truncateLines: z.number().int().min(100).max(5000).default(500)
             .describe('When largeFileStrategy=truncate, read this many lines. Default 500.'),
+        /** File extensions to exclude from reference search (FTS). Helps reduce false positives. */
+        excludeExtensions: z.array(z.string()).default(['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock'])
+            .describe('File extensions to exclude from reference/caller search. Default excludes non-code files.'),
     })
-        .default({ maxFiles: 200, maxNodes: 300, maxEdges: 800, timeBudgetMs: 25_000, maxFileSizeBytes: 1_000_000, largeFileStrategy: 'skip', truncateLines: 500 }),
+        .default({ maxFiles: 200, maxNodes: 300, maxEdges: 800, timeBudgetMs: 25_000, maxFileSizeBytes: 1_000_000, largeFileStrategy: 'skip', truncateLines: 500, excludeExtensions: ['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock'] }),
 };
 function sha256Hex(text) {
     return crypto.createHash('sha256').update(text, 'utf8').digest('hex');
@@ -217,11 +224,20 @@ export async function generateDependencyGraph(cfg, rawArgs) {
         try {
             const cached = await fs.readFile(paths.depsGraphPath, 'utf8');
             const parsed = JSON.parse(cached);
+            const cachedAt = parsed.meta.generatedAt ? new Date(parsed.meta.generatedAt).getTime() : 0;
+            const cacheAgeMs = cachedAt ? Date.now() - cachedAt : 0;
+            // Add cacheInfo to meta
+            parsed.meta.cacheInfo = {
+                fromCache: true,
+                generatedAt: parsed.meta.generatedAt,
+                cacheAgeMs,
+                hint: 'Use force=true to regenerate the graph.',
+            };
             // Add note that this is from cache
             parsed.signals.warnings = parsed.signals.warnings || [];
             parsed.signals.warnings.unshift({
                 code: 'from_cache',
-                message: `Loaded from cache (generated at ${parsed.meta.generatedAt}). Use force=true to regenerate.`,
+                message: `Loaded from cache (generated at ${parsed.meta.generatedAt}, age: ${Math.round(cacheAgeMs / 1000)}s). Use force=true to regenerate.`,
             });
             return parsed;
         }
@@ -777,26 +793,32 @@ export async function generateDependencyGraph(cfg, rawArgs) {
             message: `Failed to read target file for import extraction: ${err instanceof Error ? err.message : String(err)}`,
         });
     }
-    // 2) Upstream: find callers via FTS hits
+    // 2) Upstream: find references via FTS hits (only if edgeTypes='all')
     let searchHits = 0;
     let filesRead = 0;
-    let callEdges = 0;
+    let referenceEdges = 0;
     let importEdges = edges.filter((e) => e.type === 'imports').length;
-    if (targetSymbol && targetSymbol.length >= 2) {
+    const includeReferences = args.edgeTypes === 'all';
+    const excludeExtensions = new Set(args.options.excludeExtensions ?? ['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock']);
+    if (includeReferences && targetSymbol && targetSymbol.length >= 2) {
         const maxHits = Math.min(500, limits.maxFiles * 5);
         const hits = searchIndex(paths.searchDbPath, targetSymbol, 'code', maxHits, paths.rootDir);
         searchHits = hits.length;
         const fileLineCache = new Map();
         for (const hit of hits) {
-            if (checkBudget('timeBudget exceeded during caller scan'))
+            if (checkBudget('timeBudget exceeded during reference scan'))
                 break;
             if (edges.length >= limits.maxEdges)
                 break;
             const hitPath = hit.path;
             if (!hitPath || hit.kind !== 'code')
                 continue;
+            // P3: Filter out non-code files by extension
+            const hitExt = path.extname(hitPath).toLowerCase();
+            if (excludeExtensions.has(hitExt))
+                continue;
             // Skip obvious self-reference in the same file if no symbol boundary detection.
-            // We still allow calls within the same file (but avoid exploding edges).
+            // We still allow references within the same file (but avoid exploding edges).
             // Read file lines (cache)
             let lines = fileLineCache.get(hitPath);
             if (!lines) {
@@ -845,19 +867,19 @@ export async function generateDependencyGraph(cfg, rawArgs) {
                 snippet: clampSnippet(line, 200),
             };
             src.snippetSha256 = sha256Hex(src.snippet ?? '');
-            const evidenceId = makeEvidenceId(['calls', callerId, targetSymbolId, hitPath, String(hit.lineNo), String(call.startCol)]);
+            const evidenceId = makeEvidenceId(['references', callerId, targetSymbolId, hitPath, String(hit.lineNo), String(call.startCol)]);
             addEdge({
                 evidenceId,
                 kind: 'edge',
-                type: 'calls',
+                type: 'references',
                 from: callerId,
                 to: targetSymbolId,
                 method: 'heuristic',
-                confidence: 0.6,
+                confidence: 0.5,
                 sources: [src],
-                notes: ['call edge is name-based (no type/overload resolution)'],
+                notes: ['reference edge is FTS name-based (may include false positives from comments/strings/docs)'],
             });
-            callEdges++;
+            referenceEdges++;
             if (nodes.size >= limits.maxNodes) {
                 truncated = true;
                 truncatedReason = 'maxNodes reached';
@@ -871,21 +893,27 @@ export async function generateDependencyGraph(cfg, rawArgs) {
             });
         }
     }
+    else if (!includeReferences && targetSymbol && targetSymbol.length >= 2) {
+        warnings.push({
+            code: 'references_skipped',
+            message: 'Reference/caller search was skipped (edgeTypes="imports"). Use edgeTypes="all" to include FTS-based reference edges (may have false positives).',
+        });
+    }
     else {
         warnings.push({
             code: 'symbol_missing_or_too_short',
-            message: 'No symbol provided (or symbol too short). Upstream call graph was skipped; only imports were extracted from the target file.',
+            message: 'No symbol provided (or symbol too short). Reference graph was skipped; only imports were extracted from the target file.',
         });
     }
     // Post-process warnings
     warnings.push({
         code: 'limitations',
         message: usedAstForImports
-            ? 'This dependency graph uses deterministic parsing for imports (Tree-sitter WASM syntax AST) plus heuristics for callers (FTS + name-based). Results may be incomplete and are not type-resolved. Each edge includes method/confidence/sources for auditability.'
-            : 'This dependency graph is generated with deterministic heuristics (FTS + regex). Calls/imports may be incomplete and are not type-resolved. Each edge includes method/confidence/sources for auditability.',
+            ? 'This dependency graph uses deterministic parsing for imports (Tree-sitter WASM syntax AST). Reference edges (if enabled) use FTS + name-based heuristics and may have false positives. Each edge includes method/confidence/sources for auditability.'
+            : 'This dependency graph uses regex-based import extraction. Reference edges (if enabled) use FTS + name-based heuristics. Each edge includes method/confidence/sources for auditability.',
     });
     // Stats
-    importEdges = edges.filter((e) => e.type === 'imports').length;
+    importEdges = edges.filter((e) => e.type === 'imports' || e.type === 'imports_resolved').length;
     const out = {
         meta: {
             requestId,
@@ -901,6 +929,9 @@ export async function generateDependencyGraph(cfg, rawArgs) {
                 truncatedReason,
                 limits,
             },
+            cacheInfo: {
+                fromCache: false,
+            },
         },
         facts: {
             nodes: Array.from(nodes.values()),
@@ -910,7 +941,8 @@ export async function generateDependencyGraph(cfg, rawArgs) {
             stats: {
                 filesRead,
                 searchHits,
-                callEdges,
+                callEdges: referenceEdges, // deprecated, use referenceEdges
+                referenceEdges,
                 importEdges,
             },
             warnings,
@@ -1154,6 +1186,9 @@ async function generateGlobalDependencyGraph(ctx) {
                 truncatedReason,
                 limits,
             },
+            cacheInfo: {
+                fromCache: false,
+            },
         },
         facts: {
             nodes: Array.from(nodes.values()),
@@ -1163,7 +1198,8 @@ async function generateGlobalDependencyGraph(ctx) {
             stats: {
                 filesRead: filesProcessed,
                 searchHits: 0,
-                callEdges: 0,
+                callEdges: 0, // deprecated
+                referenceEdges: 0,
                 importEdges,
             },
             warnings,

package/dist/server.js CHANGED Viewed

@@ -52,6 +52,18 @@ const SearchBundleInputSchema = {
     query: z.string().describe('Search query. Prefix with fts: to use raw FTS syntax.'),
     scope: z.enum(['docs', 'code', 'all']).default('all').describe('Search scope.'),
     limit: z.number().int().min(1).max(200).default(30).describe('Max number of hits.'),
+    // New filtering options (v0.3.1)
+    excludePatterns: z
+        .array(z.string())
+        .optional()
+        .describe('Exclude paths matching these patterns (e.g., ["**/tests/**", "**/__pycache__/**"]). Reduces noise from test/config files.'),
+    maxSnippetLength: z
+        .number()
+        .int()
+        .min(50)
+        .max(500)
+        .optional()
+        .describe('Max length of snippet in each result (default: no limit). Use to reduce token consumption.'),
     // Deprecated (kept for backward compatibility): this tool is strictly read-only.
     ensureFresh: z
         .boolean()
@@ -1059,26 +1071,74 @@ export async function startServer() {
             if (!storageDir) {
                 throw new BundleNotFoundError(args.bundleId);
             }
-            if (args.ensureFresh) {
-                throw new Error('ensureFresh is deprecated and not supported in this tool. This tool is strictly read-only. ' +
-                    'Call preflight_update_bundle explicitly, then call preflight_search_bundle again.');
+            // P1: Collect warnings for deprecated parameters instead of throwing
+            const warnings = [];
+            if (args.ensureFresh !== undefined) {
+                warnings.push({
+                    code: 'DEPRECATED_PARAM',
+                    message: 'ensureFresh is deprecated and ignored. This tool is strictly read-only. Use preflight_update_bundle separately, then search again.',
+                });
+            }
+            if (args.autoRepairIndex !== undefined) {
+                warnings.push({
+                    code: 'DEPRECATED_PARAM',
+                    message: 'autoRepairIndex is deprecated and ignored. This tool is strictly read-only. Use preflight_repair_bundle separately, then search again.',
+                });
             }
-            if (args.autoRepairIndex) {
-                throw new Error('autoRepairIndex is deprecated and not supported in this tool. This tool is strictly read-only. ' +
-                    'Call preflight_repair_bundle explicitly, then call preflight_search_bundle again.');
+            if (args.maxAgeHours !== undefined) {
+                warnings.push({
+                    code: 'DEPRECATED_PARAM',
+                    message: 'maxAgeHours is deprecated and ignored (was only used with ensureFresh).',
+                });
             }
             const paths = getBundlePathsForId(storageDir, args.bundleId);
-            const rawHits = searchIndex(paths.searchDbPath, args.query, args.scope, args.limit, paths.rootDir);
-            const hits = rawHits.map((h) => ({
-                ...h,
-                uri: toBundleFileUri({ bundleId: args.bundleId, relativePath: h.path }),
-            }));
+            // Fetch more results if we need to filter, to ensure we still get enough after filtering
+            const fetchLimit = args.excludePatterns?.length ? Math.min(args.limit * 2, 200) : args.limit;
+            let rawHits = searchIndex(paths.searchDbPath, args.query, args.scope, fetchLimit, paths.rootDir);
+            // Apply excludePatterns filter
+            if (args.excludePatterns && args.excludePatterns.length > 0) {
+                const patterns = args.excludePatterns.map(p => {
+                    // Convert glob pattern to regex
+                    const regexStr = p
+                        .replace(/\./g, '\\.')
+                        .replace(/\*\*/g, '<<<DOUBLESTAR>>>')
+                        .replace(/\*/g, '[^/]*')
+                        .replace(/<<<DOUBLESTAR>>>/g, '.*');
+                    return new RegExp(regexStr, 'i');
+                });
+                rawHits = rawHits.filter(h => !patterns.some(re => re.test(h.path)));
+            }
+            // Limit to requested count after filtering
+            rawHits = rawHits.slice(0, args.limit);
+            const hits = rawHits.map((h) => {
+                const hit = {
+                    ...h,
+                    uri: toBundleFileUri({ bundleId: args.bundleId, relativePath: h.path }),
+                };
+                // Apply maxSnippetLength truncation
+                if (args.maxSnippetLength && h.snippet && h.snippet.length > args.maxSnippetLength) {
+                    hit.snippet = h.snippet.slice(0, args.maxSnippetLength) + '…';
+                }
+                // Truncate surroundingLines if maxSnippetLength is set
+                if (args.maxSnippetLength && h.context?.surroundingLines) {
+                    const maxLines = Math.max(3, Math.floor(args.maxSnippetLength / 50));
+                    hit.context = {
+                        ...h.context,
+                        surroundingLines: h.context.surroundingLines.slice(0, maxLines),
+                    };
+                }
+                return hit;
+            });
             const out = {
                 bundleId: args.bundleId,
                 query: args.query,
                 scope: args.scope,
                 hits,
             };
+            // Include warnings in output if any deprecated params were used
+            if (warnings.length > 0) {
+                out.warnings = warnings;
+            }
             return {
                 content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
                 structuredContent: out,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "preflight-mcp",
-  "version": "0.2.6",
+  "version": "0.3.1",
   "description": "MCP server that creates evidence-based preflight bundles for GitHub repositories and library docs.",
   "type": "module",
   "license": "MIT",