preflight-mcp 0.2.6 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -197,6 +197,7 @@ Read file(s) from bundle. Two modes:
197
197
  ### `preflight_repo_tree`
198
198
  Get repository structure overview without wasting tokens on search.
199
199
  - Returns: ASCII directory tree, file count by extension/directory, entry point candidates
200
+ - Default depth: 6 (v0.3.1+, was 4) - shows 2-3 levels under `norm/` directory
200
201
  - Use BEFORE deep analysis to understand project layout
201
202
  - Triggers: "show project structure", "what files are in this repo", "项目结构", "文件分布"
202
203
 
@@ -225,6 +226,12 @@ Important: **this tool is strictly read-only**.
225
226
  - To update: call `preflight_update_bundle`, then search again.
226
227
  - To repair: call `preflight_repair_bundle`, then search again.
227
228
 
229
+ **New filtering options** (v0.3.1):
230
+ - `excludePatterns`: Filter out paths matching patterns (e.g., `["**/tests/**", "**/__pycache__/**"]`)
231
+ - `maxSnippetLength`: Limit snippet length per result (50-500 chars) to reduce token consumption
232
+
233
+ **Deprecated parameters**: `ensureFresh`, `autoRepairIndex`, `maxAgeHours` are deprecated and will return warnings instead of errors.
234
+
228
235
  ### `preflight_search_by_tags`
229
236
  Search across multiple bundles filtered by tags (line-based SQLite FTS5).
230
237
  - Triggers: "search in MCP bundles", "在MCP项目中搜索", "搜索所有agent"
@@ -240,11 +247,23 @@ Optional parameters:
240
247
 
241
248
  ### `preflight_evidence_dependency_graph`
242
249
  Generate an evidence-based dependency graph. Two modes:
243
- - **Target mode** (provide `target.file`): Analyze a specific file's imports and callers
250
+ - **Target mode** (provide `target.file`): Analyze a specific file's imports and references
244
251
  - **Global mode** (omit `target`): Generate project-wide import graph of all code files
245
252
  - Deterministic output with source ranges for edges.
246
253
  - Uses Tree-sitter parsing when `PREFLIGHT_AST_ENGINE=wasm`; falls back to regex extraction otherwise.
247
- - Emits `imports` edges (file → module) and `imports_resolved` edges (file → internal file).
254
+
255
+ **Edge types** (v0.2.7+):
256
+ - `edgeTypes: "imports"` (default): Only AST-based import edges (high confidence, recommended)
257
+ - `edgeTypes: "all"`: Include FTS-based reference edges (name matching, may have false positives)
258
+
259
+ **Cache transparency** (v0.2.7+):
260
+ - Response includes `meta.cacheInfo` with `fromCache`, `generatedAt`, `cacheAgeMs`
261
+ - Use `force: true` to regenerate cached global graphs
262
+
263
+ **Large file handling**:
264
+ - `options.maxFileSizeBytes` (default: 1MB): Skip files larger than this
265
+ - `options.largeFileStrategy`: `"skip"` (default) or `"truncate"`
266
+ - `options.excludeExtensions`: Filter out non-code files from reference search (default: `.json`, `.md`, `.txt`, `.yml`, etc.)
248
267
 
249
268
  ### `preflight_trace_upsert`
250
269
  Create or update traceability links (code↔test, code↔doc, file↔requirement).
package/README.zh-CN.md CHANGED
@@ -218,6 +218,7 @@ npm run smoke
218
218
  ### `preflight_repo_tree`
219
219
  获取仓库结构概览,避免浪费 token 搜索。
220
220
  - 返回:ASCII 目录树、按扩展名/目录统计文件数、入口点候选
221
+ - 默认深度:6(v0.3.1+,原为 4)— 可看到 norm/ 下 2-3 层子目录
221
222
  - 在深入分析前使用,了解项目布局
222
223
  - 触发词:「项目结构」「文件分布」「show tree」
223
224
 
@@ -243,10 +244,15 @@ npm run smoke
243
244
  - 触发词:「搜索bundle」「在仓库中查找」「搜代码」
244
245
 
245
246
  重要:**此工具是严格只读的**。
246
- - `ensureFresh` / `maxAgeHours` 已**弃用**,提供时会报错
247
247
  - 更新:先调用 `preflight_update_bundle`,再搜索
248
248
  - 修复:先调用 `preflight_repair_bundle`,再搜索
249
249
 
250
+ **新过滤选项**(v0.3.1+):
251
+ - `excludePatterns`:排除匹配模式的路径(如 `["**/tests/**", "**/__pycache__/**"]`)
252
+ - `maxSnippetLength`:限制每个结果的代码片段长度(50-500 字符),减少 token 消耗
253
+
254
+ **已弃用参数**:`ensureFresh`、`autoRepairIndex`、`maxAgeHours` 已弃用,使用时会返回警告。
255
+
250
256
  ### `preflight_search_by_tags`
251
257
  跨多个 bundle 按标签过滤搜索(基于行的 SQLite FTS5)。
252
258
  - 触发词:「search in MCP bundles」「search in all bundles」「在MCP项目中搜索」「搜索所有agent」
@@ -261,10 +267,22 @@ npm run smoke
261
267
  - `limit`:跨所有 bundle 的最大命中数
262
268
 
263
269
  ### `preflight_evidence_dependency_graph`
264
- 生成目标文件/符号的「基于证据」的依赖图(imports + callers)。
270
+ 生成目标文件/符号的「基于证据」的依赖图(imports + references)。
265
271
  - 输出确定性(best-effort),并为每条边提供可追溯 source range
266
272
  - `PREFLIGHT_AST_ENGINE=wasm` 时使用 Tree-sitter;否则回退到正则抽取
267
- - 既输出 `imports`(file → module),也会在可解析时输出 `imports_resolved`(file → file)
273
+
274
+ **边类型**(v0.2.7+):
275
+ - `edgeTypes: "imports"`(默认):仅返回基于 AST 的 import 边(高置信度,推荐)
276
+ - `edgeTypes: "all"`:包含基于 FTS 的 reference 边(名称匹配,可能有误报)
277
+
278
+ **缓存透明化**(v0.2.7+):
279
+ - 响应包含 `meta.cacheInfo`:`fromCache`、`generatedAt`、`cacheAgeMs`
280
+ - 使用 `force: true` 可重新生成缓存的全局图
281
+
282
+ **大文件处理**:
283
+ - `options.maxFileSizeBytes`(默认:1MB):跳过超过此大小的文件
284
+ - `options.largeFileStrategy`:`"skip"`(默认)或 `"truncate"`
285
+ - `options.excludeExtensions`:从 reference 搜索中排除非代码文件(默认:`.json`、`.md`、`.txt`、`.yml` 等)
268
286
 
269
287
  ### `preflight_trace_upsert`
270
288
  写入/更新 bundle 级 traceability links(commit↔ticket、symbol↔test、code↔doc 等)。
@@ -47,7 +47,7 @@ function shouldExclude(relativePath, excludePatterns) {
47
47
  return false;
48
48
  }
49
49
  export async function generateRepoTree(bundleRootDir, bundleId, options = {}) {
50
- const depth = options.depth ?? 4;
50
+ const depth = options.depth ?? 6;
51
51
  const includePatterns = options.include ?? [];
52
52
  const excludePatterns = options.exclude ?? ['node_modules', '.git', '__pycache__', '.venv', 'venv', 'dist', 'build', '*.pyc'];
53
53
  const reposDir = path.join(bundleRootDir, 'repos');
@@ -23,6 +23,10 @@ export const DependencyGraphInputSchema = {
23
23
  'Global mode shows import relationships between all files but may be truncated for large projects.'),
24
24
  force: z.boolean().default(false).describe('If true, regenerate the dependency graph even if cached. ' +
25
25
  'Global mode results are cached in the bundle; use force=true to refresh.'),
26
+ /** Edge types to include in the result. Default: only imports (AST-based, high confidence). */
27
+ edgeTypes: z.enum(['imports', 'all']).default('imports').describe('Edge types to include. "imports": only AST-based import edges (high confidence, recommended). ' +
28
+ '"all": include FTS-based reference edges (name matching, may have false positives). ' +
29
+ 'Default: "imports" for accuracy. Use "all" only when you need to find callers/references.'),
26
30
  options: z
27
31
  .object({
28
32
  maxFiles: z.number().int().min(1).max(500).default(200),
@@ -38,8 +42,11 @@ export const DependencyGraphInputSchema = {
38
42
  /** If largeFileStrategy=truncate, how many lines to read */
39
43
  truncateLines: z.number().int().min(100).max(5000).default(500)
40
44
  .describe('When largeFileStrategy=truncate, read this many lines. Default 500.'),
45
+ /** File extensions to exclude from reference search (FTS). Helps reduce false positives. */
46
+ excludeExtensions: z.array(z.string()).default(['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock'])
47
+ .describe('File extensions to exclude from reference/caller search. Default excludes non-code files.'),
41
48
  })
42
- .default({ maxFiles: 200, maxNodes: 300, maxEdges: 800, timeBudgetMs: 25_000, maxFileSizeBytes: 1_000_000, largeFileStrategy: 'skip', truncateLines: 500 }),
49
+ .default({ maxFiles: 200, maxNodes: 300, maxEdges: 800, timeBudgetMs: 25_000, maxFileSizeBytes: 1_000_000, largeFileStrategy: 'skip', truncateLines: 500, excludeExtensions: ['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock'] }),
43
50
  };
44
51
  function sha256Hex(text) {
45
52
  return crypto.createHash('sha256').update(text, 'utf8').digest('hex');
@@ -217,11 +224,20 @@ export async function generateDependencyGraph(cfg, rawArgs) {
217
224
  try {
218
225
  const cached = await fs.readFile(paths.depsGraphPath, 'utf8');
219
226
  const parsed = JSON.parse(cached);
227
+ const cachedAt = parsed.meta.generatedAt ? new Date(parsed.meta.generatedAt).getTime() : 0;
228
+ const cacheAgeMs = cachedAt ? Date.now() - cachedAt : 0;
229
+ // Add cacheInfo to meta
230
+ parsed.meta.cacheInfo = {
231
+ fromCache: true,
232
+ generatedAt: parsed.meta.generatedAt,
233
+ cacheAgeMs,
234
+ hint: 'Use force=true to regenerate the graph.',
235
+ };
220
236
  // Add note that this is from cache
221
237
  parsed.signals.warnings = parsed.signals.warnings || [];
222
238
  parsed.signals.warnings.unshift({
223
239
  code: 'from_cache',
224
- message: `Loaded from cache (generated at ${parsed.meta.generatedAt}). Use force=true to regenerate.`,
240
+ message: `Loaded from cache (generated at ${parsed.meta.generatedAt}, age: ${Math.round(cacheAgeMs / 1000)}s). Use force=true to regenerate.`,
225
241
  });
226
242
  return parsed;
227
243
  }
@@ -777,26 +793,32 @@ export async function generateDependencyGraph(cfg, rawArgs) {
777
793
  message: `Failed to read target file for import extraction: ${err instanceof Error ? err.message : String(err)}`,
778
794
  });
779
795
  }
780
- // 2) Upstream: find callers via FTS hits
796
+ // 2) Upstream: find references via FTS hits (only if edgeTypes='all')
781
797
  let searchHits = 0;
782
798
  let filesRead = 0;
783
- let callEdges = 0;
799
+ let referenceEdges = 0;
784
800
  let importEdges = edges.filter((e) => e.type === 'imports').length;
785
- if (targetSymbol && targetSymbol.length >= 2) {
801
+ const includeReferences = args.edgeTypes === 'all';
802
+ const excludeExtensions = new Set(args.options.excludeExtensions ?? ['.json', '.md', '.txt', '.yml', '.yaml', '.toml', '.lock']);
803
+ if (includeReferences && targetSymbol && targetSymbol.length >= 2) {
786
804
  const maxHits = Math.min(500, limits.maxFiles * 5);
787
805
  const hits = searchIndex(paths.searchDbPath, targetSymbol, 'code', maxHits, paths.rootDir);
788
806
  searchHits = hits.length;
789
807
  const fileLineCache = new Map();
790
808
  for (const hit of hits) {
791
- if (checkBudget('timeBudget exceeded during caller scan'))
809
+ if (checkBudget('timeBudget exceeded during reference scan'))
792
810
  break;
793
811
  if (edges.length >= limits.maxEdges)
794
812
  break;
795
813
  const hitPath = hit.path;
796
814
  if (!hitPath || hit.kind !== 'code')
797
815
  continue;
816
+ // P3: Filter out non-code files by extension
817
+ const hitExt = path.extname(hitPath).toLowerCase();
818
+ if (excludeExtensions.has(hitExt))
819
+ continue;
798
820
  // Skip obvious self-reference in the same file if no symbol boundary detection.
799
- // We still allow calls within the same file (but avoid exploding edges).
821
+ // We still allow references within the same file (but avoid exploding edges).
800
822
  // Read file lines (cache)
801
823
  let lines = fileLineCache.get(hitPath);
802
824
  if (!lines) {
@@ -845,19 +867,19 @@ export async function generateDependencyGraph(cfg, rawArgs) {
845
867
  snippet: clampSnippet(line, 200),
846
868
  };
847
869
  src.snippetSha256 = sha256Hex(src.snippet ?? '');
848
- const evidenceId = makeEvidenceId(['calls', callerId, targetSymbolId, hitPath, String(hit.lineNo), String(call.startCol)]);
870
+ const evidenceId = makeEvidenceId(['references', callerId, targetSymbolId, hitPath, String(hit.lineNo), String(call.startCol)]);
849
871
  addEdge({
850
872
  evidenceId,
851
873
  kind: 'edge',
852
- type: 'calls',
874
+ type: 'references',
853
875
  from: callerId,
854
876
  to: targetSymbolId,
855
877
  method: 'heuristic',
856
- confidence: 0.6,
878
+ confidence: 0.5,
857
879
  sources: [src],
858
- notes: ['call edge is name-based (no type/overload resolution)'],
880
+ notes: ['reference edge is FTS name-based (may include false positives from comments/strings/docs)'],
859
881
  });
860
- callEdges++;
882
+ referenceEdges++;
861
883
  if (nodes.size >= limits.maxNodes) {
862
884
  truncated = true;
863
885
  truncatedReason = 'maxNodes reached';
@@ -871,21 +893,27 @@ export async function generateDependencyGraph(cfg, rawArgs) {
871
893
  });
872
894
  }
873
895
  }
896
+ else if (!includeReferences && targetSymbol && targetSymbol.length >= 2) {
897
+ warnings.push({
898
+ code: 'references_skipped',
899
+ message: 'Reference/caller search was skipped (edgeTypes="imports"). Use edgeTypes="all" to include FTS-based reference edges (may have false positives).',
900
+ });
901
+ }
874
902
  else {
875
903
  warnings.push({
876
904
  code: 'symbol_missing_or_too_short',
877
- message: 'No symbol provided (or symbol too short). Upstream call graph was skipped; only imports were extracted from the target file.',
905
+ message: 'No symbol provided (or symbol too short). Reference graph was skipped; only imports were extracted from the target file.',
878
906
  });
879
907
  }
880
908
  // Post-process warnings
881
909
  warnings.push({
882
910
  code: 'limitations',
883
911
  message: usedAstForImports
884
- ? 'This dependency graph uses deterministic parsing for imports (Tree-sitter WASM syntax AST) plus heuristics for callers (FTS + name-based). Results may be incomplete and are not type-resolved. Each edge includes method/confidence/sources for auditability.'
885
- : 'This dependency graph is generated with deterministic heuristics (FTS + regex). Calls/imports may be incomplete and are not type-resolved. Each edge includes method/confidence/sources for auditability.',
912
+ ? 'This dependency graph uses deterministic parsing for imports (Tree-sitter WASM syntax AST). Reference edges (if enabled) use FTS + name-based heuristics and may have false positives. Each edge includes method/confidence/sources for auditability.'
913
+ : 'This dependency graph uses regex-based import extraction. Reference edges (if enabled) use FTS + name-based heuristics. Each edge includes method/confidence/sources for auditability.',
886
914
  });
887
915
  // Stats
888
- importEdges = edges.filter((e) => e.type === 'imports').length;
916
+ importEdges = edges.filter((e) => e.type === 'imports' || e.type === 'imports_resolved').length;
889
917
  const out = {
890
918
  meta: {
891
919
  requestId,
@@ -901,6 +929,9 @@ export async function generateDependencyGraph(cfg, rawArgs) {
901
929
  truncatedReason,
902
930
  limits,
903
931
  },
932
+ cacheInfo: {
933
+ fromCache: false,
934
+ },
904
935
  },
905
936
  facts: {
906
937
  nodes: Array.from(nodes.values()),
@@ -910,7 +941,8 @@ export async function generateDependencyGraph(cfg, rawArgs) {
910
941
  stats: {
911
942
  filesRead,
912
943
  searchHits,
913
- callEdges,
944
+ callEdges: referenceEdges, // deprecated, use referenceEdges
945
+ referenceEdges,
914
946
  importEdges,
915
947
  },
916
948
  warnings,
@@ -1154,6 +1186,9 @@ async function generateGlobalDependencyGraph(ctx) {
1154
1186
  truncatedReason,
1155
1187
  limits,
1156
1188
  },
1189
+ cacheInfo: {
1190
+ fromCache: false,
1191
+ },
1157
1192
  },
1158
1193
  facts: {
1159
1194
  nodes: Array.from(nodes.values()),
@@ -1163,7 +1198,8 @@ async function generateGlobalDependencyGraph(ctx) {
1163
1198
  stats: {
1164
1199
  filesRead: filesProcessed,
1165
1200
  searchHits: 0,
1166
- callEdges: 0,
1201
+ callEdges: 0, // deprecated
1202
+ referenceEdges: 0,
1167
1203
  importEdges,
1168
1204
  },
1169
1205
  warnings,
package/dist/server.js CHANGED
@@ -52,6 +52,18 @@ const SearchBundleInputSchema = {
52
52
  query: z.string().describe('Search query. Prefix with fts: to use raw FTS syntax.'),
53
53
  scope: z.enum(['docs', 'code', 'all']).default('all').describe('Search scope.'),
54
54
  limit: z.number().int().min(1).max(200).default(30).describe('Max number of hits.'),
55
+ // New filtering options (v0.3.1)
56
+ excludePatterns: z
57
+ .array(z.string())
58
+ .optional()
59
+ .describe('Exclude paths matching these patterns (e.g., ["**/tests/**", "**/__pycache__/**"]). Reduces noise from test/config files.'),
60
+ maxSnippetLength: z
61
+ .number()
62
+ .int()
63
+ .min(50)
64
+ .max(500)
65
+ .optional()
66
+ .describe('Max length of snippet in each result (default: no limit). Use to reduce token consumption.'),
55
67
  // Deprecated (kept for backward compatibility): this tool is strictly read-only.
56
68
  ensureFresh: z
57
69
  .boolean()
@@ -1059,26 +1071,74 @@ export async function startServer() {
1059
1071
  if (!storageDir) {
1060
1072
  throw new BundleNotFoundError(args.bundleId);
1061
1073
  }
1062
- if (args.ensureFresh) {
1063
- throw new Error('ensureFresh is deprecated and not supported in this tool. This tool is strictly read-only. ' +
1064
- 'Call preflight_update_bundle explicitly, then call preflight_search_bundle again.');
1074
+ // P1: Collect warnings for deprecated parameters instead of throwing
1075
+ const warnings = [];
1076
+ if (args.ensureFresh !== undefined) {
1077
+ warnings.push({
1078
+ code: 'DEPRECATED_PARAM',
1079
+ message: 'ensureFresh is deprecated and ignored. This tool is strictly read-only. Use preflight_update_bundle separately, then search again.',
1080
+ });
1081
+ }
1082
+ if (args.autoRepairIndex !== undefined) {
1083
+ warnings.push({
1084
+ code: 'DEPRECATED_PARAM',
1085
+ message: 'autoRepairIndex is deprecated and ignored. This tool is strictly read-only. Use preflight_repair_bundle separately, then search again.',
1086
+ });
1065
1087
  }
1066
- if (args.autoRepairIndex) {
1067
- throw new Error('autoRepairIndex is deprecated and not supported in this tool. This tool is strictly read-only. ' +
1068
- 'Call preflight_repair_bundle explicitly, then call preflight_search_bundle again.');
1088
+ if (args.maxAgeHours !== undefined) {
1089
+ warnings.push({
1090
+ code: 'DEPRECATED_PARAM',
1091
+ message: 'maxAgeHours is deprecated and ignored (was only used with ensureFresh).',
1092
+ });
1069
1093
  }
1070
1094
  const paths = getBundlePathsForId(storageDir, args.bundleId);
1071
- const rawHits = searchIndex(paths.searchDbPath, args.query, args.scope, args.limit, paths.rootDir);
1072
- const hits = rawHits.map((h) => ({
1073
- ...h,
1074
- uri: toBundleFileUri({ bundleId: args.bundleId, relativePath: h.path }),
1075
- }));
1095
+ // Fetch more results if we need to filter, to ensure we still get enough after filtering
1096
+ const fetchLimit = args.excludePatterns?.length ? Math.min(args.limit * 2, 200) : args.limit;
1097
+ let rawHits = searchIndex(paths.searchDbPath, args.query, args.scope, fetchLimit, paths.rootDir);
1098
+ // Apply excludePatterns filter
1099
+ if (args.excludePatterns && args.excludePatterns.length > 0) {
1100
+ const patterns = args.excludePatterns.map(p => {
1101
+ // Convert glob pattern to regex
1102
+ const regexStr = p
1103
+ .replace(/\./g, '\\.')
1104
+ .replace(/\*\*/g, '<<<DOUBLESTAR>>>')
1105
+ .replace(/\*/g, '[^/]*')
1106
+ .replace(/<<<DOUBLESTAR>>>/g, '.*');
1107
+ return new RegExp(regexStr, 'i');
1108
+ });
1109
+ rawHits = rawHits.filter(h => !patterns.some(re => re.test(h.path)));
1110
+ }
1111
+ // Limit to requested count after filtering
1112
+ rawHits = rawHits.slice(0, args.limit);
1113
+ const hits = rawHits.map((h) => {
1114
+ const hit = {
1115
+ ...h,
1116
+ uri: toBundleFileUri({ bundleId: args.bundleId, relativePath: h.path }),
1117
+ };
1118
+ // Apply maxSnippetLength truncation
1119
+ if (args.maxSnippetLength && h.snippet && h.snippet.length > args.maxSnippetLength) {
1120
+ hit.snippet = h.snippet.slice(0, args.maxSnippetLength) + '…';
1121
+ }
1122
+ // Truncate surroundingLines if maxSnippetLength is set
1123
+ if (args.maxSnippetLength && h.context?.surroundingLines) {
1124
+ const maxLines = Math.max(3, Math.floor(args.maxSnippetLength / 50));
1125
+ hit.context = {
1126
+ ...h.context,
1127
+ surroundingLines: h.context.surroundingLines.slice(0, maxLines),
1128
+ };
1129
+ }
1130
+ return hit;
1131
+ });
1076
1132
  const out = {
1077
1133
  bundleId: args.bundleId,
1078
1134
  query: args.query,
1079
1135
  scope: args.scope,
1080
1136
  hits,
1081
1137
  };
1138
+ // Include warnings in output if any deprecated params were used
1139
+ if (warnings.length > 0) {
1140
+ out.warnings = warnings;
1141
+ }
1082
1142
  return {
1083
1143
  content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
1084
1144
  structuredContent: out,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "preflight-mcp",
3
- "version": "0.2.6",
3
+ "version": "0.3.1",
4
4
  "description": "MCP server that creates evidence-based preflight bundles for GitHub repositories and library docs.",
5
5
  "type": "module",
6
6
  "license": "MIT",