@gulu9527/code-trust 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,802 @@
1
+ # CodeTrust 深度调研报告 / CodeTrust Deep Research Report
2
+
3
+ > 版本 / Version: v1
4
+ > 日期 / Date: 2026-03-11
5
+ > 形式 / Format: 中文主文 + 英文译文
6
+ > 研究范围 / Scope: CodeTrust 产品方向、CI 接入策略、baseline/suppression/SARIF/PR summary 最佳实践
7
+
8
+ ---
9
+
10
+ ## 1. 执行摘要 / Executive Summary
11
+
12
+ ### 中文
13
+
14
+ CodeTrust 已经不再是“想法验证”阶段,而是进入了“第一轮收敛”阶段。当前仓库已经具备一套可运行的产品骨架:CLI 入口、核心扫描引擎、规则引擎、评分逻辑、自动修复骨架、规则测试,以及 GitHub Action 初版。这说明产品方向已经成立,下一阶段的关键不再是继续增加规则数量,而是把工具从“会扫描、会打分”升级为“可以稳定进入 CI 的 trust gate”。
15
+
16
+ 这轮调研后的核心判断是:**CodeTrust 下一阶段最关键的升级,不是 score model 更复杂,而是建立完整的 finding lifecycle。**
17
+
18
+ 也就是说,CodeTrust 需要从下面这个旧模型:
19
+
20
+ - scanner + score
21
+
22
+ 升级为这个新模型:
23
+
24
+ - finding identity
25
+ - finding lifecycle
26
+ - policy engine
27
+ - delivery channels
28
+
29
+ 只有这四层建立起来,CodeTrust 才能真正支撑以下场景:
30
+
31
+ - 只拦截新增问题,而不是历史遗留问题
32
+ - 正确区分“文件未扫描”和“问题被压制”
33
+ - 在 PR、CLI、SARIF 中保持一致的结果语义
34
+ - 让 CI 决策有解释性,而不是只靠一个分数
35
+
36
+ 本报告的最终建议是:**把 P0 重心从“继续扩规则/优化分数”转移到“稳定 finding 指纹、baseline 比对、suppression 语义、tool health 可见性”。**
37
+
38
+ ### English Translation
39
+
40
+ CodeTrust is no longer in the “idea validation” stage. It has entered its first consolidation phase. The repository already contains a runnable product skeleton: CLI entrypoints, a core scan engine, a rule engine, scoring logic, an autofix foundation, rule-level tests, and an initial GitHub Action. This means the direction is already validated. The next step is not to keep adding more rules, but to turn the tool from “something that scans and scores” into “a trust gate that can reliably live inside CI.”
41
+
42
+ The central conclusion from this research is: **the most important upgrade for CodeTrust is not a smarter scoring model, but a complete finding lifecycle.**
43
+
44
+ In other words, CodeTrust needs to evolve from this old model:
45
+
46
+ - scanner + score
47
+
48
+ into this new model:
49
+
50
+ - finding identity
51
+ - finding lifecycle
52
+ - policy engine
53
+ - delivery channels
54
+
55
+ Only when these four layers exist can CodeTrust truly support the following workflows:
56
+
57
+ - block only newly introduced issues instead of legacy debt
58
+ - correctly distinguish “not scanned” from “suppressed”
59
+ - keep result semantics consistent across CLI, PR, and SARIF
60
+ - make CI decisions explainable instead of relying on a single score
61
+
62
+ The final recommendation of this report is: **move P0 focus away from “more rules / better scores” and toward stable finding fingerprints, baseline comparison, suppression semantics, and tool-health visibility.**
63
+
64
+ ---
65
+
66
+ ## 2. 调研方法 / Research Method
67
+
68
+ ### 中文
69
+
70
+ 本报告结合了两类输入:
71
+
72
+ 1. **仓库现状判断**
73
+ - 结合当前 CodeTrust 代码结构与命令面,评估其所处阶段与短板。
74
+ 2. **Exa 定向调研**
75
+ - 调研对象主要包括:
76
+ - Semgrep:diff-aware scan、finding lifecycle、ignore/suppression、blocking 行为
77
+ - GitHub Code Scanning:SARIF 支持子集、上传限制、结果去重、身份字段
78
+ - Snyk:ignore / policy file 模型
79
+ - reviewdog:PR annotations、changed-only 工作流
80
+
81
+ 本次调研关注的不是“谁功能更多”,而是“成熟工具在 CI 接入、误报控制、问题生命周期管理上是怎么设计的”。
82
+
83
+ ### English Translation
84
+
85
+ This report combines two inputs:
86
+
87
+ 1. **Repository assessment**
88
+ - Reviewing the current CodeTrust structure and command surface to understand its maturity and its main gaps.
89
+ 2. **Directed Exa research**
90
+ - The main systems examined were:
91
+ - Semgrep: diff-aware scan, finding lifecycle, ignore/suppression, blocking behavior
92
+ - GitHub Code Scanning: SARIF subset support, upload limits, deduplication, identity fields
93
+ - Snyk: ignore / policy file model
94
+ - reviewdog: PR annotations and changed-only workflows
95
+
96
+ The goal of this research was not to compare feature counts, but to understand how mature tools design CI integration, false-positive control, and finding lifecycle management.
97
+
98
+ ---
99
+
100
+ ## 3. 当前阶段判断 / Current Stage Assessment
101
+
102
+ ### 中文
103
+
104
+ CodeTrust 的现状可以概括为:**产品骨架已经成立,可信度体系开始工程化,但 finding lifecycle / suppression / delivery layer 仍未完成。**
105
+
106
+ 目前已经具备:
107
+
108
+ - CLI 入口与多命令表面
109
+ - 核心扫描引擎
110
+ - builtin rules 管理机制
111
+ - severity + dimension weight 的评分模型
112
+ - 自动修复骨架
113
+ - 规则单测
114
+ - 初版 GitHub Action
115
+ - 稳定 issue fingerprint 输出
116
+ - `toolHealth` 可见性字段
117
+ - 一轮面向 self-scan / CI trust gate 的安全规则误报收敛
118
+
119
+ 但下一阶段最大的缺口不在“还缺几条规则”,而在于以下三件事:
120
+
121
+ 1. **finding lifecycle 尚未落地**:虽然已有稳定指纹,但还没有 baseline/new/existing/fixed/suppressed 生命周期模型。
122
+ 2. **suppression 语义缺失**:没有正式的 suppression 模型,就很难管理误报和可接受风险。
123
+ 3. **delivery layer 不完整**:CLI、PR summary、annotation、SARIF 的职责还没有清晰分层。
124
+
125
+ ### English Translation
126
+
127
+ CodeTrust’s current state can be summarized as: **the product skeleton exists, and parts of the trust system are now engineered, but finding lifecycle, suppression, and delivery are still incomplete.**
128
+
129
+ It already has:
130
+
131
+ - a CLI surface with multiple commands
132
+ - a core scan engine
133
+ - builtin rule management
134
+ - a severity + dimension-weight scoring model
135
+ - an autofix foundation
136
+ - rule-level tests
137
+ - an initial GitHub Action
138
+ - stable issue fingerprint output
139
+ - visible `toolHealth` metadata
140
+ - an initial round of false-positive reduction for self-scan / CI trust-gate scenarios
141
+
142
+ But the biggest gap is not “a few missing rules.” It is the absence of three key capabilities:
143
+
144
+ 1. **Finding lifecycle is not implemented yet**: stable fingerprints now exist, but there is still no baseline/new/existing/fixed/suppressed lifecycle.
145
+ 2. **Missing suppression semantics**: without a formal suppression model, false positives and acceptable risk cannot be managed well.
146
+ 3. **Incomplete delivery layer**: CLI, PR summary, annotations, and SARIF do not yet have clearly separated responsibilities.
147
+
148
+ ---
149
+
150
+ ## 4. 核心研究发现 / Core Research Findings
151
+
152
+ ### 4.1 Baseline 不是附加功能,而是核心数据模型 / Baseline Is a Core Data Model, Not an Optional Feature
153
+
154
+ #### 中文
155
+
156
+ Semgrep 的 diff-aware scan 不是简单“只看改动文件”,而是通过 finding identity 跟踪问题生命周期。它关注的不是某条告警的行号,而是更稳定的识别元素,例如规则 ID、文件路径、语义上下文和重复索引。
157
+
158
+ 这对 CodeTrust 的直接启发是:
159
+
160
+ - baseline 不能只是“本次 JSON 与上次 JSON 做字符串对比”
161
+ - 必须先定义稳定 fingerprint
162
+ - baseline 设计应该早于 score model v2
163
+
164
+ 否则会出现:
165
+
166
+ - 行号变化导致老问题变成新问题
167
+ - 重排代码引发误报回潮
168
+ - CI gate 噪音过高
169
+
170
+ #### English Translation
171
+
172
+ Semgrep’s diff-aware scan is not simply “check changed files only.” It tracks the lifecycle of a finding through finding identity. The goal is not to match alerts by line number, but by more stable elements such as rule ID, file path, semantic context, and occurrence index.
173
+
174
+ The direct implication for CodeTrust is:
175
+
176
+ - baseline cannot be implemented as “compare this JSON to the previous JSON as raw output”
177
+ - a stable fingerprint must come first
178
+ - baseline should be prioritized before score model v2
179
+
180
+ Without this, you will get:
181
+
182
+ - old findings being treated as new because line numbers moved
183
+ - noisy regressions caused by code reordering
184
+ - an overly noisy CI gate
185
+
186
+ ### 4.2 exclude、disable、suppress 必须拆开 / Exclude, Disable, and Suppress Must Be Separate Concepts
187
+
188
+ #### 中文
189
+
190
+ 成熟工具通常把这三件事分开:
191
+
192
+ 1. **exclude / ignore path**:根本不扫描该目标
193
+ 2. **disable rule / policy off**:扫描目标,但不启用某条规则
194
+ 3. **suppress finding**:发现问题后,显式标记为 false positive 或 acceptable risk
195
+
196
+ CodeTrust 不能把它们都合并成一个“ignore”。
197
+
198
+ 推荐最少暴露三类计数:
199
+
200
+ - `filesExcluded`
201
+ - `rulesDisabled`
202
+ - `findingsSuppressed`
203
+
204
+ 如果把它们混为一谈,用户会无法判断:
205
+
206
+ - 是工具没扫到
207
+ - 是规则没开
208
+ - 还是问题被人为接受了
209
+
210
+ #### English Translation
211
+
212
+ Mature tools usually separate these three concepts:
213
+
214
+ 1. **exclude / ignore path**: do not scan the target at all
215
+ 2. **disable rule / policy off**: scan the target, but do not run a given rule
216
+ 3. **suppress finding**: detect the issue, then explicitly mark it as a false positive or acceptable risk
217
+
218
+ CodeTrust should not collapse all of these into a single “ignore” concept.
219
+
220
+ At minimum, it should expose three counters:
221
+
222
+ - `filesExcluded`
223
+ - `rulesDisabled`
224
+ - `findingsSuppressed`
225
+
226
+ If these are blended together, users cannot tell whether:
227
+
228
+ - the tool never scanned the file
229
+ - the rule was disabled
230
+ - or the finding was explicitly accepted
231
+
232
+ ### 4.3 SARIF 是目标平台协议,不是简单导出格式 / SARIF Is a Platform Protocol, Not Just Another Export Format
233
+
234
+ #### 中文
235
+
236
+ GitHub 对 SARIF 的支持是受限子集,而不是无条件兼容全部字段。调研中最值得注意的点包括:
237
+
238
+ - 文件大小限制
239
+ - 结果数、规则数、位置数等限制
240
+ - 结果去重与稳定身份字段的重要性
241
+ - `partialFingerprints` 对结果稳定性的重要价值
242
+ - `automationDetails.id` / category 对不同自动化流程的区分作用
243
+
244
+ 因此,CodeTrust 后续做 SARIF 时,不能把现有 JSON 直接映射一层就结束,而应该单独设计 GitHub 兼容模式。
245
+
246
+ #### English Translation
247
+
248
+ GitHub supports a restricted subset of SARIF rather than the full model. The most important findings from the research were:
249
+
250
+ - file size limits
251
+ - limits on results, rules, and locations
252
+ - the importance of stable identity for deduplication
253
+ - the value of `partialFingerprints` for result stability
254
+ - the role of `automationDetails.id` / category in distinguishing automation streams
255
+
256
+ Therefore, when CodeTrust adds SARIF, it should not simply map the current JSON format into SARIF. It should design a GitHub-compatible SARIF mode deliberately.
257
+
258
+ ### 4.4 suppressed finding 进入 SARIF 可能产生错误体验 / Suppressed Findings in SARIF Can Produce Bad UX
259
+
260
+ #### 中文
261
+
262
+ Semgrep 的历史讨论说明:即便把 suppressed findings 放进 SARIF 的 `suppressions` 字段,也不代表 GitHub 一定会按用户期待进行展示。现实中,这可能导致“本来已抑制的问题在 GitHub 里仍像开放告警一样出现”。
263
+
264
+ 因此,CodeTrust 的默认策略应当是:
265
+
266
+ - CLI / JSON:可以保留 suppressed findings,并明确状态
267
+ - SARIF:默认不输出 suppressed findings
268
+ - 如有需要,再通过显式开关启用
269
+
270
+ 推荐未来参数:
271
+
272
+ - `--sarif-include-suppressed`
273
+
274
+ 默认值建议为 `false`。
275
+
276
+ #### English Translation
277
+
278
+ Semgrep’s history shows that even when suppressed findings are exported through SARIF `suppressions`, GitHub may not present them the way users expect. In practice, this can make previously suppressed issues appear as if they are still open alerts.
279
+
280
+ Therefore, CodeTrust’s default strategy should be:
281
+
282
+ - CLI / JSON: keep suppressed findings, but mark them clearly
283
+ - SARIF: do not export suppressed findings by default
284
+ - offer an explicit opt-in flag if needed
285
+
286
+ A future option could be:
287
+
288
+ - `--sarif-include-suppressed`
289
+
290
+ and the default should likely be `false`.
291
+
292
+ ### 4.5 PR 集成应该分层,不应该只输出一种结果 / PR Integration Should Be Layered, Not Monolithic
293
+
294
+ #### 中文
295
+
296
+ PR 集成至少应该分成三层:
297
+
298
+ 1. **Job Summary**:给人看的决策摘要
299
+ 2. **Annotations / changed-line comments**:给开发者的精准修复提示
300
+ 3. **SARIF upload**:给 GitHub Security / 历史跟踪的平台视图
301
+
302
+ 它们的职责分别是:
303
+
304
+ - Summary 负责解释“为什么这次通过/失败”
305
+ - Annotation 负责指出“改哪一行、修什么”
306
+ - SARIF 负责“沉淀与平台化消费”
307
+
308
+ 如果只做其中一种,体验会失衡。
309
+
310
+ #### English Translation
311
+
312
+ PR integration should have at least three layers:
313
+
314
+ 1. **Job Summary**: a human-readable decision summary
315
+ 2. **Annotations / changed-line comments**: precise developer-facing remediation hints
316
+ 3. **SARIF upload**: the platform-facing security and historical view
317
+
318
+ Their responsibilities are different:
319
+
320
+ - Summary explains why the run passed or failed
321
+ - Annotations explain exactly what to fix and where
322
+ - SARIF supports persistence and platform-level consumption
323
+
324
+ If you build only one of these, the experience becomes unbalanced.
325
+
326
+ ### 4.6 score model 的优先级低于 lifecycle 与 policy / Score Model Is Lower Priority Than Lifecycle and Policy
327
+
328
+ #### 中文
329
+
330
+ 调研后最明确的结论之一是:成熟工具真正支撑 adoption 的,通常不是分数公式本身,而是:
331
+
332
+ - baseline 稳不稳定
333
+ - 新旧问题分得清不清楚
334
+ - suppression 是否合理
335
+ - CI 输出是否可解释
336
+ - 误报管理是否可持续
337
+
338
+ 所以 CodeTrust 应该把 gate 设计成“双轨制”:
339
+
340
+ 1. **blocking findings / blocking policy**
341
+ 2. **score threshold**
342
+
343
+ 也就是说:
344
+
345
+ - 某些问题 regardless of score 直接 fail
346
+ - 其余问题再用 score 评估整体可信度
347
+
348
+ #### English Translation
349
+
350
+ One of the clearest conclusions from the research is that adoption is rarely driven by the scoring formula alone. Mature tools succeed because of:
351
+
352
+ - stable baseline behavior
353
+ - clear new-vs-existing distinction
354
+ - practical suppression handling
355
+ - explainable CI output
356
+ - sustainable false-positive control
357
+
358
+ So CodeTrust should design its gate as a dual-track system:
359
+
360
+ 1. **blocking findings / blocking policy**
361
+ 2. **score threshold**
362
+
363
+ In other words:
364
+
365
+ - some issues should fail regardless of score
366
+ - the remaining issues can then contribute to the overall trust score
367
+
368
+ ---
369
+
370
+ ## 5. 设计原则 / Design Principles
371
+
372
+ ### 中文
373
+
374
+ 基于这轮调研,建议 CodeTrust 采用以下产品与工程原则:
375
+
376
+ 1. **先定义 finding identity,再做 baseline 与 SARIF。**
377
+ 2. **先区分 tool health 和 code risk,再谈评分可信度。**
378
+ 3. **在 CI 中优先支持 only-new-findings,而不是全量历史问题阻塞。**
379
+ 4. **suppressions 必须显式、有理由、最好可过期。**
380
+ 5. **SARIF 默认走 GitHub-safe 策略。**
381
+ 6. **PR 集成必须同时服务“决策者”和“修复者”。**
382
+
383
+ ### English Translation
384
+
385
+ Based on this research, CodeTrust should adopt the following product and engineering principles:
386
+
387
+ 1. **Define finding identity before baseline and SARIF.**
388
+ 2. **Separate tool health from code risk before investing in score credibility.**
389
+ 3. **In CI, prioritize only-new-findings over blocking on the entire historical backlog.**
390
+ 4. **Suppressions must be explicit, justified, and ideally expirable.**
391
+ 5. **Use GitHub-safe defaults for SARIF.**
392
+ 6. **PR integration must serve both decision-makers and fixers.**
393
+
394
+ ---
395
+
396
+ ## 6. 推荐路线图 / Recommended Roadmap
397
+
398
+ ### 6.1 P0:信任地基 / P0: Trust Foundations
399
+
400
+ #### 中文
401
+
402
+ **目标:先解决“工具是否真的可信”这个问题。**
403
+
404
+ 推荐任务:
405
+
406
+ 1. **实现稳定 finding fingerprint**
407
+ - 输入建议:`ruleId + normalizedFilePath + contextHash + occurrenceIndex`
408
+ 2. **让 include/exclude 真正生效**
409
+ - 严格定义为 pre-scan filtering
410
+ 3. **让规则失败与扫描异常可见**
411
+ - 输出 `rulesExecuted`、`rulesFailed`、`filesSkipped`、`scanErrors`
412
+ 4. **固化 JSON schema v1**
413
+ - 明确 `toolHealth` 与 `analysisResult` 分层
414
+ 5. **收敛 `scan` 与 `report` 的职责边界**
415
+ - `scan` 做即时扫描
416
+ - `report` 做 artifact / baseline / previous result 展示
417
+
418
+ #### English Translation
419
+
420
+ **Goal: solve the question of whether the tool itself is trustworthy.**
421
+
422
+ Recommended tasks:
423
+
424
+ 1. **Implement stable finding fingerprints**
425
+ - Suggested input: `ruleId + normalizedFilePath + contextHash + occurrenceIndex`
426
+ 2. **Make include/exclude truly effective**
427
+ - Define it strictly as pre-scan filtering
428
+ 3. **Make rule failures and scan errors visible**
429
+ - Output `rulesExecuted`, `rulesFailed`, `filesSkipped`, `scanErrors`
430
+ 4. **Freeze JSON schema v1**
431
+ - Separate `toolHealth` and `analysisResult`
432
+ 5. **Clarify the boundary between `scan` and `report`**
433
+ - `scan` performs live analysis
434
+ - `report` renders artifacts, baseline comparison, or previous results
435
+
436
+ ### 6.2 P1:把工具变成可进 CI 的 trust gate / P1: Turn the Tool into a CI-Ready Trust Gate
437
+
438
+ #### 中文
439
+
440
+ **目标:把“可扫描”升级为“可决策”。**
441
+
442
+ 推荐任务:
443
+
444
+ 1. **baseline / lifecycle 比对**
445
+ - 支持 `new / existing / fixed / suppressed`
446
+ 2. **suppression 模型**
447
+ - 支持 inline、file、rule、config 级 suppression
448
+ - 建议带 `reason`、`source`、`expiresAt`
449
+ 3. **policy engine**
450
+ - 支持 `off / warn / block`
451
+ 4. **GitHub Action v2**
452
+ - job summary
453
+ - changed-line annotation
454
+ - json artifact
455
+ - baseline ref 输入
456
+ - fail-on-new-blocking
457
+ - fail-on-score-below
458
+
459
+ #### English Translation
460
+
461
+ **Goal: move from “scan-capable” to “decision-capable.”**
462
+
463
+ Recommended tasks:
464
+
465
+ 1. **baseline / lifecycle comparison**
466
+ - Support `new / existing / fixed / suppressed`
467
+ 2. **suppression model**
468
+ - Support inline, file-level, rule-level, and config-level suppression
469
+ - Ideally include `reason`, `source`, and `expiresAt`
470
+ 3. **policy engine**
471
+ - Support `off / warn / block`
472
+ 4. **GitHub Action v2**
473
+ - job summary
474
+ - changed-line annotations
475
+ - JSON artifact output
476
+ - baseline ref input
477
+ - fail-on-new-blocking
478
+ - fail-on-score-below
479
+
480
+ ### 6.3 P2:专业化与生态桥接 / P2: Professionalization and Ecosystem Bridge
481
+
482
+ #### 中文
483
+
484
+ **目标:提高专业感与外部系统兼容性。**
485
+
486
+ 推荐任务:
487
+
488
+ 1. **GitHub-compatible SARIF exporter**
489
+ - 稳定 `partialFingerprints`
490
+ - 设定 `automationDetails.id`
491
+ - 默认排除 suppressed findings
492
+ 2. **explain 模式**
493
+ - `codetrust explain <rule-id>`
494
+ 3. **presets**
495
+ - `recommended`
496
+ - `strict`
497
+ - `ci-gate`
498
+ - `ai-suspicious`
499
+ 4. **top risk file / top risk dimension / top risk module**
500
+
501
+ #### English Translation
502
+
503
+ **Goal: improve professionalism and external system compatibility.**
504
+
505
+ Recommended tasks:
506
+
507
+ 1. **GitHub-compatible SARIF exporter**
508
+ - stable `partialFingerprints`
509
+ - explicit `automationDetails.id`
510
+ - exclude suppressed findings by default
511
+ 2. **Explain mode**
512
+ - `codetrust explain <rule-id>`
513
+ 3. **Presets**
514
+ - `recommended`
515
+ - `strict`
516
+ - `ci-gate`
517
+ - `ai-suspicious`
518
+ 4. **top risk file / top risk dimension / top risk module**
519
+
520
+ ---
521
+
522
+ ## 7. 建议的输出模型 / Suggested Output Model
523
+
524
+ ### 中文
525
+
526
+ 建议 CodeTrust 的 JSON 输出模型从一开始就分为两部分:
527
+
528
+ ### 7.1 toolHealth
529
+
530
+ 用于说明工具这次“执行得怎么样”:
531
+
532
+ - `scanMode`
533
+ - `rulesExecuted`
534
+ - `rulesFailed`
535
+ - `filesConsidered`
536
+ - `filesExcluded`
537
+ - `filesSkipped`
538
+ - `scanErrors`
539
+ - `durationMs`
540
+
541
+ ### 7.2 analysisResult
542
+
543
+ 用于说明代码“风险长什么样”:
544
+
545
+ - `overall`
546
+ - `dimensions`
547
+ - `issues`
548
+ - `topRiskFiles`
549
+ - `thresholdResult`
550
+ - `lifecycleSummary`
551
+
552
+ 这样做的价值在于:
553
+
554
+ - 用户可以区分“分数低是因为代码差,还是工具没跑完整”
555
+ - 后续 PR summary 和 SARIF exporter 也更容易消费
556
+
557
+ ### English Translation
558
+
559
+ CodeTrust’s JSON output should be separated into two parts from the start:
560
+
561
+ ### 7.1 toolHealth
562
+
563
+ This explains how well the tool executed:
564
+
565
+ - `scanMode`
566
+ - `rulesExecuted`
567
+ - `rulesFailed`
568
+ - `filesConsidered`
569
+ - `filesExcluded`
570
+ - `filesSkipped`
571
+ - `scanErrors`
572
+ - `durationMs`
573
+
574
+ ### 7.2 analysisResult
575
+
576
+ This explains what the code risk looks like:
577
+
578
+ - `overall`
579
+ - `dimensions`
580
+ - `issues`
581
+ - `topRiskFiles`
582
+ - `thresholdResult`
583
+ - `lifecycleSummary`
584
+
585
+ The value of this split is:
586
+
587
+ - users can tell whether a low score comes from bad code or an incomplete scan
588
+ - PR summaries and SARIF exporters become much easier to build cleanly
589
+
590
+ ---
591
+
592
+ ## 8. 可直接创建的 GitHub Issues Backlog / GitHub-Issue-Ready Backlog
593
+
594
+ ### 中文
595
+
596
+ #### P0
597
+
598
+ 1. **feat: add stable finding fingerprint generation**
599
+ 2. **fix: apply include/exclude filtering before scan execution**
600
+ 3. **feat: surface rule execution failures in scan metadata**
601
+ 4. **feat: add strict engine mode for CI**
602
+ 5. **feat: freeze JSON output schema v1**
603
+ 6. **refactor: make report artifact-based instead of live scan**
604
+
605
+ #### P1
606
+
607
+ 7. **feat: implement baseline comparison and finding lifecycle states**
608
+ 8. **feat: add suppression model with reason and optional expiry**
609
+ 9. **feat: add policy modes (off/warn/block) per rule/category**
610
+ 10. **feat: add GitHub Action job summary and changed-line annotations**
611
+
612
+ #### P2
613
+
614
+ 11. **feat: export GitHub-compatible SARIF with stable fingerprints**
615
+ 12. **feat: add explain command for rules and findings**
616
+ 13. **feat: add recommended/strict/ci-gate presets**
617
+ 14. **feat: report top-risk files and top-risk dimensions**
618
+
619
+ ### English Translation
620
+
621
+ #### P0
622
+
623
+ 1. **feat: add stable finding fingerprint generation**
624
+ 2. **fix: apply include/exclude filtering before scan execution**
625
+ 3. **feat: surface rule execution failures in scan metadata**
626
+ 4. **feat: add strict engine mode for CI**
627
+ 5. **feat: freeze JSON output schema v1**
628
+ 6. **refactor: make report artifact-based instead of live scan**
629
+
630
+ #### P1
631
+
632
+ 7. **feat: implement baseline comparison and finding lifecycle states**
633
+ 8. **feat: add suppression model with reason and optional expiry**
634
+ 9. **feat: add policy modes (off/warn/block) per rule/category**
635
+ 10. **feat: add GitHub Action job summary and changed-line annotations**
636
+
637
+ #### P2
638
+
639
+ 11. **feat: export GitHub-compatible SARIF with stable fingerprints**
640
+ 12. **feat: add explain command for rules and findings**
641
+ 13. **feat: add recommended/strict/ci-gate presets**
642
+ 14. **feat: report top-risk files and top-risk dimensions**
643
+
644
+ ---
645
+
646
+ ## 9. 建议的成功指标 / Suggested Success Metrics
647
+
648
+ ### 中文
649
+
650
+ 为了避免路线图只停留在“功能完成”,建议给下一阶段补上结果指标:
651
+
652
+ 1. **规则执行可靠性**
653
+ - `rulesFailed / rulesExecuted` 持续下降
654
+ 2. **baseline 稳定性**
655
+ - 同一问题在小范围重构后,不应被大量重新识别为 new
656
+ 3. **CI 可接受性**
657
+ - 团队实际愿意开启 blocking mode
658
+ 4. **误报管理成本**
659
+ - suppression 的创建与追踪成本可控
660
+ 5. **PR 可读性**
661
+ - summary 与 annotations 能帮助开发者在一次 review 中完成修复
662
+
663
+ ### English Translation
664
+
665
+ To avoid a roadmap that only measures “features shipped,” the next phase should also include outcome metrics:
666
+
667
+ 1. **Rule execution reliability**
668
+ - `rulesFailed / rulesExecuted` should trend downward
669
+ 2. **Baseline stability**
670
+ - the same issue should not frequently reappear as new after small refactors
671
+ 3. **CI acceptability**
672
+ - teams should actually be willing to enable blocking mode
673
+ 4. **False-positive management cost**
674
+ - creating and tracking suppressions should remain manageable
675
+ 5. **PR readability**
676
+ - summaries and annotations should help developers fix issues in a single review cycle
677
+
678
+ ---
679
+
680
+ ## 10. 当前不建议优先投入的方向 / Areas Not Worth Prioritizing Yet
681
+
682
+ ### 中文
683
+
684
+ 在 finding lifecycle 与 CI trust gate 建好之前,不建议把主精力投入到以下方向:
685
+
686
+ - 多语言支持
687
+ - VS Code 插件
688
+ - MCP server
689
+ - SaaS dashboard
690
+ - “AI probability” 或模糊型 AI 检测能力
691
+
692
+ 原因不是这些方向没价值,而是它们都建立在“核心工作流足够可信”之上。
693
+
694
+ ### English Translation
695
+
696
+ Before finding lifecycle and the CI trust gate are solid, the main effort should not go into:
697
+
698
+ - multi-language support
699
+ - a VS Code extension
700
+ - an MCP server
701
+ - a SaaS dashboard
702
+ - fuzzy “AI probability” style detectors
703
+
704
+ The reason is not that these are worthless, but that they all depend on a trustworthy core workflow.
705
+
706
+ ---
707
+
708
+ ## 11. 最终结论 / Final Conclusion
709
+
710
+ ### 中文
711
+
712
+ CodeTrust 当前最重要的升级方向,不是让自己成为“更强的 scanner”,而是让自己成为“更可靠的 decision system”。
713
+
714
+ 一句话总结:
715
+
716
+ **CodeTrust 的下一阶段,不应再以“规则数量”来定义进度,而应以“finding lifecycle、policy、delivery 是否成立”来定义成熟度。**
717
+
718
+ 如果只能保留一个优先级判断,那就是:
719
+
720
+ **先把 CodeTrust 做成一个让团队敢放进 CI 的工具,再考虑把它做成一个让人兴奋的工具。**
721
+
722
+ ### English Translation
723
+
724
+ The most important upgrade for CodeTrust is not to become “a stronger scanner,” but to become “a more reliable decision system.”
725
+
726
+ In one sentence:
727
+
728
+ **The next stage of CodeTrust should not be measured by rule count, but by whether finding lifecycle, policy, and delivery are truly in place.**
729
+
730
+ If there is only one priority judgment to keep, it is this:
731
+
732
+ **First make CodeTrust something teams are willing to put into CI. Then make it something that excites them.**
733
+
734
+ ---
735
+
736
+ ## 12. 参考资料 / References
737
+
738
+ ### 中文
739
+
740
+ 以下资料为本次 Exa 定向调研中的核心参考:
741
+
742
+ - Semgrep Findings in CI
743
+ https://semgrep.dev/docs/semgrep-ci/findings-ci
744
+ - Semgrep Configure blocking findings
745
+ https://semgrep.dev/docs/semgrep-ci/configuring-blocking-and-errors-in-ci
746
+ - Semgrep Ignore files, folders, and code
747
+ https://semgrep.dev/docs/ignoring-files-folders-code
748
+ - Semgrep Semgrepignore v2 reference
749
+ https://semgrep.dev/docs/semgrepignore-v2-reference
750
+ - GitHub Uploading a SARIF file to GitHub
751
+ https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github
752
+ - GitHub SARIF support for code scanning
753
+ https://docs.github.com/en/code-security/reference/code-scanning/sarif-files/sarif-support-for-code-scanning
754
+ - GitHub SARIF results exceed one or more limits
755
+ https://docs.github.com/en/code-security/code-scanning/troubleshooting-sarif-uploads/results-exceed-limit
756
+ - GitHub SARIF file is too large
757
+ https://docs.github.com/en/code-security/reference/code-scanning/sarif-files/troubleshoot-sarif-uploads/file-too-large
758
+ - Snyk Ignore issues
759
+ https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/ignore-issues
760
+ - Snyk The .snyk file
761
+ https://docs.snyk.io/manage-risk/policies/the-.snyk-file
762
+ - reviewdog repository
763
+ https://github.com/reviewdog/reviewdog
764
+ - Reviewdog filter settings with GitHub Actions
765
+ https://lornajane.net/posts/2024/reviewdog-filter-settings-with-github-actions
766
+ - Semgrep PR discussion: suppressed findings in SARIF
767
+ https://github.com/returntocorp/semgrep/pull/3616
768
+ - Semgrep issue: SARIF and suppressed findings caveat
769
+ https://github.com/returntocorp/semgrep/issues/7121
770
+
771
+ ### English Translation
772
+
773
+ The following references were the most important sources used in this directed Exa research:
774
+
775
+ - Semgrep Findings in CI
776
+ https://semgrep.dev/docs/semgrep-ci/findings-ci
777
+ - Semgrep Configure blocking findings
778
+ https://semgrep.dev/docs/semgrep-ci/configuring-blocking-and-errors-in-ci
779
+ - Semgrep Ignore files, folders, and code
780
+ https://semgrep.dev/docs/ignoring-files-folders-code
781
+ - Semgrep Semgrepignore v2 reference
782
+ https://semgrep.dev/docs/semgrepignore-v2-reference
783
+ - GitHub Uploading a SARIF file to GitHub
784
+ https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github
785
+ - GitHub SARIF support for code scanning
786
+ https://docs.github.com/en/code-security/reference/code-scanning/sarif-files/sarif-support-for-code-scanning
787
+ - GitHub SARIF results exceed one or more limits
788
+ https://docs.github.com/en/code-security/code-scanning/troubleshooting-sarif-uploads/results-exceed-limit
789
+ - GitHub SARIF file is too large
790
+ https://docs.github.com/en/code-security/reference/code-scanning/sarif-files/troubleshoot-sarif-uploads/file-too-large
791
+ - Snyk Ignore issues
792
+ https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/ignore-issues
793
+ - Snyk The .snyk file
794
+ https://docs.snyk.io/manage-risk/policies/the-.snyk-file
795
+ - reviewdog repository
796
+ https://github.com/reviewdog/reviewdog
797
+ - Reviewdog filter settings with GitHub Actions
798
+ https://lornajane.net/posts/2024/reviewdog-filter-settings-with-github-actions
799
+ - Semgrep PR discussion: suppressed findings in SARIF
800
+ https://github.com/returntocorp/semgrep/pull/3616
801
+ - Semgrep issue: SARIF and suppressed findings caveat
802
+ https://github.com/returntocorp/semgrep/issues/7121