agentflight 0.3.2 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/CHANGELOG.md +63 -0
  2. package/README.md +41 -9
  3. package/dist/commands/replay.d.ts.map +1 -1
  4. package/dist/commands/replay.js +9 -3
  5. package/dist/commands/replay.js.map +1 -1
  6. package/dist/commands/report.d.ts.map +1 -1
  7. package/dist/commands/report.js +9 -3
  8. package/dist/commands/report.js.map +1 -1
  9. package/dist/commands/resume.d.ts.map +1 -1
  10. package/dist/commands/resume.js +10 -2
  11. package/dist/commands/resume.js.map +1 -1
  12. package/dist/commands/snapshot.d.ts.map +1 -1
  13. package/dist/commands/snapshot.js +19 -1
  14. package/dist/commands/snapshot.js.map +1 -1
  15. package/dist/commands/status.d.ts.map +1 -1
  16. package/dist/commands/status.js +28 -5
  17. package/dist/commands/status.js.map +1 -1
  18. package/dist/core/changed-files.d.ts +7 -0
  19. package/dist/core/changed-files.d.ts.map +1 -0
  20. package/dist/core/changed-files.js +73 -0
  21. package/dist/core/changed-files.js.map +1 -0
  22. package/dist/core/config.d.ts.map +1 -1
  23. package/dist/core/config.js +3 -0
  24. package/dist/core/config.js.map +1 -1
  25. package/dist/core/git.d.ts.map +1 -1
  26. package/dist/core/git.js +5 -2
  27. package/dist/core/git.js.map +1 -1
  28. package/dist/core/review-intelligence.d.ts +9 -0
  29. package/dist/core/review-intelligence.d.ts.map +1 -0
  30. package/dist/core/review-intelligence.js +340 -0
  31. package/dist/core/review-intelligence.js.map +1 -0
  32. package/dist/renderers/html-replay.d.ts +2 -1
  33. package/dist/renderers/html-replay.d.ts.map +1 -1
  34. package/dist/renderers/html-replay.js +188 -50
  35. package/dist/renderers/html-replay.js.map +1 -1
  36. package/dist/renderers/markdown-report.d.ts +3 -3
  37. package/dist/renderers/markdown-report.d.ts.map +1 -1
  38. package/dist/renderers/markdown-report.js +35 -5
  39. package/dist/renderers/markdown-report.js.map +1 -1
  40. package/dist/renderers/resume-prompt.d.ts +4 -1
  41. package/dist/renderers/resume-prompt.d.ts.map +1 -1
  42. package/dist/renderers/resume-prompt.js +32 -2
  43. package/dist/renderers/resume-prompt.js.map +1 -1
  44. package/dist/types/index.d.ts +39 -0
  45. package/dist/types/index.d.ts.map +1 -1
  46. package/docs/assets/agentflight-replay-timeline.png +0 -0
  47. package/docs/development/changed-file-filters.md +58 -0
  48. package/docs/{roadmap.md → roadmap/index.md} +5 -0
  49. package/docs/roadmap/v0.4.0-review-intelligence-plan.md +882 -0
  50. package/package.json +4 -2
@@ -0,0 +1,882 @@
1
+ # AgentFlight v0.4.0 Review Intelligence Plan
2
+
3
+ ## Product Goal
4
+
5
+ AgentFlight v0.4.0 should help developers decide where to review first, why those files matter, what proof exists, what proof is missing, and whether the session is ready for human review.
6
+
7
+ The release should build on the v0.1 through v0.3 arc:
8
+
9
+ - v0.1: local session recording and proof artifacts.
10
+ - v0.2: real verification evidence.
11
+ - v0.3: snapshots and timelines.
12
+ - v0.4: deterministic review intelligence over the recorded local evidence.
13
+
14
+ AgentFlight should remain the control room around AI coding agents. It should not become a coding agent, CI system, cloud service, or PR automation product in this release.
15
+
16
+ ## Recommended Scope
17
+
18
+ Recommendation: **Option B: Review focus ranking + proof gap detection + readiness model + config-driven generated/internal file filters.**
19
+
20
+ Why this scope:
21
+
22
+ - Review intelligence is the primary product direction and directly improves `status`, `report`, `replay`, and `resume`.
23
+ - Dogfooding showed generated/internal artifacts can still distort review focus. v0.3.3 fixed AgentFlight runtime artifacts; v0.4.0 should let users configure additional local ignore patterns such as `.projscan-memory/**`.
24
+ - ProjScan enrichment is valuable, but depending on unstable or human-formatted output would make AgentFlight brittle. v0.4.0 should keep the integration defensive and prepare a clean hook for future structured hints.
25
+
26
+ Do not include ProjScan-enriched ranking in v0.4.0 unless ProjScan exposes a stable machine-readable contract before implementation begins.
27
+
28
+ ## Non-Goals
29
+
30
+ Do not build these in v0.4.0:
31
+
32
+ - GitHub PR comments.
33
+ - JSON output.
34
+ - CI integration.
35
+ - Cloud sync.
36
+ - Login.
37
+ - Billing.
38
+ - Pro or Team gating.
39
+ - GitHub App.
40
+ - LLM calls.
41
+ - Source upload.
42
+ - Review automation that claims a human review happened.
43
+ - Broad replay redesign.
44
+ - Full diff capture by default.
45
+ - Hosted dashboards.
46
+
47
+ Explicitly defer these dogfood findings:
48
+
49
+ - Interrupted verification cleanup.
50
+ - Long verification heartbeat or progress output.
51
+ - Tool availability messaging alignment.
52
+ - Deeper ProjScan risk enrichment.
53
+ - Larger report/replay design system work beyond surfacing review intelligence.
54
+
55
+ ## User Stories
56
+
57
+ 1. As a developer using Codex or Claude Code, I want AgentFlight to tell me which changed files need review first so I can avoid scanning everything equally.
58
+ 2. As a reviewer, I want each review focus item to explain why it matters so I can understand the risk without reading AgentFlight internals.
59
+ 3. As an agent operator, I want AgentFlight to tell me which proof is missing so I can run the right command before claiming completion.
60
+ 4. As a developer in a repo with generated files, I want to configure local changed-file ignores so generated tool artifacts do not distort risk and review guidance.
61
+ 5. As a handoff recipient, I want the resume prompt to include the most important review focus and the exact next verification command.
62
+ 6. As a security-conscious user, I want all analysis to stay local and avoid full source diffs by default.
63
+
64
+ ## Current-State Analysis
65
+
66
+ ### Risk Categorisation
67
+
68
+ Current file: `src/core/risk.ts`
69
+
70
+ AgentFlight already categorises changed files into:
71
+
72
+ - `auth`
73
+ - `billing/payments`
74
+ - `database/migrations`
75
+ - `security/secrets`
76
+ - `config`
77
+ - `tests`
78
+ - `docs`
79
+ - `frontend`
80
+ - `backend/api`
81
+ - `dependencies`
82
+ - `unknown`
83
+
84
+ It computes a session-level risk level and reasons. It does not rank individual files or connect file categories to evidence requirements.
85
+
86
+ ### Changed-File Filtering
87
+
88
+ Current file: `src/core/changed-files.ts`
89
+
90
+ AgentFlight v0.3.3 filters its own runtime artifacts:
91
+
92
+ - `.agentflight/sessions/**`
93
+ - `.agentflight/reports/**`
94
+ - `.agentflight/current/**`
95
+ - `.agentflight/evidence/**`
96
+
97
+ It intentionally keeps `.agentflight/config.json` visible because it is user-controlled project configuration.
98
+
99
+ The filter is not yet config-driven. Dogfooding found `.projscan-memory/memory.json` can appear as a normal changed file because it is generated by ProjScan, not AgentFlight.
100
+
101
+ ### Verification Evidence
102
+
103
+ Current file: `src/core/verification.ts`
104
+
105
+ AgentFlight records verification runs with command, timestamps, duration, exit code, pass/fail state, and stdout/stderr evidence paths. It builds a basic summary:
106
+
107
+ - passed count
108
+ - failed count
109
+ - missing configured commands
110
+ - basic gaps
111
+ - readiness
112
+ - next action
113
+
114
+ The current model does not detect category-specific proof gaps such as "auth files changed but no test evidence was captured."
115
+
116
+ ### Recommendation and Readiness Logic
117
+
118
+ Current file: `src/core/verification.ts`
119
+
120
+ Current readiness states are:
121
+
122
+ - `Ready for review`
123
+ - `Not ready for review`
124
+ - `Blocked`
125
+ - `Unknown`
126
+
127
+ This is useful but too coarse. v0.4.0 should make readiness explainable with:
128
+
129
+ - state
130
+ - reason
131
+ - next best action
132
+ - suggested command, when one can be inferred
133
+
134
+ Suggested v0.4.0 display states:
135
+
136
+ - `Ready for review`
137
+ - `Needs verification`
138
+ - `Not ready for review`
139
+ - `Blocked by failed verification`
140
+ - `Unknown`
141
+
142
+ ### Reports and Replay
143
+
144
+ Current files:
145
+
146
+ - `src/renderers/markdown-report.ts`
147
+ - `src/renderers/html-replay.ts`
148
+
149
+ Reports and replays already show changed files, risk categories, timelines, verification evidence, and a recommendation. They do not yet show ranked review focus items or proof gaps as first-class sections.
150
+
151
+ ### Resume Prompt
152
+
153
+ Current file: `src/renderers/resume-prompt.ts`
154
+
155
+ Resume prompts show changed files, risk reasons, verification gaps, latest snapshot note, verification state, and next action. v0.4.0 should add review focus and make the next command more precise.
156
+
157
+ ### ProjScan Adapter
158
+
159
+ Current file: `src/adapters/projscan.ts`
160
+
161
+ The adapter detects availability and runs a baseline command defensively. It captures summary text but does not expose structured hotspot or architecture data. v0.4.0 should not parse unstable human-readable output.
162
+
163
+ ## Proposed Data Model Changes
164
+
165
+ Add review-intelligence types to `src/types/index.ts`.
166
+
167
+ ```ts
168
+ export type ReviewReadinessState =
169
+ | "Ready for review"
170
+ | "Needs verification"
171
+ | "Not ready for review"
172
+ | "Blocked by failed verification"
173
+ | "Unknown";
174
+
175
+ export interface ReviewFocusItem {
176
+ rank: number;
177
+ file: string;
178
+ category: RiskCategory;
179
+ riskLevel: RiskLevel;
180
+ score: number;
181
+ reasons: string[];
182
+ proofStatus: "covered" | "missing" | "failed" | "not-required" | "unknown";
183
+ suggestedCommand?: string;
184
+ }
185
+
186
+ export interface ProofGap {
187
+ id: string;
188
+ severity: "blocking" | "recommended" | "informational";
189
+ message: string;
190
+ affectedFiles: string[];
191
+ suggestedCommand?: string;
192
+ }
193
+
194
+ export interface ReviewReadinessDecision {
195
+ state: ReviewReadinessState;
196
+ reason: string;
197
+ nextAction: string;
198
+ suggestedCommand?: string;
199
+ }
200
+
201
+ export interface ReviewIntelligence {
202
+ focus: ReviewFocusItem[];
203
+ proofGaps: ProofGap[];
204
+ readiness: ReviewReadinessDecision;
205
+ }
206
+ ```
207
+
208
+ These types do not need to be persisted into session files in v0.4.0. They can be computed from current session data, changed files, risk analysis, config, and verification evidence. Avoiding persistence preserves v0.1, v0.2, and v0.3 session compatibility.
209
+
210
+ ## Proposed Config Changes
211
+
212
+ Extend `AgentFlightConfig` with an optional changed-file filter section:
213
+
214
+ ```ts
215
+ export interface AgentFlightConfig {
216
+ version: 1;
217
+ projectName: string;
218
+ createdAt: string;
219
+ engines: {
220
+ projscan: {
221
+ enabled: boolean;
222
+ mode: AgentFlightEngineMode;
223
+ };
224
+ agentloopkit: {
225
+ enabled: boolean;
226
+ mode: AgentFlightEngineMode;
227
+ };
228
+ };
229
+ verification: {
230
+ commands: string[];
231
+ };
232
+ changedFileFilters?: {
233
+ ignore: string[];
234
+ };
235
+ privacy: {
236
+ localOnly: true;
237
+ telemetry: false;
238
+ };
239
+ }
240
+ ```
241
+
242
+ Backwards compatibility:
243
+
244
+ - Existing v0.1 through v0.3 configs without `changedFileFilters` continue to work.
245
+ - Missing `changedFileFilters.ignore` is treated as an empty list.
246
+ - Built-in AgentFlight runtime filtering remains separate and always active.
247
+
248
+ Default for new `agentflight init` configs:
249
+
250
+ ```json
251
+ "changedFileFilters": {
252
+ "ignore": []
253
+ }
254
+ ```
255
+
256
+ Reasoning:
257
+
258
+ - AgentFlight's own runtime artifacts are always hidden by built-in filters.
259
+ - Generated directories such as `.projscan-memory/**`, `dist/**`, `coverage/**`, and `.next/**` should be documented examples, not default ignores, because some projects intentionally review or commit generated outputs.
260
+
261
+ Docs should show optional examples:
262
+
263
+ ```json
264
+ "changedFileFilters": {
265
+ "ignore": [
266
+ ".projscan-memory/**",
267
+ "coverage/**",
268
+ "dist/**",
269
+ ".next/**"
270
+ ]
271
+ }
272
+ ```
273
+
274
+ ## Affected Commands
275
+
276
+ ### `agentflight status`
277
+
278
+ Add:
279
+
280
+ - `Review first` ranked list.
281
+ - `Missing proof` as structured proof gaps.
282
+ - Readiness state with reason.
283
+ - Next action with suggested command when available.
284
+
285
+ Keep output concise. Show the top 3 to 5 focus items in the terminal.
286
+
287
+ ### `agentflight report`
288
+
289
+ Add sections:
290
+
291
+ - `## Review First`
292
+ - `## Proof Gaps`
293
+ - `## Review Readiness`
294
+
295
+ The report should include all focus items, not just the top 3, because it is a shareable artifact.
296
+
297
+ ### `agentflight replay`
298
+
299
+ Add:
300
+
301
+ - Review focus panel near the summary strip.
302
+ - Proof gap panel before or near verification evidence.
303
+ - Readiness reason in the recommendation section.
304
+
305
+ Keep the v0.3.3 calm developer-review visual direction.
306
+
307
+ ### `agentflight resume`
308
+
309
+ Add:
310
+
311
+ - Top review focus items.
312
+ - Proof gaps.
313
+ - Exact next suggested command.
314
+ - Clear instruction to handle the highest-ranked focus item first.
315
+
316
+ ### `agentflight snapshot`
317
+
318
+ Snapshots should continue capturing current risk and verification summary. v0.4.0 can optionally include a compact review intelligence summary in snapshot metadata:
319
+
320
+ ```json
321
+ "review": {
322
+ "readiness": "Needs verification",
323
+ "topFocusFiles": ["src/auth/session.ts"],
324
+ "proofGapCount": 2
325
+ }
326
+ ```
327
+
328
+ This is optional and should not block the release if it complicates compatibility. The primary outputs can compute review intelligence live.
329
+
330
+ ### `agentflight doctor`
331
+
332
+ No v0.4.0 behavior change required. Do not use this release to align tool availability messaging unless it falls out naturally from config validation.
333
+
334
+ ## Affected Renderers
335
+
336
+ ### Markdown Report
337
+
338
+ Modify `src/renderers/markdown-report.ts` input to accept `review: ReviewIntelligence`.
339
+
340
+ Render:
341
+
342
+ ```md
343
+ ## Review First
344
+
345
+ 1. src/auth/session.ts
346
+ Why: identity/session path; no passing test evidence
347
+ Suggested proof: npm test
348
+
349
+ ## Proof Gaps
350
+
351
+ - Auth files changed but no passing test evidence was recorded.
352
+
353
+ ## Review Readiness
354
+
355
+ Needs verification
356
+ Reason: High-risk files changed without matching proof.
357
+ Next action: Run agentflight verify -- npm test
358
+ ```
359
+
360
+ ### HTML Replay
361
+
362
+ Modify `src/renderers/html-replay.ts` input to accept `review: ReviewIntelligence`.
363
+
364
+ Render:
365
+
366
+ - Summary strip: readiness, risk, changed files, proof.
367
+ - Review focus section: ranked rows with file, category, reasons, and proof status.
368
+ - Proof gaps section: compact list with suggested command.
369
+ - Existing timeline and verification sections remain.
370
+
371
+ ### Resume Prompt
372
+
373
+ Modify `src/renderers/resume-prompt.ts` input to include:
374
+
375
+ - `reviewFocus: ReviewFocusItem[]`
376
+ - `proofGaps: ProofGap[]`
377
+ - `readiness: ReviewReadinessDecision`
378
+
379
+ Keep the prompt concise and agent-safe.
380
+
381
+ ## Review Scoring and Ranking Approach
382
+
383
+ Create `src/core/review-intelligence.ts`.
384
+
385
+ Inputs:
386
+
387
+ - changed files after built-in and config-driven filtering
388
+ - `RiskAnalysis`
389
+ - verification summary and runs
390
+ - session verification commands
391
+ - optional ProjScan hints in a future-compatible shape
392
+
393
+ Algorithm v0:
394
+
395
+ 1. Categorise each changed file with the existing `categorizeFile`.
396
+ 2. Assign a base score by category:
397
+
398
+ | Category | Base score |
399
+ | --------------------- | ---------- |
400
+ | `auth` | 100 |
401
+ | `billing/payments` | 95 |
402
+ | `security/secrets` | 95 |
403
+ | `database/migrations` | 90 |
404
+ | `config` | 75 |
405
+ | `backend/api` | 70 |
406
+ | `dependencies` | 65 |
407
+ | `unknown` | 50 |
408
+ | `frontend` | 35 |
409
+ | `tests` | 20 |
410
+ | `docs` | 10 |
411
+
412
+ 3. Add modifiers:
413
+
414
+ - `+30` if the category has no matching proof and proof is expected.
415
+ - `+40` if any verification failed and the file category is not docs-only.
416
+ - `+20` if dependency files changed and no build or test proof exists.
417
+ - `+15` if config files changed and no lint, typecheck, or build proof exists.
418
+ - `+10` if file category is `unknown`.
419
+
420
+ 4. Sort descending by score.
421
+ 5. Tie-break by path name for deterministic output.
422
+ 6. Limit terminal status to top 5 items; report/replay can show all.
423
+
424
+ Review reasons should be exact and human-readable, for example:
425
+
426
+ - `identity/session path`
427
+ - `backend/API file`
428
+ - `dependency metadata changed`
429
+ - `no passing test evidence`
430
+ - `build evidence missing`
431
+ - `verification failed`
432
+
433
+ Do not use vague phrases such as "AI confidence" or "probably important."
434
+
435
+ ## Proof Gap Detection Approach
436
+
437
+ Create deterministic proof classes from verification command strings:
438
+
439
+ ```ts
440
+ type VerificationProofKind = "test" | "build" | "typecheck" | "lint" | "install" | "unknown";
441
+ ```
442
+
443
+ Classify commands by normalized text:
444
+
445
+ - `test`, `vitest`, `jest`, `mocha`, `playwright`, `cypress` -> `test`
446
+ - `build` -> `build`
447
+ - `typecheck`, `tsc --noEmit`, `tsc` -> `typecheck`
448
+ - `lint`, `eslint` -> `lint`
449
+ - `install`, `npm ci`, `pnpm install`, `yarn install` -> `install`
450
+ - everything else -> `unknown`
451
+
452
+ Gap rules:
453
+
454
+ - Failed verification exists: blocking gap, readiness `Blocked by failed verification`.
455
+ - `auth`, `billing/payments`, `security/secrets`, or `database/migrations` changed without passing `test`: blocking or recommended gap depending on whether configured test commands exist.
456
+ - `backend/api` changed without passing `test` or `build`: recommended gap.
457
+ - `dependencies` changed without passing `install`, `build`, or `test`: recommended gap.
458
+ - `config` changed without passing `lint`, `typecheck`, or `build`: recommended gap.
459
+ - `frontend` changed without passing `build` or `test`: recommended gap.
460
+ - docs-only changes with no verification: informational, not blocking.
461
+ - tests-only changes with no verification: recommended gap, suggest running the test suite.
462
+ - no changed files: readiness `Unknown`.
463
+
464
+ Suggested command selection:
465
+
466
+ 1. Prefer the first configured command that maps to the missing proof kind.
467
+ 2. Otherwise infer from package scripts:
468
+ - `test` -> `npm test`
469
+ - `build` -> `npm run build`
470
+ - `typecheck` -> `npm run typecheck`
471
+ - `lint` -> `npm run lint`
472
+ 3. If no script exists, suggest `agentflight verify -- <command>` with a plain explanation instead of inventing a command.
473
+
474
+ ## Review Readiness Model
475
+
476
+ Readiness should be derived from proof gaps, failed verification, changed files, and risk:
477
+
478
+ 1. `Blocked by failed verification`
479
+ - Any verification run failed.
480
+ - Reason: the failed command must be fixed or rerun successfully.
481
+ - Next action: rerun the failed command after fixing it.
482
+
483
+ 2. `Unknown`
484
+ - No changed files, or git status could not be read.
485
+ - Reason: there is not enough changed-file evidence.
486
+ - Next action: make changes or inspect git status.
487
+
488
+ 3. `Needs verification`
489
+ - Changed files exist, no failed runs, but required or recommended proof is missing.
490
+ - Reason: specific proof gaps.
491
+ - Next action: run the highest-priority suggested command.
492
+
493
+ 4. `Not ready for review`
494
+ - Proof exists, but high-risk gaps remain or the focus list contains high-risk files without matching proof.
495
+ - Reason: high-risk changed files still lack targeted evidence.
496
+ - Next action: run suggested verification and regenerate report/replay.
497
+
498
+ 5. `Ready for review`
499
+ - No failed runs.
500
+ - No blocking gaps.
501
+ - Required proof is present, or the change set is docs-only with no configured proof requirement.
502
+ - Reason: verification evidence matches the observed risk.
503
+ - Next action: generate or share report/replay and request scoped review.
504
+
505
+ ## ProjScan Integration Approach
506
+
507
+ v0.4.0 should keep ProjScan defensive:
508
+
509
+ - Continue detecting availability through the existing adapter.
510
+ - Continue recording ProjScan status in tooling sections.
511
+ - Do not parse ProjScan human-readable text for ranking.
512
+ - Do not make ProjScan required for review intelligence.
513
+
514
+ Add a future-compatible extension point:
515
+
516
+ ```ts
517
+ export interface ExternalReviewHint {
518
+ file: string;
519
+ reason: string;
520
+ weight: number;
521
+ source: "projscan" | "agentloopkit" | "config";
522
+ }
523
+ ```
524
+
525
+ Do not wire ProjScan into this extension point until there is a stable structured output contract. A later v0.4.x release can add ProjScan hotspots or architecture-sensitive file hints.
526
+
527
+ ## Generated/Internal File Filtering Decision
528
+
529
+ Include config-driven ignore patterns in v0.4.0.
530
+
531
+ Implementation:
532
+
533
+ - Keep `filterAgentFlightRuntimePaths` as the built-in non-configurable AgentFlight runtime filter.
534
+ - Add `filterChangedFiles(files, config)` or similar helper that applies:
535
+ 1. built-in AgentFlight runtime filters
536
+ 2. optional `changedFileFilters.ignore` globs from config
537
+ - Use a small dependency-free glob matcher for simple patterns:
538
+ - exact file path
539
+ - directory prefix ending in `/**`
540
+ - basename wildcard such as `*.log` only if simple to test
541
+ - If pattern support becomes complex, use a small well-known package only after checking package cost and maintenance.
542
+
543
+ Default new config:
544
+
545
+ - Use an empty `changedFileFilters.ignore` list.
546
+ - Do not include `.projscan-memory/**`, `dist/**`, `coverage/**`, or `.next/**` by default.
547
+
548
+ Docs:
549
+
550
+ - Explain that AgentFlight always hides its own runtime evidence from changed-file analysis.
551
+ - Explain that users can add generated/internal artifacts under `changedFileFilters.ignore`.
552
+ - Warn that ignored files do not appear in risk, report, replay, resume, or snapshot summaries.
553
+
554
+ ## Backwards Compatibility Strategy
555
+
556
+ Sessions:
557
+
558
+ - Do not require new persisted session fields.
559
+ - Continue supporting sessions without `verificationRuns`.
560
+ - Continue supporting sessions without `events`.
561
+ - Continue synthesizing timeline events for older sessions.
562
+
563
+ Configs:
564
+
565
+ - Treat missing `changedFileFilters` as `{ ignore: [] }`.
566
+ - Keep `version: 1` unless a breaking config migration is introduced. This plan does not require one.
567
+ - Do not rewrite existing config files automatically.
568
+
569
+ Outputs:
570
+
571
+ - Existing commands keep their names and basic behavior.
572
+ - New review intelligence appears as additional sections and clearer wording.
573
+ - Scripts using current commands should not break because no command arguments are removed.
574
+
575
+ ## Tests Required
576
+
577
+ ### Core Review Intelligence
578
+
579
+ Create `tests/core/review-intelligence.test.ts`.
580
+
581
+ Cover:
582
+
583
+ - ranks auth above docs and tests
584
+ - ranks billing and security as high priority
585
+ - adds missing proof reason when high-risk files lack test evidence
586
+ - marks failed verification as `Blocked by failed verification`
587
+ - marks docs-only changes as ready or low-friction when no proof is configured
588
+ - suggests `npm test` for auth/backend proof gaps when test script is configured
589
+ - suggests `npm run build` for frontend/build gaps when build script is configured
590
+ - keeps output deterministic for equal scores
591
+ - handles empty changed files as `Unknown`
592
+ - handles old sessions without verification runs
593
+
594
+ ### Changed-File Filters
595
+
596
+ Create or extend `tests/core/changed-files.test.ts`.
597
+
598
+ Cover:
599
+
600
+ - built-in AgentFlight runtime paths remain filtered
601
+ - `.agentflight/config.json` remains visible
602
+ - config ignore `.projscan-memory/**` hides `.projscan-memory/memory.json`
603
+ - normal user files are not filtered
604
+ - ignored files do not feed risk analysis
605
+ - Windows-style paths normalize correctly
606
+
607
+ ### Config
608
+
609
+ Update `tests/core/config.test.ts`.
610
+
611
+ Cover:
612
+
613
+ - new configs include an empty `changedFileFilters.ignore` list
614
+ - older configs without the field still load
615
+
616
+ ### Commands
617
+
618
+ Update command tests:
619
+
620
+ - `tests/commands/workflow.test.ts`
621
+ - `tests/commands/snapshot.test.ts`
622
+ - `tests/commands/evidence-output.test.ts`
623
+
624
+ Cover:
625
+
626
+ - `status` includes `Review first`
627
+ - `status` includes readiness reason and suggested command
628
+ - `report` includes `Review First`, `Proof Gaps`, and `Review Readiness`
629
+ - `replay` includes review focus and proof gaps
630
+ - `resume` includes review focus and exact next command
631
+ - snapshot summaries are not polluted by ignored generated files
632
+
633
+ ### Renderers
634
+
635
+ Update:
636
+
637
+ - `tests/renderers/markdown-report.test.ts`
638
+ - `tests/renderers/html-replay.test.ts`
639
+ - `tests/renderers/resume-prompt.test.ts`
640
+
641
+ Cover:
642
+
643
+ - report renders ranked review focus
644
+ - report renders proof gaps honestly
645
+ - replay escapes review focus text
646
+ - replay shows readiness reason
647
+ - resume prompt includes top review files and suggested command
648
+
649
+ ### Adapters
650
+
651
+ Keep existing ProjScan and AgentLoopKit adapter tests. Add tests only if an external hints type is introduced without live adapter wiring.
652
+
653
+ ## Documentation Updates Required
654
+
655
+ Update:
656
+
657
+ - `README.md`
658
+ - Add v0.4.0 review intelligence to current capabilities.
659
+ - Update sample `status`, `report`, `replay`, and `resume` snippets with review focus and proof gaps.
660
+ - `docs/roadmap/index.md`
661
+ - Mark review intelligence as current after implementation.
662
+ - Move PR comments, JSON/CI, and deeper ProjScan enrichment to future sections.
663
+ - `docs/development/verification.md`
664
+ - Explain how proof gaps are inferred from verification commands.
665
+ - `docs/development/snapshots-and-timelines.md`
666
+ - Mention optional review summary in snapshot metadata if implemented.
667
+ - `docs/examples/basic-agentflight-session.md`
668
+ - Add a short example of `Review first` and `Missing proof`.
669
+ - `CHANGELOG.md`
670
+ - Add v0.4.0 section when release preparation begins.
671
+ - `AGENTFLIGHT_DEVLOG.md`
672
+ - Record implementation commands, ProjScan checks, AgentLoopKit checks, test results, and dogfood evidence.
673
+
674
+ Optional new doc:
675
+
676
+ - `docs/development/review-intelligence.md`
677
+ - Explain ranking, proof gap detection, readiness states, and generated-file filters.
678
+
679
+ ## Release Checklist
680
+
681
+ Before release:
682
+
683
+ - `npm run verify`
684
+ - `npm run format:check`
685
+ - `npm pack --dry-run`
686
+ - `npm audit --audit-level=moderate`
687
+ - `npx projscan@latest preflight --mode before_commit --format json`
688
+ - `npx agentloopkit@latest verify`
689
+
690
+ Dogfood:
691
+
692
+ - Start a v0.4.0 dogfood session in AgentFlight.
693
+ - Make a small docs or test change.
694
+ - Capture typecheck, lint, test, and build through `agentflight verify`.
695
+ - Confirm `status` ranks review focus sensibly.
696
+ - Confirm report/replay/resume all include the same review intelligence.
697
+ - Test a generated `.projscan-memory/memory.json` file is hidden when configured.
698
+ - Test `.agentflight/config.json` remains visible.
699
+
700
+ Package:
701
+
702
+ - Confirm `npm pack --dry-run` includes built files and docs.
703
+ - Confirm no runtime `.agentflight/sessions`, `.agentflight/reports`, `.agentflight/current`, or `.agentflight/evidence` files are included.
704
+ - Confirm `node dist/cli.js --version` reports the intended version during release prep.
705
+
706
+ ## Risks and Trade-Offs
707
+
708
+ ### Risk: Overstating Review Quality
709
+
710
+ AgentFlight should not imply it performed a human review. Use wording like `Review first`, `Proof gaps`, and `Ready for review`, not `Approved` or `Safe to merge`.
711
+
712
+ ### Risk: Brittle Proof Mapping
713
+
714
+ Command names vary by repo. Keep proof-kind mapping simple and transparent. If AgentFlight cannot infer a command, say so.
715
+
716
+ ### Risk: Hiding Important Files
717
+
718
+ Config-driven ignore patterns can hide files from risk and review analysis. Keep defaults conservative and document the effect clearly.
719
+
720
+ ### Risk: ProjScan Coupling
721
+
722
+ ProjScan enrichment is attractive but should wait for stable structured output. v0.4.0 should not parse human-formatted output.
723
+
724
+ ### Risk: Terminal Output Becomes Too Long
725
+
726
+ Limit `status` review focus to top 3 to 5 files. Put fuller detail in report/replay.
727
+
728
+ ## Phased Implementation Steps
729
+
730
+ ### Phase 1: Review Intelligence Core
731
+
732
+ Files:
733
+
734
+ - Create `src/core/review-intelligence.ts`
735
+ - Modify `src/types/index.ts`
736
+ - Test `tests/core/review-intelligence.test.ts`
737
+
738
+ Steps:
739
+
740
+ 1. Add review intelligence types.
741
+ 2. Write failing tests for ranking, proof gaps, readiness, and command suggestions.
742
+ 3. Implement proof-kind classification.
743
+ 4. Implement file scoring and deterministic ranking.
744
+ 5. Implement proof gap detection.
745
+ 6. Implement readiness decision.
746
+ 7. Run `npm test -- tests/core/review-intelligence.test.ts`.
747
+
748
+ ### Phase 2: Config-Driven Generated/Internal Filters
749
+
750
+ Files:
751
+
752
+ - Modify `src/types/index.ts`
753
+ - Modify `src/core/config.ts`
754
+ - Modify `src/core/changed-files.ts`
755
+ - Modify changed-file call sites in `src/commands/status.ts`, `report.ts`, `replay.ts`, `resume.ts`, and `snapshot.ts`
756
+ - Test `tests/core/changed-files.test.ts`
757
+ - Test `tests/core/config.test.ts`
758
+
759
+ Steps:
760
+
761
+ 1. Add optional `changedFileFilters.ignore` to config type.
762
+ 2. Seed an empty `changedFileFilters.ignore` list in new configs.
763
+ 3. Load config in commands before changed-file analysis.
764
+ 4. Apply built-in runtime filters first, then config filters.
765
+ 5. Add tests for `.projscan-memory/**`, `.agentflight/config.json`, normal files, and Windows paths.
766
+ 6. Run changed-file and config tests.
767
+
768
+ ### Phase 3: Command Integration
769
+
770
+ Files:
771
+
772
+ - Modify `src/commands/status.ts`
773
+ - Modify `src/commands/report.ts`
774
+ - Modify `src/commands/replay.ts`
775
+ - Modify `src/commands/resume.ts`
776
+ - Modify `src/commands/snapshot.ts` only if snapshot review metadata is included
777
+ - Test command workflow files
778
+
779
+ Steps:
780
+
781
+ 1. Compute review intelligence once per command after risk and verification summary.
782
+ 2. Add concise `Review first`, `Proof missing`, readiness reason, and next action to `status`.
783
+ 3. Pass review intelligence into report, replay, and resume renderers.
784
+ 4. Add optional compact review metadata to snapshot events if it remains simple.
785
+ 5. Run command tests.
786
+
787
+ ### Phase 4: Renderer Integration
788
+
789
+ Files:
790
+
791
+ - Modify `src/renderers/markdown-report.ts`
792
+ - Modify `src/renderers/html-replay.ts`
793
+ - Modify `src/renderers/resume-prompt.ts`
794
+ - Test renderer files
795
+
796
+ Steps:
797
+
798
+ 1. Add `Review First` and `Proof Gaps` sections to Markdown report.
799
+ 2. Add review focus and proof gap sections to HTML replay.
800
+ 3. Add top focus files and next command to resume prompt.
801
+ 4. Keep replay styling precise, calm, and trustworthy.
802
+ 5. Run renderer tests.
803
+
804
+ ### Phase 5: Docs and Dogfood
805
+
806
+ Files:
807
+
808
+ - Modify `README.md`
809
+ - Modify `docs/roadmap/index.md`
810
+ - Modify `docs/development/verification.md`
811
+ - Modify `docs/examples/basic-agentflight-session.md`
812
+ - Add `docs/development/review-intelligence.md` if needed
813
+ - Modify `CHANGELOG.md`
814
+ - Modify `AGENTFLIGHT_DEVLOG.md`
815
+
816
+ Steps:
817
+
818
+ 1. Update public workflow examples.
819
+ 2. Document review ranking and proof gap rules.
820
+ 3. Document generated/internal filters.
821
+ 4. Dogfood in AgentFlight.
822
+ 5. Record evidence in the devlog.
823
+ 6. Run full verification.
824
+
825
+ ## Option Decision
826
+
827
+ ### Option A: Review Focus Ranking + Proof Gap Detection + Readiness Model
828
+
829
+ Pros:
830
+
831
+ - Most focused review-intelligence release.
832
+ - Lowest implementation risk.
833
+ - Directly improves the main workflow.
834
+
835
+ Cons:
836
+
837
+ - Does not address generated ProjScan memory noise beyond v0.3.3 AgentFlight runtime filtering.
838
+
839
+ ### Option B: Option A + Config-Driven Generated/Internal File Filters
840
+
841
+ Pros:
842
+
843
+ - Addresses dogfood evidence from `.projscan-memory/memory.json`.
844
+ - Keeps filtering local, explicit, and user-controlled.
845
+ - Avoids hardcoding arbitrary generated paths.
846
+ - Still shippable as v0.4.0.
847
+
848
+ Cons:
849
+
850
+ - Adds config surface area.
851
+ - Needs careful docs so users understand ignored files disappear from review analysis.
852
+
853
+ ### Option C: Option A + ProjScan-Enriched Review Hints
854
+
855
+ Pros:
856
+
857
+ - Moves toward Baseframe Labs strategic architecture.
858
+ - Could make review ranking smarter.
859
+
860
+ Cons:
861
+
862
+ - Risky without stable structured ProjScan output.
863
+ - Could make AgentFlight brittle.
864
+ - Adds integration complexity beyond the core review-intelligence release.
865
+
866
+ ### Option D: Option A + B + C
867
+
868
+ Pros:
869
+
870
+ - Most ambitious.
871
+
872
+ Cons:
873
+
874
+ - Too broad for a clean v0.4.0.
875
+ - Mixes core product logic, config changes, and external enrichment in one release.
876
+ - Higher regression risk.
877
+
878
+ Final recommendation: **Option B**.
879
+
880
+ ## Implementation Readiness
881
+
882
+ This plan is ready for implementation after review. The release should start with the core review-intelligence model and tests, then thread that model through existing commands and renderers. Keep ProjScan enrichment as a future extension point unless a stable structured ProjScan output contract is available before implementation starts.