@xenonbyte/da-vinci-workflow 0.1.14 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/CHANGELOG.md +9 -1
  2. package/README.md +17 -0
  3. package/README.zh-CN.md +17 -0
  4. package/SKILL.md +13 -0
  5. package/commands/claude/dv/design.md +2 -0
  6. package/commands/claude/dv/verify.md +2 -0
  7. package/commands/codex/prompts/dv-design.md +2 -0
  8. package/commands/codex/prompts/dv-verify.md +1 -0
  9. package/commands/gemini/dv/design.toml +2 -0
  10. package/commands/gemini/dv/verify.toml +1 -0
  11. package/docs/mcp-aware-gate-implementation.md +291 -0
  12. package/docs/mcp-aware-gate-tests.md +244 -0
  13. package/docs/mcp-aware-gate.md +246 -0
  14. package/docs/mode-use-cases.md +2 -0
  15. package/docs/prompt-presets/desktop-app.md +3 -0
  16. package/docs/prompt-presets/mobile-app.md +3 -0
  17. package/docs/prompt-presets/tablet-app.md +3 -0
  18. package/docs/prompt-presets/web-app.md +3 -0
  19. package/docs/workflow-examples.md +9 -4
  20. package/docs/zh-CN/mcp-aware-gate-implementation.md +290 -0
  21. package/docs/zh-CN/mcp-aware-gate-tests.md +244 -0
  22. package/docs/zh-CN/mcp-aware-gate.md +249 -0
  23. package/docs/zh-CN/mode-use-cases.md +3 -0
  24. package/docs/zh-CN/prompt-presets/desktop-app.md +3 -0
  25. package/docs/zh-CN/prompt-presets/mobile-app.md +3 -0
  26. package/docs/zh-CN/prompt-presets/tablet-app.md +3 -0
  27. package/docs/zh-CN/prompt-presets/web-app.md +3 -0
  28. package/docs/zh-CN/workflow-examples.md +9 -4
  29. package/lib/audit.js +348 -0
  30. package/lib/cli.js +47 -1
  31. package/lib/mcp-runtime-gate.js +342 -0
  32. package/package.json +3 -2
  33. package/references/artifact-templates.md +26 -1
  34. package/references/checkpoints.md +66 -1
  35. package/references/design-inputs.md +2 -1
  36. package/references/pencil-design-to-code.md +8 -0
  37. package/scripts/test-mcp-runtime-gate.js +199 -0
package/CHANGELOG.md CHANGED
@@ -2,7 +2,15 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
- - No unreleased changes yet.
5
+ ## v0.1.15 - 2026-03-27
6
+
7
+ ### Changed
8
+ - MCP-aware runtime gate now has a first implementation slice: a pure evaluator, runtime-gate recording shape, and workflow hooks that require live source convergence checks before terminal completion claims
9
+ - `da-vinci audit` now distinguishes `integrity` and `completion` modes so mid-workflow sanity checks do not masquerade as terminal completion gates
10
+ - completion guidance now blocks terminal `design complete` or `workflow complete` claims unless the registered project-local `.pen` source is shell-visible, standard artifacts exist, and the completion gate passes
11
+ - design-source rules now reject unnamed live editors such as `new` as persisted project sources and explicitly block screenshot or markdown pollution inside `.da-vinci/designs/`
12
+ - prompt presets, workflow examples, and mode guides now state that screenshot exports belong under `.da-vinci/changes/<change-id>/exports/` and cannot replace the `.pen` source of truth
13
+ - Pencil-operation guidance now treats repeated unsupported-property rollbacks on the same anchor surface as unstable progress instead of acceptable forward motion
6
14
 
7
15
  ## v0.1.14 - 2026-03-27
8
16
 
package/README.md CHANGED
@@ -414,9 +414,26 @@ Useful commands:
414
414
  ```bash
415
415
  da-vinci status
416
416
  da-vinci validate-assets
417
+ da-vinci audit --mode integrity /abs/path/to/project
418
+ da-vinci audit --mode completion --change <change-id> /abs/path/to/project
417
419
  da-vinci uninstall --platform codex,claude,gemini
418
420
  ```
419
421
 
422
+ `da-vinci audit` has two intended modes:
423
+
424
+ - `--mode integrity`: a mid-workflow filesystem-truth check for missing baseline artifacts, misplaced exports, polluted `.da-vinci/designs/`, and missing persisted `.pen` sources
425
+ - `--mode completion`: a strict pre-completion gate for one change scope; use `--change <change-id>` and treat any failure as blocking
426
+
427
+ Both modes check the most common workflow-integrity failures in a project:
428
+
429
+ - missing standard Da Vinci artifacts
430
+ - missing shell-visible project-local `.pen` sources
431
+ - pollution inside `.da-vinci/designs/`
432
+ - screenshot exports stored in the wrong place
433
+ - empty or partial change scaffolds
434
+
435
+ When Pencil MCP is active, Da Vinci now also expects an MCP runtime gate record in `pencil-design.md` before terminal completion claims. That runtime gate checks live editor/source convergence separately from filesystem audit.
436
+
420
437
  Installation targets:
421
438
 
422
439
  - Codex prompts: `~/.codex/prompts/`
package/README.zh-CN.md CHANGED
@@ -343,9 +343,26 @@ da-vinci install --platform codex,claude,gemini
343
343
  ```bash
344
344
  da-vinci status
345
345
  da-vinci validate-assets
346
+ da-vinci audit --mode integrity /abs/path/to/project
347
+ da-vinci audit --mode completion --change <change-id> /abs/path/to/project
346
348
  da-vinci uninstall --platform codex,claude,gemini
347
349
  ```
348
350
 
351
+ `da-vinci audit` 现在有两种主要模式:
352
+
353
+ - `--mode integrity`:适合在工作进行中检查文件系统真相,比如基础工件缺失、导出路径错误、`.da-vinci/designs/` 被污染、项目内 `.pen` 没落盘
354
+ - `--mode completion`:适合在宣称完成前做严格检查;配合 `--change <change-id>` 使用,任何失败都应视为阻断
355
+
356
+ 两种模式都会检查项目里最常见的工作流完整性问题:
357
+
358
+ - 标准 Da Vinci 工件缺失
359
+ - 项目内 shell 可见 `.pen` 设计源缺失
360
+ - `.da-vinci/designs/` 目录被污染
361
+ - 截图导出写到了错误位置
362
+ - change scaffold 只有空目录或只写了一半
363
+
364
+ 当 Pencil MCP 可用时,Da Vinci 现在还要求在终态完成声明前,把 MCP runtime gate 结果记录到 `pencil-design.md`。这层 gate 负责检查 live editor/source convergence,与 filesystem audit 分工不同。
365
+
349
366
  安装目标:
350
367
 
351
368
  - Codex prompts:`~/.codex/prompts/`
package/SKILL.md CHANGED
@@ -216,16 +216,24 @@ Default completion rule:
216
216
  - if the request is `design-only`, stop after design artifacts and bindings
217
217
  - otherwise assume `full-delivery` and continue through implementation and verification
218
218
 
219
+ Do not report `design complete`, `workflow complete`, or any equivalent terminal state unless the completion gate in `references/checkpoints.md` is satisfied.
220
+ When shell access is available, prefer `da-vinci audit --mode integrity <project-path>` during active workflow work and `da-vinci audit --mode completion --change <change-id> <project-path>` before any terminal completion claim.
221
+
219
222
  ## Pencil Generation Rules
220
223
 
221
224
  During active Pencil work:
222
225
 
226
+ - do not begin anchor-surface generation until the required discovery and design-source artifacts exist in their standard locations for the active mode
223
227
  - keep `.da-vinci/designs/` reserved for project-local `.pen` files; do not write workflow markdown such as inventories, proposals, or checkpoints into that directory
224
228
  - on `redesign-from-code`, write a short structural-delta note for each anchor surface explaining how the new composition differs from the current XML or layout grouping
225
229
  - after the first successful Pencil write, verify that the registered project-local `.pen` path exists as a shell-visible file before treating the design source as persistent
230
+ - after the first successful Pencil write, run the MCP runtime gate when Pencil MCP is available and record the result in `pencil-design.md`
231
+ - do not treat an unnamed live editor such as `new` as a persisted project design source; reconcile it to the registered project-local `.pen` path before the design pass is considered traceable
226
232
  - use only Pencil-supported properties; do not emit web- or CSS-only layout properties such as `flex` or `margin`
233
+ - if unsupported Pencil properties cause repeated rolled-back batches on the same anchor surface, treat that pass as unstable and fix the schema usage before expanding further
227
234
  - on complex redesigns, turn approved anchor surfaces into a small shared primitive family before broad page expansion
228
235
  - apply the resolved form-factor-specific layout hygiene profile before passing screenshot review on any anchor surface or other approval candidate
236
+ - exported screenshots are review artifacts only; place them under `.da-vinci/changes/<change-id>/exports/` and never treat them as a substitute for the project-local `.pen` source
229
237
  - screenshot review is binding: if the review calls out hierarchy, spacing, clarity, inconsistency, or unresolved-placeholder issues, revise the screen before treating the checkpoint as `PASS`
230
238
 
231
239
  ## Load References On Demand
@@ -573,6 +581,11 @@ When Pencil is available through MCP:
573
581
  - Before mapping or implementation closes, verify both:
574
582
  - the `.pen` path is readable through MCP
575
583
  - the same path exists as a shell-visible file inside the project
584
+ - Before broad expansion or terminal completion, run the MCP runtime gate:
585
+ - evaluate source convergence from the active editor, registered `.pen` path, and shell-visible `.pen` file
586
+ - evaluate screen presence for claimed anchor and review target ids
587
+ - evaluate review execution for approved surfaces
588
+ - append the runtime gate result to `pencil-design.md`
576
589
 
577
590
  When Pencil is not available:
578
591
 
@@ -18,3 +18,5 @@ Create or update:
18
18
  - `pencil-design.md`
19
19
 
20
20
  Run the `design checkpoint` before locking implementation tasks.
21
+ If Pencil MCP is active, run the MCP runtime gate after the first successful Pencil write and record it in `pencil-design.md`.
22
+ Before reporting `design complete` or `workflow complete`, run `da-vinci audit --mode completion --change <change-id> <project-path>` and treat any failure as blocking.
@@ -14,3 +14,5 @@ Check:
14
14
 
15
15
  Create or update:
16
16
  - `verification.md`
17
+
18
+ If Pencil MCP is active and terminal completion is being considered, re-check the MCP runtime gate evidence before treating verification as complete.
@@ -12,3 +12,5 @@ Output should move the work toward:
12
12
  - `pencil-design.md`
13
13
 
14
14
  Use Pencil-backed structure as the design source when available.
15
+ If Pencil MCP is active, run the MCP runtime gate after the first successful Pencil write and record it in `pencil-design.md`.
16
+ Before claiming `design complete` or `workflow complete`, run `da-vinci audit --mode completion --change <change-id> <project-path>` and treat any failure as blocking.
@@ -13,3 +13,4 @@ Check:
13
13
  - drift between artifacts and code
14
14
 
15
15
  Update `verification.md` when needed.
16
+ If Pencil MCP is active and terminal completion is being considered, re-check the MCP runtime gate evidence before treating verification as complete.
@@ -11,4 +11,6 @@ Create or update:
11
11
  - `pencil-design.md`
12
12
 
13
13
  Use Pencil-backed page coverage as the source of presentation truth.
14
+ If Pencil MCP is active, run the MCP runtime gate after the first successful Pencil write and record it in `pencil-design.md`.
15
+ Before reporting `design complete` or `workflow complete`, run `da-vinci audit --mode completion --change <change-id> <project-path>` and treat any failure as blocking.
14
16
  """
@@ -12,4 +12,5 @@ Check:
12
12
  - drift between artifacts and code
13
13
 
14
14
  Update `verification.md` when needed.
15
+ If Pencil MCP is active and terminal completion is being considered, re-check the MCP runtime gate evidence before treating verification as complete.
15
16
  """
@@ -0,0 +1,291 @@
1
+ # MCP-Aware Gate Implementation Design
2
+
3
+ This document turns the MCP-aware gate proposal into an implementation design.
4
+
5
+ It still does not commit to writing code.
6
+
7
+ ## Scope
8
+
9
+ This design covers only the first implementation slice:
10
+
11
+ - runtime source convergence
12
+ - runtime screen presence
13
+ - runtime review execution
14
+ - completion blocking when runtime truth and filesystem truth diverge
15
+
16
+ It does not cover:
17
+
18
+ - automatic `.pen` reconstruction
19
+ - CLI access to live MCP state
20
+ - session persistence or transport work
21
+
22
+ ## Design Goal
23
+
24
+ Add a narrow runtime checkpoint that can stop false completion claims caused by live-editor drift.
25
+
26
+ The gate should catch cases like:
27
+
28
+ - active editor is still `new`
29
+ - anchor screens exist only in the live session
30
+ - node ids used for screenshots do not exist in the current editor
31
+ - the workflow claims completion before runtime state and filesystem state converge
32
+
33
+ ## Existing Constraints
34
+
35
+ The current architecture already provides:
36
+
37
+ - filesystem `audit`
38
+ - checkpoint rules in `references/checkpoints.md`
39
+ - artifact expectations in `design-registry.md` and `pencil-design.md`
40
+ - MCP access to active editor state and screen nodes
41
+
42
+ The current architecture does not provide:
43
+
44
+ - a CLI bridge to MCP runtime state
45
+ - a stable session id outside the active agent context
46
+
47
+ That means the MCP-aware gate must be executed inside the agent workflow while MCP tools are live.
48
+
49
+ ## Implementation Placement
50
+
51
+ ### Primary insertion points
52
+
53
+ 1. After the first successful Pencil write in a design pass.
54
+ 2. Before any terminal `design complete` or `workflow complete` claim.
55
+
56
+ ### Secondary insertion point
57
+
58
+ 3. Before broad expansion beyond approved anchor surfaces when the design pass depends on screenshot-reviewed anchors.
59
+
60
+ ### Why these points
61
+
62
+ - after first write: catches `new`-editor drift early
63
+ - before completion: catches false success claims
64
+ - before broad expansion: prevents weak runtime state from spreading into more screens
65
+
66
+ ## Owning Workflow Stage
67
+
68
+ The runtime gate should be owned by the design phase, not the CLI.
69
+
70
+ That means:
71
+
72
+ - design routes should execute it while Pencil MCP is available
73
+ - verify routes may re-check it if design completion is being claimed
74
+ - build routes should not become the primary owner of runtime gate logic
75
+
76
+ ## Input Sources
77
+
78
+ ### MCP inputs
79
+
80
+ Required:
81
+
82
+ - active editor state
83
+ - top-level nodes
84
+ - targeted node reads for claimed anchor surfaces
85
+
86
+ Expected MCP operations:
87
+
88
+ - `pencil.get_editor_state`
89
+ - `pencil.batch_get`
90
+
91
+ ### Filesystem inputs
92
+
93
+ Required:
94
+
95
+ - shell-visible `.pen` existence
96
+ - registered `.pen` path from `design-registry.md`
97
+ - declared reviewed screens and screenshot targets from `pencil-design.md`
98
+
99
+ Expected shell or file reads:
100
+
101
+ - read `design-registry.md`
102
+ - read `pencil-design.md`
103
+ - check registered `.pen` path on disk
104
+
105
+ ## Runtime Snapshot Model
106
+
107
+ The runtime gate should build one structured snapshot in memory:
108
+
109
+ ```md
110
+ runtime snapshot
111
+ - activeEditor
112
+ - topLevelScreenIds
113
+ - topLevelScreenNames
114
+ - registeredPenPath
115
+ - shellVisiblePenExists
116
+ - claimedAnchorIds
117
+ - claimedReviewedScreenIds
118
+ - reviewTargets
119
+ ```
120
+
121
+ The evaluator should only depend on this snapshot.
122
+
123
+ That keeps the implementation testable without needing a real live Pencil session for every case.
124
+
125
+ ## Evaluation Stages
126
+
127
+ ### Stage 1: Source Convergence
128
+
129
+ Checks:
130
+
131
+ - active editor is not `new`
132
+ - registered `.pen` path exists in `design-registry.md`
133
+ - registered `.pen` path exists on disk
134
+ - active editor and registered source do not obviously diverge
135
+
136
+ Result rules:
137
+
138
+ - `PASS`: runtime source and registered source converge
139
+ - `WARN`: no new live edits happened yet, or a documented deferred baseline is still being used
140
+ - `BLOCK`: runtime source is unnamed, missing, or diverged
141
+
142
+ ### Stage 2: Screen Presence
143
+
144
+ Checks:
145
+
146
+ - claimed anchor ids exist in live MCP state
147
+ - claimed reviewed screens exist in live MCP state
148
+ - screenshot targets resolve in the active document
149
+
150
+ Result rules:
151
+
152
+ - `PASS`: claimed design output is traceable to live editor nodes
153
+ - `WARN`: screen naming drift exists but ids are still traceable
154
+ - `BLOCK`: claimed screens or targets do not resolve
155
+
156
+ ### Stage 3: Review Execution
157
+
158
+ Checks:
159
+
160
+ - each approved anchor has a reviewed screen id or screenshot target
161
+ - runtime review records align with the current live editor
162
+ - review blockers were not ignored
163
+
164
+ Result rules:
165
+
166
+ - `PASS`: runtime review is credible
167
+ - `WARN`: review exists but requires follow-up before expansion
168
+ - `BLOCK`: approval claim is unsupported by runtime evidence
169
+
170
+ ## Recording Strategy
171
+
172
+ Do not introduce a new artifact family.
173
+
174
+ Append a structured section to `pencil-design.md`:
175
+
176
+ ```md
177
+ ## MCP Runtime Gate
178
+ - Time:
179
+ - Active editor:
180
+ - Registered `.pen` path:
181
+ - Shell-visible `.pen` path:
182
+ - Claimed anchor ids:
183
+ - Reviewed screen ids:
184
+ - Source convergence: PASS | WARN | BLOCK
185
+ - Screen presence: PASS | WARN | BLOCK
186
+ - Review execution: PASS | WARN | BLOCK
187
+ - Final runtime gate status: PASS | WARN | BLOCK
188
+ - Notes:
189
+ ```
190
+
191
+ ### Why `pencil-design.md`
192
+
193
+ - it already records source path, screens, screenshots, and design notes
194
+ - it is the closest existing artifact to runtime design truth
195
+ - it avoids scattering checkpoint state across ad hoc files
196
+
197
+ ## Failure Handling
198
+
199
+ When runtime gate returns `BLOCK`:
200
+
201
+ - do not continue to broad multi-screen expansion
202
+ - do not claim design completion
203
+ - do not claim workflow completion
204
+ - record the mismatch explicitly in `pencil-design.md`
205
+
206
+ When runtime gate returns `WARN`:
207
+
208
+ - allow continuation only when the warning does not create source ambiguity
209
+ - do not allow terminal completion unless the warning is explicitly resolved or accepted by the workflow rules
210
+
211
+ ## Interaction With Filesystem Audit
212
+
213
+ The runtime gate should run first.
214
+
215
+ Then:
216
+
217
+ - if runtime gate is `BLOCK`, stop immediately
218
+ - if runtime gate is `PASS` or acceptable `WARN`, run filesystem completion audit before terminal completion
219
+
220
+ That yields this order:
221
+
222
+ 1. runtime gate
223
+ 2. filesystem completion audit
224
+ 3. completion claim
225
+
226
+ ## Minimal Pseudoflow
227
+
228
+ ```md
229
+ 1. perform first successful Pencil write
230
+ 2. read active editor via MCP
231
+ 3. read claimed anchor ids from `pencil-design.md`
232
+ 4. read registered `.pen` path from `design-registry.md`
233
+ 5. check shell-visible `.pen`
234
+ 6. read live nodes for claimed anchors
235
+ 7. evaluate source convergence
236
+ 8. evaluate screen presence
237
+ 9. evaluate review execution when relevant
238
+ 10. append runtime gate results to `pencil-design.md`
239
+ 11. if terminal completion is being claimed, run filesystem completion audit
240
+ 12. only report completion if both layers pass
241
+ ```
242
+
243
+ ## Boundary Decisions
244
+
245
+ ### When Pencil MCP is unavailable
246
+
247
+ Do not try to emulate runtime gate.
248
+
249
+ Instead:
250
+
251
+ - record that MCP runtime gate could not run
252
+ - fall back to filesystem audit plus documented constraints
253
+ - do not describe the runtime gate as passed
254
+
255
+ ### When no anchor ids are recorded yet
256
+
257
+ The runtime gate may run a reduced source-convergence-only check after the first Pencil write.
258
+
259
+ It should not pretend screen-presence or review-execution checks were completed.
260
+
261
+ ### When no new Pencil edits happened
262
+
263
+ Use `WARN` or skip runtime gate rather than fabricating a pass.
264
+
265
+ ## Non-Functional Requirements
266
+
267
+ The first implementation should be:
268
+
269
+ - deterministic
270
+ - append-only in artifact recording
271
+ - easy to unit-test from a runtime snapshot object
272
+ - independent from CLI transport changes
273
+
274
+ ## Implementation Steps
275
+
276
+ Recommended order:
277
+
278
+ 1. define a runtime snapshot shape
279
+ 2. define a pure evaluator over that snapshot
280
+ 3. add a writer that appends runtime gate results to `pencil-design.md`
281
+ 4. call the gate from design-phase runtime checkpoints
282
+ 5. wire terminal completion to require both runtime gate and filesystem completion audit
283
+
284
+ ## Deferred Work
285
+
286
+ Do not include these in the first implementation:
287
+
288
+ - auto-repair of editor/source mismatch
289
+ - multi-session state reconciliation
290
+ - CLI-facing live runtime commands
291
+ - generalized checkpoint orchestration engine
@@ -0,0 +1,244 @@
1
+ # MCP-Aware Gate Test Design
2
+
3
+ This document defines how to test the future MCP-aware gate implementation.
4
+
5
+ It is a design for validation coverage, not a test implementation.
6
+
7
+ ## Test Goal
8
+
9
+ Prove that the MCP-aware gate:
10
+
11
+ - blocks false runtime completion states
12
+ - does not replace filesystem audit
13
+ - does not misclassify healthy runtime design sessions
14
+ - behaves predictably when MCP runtime facts are incomplete
15
+
16
+ ## Test Layers
17
+
18
+ The test plan should cover three layers.
19
+
20
+ ### 1. Evaluator unit tests
21
+
22
+ Test the pure runtime-gate evaluator against synthetic snapshot inputs.
23
+
24
+ Why:
25
+
26
+ - fastest feedback
27
+ - deterministic
28
+ - does not depend on a live Pencil session
29
+
30
+ ### 2. Runtime integration tests
31
+
32
+ Test the runtime-gate caller against real or fixture-backed MCP responses.
33
+
34
+ Why:
35
+
36
+ - proves that active editor and node reads are interpreted correctly
37
+
38
+ ### 3. End-to-end workflow tests
39
+
40
+ Test that runtime gate and filesystem audit cooperate correctly in realistic workflow states.
41
+
42
+ Why:
43
+
44
+ - prevents regressions where runtime gate passes but completion still lies
45
+
46
+ ## Core Test Categories
47
+
48
+ ### A. Healthy runtime source convergence
49
+
50
+ Expected:
51
+
52
+ - active editor is a named project-local `.pen`
53
+ - registered `.pen` path matches
54
+ - shell-visible `.pen` exists
55
+ - result is `PASS`
56
+
57
+ ### B. Live editor still `new`
58
+
59
+ Expected:
60
+
61
+ - source convergence is `BLOCK`
62
+ - completion is blocked even if screens exist in the live editor
63
+
64
+ ### C. Registered `.pen` path missing on disk
65
+
66
+ Expected:
67
+
68
+ - source convergence or completion status becomes `BLOCK`
69
+ - runtime gate does not treat the live editor alone as sufficient
70
+
71
+ ### D. Live screens exist but are not persisted
72
+
73
+ Example:
74
+
75
+ - active editor contains `Splash`, `Home`, `SafeBox`
76
+ - shell-visible `.pen` does not exist
77
+
78
+ Expected:
79
+
80
+ - runtime gate can prove screens exist
81
+ - source convergence is still `BLOCK`
82
+
83
+ ### E. Claimed anchor id missing
84
+
85
+ Example:
86
+
87
+ - `pencil-design.md` claims `mCZ1G`
88
+ - current live editor does not contain `mCZ1G`
89
+
90
+ Expected:
91
+
92
+ - screen presence is `BLOCK`
93
+
94
+ ### F. Screenshot target missing
95
+
96
+ Expected:
97
+
98
+ - review execution is `BLOCK`
99
+
100
+ ### G. Review blockers ignored
101
+
102
+ Example:
103
+
104
+ - screenshot review records blocker-level layout-hygiene issues
105
+ - workflow still marks anchor as approved
106
+
107
+ Expected:
108
+
109
+ - review execution is `BLOCK`
110
+
111
+ ### H. Source mismatch with explicit reconciliation
112
+
113
+ Example:
114
+
115
+ - active editor path and registered path differ
116
+ - `pencil-design.md` records a documented reconciliation
117
+
118
+ Expected:
119
+
120
+ - evaluator returns `WARN` or `PASS` only if the reconciliation rules are satisfied
121
+
122
+ ### I. No new Pencil edits yet
123
+
124
+ Expected:
125
+
126
+ - runtime gate may return `WARN` or partial result
127
+ - must not fabricate full `PASS`
128
+
129
+ ### J. Pencil MCP unavailable
130
+
131
+ Expected:
132
+
133
+ - runtime gate reports unavailable or skipped state
134
+ - workflow does not treat runtime gate as passed
135
+ - filesystem audit remains usable
136
+
137
+ ## Required End-to-End Scenarios
138
+
139
+ ### Scenario 1: Real failure pattern from Cipher
140
+
141
+ State:
142
+
143
+ - active editor is `new`
144
+ - live screens exist
145
+ - `.da-vinci/designs/` is polluted
146
+ - no shell-visible `.pen`
147
+
148
+ Expected:
149
+
150
+ - runtime gate blocks completion
151
+ - filesystem completion audit also fails
152
+
153
+ ### Scenario 2: Healthy redesign completion
154
+
155
+ State:
156
+
157
+ - active editor is the registered `.pen`
158
+ - anchor ids exist
159
+ - screenshot review exists
160
+ - shell-visible `.pen` exists
161
+ - completion audit passes
162
+
163
+ Expected:
164
+
165
+ - runtime gate passes
166
+ - completion audit passes
167
+ - terminal completion is allowed
168
+
169
+ ### Scenario 3: Runtime pass, filesystem fail
170
+
171
+ State:
172
+
173
+ - active editor is healthy
174
+ - shell-visible `.pen` exists
175
+ - but `.da-vinci/designs/` contains extra PNG or markdown files
176
+
177
+ Expected:
178
+
179
+ - runtime gate may pass
180
+ - filesystem completion audit fails
181
+ - terminal completion is blocked
182
+
183
+ ### Scenario 4: Filesystem pass, runtime fail
184
+
185
+ State:
186
+
187
+ - shell-visible `.pen` exists
188
+ - files look healthy
189
+ - active editor is still `new` or points elsewhere
190
+
191
+ Expected:
192
+
193
+ - runtime gate fails
194
+ - completion is blocked
195
+
196
+ ## Assertions That Must Exist
197
+
198
+ For every runtime-gate result, tests should assert:
199
+
200
+ - final status
201
+ - per-layer status
202
+ - which blocking condition fired
203
+ - whether completion would be allowed
204
+ - whether artifact recording shape stays stable
205
+
206
+ ## Artifact Recording Tests
207
+
208
+ Verify that runtime gate output:
209
+
210
+ - appends to `pencil-design.md`
211
+ - does not create a new artifact family
212
+ - does not write into `.da-vinci/designs/`
213
+ - does not overwrite unrelated sections
214
+
215
+ ## Regression Cases
216
+
217
+ These regressions should be covered:
218
+
219
+ 1. runtime gate accidentally behaves like a CLI-only audit
220
+ 2. runtime gate silently treats `new` as acceptable
221
+ 3. runtime gate treats PNG exports as source evidence
222
+ 4. runtime gate passes when anchor ids are missing
223
+ 5. runtime gate is skipped but reported as passed
224
+ 6. runtime gate writes ad hoc markdown files outside the allowed artifact path
225
+
226
+ ## Recommended Test Matrix
227
+
228
+ Minimum matrix:
229
+
230
+ - form factor: mobile first
231
+ - design stage: first Pencil write / approved anchor / terminal completion
232
+ - source state: healthy / `new` / diverged / missing `.pen`
233
+ - screen state: present / missing / stale ids
234
+ - review state: reviewed / missing / blocker ignored
235
+
236
+ ## Exit Criteria
237
+
238
+ The implementation should not be considered ready unless all of the following are true:
239
+
240
+ 1. evaluator unit tests cover all blocker branches
241
+ 2. at least one runtime integration test proves live editor detection works
242
+ 3. the Cipher-like failure scenario is blocked
243
+ 4. a healthy redesign scenario passes
244
+ 5. runtime gate plus filesystem audit together prevent false terminal completion