@hegemonart/get-design-done 1.27.7 → 1.28.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/.claude-plugin/marketplace.json +2 -2
  2. package/.claude-plugin/plugin.json +2 -2
  3. package/CHANGELOG.md +142 -0
  4. package/SKILL.md +1 -1
  5. package/agents/design-verifier.md +17 -0
  6. package/hooks/gdd-decision-injector.js +149 -3
  7. package/package.json +1 -1
  8. package/reference/accessibility.md +4 -0
  9. package/reference/adr-format.md +96 -0
  10. package/reference/apply-reflections-procedure.md +68 -0
  11. package/reference/architecture-vocabulary.md +102 -0
  12. package/reference/audit-scoring.md +14 -0
  13. package/reference/cache-policy.md +126 -0
  14. package/reference/color-theory.md +279 -0
  15. package/reference/compare-rubric.md +171 -0
  16. package/reference/composition.md +349 -0
  17. package/reference/connections-onboarding.md +417 -0
  18. package/reference/context-md-format.md +106 -0
  19. package/reference/contrast-advanced.md +205 -0
  20. package/reference/darkmode-audit-procedure.md +258 -0
  21. package/reference/debug-feedback-loops.md +119 -0
  22. package/reference/design-procedure.md +304 -0
  23. package/reference/design-system-guidance.md +2 -0
  24. package/reference/discover-procedure.md +204 -0
  25. package/reference/explore-procedure.md +267 -0
  26. package/reference/form-patterns.md +2 -0
  27. package/reference/health-mcp-detection.md +44 -0
  28. package/reference/health-skill-length-report.md +69 -0
  29. package/reference/heuristics.md +84 -0
  30. package/reference/i18n.md +554 -0
  31. package/reference/iconography.md +2 -0
  32. package/reference/milestone-completeness-rubric.md +87 -0
  33. package/reference/motion-interpolate.md +1 -0
  34. package/reference/palette-catalog.md +2 -0
  35. package/reference/peer-cli-protocol.md +161 -0
  36. package/reference/plan-procedure.md +278 -0
  37. package/reference/proportion-systems.md +267 -0
  38. package/reference/registry.json +204 -1
  39. package/reference/registry.schema.json +1 -1
  40. package/reference/router-rules.md +84 -0
  41. package/reference/rtl-cjk-cultural.md +2 -0
  42. package/reference/scan-procedure.md +731 -0
  43. package/reference/shared-preamble.md +78 -6
  44. package/reference/skill-authoring-contract.md +128 -0
  45. package/reference/start-procedure.md +115 -0
  46. package/reference/style-doc-procedure.md +150 -0
  47. package/reference/style-vocabulary.md +2 -0
  48. package/reference/threat-modeling.md +101 -0
  49. package/reference/typography.md +4 -0
  50. package/reference/verify-procedure.md +512 -0
  51. package/reference/visual-hierarchy-layout.md +4 -0
  52. package/scripts/validate-skill-length.cjs +283 -0
  53. package/skills/add-backlog/SKILL.md +1 -0
  54. package/skills/analyze-dependencies/SKILL.md +33 -122
  55. package/skills/apply-reflections/SKILL.md +1 -40
  56. package/skills/audit/SKILL.md +3 -1
  57. package/skills/bandit-status/SKILL.md +31 -66
  58. package/skills/benchmark/SKILL.md +15 -55
  59. package/skills/brief/SKILL.md +12 -1
  60. package/skills/cache-manager/SKILL.md +3 -57
  61. package/skills/check-update/SKILL.md +38 -75
  62. package/skills/compare/SKILL.md +29 -269
  63. package/skills/complete-cycle/SKILL.md +1 -1
  64. package/skills/connections/SKILL.md +21 -427
  65. package/skills/continue/SKILL.md +1 -0
  66. package/skills/darkmode/SKILL.md +32 -287
  67. package/skills/debug/SKILL.md +11 -8
  68. package/skills/design/SKILL.md +27 -245
  69. package/skills/discover/SKILL.md +26 -133
  70. package/skills/discuss/SKILL.md +18 -2
  71. package/skills/explore/SKILL.md +42 -176
  72. package/skills/fast/SKILL.md +1 -0
  73. package/skills/figma-write/SKILL.md +2 -2
  74. package/skills/health/SKILL.md +11 -33
  75. package/skills/help/SKILL.md +1 -0
  76. package/skills/list-assumptions/SKILL.md +1 -0
  77. package/skills/map/SKILL.md +8 -31
  78. package/skills/new-cycle/SKILL.md +3 -1
  79. package/skills/next/SKILL.md +1 -0
  80. package/skills/note/SKILL.md +1 -0
  81. package/skills/optimize/SKILL.md +21 -44
  82. package/skills/pause/SKILL.md +1 -0
  83. package/skills/peer-cli-add/SKILL.md +26 -108
  84. package/skills/peer-cli-customize/SKILL.md +22 -42
  85. package/skills/peers/SKILL.md +33 -57
  86. package/skills/plan/SKILL.md +33 -220
  87. package/skills/plant-seed/SKILL.md +1 -0
  88. package/skills/pr-branch/SKILL.md +1 -0
  89. package/skills/progress/SKILL.md +1 -7
  90. package/skills/quality-gate/SKILL.md +34 -166
  91. package/skills/quick/SKILL.md +1 -0
  92. package/skills/reapply-patches/SKILL.md +1 -0
  93. package/skills/recall/SKILL.md +1 -0
  94. package/skills/resume/SKILL.md +1 -0
  95. package/skills/review-backlog/SKILL.md +1 -0
  96. package/skills/router/SKILL.md +3 -59
  97. package/skills/scan/SKILL.md +36 -675
  98. package/skills/settings/SKILL.md +1 -0
  99. package/skills/ship/SKILL.md +1 -0
  100. package/skills/sketch/SKILL.md +1 -1
  101. package/skills/sketch-wrap-up/SKILL.md +13 -54
  102. package/skills/spike/SKILL.md +1 -1
  103. package/skills/spike-wrap-up/SKILL.md +12 -46
  104. package/skills/start/SKILL.md +13 -112
  105. package/skills/stats/SKILL.md +1 -0
  106. package/skills/style/SKILL.md +18 -140
  107. package/skills/synthesize/SKILL.md +1 -0
  108. package/skills/timeline/SKILL.md +1 -0
  109. package/skills/todo/SKILL.md +1 -0
  110. package/skills/turn-closeout/SKILL.md +36 -56
  111. package/skills/undo/SKILL.md +1 -0
  112. package/skills/update/SKILL.md +1 -0
  113. package/skills/verify/SKILL.md +42 -457
  114. package/skills/warm-cache/SKILL.md +3 -35
  115. package/skills/zoom-out/SKILL.md +26 -0
@@ -0,0 +1,205 @@
1
+ ---
2
+ name: contrast-advanced
3
+ type: heuristic
4
+ version: 1.0.0
5
+ phase: 28
6
+ tags: [contrast, apca, wcag-3, accessibility]
7
+ last_updated: 2026-05-18
8
+ ---
9
+
10
+ # Contrast Advanced — APCA (WCAG 3 Draft)
11
+
12
+ WCAG 2.1 / 2.2 contrast (4.5:1 body, 3:1 large text and non-text UI) is owned by [`./accessibility.md`](./accessibility.md) §WCAG 2.1 AA — Required Thresholds. This file owns APCA — the WCAG 3 draft perceptual contrast model — which materially misranks WCAG 2.1 verdicts on thin, large, and colored text. APCA is currently part of the WCAG 3 draft (Silver), not yet at candidate-recommendation stage; threshold values and the math can shift before ratification. Treat APCA as a **design-quality layer** that you stack on top of WCAG 2.1 AA certification, not as a replacement for it.
13
+
14
+ This is the file an agent should consult any time it is auditing a contrast pair that fails perceptually despite passing WCAG 2.1, or passes WCAG 2.1 with a margin that "feels wrong" — almost always one of: thin body text in a mid-gray, large colored text on white, or saturated text on a saturated background. Where WCAG 2.1 says "compute the luminance ratio and check the threshold", this file replaces that hand-wave with explicit perceptual thresholds, three worked misrank cases, and a heuristic mapping back to the legacy ratio so a single audit can satisfy both standards.
15
+
16
+ ---
17
+
18
+ ## APCA Lc Thresholds
19
+
20
+ APCA reports contrast as **Lc** — a perceptual lightness-contrast score on a scale that runs roughly **−108 to +106**. The sign carries directional meaning: a **positive Lc** denotes darker text on a lighter background (the common case for body copy), and a **negative Lc** denotes lighter text on a darker background (the common case for white-on-dark UI). Many practitioners report the **absolute value** `|Lc|` against the threshold — that is the convention this file uses below. Always confirm whether a calculator reports signed or unsigned Lc before comparing to a threshold.
21
+
22
+ The threshold ladder mirrors the WCAG-2.1 floor-by-use-case structure, but the breakpoints are set against perceptual contrast rather than luminance ratio, so the rank ordering between pairs sometimes flips relative to a WCAG-2.1 audit.
23
+
24
+ | Lc threshold | Use case | Rationale |
25
+ | ------------ | ---------------------- | -------------------------------------------------------------------------- |
26
+ | `Lc 75` | Body text (small) | Small glyphs at body weight need the most perceptual lift to remain legible |
27
+ | `Lc 60` | Large text | Larger glyph area tolerates lower perceptual contrast without legibility loss |
28
+ | `Lc 45` | Non-text UI | Buttons, borders, icons; functional but not body copy |
29
+ | `Lc 30` | Decorative / accent | Logos, accent dividers, brand marks; non-essential to comprehension |
30
+
31
+ A few practical notes on reading the ladder:
32
+
33
+ - **Sign convention.** `|Lc 75|` is the standard target whether the pair is dark-on-light (positive) or light-on-dark (negative). Do not confuse a calculator that returns `Lc −75` with one returning `Lc 75` — the magnitude is the same; only the polarity differs.
34
+ - **Weight and size sensitivity.** APCA's body-text threshold (`Lc 75`) is calibrated for small text at regular weight. Larger or heavier glyphs may legitimately pass at lower magnitudes; APCA's published lookup tables map exact weight × size cells to minimum Lc, with `Lc 60` and `Lc 45` covering the common "large text" and "non-text UI" buckets respectively.
35
+ - **Decorative is not "exempt".** `Lc 30` is the floor for elements where comprehension is not required (a brand watermark, a divider hairline). Anything a user must read or interact with belongs at `Lc 45` or above.
36
+
37
+ ### Worked anchor pairs at each threshold
38
+
39
+ Holding the threshold ladder in mind is easier with one canonical pair anchoring each row. These pairs are reference points only — the calculator should always be consulted before shipping:
40
+
41
+ ```txt
42
+ Lc 75 body: #1A1A1A on #FFFFFF → APCA ~|95|, WCAG ~17:1 (over-clears both standards)
43
+ Lc 60 large: #333333 on #FFFFFF → APCA ~|85|, WCAG ~12.6:1 (clears both standards)
44
+ Lc 45 non-text UI: #5C5C5C on #FFFFFF → APCA ~|68|, WCAG ~7:1 (clears both; tightens at colored variants)
45
+ Lc 30 decorative: #888888 on #FFFFFF → APCA ~|48|, WCAG ~4.5:1 (clears WCAG body; APCA-decorative only)
46
+ ```
47
+
48
+ The pattern is monotonic on neutral pairs — darker foregrounds raise both WCAG ratio and APCA Lc together. The interesting divergences only appear once color, weight, or size enters the comparison, which is the subject of the next section.
49
+
50
+ ---
51
+
52
+ ## Why 4.5:1 Misranks Thin / Large / Colored Text
53
+
54
+ WCAG 2.1's `4.5:1` is a **luminance contrast ratio**: a log-based ratio of the relative luminances of the lighter and darker pixels, calibrated against a perfectly white-vs-perfectly-black comparison. The formula is `(L1 + 0.05) / (L2 + 0.05)` where `L1`/`L2` are sRGB relative luminances, and it knows nothing about font weight, glyph size, or the hue of either side of the pair.
55
+
56
+ APCA models perceptual contrast: lightness on a perceptually uniform scale, with adjustments for font weight, glyph area, and the directional bias of human contrast sensitivity (we read dark-on-light differently from light-on-dark). The two models agree most of the time on solid black-on-white body copy. They **disagree** in three predictable failure modes — and in each case it is APCA that tracks the human-readable reality:
57
+
58
+ - **Thin mid-grays on white** read worse than the ratio suggests (APCA flags; WCAG passes).
59
+ - **Large saturated text on white** reads better than the ratio suggests (WCAG passes by a wide margin; APCA still passes but the margin is much narrower).
60
+ - **Saturated-on-saturated pairs** read worse than the ratio suggests, because human contrast sensitivity collapses when both sides carry strong hue at similar luminance (APCA flags clearly; WCAG sits ambiguously near the line).
61
+
62
+ Three worked examples make the math concrete. Ratios and Lc values below are **illustrative reference points** from the published APCA calculator at `apcacontrast.com` and a standard WCAG 2.1 contrast checker; production audits MUST recompute against a maintained calculator (APCA's tables update as the WCAG 3 draft advances) and MUST cite the calculator version + spec snapshot date for reproducibility.
63
+
64
+ ### Example 1 — Thin mid-gray on white
65
+
66
+ ```txt
67
+ Foreground: #666666 (rgb 102, 102, 102)
68
+ Background: #FFFFFF
69
+ Glyph context: body text, 16px, regular weight
70
+
71
+ WCAG 2.1 ratio: ~5.74:1 → PASSES 4.5:1 (body)
72
+ APCA Lc: ~|62| → FAILS Lc 75 (body)
73
+ ```
74
+
75
+ The luminance ratio crosses the 4.5:1 floor comfortably; on paper this pair is compliant. Perceptually, mid-gray body text on white runs out of lift well before the ratio would predict: the dark side is too light to anchor the glyph edges, and at body weight the strokes are thin enough that the eye loses crisp edge contrast. APCA's `Lc 75` body threshold catches this; WCAG 2.1's flat ratio does not. The fix is to darken the foreground (toward `#5A5A5A` or lower) or thicken the weight; both raise Lc.
76
+
77
+ ### Example 2 — Large colored text on white
78
+
79
+ ```txt
80
+ Foreground: #0066CC (rgb 0, 102, 204)
81
+ Background: #FFFFFF
82
+ Glyph context: large heading, 24px, bold
83
+
84
+ WCAG 2.1 ratio: ~6.72:1 → PASSES 4.5:1 and 3:1
85
+ APCA Lc: ~|78| → PASSES Lc 60 (large)
86
+ Disagreement: margin direction
87
+ ```
88
+
89
+ Both standards pass this pair, but they disagree on **how much margin** the design has. WCAG 2.1 reports a comfortable 6.72:1 — well above the 3:1 large-text floor and even above the 4.5:1 body floor — which encourages the designer to read this as "high contrast, plenty of headroom". APCA reports `|Lc 78|`, which clears the large-text threshold (`Lc 60`) but is only marginally above the body threshold (`Lc 75`). The simple luminance ratio over-rates saturated blue against white because the formula collapses chroma into luminance; APCA tracks the perceptual reality that the blue glyph edges read with less crispness than a pure black would at the same ratio. The audit lesson is: WCAG ratios on saturated colored text systematically overstate available contrast, and a designer who reduces saturation or shifts the hue trusting the ratio will erode legibility before the ratio reports a problem.
90
+
91
+ ### Example 3 — Saturated text on saturated background
92
+
93
+ ```txt
94
+ Foreground: #FF6600 (rgb 255, 102, 0)
95
+ Background: #0033AA (rgb 0, 51, 170)
96
+ Glyph context: large UI label, 18px, semibold
97
+
98
+ WCAG 2.1 ratio: ~3.39:1 → FAILS 4.5:1; marginally PASSES 3:1 (large/UI)
99
+ APCA Lc: ~|42| → FAILS Lc 45 (non-text UI)
100
+ ```
101
+
102
+ The luminance ratio sits in the ambiguous gap: it fails body but marginally clears the large-text/UI floor. A WCAG-2.1-only audit might accept this for a button label or a chip. APCA reports `|Lc 42|`, below the `Lc 45` non-text UI threshold and well below the `Lc 60` large-text threshold — the pair is **not** safe for either use. The simple ratio over-rates this pair because two highly saturated colors at similar perceived lightness produce strong chromatic difference but weak lightness contrast; human contrast sensitivity for reading depends primarily on lightness, so the eye reads the boundary as fuzzier than the ratio suggests. APCA exposes the perceptual deficit; WCAG 2.1 hides it.
103
+
104
+ The three examples generalise to a heuristic an auditor can apply by hand: **when a pair involves thin text, large colored text, or saturated-on-saturated, distrust the WCAG ratio and re-check with APCA.** When both standards agree (solid black on white, dark navy on white, white on solid black) the ratio is reliable.
105
+
106
+ ### Why the math diverges
107
+
108
+ The structural reason the two models disagree on the three failure modes above is worth naming explicitly, because it explains why the disagreement is **predictable**, not random.
109
+
110
+ - **WCAG 2.1 uses sRGB relative luminance.** The ratio formula collapses each color's R/G/B channels into a single luminance value via a fixed gamma-decoded weighting (`0.2126·R + 0.7152·G + 0.0722·B` after sRGB-to-linear). That luminance is then compared as a log ratio with the `+0.05` offset. The formula is hue-blind: a saturated blue and a mid-gray of equal luminance produce the same ratio against any background. It is also weight-blind and size-blind: a 4.5:1 pair is reported as 4.5:1 whether the text is 8px hairline or 96px black.
111
+ - **APCA uses perceptually weighted lightness.** It first transforms both colors into a perceptually uniform lightness space, then computes a signed contrast that accounts for the directional bias (dark-on-light vs light-on-dark) of human contrast sensitivity. Critically, the **threshold** APCA compares to is not a single number — it is a lookup keyed on the font weight × glyph size at which the pair will be rendered. A 16px regular-weight body pair targets `Lc 75`; a 24px bold pair targets `Lc 60`; a 14px hairline pair targets a value higher than `Lc 75` because the strokes are thinner.
112
+
113
+ The three failure modes line up against this structural difference: thin text fails because WCAG is weight-blind, large colored text passes-with-overstated-margin because WCAG is hue-blind, and saturated-on-saturated fails because the simple luminance ratio aliases strong chromatic contrast with lightness contrast. These are not edge cases the spec authors missed — they are the cost of the simpler 2.1 model in exchange for cheaper computation and easier hand-audit. APCA pays the perceptual cost; WCAG 2.1 pays the simplicity dividend.
114
+
115
+ ---
116
+
117
+ ## When to Use APCA vs WCAG 2.1 AA
118
+
119
+ The two standards are **not** an either/or choice for production work. APCA is informationally better on perceptual edge cases; WCAG 2.1 AA is the contractually enforceable baseline for accessibility compliance. The defensible production pattern is **dual-target compliance**: ship designs that satisfy both, fall back to a principled tiebreaker only when the two disagree.
120
+
121
+ ### Dual-target compliance pattern
122
+
123
+ 1. **Default audit floor: WCAG 2.1 AA.** Body text ≥ 4.5:1, large text and non-text UI ≥ 3:1. This is the floor for accessibility certification, procurement contracts, public-sector compliance, and any audit a third party will run.
124
+ 2. **Design-quality layer: APCA Lc thresholds.** Body `Lc 75`, large `Lc 60`, non-text UI `Lc 45`, decorative `Lc 30`. This is the floor for perceptual quality and for the three misrank cases above (thin, large-colored, colored-on-colored).
125
+ 3. **Tiebreaker when the two disagree.** Prefer **satisfying both** — almost every pair has a small foreground or background adjustment that clears both simultaneously. When forced to choose, prefer APCA for **body text** (perceptual legibility matters more than the legacy ratio when a user is reading) and prefer WCAG 2.1 AA for **non-text UI** (focus rings, button borders, icon glyphs — where contractual certification matters more than the perceptual edge case).
126
+
127
+ The order is intentional. The WCAG 2.1 AA floor is the legally and contractually defensible baseline; treating APCA as a *replacement* would shrink the compliance surface and break audits run by third parties using legacy tools. Treating APCA as an *additive* design-quality layer expands the design surface without breaking compliance.
128
+
129
+ ### Common audit-finding patterns
130
+
131
+ When the dual-target pattern is applied in practice, a small set of finding patterns surfaces repeatedly. Naming them lets an audit triage the result rather than re-deriving the verdict from raw numbers each time.
132
+
133
+ - **`apca-flags-thin-body`** — WCAG 2.1 body passes (≥ 4.5:1), APCA fails `Lc 75`. Mid-gray body copy on white; the dominant fix is darkening the foreground by 1-2 modular steps on the lightness axis.
134
+ - **`wcag-margin-overstated`** — both standards pass, but APCA reports a margin substantially narrower than the WCAG ratio implies. Common on saturated colored text on white. Action: do not lean further into the apparent WCAG headroom (e.g., by lowering chroma or shifting hue toward background); the perceptual margin is already thinner than the ratio reports.
135
+ - **`saturated-on-saturated-trap`** — WCAG 2.1 sits ambiguously in the 3:1-to-4.5:1 band, APCA fails clearly. High-saturation color-on-color pairs. The dominant fix is widening the lightness gap (push foreground darker or background lighter), not adjusting the hue choice.
136
+ - **`focus-ring-wcag-pass-apca-borderline`** — non-text UI pair clears WCAG 2.1 `3:1` but sits at or just below APCA `Lc 45`. Common on light-gray focus rings around white inputs. Action: APCA-aware designers raise the ring contrast even though WCAG would let it ship.
137
+ - **`light-on-dark-asymmetry`** — same `|Lc|` magnitude reads differently dark-on-light vs light-on-dark; pairs that pass dark-mode body can fail when polarity-flipped to light-mode body. Audit both polarities independently rather than assuming symmetry.
138
+
139
+ These patterns are useful labels for the dual-target dashboard view: an audit can report "3 `apca-flags-thin-body` findings, 1 `saturated-on-saturated-trap` finding, 0 hard WCAG 2.1 fails" and the design team immediately knows the corrective work concentrates on the body-text color choice, not on the WCAG-side certification.
140
+
141
+ ### Draft-status caveat
142
+
143
+ APCA is currently in the WCAG 3 draft. The threshold values cited above (`Lc 75 / 60 / 45 / 30`), the calculator implementation, and the lookup-table mapping of weight × size to minimum Lc can all shift before WCAG 3 reaches candidate recommendation. Reproducible audits MUST cite:
144
+
145
+ - The APCA calculator version used (e.g., `apcacontrast.com` build `0.1.9 W3`)
146
+ - The WCAG 3 draft snapshot date used as the spec reference
147
+ - The conversion table version used (the table in the next section is heuristic and approximate)
148
+
149
+ Any audit that does not cite these is not reproducible — a re-audit six months later may produce different verdicts on the same pairs, not because the design changed but because the underlying spec advanced.
150
+
151
+ ---
152
+
153
+ ## Lc ↔ WCAG 2.1 Conversion Table
154
+
155
+ The table below is an **approximate heuristic** for legacy interop — it lets an auditor running a WCAG 2.1 contrast checker spot which pairs are likely to satisfy APCA-equivalent perceptual contrast and which are likely to fall short. APCA and WCAG 2.1 measure fundamentally different things (perceptual lightness contrast vs luminance ratio), so a precise one-to-one mapping does not exist; the table is calibrated against the common case of solid neutral text on a solid neutral background, and drifts in either direction once saturated hue enters the pair.
156
+
157
+ | APCA Lc threshold | WCAG 2.1 ratio (approx) | Common case |
158
+ | ----------------- | ----------------------- | --------------------------------- |
159
+ | `Lc 75` | `~7:1` | Body text, strict (AAA-equivalent) |
160
+ | `Lc 60` | `~4.5:1` | Body text minimum (AA body floor) |
161
+ | `Lc 45` | `~3:1` | Large text / non-text UI |
162
+ | `Lc 30` | `~2:1` | Decorative / accent |
163
+
164
+ Use this table in two directions:
165
+
166
+ - **WCAG-pass-to-APCA check.** A pair that satisfies WCAG 2.1 AA body (`4.5:1`) maps to roughly `Lc 60` — which is APCA's *large text* floor, not its body floor. A WCAG-2.1 body-compliant design is **not** automatically APCA body-compliant. Re-check body text against `Lc 75`.
167
+ - **APCA-pass-to-WCAG check.** A pair that satisfies APCA body (`Lc 75`) maps to roughly `7:1` — well above WCAG 2.1 AA's `4.5:1` floor. APCA body-compliant designs are almost always WCAG 2.1 AA body-compliant; the reverse is not true.
168
+
169
+ The asymmetry is the practical lesson: **APCA is the stricter floor for body text.** A design that targets APCA body and ships through a WCAG 2.1 AA audit will pass both with margin. A design that targets WCAG 2.1 AA body and is then APCA-audited will frequently surface "passes WCAG, fails APCA Lc 75" findings on thin and colored text — those findings are real, not noise.
170
+
171
+ The conversion drifts at the saturated-hue edges. On saturated-on-saturated pairs (Example 3 above), the WCAG ratio over-rates contrast relative to APCA Lc; the heuristic ratio in the table reads optimistic. On thin gray body text (Example 1), the WCAG ratio also over-rates — `5.74:1` maps to `~Lc 62` in the table-extrapolated direction, but the measured Lc was `~62` and the threshold was `75`, so the pair still fails APCA body despite passing the table-implied WCAG-equivalent. **Treat the table as a screening tool, not as a substitute for re-running the actual APCA calculator on the actual pair.**
172
+
173
+ ### Applying the table to the three worked examples
174
+
175
+ Walking the three misrank cases through the conversion table makes the screening-tool behaviour explicit:
176
+
177
+ ```txt
178
+ Example 1 — #666666 on #FFFFFF (thin body)
179
+ WCAG ratio: ~5.74:1 → table-row ~Lc 60 (just above body floor)
180
+ APCA Lc: ~|62| → matches the table row
181
+ Verdict: APCA body threshold is Lc 75; the table-mapped Lc 60 fails body
182
+ Reading: table screening agrees with the direct APCA measurement here
183
+
184
+ Example 2 — #0066CC on #FFFFFF (large colored)
185
+ WCAG ratio: ~6.72:1 → table-row ~Lc 75 (body-strict)
186
+ APCA Lc: ~|78| → close to the table row
187
+ Verdict: Both standards pass; APCA margin tighter than ratio suggests
188
+ Reading: table screening would have caught this had body been the target
189
+
190
+ Example 3 — #FF6600 on #0033AA (saturated-on-saturated)
191
+ WCAG ratio: ~3.39:1 → table-row between Lc 30 and Lc 45 (decorative-to-UI)
192
+ APCA Lc: ~|42| → matches the lower end of that band
193
+ Verdict: APCA non-text UI threshold is Lc 45; this falls below
194
+ Reading: table screening flags this as risky; direct APCA confirms
195
+ ```
196
+
197
+ In two of the three cases the table screening converges on the same verdict the direct APCA calculator produces. In the third (large colored text), the table screening *would* have caught the tight margin had the target been body text rather than large text — exactly the kind of design-quality consideration the dual-target pattern is designed to surface. The table is therefore a useful first-pass filter for designers working in a WCAG-2.1-centric tool stack, with the explicit understanding that any pair the screening surfaces as borderline must be re-run against the actual APCA calculator before a decision ships.
198
+
199
+ ---
200
+
201
+ ## Cross-References
202
+
203
+ - [`./accessibility.md`](./accessibility.md) §WCAG 2.1 AA — Required Thresholds: legacy luminance-ratio floor; pair with APCA Lc for dual-target compliance.
204
+
205
+ > The reciprocal inbound cross-link from [`./accessibility.md`](./accessibility.md) (a "see also: APCA / WCAG 3 draft" pointer into the §WCAG 2.1 / 2.2 section) lands in Phase 28-06 (additive-only, per D-06) and is not present at the time this file ships.
@@ -0,0 +1,258 @@
1
+ ---
2
+ name: darkmode-audit-procedure
3
+ type: meta-rules
4
+ version: 1.0.0
5
+ phase: 28.5
6
+ tags: [darkmode, dark-mode, contrast, audit, procedure, extracted]
7
+ last_updated: 2026-05-18
8
+ ---
9
+
10
+ Source: extracted from `skills/darkmode/SKILL.md` (Phase 28.5 rework — D-10 extract-then-link).
11
+ The skill's load-bearing routing + decision tree stays in `../skills/darkmode/SKILL.md`; this
12
+ file holds the architecture-detection greps, contrast computation, anti-pattern grep
13
+ snippets, and the `DARKMODE-AUDIT.md` report template.
14
+
15
+ # Dark Mode Audit Procedure
16
+
17
+ Detailed procedure for the `get-design-done:darkmode` standalone audit — companion to
18
+ `../skills/darkmode/SKILL.md`. Read this file when executing a specific audit step
19
+ (architecture detection, contrast computation, anti-pattern grep, report layout). The
20
+ SKILL.md keeps the load-bearing pre-flight + step routing; this file holds the deep
21
+ methodology.
22
+
23
+ For the perceptual layer (APCA / WCAG 3 draft) sitting on top of the WCAG 2.1 ratios used
24
+ in Step 2, see `./contrast-advanced.md`. For modern OKLCH-based dark token-pair generation,
25
+ see `./color-theory.md` §OKLCH. For the cross-skill output discipline + connection-probe
26
+ pattern, see `./shared-preamble.md#output-contract-reminders` and
27
+ `./shared-preamble.md#connection-handshake-summary`.
28
+
29
+ ---
30
+
31
+ ## Step 1: Architecture Detection (DARK-02)
32
+
33
+ Run all three architecture greps against `$SRC_ROOT`. Use `2>/dev/null` on each to suppress missing-directory errors.
34
+
35
+ ```bash
36
+ # Architecture 1: CSS custom properties with dark media query
37
+ arch1_count=$(grep -rEn "prefers-color-scheme.*dark|\.dark[[:space:]]*\{" "$SRC_ROOT" \
38
+ --include="*.css" --include="*.scss" 2>/dev/null | wc -l)
39
+
40
+ # Architecture 2: Tailwind dark: prefix
41
+ arch2_count=$(grep -rEn "dark:[a-z]" "$SRC_ROOT" \
42
+ --include="*.tsx" --include="*.jsx" --include="*.html" 2>/dev/null | wc -l)
43
+
44
+ # Architecture 3: JS class toggle on <html> / <body>
45
+ arch3_count=$(grep -rEn "classList.*dark|setAttribute.*dark|document\.documentElement" "$SRC_ROOT" \
46
+ --include="*.ts" --include="*.tsx" --include="*.js" 2>/dev/null | wc -l)
47
+ ```
48
+
49
+ **Classification rules:**
50
+
51
+ | Condition | Classification |
52
+ |-----------|---------------|
53
+ | All three counts < 3 | No dark mode — abort: "No dark mode implementation detected — nothing to audit." |
54
+ | Exactly one count ≥ 3 | Primary architecture = that one |
55
+ | Two or more counts ≥ 5 | Hybrid (list all detected architectures) |
56
+ | One count ≥ 3, others < 5 | Primary = highest count |
57
+
58
+ Record `ARCH_DETECTED` as one of: `Architecture 1 (CSS custom props)`, `Architecture 2 (Tailwind dark:)`, `Architecture 3 (JS class toggle)`, or `Hybrid`.
59
+
60
+ ---
61
+
62
+ ## Step 2: Contrast Audit (DARK-03)
63
+
64
+ For the detected architecture, enumerate color token + background token pairs used in dark context, then compute WCAG contrast ratios.
65
+
66
+ **Token extraction by architecture:**
67
+
68
+ **Architecture 1 (CSS custom props):**
69
+ ```bash
70
+ grep -rEn "\.dark[[:space:]]*\{|prefers-color-scheme.*dark" "$SRC_ROOT" \
71
+ --include="*.css" --include="*.scss" -A 30 2>/dev/null \
72
+ | grep -E "^\s*--[a-z].*:\s*#[0-9a-fA-F]{3,8}|^\s*--[a-z].*:\s*rgb"
73
+ ```
74
+
75
+ **Architecture 2 (Tailwind dark:):**
76
+ ```bash
77
+ grep -rEhon "dark:(bg|text)-[a-z0-9-]+" "$SRC_ROOT" \
78
+ --include="*.tsx" --include="*.jsx" --include="*.html" 2>/dev/null | sort -u
79
+ ```
80
+
81
+ **Architecture 3 (JS class toggle):**
82
+ ```bash
83
+ grep -rEn "\.dark[[:space:]]*\{" "$SRC_ROOT" \
84
+ --include="*.css" --include="*.scss" -A 30 2>/dev/null \
85
+ | grep -E "color|background"
86
+ ```
87
+
88
+ **WCAG contrast computation:**
89
+
90
+ Use the linearized-sRGB formula from `agents/design-executor.md` Type: accessibility (pre-calibrated — do not re-derive):
91
+
92
+ 1. Convert each hex channel to linear light: `c_lin = (c/255 ≤ 0.04045) ? c/255/12.92 : ((c/255 + 0.055)/1.055)^2.4`
93
+ 2. Relative luminance: `L = 0.2126 * R_lin + 0.7152 * G_lin + 0.0722 * B_lin`
94
+ 3. Contrast ratio: `(L_lighter + 0.05) / (L_darker + 0.05)`
95
+
96
+ **Thresholds:**
97
+
98
+ | Text type | Min ratio | Fail severity |
99
+ |-----------|-----------|---------------|
100
+ | Body text (< 18pt or < 14pt bold) | 4.5:1 | P0 (critical) |
101
+ | Large text (≥ 18pt or ≥ 14pt bold) | 3:1 | P1 (major) |
102
+ | UI component boundaries | 3:1 | P1 (major) |
103
+
104
+ Flag every pair that fails its threshold. Include token names, hex values, computed ratio, and required ratio in the fix description.
105
+
106
+ For pairs that pass WCAG 2.1 but feel wrong perceptually (thin mid-gray text, large saturated text, saturated-on-saturated), cross-check with the APCA Lc thresholds in `./contrast-advanced.md` and annotate `[APCA-mismatch]` in the fix description.
107
+
108
+ ---
109
+
110
+ ## Step 3: Token Override Completeness (DARK-04)
111
+
112
+ Check that every light-mode color token has a corresponding dark-mode override.
113
+
114
+ **Enumerate light-mode tokens:**
115
+ ```bash
116
+ grep -rEhon "var\(--color-[a-z0-9-]+\)" "$SRC_ROOT" \
117
+ --include="*.css" --include="*.scss" --include="*.tsx" --include="*.jsx" 2>/dev/null \
118
+ | grep -oE "\-\-color-[a-z0-9-]+" | sort -u
119
+
120
+ grep -rEhon "(bg|text|border|ring)-[a-z]+-[0-9]+" "$SRC_ROOT" \
121
+ --include="*.tsx" --include="*.jsx" 2>/dev/null | sort -u
122
+ ```
123
+
124
+ **Check dark overrides (architecture-specific):**
125
+ - Arch 1: Token appears in `.dark { --color-* }` block or `@media (prefers-color-scheme: dark) { --color-* }`
126
+ - Arch 2: A `dark:` prefixed variant of the Tailwind class exists in the same file or a shared layout
127
+ - Arch 3: Token appears in the dark CSS block activated by JS class toggle
128
+
129
+ **Flag:** Any light-mode color token with no dark override → P1 (major). For OKLCH-based pair generation guidance, see `./color-theory.md` §OKLCH.
130
+
131
+ ---
132
+
133
+ ## Step 4: Dark-Specific Anti-Patterns (DARK-05)
134
+
135
+ **Anti-pattern A: Images and SVGs without dark variant**
136
+
137
+ ```bash
138
+ grep -rEn "<img[^>]+src=|<svg" "$SRC_ROOT" \
139
+ --include="*.tsx" --include="*.jsx" --include="*.html" --include="*.vue" 2>/dev/null \
140
+ | grep -v "dark\."
141
+ ```
142
+
143
+ For each image/SVG found, check whether any of the following exist:
144
+ - A sibling file with pattern `[name]-dark.{png,svg,webp}`
145
+ - A `dark:hidden` / `dark:block` swap class pairing in the same component
146
+ - A `<picture>` element with a `prefers-color-scheme: dark` source
147
+
148
+ Flag images/SVGs with none of the above → P2 (minor).
149
+
150
+ **Anti-pattern B: Pure-black backgrounds (BAN-05)**
151
+
152
+ ```bash
153
+ grep -rEn "#000000|#000\b|rgb\([[:space:]]*0[[:space:]]*,[[:space:]]*0[[:space:]]*,[[:space:]]*0[[:space:]]*\)|background[^:]*:[[:space:]]*black" \
154
+ "$SRC_ROOT" --include="*.css" --include="*.scss" 2>/dev/null
155
+ ```
156
+
157
+ Any match within a `.dark {}` block or `@media (prefers-color-scheme: dark)` context → P1 (major). Pure black (`#000000`) in dark mode causes visual harshness and fails accessibility in high-contrast conditions. Use near-black (`#0a0a0a` – `#1a1a1a`) instead.
158
+
159
+ **Anti-pattern C: Missing forced-colors media query**
160
+
161
+ ```bash
162
+ forced_count=$(grep -rEn "@media.*forced-colors" "$SRC_ROOT" \
163
+ --include="*.css" --include="*.scss" 2>/dev/null | wc -l)
164
+ ```
165
+
166
+ If `forced_count` equals 0 → P2 (minor). The `forced-colors` media query ensures the design respects Windows High Contrast mode and similar OS accessibility overrides.
167
+
168
+ ---
169
+
170
+ ## Step 5: Meta Property Check (DARK-06)
171
+
172
+ **color-scheme property:**
173
+ ```bash
174
+ cs_count=$(grep -rEn "color-scheme" "$SRC_ROOT" public/ \
175
+ --include="*.html" --include="*.tsx" --include="*.css" 2>/dev/null | wc -l)
176
+ ```
177
+ If `cs_count` equals 0 → P2 (minor).
178
+
179
+ **prefers-color-scheme media query:**
180
+ ```bash
181
+ pcs_count=$(grep -rEn "prefers-color-scheme" "$SRC_ROOT" public/ \
182
+ --include="*.html" --include="*.tsx" --include="*.css" 2>/dev/null | wc -l)
183
+ ```
184
+ If `pcs_count` equals 0 → P2 (minor). Absence means the site ignores the OS-level dark mode preference.
185
+
186
+ ---
187
+
188
+ ## Step 5B: Dark Mode Rendering Screenshots (when preview: available)
189
+
190
+ Check `preview` status from `.design/STATE.md <connections>` (per `./shared-preamble.md#connection-handshake-summary`).
191
+
192
+ **If `preview: available`:**
193
+
194
+ 1. `preview_navigate` to the primary route (e.g., `http://localhost:3000/`).
195
+ 2. Capture light-mode: `preview_screenshot` → `.design/screenshots/darkmode/light.png`.
196
+ 3. Inject dark mode using the project's toggle mechanism (check `DESIGN-CONTEXT.md` D-XX decisions):
197
+ - Tailwind dark: `preview_eval("document.documentElement.classList.add('dark')")`
198
+ - data-theme: `preview_eval("document.documentElement.setAttribute('data-theme','dark')")`
199
+ - Custom class: `preview_eval("document.documentElement.classList.add('theme-dark')")`
200
+ - If mechanism is unknown: attempt Tailwind default first; note in `DARKMODE-AUDIT.md` which method was used.
201
+ 4. `preview_screenshot` → `.design/screenshots/darkmode/dark.png`.
202
+ 5. Record both paths (NOT base64) for embedding in `## Dark Mode Rendering` section.
203
+
204
+ **If `preview: unavailable` or `preview: not_configured`:** omit `## Dark Mode Rendering` section entirely. Emit `Visual dark mode check skipped — preview not configured.` in Notes.
205
+
206
+ ---
207
+
208
+ ## Step 6: DARKMODE-AUDIT.md Template
209
+
210
+ Output path: `.design/DARKMODE-AUDIT.md`.
211
+
212
+ ```markdown
213
+ # Dark Mode Audit
214
+
215
+ **Generated:** <ISO date>
216
+ **Architecture detected:** <Architecture 1 (CSS custom props) | Architecture 2 (Tailwind dark:) | Architecture 3 (JS class toggle) | Hybrid | None>
217
+ **Source scanned:** <SRC_ROOT>
218
+
219
+ ## Summary
220
+
221
+ | Category | Status | Issues |
222
+ |----------|--------|--------|
223
+ | Contrast (DARK-03) | <pass / fail> | <count> |
224
+ | Token Overrides (DARK-04) | <pass / fail> | <count> |
225
+ | Anti-Patterns (DARK-05) | <pass / fail> | <count> |
226
+ | Meta Properties (DARK-06) | <pass / fail> | <count> |
227
+
228
+ ## P0 Fixes (Critical — contrast failure on body text)
229
+ - [CONTRAST] <token-pair>: ratio <X:1> — required 4.5:1. File: <path>
230
+
231
+ ## P1 Fixes (Major — large-text contrast / missing dark overrides / pure-black)
232
+ - [CONTRAST-LARGE] <token-pair>: ratio <X:1> — required 3:1. File: <path>
233
+ - [TOKEN-OVERRIDE] Missing dark override for <--token-name>. Light value: <hex>. File: <path>
234
+ - [BAN-05] Pure-black background detected in dark context. File: <path>:line
235
+
236
+ ## P2 Fixes (Minor — missing SVG variants / forced-colors / meta props)
237
+ - [SVG-DARK] <image.svg> has no dark variant. File: <path>
238
+ - [FORCED-COLORS] No @media (forced-colors) block detected in any CSS file.
239
+ - [COLOR-SCHEME] No color-scheme property or meta tag detected.
240
+ - [PREFERS-COLOR-SCHEME] No prefers-color-scheme query detected.
241
+
242
+ ## P3 Fixes (Cosmetic)
243
+ - <cosmetic issues, if any>
244
+
245
+ ## Dark Mode Rendering
246
+ <Either side-by-side screenshot references, or "Visual dark mode check skipped — preview not configured.">
247
+
248
+ ## Notes
249
+ This audit is read-only. It does NOT write scores back to DESIGN.md.
250
+ To apply fixes, run the design pipeline and include dark mode decisions in DESIGN-CONTEXT.md.
251
+ Score writeback (V2-05) is deferred.
252
+ ```
253
+
254
+ If a priority bucket has no issues, omit that section or write "None."
255
+
256
+ ---
257
+
258
+ *Imported by: `../skills/darkmode/SKILL.md`. Maintained as part of Phase 28.5 (Bucket 2 rework — D-10).*
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: debug-feedback-loops
3
+ type: heuristic
4
+ version: 1.0.0
5
+ phase: 28.5
6
+ tags: [debug, feedback-loop, deterministic-signal, iterate-on-loop, mit-port, mattpocock]
7
+ last_updated: 2026-05-18
8
+ ---
9
+
10
+ Source: mattpocock/skills (MIT) — engineering/diagnose Phase 1 — adapted with permission. See `../NOTICE` for the full attribution block.
11
+
12
+ # Debug Feedback Loops
13
+
14
+ Build a feedback loop before any hypothesizing. A feedback loop is a deterministic, fast, agent-runnable pass/fail signal that tells you whether the bug is reproduced right now, every time, in well under a minute. Without it, debugging degenerates to speculation.
15
+
16
+ **Do not proceed to Phase 2 (hypothesis generation) until you have a loop you believe in.**
17
+
18
+ ## The 10 construction paths
19
+
20
+ Listed in priority order — try the cheaper, faster paths first. Each entry: when to reach for it, the shape of the loop, and the verification snippet.
21
+
22
+ ### 1. Failing test
23
+
24
+ The strongest signal. Write the failing test in the project's native test framework (pytest, jest, vitest, go test, cargo test, junit, rspec). Assert the buggy behavior; run with watch mode if available. When the test goes green, the bug is fixed. Best case: reproduces in under 2 seconds, runnable by a fresh agent with one command.
25
+
26
+ When to reach: the project already has a test runner the agent can invoke; the bug is in a unit/integration-testable code path; the symptom is expressible as a deterministic assertion.
27
+
28
+ ### 2. `curl` against a running endpoint
29
+
30
+ For HTTP endpoints, a `curl` against the staging or local server with assertion on body/status. Pipe to `jq` for structured assertions. Cheap, reliable, framework-free. Wrap in a 3-line bash script so the loop is one command.
31
+
32
+ When to reach: the bug is in an HTTP endpoint; the server is already runnable locally; the symptom shows up in response body/status/headers.
33
+
34
+ ### 3. CLI fixture
35
+
36
+ For CLI tools, a small shell script that invokes the binary with fixture inputs and `grep`s for the symptom in stdout/stderr. Bash one-liner is enough; no test framework. Capture exit code, stdout, and stderr separately.
37
+
38
+ When to reach: the bug is in a CLI tool; the symptom shows up in stdout/stderr/exit-code; the failure mode is reproducible from a fixed input.
39
+
40
+ ### 4. Headless browser
41
+
42
+ For UI bugs, a Playwright or Puppeteer script that opens the page, performs the trigger action, and screenshots the symptom region. Compare against a known-good fixture or assert on a DOM selector's text/attribute. Use a deterministic seed for any randomness (animations, IDs, dates).
43
+
44
+ When to reach: the bug is visible only in a rendered browser context; DOM-level inspection isn't enough; the symptom involves layout, paint, or interaction timing.
45
+
46
+ ### 5. Trace replay
47
+
48
+ For bugs reproducible from a captured execution trace, replay the trace deterministically. `rr` on Linux for native binaries, record-and-replay plugins for some browsers, Chrome DevTools recording for web, or a captured production trace for distributed systems. Replays are bit-for-bit reproducible — the gold standard for non-deterministic bugs that have been captured once.
49
+
50
+ When to reach: the bug is hard to reproduce live but you have a captured trace; the trace runtime is available; bit-identical replay is achievable.
51
+
52
+ ### 6. Throwaway harness
53
+
54
+ Write a minimal `main()` that calls the buggy function with known inputs and prints the output. ~10 lines. Side-step framework startup costs entirely. Throw away when fixed. Useful when the test framework's setup is too heavy to iterate quickly.
55
+
56
+ When to reach: the bug is in a single function with a clear input/output contract; framework setup dominates iteration time; you'd rather iterate on a 10-line script than wait for the framework to boot.
57
+
58
+ ### 7. Fuzz
59
+
60
+ When the bug surfaces unpredictably, use property-based testing (Hypothesis, fast-check, jqwik) or AFL/libfuzzer for native code. The fuzzer narrows on the failing input across runs. Combine with a "minimize" pass to shrink the failing input to its smallest form.
61
+
62
+ When to reach: the input space is exotic (parsers, serializers, network protocols); the bug isn't tied to a specific input you can name; you suspect a class of inputs but can't enumerate it.
63
+
64
+ ### 8. Bisect
65
+
66
+ When you know "it worked on commit X, fails on commit Y," `git bisect` is the loop. Each iteration runs the test/curl/CLI from a higher path. Reproduces in O(log N) commits. Combine with `git bisect run <script>` to fully automate.
67
+
68
+ When to reach: the bug regressed between two known commits; you have a binary pass/fail script from one of the earlier paths; you don't yet know which commit introduced the regression.
69
+
70
+ ### 9. Differential
71
+
72
+ When you have two systems and one's correct, write a differential harness: feed the same input to both, assert outputs match. Cheap for migration bugs (old → new implementation) and protocol bugs (your client vs reference client). The harness becomes the regression test once the bug is fixed.
73
+
74
+ When to reach: a known-good reference implementation exists; the buggy code is supposed to match the reference; input space is enumerable enough to drive both implementations side by side.
75
+
76
+ ### 10. HITL bash (Human In The Loop)
77
+
78
+ Last resort. A documented sequence of bash commands a human runs that produces a pass/fail signal. Slower (human in the path), but better than no loop. The agent reads stdout/stderr; the human reads the screen and reports. Use only when no automatable signal is available — physical hardware, vendor portals, manual eyeball checks.
79
+
80
+ When to reach: every other path is blocked by an unautomatable surface; the human cost is acceptable for the duration of the investigation; the loop will be retired or upgraded the moment any earlier path becomes possible.
81
+
82
+ ## Iterate on the loop itself
83
+
84
+ The loop is a first-class artifact. Iterate on it before iterating on hypotheses. A 5-minute loop run 20 times costs 100 minutes; a 5-second loop run 20 times costs 100 seconds. Spending 30 minutes tightening the loop pays back inside the same session.
85
+
86
+ - **Cache setup**: hoist expensive setup (DB seed, fixture load, container start) out of the loop body. Run the loop only on the fast inner part. Use `--no-rebuild`, persistent containers, or test-runner watch modes.
87
+ - **Narrow scope**: if the loop runs the full test suite to verify one bug, narrow to just the failing test/group. Fewer side-effects, faster iterations. `pytest -k`, `jest -t`, `vitest --testNamePattern`, `go test -run` are your friends.
88
+ - **Pin time**: when the bug is time-dependent, freeze the clock (`sinon.useFakeTimers`, `jest.useFakeTimers`, `freezegun`, `time-machine`). Removes wallclock as a variable.
89
+ - **Seed RNG**: every random source gets a fixed seed in the loop. `Math.random`, `crypto.randomBytes`, `random.seed(...)`, `rand::SeedableRng`. Determinism over coverage — the loop must be bit-identical given the same code state.
90
+ - **Isolate filesystem**: run in a `tmpdir`; reset between iterations. Avoids "fixed on my machine" via stale state. Bind-mount or copy fixtures into the tmpdir per iteration.
91
+ - **Freeze network**: mock or record/replay all outbound calls (`nock`, `vcr`, `polly.js`, `mitmproxy --replay`). Real-network loops are non-deterministic by definition.
92
+
93
+ The discipline: every iteration of the loop should be bit-identical given the same code state. If two iterations differ without a code change, the loop has a hidden input — find it and pin it before continuing.
94
+
95
+ ## Non-deterministic bugs
96
+
97
+ Some bugs surface only sometimes. The goal is NOT a clean repro — the goal is to raise the reproduction rate to debuggable.
98
+
99
+ - **Measure baseline rate**: run the loop N=20 times. Note pass/fail count. Record the rate so you can tell whether later changes helped.
100
+ - **Raise stressors**: add concurrency, contention, memory pressure, network jitter (use `tc qdisc add dev lo root netem delay 100ms 50ms` on Linux, `Network Link Conditioner` on macOS). Re-measure.
101
+ - **Target the suspect axis**: if you suspect a race, add a deterministic sleep at the suspect point and measure. If reproduction jumps to 100% with sleep, the race is in that region. If it stays at baseline, the race is elsewhere.
102
+ - **A 30% reproduction rate is debuggable.** A 5% rate isn't — keep raising stressors until you cross 30%. At 30% you can iterate; at 5% you're guessing whether your fix helped or you got lucky.
103
+
104
+ ## When the loop is good enough
105
+
106
+ The loop is good enough when:
107
+
108
+ - It runs in under a minute (preferably under 10 seconds).
109
+ - It's deterministic (or, for non-determinism, reproduces at least 30% of the time).
110
+ - It's automatable — no human in the inner loop except by explicit choice (Path 10).
111
+ - A fresh agent could pick up the loop and run it without context.
112
+
113
+ Only after the loop is good enough should you proceed to hypothesizing the fix. The hypothesis cycle is governed by `./debugger-philosophy.md` (one variable at a time, do not stop at first plausible cause, the bug is where you didn't look).
114
+
115
+ ## Cross-references
116
+
117
+ - `./debugger-philosophy.md` — companion framing; the hypothesis-cycle discipline that runs in Phase 2 once the loop is in place.
118
+ - `../skills/debug/SKILL.md` — Phase 1 of the debug skill mandates this catalog before any hypothesis generation.
119
+ - `../NOTICE` — full mattpocock/skills MIT attribution.