@zigrivers/scaffold 3.4.1 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (194) hide show
  1. package/README.md +91 -0
  2. package/content/knowledge/game/game-accessibility.md +328 -0
  3. package/content/knowledge/game/game-ai-patterns.md +542 -0
  4. package/content/knowledge/game/game-asset-pipeline.md +359 -0
  5. package/content/knowledge/game/game-audio-design.md +342 -0
  6. package/content/knowledge/game/game-binary-vcs-strategy.md +396 -0
  7. package/content/knowledge/game/game-design-document.md +260 -0
  8. package/content/knowledge/game/game-domain-patterns.md +297 -0
  9. package/content/knowledge/game/game-economy-design.md +355 -0
  10. package/content/knowledge/game/game-engine-selection.md +242 -0
  11. package/content/knowledge/game/game-input-systems.md +357 -0
  12. package/content/knowledge/game/game-level-content-design.md +455 -0
  13. package/content/knowledge/game/game-liveops-analytics.md +280 -0
  14. package/content/knowledge/game/game-localization.md +323 -0
  15. package/content/knowledge/game/game-milestone-definitions.md +337 -0
  16. package/content/knowledge/game/game-modding-ugc.md +390 -0
  17. package/content/knowledge/game/game-narrative-design.md +404 -0
  18. package/content/knowledge/game/game-networking.md +391 -0
  19. package/content/knowledge/game/game-performance-budgeting.md +378 -0
  20. package/content/knowledge/game/game-platform-certification.md +417 -0
  21. package/content/knowledge/game/game-project-structure.md +360 -0
  22. package/content/knowledge/game/game-save-systems.md +452 -0
  23. package/content/knowledge/game/game-testing-strategy.md +470 -0
  24. package/content/knowledge/game/game-ui-patterns.md +475 -0
  25. package/content/knowledge/game/game-vr-ar-design.md +313 -0
  26. package/content/knowledge/review/review-art-bible.md +305 -0
  27. package/content/knowledge/review/review-game-design.md +303 -0
  28. package/content/knowledge/review/review-game-economy.md +272 -0
  29. package/content/knowledge/review/review-netcode.md +280 -0
  30. package/content/knowledge/review/review-platform-cert.md +341 -0
  31. package/content/methodology/custom-defaults.yml +25 -0
  32. package/content/methodology/deep.yml +25 -0
  33. package/content/methodology/game-overlay.yml +145 -0
  34. package/content/methodology/mvp.yml +25 -0
  35. package/content/pipeline/architecture/ai-behavior-design.md +87 -0
  36. package/content/pipeline/architecture/netcode-spec.md +86 -0
  37. package/content/pipeline/architecture/review-netcode.md +78 -0
  38. package/content/pipeline/foundation/performance-budgets.md +91 -0
  39. package/content/pipeline/modeling/narrative-bible.md +84 -0
  40. package/content/pipeline/pre/game-design-document.md +89 -0
  41. package/content/pipeline/pre/review-gdd.md +74 -0
  42. package/content/pipeline/quality/analytics-telemetry.md +98 -0
  43. package/content/pipeline/quality/live-ops-plan.md +99 -0
  44. package/content/pipeline/quality/platform-cert-prep.md +129 -0
  45. package/content/pipeline/quality/playtest-plan.md +83 -0
  46. package/content/pipeline/specification/art-bible.md +87 -0
  47. package/content/pipeline/specification/audio-design.md +96 -0
  48. package/content/pipeline/specification/content-structure-design.md +141 -0
  49. package/content/pipeline/specification/economy-design.md +104 -0
  50. package/content/pipeline/specification/game-accessibility.md +82 -0
  51. package/content/pipeline/specification/game-ui-spec.md +97 -0
  52. package/content/pipeline/specification/input-controls-spec.md +81 -0
  53. package/content/pipeline/specification/localization-plan.md +113 -0
  54. package/content/pipeline/specification/modding-ugc-spec.md +116 -0
  55. package/content/pipeline/specification/online-services-spec.md +104 -0
  56. package/content/pipeline/specification/review-economy.md +87 -0
  57. package/content/pipeline/specification/review-game-ui.md +73 -0
  58. package/content/pipeline/specification/save-system-spec.md +116 -0
  59. package/dist/cli/commands/adopt.d.ts.map +1 -1
  60. package/dist/cli/commands/adopt.js +25 -0
  61. package/dist/cli/commands/adopt.js.map +1 -1
  62. package/dist/cli/commands/adopt.test.js +28 -1
  63. package/dist/cli/commands/adopt.test.js.map +1 -1
  64. package/dist/cli/commands/build.test.js +3 -0
  65. package/dist/cli/commands/build.test.js.map +1 -1
  66. package/dist/cli/commands/init.d.ts +1 -0
  67. package/dist/cli/commands/init.d.ts.map +1 -1
  68. package/dist/cli/commands/init.js +6 -0
  69. package/dist/cli/commands/init.js.map +1 -1
  70. package/dist/cli/commands/init.test.js +12 -1
  71. package/dist/cli/commands/init.test.js.map +1 -1
  72. package/dist/cli/commands/knowledge.test.js +8 -0
  73. package/dist/cli/commands/knowledge.test.js.map +1 -1
  74. package/dist/cli/commands/next.d.ts.map +1 -1
  75. package/dist/cli/commands/next.js +19 -5
  76. package/dist/cli/commands/next.js.map +1 -1
  77. package/dist/cli/commands/next.test.js +56 -0
  78. package/dist/cli/commands/next.test.js.map +1 -1
  79. package/dist/cli/commands/rework.d.ts.map +1 -1
  80. package/dist/cli/commands/rework.js +11 -2
  81. package/dist/cli/commands/rework.js.map +1 -1
  82. package/dist/cli/commands/rework.test.js +5 -0
  83. package/dist/cli/commands/rework.test.js.map +1 -1
  84. package/dist/cli/commands/run.d.ts.map +1 -1
  85. package/dist/cli/commands/run.js +54 -4
  86. package/dist/cli/commands/run.js.map +1 -1
  87. package/dist/cli/commands/run.test.js +384 -0
  88. package/dist/cli/commands/run.test.js.map +1 -1
  89. package/dist/cli/commands/skip.test.js +3 -0
  90. package/dist/cli/commands/skip.test.js.map +1 -1
  91. package/dist/cli/commands/status.d.ts.map +1 -1
  92. package/dist/cli/commands/status.js +16 -3
  93. package/dist/cli/commands/status.js.map +1 -1
  94. package/dist/cli/commands/status.test.js +55 -0
  95. package/dist/cli/commands/status.test.js.map +1 -1
  96. package/dist/cli/output/auto.d.ts +3 -0
  97. package/dist/cli/output/auto.d.ts.map +1 -1
  98. package/dist/cli/output/auto.js +9 -0
  99. package/dist/cli/output/auto.js.map +1 -1
  100. package/dist/cli/output/context.d.ts +6 -0
  101. package/dist/cli/output/context.d.ts.map +1 -1
  102. package/dist/cli/output/context.js.map +1 -1
  103. package/dist/cli/output/context.test.js +87 -0
  104. package/dist/cli/output/context.test.js.map +1 -1
  105. package/dist/cli/output/error-display.test.js +3 -0
  106. package/dist/cli/output/error-display.test.js.map +1 -1
  107. package/dist/cli/output/interactive.d.ts +3 -0
  108. package/dist/cli/output/interactive.d.ts.map +1 -1
  109. package/dist/cli/output/interactive.js +76 -0
  110. package/dist/cli/output/interactive.js.map +1 -1
  111. package/dist/cli/output/json.d.ts +3 -0
  112. package/dist/cli/output/json.d.ts.map +1 -1
  113. package/dist/cli/output/json.js +9 -0
  114. package/dist/cli/output/json.js.map +1 -1
  115. package/dist/config/loader.d.ts.map +1 -1
  116. package/dist/config/loader.js +3 -2
  117. package/dist/config/loader.js.map +1 -1
  118. package/dist/config/schema.d.ts +641 -15
  119. package/dist/config/schema.d.ts.map +1 -1
  120. package/dist/config/schema.js +26 -1
  121. package/dist/config/schema.js.map +1 -1
  122. package/dist/config/schema.test.js +192 -1
  123. package/dist/config/schema.test.js.map +1 -1
  124. package/dist/core/assembly/overlay-loader.d.ts +24 -0
  125. package/dist/core/assembly/overlay-loader.d.ts.map +1 -0
  126. package/dist/core/assembly/overlay-loader.js +190 -0
  127. package/dist/core/assembly/overlay-loader.js.map +1 -0
  128. package/dist/core/assembly/overlay-loader.test.d.ts +2 -0
  129. package/dist/core/assembly/overlay-loader.test.d.ts.map +1 -0
  130. package/dist/core/assembly/overlay-loader.test.js +106 -0
  131. package/dist/core/assembly/overlay-loader.test.js.map +1 -0
  132. package/dist/core/assembly/overlay-resolver.d.ts +15 -0
  133. package/dist/core/assembly/overlay-resolver.d.ts.map +1 -0
  134. package/dist/core/assembly/overlay-resolver.js +58 -0
  135. package/dist/core/assembly/overlay-resolver.js.map +1 -0
  136. package/dist/core/assembly/overlay-resolver.test.d.ts +2 -0
  137. package/dist/core/assembly/overlay-resolver.test.d.ts.map +1 -0
  138. package/dist/core/assembly/overlay-resolver.test.js +246 -0
  139. package/dist/core/assembly/overlay-resolver.test.js.map +1 -0
  140. package/dist/core/assembly/overlay-state-resolver.d.ts +26 -0
  141. package/dist/core/assembly/overlay-state-resolver.d.ts.map +1 -0
  142. package/dist/core/assembly/overlay-state-resolver.js +63 -0
  143. package/dist/core/assembly/overlay-state-resolver.js.map +1 -0
  144. package/dist/core/assembly/overlay-state-resolver.test.d.ts +2 -0
  145. package/dist/core/assembly/overlay-state-resolver.test.d.ts.map +1 -0
  146. package/dist/core/assembly/overlay-state-resolver.test.js +256 -0
  147. package/dist/core/assembly/overlay-state-resolver.test.js.map +1 -0
  148. package/dist/core/assembly/preset-loader.d.ts +1 -0
  149. package/dist/core/assembly/preset-loader.d.ts.map +1 -1
  150. package/dist/core/assembly/preset-loader.js +2 -0
  151. package/dist/core/assembly/preset-loader.js.map +1 -1
  152. package/dist/core/dependency/eligibility.test.js +3 -0
  153. package/dist/core/dependency/eligibility.test.js.map +1 -1
  154. package/dist/e2e/game-pipeline.test.d.ts +10 -0
  155. package/dist/e2e/game-pipeline.test.d.ts.map +1 -0
  156. package/dist/e2e/game-pipeline.test.js +298 -0
  157. package/dist/e2e/game-pipeline.test.js.map +1 -0
  158. package/dist/e2e/init.test.js +3 -0
  159. package/dist/e2e/init.test.js.map +1 -1
  160. package/dist/project/adopt.d.ts +3 -1
  161. package/dist/project/adopt.d.ts.map +1 -1
  162. package/dist/project/adopt.js +29 -1
  163. package/dist/project/adopt.js.map +1 -1
  164. package/dist/project/adopt.test.js +51 -1
  165. package/dist/project/adopt.test.js.map +1 -1
  166. package/dist/types/config.d.ts +50 -4
  167. package/dist/types/config.d.ts.map +1 -1
  168. package/dist/types/config.test.d.ts +2 -0
  169. package/dist/types/config.test.d.ts.map +1 -0
  170. package/dist/types/config.test.js +97 -0
  171. package/dist/types/config.test.js.map +1 -0
  172. package/dist/utils/eligible.d.ts +3 -2
  173. package/dist/utils/eligible.d.ts.map +1 -1
  174. package/dist/utils/eligible.js +18 -4
  175. package/dist/utils/eligible.js.map +1 -1
  176. package/dist/utils/errors.d.ts +4 -0
  177. package/dist/utils/errors.d.ts.map +1 -1
  178. package/dist/utils/errors.js +31 -0
  179. package/dist/utils/errors.js.map +1 -1
  180. package/dist/utils/errors.test.js +4 -1
  181. package/dist/utils/errors.test.js.map +1 -1
  182. package/dist/wizard/questions.d.ts +4 -0
  183. package/dist/wizard/questions.d.ts.map +1 -1
  184. package/dist/wizard/questions.js +59 -1
  185. package/dist/wizard/questions.js.map +1 -1
  186. package/dist/wizard/questions.test.js +178 -4
  187. package/dist/wizard/questions.test.js.map +1 -1
  188. package/dist/wizard/wizard.d.ts +1 -0
  189. package/dist/wizard/wizard.d.ts.map +1 -1
  190. package/dist/wizard/wizard.js +4 -1
  191. package/dist/wizard/wizard.js.map +1 -1
  192. package/dist/wizard/wizard.test.js +102 -4
  193. package/dist/wizard/wizard.test.js.map +1 -1
  194. package/package.json +1 -1
@@ -0,0 +1,470 @@
1
+ ---
2
+ name: game-testing-strategy
3
+ description: Simulation logic testing, visual and performance regression, soak testing, balance validation, playtest protocols, and CI integration
4
+ topics: [game-dev, testing, playtesting, soak, balance, visual-regression]
5
+ ---
6
+
7
+ Game testing spans a far wider range than typical software testing. Beyond unit and integration tests for code, games require visual regression testing (did the shader change break the look of every material?), performance regression testing (did the new particle system push frame time over budget?), soak testing (does the game crash after 48 hours of continuous play?), balance validation (is the new weapon overpowered?), and structured playtesting with real humans. Each testing layer catches a different class of defect, and skipping any layer means shipping that class of bug to players.
8
+
9
+ ## Summary
10
+
11
+ ### Testing Layers for Games
12
+
13
+ Game testing is organized into layers, each targeting different risk categories:
14
+
15
+ 1. **Simulation Logic Tests** — Automated tests that verify game rules, state machines, inventory calculations, damage formulas, AI decision logic, and other systems that can be isolated from rendering and input. These run fast, headlessly, and in CI. Frame as "simulation logic" rather than "deterministic" — true determinism is nearly impossible when standard Unity/Unreal physics engines introduce floating-point non-determinism across platforms and frame rates.
16
+
17
+ 2. **Integration Tests** — Tests that verify interactions between subsystems: does picking up an item correctly update the inventory UI, play the pickup sound, trigger the quest objective, and despawn the world object? Requires a running game instance but can be automated with scripted input.
18
+
19
+ 3. **Visual Regression Tests** — Screenshot comparison tests that detect unintended visual changes. Capture baseline screenshots of key scenes and compare against new builds pixel-by-pixel (with a threshold for acceptable variation). Catches shader bugs, broken materials, missing textures, and lighting changes.
20
+
21
+ 4. **Performance Regression Tests** — Automated frame timing measurements against budgets. Run standardized benchmark scenarios and fail the build if P95 frame time exceeds the target. Catches performance regressions before they compound.
22
+
23
+ 5. **Soak Tests** — Extended-duration tests (24–72 hours) that detect memory leaks, resource exhaustion, crash bugs, and degradation over time. Essential for live-service games and any game where players leave sessions running.
24
+
25
+ 6. **Balance Validation** — Automated simulation of thousands of combat encounters, economy transactions, or progression paths to detect statistical outliers. Not a substitute for human judgment but catches egregious balance errors before playtesting.
26
+
27
+ 7. **Playtest Sessions** — Structured observation of real players interacting with the game. Captures UX issues, difficulty spikes, confusion points, and emotional responses that no automated test can detect.
28
+
29
+ 8. **Compatibility Testing** — Testing across target hardware configurations, OS versions, driver versions, and peripheral combinations. Console certification testing is a specialized form of compatibility testing.
30
+
31
+ ### Simulation Logic Testing Principles
32
+
33
+ Simulation logic tests are the backbone of automated game testing. They verify the rules of the game without requiring a running renderer, audio system, or input device.
34
+
35
+ **What to test as simulation logic:**
36
+ - Damage calculations, healing, buff/debuff application and duration
37
+ - Inventory operations (add, remove, stack, split, capacity limits)
38
+ - Quest state transitions (objectives completed, prerequisites met)
39
+ - AI decision trees and utility scores given known inputs
40
+ - Crafting recipes (valid combinations, resource consumption, output)
41
+ - Economy transactions (buy, sell, trade, currency conversion)
42
+ - Spawn rules (wave composition, difficulty scaling, spawn point selection)
43
+ - Save/load round-trip fidelity (save state, load state, verify equivalence)
44
+
45
+ **What is NOT suitable for simulation logic tests:**
46
+ - Rendering correctness (use visual regression)
47
+ - Input feel and responsiveness (use playtesting)
48
+ - Audio mixing and spatialization (use listening sessions)
49
+ - Physics emergent behavior (use integration tests with replay)
50
+ - Performance characteristics (use performance regression)
51
+
52
+ ### Automated Replay Systems
53
+
54
+ Replay systems record player input and game state, then replay them to reproduce bugs, run regression tests, and generate screenshots for visual comparison.
55
+
56
+ ## Deep Guidance
57
+
58
+ ### Simulation Logic Test Implementation
59
+
60
+ The key to testable game logic is separating simulation from presentation. Game logic that depends on MonoBehaviour lifecycle, Update loops, or renderer state is untestable in isolation. Extract logic into plain classes.
61
+
62
+ ```csharp
63
+ // GOOD: Testable damage calculation — pure function, no engine dependencies
64
+ public static class DamageCalculator
65
+ {
66
+ public struct DamageInput
67
+ {
68
+ public float BaseDamage;
69
+ public float AttackerLevel;
70
+ public float DefenderArmor;
71
+ public float DefenderLevel;
72
+ public DamageType Type;
73
+ public float CritMultiplier; // 1.0 = no crit
74
+ public float[] ActiveBuffMultipliers;
75
+ public float[] ActiveDebuffMultipliers;
76
+ }
77
+
78
+ public struct DamageResult
79
+ {
80
+ public float RawDamage;
81
+ public float MitigatedDamage;
82
+ public float FinalDamage;
83
+ public bool WasResisted;
84
+ public bool WasLethal;
85
+ }
86
+
87
+ public static DamageResult Calculate(DamageInput input, float defenderCurrentHP)
88
+ {
89
+ // Step 1: Base damage with level scaling
90
+ float levelDelta = input.AttackerLevel - input.DefenderLevel;
91
+ float levelScalar = Mathf.Clamp(1f + (levelDelta * 0.05f), 0.5f, 2.0f);
92
+ float raw = input.BaseDamage * levelScalar * input.CritMultiplier;
93
+
94
+ // Step 2: Apply attacker buffs (multiplicative stacking)
95
+ float buffTotal = 1f;
96
+ foreach (float buff in input.ActiveBuffMultipliers)
97
+ buffTotal *= buff;
98
+ raw *= buffTotal;
99
+
100
+ // Step 3: Apply defender debuffs
101
+ float debuffTotal = 1f;
102
+ foreach (float debuff in input.ActiveDebuffMultipliers)
103
+ debuffTotal *= debuff;
104
+ raw *= debuffTotal;
105
+
106
+ // Step 4: Armor mitigation (diminishing returns formula)
107
+ float armorReduction = input.DefenderArmor / (input.DefenderArmor + 100f);
108
+ float mitigated = raw * (1f - armorReduction);
109
+
110
+ // Step 5: Floor at 1 damage (no zero-damage hits)
111
+ float final_dmg = Mathf.Max(1f, mitigated);
112
+
113
+ return new DamageResult
114
+ {
115
+ RawDamage = raw,
116
+ MitigatedDamage = mitigated,
117
+ FinalDamage = final_dmg,
118
+ WasResisted = armorReduction > 0.8f,
119
+ WasLethal = final_dmg >= defenderCurrentHP,
120
+ };
121
+ }
122
+ }
123
+
124
+ // Test class — runs in NUnit/xUnit without Unity Editor
125
+ [TestFixture]
126
+ public class DamageCalculatorTests
127
+ {
128
+ [Test]
129
+ public void SameLevelNoCritNoArmor_ReturnsBaseDamage()
130
+ {
131
+ var input = new DamageCalculator.DamageInput
132
+ {
133
+ BaseDamage = 100f,
134
+ AttackerLevel = 10f,
135
+ DefenderLevel = 10f,
136
+ DefenderArmor = 0f,
137
+ CritMultiplier = 1f,
138
+ ActiveBuffMultipliers = new float[0],
139
+ ActiveDebuffMultipliers = new float[0],
140
+ };
141
+ var result = DamageCalculator.Calculate(input, 500f);
142
+ Assert.AreEqual(100f, result.FinalDamage, 0.01f);
143
+ Assert.IsFalse(result.WasLethal);
144
+ }
145
+
146
+ [Test]
147
+ public void HighArmor_ReducesDamageWithDiminishingReturns()
148
+ {
149
+ var input = new DamageCalculator.DamageInput
150
+ {
151
+ BaseDamage = 100f,
152
+ AttackerLevel = 10f,
153
+ DefenderLevel = 10f,
154
+ DefenderArmor = 400f, // 400/(400+100) = 80% reduction
155
+ CritMultiplier = 1f,
156
+ ActiveBuffMultipliers = new float[0],
157
+ ActiveDebuffMultipliers = new float[0],
158
+ };
159
+ var result = DamageCalculator.Calculate(input, 500f);
160
+ Assert.AreEqual(20f, result.FinalDamage, 0.01f);
161
+ Assert.IsTrue(result.WasResisted); // >80% mitigation
162
+ }
163
+
164
+ [Test]
165
+ public void MinimumDamage_NeverBelowOne()
166
+ {
167
+ var input = new DamageCalculator.DamageInput
168
+ {
169
+ BaseDamage = 1f,
170
+ AttackerLevel = 1f,
171
+ DefenderLevel = 50f,
172
+ DefenderArmor = 9999f,
173
+ CritMultiplier = 1f,
174
+ ActiveBuffMultipliers = new float[0],
175
+ ActiveDebuffMultipliers = new float[0],
176
+ };
177
+ var result = DamageCalculator.Calculate(input, 500f);
178
+ Assert.GreaterOrEqual(result.FinalDamage, 1f);
179
+ }
180
+ }
181
+ ```
182
+
183
+ ### Visual Regression Testing
184
+
185
+ Visual regression tests capture screenshots of specific game scenes and compare them against approved baselines to detect unintended visual changes.
186
+
187
+ **Implementation approach:**
188
+
189
+ 1. Define a set of "visual test scenes" — minimal scenes that isolate specific visual features (a scene with only the skybox, a scene with a character under standard lighting, a scene with all UI elements visible)
190
+ 2. Write an automated script that loads each scene, waits for rendering to stabilize, and captures a screenshot at a fixed resolution
191
+ 3. Compare each screenshot against the approved baseline using a pixel-difference algorithm with a configurable threshold (typically 0.1–1% pixel difference tolerance)
192
+ 4. Fail the build if any scene exceeds the threshold
193
+
194
+ **Screenshot comparison pipeline:**
195
+
196
+ ```python
197
+ # visual_regression.py — Compare screenshots against baselines
198
+ # Requires: Pillow (PIL)
199
+
200
+ from PIL import Image, ImageChops
201
+ import os
202
+ import sys
203
+ from pathlib import Path
204
+
205
+ BASELINE_DIR = Path("tests/visual/baselines")
206
+ CURRENT_DIR = Path("tests/visual/current")
207
+ DIFF_DIR = Path("tests/visual/diffs")
208
+ THRESHOLD_PERCENT = 0.5 # Max acceptable pixel difference
209
+
210
+ def compare_images(baseline_path: Path, current_path: Path) -> tuple[float, Path]:
211
+ """Compare two images, return difference percentage and diff image path."""
212
+ baseline = Image.open(baseline_path).convert("RGB")
213
+ current = Image.open(current_path).convert("RGB")
214
+
215
+ if baseline.size != current.size:
216
+ raise ValueError(
217
+ f"Size mismatch: baseline={baseline.size}, current={current.size}"
218
+ )
219
+
220
+ diff = ImageChops.difference(baseline, current)
221
+ # Convert to grayscale for simpler analysis
222
+ diff_gray = diff.convert("L")
223
+
224
+ # Count pixels that differ by more than a small noise threshold
225
+ pixels = list(diff_gray.getdata())
226
+ total_pixels = len(pixels)
227
+ changed_pixels = sum(1 for p in pixels if p > 10) # ignore noise < 10/255
228
+ diff_percent = (changed_pixels / total_pixels) * 100
229
+
230
+ # Save diff image (amplified for visibility)
231
+ diff_amplified = diff_gray.point(lambda x: min(255, x * 10))
232
+ diff_path = DIFF_DIR / f"diff_{baseline_path.stem}.png"
233
+ diff_amplified.save(diff_path)
234
+
235
+ return diff_percent, diff_path
236
+
237
+ def main():
238
+ DIFF_DIR.mkdir(parents=True, exist_ok=True)
239
+ failures = []
240
+
241
+ for baseline_file in sorted(BASELINE_DIR.glob("*.png")):
242
+ current_file = CURRENT_DIR / baseline_file.name
243
+ if not current_file.exists():
244
+ failures.append(f"MISSING: {baseline_file.name} — no current screenshot")
245
+ continue
246
+
247
+ try:
248
+ diff_pct, diff_path = compare_images(baseline_file, current_file)
249
+ status = "PASS" if diff_pct <= THRESHOLD_PERCENT else "FAIL"
250
+ print(f" {status}: {baseline_file.name} — {diff_pct:.2f}% difference")
251
+ if diff_pct > THRESHOLD_PERCENT:
252
+ failures.append(
253
+ f"{baseline_file.name}: {diff_pct:.2f}% diff "
254
+ f"(threshold: {THRESHOLD_PERCENT}%) — see {diff_path}"
255
+ )
256
+ except ValueError as e:
257
+ failures.append(f"ERROR: {baseline_file.name} — {e}")
258
+
259
+ if failures:
260
+ print("\nVISUAL REGRESSION FAILURES:")
261
+ for f in failures:
262
+ print(f" {f}")
263
+ print("\nTo update baselines (after visual review):")
264
+ print(f" cp {CURRENT_DIR}/*.png {BASELINE_DIR}/")
265
+ sys.exit(1)
266
+ else:
267
+ print(f"\nAll {len(list(BASELINE_DIR.glob('*.png')))} visual tests passed.")
268
+
269
+ if __name__ == "__main__":
270
+ main()
271
+ ```
272
+
273
+ ### Performance Regression Testing
274
+
275
+ Performance regression tests run standardized gameplay scenarios and measure frame timing, memory usage, and draw call counts against defined budgets.
276
+
277
+ **Automated benchmark framework:**
278
+
279
+ 1. Define benchmark scenarios as recorded input sequences or scripted camera paths that exercise specific subsystems (combat encounter, dense environment, UI-heavy screen)
280
+ 2. Run each scenario for a fixed duration (30–60 seconds) while capturing frame timing data
281
+ 3. Calculate P50, P95, P99 frame times and compare against per-scenario budgets
282
+ 4. Track results over time to detect gradual degradation (a 0.5ms regression per week compounds to 2ms over a month)
283
+
284
+ **Metrics to capture per benchmark:**
285
+ - Frame time: P50, P95, P99, max
286
+ - CPU time: total and per-subsystem (physics, AI, rendering prep, audio)
287
+ - GPU time: total and per-pass (shadows, lighting, post-process)
288
+ - Draw calls: average and peak
289
+ - Memory: peak allocated, peak committed, GC pause count and duration
290
+ - Loading time: scene load duration
291
+
292
+ **Regression detection:**
293
+ - Compare against the previous successful build (catch immediate regressions)
294
+ - Compare against a weekly rolling baseline (catch gradual drift)
295
+ - Alert on any P95 frame time increase greater than 1ms
296
+ - Alert on any memory increase greater than 50MB
297
+ - Alert on any draw call increase greater than 500
298
+
299
+ ### Soak Testing
300
+
301
+ Soak tests run the game continuously for 24–72 hours to detect issues that only manifest over extended play sessions.
302
+
303
+ **What soak tests catch:**
304
+ - Memory leaks (gradual allocation growth that eventually exhausts available memory)
305
+ - Handle leaks (file handles, GPU resources, network sockets not properly released)
306
+ - Numerical drift (floating-point accumulation errors in world position, camera, or physics)
307
+ - State corruption (rare race conditions that require thousands of iterations to trigger)
308
+ - Server stability (for multiplayer: connection handling, session management, database growth)
309
+
310
+ **Soak test implementation:**
311
+ 1. Create an automated player that loops through core gameplay: load level, play for 10 minutes, return to menu, load next level, repeat
312
+ 2. Log memory usage, frame time, and error counts every 60 seconds
313
+ 3. After the soak period, analyze the time-series data for upward trends in memory or downward trends in frame rate
314
+ 4. Any error or crash during the soak period is a P0 bug — the game must be stable for at least 72 hours
315
+
316
+ **Soak test analysis:**
317
+ - Plot memory usage over time — a linear upward trend indicates a leak
318
+ - Plot frame time over time — upward drift indicates resource accumulation
319
+ - Count unique error messages — any new error type appearing after hour 1 suggests time-dependent issues
320
+ - Monitor GC frequency — increasing GC frequency indicates allocation pressure building
321
+
322
+ ### Balance Validation
323
+
324
+ Automated balance testing simulates thousands of game scenarios to detect statistical outliers in combat, economy, or progression systems.
325
+
326
+ **Monte Carlo combat simulation:**
327
+ - Simulate 10,000 encounters for each enemy type against a reference player build
328
+ - Record win rate, average time-to-kill, average damage taken, healing consumed
329
+ - Flag any encounter where win rate deviates more than 10% from the design target
330
+ - Flag any weapon/ability where usage rate in simulated optimal play exceeds 60% (dominance indicator)
331
+
332
+ **Economy simulation:**
333
+ - Simulate 1,000 player progression paths through the game's economy
334
+ - Track currency accumulation rate, item acquisition rate, and power growth curve
335
+ - Flag if any path results in a player being unable to afford required purchases (soft lock)
336
+ - Flag if any path results in currency accumulation exceeding 3x the intended rate (exploit indicator)
337
+
338
+ ### Playtest Protocol
339
+
340
+ Structured playtesting follows a repeatable protocol to produce actionable data.
341
+
342
+ **Pre-session setup:**
343
+ 1. Define the playtest goals (e.g., "evaluate onboarding flow for new players" or "assess difficulty of World 3 boss")
344
+ 2. Prepare the build — stable, no known crashes, telemetry enabled
345
+ 3. Recruit appropriate testers (new players for onboarding tests, experienced players for difficulty tests, target demographic for overall feel)
346
+ 4. Prepare observation forms with specific questions to answer
347
+
348
+ **During the session:**
349
+ - Observer does NOT help the player unless they are completely stuck for more than 3 minutes
350
+ - Record the session (screen + face cam if consented) for later review
351
+ - Note timestamps of confusion, frustration, delight, and surprise
352
+ - Track objective metrics: time to complete tutorial, deaths per section, items used
353
+
354
+ **Post-session:**
355
+ - Administer a short questionnaire (5–10 questions, Likert scale + free text)
356
+ - Conduct a brief interview asking about specific moments the observer noted
357
+ - Aggregate quantitative data across all testers in the session
358
+ - Create an actionable report: problems ranked by severity, frequency, and impact
359
+
360
+ ### Compatibility Testing Matrix
361
+
362
+ Game compatibility testing covers hardware, OS, and peripheral variations.
363
+
364
+ ```yaml
365
+ # compatibility_matrix.yaml — Minimum test configurations
366
+
367
+ pc:
368
+ gpu_vendors: [nvidia, amd, intel]
369
+ gpu_tiers:
370
+ - name: min_spec
371
+ example: "GTX 1060 / RX 580 / Arc A380"
372
+ resolution: 1080p
373
+ settings: low
374
+ - name: recommended
375
+ example: "RTX 3060 / RX 6700 XT"
376
+ resolution: 1440p
377
+ settings: high
378
+ - name: high_end
379
+ example: "RTX 4080 / RX 7900 XTX"
380
+ resolution: 4K
381
+ settings: ultra
382
+ os: ["Windows 10 22H2", "Windows 11 23H2"]
383
+ drivers: ["latest stable", "one version behind"]
384
+ ram: [8GB, 16GB, 32GB]
385
+ storage: [HDD, SATA_SSD, NVMe]
386
+
387
+ console:
388
+ playstation:
389
+ - PS5 (disc)
390
+ - PS5 (digital)
391
+ - PS5 Pro (if applicable)
392
+ xbox:
393
+ - Xbox Series X
394
+ - Xbox Series S # Lower GPU, lower memory — often the constraint
395
+ switch:
396
+ - Switch OLED (docked)
397
+ - Switch OLED (handheld)
398
+ - Switch Lite
399
+
400
+ mobile:
401
+ ios:
402
+ - iPhone 12 (min spec)
403
+ - iPhone 14 (target)
404
+ - iPhone 16 Pro (high end)
405
+ - iPad Air (tablet layout)
406
+ android:
407
+ - Samsung Galaxy A54 (mid-range baseline)
408
+ - Samsung Galaxy S24 (flagship)
409
+ - Pixel 8 (stock Android)
410
+ - Xiaomi device (custom Android skin)
411
+
412
+ peripherals:
413
+ controllers:
414
+ - Xbox Wireless Controller
415
+ - DualSense (PS5)
416
+ - Nintendo Pro Controller
417
+ - Generic XInput gamepad
418
+ input_devices:
419
+ - Mouse + keyboard
420
+ - Steam Deck controls
421
+ - Touch screen (mobile)
422
+ displays:
423
+ - 16:9 (1080p, 1440p, 4K)
424
+ - 21:9 ultrawide
425
+ - 16:10 (Steam Deck, some laptops)
426
+ - Variable refresh rate (G-Sync/FreeSync)
427
+ ```
428
+
429
+ ### Console Certification Test Procedures
430
+
431
+ Console platform holders require games to pass a certification test suite before they can be published. Failing certification delays launch.
432
+
433
+ **Common certification failure categories:**
434
+ - **Stability**: Crashes, hangs, or infinite loading screens during any reachable game state
435
+ - **Save data**: Failure to handle corrupted save data gracefully, save data exceeding platform limits, losing progress on unexpected power loss
436
+ - **User accounts**: Not handling sign-out during gameplay, not supporting multiple user profiles, not respecting parental controls
437
+ - **Network**: Not handling network disconnection gracefully, not displaying appropriate error messages, not respecting NAT types
438
+ - **Accessibility**: Missing subtitle options (required on some platforms), missing colorblind modes
439
+ - **Performance**: Frame rate below acceptable thresholds, loading times exceeding limits, memory budget violations
440
+
441
+ **Pre-certification checklist (internal):**
442
+ 1. Complete a full playthrough with no crashes on each target SKU
443
+ 2. Test every save/load path, including simulated storage full and corrupt save
444
+ 3. Test network disconnection at every online-capable game state
445
+ 4. Test controller disconnection and reconnection during gameplay
446
+ 5. Test suspend/resume (PS5 rest mode, Xbox Quick Resume, Switch sleep)
447
+ 6. Verify all required platform features (achievements/trophies, rich presence, cloud saves)
448
+ 7. Run the platform holder's own pre-certification tool if available (Sony's submission checker, Microsoft's XR validation tool)
449
+
450
+ ### Automated Replay Systems
451
+
452
+ Replay systems enable reproducible testing by recording and replaying game sessions.
453
+
454
+ **Recording approach:**
455
+ - Record input events (not game state) with frame-accurate timestamps
456
+ - Record the random seed used for the session
457
+ - Record the build version and content hash
458
+ - Store replays as compact binary files (typically 1–5 KB per minute of gameplay)
459
+
460
+ **Replay determinism challenges:**
461
+ - Standard Unity/Unreal physics engines are NOT frame-rate independent in a bit-exact sense — the same inputs at different frame rates produce slightly different outcomes
462
+ - Floating-point operations may produce different results across CPU architectures (x86 vs ARM)
463
+ - Multithreaded systems introduce ordering non-determinism
464
+ - Mitigation: run replays at a fixed timestep matching the recording, accept "close enough" validation rather than bit-exact reproduction, or implement a custom fixed-point simulation layer for competitive games that require exact replay
465
+
466
+ **Replay uses beyond testing:**
467
+ - Kill cam / highlight reel features
468
+ - Anti-cheat validation (replay suspicious sessions server-side)
469
+ - Player behavior analytics (aggregate replay data to find popular paths, death locations, cheese strategies)
470
+ - Regression testing (replay a library of sessions against new builds, flag any that diverge beyond threshold)