@zigrivers/scaffold 3.4.1 → 3.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +91 -0
- package/content/knowledge/game/game-accessibility.md +328 -0
- package/content/knowledge/game/game-ai-patterns.md +567 -0
- package/content/knowledge/game/game-asset-pipeline.md +363 -0
- package/content/knowledge/game/game-audio-design.md +344 -0
- package/content/knowledge/game/game-binary-vcs-strategy.md +396 -0
- package/content/knowledge/game/game-design-document.md +269 -0
- package/content/knowledge/game/game-domain-patterns.md +299 -0
- package/content/knowledge/game/game-economy-design.md +355 -0
- package/content/knowledge/game/game-engine-selection.md +242 -0
- package/content/knowledge/game/game-input-systems.md +379 -0
- package/content/knowledge/game/game-level-content-design.md +483 -0
- package/content/knowledge/game/game-liveops-analytics.md +280 -0
- package/content/knowledge/game/game-localization.md +323 -0
- package/content/knowledge/game/game-milestone-definitions.md +337 -0
- package/content/knowledge/game/game-modding-ugc.md +390 -0
- package/content/knowledge/game/game-narrative-design.md +404 -0
- package/content/knowledge/game/game-networking.md +393 -0
- package/content/knowledge/game/game-performance-budgeting.md +389 -0
- package/content/knowledge/game/game-platform-certification.md +417 -0
- package/content/knowledge/game/game-project-structure.md +360 -0
- package/content/knowledge/game/game-save-systems.md +452 -0
- package/content/knowledge/game/game-testing-strategy.md +470 -0
- package/content/knowledge/game/game-ui-patterns.md +477 -0
- package/content/knowledge/game/game-vr-ar-design.md +313 -0
- package/content/knowledge/review/review-art-bible.md +305 -0
- package/content/knowledge/review/review-game-design.md +303 -0
- package/content/knowledge/review/review-game-economy.md +272 -0
- package/content/knowledge/review/review-game-ui.md +293 -0
- package/content/knowledge/review/review-netcode.md +280 -0
- package/content/knowledge/review/review-platform-cert.md +341 -0
- package/content/methodology/custom-defaults.yml +25 -0
- package/content/methodology/deep.yml +25 -0
- package/content/methodology/game-overlay.yml +145 -0
- package/content/methodology/mvp.yml +25 -0
- package/content/pipeline/architecture/ai-behavior-design.md +87 -0
- package/content/pipeline/architecture/netcode-spec.md +86 -0
- package/content/pipeline/architecture/review-netcode.md +78 -0
- package/content/pipeline/foundation/performance-budgets.md +91 -0
- package/content/pipeline/modeling/narrative-bible.md +84 -0
- package/content/pipeline/pre/game-design-document.md +90 -0
- package/content/pipeline/pre/review-gdd.md +74 -0
- package/content/pipeline/quality/analytics-telemetry.md +98 -0
- package/content/pipeline/quality/live-ops-plan.md +99 -0
- package/content/pipeline/quality/platform-cert-prep.md +129 -0
- package/content/pipeline/quality/playtest-plan.md +84 -0
- package/content/pipeline/specification/art-bible.md +87 -0
- package/content/pipeline/specification/audio-design.md +97 -0
- package/content/pipeline/specification/content-structure-design.md +142 -0
- package/content/pipeline/specification/economy-design.md +105 -0
- package/content/pipeline/specification/game-accessibility.md +82 -0
- package/content/pipeline/specification/game-ui-spec.md +97 -0
- package/content/pipeline/specification/input-controls-spec.md +81 -0
- package/content/pipeline/specification/localization-plan.md +113 -0
- package/content/pipeline/specification/modding-ugc-spec.md +116 -0
- package/content/pipeline/specification/online-services-spec.md +104 -0
- package/content/pipeline/specification/review-economy.md +87 -0
- package/content/pipeline/specification/review-game-ui.md +73 -0
- package/content/pipeline/specification/save-system-spec.md +116 -0
- package/dist/cli/commands/adopt.d.ts.map +1 -1
- package/dist/cli/commands/adopt.js +25 -0
- package/dist/cli/commands/adopt.js.map +1 -1
- package/dist/cli/commands/adopt.test.js +28 -1
- package/dist/cli/commands/adopt.test.js.map +1 -1
- package/dist/cli/commands/build.test.js +3 -0
- package/dist/cli/commands/build.test.js.map +1 -1
- package/dist/cli/commands/init.d.ts +1 -0
- package/dist/cli/commands/init.d.ts.map +1 -1
- package/dist/cli/commands/init.js +6 -0
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/commands/init.test.js +12 -1
- package/dist/cli/commands/init.test.js.map +1 -1
- package/dist/cli/commands/knowledge.test.js +8 -0
- package/dist/cli/commands/knowledge.test.js.map +1 -1
- package/dist/cli/commands/next.d.ts.map +1 -1
- package/dist/cli/commands/next.js +19 -5
- package/dist/cli/commands/next.js.map +1 -1
- package/dist/cli/commands/next.test.js +56 -0
- package/dist/cli/commands/next.test.js.map +1 -1
- package/dist/cli/commands/rework.d.ts.map +1 -1
- package/dist/cli/commands/rework.js +11 -2
- package/dist/cli/commands/rework.js.map +1 -1
- package/dist/cli/commands/rework.test.js +5 -0
- package/dist/cli/commands/rework.test.js.map +1 -1
- package/dist/cli/commands/run.d.ts.map +1 -1
- package/dist/cli/commands/run.js +54 -4
- package/dist/cli/commands/run.js.map +1 -1
- package/dist/cli/commands/run.test.js +384 -0
- package/dist/cli/commands/run.test.js.map +1 -1
- package/dist/cli/commands/skip.test.js +3 -0
- package/dist/cli/commands/skip.test.js.map +1 -1
- package/dist/cli/commands/status.d.ts.map +1 -1
- package/dist/cli/commands/status.js +16 -3
- package/dist/cli/commands/status.js.map +1 -1
- package/dist/cli/commands/status.test.js +55 -0
- package/dist/cli/commands/status.test.js.map +1 -1
- package/dist/cli/output/auto.d.ts +3 -0
- package/dist/cli/output/auto.d.ts.map +1 -1
- package/dist/cli/output/auto.js +9 -0
- package/dist/cli/output/auto.js.map +1 -1
- package/dist/cli/output/context.d.ts +6 -0
- package/dist/cli/output/context.d.ts.map +1 -1
- package/dist/cli/output/context.js.map +1 -1
- package/dist/cli/output/context.test.js +87 -0
- package/dist/cli/output/context.test.js.map +1 -1
- package/dist/cli/output/error-display.test.js +3 -0
- package/dist/cli/output/error-display.test.js.map +1 -1
- package/dist/cli/output/interactive.d.ts +3 -0
- package/dist/cli/output/interactive.d.ts.map +1 -1
- package/dist/cli/output/interactive.js +76 -0
- package/dist/cli/output/interactive.js.map +1 -1
- package/dist/cli/output/json.d.ts +3 -0
- package/dist/cli/output/json.d.ts.map +1 -1
- package/dist/cli/output/json.js +9 -0
- package/dist/cli/output/json.js.map +1 -1
- package/dist/config/loader.d.ts.map +1 -1
- package/dist/config/loader.js +3 -2
- package/dist/config/loader.js.map +1 -1
- package/dist/config/schema.d.ts +641 -15
- package/dist/config/schema.d.ts.map +1 -1
- package/dist/config/schema.js +26 -1
- package/dist/config/schema.js.map +1 -1
- package/dist/config/schema.test.js +192 -1
- package/dist/config/schema.test.js.map +1 -1
- package/dist/core/assembly/overlay-loader.d.ts +24 -0
- package/dist/core/assembly/overlay-loader.d.ts.map +1 -0
- package/dist/core/assembly/overlay-loader.js +190 -0
- package/dist/core/assembly/overlay-loader.js.map +1 -0
- package/dist/core/assembly/overlay-loader.test.d.ts +2 -0
- package/dist/core/assembly/overlay-loader.test.d.ts.map +1 -0
- package/dist/core/assembly/overlay-loader.test.js +106 -0
- package/dist/core/assembly/overlay-loader.test.js.map +1 -0
- package/dist/core/assembly/overlay-resolver.d.ts +15 -0
- package/dist/core/assembly/overlay-resolver.d.ts.map +1 -0
- package/dist/core/assembly/overlay-resolver.js +58 -0
- package/dist/core/assembly/overlay-resolver.js.map +1 -0
- package/dist/core/assembly/overlay-resolver.test.d.ts +2 -0
- package/dist/core/assembly/overlay-resolver.test.d.ts.map +1 -0
- package/dist/core/assembly/overlay-resolver.test.js +246 -0
- package/dist/core/assembly/overlay-resolver.test.js.map +1 -0
- package/dist/core/assembly/overlay-state-resolver.d.ts +26 -0
- package/dist/core/assembly/overlay-state-resolver.d.ts.map +1 -0
- package/dist/core/assembly/overlay-state-resolver.js +63 -0
- package/dist/core/assembly/overlay-state-resolver.js.map +1 -0
- package/dist/core/assembly/overlay-state-resolver.test.d.ts +2 -0
- package/dist/core/assembly/overlay-state-resolver.test.d.ts.map +1 -0
- package/dist/core/assembly/overlay-state-resolver.test.js +256 -0
- package/dist/core/assembly/overlay-state-resolver.test.js.map +1 -0
- package/dist/core/assembly/preset-loader.d.ts +1 -0
- package/dist/core/assembly/preset-loader.d.ts.map +1 -1
- package/dist/core/assembly/preset-loader.js +2 -0
- package/dist/core/assembly/preset-loader.js.map +1 -1
- package/dist/core/dependency/eligibility.test.js +3 -0
- package/dist/core/dependency/eligibility.test.js.map +1 -1
- package/dist/e2e/game-pipeline.test.d.ts +10 -0
- package/dist/e2e/game-pipeline.test.d.ts.map +1 -0
- package/dist/e2e/game-pipeline.test.js +298 -0
- package/dist/e2e/game-pipeline.test.js.map +1 -0
- package/dist/e2e/init.test.js +3 -0
- package/dist/e2e/init.test.js.map +1 -1
- package/dist/project/adopt.d.ts +3 -1
- package/dist/project/adopt.d.ts.map +1 -1
- package/dist/project/adopt.js +29 -1
- package/dist/project/adopt.js.map +1 -1
- package/dist/project/adopt.test.js +51 -1
- package/dist/project/adopt.test.js.map +1 -1
- package/dist/types/config.d.ts +50 -4
- package/dist/types/config.d.ts.map +1 -1
- package/dist/types/config.test.d.ts +2 -0
- package/dist/types/config.test.d.ts.map +1 -0
- package/dist/types/config.test.js +97 -0
- package/dist/types/config.test.js.map +1 -0
- package/dist/utils/eligible.d.ts +3 -2
- package/dist/utils/eligible.d.ts.map +1 -1
- package/dist/utils/eligible.js +18 -4
- package/dist/utils/eligible.js.map +1 -1
- package/dist/utils/errors.d.ts +4 -0
- package/dist/utils/errors.d.ts.map +1 -1
- package/dist/utils/errors.js +31 -0
- package/dist/utils/errors.js.map +1 -1
- package/dist/utils/errors.test.js +4 -1
- package/dist/utils/errors.test.js.map +1 -1
- package/dist/wizard/questions.d.ts +4 -0
- package/dist/wizard/questions.d.ts.map +1 -1
- package/dist/wizard/questions.js +59 -1
- package/dist/wizard/questions.js.map +1 -1
- package/dist/wizard/questions.test.js +178 -4
- package/dist/wizard/questions.test.js.map +1 -1
- package/dist/wizard/wizard.d.ts +1 -0
- package/dist/wizard/wizard.d.ts.map +1 -1
- package/dist/wizard/wizard.js +4 -1
- package/dist/wizard/wizard.js.map +1 -1
- package/dist/wizard/wizard.test.js +102 -4
- package/dist/wizard/wizard.test.js.map +1 -1
- package/package.json +1 -1
|
@@ -0,0 +1,389 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: game-performance-budgeting
|
|
3
|
+
description: Frame budget allocation, memory budgets per platform, GPU/draw call limits, loading targets, thermal constraints, and profiling tools
|
|
4
|
+
topics: [game-dev, performance, frame-budget, memory, gpu, optimization]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Performance budgeting is the discipline of allocating fixed time, memory, and GPU resource envelopes to each game subsystem and enforcing those limits throughout development. Unlike web performance where a slow page is merely annoying, a missed frame budget in a game causes visible hitches, input lag, and motion sickness in VR. Budgets must be established at project start, measured continuously with profiling tools, and treated as hard constraints — not aspirational targets.
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
|
|
11
|
+
### Frame Budget Fundamentals
|
|
12
|
+
|
|
13
|
+
Every rendering frame must complete within a fixed time window determined by the target frame rate:
|
|
14
|
+
|
|
15
|
+
- **60 fps** = 16.67 ms per frame (standard for action games, shooters)
|
|
16
|
+
- **30 fps** = 33.33 ms per frame (acceptable for strategy, narrative, some open-world)
|
|
17
|
+
- **120 fps** = 8.33 ms per frame (competitive shooters, high-refresh displays)
|
|
18
|
+
- **VR targets**: 72 Hz (Quest 2 minimum) = 13.89 ms, 90 Hz (standard) = 11.11 ms, 120 Hz (high-end) = 8.33 ms — missed frames in VR cause motion sickness, making these hard requirements
|
|
19
|
+
|
|
20
|
+
The frame budget is divided across the CPU and GPU. On modern hardware, CPU and GPU run in parallel — the frame time is the **longer** of the two, not their sum. However, CPU work that feeds GPU work (draw call submission, compute dispatch) creates dependencies that can serialize the pipeline.
|
|
21
|
+
|
|
22
|
+
A typical 16.67 ms budget breakdown for a 60 fps action game:
|
|
23
|
+
|
|
24
|
+
- **Game logic / simulation**: 3–4 ms (physics, AI, gameplay scripts, animation)
|
|
25
|
+
- **Rendering preparation**: 2–3 ms (culling, sorting, draw call setup)
|
|
26
|
+
- **Audio**: 1–2 ms (mixing, DSP, spatial calculations)
|
|
27
|
+
- **UI**: 0.5–1 ms (layout, rendering, input processing)
|
|
28
|
+
- **Networking**: 0.5–1 ms (send/receive, serialization, prediction)
|
|
29
|
+
- **Engine overhead**: 1–2 ms (garbage collection, job scheduling, memory management)
|
|
30
|
+
- **Headroom**: 2–4 ms (reserved for spikes, debug builds, min-spec hardware)
|
|
31
|
+
|
|
32
|
+
The headroom allocation is critical. A budget that uses 100% of the frame time on average will miss the target 50% of the time due to variance.
|
|
33
|
+
|
|
34
|
+
### Memory Budgets by Platform
|
|
35
|
+
|
|
36
|
+
Memory budgets vary dramatically across target platforms:
|
|
37
|
+
|
|
38
|
+
- **PC (mid-range)**: 4–8 GB available to the game (total system RAM 16 GB, shared with OS and other apps)
|
|
39
|
+
- **PlayStation 5**: 16 GB unified memory, ~12 GB available to games (rest reserved by OS)
|
|
40
|
+
- **Xbox Series X**: 16 GB, split into 10 GB fast (GPU-optimal) and 6 GB standard
|
|
41
|
+
- **Nintendo Switch**: 4 GB total, ~3.2 GB available to games (docked and handheld share the same budget)
|
|
42
|
+
- **Mobile (mid-range)**: 2–3 GB available (total 4–6 GB, aggressive OS memory management)
|
|
43
|
+
- **Mobile (low-end)**: 1–1.5 GB available (devices with 2–3 GB total)
|
|
44
|
+
|
|
45
|
+
Memory budgets must be subdivided by subsystem: textures (typically 40–60% of total), meshes (10–20%), audio (5–10%), scripts/game state (5–10%), engine overhead (10–15%).
|
|
46
|
+
|
|
47
|
+
### GPU and Draw Call Budgets
|
|
48
|
+
|
|
49
|
+
- **Draw calls**: 2,000–5,000 per frame on modern hardware (batching and instancing reduce this)
|
|
50
|
+
- **Triangle count**: 2–10 million per frame depending on platform and LOD strategy
|
|
51
|
+
- **Texture memory**: Budget per-platform; use streaming and virtual textures for open worlds
|
|
52
|
+
- **Shader complexity**: Measure in GPU ms per material; flag any material exceeding 0.5 ms in isolation
|
|
53
|
+
- **Post-processing**: Budget 2–4 ms total for all post-process effects combined
|
|
54
|
+
|
|
55
|
+
### 2D and Non-3D Performance Budgets
|
|
56
|
+
|
|
57
|
+
2D games have different bottlenecks than 3D:
|
|
58
|
+
|
|
59
|
+
- **Sprite batch count**: Target 50-200 draw calls. Each atlas page = 1 batch. Minimize atlas count by grouping co-rendered sprites.
|
|
60
|
+
- **Fill rate**: Overdraw from layered sprites is the primary 2D GPU cost. Target < 4x overdraw. Transparent sprites are expensive — use opaque where possible.
|
|
61
|
+
- **Particle count (2D VFX)**: Budget 200-500 active particles for mobile, 1,000-2,000 for PC.
|
|
62
|
+
- **UI-heavy games** (visual novels, card games): UI element count < 500 visible, text rendering budget < 2ms, animation/tween count < 100 simultaneous.
|
|
63
|
+
|
|
64
|
+
### Loading Time Targets
|
|
65
|
+
|
|
66
|
+
- **Initial boot to menu**: Under 10 seconds on console (platform certification requirement areas)
|
|
67
|
+
- **Level load**: Under 15 seconds with SSD, under 30 seconds with HDD
|
|
68
|
+
- **Fast travel / respawn**: Under 3 seconds (use streaming, not full loads)
|
|
69
|
+
- **Asset streaming during gameplay**: Zero visible pop-in at normal movement speed
|
|
70
|
+
|
|
71
|
+
### Mobile Thermal and Battery Constraints
|
|
72
|
+
|
|
73
|
+
Mobile devices throttle CPU and GPU when they overheat. A game that runs at 60 fps for the first five minutes and drops to 30 fps after thermal throttling has a broken performance budget.
|
|
74
|
+
|
|
75
|
+
- Target sustained performance, not peak performance
|
|
76
|
+
- Measure thermal state after 30 minutes of continuous play
|
|
77
|
+
- Budget power draw to allow 2–3 hours of gameplay per charge
|
|
78
|
+
- Reduce target frame rate to 30 fps on mobile unless the game genre demands 60 fps
|
|
79
|
+
|
|
80
|
+
**Concrete mobile thermal targets:** Typical thermal throttle onset at 40-42°C skin temperature (varies by device). Sustainable GPU utilization: 50-60% of peak — not 100%. Power draw targets: 3W sustained for phone games (~3hr on 4000mAh battery), 5W for tablet. Monitor with Android `dumpsys thermalservice` or iOS `ProcessInfo.ThermalState`. At `.serious` thermal state, reduce render resolution by 25% and particle count by 50%.
|
|
81
|
+
|
|
82
|
+
## Deep Guidance
|
|
83
|
+
|
|
84
|
+
### Frame Budget Allocation Template
|
|
85
|
+
|
|
86
|
+
```yaml
|
|
87
|
+
# Frame Budget Allocation — 60 fps target (16.67 ms)
|
|
88
|
+
# Assign budgets at project start, enforce via automated profiling
|
|
89
|
+
|
|
90
|
+
target_fps: 60
|
|
91
|
+
frame_budget_ms: 16.67
|
|
92
|
+
headroom_ms: 3.0 # Reserve for spikes and min-spec variance
|
|
93
|
+
usable_budget_ms: 13.67 # What subsystems share
|
|
94
|
+
|
|
95
|
+
cpu_budget:
|
|
96
|
+
simulation:
|
|
97
|
+
physics: 1.5 # Rigid bodies, collision detection
|
|
98
|
+
ai: 1.0 # Behavior trees, pathfinding queries (not bake)
|
|
99
|
+
gameplay: 1.5 # Scripts, abilities, damage, spawning
|
|
100
|
+
animation: 0.5 # Blend tree evaluation, IK solving
|
|
101
|
+
subtotal: 4.5
|
|
102
|
+
|
|
103
|
+
rendering_prep:
|
|
104
|
+
culling: 0.5 # Frustum + occlusion culling
|
|
105
|
+
sorting: 0.3 # Render queue sorting
|
|
106
|
+
draw_submit: 1.2 # Draw call submission to GPU
|
|
107
|
+
subtotal: 2.0
|
|
108
|
+
|
|
109
|
+
audio:
|
|
110
|
+
mix: 0.5 # Final mix, bus processing
|
|
111
|
+
spatial: 0.3 # 3D positioning, HRTF, occlusion
|
|
112
|
+
decode: 0.2 # Streaming decode (Vorbis/Opus)
|
|
113
|
+
subtotal: 1.0
|
|
114
|
+
|
|
115
|
+
networking:
|
|
116
|
+
recv_deser: 0.3 # Receive and deserialize packets
|
|
117
|
+
prediction: 0.2 # Client prediction + reconciliation
|
|
118
|
+
send_ser: 0.2 # Serialize and send packets
|
|
119
|
+
subtotal: 0.7
|
|
120
|
+
|
|
121
|
+
ui:
|
|
122
|
+
layout: 0.3 # UI layout calculation
|
|
123
|
+
render: 0.2 # UI draw calls
|
|
124
|
+
subtotal: 0.5
|
|
125
|
+
|
|
126
|
+
engine:
|
|
127
|
+
gc_memory: 0.5 # Garbage collection / allocator maintenance
|
|
128
|
+
job_scheduler: 0.2 # Job system overhead
|
|
129
|
+
subsystem_tick: 0.3 # Misc engine ticks
|
|
130
|
+
subtotal: 1.0
|
|
131
|
+
|
|
132
|
+
# CPU total: 9.7 ms (leaves 3.97 ms headroom — good)
|
|
133
|
+
|
|
134
|
+
gpu_budget:
|
|
135
|
+
depth_prepass: 1.0 # Z-prepass for early-Z rejection
|
|
136
|
+
gbuffer: 3.0 # Geometry rendering (deferred) or forward pass
|
|
137
|
+
lighting: 2.5 # Direct + indirect lighting, shadows
|
|
138
|
+
post_process: 2.0 # Bloom, tone mapping, AA, motion blur
|
|
139
|
+
ui_overlay: 0.5 # UI rendering on GPU
|
|
140
|
+
particles: 1.0 # VFX / particle rendering
|
|
141
|
+
# GPU total: 10.0 ms (leaves 3.67 ms headroom)
|
|
142
|
+
|
|
143
|
+
memory_budget_mb: # Example: console target (12 GB available)
|
|
144
|
+
textures: 5000 # ~42% — streaming pool + resident
|
|
145
|
+
meshes: 2000 # ~17% — vertex/index buffers, LOD chain
|
|
146
|
+
audio: 800 # ~7% — loaded banks + streaming buffers
|
|
147
|
+
animation: 400 # ~3% — skeletal data, blend trees
|
|
148
|
+
physics: 300 # ~2.5% — collision meshes, solver state
|
|
149
|
+
game_state: 500 # ~4% — entity data, scripts, save state
|
|
150
|
+
render_targets: 1500 # ~12.5% — GBuffer, shadow maps, post-FX
|
|
151
|
+
engine: 1000 # ~8% — job system, allocator overhead
|
|
152
|
+
headroom: 500 # ~4% — reserved for spikes
|
|
153
|
+
# Total: 12000 MB
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
### Profiling Tools by Engine
|
|
157
|
+
|
|
158
|
+
**Unity profiling stack:**
|
|
159
|
+
- **Unity Profiler**: Built-in CPU/GPU/memory profiler; use Deep Profile mode sparingly (10x overhead)
|
|
160
|
+
- **Frame Debugger**: Step through draw calls one at a time to find redundant draws
|
|
161
|
+
- **Memory Profiler package**: Snapshot-based memory analysis; compare snapshots to find leaks
|
|
162
|
+
- **Profile Analyzer**: Compare profiler captures across runs to detect regressions
|
|
163
|
+
- **Rendering Statistics**: Real-time overlay showing batches, triangles, set-pass calls
|
|
164
|
+
- **Platform-specific**: Xcode Instruments (iOS), Android GPU Inspector, RenderDoc (PC)
|
|
165
|
+
|
|
166
|
+
**Unreal profiling stack:**
|
|
167
|
+
- **Unreal Insights**: Modern trace-based profiler replacing the legacy stat system
|
|
168
|
+
- **stat commands**: `stat unit`, `stat fps`, `stat gpu`, `stat scenerendering` for real-time overlay
|
|
169
|
+
- **GPU Visualizer**: `ProfileGPU` command for per-pass GPU timing
|
|
170
|
+
- **Memreport**: Memory reporting by category
|
|
171
|
+
- **RenderDoc integration**: Capture and inspect individual frames
|
|
172
|
+
- **Platform-specific**: PIX (Xbox/Windows), Razor (PlayStation), Snapdragon Profiler (Android)
|
|
173
|
+
|
|
174
|
+
**Godot profiling stack:**
|
|
175
|
+
- **Built-in Profiler**: CPU time per function, physics, and rendering
|
|
176
|
+
- **Monitor panel**: FPS, draw calls, memory, physics bodies
|
|
177
|
+
- **Visual Profiler**: GPU frame breakdown
|
|
178
|
+
- External tools: RenderDoc, platform-native profilers
|
|
179
|
+
|
|
180
|
+
**Cross-engine tools:**
|
|
181
|
+
- **RenderDoc**: Free, open-source GPU frame capture and analysis (works with Vulkan, D3D11/12, OpenGL)
|
|
182
|
+
- **PIX**: Microsoft's GPU profiler for DirectX (Windows and Xbox)
|
|
183
|
+
- **Xcode Metal Debugger**: GPU profiling for Apple platforms
|
|
184
|
+
- **Tracy**: High-performance C++ profiler with frame-by-frame timeline view
|
|
185
|
+
- **Superluminal**: Low-overhead CPU profiler for Windows
|
|
186
|
+
|
|
187
|
+
### Draw Call Optimization Strategies
|
|
188
|
+
|
|
189
|
+
Draw calls are the primary CPU-side rendering bottleneck. Each draw call has fixed overhead from API state changes and command buffer submission.
|
|
190
|
+
|
|
191
|
+
**Batching techniques:**
|
|
192
|
+
- **Static batching**: Combine meshes that never move into a single draw call at build time. Costs memory (duplicated vertex data) but eliminates per-frame draw overhead.
|
|
193
|
+
- **Dynamic batching**: Combine small meshes (under ~300 vertices in Unity) at runtime. CPU overhead for combining may exceed the draw call savings for complex meshes.
|
|
194
|
+
- **GPU instancing**: Render many copies of the same mesh with a single draw call plus an instance buffer. Ideal for vegetation, debris, crowds.
|
|
195
|
+
- **Indirect drawing**: GPU-driven rendering where the GPU itself decides what to draw. Used in Nanite (Unreal) and custom engines for massive scene complexity.
|
|
196
|
+
|
|
197
|
+
**State change reduction:**
|
|
198
|
+
- Sort draws by material to minimize shader/texture state changes
|
|
199
|
+
- Use texture atlases and texture arrays to reduce texture bind changes
|
|
200
|
+
- Merge materials where possible (combine multiple texture maps into channels of a single texture)
|
|
201
|
+
|
|
202
|
+
**LOD (Level of Detail):**
|
|
203
|
+
- Every mesh visible beyond 10 meters should have at least 3 LOD levels
|
|
204
|
+
- LOD0: full detail (within 10 m), LOD1: 50% triangles (10–30 m), LOD2: 25% (30–100 m), LOD3: billboard or removed (100 m+)
|
|
205
|
+
- Cross-fade or dither between LOD levels to hide transitions
|
|
206
|
+
- Measure LOD savings: a scene with proper LODs typically uses 60–80% fewer triangles than LOD0 everywhere
|
|
207
|
+
|
|
208
|
+
### Memory Leak Detection
|
|
209
|
+
|
|
210
|
+
Memory leaks in games manifest as gradually increasing memory use over play sessions, eventually causing crashes or OOM kills (especially on mobile).
|
|
211
|
+
|
|
212
|
+
**Common leak sources:**
|
|
213
|
+
- Event listeners not unsubscribed when objects are destroyed
|
|
214
|
+
- Loaded assets (textures, audio clips) referenced by destroyed objects preventing GC
|
|
215
|
+
- Growing collections (lists, dictionaries) that are appended to but never pruned
|
|
216
|
+
- Pooled objects that accumulate component references over their reuse lifecycle
|
|
217
|
+
- Native/unmanaged resources (file handles, GPU buffers) not explicitly released
|
|
218
|
+
|
|
219
|
+
**Detection workflow:**
|
|
220
|
+
1. Take a memory snapshot at a known-good state (e.g., main menu after fresh boot)
|
|
221
|
+
2. Play through a level, return to main menu
|
|
222
|
+
3. Force garbage collection
|
|
223
|
+
4. Take a second snapshot
|
|
224
|
+
5. Diff the snapshots — any growth is a potential leak
|
|
225
|
+
6. Repeat the cycle 3–5 times — true leaks show linear growth
|
|
226
|
+
|
|
227
|
+
```csharp
|
|
228
|
+
// Unity: Automated leak detection in development builds
|
|
229
|
+
using UnityEngine;
|
|
230
|
+
using UnityEngine.Profiling;
|
|
231
|
+
using System.Collections;
|
|
232
|
+
|
|
233
|
+
public class MemoryLeakDetector : MonoBehaviour
|
|
234
|
+
{
|
|
235
|
+
private long _baselineBytes;
|
|
236
|
+
private int _cycleCount;
|
|
237
|
+
private const int WarningThresholdMB = 50;
|
|
238
|
+
|
|
239
|
+
public void CaptureBaseline()
|
|
240
|
+
{
|
|
241
|
+
// Force GC before capturing baseline
|
|
242
|
+
System.GC.Collect();
|
|
243
|
+
System.GC.WaitForPendingFinalizers();
|
|
244
|
+
System.GC.Collect();
|
|
245
|
+
|
|
246
|
+
_baselineBytes = Profiler.GetTotalAllocatedMemoryLong();
|
|
247
|
+
_cycleCount = 0;
|
|
248
|
+
Debug.Log($"[LeakDetector] Baseline: {_baselineBytes / (1024 * 1024)} MB");
|
|
249
|
+
}
|
|
250
|
+
|
|
251
|
+
public void CheckForLeaks()
|
|
252
|
+
{
|
|
253
|
+
System.GC.Collect();
|
|
254
|
+
System.GC.WaitForPendingFinalizers();
|
|
255
|
+
System.GC.Collect();
|
|
256
|
+
|
|
257
|
+
long currentBytes = Profiler.GetTotalAllocatedMemoryLong();
|
|
258
|
+
long deltaBytes = currentBytes - _baselineBytes;
|
|
259
|
+
float deltaMB = deltaBytes / (1024f * 1024f);
|
|
260
|
+
_cycleCount++;
|
|
261
|
+
|
|
262
|
+
Debug.Log($"[LeakDetector] Cycle {_cycleCount}: " +
|
|
263
|
+
$"Current={currentBytes / (1024 * 1024)} MB, " +
|
|
264
|
+
$"Delta={deltaMB:F1} MB from baseline");
|
|
265
|
+
|
|
266
|
+
if (deltaMB > WarningThresholdMB)
|
|
267
|
+
{
|
|
268
|
+
Debug.LogError($"[LeakDetector] LEAK SUSPECTED: " +
|
|
269
|
+
$"{deltaMB:F1} MB growth after {_cycleCount} cycles. " +
|
|
270
|
+
$"Take a memory snapshot NOW for comparison.");
|
|
271
|
+
}
|
|
272
|
+
}
|
|
273
|
+
}
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
### Platform Certification Performance Requirements
|
|
277
|
+
|
|
278
|
+
Console platform holders enforce performance requirements during certification:
|
|
279
|
+
|
|
280
|
+
**PlayStation:**
|
|
281
|
+
- Game must not drop below target frame rate during normal gameplay for extended periods
|
|
282
|
+
- Loading screens must show activity (progress bar, animation) — no static screens over 3 seconds
|
|
283
|
+
- Suspend/resume must complete within platform-specified time limits
|
|
284
|
+
- Memory must not exceed allocation — OOM crashes are automatic certification failures
|
|
285
|
+
|
|
286
|
+
**Xbox:**
|
|
287
|
+
- Xbox Reliability (XR) requirements specify maximum memory usage per title profile
|
|
288
|
+
- Frame rate must be stable at the advertised rate (30 or 60 fps)
|
|
289
|
+
- Quick Resume must be supported — game state must survive suspend/resume cycles
|
|
290
|
+
- Loading from SSD must meet platform guidance (<2 seconds for fast travel)
|
|
291
|
+
|
|
292
|
+
**Nintendo Switch:**
|
|
293
|
+
- Must run acceptably in both docked (1080p target) and handheld (720p target) modes
|
|
294
|
+
- Dynamic resolution scaling is expected — games should lower resolution to maintain frame rate
|
|
295
|
+
- Memory budget is tight (~3.2 GB) — aggressive texture compression and streaming required
|
|
296
|
+
- Thermal throttling is common — test performance after 30+ minutes of handheld play
|
|
297
|
+
|
|
298
|
+
**Mobile (App Store / Google Play):**
|
|
299
|
+
- No hard certification for performance, but review teams flag severe issues
|
|
300
|
+
- ANR (Application Not Responding) on Android triggers if the main thread blocks >5 seconds
|
|
301
|
+
- iOS watchdog kills apps that take too long to launch (~20 seconds)
|
|
302
|
+
- App size limits: 200 MB for cellular download on iOS; use asset bundles for larger content
|
|
303
|
+
|
|
304
|
+
### Automated Performance Regression Testing
|
|
305
|
+
|
|
306
|
+
Manual profiling catches issues reactively. Automated performance tests catch regressions before they ship.
|
|
307
|
+
|
|
308
|
+
```python
|
|
309
|
+
# performance_gate.py — CI script that fails the build on perf regression
|
|
310
|
+
# Run after automated playthrough captures profiler data
|
|
311
|
+
|
|
312
|
+
import json
|
|
313
|
+
import sys
|
|
314
|
+
|
|
315
|
+
# Load profiler output from automated test run
|
|
316
|
+
with open("profiler_results.json") as f:
|
|
317
|
+
results = json.load(f)
|
|
318
|
+
|
|
319
|
+
# Define per-subsystem budgets (in milliseconds)
|
|
320
|
+
budgets = {
|
|
321
|
+
"frame_time_p95": 16.67, # 95th percentile frame time
|
|
322
|
+
"frame_time_p99": 20.0, # 99th percentile (allow some spikes)
|
|
323
|
+
"physics_avg": 1.5,
|
|
324
|
+
"ai_avg": 1.0,
|
|
325
|
+
"rendering_avg": 3.0,
|
|
326
|
+
"audio_avg": 1.0,
|
|
327
|
+
"gc_max": 2.0, # Max single GC pause
|
|
328
|
+
"draw_calls_avg": 3000, # Average draw calls per frame
|
|
329
|
+
"memory_peak_mb": 3200, # Peak memory (Switch target)
|
|
330
|
+
}
|
|
331
|
+
|
|
332
|
+
failures = []
|
|
333
|
+
for metric, budget in budgets.items():
|
|
334
|
+
actual = results.get(metric, 0)
|
|
335
|
+
if actual > budget:
|
|
336
|
+
pct_over = ((actual - budget) / budget) * 100
|
|
337
|
+
failures.append(
|
|
338
|
+
f" FAIL: {metric} = {actual:.1f} "
|
|
339
|
+
f"(budget: {budget:.1f}, {pct_over:.0f}% over)"
|
|
340
|
+
)
|
|
341
|
+
|
|
342
|
+
if failures:
|
|
343
|
+
print("PERFORMANCE BUDGET VIOLATIONS:")
|
|
344
|
+
print("\n".join(failures))
|
|
345
|
+
sys.exit(1)
|
|
346
|
+
else:
|
|
347
|
+
print("All performance budgets passed.")
|
|
348
|
+
sys.exit(0)
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
### Thermal Profiling for Mobile
|
|
352
|
+
|
|
353
|
+
Mobile thermal throttling is the single most common cause of "it runs fine in the office" performance failures. Devices heat up during extended play and reduce clock speeds.
|
|
354
|
+
|
|
355
|
+
**Testing protocol:**
|
|
356
|
+
1. Charge the device to 100% and unplug it
|
|
357
|
+
2. Close all background apps
|
|
358
|
+
3. Run the game for 45 minutes continuously
|
|
359
|
+
4. Log frame time, CPU frequency, GPU frequency, and skin temperature every second
|
|
360
|
+
5. Plot all four on a timeline — look for the "thermal cliff" where clocks drop
|
|
361
|
+
6. The sustained performance after throttling is the real frame budget, not the initial burst
|
|
362
|
+
|
|
363
|
+
**Mitigation strategies:**
|
|
364
|
+
- Reduce simulation quality when thermal state is elevated (lower physics tick rate, simplified AI, reduced particle counts)
|
|
365
|
+
- Implement a "thermal budget" system that queries the OS for thermal state (iOS: `ProcessInfo.thermalState`, Android: `PowerManager` thermal API)
|
|
366
|
+
- Use lower rendering resolution with upscaling rather than reducing frame rate
|
|
367
|
+
- Schedule heavy background work (asset loading, pathfinding bakes) during low-activity gameplay moments
|
|
368
|
+
- Test on the oldest supported device in a warm environment (30+ C ambient)
|
|
369
|
+
|
|
370
|
+
### Profiling Discipline
|
|
371
|
+
|
|
372
|
+
Performance optimization without profiling data is guesswork. Follow these rules:
|
|
373
|
+
|
|
374
|
+
1. **Measure first, optimize second** — never optimize based on assumptions about what is slow
|
|
375
|
+
2. **Profile on target hardware** — a developer PC with an RTX 4090 tells you nothing about Switch or mobile performance
|
|
376
|
+
3. **Profile release builds** — debug builds have 2–10x overhead from assertions, logging, and disabled optimizations
|
|
377
|
+
4. **Profile representative content** — the title screen is not representative; profile the most complex gameplay scenario
|
|
378
|
+
5. **Track metrics over time** — a single profile session is a snapshot; daily automated profiling catches regressions
|
|
379
|
+
6. **Budget from day one** — retrofitting performance into a game that has been running 3x over budget for a year is brutal; establish and enforce budgets from the first playable build
|
|
380
|
+
|
|
381
|
+
### Common Performance Antipatterns
|
|
382
|
+
|
|
383
|
+
- **Allocating during gameplay**: Any `new` call or list resize during the game loop risks GC pauses. Pre-allocate and pool everything.
|
|
384
|
+
- **Synchronous I/O on the main thread**: File reads, network calls, or asset loads on the main thread cause frame spikes. Use async I/O and streaming.
|
|
385
|
+
- **Per-frame string operations**: String concatenation, formatting, and parsing allocate memory every frame. Cache results, use `StringBuilder`, or use integer identifiers instead of strings.
|
|
386
|
+
- **Unbounded spatial queries**: "Find all enemies" without a spatial index (quadtree, grid) is O(n^2) and degrades as entity count grows.
|
|
387
|
+
- **Overdraw**: Transparent objects rendered back-to-front can cause massive pixel fill rate waste. Sort, z-test, and minimize layered transparency.
|
|
388
|
+
- **Shader complexity creep**: Artists add features to materials over time. Each additional texture sample, branch, or math operation costs GPU time multiplied by every pixel that material covers. Audit materials monthly.
|
|
389
|
+
- **Excessive raycasts**: Physics raycasts are not free. Budget them (e.g., max 50 per frame for AI line-of-sight checks) and use layers to filter collision masks.
|