@plasius/gpu-renderer 0.2.5 → 0.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -23,25 +23,14 @@ All notable changes to this project will be documented in this file.
23
23
  - **Security**
24
24
  - (placeholder)
25
25
 
26
- ## [0.2.5] - 2026-06-14
27
-
28
- - **Added**
29
- - (placeholder)
30
-
31
- - **Changed**
32
- - (placeholder)
33
-
34
- - **Fixed**
35
- - (placeholder)
36
-
37
- - **Security**
38
- - (placeholder)
39
-
40
- ## [0.2.4] - 2026-06-11
26
+ ## [0.2.6] - 2026-06-14
41
27
 
42
28
  - **Added**
43
29
  - Added `updateCamera(...)` support for wavefront renderers so validation
44
30
  views can animate camera movement without rebuilding mesh buffers.
31
+ - Added `gpuWorkerJobs` diagnostics to wavefront frame stats so callers can
32
+ inspect completed compute-dispatch jobs per frame, per second, and per
33
+ command submission alongside the existing GPU parallelism figures.
45
34
  - Added `gpuParallelism` diagnostics to wavefront frame stats and snapshots so
46
35
  consumers can inspect adapter compute limits, direct workgroups,
47
36
  indirect-dispatch estimates, and whether a frame exposes multi-workgroup GPU
@@ -52,17 +41,72 @@ All notable changes to this project will be documented in this file.
52
41
  - Added default-on `deferredPathResolve` support for wavefront tracing so
53
42
  surface traversal records material responses and terminal source radiance
54
43
  before the output pass resolves the path backward into pixel accumulation.
44
+ - Added design and ADR coverage for shadow-tested direct light and bounded
45
+ bounce attenuation in deferred wavefront path resolution.
46
+ - Added display-quality material-response support for sheen, clearcoat, and
47
+ decoded base-colour, metallic-roughness, normal, and occlusion maps in the
48
+ wavefront mesh path.
49
+ - Added GPU hit-time material atlas sampling for display-quality wavefront
50
+ tracing so exact hit UVs now drive base-colour, metallic-roughness, normal,
51
+ occlusion, and emissive texture evaluation on the GPU instead of relying on
52
+ CPU-baked triangle averages.
53
+ - Added ADR and design coverage for generic glTF material transport so
54
+ specular colour, sheen colour, transmission, clearcoat, and IOR can travel
55
+ through the shared wavefront shading path instead of being inferred from
56
+ demo-specific material names.
57
+ - Added ADR and design coverage for environment-driven glossy surface
58
+ response so specular, sheen, and clearcoat shading can use reflection-
59
+ aligned environment radiance instead of leaning primarily on a sun-only
60
+ highlight proxy.
61
+ - Added ADR and design coverage for prefiltered HDRI, BRDF LUT, and MIS-based
62
+ environment lighting in the wavefront display-quality path.
63
+ - Added adaptive `renderFrame(...)` sampling controls so callers can cap a
64
+ frame with `frameTimeBudgetMs`, guarantee a `minimumSamplesPerPixel`, and
65
+ inspect actual `renderedSamplesPerPixel` separately from the configured SPP
66
+ ceiling.
67
+ - Added `createWavefrontAdaptiveSamplingLevels(...)` so consumers can hand
68
+ wavefront SPP adaptation to `@plasius/gpu-performance` without duplicating
69
+ renderer-specific ladder construction in app or demo code.
55
70
 
56
71
  - **Changed**
72
+ - Changed internal wavefront frame batching and dispatch-diagnostics plumbing
73
+ to live in a dedicated runtime helper module so scheduling concerns stay
74
+ separated from shader and pipeline assembly.
57
75
  - Changed wavefront frame dispatch to split large tile/sample workloads into
58
76
  bounded command submissions instead of encoding an entire high-resolution
59
77
  frame into one command buffer.
60
- - Changed display-quality wavefront tracing to require one additional
61
- path-vertex storage buffer, raising the trace-stage storage-buffer request
62
- from 9 to 10.
78
+ - Changed display-quality wavefront tracing to pack HDRI importance-sampling
79
+ PDFs and CDFs into one GPU texture so the MIS path stays compatible with
80
+ adapters that expose only the existing 10 trace-stage storage buffers.
63
81
  - Changed wavefront terminal/direct environment estimates to consume
64
82
  `environmentLighting.sunlitBaseline` as a time-of-day daylight floor instead
65
83
  of relying only on restrained ambient colour.
84
+ - Changed deferred wavefront surface resolution to allow explicit
85
+ shadow-tested direct lighting before terminal continuation resolution and to
86
+ remap extremely dark bounce responses to a small scene-brightness floor.
87
+ - Changed wavefront denoise to adapt its kernel strength to SPP and to skip
88
+ the intermediate full-screen scratch pass for 4+ SPP frames, reducing blur
89
+ and denoise cost on cleaner renders.
90
+ - Changed async wavefront frame rendering to wait for submitted GPU work and
91
+ to batch higher-SPP workloads more defensibly, while raising the SPP ceiling
92
+ from 64 to 256 for higher-end GPUs.
93
+ - Changed awaited high-SPP wavefront rendering to fence submitted GPU work
94
+ once per `renderFrame(...)` call instead of after every intermediate command
95
+ submission.
96
+ - Changed wavefront scene, mesh, triangle, material, and hit records to carry
97
+ generic glTF-style specular colour, sheen colour, and transmission inputs
98
+ in addition to the existing roughness, metallic, clearcoat, and IOR data.
99
+ - Changed direct and terminal wavefront surface environment shading to sample
100
+ reflection-aligned environment radiance for glossy materials before adding
101
+ narrower sun highlights, improving leather, chrome, and polished-surface
102
+ response without model-specific rules.
103
+ - Changed wavefront environment-lighting setup to prepare roughness-aware HDRI
104
+ resources, a BRDF integration LUT, and HDRI importance-sampling tables for
105
+ display-quality renders.
106
+ - Changed wavefront continuation sampling to emit BSDF PDFs for diffuse,
107
+ conductor, clearcoat, and transmission paths so environment misses and
108
+ explicit HDRI samples can use MIS instead of the older light-guidance
109
+ heuristics.
66
110
 
67
111
  - **Fixed**
68
112
  - Fixed low-sample wavefront renders so non-emissive surface hits receive a
@@ -70,6 +114,12 @@ All notable changes to this project will be documented in this file.
70
114
  - Reduced deterministic environment fill on direct surface hits so ambient
71
115
  rescue lighting no longer washes dark materials toward the full scene
72
116
  ambient colour.
117
+ - Fixed wavefront output-probe readback so higher-SPP validation renders wait
118
+ for submitted GPU work before mapping the staging buffer, avoiding
119
+ `GPUBuffer.mapAsync(...)` lifetime failures during probe capture.
120
+ - Fixed display-quality environment misses so non-delta BSDF paths now apply
121
+ MIS weighting against the HDRI direction PDF instead of overcounting raw
122
+ sky radiance on termination.
73
123
 
74
124
  - **Security**
75
125
  - (placeholder)
@@ -363,5 +413,4 @@ All notable changes to this project will be documented in this file.
363
413
  [0.1.12]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.1.12
364
414
  [0.1.14]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.1.14
365
415
  [0.2.1]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.1
366
- [0.2.4]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.4
367
- [0.2.5]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.5
416
+ [0.2.6]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.6
package/README.md CHANGED
@@ -113,7 +113,10 @@ debug validation scenes. It is compute-driven, tiled, and breadth-first by
113
113
  bounce depth, so queue buffers are bounded by tile size instead of presentation
114
114
  resolution. Renderer-owned GPU record sizes are part of the public compute
115
115
  limits so ray, hit, triangle, BVH, and accumulation buffers stay aligned with
116
- their WGSL layouts.
116
+ their WGSL layouts. Frame submission batching and dispatch diagnostics are kept
117
+ in dedicated runtime helpers so performance-facing integrations can consume
118
+ stable frame stats without inheriting the renderer's shader and pipeline
119
+ assembly internals.
117
120
 
118
121
  ```js
119
122
  import {
@@ -168,7 +171,8 @@ debugRenderer.renderOnce();
168
171
  ```
169
172
 
170
173
  Scene objects currently support analytic `sphere` and axis-aligned `box`
171
- records with colour, emission, roughness, metallic, opacity, and IOR fields.
174
+ records with colour, emission, roughness, metallic, opacity, IOR, clearcoat,
175
+ sheen colour, specular colour, and transmission fields.
172
176
  These records are debug fixtures only. Product Studio visual rendering requires
173
177
  the mesh BVH path described in
174
178
  `docs/adrs/adr-0007-triangle-mesh-wavefront-path-tracing.md`. This is a
@@ -185,6 +189,18 @@ fixtures and deterministic layout tests. GPU BVH construction now uses
185
189
  Morton-style centroid keys to sort leaf references before sorted leaves and
186
190
  level-concurrent internal nodes are materialized. The current mesh path is the
187
191
  GPU runtime baseline under active hardening.
192
+ When mesh inputs also carry UVs plus decoded base-colour,
193
+ metallic-roughness, normal, occlusion, or emissive maps, the display-quality
194
+ path now packs them into GPU texture atlases and samples them at the resolved
195
+ hit UV inside the wavefront trace pass. Generic glTF-style material factors
196
+ such as clearcoat, sheen colour, specular colour, transmission, and IOR are
197
+ also preserved through the GPU records so demo validation does not need
198
+ model-name overrides. CPU-side texture work is limited to load-time decode and
199
+ atlas packing; per-hit shading stays on the GPU. Direct and terminal glossy
200
+ response now also samples reflection-aligned environment radiance so leather,
201
+ chrome, and other polished authored materials can read from the active
202
+ environment map or procedural sky instead of relying mostly on a sun-direction
203
+ proxy.
188
204
 
189
205
  `samplesPerPixel` controls how many GPU primary-ray samples are accumulated per
190
206
  screen pixel within a single render. This multiplies dispatch work but does not
@@ -207,9 +223,12 @@ better material PDFs are hardened. By default, `deferredPathResolve` records
207
223
  per-bounce material responses in a tile-bounded path buffer and records the
208
224
  terminal emissive/HDRI/environment source in the final path slot. The output
209
225
  pass then resolves that recorded path backward and adds the weighted sample to
210
- the pixel accumulation, so surface traversal no longer injects broad visible
211
- ambient/direct environment light before a terminal source is known. Set
212
- `deferredPathResolve: false` only for legacy forward-accumulation comparison.
226
+ the pixel accumulation, so unresolved continuation light is still deferred until
227
+ a terminal source is known. Surface resolution may still add a small
228
+ shadow-tested direct-light term immediately when it has an explicit source and
229
+ visibility result, which keeps true occlusion shadows possible without falling
230
+ back to broad per-bounce ambient fill. Set `deferredPathResolve: false` only
231
+ for legacy forward-accumulation comparison.
213
232
  When an `environmentMap` is provided, the wavefront trace shader samples it as
214
233
  an equirectangular radiance source for environment misses and uses the same
215
234
  mapped radiance for terminal residuals before falling back to static ambient.
@@ -217,14 +236,28 @@ The procedural horizon/zenith/sun model remains the fallback for callers that
217
236
  have not supplied an HDRI/radiance texture. `environmentLighting.sunlitBaseline`
218
237
  adds a time-of-day daylight floor to terminal and direct environment estimates,
219
238
  so bright presets retain colour at the last collision without returning to a
220
- whitewashed global ambient term.
239
+ whitewashed global ambient term. Extremely dark recorded bounce responses are
240
+ also remapped to a small scene-brightness-driven luminance floor so bright
241
+ low-sample scenes do not produce isolated black speckles when a valid terminal
242
+ source was already found.
221
243
  For static mesh scenes, the GPU acceleration build is submitted once and then
222
244
  reused by subsequent frames. Per-frame tracing writes one dynamic uniform slot
223
245
  per tile/sample or post-process pass and batches tile tracing, tile output,
224
246
  optional denoise, and presentation into bounded command submissions controlled
225
247
  by `maxFramePassesPerSubmission` to keep 4K/high-spp command buffers from
226
248
  becoming oversized. `updateCamera(...)` can update the per-frame camera uniforms
227
- between renders without rebuilding mesh buffers. Frame stats and snapshots expose
249
+ without rebuilding scene buffers. `renderFrame(...)` also accepts an optional
250
+ `frameTimeBudgetMs` plus `minimumSamplesPerPixel`: when present, configured
251
+ `samplesPerPixel` becomes a ceiling instead of a hard requirement, the renderer
252
+ guarantees at least the minimum full-screen pass, and frame stats report both
253
+ configured `samplesPerPixel` and actual `renderedSamplesPerPixel` so realtime
254
+ callers can budget motion frames without overstating delivered quality.
255
+ For consumers that want to hand wavefront SPP adaptation to
256
+ `@plasius/gpu-performance`, `createWavefrontAdaptiveSamplingLevels(...)` exposes
257
+ a bounded low-to-high ladder of per-frame `samplesPerPixel`,
258
+ `frameTimeBudgetMs`, and `minimumSamplesPerPixel` configs that stay aligned
259
+ with the renderer's supported adaptive-sampling surface. Frame stats and
260
+ snapshots expose
228
261
  `gpuParallelism` diagnostics with adapter compute limits, configured workgroup
229
262
  size, direct compute dispatches, known workgroups/invocations, indirect dispatch
230
263
  counts, and upper-bound indirect work estimates. WebGPU does not expose physical