@kaminos/webgpu-inference-kit 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,10 @@
1
1
  # @kaminos/webgpu-inference-kit
2
2
 
3
- Native WebGPU inference route substrate for Kaminos.
3
+ Composable browser WebGPU inference route contracts, runtime profiles, and scheduler envelopes.
4
+
5
+ This package is the shared substrate we are extracting from several browser-native model ports: MoGE depth/normal, SHARP image-to-splat, Kimodo text-to-motion, and Stable Fast 3D image-to-mesh. The bet is that these ports become more valuable when they can run as routes in the same browser GPU process, report the device and scheduling conditions they actually received, and hand outputs to each other without every repo inventing its own adapter grammar.
6
+
7
+ Receipts and evidence checks matter here, but they are not the point of the package. They are the guardrail that lets higher-level systems compose WebGPU routes without treating a fixture, fallback, stale cache, or half-profiled run as if it were live model output.
4
8
 
5
9
  Install:
6
10
 
@@ -12,145 +16,72 @@ Import:
12
16
 
13
17
  ```js
14
18
  import {
15
- createWebGpuRouteSchemaContract,
16
- createWebGpuLocalRouteReceipt,
17
- classifyWebGpuRouteReceiptEvidence,
19
+ createWebGpuRouteRegistry,
20
+ createRouteInvocationRequest,
21
+ createMogeDepthNormalRouteDefinition,
22
+ requestBrowserWebGpuDevice,
23
+ createWebGpuRouteSchedulerProfile,
18
24
  } from "@kaminos/webgpu-inference-kit";
19
25
  ```
20
26
 
21
- This package starts with contracts, not kernels. Its first job is to make
22
- browser-local model routes prove what they actually ran before Kaminos treats
23
- their outputs as asset evidence.
24
-
25
- Current surface:
26
-
27
- - `createWebGpuLocalRouteReceipt(input)`: creates a
28
- `kaminos.webgpu-route-receipt.v0` receipt for a `webgpu-local` route.
29
- - `createWebGpuRouteSchemaContract(input)`: exposes the kit-owned route
30
- definition/request/result/receipt/runtime-profile/evidence-classification
31
- schema strings as a compact contract object so route repos can run conformance
32
- checks instead of manually mirroring hidden constants.
33
- - `createWebGpuRouteReceiptFromArtifacts(input)` plus
34
- `createRouteReceiptArtifacts`, `createRouteReceiptInputArtifact`,
35
- `finishAndValidateRouteProfile`, and validation helpers: shared route receipt
36
- substrate used by MoGE, SHARP, Kimodo, and SF3D factories to preserve artifact
37
- identity, backend identity, and staged profile requirements without duplicating
38
- false-closure-prone boilerplate.
39
- - `validateRouteReceipt(receipt)`: validates requested/effective route identity,
40
- backend/model/kernel identity, input/output artifact ids, timings, and
41
- fallback status.
42
- - `assertAuthoritativeRouteReceipt(receipt)`: rejects fallback, cached,
43
- partial, missing, or non-real outputs before they can masquerade as
44
- authoritative Kaminos evidence.
45
- - `defineTensorManifest(input)` and `validateTensorManifest(manifest)`: normalize
46
- and validate model tensor metadata, including dtype sizes and byte lengths.
47
- - `createWebGpuDeviceRequest(adapter, options)`: derives requested WebGPU
48
- features and max adapter limits for model inference without silently capping
49
- below the adapter's own reported capacity.
50
- - `requestBrowserWebGpuDevice(gpu, options)`: requests a browser WebGPU adapter
51
- and device, then returns the effective device request and backend identity
52
- that route receipts should preserve.
53
- - `createWebGpuBackendIdentity(input)` and
54
- `validateWebGpuBackendIdentity(identity)`: preserve effective browser,
55
- adapter, feature, limit, and timestamp-query identity for route receipts.
56
- - `createStagedSubmitProfile(input)`, `addStagedSubmitStage(profile, stage)`,
57
- `finishStagedSubmitProfile(profile)`, and
58
- `validateStagedSubmitProfile(profile)`: record staged-submit timing evidence
59
- and reject timestamp-query timing unless it is validated against staged waits.
60
- - `createKernelProfileMetadata(input)` and
61
- `createRouteKernelProfileMetadata(input)`: normalize shared kit version,
62
- kernel profile, commit, required stage, and timing-source metadata for route
63
- definitions and receipts while keeping route-specific semantics local.
64
- - `createWebGpuRuntimeProfileInput(input)`,
65
- `createWebGpuRuntimeProfile(input)`, and
66
- `validateWebGpuRuntimeProfile(profile)`: combine effective WebGPU backend
67
- identity, kernel metadata, staged profile evidence, and evidence mode into a
68
- single producer-side runtime profile object.
69
- - `createWebGpuRouteSchedulerProfile(input)` and
70
- `validateWebGpuRouteSchedulerProfile(profile)`: preserve requested versus
71
- effective WebGPU route scheduling, including throughput/cooperative mode,
72
- route-specific phase chunk sizes, submitted-work waits, yield cadence, and
73
- unsupported scheduler fields so a route cannot claim cooperative behavior
74
- without effective telemetry.
75
- - `createWebGpuRouteBackpressureProfile(input)` and
76
- `validateWebGpuRouteBackpressureProfile(profile)`: record route pressure
77
- classification, warm/cache and memory-sharing posture, and frame-tail
78
- evidence for visible-wait/furnace classification without turning internal
79
- scheduler knobs into operator-facing controls.
80
- - `classifyWebGpuRouteReceiptEvidence(receipt)` and
81
- `classifyWebGpuRouteWorkerResultEvidence(result)`: commoner-side receipt
82
- classification helpers that distinguish authoritative live WebGPU evidence
83
- from fallback, partial, cache/demo, stale, route-mismatch, and invalid
84
- receipts, while surfacing scheduler verification and frame-tail fields when a
85
- route provides them.
86
- - `createMogeDepthNormalRouteReceipt(input)`: first concrete `webgpu-local`
87
- route receipt factory for `moge.depth-normal.webgpu-local.v0`.
88
- - `defineWebGpuRoute(input)`, `createWebGpuRouteRegistry(routes)`,
89
- `createRouteInvocationRequest(route, input)`, `createRouteWorkerResult(route,
90
- input)`, and their validators: define worker-executable routes, create
91
- invocation envelopes, and validate route results before Wake/Pipeline consume
92
- them as Kaminos evidence.
93
- - `createMogeDepthNormalRouteDefinition(input)`: first concrete route
94
- definition for MoGE source-image to depth/normal/pointmap truth-layer output.
95
- - `createSharpImageToSplatRouteReceipt(input)`: concrete receipt factory for
96
- `sharp.image-to-splat.webgpu-local.v0`, preserving source image, browser
97
- WebGPU backend identity, PLY splat candidate, depth map, SHARP metadata, and
98
- optional splat autocrop evidence.
99
- - `createSharpImageToSplatRouteDefinition(input)`: route definition aligned to
100
- the native SHARP-WebGPU browser adapter surface used by Kaminos Pipeline:
101
- source image in, splat candidate/depth/metadata out, with optional
102
- `kaminos.splat-autocrop-evidence.v0` side evidence.
103
- - `createKimodoTextToMotionRouteReceipt(input)` and
104
- `createKimodoTextToMotionRouteDefinition(input)`: browser WebGPU
105
- text-to-motion route contract for Kimodo SOMA-RP-v1.1, preserving prompt
106
- identity, SOMA77 joint output, motion sidecar output, optional filmstrip, and
107
- staged text-embedding/DDIM/FK/output-capture timing.
108
- - `createSf3dImageToMeshRouteReceipt(input)` and
109
- `createSf3dImageToMeshRouteDefinition(input)`: browser WebGPU image-to-mesh
110
- route contract for Stable Fast 3D, preserving source image, GLB mesh, albedo
111
- texture, normal map, optional OBJ, and DINOv2/two-stream/triplane/marching-tet
112
- stage identity.
113
-
114
- Near-term extraction order:
115
-
116
- 1. Route receipt and tensor manifest contracts. Done in the scaffold slice.
117
- 2. WebGPU device/feature/profiling identity helpers. First pure contract helpers
118
- are in place; browser adapters should wire into these next.
119
- 3. MoGE depth/normal route receipt. First factory is in place and the MoGE
120
- runtime emits this receipt from live inference.
121
- 4. Browser route boundary. Route registry, invocation request, worker result,
122
- browser device request, and MoGE route definition contracts are in place.
123
- 5. SHARP image-to-splat route contract. First factory and route definition are
124
- in place for the browser-native SHARP-WebGPU path; runtime emission remains
125
- owned by SHARP/Pipeline adapter surfaces.
126
- 6. Kimodo and SF3D route contracts. First factories and route definitions are
127
- in place for browser-native text-to-motion and image-to-mesh routes; runtime
128
- emission remains owned by those route repos and Kaminos motion/pipeline
129
- consumers.
130
- 7. MoGE schema mirror drift reduction. The kit exposes a schema contract object;
131
- MoGE has a dev conformance test against that contract while the runtime still
132
- avoids a brittle temporary worktree dependency.
133
- 8. Shared route receipt helper. Artifact normalization, backend identity
134
- validation, staged profile validation, and receipt construction now live in
135
- one helper consumed by all four concrete route factories.
136
- 9. Shared kernel/profile metadata helper. Kit version, kernel profile, commit,
137
- required stage, and timing-source normalization now live in one helper
138
- consumed by all four concrete route factories.
139
- 10. Runtime profile and commoner receipt classification helpers. Producers can
140
- normalize effective WebGPU runtime evidence, and downstream commoners can
141
- classify receipts before treating outputs as live route evidence.
142
- 11. Scheduler/backpressure contracts. Routes can now report throughput versus
143
- cooperative scheduling, requested/effective phase chunking, unsupported
144
- fields, visible-wait/furnace pressure, and frame-tail evidence without
145
- implying browser GPU preemption that WebGPU does not provide.
146
- 12. Pipeline, bind-group, uniform, and buffer caches from MoGE/SHARP.
147
- 13. Shared kernels only when at least two real routes need them or a measured
148
- kernel slice proves the extraction useful.
149
-
150
- Non-goals:
27
+ ## What This Is
28
+
29
+ `@kaminos/webgpu-inference-kit` is a small, route-facing contract library for browser-local WebGPU inference. It gives model ports a common way to describe:
30
+
31
+ - What route is being invoked, such as MoGE depth/normal or SHARP image-to-splat.
32
+ - Which browser WebGPU adapter, device features, limits, and timestamp capabilities were actually available.
33
+ - Which kernel/profile variant ran, and which stages are required for a useful runtime profile.
34
+ - How a route was scheduled: throughput mode, cooperative/yield posture, phase chunk sizes, submitted-work waits, and unsupported scheduler fields.
35
+ - Which artifacts went in and out, so downstream consumers can join routes without losing identity.
36
+
37
+ The immediate goal is practical composition inside Kaminos: MoGE can become a local geometry/depth route, SHARP can emit splat candidates, Kimodo can emit motion clips, SF3D can emit meshes, and pipeline/commoner code can consume those outputs through one route grammar. The longer-term opportunity is a browser-native inference runtime kit that makes future image generators, 3D generators, and possibly language-model routes easier to seat without rebuilding the same WebGPU plumbing from scratch.
38
+
39
+ ## Why Not Just Evidence?
40
+
41
+ Evidence is the accountability layer. The product center is route composition and runtime control.
42
+
43
+ WebGPU model ports have awkward failure modes: the browser may give a different adapter than expected, timestamp queries may be absent or misleading, a route may silently fall back to fixtures or stubs, and long GPU phases can monopolize the device unless the route reports how it yields. The kit keeps those facts attached to the route envelope so schedulers and downstream consumers can make sane choices.
44
+
45
+ So the intended stack is:
46
+
47
+ 1. **Route boundary:** define callable browser-local inference routes with stable input/output roles.
48
+ 2. **Runtime profile:** preserve adapter/device/kernel/stage identity for the run that actually happened.
49
+ 3. **Scheduler/backpressure profile:** expose whether the route is throughput-oriented, cooperative, furnace-class, warm, cached, or frame-tail-sensitive.
50
+ 4. **Receipt and classification:** reject stale, fallback, partial, mismatched, or invalid route output before another system treats it as authoritative.
51
+
52
+ The fourth layer protects the first three. It should not swallow the whole story.
53
+
54
+ ## Current Surface
55
+
56
+ - `defineWebGpuRoute(input)`, `createWebGpuRouteRegistry(routes)`, `createRouteInvocationRequest(route, input)`, `createRouteWorkerResult(route, input)`, and validators: define worker-executable browser routes, create invocation envelopes, and validate route results before downstream consumers compose them.
57
+ - `createMogeDepthNormalRouteDefinition(input)` and `createMogeDepthNormalRouteReceipt(input)`: MoGE source-image to depth/normal/pointmap route contract.
58
+ - `createSharpImageToSplatRouteDefinition(input)` and `createSharpImageToSplatRouteReceipt(input)`: SHARP source-image to splat candidate/depth/metadata route contract, including optional splat autocrop side output.
59
+ - `createKimodoTextToMotionRouteDefinition(input)` and `createKimodoTextToMotionRouteReceipt(input)`: Kimodo text-prompt to SOMA77 joints/motion-clip route contract, with optional filmstrip output and diffusion/FK/output timing stages.
60
+ - `createSf3dImageToMeshRouteDefinition(input)` and `createSf3dImageToMeshRouteReceipt(input)`: Stable Fast 3D source-image to GLB/albedo/normal/OBJ route contract with DINOv2, two-stream, triplane, and marching-tet stage identity.
61
+ - `createWebGpuDeviceRequest(adapter, options)` and `requestBrowserWebGpuDevice(gpu, options)`: request browser WebGPU devices using adapter-reported limits without imposing hidden caps, and return the effective request/backend identity for the route.
62
+ - `createWebGpuBackendIdentity(input)` and `validateWebGpuBackendIdentity(identity)`: preserve browser, adapter, feature, limit, and timestamp-query identity.
63
+ - `createStagedSubmitProfile(input)`, `addStagedSubmitStage(profile, stage)`, `finishStagedSubmitProfile(profile)`, and `validateStagedSubmitProfile(profile)`: describe staged queue-submit timing in a way that can be compared across routes.
64
+ - `createKernelProfileMetadata(input)` and `createRouteKernelProfileMetadata(input)`: normalize kit version, kernel profile, commit, required stages, and timing-source metadata for route definitions and receipts.
65
+ - `createWebGpuRuntimeProfileInput(input)`, `createWebGpuRuntimeProfile(input)`, and `validateWebGpuRuntimeProfile(profile)`: combine effective backend identity, kernel metadata, staged profile, and route mode into one producer-side runtime profile object.
66
+ - `createWebGpuRouteSchedulerProfile(input)` and `validateWebGpuRouteSchedulerProfile(profile)`: preserve requested versus effective scheduling, including throughput/cooperative mode, route-specific phase chunk sizes, submitted-work waits, yield cadence, and unsupported fields.
67
+ - `createWebGpuRouteBackpressureProfile(input)` and `validateWebGpuRouteBackpressureProfile(profile)`: record visible-wait/furnace pressure, warm/cache posture, memory-sharing posture, and frame-tail impact.
68
+ - `defineTensorManifest(input)` and `validateTensorManifest(manifest)`: normalize tensor metadata including dtype sizes and byte lengths.
69
+ - `createWebGpuLocalRouteReceipt(input)`, `createWebGpuRouteReceiptFromArtifacts(input)`, `createRouteReceiptArtifacts(input)`, `finishAndValidateRouteProfile(input)`, `validateRouteReceipt(receipt)`, and `assertAuthoritativeRouteReceipt(receipt)`: shared receipt construction and validation helpers.
70
+ - `classifyWebGpuRouteReceiptEvidence(receipt)` and `classifyWebGpuRouteWorkerResultEvidence(result)`: consumer-side classification helpers for authoritative, fallback, partial, cached, stale, route-mismatch, and invalid route outputs.
71
+ - `createWebGpuRouteSchemaContract(input)`: compact schema/version contract for route repos that need conformance tests against this package.
72
+
73
+ ## Near-Term Direction
74
+
75
+ 1. Keep the route boundary stable enough for MoGE, SHARP, Kimodo, SF3D, and Pipeline/commoners to consume one package.
76
+ 2. Move browser device acquisition, feature profiling, staged timing, scheduler/backpressure reporting, and route receipts out of individual model repos as shared utilities.
77
+ 3. Extract bind-group, pipeline, uniform, buffer-cache, and kernel helpers only when at least two real routes need the same machinery or a measured slice proves the extraction useful.
78
+ 4. Preserve enough runtime posture for long routes to become breathable: a route should be able to state where it can yield, what that costs, and whether the browser actually honored the requested scheduling shape.
79
+ 5. Avoid becoming a generic ONNX, LLM, or universal tensor runtime until a concrete WebGPU route exposes an advantage we can actually own.
80
+
81
+ ## Non-Goals
151
82
 
152
83
  - Generic ONNX import parity.
153
- - General LLM runtime competition.
84
+ - Competing with mature general-purpose browser LLM runtimes without a concrete route-level advantage.
154
85
  - Kaminos graph, scene, library, or promotion ownership.
155
- - Any route that hides fallback, stale output, fixture data, partial output, or
156
- effective backend identity.
86
+ - Hidden caps below adapter/device capacity without measured justification.
87
+ - Treating fallback, stale output, fixture data, partial output, or missing backend identity as successful live inference.
package/package.json CHANGED
@@ -1,9 +1,9 @@
1
1
  {
2
2
  "name": "@kaminos/webgpu-inference-kit",
3
- "version": "0.1.1",
3
+ "version": "0.1.2",
4
4
  "private": false,
5
5
  "type": "module",
6
- "description": "Native WebGPU inference route substrate for Kaminos.",
6
+ "description": "Composable browser WebGPU inference route contracts, runtime profiles, and scheduler envelopes.",
7
7
  "license": "UNLICENSED",
8
8
  "repository": {
9
9
  "type": "git",
@@ -1,4 +1,4 @@
1
- export const WEBGPU_INFERENCE_KIT_VERSION = '0.1.1';
1
+ export const WEBGPU_INFERENCE_KIT_VERSION = '0.1.2';
2
2
  const DEFAULT_KIT_VERSION = WEBGPU_INFERENCE_KIT_VERSION;
3
3
  const DEFAULT_TIMING_SOURCE = 'queue-submit-wait';
4
4