npm - @plasius/gpu-lock-free-queue - Versions diffs - 0.1.2 → 0.2.2 - Mend

@plasius/gpu-lock-free-queue 0.1.2 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -20,6 +20,85 @@ The format is based on **[Keep a Changelog](https://keepachangelog.com/en/1.1.0/
 - **Security**
   - (placeholder)
+## [0.2.2] - 2026-02-28
+- **Added**
+  - (placeholder)
+- **Changed**
+  - (placeholder)
+- **Fixed**
+  - (placeholder)
+- **Security**
+  - (placeholder)
+## [0.2.2] - 2026-02-28
+- **Added**
+  - (placeholder)
+- **Changed**
+  - (placeholder)
+- **Fixed**
+  - (placeholder)
+- **Security**
+  - (placeholder)
+## [0.2.2] - 2026-02-28
+- **Added**
+  - (placeholder)
+- **Changed**
+  - (placeholder)
+- **Fixed**
+  - (placeholder)
+- **Security**
+  - (placeholder)
+## [0.2.2] - 2026-02-28
+- **Added**
+  - (placeholder)
+- **Changed**
+  - (placeholder)
+- **Fixed**
+  - (placeholder)
+- **Security**
+  - (placeholder)
+## [0.2.1] - 2026-01-23
+- **Changed**
+  - **Breaking:** Queue payloads are now referenced by fixed metadata offsets into caller-managed payload buffers (no internal payload arena).
+  - **Breaking:** Queue header and bindings updated to remove payload arena fields and buffer.
+  - Demo and tests updated to reflect the new payload-handle layout.
+  - **Breaking:** Queue header now includes payload arena head/tail and capacity/mask fields.
+  - Queue helpers now expose a `queue_len` backlog snapshot for schedulers.
+  - Demo and tests updated to use job metadata and variable payload copies.
+- **Fixed**
+  - Payload allocations now validate arena capacity before enqueue.
+## [0.2.0] - 2026-01-23
+- **Changed**
+  - **Breaking:** WGSL bindings now include a dedicated payload ring buffer plus input/output payload buffers.
+  - Queue headers now carry `payload_stride` (u32 words) and job payloads are copied into the ring on enqueue.
+  - Demo and tests updated to use payload buffers instead of `input_jobs`/`output_jobs`.
+- **Fixed**
+  - Payload job counts now clamp to payload buffer lengths to prevent overruns.
 ## [0.1.2] - 2026-01-22
 - **Added**
@@ -51,3 +130,20 @@ The format is based on **[Keep a Changelog](https://keepachangelog.com/en/1.1.0/
 [0.1.0]: https://github.com/Plasius-LTD/gpu-lock-free-queue/releases/tag/v0.1.0
 [0.1.2]: https://github.com/Plasius-LTD/gpu-lock-free-queue/releases/tag/v0.1.2
+[0.2.0]: https://github.com/Plasius-LTD/gpu-lock-free-queue/releases/tag/v0.2.0
+[0.2.1]: https://github.com/Plasius-LTD/gpu-lock-free-queue/releases/tag/v0.2.1
+## [0.2.1] - 2026-02-11
+- **Added**
+  - Initial release.
+- **Changed**
+  - (placeholder)
+- **Fixed**
+  - (placeholder)
+- **Security**
+  - (placeholder)
+[0.2.2]: https://github.com/Plasius-LTD/gpu-lock-free-queue/releases/tag/v0.2.2

package/README.md CHANGED Viewed

@@ -1,6 +1,13 @@
 # @plasius/gpu-lock-free-queue
-[![npm version](https://img.shields.io/npm/v/@plasius/gpu-lock-free-queue)](https://www.npmjs.com/package/@plasius/gpu-lock-free-queue)
+[![npm version](https://img.shields.io/npm/v/@plasius/gpu-lock-free-queue.svg)](https://www.npmjs.com/package/@plasius/gpu-lock-free-queue)
+[![Build Status](https://img.shields.io/github/actions/workflow/status/Plasius-LTD/gpu-lock-free-queue/ci.yml?branch=main&label=build&style=flat)](https://github.com/Plasius-LTD/gpu-lock-free-queue/actions/workflows/ci.yml)
+[![coverage](https://img.shields.io/codecov/c/github/Plasius-LTD/gpu-lock-free-queue)](https://codecov.io/gh/Plasius-LTD/gpu-lock-free-queue)
+[![License](https://img.shields.io/github/license/Plasius-LTD/gpu-lock-free-queue)](./LICENSE)
+[![Code of Conduct](https://img.shields.io/badge/code%20of%20conduct-yes-blue.svg)](./CODE_OF_CONDUCT.md)
+[![Security Policy](https://img.shields.io/badge/security%20policy-yes-orange.svg)](./SECURITY.md)
+[![Changelog](https://img.shields.io/badge/changelog-md-blue.svg)](./CHANGELOG.md)
 [![CI](https://github.com/Plasius-LTD/gpu-lock-free-queue/actions/workflows/ci.yml/badge.svg)](https://github.com/Plasius-LTD/gpu-lock-free-queue/actions/workflows/ci.yml)
 [![license](https://img.shields.io/npm/l/@plasius/gpu-lock-free-queue)](./LICENSE)
@@ -25,11 +32,24 @@ const shaderCode = await loadQueueWgsl();
 ## What this is
 - Lock-free multi-producer, multi-consumer ring queue on the GPU.
 - Uses per-slot sequence numbers to avoid ABA for slots within a 32-bit epoch.
-- Fixed-size jobs (u32) for now; a "job" can be expanded to a fixed-size struct or an index into a separate payload buffer.
+- Fixed-size job metadata with payload offsets into a caller-managed data arena or buffer.
+## Buffer layout (breaking change in v0.4.0)
+Bindings are:
+1. `@binding(0)` queue header: `{ head, tail, capacity, mask }`
+2. `@binding(1)` slot array (`Slot` with `seq`, `job_type`, `payload_offset`, `payload_words`)
+3. `@binding(2)` input jobs (`array<JobMeta>` with `job_type`, `payload_offset`, `payload_words`)
+4. `@binding(3)` output jobs (`array<JobMeta>` with `job_type`, `payload_offset`, `payload_words`)
+5. `@binding(4)` input payloads (`array<u32>`, payload data referenced by `input_jobs.payload_offset`)
+6. `@binding(5)` output payloads (`array<u32>`, length `job_count * output_stride`)
+7. `@binding(6)` status flags (`array<u32>`, length `job_count`)
+8. `@binding(7)` params (`Params` with `job_count`, `output_stride`)
+`output_stride` is the per-job output stride (u32 words) used when copying payloads into `output_payloads`.
 ## Limitations
 - Sequence counters are 32-bit. At extreme throughput over a long time, counters wrap and ABA can reappear. If you need true long-running safety, consider a reset protocol, sharding, or a future 64-bit atomic extension.
-- Jobs are fixed-size and must be power-of-two capacity.
+- Payload lifetimes are managed by the caller. Ensure payload buffers remain valid until consumers finish, or use frame-bounded arenas/generation handles.
 - This demo is intentionally minimal; it is not yet integrated with a scheduler or backpressure policy.
 ## Run the demo
@@ -58,5 +78,5 @@ npm run test:e2e
 - `src/queue.wgsl`: Lock-free queue implementation.
 - `src/index.js`: Package entry point for loading the WGSL file.
-## Job shape
-Current jobs are `u32` values. If you need richer jobs, use a fixed-size struct (e.g., 16 bytes) or store indices into a separate payload buffer. Variable-length jobs should be modeled as an index + length into a payload arena to keep the queue fixed-size.
+## Payload shape
+Payloads are variable-length chunks stored in a caller-managed buffer. Each job specifies `job_type`, `payload_offset`, and `payload_words` in `input_jobs`; dequeue copies payloads from `input_payloads` into `output_payloads` using `output_stride` and mirrors the metadata into `output_jobs`. If you need `f32`, store `bitcast<u32>(value)` and reinterpret on the consumer side.

package/dist/index.cjs CHANGED Viewed

@@ -1,46 +1,11 @@
-var __create = Object.create;
-var __defProp = Object.defineProperty;
-var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
-var __getOwnPropNames = Object.getOwnPropertyNames;
-var __getProtoOf = Object.getPrototypeOf;
-var __hasOwnProp = Object.prototype.hasOwnProperty;
-var __export = (target, all) => {
-  for (var name in all)
-    __defProp(target, name, { get: all[name], enumerable: true });
-};
-var __copyProps = (to, from, except, desc) => {
-  if (from && typeof from === "object" || typeof from === "function") {
-    for (let key of __getOwnPropNames(from))
-      if (!__hasOwnProp.call(to, key) && key !== except)
-        __defProp(to, key, { get: () => from[key], enumerable: !(desc = __getOwnPropDesc(from, key)) || desc.enumerable });
-  }
-  return to;
-};
-var __toESM = (mod, isNodeMode, target) => (target = mod != null ? __create(__getProtoOf(mod)) : {}, __copyProps(
-  // If the importer is in node compatibility mode or this is not an ESM
-  // file that has been converted to a CommonJS file using a Babel-
-  // compatible transform (i.e. "__esModule" has not been set), then set
-  // "default" to the CommonJS "module.exports" for node compatibility.
-  isNodeMode || !mod || !mod.__esModule ? __defProp(target, "default", { value: mod, enumerable: true }) : target,
-  mod
-));
-var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: true }), mod);
-// src/index.js
-var index_exports = {};
-__export(index_exports, {
-  loadQueueWgsl: () => loadQueueWgsl,
-  queueWgslUrl: () => queueWgslUrl
-});
-module.exports = __toCommonJS(index_exports);
-var import_meta = {};
-var queueWgslUrl = new URL("./queue.wgsl", import_meta.url);
+// src/index.cjs
+var { pathToFileURL, fileURLToPath } = require("url");
+var { readFile } = require("fs/promises");
+var queueWgslUrl = new URL("./queue.wgsl", pathToFileURL(__filename));
 async function loadQueueWgsl(options = {}) {
   const { url = queueWgslUrl, fetcher = globalThis.fetch } = options ?? {};
   const wgslUrl = url instanceof URL ? url : new URL(url, queueWgslUrl);
   if (!fetcher || wgslUrl.protocol === "file:") {
-    const { readFile } = await import("fs/promises");
-    const { fileURLToPath } = await import("url");
     return readFile(fileURLToPath(wgslUrl), "utf8");
   }
   const response = await fetcher(wgslUrl);
@@ -52,9 +17,8 @@ async function loadQueueWgsl(options = {}) {
   }
   return response.text();
 }
-// Annotate the CommonJS export names for ESM import in node:
-0 && (module.exports = {
-  loadQueueWgsl,
-  queueWgslUrl
-});
+module.exports = {
+  queueWgslUrl,
+  loadQueueWgsl
+};
 //# sourceMappingURL=index.cjs.map

package/dist/index.cjs.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"sources":["../src/index.js"],"sourcesContent":["~~export~~ const queueWgslUrl = new URL(\"./queue.wgsl\", ~~import.meta.url~~);\n\~~nexport~~ ~~async~~ function loadQueueWgsl(options = {}) {\n const { url = queueWgslUrl, fetcher = globalThis.fetch } = options ?? {};\n const wgslUrl = url instanceof URL ? url : new URL(url, queueWgslUrl);\n\n if (!fetcher \|\| wgslUrl.protocol === \"file:\") {\n ~~const { readFile } = await import(\"node:fs/promises\");\n const { fileURLToPath } = await import(\"node:url\");\n~~ return readFile(fileURLToPath(wgslUrl), \"utf8\");\n }\n\n const response = await fetcher(wgslUrl);\n if (!response.ok) {\n const status = \"status\" in response ? response.status : \"unknown\";\n const statusText = \"statusText\" in response ? response.statusText : \"\";\n const detail = statusText ? `${status} ${statusText}` : `${status}`;\n throw new Error(`Failed to load WGSL (${detail})`);\n }\n return response.text();\n}\n"],"mappings":"~~;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AAAA~~;AAAA;~~AAAA~~;~~AAAA;AAAA;AAAA;AAAA;AAAO~~,IAAM,eAAe,IAAI,IAAI,gBAAgB,~~YAAY~~,~~GAAG~~;~~AAEnE~~,~~eAAsB~~,cAAc,UAAU,CAAC,GAAG;~~AAChD~~,QAAM,EAAE,MAAM,cAAc,UAAU,WAAW,MAAM,IAAI,WAAW,CAAC;AACvE,QAAM,UAAU,eAAe,MAAM,MAAM,IAAI,IAAI,KAAK,YAAY;AAEpE,MAAI,CAAC,WAAW,QAAQ,aAAa,SAAS;AAC5C,~~UAAM,EAAE,SAAS,IAAI,MAAM,OAAO,aAAkB;AACpD,UAAM,EAAE,cAAc,IAAI,MAAM,OAAO,KAAU;AACjD,~~WAAO,SAAS,cAAc,OAAO,GAAG,MAAM;AAAA,EAChD;AAEA,QAAM,WAAW,MAAM,QAAQ,OAAO;AACtC,MAAI,CAAC,SAAS,IAAI;AAChB,UAAM,SAAS,YAAY,WAAW,SAAS,SAAS;AACxD,UAAM,aAAa,gBAAgB,WAAW,SAAS,aAAa;AACpE,UAAM,SAAS,aAAa,GAAG,MAAM,IAAI,UAAU,KAAK,GAAG,MAAM;AACjE,UAAM,IAAI,MAAM,wBAAwB,MAAM,GAAG;AAAA,EACnD;AACA,SAAO,SAAS,KAAK;AACvB;","names":[]}
1	+ {"version":3,"sources":["../src/index.cjs"],"sourcesContent":["const { pathToFileURL, fileURLToPath } = require(\"node:url\");\nconst { readFile } = require(\"node:fs/promises\");\n\nconst queueWgslUrl = new URL(\"./queue.wgsl\", pathToFileURL(__filename));\n\nasync function loadQueueWgsl(options = {}) {\n const { url = queueWgslUrl, fetcher = globalThis.fetch } = options ?? {};\n const wgslUrl = url instanceof URL ? url : new URL(url, queueWgslUrl);\n\n if (!fetcher \|\| wgslUrl.protocol === \"file:\") {\n return readFile(fileURLToPath(wgslUrl), \"utf8\");\n }\n\n const response = await fetcher(wgslUrl);\n if (!response.ok) {\n const status = \"status\" in response ? response.status : \"unknown\";\n const statusText = \"statusText\" in response ? response.statusText : \"\";\n const detail = statusText ? `${status} ${statusText}` : `${status}`;\n throw new Error(`Failed to load WGSL (${detail})`);\n }\n return response.text();\n}\n\nmodule.exports = {\n queueWgslUrl,\n loadQueueWgsl,\n};\n"],"mappings":";AAAA,IAAM,EAAE,eAAe,cAAc,IAAI,QAAQ,KAAU;AAC3D,IAAM,EAAE,SAAS,IAAI,QAAQ,aAAkB;AAE/C,IAAM,eAAe,IAAI,IAAI,gBAAgB,cAAc,UAAU,CAAC;AAEtE,eAAe,cAAc,UAAU,CAAC,GAAG;AACzC,QAAM,EAAE,MAAM,cAAc,UAAU,WAAW,MAAM,IAAI,WAAW,CAAC;AACvE,QAAM,UAAU,eAAe,MAAM,MAAM,IAAI,IAAI,KAAK,YAAY;AAEpE,MAAI,CAAC,WAAW,QAAQ,aAAa,SAAS;AAC5C,WAAO,SAAS,cAAc,OAAO,GAAG,MAAM;AAAA,EAChD;AAEA,QAAM,WAAW,MAAM,QAAQ,OAAO;AACtC,MAAI,CAAC,SAAS,IAAI;AAChB,UAAM,SAAS,YAAY,WAAW,SAAS,SAAS;AACxD,UAAM,aAAa,gBAAgB,WAAW,SAAS,aAAa;AACpE,UAAM,SAAS,aAAa,GAAG,MAAM,IAAI,UAAU,KAAK,GAAG,MAAM;AACjE,UAAM,IAAI,MAAM,wBAAwB,MAAM,GAAG;AAAA,EACnD;AACA,SAAO,SAAS,KAAK;AACvB;AAEA,OAAO,UAAU;AAAA,EACf;AAAA,EACA;AACF;","names":[]}

package/dist/queue.wgsl CHANGED Viewed

@@ -3,26 +3,36 @@ struct Queue {
   tail: atomic<u32>,
   capacity: u32,
   mask: u32,
-  _pad: vec2<u32>,
 };
 struct Slot {
   seq: atomic<u32>,
-  value: u32,
-  _pad: vec2<u32>,
+  job_type: u32,
+  payload_offset: u32,
+  payload_words: u32,
+};
+struct JobMeta {
+  job_type: u32,
+  payload_offset: u32,
+  payload_words: u32,
+  _pad: u32,
 };
 struct Params {
   job_count: u32,
-  _pad: vec3<u32>,
+  output_stride: u32,
+  _pad: vec2<u32>,
 };
 @group(0) @binding(0) var<storage, read_write> queue: Queue;
 @group(0) @binding(1) var<storage, read_write> slots: array<Slot>;
-@group(0) @binding(2) var<storage, read> input_jobs: array<u32>;
-@group(0) @binding(3) var<storage, read_write> output_jobs: array<u32>;
-@group(0) @binding(4) var<storage, read_write> status: array<u32>;
-@group(0) @binding(5) var<uniform> params: Params;
+@group(0) @binding(2) var<storage, read> input_jobs: array<JobMeta>;
+@group(0) @binding(3) var<storage, read_write> output_jobs: array<JobMeta>;
+@group(0) @binding(4) var<storage, read> input_payloads: array<u32>;
+@group(0) @binding(5) var<storage, read_write> output_payloads: array<u32>;
+@group(0) @binding(6) var<storage, read_write> status: array<u32>;
+@group(0) @binding(7) var<uniform> params: Params;
 const MAX_RETRIES: u32 = 512u;
@@ -48,11 +58,28 @@ fn enqueue_job_count() -> u32 {
 }
 fn dequeue_job_count() -> u32 {
-  let count = min(params.job_count, arrayLength(&output_jobs));
+  if (params.output_stride == 0u) {
+    return 0u;
+  }
+  let payload_jobs = arrayLength(&output_payloads) / params.output_stride;
+  var count = min(params.job_count, arrayLength(&output_jobs));
+  count = min(count, payload_jobs);
   return min(count, arrayLength(&status));
 }
-fn enqueue(val: u32) -> u32 {
+fn queue_len() -> u32 {
+  let h = atomicLoad(&queue.head);
+  let t = atomicLoad(&queue.tail);
+  return t - h;
+}
+fn enqueue(idx: u32) -> u32 {
+  let job = input_jobs[idx];
+  let payload_words = job.payload_words;
+  let input_offset = job.payload_offset;
+  if (input_offset + payload_words > arrayLength(&input_payloads)) {
+    return 0u;
+  }
   for (var attempt: u32 = 0u; attempt < MAX_RETRIES; attempt++) {
     let t = atomicLoad(&queue.tail);
     let slot_index = t & queue.mask;
@@ -62,7 +89,9 @@ fn enqueue(val: u32) -> u32 {
     if (diff == 0) {
       let res = atomicCompareExchangeWeak(&queue.tail, t, t + 1u);
       if (res.exchanged) {
-        slots[slot_index].value = val;
+        slots[slot_index].job_type = job.job_type;
+        slots[slot_index].payload_offset = input_offset;
+        slots[slot_index].payload_words = payload_words;
         atomicStore(&slots[slot_index].seq, t + 1u);
         return 1u;
       }
@@ -84,8 +113,22 @@ fn dequeue(idx: u32) -> u32 {
     if (diff == 0) {
       let res = atomicCompareExchangeWeak(&queue.head, h, h + 1u);
       if (res.exchanged) {
-        let val = slots[slot_index].value;
-        output_jobs[idx] = val;
+        let payload_offset = slots[slot_index].payload_offset;
+        let payload_words = slots[slot_index].payload_words;
+        let job_type = slots[slot_index].job_type;
+        let output_stride = params.output_stride;
+        let dst_base = idx * output_stride;
+        let copy_words = min(payload_words, output_stride);
+        for (var i: u32 = 0u; i < copy_words; i = i + 1u) {
+          output_payloads[dst_base + i] = input_payloads[payload_offset + i];
+        }
+        for (var i: u32 = copy_words; i < output_stride; i = i + 1u) {
+          output_payloads[dst_base + i] = 0u;
+        }
+        output_jobs[idx].job_type = job_type;
+        output_jobs[idx].payload_offset = payload_offset;
+        output_jobs[idx].payload_words = payload_words;
+        output_jobs[idx]._pad = 0u;
         atomicStore(&slots[slot_index].seq, h + queue.capacity);
         return 1u;
       }
@@ -111,7 +154,7 @@ fn enqueue_main(@builtin(global_invocation_id) gid: vec3<u32>) {
     return;
   }
-  let ok = enqueue(input_jobs[idx]);
+  let ok = enqueue(idx);
   if (ok == 1u) {
     status[idx] = 1u;
   }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@plasius/gpu-lock-free-queue",
-  "version": "0.1.2",
+  "version": "0.2.2",
   "description": "WebGPU lock-free MPMC ring queue with sequence counters.",
   "type": "module",
   "sideEffects": false,
@@ -29,7 +29,9 @@
     "test": "npm run test:unit",
     "test:unit": "node --test",
     "test:e2e": "npx playwright install chromium && playwright test",
-    "test:coverage": "c8 --reporter=lcov --reporter=text node --test"
+    "test:coverage": "c8 --reporter=lcov --reporter=text node --test",
+    "pack:check": "node scripts/verify-public-package.cjs",
+    "prepublishOnly": "npm run build && npm run pack:check"
   },
   "keywords": [
     "webgpu",
@@ -71,5 +73,8 @@
       "type": "github",
       "url": "https://github.com/sponsors/Plasius-LTD"
     }
-  ]
+  ],
+  "overrides": {
+    "minimatch": "^10.2.1"
+  }
 }

package/src/index.cjs ADDED Viewed

@@ -0,0 +1,27 @@
+const { pathToFileURL, fileURLToPath } = require("node:url");
+const { readFile } = require("node:fs/promises");
+const queueWgslUrl = new URL("./queue.wgsl", pathToFileURL(__filename));
+async function loadQueueWgsl(options = {}) {
+  const { url = queueWgslUrl, fetcher = globalThis.fetch } = options ?? {};
+  const wgslUrl = url instanceof URL ? url : new URL(url, queueWgslUrl);
+  if (!fetcher || wgslUrl.protocol === "file:") {
+    return readFile(fileURLToPath(wgslUrl), "utf8");
+  }
+  const response = await fetcher(wgslUrl);
+  if (!response.ok) {
+    const status = "status" in response ? response.status : "unknown";
+    const statusText = "statusText" in response ? response.statusText : "";
+    const detail = statusText ? `${status} ${statusText}` : `${status}`;
+    throw new Error(`Failed to load WGSL (${detail})`);
+  }
+  return response.text();
+}
+module.exports = {
+  queueWgslUrl,
+  loadQueueWgsl,
+};

package/src/queue.wgsl CHANGED Viewed

@@ -3,26 +3,36 @@ struct Queue {
   tail: atomic<u32>,
   capacity: u32,
   mask: u32,
-  _pad: vec2<u32>,
 };
 struct Slot {
   seq: atomic<u32>,
-  value: u32,
-  _pad: vec2<u32>,
+  job_type: u32,
+  payload_offset: u32,
+  payload_words: u32,
+};
+struct JobMeta {
+  job_type: u32,
+  payload_offset: u32,
+  payload_words: u32,
+  _pad: u32,
 };
 struct Params {
   job_count: u32,
-  _pad: vec3<u32>,
+  output_stride: u32,
+  _pad: vec2<u32>,
 };
 @group(0) @binding(0) var<storage, read_write> queue: Queue;
 @group(0) @binding(1) var<storage, read_write> slots: array<Slot>;
-@group(0) @binding(2) var<storage, read> input_jobs: array<u32>;
-@group(0) @binding(3) var<storage, read_write> output_jobs: array<u32>;
-@group(0) @binding(4) var<storage, read_write> status: array<u32>;
-@group(0) @binding(5) var<uniform> params: Params;
+@group(0) @binding(2) var<storage, read> input_jobs: array<JobMeta>;
+@group(0) @binding(3) var<storage, read_write> output_jobs: array<JobMeta>;
+@group(0) @binding(4) var<storage, read> input_payloads: array<u32>;
+@group(0) @binding(5) var<storage, read_write> output_payloads: array<u32>;
+@group(0) @binding(6) var<storage, read_write> status: array<u32>;
+@group(0) @binding(7) var<uniform> params: Params;
 const MAX_RETRIES: u32 = 512u;
@@ -48,11 +58,28 @@ fn enqueue_job_count() -> u32 {
 }
 fn dequeue_job_count() -> u32 {
-  let count = min(params.job_count, arrayLength(&output_jobs));
+  if (params.output_stride == 0u) {
+    return 0u;
+  }
+  let payload_jobs = arrayLength(&output_payloads) / params.output_stride;
+  var count = min(params.job_count, arrayLength(&output_jobs));
+  count = min(count, payload_jobs);
   return min(count, arrayLength(&status));
 }
-fn enqueue(val: u32) -> u32 {
+fn queue_len() -> u32 {
+  let h = atomicLoad(&queue.head);
+  let t = atomicLoad(&queue.tail);
+  return t - h;
+}
+fn enqueue(idx: u32) -> u32 {
+  let job = input_jobs[idx];
+  let payload_words = job.payload_words;
+  let input_offset = job.payload_offset;
+  if (input_offset + payload_words > arrayLength(&input_payloads)) {
+    return 0u;
+  }
   for (var attempt: u32 = 0u; attempt < MAX_RETRIES; attempt++) {
     let t = atomicLoad(&queue.tail);
     let slot_index = t & queue.mask;
@@ -62,7 +89,9 @@ fn enqueue(val: u32) -> u32 {
     if (diff == 0) {
       let res = atomicCompareExchangeWeak(&queue.tail, t, t + 1u);
       if (res.exchanged) {
-        slots[slot_index].value = val;
+        slots[slot_index].job_type = job.job_type;
+        slots[slot_index].payload_offset = input_offset;
+        slots[slot_index].payload_words = payload_words;
         atomicStore(&slots[slot_index].seq, t + 1u);
         return 1u;
       }
@@ -84,8 +113,22 @@ fn dequeue(idx: u32) -> u32 {
     if (diff == 0) {
       let res = atomicCompareExchangeWeak(&queue.head, h, h + 1u);
       if (res.exchanged) {
-        let val = slots[slot_index].value;
-        output_jobs[idx] = val;
+        let payload_offset = slots[slot_index].payload_offset;
+        let payload_words = slots[slot_index].payload_words;
+        let job_type = slots[slot_index].job_type;
+        let output_stride = params.output_stride;
+        let dst_base = idx * output_stride;
+        let copy_words = min(payload_words, output_stride);
+        for (var i: u32 = 0u; i < copy_words; i = i + 1u) {
+          output_payloads[dst_base + i] = input_payloads[payload_offset + i];
+        }
+        for (var i: u32 = copy_words; i < output_stride; i = i + 1u) {
+          output_payloads[dst_base + i] = 0u;
+        }
+        output_jobs[idx].job_type = job_type;
+        output_jobs[idx].payload_offset = payload_offset;
+        output_jobs[idx].payload_words = payload_words;
+        output_jobs[idx]._pad = 0u;
         atomicStore(&slots[slot_index].seq, h + queue.capacity);
         return 1u;
       }
@@ -111,7 +154,7 @@ fn enqueue_main(@builtin(global_invocation_id) gid: vec3<u32>) {
     return;
   }
-  let ok = enqueue(input_jobs[idx]);
+  let ok = enqueue(idx);
   if (ok == 1u) {
     status[idx] = 1u;
   }