npm - queasy - Versions diffs - 0.1.0 → 0.3.0 - Mend

queasy 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/.github/workflows/check.yml +50 -0
package/.github/workflows/publish.yml +44 -0
package/AGENTS.md +2 -1
package/CLAUDE.md +1 -1
package/Readme.md +32 -22
package/docker-compose.yml +0 -2
package/fuzztest/Readme.md +185 -0
package/fuzztest/fuzz.js +354 -0
package/fuzztest/handlers/cascade-a.js +94 -0
package/fuzztest/handlers/cascade-b.js +72 -0
package/fuzztest/handlers/fail-handler.js +52 -0
package/fuzztest/handlers/periodic.js +93 -0
package/fuzztest/process.js +100 -0
package/fuzztest/shared/chaos.js +28 -0
package/fuzztest/shared/stream.js +40 -0
package/package.json +7 -4
package/plans/redis-options.md +279 -0
package/src/client.js +100 -30
package/src/constants.js +3 -0
package/src/pool.js +4 -7
package/src/queasy.lua +33 -40
package/src/queue.js +4 -11
package/src/types.ts +15 -0
package/src/utils.js +26 -0
package/test/client.test.js +39 -41
package/test/errors.test.js +12 -12
package/test/fixtures/always-fail-handler.js +3 -3
package/test/fixtures/data-logger-handler.js +5 -0
package/test/fixtures/failure-handler.js +2 -2
package/test/fixtures/permanent-error-handler.js +3 -3
package/test/fixtures/slow-handler.js +2 -2
package/test/fixtures/success-handler.js +3 -3
package/test/fixtures/with-failure-handler.js +3 -3
package/test/guards.test.js +131 -0
package/test/manager.test.js +217 -70
package/test/pool.test.js +153 -57
package/test/queue.test.js +6 -5
package/test/redis-functions.test.js +18 -12
package/test/utils.test.js +52 -0
package/.claude/settings.local.json +0 -27
package/.zed/settings.json +0 -39
package/doc/Implementation.md +0 -70
package/test/index.test.js +0 -55

package/.github/workflows/check.yml ADDED Viewed

@@ -0,0 +1,50 @@
+name: Check
+on:
+  pull_request:
+    branches: [master]
+jobs:
+  check:
+    runs-on: ubuntu-latest
+    services:
+      redis:
+        image: redis:7
+        ports:
+          - 6379:6379
+        options: >-
+          --health-cmd "redis-cli ping"
+          --health-interval 10s
+          --health-timeout 5s
+          --health-retries 5
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 22
+      - run: npm ci
+      - name: Lint
+        run: npm run lint
+      - name: Typecheck
+        run: npm run typecheck
+      - name: Test with coverage
+        run: npm run test:coverage
+      - name: Check version is not already tagged
+        run: |
+          git fetch --tags
+          VERSION=$(node -e "console.log(require('./package.json').version)")
+          if [ -n "$(git tag -l "v$VERSION")" ]; then
+            echo "::error::Tag v$VERSION already exists. Bump the version in package.json."
+            exit 1
+          fi
+          echo "Version v$VERSION is not yet tagged"

package/.github/workflows/publish.yml ADDED Viewed

@@ -0,0 +1,44 @@
+name: Publish
+on:
+  push:
+    branches: [master]
+permissions:
+  contents: write
+  id-token: write
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 22
+          registry-url: https://registry.npmjs.org
+      - run: npm install -g npm@latest
+      - run: npm ci
+      - name: Extract version
+        id: version
+        run: echo "version=$(node -e "console.log(require('./package.json').version)")" >> "$GITHUB_OUTPUT"
+      - name: Publish to npm
+        run: npm publish --provenance --access public
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+      - name: Create and push tag
+        run: |
+          git tag "v${{ steps.version.outputs.version }}"
+          git push origin "v${{ steps.version.outputs.version }}"
+      - name: Create GitHub release
+        run: gh release create "v${{ steps.version.outputs.version }}" --generate-notes
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

package/AGENTS.md CHANGED Viewed

@@ -24,7 +24,8 @@ Queasy is a Redis-backed job queue for Node.js with **at-least-once** delivery s
 The JS side is split across several modules:
 - **`src/client.js`** (`Client` class): Top-level entry point. Wraps a `node-redis` connection, loads the Lua script into Redis via `FUNCTION LOAD REPLACE` on construction, and manages named `Queue` instances. Generates a unique `clientId` for heartbeats. All Redis `fCall` invocations live here (`dispatch`, `cancel`, `dequeue`, `finish`, `fail`, `retry`, `bump`). Exported from `src/index.js`.
-- **`src/queue.js`** (`Queue` class): Represents a single named queue. Holds dequeue options and handler path. `listen()` attaches a handler and starts a `setInterval` polling loop that calls `dequeue()`. `dequeue()` checks pool capacity, fetches jobs from Redis, and processes each via the pool. Handles retry/fail logic (backoff calculation, stall-count checks) on the JS side.
+- **`src/queue.js`** (`Queue` class): Represents a single named queue. `listen()` attaches a handler path and options, optionally sets up a fail queue (`{key}-fail`), then registers itself with the `Manager` via `addQueue()`. `dequeue(count)` fetches jobs from Redis, processes each via the pool, and handles outcomes: finishes on success, retries with exponential backoff on retriable errors, and dispatches to the fail queue on permanent errors or when `maxRetries`/`maxStalls` limits are exceeded. Returns `{ count, promise }` so the manager can track whether the queue is saturated.
+- **`src/manager.js`** (`Manager` class): Centralized dequeue scheduler shared across all queues on a client. When a queue calls `listen()`, it registers itself via `addQueue()`. The manager runs a single `next()` loop that round-robins through queues, calling `queue.dequeue(batchSize)` on each. Batch size is computed from pool capacity, the number of busy queues, and the handler's `size` option. After each dequeue, queues are re-sorted by a priority function (`compareQueueEntries`): busy queues first, then by `priority` (higher first), then by `lastDequeuedAt` (oldest first), then by `size` (larger first). The loop schedules the next tick immediately if the top queue is busy, otherwise waits `DEQUEUE_INTERVAL` ms from the last dequeue time.
 - **`src/pool.js`** (`Pool` class): Manages a set of `Worker` threads. Each worker has a `capacity` (default 100 units). `process()` picks the worker with the most spare capacity, posts the job, and returns a promise. Handles job timeouts: a timed-out job marks the worker as unhealthy, replaces it with a fresh one, and terminates the old worker once only stalled jobs remain.
 - **`src/worker.js`**: Runs inside a `Worker` thread. Receives `exec` messages, dynamically imports the handler module, calls `handle(data, job)`, and posts back `done` messages (with optional error info).
 - **`src/constants.js`**: Default retry options, heartbeat/timeout intervals, worker capacity, dequeue polling interval.

package/CLAUDE.md CHANGED Viewed

@@ -21,7 +21,7 @@ Tests require a running Redis instance. Use `docker:up` first if needed.
 Queasy is a Redis-backed job queue with **at-least-once** delivery semantics. The core logic lives in two layers:
-- **JS layer** (`src/queue.js`): The `queue()` factory returns `{ dispatch, cancel, listen }`. On first use, it uploads the Lua script to Redis via `FUNCTION LOAD REPLACE`. A `WeakSet` (`initializedClients`) tracks which Redis clients have already had the functions loaded. `listen()` is currently a TODO stub.
+- **JS layer** (`src/client.js`): The `Client` class accepts a `RedisOptions` object and constructs its own node-redis connection via `createClient` (plain object) or `createCluster` (object with `rootNodes`). On construction it connects, then uploads the Lua script to Redis via `FUNCTION LOAD REPLACE`. The connection is torn down in `close()` via `destroy()`.
 - **Lua layer** (`src/queasy.lua`): All queue state mutations are atomic Redis functions registered under the `queasy` library. No queue logic should be duplicated in JS — the Lua functions are the single source of truth for state transitions.
 ### Redis data structures

package/Readme.md CHANGED Viewed

@@ -2,18 +2,18 @@
 A Redis-backed job queue for Node.js, featuring (in comparison with design inspiration BullMQ):
-- **Singleton jobs**: Guarantees that no more than one job with a given ID is be processed at a time, without trampolines or dropping jobs (“unsafe deduplication”).
-- **Fail handlers**: Guaranteed at-least-once handlers for failed or stalled jobs, which permits reliable periodic jobs without a external scheduling or “reviver” systems.
+- **Singleton jobs**: Guarantees that no more than one job with a given ID is being processed at any time, without trampolines or dropping jobs (“unsafe deduplication”).
+- **Fail handlers**: Guaranteed at-least-once handlers for failed or stalled jobs, enabling reliable periodic jobs without a external scheduling or “reviver” systems.
 - **Instant config changes**: Most configuration changes take effect immediately no matter the queue length, as they apply at dequeue time.
 - **Worker threads**: Jobs are processed in worker threads, preventing main process stalling and failing health checks due to CPU-bound jobs
 - **Capacity model**: Worker capacity flexibly shared between heterogenous queues based on priority and demand, rather than queue-specific “concurrency”.
-- **Job timeout**: Enforced by draining and terminating worker threads with timed out jobs
-- **Zombie protection**: Clients that have lost locks detect this and exit at next heartbeat
+- **Job timeout**: Timed out jobs are killed by draining and terminating the worker threads it runs on
+- **Zombie protection**: Clients that have lost locks while stalled before recovering detect this and terminate themselves immediately
 - **Fine-grained updates**: Control over individual attributes when one job updates another with the same ID
 ### Terminology
-A _client_ is an instance of Quesy that connects to a Redis database. A _job_ is the basic unit of work that is _dispatched_ into a _queue_.
+A _client_ is an instance of Queasy. It manages its own Redis connection. A _job_ is the basic unit of work that is _dispatched_ into a _queue_.
 A _handler_ is JavaScript code that performs work. There are two kinds of handlers: _task handlers_, which process jobs, and _fail handlers_, which are invoked when a job fails permanently. Handlers run on _workers_, which are Node.js worker threads. By default, a Queasy client automatically creates one worker per CPU.
@@ -22,8 +22,9 @@ A _handler_ is JavaScript code that performs work. There are two kinds of handle
 - `id`: string; generated if unspecified. See _update semantics_ below for more information.
 - `data`: a JSON-serializable value passed to handlers
 - `runAt`: number; a unix timestamp, to delay job execution until at least that time
-- `stallCount`: number; how many times has this job caused the client or worker to stall?
-- `retryCount`: number; how many times has this job caused the handler to throw an error?
+- `retryCount`: number; how many times has this job been retried for any reason?
+- `stallCount`: number; how many times did the client processing this job stop sending heartbeats?
+- `timeoutCount`: number; how many times did this job fail to complete in the allocated time?
 ### Job lifecycle
@@ -42,23 +43,29 @@ Queues are dequeued based on their priority and the ratio of available capacity
 When a worker start processing a job, a timer is started; if the job completes or throws, the timer is cleared. If the timeout occurs, the job is marked stalled and the worker is removed from the pool so it no longer receives new jobs. A new worker is also created and added to the pool to replace it.
-The unhealthy worker (with stalled jobs) continues to run until it has *only* stalled jobs remaining. When this happens, the worker is terminated, and all its stalled jobs are retried.
+The unhealthy worker (with at least one stalled job) continues to run until it has *only* stalled jobs remaining. When this happens, the worker is terminated, and all its stalled jobs are retried.
 ### Stall handling
 The client (in the main thread) sends periodic heartbeats to Redis for each queue it’s processing. If heartbeats from a client stop, a Lua script in Redis removes this client and returns all its active jobs into the waiting state with their stall count property incremented.
-When a job is dequeued, if its stall count exceeds the configured maximum, it is immediately considered permanently failed and its handler is not invoked.
+When a job is dequeued, if its stall count exceeds the configured maximum, it is immediately considered permanently failed; its task handler is not invoked.
 The response of the heartbeat Lua function indicates whether the client had been removed due to an earlier stall; if it receives this response, the client terminates all its worker threads immediately and re-initializes the pool and queues.
 ## API
-### `client(redisConnection, workerCount)`
-Returns a Queasy client.
-- `redisConnection`: a node-redis connection object.
+### `new Client(options, workerCount)`
+Returns a Queasy client. Queasy creates and manages its own Redis connection internally.
+- `options`: connection options. Two forms are accepted:
+  - **Single node** (plain object): passed to node-redis `createClient`. Accepts `url`, `socket`, `username`, `password`, and `database`. Defaults to `{}` (connects to `localhost:6379`).
+  - **Cluster** (object with `rootNodes`): passed to node-redis `createCluster`. Accepts:
+    - `rootNodes`: array of per-node connection options (same fields as single-node form); at least three nodes are recommended.
+    - `defaults`: options shared across all nodes (e.g. auth and TLS).
+    - `nodeAddressMap`: address translation map for NAT environments.
 - `workerCount`: number; Size of the worker pool. If 0, or if called in a queasy worker thread, no pool is created. Defaults to the number of CPUs.
+The client object returned is an EventEmitter, which emits a 'disconnect' event when it fails permanently for any reason, such as library version mismatch between different workers connected to the same Redis insance, or a lost locks situation. When this happens, in general the application should exit the worker process and allow the supervisor to restart it.
 ### `client.queue(name)`
@@ -74,8 +81,7 @@ Adds a job to the queue. `data` may be any JSON value, which will be passed unch
 The following options take effect if an `id` is provided, and it matches that of a job already in the queue.
 - `updateData`: boolean; whether to replace the data of any waiting job with the same ID; default: true
 - `updateRunAt`: boolean | 'ifLater' | 'ifEarlier'; default: true
-- `updateRetryStrategy`: boolean; whether to replace `maxRetries`, `maxStalls`, `minBackoff` and `maxBackoff`
-- `resetCounts`: boolean; Whether to reset the internal failure and stall counts to 0; default: same as updateData
+- `resetCounts`: boolean; Whether to reset the retry, timeout and stall counts to 0; default: same as updateData
 Returns a promise that resolves to the job ID when the job has been added to Redis.
@@ -92,10 +98,12 @@ Attaches handlers to a queue to process jobs that are added to it.
 The following options control retry behavior:
 - `maxRetries`: number; default: 10
 - `maxStalls`: number; default: 3
+- `maxTimeouts`: number, default: 3
 - `minBackoff`: number; in milliseconds; default: 2,000
 - `maxBackoff`: number; default: 300,000
 - `size`: number; default: 10
 - `timeout`: number; in milliseconds; default: 60,000
+- `priority`: number; higher values are given preference; default: 100
 Additional options affect failure handling:
 - `failHandler`: The path to a JavaScript module that exports the handler for failure jobs
@@ -107,13 +115,13 @@ Every handler module must have a named export `handle`, a function that is calle
 ### Task handlers
-It receives two arguments:
+They receive two arguments:
 - `data`, the JSON value passed to dispatch
-- `job`, a Job object contains the job attributes except data
+- `job`, a Job object containing other job attributes (excluding data)
-This function may throw (or return a Promise that rejects) to indicate job failure. If the thrown error is an
-instance of `PermanentError`, or if `maxRetries` has been reached, the job is not retried. Otherwise, the job
-is queued to be retried with `maxRetries` incremented.
+This function may throw (or return a Promise that rejects) to indicate job failure. If the thrown error contains
+a property `kind` with the value `permanent`, or if `maxRetries` has been reached, the job is not retried.
+Otherwise, the job is queued to be retried with `retryCount` incremented.
 If the thrown error has a property `retryAt`, the job’s `runAt` is set to this value; otherwise, it’s set using
 the exponential backoff algorithm.
@@ -123,8 +131,10 @@ If it returns any value apart from a Promise that rejects, the job is considered
 ### Failure handlers
 This function receives three arguments:
-- `data`, the JSON value passed to dispatch
-- `job`
-- `error`, a JSON object with a copy of the enumerable properties of the error thrown by the final call to handle, or an instance of `StallError` if the final call to handle didn’t return or throw.
+- `data`, a tuple (array) containing three items:
+  - `originalData`
+  - `originalJob`
+  - `error`, a JSON object with the name, message and kind properties of the error thrown by the final call to handle. Kind might be `permanent`, `retriable` or `stall`. In case of stall, the name property is either `StallError` or `TimeoutError`.
+- `job`, details of the failure handling job
 If this function throws an error (or returns a Promise that rejects), it is retried using exponential backoff.

package/docker-compose.yml CHANGED Viewed

@@ -7,8 +7,6 @@ services:
     ports:
       - '6379:6379'
     command: redis-server --save 60 1 --loglevel warning
-    volumes:
-      - redis-data:/data
     healthcheck:
       test: ['CMD', 'redis-cli', 'ping']
       interval: 5s

package/fuzztest/Readme.md ADDED Viewed

@@ -0,0 +1,185 @@
+# Queasy Fuzz Test Plan
+A long-running end-to-end fuzz test that simulates random failures and continuously verifies core system invariants.
+## Invariants Verified
+1. **Mutual exclusion**: Two jobs with the same Job ID are never processed by different clients or worker threads simultaneously.
+2. **No re-processing of successful jobs**: A job that has succeeded is never processed again.
+3. **Scheduling**: No job is processed before its `run_at` time.
+4. **Priority ordering within a queue**: No job starts processing while another job in the same queue with a lower `run_at` is still waiting (i.e., eligible jobs are dequeued in order).
+5. **Fail handler completeness**: If a fail handler is registered, every job that does not eventually succeed MUST result in the fail handler being invoked.
+6. **Queue progress (priority starvation prevention)**: Non-empty queues at the highest priority level always make progress. When they drain, queues at the next priority level begin making progress.
+## Structure Overview
+```
+fuzztest/
+  Readme.md              # This file
+  fuzz.js                # Orchestrator: spawns child processes, monitors shared state
+  process.js             # Child process: sets up clients and listens on all queues
+  handlers/
+    periodic.js          # Re-queues itself; dispatches cascade jobs; occasionally stalls/crashes
+    cascade-a.js         # Dispatched by periodic; dispatches into cascade-b
+    cascade-b.js         # Dispatched by cascade-a; final handler
+    fail-handler.js      # Shared fail handler for all queues; records invocations
+  shared/
+    state.js             # In-process shared state helpers (for the orchestrator)
+    log.js               # Structured logger (writes to fuzz-output.log, never throws)
+```
+## Process Architecture
+The orchestrator (`fuzz.js`) spawns **N child processes** (default: 4). Each child process creates one Redis client and calls `listen()` on every queue. The orchestrator itself does not process jobs — it only monitors invariants and manages the lifecycle.
+Handlers write events (job start, finish, fail, stall) directly to a Redis stream (`fuzz:events`). The orchestrator reads from this stream and maintains a shared in-memory log of events, checking invariants after each one. Child processes do not need to forward events to the orchestrator themselves — the stream is the shared channel.
+Child processes are deliberately killed and restarted periodically to simulate crashes. A killed process' checked-out jobs will be swept and retried/failed by the remaining processes.
+## Queue Configuration
+Three queues at different priority levels, all listened on by every child process. Parameters are kept small to produce many events quickly:
+| Parameter | `{fuzz}:periodic` | `{fuzz}:cascade-a` | `{fuzz}:cascade-b` |
+|---|---|---|---|
+| Handler | `periodic.js` | `cascade-a.js` | `cascade-b.js` |
+| Priority | 300 | 200 | 100 |
+| `maxRetries` | 3 | 3 | 3 |
+| `maxStalls` | 2 | 2 | 2 |
+| `minBackoff` | 200 ms | 200 ms | 200 ms |
+| `maxBackoff` | 2 000 ms | 2 000 ms | 2 000 ms |
+| `timeout` | 3 000 ms | 3 000 ms | 3 000 ms |
+| `size` | 10 | 10 | 10 |
+| `failHandler` | `fail-handler.js` | `fail-handler.js` | `fail-handler.js` |
+| `failRetryOptions.maxRetries` | 5 | 5 | 5 |
+| `failRetryOptions.minBackoff` | 200 ms | 200 ms | 200 ms |
+The short `timeout` (3 s) means stalling jobs are detected and swept quickly. The short `minBackoff` / `maxBackoff` window (200 ms – 2 s) means retries cycle fast. With `maxRetries: 3` and `maxStalls: 2`, most failed jobs reach the fail handler within seconds.
+## Periodic Jobs (Seed)
+A fixed set of periodic job IDs (e.g., `periodic-0` through `periodic-4`) are dispatched by the orchestrator at startup. Each periodic handler:
+1. Records the current processing event by writing `{ type: 'start', queue, id, threadId, clientId, startedAt }` to the `fuzz:events` Redis stream.
+2. Optionally sleeps for a random short delay.
+3. Dispatches a cascade-a job with a unique ID and a `runAt` randomly up to 2 seconds in the future.
+4. Re-dispatches itself (same job ID, `updateRunAt: true`) with a delay of 1–5 seconds, so the job continues to fire periodically.
+5. On success, writes `{ type: 'finish', queue, id, threadId, clientId, finishedAt }` to the `fuzz:events` stream.
+The fail handler for periodic jobs also re-dispatches the same periodic job ID (with a delay), ensuring periodic jobs survive permanent failures. This lets the orchestrator assert that periodic jobs keep running indefinitely.
+## Cascade Jobs
+`cascade-a.js`:
+- Records start/finish events.
+- Dispatches one or two `cascade-b` jobs with unique IDs.
+- Subject to all chaos behaviors (see below).
+`cascade-b.js`:
+- Records start/finish events.
+- Terminal handler; does not dispatch further jobs.
+- Subject to all chaos behaviors (see below).
+## Chaos Behaviors
+All handlers are subject to all chaos behaviors. The probabilities below are per-invocation and apply uniformly across `periodic.js`, `cascade-a.js`, and `cascade-b.js`:
+| Behavior | Probability | Notes |
+|---|---|---|
+| Normal completion | ~65% | Dispatches downstream jobs (if any), then returns |
+| Retriable error (throws `Error`) | ~15% | No downstream dispatch |
+| Permanent error (throws `PermanentError`) | ~5% | No downstream dispatch |
+| Stall (returns a never-resolving promise) | ~10% | Detected after `timeout` (3 s); counts as a stall |
+| CPU spin (blocks the worker thread) | ~3% | Tight loop until the process detects the hang and kills the thread (via `timeout`) |
+| Crash (causes the child process to exit) | ~2% | Handler writes a "crash-me" flag to Redis; main thread polls and exits |
+With `timeout: 3000`, stalling and spinning jobs are swept within ~3–13 seconds (timeout + heartbeat sweep interval). With `maxStalls: 2`, two stalls exhaust the stall budget and the job is sent to the fail handler, cycling fast.
+When a child process crashes, the orchestrator detects the exit event and restarts a new child process after a short delay.
+## Event Logging and Invariant Checking
+The orchestrator maintains an append-only in-memory event log. Each entry contains:
+```js
+{ type, queue, id, threadId, clientId, timestamp }
+```
+where `type` is one of: `start`, `finish`, `fail`, `stall`, `cancel`.
+After each event is appended, the orchestrator runs incremental invariant checks:
+### Invariant 1: Mutual Exclusion
+Maintain a `Map<jobId, { clientId, threadId, startedAt }>` of currently-active jobs. On `start`, check that the job ID is not already in the map. On `finish`/`fail`/`stall`, remove it.
+If a `start` event arrives for a job ID already in the map → **VIOLATION**.
+### Invariant 2: No Re-processing of Succeeded Jobs
+Maintain a `Set<jobId>` of successfully finished job IDs. On `start`, check that the ID is not in this set.
+If a `start` event arrives for a job ID in the succeeded set → **VIOLATION**.
+Note: Re-processing after a stall or retry is expected and must not be flagged.
+### Invariant 3: Scheduling (No Early Processing)
+Each `start` event includes `startedAt` (wall clock). Each job dispatch records an intended `runAt`. On `start`, verify `startedAt >= runAt - CLOCK_TOLERANCE_MS`.
+If `startedAt < runAt - CLOCK_TOLERANCE_MS` → **VIOLATION**.
+`CLOCK_TOLERANCE_MS` accounts for clock skew between the orchestrator, child processes, and Redis (default: 100ms).
+### Invariant 4: Priority Ordering
+Track the earliest-known `runAt` for jobs dispatched into each queue but not yet started. When a `start` event arrives for a job in that queue, verify no other eligible job (with `runAt <= now`) in the same queue has a lower `runAt` that has been waiting longer.
+This invariant is best-effort and checked with a configurable lag (e.g., 200ms) to account for the inherent race between dequeue polling and dispatch. A violation is only flagged when the ordering difference exceeds this lag.
+### Invariant 5: Fail Handler Completeness
+Track every job that has been dispatched (by ID). When a job exceeds its `maxRetries` or receives a permanent error, a fail event should be observed. Maintain a map `{ jobId → { exhausted: bool, failSeen: bool } }`. After a configurable drain period (e.g., 30 seconds after a queue goes quiet), check that every exhausted job has a corresponding `fail` event.
+### Invariant 6: Queue Progress
+The orchestrator monitors the time since the last `start` event per queue. If a queue is known to be non-empty (based on dispatched vs finished counts) and no `start` event has been seen for more than a configurable `STALL_THRESHOLD_MS` (e.g., 30 seconds), flag a progress violation.
+Priority starvation is checked by verifying that the low-priority queue does not process jobs while the high-priority queue has outstanding jobs older than the dequeue poll interval.
+## Output and Reporting
+Violations are logged to stdout and to `fuzz-output.log` with full context. The process does **not** exit on a violation — it logs and continues, accumulating a count of violations. A summary is printed periodically (every 60 seconds) and on `SIGINT`.
+Log format (newline-delimited JSON):
+```json
+{ "time": "...", "level": "info|warn|error", "msg": "...", "data": { ... } }
+```
+Violation entries use level `"error"` and include the invariant name, the offending event, and relevant recent history.
+## Configuration
+All tunable parameters live at the top of `fuzz.js` as named constants:
+```js
+const NUM_PROCESSES = 4;         // Child processes
+const NUM_PERIODIC_JOBS = 5;     // Fixed periodic job IDs
+const PERIODIC_MIN_DELAY = 1000; // ms before re-queuing self
+const PERIODIC_MAX_DELAY = 5000;
+const CRASH_INTERVAL_MS = 30000; // Orchestrator kills a random child process this often
+const CLOCK_TOLERANCE_MS = 100;
+const STALL_THRESHOLD_MS = 30000;
+const PRIORITY_LAG_MS = 200;
+const LOG_FILE = 'fuzz-output.log';
+```
+## Running
+The fuzz test is separate from the default test suite and is never run by `npm test`. It is started manually:
+```sh
+node fuzztest/fuzz.js
+```
+It runs indefinitely. Stop with `Ctrl+C`. A summary of violations and events processed will be printed on exit.
+## Notes on Implementation
+- Child processes use the `queasy` library's public API (`queue()`, `dispatch()`, `listen()`). They do not talk directly to Redis.
+- The orchestrator does not import from `src/`; it only spawns child processes and learns about child process lifecycle only from the `spawn` and `exit` events.
+- All handler modules in `fuzztest/handlers/` must be self-contained ESM modules that can be passed as `handlerPath` to `queue.listen()`.
+- Handlers write events to the `fuzz:events` Redis stream using a dedicated Redis client created at handler module load time. The orchestrator reads from this stream via `XREAD BLOCK`. This is the only communication channel between handlers and the orchestrator — no IPC is used.
+- The chaos crash behavior must be triggered from the child process's main thread, not from inside a handler's worker thread. To simulate a crash, the handler uses `postMessage` to send a `{ type: 'crash' }` message to the main thread, which listens for it and calls `process.exit()`.