RubyGems - async-background - Versions diffs - 1.0.0 → 1.0.1 - Mend

async-background 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +319 -276
data/README.md +91 -109
data/lib/async/background/version.rb +1 -1
data/lib/async/background/web/app.rb +39 -18
data/lib/async/background/web/auth.rb +7 -2
data/lib/async/background/web/configuration.rb +17 -3
data/lib/async/background/web/event_hub.rb +25 -148
data/lib/async/background/web/response.rb +31 -9
data/lib/async/background/web/router.rb +3 -1
data/lib/async/background/web/stream.rb +54 -15
metadata +2 -2

data/CHANGELOG.md CHANGED Viewed

@@ -1,358 +1,401 @@
 # Changelog
-## Unreleased
+## 1.0.1
+Dashboard security headers and a fiber-native rewrite of the SSE stream.
+### Security
+- HTML shell now ships a strict CSP and `X-Frame-Options: DENY`. The CSP is
+  `default-src 'none'; script-src 'self'; style-src 'self'; img-src 'self'
+  data:; connect-src 'self'; frame-ancestors 'none'; base-uri 'none';
+  form-action 'none'`.
+- Every response carries `X-Content-Type-Options: nosniff`,
+  `Referrer-Policy: no-referrer`, and `Cross-Origin-Resource-Policy:
+  same-origin`.
+- `401` and `404` responses are now JSON like every other error — minor
+  breaking change for clients that parsed the previous `text/plain` body.
+- `HEAD` is accepted on every route that accepts `GET` (RFC 9110 §9.3.2).
+  `HEAD /api/stream` returns the SSE headers without opening a stream.
+- `mount_path` validation is stricter: must start with `/`, no trailing
+  slash, no control characters, no whitespace.
+- New `Configuration#logger` (any object responding to `#warn` / `#error`).
+  When set, auth-callable exceptions, internal `rescue StandardError`, and
+  SSE stream errors are surfaced instead of being silently swallowed.
+### Architecture: fiber-native SSE
+- Removed the per-process monitor `Thread.new` and per-subscription
+  `Mutex` + `ConditionVariable`. The dashboard web subsystem spawns **zero**
+  native threads of its own.
+- `EventHub` is now a tiny mutex-guarded frame cache keyed by `data_version`.
+  No subscriptions, no monitor.
+- `Stream#each` runs the entire poll-and-yield loop inside the per-request
+  Falcon fiber: `sleep` (fiber-aware), read `PRAGMA data_version`, yield the
+  `overview` frame when the version moves, yield a heartbeat otherwise.
+- Each tab now polls `data_version` independently. At default 0.5 s and
+  realistic operator fan-out that's well under 50 SQLite header reads /
+  second per process. JSON rendering still happens at most once per version
+  thanks to the hub cache.
+- Shutdown is clean: `App#close` marks the hub closed, the next poll raises
+  `ClosedError`, the loop exits. No thread to join, no `Subscription` to
+  unsubscribe.
+&nbsp;
-### Dashboard SSE hardening
+## 1.0.0
-- Make SSE the default dashboard transport. `:polling` remains an explicit compatibility fallback.
-- Replace one `PRAGMA data_version` loop **per browser connection** with one shared `EventHub` watcher per Rack app process while at least one dashboard tab is connected. It fans complete overview snapshots to every SSE body through latest-value mailboxes, so slow tabs do not accumulate unbounded event queues.
-- Use complete snapshots on connect and after a change rather than a replay log: reconnecting EventSource clients cannot miss the current queue state.
-- Coalesce client-side list refreshes after a burst of changes, cancel stale list fetches on tab switches, and retain previous pages when `Load more` is used.
-- Add SSE retry control, 25-second heartbeat frames, `no-cache, no-transform`, and `X-Accel-Buffering: no`. Remove the hop-by-hop `Connection` response header.
-- Fingerprint asset URLs and cache immutable assets by digest, so a dashboard deploy cannot leave a browser on incompatible HTML/JS/CSS.
-- Add coverage for event fan-out, latest-value coalescing, clean stream shutdown, forced overview reads, and the SSE configuration constraints.
+First stable release. The queue execution contract from 0.7.2 (claim-token
+CAS, lifecycle columns, barrier-based shutdown drain, per-status partial
+indexes, versioned migrations) is now considered the public API.
+### Dashboard
+A read-only Rack-mountable UI under `require 'async/background/web'`:
+vanilla HTML / CSS / JS, no framework, no npm.
+- Endpoints: `GET /`, `GET /assets/{app.css,app.js}`,
+  `GET /api/{overview,executing,claimed,pending,done,failed,metrics,config,stream}`.
+- The read path runs through `Async::Background::Web::Snapshot`, which
+  opens SQLite with `file:?mode=ro`, wraps a `Mutex` around a single shared
+  connection, and uses one read transaction per endpoint plus a TTL'd
+  overview cache (`counts_cache_ttl`, default 3 s).
+- Distinguishes **executing** (`status='running' AND started_at IS NOT
+  NULL`) from **claimed** (`status='running' AND started_at IS NULL`).
+- Cursor pagination for `done` / `failed` / `pending` using
+  `(finished_at, id)` / `(run_at, id)` tuples. Stable on ties.
+- Args hidden by default (`expose_args: false`); when enabled, content
+  runs through `redact_args`. All user content rendered through
+  `textContent`, never `innerHTML`.
+- `auth` is **mandatory**. `Configuration#validate!` rejects an
+  unconfigured `auth`. There is no permissive default — a falsey result
+  returns `401`.
+### SSE transport
+The dashboard uses a single long-lived `text/event-stream` connection per
+browser tab instead of polling `/api/overview` every 2 seconds. One HTTP
+connection per tab regardless of how long it stays open.
+- `Configuration#transport` accepts `:sse` (default) or `:polling`. Anything
+  else raises `ConfigurationError`. The chosen transport is exposed at
+  `/api/config` so the client knows which path to take.
+- Client opens `EventSource(mount_path + '/api/stream')` once; the server
+  pushes an `overview` event when `PRAGMA data_version` changes and a
+  `:keepalive` comment frame every 25 s.
+- Server-supplied 5 s reconnect delay; each reconnect begins from a full
+  current snapshot (no event log).
+- Asset URLs are fingerprinted and cached immutably by digest, so a
+  dashboard deploy can't leave a browser on incompatible HTML / JS / CSS.
+### Server compatibility for SSE
+SSE holds the response open for the lifetime of the dashboard tab.
+- **Falcon** — recommended. Handles long-lived connections via fibers.
+- **Puma** — works. Each open tab holds one worker thread for its lifetime;
+  fine for a handful of operators, problematic if many concurrent operators
+  would starve the worker pool.
+- **Unicorn** — doesn't work. Blocking worker model can't hold long-lived
+  connections without timeouts. Stay on `:polling`.
+See the picture in the README for what each server is actually holding.
-## 1.1.0
+### Configuration
-Server-Sent Events transport for the dashboard. Replaces HTTP polling as the recommended transport.
+```ruby
+require 'async/background/web'
-### Added
+Async::Background::Queue::Store.prepare_dashboard!(path: '/var/lib/app/queue.db')
-- **SSE transport for the dashboard.** Set `c.transport = :sse` in `Async::Background::Web.configure` and the dashboard now uses a single long-lived `text/event-stream` connection per browser tab instead of polling `/api/overview` every 2 seconds. The browser opens `EventSource(mount_path + '/api/stream')` once; the server pushes an `overview` event whenever `PRAGMA data_version` changes, and a `:keepalive` comment frame every 30 seconds. Result: 1 HTTP connection per dashboard tab regardless of how long it stays open, instead of 30 req/min per tab.
+Async::Background::Web.configure do |c|
+  c.queue_path       = '/var/lib/app/queue.db'
+  c.auth             = ->(env) { env['warden'].user&.admin? }
+  c.expose_args      = false
+  c.metrics_path     = '/run/app/async-background.shm'
+  c.total_workers    = 4
+  c.counts_cache_ttl = 3.0
+  c.poll_interval_ms = 2000
+  c.list_limit       = 50
+  c.mount_path       = '/admin/background'
+  c.title            = 'My App background jobs'
+end
-  - New module `Async::Background::Web::Stream` implements the event loop as a Rack streaming body (responds to `#each`, yields SSE frames). Holds no state across requests.
-  - New route `GET /api/stream` returns `200 text/event-stream` when `transport == :sse`, `404` otherwise. Subject to the same auth gate as every other endpoint.
-  - New `Response.sse(body)` helper sets the correct headers including `x-accel-buffering: no` (disables nginx buffering for the streaming response).
-  - JS client (`assets.rb`) detects `state.config.transport === 'sse'` at boot and chooses `EventSource` over `setInterval(tick, ...)`. Both transports share the same `applyOverview()` and `refreshActiveList()` handlers, so the UI behaves identically.
+run Async::Background::Web.app
+```
-- **`Configuration#transport`** with default `:polling` (backward compatible) and accepted values `:polling | :sse`. Validation rejects anything else with `ConfigurationError`. The chosen transport is exposed at `/api/config` so the client knows which path to take.
+### Dependencies
-### Migration
+`rack` is optional. Required only when `require 'async/background/web'` is
+loaded. Core gem and worker processes don't require it.
-Existing deployments keep working unchanged. To opt into SSE:
+### Breaking changes from 0.7.x
-```ruby
-Async::Background::Web.configure do |c|
-  c.queue_path = ...
-  c.auth = ->(env) { ... }
-  c.transport = :sse
-end
-```
+None beyond what 0.7.2 already shipped. The 1.0 line locks the existing
+contract:
-### Server compatibility note
+- `Queue::Store#fetch` returns `claim_token` in the result hash.
+- All terminal `Queue::Store` methods (`complete`, `fail`, `retry_or_fail`)
+  require the `claim_token:` kwarg and return CAS success boolean /
+  `:retried` / `:failed` / `nil`.
+- Schema is versioned via `PRAGMA user_version`. Use
+  `Queue::Store.migrate!(path:)` to upgrade. Use
+  `Queue::Store.prepare_dashboard!(path:)` from the dashboard process to
+  add dashboard-only indexes.
-SSE holds the request thread/fiber open for the lifetime of the dashboard tab. **Recommended for Falcon**, which handles long-lived connections natively via fibers. **Puma works** but each open dashboard tab holds one worker thread for its lifetime — fine for an admin dashboard with a handful of operators, problematic if many concurrent operators would starve the worker pool. **Unicorn does not work** for SSE since its blocking worker model can't hold long-lived connections without timeouts; stay on `:polling` there.
-### Backend-side polling
-The server still polls `PRAGMA data_version` every 500ms inside the snapshot connection to detect changes. This is a connection-local PRAGMA call, microseconds, never hits a rate limiter. Client-facing transport is push.
+&nbsp;
-### Tests
+## 0.7.2
-- New `spec/async/background/web/stream_spec.rb` — covers overview event on data_version change, heartbeat after idle, graceful exit on `EPIPE`/`IOError`, error frame on `ClosedError`/`UnavailableError`.
-- Extended `spec/async/background/web/app_spec.rb` — `/api/stream` returns 404 on polling default, 200 text/event-stream on `:sse`, 401 without auth.
-- Extended `spec/async/background/web/configuration_spec.rb` — accepts `:sse`, rejects unknown transports.
+Harden queue execution, retries, shutdown, and metrics. Adds schema v1,
+optional dashboard indexes, and a faster enqueue path.
-## 1.0.0
-First stable release. The queue execution contract from 0.7.2 (claim-token CAS, lifecycle columns, barrier-based shutdown drain, per-status partial indexes, versioned migrations) is now considered the public API.
-### Features
+&nbsp;
-- **Web dashboard.** Rack-mountable read-only UI under `require 'async/background/web'`. Vanilla HTML/CSS/JS, no JS framework, no npm.
-  - Endpoints: `GET /`, `GET /assets/app.css`, `GET /assets/app.js`, `GET /api/overview`, `GET /api/executing`, `GET /api/claimed`, `GET /api/pending`, `GET /api/done`, `GET /api/failed`, `GET /api/metrics`, `GET /api/config`.
-  - Default transport is JSON polling (`poll_interval_ms`, default 2000). SSE adapter for Falcon is intentionally deferred to a later release; the dashboard already coalesces work via a shared overview cache, so adding SSE later is a backward-compatible change.
-  - Read path runs through `Async::Background::Web::Snapshot`, which opens SQLite with `file:?mode=ro`, wraps a `Mutex` around a single shared connection, and uses one read transaction per endpoint and caches each overview as one consistent snapshot.
-  - Distinguishes `Executing` (`status='running' AND started_at IS NOT NULL`) from `Claimed` (`status='running' AND started_at IS NULL`).
-  - Overview snapshot cache for `counts_cache_ttl` seconds (default 3.0) so a busy queue does not turn the dashboard into a hot reader.
-  - Cursor pagination for `done`/`failed`/`pending` using `(finished_at, id)` / `(run_at, id)` tuples. Stable on ties.
-  - Args hidden by default (`expose_args: false`); when enabled, content runs through `redact_args`. All user content rendered through `textContent`, never `innerHTML`.
-  - Auth hook is **mandatory**. `Configuration#validate!` rejects an unconfigured `auth`. There is no permissive default.
+## 0.7.1
-  - Add the optional Rack dashboard for the SQLite queue.
-  - Make sqlite3 an explicit runtime dependency for queue/dashboard installs.
+`Store` exposes three SQLite tuning knobs via `StoreOptions`, validated at
+construction time so misconfigurations fail fast:
-### Configuration
+- `mmap` (`true` / `false`, default `true`) — memory-mapped I/O.
+- `synchronous` (`:normal` / `:full` / `:extra`, default `:normal`) —
+  durability vs throughput.
+- `wal_autocheckpoint` (`Integer` in `100..10_000`, default `1_000`) — WAL
+  checkpoint frequency in pages.
-```ruby
-require 'async/background/web'
+**Breaking change.** `Store.new(path:, mmap:)` → `Store.new(path:, options:
+{ mmap: ... })`. The direct `mmap:` kwarg is removed in favor of the
+unified `options:` hash. Update any call site that constructs `Store`
+manually.
-Async::Background::Queue::Store.prepare_dashboard!(path: '/var/lib/app/queue.db')
+See [Get Started → Store tuning](docs/GET_STARTED.md#appendix-store-tuning)
+for trade-offs.
-Async::Background::Web.configure do |c|
-  c.queue_path        = '/var/lib/app/queue.db'
-  c.auth              = ->(env) { env['warden'].user&.admin? }
-  c.expose_args       = false
-  c.metrics_path      = '/run/app/async-background.shm'
-  c.total_workers     = 4
-  c.counts_cache_ttl  = 3.0
-  c.poll_interval_ms  = 2000
-  c.list_limit        = 50
-  c.mount_path        = '/admin/background'
-  c.title             = 'My App background jobs'
+&nbsp;
+## 0.6.2
+Queue jobs gain a **configurable timeout** at three levels — call-site
+`options:`, class-level `.options`, default 120 s — merged at enqueue time
+so the runner just reads the final value from the payload:
+```ruby
+class HeavyImportJob
+  include Async::Background::Job
+  options timeout: 600
 end
-run Async::Background::Web.app
+HeavyImportJob.perform_async(user_id, options: { timeout: 120 })  # wins
 ```
-### Dependencies
-- `rack` is an optional dependency. Required only when `require 'async/background/web'` is loaded. Core gem and worker processes do not require it.
+Side effects: an `options TEXT` column in SQLite (added idempotently via
+`ALTER TABLE … rescue nil` on existing databases), an extensible `options:`
+hash across the entire enqueue chain, a `Job::Options` schema via
+`Data.define` (unknown keys raise `ArgumentError`), and queue-timeout
+failure logs now include the actual value (`"timed out after 120s"`).
-### Breaking changes from 0.7.x
-None beyond what 0.7.2 already shipped. The 1.0 line locks the existing contract:
-- `Queue::Store#fetch` returns `claim_token` in the result hash.
-- All terminal `Queue::Store` methods (`complete`, `fail`, `retry_or_fail`) require the `claim_token:` kwarg and return CAS success boolean / `:retried` / `:failed` / `nil`.
-- Schema is versioned via `PRAGMA user_version`. Use `Queue::Store.migrate!(path:)` to upgrade. Use `Queue::Store.prepare_dashboard!(path:)` from the dashboard process to lazily create dashboard-only indexes (per-status partial indexes for `done` / `failed`, plus separate `executing` and `claimed` indexes).
+&nbsp;
-## 0.7.2
+## 0.6.1
-- Harden queue execution, retries, shutdown, and metrics.
-- Add schema v1, optional dashboard indexes, and a faster enqueue path.
+Two scheduler fixes and one notification fast path:
-## 0.7.1
+- **Cron busy-loop on overlap skip.** When a scheduled run was skipped
+  because the previous one was still active, the entry was re-pushed to the
+  heap without `reschedule`. `next_run_at` never advanced, so the next
+  iteration picked it up immediately. Skip branch now calls
+  `entry.reschedule(monotonic_now)` like the normal path.
+- **Prepared statement reset on fetch error.** `@fetch_stmt.reset!` ran
+  after `execute` returned, so an exception inside `execute` left the
+  statement dirty and the next `fetch` could fail. Wrapped in
+  `begin / ensure`.
+- **SocketNotifier: 1 connect per enqueue.** `notify_all` no longer
+  connects to all N worker sockets on every enqueue. Wakes a single worker
+  chosen by random offset, falls back through the ring only if the chosen
+  worker is dead. Happy path: 1 connect; worst case (all workers down): N.
+- Pending lookup now uses a partial index
+  `idx_jobs_pending(run_at, id) WHERE status = 'pending'`. Smaller on disk,
+  cheaper to update, and matches the only query that uses it.
-### Features
-- **Tunable `Store` options via `StoreOptions`** — three knobs exposed for SQLite tuning, validated at construction time so misconfigurations fail fast at boot:
-  - `mmap` (`true`/`false`, default `true`) — toggle memory-mapped I/O
-  - `synchronous` (`:normal`/`:full`/`:extra`, default `:normal`) — durability vs throughput
-  - `wal_autocheckpoint` (`Integer` in `100..10_000`, default `1_000`) — WAL checkpoint frequency in pages
-  Range and enum validation prevent foot-guns (e.g. `wal_autocheckpoint: 100_000` would bloat WAL beyond `journal_size_limit`). See [Get Started → Store tuning](docs/GET_STARTED.md) for trade-offs of each knob
-### Breaking changes
-- `Store.new(path:, mmap:)` → `Store.new(path:, options: { mmap: ... })`. Direct `mmap:` keyword argument removed in favor of the unified `options:` hash. Users who construct `Store` manually (e.g. for web-worker enqueue) need to update the call site
+&nbsp;
-## 0.6.2
+## 0.6.0
-### Features
-- **Configurable timeout for queue jobs** — queue jobs previously used a hardcoded 30-second timeout (`DEFAULT_TIMEOUT`). Now configurable via `options` hash at two levels:
-  ```ruby
-  # Class-level default
-  class HeavyImportJob
-    include Async::Background::Job
-    options timeout: 600
-    def perform(user_id) = # ...
-  end
-  # Call-site override (wins over class-level)
-  HeavyImportJob.perform_async(user_id, options: { timeout: 120 })
-  ```
-  Priority: call-site `options:` → class-level `options` → `DEFAULT_TIMEOUT` (30s). Options are merged at enqueue time so the runner simply reads the final value from the payload
-- **`options:` hash across the entire enqueue chain** — single extensible contract from `perform_async` through `Client` down to `Store`. Currently supports `:timeout`, designed to accommodate future keys (e.g. `:retry`) without API changes
-- **`Job::Options` schema via `Data.define`** — declares known option keys with types and defaults. Unknown keys raise `ArgumentError`, invalid types raise `TypeError`. No manual validation code
-- **`options TEXT` column in SQLite** — stores the merged options hash as JSON. Extensible without schema changes when new options are added
-### Improvements
-- **Queue timeout logged on failure** — `run_queue_job` error log now includes actual timeout value: `"timed out after 120s"` instead of generic `"timed out"`
-- **Idempotent schema migration** — existing databases get `ALTER TABLE jobs ADD COLUMN options TEXT` on first connection, wrapped in `rescue nil` for safe re-runs. New databases include the column in `CREATE TABLE`
+**Queue notification system rewritten.** The pipe-based `Notifier` is
+replaced with a Unix-domain-socket architecture: each worker listens on its
+own socket (`<dir>/async_bg_worker_N.sock`), producers broadcast wake-ups
+via `SocketNotifier`. Fork-safe by design (no shared FDs), resilient to
+restarts (stale-socket cleanup), and sub-100 µs wake-up latency
+(30–80 µs typical).
-## 0.6.1
+**Why.** The pipe-based notifier was fundamentally broken in the
+recommended multi-fork setup: `for_consumer!` closed the writer end in each
+child, making `Client#push → notify` fail silently with `IOError`. All
+writes hit `WRITE_DROPPED`, so the queue silently degraded to 5-second
+polling.
-### Bug Fixes
-- **Runner: cron jobs busy-loop on overlap skip** — when a scheduled run was skipped because the previous one was still active, the entry was re-pushed to the heap without calling `reschedule`. For cron jobs (where `interval` is `nil`), this meant `next_run_at` was never advanced to the next cron tick, causing the entry to be picked up again immediately on the next loop iteration. Skip branch now calls `entry.reschedule(monotonic_now)` like the normal path
-- **Store: prepared statement not reset on fetch error** — `@fetch_stmt.reset!` was called after `execute` returned, so an exception inside `execute` left the statement in a dirty state and the next `fetch` could fail. Wrapped in `begin/ensure` to guarantee reset
+**Breaking changes.** `Runner` now takes `queue_socket_dir:` instead of
+`queue_notifier:`. `Notifier#for_producer!` / `Notifier#for_consumer!` are
+removed. `Client#push` calls `notifier.notify_all`. Environment variable
+`QUEUE_SOCKET_PATH` is replaced by `QUEUE_SOCKET_DIR` (a directory now).
-### Improvements
-- **SocketNotifier: non-blocking enqueue with ring fallback** — `notify_all` no longer connects to all N worker sockets on every enqueue. `UNIXSocket.new` is a blocking, non-fiber-aware syscall, and notifying every worker blocked the Falcon reactor for N `connect()` calls on the hot HTTP enqueue path. Now wakes a single worker chosen by random offset, falling back through the ring only if the chosen worker is dead (`ECONNREFUSED` etc.). Happy path: 1 connect. Worst case (all workers down): N connects — same as before, but only when actually needed. Safe because the queue is shared in SQLite, not sharded per worker
-- **SocketNotifier: cleaned up `UNAVAILABLE` error list** — removed `IO::WaitWritable` and `Errno::EAGAIN`. They implied "socket buffer full", but `write_nonblock` of a single byte to a freshly-opened connection cannot fill the kernel buffer. Listing them only misled readers
-- **Store: partial index for pending lookup** — replaced `idx_jobs_status_run_at_id(status, run_at, id)` with partial index `idx_jobs_pending(run_at, id) WHERE status = 'pending'`. Smaller on disk, cheaper to update, and matches the only query that uses it (`fetch`). `done`/`failed`/`running` rows no longer occupy index pages
-## 0.6.0
-### Breaking Changes
-- **Queue notification system completely rewritten** — replaced pipe-based `Notifier` with Unix domain socket-based architecture
-  - `Runner` now takes `queue_socket_dir:` parameter instead of `queue_notifier:`
-  - Removed `Notifier#for_producer!` and `Notifier#for_consumer!` — no longer needed
-  - `Client#push` now calls `notifier.notify_all` instead of `notifier.notify`
-### Features
-- **Unix domain socket-based notifications** — solves all cross-process notification problems
-  - New `SocketWaker` class (consumer-side) — each worker listens on its own Unix socket (`/tmp/queue/sockets/async_bg_worker_N.sock`)
-  - New `SocketNotifier` class (producer-side) — connects to all worker sockets to broadcast wake-ups
-  - **Cross-process wake-up now works correctly** — web workers → background workers, background workers → background workers
-  - **Fork-safe by design** — no shared file descriptors, each process creates its own socket after fork
-  - **Resilient to restarts** — stale socket cleanup on worker startup, graceful degradation if worker unavailable
-  - **Sub-100µs latency** — typical wake-up time 30-80µs vs previous 5-second polling fallback
-### Bug Fixes
-- **CRITICAL: Notifier bug in recommended setup** — the old pipe-based `Notifier` was fundamentally broken in multi-fork scenarios:
-  - `for_consumer!` closed the writer end in each child process, making `Client#push → notify` fail silently with `IOError`
-  - All writes were caught by `WRITE_DROPPED` rescue block, causing jobs to use 5-second polling instead of instant wake-up
-  - Web workers had no way to notify background workers (no shared pipe after fork)
-  - The bug was masked by `WRITE_DROPPED` silently catching `IOError` — appeared to work but degraded to polling
-- **Socket cleanup race conditions** — `SocketWaker#cleanup_stale_socket` now validates if socket is truly stale by attempting connection
-### Improvements
-- Updated `docs/GET_STARTED.md` with new socket-based setup for Falcon
-- Added section on web worker → background worker job enqueuing with full example
-- Changed environment variable from `QUEUE_SOCKET_PATH` to `QUEUE_SOCKET_DIR` (directory instead of single socket path)
-- Better error handling in `SocketWaker` and `SocketNotifier` with comprehensive `UNAVAILABLE` error list
-- Integrated with `Async::Notification` for local wake-ups (shutdown signals)
-### Technical Details
-- **Why sockets over pipes?** Pipes require shared FDs across fork boundaries. The recommended Falcon setup calls `for_consumer!` in each child, which closes the writer, breaking the notification chain. Sockets use filesystem paths — any process can connect without inherited FDs.
-- **Performance impact:** Adding ~80µs per enqueue for 8 workers (8 socket connections) vs ~100µs for SQLite transaction = negligible overhead
-- **Graceful degradation:** If worker socket unavailable (`ENOENT`, `ECONNREFUSED`), producer silently skips — job still in database, will be picked up on next poll (5s max delay)
+&nbsp;
 ## 0.5.1
-### Testing Infrastructure
-- **Comprehensive CI setup** — full Docker-based integration testing environment with `Dockerfile.ci`, `docker-compose.ci.yml`, and `Gemfile.ci`
-- **End-to-end scenario testing** — new `ci/scenario_test.rb` validates real-world scenarios with forked workers:
-  - Normal execution of fast/slow/failing jobs across multiple workers
-  - Crash recovery after SIGKILL with automatic job pickup by remaining workers
-  - No duplicate execution guarantees under worker crashes
-  - Proper job distribution validation across worker pool
-- **Test fixtures** — dedicated `ci/fixtures/jobs.rb` and `ci/fixtures/schedule.yml` for scenario testing
-### Bug Fixes
-- **SQLite busy timeout** — added `PRAGMA busy_timeout = 5000` to `Queue::Store` to prevent `SQLITE_BUSY` errors under concurrent multi-process database access
-- **Enhanced Queue::Notifier error handling** — restructured IO error handling with clearer categorization:
-  - `WRITE_DROPPED` for write failures (`IO::WaitWritable`, `Errno::EAGAIN`, `IOError`, `Errno::EPIPE`) — all non-fatal as job is already in store
-  - `READ_EXHAUSTED` for read exhaustion (`IO::WaitReadable`, `EOFError`, `IOError`) — normal drain completion
-  - Added explanatory comments for each error type and handling strategy
+CI infrastructure: full Docker-based integration testing (`Dockerfile.ci`,
+`docker-compose.ci.yml`, `Gemfile.ci`) plus an end-to-end scenario test that
+validates forked-worker behavior — normal execution, crash recovery after
+SIGKILL, no duplicate execution under crashes, proper distribution across
+the pool.
+Also: `PRAGMA busy_timeout = 5000` on `Queue::Store` to prevent
+`SQLITE_BUSY` under concurrent multi-process access; cleaner IO error
+categorization in `Queue::Notifier` (`WRITE_DROPPED` vs `READ_EXHAUSTED`)
+with explanatory comments.
+&nbsp;
 ## 0.5.0
-### Features
-- **Delayed jobs** — full support for scheduling jobs in the future
-  - `Queue::Client#push_in(delay, class_name, args)` — enqueue with delay in seconds
-  - `Queue::Client#push_at(time, class_name, args)` — enqueue at a specific time
-  - `Queue.enqueue_in(delay, job_class, *args)` — class-level delayed enqueue
-  - `Queue.enqueue_at(time, job_class, *args)` — class-level scheduled enqueue
-  - New `run_at` column in SQLite `jobs` table — jobs are only fetched when `run_at <= now`
-- **Job module** — Sidekiq-like `include Async::Background::Job` interface
-  - `perform_async(*args)` — immediate queue execution
-  - `perform_in(delay, *args)` — delayed execution after N seconds
-  - `perform_at(time, *args)` — scheduled execution at a specific time
-  - Instance-level `#perform` with class-level `perform_now` delegation
-- **Clock module** — shared `monotonic_now` / `realtime_now` helpers extracted into `Async::Background::Clock`, included by `Runner`, `Queue::Store`, and `Queue::Client`
-### Bug Fixes
-- **Runner: incorrect task in `with_timeout`** — `semaphore.async { |job_task| ... }` now correctly receives the child task instead of capturing the parent `task` from the outer scope. Previously, `with_timeout` was applied to the parent task, which could cancel unrelated work
-### Improvements
-- **Job API: `#perform` instead of `#perform_now`** — job classes now define `#perform` instance method. The class-level `perform_now` creates instance and calls `#perform`, aligning with ActiveJob / Sidekiq conventions
-- Updated error messages: validation failures now suggest `must include Async::Background::Job` instead of `must implement .perform_now`
-- `Queue::Client` — extracted private `ensure_configured!` and `resolve_class_name` methods for cleaner validation and class name resolution logic
-- `Queue::Notifier` — extracted `IO_ERRORS` constant (`IO::WaitReadable`, `EOFError`, `IOError`) for cleaner `rescue` in `drain`
-- `Queue::Store` — replaced index `idx_jobs_status_id(status, id)` with `idx_jobs_status_run_at_id(status, run_at, id)` for efficient delayed job lookups
-- `Queue::Store` — `fetch` SQL now uses `WHERE status = 'pending' AND run_at <= ?` with `ORDER BY run_at, id` to process jobs in scheduled order
-- Removed duplicated `monotonic_now` / `realtime_now` from `Runner` and `Store` — now provided by `Clock` module
-- Updated documentation: README (Job module examples, Queue architecture diagram, Clock section), GET_STARTED (delayed jobs guide, Job module usage, minimal queue-only example)
+**Delayed jobs.** Full support for scheduling jobs in the future:
+```ruby
+SomeJob.perform_in(60, *args)
+SomeJob.perform_at(time, *args)
+```
+Backed by a new `run_at` column in the SQLite `jobs` table — jobs are only
+fetched when `run_at <= now`.
+**Job module.** Sidekiq-like `include Async::Background::Job` adds
+`perform_async`, `perform_in`, `perform_at`, instance-level `#perform`, and
+class-level `perform_now` delegation.
+**Clock module.** Shared `monotonic_now` / `realtime_now` helpers extracted
+to `Async::Background::Clock` and included by `Runner`, `Queue::Store`, and
+`Queue::Client`.
+&nbsp;
 ## 0.4.5
-### Breaking Changes
-- `PRAGMAS` is now a frozen lambda `PRAGMAS.call(mmap_size)` instead of a static string — if you referenced this constant directly, update your code
+**Fetch race condition fixed.** Wrapped `UPDATE ... RETURNING` in
+`BEGIN IMMEDIATE` to prevent two workers from picking up the same job
+simultaneously.
+**mmap on Docker overlay2.** `overlay2` does not guarantee `write()` /
+`mmap()` coherence, which corrupts the WAL under concurrent multi-process
+access. mmap is now configurable via `queue_mmap: false` instead of being
+hardcoded. Proper Docker setup with named volumes is documented in
+[Get Started → Docker](docs/GET_STARTED.md#step-3--docker-setup).
+Also: `PRAGMA optimize` on shutdown wrapped in `rescue nil`,
+`PRAGMA incremental_vacuum` actually works now (`PRAGMA auto_vacuum =
+INCREMENTAL` added to schema; only takes effect on new databases),
+composite index `idx_jobs_status_id(status, id)` to eliminate a sort in
+`fetch`. New `queue_mmap:` / `mmap:` parameters and a public
+`attr_reader :queue_store` on `Runner`.
+**Breaking-ish.** `PRAGMAS` is now a frozen lambda `PRAGMAS.call(mmap_size)`
+instead of a static string; update any direct reference.
-### Features
-- New `queue_mmap:` parameter on `Runner` (default: `true`) — allows disabling SQLite mmap for environments where it's unsafe (Docker overlay2)
-- New `mmap:` parameter on `Queue::Store` (default: `true`) — controls `PRAGMA mmap_size` (256 MB when enabled, 0 when disabled)
-- Public `attr_reader :queue_store` on `Runner` — eliminates need for `instance_variable_get` when sharing Store with Client
-### Bug Fixes
-- **CRITICAL: fetch race condition** — wrapped `UPDATE ... RETURNING` in `BEGIN IMMEDIATE` transaction to prevent two workers from picking up the same job simultaneously
-- **CRITICAL: mmap + Docker overlay2** — `overlay2` filesystem does not guarantee `write()`/`mmap()` coherence, causing SQLite WAL corruption under concurrent multi-process access. mmap is now configurable via `queue_mmap: false` instead of being hardcoded. Documented proper Docker setup with named volumes in `docs/GET_STARTED.md`
-- **`PRAGMA optimize` on shutdown** — wrapped in `rescue nil` to prevent `SQLite3::BusyException` when another process holds the write lock during graceful shutdown
-- **`PRAGMA incremental_vacuum` was a no-op** — added `PRAGMA auto_vacuum = INCREMENTAL` to schema. Without it, `incremental_vacuum` does nothing. Note: only takes effect on newly created databases; existing databases require a one-time `VACUUM`
-### Improvements
-- Replaced index `idx_jobs_status(status)` with composite `idx_jobs_status_id(status, id)` — eliminates sort step in `fetch` query (`ORDER BY id LIMIT 1` is now a direct B-tree lookup)
-- Fixed `finalize_statements` — changed `%i[@enqueue_stmt ...]` to `%i[enqueue_stmt ...]` with `:"@#{name}"` interpolation for idiomatic `instance_variable_get`/`set` usage
-- Added documentation: `README.md` (concise, with warning markers) and `docs/GET_STARTED.md` (step-by-step guide covering schedule config, Falcon integration, Docker setup, dynamic queue)
+&nbsp;
 ## 0.4.0
-### Features
-- **Dynamic job queue** — enqueue jobs at runtime from any process (web, console, rake) with automatic execution by background workers
-  - `Queue::Store` — SQLite-backed persistent storage with WAL mode, prepared statements, and optimized pragmas
-  - `Queue::Notifier` — `IO.pipe`-based zero-cost wakeup between producer and consumer processes (no polling)
-  - `Queue::Client` — public API: `Async::Background::Queue.enqueue(JobClass, *args)`
-  - Automatic recovery of stale `running` jobs on worker restart
-  - Periodic cleanup of completed jobs (piggyback on fetch, every 5 minutes)
-  - `PRAGMA incremental_vacuum` when cleanup removes 100+ rows
-  - Worker isolation via `ISOLATION_FORKS` env variable — exclude specific workers from queue processing
-  - Custom database path via `queue_db_path` parameter
-  - Requires optional `sqlite3` gem (`~> 2.0`) — not included by default, must be added to Gemfile explicitly
-- New Runner parameters: `queue_notifier:` and `queue_db_path:`
-### Improvements
-- Unified `monotonic_now` usage across `run_job` and `run_queue_job` (was using direct `Process.clock_gettime` call in `run_job`)
-- `Queue::Notifier#drain` — moved `rescue` inside the loop to avoid stack unwinding on each drain cycle
+**Dynamic job queue.** Enqueue jobs at runtime from any process (web,
+console, rake) with automatic execution by background workers.
-## 0.3.0
+- `Queue::Store` — SQLite-backed persistent storage with WAL mode,
+  prepared statements, and optimized pragmas.
+- `Queue::Notifier` — `IO.pipe`-based zero-cost wake-up between producer
+  and consumer processes.
+- `Queue::Client` — public API: `Async::Background::Queue.enqueue
+  (JobClass, *args)`.
+- Automatic recovery of stale `running` jobs on worker restart.
+- Periodic cleanup of completed jobs (piggybacked on fetch, every 5 min);
+  `PRAGMA incremental_vacuum` when cleanup removes 100+ rows.
+- `ISOLATION_FORKS` env var excludes specific workers from queue processing.
+- Custom database path via `queue_db_path:` on `Runner`.
-### Features
-- Added optional metrics collection system using shared memory
-- New `Metrics` class with worker-specific performance tracking
-- Public API: `runner.metrics.enabled?`, `runner.metrics.values`, `Metrics.read_all()`
-- Tracks total runs, successes, failures, timeouts, skips, active jobs, and execution times
-- Requires optional `async-utilization` gem dependency
-- Metrics stored in `/tmp/async-background.shm` with lock-free updates per worker
+Requires the optional `sqlite3` gem (`~> 2.0`).
-## 0.2.6
+(The 0.6.0 socket-based architecture supersedes the pipe-based notifier
+introduced here.)
-### Improvements
-- Micro-optimization in `wait_with_shutdown` method: use passed `task` parameter instead of `Async::Task.current` for better consistency and slight performance improvement
-## 0.2.5
-### Features
-- Added graceful shutdown via signal handlers for SIGINT and SIGTERM
-- Enhanced process lifecycle management with proper signal handling using `Signal.trap` and IO.pipe for async communication
-- Improved robustness for production deployments and container orchestration
-- Updated dependencies to work with latest Async 2.x API (removed deprecated `:parent` parameter usage)
+&nbsp;
-## 0.2.4
+## 0.3.0
-### Improvements
-- Removed hardcoded version warning from main module (was checking against fixed list: 0.1.0, 0.2.2, 0.2.3). Use semantic versioning with pre-release suffixes for unstable versions (e.g., 0.3.0.alpha1) instead
-- Removed hardcoded stable versions list from gemspec description — metadata should describe functionality, not versioning
-- Changed `while true` to idiomatic `loop do` in run method
-- Added `Gemfile.lock` to .gitignore (gems should not commit lockfile)
-- Updated README: clarified that job class must respond to `.perform_now` class method (removed confusing mention of instance `#perform`)
+Optional metrics collection via shared memory. `Metrics` tracks per-worker
+counters: `total_runs`, `total_successes`, `total_failures`,
+`total_timeouts`, `total_skips`, `active_jobs`, plus last-run timestamp and
+duration. Public API: `runner.metrics.enabled?`, `runner.metrics.values`,
+`Metrics.read_all(total_workers:)`. Requires the optional
+`async-utilization` gem; absent that, `enabled?` is `false` and `read_all`
+returns `[]`. Default file: `/tmp/async-background.shm`.
-## 0.2.2
-### Bug Fixes
-- **CRITICAL**: Removed logger parameter from Runner initialize (was unused). Fixed initialization to use Console.logger directly which now properly initializes in forked processes with correct context
-## 0.2.1
+&nbsp;
-### Bug Fixes
-- **CRITICAL**: Added missing `require 'console'` in main module. Logger was nil because Console gem was not imported, causing `undefined method 'info' for nil` errors on worker initialization
+## 0.2.x
-## 0.2.0
+- **0.2.6** — `wait_with_shutdown` uses the passed `task` parameter
+  instead of `Async::Task.current`.
+- **0.2.5** — Graceful shutdown via `SIGINT` / `SIGTERM` signal handlers
+  using `Signal.trap` and `IO.pipe`. Compatible with Async 2.x API
+  (removed deprecated `:parent`).
+- **0.2.4** — Removed hardcoded version warning. Use semver pre-release
+  suffixes for unstable versions (e.g. `0.3.0.alpha1`).
+- **0.2.2** — Removed unused `logger` parameter from `Runner#initialize`;
+  use `Console.logger` directly, which now initializes correctly in
+  forked processes.
+- **0.2.1** — Added missing `require 'console'` in main module. Logger
+  was `nil`, causing `undefined method 'info' for nil` on worker
+  initialization.
+- **0.2.0** — Removed hidden ActiveSupport dependency
+  (`safe_constantize` → `Object.const_get` + `NameError`). Job validation
+  now checks for `.perform_now` (class method) instead of `.perform`
+  (instance method). Fixed a race where an entry could disappear from the
+  heap during execution. Added `stop()` and `running?()` to `Runner`.
-### Bug Fixes
-- **CRITICAL**: Removed hidden ActiveSupport dependency. Replaced `safe_constantize` with `Object.const_get` + `NameError` handling
-- **CRITICAL**: Fixed validator mismatch: now validates `.perform_now` (class method) instead of `.perform` (instance method)
-- **CRITICAL**: Fixed race condition where entry could disappear from heap during execution. `reschedule` and `heap.push` now always execute after job processing
-- Added full exception backtrace to error logs for production debugging
-- Improved YAML security by removing `Symbol` from `permitted_classes`
-- Removed Mutex from graceful shutdown (anti-pattern in Async). Boolean assignment is atomic in MRI
-### Features
-- Added optional `logger` parameter to Runner constructor for custom loggers (Rails.logger, etc.)
-- Added `stop()` method for graceful shutdown
-- Added `running?()` method to check scheduler status
-### Breaking Changes
-- Job class validation now checks for `.perform_now` class method (was checking for `.perform` instance method)
+&nbsp;
 ## 0.1.0
-- Initial release
-- Single event loop with min-heap timer (O(log N) scheduling)
-- Skip overlapping execution
-- Startup jitter to prevent thundering herd
-- Monotonic clock for interval jobs, wall clock for cron jobs
-- Deterministic worker sharding via Zlib.crc32
-- Semaphore-based concurrency control
-- Per-job timeout protection
-- Structured logging via Console
+Initial release.
+- Single event loop with min-heap timer (`O(log N)` scheduling).
+- Skip overlapping execution.
+- Startup jitter to prevent thundering herd.
+- Monotonic clock for interval jobs, wall clock for cron jobs.
+- Deterministic worker sharding via `Zlib.crc32`.
+- Semaphore-based concurrency control.
+- Per-job timeout protection.
+- Structured logging via Console.