solid_queue_web 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 65caaec8c90225e3645116b206b9a3f81c4ff2389b2dd688e2e3fccca8a0121d
4
- data.tar.gz: 217dd8f2c0dcdaf13fe69cff9f3bf1c0857a416e88de8643248fc54a85efe917
3
+ metadata.gz: e4178da05b230b0990212f6dd74b68f45b18e97e50fd1b732bc54a3243e1d61d
4
+ data.tar.gz: af6f4c2352ce0076fb441fe90d8e38f233bb3bee99fce040155d34e23af2e0b9
5
5
  SHA512:
6
- metadata.gz: ed20d5b71f733c2e7eea80a249ff42c45094507f129dd69899b5938bc9594f575cb0a5958fc3e94bbf7b28fda0c70256257ba1651738d42dfa5c0661e61a7b73
7
- data.tar.gz: d7ab81704638739b1108e666d87797a0f4d21c690d428ec321ce04ae98a3af15e181d783e5a498fd9e892b8b1e8041648d9666c356c1aebc194dbc76173cfdfe
6
+ metadata.gz: '0751948c9e66b3b5d0bbd3bb47007121466ebec956b523b2d3c3b1e16df399005d4acc823b2a1196f75a9511cc5167f135ba8bbb83f6e5c3778f405accc62e3f'
7
+ data.tar.gz: 422342bb5b419181d5ed4612cf2b06f907f5dcf540d43bb88973b145d1b08b5e4afd9e6319af5249dd6d16351480ab5440b54b111390f6692fc1209a217cfb4d
data/README.md CHANGED
@@ -8,6 +8,10 @@
8
8
 
9
9
  A monitoring and management dashboard for [Solid Queue](https://github.com/rails/solid_queue), mountable as a Rails engine in any app.
10
10
 
11
+ > **Note:** Development of this gem will continue, but if you need a unified dashboard that covers **Solid Queue**, **Solid Cable**, and **Solid Cache** in a single interface, check out [solid_stack_web](https://github.com/eclectic-coding/solid_stack_web).
12
+
13
+ ![SolidQueueWeb dashboard](docs/solid-queue-web.png)
14
+
11
15
  ## The problem
12
16
 
13
17
  Solid Queue ships without a web interface. When jobs fail, queues back up, or workers go silent in production, the only options are `rails console` or raw SQL queries. SolidQueueWeb gives your team a real-time dashboard to inspect, retry, and discard jobs without leaving the browser — and without standing up any additional infrastructure.
@@ -38,7 +42,7 @@ SolidQueueWeb surfaces all of this in a browser UI available at any route you ch
38
42
  - **Jobs** — filterable by status (ready, scheduled, claimed, blocked, failed), queue, and priority; search by job class name with dynamic auto-submit; time-based period filter (1 h / 24 h / 7 d); discard individual or all jobs; Turbo Frame navigation so only the table updates on filter or search; auto-refreshes every 10 seconds
39
43
  - **Scheduled job management** — reschedule a scheduled job to run immediately ("Run Now") or push its `scheduled_at` forward by 1 h, 24 h, or 7 d; Turbo Stream responses update the row in place; "Run All Now" bulk action promotes every scheduled job in the current filtered view in a single operation
40
44
  - **Failed jobs** — list of failed executions with error details; search by class name; filter by queue; time-based period filter; retry or discard individually or in bulk; bulk retry with configurable stagger (+5s / +10s / +30s / +1m) to avoid thundering herd on recovery
41
- - **Job detail** — full arguments, timestamps, blocked-until date, and error backtrace; action buttons based on job status
45
+ - **Job detail** — full arguments, timestamps, blocked-until date, and error backtrace; action buttons based on job status; failed jobs show an editable arguments textarea so you can correct a bad payload and retry in one step without redeploying
42
46
  - **Queue management** — pause and resume individual queues; queue-scoped job list with status filter, search, and discard
43
47
  - **Recurring tasks** — all configured recurring tasks with cron schedule, next run time, last run time, and static/dynamic classification; "Run Now" button enqueues a task immediately without waiting for its next scheduled run
44
48
  - **Processes** — workers, dispatchers, and supervisors with heartbeat health status; auto-refreshes every 10 seconds
@@ -50,13 +54,11 @@ SolidQueueWeb surfaces all of this in a browser UI available at any route you ch
50
54
  - **CSV export** — "Export CSV" button on the jobs, failed jobs, and history pages downloads all records matching the current filters; columns are tailored per view
51
55
  - **Slow job detection** — when `slow_job_threshold` is configured, claimed jobs running longer than the threshold are flagged with an orange row, a "slow" badge, and a "Running For" duration column on the Running tab; a "Slow Jobs" warning card appears on the dashboard with a link to the Running tab
52
56
  - **Webhook alerts** — set `alert_webhook_url` and `alert_failure_threshold` to receive a POST request whenever the failed job count meets or exceeds the threshold; fires asynchronously so dashboard performance is unaffected; a configurable cooldown (default 1 h) prevents repeated alerts while the count stays elevated
53
- - **Performance analytics** — per-job-class statistics at `/jobs/performance` showing run count, average, p50, p95, min, and max duration; sorted by p95 descending so the slowest classes surface first; period filter scopes to 1h / 24h / 7d or all time; each class name links to the filtered History view
57
+ - **Performance analytics** — per-job-class statistics at `/jobs/performance` showing run count, average, p50, p95, p99, standard deviation, min, and max duration; sorted by p95 descending so the slowest classes surface first; high std dev surfaces inconsistent jobs worth investigating; period filter scopes to 1h / 24h / 7d or all time; each class name links to the filtered History view
58
+ - **Failed job trend chart** — a "Failures — Last 12 Hours" bar chart on the dashboard shows failures per hour over the last 12 hours; bars are red, making failure spikes visible before clicking into the failed jobs list
59
+ - **Error frequency report** — `GET /jobs/failed_jobs/errors` groups all failed jobs by error class and message prefix, shows a count per group, and surfaces a sample backtrace in an expandable row; sorted by count descending so the most common errors appear first; accessible via the "Error Summary" button on the Failed Jobs page
54
60
  - **Metrics / health endpoint** — `GET /jobs/metrics.json` returns a machine-readable JSON document with job counts, throughput, per-queue depth and pause state, and process health summary; suitable for Prometheus scraping, uptime monitors, or external dashboards; `slow_jobs` count included when `slow_job_threshold` is configured
55
61
 
56
- ## Screenshots
57
-
58
- ![SolidQueueWeb dashboard](docs/solid-queue-web.png)
59
-
60
62
  ## Compatibility
61
63
 
62
64
  | Dependency | Version |
@@ -102,9 +104,10 @@ SolidQueueWeb.configure do |config|
102
104
  config.default_refresh_interval = 30_000 # jobs/processes/history auto-refresh in ms (default: 10_000)
103
105
  config.search_results_limit = 10 # max results per status in global search (default: 25)
104
106
  config.slow_job_threshold = 5.minutes # flag claimed jobs running longer than this (default: nil = disabled)
105
- config.alert_webhook_url = "https://hooks.example.com/solid-queue" # POST target (default: nil = disabled)
107
+ config.alert_webhook_url = "https://hooks.example.com/solid-queue" # POST target — string or array (default: nil = disabled)
106
108
  config.alert_failure_threshold = 10 # fire when failed count >= this (default: nil = disabled)
107
- config.alert_webhook_cooldown = 1800 # seconds between repeated alerts (default: 3600)
109
+ config.alert_queue_thresholds = { "critical" => 50, "default" => 200 } # fire when queue depth >= threshold (default: {})
110
+ config.alert_webhook_cooldown = 1800 # seconds between repeated alerts per alert type (default: 3600)
108
111
  config.connects_to = { reading: :reading, writing: :writing } # read replica (default: nil)
109
112
  end
110
113
 
@@ -129,6 +132,17 @@ SolidQueueWeb.configure do |config|
129
132
  end
130
133
  ```
131
134
 
135
+ To fan out to multiple endpoints (e.g. Slack and PagerDuty simultaneously), pass an array:
136
+
137
+ ```ruby
138
+ config.alert_webhook_url = [
139
+ "https://hooks.slack.com/services/...",
140
+ "https://events.pagerduty.com/..."
141
+ ]
142
+ ```
143
+
144
+ All configured URLs receive the same payload. A failure posting to one URL is logged and skipped without blocking the remaining targets.
145
+
132
146
  The request body is JSON:
133
147
 
134
148
  ```json
@@ -142,6 +156,31 @@ The request body is JSON:
142
156
 
143
157
  The webhook fires asynchronously in a background thread so dashboard page loads are never delayed. HTTP errors are logged to `Rails.logger` and swallowed. The cooldown window prevents repeated alerts while the count stays elevated — the clock resets on each app restart.
144
158
 
159
+ ## Queue depth alerts
160
+
161
+ Set `alert_queue_thresholds` to fire a webhook when any queue's ready job count meets or exceeds a per-queue limit:
162
+
163
+ ```ruby
164
+ SolidQueueWeb.configure do |config|
165
+ config.alert_webhook_url = "https://hooks.example.com/solid-queue"
166
+ config.alert_queue_thresholds = { "critical" => 50, "default" => 200 }
167
+ end
168
+ ```
169
+
170
+ The same `alert_webhook_url` endpoint(s) receive the payload, with a distinct event type so you can route it differently:
171
+
172
+ ```json
173
+ {
174
+ "event": "queue_depth_threshold_exceeded",
175
+ "queue_name": "critical",
176
+ "depth": 63,
177
+ "threshold": 50,
178
+ "fired_at": "2026-05-21T12:34:56Z"
179
+ }
180
+ ```
181
+
182
+ Cooldown is tracked independently per queue, so a persistently deep "critical" queue does not suppress alerts for "default". The shared `alert_webhook_cooldown` setting applies to each queue separately.
183
+
145
184
  ## Metrics endpoint
146
185
 
147
186
  `GET /jobs/metrics.json` returns a machine-readable JSON document suitable for Prometheus scraping, uptime monitors, or external dashboards. No configuration is required — the endpoint is available as soon as the engine is mounted.
@@ -205,15 +244,7 @@ When `connects_to` is `nil` (the default), no connection switching occurs and si
205
244
 
206
245
  ## Roadmap
207
246
 
208
- Post-1.0 planned features:
209
-
210
- **Operations**
211
- - Admin audit log — record who retried or discarded which jobs and when (requires host-app user identity)
212
- - Failed job retry with modified arguments — edit the arguments JSON from the job detail page before retrying; useful for correcting bad payloads without redeploying
213
-
214
- **Notifications**
215
- - Multiple webhook targets — support an array of `alert_webhook_url` values so alerts can fan out to Slack, PagerDuty, and custom endpoints simultaneously
216
- - Queue depth alert — fire a webhook when a queue's ready job count exceeds a configurable threshold (complements the existing failure-count alert)
247
+ See [ROADMAP.md](ROADMAP.md) for the full post-1.0 feature plan, organized by release milestone.
217
248
 
218
249
  Pull requests for any of these are welcome. See [Contributing](#contributing) below.
219
250
 
@@ -117,4 +117,28 @@
117
117
  background: var(--muted);
118
118
  border-color: var(--muted);
119
119
  color: #fff;
120
+ }
121
+
122
+ .sqd-textarea {
123
+ width: 100%;
124
+ padding: 0.5rem 0.75rem;
125
+ border: 1px solid var(--border);
126
+ border-radius: 5px;
127
+ font-size: 13px;
128
+ background: var(--surface);
129
+ color: var(--text);
130
+ line-height: 1.6;
131
+ resize: vertical;
132
+ box-sizing: border-box;
133
+ display: block;
134
+ }
135
+
136
+ .sqd-textarea:focus {
137
+ outline: 2px solid var(--primary);
138
+ outline-offset: -1px;
139
+ border-color: var(--primary);
140
+ }
141
+
142
+ .sqd-args-form__submit {
143
+ margin-top: 0.75rem;
120
144
  }
@@ -75,6 +75,13 @@
75
75
 
76
76
  .sqd-pre--muted { color: var(--muted); }
77
77
 
78
+ .sqd-error-details summary {
79
+ cursor: pointer;
80
+ list-style: none;
81
+ }
82
+ .sqd-error-details summary::-webkit-details-marker { display: none; }
83
+ .sqd-error-details .sqd-pre { margin-top: 0.5rem; }
84
+
78
85
  .sqd-error-header {
79
86
  font-size: 13px;
80
87
  padding: 0.5rem 0.75rem;
@@ -94,4 +94,8 @@
94
94
 
95
95
  .sqd-sparkline__bar--depth {
96
96
  background: var(--purple);
97
+ }
98
+
99
+ .sqd-sparkline__bar--failure {
100
+ background: var(--danger);
97
101
  }
@@ -3,6 +3,7 @@ module SolidQueueWeb
3
3
  def index
4
4
  @stats = DashboardStats.new
5
5
  AlertWebhook.call(failure_count: @stats.counts[:failed])
6
+ QueueDepthAlert.call
6
7
  end
7
8
  end
8
9
  end
@@ -0,0 +1,15 @@
1
+ module SolidQueueWeb
2
+ class FailedJobs::ArgumentsController < ApplicationController
3
+ def update
4
+ execution = SolidQueue::FailedExecution.find(params[:failed_job_id])
5
+ new_arguments = JSON.parse(params[:arguments])
6
+ execution.job.update!(arguments: new_arguments)
7
+ execution.retry
8
+ redirect_to failed_jobs_path, notice: "Job arguments updated and queued for retry."
9
+ rescue JSON::ParserError
10
+ redirect_to job_path(execution.job), alert: "Invalid JSON: could not parse arguments."
11
+ rescue => e
12
+ redirect_to failed_jobs_path, alert: "Could not update job: #{e.message}"
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,9 @@
1
+ module SolidQueueWeb
2
+ module FailedJobs
3
+ class ErrorsController < ApplicationController
4
+ def index
5
+ @groups = ErrorFrequencyReport.new.groups
6
+ end
7
+ end
8
+ end
9
+ end
@@ -12,7 +12,8 @@ module SolidQueueWeb
12
12
  return if failure_count < SolidQueueWeb.alert_failure_threshold
13
13
  return unless should_fire?
14
14
 
15
- Thread.new { post(SolidQueueWeb.alert_webhook_url, failure_count) }
15
+ urls = webhook_urls
16
+ Thread.new { urls.each { |url| post(url, failure_count) } }
16
17
  end
17
18
 
18
19
  def reset!
@@ -22,7 +23,11 @@ module SolidQueueWeb
22
23
  private
23
24
 
24
25
  def configured?
25
- SolidQueueWeb.alert_webhook_url.present? && SolidQueueWeb.alert_failure_threshold.present?
26
+ webhook_urls.any? && SolidQueueWeb.alert_failure_threshold.present?
27
+ end
28
+
29
+ def webhook_urls
30
+ Array(SolidQueueWeb.alert_webhook_url).flatten.compact.select(&:present?)
26
31
  end
27
32
 
28
33
  def should_fire?
@@ -1,6 +1,6 @@
1
1
  module SolidQueueWeb
2
2
  class DashboardStats
3
- attr_reader :counts, :throughput, :sparkline, :depth_sparkline, :slow_jobs_count
3
+ attr_reader :counts, :throughput, :sparkline, :depth_sparkline, :failure_sparkline, :slow_jobs_count
4
4
 
5
5
  def initialize
6
6
  @now = Time.current
@@ -32,6 +32,13 @@ module SolidQueueWeb
32
32
  finished_times.count { |t| t >= from && t < to }
33
33
  end
34
34
 
35
+ failed_times = SolidQueue::FailedExecution.where(created_at: 12.hours.ago..@now).pluck(:created_at)
36
+ @failure_sparkline = 12.times.map do |i|
37
+ from = (12 - i).hours.ago
38
+ to = i == 11 ? @now : (11 - i).hours.ago
39
+ failed_times.count { |t| t >= from && t < to }
40
+ end
41
+
35
42
  threshold = SolidQueueWeb.slow_job_threshold
36
43
  @slow_jobs_count = threshold ? SolidQueue::ClaimedExecution.where("created_at <= ?", threshold.ago).count : 0
37
44
 
@@ -0,0 +1,34 @@
1
+ module SolidQueueWeb
2
+ class ErrorFrequencyReport
3
+ Row = Data.define(:exception_class, :message_prefix, :count, :sample_backtrace)
4
+
5
+ MESSAGE_LIMIT = 120
6
+
7
+ def groups
8
+ SolidQueue::FailedExecution
9
+ .order(created_at: :desc)
10
+ .each_with_object({}) do |execution, acc|
11
+ key = [execution.exception_class.to_s, message_prefix(execution.message)]
12
+ entry = acc[key] ||= { count: 0, sample_backtrace: nil }
13
+ entry[:count] += 1
14
+ entry[:sample_backtrace] ||= execution.backtrace
15
+ end
16
+ .map do |(exception_class, prefix), data|
17
+ Row.new(
18
+ exception_class: exception_class,
19
+ message_prefix: prefix,
20
+ count: data[:count],
21
+ sample_backtrace: data[:sample_backtrace]
22
+ )
23
+ end
24
+ .sort_by { |row| -row.count }
25
+ end
26
+
27
+ private
28
+
29
+ def message_prefix(message)
30
+ return "" if message.nil?
31
+ message.length > MESSAGE_LIMIT ? "#{message[0, MESSAGE_LIMIT]}…" : message
32
+ end
33
+ end
34
+ end
@@ -1,6 +1,6 @@
1
1
  module SolidQueueWeb
2
2
  class JobPerformanceStats
3
- Row = Struct.new(:class_name, :count, :avg, :p50, :p95, :min, :max, keyword_init: true)
3
+ Row = Struct.new(:class_name, :count, :avg, :p50, :p95, :p99, :std_dev, :min, :max, keyword_init: true)
4
4
 
5
5
  def initialize(scope)
6
6
  @scope = scope
@@ -18,6 +18,8 @@ module SolidQueueWeb
18
18
  avg: mean(durations),
19
19
  p50: percentile(durations, 50),
20
20
  p95: percentile(durations, 95),
21
+ p99: percentile(durations, 99),
22
+ std_dev: std_dev(durations),
21
23
  min: durations.first,
22
24
  max: durations.last
23
25
  )
@@ -34,5 +36,11 @@ module SolidQueueWeb
34
36
  idx = [(pct / 100.0 * sorted.size).ceil - 1, 0].max
35
37
  sorted[idx]
36
38
  end
39
+
40
+ def std_dev(sorted)
41
+ return 0.0 if sorted.size < 2
42
+ m = mean(sorted)
43
+ Math.sqrt(sorted.sum { |x| (x - m)**2 } / sorted.size)
44
+ end
37
45
  end
38
46
  end
@@ -0,0 +1,74 @@
1
+ require "net/http"
2
+ require "json"
3
+ require "uri"
4
+
5
+ module SolidQueueWeb
6
+ class QueueDepthAlert
7
+ MUTEX = Mutex.new
8
+
9
+ class << self
10
+ def call
11
+ return unless configured?
12
+
13
+ queue_depths = SolidQueue::ReadyExecution
14
+ .joins(:job)
15
+ .group("solid_queue_jobs.queue_name")
16
+ .count
17
+
18
+ queue_depths.each do |queue_name, depth|
19
+ threshold = SolidQueueWeb.alert_queue_thresholds[queue_name.to_s]
20
+ next unless threshold && depth >= threshold
21
+ next unless should_fire?(queue_name)
22
+
23
+ urls = webhook_urls
24
+ Thread.new { urls.each { |url| post(url, queue_name, depth, threshold) } }
25
+ end
26
+ end
27
+
28
+ def reset!
29
+ MUTEX.synchronize { @last_fired_at = {} }
30
+ end
31
+
32
+ private
33
+
34
+ def configured?
35
+ SolidQueueWeb.alert_queue_thresholds.any? && webhook_urls.any?
36
+ end
37
+
38
+ def webhook_urls
39
+ Array(SolidQueueWeb.alert_webhook_url).flatten.compact.select(&:present?)
40
+ end
41
+
42
+ def should_fire?(queue_name)
43
+ MUTEX.synchronize do
44
+ @last_fired_at ||= {}
45
+ cooldown = SolidQueueWeb.alert_webhook_cooldown
46
+ return false if @last_fired_at[queue_name] && Time.current - @last_fired_at[queue_name] < cooldown
47
+
48
+ @last_fired_at[queue_name] = Time.current
49
+ true
50
+ end
51
+ end
52
+
53
+ def post(url_string, queue_name, depth, threshold)
54
+ uri = URI.parse(url_string)
55
+ payload = JSON.generate(
56
+ event: "queue_depth_threshold_exceeded",
57
+ queue_name: queue_name,
58
+ depth: depth,
59
+ threshold: threshold,
60
+ fired_at: Time.current.iso8601
61
+ )
62
+ http = Net::HTTP.new(uri.host, uri.port)
63
+ http.use_ssl = uri.scheme == "https"
64
+ http.open_timeout = 5
65
+ http.read_timeout = 10
66
+ request = Net::HTTP::Post.new(uri.path.presence || "/", "Content-Type" => "application/json")
67
+ request.body = payload
68
+ http.request(request)
69
+ rescue => e
70
+ Rails.logger.error("[SolidQueueWeb] Queue depth alert webhook failed: #{e.message}")
71
+ end
72
+ end
73
+ end
74
+ end
@@ -104,6 +104,35 @@
104
104
  <% end %>
105
105
  </div>
106
106
 
107
+ <% max_failures = [@stats.failure_sparkline.max, 1].max %>
108
+ <div class="sqd-card" style="margin-bottom: 1rem;">
109
+ <div class="sqd-card__header">
110
+ <span class="sqd-card__title">Failures &mdash; Last 12 Hours</span>
111
+ <div class="sqd-throughput__summary">
112
+ <span>Total: <strong><%= @stats.failure_sparkline.sum %></strong></span>
113
+ </div>
114
+ </div>
115
+ <% if @stats.failure_sparkline.all?(&:zero?) %>
116
+ <div class="sqd-sparkline__empty">No failures in the last 12 hours</div>
117
+ <% else %>
118
+ <div class="sqd-sparkline" aria-label="Failed jobs per hour over the last 12 hours">
119
+ <% @stats.failure_sparkline.each_with_index do |count, i| %>
120
+ <% pct = (count.to_f / max_failures * 100).round %>
121
+ <% hour_start = (12 - i).hours.ago %>
122
+ <% show_tick = [0, 3, 6, 9, 11].include?(i) %>
123
+ <div class="sqd-sparkline__col">
124
+ <div class="sqd-sparkline__bar-wrap">
125
+ <div class="sqd-sparkline__bar sqd-sparkline__bar--failure"
126
+ style="height: <%= [pct, 3].max %>%"
127
+ title="<%= hour_start.strftime('%-I%p').downcase %>: <%= count %> <%= "failure".pluralize(count) %>"></div>
128
+ </div>
129
+ <div class="sqd-sparkline__tick"><%= show_tick ? (i == 11 ? "now" : hour_start.strftime("%-I%p").downcase) : "" %></div>
130
+ </div>
131
+ <% end %>
132
+ </div>
133
+ <% end %>
134
+ </div>
135
+
107
136
  <div style="display:grid; grid-template-columns: repeat(auto-fit, minmax(240px, 1fr)); gap: 1rem;">
108
137
  <div class="sqd-card">
109
138
  <div class="sqd-card__header">
@@ -0,0 +1,44 @@
1
+ <div class="sqd-page-header">
2
+ <h1 class="sqd-page-title">Error Summary</h1>
3
+ <div class="sqd-actions">
4
+ <%= link_to "← Failed Jobs", failed_jobs_path, class: "sqd-btn sqd-btn--muted sqd-btn--sm" %>
5
+ </div>
6
+ </div>
7
+
8
+ <% if @groups.any? %>
9
+ <div class="sqd-card">
10
+ <table>
11
+ <thead>
12
+ <tr>
13
+ <th scope="col">Error Class</th>
14
+ <th scope="col">Message</th>
15
+ <th scope="col" style="text-align: right;">Count</th>
16
+ </tr>
17
+ </thead>
18
+ <tbody>
19
+ <% @groups.each do |group| %>
20
+ <tr>
21
+ <td class="sqd-mono"><%= group.exception_class.presence || "—" %></td>
22
+ <td>
23
+ <% if group.sample_backtrace.present? %>
24
+ <details class="sqd-error-details">
25
+ <summary class="sqd-truncate" title="<%= group.message_prefix %>">
26
+ <%= group.message_prefix.presence || "—" %>
27
+ </summary>
28
+ <pre class="sqd-pre sqd-pre--muted"><%= Array(group.sample_backtrace).first(10).join("\n") %></pre>
29
+ </details>
30
+ <% else %>
31
+ <span class="sqd-truncate" title="<%= group.message_prefix %>"><%= group.message_prefix.presence || "—" %></span>
32
+ <% end %>
33
+ </td>
34
+ <td style="text-align: right;"><%= group.count %></td>
35
+ </tr>
36
+ <% end %>
37
+ </tbody>
38
+ </table>
39
+ </div>
40
+ <% else %>
41
+ <div class="sqd-card">
42
+ <div class="sqd-empty">No failed jobs. All clear!</div>
43
+ </div>
44
+ <% end %>
@@ -2,6 +2,7 @@
2
2
  <h1 class="sqd-page-title">Failed Jobs</h1>
3
3
  <% if @failed_jobs.any? %>
4
4
  <div class="sqd-actions">
5
+ <%= link_to "Error Summary", failed_job_errors_path, class: "sqd-btn sqd-btn--muted sqd-btn--sm" %>
5
6
  <%= link_to "Export CSV", failed_jobs_path(format: :csv, queue: @queue, q: @search, period: @period),
6
7
  class: "sqd-btn sqd-btn--muted", data: { turbo: false } %>
7
8
  <%= button_to "Retry All", retry_all_failed_jobs_path,
@@ -63,7 +63,19 @@
63
63
 
64
64
  <div class="sqd-card sqd-detail-section">
65
65
  <h2 class="sqd-section-title">Arguments</h2>
66
- <pre class="sqd-pre"><%= JSON.pretty_generate(@job.arguments) rescue @job.arguments.inspect %></pre>
66
+ <% if @execution_status == "failed" && @job.failed_execution %>
67
+ <% args_json = begin; JSON.pretty_generate(@job.arguments); rescue; @job.arguments.inspect; end %>
68
+ <%= form_with url: failed_job_arguments_path(@job.failed_execution), method: :patch do |f| %>
69
+ <%= f.text_area :arguments,
70
+ value: args_json,
71
+ class: "sqd-textarea sqd-mono",
72
+ rows: [args_json.lines.count + 1, 6].max,
73
+ "aria-label": "Job arguments JSON" %>
74
+ <%= f.submit "Retry with these arguments", class: "sqd-btn sqd-btn--primary sqd-args-form__submit" %>
75
+ <% end %>
76
+ <% else %>
77
+ <pre class="sqd-pre"><%= JSON.pretty_generate(@job.arguments) rescue @job.arguments.inspect %></pre>
78
+ <% end %>
67
79
  </div>
68
80
  </div>
69
81
 
@@ -21,6 +21,8 @@
21
21
  <th scope="col" style="text-align: right;">Avg</th>
22
22
  <th scope="col" style="text-align: right;">p50</th>
23
23
  <th scope="col" style="text-align: right;">p95</th>
24
+ <th scope="col" style="text-align: right;">p99</th>
25
+ <th scope="col" style="text-align: right;">Std Dev</th>
24
26
  <th scope="col" style="text-align: right;">Min</th>
25
27
  <th scope="col" style="text-align: right;">Max</th>
26
28
  </tr>
@@ -36,6 +38,8 @@
36
38
  <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.avg) %></td>
37
39
  <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.p50) %></td>
38
40
  <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.p95) %></td>
41
+ <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.p99) %></td>
42
+ <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.std_dev) %></td>
39
43
  <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.min) %></td>
40
44
  <td class="sqd-mono" style="text-align: right;"><%= format_duration(row.max) %></td>
41
45
  </tr>
data/config/routes.rb CHANGED
@@ -35,9 +35,12 @@ SolidQueueWeb::Engine.routes.draw do
35
35
  end
36
36
  end
37
37
 
38
+ get "failed_jobs/errors", to: "failed_jobs/errors#index", as: :failed_job_errors
39
+
38
40
  resource :failed_job_selection, path: "failed_jobs/selection", only: [:create, :destroy],
39
41
  controller: "failed_jobs/selections"
40
42
  resources :failed_jobs, only: [:index, :destroy] do
43
+ resource :arguments, only: [:update], controller: "failed_jobs/arguments"
41
44
  collection do
42
45
  post :retry_all, to: "retry_failed_jobs#create"
43
46
  post :discard_all, action: :destroy
@@ -1,3 +1,3 @@
1
1
  module SolidQueueWeb
2
- VERSION = "1.0.0"
2
+ VERSION = "1.2.0"
3
3
  end
@@ -6,7 +6,7 @@ module SolidQueueWeb
6
6
  class << self
7
7
  attr_writer :page_size, :dashboard_refresh_interval, :default_refresh_interval, :search_results_limit,
8
8
  :slow_job_threshold, :alert_webhook_url, :alert_failure_threshold, :alert_webhook_cooldown,
9
- :connects_to
9
+ :alert_queue_thresholds, :connects_to
10
10
 
11
11
  def page_size
12
12
  @page_size || 25
@@ -40,6 +40,10 @@ module SolidQueueWeb
40
40
  @alert_webhook_cooldown || 3600
41
41
  end
42
42
 
43
+ def alert_queue_thresholds
44
+ @alert_queue_thresholds || {}
45
+ end
46
+
43
47
  def connects_to
44
48
  @connects_to
45
49
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: solid_queue_web
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chuck Smith
@@ -93,9 +93,12 @@ dependencies:
93
93
  - - ">="
94
94
  - !ruby/object:Gem::Version
95
95
  version: '1.2'
96
- description: Mount SolidQueueWeb in any Rails app using Solid Queue to get a dashboard
97
- for your queues, jobs by status, failed executions, and job actions (retry, discard)
98
- all without leaving your app.
96
+ description: 'Mount SolidQueueWeb in any Rails app using Solid Queue to get a full-featured
97
+ job dashboard: inspect jobs by status (ready, scheduled, running, blocked, failed),
98
+ retry or discard failed jobs, reschedule or run scheduled jobs immediately, manage
99
+ recurring tasks, filter by queue/priority/period, export to CSV, detect slow jobs,
100
+ view queue depth sparklines, track job performance (p50/p95), and scrape a /metrics
101
+ JSON endpoint for external monitoring — all without leaving your app.'
99
102
  email:
100
103
  - eclectic-coding@users.noreply.github.com
101
104
  executables: []
@@ -121,6 +124,8 @@ files:
121
124
  - app/controllers/solid_queue_web/application_controller.rb
122
125
  - app/controllers/solid_queue_web/blocked_jobs_controller.rb
123
126
  - app/controllers/solid_queue_web/dashboard_controller.rb
127
+ - app/controllers/solid_queue_web/failed_jobs/arguments_controller.rb
128
+ - app/controllers/solid_queue_web/failed_jobs/errors_controller.rb
124
129
  - app/controllers/solid_queue_web/failed_jobs/selections_controller.rb
125
130
  - app/controllers/solid_queue_web/failed_jobs_controller.rb
126
131
  - app/controllers/solid_queue_web/history_controller.rb
@@ -148,11 +153,14 @@ files:
148
153
  - app/models/solid_queue_web/job.rb
149
154
  - app/services/solid_queue_web/alert_webhook.rb
150
155
  - app/services/solid_queue_web/dashboard_stats.rb
156
+ - app/services/solid_queue_web/error_frequency_report.rb
151
157
  - app/services/solid_queue_web/job_performance_stats.rb
152
158
  - app/services/solid_queue_web/metrics_payload.rb
159
+ - app/services/solid_queue_web/queue_depth_alert.rb
153
160
  - app/services/solid_queue_web/queue_stats.rb
154
161
  - app/views/layouts/solid_queue_web/application.html.erb
155
162
  - app/views/solid_queue_web/dashboard/index.html.erb
163
+ - app/views/solid_queue_web/failed_jobs/errors/index.html.erb
156
164
  - app/views/solid_queue_web/failed_jobs/index.html.erb
157
165
  - app/views/solid_queue_web/history/index.html.erb
158
166
  - app/views/solid_queue_web/jobs/destroy.turbo_stream.erb