RubyGems - rails_health_checks - Versions diffs - 1.1.0 → 1.2.0 - Mend

rails_health_checks 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml +4 -4
data/README.md +168 -10
data/app/controllers/rails_health_checks/live_controller.rb +3 -6
data/app/controllers/rails_health_checks/ready_controller.rb +14 -0
data/config/routes.rb +7 -4
data/lib/generators/rails_health_checks/templates/initializer.rb +7 -0
data/lib/rails_health_checks/configuration.rb +3 -1
data/lib/rails_health_checks/version.rb +1 -1
metadata +2 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 934bfd1962ca10009ba6925f81cbf57f09d8d63ab017a6452296bfb32994352b
-  data.tar.gz: 7ba0cfd783e433b26de67d2f9a2dc00a13e341584c88b1559c5f08e1f237187c
+  metadata.gz: 9033111f01524e17546525feaba1a215ac6e3cad1c7c8b0b12dd527850c74e7f
+  data.tar.gz: 3eded20b26844b046d33689f6a5512efa1a7c7964f04679552d00020d9a7aa78
 SHA512:
-  metadata.gz: 2e52cb1373a3a8c26bd7a01ceefa6a32dedfb5c06421d754d94150a56349349b3ef467e529e696e9a74efd2449b7f8078bfff5b1a3cf9f2915786e8b16468d87
-  data.tar.gz: 9c1dbc7ea623885abc7cf39acbdc63741fffb25698c93d3beddc501f598783956ede451c60456b8ba1e54bc54c1b725fb70a2a2edc727e6836081d8ff93587fc
+  metadata.gz: 7a8373f504fde389fd3e0202f3a08a1ea9e5a6af15dca997f8270e8dfcbca2cf5155a5f57a490392c6df1d3bc65f15a452b1c64ba3f96650b214007be5506623
+  data.tar.gz: 3f58fb48bd0a1a416e078f3e3bae0bb70e9e220c1cb93eaae25b2ec1c7710d56c9965e965fffa33f6d3922c132a120f8e0b554c18358d10eaa0787cebad0b778

data/README.md CHANGED Viewed

@@ -11,6 +11,7 @@ A Rails engine that adds production-grade health check endpoints to any Rails ap
 **Built-in checks:** database · cache · Redis · SMTP · Sidekiq · SolidQueue · GoodJob · Resque · disk · memory · HTTP
 **Key features:**
+- **Two-tier endpoints:** `/live` (liveness — process only) and `/ready` (readiness — all deps) prevent cascade failures in Kubernetes and behind load balancers
 - Parallel check execution via `Concurrent::Future` — response time bounded by the slowest check, not the sum
 - Result caching (`config.cache_duration`) to absorb high-frequency probe traffic
 - Prometheus text exposition at `GET /health/metrics` (always HTTP 200)
@@ -21,9 +22,14 @@ A Rails engine that adds production-grade health check endpoints to any Rails ap
 ## Table of Contents
+- [Upgrading](#upgrading)
 - [Installation](#installation)
 - [Rack Applications](#rack-applications)
 - [Endpoints](#endpoints)
+  - [Liveness vs. Readiness](#liveness-vs-readiness--why-two-tiers)
+  - [Kubernetes wiring](#kubernetes-wiring)
+  - [Load balancer wiring](#load-balancer-wiring)
+  - [Configuring endpoint paths](#configuring-endpoint-paths)
 - [Configuration](#configuration)
   - [Configuration Reference](#configuration-reference)
 - [Authentication](#authentication)
@@ -43,6 +49,30 @@ A Rails engine that adds production-grade health check endpoints to any Rails ap
 ---
+## Upgrading
+### v1.1.x → v1.2.x — breaking change to `/live`
+> **`GET /health/live` no longer runs dependency checks.**
+Prior to v1.2.0, `/live` ran all configured checks (database, Redis, etc.) and returned `503` if any failed. This was readiness behaviour under a liveness name and is the root cause of the cascade failure footgun described below.
+**What changed:** `/live` now returns `200 OK` whenever the Ruby process is alive, regardless of dependency state. Authentication is also skipped on this endpoint so Kubernetes and load balancer probes work without credentials.
+**What to do:** If you were relying on `/live` to verify dependencies, switch to the new `/health/ready` endpoint. No configuration changes required.
+```
+# Before (was running dependency checks — now only liveness)
+GET /health/live   →  200 if process alive (deps ignored)
+# New endpoint for dependency checks
+GET /health/ready  →  200 if all deps pass, 503 if any fail
+```
+[↑ Back to top](#table-of-contents)
+---
 ## Installation
 Add to your Gemfile:
@@ -125,8 +155,9 @@ The routes are identical to the Rails engine, relative to the mount point:
 | Endpoint | Format | Use case |
 |----------|--------|----------|
-| `GET/HEAD /` | JSON | Health status |
-| `GET/HEAD /live` | Plain text | Liveness probe |
+| `GET/HEAD /` | JSON | Full dependency health (monitoring dashboards) |
+| `GET/HEAD /live` | Plain text | Liveness probe — process only, no deps |
+| `GET/HEAD /ready` | Plain text | Readiness probe — all configured dependency checks |
 | `GET /metrics` | Prometheus text | Prometheus scraping |
 | `GET /:group` | JSON | Scoped check group |
@@ -174,16 +205,140 @@ Token and IP allowlist strategies are unchanged.
 ## Endpoints
-| Endpoint | Format | Use case |
-|----------|--------|----------|
-| `GET /health` | JSON | Monitoring dashboards, detailed diagnostics |
-| `GET /health/live` | Plain text | Load balancer liveness probes |
-| `GET /health/metrics` | Prometheus text | Prometheus / OpenMetrics scraping |
-| `GET /health/:group` | JSON | Scoped check group (e.g. `/health/workers`) |
+| Endpoint | Runs checks? | Format | Use case |
+|----------|-------------|--------|----------|
+| `GET /health/live` | No — process only | Plain text | Kubernetes `livenessProbe`, load balancer health check |
+| `GET /health/ready` | Yes — all configured deps | Plain text | Kubernetes `readinessProbe`, external uptime monitors |
+| `GET /health` | Yes — all configured deps | JSON | Monitoring dashboards, alerting pipelines |
+| `GET /health/metrics` | Yes — all configured deps | Prometheus text | Prometheus / OpenMetrics scraping |
+| `GET /health/:group` | Yes — named subset | JSON | Scoped group (e.g. `/health/workers`) |
+`/health/live`, `/health/ready`, and `/health` also respond to `HEAD` requests.
-`/health` and `/health/live` also respond to `HEAD` requests (useful for lightweight load balancer probes).
+HTTP status: `200 OK` when all checks pass, `503 Service Unavailable` when any check fails (except `/metrics` which always returns `200`, and `/live` which always returns `200`).
+---
+### Liveness vs. Readiness — why two tiers?
+**Using a single health endpoint for both load balancer checks and dependency monitoring is a cascade failure footgun.** Here is the exact failure chain:
+1. Your database has a 30-second blip
+2. All running pods probe `/health/ready` → all return `503`
+3. The load balancer removes every pod from rotation simultaneously
+4. Traffic has nowhere to go — the app is fully down
+5. If the same endpoint drives `livenessProbe`, Kubernetes begins restarting every pod
+6. Restarting pods reconnect to the still-blipping database, fail again, restart again
+7. What was a 30-second DB hiccup is now a multi-minute outage driven by a thundering herd of pod restarts
+The fix is to separate the two concerns:
+| Endpoint | Question it answers | Correct probe |
+|----------|--------------------|--------------:|
+| `/health/live` | Is the process running and responsive? | `livenessProbe`, LB health check |
+| `/health/ready` | Are all dependencies reachable? | `readinessProbe`, uptime monitor |
+**Liveness (`/health/live`)** — returns `200 OK` as long as the Ruby process responds. No dependency checks run. Authentication is skipped so Kubernetes and load balancers work without credentials. When this fails, k8s restarts the pod because the process itself is stuck or crashed.
+**Readiness (`/health/ready`)** — runs all configured dependency checks. Returns `503` if any check fails. When this fails, k8s stops routing traffic to the pod but leaves it running. The pod rejoins rotation automatically once dependencies recover — no restart, no thundering herd.
+**Deep JSON (`/health`)** — same dependency checks as `/ready`, returned as structured JSON with per-check status and latency. Use for monitoring dashboards, alerting, or anywhere you need machine-readable detail. Do not use for liveness or readiness probes.
+---
+### Kubernetes wiring
+```yaml
+containers:
+  - name: web
+    ports:
+      - containerPort: 3000
+    livenessProbe:
+      httpGet:
+        path: /health/live   # process-only — DB blip does NOT restart this pod
+        port: 3000
+      initialDelaySeconds: 10
+      periodSeconds: 10
+      failureThreshold: 3    # restarts only if the process stops responding entirely
+    readinessProbe:
+      httpGet:
+        path: /health/ready  # dep checks — stops traffic but does NOT restart the pod
+        port: 3000
+      initialDelaySeconds: 5
+      periodSeconds: 10
+      failureThreshold: 2    # removes from rotation after 2 consecutive dep failures
+    startupProbe:            # optional: give the app time to boot before probing
+      httpGet:
+        path: /health/live
+        port: 3000
+      failureThreshold: 30
+      periodSeconds: 5
+```
+> **Warning:** Do not point `livenessProbe` at `/health/ready`. A single dependency failure will cause Kubernetes to restart every pod simultaneously, turning a recoverable dep outage into a full application restart loop.
+---
-HTTP status is `200 OK` when all checks pass, `503 Service Unavailable` otherwise (except `/metrics` which always returns `200`).
+### Load balancer wiring
+Always use the liveness endpoint for load balancer health checks. If you use the readiness endpoint and a dependency blips, the load balancer ejects all nodes at once and traffic has nowhere to go.
+**AWS ALB / NLB (target group health check)**
+```
+Health check path:    /health/live
+Healthy threshold:    2
+Unhealthy threshold:  3
+Timeout:              5s
+Interval:             10s
+```
+**Nginx upstream**
+```nginx
+upstream rails_app {
+  server app1:3000;
+  server app2:3000;
+}
+server {
+  location /health/live {
+    proxy_pass http://rails_app;
+  }
+}
+```
+**HAProxy**
+```
+backend rails_app
+  option httpchk GET /health/live
+  server app1 app1:3000 check
+  server app2 app2:3000 check
+```
+> **Note:** Reserve `/health/ready` for Kubernetes `readinessProbe` and external uptime monitors (Pingdom, UptimeRobot, Better Uptime). These are the right tools to alert you when dependencies are down — the load balancer is not.
+---
+### Configuring endpoint paths
+The readiness path defaults to `ready` (i.e. `/health/ready` when the engine is mounted at `/health`). Override it in your initializer:
+```ruby
+RailsHealthChecks.configure do |config|
+  config.readiness_path = "readyz"  # → /health/readyz
+end
+```
+The engine mount point is configurable in `config/routes.rb`:
+```ruby
+mount RailsHealthChecks::Engine => "/healthz"
+# exposes: /healthz/live, /healthz/ready, /healthz, /healthz/metrics
+```
+---
 ### JSON response shape
@@ -329,6 +484,7 @@ Configuration is validated at boot time. An unknown check name, a missing `http_
 | `checks` | `Array` | `[:database]` | Built-in or custom check names to run |
 | `timeout` | `Integer` | `5` | Global per-check timeout in seconds |
 | `cache_duration` | `Integer\|nil` | `nil` | Cache results for N seconds; `nil` disables caching |
+| `readiness_path` | `String` | `"ready"` | Path of the readiness endpoint within the engine (e.g. `"ready"` → `/health/ready`) |
 | `token` | `String\|nil` | `nil` | Bearer token for authentication |
 | `allowed_ips` | `Array\|nil` | `nil` | IP allowlist; accepts exact IPs and CIDR ranges |
 | `redis_url` | `String\|nil` | `nil` | Redis URL for `:redis` check; falls back to `REDIS_URL` env var then `redis://localhost:6379/0` |
@@ -354,6 +510,8 @@ Configuration is validated at boot time. An unknown check name, a missing `http_
 By default health endpoints are public. Use one of the following strategies to restrict access. Unauthenticated requests receive `401 Unauthorized`.
+> **Note:** `GET /health/live` always bypasses authentication regardless of the configured strategy. Liveness probes are called by Kubernetes and load balancers which cannot pass credentials, so enforcing auth on this endpoint would break infrastructure probing.
 ### Bearer token
 ```ruby

data/app/controllers/rails_health_checks/live_controller.rb CHANGED Viewed

@@ -2,13 +2,10 @@
 module RailsHealthChecks
   class LiveController < ApplicationController
+    skip_before_action :authenticate!
     def show
-      builder = ResponseBuilder.new(run_checks(RailsHealthChecks.configuration.checks))
-      if builder.overall_status == "ok"
-        render plain: "OK", status: :ok
-      else
-        render plain: "Service Unavailable", status: :service_unavailable
-      end
+      render plain: "OK", status: :ok
     end
   end
 end

data/app/controllers/rails_health_checks/ready_controller.rb ADDED Viewed

@@ -0,0 +1,14 @@
+# frozen_string_literal: true
+module RailsHealthChecks
+  class ReadyController < ApplicationController
+    def show
+      builder = ResponseBuilder.new(run_checks(RailsHealthChecks.configuration.checks))
+      if builder.overall_status == "ok"
+        render plain: "OK", status: :ok
+      else
+        render plain: "Service Unavailable", status: :service_unavailable
+      end
+    end
+  end
+end

data/config/routes.rb CHANGED Viewed

@@ -1,8 +1,11 @@
 # frozen_string_literal: true
 RailsHealthChecks::Engine.routes.draw do
-  match "/",      to: "health#show",   as: :health,      via: [:get, :head]
-  match "/live",  to: "live#show",     as: :health_live, via: [:get, :head]
-  get "/metrics", to: "metrics#show",  as: :health_metrics
-  get "/:id",     to: "groups#show",   as: :health_group
+  readiness_path = RailsHealthChecks.configuration.readiness_path
+  match "/",                    to: "health#show",   as: :health,          via: [:get, :head]
+  match "/live",                to: "live#show",     as: :health_live,     via: [:get, :head]
+  match "/#{readiness_path}",   to: "ready#show",    as: :health_ready,    via: [:get, :head]
+  get "/metrics",               to: "metrics#show",  as: :health_metrics
+  get "/:id",                   to: "groups#show",   as: :health_group
 end

data/lib/generators/rails_health_checks/templates/initializer.rb CHANGED Viewed

@@ -12,6 +12,13 @@ RailsHealthChecks.configure do |config|
   # Cache check results for N seconds to avoid re-running on every request (default: nil, disabled)
   # config.cache_duration = 10
+  # ---------------------------------------------------------------------------
+  # Endpoint paths
+  # ---------------------------------------------------------------------------
+  # Path for the readiness endpoint within the engine (default: "ready").
+  # When the engine is mounted at "/health", the readiness endpoint is "/health/ready".
+  # config.readiness_path = "ready"
   # ---------------------------------------------------------------------------
   # Authentication — all strategies are mutually exclusive; default is public
   # ---------------------------------------------------------------------------

data/lib/rails_health_checks/configuration.rb CHANGED Viewed

@@ -12,7 +12,8 @@ module RailsHealthChecks
                   :smtp_address, :smtp_port,
                   :sidekiq_queue_size, :solid_queue_job_count, :good_job_latency,
                   :resque_queue_size, :disk_warn_threshold, :disk_critical_threshold, :disk_path,
-                  :memory_threshold, :http_url, :http_expected_status, :http_headers
+                  :memory_threshold, :http_url, :http_expected_status, :http_headers,
+                  :readiness_path
     attr_reader :authenticate_block, :custom_checks, :groups
     def initialize
@@ -39,6 +40,7 @@ module RailsHealthChecks
       @custom_checks = {}
       @groups = {}
       @disabled_checks = {}
+      @readiness_path = "ready"
     end
     def checks

data/lib/rails_health_checks/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module RailsHealthChecks
-  VERSION = "1.1.0"
+  VERSION = "1.2.0"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: rails_health_checks
 version: !ruby/object:Gem::Version
-  version: 1.1.0
+  version: 1.2.0
 platform: ruby
 authors:
 - Chuck Smith
@@ -57,6 +57,7 @@ files:
 - app/controllers/rails_health_checks/health_controller.rb
 - app/controllers/rails_health_checks/live_controller.rb
 - app/controllers/rails_health_checks/metrics_controller.rb
+- app/controllers/rails_health_checks/ready_controller.rb
 - app/jobs/rails_health_checks/application_job.rb
 - app/mailers/rails_health_checks/application_mailer.rb
 - app/models/rails_health_checks/application_record.rb