allgood 0.1.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fe79d04db962fcfc50cb094c0b122506d09f903b7ee02e2ee7ae0c3930c78557
4
- data.tar.gz: fb9a69be909db1c10d5019dfadbdbee5b7da514913d8a907933cd2873e73f2f0
3
+ metadata.gz: 45bac1780ef5cb92516f0a6f02cd7aa7b476213d6595a937252c600dc3e870e0
4
+ data.tar.gz: b1f8f7e8e30609d67c6c28fb66cd651d13414f9161f74b3e3880f02113408c10
5
5
  SHA512:
6
- metadata.gz: fc26bbc3685f38fbfa49e05047987537c64e06c55356c28d0c8a0a50b73b6981c10d66c96996ca44057720402f4f5ed1319309a33402b25a39d0bdf472518496
7
- data.tar.gz: ea88c1027193b356560b2361e65a28c42d3b10683df7e777796ff5de10c0c6e3aa93f7bb42fa70ef6ce9699d288e4efa74ceaa82667a5208ec006aa41506e6ea
6
+ metadata.gz: b169c7d38987605312e2e013f645814e12fa23f2b0574793fa560f9f9ec9f97491eea5347c625e8b6d0909f3ec32c912a744925906f9e66b710486eb365cf35f
7
+ data.tar.gz: 0573a0b4918ec2e6e4a0ce40831d3e57afed34db5e8ada879d0db4450764dd0e72a8ffe4b3e779bf58758cf3d29546728600f9823437cdf1618627a48648fe5a
data/CHANGELOG.md CHANGED
@@ -1,4 +1,19 @@
1
- ## [Unreleased]
1
+ ## [0.3.0] - 2024-10-27
2
+
3
+ - Added rate limiting for expensive checks with the `run: "N times per day/hour"` option
4
+ - Added a cache mechanism to store check results and error states, which allows for rate limiting and avoiding redundant runs when checks fail
5
+ - Added automatic cache key expiration
6
+ - Added error handling and feedback for rate-limited checks
7
+
8
+ ## [0.2.0] - 2024-10-26
9
+
10
+ - Improved the `allgood` DSL by adding optional conditionals on when individual checks are run
11
+ - Allow for environment-specific checks with `only` and `except` options (`check "Test Check", only: [:development, :test]`)
12
+ - Allow for conditional checks with `if` and `unless` options, which can be procs or any other condition (`check "Test Check", if: -> { condition }`)
13
+ - Added visual indication of skipped checks in the healthcheck page
14
+ - Improved developer experience by showing why checks were skipped (didn't meet conditions, environment-specific, etc.)
15
+ - New DSL changes are fully backward compatible with the previous version (new options are optional, and checks will run normally if they are not specified), so the new version won't break existing configurations
16
+ - Changed configuration loading to happen after Rails initialization so we fix the segfault that could occur when requiring gems in the `allgood.rb` configuration file before Rails was initialized
2
17
 
3
18
  ## [0.1.0] - 2024-08-22
4
19
 
data/README.md CHANGED
@@ -1,12 +1,22 @@
1
1
  # ✅ Allgood - Rails gem for health checks
2
2
 
3
- Add quick, simple, and beautiful health checks to your Rails application.
3
+ [![Gem Version](https://badge.fury.io/rb/allgood.svg)](https://badge.fury.io/rb/allgood)
4
4
 
5
- `allgood` allows you to define custom, business-oriented health checks (as in: are there any new users in the past 24 hours, are they actually using the app, does the last record have all the attributes we expect, etc.) in a very intuitive way that reads just like English – and provides a `/healthcheck` endpoint that displays the results in a beautiful page.
5
+ Add quick, simple, and beautiful health checks to your Rails application via a `/healthcheck` page.
6
6
 
7
- You can then use that endpoint to monitor the health of your application via UptimeRobot, Pingdom, etc. These services will load your `/healthcheck` page every few minutes, so all checks will be run when UptimeRobot fetches the page.
7
+ Use it for smoke testing, to make sure your app is healthy and functioning as expected.
8
8
 
9
- ![alt text](allgood.jpeg)
9
+ ![Example dashboard of the Allgood health check page](allgood.jpeg)
10
+
11
+ ## How it works
12
+
13
+ `allgood` allows you to define custom health checks / smoke tests (as in: can the Rails app connect to the DB, are there any new users in the past 24 hours, are they actually using the app, etc.) in a very intuitive way that reads just like English.
14
+
15
+ It provides a `/healthcheck` endpoint that displays the results in a beautiful page.
16
+
17
+ You can then [use that endpoint to monitor the health of your application via UptimeRobot](https://uptimerobot.com/?rid=854006b5fe82e4), Pingdom, etc. These services will load your `/healthcheck` page every few minutes, so all checks will be run when UptimeRobot fetches the page.
18
+
19
+ `allgood` aims to provide developers with peace of mind by answering the question "is production okay?" at a glance.
10
20
 
11
21
  ## Installation
12
22
 
@@ -17,9 +27,10 @@ gem 'allgood'
17
27
 
18
28
  Then run `bundle install`.
19
29
 
20
- ## Usage
30
+ After installing the gem, you need to mount the `/healthcheck` route and define your health checks in a `config/allgood.rb` file.
21
31
 
22
- ### Mounting the Engine
32
+
33
+ ## Mount the `/healthcheck` route
23
34
 
24
35
  In your `config/routes.rb` file, mount the Allgood engine:
25
36
  ```ruby
@@ -28,36 +39,83 @@ mount Allgood::Engine => '/healthcheck'
28
39
 
29
40
  You can now navigate to `/healthcheck` to see the health check results.
30
41
 
31
- The `/healthcheck` page returns a `200` HTTP code if all checks are successful – and error `503 Service Unavailable` otherwise.
42
+ The `/healthcheck` page returns HTTP codes:
43
+ - `200 OK` if all checks are successful
44
+ - `503 Service Unavailable` error otherwise
45
+
46
+ Services like UptimeRobot pick up these HTTP codes, which makes monitoring easy.
32
47
 
33
- `allgood` is also a nice replacement for the default `/up` Rails action, so Kamal to also checks things like if the database connection is good. Just change the mounting route to `/up` instead of `/healthcheck`
48
+ **Kamal**: `allgood` can also be used as a replacement for the default `/up` Rails action, to make [Kamal](https://github.com/basecamp/kamal) check things like if the database connection is healthy when deploying your app's containers. Just change `allgood`'s mounting route to `/up` instead of `/healthcheck`, or configure Kamal to use the `allgood` route.
34
49
 
50
+ > [!TIP]
51
+ > If you're using Kamal with `allgood`, container deployment will fail if any defined checks fail, [without feedback from Kamal](https://github.com/rameerez/allgood/issues/1) on what went wrong. Your containers will just not start, and you'll get a generic error message. To avoid this, you can either keep the `allgood.rb` file very minimal (e.g., only check for active DB connection, migrations up to date, etc.) so the app deployment is likely to succeed, or you can use the default `/up` route for Kamal, and then mount `allgood` on another route for more advanced business-oriented checks. What you want to avoid is your app deployment failing because of usage-dependent or business-oriented checks, like your app not deploying because it didn't get any users in the past hour, or something like that.
35
52
 
36
- ### Configuring Health Checks
53
+ ## Configure your health checks
37
54
 
38
- Create a file `config/allgood.rb` in your Rails application. This is where you'll define your health checks:
55
+ Create a file `config/allgood.rb` in your Rails application. This is where you'll define your health checks. Here's a simple example:
39
56
  ```ruby
40
57
  # config/allgood.rb
41
58
 
42
59
  check "We have an active database connection" do
43
- make_sure ActiveRecord::Base.connection.active?
60
+ make_sure ActiveRecord::Base.connection.connect!.active?
44
61
  end
45
62
  ```
46
63
 
47
- This will run the check upon page load, and will show "Check passed" or "Check failed" next to it. You can also specify a custom human-readable success / error message for each check, so you don't go crazy when things fail and you can't figure out what the check expected output was:
64
+ `allgood` will run all checks upon page load, and will show "Check passed" or "Check failed" next to it. That's it add as many health checks as you want!
65
+
66
+ Here's my default `config/allgood.rb` file that should work for most Rails applications, feel free to use it as a starting point:
67
+
48
68
  ```ruby
49
- check "Cache is accessible and functioning" do
50
- Rails.cache.write('health_check_test', 'ok')
51
- make_sure Rails.cache.read('health_check_test') == 'ok', "The `health_check_test` key in the cache should contain `'ok'`"
69
+ # config/allgood.rb
70
+
71
+ check "We have an active database connection" do
72
+ make_sure ActiveRecord::Base.connection.connect!.active?
73
+ end
74
+
75
+ check "Database can perform a simple query" do
76
+ make_sure ActiveRecord::Base.connection.execute("SELECT 1").any?
77
+ end
78
+
79
+ check "Database migrations are up to date" do
80
+ make_sure ActiveRecord::Migration.check_all_pending! == nil
81
+ end
82
+
83
+ check "Disk space usage is below 90%" do
84
+ usage = `df -h / | tail -1 | awk '{print $5}' | sed 's/%//'`.to_i
85
+ expect(usage).to_be_less_than(90)
86
+ end
87
+
88
+ check "Memory usage is below 90%" do
89
+ usage = `free | grep Mem | awk '{print $3/$2 * 100.0}' | cut -d. -f1`.to_i
90
+ expect(usage).to_be_less_than(90)
52
91
  end
53
92
  ```
54
93
 
55
- As you can see, there's a very simple DSL (Domain-Specific Language) you can use to define health checks. It reads almost like natural English, and allows you to define powerful yet simple checks to make sure your app is healthy.
94
+ I've also added an example [`config/allgood.rb`](examples/allgood.rb) file in the `examples` folder, with very comprehensive checks for a Rails 8+ app, that you can use as a starting point.
95
+
96
+ > [!IMPORTANT]
97
+ > Make sure you restart the Rails server (`bin/rails s`) every time you modify the `config/allgood.rb` file for the changes to apply – the `allgood` config is only loaded once when the Rails server starts.
98
+
99
+ ### The `allgood` DSL
100
+
101
+ As you can see, there's a very simple DSL (Domain-Specific Language) you can use to define health checks.
102
+
103
+ It reads almost like natural English, and allows you to define powerful yet simple checks to make sure your app is healthy.
104
+
105
+ For example, you can specify a custom human-readable success / error message for each check, so you don't go crazy when things fail and you can't figure out what the check expected output was:
106
+ ```ruby
107
+ check "Cache is accessible and functioning" do
108
+ Rails.cache.write('allgood_test', 'ok')
109
+ make_sure Rails.cache.read('allgood_test') == 'ok', "The `allgood_test` key in the cache should contain `'ok'`"
110
+ end
111
+ ```
56
112
 
57
113
  Other than checking for an active database connection, it's useful to check for business-oriented metrics, such as whether your app has gotten any new users in the past 24 hours (to make sure your signup flow is not broken), check whether there have been any new posts / records created recently (to make sure your users are performing the actions you'd expect them to do in your app), check for recent purchases, check for external API connections, check whether new records contain values within expected range, etc.
58
114
 
59
115
  Some business health check examples that you'd need to adapt to the specifics of your particular app:
60
116
  ```ruby
117
+ # Adapt these to your app specifics
118
+
61
119
  check "There's been new signups in the past 24 hours" do
62
120
  count = User.where(created_at: 24.hours.ago..Time.now).count
63
121
  expect(count).to_be_greater_than(0)
@@ -70,63 +128,96 @@ check "The last created Purchase has a valid total" do
70
128
  end
71
129
  ```
72
130
 
73
- Other nice checks to have:
131
+ ### Available check methods
132
+
133
+ - `make_sure(condition, message = nil)`: Ensures that the given condition is true.
134
+ - `expect(actual).to_eq(expected)`: Checks if the actual value equals the expected value.
135
+ - `expect(actual).to_be_greater_than(expected)`: Checks if the actual value is greater than the expected value.
136
+ - `expect(actual).to_be_less_than(expected)`: Checks if the actual value is less than the expected value.
137
+
138
+ Please help us develop by adding more expectation methods in the `Expectation` class!
139
+
140
+ ### Run checks only in specific environments or under certain conditions
141
+
142
+ You can also make certain checks run only in specific environments or under certain conditions. Some examples:
143
+
74
144
  ```ruby
75
- check "Database can perform a simple query" do
76
- make_sure ActiveRecord::Base.connection.execute("SELECT 1").any?
145
+ # Only run in production
146
+ check "There have been new user signups in the past hour", only: :production do
147
+ make_sure User.where(created_at: 1.hour.ago..Time.now).count.positive?
77
148
  end
78
149
 
79
- check "Database migrations are up to date" do
80
- make_sure ActiveRecord::Migration.check_all_pending! == nil
150
+ # Run in both staging and production
151
+ check "External API is responsive", only: [:staging, :production] do
152
+ # ...
81
153
  end
82
154
 
83
- check "Cache is accessible and functioning" do
84
- Rails.cache.write('health_check_test', 'ok')
85
- make_sure Rails.cache.read('health_check_test') == 'ok', "The `health_check_test` key in the cache should contain `'ok'`"
155
+ # Run everywhere except development
156
+ check "A SolidCable connection is active and healthy", except: :development do
157
+ # ...
86
158
  end
87
159
 
88
- check "Disk space usage is below 90%" do
89
- usage = `df -h / | tail -1 | awk '{print $5}' | sed 's/%//'`.to_i
90
- expect(usage).to_be_less_than(90)
160
+ # Using if with a direct boolean
161
+ check "Feature flag is enabled", if: ENV['FEATURE_ENABLED'] == 'true' do
162
+ # ...
91
163
  end
92
164
 
93
- check "Memory usage is below 90%" do
94
- usage = `free | grep Mem | awk '{print $3/$2 * 100.0}' | cut -d. -f1`.to_i
95
- expect(usage).to_be_less_than(90)
165
+ # Using if with a Proc for more complex conditions
166
+ check "Complex condition", if: -> { User.count > 1000 && User.last.created_at < 10.minutes.ago } do
167
+ # ...
96
168
  end
97
- ```
98
169
 
99
- If you have other nice default checks, please open a PR! I'd love to provide a good default `config/allgood.rb` file.
170
+ # Override default timeout (in seconds) for specific checks
171
+ # By default, each check has a timeout of 10 seconds
172
+ check "Slow external API", timeout: 30 do
173
+ # ...
174
+ end
100
175
 
101
- > ⚠️ Make sure to restart the Rails server every time you modify the `config/allgood.rb` file for the config to reload and the changes to apply.
176
+ # Combine multiple conditions
177
+ check "Complex check",
178
+ only: :production,
179
+ if: -> { User.count > 1000 },
180
+ timeout: 15 do
181
+ # ...
182
+ end
183
+ ```
102
184
 
185
+ ### Rate Limiting Expensive Checks
103
186
 
104
- ### Available Check Methods
187
+ For expensive operations (like testing paid APIs), you can limit how often checks run:
105
188
 
106
- - `make_sure(condition, message = nil)`: Ensures that the given condition is true.
107
- - `expect(actual).to_eq(expected)`: Checks if the actual value equals the expected value.
108
- - `expect(actual).to_be_greater_than(expected)`: Checks if the actual value is greater than the expected value.
109
- - `expect(actual).to_be_less_than(expected)`: Checks if the actual value is less than the expected value.
189
+ ```ruby
190
+ # Run expensive checks a limited number of times
191
+ check "OpenAI is responding with a valid LLM message", run: "2 times per day" do
192
+ # expensive API call
193
+ end
110
194
 
111
- Please help us develop by adding more expectation methods in the `Expectation` class!
195
+ check "Analytics can be processed", run: "4 times per hour" do
196
+ # expensive operation
197
+ end
198
+ ```
112
199
 
113
- ## Customization
200
+ Important notes:
201
+ - Rate limits reset at the start of each period (hour/day)
202
+ - The error state persists between rate-limited runs
203
+ - Rate-limited checks show clear feedback about remaining runs and next reset time
114
204
 
115
- ### Timeout
205
+ When a check is skipped due to its conditions not being met, it will appear in the healthcheck page with a skip emoji (⏭️) and a clear explanation of why it was skipped.
116
206
 
117
- By default, each check has a timeout of 10 seconds.
207
+ ![Example dashboard of the Allgood health check page with skipped checks](allgood_skipped.webp)
118
208
 
209
+ _Note: the `allgood` health check dashboard has an automatic dark mode, based on the system's appearance settings._
119
210
 
120
211
  ## Development
121
212
 
122
- After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
213
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
123
214
 
124
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
215
+ To install this gem onto your local machine, run `bundle exec rake install`.
125
216
 
126
217
  ## Contributing
127
218
 
128
- Bug reports and pull requests are welcome on GitHub at https://github.com/rameerez/allgood Our code of conduct is: just be nice and make your mom proud of what you do and post online.
219
+ Bug reports and pull requests are welcome on GitHub at https://github.com/rameerez/allgood. Our code of conduct is: just be nice and make your mom proud of what you do and post online.
129
220
 
130
221
  ## License
131
222
 
132
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
223
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/allgood.jpeg CHANGED
Binary file
Binary file
@@ -29,25 +29,77 @@ module Allgood
29
29
 
30
30
  def run_checks
31
31
  Allgood.configuration.checks.map do |check|
32
- run_single_check(check)
32
+ if check[:status] == :skipped
33
+ {
34
+ name: check[:name],
35
+ success: true,
36
+ skipped: true,
37
+ message: check[:skip_reason],
38
+ duration: 0
39
+ }
40
+ else
41
+ run_single_check(check)
42
+ end
33
43
  end
34
44
  end
35
45
 
36
46
  def run_single_check(check)
47
+ last_result_key = "allgood:last_result:#{check[:name].parameterize}"
48
+ last_result = Allgood::CacheStore.instance.read(last_result_key)
49
+
50
+ unless Allgood.configuration.should_run_check?(check)
51
+ message = check[:skip_reason]
52
+ if last_result
53
+ status_info = "Last check #{last_result[:success] ? 'passed' : 'failed'} #{time_ago_in_words(last_result[:time])} ago: #{last_result[:message]}"
54
+ message = "#{message}. #{status_info}"
55
+ end
56
+
57
+ return {
58
+ name: check[:name],
59
+ success: last_result ? last_result[:success] : true,
60
+ skipped: true,
61
+ message: message,
62
+ duration: 0
63
+ }
64
+ end
65
+
37
66
  start_time = Time.now
38
67
  result = { success: false, message: "Check timed out after #{check[:timeout]} seconds" }
68
+ error_key = "allgood:error:#{check[:name].parameterize}"
39
69
 
40
70
  begin
41
71
  Timeout.timeout(check[:timeout]) do
42
72
  check_result = Allgood.configuration.run_check(&check[:block])
43
73
  result = { success: check_result[:success], message: check_result[:message] }
74
+
75
+ if result[:success]
76
+ # Clear error state and store successful result
77
+ Allgood::CacheStore.instance.write(error_key, nil)
78
+ Allgood::CacheStore.instance.write(last_result_key, {
79
+ success: true,
80
+ message: result[:message],
81
+ time: Time.current
82
+ })
83
+ end
44
84
  end
45
- rescue Timeout::Error
46
- # The result is already set to a timeout message
47
- rescue Allgood::CheckFailedError => e
48
- result = { success: false, message: e.message }
49
- rescue StandardError => e
50
- result = { success: false, message: "Error: #{e.message}" }
85
+ rescue Timeout::Error, Allgood::CheckFailedError, StandardError => e
86
+ error_message = case e
87
+ when Timeout::Error
88
+ "Check timed out after #{check[:timeout]} seconds"
89
+ when Allgood::CheckFailedError
90
+ e.message
91
+ else
92
+ "Error: #{e.message}"
93
+ end
94
+
95
+ # Store error state and failed result
96
+ Allgood::CacheStore.instance.write(error_key, error_message)
97
+ Allgood::CacheStore.instance.write(last_result_key, {
98
+ success: false,
99
+ message: error_message,
100
+ time: Time.current
101
+ })
102
+ result = { success: false, message: error_message }
51
103
  end
52
104
 
53
105
  {
@@ -14,6 +14,10 @@
14
14
  .check {
15
15
  margin: 0.5em 0;
16
16
  }
17
+
18
+ .skipped {
19
+ opacity: 0.6;
20
+ }
17
21
  </style>
18
22
 
19
23
  <header>
@@ -22,11 +26,18 @@
22
26
 
23
27
  <% if @results.any? %>
24
28
  <% @results.each do |result| %>
25
- <div class="check">
26
- <%= result[:success] ? "✅" : "❌" %>
27
- <b><%= result[:name] %></b>: <i><%= result[:message] %></i> <code>[<%= result[:duration] %>ms]</code>
29
+ <div class="check <%= 'skipped' if result[:skipped] %>">
30
+ <% if result[:skipped] %>
31
+ ⏭️
32
+ <% else %>
33
+ <%= result[:success] ? "✅" : "❌" %>
34
+ <% end %>
35
+ <b><%= result[:name] %></b>: <i><%= result[:message] %></i>
36
+ <% unless result[:skipped] %>
37
+ <code>[<%= result[:duration] %>ms]</code>
38
+ <% end %>
28
39
  </div>
29
40
  <% end %>
30
41
  <% else %>
31
42
  <p>No health checks were run. Please check your configuration.</p>
32
- <% end %>
43
+ <% end %>
@@ -0,0 +1,216 @@
1
+ require 'open-uri'
2
+ TEST_IMAGE = URI.open("https://picsum.photos/id/237/536/354").read
3
+
4
+ # --- ACTIVE RECORD ---
5
+
6
+ check "We have an active database connection" do
7
+ make_sure ActiveRecord::Base.connection.connect!.active?
8
+ end
9
+
10
+ check "The database can perform a simple query" do
11
+ make_sure ActiveRecord::Base.connection.execute("SELECT 1 LIMIT 1").any?
12
+ end
13
+
14
+ check "The database can perform writes" do
15
+ table_name = "allgood_health_check_#{Time.now.to_i}"
16
+ random_id = rand(1..999999)
17
+
18
+ result = ActiveRecord::Base.connection.execute(<<~SQL)
19
+ DROP TABLE IF EXISTS #{table_name};
20
+ CREATE TEMPORARY TABLE #{table_name} (id integer);
21
+ INSERT INTO #{table_name} (id) VALUES (#{random_id});
22
+ SELECT id FROM #{table_name} LIMIT 1;
23
+ SQL
24
+
25
+ ActiveRecord::Base.connection.execute("DROP TABLE #{table_name}")
26
+
27
+ make_sure result.present? && result.first["id"] == random_id, "Able to write to temporary table"
28
+ end
29
+
30
+ check "The database connection pool is healthy" do
31
+ pool = ActiveRecord::Base.connection_pool
32
+
33
+ used_connections = pool.connections.count
34
+ max_connections = pool.size
35
+ usage_percentage = (used_connections.to_f / max_connections * 100).round
36
+
37
+ make_sure usage_percentage < 90, "Pool usage at #{usage_percentage}% (#{used_connections}/#{max_connections})"
38
+ end
39
+
40
+ check "Database migrations are up to date" do
41
+ make_sure ActiveRecord::Migration.check_all_pending! == nil
42
+ end
43
+
44
+ # --- IMAGE PROCESSING ---
45
+
46
+ check "Vips (libvips) is installed on Linux", except: :development do
47
+ output = `ldconfig -p | grep libvips`
48
+ make_sure output.present? && output.include?("libvips.so") && output.include?("libvips-cpp.so"), "libvips is found in the Linux system's library cache"
49
+ end
50
+
51
+ check "Vips is available to Rails" do
52
+ throw "ImageProcessing::Vips is not available" if !ImageProcessing::Vips.present? # Need this line to load `Vips`
53
+
54
+ make_sure Vips::VERSION.present?, "Vips available with version #{Vips::VERSION}"
55
+ end
56
+
57
+ check "Vips can perform operations on images" do
58
+ throw "ImageProcessing::Vips is not available" if !ImageProcessing::Vips.present? # Need this line to load `Vips`
59
+
60
+ image = Vips::Image.new_from_buffer(TEST_IMAGE, "")
61
+ processed_image = image
62
+ .gaussblur(10) # Apply Gaussian blur with sigma 10
63
+ .linear([1.2], [0]) # Increase brightness
64
+ .invert # Invert colors for a wild effect
65
+ .sharpen # Apply sharpening
66
+ .resize(0.5)
67
+
68
+ make_sure processed_image.present? && processed_image.width == 268 && processed_image.height == 177, "If we input an image of 536x354px, and we apply filters and a 0.5 resize, we should get an image of 268x177px"
69
+ end
70
+
71
+ check "ImageProcessing::Vips is available to Rails" do
72
+ make_sure ImageProcessing::Vips.present?
73
+ end
74
+
75
+ check "ImageProcessing can perform operations on images" do
76
+ image_processing_image = ImageProcessing::Vips
77
+ .source(Vips::Image.new_from_buffer(TEST_IMAGE, ""))
78
+ .resize_to_limit(123, 123) # Resize to fit within 500x500
79
+ .convert("webp") # Convert to webp format
80
+ .call
81
+ processed_image = Vips::Image.new_from_file(image_processing_image.path)
82
+
83
+ make_sure processed_image.present? && processed_image.width == 123 && processed_image.get("vips-loader") == "webpload", "ImageProcessing can resize and convert to webp"
84
+ end
85
+
86
+ # --- ACTIVE STORAGE ---
87
+
88
+ check "Active Storage is available to Rails" do
89
+ make_sure ActiveStorage.present?
90
+ end
91
+
92
+ check "Active Storage tables are present in the database" do
93
+ make_sure ActiveRecord::Base.connection.table_exists?("active_storage_attachments") && ActiveRecord::Base.connection.table_exists?("active_storage_blobs")
94
+ end
95
+
96
+ check "Active Storage has a valid client configured" do
97
+ service = ActiveStorage::Blob.service
98
+ service_name = service&.class&.name&.split("::")&.last&.split("Service")&.first
99
+
100
+ if !service_name.downcase.include?("disk")
101
+ make_sure service.present? && service.respond_to?(:client) && service.client.present?, "Active Storage service has a valid #{service_name} client configured"
102
+ else
103
+ make_sure !Rails.env.production? && service.present?, "Active Storage using #{service_name} service in #{Rails.env.to_s}"
104
+ end
105
+ end
106
+
107
+ check "ActiveStorage can store images, retrieve them, and purge them" do
108
+ blob = ActiveStorage::Blob.create_and_upload!(io: StringIO.new(TEST_IMAGE), filename: "allgood-test-image-#{Time.now.to_i}.jpg", content_type: "image/jpeg")
109
+ blob_key = blob.key
110
+ make_sure blob.persisted? && blob.service.exist?(blob_key)
111
+ blob.purge
112
+ make_sure !blob.service.exist?(blob_key), "Image needs to be successfully stored, retrieved, and purged from #{ActiveStorage::Blob.service.name} (#{ActiveStorage::Blob.service.class.name})"
113
+ end
114
+
115
+ # --- CACHE ---
116
+
117
+ check "Cache is accessible and functioning" do
118
+ cache_value = "allgood_#{Time.now.to_i}"
119
+ Rails.cache.write("allgood_health_check_test", cache_value, expires_in: 1.minute)
120
+ make_sure Rails.cache.read("allgood_health_check_test") == cache_value, "The `allgood_health_check_test` key in the cache should return the string `#{cache_value}`"
121
+ end
122
+
123
+ # --- SOLID QUEUE ---
124
+
125
+ check "SolidQueue is available to Rails" do
126
+ make_sure SolidQueue.present?
127
+ end
128
+
129
+ check "We have an active SolidQueue connection to the database" do
130
+ make_sure SolidQueue::Job.connection.connect!.active?
131
+ end
132
+
133
+ check "SolidQueue tables are present in the database" do
134
+ make_sure SolidQueue::Job.connection.table_exists?("solid_queue_jobs") && SolidQueue::Job.connection.table_exists?("solid_queue_failed_executions") && SolidQueue::Job.connection.table_exists?("solid_queue_semaphores")
135
+ end
136
+
137
+ check "The percentage of failed jobs in the last 24 hours is less than 1%", only: :production do
138
+ failed_jobs = SolidQueue::FailedExecution.where(created_at: 24.hours.ago..Time.now).count
139
+ all_jobs = SolidQueue::Job.where(created_at: 24.hours.ago..Time.now).count
140
+
141
+ if all_jobs > 10
142
+ percentage = all_jobs > 0 ? (failed_jobs.to_f / all_jobs.to_f * 100) : 0
143
+ make_sure percentage < 1, "#{percentage.round(2)}% of jobs are failing"
144
+ else
145
+ make_sure true, "Not enough jobs to calculate meaningful failure rate (only #{all_jobs} jobs in last 24h)"
146
+ end
147
+ end
148
+
149
+ # --- ACTION CABLE ---
150
+
151
+ check "ActionCable is configured and running" do
152
+ make_sure ActionCable.server.present?, "ActionCable server should be running"
153
+ end
154
+
155
+ check "ActionCable is configured to accept connections with a valid adapter" do
156
+ make_sure ActionCable.server.config.allow_same_origin_as_host, "ActionCable server should be configured to accept connections"
157
+
158
+ adapter = ActionCable.server.config.cable["adapter"]
159
+
160
+ if Rails.env.production?
161
+ make_sure adapter.in?(["solid_cable", "redis"]), "ActionCable running #{adapter} adapter in #{Rails.env.to_s}"
162
+ else
163
+ make_sure adapter.in?(["solid_cable", "async"]), "ActionCable running #{adapter} adapter in #{Rails.env.to_s}"
164
+ end
165
+ end
166
+
167
+ check "ActionCable can broadcast messages and store them in SolidCable" do
168
+ test_message = "allgood_test_#{Time.now.to_i}"
169
+
170
+ begin
171
+ ActionCable.server.broadcast("allgood_test_channel", { message: test_message })
172
+
173
+ # Verify message was stored in SolidCable
174
+ message = SolidCable::Message.where(channel: "allgood_test_channel")
175
+ .order(created_at: :desc)
176
+ .first
177
+
178
+ make_sure message.present?, "Message should be stored in SolidCable"
179
+ make_sure message.payload.include?(test_message) && message.destroy, "Message payload should contain our test message"
180
+ rescue => e
181
+ make_sure false, "Failed to broadcast/verify message: #{e.message}"
182
+ end
183
+ end
184
+
185
+ # --- SYSTEM ---
186
+
187
+ check "Disk space usage is below 90%", only: :production do
188
+ usage = `df -h / | tail -1 | awk '{print $5}' | sed 's/%//'`.to_i
189
+ expect(usage).to_be_less_than(90)
190
+ end
191
+
192
+ check "Memory usage is below 90%", only: :production do
193
+ usage = `free | grep Mem | awk '{print $3/$2 * 100.0}' | cut -d. -f1`.to_i
194
+ expect(usage).to_be_less_than(90)
195
+ end
196
+
197
+ # --- SITEMAP ---
198
+
199
+ check "The sitemap generator is available" do
200
+ make_sure SitemapGenerator.present?
201
+ end
202
+
203
+ check "sitemap.xml.gz exists", only: :production do
204
+ make_sure File.exist?(Rails.public_path.join("sitemap.xml.gz"))
205
+ end
206
+
207
+
208
+ # --- USAGE-DEPENDENT CHECKS ---
209
+
210
+ check "SolidQueue has processed jobs in the last 24 hours", only: :production do
211
+ make_sure SolidQueue::Job.where(created_at: 24.hours.ago..Time.now).order(created_at: :desc).limit(1).any?
212
+ end
213
+
214
+ # --- PAY / STRIPE ---
215
+
216
+ # TODO: no error webhooks in the past 24 hours, new sales in the past few hours, etc.
@@ -0,0 +1,52 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Allgood
4
+ class CacheStore
5
+ def self.instance
6
+ @instance ||= new
7
+ end
8
+
9
+ def initialize
10
+ @memory_store = {}
11
+ end
12
+
13
+ def read(key)
14
+ if rails_cache_available?
15
+ Rails.cache.read(key)
16
+ else
17
+ @memory_store[key]
18
+ end
19
+ end
20
+
21
+ def write(key, value)
22
+ if rails_cache_available?
23
+ expiry = key.include?('day') ? 1.day : 1.hour
24
+ Rails.cache.write(key, value, expires_in: expiry)
25
+ else
26
+ @memory_store[key] = value
27
+ end
28
+ end
29
+
30
+ def cleanup_old_keys
31
+ return unless rails_cache_available?
32
+
33
+ keys_pattern = "allgood:*"
34
+ if Rails.cache.respond_to?(:delete_matched)
35
+ Rails.cache.delete_matched("#{keys_pattern}:*:#{(Time.current - 2.days).strftime('%Y-%m-%d')}*")
36
+ end
37
+ rescue StandardError => e
38
+ Rails.logger.warn "Allgood: Failed to cleanup old cache keys: #{e.message}"
39
+ end
40
+
41
+ private
42
+
43
+ def rails_cache_available?
44
+ Rails.cache && Rails.cache.respond_to?(:read) && Rails.cache.respond_to?(:write) &&
45
+ Rails.cache.write("allgood_rails_cache_test_ok", "true") &&
46
+ Rails.cache.read("allgood_rails_cache_test_ok") == "true"
47
+ rescue StandardError => e
48
+ Rails.logger.warn "Allgood: Rails.cache not available (#{e.message}), falling back to memory store"
49
+ false
50
+ end
51
+ end
52
+ end
@@ -8,13 +8,167 @@ module Allgood
8
8
  @default_timeout = 10 # Default timeout of 10 seconds
9
9
  end
10
10
 
11
- def check(name, &block)
12
- @checks << { name: name, block: block, timeout: @default_timeout }
11
+ def check(name, **options, &block)
12
+ check_info = {
13
+ name: name,
14
+ block: block,
15
+ timeout: options[:timeout] || @default_timeout,
16
+ options: options,
17
+ status: :pending
18
+ }
19
+
20
+ # Handle rate limiting
21
+ if options[:run]
22
+ begin
23
+ check_info[:rate] = parse_run_frequency(options[:run])
24
+ rescue ArgumentError => e
25
+ check_info[:status] = :skipped
26
+ check_info[:skip_reason] = "Invalid run frequency: #{e.message}"
27
+ @checks << check_info
28
+ return
29
+ end
30
+ end
31
+
32
+ # Handle environment-specific options
33
+ if options[:only]
34
+ environments = Array(options[:only])
35
+ unless environments.include?(Rails.env.to_sym)
36
+ check_info[:status] = :skipped
37
+ check_info[:skip_reason] = "Only runs in #{environments.join(', ')}"
38
+ @checks << check_info
39
+ return
40
+ end
41
+ end
42
+
43
+ if options[:except]
44
+ environments = Array(options[:except])
45
+ if environments.include?(Rails.env.to_sym)
46
+ check_info[:status] = :skipped
47
+ check_info[:skip_reason] = "This check doesn't run in #{environments.join(', ')}"
48
+ @checks << check_info
49
+ return
50
+ end
51
+ end
52
+
53
+ # Handle conditional checks
54
+ if options[:if]
55
+ condition = options[:if]
56
+ unless condition.is_a?(Proc) ? condition.call : condition
57
+ check_info[:status] = :skipped
58
+ check_info[:skip_reason] = "Check condition not met"
59
+ @checks << check_info
60
+ return
61
+ end
62
+ end
63
+
64
+ if options[:unless]
65
+ condition = options[:unless]
66
+ if condition.is_a?(Proc) ? condition.call : condition
67
+ check_info[:status] = :skipped
68
+ check_info[:skip_reason] = "Check `unless` condition met"
69
+ @checks << check_info
70
+ return
71
+ end
72
+ end
73
+
74
+ check_info[:status] = :active
75
+ @checks << check_info
13
76
  end
14
77
 
15
78
  def run_check(&block)
16
79
  CheckRunner.new.instance_eval(&block)
17
80
  end
81
+
82
+ def should_run_check?(check)
83
+ return true unless check[:rate]
84
+
85
+ cache_key = "allgood:last_run:#{check[:name].parameterize}"
86
+ runs_key = "allgood:runs_count:#{check[:name].parameterize}:#{current_period(check[:rate])}"
87
+ error_key = "allgood:error:#{check[:name].parameterize}"
88
+ last_result_key = "allgood:last_result:#{check[:name].parameterize}"
89
+
90
+ last_run = Allgood::CacheStore.instance.read(cache_key)
91
+ period_runs = Allgood::CacheStore.instance.read(runs_key).to_i
92
+ last_result = Allgood::CacheStore.instance.read(last_result_key)
93
+
94
+ current_period_key = current_period(check[:rate])
95
+ stored_period = Allgood::CacheStore.instance.read("allgood:current_period:#{check[:name].parameterize}")
96
+
97
+ # If we're in a new period, reset the counter
98
+ if stored_period != current_period_key
99
+ period_runs = 0
100
+ Allgood::CacheStore.instance.write("allgood:current_period:#{check[:name].parameterize}", current_period_key)
101
+ Allgood::CacheStore.instance.write(runs_key, 0)
102
+ end
103
+
104
+ # If there's an error, wait until next period
105
+ if previous_error = Allgood::CacheStore.instance.read(error_key)
106
+ next_period = next_period_start(check[:rate])
107
+ rate_info = "Rate limited (#{period_runs}/#{check[:rate][:max_runs]} runs this #{check[:rate][:period]})"
108
+ check[:skip_reason] = "#{rate_info}. Waiting until #{next_period.strftime('%H:%M:%S %Z')} to retry failed check"
109
+ return false
110
+ end
111
+
112
+ # If we haven't exceeded the max runs for this period
113
+ if period_runs < check[:rate][:max_runs]
114
+ Allgood::CacheStore.instance.write(cache_key, Time.current)
115
+ Allgood::CacheStore.instance.write(runs_key, period_runs + 1)
116
+ true
117
+ else
118
+ next_period = next_period_start(check[:rate])
119
+ rate_info = "Rate limited (#{period_runs}/#{check[:rate][:max_runs]} runs this #{check[:rate][:period]})"
120
+ next_run = "Next check at #{next_period.strftime('%H:%M:%S %Z')}"
121
+ check[:skip_reason] = "#{rate_info}. #{next_run}"
122
+ false
123
+ end
124
+ end
125
+
126
+ private
127
+
128
+ def parse_run_frequency(frequency)
129
+ case frequency.to_s.downcase
130
+ when /(\d+)\s+times?\s+per\s+(day|hour)/i
131
+ max_runs, period = $1.to_i, $2
132
+ if max_runs <= 0
133
+ raise ArgumentError, "Number of runs must be positive"
134
+ end
135
+ if max_runs > 1000
136
+ raise ArgumentError, "Maximum 1000 runs per period allowed"
137
+ end
138
+ { max_runs: max_runs, period: period }
139
+ else
140
+ raise ArgumentError, "Unsupported frequency format. Use 'N times per day' or 'N times per hour'"
141
+ end
142
+ end
143
+
144
+ def current_period(rate)
145
+ case rate[:period]
146
+ when 'day'
147
+ Time.current.strftime('%Y-%m-%d')
148
+ when 'hour'
149
+ Time.current.strftime('%Y-%m-%d-%H')
150
+ end
151
+ end
152
+
153
+ def new_period?(last_run, rate)
154
+ case rate[:period]
155
+ when 'day'
156
+ !last_run.to_date.equal?(Time.current.to_date)
157
+ when 'hour'
158
+ last_run.strftime('%Y-%m-%d-%H') != Time.current.strftime('%Y-%m-%d-%H')
159
+ end
160
+ end
161
+
162
+ def next_period_start(rate)
163
+ case rate[:period]
164
+ when 'day'
165
+ Time.current.beginning_of_day + 1.day
166
+ when 'hour'
167
+ Time.current.beginning_of_hour + 1.hour
168
+ else
169
+ raise ArgumentError, "Unsupported period: #{rate[:period]}"
170
+ end
171
+ end
18
172
  end
19
173
 
20
174
  class CheckRunner
@@ -2,7 +2,7 @@ module Allgood
2
2
  class Engine < ::Rails::Engine
3
3
  isolate_namespace Allgood
4
4
 
5
- initializer "allgood.load_configuration" do
5
+ config.after_initialize do
6
6
  config_file = Rails.root.join("config", "allgood.rb")
7
7
  if File.exist?(config_file)
8
8
  Allgood.configure do |config|
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Allgood
4
- VERSION = "0.1.0"
4
+ VERSION = "0.3.0"
5
5
  end
data/lib/allgood.rb CHANGED
@@ -3,6 +3,7 @@
3
3
  require_relative "allgood/version"
4
4
  require_relative "allgood/engine"
5
5
  require_relative "allgood/configuration"
6
+ require_relative "allgood/cache_store"
6
7
 
7
8
  module Allgood
8
9
  class Error < StandardError; end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: allgood
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - rameerez
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-08-23 00:00:00.000000000 Z
11
+ date: 2024-11-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rails
@@ -24,12 +24,13 @@ dependencies:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
26
  version: 6.0.0
27
- description: 'Define custom, business-oriented health checks for your app (as in:
28
- are there any new users in the past 24 hours) and see the results in a simple /healthcheck
29
- page that you can use to monitor your app with UptimeRobot, Pingdom, or other monitoring
30
- services.'
27
+ description: 'Define custom health checks for your app (as in: are there any new users
28
+ in the past 24 hours) and see the results in a simple /healthcheck page that you
29
+ can use to monitor your production app with UptimeRobot, Pingdom, or other monitoring
30
+ services. It''s also useful as a drop-in replacement for the default `/up` health
31
+ check endpoint for Kamal deployments.'
31
32
  email:
32
- - allgood@rameerez.com
33
+ - rubygems@rameerez.com
33
34
  executables: []
34
35
  extensions: []
35
36
  extra_rdoc_files: []
@@ -39,12 +40,15 @@ files:
39
40
  - README.md
40
41
  - Rakefile
41
42
  - allgood.jpeg
43
+ - allgood_skipped.webp
42
44
  - app/controllers/allgood/base_controller.rb
43
45
  - app/controllers/allgood/healthcheck_controller.rb
44
46
  - app/views/allgood/healthcheck/index.html.erb
45
47
  - app/views/layouts/allgood/application.html.erb
46
48
  - config/routes.rb
49
+ - examples/allgood.rb
47
50
  - lib/allgood.rb
51
+ - lib/allgood/cache_store.rb
48
52
  - lib/allgood/configuration.rb
49
53
  - lib/allgood/engine.rb
50
54
  - lib/allgood/version.rb
@@ -56,7 +60,7 @@ metadata:
56
60
  allowed_push_host: https://rubygems.org
57
61
  homepage_uri: https://github.com/rameerez/allgood
58
62
  source_code_uri: https://github.com/rameerez/allgood
59
- changelog_uri: https://github.com/rameerez/allgood
63
+ changelog_uri: https://github.com/rameerez/allgood/blob/main/CHANGELOG.md
60
64
  post_install_message:
61
65
  rdoc_options: []
62
66
  require_paths:
@@ -72,7 +76,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
72
76
  - !ruby/object:Gem::Version
73
77
  version: '0'
74
78
  requirements: []
75
- rubygems_version: 3.5.17
79
+ rubygems_version: 3.5.22
76
80
  signing_key:
77
81
  specification_version: 4
78
82
  summary: Add quick, simple, and beautiful health checks to your Rails application.