kennel 1.74.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 769ede5638522cc0c56394f0fde86207b88aad4ade1ff05e0360018972e52a55
4
+ data.tar.gz: 467dfab6d84ec29c1efbba1f7ab1df2fad3fd5767521a677c88b38c8bb8f2e19
5
+ SHA512:
6
+ metadata.gz: 64f8dbae6ba9ea0d51be0c20a955a261ca026c760e77cf05e945b0a2514e6ef99527f8a00a9dc39abc053235fd76474435fb13d610e6e1ed1ce4e92d99e81ea9
7
+ data.tar.gz: 695f7f97b7355291d66b72423ac3c1adcadf804ced559aa35222c431c92198beca00fa625ebbad70ab105173ec8ca0b11eb4ec13a3dd19c04945fa9709821552
@@ -0,0 +1,244 @@
1
+ # Kennel
2
+
3
+ ![](template/github/cage.jpg?raw=true)
4
+
5
+ Manage datadog monitors/dashboards/slos as code
6
+
7
+ - Documented, reusable, and searchable
8
+ - Changes are PR reviewed and auditable
9
+ - Updating shows diff before applying
10
+ - Automated import of existing monitors/dashboards/slos
11
+
12
+ ![](template/github/screen.png?raw=true)
13
+ <!-- NOT IN template/Readme.md -->
14
+ ## Install
15
+
16
+ - create a new private `kennel` repo for your organization (do not fork this repo)
17
+ - use the template folder as starting point:
18
+ ```Bash
19
+ git clone git@github.com:your-org/kennel.git
20
+ git clone git@github.com:grosser/kennel.git seed
21
+ mv seed/template/* kennel/
22
+ cd kennel && git add . && git commit -m 'initial'
23
+ ```
24
+ - add a basic projects and teams so others can copy-paste to get started
25
+ - setup travis build for your repo
26
+ - uncomment `.travis.yml` section for automated PR planing and datadog updates on merge
27
+ - follow `Setup` in your repos Readme.md
28
+ <!-- NOT IN -->
29
+
30
+ ## Structure
31
+
32
+ - `projects/` monitors/dashboards/etc scoped by project
33
+ - `teams/` team definitions
34
+ - `parts/` monitors/dashboards/etc that are used by multiple projects
35
+ - `generated/` projects as json, to show current state and proposed changes in PRs
36
+
37
+ ## Workflows
38
+
39
+ <!-- ONLY IN template/Readme.md
40
+ ### Setup
41
+ - clone the repo
42
+ - `gem install bundler && bundle install`
43
+ - `cp .env.example .env`
44
+ - open [Datadog API Settings](https://app.datadoghq.com/account/settings#api)
45
+ - copy any `API Key` and add it to `.env` as `DATADOG_API_KEY`
46
+ - find or create (check last page) your personal "Application Key" and add it to `.env` as `DATADOG_APP_KEY=`
47
+ - change the `DATADOG_SUBDOMAIN=app` in `.env` to your companies subdomain if you have one
48
+ - verify it works by running `rake plan`, it might show some diff, but should not crash
49
+ -->
50
+
51
+ ### Adding a team
52
+
53
+ - `mention` is used for all team monitors via `super()`
54
+ - `renotify_interval` is used for all team monitors (defaults to `0` / off)
55
+ - `tags` is used for all team monitors/dashboards (defaults to `team:<team-name>`)
56
+
57
+ ```Ruby
58
+ # teams/my_team.rb
59
+ module Teams
60
+ class MyTeam < Kennel::Models::Team
61
+ defaults(
62
+ mention: -> { "@slack-my-team" }
63
+ )
64
+ end
65
+ end
66
+ ```
67
+
68
+ ### Adding a new monitor
69
+ - use [datadog monitor UI](https://app.datadoghq.com/monitors#create) to create a monitor
70
+ - see below
71
+
72
+ ### Updating an existing monitor
73
+ - use [datadog monitor UI](https://app.datadoghq.com/monitors/manage) to find a monitor
74
+ - get the `id` from the url
75
+ - run `URL='https://app.datadoghq.com/monitors/123' bundle exec rake kennel:import` and copy the output
76
+ - find or create a project in `projects/`
77
+ - add the monitor to `parts: [` list, for example:
78
+ ```Ruby
79
+ # projects/my_project.rb
80
+ class MyProject < Kennel::Models::Project
81
+ defaults(
82
+ team: -> { Teams::MyTeam.new }, # use existing team or create new one in teams/
83
+ parts: -> {
84
+ [
85
+ Kennel::Models::Monitor.new(
86
+ self,
87
+ id: -> { 123456 }, # id from datadog url, not necessary when creating a new monitor
88
+ type: -> { "query alert" },
89
+ kennel_id: -> { "load-too-high" }, # make up a unique name
90
+ name: -> { "Foobar Load too high" }, # nice descriptive name that will show up in alerts and emails
91
+ message: -> {
92
+ # Explain what behavior to expect and how to fix the cause
93
+ # Use #{super()} to add team notifications.
94
+ <<~TEXT
95
+ Foobar will be slow and that could cause Barfoo to go down.
96
+ Add capacity or debug why it is suddenly slow.
97
+ #{super()}
98
+ TEXT
99
+ },
100
+ query: -> { "avg(last_5m):avg:system.load.5{hostgroup:api} by {pod} > #{critical}" }, # replace actual value with #{critical} to keep them in sync
101
+ critical: -> { 20 }
102
+ )
103
+ ]
104
+ }
105
+ )
106
+ end
107
+ ```
108
+ - run `PROJECT=my_project bundle exec rake plan`, an Update to the existing monitor should be shown (not Create / Delete)
109
+ - alternatively: `bundle exec rake generate` to only locally update the generated `json` files
110
+ - review changes then `git commit`
111
+ - make a PR ... get reviewed ... merge
112
+ - datadog is updated by travis
113
+
114
+ ### Adding a new dashboard
115
+ - go to [datadog dashboard UI](https://app.datadoghq.com/dashboard/lists) and click on _New Dashboard_ to create a dashboard
116
+ - see below
117
+
118
+ ### Updating an existing dashboard
119
+ - go to [datadog dashboard UI](https://app.datadoghq.com/dashboard/lists) and click on _New Dashboard_ to find a dashboard
120
+ - get the `id` from the url
121
+ - run `URL='https://app.datadoghq.com/dashboard/bet-foo-bar' bundle exec rake kennel:import` and copy the output
122
+ - find or create a project in `projects/`
123
+ - add a dashboard to `parts: [` list, for example:
124
+ ```Ruby
125
+ class MyProject < Kennel::Models::Project
126
+ defaults(
127
+ team: -> { Teams::MyTeam.new }, # use existing team or create new one in teams/
128
+ parts: -> {
129
+ [
130
+ Kennel::Models::Dashboard.new(
131
+ self,
132
+ id: -> { "abc-def-ghi" }, # id from datadog url, not needed when creating a new dashboard
133
+ title: -> { "My Dashboard" },
134
+ description: -> { "Overview of foobar" },
135
+ template_variables: -> { ["environment"] }, # see https://docs.datadoghq.com/api/?lang=ruby#timeboards
136
+ kennel_id: -> { "overview-dashboard" }, # make up a unique name
137
+ layout_type: -> { "ordered" },
138
+ definitions: -> {
139
+ [ # An array or arrays, each one is a graph in the dashboard, alternatively a hash for finer control
140
+ [
141
+ # title, viz, type, query, edit an existing graph and see the json definition
142
+ "Graph name", "timeseries", "area", "sum:mystats.foobar{$environment}"
143
+ ],
144
+ [
145
+ # queries can be an Array as well, this will generate multiple requests
146
+ # for a single graph
147
+ "Graph name", "timeseries", "area", ["sum:mystats.foobar{$environment}", "sum:mystats.success{$environment}"],
148
+ # add events too ...
149
+ events: [{q: "tags:foobar,deploy", tags_execution: "and"}]
150
+ ]
151
+ ]
152
+ }
153
+ )
154
+ ]
155
+ }
156
+ )
157
+ end
158
+ ```
159
+
160
+ ### Skipping validations
161
+
162
+ Some validations might be too strict for your usecase or just wrong, please [open an issue](https://github.com/grosser/kennel/issues) and
163
+ to unblock use the `validate: -> { false }` option.
164
+
165
+ ### Linking with kennel_ids
166
+
167
+ To link to existing monitors via their kennel_id
168
+
169
+ - Screens `uptime` widgets can use `monitor: {id: "foo:bar"}`
170
+ - Screens `alert_graph` widgets can use `alert_id: "foo:bar"`
171
+ - Monitors `composite` can use `query: -> { "%{foo:bar} || %{foo:baz}" }`
172
+
173
+ ### Debugging changes locally
174
+
175
+ - rebase on updated `master` to not undo other changes
176
+ - figure out project name by converting the class name to snake-case
177
+ - run `PROJECT=foo bundle exec rake kennel:update_datadog` to test changes for a single project
178
+
179
+ ### Listing un-muted alerts
180
+
181
+ Run `rake kennel:alerts TAG=service:my-service` to see all un-muted alerts for a given datadog monitor tag.
182
+
183
+ ### Validating mentions work
184
+
185
+ `rake kennel:validate_mentions` should run as part of CI
186
+
187
+ ### Grepping through all of datadog
188
+
189
+ `TYPE=monitor rake kennel:dump`
190
+
191
+ ### Find all monitors with No-Data
192
+
193
+ `rake kennel:nodata TAG=team:foo`
194
+
195
+ ## Examples
196
+
197
+ ### Reusable monitors/dashes/etc
198
+
199
+ Add to `parts/<folder>`.
200
+
201
+ ```Ruby
202
+ module Monitors
203
+ class LoadTooHigh < Kennel::Models::Monitor
204
+ defaults(
205
+ name: -> { "#{project.name} load too high" },
206
+ message: -> { "Shut it down!" },
207
+ type: -> { "query alert" },
208
+ query: -> { "avg(last_5m):avg:system.load.5{hostgroup:#{project.kennel_id}} by {pod} > #{critical}" }
209
+ )
210
+ end
211
+ end
212
+ ```
213
+
214
+ Reuse it in multiple projects.
215
+
216
+ ```Ruby
217
+ class Database < Kennel::Models::Project
218
+ defaults(
219
+ team: -> { Kennel::Models::Team.new(mention: -> { '@slack-foo' }, kennel_id: -> { 'foo' }) },
220
+ parts: -> { [Monitors::LoadTooHigh.new(self, critical: -> { 13 })] }
221
+ )
222
+ end
223
+ ```
224
+ <!-- NOT IN template/Readme.md -->
225
+
226
+ ### Integration testing
227
+
228
+ ```Bash
229
+ rake play
230
+ cd template
231
+ rake plan
232
+ ```
233
+
234
+ Then make changes to play around, do not commit changes and make sure to revert with a `rake kennel:update_datadog` after deleting everything.
235
+
236
+ To make changes via the UI, make a new free datadog account and use it's credentaisl instead.
237
+
238
+ Author
239
+ ======
240
+ [Michael Grosser](http://grosser.it)<br/>
241
+ michael@grosser.it<br/>
242
+ License: MIT<br/>
243
+ [![Build Status](https://travis-ci.org/grosser/kennel.png)](https://travis-ci.org/grosser/kennel)
244
+ <!-- NOT IN -->
@@ -0,0 +1,90 @@
1
+ # frozen_string_literal: true
2
+ require "faraday"
3
+ require "json"
4
+ require "English"
5
+
6
+ require "kennel/version"
7
+ require "kennel/utils"
8
+ require "kennel/progress"
9
+ require "kennel/syncer"
10
+ require "kennel/api"
11
+ require "kennel/github_reporter"
12
+ require "kennel/subclass_tracking"
13
+ require "kennel/settings_as_methods"
14
+ require "kennel/file_cache"
15
+ require "kennel/template_variables"
16
+ require "kennel/optional_validations"
17
+ require "kennel/unmuted_alerts"
18
+
19
+ require "kennel/models/base"
20
+ require "kennel/models/record"
21
+
22
+ # records
23
+ require "kennel/models/dashboard"
24
+ require "kennel/models/monitor"
25
+ require "kennel/models/slo"
26
+
27
+ # settings
28
+ require "kennel/models/project"
29
+ require "kennel/models/team"
30
+
31
+ module Kennel
32
+ class ValidationError < RuntimeError
33
+ end
34
+
35
+ @out = $stdout
36
+ @err = $stderr
37
+
38
+ class << self
39
+ attr_accessor :out, :err
40
+
41
+ def generate
42
+ FileUtils.rm_rf("generated")
43
+ generated.each do |part|
44
+ path = "generated/#{part.tracking_id.sub(":", "/")}.json"
45
+ FileUtils.mkdir_p(File.dirname(path))
46
+ File.write(path, JSON.pretty_generate(part.as_json) << "\n")
47
+ end
48
+ end
49
+
50
+ def plan
51
+ syncer.plan
52
+ end
53
+
54
+ def update
55
+ syncer.plan
56
+ syncer.update if syncer.confirm
57
+ end
58
+
59
+ private
60
+
61
+ def syncer
62
+ @syncer ||= Syncer.new(api, generated, project: ENV["PROJECT"])
63
+ end
64
+
65
+ def api
66
+ @api ||= Api.new(ENV.fetch("DATADOG_APP_KEY"), ENV.fetch("DATADOG_API_KEY"))
67
+ end
68
+
69
+ def generated
70
+ @generated ||= begin
71
+ Progress.progress "Generating" do
72
+ load_all
73
+ parts = Models::Project.recursive_subclasses.flat_map do |project_class|
74
+ project_class.new.validated_parts
75
+ end
76
+ parts.map(&:tracking_id).group_by { |id| id }.select do |id, same|
77
+ raise "#{id} is defined #{same.size} times" if same.size != 1
78
+ end
79
+ parts
80
+ end
81
+ end
82
+ end
83
+
84
+ def load_all
85
+ ["teams", "parts", "projects"].each do |folder|
86
+ Dir["#{folder}/**/*.rb"].sort.each { |f| require "./#{f}" }
87
+ end
88
+ end
89
+ end
90
+ end
@@ -0,0 +1,83 @@
1
+ # frozen_string_literal: true
2
+ module Kennel
3
+ class Api
4
+ def initialize(app_key, api_key)
5
+ @app_key = app_key
6
+ @api_key = api_key
7
+ @client = Faraday.new(url: "https://app.datadoghq.com") { |c| c.adapter :net_http_persistent }
8
+ end
9
+
10
+ def show(api_resource, id, params = {})
11
+ reply = request :get, "/api/v1/#{api_resource}/#{id}", params: params
12
+ api_resource == "slo" ? reply[:data] : reply
13
+ end
14
+
15
+ def list(api_resource, params = {})
16
+ if api_resource == "slo"
17
+ raise ArgumentError if params[:limit] || params[:offset]
18
+ limit = 1000
19
+ offset = 0
20
+ all = []
21
+
22
+ loop do
23
+ result = request :get, "/api/v1/#{api_resource}", params: params.merge(limit: limit, offset: offset)
24
+ data = result.fetch(:data)
25
+ all.concat data
26
+ break all if data.size < limit
27
+ offset += limit
28
+ end
29
+ else
30
+ result = request :get, "/api/v1/#{api_resource}", params: params
31
+ result = result.fetch(:dashboards) if api_resource == "dashboard"
32
+ result
33
+ end
34
+ end
35
+
36
+ def create(api_resource, attributes)
37
+ reply = request :post, "/api/v1/#{api_resource}", body: attributes
38
+ api_resource == "slo" ? reply[:data].first : reply
39
+ end
40
+
41
+ def update(api_resource, id, attributes)
42
+ request :put, "/api/v1/#{api_resource}/#{id}", body: attributes
43
+ end
44
+
45
+ def delete(api_resource, id)
46
+ request :delete, "/api/v1/#{api_resource}/#{id}", ignore_404: true
47
+ end
48
+
49
+ private
50
+
51
+ def request(method, path, body: nil, params: {}, ignore_404: false)
52
+ params = params.merge(application_key: @app_key, api_key: @api_key)
53
+ query = Faraday::FlatParamsEncoder.encode(params)
54
+ response = nil
55
+ tries = 2
56
+
57
+ tries.times do |i|
58
+ response = Utils.retry Faraday::ConnectionFailed, Faraday::TimeoutError, times: 2 do
59
+ @client.send(method, "#{path}?#{query}") do |request|
60
+ request.body = JSON.generate(body) if body
61
+ request.headers["Content-type"] = "application/json"
62
+ end
63
+ end
64
+
65
+ break if i == tries - 1 || method != :get || response.status < 500
66
+ Kennel.err.puts "Retrying on server error #{response.status} for #{path}"
67
+ end
68
+
69
+ if !response.success? && (response.status != 404 || !ignore_404)
70
+ message = +"Error #{response.status} during #{method.upcase} #{path}\n"
71
+ message << "request:\n#{JSON.pretty_generate(body)}\nresponse:\n" if body
72
+ message << response.body
73
+ raise message
74
+ end
75
+
76
+ if response.body.empty?
77
+ {}
78
+ else
79
+ JSON.parse(response.body, symbolize_names: true)
80
+ end
81
+ end
82
+ end
83
+ end
@@ -0,0 +1,53 @@
1
+ # frozen_string_literal: true
2
+
3
+ # cache that reads everything from a single file
4
+ # to avoid doing multiple disk reads while iterating all definitions
5
+ # it also replaces updated keys and has an overall expiry to not keep deleted things forever
6
+ module Kennel
7
+ class FileCache
8
+ def initialize(file, cache_version)
9
+ @file = file
10
+ @cache_version = cache_version
11
+ @now = Time.now.to_i
12
+ @expires = @now + (30 * 24 * 60 * 60) # 1 month
13
+ end
14
+
15
+ def open
16
+ load_data
17
+ expire_old_data
18
+ yield self
19
+ ensure
20
+ persist
21
+ end
22
+
23
+ def fetch(key, key_version)
24
+ old_value, old_version = @data[key]
25
+ return old_value if old_version == [key_version, @cache_version]
26
+
27
+ new_value = yield
28
+ @data[key] = [new_value, [key_version, @cache_version], @expires]
29
+ new_value
30
+ end
31
+
32
+ private
33
+
34
+ def load_data
35
+ @data =
36
+ begin
37
+ Marshal.load(File.read(@file)) # rubocop:disable Security/MarshalLoad
38
+ rescue StandardError
39
+ {}
40
+ end
41
+ end
42
+
43
+ def persist
44
+ dir = File.dirname(@file)
45
+ FileUtils.mkdir_p(dir) unless File.directory?(dir)
46
+ File.write(@file, Marshal.dump(@data))
47
+ end
48
+
49
+ def expire_old_data
50
+ @data.reject! { |_, (_, _, ex)| ex < @now }
51
+ end
52
+ end
53
+ end