cfn-guardian 0.4.0 → 0.6.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (47) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/build-gem.yml +25 -0
  3. data/.github/workflows/release-gem.yml +25 -0
  4. data/.github/workflows/release-image.yml +33 -0
  5. data/.rspec +1 -0
  6. data/Gemfile.lock +13 -13
  7. data/README.md +3 -819
  8. data/cfn-guardian.gemspec +1 -3
  9. data/docs/alarm_templates.md +130 -0
  10. data/docs/cli.md +182 -0
  11. data/docs/composite_alarms.md +24 -0
  12. data/docs/custom_checks/azure_file_check.md +28 -0
  13. data/docs/custom_checks/domain_expiry.md +10 -0
  14. data/docs/custom_checks/http.md +59 -0
  15. data/docs/custom_checks/log_group_metric_filters.md +27 -0
  16. data/docs/custom_checks/nrpe.md +29 -0
  17. data/docs/custom_checks/port.md +40 -0
  18. data/docs/custom_checks/sftp.md +73 -0
  19. data/docs/custom_checks/sql.md +44 -0
  20. data/docs/custom_checks/tls.md +25 -0
  21. data/docs/custom_metrics.md +71 -0
  22. data/docs/event_subscriptions.md +67 -0
  23. data/docs/maintenance_mode.md +85 -0
  24. data/docs/notifiers.md +33 -0
  25. data/docs/overview.md +22 -0
  26. data/docs/resources.md +93 -0
  27. data/docs/variables.md +58 -0
  28. data/lib/cfnguardian.rb +72 -58
  29. data/lib/cfnguardian/cloudwatch.rb +43 -32
  30. data/lib/cfnguardian/compile.rb +82 -5
  31. data/lib/cfnguardian/deploy.rb +2 -16
  32. data/lib/cfnguardian/display_formatter.rb +1 -2
  33. data/lib/cfnguardian/error.rb +4 -0
  34. data/lib/cfnguardian/models/alarm.rb +40 -28
  35. data/lib/cfnguardian/models/check.rb +30 -12
  36. data/lib/cfnguardian/models/event.rb +43 -15
  37. data/lib/cfnguardian/models/event_subscription.rb +96 -0
  38. data/lib/cfnguardian/resources/azure_file.rb +20 -0
  39. data/lib/cfnguardian/resources/base.rb +111 -26
  40. data/lib/cfnguardian/resources/ec2_instance.rb +11 -0
  41. data/lib/cfnguardian/resources/http.rb +1 -0
  42. data/lib/cfnguardian/resources/rds_cluster.rb +14 -0
  43. data/lib/cfnguardian/resources/rds_instance.rb +71 -0
  44. data/lib/cfnguardian/stacks/main.rb +7 -6
  45. data/lib/cfnguardian/stacks/resources.rb +34 -5
  46. data/lib/cfnguardian/version.rb +1 -1
  47. metadata +35 -10
data/cfn-guardian.gemspec CHANGED
@@ -13,8 +13,6 @@ Gem::Specification.new do |spec|
13
13
  spec.homepage = "https://github.com/base2Services/cfn-guardian"
14
14
  spec.license = "MIT"
15
15
 
16
- spec.metadata["allowed_push_host"] = "https://rubygems.org"
17
-
18
16
  spec.metadata["homepage_uri"] = spec.homepage
19
17
  spec.metadata["source_code_uri"] = "https://github.com/base2Services/cfn-guardian"
20
18
  spec.metadata["changelog_uri"] = "https://github.com/base2Services/cfn-guardian"
@@ -39,5 +37,5 @@ Gem::Specification.new do |spec|
39
37
  spec.add_dependency 'aws-sdk-codepipeline', '~> 1.28', '<2'
40
38
 
41
39
  spec.add_development_dependency "bundler", "~> 2.0"
42
- spec.add_development_dependency "rake", "~> 10.0"
40
+ spec.add_development_dependency "rake", "~> 13.0"
43
41
  end
@@ -0,0 +1,130 @@
1
+ # Guardian Alarm Templates
2
+
3
+ Each resource group has a set of default alarm templates which defines all the cloudwatch alarm options such as Threshold, Statistic, EvaluationPeriods etc. These can be manipulated in a few ways to change the values or create new alarms. They are defined under the top level key `Templates` in the yaml config file.
4
+
5
+ ## Alarm Defaults
6
+
7
+ To list the default alarms use the `show-alarms` command with the `--defaults` switch.
8
+ The list can be filtered using the `--group ApplicationTargetGroup` and `--alarm TargetResponseTime` optional switches
9
+
10
+ ```sh
11
+ cfn-guardian show-alarms --defaults --group ApplicationTargetGroup --alarm TargetResponseTime
12
+
13
+ +-------------------------+----------------------------------+
14
+ | ApplicationTargetGroup::TargetResponseTime |
15
+ | guardian-ApplicationTargetGroup-Default-TargetResponseTime |
16
+ +-------------------------+----------------------------------+
17
+ | Property | Config |
18
+ +-------------------------+----------------------------------+
19
+ | ResourceId | Default |
20
+ | ResourceHash | 7a1920d61156abc05a60135aefe8bc67 |
21
+ | Enabled | true |
22
+ | MetricName | TargetResponseTime |
23
+ | Dimensions | |
24
+ | Threshold | 5 |
25
+ | Period | 60 |
26
+ | EvaluationPeriods | 5 |
27
+ | ComparisonOperator | GreaterThanThreshold |
28
+ | Statistic | Maximum |
29
+ | ActionsEnabled | true |
30
+ | AlarmAction | Critical |
31
+ | TreatMissingData | notBreaching |
32
+ +-------------------------+----------------------------------+
33
+ ```
34
+
35
+ ## Overriding Defaults
36
+
37
+ Alarm properties such as `Threshold`, `AlarmAction`, etc can be overriden at the alarm level or at the alarm group level.
38
+
39
+ ### Alarm Group Overrides
40
+
41
+ Alarm group level overrides apply to all alarms within the alarm group.
42
+
43
+ ```yaml
44
+ Templates:
45
+ # define the resource group
46
+ Ec2Instance:
47
+ # GroupOverrides key denotes the group level overrides
48
+ GroupOverrides:
49
+ # supply the key value of the alarm property you want to override
50
+ AlarmAction: Informational
51
+ ```
52
+
53
+ ### Alarm Overrides
54
+
55
+ Alarm overrides apply only to the alarm the property is applied to. This will override any alarm group level overrides.
56
+
57
+ ```yaml
58
+ Templates:
59
+ # define the resource group
60
+ Ec2Instance:
61
+ # define the Alarm name you want to override
62
+ CPUUtilizationHigh:
63
+ # supply the key value of the alarm property you want to override
64
+ Threshold: 80
65
+ ```
66
+
67
+ ## Creating A New Alarm From A Default
68
+
69
+ You can create a default alarm from a default alarm using the `Inherit:` key. This will inherit all properites from the default alarm which can then be overridden.
70
+
71
+ ```yaml
72
+ Templates:
73
+ # define the resource group
74
+ Ec2Instance:
75
+ # define the Alarm name you want to override
76
+ CPUUtilizationWarning:
77
+ # Inherit the CPUUtilizationHigh alarm
78
+ Inherit: CPUUtilizationHigh
79
+ # supply the key value of the alarm property you want to override
80
+ Threshold: 75
81
+ EvaluationPeriods: 60
82
+ AlarmAction: Warning
83
+ ```
84
+
85
+ ## Creating A New Alarm With No Defaults
86
+
87
+ You can create a new alarm with out inheriting an existing one. This will the inherit the default properties for the resource group.
88
+
89
+ ```yaml
90
+ Templates:
91
+ # define the resource group
92
+ Ec2Instance:
93
+ # define the Alarm name you want to override
94
+ CPUUtilizationWarning:
95
+ # metric name must be provided
96
+ MetricName: CPUUtilization
97
+ # supply the key value of the alarm property you want to override
98
+ Statistic: Minimum
99
+ Threshold: 75
100
+ EvaluationPeriods: 60
101
+ AlarmAction: Warning
102
+ ```
103
+
104
+ ## Disabling An Alarm
105
+
106
+ You can disable an alarm by setting the alarm to `false`
107
+
108
+ ```yaml
109
+ Templates:
110
+ # define the resource group
111
+ Ec2Instance:
112
+ # define the Alarm and set the value to false
113
+ CPUUtilizationHigh: false
114
+ ```
115
+
116
+ ## M Out Of N Metric Data Points
117
+
118
+ This can be good to alert on groups of spikes with in a certain time frame without getting alerts for individual spikes.
119
+ It works by setting the `EvaluationPeriods` as N value and `DatapointsToAlarm` as the M value.
120
+ The following example will trigger the alarm if 6 out of 10 data points crossed the threshold of 90% CPU utilisation in a 10 minute period.
121
+
122
+ ```yaml
123
+ Templates:
124
+ Ec2Instance:
125
+ CPUUtilizationHigh:
126
+ Threshold: 90
127
+ Period: 60
128
+ EvaluationPeriods: 10
129
+ DatapointsToAlarm: 6
130
+ ```
data/docs/cli.md ADDED
@@ -0,0 +1,182 @@
1
+ # Guardian CLI Commands
2
+
3
+ Guardian deployments are managed by AWS codebuild and AWS codepipeline but there are some useful commands to help debug an issue.
4
+
5
+ ## Install the cli
6
+
7
+ ```ruby
8
+ gem install cfn-guardian
9
+ ```
10
+
11
+ ## CLI Help
12
+
13
+ ```sh
14
+ Commands:
15
+ cfn-guardian --version, -v # print the version
16
+ cfn-guardian compile c, --config=CONFIG # Generate monitoring CloudFormation templates
17
+ cfn-guardian deploy c, --config=CONFIG # Generates and deploys monitoring CloudFormation templates
18
+ cfn-guardian disable-alarms # Disable cloudwatch alarm notifications
19
+ cfn-guardian enable-alarms # Enable cloudwatch alarm notifications
20
+ cfn-guardian help [COMMAND] # Describe available commands or one specific command
21
+ cfn-guardian show-alarms # Shows alarm settings
22
+ cfn-guardian show-config-history # Shows the last 10 commits made to the codecommit repo
23
+ cfn-guardian show-drift # Cloudformation drift detection
24
+ cfn-guardian show-history # Shows alarm history for the last 7 days
25
+ cfn-guardian show-pipeline # Shows the current state of the AWS code pipeline
26
+ cfn-guardian show-state # Shows alarm state in cloudwatch
27
+
28
+ Options:
29
+ [--debug], [--no-debug] # enable debug logging
30
+ ```
31
+
32
+ ## Alarm Debugging
33
+
34
+ ### show-alarms
35
+
36
+ Displays the configured settings for each alarm. Can be filtered by resource group and alarm name. Defaults to show all configured alarms.
37
+ Alarms can be filtered using the `--filter` switch providing multiple key:values witch uses the and operator.
38
+
39
+ ```bash
40
+ Usage:
41
+ cfn-guardian show-alarms
42
+
43
+ Options:
44
+ c, [--config=CONFIG] # yaml config file
45
+ [--defaults], [--no-defaults] # display default alarms and properties
46
+ r, [--region=REGION] # set the AWS region
47
+ [--filter=key:value] # filter the displayed alarms by [group, resource-id, alarm, stack-id, topic, maintenance-group]
48
+ [--compare], [--no-compare] # compare config to deployed alarms
49
+ [--debug], [--no-debug] # enable debug logging
50
+ ```
51
+
52
+ ### show-history
53
+
54
+ Displays the alarm state or config history for the last 7 days. Alarms can be described in 2 different ways:
55
+
56
+ 1. Using the config to describe the alarms and filter via the group, alarm and resource id.
57
+ 2. Supplying a list of alarm names with the `--alarm-names` option.
58
+
59
+ *NOTE: Options 2 may find alarms not in the guardian stack.*
60
+
61
+ ```bash
62
+ Usage:
63
+ cfn-guardian show-history
64
+
65
+ Options:
66
+ r, [--region=REGION] # set the AWS region
67
+ [--alarm-names=one two three] # list of cloudwatch alarm names
68
+ t, [--type=TYPE] # filter by alarm state
69
+ # Default: state
70
+ # Possible values: state, config
71
+ [--alarm-prefix=ALARM_PREFIX] # cloudwatch alarm name prefix
72
+ # Default: guardian
73
+ [--filter=key:value] # filter the displayed alarms by [group, resource-id, alarm, stack-id]
74
+ [--debug], [--no-debug] # enable debug logging
75
+ ```
76
+
77
+ ### show-state
78
+
79
+ Displays the state of the deployed CloudWatch alarms. Alarms can be filtered using the `--filter` switch providing multiple key:values witch uses the and operator.
80
+ Alarm can also be filtered using the `--alarms-prefix` to only list alarms that begin with the provided string.
81
+
82
+ ```bash
83
+ Usage:
84
+ cfn-guardian show-state
85
+
86
+ Options:
87
+ r, [--region=REGION] # set the AWS region
88
+ s, [--state=STATE] # filter by alarm state
89
+ # Possible values: OK, ALARM, INSUFFICIENT_DATA
90
+ [--alarm-names=one two three] # list of cloudwatch alarm names
91
+ [--alarm-prefix=ALARM_PREFIX] # cloudwatch alarm name prefix
92
+ # Default: guardian
93
+ [--filter=key:value] # filter the displayed alarms by [group, resource-id, alarm, stack-id, topic, maintenance-group]
94
+ [--debug], [--no-debug] # enable debug logging
95
+ ```
96
+
97
+ ### show-drift
98
+
99
+ Displays any Cloudformation drift detection in the CloudWatch alarms from the deployed stacks. Useful for detecting manual changes to an alarm.
100
+
101
+ ```bash
102
+ Usage:
103
+ cfn-guardian show-drift
104
+
105
+ Options:
106
+ s, [--stack-name=STACK_NAME] # set the Cloudformation stack name
107
+ # Default: guardian
108
+ [--debug], [--no-debug] # enable debug logging
109
+ ```
110
+
111
+ ## Enabling and Disabling Alarms
112
+
113
+ ### Disable Alarms
114
+
115
+ Disable cloudwatch alarm notifications for a maintenance group or for specific alarms. See [maintenace groups](maintenance_mode.md) docs for more information.
116
+
117
+ ```yaml
118
+ Usage:
119
+ cfn-guardian disable-alarms
120
+
121
+ Options:
122
+ r, [--region=REGION] # set the AWS region
123
+ g, [--group=GROUP] # name of the maintenance group defined in the config
124
+ [--alarm-prefix=ALARM_PREFIX] # cloud watch alarm name prefix
125
+ [--alarms=one two three] # List of cloudwatch alarm names
126
+ [--debug], [--no-debug] # enable debug logging
127
+ ```
128
+
129
+ ### Enable Alarms
130
+
131
+ Enable cloudwatch alarm notifications for a maintenance group or for specific alarms. Once alarms are enable the state is set back to OK to re send notifications of any failed alarms.
132
+
133
+ ```yaml
134
+ Usage:
135
+ cfn-guardian enable-alarms
136
+
137
+ Options:
138
+ r, [--region=REGION] # set the AWS region
139
+ g, [--group=GROUP] # name of the maintenance group defined in the config
140
+ [--alarm-prefix=ALARM_PREFIX] # cloud watch alarm name prefix
141
+ [--alarms=one two three] # List of cloudwatch alarm names
142
+ [--debug], [--no-debug] # enable debug logging
143
+ ```
144
+
145
+ ## Cloudformation Stack
146
+
147
+ ### compile
148
+
149
+ Generates CloudFormation templates from the alarm configuration and output to the out/ directory. Useful if you want to debug a config issue locally.
150
+
151
+ ```bash
152
+ Usage:
153
+ cfn-guardian compile c, --config=CONFIG
154
+
155
+ Options:
156
+ c, --config=CONFIG # yaml config file
157
+ [--validate], [--no-validate] # validate cfn templates
158
+ # Default: true
159
+ [--bucket=BUCKET] # provide custom bucket name, will create a default bucket if not provided
160
+ r, [--region=REGION] # set the AWS region
161
+ [--debug], [--no-debug] # enable debug logging
162
+ ```
163
+
164
+ ### deploy
165
+
166
+ Generates CloudFormation templates from the alarm configuration and output to the out/ directory. Then copies the files to the s3 bucket and deploys the Cloudformation.
167
+
168
+ ```bash
169
+ Usage:
170
+ cfn-guardian deploy c, --config=CONFIG
171
+
172
+ Options:
173
+ c, --config=CONFIG # yaml config file
174
+ [--bucket=BUCKET] # provide custom bucket name, will create a default bucket if not provided
175
+ r, [--region=REGION] # set the AWS region
176
+ s, [--stack-name=STACK_NAME] # set the Cloudformation stack name. Defaults to `guardian`
177
+ [--sns-critical=SNS_CRITICAL] # sns topic arn for the critical alamrs
178
+ [--sns-warning=SNS_WARNING] # sns topic arn for the warning alamrs
179
+ [--sns-task=SNS_TASK] # sns topic arn for the task alamrs
180
+ [--sns-informational=SNS_INFORMATIONAL] # sns topic arn for the informational alamrs
181
+ [--debug], [--no-debug] # enable debug logging
182
+ ```
@@ -0,0 +1,24 @@
1
+ # Composite Alarms
2
+
3
+ Composite alarms take into account a combination of alarm states and only alarm when all conditions in the rule are met. See AWS (documentation)[https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutCompositeAlarm.html] for rule syntax.
4
+
5
+ Using the `Composites:` top level key, create the alarm using the following syntax.
6
+
7
+ **NOTE:** Each composite alarm cost $0.50/month
8
+
9
+ ```yaml
10
+ Composites:
11
+
12
+ # the key is used as the alarm name
13
+ AlarmName:
14
+ # Set the notification SNS topic, defaults to no notifications
15
+ Action: Informational
16
+ # Set a meaningful alarm description
17
+ Description: test
18
+ # Set the alarm rule by providing the alarm names. See above for rule syntax.
19
+ # Use the show-state command to get a list of the alarm names.
20
+ Rule: >-
21
+ ALARM(guardian-alarm-1)
22
+ AND
23
+ ALARM(guardian-alarm-2)
24
+ ```
@@ -0,0 +1,28 @@
1
+ # Azure File Check
2
+
3
+ CloudWatch Namespace: `FileAgeCheck`
4
+
5
+ Alerts based on file age being older than expected
6
+ ```yaml
7
+ Resources:
8
+ AzureFile:
9
+ # Storage account
10
+ - Id: us187fnakrap
11
+ # Container within storage account
12
+ Container: mybackups
13
+ # SSM Param within the AWS account which contains the storage account connection string
14
+ ConnectionString: /azurefilecheck/test/connection_string
15
+ # List of search objects
16
+ Search:
17
+ -
18
+ # Prefix used to filter returned items in blob storage
19
+ PREFIX: file123
20
+ # File identifer to perform age check on
21
+ REGEX: .log
22
+ # Oldest expected file age in seconds
23
+ OLDEST: 300
24
+ -
25
+ PREFIX: file456
26
+ REGEX: .bak
27
+ OLDEST: 86400
28
+ ```
@@ -0,0 +1,10 @@
1
+ # DomainExpiry
2
+
3
+ Cloudwatch NameSpace: `DNS`
4
+
5
+ ```yaml
6
+ Resources:
7
+ DomainExpiry:
8
+ # Array of resources defining the domain with the Id: key
9
+ - Id: example.com
10
+ ```
@@ -0,0 +1,59 @@
1
+ # HTTP
2
+
3
+ ## Public HTTP Check
4
+
5
+ Cloudwatch NameSpace: `HttpCheck`
6
+
7
+ ```yaml
8
+ Resources:
9
+ Http:
10
+ # Array of resources defining the http endpoint with the Id: key
11
+ - Id: https://api.example.com
12
+ # enables the status code check
13
+ StatusCode: 200
14
+ # enables the SSL check
15
+ Ssl: true
16
+ # boolean tp request a compressed response
17
+ Compressed: true
18
+ - Id: https://www.example.com
19
+ StatusCode: 301
20
+ - Id: https://example.com
21
+ StatusCode: 200
22
+ Ssl: true
23
+ # enables the body regex check
24
+ BodyRegex: 'helloworld'
25
+ - Id: http://www.example.com/images/cat.jpg
26
+ StatusCode: 200
27
+ # md5 hash of the image
28
+ BodyRegex: ae49b4246a89efcb5c639f00a013e812
29
+ - Id: https://api.example.com/user
30
+ StatusCode: 201
31
+ # default method is get but can be overridden to support post/put/head etc
32
+ Method: post
33
+ # specify headers using "key=value key=value"
34
+ Headers: content-type=application/json
35
+ # pass in custom payload for the request
36
+ Payload: '{"name": "john"}'
37
+ ```
38
+
39
+ ## Private HTTP Check
40
+
41
+ Cloudwatch NameSpace: `InternalHttpCheck`
42
+
43
+ ```yaml
44
+ Resources:
45
+ InternalHttp:
46
+ # Array of host groups with the uniq identifier of Environment.
47
+ # This will create a nrpe lambda per group attach to the defined vpc and subnets
48
+ - Environment: Prod
49
+ # VPC id for the vpc the EC2 hosts are running in
50
+ VpcId: vpc-1234
51
+ # Array of subnets to attach to the lambda function. Supply multiple if you want to be multi AZ.
52
+ # Multiple subnets from the same AZ cannot be used!
53
+ Subnets:
54
+ - subnet-abcd
55
+ Hosts:
56
+ # Array of resources defining the http endpoint with the Id: key
57
+ # All the same options as Http including ssl check on the internal endpoint
58
+ - Id: http://api.example.com
59
+ ```