runbooks 0.9.2__py3-none-any.whl → 0.9.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. runbooks/__init__.py +15 -6
  2. runbooks/cfat/__init__.py +3 -1
  3. runbooks/cloudops/__init__.py +3 -1
  4. runbooks/common/aws_utils.py +367 -0
  5. runbooks/common/enhanced_logging_example.py +239 -0
  6. runbooks/common/enhanced_logging_integration_example.py +257 -0
  7. runbooks/common/logging_integration_helper.py +344 -0
  8. runbooks/common/profile_utils.py +8 -6
  9. runbooks/common/rich_utils.py +347 -3
  10. runbooks/enterprise/logging.py +400 -38
  11. runbooks/finops/README.md +262 -406
  12. runbooks/finops/__init__.py +2 -1
  13. runbooks/finops/accuracy_cross_validator.py +12 -3
  14. runbooks/finops/commvault_ec2_analysis.py +415 -0
  15. runbooks/finops/cost_processor.py +718 -42
  16. runbooks/finops/dashboard_router.py +44 -22
  17. runbooks/finops/dashboard_runner.py +302 -39
  18. runbooks/finops/embedded_mcp_validator.py +358 -48
  19. runbooks/finops/finops_scenarios.py +771 -0
  20. runbooks/finops/multi_dashboard.py +30 -15
  21. runbooks/finops/single_dashboard.py +386 -58
  22. runbooks/finops/types.py +29 -4
  23. runbooks/inventory/__init__.py +2 -1
  24. runbooks/main.py +522 -29
  25. runbooks/operate/__init__.py +3 -1
  26. runbooks/remediation/__init__.py +3 -1
  27. runbooks/remediation/commons.py +55 -16
  28. runbooks/remediation/commvault_ec2_analysis.py +259 -0
  29. runbooks/remediation/rds_snapshot_list.py +267 -102
  30. runbooks/remediation/workspaces_list.py +182 -31
  31. runbooks/security/__init__.py +3 -1
  32. runbooks/sre/__init__.py +2 -1
  33. runbooks/utils/__init__.py +81 -6
  34. runbooks/utils/version_validator.py +241 -0
  35. runbooks/vpc/__init__.py +2 -1
  36. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/METADATA +98 -60
  37. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/RECORD +41 -38
  38. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/entry_points.txt +1 -0
  39. runbooks/inventory/cloudtrail.md +0 -727
  40. runbooks/inventory/discovery.md +0 -81
  41. runbooks/remediation/CLAUDE.md +0 -100
  42. runbooks/remediation/DOME9.md +0 -218
  43. runbooks/security/ENTERPRISE_SECURITY_FRAMEWORK.md +0 -506
  44. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/WHEEL +0 -0
  45. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/licenses/LICENSE +0 -0
  46. {runbooks-0.9.2.dist-info → runbooks-0.9.4.dist-info}/top_level.txt +0 -0
@@ -1,727 +0,0 @@
1
- # CloudTrail: Accountability and Governance
2
-
3
- ??? info "python check_cloudtrail_status.py"
4
-
5
- | Parent Acct | Account Number | Region | Trail Name | Trail Type | S3 Bucket |
6
- | ------------ | -------------- | --------- | -------------------- | ---------- | ----------------------------- |
7
- | 909135376185 | 909135376185 | us-east-1 | ams-tf-cloudtraillog | OrgTrail | ams-cloudtrail-vector-all-org |
8
-
9
- ---
10
-
11
- ## How to identify WHO changed an AWS Resources, WHEN, and HOW it happened
12
-
13
-
14
-
15
- ??? note "🔎 Objective: Changed an AWS Security Group (sg-xxx) in an AWS account"
16
-
17
- > We need to know:
18
-
19
- - **Who** made the change
20
- - **When** it happened
21
- - **How** (what method/tool was used)
22
- - Optionally: **What** exactly was changed (rules added/removed/modified)
23
-
24
-
25
- ## ✅ Step-by-Step Forensic Workflow
26
-
27
- ### 🛠️ **1. Enable or Query AWS CloudTrail (Primary Source)**
28
-
29
- CloudTrail is your **single source of truth** for who did what, when, and how in AWS.
30
-
31
- #### 🔍 How to Query CloudTrail for SG Changes:
32
- **Console:**
33
- 1. Go to **CloudTrail > Event History**
34
- 2. Set **Lookup Attribute** to:
35
- - **Event name**: `AuthorizeSecurityGroupIngress`, `AuthorizeSecurityGroupEgress`, `RevokeSecurityGroupIngress`, `RevokeSecurityGroupEgress`, or `UpdateSecurityGroupRuleDescriptions`
36
- - Or filter by **Resource Name**: `sg-xxx`
37
- 3. Set **Time Range** (last 7–30 days typically)
38
-
39
- **Important Fields to Look At:**
40
- | Field | Description |
41
- |-------|-------------|
42
- | **Event Time** | When the change occurred |
43
- | **Event Name** | Type of operation (authorize/revoke/modify) |
44
- | **User Identity** | IAM principal who initiated the change |
45
- | **Access Key / Session Context** | Whether it was via console, CLI, or automation |
46
- | **Source IP** | IP address where the change came from |
47
- | **Event Source** | Always `ec2.amazonaws.com` for SG changes |
48
- | **Request Parameters** | IP ranges, ports, protocols involved in the change |
49
-
50
- > ✅ **Pro Tip**: If you're using **AWS Organizations**, query from the **Audit Account's CloudTrail**, or the central logging bucket if it's delivered via S3.
51
-
52
- ---
53
-
54
- ### 📜 **2. Use AWS Config for Historical Diff View**
55
-
56
- If AWS Config is enabled for your account (highly recommended), it provides **resource-level history and diffs**.
57
-
58
- #### 🔍 How to Use:
59
- 1. Go to **AWS Config > Resources**
60
- 2. Filter by **Resource Type**: *EC2 Security Group*
61
- 3. Search for **sg-xxx**
62
- 4. View **Configuration Timeline**:
63
- - You’ll see **before/after diffs**
64
- - You can pinpoint what rule (CIDR/port/protocol) was added or removed
65
-
66
- > 🔐 **Bonus**: AWS Config also tells you **compliance status**, i.e., whether the change violated your internal security baselines.
67
-
68
- ---
69
-
70
- ### 📊 **3. Use Athena + CloudTrail Logs for Advanced Search (Optional)**
71
-
72
- If your CloudTrail is delivered to S3 (recommended for long-term logging), you can:
73
- - Use **Amazon Athena** with the **AWS CloudTrail partitioned schema**
74
- - Run a SQL query like this:
75
-
76
- ```sql
77
- SELECT
78
- eventTime,
79
- eventName,
80
- userIdentity.arn,
81
- sourceIPAddress,
82
- requestParameters.groupId,
83
- requestParameters.ipPermissions
84
- FROM cloudtrail_logs
85
- WHERE eventSource = 'ec2.amazonaws.com'
86
- AND requestParameters.groupId = 'sg-xxx'
87
- AND eventName IN (
88
- 'AuthorizeSecurityGroupIngress',
89
- 'RevokeSecurityGroupIngress',
90
- 'AuthorizeSecurityGroupEgress',
91
- 'RevokeSecurityGroupEgress'
92
- )
93
- ORDER BY eventTime DESC
94
- ```
95
-
96
- ---
97
-
98
- ### 🧑‍💻 **4. Correlate IAM User/Role Access**
99
-
100
- Once you identify **who** made the change, validate:
101
-
102
- - Was it a human user (`IAMUser`) or automated role (`IAMRole`, `AssumedRole`)?
103
- - Was MFA enforced for human access?
104
- - Did the user belong to a **delegated group (like `Admin`, `NetworkOps`)**?
105
-
106
- Use **AWS IAM > Access Analyzer** or **CloudTrail "userIdentity" block** for this.
107
-
108
- > 🔐 If it's an assumed role like `DevOpsAssumeRole`, look for **"sessionContext > sessionIssuer > userName"** in CloudTrail to trace back the original IAM identity.
109
-
110
- ---
111
-
112
- ### 🔄 **5. Cross-Check via Change Management / ITSM**
113
-
114
- If you're practicing good governance, you should have:
115
- - A **JIRA ticket**, **ServiceNow request**, or **GitOps pull request** associated with the change
116
- - Cross-reference **timestamp** and **user** from CloudTrail with ticket system logs
117
- - If deployed via Terraform/CDK: check the commit history or CI/CD job logs
118
-
119
- ---
120
-
121
- ## 🔥 Security Best Practices Going Forward
122
-
123
- | Practice | Why |
124
- |---------|-----|
125
- | ✅ Enable **CloudTrail org-wide** and deliver logs to central S3 | Full audit trail |
126
- | ✅ Use **AWS Config** across accounts | Historical visibility of resource changes |
127
- | ✅ Integrate **CloudTrail Insights** | Detect unusual activity like bulk security group changes |
128
- | ✅ Tag SGs with `Owner`, `Environment`, `ChangeTicket` | Aids investigation |
129
- | ✅ Enforce **IAM Conditions** for `ec2:AuthorizeSecurityGroupIngress` etc. | Only allow changes through approved paths |
130
- | ✅ Use **GuardDuty** or **Security Hub** to flag risky changes (e.g., `0.0.0.0/0` open port) | Detection & alerting |
131
-
132
- ---
133
-
134
- > **An enterprise-grade, fully automated solution** for **monitoring and alerting on AWS Security Group (`sg-xxx`) changes** with forensic-level visibility, real-time alerts, and infrastructure governance.
135
-
136
- We're going to:
137
-
138
- 1. ✅ Deep dive: How to **use Athena with CloudTrail partitioned schema**
139
- 2. ✅ Craft **Athena queries** to detect SG changes
140
- 3. ✅ Configure **AWS Config rules** to enforce and detect policy violations
141
- 4. ✅ Build a **real-time alert workflow** using **EventBridge + SNS + Microsoft Teams**
142
- 5. ✅ Wrap with best practices for **automation, security, and audit-readiness**
143
-
144
- ---
145
-
146
- ## 🔍 PART 1: Use Amazon Athena with AWS CloudTrail Logs (Partitioned Schema)
147
-
148
- Amazon Athena allows you to **query CloudTrail logs stored in S3 using SQL**, which is ideal for forensic investigations or continuous audits.
149
-
150
- ### ✅ Step-by-Step Setup
151
-
152
- #### **Step 1 – Ensure CloudTrail is Delivered to S3**
153
- If not already configured:
154
- - Go to **CloudTrail > Trails**
155
- - Ensure **S3 logging is enabled** and directed to a known bucket (e.g., `cloudtrail-logs-org-central`)
156
- - Ideally use **organization trail** for centralized auditing
157
-
158
- #### **Step 2 – Create Athena Table for CloudTrail Logs**
159
-
160
- Use this sample schema to create an Athena table (you only need to do this once per account or org-wide audit bucket):
161
-
162
- > ~~s3://<your-cloudtrail-bucket-name>/AWSLogs/<account-id>/CloudTrail/~~ --> s3://cloudtrail-logs-org-central/AWSLogs/~~Your_Management_Account_ID~~/CloudTrail/
163
-
164
- s3://ams-cloudtrail-vector-all-org/AWSLogs/909135376185/CloudTrail/
165
-
166
- > create-cloudtrail_logs-table.sql
167
-
168
- ```sql
169
- CREATE EXTERNAL TABLE IF NOT EXISTS cloudtrail_logs (
170
- eventVersion STRING,
171
- eventTime TIMESTAMP,
172
- eventSource STRING,
173
- eventName STRING,
174
- awsRegion STRING,
175
- sourceIPAddress STRING,
176
- userAgent STRING,
177
- userIdentity STRUCT<
178
- type: STRING,
179
- principalId: STRING,
180
- arn: STRING,
181
- accountId: STRING,
182
- accessKeyId: STRING,
183
- userName: STRING,
184
- sessionContext: STRUCT<
185
- attributes: STRUCT<
186
- mfaAuthenticated: STRING,
187
- creationDate: STRING>,
188
- sessionIssuer: STRUCT<
189
- type: STRING,
190
- principalId: STRING,
191
- arn: STRING,
192
- accountId: STRING,
193
- userName: STRING>>>,
194
- requestParameters STRING,
195
- responseElements STRING,
196
- additionalEventData STRING,
197
- errorCode STRING,
198
- errorMessage STRING,
199
- requestID STRING,
200
- eventID STRING,
201
- readOnly STRING,
202
- eventType STRING,
203
- apiVersion STRING,
204
- managementEvent BOOLEAN,
205
- recipientAccountId STRING,
206
- sharedEventID STRING,
207
- vpcEndpointId STRING
208
- )
209
- PARTITIONED BY (`region` STRING, `year` STRING, `month` STRING, `day` STRING)
210
- STORED AS PARQUET
211
- LOCATION 's3://your-cloudtrail-logs/AWSLogs/<account-id>/CloudTrail/'
212
- TBLPROPERTIES (
213
- "classification"="parquet",
214
- "projection.enabled"="true",
215
- "projection.region.type"="enum",
216
- "projection.region.values"="us-east-1,us-west-2,ap-southeast-2",
217
- "projection.year.type"="integer",
218
- "projection.year.range"="2024,2030",
219
- "projection.month.type"="integer",
220
- "projection.month.range"="1,12",
221
- "projection.day.type"="integer",
222
- "projection.day.range"="1,31",
223
- "storage.location.template"="s3://your-cloudtrail-logs/AWSLogs/<account-id>/CloudTrail/${region}/${year}/${month}/${day}/"
224
- );
225
- ```
226
-
227
- > Replace `<your-cloudtrail-bucket-name>` and `<account-id>` accordingly.
228
-
229
- #### **Step 3 – Repair Partitions (very important)**
230
-
231
- ```sql
232
- MSCK REPAIR TABLE cloudtrail_logs;
233
- ```
234
-
235
- This loads available partitions into Athena for querying.
236
-
237
- ---
238
-
239
- ## 📊 PART 2: Craft Athena Query to Detect SG Changes
240
-
241
- Here's an optimized, real-world Athena SQL query to detect all Security Group changes (`sg-xxx`):
242
-
243
- ```sql
244
- SELECT
245
- eventTime,
246
- eventName,
247
- userIdentity.arn AS actor,
248
- userIdentity.sessionContext.sessionIssuer.userName AS assumedBy,
249
- sourceIPAddress,
250
- requestParameters,
251
- json_extract_scalar(requestParameters, '$.groupId') AS securityGroupId
252
- FROM cloudtrail_logs
253
- WHERE eventName IN (
254
- 'AuthorizeSecurityGroupIngress',
255
- 'RevokeSecurityGroupIngress',
256
- 'AuthorizeSecurityGroupEgress',
257
- 'RevokeSecurityGroupEgress',
258
- 'UpdateSecurityGroupRuleDescriptionsEgress',
259
- 'UpdateSecurityGroupRuleDescriptionsIngress'
260
- )
261
- AND json_extract_scalar(requestParameters, '$.groupId') = 'sg-xxx'
262
- AND year = '2025'
263
- AND month = '04'
264
- ORDER BY eventTime DESC;
265
- ```
266
-
267
- > ✅ **Customize** the `year` and `month` fields based on your timeframe. This is important for partition pruning and performance.
268
-
269
- ---
270
-
271
- ## 🛡️ PART 3: Use AWS Config to Detect Non-Compliant SG Changes
272
-
273
- ### ✅ AWS Config Setup
274
-
275
- Ensure AWS Config is:
276
- - **Enabled in the account or centrally via org**
277
- - Recording **EC2:SecurityGroup** as a tracked resource
278
-
279
- ### 🔧 Create AWS Managed or Custom Rule
280
-
281
- Use AWS Managed Rule: `INCOMING_SSH_DISABLED`, `RESTRICTED_INCOMING_TRAFFIC`, or create a **custom rule** (Lambda-backed) to detect violations like:
282
-
283
- - **Ports open to 0.0.0.0/0**
284
- - **New rules outside of allowed IP range**
285
- - **Unauthorized source CIDRs**
286
-
287
- ### ✅ Example: Custom AWS Config Rule for CIDR Scope
288
-
289
- You can use [AWS Config Rule example](https://docs.aws.amazon.com/config/latest/developerguide/evaluate-config_develop-rules_nodejs.html) or build a custom rule in Python that flags any ingress rule with `0.0.0.0/0`.
290
-
291
- ---
292
-
293
- ## 🚨 PART 4: Automate Real-Time Alerting with EventBridge + SNS + MS Teams
294
-
295
- ### 🧱 Overview of Architecture
296
-
297
- 1. **EventBridge Rule** watches for specific `AuthorizeSecurityGroupIngress`, etc.
298
- 2. **SNS Topic** receives those events
299
- 3. **Lambda Function** transforms event into MS Teams format and posts via webhook
300
-
301
- ---
302
-
303
- ### 🔧 Step-by-Step: Setup
304
-
305
- #### ✅ 1. Create EventBridge Rule
306
-
307
- ```json
308
- {
309
- "source": ["aws.ec2"],
310
- "detail-type": ["AWS API Call via CloudTrail"],
311
- "detail": {
312
- "eventName": [
313
- "AuthorizeSecurityGroupIngress",
314
- "RevokeSecurityGroupIngress",
315
- "AuthorizeSecurityGroupEgress",
316
- "RevokeSecurityGroupEgress"
317
- ],
318
- "requestParameters.groupId": ["sg-xxx"]
319
- }
320
- }
321
- ```
322
-
323
- #### ✅ 2. Create SNS Topic (e.g., `SGChangeAlerts`)
324
-
325
- - In **SNS**, create a topic
326
- - Add Lambda (below) as a subscriber
327
-
328
- #### ✅ 3. Create MS Teams Incoming Webhook
329
-
330
- - In MS Teams:
331
- - Go to a channel
332
- - Choose “Connectors” → “Incoming Webhook”
333
- - Name it, copy the **Webhook URL**
334
-
335
- ---
336
-
337
- #### ✅ 4. Lambda Function to Post to MS Teams
338
-
339
- Use a simple Python Lambda (add your webhook URL):
340
-
341
- ```python
342
- import json
343
- import urllib3
344
-
345
- http = urllib3.PoolManager()
346
- teams_webhook_url = "<YOUR_MS_TEAMS_WEBHOOK_URL>"
347
-
348
- def lambda_handler(event, context):
349
- for record in event['Records']:
350
- message = json.loads(record['Sns']['Message'])
351
- detail = message.get('detail', {})
352
-
353
- sg_id = detail.get('requestParameters', {}).get('groupId', 'Unknown SG')
354
- event_name = detail.get('eventName', 'UnknownEvent')
355
- actor = detail.get('userIdentity', {}).get('arn', 'Unknown')
356
- region = detail.get('awsRegion', 'Unknown')
357
- source_ip = detail.get('sourceIPAddress', 'Unknown')
358
-
359
- msg = {
360
- "title": "⚠️ AWS Security Group Change Detected",
361
- "text": f"**Event**: {event_name}\n**SG**: {sg_id}\n**User**: {actor}\n**Region**: {region}\n**Source IP**: {source_ip}"
362
- }
363
-
364
- http.request('POST', teams_webhook_url, body=json.dumps(msg), headers={'Content-Type': 'application/json'})
365
- ```
366
-
367
- - Attach basic execution policy (SNS invocation + logs)
368
- - Test by triggering a manual SG change
369
-
370
- ---
371
-
372
- ## 🔄 PART 5: Best Practices for Automation & Audit Maturity
373
-
374
- | Category | Best Practice |
375
- |----------|----------------|
376
- | **Logging** | Store CloudTrail in centralized, versioned, encrypted S3 bucket |
377
- | **Query** | Partition Athena tables by `year/month/day` for efficiency |
378
- | **Alerting** | Always include user identity, IP, region in alerts |
379
- | **Tagging** | Tag SGs with `Owner`, `Environment`, `ChangeTicket` |
380
- | **Governance** | Use SCPs and IAM boundaries to restrict unauthorized SG changes |
381
- | **Forensics** | Store Athena queries in a shared repo; automate with scheduled queries for weekly audits |
382
- | **Cost Optimization** | Use Amazon Athena scheduled queries + QuickSight dashboards instead of external tools |
383
-
384
- ---
385
-
386
- ## 🎯 Final Thoughts: Governance-Driven Cloud Security
387
-
388
- This solution provides:
389
- - **Immediate visibility** (EventBridge + Teams alerting)
390
- - **Historical traceability** (CloudTrail + Athena)
391
- - **Policy enforcement** (AWS Config + IAM/SCP)
392
- - **Audit readiness** (Tagging + documentation + centralized logs)
393
-
394
- By integrating these layers, you go beyond reaction and establish a **proactive, auditable, and secure cloud operating model**.
395
-
396
- ---
397
-
398
- Absolutely. Let's take our time and raise the bar.
399
-
400
- We are now enhancing the **Athena Query Suite** to support **real-time or near-real-time** analysis of **Critical Alerts** across:
401
-
402
- 1. 🔐 **Security**
403
- 2. 🌐 **Network**
404
- 3. 🏗️ **Infrastructure**
405
- 4. 💻 **EC2 Runtime**
406
-
407
- ---
408
-
409
- ## 🧠 Goal:
410
-
411
- Design and implement **production-ready Athena SQL queries**, with **partition projection**, **structured access**, and **cost-efficient filtering**, for all **critical alert types**. These queries should:
412
- - Align with **AWS best practices**
413
- - Be easy to integrate with **scheduled queries, dashboards, or alert pipelines**
414
- - Prioritize **precision, performance, and auditability**
415
-
416
- ---
417
-
418
- ## 🧩 Improvements Identified from Previous Athena Queries:
419
-
420
- | Area | Original | To-Be |
421
- |------|----------|-------|
422
- | Filtering | Broad or unpartitioned | Partitioned by `year`, `month`, `day`, `region` |
423
- | Identity Insight | Simple `userIdentity.arn` | Full `sessionContext.sessionIssuer.userName`, MFA check |
424
- | Parameter Handling | `json_extract_scalar` | Structured `MAP` access (e.g. `requestParameters['groupId']`) |
425
- | Output Fields | Too generic | Specific fields like `eventName`, `caller`, `ip`, `action`, `resourceId` |
426
- | Extensibility | Hardcoded SG ID | Accepts any SG, port, or IP — makes it reusable |
427
-
428
- ---
429
-
430
- ## ✅ Let’s Now Write the Full Set of **Critical Athena Queries**
431
-
432
- ---
433
-
434
- ### 🔐 **1. Security Alerts**
435
-
436
- ---
437
-
438
- #### 🔸 A. Unauthorized API Calls (Brute Force, Exploitation Attempts)
439
-
440
- > UnauthorizedOperation.sql
441
-
442
- ```sql
443
- SELECT
444
- eventTime,
445
- userIdentity.arn AS user_arn,
446
- sourceIPAddress,
447
- awsRegion,
448
- eventName,
449
- errorCode
450
- FROM cloudtrail_logs
451
- WHERE eventName = 'UnauthorizedOperation'
452
- AND year = '2025'
453
- AND month = '04'
454
- AND day BETWEEN '01' AND '08'
455
- ORDER BY eventTime DESC;
456
- ```
457
-
458
- > 📌 **Improvement**: Could also add `errorCode IN ('AccessDenied', 'UnauthorizedOperation')` to catch more cases.
459
-
460
- ---
461
-
462
- #### 🔸 B. Root Account Usage
463
-
464
- > RootAccountUsage.sql
465
-
466
- ```sql
467
- SELECT
468
- eventTime,
469
- eventName,
470
- sourceIPAddress,
471
- userIdentity.arn AS user_arn,
472
- userAgent
473
- FROM cloudtrail_logs
474
- WHERE userIdentity.type = 'Root'
475
- AND year = '2025'
476
- AND month = '04'
477
- AND day BETWEEN '01' AND '08'
478
- ORDER BY eventTime DESC;
479
- ```
480
-
481
- > ✅ Use `userIdentity.type = 'Root'` — most accurate method to detect root usage across API calls.
482
-
483
- ---
484
-
485
- #### 🔸 C. Security Group Rule Changes
486
-
487
- > SecurityGroupRuleChanges.sql
488
-
489
- ```sql
490
- SELECT
491
- eventTime,
492
- userIdentity.arn AS user,
493
- eventName,
494
- requestParameters['groupId'] AS securityGroupId,
495
- requestParameters['ipPermissions'] AS modifiedPermissions,
496
- sourceIPAddress
497
- FROM cloudtrail_logs
498
- WHERE eventName IN (
499
- 'AuthorizeSecurityGroupIngress',
500
- 'RevokeSecurityGroupIngress',
501
- 'AuthorizeSecurityGroupEgress',
502
- 'RevokeSecurityGroupEgress',
503
- 'UpdateSecurityGroupRuleDescriptionsIngress',
504
- 'UpdateSecurityGroupRuleDescriptionsEgress'
505
- )
506
- AND year = '2025'
507
- AND month = '04'
508
- AND day BETWEEN '01' AND '08'
509
- ORDER BY eventTime DESC;
510
- ```
511
-
512
- ---
513
-
514
- #### 🔸 D. IAM Policy/Role Changes
515
-
516
- ```sql
517
- SELECT
518
- eventTime,
519
- eventName,
520
- userIdentity.arn AS user,
521
- requestParameters['roleName'] AS role,
522
- requestParameters['policyDocument'] AS newPolicy,
523
- sourceIPAddress
524
- FROM cloudtrail_logs
525
- WHERE eventName IN (
526
- 'PutRolePolicy', 'AttachRolePolicy', 'CreatePolicy',
527
- 'CreateRole', 'UpdateAssumeRolePolicy'
528
- )
529
- AND year = '2025'
530
- AND month = '04'
531
- AND day BETWEEN '01' AND '08'
532
- ORDER BY eventTime DESC;
533
- ```
534
-
535
- ---
536
-
537
- #### 🔸 E. Port 22/3389 Open to 0.0.0.0/0
538
-
539
- ```sql
540
- SELECT
541
- eventTime,
542
- userIdentity.arn AS user,
543
- requestParameters['groupId'] AS sg_id,
544
- requestParameters['ipPermissions'] AS permissions,
545
- sourceIPAddress
546
- FROM cloudtrail_logs
547
- WHERE eventName IN ('AuthorizeSecurityGroupIngress')
548
- AND requestParameters['ipPermissions'] LIKE '%0.0.0.0/0%'
549
- AND (
550
- requestParameters['ipPermissions'] LIKE '%22%' OR
551
- requestParameters['ipPermissions'] LIKE '%3389%'
552
- )
553
- AND year = '2025'
554
- AND month = '04'
555
- AND day BETWEEN '01' AND '08'
556
- ORDER BY eventTime DESC;
557
- ```
558
-
559
- ---
560
-
561
- ### 🌐 **2. Network Alerts**
562
-
563
- ---
564
-
565
- #### 🔸 A. NAT Gateway or Internet Gateway Failures (CloudTrail-Based)
566
-
567
- CloudTrail doesn't capture health-check failure natively. Use **CloudWatch logs + SNS** for real-time alerts. But you can **detect removal** of NAT/IGW:
568
-
569
- ```sql
570
- SELECT
571
- eventTime,
572
- eventName,
573
- userIdentity.arn AS user,
574
- requestParameters['gatewayId'] AS gateway_id,
575
- sourceIPAddress
576
- FROM cloudtrail_logs
577
- WHERE eventName IN ('DeleteNatGateway', 'DetachInternetGateway')
578
- AND year = '2025'
579
- AND month = '04'
580
- AND day BETWEEN '01' AND '08'
581
- ORDER BY eventTime DESC;
582
- ```
583
-
584
- ---
585
-
586
- #### 🔸 B. VPC Flow Logs — Suspicious Connections (JOIN with parsed Flow Logs)
587
-
588
- This requires **VPC Flow Logs** parsed via **Athena + Glue**. Sample query pattern:
589
-
590
- ```sql
591
- SELECT *
592
- FROM vpc_flow_logs_parquet
593
- WHERE dstaddr IN ('1.2.3.4', '8.8.8.8')
594
- AND dstport IN (22, 3389, 3306)
595
- AND action = 'ACCEPT'
596
- AND day BETWEEN '2025-04-01' AND '2025-04-08'
597
- ORDER BY start DESC;
598
- ```
599
-
600
- ---
601
-
602
- ### 🏗️ **3. Infrastructure Monitoring Alerts**
603
-
604
- > These are best handled via **CloudWatch alarms**, but some are detectable via Athena from CloudTrail as *resource state changes*.
605
-
606
- ---
607
-
608
- #### 🔸 A. Auto Scaling Group Launch Failures
609
-
610
- ```sql
611
- SELECT
612
- eventTime,
613
- userIdentity.arn,
614
- eventName,
615
- errorMessage,
616
- sourceIPAddress
617
- FROM cloudtrail_logs
618
- WHERE eventSource = 'autoscaling.amazonaws.com'
619
- AND eventName = 'CreateAutoScalingGroup'
620
- AND errorCode IS NOT NULL
621
- AND year = '2025'
622
- AND month = '04'
623
- ORDER BY eventTime DESC;
624
- ```
625
-
626
- ---
627
-
628
- #### 🔸 B. Backup Failure (AWS Backup)
629
-
630
- ```sql
631
- SELECT
632
- eventTime,
633
- eventName,
634
- userIdentity.arn,
635
- requestParameters['backupVaultName'],
636
- errorMessage
637
- FROM cloudtrail_logs
638
- WHERE eventSource = 'backup.amazonaws.com'
639
- AND eventName = 'StartBackupJob'
640
- AND errorCode IS NOT NULL
641
- AND year = '2025'
642
- AND month = '04'
643
- ORDER BY eventTime DESC;
644
- ```
645
-
646
- ---
647
-
648
- ### 💻 **4. EC2 Instance Alerts**
649
-
650
- ---
651
-
652
- #### 🔸 A. Unexpected Stop or Termination
653
-
654
- ```sql
655
- SELECT
656
- eventTime,
657
- userIdentity.arn,
658
- eventName,
659
- requestParameters['instanceId'],
660
- sourceIPAddress
661
- FROM cloudtrail_logs
662
- WHERE eventSource = 'ec2.amazonaws.com'
663
- AND eventName IN ('StopInstances', 'TerminateInstances')
664
- AND year = '2025'
665
- AND month = '04'
666
- ORDER BY eventTime DESC;
667
- ```
668
-
669
- ---
670
-
671
- #### 🔸 B. EC2 Status Check Failure (Indirect Detection)
672
-
673
- You can track `DescribeInstanceStatus` calls or hook into **CloudWatch Alarm**. Sample from CloudTrail:
674
-
675
- ```sql
676
- SELECT
677
- eventTime,
678
- userIdentity.arn,
679
- eventName,
680
- requestParameters['instanceId'],
681
- sourceIPAddress
682
- FROM cloudtrail_logs
683
- WHERE eventSource = 'ec2.amazonaws.com'
684
- AND eventName = 'DescribeInstanceStatus'
685
- AND year = '2025'
686
- AND month = '04'
687
- ORDER BY eventTime DESC;
688
- ```
689
-
690
- ---
691
-
692
- ## 🧱 Implementation Tips
693
-
694
- - Automate Athena queries via **Scheduled Queries (daily/hourly)**
695
- - Export results to **S3 + QuickSight** or **alert on non-empty results**
696
- - Pair with **EventBridge rules** for real-time alerts
697
- - Use **Lambda** to format and send alert messages to:
698
- - **MS Teams / Slack / Email**
699
- - Custom dashboards
700
-
701
- ---
702
-
703
- ## ✅ Summary Table of Enhanced Queries
704
-
705
- | Alert Type | Query Name | Detection Method |
706
- |------------|------------|------------------|
707
- | Security | Unauthorized API Calls | CloudTrail via Athena |
708
- | Security | Root Account Usage | `userIdentity.type = 'Root'` |
709
- | Security | SG Rule Changes | `eventName` in SG Ops |
710
- | Security | IAM Policy Changes | `PutRolePolicy`, etc. |
711
- | Security | Public SSH/RDP | `ipPermissions` contains `0.0.0.0/0` |
712
- | Network | NAT/IGW Delete | `DeleteNatGateway`, `DetachInternetGateway` |
713
- | Network | VPC Flow Anomaly | Join with VPC logs |
714
- | Infra | ASG Fail | CloudTrail errorCode on `CreateAutoScalingGroup` |
715
- | Infra | AWS Backup Fail | `StartBackupJob` with error |
716
- | EC2 | Unexpected Stop | `StopInstances`, `TerminateInstances` |
717
- | EC2 | Status Check Fail | `DescribeInstanceStatus` queries |
718
-
719
- ---
720
-
721
- > TODO: Let’s turn this into a full-scale, automated DevSecOps solution.
722
-
723
- - **Terraform/CDK to create scheduled Athena queries + alerts?**
724
- - A ready-made **QuickSight security dashboard?**
725
- - A **GitHub repo** for these SQL files + alert infrastructure?
726
-
727
- ---