@query-ai/digital-workers 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,525 @@
1
+ # FSQL Investigation Cheat Sheet
2
+
3
+ > **This is a quick-reference supplement, not the complete syntax reference.** For full FSQL syntax (all ~40 observable types, SUMMARIZE/GROUP BY, set operations, depth modifiers), read the MCP resource `fsql://docs/syntax-reference` via `ReadMcpResourceTool`.
4
+
5
+ ## Query Structure
6
+
7
+ ```
8
+ QUERY <fields> [WITH <filters>] [BEFORE <end>] [AFTER <start>] [FROM <connectors>]
9
+ ```
10
+
11
+ Alternative (filter-first):
12
+ ```
13
+ QUERY <filter> [SHOW <fields>] [BEFORE <end>] [AFTER <start>] [FROM <connectors>]
14
+ ```
15
+
16
+ ## Selectors
17
+
18
+ | Selector | Description | Example |
19
+ |----------|-------------|---------|
20
+ | `.` | Direct path | `authentication.user.username` |
21
+ | `*` | All fields at level | `authentication.*` |
22
+ | `**` | All fields recursively | `authentication.**` |
23
+ | `#` | Category (all events in category) | `#network.*` |
24
+ | `%` | Observable (searches all matching fields) | `%ip`, `%hash`, `%email`, `%username`, `%domain` |
25
+ | `@` | Type filter | `@ip`, `@user` |
26
+
27
+ ## Operators
28
+
29
+ | Operator | Aliases | Description |
30
+ |----------|---------|-------------|
31
+ | `=` | | Equals |
32
+ | `==` | | Case-insensitive equals |
33
+ | `!=` | | Not equals |
34
+ | `CONTAINS` | `~` | Contains substring |
35
+ | `ICONTAINS` | `~~` | Case-insensitive contains |
36
+ | `STARTSWITH` | `^=` | Starts with |
37
+ | `ENDSWITH` | `$=` | Ends with |
38
+ | `IN` | | In list (comma-separated) |
39
+ | `<`, `>`, `<=`, `>=` | | Numeric comparison |
40
+ | `empty` | | Field is null/empty |
41
+ | `ANY` | | Array quantifier: any element matches |
42
+ | `ALL` | | Array quantifier: all elements match |
43
+
44
+ ## Combining Filters
45
+
46
+ - `AND` — both conditions must match
47
+ - `OR` — at least one must match
48
+ - Parentheses for grouping: `(A OR B) AND C`
49
+
50
+ ## Time Ranges
51
+
52
+ - `AFTER 24h` — data from at least 24 hours ago
53
+ - `BEFORE 12h` — data from up to 12 hours ago
54
+ - Units: `h/hr/hrs`, `d/day/days`, `w/week/weeks`, `m/month/months`
55
+
56
+ ## Data Sources
57
+
58
+ ```
59
+ FROM 'Crowdstrike Falcon', 'AWS CloudTrail', 'Splunk'
60
+ ```
61
+
62
+ ## EXPLAIN Commands (Schema Discovery)
63
+
64
+ ```
65
+ EXPLAIN ATTRIBUTES network_activity.%ip -- List all IP fields in network_activity
66
+ EXPLAIN SCHEMA network_activity.proxy.%ip -- Schema details for matching fields
67
+ EXPLAIN GRAPHQL QUERY ... -- Translate FSQL to GraphQL (validation)
68
+ ```
69
+
70
+ ## Key OCSF Event Types
71
+
72
+ | Category | Events | Use For |
73
+ |----------|--------|---------|
74
+ | Findings | `detection_finding`, `security_finding`, `vulnerability_finding` | Alerts, detections |
75
+ | Identity | `authentication`, `account_change`, `authorize_session` | Auth analysis |
76
+ | Network | `network_activity`, `http_activity`, `dns_activity`, `ssh_activity`, `email_activity` | Network investigation |
77
+ | System | `process_activity`, `file_activity`, `module_activity` | Endpoint investigation |
78
+ | Application | `api_activity`, `web_resource_activity` | Cloud/app investigation |
79
+
80
+ ## Investigation Query Patterns
81
+
82
+ ### Discovery Scans (Layer 1a — always start here)
83
+
84
+ Use `*.message, *.time` to find which event types have data for IOCs, without overflowing. The `__event` field in results tells you the event type.
85
+
86
+ **BATCH same-type IOCs using `IN` — one query per IOC type, not one per IOC:**
87
+
88
+ ```
89
+ -- Discover where multiple IPs appear (1 query instead of 5)
90
+ QUERY *.message, *.time WITH %ip IN '10.0.0.1', '10.0.0.2', '10.0.0.3', '10.0.0.4', '10.0.0.5' AFTER 7d
91
+
92
+ -- Discover where multiple hashes appear
93
+ QUERY *.message, *.time WITH %hash IN '44d88612fea8a8f36de82e1278abb02f', 'abc123def456' AFTER 7d
94
+
95
+ -- Discover where multiple usernames appear
96
+ QUERY *.message, *.time WITH %username IN 'jsmith', 'jdoe', 'admin' AFTER 7d
97
+
98
+ -- Single IOC is fine too
99
+ QUERY *.message, *.time WITH %domain = 'evil.com' AFTER 7d
100
+ ```
101
+
102
+ **NEVER use `QUERY ** WITH %observable` or bare `QUERY %observable` — these return full OCSF events and will overflow.**
103
+
104
+ ### Aggregation Patterns (SUMMARIZE — after discovery)
105
+
106
+ Use SUMMARIZE when you need counts or distributions instead of individual events. Always after Layer 1a/1b identifies what you're looking at. Pass SUMMARIZE queries to `Validate_FSQL_Query` the same as QUERY (the tool prepends `VALIDATE` automatically — do NOT include `VALIDATE` in the query string). All fields must reference the same OCSF event class.
107
+
108
+ **Execution constraints:** `status_id = NEW` filtering fails on detection_finding (use GROUP BY status_id instead). `FROM` clause not supported. High-cardinality GROUP BY (IPs, hashes) can overflow — scope with a WITH filter. email_activity and file_activity SUMMARIZE execution fails. If SUMMARIZE returns empty, fall back to QUERY. Check `summarize_support` in the environment profile before querying.
109
+
110
+ **Alert triage — severity and status distribution:**
111
+ ```
112
+ -- How many alerts by type? (GROUP BY status_id to separate NEW from RESOLVED in results)
113
+ SUMMARIZE COUNT detection_finding.message GROUP BY detection_finding.message, detection_finding.status_id
114
+ AFTER 24h
115
+
116
+ -- Status distribution (are these NEW or already RESOLVED?)
117
+ SUMMARIZE COUNT detection_finding.status_id GROUP BY detection_finding.status_id
118
+ WITH detection_finding.severity_id IN HIGH, CRITICAL AFTER 7d
119
+
120
+ -- Severity breakdown with status
121
+ SUMMARIZE COUNT detection_finding.severity_id GROUP BY detection_finding.severity_id, detection_finding.status_id
122
+ AFTER 24h
123
+ ```
124
+
125
+ **Scope assessment — how many hosts/users/IPs are affected?**
126
+ ```
127
+ -- Per-host alert count with status (filter NEW in results, not in query)
128
+ SUMMARIZE COUNT detection_finding.message GROUP BY detection_finding.device.hostname, detection_finding.status_id
129
+ WITH detection_finding.severity_id IN HIGH, CRITICAL AFTER 24h
130
+
131
+ -- Per-host alert count over wider window
132
+ SUMMARIZE COUNT detection_finding.message GROUP BY detection_finding.device.hostname, detection_finding.status_id
133
+ AFTER 7d
134
+
135
+ -- Unique users in authentication events
136
+ SUMMARIZE COUNT DISTINCT authentication.actor.user.email_addr
137
+ WITH authentication.status_id = FAILURE AFTER 24h
138
+ ```
139
+
140
+ **Identity investigation — auth failure distribution:**
141
+ ```
142
+ -- Failure count by source IP (spray detection)
143
+ SUMMARIZE COUNT authentication.user.uid GROUP BY authentication.src_endpoint.ip
144
+ WITH authentication.status_id = FAILURE AFTER 24h
145
+
146
+ -- Distinct users per source IP (credential stuffing signal)
147
+ SUMMARIZE COUNT DISTINCT authentication.actor.user.email_addr
148
+ GROUP BY authentication.src_endpoint.ip
149
+ WITH authentication.status_id = FAILURE AFTER 24h
150
+
151
+ -- Distinct IPs per user (impossible travel signal)
152
+ SUMMARIZE COUNT DISTINCT authentication.device.ip
153
+ GROUP BY authentication.actor.user.email_addr
154
+ WITH authentication.status_id = SUCCESS AFTER 24h
155
+ ```
156
+
157
+ **Network investigation — connection volume and port distribution:**
158
+ ```
159
+ -- Outbound connection count by source (scanning detection)
160
+ SUMMARIZE COUNT network_activity.message GROUP BY network_activity.src_endpoint.ip
161
+ WITH network_activity.src_endpoint.ip IN '10.0.0.1', '10.0.0.2' AFTER 7d
162
+
163
+ -- Unique destination ports (port scan breadth)
164
+ SUMMARIZE COUNT DISTINCT network_activity.dst_endpoint.port
165
+ WITH network_activity.src_endpoint.ip = '10.0.0.1' AFTER 7d
166
+ ```
167
+
168
+ **False positive verification:**
169
+ ```
170
+ -- Confirm all instances of an alert are resolved
171
+ SUMMARIZE COUNT detection_finding.status_id GROUP BY detection_finding.status_id
172
+ WITH detection_finding.message = 'Yttrium Actor activity detected' AFTER 7d
173
+ -- If result shows 100% RESOLVED → confirmed false positive pattern
174
+ ```
175
+
176
+ **Hash/observable distribution across event types:**
177
+ ```
178
+ -- Which event types have hits for a hash? (complements Layer 1a discovery)
179
+ -- Note: Use Layer 1a (*.message, *.time) first. Use SUMMARIZE to quantify.
180
+ SUMMARIZE COUNT detection_finding.message GROUP BY detection_finding.message
181
+ WITH %hash = 'f6c3023f' AFTER 7d
182
+ ```
183
+
184
+ ### Targeted Follow-Ups (Layer 1b — after discovery)
185
+
186
+ Once you know which event types have hits, query specific fields:
187
+
188
+ ```
189
+ -- Pull new high/critical alerts (ALWAYS use status_id = NEW)
190
+ QUERY detection_finding.message, detection_finding.severity_id, detection_finding.status_id,
191
+ detection_finding.time, detection_finding.observables, detection_finding.attacks
192
+ WITH detection_finding.severity_id IN HIGH, CRITICAL, FATAL
193
+ AND detection_finding.status_id = NEW AFTER 24h
194
+
195
+ -- Auth failures for a user
196
+ QUERY authentication.message, authentication.time, authentication.status_id,
197
+ authentication.src_endpoint.ip, authentication.user.username
198
+ WITH authentication.user.username = 'jsmith' AND authentication.status_id = FAILURE AFTER 7d
199
+
200
+ -- Email activity for phishing investigation
201
+ QUERY email_activity.message, email_activity.time, email_activity.actor.user.email_addr,
202
+ email_activity.email.subject
203
+ WITH %email = 'suspect@evil.com' AFTER 7d
204
+
205
+ -- Process activity (suspicious execution)
206
+ QUERY process_activity.message, process_activity.time, process_activity.device.hostname,
207
+ process_activity.process.cmd_line
208
+ WITH process_activity.process.cmd_line CONTAINS 'powershell'
209
+ AND process_activity.process.cmd_line CONTAINS 'hidden' AFTER 24h
210
+ ```
211
+
212
+ ### Per-Host Deep Dives (priority hosts only)
213
+
214
+ After identifying the 2-5 most suspicious hosts, run per-host queries with wider lookback:
215
+
216
+ ```
217
+ -- Detections on a single host (7d lookback for multi-day patterns)
218
+ QUERY detection_finding.message, detection_finding.severity_id, detection_finding.status_id,
219
+ detection_finding.time, detection_finding.attacks
220
+ WITH detection_finding.device.hostname = 'BD-2578' AFTER 7d
221
+
222
+ -- Full process activity on a single host (scoped — ** is OK here)
223
+ QUERY process_activity.** WITH process_activity.device.hostname = 'BD-2578' AFTER 24h
224
+
225
+ -- All network activity from a single host
226
+ QUERY #network.src_endpoint.ip, #network.dst_endpoint.ip, #network.dst_endpoint.port,
227
+ #network.message, #network.time
228
+ WITH #network.src_endpoint.hostname = 'BD-2578' AFTER 48h
229
+ ```
230
+
231
+ ---
232
+
233
+ ## Pivoting from Findings to Telemetry
234
+
235
+ **The most common investigation mistake is staying in `detection_finding` for the entire investigation.** Detection findings tell you *what an alert fired on*. Telemetry event types tell you *what actually happened*. You need both.
236
+
237
+ ### When to Pivot
238
+
239
+ After Gate 1 (intake) or Gate 2 (enrichment), once you know which hosts and IOCs are involved, **always query the underlying telemetry**. The pivot depends on what the alert is about:
240
+
241
+ | Alert Type | Pivot To | Why |
242
+ |------------|----------|-----|
243
+ | Suspicious process/script execution | `process_activity` | See the actual command lines, parent processes, execution chains |
244
+ | Malware/file-based alert | `file_activity` | See file creation, modification, drops on the host |
245
+ | Network/C2/lateral movement | `network_activity`, `dns_activity`, `http_activity` | See actual traffic flows, DNS lookups, HTTP connections |
246
+ | Identity/auth anomaly | `authentication` | See login patterns, source IPs, success/failure over time |
247
+ | Email/phishing | `email_activity` | See delivery chain, recipients, attachment metadata |
248
+ | Cloud/API abuse | `api_activity` | See actual API calls, who made them, from where |
249
+ | DLL sideloading/injection | `module_activity` | See DLL loads, image load events |
250
+ | Registry persistence | `evidence_info` | See registry key changes (from XDR connectors) |
251
+
252
+ ### The Pivot Pattern
253
+
254
+ ```
255
+ -- Step 1: You have a detection finding about PowerShell on BD-3263
256
+ -- (from Gate 1 intake)
257
+
258
+ -- Step 2: PIVOT — query process_activity for actual PS execution on that host
259
+ QUERY process_activity.message, process_activity.time,
260
+ process_activity.process.name, process_activity.process.cmd_line,
261
+ process_activity.actor.process.name, process_activity.device.hostname
262
+ WITH process_activity.device.hostname = 'BD-3263'
263
+ AND %process_name = 'powershell.exe' AFTER 7d
264
+
265
+ -- Step 3: EXPAND — what else ran on that host around the same time?
266
+ QUERY process_activity.message, process_activity.time,
267
+ process_activity.process.name, process_activity.process.cmd_line,
268
+ process_activity.actor.process.name
269
+ WITH process_activity.device.hostname = 'BD-3263' AFTER 24h
270
+
271
+ -- Step 4: CORRELATE — check file drops too
272
+ QUERY file_activity.message, file_activity.time,
273
+ file_activity.file.name, file_activity.file.path,
274
+ file_activity.device.hostname
275
+ WITH file_activity.device.hostname = 'BD-3263' AFTER 24h
276
+ ```
277
+
278
+ ---
279
+
280
+ ## Event-Type Query Patterns by Investigation Need
281
+
282
+ ### Endpoint / Process Investigation
283
+
284
+ **When to use:** Any alert involving process execution, script activity, suspicious commands, fileless attacks, LOLBins.
285
+
286
+ **Available data:** 4 connectors provide `process_activity` (CarbonBlack, XDR, Kalibr SYSLOG).
287
+
288
+ **Key observables:** `%process_name`, `%command_line`, `%script_content`, `%file_name`
289
+
290
+ ```
291
+ -- Process execution by name on a host
292
+ QUERY process_activity.message, process_activity.time,
293
+ process_activity.process.name, process_activity.process.cmd_line,
294
+ process_activity.actor.process.name, process_activity.device.hostname
295
+ WITH process_activity.device.hostname = 'BD-3263'
296
+ AND %process_name = 'powershell.exe' AFTER 7d
297
+
298
+ -- Suspicious command line patterns (encoded commands, download cradles)
299
+ QUERY process_activity.message, process_activity.time,
300
+ process_activity.process.cmd_line, process_activity.device.hostname
301
+ WITH %command_line ICONTAINS '-encodedcommand' AFTER 24h
302
+
303
+ QUERY process_activity.message, process_activity.time,
304
+ process_activity.process.cmd_line, process_activity.device.hostname
305
+ WITH %command_line ICONTAINS 'downloadstring' AFTER 24h
306
+
307
+ -- Who launched what? (parent process analysis)
308
+ QUERY process_activity.message, process_activity.time,
309
+ process_activity.process.name, process_activity.process.cmd_line,
310
+ process_activity.actor.process.name
311
+ WITH process_activity.device.hostname = 'BD-3263'
312
+ AND process_activity.actor.process.name = 'cmd.exe' AFTER 24h
313
+
314
+ -- All process activity on a host (scoped — ** OK for single host)
315
+ QUERY process_activity.** WITH process_activity.device.hostname = 'BD-3263' AFTER 24h
316
+ ```
317
+
318
+ **File activity (file drops, malware staging):**
319
+
320
+ Available data: 2 connectors (CarbonBlack, XDR).
321
+
322
+ ```
323
+ -- Files created/modified on a host
324
+ QUERY file_activity.message, file_activity.time,
325
+ file_activity.file.name, file_activity.file.path,
326
+ file_activity.device.hostname
327
+ WITH file_activity.device.hostname = 'BD-3263' AFTER 24h
328
+
329
+ -- Search for a specific file by name
330
+ QUERY file_activity.message, file_activity.time,
331
+ file_activity.file.name, file_activity.file.path,
332
+ file_activity.device.hostname
333
+ WITH %file_name ICONTAINS 'ContentServer.exe' AFTER 7d
334
+
335
+ -- Search for a file by hash
336
+ QUERY file_activity.message, file_activity.time,
337
+ file_activity.file.name, file_activity.device.hostname
338
+ WITH %hash = 'e7fc03267e47814e23e004e5f3a1205b' AFTER 7d
339
+ ```
340
+
341
+ **Module/DLL activity (sideloading, injection):**
342
+
343
+ Available data: 1 connector (XDR DevImgLoad).
344
+
345
+ ```
346
+ QUERY module_activity.message, module_activity.time,
347
+ module_activity.module.file.name, module_activity.device.hostname
348
+ WITH module_activity.device.hostname = 'BD-3263' AFTER 24h
349
+ ```
350
+
351
+ **Registry activity (persistence, IFEO, AppCertDLLs):**
352
+
353
+ Available data: 1 connector (XDR DevRegEvts) via `evidence_info`.
354
+
355
+ ```
356
+ QUERY evidence_info.message, evidence_info.time,
357
+ evidence_info.device.hostname
358
+ WITH evidence_info.device.hostname = 'BD-3263' AFTER 7d
359
+ ```
360
+
361
+ ### Identity / Authentication Investigation
362
+
363
+ **When to use:** Unfamiliar sign-in, brute force, account compromise, privilege escalation alerts.
364
+
365
+ **Available data:** 5+ connectors (Entra ID, Okta, device logon events).
366
+
367
+ **Key observables:** `%username`, `%email`, `%ip`
368
+
369
+ ```
370
+ -- Login history for a user (success and failure)
371
+ QUERY authentication.message, authentication.time, authentication.status_id,
372
+ authentication.src_endpoint.ip, authentication.user.username,
373
+ authentication.http_request.user_agent
374
+ WITH authentication.user.username = 'jsmith' AFTER 7d
375
+
376
+ -- Failed logins only (brute force detection)
377
+ QUERY authentication.message, authentication.time, authentication.status_id,
378
+ authentication.src_endpoint.ip, authentication.user.username
379
+ WITH authentication.user.username = 'jsmith'
380
+ AND authentication.status_id = FAILURE AFTER 7d
381
+
382
+ -- All logins from a suspicious IP
383
+ QUERY authentication.message, authentication.time,
384
+ authentication.user.username, authentication.status_id
385
+ WITH %ip = '136.179.10.135' AFTER 7d
386
+
387
+ -- Account changes (privilege escalation, group membership)
388
+ QUERY account_change.message, account_change.time,
389
+ account_change.user.username, account_change.type_name
390
+ WITH %username = 'jsmith' AFTER 7d
391
+ ```
392
+
393
+ ### Network Investigation
394
+
395
+ **When to use:** C2 communication, lateral movement, port scanning, data exfiltration alerts.
396
+
397
+ **Available data:** `network_activity` (2 connectors: VPC Flow, IPS/IDS), `dns_activity` (1: Route53), `http_activity` (4: WAF, URL filtering, Cribl).
398
+
399
+ **Key observables:** `%ip`, `%domain`, `%url`, `%http_user_agent`
400
+
401
+ ```
402
+ -- Network flows from/to a suspicious IP
403
+ QUERY network_activity.message, network_activity.time,
404
+ network_activity.src_endpoint.ip, network_activity.dst_endpoint.ip,
405
+ network_activity.dst_endpoint.port, network_activity.traffic.bytes_in,
406
+ network_activity.traffic.bytes_out
407
+ WITH %ip = '10.100.21.239' AFTER 7d
408
+
409
+ -- Using category selector for all network event types at once
410
+ QUERY #network.src_endpoint.ip, #network.dst_endpoint.ip,
411
+ #network.dst_endpoint.port, #network.message, #network.time
412
+ WITH #network.src_endpoint.ip = '172.16.16.58' AFTER 48h
413
+
414
+ -- DNS lookups from a host (beaconing, C2 domains)
415
+ QUERY dns_activity.message, dns_activity.time,
416
+ dns_activity.query.hostname, dns_activity.src_endpoint.ip
417
+ WITH dns_activity.src_endpoint.ip = '172.16.16.58' AFTER 7d
418
+
419
+ -- DNS lookups for a specific domain
420
+ QUERY dns_activity.message, dns_activity.time,
421
+ dns_activity.query.hostname, dns_activity.src_endpoint.ip
422
+ WITH %domain ICONTAINS 'evil.com' AFTER 7d
423
+
424
+ -- HTTP activity (C2 callbacks, download URLs)
425
+ QUERY http_activity.message, http_activity.time,
426
+ http_activity.src_endpoint.ip, http_activity.dst_endpoint.ip,
427
+ http_activity.http_request.url.text
428
+ WITH %ip = '52.39.83.27' AFTER 7d
429
+
430
+ -- Suspicious user agents
431
+ QUERY http_activity.message, http_activity.time,
432
+ http_activity.http_request.user_agent, http_activity.src_endpoint.ip
433
+ WITH %http_user_agent ICONTAINS 'python-requests' AFTER 24h
434
+ ```
435
+
436
+ ### Email / Phishing Investigation
437
+
438
+ **When to use:** Phishing delivery, malicious attachment, BEC alerts.
439
+
440
+ **Available data:** 3+ connectors (Proofpoint, O365 Email Security).
441
+
442
+ **Key observables:** `%email`, `%hash`, `%domain`
443
+
444
+ ```
445
+ -- Emails with a specific attachment hash
446
+ QUERY email_activity.message, email_activity.time,
447
+ email_activity.actor.user.email_addr, email_activity.email.subject
448
+ WITH %hash = 'e7fc03267e47814e23e004e5f3a1205b' AFTER 7d
449
+
450
+ -- Emails from a sender
451
+ QUERY email_activity.message, email_activity.time,
452
+ email_activity.actor.user.email_addr, email_activity.email.subject
453
+ WITH %email = 'attacker@evil.com' AFTER 7d
454
+
455
+ -- Emails to a specific recipient
456
+ QUERY email_activity.message, email_activity.time,
457
+ email_activity.actor.user.email_addr, email_activity.email.subject
458
+ WITH %email = 'carolyn.carter@directory.query.ai' AFTER 7d
459
+ ```
460
+
461
+ ### Cloud / API Investigation
462
+
463
+ **When to use:** Unusual API calls, cloud resource access, IAM changes, container activity.
464
+
465
+ **Available data:** 5+ connectors (CloudTrail, EKS Audit, Azure Activity).
466
+
467
+ **Key observables:** `%username`, `%ip`
468
+
469
+ ```
470
+ -- API calls from a specific user
471
+ QUERY api_activity.message, api_activity.time,
472
+ api_activity.actor.user.username, api_activity.src_endpoint.ip
473
+ WITH %username = 'admin-service-account' AFTER 7d
474
+
475
+ -- API calls from a suspicious IP
476
+ QUERY api_activity.message, api_activity.time,
477
+ api_activity.actor.user.username, api_activity.src_endpoint.ip
478
+ WITH %ip = '136.179.10.135' AFTER 7d
479
+ ```
480
+
481
+ ### Enrichment / OSINT Lookup
482
+
483
+ **When to use:** Checking IOC reputation — hashes, IPs, domains.
484
+
485
+ **Available data:** 10+ OSINT connectors (VirusTotal, AlienVault, AbuseIPDB, Shodan, etc.)
486
+
487
+ ```
488
+ -- Hash reputation lookup
489
+ QUERY osint_inventory_info.message, osint_inventory_info.time
490
+ WITH %hash = 'e7fc03267e47814e23e004e5f3a1205b' AFTER 30d
491
+
492
+ -- IP reputation lookup
493
+ QUERY osint_inventory_info.message, osint_inventory_info.time
494
+ WITH %ip = '136.179.10.135' AFTER 30d
495
+
496
+ -- Domain reputation lookup
497
+ QUERY osint_inventory_info.message, osint_inventory_info.time
498
+ WITH %domain = 'evil.com' AFTER 30d
499
+ ```
500
+
501
+ ---
502
+
503
+ ## Observable Quick Reference
504
+
505
+ Use these `%` observables for cross-event-type searches. They search all matching fields across all event types automatically.
506
+
507
+ | Observable | Matches | Best For |
508
+ |------------|---------|----------|
509
+ | `%ip` | All IP fields | Network IOCs, source tracking |
510
+ | `%hash` | All hash fields (MD5, SHA1, SHA256) | Malware, file reputation |
511
+ | `%domain` | All domain/hostname fields | C2, DNS, phishing domains |
512
+ | `%email` | All email address fields | Phishing, identity correlation |
513
+ | `%username` | All username fields | Identity investigation |
514
+ | `%process_name` | All process name fields | Endpoint, malware execution |
515
+ | `%command_line` | All command line fields | Script/LOLBin analysis |
516
+ | `%script_content` | All script content fields | Fileless attack analysis |
517
+ | `%file_name` | All file name fields | Malware drops, staging |
518
+ | `%file_path` | All file path fields | File location analysis |
519
+ | `%url` | All URL fields | C2 callbacks, downloads |
520
+ | `%http_user_agent` | All user-agent fields | Bot/tool identification |
521
+ | `%registry_key_path` | All registry key fields | Persistence mechanisms |
522
+ | `%mac` | All MAC address fields | Device tracking |
523
+ | `%port` | All port fields | Network services |
524
+
525
+ ---
@@ -0,0 +1,150 @@
1
+ ---
2
+ name: hunt-pattern-analyzer
3
+ description: Use when hunt query results need classification — determines if findings represent active threats, historical threats, suspicious patterns, coverage gaps, or clean results
4
+ ---
5
+
6
+ # Hunt Pattern Analyzer
7
+
8
+ ## Iron Law
9
+
10
+ **ABSENCE OF EVIDENCE IS NOT EVIDENCE OF ABSENCE.**
11
+
12
+ A clean result does not mean the environment is safe. It means the hypothesis was tested against available data and no evidence was found. The difference is critical — document what was tested, at what confidence, and what data was unavailable.
13
+
14
+ ## When to Invoke
15
+
16
+ Called by `threat-hunt` orchestrator at Phase 3 after hunt investigation queries complete.
17
+
18
+ ## Process
19
+
20
+ ### Step 1: Review Hunt Queries
21
+
22
+ Review the hunt's queries.md — every query executed, every result. Understand what was asked, what came back, and what returned empty.
23
+
24
+ ### Step 2: Review Data Availability
25
+
26
+ Review the data availability map from Phase 1. Know which connectors, event types, and fields were available and which were not. This defines the ceiling of hunt confidence.
27
+
28
+ ### Step 3: Classify Each Finding
29
+
30
+ For each finding or query result set, classify using the following table:
31
+
32
+ | Type | Meaning | Action |
33
+ |------|---------|--------|
34
+ | Active Threat | Ongoing malicious activity | Hand off to alert-investigation immediately |
35
+ | Historical Threat | Past activity, now dormant | Document, generate detections, recommend forensic review |
36
+ | Suspicious Pattern | Anomalous but inconclusive | Document, recommend monitoring |
37
+ | Coverage Gap | Data needed doesn't exist | Document gap + impact, recommend data source |
38
+ | Clean | Hypothesis tested, no evidence | Document what was tested, note confidence |
39
+
40
+ ### Step 4: Map to MITRE ATT&CK
41
+
42
+ Map each finding to MITRE ATT&CK techniques. This feeds detection engineering and coverage tracking. Every finding gets a technique mapping — no exceptions.
43
+
44
+ ### Step 5: Build Behavioral Patterns
45
+
46
+ Build behavioral pattern descriptions for detection-engineer — what does this look like as a repeatable detection? Describe the observable behavior, not just the IOC.
47
+
48
+ ### Step 6: Cross-Reference for Kill Chains
49
+
50
+ Cross-reference findings against each other — do multiple findings form a kill chain? Individual findings may look benign. Together they may form an attack chain. Map correlated findings to ATT&CK stages.
51
+
52
+ ### Step 7: Check for Active Threats
53
+
54
+ Check for active threat indicators — if ANY finding suggests ongoing compromise:
55
+
56
+ 1. Immediately flag for hand-off to alert-investigation
57
+ 2. Package the evidence (queries, results, timeline, affected systems)
58
+ 3. The hunt PAUSES — active threats take priority
59
+
60
+ ### Step 8: Identify Coverage Gaps
61
+
62
+ For each TTP in the hypothesis that could NOT be tested:
63
+
64
+ 1. What data was missing?
65
+ 2. Which connector/event type would fill the gap?
66
+ 3. What's the risk of the gap? (what threats are invisible?)
67
+
68
+ ### Step 9: Assess Hunt Confidence
69
+
70
+ Assess overall hunt confidence against the Phase 1 targets: data coverage, TTP coverage, and enrichment depth.
71
+
72
+ ### Step 10: Assess HMM Maturity
73
+
74
+ Assess HMM maturity level this hunt demonstrates. Document the justification.
75
+
76
+ ## Output
77
+
78
+ ```
79
+ HUNT FINDINGS ANALYSIS
80
+ ━━━━━━━━━━━━━━━━━━━━━
81
+ Hypothesis: [hypothesis statement]
82
+ Hunt Tier: [Focused/Broad]
83
+ Overall Confidence: [percentage] — [above/below] 90% threshold
84
+
85
+ FINDINGS:
86
+
87
+ Finding 1: [title]
88
+ Classification: [Active Threat / Historical / Suspicious / Coverage Gap / Clean]
89
+ MITRE ATT&CK: [technique ID] — [technique name]
90
+ Evidence: [summary of supporting queries and results]
91
+ Behavioral Pattern: [description for detection engineering]
92
+ Affected Systems: [hosts/users/segments]
93
+ Timeline: [when activity occurred]
94
+ Action: [specific next step]
95
+
96
+ Finding 2: [...]
97
+
98
+ COVERAGE GAPS:
99
+ Gap 1: [description]
100
+ Missing: [event type / field / connector]
101
+ Blocks: [MITRE techniques that cannot be tested]
102
+ Impact: [HIGH/MEDIUM/LOW] — [what's invisible]
103
+ Remediation: [specific action]
104
+
105
+ Gap 2: [...]
106
+
107
+ KILL CHAIN ASSESSMENT:
108
+ [Do the findings form a coherent attack chain? Map to ATT&CK stages]
109
+ [Or are they isolated, unrelated findings?]
110
+
111
+ HUNT CONFIDENCE:
112
+ Data Coverage: [X]% — [N/M] mapped data sources queried
113
+ TTP Coverage: [X]% — [N/M] hypothesis behaviors tested
114
+ Enrichment Depth: [X]% — findings enriched with TI/context
115
+ Overall: [X]% — [PASS/BELOW THRESHOLD]
116
+
117
+ HMM ASSESSMENT:
118
+ This hunt demonstrates HMM Level [N]: [brief justification]
119
+ ```
120
+
121
+ **Active Threat Escalation (if applicable):**
122
+
123
+ ```
124
+ ACTIVE THREAT DETECTED — Recommending hunt pause
125
+
126
+ Evidence:
127
+ [summary of findings indicating ongoing compromise]
128
+
129
+ Affected Systems: [list]
130
+ Timeline: Active as of [most recent evidence timestamp]
131
+
132
+ Recommended Action: Hand off to alert-investigation for formal triage.
133
+ Evidence package assembled in [hunt artifact directory].
134
+
135
+ Awaiting analyst decision: "escalate" to hand off, or "continue hunting" to proceed.
136
+ ```
137
+
138
+ **Return findings analysis to the threat-hunt orchestrator and continue. Do not present to the user or wait for input. Exception: Active threat detection — present the escalation prompt and await analyst decision.**
139
+
140
+ ## Red Flags
141
+
142
+ | Red Flag | Correct Action |
143
+ |----------|---------------|
144
+ | "No findings, the environment is clean" | STOP. Clean means the hypothesis was tested and no evidence was found. It does NOT mean the environment is safe. Document what was tested and at what confidence. |
145
+ | Classifying a coverage gap as "clean" | STOP. If data doesn't exist to test a TTP, that's a gap, not a clean result. The difference matters for the gap remediation plan. |
146
+ | Ignoring low-confidence findings | STOP. A suspicious pattern with 60% confidence is still a finding. Document it. Recommend monitoring. |
147
+ | Not checking for kill chain patterns | STOP. Individual findings may look benign. Together they may form an attack chain. Always cross-reference. |
148
+ | Skipping ATT&CK mapping | STOP. Every finding maps to a technique. This feeds detection engineering and coverage tracking. |
149
+ | Detecting active threat and continuing the hunt | STOP. Active threats get immediate escalation to alert-investigation. The hunt pauses. |
150
+ | Not documenting what was tested for clean results | STOP. "Clean" without documentation is meaningless. Future analysts need to know what was checked. |