agentic-threat-hunting-framework 0.2.3__py3-none-any.whl → 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. {agentic_threat_hunting_framework-0.2.3.dist-info → agentic_threat_hunting_framework-0.3.0.dist-info}/METADATA +38 -40
  2. agentic_threat_hunting_framework-0.3.0.dist-info/RECORD +51 -0
  3. athf/__version__.py +1 -1
  4. athf/cli.py +7 -2
  5. athf/commands/__init__.py +4 -0
  6. athf/commands/agent.py +452 -0
  7. athf/commands/context.py +6 -9
  8. athf/commands/env.py +2 -2
  9. athf/commands/hunt.py +3 -3
  10. athf/commands/init.py +45 -0
  11. athf/commands/research.py +530 -0
  12. athf/commands/similar.py +5 -5
  13. athf/core/research_manager.py +419 -0
  14. athf/core/web_search.py +340 -0
  15. athf/data/__init__.py +19 -0
  16. athf/data/docs/CHANGELOG.md +147 -0
  17. athf/data/docs/CLI_REFERENCE.md +1797 -0
  18. athf/data/docs/INSTALL.md +594 -0
  19. athf/data/docs/README.md +31 -0
  20. athf/data/docs/environment.md +256 -0
  21. athf/data/docs/getting-started.md +419 -0
  22. athf/data/docs/level4-agentic-workflows.md +480 -0
  23. athf/data/docs/lock-pattern.md +149 -0
  24. athf/data/docs/maturity-model.md +400 -0
  25. athf/data/docs/why-athf.md +44 -0
  26. athf/data/hunts/FORMAT_GUIDELINES.md +507 -0
  27. athf/data/hunts/H-0001.md +453 -0
  28. athf/data/hunts/H-0002.md +436 -0
  29. athf/data/hunts/H-0003.md +546 -0
  30. athf/data/hunts/README.md +231 -0
  31. athf/data/integrations/MCP_CATALOG.md +45 -0
  32. athf/data/integrations/README.md +129 -0
  33. athf/data/integrations/quickstart/splunk.md +162 -0
  34. athf/data/knowledge/hunting-knowledge.md +2375 -0
  35. athf/data/prompts/README.md +172 -0
  36. athf/data/prompts/ai-workflow.md +581 -0
  37. athf/data/prompts/basic-prompts.md +316 -0
  38. athf/data/templates/HUNT_LOCK.md +228 -0
  39. agentic_threat_hunting_framework-0.2.3.dist-info/RECORD +0 -23
  40. {agentic_threat_hunting_framework-0.2.3.dist-info → agentic_threat_hunting_framework-0.3.0.dist-info}/WHEEL +0 -0
  41. {agentic_threat_hunting_framework-0.2.3.dist-info → agentic_threat_hunting_framework-0.3.0.dist-info}/entry_points.txt +0 -0
  42. {agentic_threat_hunting_framework-0.2.3.dist-info → agentic_threat_hunting_framework-0.3.0.dist-info}/licenses/LICENSE +0 -0
  43. {agentic_threat_hunting_framework-0.2.3.dist-info → agentic_threat_hunting_framework-0.3.0.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,507 @@
1
+ # Hunt Format Guidelines
2
+
3
+ ## Standard: LOCK Methodology with ABLE Scoping
4
+
5
+ All hunt files in this framework follow the **LOCK** (Learn-Observe-Check-Keep) methodology combined with **ABLE** scoping (Actor-Behavior-Location-Evidence). This unified format combines hypothesis planning and execution results in a single document.
6
+
7
+ ## File Naming Convention
8
+
9
+ - **Format:** `H-####.md` (e.g., `H-0001.md`, `H-0002.md`)
10
+ - **Sequential numbering** starting from 0001
11
+ - **Single file** contains both the hunt methodology and execution results
12
+ - **No dated execution files** - update the same file as you iterate
13
+
14
+ ## Template Structure
15
+
16
+ The HUNT_LOCK.md template contains these sections:
17
+
18
+ ### Title & Metadata
19
+
20
+ ```markdown
21
+ # H-XXXX: [Hunt Title]
22
+
23
+ **Hunt Metadata**
24
+ - Date: YYYY-MM-DD
25
+ - Hunter: [Your Name]
26
+ - Status: [Planning|In Progress|Completed]
27
+ - MITRE ATT&CK: [T####.### - Technique Name]
28
+ ```
29
+
30
+ **Status Values:**
31
+
32
+ - **Planning** - Hunt is being designed
33
+ - **In Progress** - Hunt is actively being executed
34
+ - **Completed** - Hunt execution finished, results documented
35
+
36
+ ---
37
+
38
+ ## YAML Frontmatter (Optional at Level 0-1, Recommended at Level 2+)
39
+
40
+ ### What is YAML Frontmatter?
41
+
42
+ YAML frontmatter is machine-readable metadata placed at the very top of hunt files. It enables:
43
+
44
+ - **AI-powered filtering** - Quickly find hunts by status, tactics, techniques, platform
45
+ - **Automated analysis** - Calculate hunt success rates, track findings, identify coverage gaps
46
+ - **Hunt relationships** - Link related hunts for context and knowledge building
47
+ - **Maturity progression** - Prepares your hunting program for Level 3+ automation
48
+
49
+ ### When to Use YAML Frontmatter
50
+
51
+ | Maturity Level | Recommendation | Rationale |
52
+ |----------------|----------------|-----------|
53
+ | **Level 0-1** (Manual) | Optional | Focus on learning LOCK methodology first |
54
+ | **Level 2** (Searchable) | Recommended | AI can now filter and reference hunts programmatically |
55
+ | **Level 3+** (Generative/Agentic) | Required | Automation requires structured metadata |
56
+
57
+ ### Dual-Format Approach
58
+
59
+ Hunt files use **both** YAML frontmatter and markdown metadata:
60
+
61
+ 1. **YAML frontmatter** (lines 1-17) - Machine-readable, enables automation
62
+ 2. **Markdown metadata** (below title) - Human-readable, provides visual context
63
+
64
+ **Why both?** YAML enables AI filtering while markdown provides at-a-glance context when reading hunts.
65
+
66
+ ### Complete YAML Schema
67
+
68
+ ```yaml
69
+ ---
70
+ hunt_id: H-XXXX # Unique hunt identifier (required)
71
+ title: [Hunt Title] # Full hunt title (required)
72
+ status: planning # Options: planning, in-progress, completed (required)
73
+ date: YYYY-MM-DD # Hunt creation or last update date (required)
74
+ hunter: [Your Name] # Person or team conducting hunt (required)
75
+ platform: [Windows, macOS, Linux] # Target platforms - array format (required)
76
+ tactics: [credential-access] # MITRE ATT&CK tactics (required)
77
+ techniques: [T1003.001] # MITRE ATT&CK technique IDs (required)
78
+ data_sources: [Splunk] # SIEM/log platforms used (required)
79
+ related_hunts: [] # Related hunt IDs (optional)
80
+ findings_count: 0 # Total findings discovered (optional)
81
+ true_positives: 0 # Confirmed malicious activity (optional)
82
+ false_positives: 0 # Benign activity flagged (optional)
83
+ customer_deliverables: [] # For MSPs tracking deliverables (optional)
84
+ tags: [credential-theft] # Freeform categorization tags (optional)
85
+ ---
86
+ ```
87
+
88
+ ### Field Definitions
89
+
90
+ #### Required Fields
91
+
92
+ | Field | Type | Purpose | Example |
93
+ |-------|------|---------|---------|
94
+ | `hunt_id` | string | Unique identifier matching filename | `H-0042` |
95
+ | `title` | string | Full descriptive hunt title | `Kerberoasting Detection via Service Ticket Requests` |
96
+ | `status` | string | Hunt lifecycle stage | `planning`, `in-progress`, `completed` |
97
+ | `date` | string (YYYY-MM-DD) | Hunt creation or last updated date | `2025-11-30` |
98
+ | `hunter` | string | Person or team conducting hunt | `Security Team`, `John Doe` |
99
+ | `platform` | array | Operating systems or environments targeted | `[Windows, macOS, Linux]`, `[Cloud]`, `[Network]` |
100
+ | `tactics` | array | MITRE ATT&CK tactics (lowercase with hyphens) | `[credential-access, persistence]` |
101
+ | `techniques` | array | MITRE ATT&CK technique IDs | `[T1003.001, T1558.003]` |
102
+ | `data_sources` | array | SIEM, EDR, or log platforms used | `[Splunk, Sentinel, ClickHouse]` |
103
+
104
+ #### Optional Fields
105
+
106
+ | Field | Type | Purpose | Example | When to Use |
107
+ |-------|------|---------|---------|-------------|
108
+ | `related_hunts` | array | Hunt IDs that relate to this hunt | `[H-0015, H-0038]` | When building on past work or pivoting |
109
+ | `findings_count` | integer | Total findings (TP + FP + suspicious) | `15` | Post-execution or during KEEP phase |
110
+ | `true_positives` | integer | Confirmed malicious activity | `3` | Post-execution summary |
111
+ | `false_positives` | integer | Benign activity flagged | `12` | Post-execution summary |
112
+ | `customer_deliverables` | array | Client report references (for MSPs) | `[CUST-2025-Q1-001]` | Managed service providers |
113
+ | `tags` | array | Freeform categorization keywords | `[supply-chain, living-off-the-land]` | Additional context beyond ATT&CK |
114
+
115
+ ### MITRE ATT&CK Tactic Reference
116
+
117
+ Use these **lowercase hyphenated** values for the `tactics` field:
118
+
119
+ - `initial-access`
120
+ - `execution`
121
+ - `persistence`
122
+ - `privilege-escalation`
123
+ - `defense-evasion`
124
+ - `credential-access`
125
+ - `discovery`
126
+ - `lateral-movement`
127
+ - `collection`
128
+ - `command-and-control`
129
+ - `exfiltration`
130
+ - `impact`
131
+
132
+ ### Platform Values
133
+
134
+ Common values for the `platform` array:
135
+
136
+ - `Windows` - Windows endpoints/servers
137
+ - `macOS` - Apple macOS systems
138
+ - `Linux` - Linux distributions
139
+ - `Cloud` - Cloud services (AWS, Azure, GCP)
140
+ - `Network` - Network devices/traffic
141
+ - `Container` - Docker, Kubernetes
142
+ - `SaaS` - Software-as-a-Service applications
143
+
144
+ ### Examples
145
+
146
+ #### Minimal YAML Frontmatter (Level 0-1)
147
+
148
+ ```yaml
149
+ ---
150
+ hunt_id: H-0001
151
+ title: macOS Data Collection via AppleScript
152
+ status: completed
153
+ date: 2025-11-19
154
+ hunter: Security Team
155
+ platform: [macOS]
156
+ tactics: [collection]
157
+ techniques: [T1005]
158
+ data_sources: [Splunk]
159
+ related_hunts: []
160
+ findings_count: 0
161
+ true_positives: 0
162
+ false_positives: 0
163
+ customer_deliverables: []
164
+ tags: [macos, applescript]
165
+ ---
166
+ ```
167
+
168
+ #### Comprehensive YAML Frontmatter (Level 2+)
169
+
170
+ ```yaml
171
+ ---
172
+ hunt_id: H-0042
173
+ title: Kerberoasting Detection via Service Ticket Requests
174
+ status: completed
175
+ date: 2025-11-30
176
+ hunter: Threat Hunting Team
177
+ platform: [Windows]
178
+ tactics: [credential-access]
179
+ techniques: [T1558.003]
180
+ data_sources: [Splunk, Windows Event Logs]
181
+ related_hunts: [H-0015, H-0038]
182
+ findings_count: 15
183
+ true_positives: 3
184
+ false_positives: 12
185
+ customer_deliverables: []
186
+ tags: [kerberos, active-directory, credential-theft, service-accounts]
187
+ ---
188
+ ```
189
+
190
+ #### Multi-Platform Hunt Example
191
+
192
+ ```yaml
193
+ ---
194
+ hunt_id: H-0078
195
+ title: JavaScript Malware Execution Detection
196
+ status: in-progress
197
+ date: 2025-12-01
198
+ hunter: Detection Engineering
199
+ platform: [Windows, macOS, Linux] # Cross-platform TTP
200
+ tactics: [execution]
201
+ techniques: [T1059.007]
202
+ data_sources: [EDR, Sysmon]
203
+ related_hunts: [H-0004]
204
+ findings_count: 0
205
+ true_positives: 0
206
+ false_positives: 0
207
+ customer_deliverables: []
208
+ tags: [javascript, node-js, living-off-the-land, supply-chain]
209
+ ---
210
+ ```
211
+
212
+ ### Adoption by Maturity Level
213
+
214
+ #### Level 0-1: Manual & Documented
215
+
216
+ **Goal:** Learn LOCK methodology, build hunting muscle memory
217
+
218
+ **YAML Frontmatter:** Optional - You can omit it entirely or include minimal fields
219
+
220
+ **Focus on:**
221
+
222
+ - Understanding hypothesis generation
223
+ - Writing clear queries
224
+ - Documenting lessons learned
225
+
226
+ **Example:** Use markdown metadata only, skip YAML frontmatter
227
+
228
+ #### Level 2: Searchable
229
+
230
+ **Goal:** Enable AI to search and reference past hunts
231
+
232
+ **YAML Frontmatter:** Recommended - Include all required fields + tags
233
+
234
+ **Focus on:**
235
+
236
+ - Consistent metadata across hunts
237
+ - Using AI to find related work
238
+ - Tracking ATT&CK coverage
239
+
240
+ **Example:** Add YAML with required fields, populate `tags` and `related_hunts`
241
+
242
+ #### Level 3+: Generative & Agentic
243
+
244
+ **Goal:** Automation, metrics, programmatic hunt generation
245
+
246
+ **YAML Frontmatter:** Required - All fields, especially findings metrics
247
+
248
+ **Focus on:**
249
+
250
+ - Hunt success rate analysis
251
+ - Automated coverage gap identification
252
+ - Programmatic hunt scheduling
253
+ - Knowledge graph construction
254
+
255
+ **Example:** Full YAML schema with all optional fields populated post-execution
256
+
257
+ ### YAML Best Practices
258
+
259
+ **Consistency**
260
+
261
+ - Use lowercase hyphenated tactics (`credential-access`, not `Credential Access`)
262
+ - Use proper ATT&CK technique IDs (`T1003.001`, not `T1003`)
263
+ - Use kebab-case for multi-word tags (`living-off-the-land`, not `living_off_the_land`)
264
+
265
+ **When to Update**
266
+
267
+ - **During Planning:** Set `status: planning`, populate tactics/techniques/platform
268
+ - **During Execution:** Change `status: in-progress`
269
+ - **Post-Execution:** Update `status: completed`, add findings counts
270
+
271
+ **Related Hunts**
272
+
273
+ - Link to hunts you built upon (`related_hunts: [H-0022]`)
274
+ - Link to hunts that extend your findings (`related_hunts: [H-0045, H-0046]`)
275
+ - Don't overlink - only add hunts that directly inform or extend this work
276
+
277
+ **Tags**
278
+
279
+ - Supplement ATT&CK with context: `supply-chain`, `zero-day`, `apt29`
280
+ - Use lowercase hyphenated format: `credential-theft`, not `Credential Theft`
281
+ - 3-6 tags per hunt is ideal
282
+
283
+ ---
284
+
285
+ ### LEARN: Prepare the Hunt
286
+
287
+ Educational foundation and hunt planning.
288
+
289
+ **Hypothesis Statement:**
290
+ Clear statement of what you're hunting and why.
291
+
292
+ **ABLE Scoping Table:**
293
+
294
+ | Field | Your Input |
295
+ |-------|-----------|
296
+ | **Actor** *(Optional)* | Threat actor or "N/A" |
297
+ | **Behavior** | Actions, TTPs, methods, tools |
298
+ | **Location** | Endpoint, network, cloud environment |
299
+ | **Evidence** | **Source:** [Log source]<br>**Key Fields:** [field1, field2]<br>**Example:** [What malicious activity looks like] |
300
+
301
+ **Threat Intel & Research:**
302
+
303
+ - MITRE ATT&CK techniques
304
+ - CTI sources and references
305
+ - Historical context for your environment
306
+
307
+ **Related Tickets:**
308
+ Cross-references to SOC, IR, TI, or DE tickets
309
+
310
+ ---
311
+
312
+ ### OBSERVE: Expected Behaviors
313
+
314
+ Hypothesis of what you expect to find.
315
+
316
+ **What Normal Looks Like:**
317
+ Describe legitimate activity (false positive sources)
318
+
319
+ **What Suspicious Looks Like:**
320
+ Describe anomalous behaviors you're hunting
321
+
322
+ **Expected Observables:**
323
+
324
+ - Processes, network connections, files, registry keys, authentication events
325
+
326
+ ---
327
+
328
+ ### CHECK: Execute & Analyze
329
+
330
+ Investigation execution and results.
331
+
332
+ **Data Source Information:**
333
+
334
+ - Index/data source
335
+ - Time range analyzed
336
+ - Events processed
337
+ - Data quality notes
338
+
339
+ **Hunting Queries:**
340
+
341
+ ```[language]
342
+ [Initial query]
343
+ ```
344
+
345
+ **Query Notes:** Did it work? FPs? Gaps?
346
+
347
+ ```[language]
348
+ [Refined query after iteration]
349
+ ```
350
+
351
+ **Refinement Rationale:** What changed and why?
352
+
353
+ **Visualization & Analytics:**
354
+
355
+ - Time-series, heatmaps, anomaly detection used
356
+ - Patterns observed
357
+ - Screenshots referenced
358
+
359
+ **Query Performance:**
360
+
361
+ - What worked well
362
+ - What didn't work
363
+ - Iterations made
364
+
365
+ ---
366
+
367
+ ### KEEP: Findings & Response
368
+
369
+ Results, lessons, and follow-up actions.
370
+
371
+ **Executive Summary:**
372
+ 3-5 sentences: What was found? Hypothesis proved/disproved?
373
+
374
+ **Findings Table:**
375
+
376
+ | Finding | Ticket | Description |
377
+ |---------|--------|-------------|
378
+ | [TP/FP/Suspicious] | [INC-####] | [Brief description] |
379
+
380
+ **True Positives:** Count and summary
381
+ **False Positives:** Count and patterns
382
+ **Suspicious Events:** Count requiring investigation
383
+
384
+ **Detection Logic:**
385
+
386
+ - Could this become automated detection?
387
+ - Thresholds and conditions for alerts
388
+ - Tuning considerations
389
+
390
+ **Lessons Learned:**
391
+
392
+ - What worked well
393
+ - What could be improved
394
+ - Telemetry gaps identified
395
+
396
+ **Follow-up Actions:**
397
+
398
+ - [ ] Checklist items for next steps
399
+ - [ ] Detection rule creation
400
+ - [ ] Hypothesis updates needed
401
+
402
+ **Follow-up Hunts:**
403
+
404
+ - H-XXXX: [New hunt ideas from findings]
405
+
406
+ ---
407
+
408
+ **Hunt Completed:** [Date]
409
+
410
+ ---
411
+
412
+ ## Section Purpose Guide
413
+
414
+ ### LEARN
415
+
416
+ **Purpose:** Build understanding and context before hunting.
417
+
418
+ - Explain the TTP being hunted
419
+ - Provide threat intel context
420
+ - Document why this hunt matters now
421
+ - Use ABLE framework to scope precisely
422
+
423
+ ### OBSERVE
424
+
425
+ **Purpose:** Create clear, testable hypothesis.
426
+
427
+ - State what you expect to find
428
+ - Distinguish normal from suspicious
429
+ - List specific observables
430
+
431
+ ### CHECK
432
+
433
+ **Purpose:** Execute investigation and document process.
434
+
435
+ - Embed queries directly in markdown
436
+ - Document what worked and what didn't
437
+ - Show query iterations and refinements
438
+ - Track performance and data quality
439
+
440
+ ### KEEP
441
+
442
+ **Purpose:** Capture outcomes and enable improvement.
443
+
444
+ - Summarize findings (TP/FP/Suspicious)
445
+ - Extract lessons learned
446
+ - Identify detection opportunities
447
+ - Plan follow-up actions and hunts
448
+
449
+ ## Hunt Workflow Best Practices
450
+
451
+ ### Query Development
452
+
453
+ - **Embed queries in markdown** using code blocks with syntax highlighting
454
+ - **Comment your queries** to explain detection logic
455
+ - **Document iterations** - Show initial query and refinements
456
+ - **Explain why queries changed** based on findings
457
+
458
+ ### ABLE Scoping
459
+
460
+ - **Actor is optional** - Focus on behavior unless actor context adds value
461
+ - **Evidence section is critical** - Include log sources, key fields, and examples
462
+ - **Be specific** - Vague scoping leads to vague results
463
+
464
+ ### Single-File Workflow
465
+
466
+ - **Update the same file** as you iterate (no dated copies)
467
+ - **Status field tracks progress** (Planning → In Progress → Completed)
468
+ - **Keep query history** - Comment out old queries, don't delete them
469
+ - **Document why things changed** in lessons learned
470
+
471
+ ### Status Management
472
+
473
+ - **Planning:** Hypothesis defined, queries being developed
474
+ - **In Progress:** Actively hunting, collecting data, refining queries
475
+ - **Completed:** Results documented, findings summarized, lessons captured
476
+
477
+ ## Why LOCK + ABLE?
478
+
479
+ **LOCK methodology** ensures structured hunting:
480
+
481
+ - **Learn:** Educational foundation
482
+ - **Observe:** Clear hypothesis
483
+ - **Check:** Actionable detection
484
+ - **Keep:** Captured lessons
485
+
486
+ **ABLE scoping** provides precision:
487
+
488
+ - **Actor:** Who (optional context)
489
+ - **Behavior:** What (required - the actions)
490
+ - **Location:** Where (required - the environment)
491
+ - **Evidence:** How to find it (required - data sources and examples)
492
+
493
+ Together they create hunts that are educational, repeatable, and improve over time.
494
+
495
+ ## Design Inspiration
496
+
497
+ This template combines best practices from multiple threat hunting frameworks:
498
+
499
+ - **LOCK methodology** (Learn-Observe-Check-Keep) for structured, educational hunting
500
+ - **ABLE scoping** (Actor-Behavior-Location-Evidence) for precise hunt definition
501
+ - **PEAK framework** (Prepare-Execute-Act with Knowledge) for single-file hypothesis+execution workflow
502
+
503
+ The result is a condensed, practical template that guides hunters from hypothesis through results while maintaining comprehensive documentation.
504
+
505
+ ## Example Reference
506
+
507
+ See [H-0001.md](H-0001.md), [H-0002.md](H-0002.md), and [H-0003.md](H-0003.md) for complete hunt examples.