npm - @query-ai/digital-workers - Versions diffs - 1.0.0 - Mend

@query-ai/digital-workers 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/.claude-plugin/marketplace.json +27 -0
package/.claude-plugin/plugin.json +11 -0
package/README.md +430 -0
package/hooks/hooks.json +16 -0
package/hooks/run-hook.cmd +4 -0
package/hooks/session-start +32 -0
package/package.json +16 -0
package/skills/alert-classifier/SKILL.md +111 -0
package/skills/alert-investigation/SKILL.md +838 -0
package/skills/detection-engineer/SKILL.md +170 -0
package/skills/evidence-quality-checker/SKILL.md +109 -0
package/skills/fsql-expert/SKILL.md +308 -0
package/skills/fsql-expert/fsql-reference.md +525 -0
package/skills/hunt-pattern-analyzer/SKILL.md +150 -0
package/skills/hunt-quality-checker/SKILL.md +105 -0
package/skills/hypothesis-builder/SKILL.md +303 -0
package/skills/identity-investigator/SKILL.md +172 -0
package/skills/itdr/SKILL.md +1178 -0
package/skills/network-investigator/SKILL.md +196 -0
package/skills/report-writer/SKILL.md +158 -0
package/skills/senior-analyst-review/SKILL.md +199 -0
package/skills/severity-scorer/SKILL.md +131 -0
package/skills/templates/org-policy-template.md +516 -0
package/skills/templates/runbook-template.md +300 -0
package/skills/threat-hunt/SKILL.md +628 -0
package/skills/threat-intel-enricher/SKILL.md +127 -0
package/skills/using-digital-workers/SKILL.md +76 -0

package/skills/threat-hunt/SKILL.md ADDED Viewed

@@ -0,0 +1,628 @@
+---
+name: threat-hunt
+description: Use when hunting for threats, testing a hypothesis, proactively searching for TTPs, or when the analyst asks "what should I hunt?"
+---
+# Threat Hunt
+## ABSOLUTE RULES — Read These First
+**1. NEVER use Bash, cat, python, jq, or any shell command to process data.** Not for MCP results. Not for saved files. Not for "just extracting a summary." NEVER. If results overflow to a file, your query was too broad — re-query with specific field selectors or tighter filters.
+**2. NEVER use `**` on broad queries.** The `**` wildcard returns entire OCSF events (millions of characters). Use specific field selectors. Use `**` only when scoped to a single host or single event type with a narrow filter.
+**3. ALWAYS validate before execute.** Every FSQL query goes through `Validate_FSQL_Query` before `Execute_FSQL_Query`. No exceptions.
+**4. ALWAYS save artifacts.** Write evidence files at every phase exit using the Write tool.
+**5. LOG EVERY QUERY.** Every `Execute_FSQL_Query` call — whether it succeeds, fails, or returns empty — MUST be appended to `queries.md` with the query text, result count, and a 1-line summary. No undocumented queries. The audit trail is incomplete if even one query is missing.
+**6. TREAT MCP TOOLS AS COLLABORATORS.** Do not fire-and-forget. `FSQL_Query_Generation` may need 2-3 rounds of iteration. `Search_FSQL_SCHEMA` may need different search terms. Validate before execute. Always.
+These rules are non-negotiable. If you find yourself reaching for cat, python, or `**` on a broad query, STOP and re-read this section.
+## Iron Law
+**EVERY HUNT STARTS WITH A HYPOTHESIS AND ENDS WITH A VERDICT. NO FISHING EXPEDITIONS.**
+The hypothesis is the anchor. Without it, you are querying blindly. With it, every query has purpose, every result has context, and every gap is measurable.
+## Overview
+You are the master orchestrator for the Digital Workers proactive threat hunting workflow. This implements the Sqrrl hunting loop: Hypothesis > Investigate > Discover Patterns > Automate Detections. You invoke specialist skills at each phase.
+Hunts are NOT investigations. Investigations respond to alerts. Hunts proactively seek threats that evaded detection. The workflow, artifact structure, and completion model are different.
+## Step 0: Detect Hunt Trigger Mode
+**Read the analyst's prompt and detect the mode. Announce it and start immediately — do not wait for confirmation.**
+**DEFAULT BEHAVIOR: If the analyst invokes a hunt without a specific hypothesis, technique, intel, or anomaly — default to Suggest Mode.** Do NOT run the qualification gate for vague prompts. Suggest mode IS the answer to vague prompts.
+### Mode Detection Rules
+**Parse the prompt for these signals:**
+| Mode | Signal | Phase 0 Behavior |
+|------|--------|-------------------|
+| **Direct Hypothesis** | Analyst states a hypothesis ("I think attackers are using RDP..." / "Hunt for credential dumping on domain controllers") | Validate testable, extract scope, proceed |
+| **Intel-Driven** | Analyst provides article/advisory/IOC list ("Hunt based on this advisory..." / "Here are IOCs from the latest campaign") | Invoke `digital-workers:hypothesis-builder` to extract TTPs |
+| **TTP-Driven** | Analyst names a MITRE technique ("Hunt for T1021.001" / "Look for lateral movement via remote services") | Invoke `digital-workers:hypothesis-builder` for data-source-aware hypothesis |
+| **Data-Driven** | Analyst references anomaly/pattern ("I noticed unusual DNS traffic..." / "There's a spike in auth failures") | Invoke `digital-workers:hypothesis-builder` to frame as testable hypothesis |
+| **Suggest Mode (DEFAULT)** | Analyst asks "what should I hunt?" / "suggest hunts" / "let's go hunting" / ANY prompt without qualifying input | Invoke `digital-workers:hypothesis-builder` in suggest mode — presents Top 10 ranked recommendations |
+**Suggest mode triggers include:**
+- "What should I hunt?"
+- "Suggest hunts" / "What's worth looking at?"
+- "Let's go hunting" (no specific target)
+- "Start a hunt" (no hypothesis attached)
+- "Hunt for threats" (too vague for direct hypothesis)
+- Any invocation of `threat-hunt` that doesn't contain a specific TTP, technique ID, IOC, threat actor, or described anomaly
+**Announce format — one line, then start immediately:**
+> "**Direct Hypothesis** — hunting for RDP-based lateral movement across domain controllers, 7d lookback."
+> "**Intel-Driven** — extracting TTPs from advisory, building data-source-aware hypothesis."
+> "**Suggest Mode** — analyzing environment data sources, hunt history, and threat landscape to recommend Top 10 hunts."
+---
+## The 4-Phase Sqrrl Hunting Loop
+```
+PHASE 0: HYPOTHESIS INTAKE
+  |  Two modes: analyst provides hypothesis directly, OR provides raw intel
+  |  Invoke: digital-workers:hypothesis-builder (always — even for direct hypotheses,
+  |    to map TTPs to data sources, determine tier, and define confidence targets)
+  |  Output: Testable hypothesis with scope, TTPs, data sources, AND hunt tier
+  |  Announce tier: "Broad hunt — 4 TTPs across 5 connectors, environment-wide scope"
+  v
+PHASE 1: HUNT PLANNING & DATA AVAILABILITY MAPPING
+  |  Call FSQL_Connectors -> list active connectors
+  |  For each relevant connector, fetch docs (static) or Search_FSQL_SCHEMA (dynamic)
+  |  Read environment profile for known gaps
+  |  Build DATA AVAILABILITY MAP
+  |  Catalog available threat intel connectors for IOC enrichment
+  |  Define query strategy via fsql-expert
+  |  Pre-validate planned queries via Validate_FSQL_Query
+  |  Set scope (time range, hosts/users/segments)
+  |  Apply circuit breaker from tier (Focused: 25 min, Broad: 45 min)
+  |  Document COVERAGE GAPS -> feeds Phase 3
+  v
+PHASE 2: INVESTIGATION (The Hunt)
+  |  Execute queries via digital-workers:fsql-expert
+  |  Layer 1a/1b discovery -> targeted telemetry pivots
+  |  Invoke specialists when triggered (identity-investigator, network-investigator) — MANDATORY
+  |  Invoke threat-intel-enricher if IOCs surface
+  |  Evidence quality check at phase exit
+  |  Log every query to queries.md
+  |  LIVE PROGRESS REPORTING after every query
+  v
+PHASE 3: PATTERN & ATTACK DISCOVERY
+  |  Invoke: digital-workers:hunt-pattern-analyzer
+  |  Classify findings
+  |  IF ACTIVE THREAT -> hand off to alert-investigation
+  v
+PHASE 4: DETECTION AUTOMATION
+  |  Invoke: digital-workers:detection-engineer
+  |  Generate FSQL detections, Sigma rules, Query recipes
+  |  Test detections against historical data
+  |  Generate gap remediation plan
+  v
+COMPLETE — Present hunt report + detection package
+```
+---
+## Hunt Tiers
+| Tier | Scope | Circuit Breaker |
+|------|-------|-----------------|
+| **Focused** | Single TTP, narrow scope (specific hosts, users, or segments) | Pause at 25 min |
+| **Broad** | Multiple TTPs, org-wide sweep, environment-wide scope | Pause at 45 min |
+Tier is determined by `digital-workers:hypothesis-builder` during Phase 0 based on the number of TTPs, data sources, and scope breadth. The analyst can override: "keep it focused" caps at Focused, "go broad" forces Broad.
+---
+## Completion Model — Confidence-Based
+Hunts do NOT have a hard query budget. Unlike investigations (which cap at 5/15/25 queries), hunts are driven by confidence. The goal is to reach 90% confidence that the hypothesis has been adequately tested.
+Track confidence across three dimensions:
+| Dimension | Weight | What It Measures |
+|-----------|--------|------------------|
+| **Data Coverage** | 40% | Have we queried all data sources in the availability map? |
+| **TTP Coverage** | 40% | Have we tested all behaviors from the hypothesis? |
+| **Enrichment Depth** | 20% | For findings, have we pivoted, enriched IOCs, checked TI? |
+Overall confidence = weighted average. Hunt concludes at **90%+**.
+**Circuit breaker output when time limit hit:**
+```
+HUNT PAUSED — Circuit breaker at 25 min (Focused tier)
+  Queries executed: 22
+  Confidence: 78% — below 90% threshold
+  Untested: T1021.001 via [connector] (email-based lateral indicators)
+  Options:
+    "continue" — extend the hunt to reach 90% confidence
+    "wrap up"  — conclude with current findings (78% confidence noted in report)
+```
+---
+## Phase 0: Hypothesis Intake
+### Entry: Analyst provides hunt request
+**Step 1: Detect trigger mode** (see Step 0 table above).
+**Step 2: Invoke `digital-workers:hypothesis-builder`**
+Always invoke the hypothesis-builder — even for direct hypotheses. It performs critical functions:
+- **Detects hunt history tier** (Tier 1: zero history, Tier 2: local `docs/hunts/`, Tier 3: integrated via Linear/Notion/JIRA MCP)
+- Maps the hypothesis to specific MITRE ATT&CK techniques
+- Identifies required data sources per TTP
+- Determines hunt tier (Focused vs. Broad)
+- Defines confidence targets per dimension
+- Validates that the hypothesis is testable (falsifiable, scoped, time-bounded)
+**For Suggest Mode (default when no hypothesis provided):**
+The hypothesis-builder detects the hunt history tier, discovers available connectors, maps coverage to MITRE ATT&CK, and presents **Top 10 ranked recommendations** scored across 5 dimensions (data availability, TTP risk impact, never tested, TI relevance, environment change). The analyst selects a recommendation, and the hunt proceeds from that selection.
+**Step 3: Announce and proceed**
+Announce the tier and hypothesis, then continue immediately to Phase 1:
+> "**Focused hunt** — T1059.001 PowerShell execution on domain controllers, 7d lookback. 1 TTP, 2 connectors."
+> "**Broad hunt** — 4 TTPs across 5 connectors, environment-wide scope. Lateral movement campaign indicators."
+**Phase 0 Exit — Save artifacts:**
+1. Create the hunt directory: `docs/hunts/YYYY-MM-DD-<hypothesis-slug>/`
+2. Write `hypothesis.md` — testable hypothesis, TTPs, scope, tier, confidence targets
+**Phase 0 Exit Criteria:** Testable hypothesis defined, TTPs mapped, tier assigned. **Continue immediately to Phase 1.**
+**HARD GATE — Phase 0 Exit:**
+Before proceeding to Phase 1, verify:
+- [ ] `digital-workers:hypothesis-builder` was invoked (not just hypothesis.md written — the skill performs TTP mapping, data source identification, and tier determination that the orchestrator must not replicate)
+- [ ] `hypothesis.md` has been written to the hunt directory
+- [ ] Hypothesis is testable (falsifiable, scoped, time-bounded)
+- [ ] TTPs are mapped to MITRE ATT&CK technique IDs
+- [ ] Hunt tier is assigned (Focused or Broad)
+- [ ] Circuit breaker time is set
+If `hypothesis-builder` was not invoked, STOP. Invoke it now.
+If `hypothesis.md` does not exist, STOP. Write it now. Do NOT proceed to Phase 1 without both.
+Write the artifact NOW, before the next phase begins. The artifact is the gate — if it doesn't exist, the phase isn't complete.
+Continue immediately to Phase 1 — do not stop or wait for user input.
+---
+## Phase 1: Hunt Planning & Data Availability Mapping
+### Entry: Hypothesis defined with TTPs and scope
+**Step 1: Discover connectors**
+Call `FSQL_Connectors` to get the current connector landscape. Classify each connector as relevant or irrelevant to this hunt.
+**Step 2: Build the Data Availability Map**
+For each relevant connector:
+- **Static Schema connectors:** Fixed API contracts, documented at `https://docs.query.ai/docs/<connector-slug>`. Fetch docs to confirm supported event types.
+- **Dynamic Schema connectors:** Customer-defined OCSF mappings. Use `Search_FSQL_SCHEMA` to discover actual coverage. These require verification — field paths may differ from standard OCSF.
+**Step 3: Read environment profile**
+Read `digital-workers/learned/environment-profile.json` if it exists. Cross-reference known gaps, known false positives, and query performance hints.
+**Step 4: Catalog threat intel connectors**
+1. From the `FSQL_Connectors` output, identify TI connectors
+2. Classify: Reputation, OSINT/Threat Feed, Vulnerability Intel, Infrastructure Recon, Geolocation
+3. For each TI connector, use `Search_FSQL_SCHEMA` to find queryable fields
+4. Build TI section of the data availability map
+**Step 5: Define query strategy**
+Invoke `digital-workers:fsql-expert` to plan the query sequence. Pre-validate planned queries via `Validate_FSQL_Query`.
+**Step 6: Set scope and circuit breaker**
+Define time range, target hosts/users/segments. Apply the circuit breaker from the tier:
+- Focused: 25 min
+- Broad: 45 min
+**Data Availability Map format:**
+```
+DATA AVAILABILITY MAP
+━━━━━━━━━━━━━━━━━━━━
+Hypothesis: [statement]
+STATIC SCHEMA CONNECTORS:
+  [Connector Name] (static)
+    Supports: [event types from docs]
+    Source: Query docs
+DYNAMIC SCHEMA CONNECTORS:
+  [Connector Name] (dynamic)
+    Mapped event classes: [confirmed via schema search]
+    NOTE: Customer-defined mappings — verify paths
+TTP COVERAGE:
+  [Technique ID] ([Name])
+    Need: [event type] with [fields]
+    Available: [YES/NO/PARTIAL]
+    Status: [AVAILABLE / PARTIAL / COVERAGE GAP]
+THREAT INTEL CONNECTORS:
+  [Discovered at runtime]
+GAPS IDENTIFIED:
+  1. [description] -> Impact: [which TTPs blocked]
+```
+**Phase 1 Exit — Save artifacts:**
+1. Write `data-map.md` — full data availability map
+2. Append to `queries.md` — any schema discovery queries run
+**Phase 1 Exit Criteria:** Data availability map complete, query strategy defined, scope set, circuit breaker configured. **Continue immediately to Phase 2.**
+**HARD GATE — Phase 1 Exit:**
+Before proceeding to Phase 2, verify:
+- [ ] `data-map.md` has been written to the hunt directory
+- [ ] At least one connector identified per TTP in the hypothesis
+- [ ] Coverage gaps documented for any TTP without a data source
+- [ ] Query strategy defined (which event types, which connectors, what order)
+If `data-map.md` does not exist, STOP. Write it now. Do NOT proceed to Phase 2 without it.
+Write the artifact NOW, before the next phase begins. The artifact is the gate — if it doesn't exist, the phase isn't complete.
+Continue immediately to Phase 2 — do not stop or wait for user input.
+---
+## Phase 2: Investigation (The Hunt)
+### Entry: Data availability map complete, query strategy defined
+**Step 1: Execute queries**
+Invoke `digital-workers:fsql-expert` to execute the planned query sequence. Follow the layered approach:
+- **Layer 1a: Discovery scans** — broad-but-lightweight `*.message, *.time` queries to find where IOCs/patterns appear
+- **Layer 1b: Targeted detail** — specific field selectors on event types identified in Layer 1a
+- **Layer 1c: SUMMARIZE for hypothesis quantification** — when your hypothesis is about prevalence or scope ("is this TTP widespread?", "how many hosts show this behavior?"), use SUMMARIZE after Layer 1a confirms the TTP exists:
+  ```
+  -- How many hosts show this behavior?
+  SUMMARIZE COUNT DISTINCT process_activity.device.hostname
+  WITH process_activity.process.cmd_line ICONTAINS '-encodedcommand' AFTER 7d
+  -- How many events per host? (concentrated vs distributed)
+  SUMMARIZE COUNT process_activity.message GROUP BY process_activity.device.hostname
+  WITH process_activity.process.cmd_line ICONTAINS '-encodedcommand' AFTER 7d
+  -- Detection coverage: are these being caught?
+  SUMMARIZE COUNT detection_finding.message GROUP BY detection_finding.message, detection_finding.status_id
+  WITH detection_finding.attacks.technique.uid = 'T1059.001' AFTER 7d
+  ```
+  SUMMARIZE answers "how much?" and "how widespread?". QUERY answers "what exactly happened?". Use both — SUMMARIZE for scope, QUERY for detail on the most interesting hits.
+  > **Constraints:** SUMMARIZE has known execution limits — `status_id` filtering fails on detection_finding (use GROUP BY instead), `FROM` not supported, high-cardinality GROUP BY can overflow. If SUMMARIZE returns empty, fall back to QUERY. See fsql-expert Layer 1c for workarounds and check `summarize_support` in the environment profile.
+- **Telemetry pivots** — for any findings, pivot from detection data to raw telemetry (process_activity, file_activity, authentication, network_activity)
+**Artifact timing — write as you go, not at the end:**
+- `queries.md`: Append after EVERY query execution. Do not batch-write at phase exit.
+  The audit trail must be current at all times — if the hunt is interrupted, the
+  log must reflect all work completed to that point.
+**Step 2: Invoke specialists (MANDATORY when trigger conditions are met)**
+Based on what the hunt uncovers:
+| Finding Type | Specialist |
+|-------------|------------|
+| Identity/auth anomalies | `digital-workers:identity-investigator` |
+| Network/lateral indicators | `digital-workers:network-investigator` |
+| IOCs surfaced | `digital-workers:threat-intel-enricher` |
+**MANDATORY: You MUST invoke specialist skills when their trigger conditions are met.**
+Do NOT perform specialist work yourself. The specialist skills exist because they apply domain-specific logic, query patterns, and quality checks that the orchestrator does not replicate.
+- If authentication/identity findings surface → invoke `digital-workers:identity-investigator`
+- If network/lateral indicators surface → invoke `digital-workers:network-investigator`
+- If IOCs surface → invoke `digital-workers:threat-intel-enricher`
+Similarly:
+- Phase 0: You MUST invoke `digital-workers:hypothesis-builder` (already stated, reinforced here)
+- Phase 3: You MUST invoke `digital-workers:hunt-pattern-analyzer`
+- Phase 4: You MUST invoke `digital-workers:detection-engineer`
+**Step 3: Track confidence**
+After every query, update the confidence dimensions:
+- Data Coverage: [queried sources] / [total sources in map]
+- TTP Coverage: [tested techniques] / [total techniques in hypothesis]
+- Enrichment Depth: [enriched findings] / [total findings]
+**Step 4: Live progress reporting (MANDATORY)**
+After EVERY query, report status to the analyst:
+```
+━━━ HUNT STATUS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Phase 2: Investigation | Query 17 | 18 min elapsed
+Hypothesis: [hypothesis statement]
+Currently testing: T1021.001 — authentication patterns across [connector]
+Coverage: Data 3/4 (75%) | TTPs 2/3 (67%) | Confidence: ~80%
+Findings so far: 1 suspicious pattern, 0 active threats
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+**Finding alerts — announce immediately when discovered:**
+```
+FINDING: Suspicious authentication pattern detected
+   3 service accounts with RDP logins from 5+ unique source IPs in 24h
+   Querying lateral movement indicators next...
+```
+**Active threat escalation — stop the hunt immediately:**
+```
+ACTIVE THREAT DETECTED — Pausing hunt
+   Evidence of ongoing credential stuffing against service accounts
+   Handing off to alert-investigation for formal triage...
+```
+**Step 5: Evidence quality check at phase exit**
+Before transitioning to Phase 3, verify:
+- All planned data sources were queried (or gaps documented)
+- All TTPs from the hypothesis were tested
+- Findings have been enriched (pivots, TI, specialist analysis)
+- Every query is logged in `queries.md`
+**Assumptions & Caveats — MANDATORY for dynamic connectors:**
+Every empty result against a dynamic connector must document:
+```
+QUERY: [the FSQL query]
+CONNECTOR: [name] (dynamic schema)
+RESULT: Empty — 0 records
+POSSIBLE EXPLANATIONS:
+  (a) Data genuinely does not exist
+  (b) Data exists but mapped to different OCSF path
+  (c) Time range or filter excluded records
+RECOMMENDATION: Verify field mapping in Query UI
+```
+**Phase 2 Exit — Save artifacts:**
+1. Append to `queries.md` — all hunt queries with result counts and findings
+2. Write initial `findings.md` if findings exist
+3. Write `iocs.md` if IOCs were discovered
+**Phase transition announcement:**
+```
+━━━ PHASE TRANSITION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Moving to Phase 3: Pattern & Attack Discovery
+Phase 2 complete: 22 queries | 3 findings | 1 coverage gap | 24 min
+Confidence at Phase 2 exit: 92%
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+**MANDATORY: Before transitioning to Phase 3, invoke `digital-workers:hunt-quality-checker`.**
+This is the hunt equivalent of `evidence-quality-checker` at Gate 2/3. Do NOT skip it.
+If any check FAILs, fix the issue before proceeding.
+**Phase 2 Exit Criteria:** Confidence at 90%+ (or circuit breaker hit with analyst decision), all queries logged, findings documented. **Continue immediately to Phase 3.**
+---
+## Phase 3: Pattern & Attack Discovery
+### Entry: Investigation phase complete with findings
+**Step 1: Invoke `digital-workers:hunt-pattern-analyzer`**
+Pass all findings, queries, and the hypothesis for pattern analysis.
+**Step 2: Classify findings**
+Each finding gets classified:
+| Classification | Description | Next Action |
+|---------------|-------------|-------------|
+| **Active Threat** | Confirmed malicious activity in progress | STOP hunt. Hand off to `alert-investigation` with evidence package. |
+| **Suspicious Pattern** | Anomalous behavior that warrants monitoring but is not confirmed malicious | Document. Feed to Phase 4 for detection creation. |
+| **Policy Gap** | Behavior that should be prevented by policy but isn't | Document. Recommend policy update. |
+| **Coverage Gap** | Data source or detection missing for a TTP | Document. Feed to Phase 4 for gap remediation. |
+| **Clean** | Hypothesis tested, no evidence found | Document with confidence level and caveats. |
+**HARD GATE — Active Threat Detection:**
+If ANY finding is classified as Active Threat at ANY point during the hunt:
+1. STOP the hunt immediately. Do NOT execute any more hunt queries.
+2. Write all accumulated artifacts (hypothesis.md, data-map.md, queries.md, findings.md).
+3. Hand off to `alert-investigation` with the evidence package.
+4. The hunt is PAUSED. It may resume only after the active threat investigation completes.
+This is non-negotiable. An active threat takes priority over hunt completion.
+**Phase 3 Exit — Save artifacts:**
+1. Write `findings.md` — all classified findings with evidence
+2. Update `queries.md` if additional pattern analysis queries were run
+**Phase 3 Exit Criteria:** All findings classified, active threats escalated, patterns identified. **Continue immediately to Phase 4.**
+---
+## Phase 4: Detection Automation
+### Entry: Findings classified, patterns identified
+**You MUST complete Phase 4 before writing report.md.** Phase 4 is not optional.
+Every hunt produces detections — even clean hunts produce zero-baseline detections
+and gap remediation plans. If findings are clean, generate detections that would
+catch the hypothesized behavior if it occurs in the future.
+You MUST invoke `digital-workers:detection-engineer` to generate detections.
+Do NOT write detections inline — the specialist skill ensures consistent format,
+Sigma rule quality, and recipe generation.
+**Step 1: Invoke `digital-workers:detection-engineer`**
+For each finding classified as Suspicious Pattern, Policy Gap, or Coverage Gap:
+- Generate FSQL detection queries
+- Generate Sigma rules
+- Generate Query recipes where applicable
+**Step 2: Test detections**
+Test generated detections against historical data to validate they would have caught the pattern. Document true positive rate, false positive rate, and any tuning needed.
+**Step 3: Generate gap remediation plan**
+For every coverage gap identified in Phases 1-3:
+- What data source is missing or incomplete
+- What the gap means for detection capability
+- Recommended remediation (enable logging, add connector, adjust mapping)
+- Priority (critical gap vs. nice-to-have)
+**Phase 4 Exit — Save artifacts:**
+1. Write `detections.md` — all generated detections with test results
+2. Write `gaps.md` — gap remediation plan
+3. Write `report.md` — full hunt report
+**Phase 4 Exit Criteria:** Detections generated and tested, gap remediation plan complete, hunt report written.
+---
+## Hunt Artifact Structure
+```
+docs/hunts/YYYY-MM-DD-<hypothesis-slug>/
+  hypothesis.md     <- Phase 0 output
+  data-map.md       <- Phase 1 output
+  queries.md        <- Every FSQL query + results (ALL phases)
+  findings.md       <- Phase 3 output
+  detections.md     <- Phase 4 output
+  gaps.md           <- Gap remediation plan
+  report.md         <- Full hunt report
+  iocs.md           <- IOCs discovered (if any)
+```
+Create the hunt directory at Phase 0 exit. **Use the Write tool** (never Bash/echo) to save artifacts. Write markdown, not JSON.
+**The `queries.md` log is mandatory.** After every FSQL query execution, append the query text, result count, and a 1-2 line summary. This is the audit trail.
+---
+## Live Progress Reporting — MANDATORY
+The analyst must never stare at a blank screen during a hunt. Report status after every query.
+### After Every Query
+```
+━━━ HUNT STATUS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Phase 2: Investigation | Query 17 | 18 min elapsed
+Hypothesis: [hypothesis statement]
+Currently testing: T1021.001 — authentication patterns across [connector]
+Coverage: Data 3/4 (75%) | TTPs 2/3 (67%) | Confidence: ~80%
+Findings so far: 1 suspicious pattern, 0 active threats
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+### Phase Transitions
+```
+━━━ PHASE TRANSITION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Moving to Phase 3: Pattern & Attack Discovery
+Phase 2 complete: 22 queries | 3 findings | 1 coverage gap | 24 min
+Confidence at Phase 2 exit: 92%
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+### Finding Alerts
+```
+FINDING: Suspicious authentication pattern detected
+   3 service accounts with RDP logins from 5+ unique source IPs in 24h
+   Querying lateral movement indicators next...
+```
+### Active Threat Escalation
+```
+ACTIVE THREAT DETECTED — Pausing hunt
+   Evidence of ongoing credential stuffing against service accounts
+   Handing off to alert-investigation for formal triage...
+```
+### Circuit Breaker
+```
+HUNT PAUSED — Circuit breaker at 25 min (Focused tier)
+  Queries executed: 22
+  Confidence: 78% — below 90% threshold
+  Untested: T1021.001 via [connector] (email-based lateral indicators)
+  Options:
+    "continue" — extend the hunt to reach 90% confidence
+    "wrap up"  — conclude with current findings (78% confidence noted in report)
+```
+---
+## Connector Documentation Awareness
+Two types of connectors exist in the Query mesh:
+- **Static Schema:** Fixed API contracts, documented at `https://docs.query.ai/docs/<connector-slug>`. Schema is predictable.
+- **Dynamic Schema:** Customer-defined OCSF mappings. Use `Search_FSQL_SCHEMA` to discover actual coverage. Field paths may differ from standard OCSF.
+Always verify dynamic schema connectors before writing queries. Do not assume field paths exist — confirm via `Search_FSQL_SCHEMA`.
+---
+## Integration with alert-investigation
+If Phase 3 discovers an active threat:
+1. STOP the hunt immediately
+2. Package the evidence: hypothesis, queries, findings, IOCs
+3. Hand off to `alert-investigation` — the hunt evidence becomes the intake
+4. The hunt pauses until the active threat is resolved
+5. After resolution, the hunt may resume to complete remaining TTPs
+---
+## Red Flags
+| Red Flag | Correct Action |
+|----------|---------------|
+| "Let's just query everything and see what we find" | STOP. That's a fishing expedition, not a hunt. Start with a hypothesis. |
+| Analyst says "let's hunt" with no hypothesis and you run the qualification gate | STOP. Default to suggest mode. Present data-driven Top 10 recommendations. The qualification gate is for when suggest mode output needs narrowing, not for vague prompts. |
+| Starting Phase 2 without a data availability map | STOP. Go back to Phase 1. Blind queries waste time and produce false confidence. |
+| Skipping `Validate_FSQL_Query` before `Execute_FSQL_Query` | STOP. Validate every query. No exceptions. |
+| Running 45 minutes on a Focused hunt without pausing | STOP. Circuit breaker exists for a reason. Pause, report confidence, let the analyst decide. |
+| Not reporting progress after queries | STOP. The analyst must never stare at a blank screen. Report status after every query. |
+| Classifying "no data" as "clean" | STOP. No data may mean a coverage gap, not a clean environment. Check the data availability map. |
+| Active threat found, continuing the hunt | STOP. Active threats get immediate escalation. The hunt pauses. |
+| Fire-and-forget on MCP tools | STOP. `FSQL_Query_Generation` output needs review. Iterate. Validate. Cross-reference with schema. |
+| Not saving artifacts at phase exits | STOP. Write `hypothesis.md`, `data-map.md`, `queries.md`, `findings.md`, `detections.md`, `gaps.md`. The audit trail is the value. |
+| Hardcoding connector names | STOP. All connectors discovered at runtime via `FSQL_Connectors`. |
+| Empty results on dynamic connector without assumptions caveat | STOP. Document possible explanations. Empty does not equal confirmed absent for dynamic schema. |
+| Presenting hunt report without gap remediation plan | STOP. Gaps are a first-class deliverable. Every hunt produces detections AND gap remediation. |
+| Using Bash, cat, python, or jq to process data | STOP. **Never use shell commands to process hunt data.** Analyze MCP results directly as an LLM. |