npm - @query-ai/digital-workers - Versions diffs - 1.0.0 - Mend

@query-ai/digital-workers 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/.claude-plugin/marketplace.json +27 -0
package/.claude-plugin/plugin.json +11 -0
package/README.md +430 -0
package/hooks/hooks.json +16 -0
package/hooks/run-hook.cmd +4 -0
package/hooks/session-start +32 -0
package/package.json +16 -0
package/skills/alert-classifier/SKILL.md +111 -0
package/skills/alert-investigation/SKILL.md +838 -0
package/skills/detection-engineer/SKILL.md +170 -0
package/skills/evidence-quality-checker/SKILL.md +109 -0
package/skills/fsql-expert/SKILL.md +308 -0
package/skills/fsql-expert/fsql-reference.md +525 -0
package/skills/hunt-pattern-analyzer/SKILL.md +150 -0
package/skills/hunt-quality-checker/SKILL.md +105 -0
package/skills/hypothesis-builder/SKILL.md +303 -0
package/skills/identity-investigator/SKILL.md +172 -0
package/skills/itdr/SKILL.md +1178 -0
package/skills/network-investigator/SKILL.md +196 -0
package/skills/report-writer/SKILL.md +158 -0
package/skills/senior-analyst-review/SKILL.md +199 -0
package/skills/severity-scorer/SKILL.md +131 -0
package/skills/templates/org-policy-template.md +516 -0
package/skills/templates/runbook-template.md +300 -0
package/skills/threat-hunt/SKILL.md +628 -0
package/skills/threat-intel-enricher/SKILL.md +127 -0
package/skills/using-digital-workers/SKILL.md +76 -0

package/skills/network-investigator/SKILL.md ADDED Viewed

@@ -0,0 +1,196 @@
+---
+name: network-investigator
+description: Use when investigation involves network activity, lateral movement, C2 communication, unusual traffic patterns, or network-based IOCs
+---
+# Network Investigator
+## Iron Law
+**MAP THE FULL COMMUNICATION PATTERN, NOT JUST THE SINGLE CONNECTION.**
+A single suspicious connection means nothing without context. How often does it occur? What's the pattern? What other hosts are involved? Build the full picture.
+## When to Invoke
+Called by `alert-investigation` when the `alert-classifier` identifies:
+- Alert type: Network/Lateral
+- IOCs include: IP addresses, domains, ports, hostnames
+- MITRE techniques: T1071 (Application Layer Protocol), T1572 (Protocol Tunneling), T1021 (Remote Services), T1570 (Lateral Tool Transfer)
+## Investigation Process
+Use `digital-workers:fsql-expert` for ALL queries below.
+### Step 1: Map the Connection
+From the alert, identify the primary connection:
+- Source endpoint (IP, hostname, port)
+- Destination endpoint (IP, hostname, port)
+- Protocol and service
+- Direction (inbound, outbound, internal)
+- Timestamp and duration
+### Step 2: Source Endpoint Activity
+What else has the source endpoint been doing?
+```
+QUERY #network.src_endpoint.ip, #network.dst_endpoint.ip, #network.dst_endpoint.port,
+      #network.message, #network.time
+WITH #network.src_endpoint.ip = '<source_ip>' AFTER 24h
+```
+Look for:
+- **Volume**: How many connections from this host? Normal baseline?
+- **Destinations**: How many unique destinations? Any known-bad?
+- **Ports**: What destination ports? Any unusual (4444, 8080, 6666)?
+- **Protocols**: Expected protocols for this host type?
+- **Data volume**: Unusual bytes transferred?
+**SUMMARIZE for network pattern analysis:**
+Use SUMMARIZE to quantify network behavior before examining individual connections:
+```
+-- Outbound connection volume by source (scanning detection)
+SUMMARIZE COUNT network_activity.message GROUP BY network_activity.src_endpoint.ip
+WITH network_activity.src_endpoint.ip IN '<ip1>', '<ip2>' AFTER 7d
+-- Port scan breadth (how many unique ports targeted?)
+SUMMARIZE COUNT DISTINCT network_activity.dst_endpoint.port
+WITH network_activity.src_endpoint.ip = '<ip>' AFTER 7d
+-- Destination diversity (how many unique IPs contacted?)
+SUMMARIZE COUNT DISTINCT network_activity.dst_endpoint.ip
+WITH network_activity.src_endpoint.ip = '<ip>' AFTER 7d
+-- Lateral movement scope (internal destinations per source)
+SUMMARIZE COUNT DISTINCT network_activity.dst_endpoint.ip
+GROUP BY network_activity.src_endpoint.ip
+WITH network_activity.dst_endpoint.port IN 3389, 445, 22, 5985 AFTER 7d
+```
+SUMMARIZE answers "how much?" and "how broad?". QUERY answers "to where exactly?" and "when?".
+> **Constraints:** SUMMARIZE has known execution limits — `status_id` filtering fails on detection_finding (use GROUP BY instead), `FROM` not supported, high-cardinality GROUP BY can overflow. If SUMMARIZE returns empty, fall back to QUERY. See fsql-expert Layer 1c for workarounds and check `summarize_support` in the environment profile.
+### Step 3: Destination Analysis
+Is the destination known-good, known-bad, or unknown?
+```
+QUERY #network.src_endpoint.ip, #network.dst_endpoint.ip, #network.dst_endpoint.port,
+      #network.message, #network.time
+WITH #network.dst_endpoint.ip = '<dest_ip>' AFTER 7d
+```
+Look for:
+- How many internal hosts connect to this destination?
+- Is this a known service/infrastructure IP?
+- First-seen: is this a new destination?
+- Pass to `threat-intel-enricher` for reputation
+### Step 4: Lateral Movement Detection
+Search for the source endpoint connecting to internal systems:
+```
+QUERY #network.src_endpoint.ip, #network.dst_endpoint.ip, #network.dst_endpoint.port,
+      #network.message, #network.time
+WITH #network.src_endpoint.ip = '<source_ip>'
+AND #network.dst_endpoint.port IN 22, 135, 139, 445, 3389, 5985, 5986 AFTER 48h
+```
+These ports indicate remote access attempts (SSH, SMB, RDP, WinRM). Look for:
+- Number of unique internal destinations
+- Success vs. failure of connections
+- Time pattern (scanning = rapid sequential connections)
+- Whether the source is authorized for this access
+### Step 5: DNS Activity
+If domain names are involved:
+```
+QUERY dns_activity.message, dns_activity.time, dns_activity.query.hostname,
+      dns_activity.src_endpoint.ip
+WITH dns_activity.query.hostname = '<domain>' AFTER 7d
+```
+Look for:
+- Which internal hosts queried this domain?
+- Query frequency (beaconing = regular intervals)
+- DNS record types (TXT records may indicate tunneling)
+- Resolution to known-bad IPs
+### Step 6: Timeline Reconstruction
+Build a chronological view of all network activity for the involved endpoints:
+```
+-- Layer 1a: Discover all event types for this IP
+QUERY *.message, *.time WITH %ip = '<source_ip>' AFTER 48h
+-- Layer 1b: Query specific event types found in Layer 1a with targeted fields
+```
+Order by timestamp. Identify:
+- Initial compromise point
+- Reconnaissance/scanning phase
+- Lateral movement attempts
+- Data staging/exfiltration
+- C2 communication pattern (interval, data volume, protocol)
+## C2 Beaconing Indicators
+Look for these patterns in connection data:
+- **Regular intervals**: Connections every 30s, 60s, 5m (with jitter)
+- **Small data transfers**: < 1KB per connection (heartbeat)
+- **Unusual protocols**: HTTP/HTTPS to non-standard ports
+- **Long connections**: Persistent connections to external IPs
+- **DNS anomalies**: High volume of DNS queries, TXT record lookups, long subdomain strings
+## Output
+```
+NETWORK INVESTIGATION:
+  Primary Connection: [src] -> [dst] ([protocol]:[port])
+  Direction: [Inbound/Outbound/Internal]
+  Source Endpoint Profile:
+    Hostname: [name]
+    Total connections (24h): [count]
+    Unique destinations: [count]
+    Unusual ports: [list or "None"]
+  Destination Assessment:
+    IP/Domain: [value]
+    Reputation: [from threat-intel-enricher]
+    Internal hosts connecting: [count]
+    First seen: [date]
+  Lateral Movement: [Detected/Not detected]
+    Internal targets: [list of IPs/hostnames]
+    Ports targeted: [list]
+    Success rate: [X/Y attempts]
+  C2 Indicators: [Present/Not present]
+    Pattern: [description if present]
+    Interval: [if beaconing detected]
+  Timeline: [chronological summary]
+  IOCs Discovered: [new indicators]
+  Recommended Follow-Up: [additional queries or skills]
+```
+**Return this investigation output to the calling orchestrator and continue. Do not present to the user or wait for input — the orchestrator will incorporate findings into the evidence package.**
+## Red Flags
+| Red Flag | Correct Action |
+|----------|---------------|
+| "Just one connection to a suspicious IP" | Complete the pattern analysis. One connection may be the tip. |
+| "Internal traffic, probably safe" | STOP. Lateral movement IS internal traffic. Investigate. |
+| "No data on this destination" | That's a finding. Unknown destinations connecting to internal hosts is suspicious. Note as unknown, flag for enrichment. |

package/skills/report-writer/SKILL.md ADDED Viewed

@@ -0,0 +1,158 @@
+---
+name: report-writer
+description: Use when investigation findings need to be formatted as a response-ready report — produces technical investigation with business summary, switchable to full executive format on request
+---
+# Report Writer
+## Iron Law
+**EVERY REPORT IS RESPONSE-READY. IF THE READER CAN'T ACT ON IT, REWRITE IT.**
+The person reading this report should know exactly what happened, how confident we are, and what to do next — without needing to re-investigate.
+## Recommendation Model — MANDATORY
+**NEVER write "INCIDENT DECLARED" in any report.** The Digital Worker recommends — it does not declare incidents. Incident declaration has compliance and business implications that require human judgment.
+Use these phrases:
+- "Proposed Disposition: Critical Threat — Recommend Incident Escalation"
+- "RECOMMEND ESCALATION" (not "INCIDENT DECLARED")
+- "Evidence supports incident declaration — analyst action required"
+- "All findings require human analyst validation before action"
+The report header should use "RECOMMEND ESCALATION" or "Proposed Disposition: Critical Threat", never "INCIDENT DECLARED" or "Status: INCIDENT".
+## When to Invoke
+Called by `alert-investigation` at Gate 6 (Incident Notification) after the investigation is complete.
+## Default Output: Technical Report with Business Summary
+### Section 1: Business Summary (ALWAYS PRESENT)
+Write in plain English. No jargon. No acronyms without explanation. A non-technical executive should understand this section completely.
+Include:
+- **What happened** — one sentence
+- **Impact** — what systems, data, or operations are affected
+- **Current status** — contained? under investigation? resolved?
+- **Risk level** — plain language (e.g., "moderate risk to customer data")
+- **What's being done** — immediate actions taken or recommended
+- **What's needed** — any decisions or actions required from leadership
+**Length:** 3-6 sentences. Concise. Direct.
+### Section 2: Technical Investigation
+#### Proposed Disposition
+The disposition below is the worker's **recommended** disposition based on evidence gathered. The analyst confirms or overrides before the disposition is finalized.
+- **Critical Threat** — confirmed malicious activity; recommend immediate response and incident escalation
+- **Policy Violation** — real activity violating policy, not necessarily malicious
+- **False Positive** — detection triggered incorrectly
+- **Benign Activity** — real activity that is expected/authorized
+#### Five W's
+| Question | Answer | Confidence |
+|----------|--------|------------|
+| **Who** | [user, account, actor] | [Confirmed/High/Medium/Low] |
+| **What** | [action or event] | [Confirmed/High/Medium/Low] |
+| **When** | [timestamp range] | [Confirmed/High/Medium/Low] |
+| **Where** | [systems, IPs, locations] | [Confirmed/High/Medium/Low] |
+| **Why** | [assessed motive or cause] | [Confirmed/High/Medium/Low] |
+#### MITRE ATT&CK Mapping
+- **Tactic:** [tactic name]
+- **Technique:** [technique name] ([technique ID])
+- **Sub-technique:** [if applicable]
+#### Indicators of Compromise
+| IOC | Type | Reputation | Context |
+|-----|------|-----------|---------|
+| [value] | [IP/Hash/Domain/etc.] | [Known Malicious/Suspicious/Unknown/Clean] | [where found, significance] |
+#### Evidence Chain
+For each key finding, document:
+1. **FSQL query executed** (exact query text)
+2. **What the results showed** (summary)
+3. **Conclusion drawn** (with confidence level)
+4. **How this connects** to the overall investigation narrative
+#### Correlated Alerts
+List any related alerts discovered during investigation:
+- Alert title, source, timestamp
+- Relationship to primary alert
+#### Recommended Next Steps
+Specific, actionable items:
+- For **Critical Threat**: Containment actions, escalation targets, notification requirements. Present incident criteria assessment — map evidence against criteria and recommend whether to formally declare an incident.
+- For **Policy Violation**: Remediation steps, policy owner to notify
+- For **False Positive**: Detection tuning recommendation, what to adjust
+- For **Benign Activity**: Baseline update, suppression rule suggestion
+Include customer operating procedures if a custom skill is loaded (e.g., `acme-escalation-policy`).
+### Section 3: Show Your Work (On Request)
+Only include when asked or when confidence is low on key findings:
+- Complete list of FSQL queries executed with results
+- Reasoning chain: how each finding led to the next query
+- What was ruled out and why
+- **Knowns**: facts supported by data
+- **Unknowns**: what couldn't be determined (and why — data unavailable, insufficient time range, connector down)
+- **Assumptions**: any inferences made without direct evidence (with rationale)
+## Executive Report (On Request)
+When asked to produce a business/executive version:
+Restructure the entire report for a non-technical audience:
+- Lead with business impact and risk
+- Translate all technical findings to business language
+- Remove FSQL queries, IOC tables, ATT&CK mappings
+- Add: financial impact estimate (if applicable), regulatory implications, reputational risk
+- Add: what decisions leadership needs to make
+- Keep it to one page / one screen
+## Org-Policy Integration
+If an org-policy skill is loaded, check it for:
+- **Report Format preferences** — which sections to include/exclude, whether to expand Show Your Work by default
+- **Escalation Targets** — include org-specific notification routing in Recommended Next Steps
+- **Custom escalation procedures** — reference org-specific runbooks or escalation paths
+If no org-policy is loaded, use the default format documented above.
+## Quality Checklist
+Before finalizing any report, verify:
+- [ ] Business summary is present and jargon-free
+- [ ] Disposition is one of the four valid values
+- [ ] All Five W's are answered (or explicitly marked as Unknown)
+- [ ] Every conclusion cites specific evidence (query + result)
+- [ ] Confidence level is stated for each finding
+- [ ] IOCs are listed with type and reputation
+- [ ] Next steps are specific and actionable
+- [ ] Unknowns are explicitly stated
+- [ ] Report is response-ready (reader can act without re-investigating)
+**Return the completed report to the calling orchestrator. Do NOT present it directly to the analyst — the orchestrator will run senior-analyst-review first (if triggered) before presenting. Do not wait for user input.**
+## Red Flags
+| Red Flag | Correct Action |
+|----------|---------------|
+| "Investigation complete" (without business summary) | STOP. Business summary is mandatory on every report. |
+| "Probably a false positive" (without evidence) | STOP. State the evidence that supports the disposition. |
+| "Recommend further investigation" (as only next step) | STOP. Be specific. What should be investigated? Which queries? Which systems? |
+| Writing technical jargon in the business summary | STOP. Rewrite. If your manager's manager can't understand it, simplify. |
+| Skipping the confidence column | STOP. Every finding gets a confidence level. No exceptions. |

package/skills/senior-analyst-review/SKILL.md ADDED Viewed

@@ -0,0 +1,199 @@
+---
+name: senior-analyst-review
+description: Use when a completed investigation needs quality review — checks evidence completeness, logic, missed indicators, severity calibration, and blind spots. Auto-triggered on HIGH/CRITICAL, manually invocable on any investigation.
+---
+# Senior Analyst Review
+## Iron Law
+**REVIEW THE EVIDENCE, NOT THE CONCLUSION. Re-examine what was found and what was missed — then evaluate whether the disposition follows.**
+## When to Invoke
+- **Automatically** by `alert-investigation` after Gate 6 when the investigation's composite severity is >= 3.0 (DEEP) or the proposed disposition is Critical Threat
+- **Manually** by analyst: `skill: "digital-workers:senior-analyst-review"`
+- **Configurable** via org-policy skill (`review_trigger: all | high_critical | critical_only | manual`)
+**Never** during an investigation. Always after the investigation is complete (post-Gate 6). The review must be independent of the investigation process.
+## Evidence Sources
+The investigation writes artifacts to the investigation directory (`docs/investigations/YYYY-MM-DD-<description>/`). Use the **Read tool** to review these files during the quality check:
+| File | Use During |
+|------|-----------|
+| `queries.md` | Check 1 (evidence completeness) — were all relevant queries run? |
+| `iocs.md` | Check 3 (missed indicators) — were all IOCs followed up? |
+| `gate-1-intake.md` | Check 1 — what was the starting scope? |
+| `gate-2-enrichment.md` | Check 1, 3 — what was enriched, what was found? |
+| `gate-3-severity.md` | Check 4 (severity calibration) — are scores justified? |
+| `gate-4-investigation.md` | Check 2 (logic) — does disposition follow from evidence? |
+| `report.md` | Check 5, 6 — are recommendations actionable? Are blind spots documented? |
+## The Nine Checks
+### Check 1: Evidence Completeness
+Were all relevant data sources queried?
+- Cross-reference the connectors available in the mesh (via `FSQL_Connectors`) against the connectors actually queried during the investigation
+- Were all extracted IOCs followed up? Every IOC from Gate 1 should have at least one enrichment query
+- Are the Five W's fully populated? Each "W" answered or explicitly marked Unknown with a reason
+- Were time ranges appropriate? Not too narrow (missed context), not too broad (noise)
+- Were observable searches run? (`%ip`, `%hash`, `%email`, `%domain`) for key indicators
+**Flag:** "Data source X is connected but was not queried — relevant because [reason]"
+**Flag:** "IOC [value] was extracted in Gate 1 but has no follow-up query"
+### Check 2: Logic Check
+Does the proposed disposition follow from the evidence?
+- For each conclusion in the report, can you trace it back to a specific query result?
+- Are there unsupported leaps? (Conclusions stated without corresponding evidence)
+- Is the confidence calibration justified? Each finding's confidence level should match evidence quality
+- Were alternative explanations considered? (Could a Critical Threat be a Policy Violation? Could a False Positive be a True Positive with insufficient data?)
+- Audit test: if this disposition were challenged in an incident review, would the evidence chain hold?
+**Flag:** "Conclusion X is stated with HIGH confidence but the supporting query returned only [N] records"
+**Flag:** "Alternative explanation not considered: [scenario]"
+### Check 3: Missed Indicators
+Are there IOCs in the evidence that weren't followed up?
+Review all query results from the investigation:
+- IPs that appeared in results but weren't searched across the mesh
+- Usernames mentioned but no authentication pattern analysis run
+- File hashes found but no threat intel lookup performed
+- Domains referenced but no DNS activity check
+- Pivot points: query results that contained new IOCs not extracted and pursued
+**Flag:** "IP [value] appears in query 3 results but was not searched as an IOC"
+**Flag:** "User [name] appears in correlated alerts but identity-investigator was not invoked"
+### Check 4: Severity Calibration
+Is the scoring justified by the evidence?
+- Review each of the five dimension scores against the evidence
+- Would different reasonable weights change the depth routing?
+- Are override rules correctly applied (or correctly not applied)?
+- Check for anchoring bias — was the score influenced by the source tool's severity label rather than contextual evidence?
+- If an org-policy defines custom weights or crown jewels, were they applied?
+**Flag:** "Asset Criticality scored 3/5 but the compromised host [name] appears in the org crown jewels list"
+**Flag:** "Override condition 'lateral movement detected' is met by [evidence] but DEEP was not triggered"
+### Check 5: Recommendations Check
+Are the response actions specific and actionable?
+- "Investigate further" is not actionable. "Query VPN logs for user X in the 24h before the alert" is.
+- Are containment recommendations proportional to the threat?
+- Are notification targets identified (or flagged as needing org-policy)?
+- For Critical Threat: are incident criteria clearly mapped against evidence?
+- Can the analyst act on every recommendation without additional research?
+**Flag:** "Recommendation 'monitor the situation' lacks specificity — what to monitor, for how long, what triggers escalation?"
+### Check 6: Blind Spots
+What couldn't be determined — and was that called out?
+- Were all unknowns explicitly documented in the report?
+- Are there data sources that were unavailable? (Connector down, no data for time range, query returned zero results)
+- Are there investigation paths not taken? Should they have been?
+- Were critical gaps called out with impact assessment? ("No auth logs available — cannot verify MFA bypass" is good. Silently not checking auth is bad.)
+**Flag:** "Query for DNS activity returned zero results but this gap is not mentioned in the report"
+**Flag:** "Network flow data was not queried — if available, would show exfiltration volume"
+### Check 7: Status Verification
+Were all findings cited as evidence verified for `status_id`?
+- Review every detection finding cited in the report. Check its `status_id` and `status_detail`.
+- Are any RESOLVED/Benign findings cited as active threats?
+- Does the alert count in the report include a status breakdown (NEW vs. RESOLVED vs. null)?
+- Were `status_detail` values like "UnsupportedAlertType" interpreted correctly?
+**Flag:** "Report cites [N] APT detections but [M] are RESOLVED/Benign — these do not support the narrative"
+**Flag:** "Alert count of [N] has no status breakdown — [M] are resolved, only [K] are NEW"
+### Check 8: Attribution Integrity
+Is threat actor attribution supported by independently verified evidence?
+- Is the attribution based on NEW/unresolved detections, or on RESOLVED/Benign vendor labels?
+- Was the attribution independently verified via threat intel enrichment (hash lookups, IP reputation)?
+- If threat intel enrichment was not possible (no hashes, no external IPs), is the confidence ceiling stated?
+- Are vendor labels from the same platform counted as independent corroboration? (They shouldn't be.)
+**Flag:** "APT28 attribution based entirely on vendor labels — [N] of [M] supporting detections are RESOLVED/Benign"
+**Flag:** "Attribution at HIGH confidence but no independent threat intel verification was possible"
+### Check 9: Specialist Invocation
+If the investigation was routed to DEEP depth, were the appropriate specialist skills invoked?
+- Cross-reference alert types from classification against specialist skills available
+- Identity/Access alerts → was `identity-investigator` invoked?
+- Network/Lateral alerts → was `network-investigator` invoked?
+- If specialists were not invoked: is there a documented reason?
+**Flag:** "Investigation routed DEEP but identity-investigator was never invoked despite [N] identity-type alerts"
+**Flag:** "No specialist skills invoked on a DEEP investigation — Gate 4 specialist requirement was skipped"
+## Output
+```
+SENIOR ANALYST REVIEW
+━━━━━━━━━━━━━━━━━━━━
+Investigation: [alert title / investigation ID]
+Proposed Disposition: [disposition]
+Review Verdict: ✅ APPROVED | ⚠️ GAPS IDENTIFIED
+Evidence Completeness:  [PASS | GAPS]
+Logic Check:            [PASS | CONCERNS]
+Missed Indicators:      [NONE FOUND | FOUND]
+Severity Calibration:   [APPROPRIATE | ADJUSTMENT SUGGESTED]
+Recommendations:        [ACTIONABLE | NEEDS IMPROVEMENT]
+Blind Spots:            [DOCUMENTED | UNDOCUMENTED GAPS]
+Status Verification:    [PASS | FINDINGS ON RESOLVED ALERTS]
+Attribution Integrity:  [PASS | UNSUPPORTED CLAIMS]
+Specialist Invocation:  [PASS | SKILLS NOT INVOKED]
+[If GAPS IDENTIFIED:]
+Issues Found:
+1. [Specific issue with category and suggested action]
+2. [Specific issue with category and suggested action]
+Suggested Follow-Up Queries:
+1. [Specific FSQL query or investigation action]
+2. [Specific FSQL query or investigation action]
+[If APPROVED:]
+Investigation is thorough and the proposed disposition is well-supported
+by the evidence. Ready for analyst review.
+```
+## What Happens After Review
+**Return the review output to the calling orchestrator immediately. Do not present review results directly to the user or wait for input.**
+- **APPROVED**: The orchestrator proceeds immediately to analyst presentation (Gate 6 Step 4).
+- **GAPS IDENTIFIED**: The orchestrator runs the suggested follow-up queries, updates the evidence package and Five W's, regenerates the report, and re-submits for review. Maximum 2 review cycles — if gaps persist after 2 cycles, present the investigation with the review notes attached so the analyst sees both the findings and the reviewer's concerns.
+## Red Flags
+| Red Flag | Correct Action |
+|----------|---------------|
+| "The investigation looks fine, approved" without running all nine checks | STOP. Run every check. A cursory review is worse than no review. |
+| Changing the disposition directly | STOP. The reviewer flags concerns and suggests — it doesn't override. The analyst decides. |
+| Reviewing while the investigation is still in progress | STOP. Review runs only on completed investigations. Independence matters. |
+| "No gaps found" on a 2-query investigation | STOP. A DEEP investigation with only 2 queries almost certainly has evidence gaps. |
+| Skipping Check 3 (Missed Indicators) because "the disposition seems right" | STOP. Missed indicators are how real threats slip through correct-seeming investigations. |

package/skills/severity-scorer/SKILL.md ADDED Viewed

@@ -0,0 +1,131 @@
+---
+name: severity-scorer
+description: Use when an alert needs multi-factor risk scoring to determine investigation depth — combines raw severity with asset criticality, business impact, and confidence
+---
+# Severity Scorer
+## Iron Law
+**RAW SEVERITY IS NOT RISK. CONTEXT DETERMINES DEPTH.**
+A Critical alert on a dev test server may be lower risk than a Medium alert on a production database holding customer PII. Always score contextually.
+## When to Invoke
+Called by `alert-investigation` at Gate 3 (Analyze Situation) after enrichment is complete.
+## Org-Policy Integration
+If an org-policy skill is loaded, check it for:
+- **Custom dimension weights** (Severity Weights section) — use instead of defaults
+- **Custom depth routing thresholds** — use instead of 1.0-1.9 / 2.0-2.9 / 3.0-5.0
+- **Crown jewels list** (Crown Jewels section) — use for Asset Criticality scoring instead of inferring
+- **Additional override rules** — add to the built-in override list
+If no org-policy is loaded, use all defaults documented below.
+## Scoring Dimensions
+Score each dimension 1-5, then compute composite:
+### 1. Alert Severity (from source tool)
+| Score | Level | Description |
+|-------|-------|-------------|
+| 5 | FATAL/CRITICAL | Highest severity from detection tool |
+| 4 | HIGH | Significant threat indicated |
+| 3 | MEDIUM | Moderate concern |
+| 2 | LOW | Minor or informational |
+| 1 | INFO | Informational only |
+### 2. Asset Criticality
+| Score | Level | Indicators |
+|-------|-------|-----------|
+| 5 | Crown Jewel | Production customer data, financial systems, domain controllers, CA servers |
+| 4 | High Value | Production application servers, email systems, VPN gateways |
+| 3 | Standard | Standard workstations, internal applications |
+| 2 | Low Value | Dev/test systems, sandbox environments |
+| 1 | Minimal | Decommissioned, isolated lab systems |
+If an org-policy skill defines a Crown Jewels list, use it for scoring. If a separate `crown-jewels` skill is loaded, use that. Otherwise, infer from hostname, IP range, and system role.
+### 3. Business Impact (Potential)
+| Score | Level | Description |
+|-------|-------|-------------|
+| 5 | Severe | PII/PCI exposure, regulatory notification required, major service outage |
+| 4 | High | Internal sensitive data at risk, partial service impact |
+| 3 | Moderate | Business process disruption, limited data exposure |
+| 2 | Low | Minimal operational impact, no data exposure |
+| 1 | Negligible | No measurable business impact |
+### 4. Confidence
+| Score | Level | Description |
+|-------|-------|-------------|
+| 5 | Confirmed | Multiple corroborating data sources, clear evidence |
+| 4 | High | Strong single-source evidence, consistent with known patterns |
+| 3 | Medium | Plausible but incomplete evidence, some ambiguity |
+| 2 | Low | Weak evidence, high chance of false positive |
+| 1 | Minimal | Single low-fidelity signal, no corroboration |
+**Anchoring guard**: If the primary evidence for scoring Confidence >= 4 (High) is vendor detection labels alone (e.g., "YTTRIUM malicious file detected", "Dukozy malware"), cap Confidence at 3 (Medium). Vendor labels from the same platform are not independent corroboration — five detections from SecLake are one source, not five. To score Confidence >= 4, require at least one of:
+- File hash confirmed malicious via independent threat intel
+- Behavioral correlation across two or more independent tools
+- Network IOC (IP/domain) confirmed malicious via independent reputation service
+- Manual analyst confirmation
+### 5. Threat Context
+| Score | Level | Description |
+|-------|-------|-------------|
+| 5 | Active Campaign | Matches known active threat campaign targeting this industry |
+| 4 | Known APT TTP | Technique matches known advanced persistent threat |
+| 3 | Known Technique | Common attack technique, well-understood |
+| 2 | Generic | Commodity scanning, automated probing |
+| 1 | Unknown | No threat intelligence correlation |
+## Composite Score and Depth Routing
+**Composite = (Alert Severity x 0.20) + (Asset Criticality x 0.25) + (Business Impact x 0.25) + (Confidence x 0.15) + (Threat Context x 0.15)**
+| Composite Score | Depth | Action |
+|----------------|-------|--------|
+| **1.0 - 1.9** | AUTO-CLOSE | Document disposition, update FP patterns |
+| **2.0 - 2.9** | STANDARD | Five W's, basic enrichment, verdict, report |
+| **3.0 - 5.0** | DEEP | Full timeline, multi-source correlation, IOC analysis, ATT&CK mapping |
+**Override rules (always DEEP regardless of score):**
+- Lateral movement detected
+- Persistence mechanism discovered
+- Credential compromise indicators
+- Data exfiltration signals
+- Multiple correlated alerts from same actor/campaign
+- Novel or previously unseen technique
+## Output
+```
+SEVERITY ASSESSMENT:
+  Alert Severity: [score] — [rationale]
+  Asset Criticality: [score] — [rationale]
+  Business Impact: [score] — [rationale]
+  Confidence: [score] — [rationale]
+  Threat Context: [score] — [rationale]
+  Composite Score: [X.X]
+  Investigation Depth: [AUTO-CLOSE | STANDARD | DEEP]
+  Override Applied: [Yes/No — reason if yes]
+```
+**Return this severity assessment to the calling orchestrator and continue. Do not present to the user or wait for input.**
+## Red Flags
+| Red Flag | Correct Action |
+|----------|---------------|
+| "I don't know the asset criticality, so I'll assume low" | STOP. Query the mesh for the asset. Check hostname patterns, IP ranges. Note as Unknown, not Low. |
+| "This is clearly a false positive" | STOP. Score it. If the score says auto-close, then auto-close. Don't prejudge. |
+| "Critical severity = deep investigation always" | STOP. Critical on a test box with low confidence may be standard depth. Score all dimensions. |