sonobat 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +76 -3
- package/dist/index.js +1256 -8
- package/dist/index.js.map +1 -1
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -1,15 +1,18 @@
|
|
|
1
1
|
# sonobat
|
|
2
2
|
|
|
3
|
+
[](https://github.com/0x6d61/sonobat/actions/workflows/ci.yml)
|
|
4
|
+
|
|
3
5
|
**AttackDataGraph for autonomous penetration testing.**
|
|
4
6
|
|
|
5
|
-
sonobat is a normalized data store that ingests tool outputs (nmap, ffuf, nuclei), builds a structured attack graph, and proposes next-step actions based on missing data. It exposes an [MCP Server](https://modelcontextprotocol.io/) so that LLM agents can drive the entire reconnaissance-to-exploitation loop autonomously.
|
|
7
|
+
sonobat is a normalized data store that ingests tool outputs (nmap, ffuf, nuclei), builds a structured attack graph, and proposes next-step actions based on missing data. It includes a built-in **Datalog inference engine** for attack path analysis and exposes an [MCP Server](https://modelcontextprotocol.io/) so that LLM agents can drive the entire reconnaissance-to-exploitation loop autonomously.
|
|
6
8
|
|
|
7
9
|
## Features
|
|
8
10
|
|
|
9
11
|
- **Ingest** — Parse nmap XML, ffuf JSON, and nuclei JSONL into a normalized SQLite graph
|
|
10
12
|
- **Normalize** — Deduplicate and link hosts, services, endpoints, inputs, observations, credentials, and vulnerabilities
|
|
11
13
|
- **Propose** — Gap-driven engine suggests what to scan next based on missing data
|
|
12
|
-
- **
|
|
14
|
+
- **Datalog Inference** — Built-in Datalog engine for attack path analysis with preset and custom rules
|
|
15
|
+
- **MCP Server** — 17 tools + 3 resources accessible via stdio for LLM agents (Claude Desktop, Claude Code, etc.)
|
|
13
16
|
|
|
14
17
|
## Data Model
|
|
15
18
|
|
|
@@ -53,7 +56,7 @@ npm test
|
|
|
53
56
|
|
|
54
57
|
## MCP Server
|
|
55
58
|
|
|
56
|
-
sonobat runs as an MCP server over stdio. LLM agents connect to it and use tools to ingest data, query the graph, and get next-step proposals.
|
|
59
|
+
sonobat runs as an MCP server over stdio. LLM agents connect to it and use tools to ingest data, query the graph, run Datalog inference, and get next-step proposals.
|
|
57
60
|
|
|
58
61
|
### Available Tools
|
|
59
62
|
|
|
@@ -73,6 +76,9 @@ sonobat runs as an MCP server over stdio. LLM agents connect to it and use tools
|
|
|
73
76
|
| | `add_credential` | Add a credential for a service |
|
|
74
77
|
| | `add_vulnerability` | Add a vulnerability for a service |
|
|
75
78
|
| | `link_cve` | Link a CVE record to a vulnerability |
|
|
79
|
+
| **Datalog** | `list_facts` | Show database contents as Datalog facts |
|
|
80
|
+
| | `run_datalog` | Execute a custom Datalog program against the database |
|
|
81
|
+
| | `query_attack_paths` | Run preset or saved attack pattern analysis |
|
|
76
82
|
|
|
77
83
|
### MCP Resources
|
|
78
84
|
|
|
@@ -82,6 +88,69 @@ sonobat runs as an MCP server over stdio. LLM agents connect to it and use tools
|
|
|
82
88
|
| `sonobat://hosts/{id}` | Host detail with full service tree |
|
|
83
89
|
| `sonobat://summary` | Overall statistics |
|
|
84
90
|
|
|
91
|
+
## Datalog Inference Engine
|
|
92
|
+
|
|
93
|
+
sonobat includes a built-in Datalog inference engine that enables attack path analysis by reasoning over the normalized database.
|
|
94
|
+
|
|
95
|
+
### How It Works
|
|
96
|
+
|
|
97
|
+
1. **Fact Extraction** — Database rows are automatically converted to Datalog facts (e.g., `host("h-001", "10.0.0.1", "IP")`)
|
|
98
|
+
2. **Rule Evaluation** — Naive bottom-up evaluator with fixed-point iteration derives new facts from rules
|
|
99
|
+
3. **Query Answering** — Queries return matching tuples with variable bindings
|
|
100
|
+
|
|
101
|
+
### Available Predicates
|
|
102
|
+
|
|
103
|
+
| Predicate | Arity | Source Table |
|
|
104
|
+
|-----------|-------|-------------|
|
|
105
|
+
| `host(Id, Authority, Kind)` | 3 | hosts |
|
|
106
|
+
| `service(HostId, Id, Transport, Port, AppProto, State)` | 6 | services |
|
|
107
|
+
| `http_endpoint(ServiceId, Id, Method, Path, StatusCode)` | 5 | http_endpoints |
|
|
108
|
+
| `input(ServiceId, Id, Location, Name)` | 4 | inputs |
|
|
109
|
+
| `endpoint_input(EndpointId, InputId)` | 2 | endpoint_inputs |
|
|
110
|
+
| `observation(InputId, Id, RawValue, Source, Confidence)` | 5 | observations |
|
|
111
|
+
| `credential(ServiceId, Id, Username, SecretType, Source, Confidence)` | 6 | credentials |
|
|
112
|
+
| `vulnerability(ServiceId, Id, VulnType, Title, Severity, Confidence)` | 6 | vulnerabilities |
|
|
113
|
+
| `vulnerability_endpoint(VulnId, EndpointId)` | 2 | vulnerabilities |
|
|
114
|
+
| `cve(VulnId, CveId, CvssScore)` | 3 | cves |
|
|
115
|
+
| `vhost(HostId, Id, Hostname, Source)` | 4 | vhosts |
|
|
116
|
+
|
|
117
|
+
### Preset Attack Patterns
|
|
118
|
+
|
|
119
|
+
| Pattern | Description |
|
|
120
|
+
|---------|-------------|
|
|
121
|
+
| `reachable_services` | Open services reachable on each host |
|
|
122
|
+
| `authenticated_access` | Services with known credentials |
|
|
123
|
+
| `exploitable_endpoints` | Endpoints with confirmed vulnerabilities |
|
|
124
|
+
| `critical_vulns` | Critical and high severity vulnerabilities |
|
|
125
|
+
| `attack_surface` | Full attack surface overview |
|
|
126
|
+
| `unfuzzed_inputs` | Inputs with observations but no vulnerabilities found yet |
|
|
127
|
+
|
|
128
|
+
### Custom Rules
|
|
129
|
+
|
|
130
|
+
LLM agents can write and execute custom Datalog rules via the `run_datalog` MCP tool. Rules can be saved to the database with a `generated_by` field (`human` or `ai`) for future reuse.
|
|
131
|
+
|
|
132
|
+
```
|
|
133
|
+
% Example: Find all HTTP services with SQL injection vulnerabilities
|
|
134
|
+
sqli_service(HostId, ServiceId, Title) :-
|
|
135
|
+
service(HostId, ServiceId, "tcp", Port, "http", "open"),
|
|
136
|
+
vulnerability(ServiceId, VulnId, "sqli", Title, Severity, Confidence).
|
|
137
|
+
?- sqli_service(HostId, ServiceId, Title).
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
## Propose Engine
|
|
141
|
+
|
|
142
|
+
The proposer analyzes missing data in the attack graph and suggests next actions:
|
|
143
|
+
|
|
144
|
+
| Missing Data Pattern | Proposed Action | Description |
|
|
145
|
+
|---------------------|----------------|-------------|
|
|
146
|
+
| Host has no services | `nmap_scan` | Port scan the host |
|
|
147
|
+
| HTTP service has no endpoints | `ffuf_discovery` | Directory/file discovery |
|
|
148
|
+
| Endpoint has no inputs | `parameter_discovery` | Find input parameters |
|
|
149
|
+
| Input has no observations | `value_collection` | Collect parameter values |
|
|
150
|
+
| Input has observations but no vulnerabilities | `value_fuzz` | Fuzz the parameter with attack payloads |
|
|
151
|
+
| HTTP service has no vhosts | `vhost_discovery` | Virtual host enumeration |
|
|
152
|
+
| HTTP service has no vulnerability scan | `nuclei_scan` | Run vulnerability scanner |
|
|
153
|
+
|
|
85
154
|
### Claude Desktop
|
|
86
155
|
|
|
87
156
|
Add to `claude_desktop_config.json`:
|
|
@@ -142,6 +211,8 @@ npx @modelcontextprotocol/inspector npx tsx src/index.ts
|
|
|
142
211
|
| Validation | Zod |
|
|
143
212
|
| Build | tsup (esbuild) |
|
|
144
213
|
| Test | Vitest |
|
|
214
|
+
| Linter | ESLint + @typescript-eslint |
|
|
215
|
+
| Formatter | Prettier |
|
|
145
216
|
|
|
146
217
|
## Development
|
|
147
218
|
|
|
@@ -151,7 +222,9 @@ npm test # Run all tests
|
|
|
151
222
|
npm run test:watch # Watch mode
|
|
152
223
|
npm run test:coverage # Coverage report
|
|
153
224
|
npm run lint # ESLint
|
|
225
|
+
npm run lint:fix # ESLint with auto-fix
|
|
154
226
|
npm run format # Prettier
|
|
227
|
+
npm run format:check # Prettier check
|
|
155
228
|
npm run typecheck # tsc --noEmit
|
|
156
229
|
npm run build # Production build
|
|
157
230
|
```
|