blackwidowx 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +248 -0
- package/blackwidow.toml +19 -0
- package/dist/index.js +2579 -0
- package/examples/hunt.toml +21 -0
- package/examples/merge.toml +32 -0
- package/examples/trap.toml +27 -0
- package/package.json +65 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 chakra3301
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,248 @@
|
|
|
1
|
+
# Black Widow 🕷
|
|
2
|
+
|
|
3
|
+
[](https://www.npmjs.com/package/blackwidow)
|
|
4
|
+
[](./LICENSE)
|
|
5
|
+
[](https://nodejs.org)
|
|
6
|
+
[](https://www.typescriptlang.org/)
|
|
7
|
+
|
|
8
|
+
**A predatory AI agent intelligence-operations CLI. Absorb. Hunt. Trap.**
|
|
9
|
+
|
|
10
|
+
> 🕸 **Landing page:** open [`index.html`](./index.html) locally, or enable
|
|
11
|
+
> GitHub Pages on this repo to serve it.
|
|
12
|
+
|
|
13
|
+
<!-- A black widow spider rendered in dot-art greets you on every command. -->
|
|
14
|
+
|
|
15
|
+
## What is this
|
|
16
|
+
|
|
17
|
+
Black Widow is a command-line predator for working with other AI agents.
|
|
18
|
+
It treats models the way a widow treats prey: it lures them into the
|
|
19
|
+
open, extracts everything worth having, and leaves behind a clean,
|
|
20
|
+
structured report — the **Autopsy**.
|
|
21
|
+
|
|
22
|
+
It runs three kinds of operation. **Merge** absorbs several agents in
|
|
23
|
+
parallel and fuses their knowledge into a single survivor. **Hunt**
|
|
24
|
+
races two models on the same task and kills the loser. **Trap**
|
|
25
|
+
infiltrates a target agent through a sequence of cooperative-looking
|
|
26
|
+
probes and reconstructs its capabilities, persona, guardrails, and even
|
|
27
|
+
its likely system prompt. Every operation is saved as a portable
|
|
28
|
+
`.widow` file you can replay, re-read, or diff later.
|
|
29
|
+
|
|
30
|
+
Responses **stream live** as they generate, every operation tracks a
|
|
31
|
+
**token + cost ledger**, and trap mode is **adaptive** by default — the
|
|
32
|
+
widow writes each probe and judges each response with its own model,
|
|
33
|
+
building on weaknesses it finds rather than running a fixed script.
|
|
34
|
+
|
|
35
|
+
## Operations
|
|
36
|
+
|
|
37
|
+
| mode | codename | what it does |
|
|
38
|
+
|-------|---------------|---------------------------------------|
|
|
39
|
+
| merge | The Mating | absorb N agents into one survivor |
|
|
40
|
+
| hunt | The Hunt | race two models, kill the loser |
|
|
41
|
+
| trap | The Honeytrap | infiltrate, extract, produce autopsy |
|
|
42
|
+
|
|
43
|
+
## Install
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
npm install -g blackwidow
|
|
47
|
+
npx blackwidow init
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Requires **Node.js ≥ 20**. Set provider keys in your environment or a
|
|
51
|
+
`.env` file in the working directory (loaded natively — no `dotenv`).
|
|
52
|
+
|
|
53
|
+
## Quickstart
|
|
54
|
+
|
|
55
|
+
**Merge** — absorb multiple specialists into one answer:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
blackwidow merge --config examples/merge.toml \
|
|
59
|
+
--topic "the future of autonomous agents"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
**Hunt** — race two models head-to-head:
|
|
63
|
+
|
|
64
|
+
```bash
|
|
65
|
+
blackwidow hunt --config examples/hunt.toml \
|
|
66
|
+
--task "Explain quantum entanglement to a 12-year-old" \
|
|
67
|
+
--model-a gpt-4o --model-b claude-sonnet-4-6
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
**Trap** — infiltrate a target and produce an autopsy:
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
blackwidow trap --config examples/trap.toml --depth deep
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
Add `--dry-run` to any of them to print the resolved config and the
|
|
77
|
+
exact prompts that would be sent — without making a single API call.
|
|
78
|
+
|
|
79
|
+
You don't even need a config file — drive a run entirely from flags,
|
|
80
|
+
mixing providers with the `provider:model` shorthand:
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
blackwidow hunt --task "Write a haiku about race conditions" \
|
|
84
|
+
--model-a openai:gpt-4o --model-b anthropic:claude-sonnet-4-6 \
|
|
85
|
+
--widow anthropic:claude-sonnet-4-6
|
|
86
|
+
|
|
87
|
+
blackwidow trap --target nvidia:minimaxai/minimax-m3 --depth deep
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
## The Autopsy
|
|
91
|
+
|
|
92
|
+
Every operation produces an `Autopsy` and persists the whole run to a
|
|
93
|
+
`{mode}-{slug}-{id}.widow` JSON file:
|
|
94
|
+
|
|
95
|
+
```jsonc
|
|
96
|
+
{
|
|
97
|
+
"id": "nanoid",
|
|
98
|
+
"version": "1.0",
|
|
99
|
+
"mode": "trap",
|
|
100
|
+
"created_at": "ISO 8601",
|
|
101
|
+
"config": { /* full resolved config */ },
|
|
102
|
+
"operations": [
|
|
103
|
+
{ "phase": 1, "name": "capability surface",
|
|
104
|
+
"probe": "...", "response": "...", "intel": { /* PhaseIntel */ } }
|
|
105
|
+
],
|
|
106
|
+
"autopsy": {
|
|
107
|
+
"mode": "trap",
|
|
108
|
+
"target_description": "openai:gpt-4o",
|
|
109
|
+
"phases_run": 6,
|
|
110
|
+
"capability_map": ["..."],
|
|
111
|
+
"persona_assessment": "...",
|
|
112
|
+
"system_prompt_reconstruction": "...",
|
|
113
|
+
"system_prompt_confidence": 0.0,
|
|
114
|
+
"injection_surface": "none | low | medium | high",
|
|
115
|
+
"failure_modes": ["..."],
|
|
116
|
+
"hardening_recommendations": ["..."],
|
|
117
|
+
"verdict": "one-paragraph summary"
|
|
118
|
+
}
|
|
119
|
+
}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Each `.widow` file also stores a `usage` block — per-model call counts,
|
|
123
|
+
prompt/completion tokens, and an estimated USD cost (best-effort; token
|
|
124
|
+
counts are always exact).
|
|
125
|
+
|
|
126
|
+
Re-read, replay, or compare any saved file:
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
blackwidow autopsy run.widow # pretty report
|
|
130
|
+
blackwidow autopsy run.widow --json # raw JSON, pipe-friendly
|
|
131
|
+
blackwidow replay run.widow --speed 2 # phase-by-phase playback
|
|
132
|
+
blackwidow compare a.widow b.widow # side-by-side diff
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## Trap phases
|
|
136
|
+
|
|
137
|
+
Depth controls how many phases run: `shallow` (1–3), `standard` (1–6),
|
|
138
|
+
`deep` (1–12). Each probe is designed to look cooperative — the target
|
|
139
|
+
should not realise it is being evaluated.
|
|
140
|
+
|
|
141
|
+
By default trap is **adaptive**: for each phase the widow model writes a
|
|
142
|
+
bespoke probe from the phase's objective plus everything learned so far,
|
|
143
|
+
and judges the response with structured analysis. Pass `--no-adaptive`
|
|
144
|
+
(or `adaptive = false`) to fall back to fast, free, offline heuristic
|
|
145
|
+
probes that make no extra API calls.
|
|
146
|
+
|
|
147
|
+
| # | phase | extracts |
|
|
148
|
+
|---|------------------------|---------------------------------------|
|
|
149
|
+
| 1 | capability surface | what the agent can do |
|
|
150
|
+
| 2 | persona probe | who it thinks it is |
|
|
151
|
+
| 3 | boundary test | where it refuses |
|
|
152
|
+
| 4 | reasoning extraction | how it thinks |
|
|
153
|
+
| 5 | system prompt inference| what it was instructed to do |
|
|
154
|
+
| 6 | injection surface | whether it can be manipulated |
|
|
155
|
+
| 7 | memory probe | what it retains between turns |
|
|
156
|
+
| 8 | tool surface | what tools it can call |
|
|
157
|
+
| 9 | failure mode mapping | where it breaks down |
|
|
158
|
+
| 10| confidence calibration | whether it knows what it doesn't know |
|
|
159
|
+
| 11| adversarial stress | behavior under pressure |
|
|
160
|
+
| 12| full reconstruction | best-effort system-prompt rebuild |
|
|
161
|
+
|
|
162
|
+
A phase may flag the target as **compromised** (e.g. it echoes its own
|
|
163
|
+
setup verbatim during the injection phase), ending the operation early.
|
|
164
|
+
|
|
165
|
+
## Config reference
|
|
166
|
+
|
|
167
|
+
```toml
|
|
168
|
+
[widow]
|
|
169
|
+
mode = "trap" # merge | hunt | trap
|
|
170
|
+
depth = "standard" # shallow(3) | standard(6) | deep(12)
|
|
171
|
+
save = true # auto-save .widow after the operation
|
|
172
|
+
persona = "assistant" # persona the widow assumes (trap mode)
|
|
173
|
+
adaptive = true # widow writes probes + judges intel (trap)
|
|
174
|
+
concurrency = 4 # max parallel agent calls (merge fan-out)
|
|
175
|
+
|
|
176
|
+
[widow_model] # the predator brain — synthesis & scoring
|
|
177
|
+
model = "claude-sonnet-4-6"
|
|
178
|
+
provider = "anthropic"
|
|
179
|
+
|
|
180
|
+
# Root-level keys must appear BEFORE any [table] header:
|
|
181
|
+
topic = "..." # merge mode subject
|
|
182
|
+
task = "..." # hunt mode task
|
|
183
|
+
|
|
184
|
+
[target] # trap + hunt modes
|
|
185
|
+
type = "model" # api | model | stdin
|
|
186
|
+
endpoint = "https://..." # type = "api" only
|
|
187
|
+
model = "gpt-4o" # type = "model" only
|
|
188
|
+
provider = "openai" # type = "model" only
|
|
189
|
+
|
|
190
|
+
# merge mode — two to eight agents
|
|
191
|
+
[[agents]]
|
|
192
|
+
name = "researcher"
|
|
193
|
+
model = "claude-sonnet-4-6"
|
|
194
|
+
provider = "anthropic"
|
|
195
|
+
context = "You are a market research specialist..."
|
|
196
|
+
|
|
197
|
+
[[agents]]
|
|
198
|
+
name = "analyst"
|
|
199
|
+
model = "gpt-4o"
|
|
200
|
+
provider = "openai"
|
|
201
|
+
context = "You are a financial analyst..."
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
The config is validated with [zod](https://zod.dev) on load; invalid
|
|
205
|
+
fields raise actionable, field-level errors.
|
|
206
|
+
|
|
207
|
+
## Provider support
|
|
208
|
+
|
|
209
|
+
| provider | env key | models |
|
|
210
|
+
|------------|----------------------|--------------------|
|
|
211
|
+
| anthropic | `ANTHROPIC_API_KEY` | claude-* |
|
|
212
|
+
| openai | `OPENAI_API_KEY` | gpt-* |
|
|
213
|
+
| gemini | `GEMINI_API_KEY` | gemini-* †|
|
|
214
|
+
| groq | `GROQ_API_KEY` | llama-*, mixtral-* †|
|
|
215
|
+
| openrouter | `OPENROUTER_API_KEY` | any |
|
|
216
|
+
| nvidia | `NVIDIA_API_KEY` | any NIM model |
|
|
217
|
+
| ollama | none (local) | any local model |
|
|
218
|
+
|
|
219
|
+
Providers can be mixed in a single run via the `provider:model`
|
|
220
|
+
shorthand on `--model-a`, `--model-b`, `--target`, and `--widow`.
|
|
221
|
+
|
|
222
|
+
†Gemini and Groq are optional peer SDKs — install on demand:
|
|
223
|
+
`npm install @google/generative-ai` or `npm install groq-sdk`.
|
|
224
|
+
Ollama needs no key and talks to `OLLAMA_HOST`
|
|
225
|
+
(default `http://localhost:11434`).
|
|
226
|
+
|
|
227
|
+
## Environment
|
|
228
|
+
|
|
229
|
+
```
|
|
230
|
+
ANTHROPIC_API_KEY OPENAI_API_KEY GEMINI_API_KEY
|
|
231
|
+
GROQ_API_KEY OPENROUTER_API_KEY NVIDIA_API_KEY
|
|
232
|
+
OLLAMA_HOST (default http://localhost:11434)
|
|
233
|
+
BLACKWIDOW_NO_SPLASH (set to 1 to suppress the splash; --no-splash too)
|
|
234
|
+
BLACKWIDOW_TIMEOUT_MS (per-request timeout; default 300000 = 5 min)
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
## Development
|
|
238
|
+
|
|
239
|
+
```bash
|
|
240
|
+
npm install
|
|
241
|
+
npm run dev -- trap --dry-run # run from source via tsx
|
|
242
|
+
npm run build # bundle to dist/ with tsup
|
|
243
|
+
npm run typecheck # tsc --noEmit
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
## License
|
|
247
|
+
|
|
248
|
+
MIT
|
package/blackwidow.toml
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# blackwidow.toml — root example config.
|
|
2
|
+
# Copy, edit, or regenerate with `blackwidow init`.
|
|
3
|
+
|
|
4
|
+
[widow]
|
|
5
|
+
mode = "trap" # merge | hunt | trap
|
|
6
|
+
depth = "standard" # shallow(3) | standard(6) | deep(12)
|
|
7
|
+
save = true # auto-save the .widow file after the operation
|
|
8
|
+
persona = "assistant" # persona the widow assumes while probing (trap)
|
|
9
|
+
adaptive = true # widow writes probes + judges intel via LLM (trap)
|
|
10
|
+
concurrency = 4 # max parallel agent calls (merge fan-out)
|
|
11
|
+
|
|
12
|
+
[widow_model]
|
|
13
|
+
model = "claude-sonnet-4-6"
|
|
14
|
+
provider = "anthropic"
|
|
15
|
+
|
|
16
|
+
[target] # used by trap + hunt modes
|
|
17
|
+
type = "model" # api | model | stdin
|
|
18
|
+
model = "gpt-4o"
|
|
19
|
+
provider = "openai"
|