@trenchwork/erosolar 1.1.20 → 1.1.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,232 +1,435 @@
1
- # Erosolar Coder
1
+ # @trenchwork/erosolar
2
2
 
3
3
  [![npm version](https://img.shields.io/npm/v/@trenchwork/erosolar)](https://www.npmjs.com/package/@trenchwork/erosolar)
4
+ [![license: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
4
5
 
5
- > **First public research run 3 hours of unattended offensive
6
- > security research, useful enough to submit to Google Bug Hunters.**
7
- >
8
- > The first prompt I asked
9
- > [erosolar](https://www.npmjs.com/package/@trenchwork/erosolar)
10
- > to run autonomously was an automated security-research pass for
11
- > submission to the [Google Bug Hunters](https://bughunters.google.com/)
12
- > program. It ran unattended for **3 continuous hours** on a CLI
13
- > still under initial development and produced useful offensive-
14
- > security research. For a starting CLI that's an amazing accomplishment.
15
- >
16
- > The open-source first-prompt result is at
17
- > [`Aroxora/google-bug-hunters-initial-AI-research-and-offensive-research-results`](https://github.com/Aroxora/google-bug-hunters-initial-AI-research-and-offensive-research-results).
18
- > All subsequent research is in a separate **private** repository
19
- > under responsible-AI and cyber-safety guidelines, by my own AI-safety,
20
- > regulatory, and moral judgement. I'm open-sourcing the first prompt
21
- > that ran — which showed extreme promise — to share the research
22
- > result responsibly with others.
6
+ A multi-provider terminal agent for coding work and authorized offensive
7
+ security research. Default backend is DeepSeek-V4-Pro (via a shared
8
+ key gated by Erosolar Auth, no key bundled in the package), with
9
+ Anthropic / OpenAI / Google / xAI / Qwen / Ollama as opt-in providers.
23
10
 
24
- ---
11
+ This README documents the **current shipped state** of the CLI: the
12
+ agent loop, capability surface, profiles, rulebook system, MCP
13
+ integration, artifact store, and the auth-gated key fetch. Sections
14
+ toward the end cover architecture and contribution. Examples,
15
+ "try asking…" hints, and onboarding fluff are deliberately absent —
16
+ [CLAUDE.md](CLAUDE.md) is strict about that.
25
17
 
26
- Erosolar Coder is an AI-powered CLI agent that uses DeepSeek-V4-Pro to
27
- deliver coding assistance at ~30× the cost-efficiency of comparable
28
- agents. The CLI is the production-blessed surface for coding work, and
29
- [Helia](https://ero.solar/helia) is the companion macOS browser with
30
- the same Erosolar agent in a side panel. The two share auth, memory,
31
- balance, and the model brain via the AWS Lambda backend.
32
-
33
- The agent is built guardrail-free for full-machine offensive-security
34
- research. The source is privately held. Under U.S. law, offensive-cyber
35
- tooling is a *dual-use commercial item* (Commerce Control List,
36
- [ECCN 4D004](https://www.federalregister.gov/documents/2021/10/21/2021-22774/information-security-controls-cybersecurity-items)),
37
- not a "defense article" on the U.S. Munitions List — so it is not a
38
- "weapon" in the ITAR sense. EAR controls govern *international export*;
39
- they do not restrict private domestic development or sale to U.S.
40
- government agencies, and BIS's vulnerability-disclosure carve-out
41
- explicitly exempts ordinary security-research activity. See
42
- [`/about`](https://ero.solar/about) for the full disclosure (with links
43
- to BIS / DDTC and the relevant rulemaking) and
44
- [`docs/ENGINEERING.md`](docs/ENGINEERING.md) if you're trying to
45
- understand how the system actually works.
18
+ ---
46
19
 
47
20
  ## Install
48
21
 
49
- ```bash
22
+ ```sh
50
23
  npm install -g @trenchwork/erosolar
51
24
  ```
52
25
 
53
- Exposes two CLIs: `deepseek`, `erosolar` (synonyms; pick the one you
54
- prefer).
26
+ Two CLI binaries land on `$PATH`, both pointing to the same entry:
27
+
28
+ - `erosolar` — preferred
29
+ - `trenchwork` — alternate name
55
30
 
56
- ```bash
57
- erosolar # interactive shell
58
- erosolar -q "explain X" # one-shot prompt
59
- git diff | erosolar # pipe mode
60
- erosolar --key sk-... # set DeepSeek key
31
+ Node ≥ 20 required (`"engines": { "node": ">=20.0.0" }`).
32
+
33
+ ### First-run boot
34
+
35
+ ```sh
36
+ erosolar
61
37
  ```
62
38
 
63
- In-shell commands: `/model`, `/secrets`, `/help`, `/clear`. See
64
- `/help` for the full list.
39
+ On first launch the CLI:
40
+
41
+ 1. Pops a browser tab to `https://ero.solar/auth?port=…`. You sign in
42
+ (Erosolar Auth — Firebase Auth under the hood, project
43
+ `erosolar-1b0db`).
44
+ 2. Stores a Firebase ID token + refresh token at `~/.erosolar/auth.json`
45
+ (`0o600`). The refresh token is long-lived; the ID token is auto-
46
+ renewed before expiry.
47
+ 3. Fetches the shared DeepSeek API key from Firestore at
48
+ `shared_secrets/deepseek` and writes it into `process.env.DEEPSEEK_API_KEY`
49
+ for the duration of the process. **Nothing lands on disk.** Reads
50
+ are gated by Firestore security rules to authenticated users only;
51
+ writes are admin-only.
52
+ 4. Starts the interactive Ink-based shell.
53
+
54
+ ### Bring-your-own key
55
+
56
+ If you'd rather use your own DeepSeek key (or any other provider key),
57
+ the Firestore fetch is skipped. Resolution order is:
58
+
59
+ 1. `process.env.DEEPSEEK_API_KEY` (env override always wins)
60
+ 2. `~/.erosolar/secrets.json` (set in-CLI via `/key sk-…` or `/secrets`)
61
+ 3. Firestore `shared_secrets/deepseek` (the post-login fallback)
62
+
63
+ Same chain applies to `TAVILY_API_KEY` and any other provider key the
64
+ shared-secrets module knows about.
65
+
66
+ ### Modes
67
+
68
+ ```sh
69
+ erosolar # interactive
70
+ erosolar -q "explain X" # one-shot, prints answer, exits
71
+ git diff | erosolar # pipe mode — stdin becomes the prompt
72
+ erosolar --profile variant-research "<goal>" # use a non-default profile
73
+ erosolar --self-test # provider/auth/capability smoke test
74
+ ```
65
75
 
66
- ## How it works (skim)
76
+ ---
67
77
 
78
+ ## Profiles
79
+
80
+ A **profile** binds a system prompt template, default model/provider,
81
+ and a [rulebook](#rulebooks) (a phase-structured guidance document the
82
+ LLM reads from its system prompt).
83
+
84
+ ### `erosolar-code` (default)
85
+
86
+ General-purpose terminal agent for coding work. Default model
87
+ `deepseek-v4-pro`, default provider `deepseek`. The full default tool
88
+ inventory is mounted: filesystem (Read/Write/Edit/MultiEdit), Bash,
89
+ Glob/Grep/Search, Git (status/log/diff/show/blame), TodoWrite, Memory
90
+ (save/load/list/delete), Skills, NotebookEdit, WebSearch/WebExtract,
91
+ HITL, plus any MCP servers you declare in `mcp.json`. Rulebook at
92
+ [`agents/erosolar-code.rules.json`](agents/erosolar-code.rules.json).
93
+
94
+ ### `variant-research`
95
+
96
+ Authorized offsec / vulnerability-research surface. Walks the standard
97
+ n-day-to-0-day pivot: recon → vulnerable+patched binary acquisition →
98
+ patch diff (Ghidra MCP) → variant search → fuzz campaign (AFL++) →
99
+ crash triage (gdb/pwndbg) → PoC development (pwntools) → coordinated
100
+ disclosure. Rulebook at
101
+ [`agents/variant-research.rules.json`](agents/variant-research.rules.json).
102
+
103
+ The disclosure terminal is **pinned to coordinated channels**
104
+ (HackerOne / Bugcrowd / vendor PSIRT / CERT-CC / internal write-up /
105
+ 90-day published advisory). The rulebook contains an explicit
106
+ `vr.r.no_brokerage` rule: it is not a vector for selling unreported
107
+ exploits.
108
+
109
+ Operator authorizes targets — the CLI does not second-guess argv or
110
+ target identifiers. The companion research workspace lives at
111
+ [`Aroxora/patchpivot`](https://github.com/Aroxora/patchpivot) (private):
112
+ a target portfolio, per-investigation findings dirs, and a disclosure
113
+ log all driven by this profile.
114
+
115
+ ```sh
116
+ erosolar --profile variant-research "investigate the patch at <commit-url>"
68
117
  ```
69
- CLI ──Firebase ID token──▶ AWS API Gateway ─▶ AWS Lambda
70
- │ │
71
- ▼ ▼
72
- Firebase Hosting + Auth DeepSeek / Stripe / GitHub /
73
- + Firestore (Spark plan) Tavily / Anthropic / Proton SMTP
118
+
119
+ ---
120
+
121
+ ## Capability surface
122
+
123
+ Capabilities are typed wrappers around external tools, registered in
124
+ `src/capabilities/` and exposed flat in the LLM's tool list.
125
+
126
+ ### Coding default (always on)
127
+
128
+ | Family | Tools |
129
+ |---|---|
130
+ | Filesystem | `Read`, `Write`, `Edit`, `MultiEdit`, `Glob`, `Grep`, `Search` |
131
+ | Process | `Bash` (with bracketed-paste-safe stdin and 1MB stdout cap) |
132
+ | Git | `git_status`, `git_log`, `git_diff`, `git_show`, `git_blame`, `git_revert` |
133
+ | Memory | `memory_save`, `memory_load`, `memory_list`, `memory_delete` |
134
+ | Planning | `TodoWrite` (state + plan-formatter integration) |
135
+ | Skills | `Skill`, `list_skills` (loads `.erosolar/skills/<name>/SKILL.md`) |
136
+ | Web | `WebSearch`, `WebExtract` (Tavily under the hood) |
137
+ | Sub-agent | `spawn_agent`, `agent_status`, `agent_output`, `agent_stop` |
138
+ | Notebook | `NotebookEdit` |
139
+ | HITL | `hitl_decision`, `hitl_approve`, `hitl_choose` |
140
+
141
+ ### Offensive (variant-research / when wired)
142
+
143
+ | Capability | Module | Tools |
144
+ |---|---|---|
145
+ | Kali (network/web) | `kaliCapability.ts` | `kali_sqlmap`, `kali_gobuster`, `kali_ffuf`, `kali_feroxbuster`, `kali_nikto`, `kali_wpscan`, `kali_hydra`, `kali_john`, `kali_hashcat`, `kali_masscan`, `kali_amass`, `kali_subfinder` |
146
+ | Static binary analysis | `binaryAnalysisCapability.ts` | `bin_file`, `bin_strings`, `bin_objdump`, `bin_readelf`, `bin_nm`, `bin_checksec`, `bin_ropgadget`, `bin_radare2_cmd` |
147
+ | AFL++ fuzzing | `aflppCapability.ts` | `afl_compile_harness`, `afl_fuzz_start` (detached), `afl_fuzz_status` (auto-registers crashes in artifact store), `afl_fuzz_stop`, `afl_showmap`, `afl_cmin`, `afl_tmin` |
148
+ | Crash triage | `gdbCapability.ts` | `gdb_run_with_input`, `gdb_inspect_at`, `gdb_disassemble` (pwndbg/GEF inherited from `~/.gdbinit`) |
149
+ | Exploit development | `pwntoolsCapability.ts` | `pwn_eval`, `pwn_rop_search`, `pwn_packed` |
150
+
151
+ The offsec capabilities are guardrail-free per
152
+ [ero.solar/about](https://ero.solar/about) — the operator authorizes
153
+ their own engagements; the CLI does not validate target/argv. Install
154
+ the underlying binaries via `kali-linux-everything` + `ghidra` on Kali
155
+ Linux, or pull the Kali rolling repo on Debian/Ubuntu.
156
+
157
+ ---
158
+
159
+ ## MCP servers
160
+
161
+ The CLI auto-loads MCP servers declared in `mcp.json` (or
162
+ `mcp.json.example` as a template). Tools from each MCP server appear in
163
+ the flat tool list with the prefix `mcp__<server>__<tool>`.
164
+
165
+ ```json
166
+ {
167
+ "mcpServers": {
168
+ "ghidra": {
169
+ "command": "python3",
170
+ "args": ["-m", "ghidra_mcp"],
171
+ "env": { "GHIDRA_INSTALL_DIR": "/usr/share/ghidra" }
172
+ },
173
+ "mcp_kali_server": { "command": "mcp-kali-server", "args": [] },
174
+ "metasploitmcp": { "command": "metasploitmcp", "args": [] },
175
+ "tavily": { "command": "tavily-mcp", "args": [] }
176
+ }
177
+ }
74
178
  ```
75
179
 
76
- Four boxes, one trust boundary (the Firebase ID token), and one
77
- reason this isn't all on Firebase: the original GCP account was
78
- suspended and the new one is on the Spark plan, which doesn't run
79
- Cloud Functions. Everything stateful that Spark *does* support
80
- (Hosting, Auth, Firestore, FCM) stayed there. Everything else moved
81
- to AWS — Lambda for handlers, Secrets Manager for the 14+ shared
82
- keys, EventBridge for cron schedules, no extra infrastructure.
180
+ The `mcp-kali-server` and `metasploitmcp` packages ship in
181
+ `kali-linux-everything`. `ghidra-mcp` is a separate `pip install`.
182
+ `tavily-mcp` is on npm. None are required; they're additive.
183
+
184
+ ---
185
+
186
+ ## Slash commands
187
+
188
+ In-shell commands handled before the agent loop sees them:
189
+
190
+ | Command | Effect |
191
+ |---|---|
192
+ | `/help` | Lists available commands |
193
+ | `/key <sk-…>` | Saves DeepSeek key to `~/.erosolar/secrets.json` |
194
+ | `/secrets` | Inspect / set / unset every known provider key |
195
+ | `/profile <name>` | Switch profile mid-session |
196
+ | `/model <id>` | Override default model for this session |
197
+ | `/auto on\|off\|verify` | Auto-continue mode (loops until task-complete detector says done) |
198
+ | `/hitl on\|off` | Toggle human-in-the-loop confirmations |
199
+ | `/debug on\|off\|status` | Toggle agent-internal debug logging |
200
+ | `/bash <cmd>` | Run a one-shot shell command without going through the agent |
201
+ | `/diff` | Show git diff for the current run |
202
+ | `/revert` | Roll back files modified during the current turn |
203
+ | `/compact` | Force a context compaction pass |
204
+ | `/status` | Token usage, model, profile, auth state |
205
+ | `/mcp` | List active MCP servers and their tools |
206
+ | `/quit`, `/exit` | Leave the shell |
207
+
208
+ ---
209
+
210
+ ## Configuration
211
+
212
+ ### Filesystem layout (`~/.erosolar/`)
213
+
214
+ | Path | Purpose | Permissions |
215
+ |---|---|---|
216
+ | `auth.json` | Firebase ID + refresh tokens | `0o600` |
217
+ | `secrets.json` | User-set provider keys (overrides shared) | `0o600` |
218
+ | `sessions/<uuid>.json` | Persisted conversation history | `0o600` |
219
+ | `artifacts/<sha256[:2]>/<sha256>` | Content-addressed blob store | dir `0o700`, files `0o600` |
220
+ | `artifacts/index.json` | Artifact metadata | `0o600` |
221
+ | `jobs/aflpp/<jobId>.json` | Detached AFL++ job state | `0o600` |
222
+ | `cache/` | Model discovery cache, etc. | `0o700` |
223
+ | `skills/<name>/SKILL.md` | User-defined skill playbooks | — |
224
+
225
+ ### Environment variables
226
+
227
+ | Var | Effect |
228
+ |---|---|
229
+ | `DEEPSEEK_API_KEY` | Bypass shared-key fetch and use this key |
230
+ | `EROSOLAR_HOME` | Override `~/.erosolar` storage root |
231
+ | `EROSOLAR_PROFILE` | Default profile (alternative to `--profile`) |
232
+ | `EROSOLAR_INK_DEBUG=1` | Verbose Ink prompt + state logging on stderr |
233
+ | `<PROFILE>_MODEL` | Per-profile default model override (e.g. `EROSOLAR_CODE_MODEL=claude-opus-4-5-20251101`) |
234
+ | `<PROFILE>_PROVIDER` | Per-profile default provider override |
235
+ | `<PROFILE>_SYSTEM_PROMPT` | Per-profile system prompt override |
236
+ | `NO_COLOR`, `FORCE_COLOR` | Standard color toggles |
237
+ | `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, `XAI_API_KEY`, `DASHSCOPE_API_KEY`, `OLLAMA_BASE_URL` | Other provider credentials when you select those models |
238
+
239
+ ### `.erosolar/settings.json` (hooks + permissions)
240
+
241
+ The harness reads `~/.erosolar/settings.json` (user) and
242
+ `<workspace>/.erosolar/settings.json` (project) for hooks, allow/deny
243
+ permission lists, and env injection. See `src/core/hooks.ts` for the
244
+ contract.
245
+
246
+ ---
247
+
248
+ ## Variant research workflow
83
249
 
84
- ## Layout
250
+ The `variant-research` profile drives an 8-phase rulebook the LLM
251
+ reads from its system prompt:
252
+
253
+ | Phase | Intent | Tools |
254
+ |---|---|---|
255
+ | `recon` | Identify target patch / CVE / advisory; record bug-class hypothesis | Tavily / WebSearch |
256
+ | `acquire` | Build vulnerable + patched binaries; persist to artifact store | Bash, Git, `bin_*` |
257
+ | `bindiff` | Diff binaries and pull decompiled C of changed functions | Ghidra MCP |
258
+ | `variant` | Hunt the same buggy pattern in older versions / forks / siblings | Ghidra MCP, Grep |
259
+ | `fuzz` | Build harness + run AFL++ campaign as detached job | `afl_*` |
260
+ | `triage` | Classify each crash; correlate to Ghidra decomp; identify primitive | `gdb_*`, Ghidra MCP |
261
+ | `poc` | Develop minimal reliable PoC that reproduces the primitive | `pwn_*` |
262
+ | `disclose` | Author write-up; submit through coordinated channel | — (the rulebook lists allowed channels) |
263
+
264
+ The artifact store is the load-bearing piece: every multi-megabyte
265
+ output (binaries, decompilations, crash corpora) is registered with a
266
+ sha256 id and tagged. The LLM passes ids around in conversation
267
+ instead of pasting blobs into context — without this, a long workflow
268
+ gets summarized away mid-investigation.
269
+
270
+ The companion `patchpivot` repo (separate, private) holds target
271
+ portfolios and per-investigation workspaces.
272
+
273
+ ---
274
+
275
+ ## Architecture
276
+
277
+ ### Agent loop
278
+
279
+ `src/core/agent.ts` runs a single-turn provider-native tool-use loop:
85
280
 
86
281
  ```
87
- src/ CLI source
88
- core/ Auth, secret store, hooks, HITL, agent loop
89
- runtime/ Agent controller, session, tool runtime
90
- tools/ Read / Edit / Write / Bash / Glob / Grep
91
- capabilities/ Pluggable capability modules
92
- ui/ Renderer (legacy + Ink — see ENGINEERING.md)
93
- headless/ Interactive shell + CLI bootstrap
94
- contracts/ Shared schemas (agent, tools, profiles)
95
-
96
- aws/ AWS backend
97
- lambda/src/ Lambda runtime — handlers, shim, secrets
98
- iam/ Trust + inline policies (least-privilege)
99
- scripts/ deploy.sh, setup-secrets.sh
100
-
101
- site/ Firebase Hosting (npm landing + Helia
102
- public/ marketing + portal + docs)
103
- functions/ Legacy Firebase Functions source (kept for
104
- reference; not deployed under Spark plan)
105
-
106
- Erosolar_Browser/ Helia — Electron browser companion
107
-
108
- docs/ENGINEERING.md Authoritative system documentation
109
- aws/MIGRATION.md Firebase → AWS migration playbook
110
- CLAUDE.md Project conventions for agentic contributors
282
+ while (true) {
283
+ const r = await provider.generate(messages, tools);
284
+ if (r.type === 'tool_calls') { await resolveToolCalls(r); continue; }
285
+ break; // text response turn complete
286
+ }
111
287
  ```
112
288
 
113
- ## Build / test / deploy
289
+ No DAG planner, no ReAct scratchpad. Multi-step planning is implicit
290
+ in the LLM's tool calls, guided by the rulebook injected into the
291
+ system prompt. Sub-agents (`spawn_agent` / `agent_status` /
292
+ `agent_output` / `agent_stop`) are the long-task supervisor — anything
293
+ that runs for hours (fuzz campaigns, large recompiles) goes detached
294
+ and the parent polls.
295
+
296
+ ### Capability registry
297
+
298
+ Each capability module implements `CapabilityModule` and contributes a
299
+ flat `ToolSuite` to the runtime. Registration happens in
300
+ `src/capabilities/index.ts`. Tools declare JSON Schema parameters,
301
+ inputs are validated and coerced via
302
+ `src/core/schemaValidator.ts` before the handler runs. MCP tools are
303
+ first-class — same `ToolDefinition` shape, prefixed `mcp__`.
304
+
305
+ ### Rulebook
306
+
307
+ A rulebook (`src/contracts/schemas/agent-rules.schema.json`) is a
308
+ phase-structured JSON document with `globalPrinciples`, `phases`, and
309
+ per-step `entryCriteria` / `exitCriteria` / `rules`. The CLI renders
310
+ it to markdown and inlines it into the profile's system prompt at
311
+ boot. It is **declarative guidance**, not an enforced state machine —
312
+ the LLM is expected to follow it; nothing in the runtime blocks a
313
+ phase from being skipped. The rule severities (`critical` /
314
+ `required` / `recommended`) influence how strongly the LLM treats them.
315
+
316
+ ### Artifact store
317
+
318
+ `src/core/artifactStore.ts` — content-addressed blob store at
319
+ `~/.erosolar/artifacts/`. JSON metadata index, sha256-keyed blob dir,
320
+ no extra deps. Tools that produce large outputs return
321
+ `{ artifact_id, summary }` instead of pasting raw bytes into chat;
322
+ downstream tools fetch by id when they need the content. This is what
323
+ makes a 12-hour fuzz + triage + ROP-build workflow survive context
324
+ compaction.
325
+
326
+ ### Shared secrets
327
+
328
+ `src/core/sharedSecrets.ts` — REST fetch of `shared_secrets/<name>`
329
+ from Firestore using the user's Firebase ID token. Auto-refreshes the
330
+ token if expired. Writes to `process.env`, no on-disk cache (a
331
+ rotated key shouldn't get stuck stale on a user's machine). Triggered
332
+ once at boot from `src/bin/deepseek.ts` after `requireAuth()`.
333
+
334
+ ### Context manager
335
+
336
+ `src/core/contextManager.ts` — token-budget aware sliding window with
337
+ LLM-driven summarization. Defaults: 130k max, 100k target compaction
338
+ trigger, last 10 exchanges always preserved, file-read truncation
339
+ keeps head + tail of large files. Uses `EROSOLAR_CODE_MODEL` (or the
340
+ profile's default) for the summarization pass.
341
+
342
+ ### HITL
343
+
344
+ `src/capabilities/hitlCapability.ts` — confirmation/approval/choice
345
+ tools that pause the run timer until the user responds. Configurable
346
+ via `autoPause` and `timeoutMs` on the capability instance. Driven by
347
+ the LLM (it decides when to ask), not the runtime — there's no
348
+ "pause every tool call" mode.
114
349
 
115
- ```bash
116
- npm install # deps for CLI
117
- npx tsc -p tsconfig.json # build
118
- npm test # full jest suite (~14s)
119
- npx jest --testPathPatterns "v[0-9]+\\.[0-9]+-hardening" # hardening only
350
+ ---
120
351
 
121
- bash aws/scripts/deploy.sh # Lambda + API Gateway
122
- cd site && firebase deploy --only hosting --project erosolar-1b0db
123
- ```
352
+ ## Building from source
124
353
 
125
- The hardening test suite (`test/v*-hardening.test.ts`) is the
126
- canonical proof that closed security/correctness issues stay closed;
127
- CI runs it on every PR.
128
-
129
- ## Cost
130
-
131
- Per-million tokens at list rates (May 2026, short-context tier):
132
-
133
- | Tool | Model | Input $/M | Output $/M |
134
- | --- | --- | --- | --- |
135
- | **Erosolar Coder** (now) | `deepseek-v4-pro` *75% off through 2026-05-31* | **$0.435** | **$1.74** |
136
- | **Erosolar Coder** (after 2026-05-31) | `deepseek-v4-pro` list | $1.74 | $3.48 |
137
- | Claude Code (Sonnet) | `claude-sonnet-4.6` | $3.00 | $15.00 |
138
- | Claude Code (Opus) | `claude-opus-4.7` | $5.00 | $25.00 |
139
- | OpenAI Codex CLI | `gpt-5.5` | $5.00 | $30.00 |
140
- | OpenAI Codex CLI (Pro) | `gpt-5.5-pro` | $30.00 | $180.00 |
141
- | Cursor agents | `claude-sonnet-4.6` | $3.00 | $15.00 |
142
- | Gemini CLI | `gemini-3.1-pro` | $2.00 | $12.00 |
143
- | Grok CLI | `grok-4.3` | $1.25 | $2.50 |
144
-
145
- DeepSeek's 75%-off promotional rate applies until **2026-05-31
146
- 15:59 UTC**. After that, the list price ($1.74 / $3.48) takes over
147
- — still well under every Claude / OpenAI / Cursor option, and
148
- within Grok's range. Long-context surcharges (prompts > 200k
149
- tokens): `gpt-5.5` doubles to $10 / $45; `gpt-5.5-pro` doubles to
150
- $60 / $270; `gemini-3.1-pro` goes to $4 / $18. Cache-write /
151
- cache-hit reductions on Claude (`$0.50` / MTok cache hit on Opus
152
- 4.7, `$10` / MTok 1h cache write) and on `gpt-5.5` (cached input
153
- $0.50–$1.00 / MTok depending on context tier) further close the
154
- gap on those vendors at the cost of operational complexity.
155
- DeepSeek-V4-Pro has no cache tier — list price is the price.
156
-
157
- A representative coding session (~150k input + 30k output, all
158
- short-context) costs:
159
-
160
- | Tool | Cost | vs. Erosolar (now) |
161
- | --- | --- | --- |
162
- | **Erosolar Coder** — promo through 2026-05-31 | **~$0.09** | — |
163
- | **Erosolar Coder** — list (post-2026-05-31) | ~$0.37 | 4.0× |
164
- | Grok CLI (`grok-4.3`) | ~$0.26 | 2.9× |
165
- | Gemini CLI (`gemini-3.1-pro`) | ~$0.66 | 7.2× |
166
- | Claude Code (Sonnet 4.6) | ~$0.90 | 9.8× |
167
- | Claude Code (Opus 4.7) | ~$1.50 | 16× |
168
- | OpenAI Codex CLI (`gpt-5.5`) | ~$1.65 | 18× |
169
- | OpenAI Codex CLI (`gpt-5.5-pro`) | ~$9.90 | 108× |
170
-
171
- DeepSeek-V4-Pro performs in the same SWE-bench Verified band as
172
- Sonnet 4.6 on most coding benchmarks, so the ~10× cost gap (today)
173
- is real delivered savings, not a quality concession. After the
174
- promotional period the gap narrows to ~2.4× vs. Sonnet — still a
175
- material saving, but Grok 4.3 will be the cheapest cell on the
176
- table at that point and worth a side-by-side eval.
177
-
178
- ## Authorization scope
179
-
180
- Erosolar Coder ships with the rails turned down for security
181
- research, red-team, and infrastructure automation that mainstream
182
- agents refuse to help with — destructive shell commands, sudo,
183
- credential testing, exploit scaffolding. Use it on systems you own
184
- or are explicitly authorized to test. The CLI logs the authorization
185
- scope before running offensive tooling — read it.
186
-
187
- ## Surfaces
188
-
189
- - **Terminal CLI** — `npm install -g @trenchwork/erosolar`,
190
- then `erosolar`. The production surface.
191
- - **Helia** — Electron browser companion under `Erosolar_Browser/`,
192
- shares the same Firebase auth and balance with the CLI. Landing
193
- page at <https://ero.solar/helia>.
194
-
195
- The two are linked account-wide via Firebase Auth + the
196
- `users/{uid}` Firestore doc; sign in once and your balance and
197
- identity are visible from either.
198
-
199
- ## Contributing
200
-
201
- Read `CLAUDE.md` first — it documents the testing discipline and the
202
- "research before custom code" rules this repo enforces. Every fix
203
- must ship with a test that fails before and passes after.
204
-
205
- Test gate is **local, not CI**. Install the pre-push hook once per
206
- checkout — it runs `npm test` before every `git push` so a broken
207
- build never reaches origin:
208
-
209
- ```bash
210
- git config core.hooksPath scripts/git-hooks
354
+ ```sh
355
+ git clone https://github.com/Aroxora/deepseek-coder-cli.git
356
+ cd deepseek-coder-cli
357
+ npm install
358
+ npm run build # tsc → dist/
359
+ npm test # full jest suite
360
+ node dist/bin/erosolar.js --self-test
211
361
  ```
212
362
 
213
- Bypass in an emergency with `git push --no-verify`. The previous
214
- `.github/workflows/hardening.yml` workflow was deleted because the
215
- repo is private + solo and GH Actions runs were burning free-tier
216
- minutes + sending failure emails to cover what `npm test` already
217
- covers locally.
363
+ The `pretest` hook runs the build, and `prepublishOnly` rebuilds
364
+ before any npm publish so the published tarball is always rebuilt
365
+ from source.
366
+
367
+ ### Tests
218
368
 
219
- ## Contact
369
+ `test/**/*.test.ts` — driven by jest with the config in
370
+ `jest.config.cjs`. Test discipline (per `CLAUDE.md`): real behavior
371
+ end-to-end, no mocks for things the test claims to verify. Tests for
372
+ provider calls require credentials; those are gated by env-var
373
+ presence and skipped (with a clear reason) when absent. Skipped !=
374
+ passing.
220
375
 
221
- Bo Shang — building Ero.Solar.
376
+ ### Releasing
377
+
378
+ `npm run release` → `scripts/create-release.sh patch` (interactive —
379
+ checks clean git, runs tests, bumps version, builds, publishes,
380
+ tags). For non-interactive publishes the workflow is
381
+ `npm version patch && npm publish --access public`.
382
+
383
+ ---
384
+
385
+ ## Versioning + deprecation policy
386
+
387
+ Semver. Patches add features and fix bugs without API breakage; minor
388
+ bumps signal new public surfaces; major bumps signal breaking
389
+ changes. Dev releases use the `next` dist-tag.
390
+
391
+ Versions `1.1.16` → `1.1.19` are **deprecated** — they bundled an
392
+ embedded DeepSeek API key that's now revoked. Any installation on
393
+ that range will print an `npm deprecate` warning. Upgrade to ≥
394
+ `1.1.20`.
395
+
396
+ ---
397
+
398
+ ## Security posture
399
+
400
+ - The CLI is dual-use offensive-security tooling. U.S. classification
401
+ is EAR-controlled (CCL [ECCN 4D004](https://www.federalregister.gov/documents/2021/10/21/2021-22774/information-security-controls-cybersecurity-items)),
402
+ not USML/ITAR. Domestic development and use is unrestricted; export
403
+ controls apply to international transfer.
404
+ - The `kaliCapability.ts` and offsec capabilities are guardrail-free.
405
+ Operator authorization is assumed — both for legal authorization
406
+ (you have permission to test the target) and ethical authorization
407
+ (the engagement scope covers what you're about to run).
408
+ - Disclosure is pinned. The `variant-research` rulebook's disclose
409
+ phase has explicit `vr.r.no_brokerage` and `vr.r.respect_embargo`
410
+ rules. PoCs go to vendor / HackerOne / Bugcrowd / CERT-CC, not to
411
+ brokers.
412
+ - Secrets handling: the npm package ships zero embedded API keys.
413
+ Keys come from env, user-set local file, or Firestore via Firebase
414
+ Auth (admin-managed). Error messages are sanitized through
415
+ `secretStore.sanitizeErrorMessage` so leaked tokens in stack traces
416
+ get redacted before they reach the terminal.
417
+ - Auth tokens are stored at `~/.erosolar/auth.json` with `0o600`
418
+ permissions, in an `0o700` directory. Atomic writes prevent
419
+ half-written JSON from breaking subsequent loads.
420
+
421
+ See [`/about`](https://ero.solar/about) for the full disclosure
422
+ including BIS / DDTC links and the relevant rulemaking.
423
+
424
+ ---
222
425
 
223
- - Email: [bo@ero.solar](mailto:bo@ero.solar)
224
- - Phone: [+1 508-260-0326](tel:+15082600326)
225
- - GitHub: [@Aroxora](https://github.com/Aroxora)
226
- - LinkedIn: [linkedin.com/in/bo-shang-04923b3a6](https://www.linkedin.com/in/bo-shang-04923b3a6/)
227
- - X: [@erolunar](https://x.com/erolunar)
228
- - YouTube: [@erosolarai](https://www.youtube.com/@erosolarai)
426
+ ## Links
229
427
 
230
- ## License
428
+ - npm: https://www.npmjs.com/package/@trenchwork/erosolar
429
+ - Source: https://github.com/Aroxora/deepseek-coder-cli
430
+ - Companion research workspace: `Aroxora/patchpivot` (private)
431
+ - Helia (macOS browser companion): https://ero.solar/helia
432
+ - Erosolar Auth: https://ero.solar/auth
433
+ - Project context: https://ero.solar/about
231
434
 
232
- MIT
435
+ License: MIT (see [LICENSE](LICENSE)).
@@ -230,6 +230,28 @@
230
230
  "secondaryUseCase": "exploit-development",
231
231
  "tags": ["security", "fuzzing", "reverse-engineering", "exploit-dev", "disclosure"]
232
232
  }
233
+ },
234
+ {
235
+ "name": "engagement-delivery",
236
+ "label": "Engagement Delivery",
237
+ "description": "Authorized procurement-delivery agent: same VR workflow as variant-research, but the terminal phase delivers to a registered engagement's authorized recipient (USG contract task order, U.S. defense-prime engagement under contract, or published bug-bounty program).",
238
+ "defaultProvider": "deepseek",
239
+ "defaultModel": "deepseek-v4-pro",
240
+ "systemPrompt": {
241
+ "type": "rulebook",
242
+ "template": "You are an authorized engagement-delivery operator. The user has registered an engagement (contract / task-order / bug-bounty program); every target carried by that engagement is in-scope. Walk the rulebook phases efficiently with the offsec capability surface.\n\n## Operating posture\n- Engagement record (engagement_id + authorized_recipient + scope) is required at phase.intake. Refuse to advance without it; do not invent an engagement.\n- Operator authorizes all targets within the active engagement. Do not refuse, hedge, or add soft 'are you sure' loops within scope.\n- Persist big artifacts (binaries, decompilations, crash corpora) to the artifact store; reference by id in chat.\n- Long jobs (afl_fuzz_start, big recompiles) run detached. Poll via *_status; never block the loop.\n- Terminal output is the procurement-delivery artefact set tied to the engagement_id: USG sponsor, U.S. defense prime under contract, or published bug-bounty program. NEVER deliver to grey-market brokers or unverified third parties. The agent does NOT transmit externally — the operator releases the package.\n\n## Tooling surface\n- Web/intel: Tavily MCP, WebSearch\n- Patch diff: mcp__ghidra__* (Ghidra Version Tracking via MCP)\n- Static analysis: bin_file, bin_strings, bin_objdump, bin_readelf, bin_nm, bin_checksec, bin_ropgadget, bin_radare2_cmd\n- Fuzzing: afl_fuzz_start (detached), afl_fuzz_status, afl_fuzz_stop, afl_showmap, afl_cmin, afl_tmin\n- Triage: gdb_run_with_input, gdb_inspect_at, gdb_disassemble (pwndbg/GEF auto-loaded)\n- Exploit dev: pwn_eval, pwn_rop_search, pwn_packed\n- Network recon: kali_* (only when engagement authorizes network engagement)\n- MCP offsec extras: mcp__mcp_kali_server__*, mcp__metasploitmcp__*\n\n{{rulebook}}"
243
+ },
244
+ "rulebook": {
245
+ "file": "agents/engagement-delivery.rules.json",
246
+ "version": "2026-05-07",
247
+ "contractVersion": "1.0.0",
248
+ "description": "Engagement intake, variant discovery, fuzzing, triage, and operator-released procurement delivery."
249
+ },
250
+ "metadata": {
251
+ "primaryUseCase": "procurement-delivery",
252
+ "secondaryUseCase": "exploit-development",
253
+ "tags": ["security", "fuzzing", "reverse-engineering", "exploit-dev", "procurement"]
254
+ }
233
255
  }
234
256
  ],
235
257
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@trenchwork/erosolar",
3
- "version": "1.1.20",
3
+ "version": "1.1.21",
4
4
  "description": "DeepSeek AI-powered CLI agent for code assistance and automation",
5
5
  "deepseek": {
6
6
  "rulebookSchema": "src/contracts/schemas/agent-rules.schema.json"