npm - @trenchwork/erosolar - Versions diffs - 1.1.20 → 1.1.22 - Mend

@trenchwork/erosolar 1.1.20 → 1.1.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +401 -198
package/dist/contracts/agent-schemas.json +22 -0
package/dist/headless/interactiveShell.js +17 -17
package/dist/headless/interactiveShell.js.map +1 -1
package/dist/ui/ink/App.d.ts.map +1 -1
package/dist/ui/ink/App.js +1 -1
package/dist/ui/ink/App.js.map +1 -1
package/dist/ui/ink/ChatStatic.d.ts +1 -1
package/dist/ui/ink/ChatStatic.d.ts.map +1 -1
package/dist/ui/ink/ChatStatic.js +19 -3
package/dist/ui/ink/ChatStatic.js.map +1 -1
package/dist/ui/ink/InkPromptController.js +1 -1
package/dist/ui/ink/InkPromptController.js.map +1 -1
package/dist/ui/ink/StatusLine.d.ts.map +1 -1
package/dist/ui/ink/StatusLine.js +1 -1
package/dist/ui/ink/StatusLine.js.map +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,232 +1,435 @@
-# Erosolar Coder
+# Erosolar Coder · @trenchwork/erosolar
 [![npm version](https://img.shields.io/npm/v/@trenchwork/erosolar)](https://www.npmjs.com/package/@trenchwork/erosolar)
+[![license: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
-> **First public research run — 3 hours of unattended offensive
-> security research, useful enough to submit to Google Bug Hunters.**
->
-> The first prompt I asked
-> [erosolar](https://www.npmjs.com/package/@trenchwork/erosolar)
-> to run autonomously was an automated security-research pass for
-> submission to the [Google Bug Hunters](https://bughunters.google.com/)
-> program. It ran unattended for **3 continuous hours** on a CLI
-> still under initial development and produced useful offensive-
-> security research. For a starting CLI that's an amazing accomplishment.
->
-> The open-source first-prompt result is at
-> [`Aroxora/google-bug-hunters-initial-AI-research-and-offensive-research-results`](https://github.com/Aroxora/google-bug-hunters-initial-AI-research-and-offensive-research-results).
-> All subsequent research is in a separate **private** repository
-> under responsible-AI and cyber-safety guidelines, by my own AI-safety,
-> regulatory, and moral judgement. I'm open-sourcing the first prompt
-> that ran — which showed extreme promise — to share the research
-> result responsibly with others.
+A multi-provider terminal agent for coding work and authorized offensive
+security research. Default backend is DeepSeek-V4-Pro (via a shared
+key gated by Erosolar Auth, no key bundled in the package), with
+Anthropic / OpenAI / Google / xAI / Qwen / Ollama as opt-in providers.
----
+This README documents the **current shipped state** of the CLI: the
+agent loop, capability surface, profiles, rulebook system, MCP
+integration, artifact store, and the auth-gated key fetch. Sections
+toward the end cover architecture and contribution. Examples,
+"try asking…" hints, and onboarding fluff are deliberately absent —
+[CLAUDE.md](CLAUDE.md) is strict about that.
-Erosolar Coder is an AI-powered CLI agent that uses DeepSeek-V4-Pro to
-deliver coding assistance at ~30× the cost-efficiency of comparable
-agents. The CLI is the production-blessed surface for coding work, and
-[Helia](https://ero.solar/helia) is the companion macOS browser with
-the same Erosolar agent in a side panel. The two share auth, memory,
-balance, and the model brain via the AWS Lambda backend.
-The agent is built guardrail-free for full-machine offensive-security
-research. The source is privately held. Under U.S. law, offensive-cyber
-tooling is a *dual-use commercial item* (Commerce Control List,
-[ECCN 4D004](https://www.federalregister.gov/documents/2021/10/21/2021-22774/information-security-controls-cybersecurity-items)),
-not a "defense article" on the U.S. Munitions List — so it is not a
-"weapon" in the ITAR sense. EAR controls govern *international export*;
-they do not restrict private domestic development or sale to U.S.
-government agencies, and BIS's vulnerability-disclosure carve-out
-explicitly exempts ordinary security-research activity. See
-[`/about`](https://ero.solar/about) for the full disclosure (with links
-to BIS / DDTC and the relevant rulemaking) and
-[`docs/ENGINEERING.md`](docs/ENGINEERING.md) if you're trying to
-understand how the system actually works.
+---
 ## Install
-```bash
+```sh
 npm install -g @trenchwork/erosolar
 ```
-Exposes two CLIs: `deepseek`, `erosolar` (synonyms; pick the one you
-prefer).
+Two CLI binaries land on `$PATH`, both pointing to the same entry:
+- `erosolar` — preferred
+- `trenchwork` — alternate name
-```bash
-erosolar                       # interactive shell
-erosolar -q "explain X"        # one-shot prompt
-git diff | erosolar            # pipe mode
-erosolar --key sk-...          # set DeepSeek key
+Node ≥ 20 required (`"engines": { "node": ">=20.0.0" }`).
+### First-run boot
+```sh
+erosolar
 ```
-In-shell commands: `/model`, `/secrets`, `/help`, `/clear`. See
-`/help` for the full list.
+On first launch the CLI:
+1. Pops a browser tab to `https://ero.solar/auth?port=…`. You sign in
+   (Erosolar Auth — Firebase Auth under the hood, project
+   `erosolar-1b0db`).
+2. Stores a Firebase ID token + refresh token at `~/.erosolar/auth.json`
+   (`0o600`). The refresh token is long-lived; the ID token is auto-
+   renewed before expiry.
+3. Fetches the shared DeepSeek API key from Firestore at
+   `shared_secrets/deepseek` and writes it into `process.env.DEEPSEEK_API_KEY`
+   for the duration of the process. **Nothing lands on disk.** Reads
+   are gated by Firestore security rules to authenticated users only;
+   writes are admin-only.
+4. Starts the interactive Ink-based shell.
+### Bring-your-own key
+If you'd rather use your own DeepSeek key (or any other provider key),
+the Firestore fetch is skipped. Resolution order is:
+1. `process.env.DEEPSEEK_API_KEY` (env override always wins)
+2. `~/.erosolar/secrets.json` (set in-CLI via `/key sk-…` or `/secrets`)
+3. Firestore `shared_secrets/deepseek` (the post-login fallback)
+Same chain applies to `TAVILY_API_KEY` and any other provider key the
+shared-secrets module knows about.
+### Modes
+```sh
+erosolar                        # interactive
+erosolar -q "explain X"         # one-shot, prints answer, exits
+git diff | erosolar             # pipe mode — stdin becomes the prompt
+erosolar --profile variant-research "<goal>"   # use a non-default profile
+erosolar --self-test            # provider/auth/capability smoke test
+```
-## How it works (skim)
+---
+## Profiles
+A **profile** binds a system prompt template, default model/provider,
+and a [rulebook](#rulebooks) (a phase-structured guidance document the
+LLM reads from its system prompt).
+### `erosolar-code` (default)
+General-purpose terminal agent for coding work. Default model
+`deepseek-v4-pro`, default provider `deepseek`. The full default tool
+inventory is mounted: filesystem (Read/Write/Edit/MultiEdit), Bash,
+Glob/Grep/Search, Git (status/log/diff/show/blame), TodoWrite, Memory
+(save/load/list/delete), Skills, NotebookEdit, WebSearch/WebExtract,
+HITL, plus any MCP servers you declare in `mcp.json`. Rulebook at
+[`agents/erosolar-code.rules.json`](agents/erosolar-code.rules.json).
+### `variant-research`
+Authorized offsec / vulnerability-research surface. Walks the standard
+n-day-to-0-day pivot: recon → vulnerable+patched binary acquisition →
+patch diff (Ghidra MCP) → variant search → fuzz campaign (AFL++) →
+crash triage (gdb/pwndbg) → PoC development (pwntools) → coordinated
+disclosure. Rulebook at
+[`agents/variant-research.rules.json`](agents/variant-research.rules.json).
+The disclosure terminal is **pinned to coordinated channels**
+(HackerOne / Bugcrowd / vendor PSIRT / CERT-CC / internal write-up /
+90-day published advisory). The rulebook contains an explicit
+`vr.r.no_brokerage` rule: it is not a vector for selling unreported
+exploits.
+Operator authorizes targets — the CLI does not second-guess argv or
+target identifiers. The companion research workspace lives at
+[`Aroxora/patchpivot`](https://github.com/Aroxora/patchpivot) (private):
+a target portfolio, per-investigation findings dirs, and a disclosure
+log all driven by this profile.
+```sh
+erosolar --profile variant-research "investigate the patch at <commit-url>"
 ```
-CLI ──Firebase ID token──▶ AWS API Gateway ─▶ AWS Lambda
- │                                              │
- ▼                                              ▼
-Firebase Hosting + Auth                    DeepSeek / Stripe / GitHub /
-+ Firestore (Spark plan)                   Tavily / Anthropic / Proton SMTP
+---
+## Capability surface
+Capabilities are typed wrappers around external tools, registered in
+`src/capabilities/` and exposed flat in the LLM's tool list.
+### Coding default (always on)
+| Family | Tools |
+|---|---|
+| Filesystem | `Read`, `Write`, `Edit`, `MultiEdit`, `Glob`, `Grep`, `Search` |
+| Process | `Bash` (with bracketed-paste-safe stdin and 1MB stdout cap) |
+| Git | `git_status`, `git_log`, `git_diff`, `git_show`, `git_blame`, `git_revert` |
+| Memory | `memory_save`, `memory_load`, `memory_list`, `memory_delete` |
+| Planning | `TodoWrite` (state + plan-formatter integration) |
+| Skills | `Skill`, `list_skills` (loads `.erosolar/skills/<name>/SKILL.md`) |
+| Web | `WebSearch`, `WebExtract` (Tavily under the hood) |
+| Sub-agent | `spawn_agent`, `agent_status`, `agent_output`, `agent_stop` |
+| Notebook | `NotebookEdit` |
+| HITL | `hitl_decision`, `hitl_approve`, `hitl_choose` |
+### Offensive (variant-research / when wired)
+| Capability | Module | Tools |
+|---|---|---|
+| Kali (network/web) | `kaliCapability.ts` | `kali_sqlmap`, `kali_gobuster`, `kali_ffuf`, `kali_feroxbuster`, `kali_nikto`, `kali_wpscan`, `kali_hydra`, `kali_john`, `kali_hashcat`, `kali_masscan`, `kali_amass`, `kali_subfinder` |
+| Static binary analysis | `binaryAnalysisCapability.ts` | `bin_file`, `bin_strings`, `bin_objdump`, `bin_readelf`, `bin_nm`, `bin_checksec`, `bin_ropgadget`, `bin_radare2_cmd` |
+| AFL++ fuzzing | `aflppCapability.ts` | `afl_compile_harness`, `afl_fuzz_start` (detached), `afl_fuzz_status` (auto-registers crashes in artifact store), `afl_fuzz_stop`, `afl_showmap`, `afl_cmin`, `afl_tmin` |
+| Crash triage | `gdbCapability.ts` | `gdb_run_with_input`, `gdb_inspect_at`, `gdb_disassemble` (pwndbg/GEF inherited from `~/.gdbinit`) |
+| Exploit development | `pwntoolsCapability.ts` | `pwn_eval`, `pwn_rop_search`, `pwn_packed` |
+The offsec capabilities are guardrail-free per
+[ero.solar/about](https://ero.solar/about) — the operator authorizes
+their own engagements; the CLI does not validate target/argv. Install
+the underlying binaries via `kali-linux-everything` + `ghidra` on Kali
+Linux, or pull the Kali rolling repo on Debian/Ubuntu.
+---
+## MCP servers
+The CLI auto-loads MCP servers declared in `mcp.json` (or
+`mcp.json.example` as a template). Tools from each MCP server appear in
+the flat tool list with the prefix `mcp__<server>__<tool>`.
+```json
+{
+  "mcpServers": {
+    "ghidra": {
+      "command": "python3",
+      "args": ["-m", "ghidra_mcp"],
+      "env": { "GHIDRA_INSTALL_DIR": "/usr/share/ghidra" }
+    },
+    "mcp_kali_server": { "command": "mcp-kali-server", "args": [] },
+    "metasploitmcp":   { "command": "metasploitmcp",    "args": [] },
+    "tavily":          { "command": "tavily-mcp",        "args": [] }
+  }
+}
 ```
-Four boxes, one trust boundary (the Firebase ID token), and one
-reason this isn't all on Firebase: the original GCP account was
-suspended and the new one is on the Spark plan, which doesn't run
-Cloud Functions. Everything stateful that Spark *does* support
-(Hosting, Auth, Firestore, FCM) stayed there. Everything else moved
-to AWS — Lambda for handlers, Secrets Manager for the 14+ shared
-keys, EventBridge for cron schedules, no extra infrastructure.
+The `mcp-kali-server` and `metasploitmcp` packages ship in
+`kali-linux-everything`. `ghidra-mcp` is a separate `pip install`.
+`tavily-mcp` is on npm. None are required; they're additive.
+---
+## Slash commands
+In-shell commands handled before the agent loop sees them:
+| Command | Effect |
+|---|---|
+| `/help` | Lists available commands |
+| `/key <sk-…>` | Saves DeepSeek key to `~/.erosolar/secrets.json` |
+| `/secrets` | Inspect / set / unset every known provider key |
+| `/profile <name>` | Switch profile mid-session |
+| `/model <id>` | Override default model for this session |
+| `/auto on\|off\|verify` | Auto-continue mode (loops until task-complete detector says done) |
+| `/hitl on\|off` | Toggle human-in-the-loop confirmations |
+| `/debug on\|off\|status` | Toggle agent-internal debug logging |
+| `/bash <cmd>` | Run a one-shot shell command without going through the agent |
+| `/diff` | Show git diff for the current run |
+| `/revert` | Roll back files modified during the current turn |
+| `/compact` | Force a context compaction pass |
+| `/status` | Token usage, model, profile, auth state |
+| `/mcp` | List active MCP servers and their tools |
+| `/quit`, `/exit` | Leave the shell |
+---
+## Configuration
+### Filesystem layout (`~/.erosolar/`)
+| Path | Purpose | Permissions |
+|---|---|---|
+| `auth.json` | Firebase ID + refresh tokens | `0o600` |
+| `secrets.json` | User-set provider keys (overrides shared) | `0o600` |
+| `sessions/<uuid>.json` | Persisted conversation history | `0o600` |
+| `artifacts/<sha256[:2]>/<sha256>` | Content-addressed blob store | dir `0o700`, files `0o600` |
+| `artifacts/index.json` | Artifact metadata | `0o600` |
+| `jobs/aflpp/<jobId>.json` | Detached AFL++ job state | `0o600` |
+| `cache/` | Model discovery cache, etc. | `0o700` |
+| `skills/<name>/SKILL.md` | User-defined skill playbooks | — |
+### Environment variables
+| Var | Effect |
+|---|---|
+| `DEEPSEEK_API_KEY` | Bypass shared-key fetch and use this key |
+| `EROSOLAR_HOME` | Override `~/.erosolar` storage root |
+| `EROSOLAR_PROFILE` | Default profile (alternative to `--profile`) |
+| `EROSOLAR_INK_DEBUG=1` | Verbose Ink prompt + state logging on stderr |
+| `<PROFILE>_MODEL` | Per-profile default model override (e.g. `EROSOLAR_CODE_MODEL=claude-opus-4-5-20251101`) |
+| `<PROFILE>_PROVIDER` | Per-profile default provider override |
+| `<PROFILE>_SYSTEM_PROMPT` | Per-profile system prompt override |
+| `NO_COLOR`, `FORCE_COLOR` | Standard color toggles |
+| `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, `XAI_API_KEY`, `DASHSCOPE_API_KEY`, `OLLAMA_BASE_URL` | Other provider credentials when you select those models |
+### `.erosolar/settings.json` (hooks + permissions)
+The harness reads `~/.erosolar/settings.json` (user) and
+`<workspace>/.erosolar/settings.json` (project) for hooks, allow/deny
+permission lists, and env injection. See `src/core/hooks.ts` for the
+contract.
+---
+## Variant research workflow
-## Layout
+The `variant-research` profile drives an 8-phase rulebook the LLM
+reads from its system prompt:
+| Phase | Intent | Tools |
+|---|---|---|
+| `recon` | Identify target patch / CVE / advisory; record bug-class hypothesis | Tavily / WebSearch |
+| `acquire` | Build vulnerable + patched binaries; persist to artifact store | Bash, Git, `bin_*` |
+| `bindiff` | Diff binaries and pull decompiled C of changed functions | Ghidra MCP |
+| `variant` | Hunt the same buggy pattern in older versions / forks / siblings | Ghidra MCP, Grep |
+| `fuzz` | Build harness + run AFL++ campaign as detached job | `afl_*` |
+| `triage` | Classify each crash; correlate to Ghidra decomp; identify primitive | `gdb_*`, Ghidra MCP |
+| `poc` | Develop minimal reliable PoC that reproduces the primitive | `pwn_*` |
+| `disclose` | Author write-up; submit through coordinated channel | — (the rulebook lists allowed channels) |
+The artifact store is the load-bearing piece: every multi-megabyte
+output (binaries, decompilations, crash corpora) is registered with a
+sha256 id and tagged. The LLM passes ids around in conversation
+instead of pasting blobs into context — without this, a long workflow
+gets summarized away mid-investigation.
+The companion `patchpivot` repo (separate, private) holds target
+portfolios and per-investigation workspaces.
+---
+## Architecture
+### Agent loop
+`src/core/agent.ts` runs a single-turn provider-native tool-use loop:
 ```
-src/                       CLI source
-  core/                    Auth, secret store, hooks, HITL, agent loop
-  runtime/                 Agent controller, session, tool runtime
-  tools/                   Read / Edit / Write / Bash / Glob / Grep
-  capabilities/            Pluggable capability modules
-  ui/                      Renderer (legacy + Ink — see ENGINEERING.md)
-  headless/                Interactive shell + CLI bootstrap
-  contracts/               Shared schemas (agent, tools, profiles)
-aws/                       AWS backend
-  lambda/src/              Lambda runtime — handlers, shim, secrets
-  iam/                     Trust + inline policies (least-privilege)
-  scripts/                 deploy.sh, setup-secrets.sh
-site/                      Firebase Hosting (npm landing + Helia
-  public/                  marketing + portal + docs)
-  functions/               Legacy Firebase Functions source (kept for
-                           reference; not deployed under Spark plan)
-Erosolar_Browser/          Helia — Electron browser companion
-docs/ENGINEERING.md        Authoritative system documentation
-aws/MIGRATION.md           Firebase → AWS migration playbook
-CLAUDE.md                  Project conventions for agentic contributors
+while (true) {
+  const r = await provider.generate(messages, tools);
+  if (r.type === 'tool_calls') { await resolveToolCalls(r); continue; }
+  break;   // text response — turn complete
+}
 ```
-## Build / test / deploy
+No DAG planner, no ReAct scratchpad. Multi-step planning is implicit
+in the LLM's tool calls, guided by the rulebook injected into the
+system prompt. Sub-agents (`spawn_agent` / `agent_status` /
+`agent_output` / `agent_stop`) are the long-task supervisor — anything
+that runs for hours (fuzz campaigns, large recompiles) goes detached
+and the parent polls.
+### Capability registry
+Each capability module implements `CapabilityModule` and contributes a
+flat `ToolSuite` to the runtime. Registration happens in
+`src/capabilities/index.ts`. Tools declare JSON Schema parameters,
+inputs are validated and coerced via
+`src/core/schemaValidator.ts` before the handler runs. MCP tools are
+first-class — same `ToolDefinition` shape, prefixed `mcp__`.
+### Rulebook
+A rulebook (`src/contracts/schemas/agent-rules.schema.json`) is a
+phase-structured JSON document with `globalPrinciples`, `phases`, and
+per-step `entryCriteria` / `exitCriteria` / `rules`. The CLI renders
+it to markdown and inlines it into the profile's system prompt at
+boot. It is **declarative guidance**, not an enforced state machine —
+the LLM is expected to follow it; nothing in the runtime blocks a
+phase from being skipped. The rule severities (`critical` /
+`required` / `recommended`) influence how strongly the LLM treats them.
+### Artifact store
+`src/core/artifactStore.ts` — content-addressed blob store at
+`~/.erosolar/artifacts/`. JSON metadata index, sha256-keyed blob dir,
+no extra deps. Tools that produce large outputs return
+`{ artifact_id, summary }` instead of pasting raw bytes into chat;
+downstream tools fetch by id when they need the content. This is what
+makes a 12-hour fuzz + triage + ROP-build workflow survive context
+compaction.
+### Shared secrets
+`src/core/sharedSecrets.ts` — REST fetch of `shared_secrets/<name>`
+from Firestore using the user's Firebase ID token. Auto-refreshes the
+token if expired. Writes to `process.env`, no on-disk cache (a
+rotated key shouldn't get stuck stale on a user's machine). Triggered
+once at boot from `src/bin/deepseek.ts` after `requireAuth()`.
+### Context manager
+`src/core/contextManager.ts` — token-budget aware sliding window with
+LLM-driven summarization. Defaults: 130k max, 100k target compaction
+trigger, last 10 exchanges always preserved, file-read truncation
+keeps head + tail of large files. Uses `EROSOLAR_CODE_MODEL` (or the
+profile's default) for the summarization pass.
+### HITL
+`src/capabilities/hitlCapability.ts` — confirmation/approval/choice
+tools that pause the run timer until the user responds. Configurable
+via `autoPause` and `timeoutMs` on the capability instance. Driven by
+the LLM (it decides when to ask), not the runtime — there's no
+"pause every tool call" mode.
-```bash
-npm install                                    # deps for CLI
-npx tsc -p tsconfig.json                       # build
-npm test                                       # full jest suite (~14s)
-npx jest --testPathPatterns "v[0-9]+\\.[0-9]+-hardening"  # hardening only
+---
-bash aws/scripts/deploy.sh                     # Lambda + API Gateway
-cd site && firebase deploy --only hosting --project erosolar-1b0db
-```
+## Building from source
-The hardening test suite (`test/v*-hardening.test.ts`) is the
-canonical proof that closed security/correctness issues stay closed;
-CI runs it on every PR.
-## Cost
-Per-million tokens at list rates (May 2026, short-context tier):
-| Tool | Model | Input $/M | Output $/M |
-| --- | --- | --- | --- |
-| **Erosolar Coder** (now) | `deepseek-v4-pro` *75% off through 2026-05-31* | **$0.435** | **$1.74** |
-| **Erosolar Coder** (after 2026-05-31) | `deepseek-v4-pro` list | $1.74 | $3.48 |
-| Claude Code (Sonnet) | `claude-sonnet-4.6` | $3.00 | $15.00 |
-| Claude Code (Opus) | `claude-opus-4.7` | $5.00 | $25.00 |
-| OpenAI Codex CLI | `gpt-5.5` | $5.00 | $30.00 |
-| OpenAI Codex CLI (Pro) | `gpt-5.5-pro` | $30.00 | $180.00 |
-| Cursor agents | `claude-sonnet-4.6` | $3.00 | $15.00 |
-| Gemini CLI | `gemini-3.1-pro` | $2.00 | $12.00 |
-| Grok CLI | `grok-4.3` | $1.25 | $2.50 |
-DeepSeek's 75%-off promotional rate applies until **2026-05-31
-15:59 UTC**. After that, the list price ($1.74 / $3.48) takes over
-— still well under every Claude / OpenAI / Cursor option, and
-within Grok's range. Long-context surcharges (prompts > 200k
-tokens): `gpt-5.5` doubles to $10 / $45; `gpt-5.5-pro` doubles to
-$60 / $270; `gemini-3.1-pro` goes to $4 / $18. Cache-write /
-cache-hit reductions on Claude (`$0.50` / MTok cache hit on Opus
-4.7, `$10` / MTok 1h cache write) and on `gpt-5.5` (cached input
-$0.50–$1.00 / MTok depending on context tier) further close the
-gap on those vendors at the cost of operational complexity.
-DeepSeek-V4-Pro has no cache tier — list price is the price.
-A representative coding session (~150k input + 30k output, all
-short-context) costs:
-| Tool | Cost | vs. Erosolar (now) |
-| --- | --- | --- |
-| **Erosolar Coder** — promo through 2026-05-31 | **~$0.09** | — |
-| **Erosolar Coder** — list (post-2026-05-31) | ~$0.37 | 4.0× |
-| Grok CLI (`grok-4.3`) | ~$0.26 | 2.9× |
-| Gemini CLI (`gemini-3.1-pro`) | ~$0.66 | 7.2× |
-| Claude Code (Sonnet 4.6) | ~$0.90 | 9.8× |
-| Claude Code (Opus 4.7) | ~$1.50 | 16× |
-| OpenAI Codex CLI (`gpt-5.5`) | ~$1.65 | 18× |
-| OpenAI Codex CLI (`gpt-5.5-pro`) | ~$9.90 | 108× |
-DeepSeek-V4-Pro performs in the same SWE-bench Verified band as
-Sonnet 4.6 on most coding benchmarks, so the ~10× cost gap (today)
-is real delivered savings, not a quality concession. After the
-promotional period the gap narrows to ~2.4× vs. Sonnet — still a
-material saving, but Grok 4.3 will be the cheapest cell on the
-table at that point and worth a side-by-side eval.
-## Authorization scope
-Erosolar Coder ships with the rails turned down for security
-research, red-team, and infrastructure automation that mainstream
-agents refuse to help with — destructive shell commands, sudo,
-credential testing, exploit scaffolding. Use it on systems you own
-or are explicitly authorized to test. The CLI logs the authorization
-scope before running offensive tooling — read it.
-## Surfaces
-- **Terminal CLI** — `npm install -g @trenchwork/erosolar`,
-  then `erosolar`. The production surface.
-- **Helia** — Electron browser companion under `Erosolar_Browser/`,
-  shares the same Firebase auth and balance with the CLI. Landing
-  page at <https://ero.solar/helia>.
-The two are linked account-wide via Firebase Auth + the
-`users/{uid}` Firestore doc; sign in once and your balance and
-identity are visible from either.
-## Contributing
-Read `CLAUDE.md` first — it documents the testing discipline and the
-"research before custom code" rules this repo enforces. Every fix
-must ship with a test that fails before and passes after.
-Test gate is **local, not CI**. Install the pre-push hook once per
-checkout — it runs `npm test` before every `git push` so a broken
-build never reaches origin:
-```bash
-git config core.hooksPath scripts/git-hooks
+```sh
+git clone https://github.com/Aroxora/deepseek-coder-cli.git
+cd deepseek-coder-cli
+npm install
+npm run build      # tsc → dist/
+npm test           # full jest suite
+node dist/bin/erosolar.js --self-test
 ```
-Bypass in an emergency with `git push --no-verify`. The previous
-`.github/workflows/hardening.yml` workflow was deleted because the
-repo is private + solo and GH Actions runs were burning free-tier
-minutes + sending failure emails to cover what `npm test` already
-covers locally.
+The `pretest` hook runs the build, and `prepublishOnly` rebuilds
+before any npm publish so the published tarball is always rebuilt
+from source.
+### Tests
-## Contact
+`test/**/*.test.ts` — driven by jest with the config in
+`jest.config.cjs`. Test discipline (per `CLAUDE.md`): real behavior
+end-to-end, no mocks for things the test claims to verify. Tests for
+provider calls require credentials; those are gated by env-var
+presence and skipped (with a clear reason) when absent. Skipped !=
+passing.
-Bo Shang — building Ero.Solar.
+### Releasing
+`npm run release` → `scripts/create-release.sh patch` (interactive —
+checks clean git, runs tests, bumps version, builds, publishes,
+tags). For non-interactive publishes the workflow is
+`npm version patch && npm publish --access public`.
+---
+## Versioning + deprecation policy
+Semver. Patches add features and fix bugs without API breakage; minor
+bumps signal new public surfaces; major bumps signal breaking
+changes. Dev releases use the `next` dist-tag.
+Versions `1.1.16` → `1.1.19` are **deprecated** — they bundled an
+embedded DeepSeek API key that's now revoked. Any installation on
+that range will print an `npm deprecate` warning. Upgrade to ≥
+`1.1.20`.
+---
+## Security posture
+- The CLI is dual-use offensive-security tooling. U.S. classification
+  is EAR-controlled (CCL [ECCN 4D004](https://www.federalregister.gov/documents/2021/10/21/2021-22774/information-security-controls-cybersecurity-items)),
+  not USML/ITAR. Domestic development and use is unrestricted; export
+  controls apply to international transfer.
+- The `kaliCapability.ts` and offsec capabilities are guardrail-free.
+  Operator authorization is assumed — both for legal authorization
+  (you have permission to test the target) and ethical authorization
+  (the engagement scope covers what you're about to run).
+- Disclosure is pinned. The `variant-research` rulebook's disclose
+  phase has explicit `vr.r.no_brokerage` and `vr.r.respect_embargo`
+  rules. PoCs go to vendor / HackerOne / Bugcrowd / CERT-CC, not to
+  brokers.
+- Secrets handling: the npm package ships zero embedded API keys.
+  Keys come from env, user-set local file, or Firestore via Firebase
+  Auth (admin-managed). Error messages are sanitized through
+  `secretStore.sanitizeErrorMessage` so leaked tokens in stack traces
+  get redacted before they reach the terminal.
+- Auth tokens are stored at `~/.erosolar/auth.json` with `0o600`
+  permissions, in an `0o700` directory. Atomic writes prevent
+  half-written JSON from breaking subsequent loads.
+See [`/about`](https://ero.solar/about) for the full disclosure
+including BIS / DDTC links and the relevant rulemaking.
+---
-- Email: [bo@ero.solar](mailto:bo@ero.solar)
-- Phone: [+1 508-260-0326](tel:+15082600326)
-- GitHub: [@Aroxora](https://github.com/Aroxora)
-- LinkedIn: [linkedin.com/in/bo-shang-04923b3a6](https://www.linkedin.com/in/bo-shang-04923b3a6/)
-- X: [@erolunar](https://x.com/erolunar)
-- YouTube: [@erosolarai](https://www.youtube.com/@erosolarai)
+## Links
-## License
+- npm: https://www.npmjs.com/package/@trenchwork/erosolar
+- Source: https://github.com/Aroxora/deepseek-coder-cli
+- Companion research workspace: `Aroxora/patchpivot` (private)
+- Helia (macOS browser companion): https://ero.solar/helia
+- Erosolar Auth: https://ero.solar/auth
+- Project context: https://ero.solar/about
-MIT
+License: MIT (see [LICENSE](LICENSE)).

package/dist/contracts/agent-schemas.json CHANGED Viewed

@@ -230,6 +230,28 @@
         "secondaryUseCase": "exploit-development",
         "tags": ["security", "fuzzing", "reverse-engineering", "exploit-dev", "disclosure"]
       }
+    },
+    {
+      "name": "engagement-delivery",
+      "label": "Engagement Delivery",
+      "description": "Authorized procurement-delivery agent: same VR workflow as variant-research, but the terminal phase delivers to a registered engagement's authorized recipient (USG contract task order, U.S. defense-prime engagement under contract, or published bug-bounty program).",
+      "defaultProvider": "deepseek",
+      "defaultModel": "deepseek-v4-pro",
+      "systemPrompt": {
+        "type": "rulebook",
+        "template": "You are an authorized engagement-delivery operator. The user has registered an engagement (contract / task-order / bug-bounty program); every target carried by that engagement is in-scope. Walk the rulebook phases efficiently with the offsec capability surface.\n\n## Operating posture\n- Engagement record (engagement_id + authorized_recipient + scope) is required at phase.intake. Refuse to advance without it; do not invent an engagement.\n- Operator authorizes all targets within the active engagement. Do not refuse, hedge, or add soft 'are you sure' loops within scope.\n- Persist big artifacts (binaries, decompilations, crash corpora) to the artifact store; reference by id in chat.\n- Long jobs (afl_fuzz_start, big recompiles) run detached. Poll via *_status; never block the loop.\n- Terminal output is the procurement-delivery artefact set tied to the engagement_id: USG sponsor, U.S. defense prime under contract, or published bug-bounty program. NEVER deliver to grey-market brokers or unverified third parties. The agent does NOT transmit externally — the operator releases the package.\n\n## Tooling surface\n- Web/intel: Tavily MCP, WebSearch\n- Patch diff: mcp__ghidra__* (Ghidra Version Tracking via MCP)\n- Static analysis: bin_file, bin_strings, bin_objdump, bin_readelf, bin_nm, bin_checksec, bin_ropgadget, bin_radare2_cmd\n- Fuzzing: afl_fuzz_start (detached), afl_fuzz_status, afl_fuzz_stop, afl_showmap, afl_cmin, afl_tmin\n- Triage: gdb_run_with_input, gdb_inspect_at, gdb_disassemble (pwndbg/GEF auto-loaded)\n- Exploit dev: pwn_eval, pwn_rop_search, pwn_packed\n- Network recon: kali_* (only when engagement authorizes network engagement)\n- MCP offsec extras: mcp__mcp_kali_server__*, mcp__metasploitmcp__*\n\n{{rulebook}}"
+      },
+      "rulebook": {
+        "file": "agents/engagement-delivery.rules.json",
+        "version": "2026-05-07",
+        "contractVersion": "1.0.0",
+        "description": "Engagement intake, variant discovery, fuzzing, triage, and operator-released procurement delivery."
+      },
+      "metadata": {
+        "primaryUseCase": "procurement-delivery",
+        "secondaryUseCase": "exploit-development",
+        "tags": ["security", "fuzzing", "reverse-engineering", "exploit-dev", "procurement"]
+      }
     }
   ],