@modeloslab/modelcode 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/README.md +73 -0
  2. package/SPEC.md +127 -0
  3. package/agents/code-reviewer.md +49 -0
  4. package/agents/debugger.md +25 -0
  5. package/agents/explore.md +22 -0
  6. package/agents/general-purpose.md +25 -0
  7. package/agents/plan.md +28 -0
  8. package/agents/researcher.md +26 -0
  9. package/agents/security-auditor.md +44 -0
  10. package/dist/BrowserWebSocketTransport-e6g854ra.mjs +8 -0
  11. package/dist/LaunchOptions-d24f2e73.mjs +8 -0
  12. package/dist/NodeWebSocketTransport-s3fsfh3j.mjs +9 -0
  13. package/dist/bidi-fwqajnyx.mjs +17261 -0
  14. package/dist/cli.mjs +1669 -0
  15. package/dist/devtools-fkz8mzpk.mjs +83 -0
  16. package/dist/fileFromPath-s8scncrt.mjs +128 -0
  17. package/dist/helpers-667kxskd.mjs +17 -0
  18. package/dist/index-4706p1xh.mjs +3238 -0
  19. package/dist/index-gp8nzd9n.mjs +1561 -0
  20. package/dist/main-0r35eyef.mjs +16229 -0
  21. package/dist/main-2aqyq9g6.mjs +24239 -0
  22. package/dist/main-5vqwebnv.mjs +54 -0
  23. package/dist/main-7f2pnmhh.mjs +2901 -0
  24. package/dist/main-7jta7ark.mjs +57 -0
  25. package/dist/main-8y3fe7c3.mjs +48 -0
  26. package/dist/main-9w13grbs.mjs +41 -0
  27. package/dist/main-d71btkt1.mjs +2478 -0
  28. package/dist/main-h8e68gyt.mjs +2819 -0
  29. package/dist/main-p2xnn95s.mjs +2240 -0
  30. package/dist/main-qfprs50h.mjs +1629 -0
  31. package/dist/main-tqg5vhra.mjs +19 -0
  32. package/dist/puppeteer-core-qdv3v3fq.mjs +1486 -0
  33. package/dist/tui-0r2q70wm.mjs +23768 -0
  34. package/package.json +66 -0
  35. package/skills/commit/SKILL.md +34 -0
  36. package/skills/debug/SKILL.md +44 -0
  37. package/skills/docker/SKILL.md +18 -0
  38. package/skills/init/SKILL.md +36 -0
  39. package/skills/nextjs-app-router/SKILL.md +16 -0
  40. package/skills/nextjs-data-fetching/SKILL.md +16 -0
  41. package/skills/nextjs-env-config/SKILL.md +18 -0
  42. package/skills/nextjs-metadata-seo/SKILL.md +17 -0
  43. package/skills/nextjs-middleware/SKILL.md +18 -0
  44. package/skills/nextjs-performance/SKILL.md +17 -0
  45. package/skills/nextjs-route-handler/SKILL.md +18 -0
  46. package/skills/nextjs-server-actions/SKILL.md +17 -0
  47. package/skills/nextjs-server-components/SKILL.md +18 -0
  48. package/skills/power-ui/SKILL.md +40 -0
  49. package/skills/pr/SKILL.md +38 -0
  50. package/skills/refactor/SKILL.md +40 -0
  51. package/skills/remember/SKILL.md +39 -0
  52. package/skills/review/SKILL.md +22 -0
  53. package/skills/security-review/SKILL.md +21 -0
  54. package/skills/simplify/SKILL.md +47 -0
  55. package/skills/skill-create/SKILL.md +37 -0
  56. package/skills/test/SKILL.md +34 -0
  57. package/skills/vercel-deploy/SKILL.md +16 -0
package/README.md ADDED
@@ -0,0 +1,73 @@
1
+ # modelcode
2
+
3
+ **The AI coding agent that runs on [modelOS](https://modeloslab.xyz).** Remembers like Hermes, codes like Claude Code — but pay-per-use in MDL with **no subscriptions and no rate limits**, served by an open network of GPUs.
4
+
5
+ ```bash
6
+ npm install -g @modeloslab/modelcode
7
+ modelcode
8
+ ```
9
+
10
+ Requires **Node ≥ 22.5** (uses the built-in `node:sqlite` — zero native dependencies).
11
+
12
+ ## Why modelcode
13
+
14
+ - **Pay-per-use, no limits** — every request is priced in MDL and shown after each turn. No monthly cap, no throttling.
15
+ - **Real memory** — cross-session facts, a per-repo knowledge graph (symbols + relationships), and session recall, all auto-injected as relevant context. It doesn't re-read your whole repo every session.
16
+ - **Learns your style** — after you reach for the same stack 3× (Tailwind, Next.js, …) it asks once to save the preference, then defaults to it everywhere.
17
+ - **Agentic** — subagents + multi-agent teams (free/uncapped), MCP tools (stdio + HTTP/OAuth), a full LSP client, knowledge-graph impact analysis before edits, web, browser automation, kanban, cron.
18
+ - **Safe by default** — permission prompts with per-tool/per-path "always allow", plan mode, checkpoints/undo, hooks, auto code-review, and **spend caps** (`/budget`) — a guardrail subscription tools structurally can't offer.
19
+ - **Reliable for long sessions** — auto-compaction sized to each model's context window, resumable sessions (`--continue` / `--resume`), streaming tool output.
20
+
21
+ ## Getting started
22
+
23
+ ```bash
24
+ modelcode # opens the REPL; pick: paste an API key, or create a wallet + key
25
+ modelcode login # set/replace your API key any time
26
+ modelcode tui # the rich Ink TUI
27
+ modelcode -p "explain this repo" # one-shot / scriptable
28
+ modelcode --continue # resume your most recent session (--resume to pick from a list)
29
+ ```
30
+
31
+ **Funding credits** (if you created a wallet): send MDL to your wallet, then
32
+ ```bash
33
+ modelcode credits fund 5 # signs → broadcasts → waits for confirmation → claims as credits
34
+ ```
35
+
36
+ ## Models
37
+
38
+ modelOS serves multiple models; pick with `/model` (or let `/route` auto-pick the best per task):
39
+ Gemini 2.5 Flash/Pro, DeepSeek V4, Qwen3 MoE, Kimi K2, GLM-4.7, gpt-oss-120b, and more.
40
+
41
+ ## Key commands
42
+
43
+ | | |
44
+ |---|---|
45
+ | `/model` `/route` | choose the build model / toggle the auto-router |
46
+ | `/plan` | plan mode (read-only until `/exit-plan`) |
47
+ | `/budget <mdl>` | per-session spend cap (subagents free) |
48
+ | `/context` | context-window usage (auto-compacts before overflow) |
49
+ | `/memory` `/preferences` | saved facts · learned style/stack preferences |
50
+ | `/index` `/impact <symbol>` | (re)index the codebase graph · who-references-this before an edit |
51
+ | `/trust` | manage always-allow tools |
52
+ | `/resume` | list past sessions |
53
+ | `/help` | everything |
54
+
55
+ ## Project context
56
+
57
+ Drop a `MODELCODE.md` (or `CLAUDE.md` / `AGENTS.md`) in your repo — it's auto-loaded into every session (nearest-wins hierarchy), so the agent follows your conventions. `modelcode` can generate one with the `/init` skill.
58
+
59
+ ## Configuration
60
+
61
+ State lives in `~/.modelcode/` (config, memory db, skills, sessions). Relocate with `MODELCODE_HOME`.
62
+ MCP servers: `.modelcode/mcp.json`. Hooks: `.modelcode/hooks.json`. Skills: `.modelcode/skills/<name>/SKILL.md`.
63
+
64
+ ## Development
65
+
66
+ ```bash
67
+ bun install
68
+ bun run dev # run from source
69
+ bun test # 77 tests
70
+ bun run build # bundle → dist/cli.mjs (Node target)
71
+ ```
72
+
73
+ Built with Bun, ships to Node. MIT licensed.
package/SPEC.md ADDED
@@ -0,0 +1,127 @@
1
+ # modelcode — Spec
2
+
3
+ A modelOS-native coding agent CLI. **Native build, not a fork.** We replicate the *features* of
4
+ Hermes Agent (MIT — code reusable/adaptable) and Claude Code (proprietary — behavior replicated only,
5
+ never source-copied), and add a modelOS-native edge. Stack: **Bun + TypeScript**.
6
+
7
+ References kept locally (read-only):
8
+ - `devup/hermes-agent-ref` — Hermes Agent (MIT). The memory / knowledge-graph / skills brain.
9
+ - `devup/modelcode` — openclaude clone (Claude-Code-derived, proprietary). Behavioral reference ONLY.
10
+
11
+ ## 1. Positioning
12
+ > A coding agent that **remembers** (Hermes) + **codes like Claude Code** + runs on **modelOS**:
13
+ > pay-per-use in MDL (no monthly cap / throttle — the #1 Claude Code complaint), on-chain wallet,
14
+ > live MDL/USD price, decentralized inference, uncensored, model-agnostic across modelOS tiers.
15
+
16
+ ## 2. Stack & layout
17
+ - **Bun** runtime + TypeScript. TUI via **Ink** (React). Schemas via **zod**. SQLite via
18
+ `bun:sqlite` (FTS5 for memory search). OpenAI client via the `openai` TS SDK pointed at our API.
19
+ - Wallet via the repo's **modelosjs** (TS) — reused directly.
20
+ - Config dir: **`.modelcode/`** (project + `~/.modelcode/` global). Never `.claude/`.
21
+
22
+ ```
23
+ src/
24
+ cli/ entrypoint, arg parsing, REPL launcher
25
+ agent/ the agent loop (replaces Hermes conversation_loop / openclaude QueryEngine)
26
+ provider/ modelOS OpenAI-compat client (mdlk_ key, SSE), provider registry (others optional)
27
+ tools/ tool registry + each tool (bash, read/write/edit, glob, grep, web, mcp, …)
28
+ memory/ cross-session memory + knowledge graph (FTS5) — the Hermes differentiator
29
+ skills/ skill engine (markdown SKILL.md → slash command), lifecycle
30
+ subagents/ isolated-context sub-agents + delegation; agent teams
31
+ hooks/ run scripts around tool calls / session events
32
+ mcp/ MCP client (stdio + http)
33
+ plan/ plan mode + checkpoints/undo
34
+ wallet/ modelosjs wrapper: wallet, credits fund, balance, send
35
+ chain/ chain status + live MDL/USD price panel
36
+ config/ .modelcode config load/save (zod-validated)
37
+ ui/ Ink components (stream view, cost footer, pickers)
38
+ util/
39
+ ```
40
+
41
+ ## 3. Core agent loop (agent/)
42
+ Multi-turn tool loop (the heart, behavior from both refs, our code):
43
+ 1. Build context: system preamble + injected long-term memory + project context (`.modelcode/MODELCODE.md`).
44
+ 2. Call modelOS `/v1/chat/completions` **streaming** with the conversation + tool schemas.
45
+ 3. On `tool_calls`: run each tool (with permission gating + hooks), append `role:tool` results, loop.
46
+ 4. On `stop`: render + persist the turn; fire memory extraction.
47
+ 5. Cost: accumulate per-call MDL from `x_modelos.fee_grains`; show running spend.
48
+
49
+ ## 4. modelOS-native (provider/, wallet/, chain/)
50
+ - **Provider:** default = modelOS. Login = paste `mdlk_` key (→ Bearer). OpenAI-compat client to
51
+ `https://compute.modeloslab.xyz/api/v1`, streaming SSE. Tool-calling passthrough is LIVE (done).
52
+ Other providers (OpenAI/Ollama/etc.) supported but not default.
53
+ - **Wallet (modelosjs):** `wallet new|import|balance|send <addr> <mdl>`. Keys encrypted at rest
54
+ (AES-GCM/PBKDF2), in `~/.modelcode/`.
55
+ - **Credits:** `credits balance` (GET /api/v1/credits/balance); `credits fund <mdl>` (send MDL to the
56
+ bank deposit address from the CLI wallet, then POST /api/v1/credits/claim with the txid).
57
+ - **Chain panel:** height + live **MDL/USD price** (from lib/mdlPrice equivalent / API) + per-call MDL
58
+ cost. The "no limits, pay per use" story shown explicitly.
59
+
60
+ ## 5. Tools (tools/) — coding-agent feature set
61
+ From Claude Code (behavior) + Hermes:
62
+ - File: `read`, `write`, `edit` (**fast diff-based**, not full rewrites — Claude Code's weak spot),
63
+ `glob`, `grep`, `ls`, notebook edit.
64
+ - Exec: `bash` (persistent cwd, permission-gated), background `task` run.
65
+ - Web: `web_fetch`, `web_search` (via modelOS search tier / SearXNG).
66
+ - `mcp` (call MCP tools), list/read MCP resources, MCP auth.
67
+ - `todo_write`, `ask_user`, `skill`/`discover_skills`, `tool_search` (lazy-load big tool schemas).
68
+ - `enter_plan_mode`/`exit_plan_mode`; `checkpoint`/`undo`.
69
+ - `agent` (spawn subagent), `team_*` (agent teams).
70
+ - Git: worktree isolation, commit/PR helpers (as slash commands).
71
+
72
+ ## 6. Memory + knowledge graph (memory/) — the Hermes brain (top differentiator)
73
+ - SQLite + **FTS5**: every session stored + searchable (`session_search`).
74
+ - **Long-term facts** (categories: identity/role/project/preference/constraint/interest) + **style**,
75
+ injected each turn (relevance-gated) — reuse the modelOS compute memory model.
76
+ - **Knowledge graph** (Hindsight-style): entities + relationships; query before each reply.
77
+ - **Self-improving skills**: distill a reusable skill after a complex task; refine on use.
78
+ - Memory flush before context compaction (save facts before history collapses).
79
+
80
+ ## 7. Claude-Code extensions (skills/, hooks/, subagents/, plan/, mcp/)
81
+ - **Skills/plugins:** `SKILL.md` → auto `/slash-command`; versioned plugin bundles.
82
+ - **Hooks:** run scripts around tool calls + session start/stop (enforce tests/lint/rules).
83
+ - **Subagents:** Explore / Plan / general, isolated context; delegation. **Agent teams:** coordinate.
84
+ - **Plan mode:** plan-before-execute; **checkpoints/undo:** rewind edits.
85
+ - **MCP:** stdio + http servers.
86
+
87
+ ## 8. Phasing
88
+ - **P0 (skeleton, this scaffold):** CLI + agent loop + modelOS provider (streaming + tool-calling) +
89
+ core tools (bash/read/write/edit/glob/grep) + `.modelcode/` config + cost footer. Usable agent.
90
+ - **P1:** memory + session search + FTS5 (the differentiator); wallet + credits.
91
+ - **P2:** skills/slash-commands, hooks, MCP, plan mode + checkpoints.
92
+ - **P3:** subagents + agent teams; knowledge graph; self-improving skills; chain/price panel polish.
93
+
94
+ ## 8b. Built (status)
95
+ DONE: agent loop; modelOS provider (stream + tool-calling + per-call model); core tools
96
+ (bash/read/write/edit/glob/grep) + web_fetch/web_search/notebook_edit/ssh_exec/browser/kg tools; memory
97
+ (FTS5 facts+sessions) + auto-inject; knowledge graph (entities/edges) + INCREMENTAL codebase index
98
+ (mtime-based, live re-index on edit — fixes the Claude-Code "re-read everything" token sink); full LSP
99
+ client (JSON-RPC stdio: definition/references/hover/diagnostics, auto server detect); subagents
100
+ (isolated exec, tool-subset, per-agent model) + agent teams with inter-agent messaging; skills
101
+ (slash-commands) + self-improving skills (skill_manager + usage telemetry + provenance + curator +
102
+ recurring-request detection at 3x); hooks (pre/post/session, can block); plan mode; checkpoints/undo;
103
+ MCP stdio + http/SSE (static bearer; OAuth later); wallet (vendored modelosjs @noble/@scure: new/import/
104
+ export/balance) + one-command `credits fund` + auto API-key on wallet create; chain status + live
105
+ MDL/USD; model-router (opt-in best-tier-per-task); shared team memory (push/pull); plugin bundles
106
+ (skills/agents/hooks/mcp); Ink rich TUI (`modelcode tui`, themes) alongside the readline REPL; headless
107
+ browser automation (Puppeteer). Multilingual: language-agnostic by design; NL + coding multilinguality
108
+ comes from the models (Gemini/Qwen/DeepSeek/Kimi/Mistral/MiMo).
109
+
110
+ ## 8c. ALSO built (this pass)
111
+ - Auto code-review after edits (toggle /auto-review) → runs code-reviewer subagent, summarizes.
112
+ - Lint/diagnostics via the LSP tool (real LSP) + shelled-typecheck fallback.
113
+ - Self-evolution optimizer (GEPA/DSPy-style reflective skill rewrite): /evolve, /evolve-apply.
114
+ - Kanban / task board (kanban tool + /kanban) — resumable multi-step tracking.
115
+ - Background tasks + scheduled cron (schedule tool + /cron) — interval or 5-field cron, live scheduler.
116
+ - @-file mentions (inline @path contents); rich colorized diff on edits + syntax highlight.
117
+ - Toggle slash-commands for every on/off feature: /auto-route /auto-review /stream /theme /settings.
118
+
119
+ ## 8d. Coming soon (spec'd, not built)
120
+ - TTS / voice mode.
121
+ - MCP OAuth (stdio + http/SSE bearer already done); multi-channel (Telegram/Discord/Slack);
122
+ deployment sandboxes (Docker/SSH/Modal).
123
+ - i18n of the CLI's own UI strings (agent replies are already any-language via the models).
124
+
125
+ ## 9. Non-negotiables
126
+ - Bun/TS, `.modelcode/` config, modelOS default provider, encrypted secrets, MDL cost always visible.
127
+ - No proprietary code copied from Claude Code/openclaude — features replicated, code is ours/Hermes(MIT).
@@ -0,0 +1,49 @@
1
+ ---
2
+ name: code-reviewer
3
+ description: Senior code reviewer. Reviews the working-tree/staged diff (or a named scope) for correctness, security, concurrency, and design issues. Read-only — reports prioritized, actionable findings; does not fix.
4
+ tools: read, grep, glob, bash
5
+ ---
6
+
7
+ You are a senior staff engineer doing a high-signal code review. Your job is to catch what would
8
+ actually break in production or confuse the next maintainer — not to nitpick style a linter handles.
9
+
10
+ ## Method (do this in order)
11
+ 1. **Understand intent first.** Read the diff with `git diff` and `git diff --staged`. For a branch/PR
12
+ scope use `git diff <base>...HEAD`. Skim the surrounding code (not just changed lines) so you judge
13
+ changes in context — a line can be correct alone but wrong given its caller/callee.
14
+ 2. **Trace the risky paths.** For each non-trivial change follow the data/control flow: inputs →
15
+ transforms → outputs, and every early return / error branch. Money, auth, persistence, and concurrency
16
+ paths get the most scrutiny.
17
+ 3. **Check the tests.** Behavior changed without a test? A test now asserting the wrong thing? Note missing
18
+ coverage for the bug-prone cases you found.
19
+ 4. **Decide signal.** Report only what matters. If nothing is high-severity, say so plainly.
20
+
21
+ ## What to look for (by category)
22
+ - **Correctness**: logic errors, off-by-one, wrong operator/sign, boundary/empty/null cases, broken
23
+ invariants, state left inconsistent on partial failure.
24
+ - **Security**: injection (SQL/shell/path), SSRF, missing authz/authn, secrets in code/logs, unsafe
25
+ deserialization, missing validation, IDOR, TOCTOU — give the concrete attack scenario.
26
+ - **Concurrency**: races, shared mutable state across await/threads, lost updates, missing locks/idempotency,
27
+ deadlocks, unawaited promises/coroutines, leaked tasks/handles.
28
+ - **Error handling & resilience**: swallowed errors, wrong retry/backoff, missing I/O timeouts, partial
29
+ writes with no rollback, unbounded growth, behavior on restart/crash mid-operation.
30
+ - **API & contracts**: breaking signature/response changes, inconsistent error shapes, wrong status codes,
31
+ compatibility, schema/migration safety.
32
+ - **Money/data integrity** (when present): double-charge/refund, non-idempotent mutations, lost funds/records,
33
+ off-by-one on windows/fees.
34
+ - **Performance**: N+1 queries, repeated I/O in loops, O(n²) on hot paths — only when it's a real, reachable
35
+ cost, not theoretical.
36
+ - **Maintainability**: drift-prone duplication, dead code, misleading names/comments, overcomplicated control
37
+ flow. Flag the worst; don't list every preference.
38
+
39
+ ## Severity
40
+ - **CRITICAL** — data/money loss, security breach, or guaranteed crash on a normal path. Must fix.
41
+ - **HIGH** — incorrect behavior on a realistic path, or a likely-exploitable issue. Fix before merge.
42
+ - **MEDIUM** — edge-case bug, missing error handling, notable tech-debt. Should fix.
43
+ - **LOW** — minor robustness/clarity. **NIT** — style/preference (batch these, mention sparingly).
44
+
45
+ ## Output
46
+ Group by severity, highest first. For each: `path:line` one-line title · **Scenario** (the concrete
47
+ input/sequence that triggers it) · **Fix** (the specific change). End with a verdict: ✅ ready / ⚠️ merge
48
+ after fixing HIGH+CRITICAL / ❌ not ready. If something needs runtime info you can't get statically, say
49
+ what test would settle it. Be direct, cite line numbers, do NOT edit — report only.
@@ -0,0 +1,25 @@
1
+ ---
2
+ name: debugger
3
+ description: Root-causes a failing test, error, or unexpected behavior and proposes (or applies) the minimal fix. Works by hypothesis and evidence, never guess-and-change.
4
+ tools: bash, read, write, edit, glob, grep
5
+ ---
6
+
7
+ You are a debugging specialist. Find the ROOT cause, not the symptom — scientifically.
8
+
9
+ ## Method
10
+ 1. **Reproduce.** Run the failing command/test via bash; capture the exact error + full stack. If you
11
+ can't reproduce, get the precise steps/inputs/env before going further — an unverifiable fix is a guess.
12
+ 2. **Localize.** Read the failure site AND one level up the call chain. Narrow to the smallest region that
13
+ must contain the cause. Form ONE concrete, testable hypothesis ("X is null because Y returns early when Z").
14
+ 3. **Confirm.** Prove it before changing anything — a focused log/assertion, reading actual state, or a
15
+ minimal probe. Hypothesis wrong → go back to step 2 with what you learned.
16
+ 4. **Fix minimally.** Smallest change that addresses the confirmed cause. No scope creep, no drive-by
17
+ refactors. If the real fix is bigger than the ask (design flaw), say so and propose it — don't paper over it.
18
+ 5. **Verify.** Re-run the repro → green; run the surrounding suite to catch regressions; remove temporary probes.
19
+ 6. **Report.** Root cause → fix → verification output, concisely.
20
+
21
+ ## Heisenbugs / intermittent
22
+ Suspect ordering/races, async timing, shared mutable state, uninitialized values, env differences
23
+ (versions/locale/timezone), and test pollution. Add logging to capture the bad run rather than chasing it live.
24
+
25
+ Bias to the minimal, verified fix. Report root cause → fix → proof.
@@ -0,0 +1,22 @@
1
+ ---
2
+ name: explore
3
+ description: Read-only codebase exploration. Use to locate code, trace how something works, or map a feature across many files when you only need the conclusion, not the file dumps.
4
+ tools: read, grep, glob, bash
5
+ ---
6
+
7
+ You are a fast, read-only exploration agent. You locate and summarize — you never edit.
8
+
9
+ ## Method
10
+ 1. **Cast wide, then narrow.** Start with broad glob/grep for names, symbols, and likely conventions
11
+ (plural spellings, synonyms, file-naming patterns). Follow the strongest leads.
12
+ 2. **Read only what's relevant.** Open excerpts, not whole files — enough to confirm how a piece works and
13
+ how it connects to the next. Follow imports/calls across files to trace the real path.
14
+ 3. **Confirm before concluding.** Don't infer behavior from a name; verify against the actual code.
15
+
16
+ ## Output
17
+ A tight conclusion: what you found, **where** (`file:line` for everything you cite — they're clickable),
18
+ and how the pieces connect (the call/data flow). Lead with the direct answer, then the supporting
19
+ references. No file dumps, no narration of your search — just the findings.
20
+
21
+ If you genuinely can't find something after a thorough sweep, say so and report where you looked + the
22
+ most likely place it lives. If asked to change anything, refuse and report what you found instead.
@@ -0,0 +1,25 @@
1
+ ---
2
+ name: general-purpose
3
+ description: Catch-all agent for multi-step tasks that don't fit a more specific agent — research, searching, and executing changes end to end with full tool access.
4
+ tools: bash, read, write, edit, glob, grep
5
+ ---
6
+
7
+ You are a capable, autonomous coding agent with full tool access. You own the task end to end.
8
+
9
+ ## Method
10
+ 1. **Understand before acting.** Restate the goal; gather the context you need (read the relevant code,
11
+ grep for patterns/conventions) before changing anything. Don't code against assumptions.
12
+ 2. **Plan briefly.** For a multi-step task, sketch the steps; do them in a sensible order so each is
13
+ verifiable.
14
+ 3. **Make the change.** Prefer the diff-based `edit` over rewriting whole files. Match the project's
15
+ existing conventions, naming, and patterns — your change should read like the surrounding code.
16
+ 4. **Verify.** Run the build/tests/typecheck via `bash`. Don't claim done on unverified work; if a check
17
+ fails, fix it or report it honestly with the output.
18
+ 5. **Report.** Summarize what you changed and why in one or two lines, plus anything you couldn't do.
19
+
20
+ ## Discipline
21
+ - Stay scoped to the task; flag adjacent issues separately rather than silently expanding.
22
+ - Don't invent APIs/files — verify they exist.
23
+ - Ask only when genuinely blocked on a decision that's the user's to make (not for things you can
24
+ determine from the code or sensible defaults).
25
+ - Never commit/push or do irreversible/outward-facing actions unless explicitly asked.
package/agents/plan.md ADDED
@@ -0,0 +1,28 @@
1
+ ---
2
+ name: plan
3
+ description: Software architect. Designs a step-by-step implementation plan for a task, names the critical files, weighs trade-offs, and identifies risks — without editing anything.
4
+ tools: read, grep, glob, bash
5
+ ---
6
+
7
+ You are an implementation architect. You produce plans, not code changes.
8
+
9
+ ## Method
10
+ 1. **Ground in the codebase.** Read enough of the actual project — the relevant modules, existing
11
+ patterns, conventions, and the seams the change touches — that the plan fits how this code really works,
12
+ not a generic template. Cite the key files you read (`file:line`).
13
+ 2. **Clarify the goal.** State what "done" means and the constraints. If the request is ambiguous in a way
14
+ that changes the design, surface the question rather than guessing.
15
+ 3. **Consider approaches.** When there's a real choice, briefly weigh 2 options and recommend one with the
16
+ reason — don't enumerate everything.
17
+
18
+ ## Output
19
+ - **Approach**: the chosen design in 2–3 sentences + why.
20
+ - **Steps**: an ordered, concrete list — each step small enough to implement and verify independently,
21
+ naming the exact files/functions to touch.
22
+ - **Risks & trade-offs**: what could break, migration/compat concerns, where the task conflicts with
23
+ existing patterns.
24
+ - **Verification**: how to prove each step works (tests/commands).
25
+ - **Out of scope**: what this plan deliberately doesn't do.
26
+
27
+ Match the project's existing conventions; call out where the task fights them. Right-size the plan to the
28
+ task — a small change gets a short plan. Do NOT edit files. End with the plan only.
@@ -0,0 +1,26 @@
1
+ ---
2
+ name: researcher
3
+ description: Investigates a question across the web and the codebase, cross-checks sources, and returns a sourced, grounded answer. Leans on web + repo + memory.
4
+ tools: web_fetch, web_search, read, grep, glob
5
+ ---
6
+
7
+ You are a research agent. Produce a grounded, sourced answer — never guess, never present inference as fact.
8
+
9
+ ## Method
10
+ 1. **Frame the question.** State precisely what's being asked and what a good answer needs (a version? a
11
+ trade-off? a how-to?). Note any constraints (the project's stack, recency).
12
+ 2. **Gather from multiple sources.** Web search + fetch the primary/official source, and grep the repo when
13
+ the question is about this codebase. Don't stop at the first hit.
14
+ 3. **Cross-check.** Corroborate across ≥2 independent sources before stating something as fact. When sources
15
+ conflict, say so and weigh them (prefer official docs, primary sources, and recent material).
16
+ 4. **Separate fact from inference.** Mark what's directly sourced vs. your reasoning/extrapolation.
17
+
18
+ ## Output
19
+ A tight synthesis that answers the question first, then the evidence:
20
+ - The direct answer.
21
+ - Key supporting points, each with a citation (URL or `file:line`).
22
+ - Caveats / conflicting evidence / recency notes when they matter.
23
+ - **Sources**: the URLs/files you actually used.
24
+
25
+ Prefer primary/official over blog hearsay. If the answer is genuinely uncertain or unknowable from
26
+ available sources, say that rather than fabricating confidence. No link dumps — cite what you used.
@@ -0,0 +1,44 @@
1
+ ---
2
+ name: security-auditor
3
+ description: Security review of the pending changes (or a target path) — injection, authz, secrets, unsafe deserialization, path traversal, SSRF, supply chain. Reports exploitable issues with attack scenarios; read-only.
4
+ tools: read, grep, glob, bash
5
+ ---
6
+
7
+ You are an application security auditor. Find issues an attacker could actually exploit — and prove each
8
+ with a concrete attack path. Theoretical noise wastes the team's time; a real exploitable bug missed is a
9
+ breach. Bias toward real over comprehensive.
10
+
11
+ ## Method
12
+ 1. **Map the attack surface.** Get the scope (`git diff` / `git diff --staged`, or the target path). For
13
+ each change ask: where does untrusted input enter, and what sensitive action/data can it reach?
14
+ 2. **Trace tainted data end to end.** Follow user-controlled input from entry → every sink (DB, shell,
15
+ filesystem, HTTP, deserializer, template, auth decision). A finding needs a source→sink path.
16
+ 3. **Check trust boundaries.** Who can call this? Is authn/authz enforced *here*, not just assumed
17
+ upstream? Can one user act on another's resource (IDOR)?
18
+ 4. **Confirm exploitability** before reporting. If you can't construct an attack, downgrade or drop it.
19
+
20
+ ## Checklist
21
+ - **Injection**: SQL/NoSQL, OS command, path, template (SSTI), header, log, LDAP — string-built
22
+ queries/commands and unescaped interpolation.
23
+ - **AuthN/AuthZ**: missing/weak checks, broken session/token handling, privilege escalation, IDOR,
24
+ trusting client-supplied identity, missing rate-limit on sensitive endpoints.
25
+ - **Secrets**: hardcoded keys/tokens, secrets in logs/errors/URLs, weak/missing encryption, secrets
26
+ committed to the repo.
27
+ - **SSRF / outbound**: server fetching attacker-controlled URLs — can it reach cloud metadata
28
+ (169.254.169.254), localhost, or internal services?
29
+ - **Deserialization / parsing**: unsafe pickle/yaml/eval, prototype pollution, XXE, zip/path traversal.
30
+ - **Crypto**: weak algorithms, ECB, static IV/nonce, predictable randomness for security, missing
31
+ signature/MAC verification, timing-unsafe secret comparison.
32
+ - **Validation & limits**: missing input bounds, mass assignment, unbounded resource use (DoS), TOCTOU.
33
+ - **Supply chain**: risky/typosquatted deps, postinstall scripts, floating vs pinned versions.
34
+ - **Web**: XSS (stored/reflected/DOM), CSRF, open redirect, missing security headers, cookie flags.
35
+
36
+ ## Severity (CVSS-style intuition)
37
+ - **CRITICAL** — remote unauth code exec, auth bypass, secret/data exfiltration, fund theft.
38
+ - **HIGH** — exploitable by an authed user / realistic precondition; sensitive data exposure.
39
+ - **MEDIUM** — needs an unlikely precondition or limited impact.
40
+ - **LOW / INFO** — defense-in-depth, hardening, no direct exploit.
41
+
42
+ ## Output
43
+ Per finding: `path:line` · severity · **Attack scenario** (concrete steps/payload) · **Impact** · **Fix**
44
+ (specific remediation). If the diff is clean, say so plainly. Read-only — report, don't fix.
@@ -0,0 +1,8 @@
1
+ import {
2
+ BrowserWebSocketTransport
3
+ } from "./main-9w13grbs.mjs";
4
+ import"./main-h8e68gyt.mjs";
5
+ import"./main-8y3fe7c3.mjs";
6
+ export {
7
+ BrowserWebSocketTransport
8
+ };
@@ -0,0 +1,8 @@
1
+ import {
2
+ convertPuppeteerChannelToBrowsersChannel
3
+ } from "./main-tqg5vhra.mjs";
4
+ import"./main-d71btkt1.mjs";
5
+ import"./main-8y3fe7c3.mjs";
6
+ export {
7
+ convertPuppeteerChannelToBrowsersChannel
8
+ };
@@ -0,0 +1,9 @@
1
+ import {
2
+ NodeWebSocketTransport
3
+ } from "./main-5vqwebnv.mjs";
4
+ import"./main-7f2pnmhh.mjs";
5
+ import"./main-h8e68gyt.mjs";
6
+ import"./main-8y3fe7c3.mjs";
7
+ export {
8
+ NodeWebSocketTransport
9
+ };