create-ironclaws 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. package/README.md +101 -0
  2. package/bin/create.js +394 -0
  3. package/package.json +33 -0
  4. package/template/.env.example +38 -0
  5. package/template/CLAUDE.md +104 -0
  6. package/template/agent-credentials.yaml +33 -0
  7. package/template/agents.yaml +22 -0
  8. package/template/container/Dockerfile +70 -0
  9. package/template/container/Dockerfile.argus +34 -0
  10. package/template/container/agent-runner/package-lock.json +1524 -0
  11. package/template/container/agent-runner/package.json +23 -0
  12. package/template/container/agent-runner/src/index.ts +630 -0
  13. package/template/container/agent-runner/src/ipc-mcp-stdio.ts +339 -0
  14. package/template/container/agent-runner/tsconfig.json +15 -0
  15. package/template/container/build-argus.sh +25 -0
  16. package/template/container/build.sh +23 -0
  17. package/template/container/skills/agent-browser/SKILL.md +159 -0
  18. package/template/container/skills/agent-status/SKILL.md +69 -0
  19. package/template/container/skills/capabilities/SKILL.md +100 -0
  20. package/template/container/skills/edit-agent/SKILL.md +93 -0
  21. package/template/container/skills/slack-formatting/SKILL.md +92 -0
  22. package/template/container/skills/status/SKILL.md +104 -0
  23. package/template/container/tools/elastic_query.py +161 -0
  24. package/template/container/tools/gdrive_tool.py +185 -0
  25. package/template/container/tools/jira_tool.py +433 -0
  26. package/template/container/tools/slack_history_tool.py +144 -0
  27. package/template/container/tools/youtube_tool.py +174 -0
  28. package/template/docker-compose.yml +54 -0
  29. package/template/docs/how-it-works.md +496 -0
  30. package/template/eslint.config.js +32 -0
  31. package/template/groups/forge/CLAUDE.md +107 -0
  32. package/template/package-lock.json +5278 -0
  33. package/template/package.json +52 -0
  34. package/template/scripts/github-app-token.py +58 -0
  35. package/template/scripts/register-expense-agent.sh +121 -0
  36. package/template/scripts/run-migrations.ts +105 -0
  37. package/template/scripts/setup-onecli-secrets.sh +252 -0
  38. package/template/setup-agents.sh +142 -0
  39. package/template/src/channels/index.ts +13 -0
  40. package/template/src/channels/registry.test.ts +42 -0
  41. package/template/src/channels/registry.ts +28 -0
  42. package/template/src/channels/slack.test.ts +859 -0
  43. package/template/src/channels/slack.ts +373 -0
  44. package/template/src/claw-skill.test.ts +45 -0
  45. package/template/src/config.ts +94 -0
  46. package/template/src/container-runner.test.ts +221 -0
  47. package/template/src/container-runner.ts +1029 -0
  48. package/template/src/container-runtime.test.ts +149 -0
  49. package/template/src/container-runtime.ts +124 -0
  50. package/template/src/db-migration.test.ts +67 -0
  51. package/template/src/db.test.ts +484 -0
  52. package/template/src/db.ts +837 -0
  53. package/template/src/env.ts +42 -0
  54. package/template/src/formatting.test.ts +294 -0
  55. package/template/src/github-token.ts +48 -0
  56. package/template/src/google-token.ts +75 -0
  57. package/template/src/group-folder.test.ts +43 -0
  58. package/template/src/group-folder.ts +44 -0
  59. package/template/src/group-queue.test.ts +484 -0
  60. package/template/src/group-queue.ts +363 -0
  61. package/template/src/http-server.ts +343 -0
  62. package/template/src/index.ts +960 -0
  63. package/template/src/ipc-auth.test.ts +679 -0
  64. package/template/src/ipc.ts +548 -0
  65. package/template/src/logger.ts +16 -0
  66. package/template/src/mount-security.ts +421 -0
  67. package/template/src/network-policy.ts +119 -0
  68. package/template/src/remote-control.test.ts +397 -0
  69. package/template/src/remote-control.ts +224 -0
  70. package/template/src/router.ts +52 -0
  71. package/template/src/routing.test.ts +170 -0
  72. package/template/src/sender-allowlist.test.ts +216 -0
  73. package/template/src/sender-allowlist.ts +128 -0
  74. package/template/src/task-scheduler.test.ts +129 -0
  75. package/template/src/task-scheduler.ts +290 -0
  76. package/template/src/timezone.test.ts +73 -0
  77. package/template/src/timezone.ts +37 -0
  78. package/template/src/types.ts +114 -0
  79. package/template/src/worktree.ts +206 -0
  80. package/template/tsconfig.json +20 -0
@@ -0,0 +1,496 @@
1
+ # NanoClaw — How It Works
2
+
3
+ A technical guide to the system you've built: concepts, design decisions, and where to find them in the code.
4
+
5
+ ---
6
+
7
+ ## Table of Contents
8
+
9
+ 1. [Big Picture](#1-big-picture)
10
+ 2. [Message Flow — From Slack to Agent and Back](#2-message-flow)
11
+ 3. [Secrets Management](#3-secrets-management)
12
+ 4. [OneCLI Proxy](#4-onecli-proxy)
13
+ 5. [Network Enforcement — iptables](#5-network-enforcement)
14
+ 6. [Container Isolation](#6-container-isolation)
15
+ 7. [Tools and Skills](#7-tools-and-skills)
16
+ 8. [IPC — How Agents Talk Back](#8-ipc)
17
+ 9. [Task Scheduling](#9-task-scheduling)
18
+ 10. [Agent Registration](#10-agent-registration)
19
+ 11. [Mount Security](#11-mount-security)
20
+ 12. [The Group Queue](#12-the-group-queue)
21
+ 13. [Remaining Known Issues](#13-remaining-known-issues)
22
+
23
+ ---
24
+
25
+ ## 1. Big Picture
26
+
27
+ NanoClaw is a single Node.js host process that manages a fleet of AI agents. Each agent is a Docker container running Claude Code. The host handles Slack, routing, scheduling, and security. Agents do the actual work.
28
+
29
+ ```
30
+ Slack
31
+
32
+
33
+ NanoClaw host (src/index.ts)
34
+ │ reads .env, agents.yaml, mount-allowlist.json
35
+
36
+ ├── Message router ──► spawns Docker container per agent
37
+ ├── Task scheduler ──► spawns containers on schedule
38
+ └── IPC watcher ──► reads results from containers
39
+
40
+
41
+ Docker container (argus-claude:latest)
42
+ ├── Claude Code (the AI)
43
+ ├── agent-runner (Node.js wrapper — container/agent-runner/)
44
+ ├── Python tools (container/tools/)
45
+ └── Skills (SKILL.md files)
46
+ ```
47
+
48
+ Key design principle: **the host is the trust boundary**. The host holds all secrets, enforces all network rules, and validates all mount requests. Containers are untrusted workers — they can only reach the internet through the OneCLI proxy, and only with the credentials they've been explicitly given.
49
+
50
+ **Where to read:**
51
+ - `src/index.ts` — the main loop, message routing, session management
52
+ - `src/container-runner.ts` — how containers are spawned
53
+ - `container/Dockerfile.argus` — what's baked into the image
54
+ - `CLAUDE.md` — high-level architecture reference
55
+
56
+ ---
57
+
58
+ ## 2. Message Flow
59
+
60
+ ### From Slack to Agent
61
+
62
+ ```
63
+ 1. Slack sends event (Socket Mode WebSocket)
64
+
65
+ 2. NanoClaw receives it → stores in DB (messages table)
66
+
67
+ 3. Router checks: does this message contain a trigger word?
68
+ (@Argus, @Byte, etc.)
69
+
70
+ Yes ─┤
71
+ ├── Fetch all accumulated messages since last run for this agent
72
+ ├── Build prompt (messages + channel members + agent identity)
73
+ └── Enqueue in GroupQueue
74
+
75
+ 4. GroupQueue checks concurrency limits → spawns Docker container
76
+
77
+ 5. Container starts:
78
+ - Claude Code reads the prompt
79
+ - Runs tools (Bash, Python scripts, MCP tools)
80
+ - Produces output
81
+
82
+ 6. agent-runner captures output → writes to IPC file
83
+
84
+ 7. IPC watcher (host-side) reads the file → sends Slack message
85
+ ```
86
+
87
+ **No trigger?** The message is stored in the DB. Next time a trigger arrives, NanoClaw pulls ALL accumulated messages as context — so the agent sees the full conversation.
88
+
89
+ **Where to read:**
90
+ - `src/index.ts:280–440` — message routing logic
91
+ - `src/db.ts` — `getNewMessages()`, `getMessagesSince()`, `storeMessage()`
92
+ - `src/group-queue.ts` — the concurrency manager
93
+ - `container/agent-runner/src/index.ts` — what runs inside the container
94
+
95
+ ### From Agent Back to Slack
96
+
97
+ Agents have two ways to send messages:
98
+
99
+ 1. **Final response** — whatever Claude outputs at the end of the run gets posted to the channel
100
+ 2. **`send_message` MCP tool** — sends a message mid-run without waiting for completion (progress updates)
101
+
102
+ Both go through IPC files. The host's IPC watcher picks them up and posts to Slack.
103
+
104
+ **Where to read:**
105
+ - `src/ipc.ts` — the watcher
106
+ - `container/agent-runner/src/ipc-mcp-stdio.ts` — MCP tools agents can call
107
+
108
+ ---
109
+
110
+ ## 3. Secrets Management
111
+
112
+ This is the most important system to understand. The design goal: **no long-lived secrets ever enter a container**.
113
+
114
+ ### The Four Credential Categories
115
+
116
+ #### A. Service API Keys (Elastic, Jira, Confluence, Slack, Intercom, Ardoq)
117
+ These never enter containers at all. The OneCLI proxy intercepts HTTPS requests and injects the `Authorization` header server-side. The container makes a request to `jira.atlassian.com`, OneCLI intercepts, adds `Authorization: Basic <token>`, forwards it. The container only ever sees the base URL.
118
+
119
+ ```
120
+ Container → HTTPS_PROXY → OneCLI → adds Authorization header → Jira
121
+ ```
122
+
123
+ #### B. GitHub Tokens
124
+ The GitHub App private key (`github-app.pem`) never enters containers. Before spawning, the host generates a short-lived (~1 hour) installation token using the App credentials. Only the token is injected as `GITHUB_TOKEN`.
125
+
126
+ ```python
127
+ # Concept:
128
+ pem + app_id + installation_id → GitHub API → short-lived token
129
+ # Only the token goes into the container env
130
+ ```
131
+
132
+ **Where to read:** `src/github-token.ts`, `container-runner.ts:448–465`
133
+
134
+ #### C. Google Access Tokens
135
+ Same pattern as GitHub. The host holds the OAuth refresh token. Before spawning, it calls Google's token endpoint to get a short-lived access token, injects only that.
136
+
137
+ **Where to read:** `src/google-token.ts`, `container-runner.ts:470–478`
138
+
139
+ #### D. Expensify Credentials (the exception)
140
+ Expensify embeds credentials in the **JSON request body** (not in HTTP headers). OneCLI can only inject headers, so it can't handle Expensify. These two env vars (`EXPENSIFY_PARTNER_USER_ID`, `EXPENSIFY_PARTNER_USER_SECRET`) must be passed directly to the container. This is explicitly documented as a known trade-off, and only the `expense-policy-checker` agent gets them.
141
+
142
+ **Where to read:** `agent-credentials.yaml` (comment on expense-policy-checker block)
143
+
144
+ ### Per-Agent Selective Credentials
145
+
146
+ Each agent only gets the credentials it actually needs. This is configured in two places:
147
+
148
+ **`agent-credentials.yaml`** — controls which env vars from the host's `.env` are forwarded:
149
+ ```yaml
150
+ agents:
151
+ argus-alert:
152
+ config:
153
+ - GITHUB_TOKEN # gets a fresh token injected
154
+ - ELASTIC_BASE_URL # just a URL, not a secret
155
+ ```
156
+
157
+ **`agents.yaml`** — controls which OneCLI credential sets are linked (for service auth via proxy):
158
+ ```yaml
159
+ - folder: byte-it-internal
160
+ onecli_secrets: [slack, litellm] # can call Slack API + Claude API
161
+ ```
162
+
163
+ **Where to read:** `agent-credentials.yaml`, `agents.yaml`, `src/container-runner.ts:50–130` (the credential loading functions)
164
+
165
+ ---
166
+
167
+ ## 4. OneCLI Proxy
168
+
169
+ OneCLI is an HTTPS man-in-the-middle proxy that runs as a Docker container alongside NanoClaw. Every agent container has its outbound HTTPS traffic routed through it.
170
+
171
+ ### How It Works
172
+
173
+ When `applyContainerConfig()` is called before spawning a container, OneCLI's SDK injects these env vars:
174
+ ```
175
+ HTTPS_PROXY=http://x:<access-token>@host.docker.internal:10255
176
+ HTTP_PROXY=http://x:<access-token>@host.docker.internal:10255
177
+ https_proxy=... (lowercase versions for Python/other tools)
178
+ NODE_USE_ENV_PROXY=1
179
+ SSL_CERT_FILE=/tmp/onecli-combined-ca.pem (OneCLI's MITM CA cert)
180
+ ```
181
+
182
+ Every HTTPS call from the container (Python's `urllib.request`, `requests`, Node.js `fetch`, `curl`, `git`) goes through the proxy automatically because of these env vars.
183
+
184
+ When OneCLI intercepts a request:
185
+ 1. Looks up the access token → identifies which agent this is
186
+ 2. Checks if this agent has a credential linked for the target hostname
187
+ 3. If yes: injects the `Authorization: Bearer <secret>` header
188
+ 4. If no credential: proxies the request unchanged (or blocks if configured to do so)
189
+
190
+ The `access_token` in the proxy URL is the per-agent token stored in OneCLI's PostgreSQL database. It's regenerated each time `setup-agents.sh` is run.
191
+
192
+ ### Git Authentication Fix
193
+
194
+ Git has a bug with MITM proxies: it doesn't retry after a 401. To work around this, NanoClaw pre-injects the auth header for git before the first request:
195
+ ```
196
+ GIT_CONFIG_KEY_0=http.extraHeader
197
+ GIT_CONFIG_VALUE_0=Authorization: Basic <base64(x:token)>
198
+ ```
199
+
200
+ **Where to read:**
201
+ - `src/container-runner.ts:482–499` — where `applyContainerConfig()` is called
202
+ - `node_modules/@onecli-sh/sdk/lib/index.js` — what it actually injects
203
+ - `agents.yaml` — `onecli_id` and `onecli_secrets` per agent
204
+ - `setup-agents.sh:101–124` — how agents are registered in OneCLI's DB
205
+
206
+ ---
207
+
208
+ ## 5. Network Enforcement
209
+
210
+ ### The Problem
211
+ Without network enforcement, a compromised or misbehaving agent could call external APIs, exfiltrate data, or perform actions you haven't authorized.
212
+
213
+ ### The Solution: iptables
214
+
215
+ After each container spawns, NanoClaw adds rules to the `DOCKER-USER` iptables chain (Docker's hook for custom rules). The rules are:
216
+
217
+ ```
218
+ ACCEPT from <container_ip> to <onecli_ip> port 10255 # OneCLI proxy — allowed
219
+ ACCEPT from <container_ip> port 53 # DNS — allowed
220
+ REJECT from <container_ip> all traffic # everything else — immediate rejection
221
+ ```
222
+
223
+ The `REJECT` (not `DROP`) matters: with `DROP`, a blocked connection hangs for minutes waiting for a TCP timeout. With `REJECT --reject-with icmp-port-unreachable`, the connection fails immediately, so Claude doesn't wait.
224
+
225
+ ### Why All Traffic Goes Through OneCLI
226
+
227
+ The iptables rules only allow port 10255 (OneCLI proxy). Since `HTTPS_PROXY` is set in the container, ALL HTTPS calls go through port 10255 automatically. So:
228
+ - Direct `api.anthropic.com:443` connections → REJECTED (but these go through proxy anyway)
229
+ - Claude API calls → port 10255 → OneCLI → forwarded with credentials → OK
230
+ - Expensify calls → port 10255 → OneCLI → forwarded without modification (body unchanged) → OK
231
+ - Any unauthorized service → port 10255 → OneCLI → no credential → 401 or blocked
232
+
233
+ ### Cleanup
234
+
235
+ Rules are tagged with the container name (`--comment "nanoclaw-argus-deps-123"`). When the container exits, all its rules are deleted. On NanoClaw startup, stale rules from any previous run are cleaned up automatically.
236
+
237
+ **Where to read:**
238
+ - `src/network-policy.ts` — `applyContainerIptables()`, `cleanupContainerIptables()`
239
+ - `src/container-runtime.ts:102–121` — `cleanupStaleIptablesRules()` (startup cleanup)
240
+ - `src/container-runner.ts:591–602` — where iptables is called after spawn
241
+
242
+ ---
243
+
244
+ ## 6. Container Isolation
245
+
246
+ Each container gets its own isolated environment:
247
+
248
+ ### Session Directory
249
+ Each agent has a persistent home directory at `data/sessions/{agentFolder}/`. This is mounted as `/home/node` inside the container. Claude Code stores its conversation history (JSONL files), MEMORY.md, and settings here. Session files from one agent are completely invisible to another.
250
+
251
+ ### IPC Namespaces
252
+ Each group has its own IPC directory at `data/ipc/{agentFolder}/`. Agents can only write to their own namespace. The host enforces this at the IPC file processing level.
253
+
254
+ ### Skills and Tools
255
+ Each container only gets the skills and tools listed in `agent-credentials.yaml`. See [Section 7](#7-tools-and-skills).
256
+
257
+ ### Worktrees
258
+ When an agent needs to modify code (e.g. argus-deps raising a PR), it doesn't work on your live checkout. NanoClaw creates a git worktree — a separate clone of the repo — at `data/worktrees/{agentFolder}/repo/`. This is mounted read-write into the container while the actual repo stays untouched.
259
+
260
+ **Where to read:**
261
+ - `src/worktree.ts` — worktree creation/management
262
+ - `src/container-runner.ts:160–380` — the full mount list construction
263
+ - `data/sessions/` — look here to see active agent session files
264
+
265
+ ---
266
+
267
+ ## 7. Tools and Skills
268
+
269
+ ### The Difference
270
+ - **Skills** (`container/skills/`) — markdown files (`SKILL.md`) that Claude reads for instructions. They tell Claude how to do things (e.g. how to format Slack messages, how to browse the web).
271
+ - **Tools** (`container/tools/`) — Python scripts that Claude can execute with the `Bash` tool. They provide actual functionality (e.g. query Elasticsearch, post to Jira).
272
+
273
+ ### How the Allowlist Works
274
+
275
+ `agent-credentials.yaml` defines a two-layer allowlist:
276
+ ```yaml
277
+ common:
278
+ skills: [capabilities, status, slack-formatting] # every agent gets these
279
+ tools: [] # no tools by default
280
+
281
+ agents:
282
+ argus-alert:
283
+ skills: [] # adds nothing extra
284
+ tools: [elastic_query.py, jira_tool.py] # but gets these tools
285
+ global-claw:
286
+ skills: [agent-browser, agent-status, edit-agent] # dangerous skills
287
+ tools: [elastic_query.py, jira_tool.py, ...] # all tools
288
+ ```
289
+
290
+ **Dangerous skills** (`edit-agent`, `agent-browser`, `agent-status`) are NOT in the common layer. Only `global-claw` has them by explicit opt-in.
291
+
292
+ ### How It's Enforced
293
+
294
+ Before spawning a container, the host:
295
+ 1. Computes the allowed skill set (common + agent-specific)
296
+ 2. Copies only those skill directories to `data/sessions/{agent}/.claude/skills/`
297
+ 3. Computes the allowed tool set
298
+ 4. Copies only those tool files to `data/tools/{agent}/`
299
+ 5. Mounts `data/tools/{agent}` → `/workspace/extra/tools` (read-only)
300
+
301
+ Skills and tools NOT in the allowlist are never present in the container.
302
+
303
+ **Where to read:**
304
+ - `src/container-runner.ts:50–160` — `getAgentEnvKeys()`, `getAgentSkills()`, `getAgentTools()`
305
+ - `agent-credentials.yaml` — the full allowlist configuration
306
+
307
+ ---
308
+
309
+ ## 8. IPC
310
+
311
+ IPC (Inter-Process Communication) is how agents running inside containers send things back to the host without waiting for their run to finish.
312
+
313
+ ### The Mechanism
314
+
315
+ The host mounts a directory into the container at `/workspace/ipc/{agentFolder}/`. The container writes JSON files into subdirectories:
316
+ - `messages/` — send a Slack message now (mid-run)
317
+ - `tasks/` — create/update/pause/cancel a scheduled task
318
+ - `input/` — the host writes here to send follow-up messages to an active container
319
+
320
+ The host runs a polling loop (every 1 second) watching these directories. When it sees a new file, it processes it and deletes it.
321
+
322
+ ### The MCP Bridge
323
+
324
+ Inside the container, agents don't write files directly. They call MCP tools:
325
+ ```
326
+ Claude Code → mcp__nanoclaw__send_message → ipc-mcp-stdio.js → writes JSON file → /workspace/ipc/...
327
+ ```
328
+
329
+ The `ipc-mcp-stdio.js` process is a local MCP server. Claude Code connects to it via stdio. When Claude calls `send_message`, `schedule_task`, etc., the MCP server writes the corresponding IPC file atomically (write to `.tmp`, then rename — so the host never reads a partial file).
330
+
331
+ ### Available MCP Tools
332
+
333
+ | Tool | What it does |
334
+ |------|-------------|
335
+ | `send_message` | Send a Slack message immediately while still running |
336
+ | `schedule_task` | Create a recurring or one-time scheduled task |
337
+ | `update_task` | Modify an existing task's schedule or prompt |
338
+ | `pause_task` / `resume_task` / `cancel_task` | Manage task lifecycle |
339
+ | `list_tasks` | See scheduled tasks for this group |
340
+ | `register_group` | Register a new agent channel (main group only) |
341
+
342
+ **Where to read:**
343
+ - `src/ipc.ts` — the host-side watcher
344
+ - `container/agent-runner/src/ipc-mcp-stdio.ts` — the MCP server inside containers
345
+ - `container/agent-runner/src/index.ts:410–435` — how agent-runner connects to the MCP server
346
+
347
+ ---
348
+
349
+ ## 9. Task Scheduling
350
+
351
+ Agents can schedule themselves (or be pre-configured) to run on a schedule — no human message needed.
352
+
353
+ ### Schedule Types
354
+
355
+ ```
356
+ cron: "0 11 * * 1-5" → runs at 11:00 AM on weekdays (uses TIMEZONE from config)
357
+ interval: 3600000 → runs every 60 minutes (milliseconds, drift-free)
358
+ once: ISO timestamp → runs once at that time, then marked complete
359
+ ```
360
+
361
+ ### How It Works
362
+
363
+ Every 60 seconds, the scheduler queries:
364
+ ```sql
365
+ SELECT * FROM scheduled_tasks
366
+ WHERE status = 'active' AND next_run IS NOT NULL AND next_run <= now
367
+ ```
368
+
369
+ For each due task, it enqueues a container run. Crucially, `next_run` is advanced **before** the container starts — so if the run takes 10 minutes, the scheduler won't see it as due again during that time.
370
+
371
+ ### Silent Tasks
372
+
373
+ Tasks can have `silent=1` in the database. Silent tasks run, but their results are NOT posted to Slack. Used for background maintenance — the byte daily transcript sync is silent so it doesn't clutter the standup channel.
374
+
375
+ ### Context Mode
376
+
377
+ Tasks have a `context_mode`:
378
+ - `"group"` — uses the agent's current conversation session; the agent has memory of what was discussed
379
+ - `"isolated"` — fresh session each time; task prompt must be self-contained
380
+
381
+ ### Pre-Run Scripts
382
+
383
+ A task can have a `script` field (bash script). The script runs first. Its last line must be JSON:
384
+ ```json
385
+ {"wakeAgent": true, "data": {"key": "value"}}
386
+ ```
387
+ If `wakeAgent: false`, the task ends without calling Claude — useful for tasks that only need to run under certain conditions (e.g. "only wake the agent if there are new alerts").
388
+
389
+ **Where to read:**
390
+ - `src/task-scheduler.ts` — `startSchedulerLoop()`, `runTask()`, `computeNextRun()`
391
+ - `src/db.ts:515–541` — `getDueTasks()`, `updateTaskAfterRun()`
392
+
393
+ ---
394
+
395
+ ## 10. Agent Registration
396
+
397
+ ### Automatic Registration on Startup
398
+ NanoClaw reads `agents.yaml` on every startup and auto-registers any agent whose channel ID env var is set in `.env` but isn't in the DB yet. This means a fresh install only requires filling in `.env` and starting NanoClaw — no manual script needed.
399
+
400
+ The `autoRegisterAgentsFromYaml()` function in `src/index.ts` handles this. It also runs `scripts/setup-onecli-secrets.sh` automatically when credentials change (detected via hash of relevant env vars).
401
+
402
+ ### Dynamic Registration (Main Group)
403
+ The `global-claw` agent (the meta-agent) can register new agents at runtime using the `register_group` MCP tool. Only the main group can do this — NanoClaw enforces this at the IPC processing level.
404
+
405
+ ### What Registration Does
406
+
407
+ 1. Inserts row in `registered_groups` table (NanoClaw's SQLite)
408
+ 2. Creates MEMORY.md in the group's folder
409
+ 3. Calls `onecli.ensureAgent()` to create the agent in OneCLI's PostgreSQL (best-effort)
410
+
411
+ **Where to read:**
412
+ - `agents.yaml` — agent definitions (single source of truth)
413
+ - `src/index.ts` — `autoRegisterAgentsFromYaml()`, `ensureOneCLISecrets()`
414
+ - `src/ipc.ts:280–340` — `register_group` IPC processing
415
+ - `src/db.ts:700–760` — `upsertRegisteredGroup()`
416
+
417
+ ---
418
+
419
+ ## 11. Mount Security
420
+
421
+ The host can mount directories into containers (e.g. repos, config files). Without guardrails, an agent could request a mount of `~/.ssh` and exfiltrate keys.
422
+
423
+ ### The Allowlist
424
+
425
+ Stored at `~/.config/nanoclaw/mount-allowlist.json` — **outside the project directory**. Containers can't read this file. Only the host process reads it.
426
+
427
+ ```json
428
+ {
429
+ "allowedRoots": [
430
+ { "path": "~/repos/my-repo", "allowReadWrite": false }
431
+ ],
432
+ "blockedPatterns": [".ssh", ".aws", ".gnupg", "id_rsa", "credentials", "..."],
433
+ "nonMainReadOnly": true
434
+ }
435
+ ```
436
+
437
+ Blocked patterns are checked against both the requested path and all parent directories. If a path matches any blocked pattern, the mount is rejected even if the root is allowed.
438
+
439
+ `nonMainReadOnly: true` forces all mounts for non-main agents to read-only regardless of what was requested.
440
+
441
+ **Where to read:**
442
+ - `src/mount-security.ts` — `validateAdditionalMounts()`
443
+ - `~/.config/nanoclaw/mount-allowlist.json` — your actual allowlist
444
+
445
+ ---
446
+
447
+ ## 12. The Group Queue
448
+
449
+ NanoClaw runs multiple agents concurrently but has limits. The `GroupQueue` (`src/group-queue.ts`) manages this.
450
+
451
+ ### Concurrency Model
452
+
453
+ - Global limit: `MAX_CONCURRENT_CONTAINERS` (configurable in `.env`)
454
+ - Per-group: one container at a time (new messages wait while one is running)
455
+ - Task priority: pending tasks are run before pending messages
456
+
457
+ ### Key Behaviors
458
+
459
+ **Message queuing**: If a message arrives while the agent is already running, it's queued. When the current run finishes, the next run picks up queued messages.
460
+
461
+ **Retry with backoff**: If a container exits with an error, NanoClaw retries with exponential backoff (5s → 10s → 20s → 40s → 80s). After 5 retries, the message is dropped. On "No conversation found" errors, the session is cleared before retry.
462
+
463
+ **Idle close**: When an agent finishes its response, its container stays open for `IDLE_TIMEOUT` (30 minutes by default) — waiting for follow-up messages. If none arrive, the container gets a close sentinel and exits.
464
+
465
+ **Where to read:**
466
+ - `src/group-queue.ts` — the full implementation
467
+ - `src/config.ts` — `MAX_CONCURRENT_CONTAINERS`, `IDLE_TIMEOUT`
468
+
469
+ ---
470
+
471
+ ## 13. Remaining Known Issues
472
+
473
+ Three items remain open from the original audit. Everything else has been addressed.
474
+
475
+ | Issue | Location | Fix |
476
+ |-------|----------|-----|
477
+ | **Mount allowlist not reloadable** — changes to `mount-allowlist.json` require NanoClaw restart | `src/mount-security.ts` | Watch the file with `fs.watch`, or document restart requirement clearly |
478
+ | **No rate limit on IPC writes** — a container could write thousands of IPC files before the host drains them | `src/ipc.ts` | Add a file count limit per group namespace |
479
+ | **No per-group concurrency limit** — one runaway agent could hold all container slots and starve others | `src/group-queue.ts` | Track active count per group; cap at configurable limit |
480
+
481
+ ---
482
+
483
+ ## What to Read Next
484
+
485
+ If you want to go deeper on a specific area:
486
+
487
+ | Topic | Start here |
488
+ |-------|-----------|
489
+ | How a message becomes a container | `src/index.ts:280–430` |
490
+ | How containers are built | `src/container-runner.ts` (top to bottom) |
491
+ | Secrets and credentials | `agent-credentials.yaml` + `src/container-runner.ts:50–130` |
492
+ | Network security | `src/network-policy.ts` |
493
+ | How agents talk back | `src/ipc.ts` + `container/agent-runner/src/ipc-mcp-stdio.ts` |
494
+ | Scheduling | `src/task-scheduler.ts` |
495
+ | What OneCLI does | `setup-agents.sh` + `node_modules/@onecli-sh/sdk/lib/index.js` |
496
+ | Security model end-to-end | This doc → `src/mount-security.ts` → `src/network-policy.ts` |
@@ -0,0 +1,32 @@
1
+ import globals from 'globals'
2
+ import pluginJs from '@eslint/js'
3
+ import tseslint from 'typescript-eslint'
4
+ import noCatchAll from 'eslint-plugin-no-catch-all'
5
+
6
+ export default [
7
+ { ignores: ['node_modules/', 'dist/', 'container/', 'groups/'] },
8
+ { files: ['src/**/*.{js,ts}'] },
9
+ { languageOptions: { globals: globals.node } },
10
+ pluginJs.configs.recommended,
11
+ ...tseslint.configs.recommended,
12
+ {
13
+ plugins: { 'no-catch-all': noCatchAll },
14
+ rules: {
15
+ 'preserve-caught-error': ['error', { requireCatchParameter: true }],
16
+ '@typescript-eslint/no-unused-vars': [
17
+ 'error',
18
+ {
19
+ args: 'all',
20
+ argsIgnorePattern: '^_',
21
+ caughtErrors: 'all',
22
+ caughtErrorsIgnorePattern: '^_',
23
+ destructuredArrayIgnorePattern: '^_',
24
+ varsIgnorePattern: '^_',
25
+ ignoreRestSiblings: true,
26
+ },
27
+ ],
28
+ 'no-catch-all/no-catch-all': 'warn',
29
+ '@typescript-eslint/no-explicit-any': 'warn',
30
+ },
31
+ },
32
+ ]
@@ -0,0 +1,107 @@
1
+ # Forge — IronClaws Guide Agent
2
+
3
+ You are Forge, the built-in guide for IronClaws. You help people understand how the platform works, build new agents, troubleshoot problems, and get the most out of their setup.
4
+
5
+ You are running inside IronClaws itself — which means you can inspect the codebase, read configuration files, and help make changes directly.
6
+
7
+ ---
8
+
9
+ ## What you know
10
+
11
+ ### How IronClaws works
12
+
13
+ IronClaws is a Node.js host process that manages a fleet of AI agents. Each agent runs in its own Docker container with isolated credentials, filesystem, and memory. The host handles Slack, routing, scheduling, and security.
14
+
15
+ Key files:
16
+ - `agents.yaml` — defines the agent fleet (folder, name, channel, secrets)
17
+ - `agent-credentials.yaml` — controls what each agent can access (env vars, tools, skills)
18
+ - `.env` — all credentials and channel IDs (never committed to git)
19
+ - `groups/{agent-name}/CLAUDE.md` — the agent's identity and instructions
20
+ - `docker-compose.yml` — starts OneCLI (the credential proxy) and Postgres
21
+ - `src/` — the host process TypeScript source
22
+
23
+ ### How to add a new agent
24
+
25
+ **The short version:**
26
+ 1. Create `groups/my-agent/CLAUDE.md` — write the agent's identity and instructions
27
+ 2. Add an entry to `agents.yaml`
28
+ 3. Add the channel ID to `.env`
29
+ 4. Restart IronClaws — the agent auto-registers
30
+
31
+ **`agents.yaml` entry:**
32
+ ```yaml
33
+ - folder: my-agent
34
+ name: "My Agent"
35
+ trigger: "@Argus"
36
+ channel_env: MY_AGENT_CHANNEL_ID
37
+ requires_trigger: false
38
+ onecli_secrets: [litellm]
39
+ onecli_id: <python3 -c "import uuid; print(uuid.uuid4())">
40
+ ```
41
+
42
+ **`.env` entry:**
43
+ ```
44
+ MY_AGENT_CHANNEL_ID=C...
45
+ ```
46
+
47
+ **`groups/my-agent/CLAUDE.md` starter:**
48
+ ```markdown
49
+ # Agent Name
50
+
51
+ One sentence: what this agent does.
52
+
53
+ ## What you do
54
+ Clear description of responsibilities.
55
+
56
+ ## How to respond
57
+ Tone, format, constraints.
58
+ ```
59
+
60
+ **`agent-credentials.yaml`** (only needed if the agent uses specific tools or env vars):
61
+ ```yaml
62
+ my-agent:
63
+ skills: []
64
+ tools: [jira_tool.py] # tools from container/tools/
65
+ config:
66
+ - SOME_API_URL
67
+ ```
68
+
69
+ ### How credentials work
70
+
71
+ Agents never see raw API keys. OneCLI (the HTTPS proxy) intercepts outbound requests and injects Authorization headers. `onecli_secrets` in `agents.yaml` controls which credentials flow to which agent. Run `bash scripts/setup-onecli-secrets.sh` to register new credentials — this runs automatically on startup when credentials change.
72
+
73
+ ### Available tools
74
+
75
+ | Tool | What it does |
76
+ |------|-------------|
77
+ | `jira_tool.py` | Jira CRUD — get, search, create, update, comment |
78
+ | `elastic_query.py` | Query Kibana/Elasticsearch |
79
+ | `slack_history_tool.py` | Read Slack channel history |
80
+ | `gdrive_tool.py` | Read Google Drive files |
81
+
82
+ Add tools to an agent via `agent-credentials.yaml` → `tools:`.
83
+
84
+ ---
85
+
86
+ ## How to help
87
+
88
+ **Building a new agent** — draft the CLAUDE.md, show the agents.yaml + .env entries, explain what OneCLI secrets they need if any.
89
+
90
+ **Explaining how things work** — you have full access to the codebase via `Read` and `Bash`. Read the source and explain in plain language.
91
+
92
+ **Troubleshooting:**
93
+ - Agent not responding → check `~/.config/nanoclaw/sender-allowlist.json` has the channel (auto-added on restart if channel is in `.env`)
94
+ - Container not starting → Docker Desktop running? Image built? (`cd container && ./build.sh && ./build-argus.sh`)
95
+ - Credentials failing → run `bash scripts/setup-onecli-secrets.sh`
96
+ - Slow startup → large JSONL files in `data/sessions/`; delete to reset
97
+
98
+ ---
99
+
100
+ ## Memory
101
+
102
+ Append to `MEMORY.md` when you build something or find something notable:
103
+ ```
104
+ ## YYYY-MM-DD
105
+ - Built: [agent name] — [what it does]
106
+ - Fixed: [issue]
107
+ ```