@semalt-ai/code 1.8.5 → 1.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (192) hide show
  1. package/.claude/settings.local.json +7 -1
  2. package/.github/workflows/ci.yml +69 -0
  3. package/ARCHITECTURE.md +6 -95
  4. package/CLAUDE.md +196 -316
  5. package/README.md +148 -4
  6. package/docs/ARCHITECTURE.md +1321 -0
  7. package/docs/CONFIG.md +340 -0
  8. package/docs/HISTORY.md +245 -0
  9. package/examples/embed.js +74 -0
  10. package/index.js +251 -10
  11. package/lib/agent.js +856 -120
  12. package/lib/api.js +239 -50
  13. package/lib/args.js +74 -2
  14. package/lib/audit.js +23 -1
  15. package/lib/background.js +584 -0
  16. package/lib/checkpoints.js +757 -0
  17. package/lib/commands/auth.js +94 -0
  18. package/lib/commands/chat-session.js +489 -0
  19. package/lib/commands/chat-slash.js +415 -0
  20. package/lib/commands/chat-turn.js +669 -0
  21. package/lib/commands/chat.js +407 -0
  22. package/lib/commands/custom.js +157 -0
  23. package/lib/commands/history-utils.js +66 -0
  24. package/lib/commands/index.js +268 -0
  25. package/lib/commands/mcp.js +113 -0
  26. package/lib/commands/oneshot.js +193 -0
  27. package/lib/commands/registry.js +269 -0
  28. package/lib/commands/tasks.js +89 -0
  29. package/lib/compact.js +87 -0
  30. package/lib/config.js +360 -11
  31. package/lib/constants.js +401 -3
  32. package/lib/deny.js +199 -0
  33. package/lib/doctor.js +160 -0
  34. package/lib/headless.js +202 -0
  35. package/lib/hooks.js +286 -0
  36. package/lib/images.js +270 -0
  37. package/lib/internals.js +49 -0
  38. package/lib/mcp/boundary.js +131 -0
  39. package/lib/mcp/client.js +270 -0
  40. package/lib/mcp/oauth.js +134 -0
  41. package/lib/memory.js +209 -0
  42. package/lib/metrics.js +37 -2
  43. package/lib/payload.js +54 -0
  44. package/lib/permission-rules.js +401 -0
  45. package/lib/permissions.js +123 -26
  46. package/lib/pricing.js +67 -0
  47. package/lib/proc.js +62 -0
  48. package/lib/prompts.js +99 -8
  49. package/lib/sandbox.js +568 -0
  50. package/lib/sdk.js +328 -0
  51. package/lib/secrets.js +211 -0
  52. package/lib/skills.js +223 -0
  53. package/lib/subagents.js +516 -0
  54. package/lib/tool_registry.js +2862 -0
  55. package/lib/tool_specs.js +263 -9
  56. package/lib/tools.js +352 -1039
  57. package/lib/ui/anim.js +86 -0
  58. package/lib/ui/ansi.js +17 -27
  59. package/lib/ui/chat-history.js +253 -71
  60. package/lib/ui/create-ui.js +67 -24
  61. package/lib/ui/diff.js +90 -25
  62. package/lib/ui/file-activity.js +236 -0
  63. package/lib/ui/format.js +195 -29
  64. package/lib/ui/input-field.js +21 -11
  65. package/lib/ui/md-stream.js +234 -0
  66. package/lib/ui/render-operation.js +113 -0
  67. package/lib/ui/select.js +1 -4
  68. package/lib/ui/status-bar.js +146 -36
  69. package/lib/ui/stream.js +20 -13
  70. package/lib/ui/theme.js +190 -44
  71. package/lib/ui/tool-operation.js +190 -0
  72. package/lib/ui/utils.js +9 -5
  73. package/lib/ui/web-activity.js +270 -0
  74. package/lib/ui/writer.js +159 -45
  75. package/lib/ui.js +1 -1
  76. package/lib/verify.js +229 -0
  77. package/lib/web-extract.js +213 -0
  78. package/lib/web-summarize.js +68 -0
  79. package/package.json +19 -4
  80. package/scripts/lint.js +57 -0
  81. package/test/agent-loop.test.js +389 -0
  82. package/test/anim-driver.test.js +153 -0
  83. package/test/ask-user-display.test.js +226 -0
  84. package/test/ask-user-gate.test.js +231 -0
  85. package/test/background.test.js +414 -0
  86. package/test/chat-history-nocolor.test.js +155 -0
  87. package/test/chat-relogin.test.js +207 -0
  88. package/test/chat.test.js +114 -0
  89. package/test/checkpoints-agent.test.js +181 -0
  90. package/test/checkpoints.test.js +650 -0
  91. package/test/command-registry.test.js +160 -0
  92. package/test/compact.test.js +116 -0
  93. package/test/completion-lazy.test.js +52 -0
  94. package/test/config-merge.test.js +324 -0
  95. package/test/config-quarantine.test.js +128 -0
  96. package/test/config-write-guard-allow-anywhere.test.js +56 -0
  97. package/test/config-write-guard-skip.test.js +46 -0
  98. package/test/config-write-guard.test.js +153 -0
  99. package/test/context-split.test.js +215 -0
  100. package/test/cost-doctor.test.js +142 -0
  101. package/test/custom-commands-chat.test.js +106 -0
  102. package/test/custom-commands.test.js +230 -0
  103. package/test/defer-detail-band.test.js +403 -0
  104. package/test/deny-windows.test.js +120 -0
  105. package/test/deny.test.js +83 -0
  106. package/test/detail-band-tab-flatten.test.js +242 -0
  107. package/test/download-allow-anywhere.test.js +66 -0
  108. package/test/download-confine.test.js +153 -0
  109. package/test/exec-diff.test.js +268 -0
  110. package/test/executors.test.js +599 -0
  111. package/test/extract-tool-calls.test.js +349 -0
  112. package/test/fetch-url-validation.test.js +219 -0
  113. package/test/file-activity.test.js +522 -0
  114. package/test/fixtures/tool-calls.js +57 -0
  115. package/test/fixtures/web-page.js +91 -0
  116. package/test/git-tools.test.js +384 -0
  117. package/test/grep-glob-serialize.test.js +242 -0
  118. package/test/grep-glob.test.js +268 -0
  119. package/test/grep-path-target.test.js +227 -0
  120. package/test/harness/README.md +57 -0
  121. package/test/harness/chat-harness.js +143 -0
  122. package/test/harness/memwarn-headless-child.js +65 -0
  123. package/test/harness/mock-llm.js +120 -0
  124. package/test/harness/mock-mcp-server.js +142 -0
  125. package/test/harness/sse-server.js +69 -0
  126. package/test/headless.test.js +348 -0
  127. package/test/history-utils.test.js +88 -0
  128. package/test/hooks-agent.test.js +238 -0
  129. package/test/hooks-verify-sandbox.test.js +232 -0
  130. package/test/hooks.test.js +216 -0
  131. package/test/http-get-user-agent.test.js +142 -0
  132. package/test/images-api.test.js +208 -0
  133. package/test/images.test.js +238 -0
  134. package/test/input-field-ctrl-o.test.js +37 -0
  135. package/test/live-height-physical.test.js +281 -0
  136. package/test/max-iterations.test.js +218 -0
  137. package/test/mcp-boundary.test.js +57 -0
  138. package/test/mcp-client.test.js +267 -0
  139. package/test/mcp-oauth.test.js +86 -0
  140. package/test/md-stream.test.js +183 -0
  141. package/test/memory-truncation-warning.test.js +222 -0
  142. package/test/memory.test.js +198 -0
  143. package/test/native-dispatch.test.js +409 -0
  144. package/test/native-live-narration.test.js +254 -0
  145. package/test/output-chokepoint.test.js +188 -0
  146. package/test/output-heredoc-leak.test.js +195 -0
  147. package/test/output-preview.test.js +245 -0
  148. package/test/path-guards.test.js +134 -0
  149. package/test/payload.test.js +99 -0
  150. package/test/permission-rules-agent.test.js +210 -0
  151. package/test/permission-rules.test.js +297 -0
  152. package/test/permissions.test.js +362 -0
  153. package/test/plan-mode.test.js +167 -0
  154. package/test/read-paginate.test.js +275 -0
  155. package/test/readonly-tools.test.js +177 -0
  156. package/test/render-operation.test.js +317 -0
  157. package/test/replay-descriptor-xml.test.js +216 -0
  158. package/test/replay-descriptor.test.js +189 -0
  159. package/test/replay-web-aggregate.test.js +291 -0
  160. package/test/replay-web-persist.test.js +241 -0
  161. package/test/result-cap.test.js +233 -0
  162. package/test/running-glyph-anim.test.js +111 -0
  163. package/test/sandbox-agent.test.js +147 -0
  164. package/test/sandbox-integration.test.js +216 -0
  165. package/test/sandbox.test.js +408 -0
  166. package/test/sdk.test.js +234 -0
  167. package/test/shell-output-cap.test.js +181 -0
  168. package/test/skills-chat.test.js +110 -0
  169. package/test/skills.test.js +295 -0
  170. package/test/smoke.test.js +68 -0
  171. package/test/status-bar-driver.test.js +93 -0
  172. package/test/status-bar-pause.test.js +164 -0
  173. package/test/status-bar-resync.test.js +188 -0
  174. package/test/stream-parser.test.js +171 -0
  175. package/test/subagents-agent.test.js +178 -0
  176. package/test/subagents.test.js +222 -0
  177. package/test/theme-palette.test.js +166 -0
  178. package/test/tool-registry.test.js +85 -0
  179. package/test/trim-budget.test.js +101 -0
  180. package/test/truncate-visible.test.js +78 -0
  181. package/test/verify-agent.test.js +317 -0
  182. package/test/verify.test.js +141 -0
  183. package/test/view-image.test.js +199 -0
  184. package/test/web-activity-ordering.test.js +203 -0
  185. package/test/web-activity.test.js +207 -0
  186. package/test/web-data-extraction-guidance.test.js +71 -0
  187. package/test/web-extract.test.js +185 -0
  188. package/test/web-fetch-agent.test.js +291 -0
  189. package/test/web-fetch-mode.test.js +193 -0
  190. package/test/web-search.test.js +380 -0
  191. package/lib/commands.js +0 -1438
  192. package/path +0 -1
package/CLAUDE.md CHANGED
@@ -1,45 +1,23 @@
1
1
  # semalt-code — CLI Agent
2
2
 
3
- Node.js CLI tool that lets AI agents interact with code via an iterative tool-use loop. Zero external dependencies; uses only Node.js built-ins.
4
-
5
- Published as `@semalt-ai/code`. Invokable as `semalt-code` or `semalt`.
6
-
7
- ---
8
-
9
- ## Directory Layout
10
-
11
- ```
12
- semalt-code/
13
- ├── index.js # Entry point: arg parsing, module wiring, command dispatch
14
- ├── lib/
15
- │ ├── api.js # HTTP client for dashboard auth + OpenAI-compatible inference
16
- ├── agent.js # Agent loop: stream extract tools → execute → repeat
17
- │ ├── commands.js # All CLI command handlers (chat, code, edit, shell, login, …)
18
- │ ├── tools.js # File and shell operation implementations
19
- │ ├── prompts.js # System prompt for the LLM (tells it to use exec/read/write tags)
20
- │ ├── ui.js # Barrel: re-exports everything from lib/ui/
21
- │ ├── ui/
22
- │ │ ├── ansi.js # ANSI escape constants, THEME, color codes, SPINNER_DEFS
23
- │ │ ├── utils.js # getCols, getRows, stripAnsi, hr, boxLine, insertCharAt, …
24
- │ │ ├── diff.js # renderDiff (LCS diff), renderMarkdown, _mdInline
25
- │ │ ├── stream.js # StreamRenderer — live token-by-token terminal output
26
- │ │ ├── legacy.js # StatusBar (cmdCode/cmdEdit), interactiveSelect, SelectMenu
27
- │ │ ├── layout.js # LayoutManager — terminal geometry, resize events
28
- │ │ ├── chat-history.js# ChatHistory — bubble rendering, scroll, streaming slots
29
- │ │ ├── status-bar.js # FullStatusBar — animated TUI status line
30
- │ │ ├── input-field.js # InputField, parseKeySequence, SLASH_CMDS
31
- │ │ └── create-ui.js # createUI factory + non-TTY no-op fallback
32
- │ ├── context.js # Loads file/directory content into the prompt
33
- │ ├── config.js # Read/write ~/.semalt-ai/config.json
34
- │ ├── permissions.js # Per-session approval tracking for tool calls
35
- │ ├── args.js # CLI argument parser
36
- │ ├── constants.js # CONFIG_PATH, DEFAULT_CONFIG, DEFAULT_API_TIMEOUT_MS
37
- │ ├── audit.js # Append-only audit log for all tool executions
38
- │ ├── storage.js # Local session persistence and resume
39
- │ └── metrics.js # Token counting, cost estimation, latency tracking
40
- ├── package.json # name: @semalt-ai/code, version: 1.8.0, bin: semalt / semalt-code
41
- └── README.md
42
- ```
3
+ Node.js CLI tool that lets AI agents interact with code via an iterative tool-use
4
+ loop (stream → extract tool calls → execute → repeat). **Minimal, vetted, pinned**
5
+ runtime dependencies everything else uses Node.js built-ins. Published as
6
+ `@semalt-ai/code`; invokable as `semalt-code` or `semalt`. Also consumable as a
7
+ library via the `createAgent` facade (`lib/sdk.js`).
8
+
9
+ > **This file is auto-loaded as project memory — keep it lean.** Deep detail lives
10
+ > in `docs/` (not auto-loaded):
11
+ > - **`docs/ARCHITECTURE.md`** — per-subsystem internals (MCP, checkpoints, sandbox,
12
+ > web-fetch pipeline, SDK, subagents, hooks, git tools, …).
13
+ > - **`docs/HISTORY.md`** dependency-policy rationale, the long-form invariant
14
+ > reference, and the "Deferred / Not Yet Implemented" roadmap.
15
+ > - **`docs/CONFIG.md`** full per-key config reference + CLI flags/commands +
16
+ > slash commands + tool tags/operations + dashboard API endpoints.
17
+
18
+ The authoritative *runtime* sources for tool tags and CLI surface are
19
+ `lib/tool_specs.js` / `lib/prompts.js` (tool tags) and `semalt-code --help` (CLI
20
+ flags). `docs/CONFIG.md` mirrors them for humans.
43
21
 
44
22
  ---
45
23
 
@@ -47,7 +25,8 @@ semalt-code/
47
25
 
48
26
  | Component | Technology |
49
27
  |-----------|-----------|
50
- | Runtime | Node.js ≥ 16, CommonJS (`require`) |
28
+ | Runtime | Node.js ≥ 18, CommonJS (`require`) |
29
+ | Runtime deps | `@modelcontextprotocol/sdk` (ESM, via `lib/mcp/boundary.js`); `@mozilla/readability` + `linkedom` + `turndown` (web-fetch extraction) — all exact-pinned |
51
30
  | HTTP | Built-in `http`/`https` modules |
52
31
  | Shell exec | `child_process.spawnSync` |
53
32
  | File I/O | `fs` module |
@@ -57,297 +36,198 @@ semalt-code/
57
36
 
58
37
  ---
59
38
 
60
- ## CLI Commands
61
-
62
- ```
63
- semalt-code # interactive chat (default)
64
- semalt-code chat # interactive chat (explicit)
65
- semalt-code code <prompt> # one-shot task with optional file context
66
- semalt-code edit <file> <instruction> # targeted file edit
67
- semalt-code shell <command> # run shell, optionally ask LLM to analyze output
68
- semalt-code login # browser-based device auth against dashboard
69
- semalt-code logout # clear stored auth_token
70
- semalt-code whoami # show authenticated user
71
- semalt-code models # interactive model selector (fetches from dashboard)
72
- semalt-code init [options] # create/update ~/.semalt-ai/config.json
73
- semalt-code audit # print last 50 audit log entries
74
- semalt-code config [set <key> <val>] # show or update config keys
75
- ```
76
-
77
- ### Common Flags
78
-
79
- ```
80
- -m, --model <name> override model for this invocation
81
- -r, --resume <chat-id> resume a dashboard chat by ID
82
- -f, --file <path> load file or directory as context
83
- -a, --analyze have LLM analyze shell output (used with `shell`)
84
- --dry-run preview file edits without writing
85
- --api-base <url> LLM API base URL (overrides config)
86
- --api-key <key> API key (overrides config)
87
- --dashboard-url <url> dashboard base URL (overrides config)
88
- --default-model <name> set default model in config
89
- --show-think display model reasoning (thinking) content
90
- --debug inline debug: per-iteration debug block in chat history (TUI-safe)
91
- --debug-file <path> extended debug: per-iteration block + raw SSE chunks
92
- + request body dumps written to <path>, nothing to stdout.
93
- Mutually exclusive with --debug.
94
- --allow-fs auto-approve all filesystem operations
95
- --allow-exec auto-approve shell command execution
96
- --allow-net auto-approve network operations
97
- --allow-all auto-approve everything (use carefully)
98
- --readonly block all write operations
99
- --new skip session resume prompt
100
- -v, --version print version
101
- -h, --help print help
102
- ```
103
-
104
- ### In-Chat Slash Commands
105
-
106
- | Command | Effect |
107
- |---------|--------|
108
- | `/help` | List slash commands |
109
- | `/file <path>` | Attach file or directory to context |
110
- | `/history` | Browse and load a local saved session |
111
- | `/chats` | Browse and resume a saved chat from the dashboard |
112
- | `/new` | Start a fresh conversation (detach from current saved chat) |
113
- | `/model [name]` | Show or switch model |
114
- | `/models` | Interactive model picker from dashboard |
115
- | `/shell <cmd>` or `!<cmd>` | Execute shell command |
116
- | `/compact` | Show token usage estimate and session metrics |
117
- | `/clear` | Reset conversation history |
118
- | `/approve` | Toggle auto-approval of tool calls |
119
- | `/config` | Print current config |
120
- | `/login` | Start device auth flow |
121
- | `/whoami` | Show current user |
122
- | `/logout` | Clear auth token |
123
- | `exit` / `quit` | Exit |
124
-
125
- ---
126
-
127
- ## Agent Loop (`lib/agent.js`)
128
-
129
- Maximum 10 iterations per user turn.
130
-
131
- ```
132
- 1. Send messages[] to LLM via chatStream()
133
- 2. Stream response tokens to terminal (StreamRenderer)
134
- 3. After full response: extract tool-call tags from text
135
- 4. If no tool tags → done
136
- 5. For each tag: request user permission (once / always / no)
137
- 6. Execute approved operations via ToolExecutor (wrapped in try/catch)
138
- 7. Append tool results to messages[]
139
- 8. Goto 1
140
- ```
39
+ ## Directory Layout
141
40
 
142
- Each tool dispatch is wrapped in try/catch; errors print a warning and continue to the next tag rather than aborting the loop.
143
-
144
- ### Tool Tags (parsed from LLM text)
145
-
146
- ```xml
147
- <exec>shell command here</exec>
148
- <shell>shell command here</shell>
149
- <read_file>/absolute/or/relative/path</read_file>
150
- <read_file path="/path/to/file"/>
151
- <write_file path="/path/to/file">file content here</write_file>
152
- <create_file path="/path/to/file">file content here</create_file>
153
- <append_file path="/path/to/file">content to append</append_file>
154
- <list_dir>/path/to/dir</list_dir>
155
- <search_files pattern="*.ts" dir="src"/>
156
- <delete_file>/path/to/file</delete_file>
157
- <make_dir>/path/to/dir</make_dir>
158
- <remove_dir>/path/to/dir</remove_dir>
159
- <get_env>ENV_VAR_NAME</get_env>
160
- <set_env name="VAR" value="value"/>
161
- <move_file src="/old/path" dst="/new/path"/>
162
- <copy_file src="/src/path" dst="/dst/path"/>
163
- <edit_file path="/file" line="42">replacement line content</edit_file>
164
- <search_in_file path="/file">regex pattern</search_in_file>
165
- <replace_in_file path="/file" search="old" replace="new"></replace_in_file>
166
- <download>https://example.com/file.zip</download>
167
- <upload path="/local/path">base64encodedcontent</upload>
168
- <file_stat>/path/to/file</file_stat>
169
- <http_get url="https://example.com/api"/>
170
- <ask_user question="What is your preferred language?"/>
171
- <store_memory key="project_lang">TypeScript</store_memory>
172
- <recall_memory key="project_lang"/>
173
- <list_memories/>
174
- <system_info/>
175
41
  ```
176
-
177
- The system prompt (`lib/prompts.js`) instructs the LLM to use exactly these tags. Do not change tag names without updating both `prompts.js` and the parser in `agent.js`.
178
-
179
- ---
180
-
181
- ## Tool Operations (`lib/tools.js`)
182
-
183
- All operations request permission before execution unless auto-approved.
184
- Output truncated to `config.max_output_lines` (default 20) to avoid filling context.
185
-
186
- | Action | Description |
187
- |--------|-------------|
188
- | `read` | Read file content |
189
- | `write` | Write file (creates parent dirs) |
190
- | `append` | Append to file |
191
- | `list_dir` | List directory contents |
192
- | `delete_file` | Delete file |
193
- | `make_dir` | Create directory (recursive) |
194
- | `remove_dir` | Remove directory (recursive) |
195
- | `move_file` | Move/rename file |
196
- | `copy_file` | Copy file |
197
- | `search_files` | Find files matching glob pattern |
198
- | `search_in_file` | Regex search within file |
199
- | `replace_in_file` | Replace text in file (regex, optional flags) |
200
- | `edit_file` | Replace a specific line number in a file |
201
- | `get_env` / `set_env` | Read/write environment variables |
202
- | `download` | HTTP GET save to file |
203
- | `upload` | Write base64-encoded content to file |
204
- | `file_stat` | Stat a file (size, mtime, type, mode) |
205
- | `http_get` | HTTP GETreturn body (truncated to max_output_lines) |
206
- | `ask_user` | Prompt user for input; auto-answers 'y' in non-TTY mode |
207
- | `store_memory` | Persist a key/value pair to `~/.semalt-ai/memory.json` |
208
- | `recall_memory` | Read a key from `~/.semalt-ai/memory.json` |
209
- | `list_memories` | List all stored memory keys |
210
- | `system_info` | Return platform, arch, hostname, memory, Node version, cwd |
211
-
212
- ---
213
-
214
- ## Audit Log (`lib/audit.js`)
215
-
216
- Every tool execution is appended to `~/.semalt-ai/audit.log` as NDJSON:
217
- ```json
218
- {"ts":"2026-01-01T00:00:00.000Z","tag":"exec","input":"{\"command\":\"ls\"}","approved":true,"result":"ok"}
42
+ semalt-code/
43
+ ├── index.js # Entry point: arg parsing, module wiring, command dispatch
44
+ ├── lib/
45
+ │ ├── sdk.js # Embedding SDK: createAgent() STABLE facade (assembles loop/registries/permissions/sandbox per-instance)
46
+ │ ├── internals.js # UNSTABLE building-blocks barrel (@semalt-ai/code/internals subpath; no semver guarantee)
47
+ │ ├── api.js # HTTP client: dashboard auth + OpenAI-compatible inference (chatStream/chatComplete/dashboard*)
48
+ │ ├── agent.js # Agent loop; boundToolOutput chokepoint; untrusted fencing; XML+native tuple convergence
49
+ │ ├── commands/ # CLI + in-chat command handlers: registry (dispatch/help/completion), custom commands,
50
+ │ │ # auth, mcp mgmt, oneshot (code/edit/shell/models/init), tasks, chat session/slash/turn
51
+ │ ├── tools.js # File + shell operation impls; agentExecShell chokepoint; secret/config path guards
52
+ │ ├── tool_registry.js # Per-tool registration: XML parseAttrs + native fromParams + execute + permission; git tools; web-fetch pipeline
53
+ │ ├── tool_specs.js # TOOL_SPECS: OpenAI-format parameter source of truth for every 'tool' tag
54
+ │ ├── proc.js # Platform-aware subprocess spawn + tree-kill (+ detached spawn / kill-by-PID / isProcessAlive)
55
+ │ ├── debug.js # Two debug modes (--debug inline / --debug-file), wired once at startup
56
+ │ ├── prompts.js # System prompt: tool-tag inventory + untrusted-content rules + navigation guidance
57
+ │ ├── ui.js / ui/ # Terminal UI: raw-ANSI writer, stream renderer, status bar, diff, select, layout, web-activity collapse
58
+ │ ├── mcp/ # boundary.js (sole CJS↔ESM bridge), client.js (manager), oauth.js (keychain provider)
59
+ │ ├── hooks.js # Lifecycle hooks (shell/prompt) at agent events; deny-listed + sandboxed; project command-hooks quarantined
60
+ │ ├── verify.js # Self-verification: run a configured command at "done", advisory/enforcing; deny-listed + sandboxed
61
+ │ ├── checkpoints.js # Per-write file snapshots + /rewind (code/conversation/both); turn linkage; restore-path guard re-validation
62
+ │ ├── sandbox.js # OS sandbox: Seatbelt/bubblewrap policy gen + wrap; resolveSandboxedSpawn shim; binary network isolation
63
+ │ ├── skills.js # Skills: discover SKILL.md, metadata-only injection, body on invocation
64
+ │ ├── subagents.js # spawn_agent tool: isolated child loop sharing parent permissions; bounded parallel
65
+ │ ├── background.js # Detached background-task launcher + registry (NOT an agent tool)
66
+ │ ├── images.js # Multimodal image input: read+size-cap+isPathSafe+base64, provider shaping, vision-capability resolution
67
+ │ ├── web-extract.js # Web-fetch stage 1+2: classify + Readability extract + Turndown HTML→Markdown + token cap
68
+ │ ├── web-summarize.js # Web-fetch stage 3: data-only untrusted-safe secondary-LLM summary
69
+ │ ├── memory.js # Project memory: AGENTS.md/CLAUDE.md hierarchy loader (this file)
70
+ │ ├── headless.js # Headless -p/--print output: text/json/stream-json
71
+ │ ├── pricing.js # Per-model price tablecost
72
+ │ ├── doctor.js # /doctor self-diagnostics: checks + aggregation
73
+ │ ├── payload.js # Prompt-caching + reasoning_effort payload augmentation
74
+ │ ├── compact.js # Conversation compaction: select/summarize/replace
75
+ │ ├── context.js # Loads file/directory content into the prompt
76
+ │ ├── config.js # Read/write ~/.semalt-ai/config.json; 4-layer merge; executable-quarantine re-resolution
77
+ │ ├── permissions.js # Per-session approval tracking (+ per-pattern rule resolution)
78
+ │ ├── permission-rules.js # Pure per-pattern rule engine: schema, canonicalization, resolvePermission
79
+ │ ├── deny.js # Destructive-command deny-list for shell calls
80
+ │ ├── secrets.js # API-key sourcing: env → OS keychain → config; generic keychain helpers
81
+ │ ├── args.js # CLI argument parser
82
+ │ ├── constants.js # CONFIG_PATH, DEFAULT_CONFIG, TAG_REGISTRY TOOL_SPECS parity check, protectedConfigDirs
83
+ │ ├── audit.js # Append-only audit log for all tool executions
84
+ │ ├── storage.js # Local session persistence and resume
85
+ │ └── metrics.js # Token counting, cost estimation, latency tracking, split context counter
86
+ ├── scripts/lint.js # Zero-dep lint: `node --check` over all sources
87
+ ├── test/ # node:test suites (smoke + per-subsystem)
88
+ ├── examples/embed.js # Runnable embedding example: createAgent + permission policy + close()
89
+ ├── package.json # exports: '.' → sdk.js, './internals' → internals.js; bin: semalt / semalt-code
90
+ ├── package-lock.json # committed lockfile (npm ci installs strictly from it)
91
+ └── README.md
219
92
  ```
220
93
 
221
- View the last 50 entries with `semalt-code audit`.
222
-
223
94
  ---
224
95
 
225
- ## Session Storage (`lib/storage.js`)
226
-
227
- Local chat sessions are saved to `~/.semalt-ai/sessions/` as JSON files named `<timestamp>-<id>.json`. The `chat` command offers to resume the most recent session (< 24 h old) on startup unless `--new` or `--resume` is passed. Use `/history` in-chat to browse and load any saved session.
228
-
229
- ---
230
-
231
- ## Metrics (`lib/metrics.js`)
232
-
233
- `Metrics` is instantiated per `runAgentLoop` call and tracks per-turn token usage, latency, and total session duration. A summary box is printed on exit (SIGINT or natural quit) and after `cmdCode` runs. Use `/compact` in-chat to see the live summary.
96
+ ## Invariants the agent must not violate
97
+
98
+ These are load-bearing. Each was verified against the code at the cited `file:line`.
99
+ Do not weaken them; when adding code, preserve them.
100
+
101
+ 1. **CommonJS only.** All files use `require()`/`module.exports`, never ES
102
+ `import`/`export`. The **sole** exception is the dynamic `import()` inside
103
+ `lib/mcp/boundary.js` — the one bridge to the ESM-only MCP SDK. Do not migrate
104
+ the project to ESM. (`lib/mcp/boundary.js:41,42,92,105,113`.)
105
+
106
+ 2. **Tool output enters context ONLY via `boundToolOutput`** (`lib/agent.js:478`).
107
+ It applies `capToTokens` (per-path budget) and, when `fenced`, the untrusted
108
+ fence. grep/glob, shell, read_file, MCP, subagent, http_get, web_search all
109
+ route through it (`lib/agent.js:546,568,625,691,732,742,865,882`). **A new tool
110
+ gets bounding by routing its output through this chokepoint — not by remembering
111
+ to cap.**
112
+
113
+ 3. **XML and native tool paths converge on one normalized `[action, ...opts]`
114
+ tuple, and guards act there.** Native (`mapInvokeToCall`) and XML
115
+ (`extractToolCalls`) both produce the same `call` tuple, executed in one loop;
116
+ `permissionManager.resolveRule(call)` and the deny gate act on the tuple, so one
117
+ guard covers both rails (`lib/agent.js:1304,1315,1603,1656,1661`).
118
+
119
+ 4. **Untrusted-content fence.** Output from `http_get` / `web_search` / MCP /
120
+ subagent / hook / verify is wrapped in
121
+ `<<<UNTRUSTED_EXTERNAL_CONTENT … >>> … <<<END_UNTRUSTED_EXTERNAL_CONTENT>>>`
122
+ (`lib/agent.js:475-476`, `lib/hooks.js:55-56`, `lib/verify.js`). The system
123
+ prompt instructs the model to treat it as DATA and **never** act on instructions
124
+ inside it (`lib/prompts.js:80-82`). The secondary web-summarizer treats the page
125
+ as data-only too — a page could have steered it.
126
+
127
+ 5. **Destructive-command deny-list at the single `agentExecShell` chokepoint.**
128
+ Every exec/shell — including native git tools (via `_runGit` → `ctx.agentExecShell`),
129
+ lifecycle hooks, and self-verify — funnels through `agentExecShell` (`lib/tools.js:239`)
130
+ which runs `classifyShellCommand` (`lib/deny.js:184`). **Agent-initiated** deny
131
+ hits **hard-block**; **user-initiated** (`!cmd`) only confirm the catastrophic
132
+ subset. Only `--dangerously-skip-permissions` bypasses classification.
133
+
134
+ 6. **The agent can never disable the OS sandbox or widen the network.** No
135
+ tool/flag/config the *model* can reach turns the sandbox off or flips
136
+ no-network back to network — only human CLI flags (`--dangerously-skip-permissions`,
137
+ `--no-network`) or the human-edited `sandbox.*` config. Network is **binary**
138
+ (on / kernel-level none — `--unshare-net` / Seatbelt `(deny network*)`); no host
139
+ proxy / allowlist / TLS interception. Protected config + secret dirs (`~/.semalt-ai`,
140
+ `~/.ssh`/`~/.aws`/`~/.gnupg`, `/etc`, every project `.semalt` dir) are bound
141
+ **read-only inside the jail, including not-yet-existing files**
142
+ (`lib/sandbox.js:59-64,107-116,131-134,382-385,449-452`; `lib/constants.js:328-341`).
143
+
144
+ 7. **Project config can only NARROW.** `.semalt/config.json` is attacker-controllable
145
+ (cloned repos). Permission rules, hooks, and verify are loaded as **separate**
146
+ user/project layers (not the shallow-merged view): project `allow` rules are
147
+ dropped before resolution, and project **command** hooks + `verify.command` are
148
+ **quarantined** (only inert prompt text survives from a project)
149
+ (`lib/permission-rules.js:226-231,367-370`; `lib/hooks.js:114-131`;
150
+ `lib/verify.js:213-222`; `lib/config.js:360-376`).
151
+
152
+ 8. **Secret-file read guard + config-write guard.** File tools refuse reads of
153
+ protected secret files (`isProtectedSecretPath`) and writes into `~/.semalt-ai`
154
+ + project `.semalt` dirs (`isProtectedConfigPath`), **including not-yet-existing
155
+ files**. Neither is overridable by `--allow-anywhere` — only by
156
+ `--dangerously-skip-permissions` (`lib/tools.js:85-89,109-119`;
157
+ `lib/constants.js:328-341`).
158
+
159
+ 9. **Permissions are per-session, never persisted.** `PermissionManager` is created
160
+ fresh per invocation with in-memory state; approvals never hit disk. In **non-TTY**
161
+ mode, calls needing interactive confirmation are **refused** (not auto-approved)
162
+ unless an `--allow-*` tier pre-approved the tag or `--dangerously-skip-permissions`
163
+ is set (`lib/permissions.js:29,38-41,221-236,292-295`).
164
+
165
+ 10. **Tool-tag names stay in sync across all three surfaces.** A load-time parity
166
+ check (`assertToolSpecParity`, `lib/constants.js:449-492`) asserts
167
+ `TAG_REGISTRY` ↔ `TOOL_SPECS` ↔ `TOOL_REGISTRY` and that every entry has both an
168
+ `execute` and a `permission`. The `agent.js` parser and `prompts.js` inventory
169
+ both consume `TAG_REGISTRY`. **Rename a tag atomically in `prompts.js`,
170
+ `agent.js`, `tool_specs.js`, and the registry** or the parity check throws at load.
171
+
172
+ 11. **Checkpoints/rewind cover file-tool mutations ONLY.** `CHECKPOINTABLE_ACTIONS`
173
+ (`lib/checkpoints.js:62-65`) = write/append/edit_file/replace_in_file/delete_file/
174
+ move_file/copy_file/upload. **Shell side effects and git discards (`git_checkout`)
175
+ are NOT reversible** — do not imply `/rewind` covers them. Rewind is
176
+ **human-only**: there is **no rewind tool** in the static/dynamic registry,
177
+ `TOOL_SPECS`, or `TAG_REGISTRY` (`/rewind` and `semalt-code rewind` are the only
178
+ entries).
179
+
180
+ 12. **Subagents/MCP grant no privilege escalation.** A subagent shares the **parent's**
181
+ `permissionManager` (cannot auto-approve what the parent wouldn't) and **cannot
182
+ recurse** (`spawn_agent` is refused/dropped for children). MCP tools **require
183
+ approval by default** (opt-in per server). Both subagent and MCP results are
184
+ **untrusted-fenced and token-capped** before entering context (MCP 10k stricter,
185
+ subagent 20k generous) (`lib/subagents.js:186,297-299,328`; `lib/mcp/client.js:105-110`;
186
+ `lib/constants.js:130-131`).
187
+
188
+ 13. **Minimal, pinned dependencies.** Prefer Node built-ins. Any runtime dep must be
189
+ minimal, justified, **exact-pinned** (no `^`/`~`), and reviewed, with the
190
+ regenerated `package-lock.json` committed in the same PR. Today: only the four
191
+ listed in Tech Stack, all exact-pinned (`package.json`). See `docs/HISTORY.md`
192
+ for the supply-chain policy and rationale.
234
193
 
235
194
  ---
236
195
 
237
- ## API Client (`lib/api.js`)
238
-
239
- Handles two distinct concerns:
240
-
241
- **Inference** (OpenAI-compatible):
242
- - `chatStream(messages, model, opts)` → streams tokens, calls `onToken`, returns `{ content, usage }`
243
- - URL: `config.api_base` normalized to include `/v1` if missing
244
- - Supports `reasoning_content` field for extended-thinking models
245
-
246
- **Dashboard** (cli.semalt.ai backend):
247
- - `requestCliLogin()` → `POST /api/auth/cli/request`
248
- - `getCliLoginStatus(id, token)` → `POST /api/auth/cli/status`
249
- - `dashboardWhoAmI()` → `GET /api/auth/me`
250
- - `dashboardLogout()` → `POST /api/auth/logout`
251
- - `dashboardListModels()` → `GET /api/models`
252
- - `dashboardGetModelForCli(id)` → `GET /api/models/{id}/cli`
253
- - `dashboardCreateChat(title, modelDbId)` → `POST /api/chats`
254
- - `dashboardListChats()` → `GET /api/chats`
255
- - `dashboardGetChat(id)` → `GET /api/chats/{id}`
256
- - `dashboardSaveMessages(chatId, messages)` → `POST /api/chats/{id}/messages/batch`
196
+ ## Build / Run / Test / Lint / Publish
257
197
 
258
- All dashboard calls send `Authorization: Bearer <auth_token>` from config.
259
-
260
- ---
261
-
262
- ## Config File (`~/.semalt-ai/config.json`)
263
-
264
- Managed by `lib/config.js`. Normalized on every load. The config directory is created automatically if it does not exist.
265
-
266
- ```json
267
- {
268
- "api_base": "http://127.0.0.1:8800",
269
- "api_key": "any",
270
- "dashboard_url": "https://cli.semalt.ai",
271
- "auth_token": "",
272
- "default_model": "default",
273
- "dashboard_model_id": null,
274
- "temperature": 0.7,
275
- "request_timeout_ms": 900000,
276
- "stream": true,
277
- "theme": "dark",
278
- "max_file_size_kb": 512,
279
- "command_timeout_ms": 30000,
280
- "max_output_lines": 50,
281
- "show_token_count": true,
282
- "show_cost": false,
283
- "context_length": null,
284
- "models": [
285
- {
286
- "name": "local-llama",
287
- "api_base": "http://127.0.0.1:11434",
288
- "api_key": "any",
289
- "model": "llama3",
290
- "context_length": 8192
291
- }
292
- ]
293
- }
198
+ ```bash
199
+ node index.js chat # run locally (interactive chat)
200
+ npm test # node --test (the test/ suite)
201
+ npm run lint # node --check over all sources (zero-dep lint)
202
+ npm link # symlink for global use during dev
203
+ npm publish --access public # publish to npm (bump package.json version first)
294
204
  ```
295
205
 
296
- - `api_base` is normalized to always include `/v1`.
297
- - Legacy key `semalt_base_url` is migrated to `api_base` on load.
298
- - `auth_token` is written by `semalt-code login` and cleared by `logout`.
299
- - `dashboard_model_id` is the integer PK of the active model in `available_models`; written when a model is selected via `/models`. Required for chat history sync — if null, history sync is silently skipped.
300
- - `max_file_size_kb` caps how large a file may be before read is refused (default 512 KB).
301
- - `command_timeout_ms` caps shell command execution time (default 30 s).
302
- - `max_output_lines` caps shell and HTTP response lines returned to the agent (default 50).
303
- - `show_token_count` controls whether token count is shown in the status bar.
304
- - `show_cost` reserved for future cost-display feature.
305
- - `context_length` / `models[].context_length` — token limit used for context-usage bar, warnings, and proactive trimming. Self-calibrating: when a request triggers a context-overflow 400 (`"context length is only N"`), `api.js` parses the real window, persists it to `config.context_length` (and to the matching `models[]` entry), and trims to ~90% of it on subsequent calls. The value is never cached in memory only — a restart keeps the learned limit.
306
- - Local `models[]` entries override dashboard models when selected.
307
-
308
- ---
309
-
310
- ## Key Patterns & Invariants
311
-
312
- - **No dependencies**: keep it that way. Any new feature must use Node.js built-ins only.
313
- - **CommonJS**: all files use `require()`/`module.exports`. Do not use ES `import`/`export`.
314
- - **Streaming**: `api.js` manually parses `text/event-stream`. The parser in `chatStream()` handles partial JSON lines — be careful editing it.
315
- - **Permissions are per-session**: `PermissionManager` resets on each CLI invocation. Approvals never persist to disk. In non-TTY mode all tool calls are auto-approved with a warning.
316
- - **Token counting is approximate**: `estimateTokens()` divides char count by 4. It is used only for the `/compact` display — do not rely on it for hard limits.
317
- - **Context trimming is proactive when a limit is known**: `chatStream()` uses the in-process `_sessionInputLimits` learned from a prior 400 overflow first, then falls back to `config.context_length * 0.9`. When neither is set, no pre-flight trim runs and the client relies on the reactive 400/413 handler (which then persists the discovered window). `Metrics.tokenLimitStatus()` returns `{ used, limit: null }` until a limit is learned, so the status bar shows "N tok · limit unknown" instead of hiding the line.
318
- - **Tool output is truncated**: `tools.js` caps output at `max_output_lines` (default 50). Configurable via config.
319
- - **Max 10 agent iterations**: hard-coded in `agent.js`. Prevents runaway loops.
320
- - **Malformed tags are skipped**: each tool dispatch in the agent loop is wrapped in try/catch; errors emit a warning line and continue to the next tool call.
206
+ Version lives in `package.json`; bump it with every published change. CI
207
+ (`.github/workflows/ci.yml`) runs `npm ci` + `npm audit --omit=dev
208
+ --audit-level=high` + lint + the test matrix.
321
209
 
322
210
  ---
323
211
 
324
- ## Development & Publishing
212
+ ## Keeping this file up-to-date
325
213
 
326
- ```bash
327
- # Run locally
328
- node index.js chat
329
-
330
- # Symlink for global use during dev
331
- npm link
332
-
333
- # Publish to npm
334
- npm publish --access public
335
- ```
336
-
337
- Version is in `package.json`. Bump it with every published change.
338
-
339
- ---
214
+ This file is **auto-loaded as project memory and capped at 32 KB** — keep it lean so
215
+ it loads in full. **Runtime-essential operational facts and the invariants above go
216
+ here; rationale, per-task history, per-subsystem deep detail, and the full config/CLI
217
+ reference go in `docs/`** (not auto-loaded). Do not let this file re-bloat.
340
218
 
341
- ## Keeping This File Up-to-Date
219
+ Update **this file** when:
220
+ - A new `lib/` module is added (update the Directory Layout one-liner).
221
+ - A **load-bearing invariant** changes — and re-verify the cited `file:line`.
222
+ - The Node version requirement or runtime-dependency set changes.
223
+ - The build/run/test/lint/publish commands change.
342
224
 
343
- Update this file when:
344
- - A new CLI command or slash command is added (update the commands tables).
345
- - A new tool action is added to `tools.js` (update the Tool Operations table).
346
- - The agent loop behavior changes (max iterations, tag format, approval flow).
347
- - A new `lib/` module is added.
348
- - The config schema changes (new keys, renamed keys, migration logic).
349
- - A new dashboard API call is added to `api.js`.
350
- - The system prompt in `prompts.js` changes in a way that affects tool-tag syntax.
351
- - The Node.js version requirement changes.
225
+ Update **`docs/`** when:
226
+ - A subsystem's internals change `docs/ARCHITECTURE.md`.
227
+ - A config key, CLI flag, slash command, or tool tag/operation changes
228
+ `docs/CONFIG.md` (and the runtime source: `lib/config.js` / `lib/args.js` /
229
+ `lib/tool_specs.js` / `lib/prompts.js`).
230
+ - A design decision, dependency rationale, or roadmap item changes → `docs/HISTORY.md`.
352
231
 
353
- When renaming or removing a tool tag, update **both** `prompts.js` and `agent.js` atomically and note it here.
232
+ When renaming or removing a tool tag, update **`prompts.js` and `agent.js`
233
+ atomically** (invariant 10) and reflect it in `docs/CONFIG.md`.