npm - @newsails/veil-cli - Versions diffs - 1.0.1 - Mend

@newsails/veil-cli 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (199) hide show

package/.veil/agents/analyst/AGENT.md +21 -0
package/.veil/agents/analyst/agent.json +23 -0
package/.veil/agents/assistant/AGENT.md +15 -0
package/.veil/agents/assistant/agent.json +19 -0
package/.veil/agents/coder/AGENT.md +18 -0
package/.veil/agents/coder/agent.json +19 -0
package/.veil/agents/hello/AGENT.md +5 -0
package/.veil/agents/hello/agent.json +13 -0
package/.veil/agents/writer/AGENT.md +12 -0
package/.veil/agents/writer/agent.json +17 -0
package/.veil/memory/MEMORY.md +343 -0
package/.veil/memory/agents/analyst/MEMORY.md +55 -0
package/.veil/memory/agents/hello/MEMORY.md +12 -0
package/.veil/runtime.pid +1 -0
package/.veil/settings.json +10 -0
package/.veil-studio/studio.db +0 -0
package/.veil-studio/studio.db-shm +0 -0
package/.veil-studio/studio.db-wal +0 -0
package/PLAN/01-vision.md +26 -0
package/PLAN/02-tech-stack.md +94 -0
package/PLAN/03-agents.md +232 -0
package/PLAN/04-runtime.md +171 -0
package/PLAN/05-tools.md +211 -0
package/PLAN/06-communication.md +243 -0
package/PLAN/07-storage.md +218 -0
package/PLAN/08-api-cli.md +153 -0
package/PLAN/09-permissions.md +108 -0
package/PLAN/10-ably.md +105 -0
package/PLAN/11-file-formats.md +442 -0
package/PLAN/12-folder-structure.md +205 -0
package/PLAN/13-operations.md +212 -0
package/PLAN/README.md +23 -0
package/README.md +128 -0
package/REPORT.md +174 -0
package/TODO.md +45 -0
package/ai-tests/FRONTEND_PROMPT.md +220 -0
package/ai-tests/Research & Planning.md +814 -0
package/ai-tests/prompt-001-basic-api.md +230 -0
package/ai-tests/prompt-002-basic-flows.md +230 -0
package/ai-tests/prompt-003-agent-behaviors.md +220 -0
package/api/middleware.js +60 -0
package/api/routes/agents.js +193 -0
package/api/routes/chat.js +93 -0
package/api/routes/completions.js +122 -0
package/api/routes/daemons.js +80 -0
package/api/routes/memory.js +169 -0
package/api/routes/models.js +40 -0
package/api/routes/remote-methods.js +74 -0
package/api/routes/sessions.js +208 -0
package/api/routes/settings.js +108 -0
package/api/routes/system.js +50 -0
package/api/routes/tasks.js +270 -0
package/api/server.js +120 -0
package/cli/formatter.js +70 -0
package/cli/index.js +443 -0
package/cli/parser.js +113 -0
package/config/config.json +10 -0
package/config/models.json +6826 -0
package/core/agent.js +329 -0
package/core/cancel.js +38 -0
package/core/compaction.js +176 -0
package/core/events.js +13 -0
package/core/loop.js +564 -0
package/core/memory.js +51 -0
package/core/prompt.js +185 -0
package/core/queue.js +96 -0
package/core/registry.js +291 -0
package/core/remote-methods.js +124 -0
package/core/router.js +386 -0
package/core/running-sessions.js +18 -0
package/docs/api/01-system.md +84 -0
package/docs/api/02-agents.md +374 -0
package/docs/api/03-chat.md +269 -0
package/docs/api/04-tasks.md +470 -0
package/docs/api/05-sessions.md +444 -0
package/docs/api/06-daemons.md +142 -0
package/docs/api/07-memory.md +186 -0
package/docs/api/08-settings.md +133 -0
package/docs/api/09-models.md +119 -0
package/docs/api/09-websocket.md +350 -0
package/docs/api/10-completions.md +134 -0
package/docs/api/README.md +116 -0
package/docs/guide/01-quickstart.md +220 -0
package/docs/guide/02-folder-structure.md +185 -0
package/docs/guide/03-configuration.md +252 -0
package/docs/guide/04-agents.md +267 -0
package/docs/guide/05-cli.md +290 -0
package/docs/guide/06-tools.md +643 -0
package/docs/guide/07-permissions.md +236 -0
package/docs/guide/08-memory.md +139 -0
package/docs/guide/09-multi-agent.md +271 -0
package/docs/guide/10-daemons.md +226 -0
package/docs/guide/README.md +53 -0
package/docs/index.html +623 -0
package/examples/README.md +151 -0
package/examples/agents/assistant/AGENT.md +31 -0
package/examples/agents/assistant/SOUL.md +9 -0
package/examples/agents/assistant/agent.json +74 -0
package/examples/agents/hello/AGENT.md +15 -0
package/examples/agents/hello/agent.json +14 -0
package/examples/agents/monitor/AGENT.md +51 -0
package/examples/agents/monitor/agent.json +33 -0
package/examples/agents/monitor/heartbeats/monitor.md +24 -0
package/examples/agents/orchestrator/AGENT.md +70 -0
package/examples/agents/orchestrator/agent.json +30 -0
package/examples/agents/researcher/AGENT.md +52 -0
package/examples/agents/researcher/agent.json +49 -0
package/examples/agents/researcher/skills/web-research.md +28 -0
package/examples/skills/code-review.md +72 -0
package/examples/skills/summarise.md +59 -0
package/examples/skills/web-research.md +42 -0
package/examples/tools/word-count/index.js +27 -0
package/examples/tools/word-count/tool.json +18 -0
package/infrastructure/database.js +563 -0
package/infrastructure/scheduler.js +122 -0
package/llm/client.js +206 -0
package/migrations/001-initial.sql +121 -0
package/migrations/002-debuggability.sql +13 -0
package/migrations/003-drop-orphaned-columns.sql +72 -0
package/migrations/004-session-message-token-fields.sql +78 -0
package/migrations/005-session-thinking.sql +5 -0
package/package.json +30 -0
package/schemas/agent.json +143 -0
package/schemas/settings.json +111 -0
package/scripts/fetch-models.js +93 -0
package/session-debug-scenario.md +248 -0
package/settings/fields.js +52 -0
package/system-prompts/base-core.md +7 -0
package/system-prompts/environment.md +13 -0
package/system-prompts/reminders/anti-drift.md +6 -0
package/system-prompts/reminders/stall-recovery.md +10 -0
package/system-prompts/safety-rules.md +25 -0
package/system-prompts/task-heuristics.md +27 -0
package/test/client.js +71 -0
package/test/integration/01-health.test.js +25 -0
package/test/integration/02-agents.test.js +80 -0
package/test/integration/03-chat-hello.test.js +48 -0
package/test/integration/04-chat-multiturn.test.js +61 -0
package/test/integration/05-chat-writer.test.js +48 -0
package/test/integration/06-task-basic.test.js +68 -0
package/test/integration/07-task-tools.test.js +74 -0
package/test/integration/08-task-code-analysis.test.js +69 -0
package/test/integration/09-memory-analyst.test.js +63 -0
package/test/integration/10-task-advanced.test.js +85 -0
package/test/integration/11-sessions-advanced.test.js +84 -0
package/test/integration/12-assistant-chat-tools.test.js +75 -0
package/test/integration/13-edge-cases.test.js +99 -0
package/test/integration/14-cancel.test.js +62 -0
package/test/integration/15-debug.test.js +106 -0
package/test/integration/16-memory-api.test.js +83 -0
package/test/integration/17-settings-api.test.js +41 -0
package/test/integration/18-tool-search-activation.test.js +119 -0
package/test/results/.gitkeep +0 -0
package/test/runner.js +206 -0
package/test/smoke.js +216 -0
package/tools/agent_message.js +85 -0
package/tools/agent_send.js +80 -0
package/tools/agent_spawn.js +44 -0
package/tools/bash.js +49 -0
package/tools/edit_file.js +41 -0
package/tools/glob.js +64 -0
package/tools/grep.js +82 -0
package/tools/list_dir.js +63 -0
package/tools/log_write.js +31 -0
package/tools/memory_read.js +38 -0
package/tools/memory_search.js +65 -0
package/tools/memory_write.js +42 -0
package/tools/read_file.js +48 -0
package/tools/sleep.js +22 -0
package/tools/task_create.js +41 -0
package/tools/task_respond.js +37 -0
package/tools/task_spawn.js +64 -0
package/tools/task_status.js +39 -0
package/tools/task_subscribe.js +37 -0
package/tools/todo_read.js +26 -0
package/tools/todo_write.js +38 -0
package/tools/tool_activate.js +24 -0
package/tools/tool_search.js +24 -0
package/tools/web_fetch.js +50 -0
package/tools/web_search.js +52 -0
package/tools/write_file.js +28 -0
package/ui/api.js +190 -0
package/ui/app.js +281 -0
package/ui/index.html +382 -0
package/ui/views/agents.js +377 -0
package/ui/views/chat.js +610 -0
package/ui/views/connection.js +96 -0
package/ui/views/daemons.js +129 -0
package/ui/views/feed.js +194 -0
package/ui/views/memory.js +263 -0
package/ui/views/models.js +146 -0
package/ui/views/sessions.js +314 -0
package/ui/views/settings.js +142 -0
package/ui/views/tasks.js +415 -0
package/utils/context.js +49 -0
package/utils/id.js +16 -0
package/utils/models.js +88 -0
package/utils/paths.js +213 -0
package/utils/settings.js +172 -0

package/PLAN/02-tech-stack.md ADDED Viewed

@@ -0,0 +1,94 @@
+# 02 — Tech Stack
+## Language
+**Node.js + JSDoc** (no TypeScript). Plain JavaScript with JSDoc type annotations for IDE support and type checking.
+> **v2 Note:** TypeScript is strongly recommended for a future version. Most of the function-signature mismatches, missing exports, and field-name inconsistencies discovered during testing would be caught at compile time.
+## Dependencies (Minimal Set)
+| Concern | Package | Justification |
+|---------|---------|---------------|
+| **LLM Client** | native `fetch` (built-in) | Node 18+ built-in. No SDK version drift. Works with any OpenAI-compatible endpoint (OpenRouter, Anthropic, Ollama, LM Studio, etc.) without external dependencies. |
+| **REST Server** | `express` | Simplest, most maintained, vast middleware ecosystem |
+| **CORS** | `cors` | Required by express for cross-origin requests |
+| **Validation** | `ajv` + `ajv-formats` | JSON Schema validator, compiles schemas to fast native functions. `ajv-formats` required for `date-time` and other format validators. |
+| **Database** | `better-sqlite3` | Fastest SQLite for Node.js, synchronous (optimal for local) |
+| **Cron** | `node-cron` | Lightweight, Unix-native cron syntax |
+| **MCP Client** | TBD (existing lib) | Not custom-built, supports stdio + Streamable HTTP |
+| **CLI Parsing** | `util.parseArgs` | Built-in Node.js (>=18.3), zero external deps |
+**Rejected alternatives:**
+- `openai` SDK — SDK versioning caused `OpenAI is not a constructor` failures (v3 vs v4 API mismatch). Native `fetch` is simpler, portable, and requires no versioning decisions.
+- `nanoid` — was missing from package.json, caused runtime crash. Replaced by `crypto.randomBytes()` from Node's built-in `crypto` module (see `src/utils/id.js`).
+- `fastify` — more complex setup for marginal perf gain
+- `langchain` / `@vercel/ai` — framework lock-in
+- `sqlite3` (async) — slower than sync for local operations
+- `yargs` / `commander` — unnecessary weight when `util.parseArgs` exists
+- `chokidar` — not needed (no file watching for agent discovery)
+## LLM Interface
+Single interface for all providers using native `fetch`:
+```
+base_url + api_key + model_name
+```
+Works with: OpenRouter, Anthropic, OpenAI, Google, Ollama, LM Studio, or any OpenAI-compatible endpoint. No provider-specific code paths. No SDK version to manage.
+```javascript
+// src/llm/client.js — fetch-based, no SDK dependency
+async function callLLM({ baseUrl, apiKey, model, messages, tools }) {
+  const response = await fetch(`${baseUrl}/chat/completions`, {
+    method: 'POST',
+    headers: {
+      'Content-Type': 'application/json',
+      'Authorization': `Bearer ${apiKey}`
+    },
+    body: JSON.stringify({ model, messages, tools })
+  });
+  if (!response.ok) throw new Error(`LLM error: ${response.status}`);
+  return response.json();
+}
+```
+## ID Generation
+Use `crypto.randomBytes()` from Node's built-in `crypto` module. **Do not use `nanoid`** — it must be installed as an external dependency and was the cause of runtime crashes when omitted from `package.json`.
+```javascript
+// src/utils/id.js
+const crypto = require('crypto');
+function generateId(prefix) {
+  return `${prefix}${crypto.randomBytes(8).toString('hex')}`;
+}
+// Usage: generateId('sess_') → 'sess_a1b2c3d4e5f6g7h8'
+```
+## Node.js Version
+Minimum: **Node.js 18.3+** (for native `fetch`, `util.parseArgs`, `node:test`, and `AbortController`).
+**Enforced two ways:**
+1. `package.json` engines field:
+```json
+{ "engines": { "node": ">=18.3.0" } }
+```
+2. Runtime check at startup entry point:
+```javascript
+const [major] = process.version.slice(1).split('.').map(Number);
+if (major < 18) {
+  console.error(`VeilCLI requires Node.js 18+. Current: ${process.version}`);
+  process.exit(1);
+}
+```
+## Dependency Discipline
+- Use **exact versions** in `package.json` (no `^` or `~` for production deps)
+- Run `npm ci` in CI, not `npm install`
+- Audit `package.json` completeness before any commit (ensure all `require()`d packages are listed)
+- All built-in Node.js modules must be listed as comments in `package.json` (`// built-in: crypto, path, fs, util`) to make dependency inventory obvious

package/PLAN/03-agents.md ADDED Viewed

@@ -0,0 +1,232 @@
+# 03 — Agent Architecture
+## Core Concept
+An **agent** is an autonomous AI entity defined by files. **How it's invoked** (chat, task, daemon, subagent) is a **mode**, not a type. The same agent can support multiple modes with different tools, skills, and permissions per mode.
+> **Required location:** Agents MUST be placed in `.veil/agents/<agentName>/`. A directory named `agents/` at the project root is NOT scanned. This is not configurable. Claude Code projects must be imported via `veil import` first.
+## Agent Definition (Two Files)
+Agent folders are **static and read-only** — safe to download from a registry, version in git, or share between projects. All mutable state (memory, heartbeats) lives elsewhere in `.veil/`.
+```
+.veil/agents/<agentName>/
+├── AGENT.md        ← Instructions / system prompt (read-only)
+├── agent.json      ← Configuration (modes, model, tools, permissions)
+├── SOUL.md         ← Optional persona and behavioral identity
+├── skills/         ← Agent-specific skills
+│   └── <skillName>/
+│       ├── SKILL.md
+│       └── scripts/    ← Scripts referenced by the skill
+└── tools/          ← Agent-level custom tools (scoped to this agent only)
+    └── <toolName>/
+        ├── tool.json
+        └── index.js
+```
+### AGENT.md
+Plain markdown. The agent's instructions, rules, and personality. Always loaded into the system prompt. No configuration here — that goes in `agent.json`.
+### agent.json
+All configuration. Defines which modes the agent supports and how each mode behaves.
+> **Validated at load time.** Every `agent.json` is validated against the JSON schema using ajv when the runtime loads an agent. Invalid files cause a startup error with a clear message — the agent will not be available until fixed. This catches field type errors early rather than causing silent failures during LLM calls.
+```json
+{
+  "name": "researcher",
+  "description": "Deep research agent that investigates topics",
+  "model": "anthropic/claude-3.5-sonnet",
+  "temperature": 0.7,
+  "reasoning": "medium",
+  "memory": { "enabled": true, "maxLines": 200 },
+  "skillDiscovery": true,
+  "modes": {
+    "chat": {
+      "enabled": true,
+      "model": "anthropic/claude-3.5-sonnet",
+      "temperature": 0.7,
+      "tools": [],
+      "disallowedTools": [],
+      "skills": [],
+      "autoLoadSkills": [],
+      "mcpServers": [],
+      "allowedAgents": [],
+      "disallowedAgents": [],
+      "permissions": {}
+    },
+    "task": {
+      "enabled": true,
+      "model": "anthropic/claude-3.5-sonnet",
+      "temperature": 0.5,
+      "reasoning": "medium",
+      "tools": [],
+      "disallowedTools": ["bash"],
+      "skills": [],
+      "autoLoadSkills": [],
+      "mcpServers": [],
+      "allowedAgents": [],
+      "disallowedAgents": ["deploy-prod"],
+      "maxIterations": 50,
+      "maxDurationSeconds": 300,
+      "permissions": {}
+    },
+    "subagent": {
+      "enabled": true,
+      "model": "anthropic/claude-3-haiku",
+      "temperature": 0.3,
+      "reasoning": "medium",
+      "tools": ["web_search", "read_file", "grep"],
+      "disallowedTools": [],
+      "skills": ["summarize"],
+      "autoLoadSkills": [],
+      "mcpServers": [],
+      "allowedAgents": [],
+      "disallowedAgents": [],
+      "maxIterations": 20,
+      "maxDurationSeconds": 120,
+      "permissions": {}
+    },
+    "daemon": {
+      "enabled": false,
+      "model": "anthropic/claude-3-haiku",
+      "temperature": 0,
+      "reasoning": "medium",
+      "cron": "*/5 * * * *",
+      "triggers": ["task.completed", "task.failed"],
+      "conflictPolicy": "skip",
+      "alertRouting": ["log", "webhook:https://slack.com/webhook/xxx"],
+      "heartbeatFile": "",
+      "tools": [],
+      "disallowedTools": [],
+      "skills": [],
+      "autoLoadSkills": [],
+      "mcpServers": [],
+      "allowedAgents": [],
+      "disallowedAgents": [],
+      "permissions": {}
+    }
+  }
+}
+```
+**Critical field type rules (these caused real failures during testing):**
+- **`model`** — must be a **string** e.g. `"anthropic/claude-3.5-sonnet"`. NOT an object like `{ "role": "main" }`. Using an object here causes every agent to fail schema validation.
+- **`modes.chat` / `modes.task` / etc.** — must be an **object** with at minimum `{ "enabled": boolean }`. NOT a bare boolean like `true` or `false`. Using a bare boolean here causes every agent to fail schema validation.
+**Field rules:**
+- **`tools`** — non-empty = whitelist (only these tools). Empty = all available.
+- **`disallowedTools`** — non-empty = blacklist (all tools EXCEPT these). Empty = no exclusions.
+- `tools` and `disallowedTools` are mutually exclusive — setting both non-empty is a validation error. Both empty = all tools allowed (not an error).
+- Same whitelist/blacklist logic for `allowedAgents`/`disallowedAgents`. `skills` and `mcpServers` are whitelist-only (empty = all).
+- **`skillDiscovery`** — when `false`, no skill names/descriptions listed in system prompt. Agent is skill-blind unless given via `autoLoadSkills`. Default: `true`.
+- **`memory.enabled`** — enables persistent memory for this agent. Default: `true`.
+- **`memory.maxLines`** — MEMORY.md line cap before auto-export. Default: 200 (from settings.json). Per-agent override.
+- **`mcpServers`** — whitelist of MCP server names. Empty = all configured servers available.
+- **`allowedAgents`/`disallowedAgents`** — controls which agents this agent can spawn/message. Available in all modes. Empty = all agents.
+- **`heartbeatFile`** — path to heartbeat checklist for daemon mode. Default: `.veil/heartbeats/<agentName>.md`.
+- `autoLoadSkills` = skills whose full content is always in the system prompt (not on-demand)
+- `permissions` = per-mode overrides (empty = inherit from settings.json)
+- `model` at root = default for all modes. Per-mode `model` overrides the root value.
+- `temperature` at root = default for all modes. Per-mode `temperature` overrides the root value.
+- `reasoning` at root = default for all modes. Per-mode `reasoning` overrides the root value.
+- **`alertRouting`** — array of routing targets: `"log"`, `"webhook:<url>"`, `"agent:<name>"`, `"ably:<channel>"`. Single string also accepted.
+## `$AGENT_FOLDER` Variable
+The runtime exposes `$AGENT_FOLDER` as a template variable in AGENT.md, SKILL.md, and heartbeat files. Resolved at load time to the **absolute path** of the agent's folder.
+This lets agents and skills reference co-located scripts:
+```markdown
+Use the script at $AGENT_FOLDER/skills/scrape/scripts/url_to_markdown.js to convert URLs to markdown.
+```
+Resolved example: `$AGENT_FOLDER` → `/home/user/project/.veil/agents/scraper`
+Also available as the `VEIL_AGENT_FOLDER` environment variable in:
+- Custom tool execution
+- Skill scripts (referenced in SKILL.md)
+- PreToolUse / PostToolUse hooks
+**Not available in:** MCP tool execution (MCP servers are external processes with their own environment).
+## Example Agents
+The source tree ships an `examples/` directory with reference agents that:
+- Are guaranteed to pass the JSON schema validator
+- Have correct field types (string model, object modes)
+- Are used directly in smoke tests and documentation
+Minimal reference: `examples/agents/hello/` — a simple chat-only agent. Always check this when unsure about the correct format before writing a new `agent.json` from scratch.
+## Invocation Modes
+### Chat Mode
+- **Trigger:** API call `POST /agents/:name/chat`
+- **Behavior:** Synchronous request/response. Stateful via sessions. User sends message, agent responds.
+- **Session:** Created on first message, resumed via session_id.
+- **Completion:** Never "completes" — session stays open until closed.
+### Task Mode
+- **Trigger:** API call `POST /agents/:name/task` or CLI `veil run --agent <name>`
+- **Behavior:** Asynchronous. Agent runs autonomously to completion.
+- **Status flow:** `pending` → `processing` → `waiting` → `finished` / `failed` / `canceled`
+- **Completion:** Agent decides it's done (no more tool calls) or hits limits.
+- **Waiting:** Agent can pause and wait for human input via `task_respond`.
+### Daemon Mode
+- **Trigger:** Cron schedule or **system-wide** event trigger (task.completed, task.failed, agent.error)
+- **System-wide events:** These fire when ANY task in the system completes/fails — not a specific one. The daemon gets woken up to handle it. This is different from `task_subscribe` (which an already-running agent uses to watch a specific task it created).
+- **Behavior:** Each tick: reads heartbeat file (from `.veil/heartbeats/<agentName>.md` or custom path), decides action, outputs result.
+- **Output contract:** Must output `HEARTBEAT_OK` or `HEARTBEAT_ALERT: <message>`
+- **Alert routing:** Configured in agent.json — log, webhook URL, another agent, Ably channel.
+- **Conflict policy:** What to do when tick fires mid-execution:
+  - `skip` — discard this tick, no-op
+  - `queue` — add to FIFO queue (max 10 pending ticks, oldest dropped if exceeded)
+  - `restart` — cancel running tick immediately, start new one
+- **Auto-start:** All agents with `daemon.enabled: true` start on `veil start`.
+### Sub-Agent Mode
+- **Trigger:** Another agent calls `agent_spawn(agent: "<name>")` (sync or async)
+- **Behavior:** Uses `modes.subagent` config for tools/skills/permissions. (`task_create` always uses `modes.task`, not subagent.)
+- **Sync vs Async:** `agent_spawn(wait: true)` blocks until sub-agent completes (default). `agent_spawn(wait: false)` returns a `taskId` immediately — sub-agent runs in background, result retrievable via `task_status`.
+- **Purpose:** An agent may need different behavior when called as a sub-agent vs running independently. For example, a "researcher" agent might have full web_search + bash in chat mode, but only web_search + read_file when spawned as a sub-agent.
+- **Parallel fan-out:** An orchestrator can `agent_spawn(wait: false)` × N to run multiple sub-agents in parallel. Caller is auto-subscribed — notifications arrive as each completes. Results retrieved via `task_status`.
+- **Fallback:** If `modes.subagent` is not defined or not enabled, the runtime falls back to `modes.chat` config. If `modes.chat` is also not enabled, the spawn fails with an error. Fallback emits a warning log.
+## Sub-Agents
+Any agent can be a sub-agent. The `subagent` mode gives fine-grained control over behavior when called by other agents.
+- **`agent_spawn`** — Sync (`wait: true`, default) or async (`wait: false`). Creates an isolated context window using `modes.subagent`. Sync returns condensed result directly. Async returns `taskId` for later retrieval.
+- **`task_create`** — Always asynchronous. Creates a task in the queue using `modes.task`. Parent can continue or wait.
+Built-in agent patterns (all just regular agents with specific tool configs):
+- **Explore** — read-only tools (read_file, grep, glob, list_dir, web_search)
+- **Plan** — read-only + todo_write, no execution tools
+- **Execute** — full tool access
+## Built-in Hidden Agents
+Shipped with the runtime, not visible to users:
+- **compact** — Context compaction (uses `compact` model)
+- **title** — Session title generation (uses `title` model)
+- **summary** — Session summary generation (uses `title` model)
+## Agent Discovery
+Agent resolution order (when an agent is needed by name):
+```
+1. ./.veil/agents/<name>/     ← project-local (user owns, never auto-updated)
+2. ~/.veil/agents/<name>/     ← global (auto-updated if version hash differs)
+3. Veil.com registry          ← remote (if connected, cached locally)
+4. NOT FOUND → error
+```
+**Startup scan:** On `veil start`, the runtime scans agent folders to auto-start daemons (`daemon.enabled: true`) and populate the `GET /agents` list. No file watcher — agents added after startup are discovered on next invocation or restart.
+Global instructions: `.veil/AGENT.md` only. No fallback to project root. Claude Code projects must be converted via `veil import`.
+> **Common mistake:** Placing agents in `./agents/` (visible project root directory) instead of `./.veil/agents/`. The runtime only scans `.veil/agents/` and `~/.veil/agents/`. Agents in the wrong location will silently not appear in `GET /agents`.

package/PLAN/04-runtime.md ADDED Viewed

@@ -0,0 +1,171 @@
+# 04 — Agent Runtime
+## Agentic Loop
+The core execution model. Runs identically for all modes — the mode only determines how the loop is started and how results are returned.
+```
+Load agent definition (AGENT.md + agent.json)
+    ↓
+Compile system prompt (7-layer assembly)
+    ↓
+Send to LLM (via native fetch — OpenAI-compatible API)
+    ↓
+LLM responds with text and/or tool_calls
+    ↓
+┌─ If tool_calls: ──────────────────────────────┐
+│   For each tool call:                          │
+│     1. Drain message queue (inject pending)    │
+│     2. Check permissions (deny/ask/allow)      │
+│     3. Run PreToolUse hook                     │
+│     4. Validate args against JSON Schema (ajv) │
+│     5. Execute tool                            │
+│     6. Run PostToolUse hook                    │
+│     7. Append tool result to messages          │
+│   Loop back to "Send to LLM"                  │
+└────────────────────────────────────────────┘
+┌─ If no tool_calls: ───────────────────────────┐
+│   Chat mode: return response to user           │
+│   Task mode: check if agent is done or stuck   │
+│   Daemon mode: parse HEARTBEAT_OK/ALERT        │
+│   Drain followup queue (inject followup msgs)  │
+└────────────────────────────────────────────┘
+    ↓
+Check limits:
+  - max_iterations reached? → fail
+  - max_duration reached? → fail
+  - context > 85% threshold? → extract memories → trigger compaction → continue
+```
+**Implementation pattern:** Async generators (`async function*`). Each iteration yields a state event, allowing consumers (API, CLI) to observe progress without coupling.
+**Function interface discipline:** All core functions use **object-destructuring parameters**, not positional arguments. This prevents signature mismatch bugs (a real source of failures found in testing) and makes call sites self-documenting:
+```javascript
+// Correct — named params, order-independent:
+async function runChat({ agent, message, sessionId, settings, cwd }) { ... }
+// Wrong — positional, fragile:
+async function runChat(agent, message, options) { ... }
+```
+All public functions in `src/core/`, `src/api/`, and `src/cli/` follow this convention. JSDoc `@param` annotations are required on all exported functions.
+## Model Delegation (v1)
+Three model roles. Each can have a different `base_url` + `api_key` + `model`.
+```json
+{
+  "models": {
+    "main": { "model": "anthropic/claude-3.5-sonnet" },
+    "compact": { "model": "anthropic/claude-3-haiku" },
+    "title": { "model": "anthropic/claude-3-haiku" }
+  }
+}
+```
+| Role | Used For | Typical Model |
+|------|----------|---------------|
+| `main` | Reasoning, planning, tool decisions, file edits (str_replace) | Claude 3.5 Sonnet, GPT-4o |
+| `compact` | Context compaction / summarization | Claude Haiku, GPT-4o-mini |
+| `title` | Session title, session summary | Cheapest available |
+If only `main` is configured, all roles fall back to `main`.
+**LLM calls use native `fetch` directly** — no OpenAI SDK. See `02-tech-stack.md` and `src/llm/client.js`. The `callLLM()` function validates its inputs before calling the API:
+```javascript
+async function callLLM({ baseUrl, apiKey, model, messages, tools = [] }) {
+  if (!Array.isArray(tools)) throw new Error('tools must be an array');
+  if (!Array.isArray(messages) || messages.length === 0) throw new Error('messages must be a non-empty array');
+  // ... fetch call
+}
+```
+Failing fast on bad inputs prevents silent failures where tool calls never reach the LLM.
+**Apply layer deferred to future version.** For v1, `edit_file` uses search-and-replace via the main model (like Claude Code).
+## Context Engineering
+### Auto-Compaction
+- Triggered when context window usage exceeds **85%** (configurable)
+- Hidden `compact` agent summarizes conversation history (runs using the `compact` model role from settings.json)
+- Preserves: key decisions, unresolved issues, file paths, active TODOs, error patterns
+- Result replaces old messages with a single compacted summary message
+- TODOs from `todo_write` are always included in compacted context
+### Pre-Compaction Memory Extraction
+Before compaction runs, the runtime asks the main model:
+> "What key facts, decisions, or learnings from this conversation should be remembered long-term?"
+- Model returns structured memories → runtime writes them to agent's MEMORY.md via `memory_write`
+- Model can return `NO_MEMORY` to skip
+- Only runs if `memory.enabled: true` in agent.json (default: on)
+- This ensures important context survives compaction even if the agent didn't explicitly `memory_write`
+### Context Trimming (applied every iteration, in order)
+**Execution order:**
+1. **Observation masking** (first) — Tool outputs beyond 10 turns old are replaced with `[output hidden]` placeholders. Agent's reasoning text is preserved. Configurable via `settings.json` → `compaction.observationMaskingTurns` (default: 10).
+2. **Tool result clearing** (second) — For messages in turns 1–10 (within masking window), large raw outputs (file reads, bash outputs) are truncated to ~500 chars + `"... [truncated]"`. This keeps recent tool outputs readable but bounded. Messages already masked in step 1 are skipped (already replaced with placeholder).
+3. **Compaction** (third, conditional) — If context still exceeds 85% threshold after steps 1-2, full compaction triggers (see above).
+**Why this order:** Masking is the cheapest operation (placeholder replacement). Clearing is slightly more work (truncation). Compaction is expensive (LLM call). Running them in this order minimizes cost while progressively reducing context.
+### Token Tracking
+Per session/task:
+- Input tokens, output tokens, cache tokens
+- Estimated cost (based on model pricing config)
+- Iteration count, tool call count
+- Stored in SQLite, queryable via API
+## System Prompt Architecture (7 Layers)
+Assembled at session start. Layers 1-2 are static (optimize for prompt caching). Layers 3-7 are dynamic.
+| Layer | Content | Type |
+|-------|---------|------|
+| 1. Base Core | Identity, version, mandate | Static, cached |
+| 2. Safety & Rules | Permission rules, instruction hierarchy | Static, cached |
+| 3. Environment | OS, shell, cwd, git status, MCP servers | Dynamic per session |
+| 4. Agent Instructions | AGENT.md + SOUL.md + Memory files | User-defined |
+| 5. Tools & Capabilities | Tool names + descriptions (progressive disclosure) | Conditional |
+| 6. Task Heuristics | "Read before edit", "no over-engineering", mode-specific tips | Conditional |
+| 7. Session Context | Compacted history, active TODOs, task brief | Dynamic |
+**Token budget:** ~6,400-8,400 tokens target. Hard cap: 12,000 tokens.
+### System Reminders (Mid-Conversation Injections)
+| Reminder | Trigger | Purpose |
+|----------|---------|---------|
+| Anti-drift | Every N iterations | Refocus on original task |
+| Stall recovery | 3+ iterations with no progress | Suggest different approach |
+| Safe-edit | Before write/edit tool | Remind to read first |
+| Plan adherence | When TODO exists | Check progress against plan |
+| Compaction warning | At 75% context | Prepare for imminent compaction |
+### Instruction Priority Hierarchy
+Baked into Layer 2 (Safety & Rules):
+```
+Safety rules (Layer 2) > User's current message > AGENT.md > SOUL.md > MEMORY.md > System defaults
+```
+## Progress Events
+The runtime emits progress events during execution. These are:
+- **Written to SQLite** (queryable via `GET /tasks/:id/events`)
+- **Written to stdout** in one-shot mode (`VEIL_PROGRESS={...}`)
+- **Pushed via Ably** for remote consumers
+**Automatic events** (emitted by runtime, `type: "iteration"`):
+- `iteration.start` — new loop iteration
+- `tool.start` — tool execution beginning
+- `tool.end` — tool execution complete (with duration)
+- `compaction.start` / `compaction.end`
+- `status.change` — task status transition
+**Agent-driven events** (via `log_write` tool, `type: "custom"`):
+- Custom progress messages, research updates, milestones
+- Format is free-form — agents decide what's useful to report
+All events include a `type` field (`"iteration"` or `"custom"`) so API consumers can filter/distinguish runtime progress from agent-specific progress.