npm - open-agents-ai - Versions diffs - 0.187.173 → 0.187.174 - Mend

open-agents-ai 0.187.173 → 0.187.174

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -40,7 +40,8 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
 - [Model-Tier Awareness](#model-tier-awareness)
 - [Live Code Knowledge Graph](#live-code-knowledge-graph)
 - [Auto-Expanding Context Window](#auto-expanding-context-window)
-- [Tools (68+)](#tools-68)
+- [Tools (85+)](#tools-85)
+- [Associative Memory & Cross-Modal Binding](#associative-memory--cross-modal-binding)
 - [Ralph Loop — Iteration-First Design](#ralph-loop--iteration-first-design)
 - [Task Control](#task-control)
 - [COHERE Cognitive Framework](#cohere-cognitive-framework)
@@ -923,7 +924,7 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
-## Tools (68)
+## Tools (85+)
 <div align="right"><a href="#top">back to top</a></div>
@@ -1010,11 +1011,24 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
 | **Hardware Access** | |
 | `camera_capture` | Access system cameras — list devices, capture JPEG frames, query capabilities. Uses ffmpeg + v4l2. Supports USB, CSI, and 360 cameras (QooCam, RealSense). Captured images can be piped to vision tools |
 | `audio_capture` | Record from microphone — list input devices, record WAV/MP3 (configurable duration/rate/channels), check real-time mic level (RMS dBFS). Uses arecord + ffmpeg backends |
-| `audio_playback` | Speaker control and TTS — play audio files (WAV/MP3/OGG), text-to-speech via espeak-ng (multi-language), get/set system volume. Uses aplay/ffplay/amixer backends |
+| `audio_playback` | Speaker control and TTS — play audio files (WAV/MP3/OGG), text-to-speech via LuxTTS voice clone (persistent GPU daemon, ~2s synthesis), get/set system volume. Uses aplay/ffplay/amixer backends |
 | `wifi_control` | WiFi network scanning and management — scan nearby networks (SSID, signal, channel, security), list WiFi adapters (built-in + USB dongles), connect/disconnect, check connection status, toggle monitor mode. Auto-detects AC600/RTL8811AU and other USB adapters |
 | `bluetooth_scan` | Bluetooth device discovery — scan for Classic and BLE devices, list HCI adapters, get device info. Uses hcitool/bluetoothctl backends |
 | `sdr_scan` | Software-defined radio scanning — frequency sweeps, ADS-B aircraft tracking (1090 MHz), FM radio capture. Auto-installs rtl-sdr tools when RTL-SDR hardware detected. Uses rtl_power/rtl_fm/dump1090 |
 | `flipper_zero` | Flipper Zero multi-tool control — Sub-GHz scanning (315/433/868/915 MHz), NFC tag reading, 125kHz RFID reading, IR capture, GPIO pin reading, storage browsing. Serial CLI via /dev/ttyACM* |
+| `meshtastic` | Mesh network communication via LoRa — send/receive messages, list nodes, get device info, configure channels. Auto-installs meshtastic CLI in venv, auto-fixes serial permissions via pkexec |
+| `gps_location` | GPS positioning from 45+ USB receivers — auto-detects device, probes NMEA at multiple baud rates. Uses pyserial+pynmea2 for reliable parsing. Returns lat/lon/alt/speed/heading |
+| `audio_analyze` | Audio scene analysis — YAMNet 521-class classification (AudioSet taxonomy), Silero VAD voice activity detection, FFT spectrum analysis with peak frequency detection |
+| `asr_listen` | Record from microphone and transcribe speech to text — combines audio capture + Whisper ASR in one call. Uses PipeWire (bluetooth/USB) → faster-whisper → openai-whisper backends |
+| **Visual Intelligence** | |
+| `visual_memory` | Face recognition + object memory — InsightFace ArcFace 512d face enrollment/identification, CLIP ViT-B/32 object teaching/recognition. Persistent face+object databases in `.open-agents/visual-memory/` |
+| `multimodal_memory` | Cross-modal episode binding — captures face + voice + text + location into unified episodes. Actions: capture (photo+audio), meet (register person with name+face+voice), recall (associative retrieval), timeline (chronological query) |
+| **Associative Memory** | |
+| `episode_store` | SQLite episode store with triple-factor scoring (recency x importance x relevance), 4-class temporal decay (session/daily/procedural/permanent), Ebbinghaus strengthening on retrieval |
+| `temporal_graph` | Temporal knowledge graph with Graphiti-style valid_from/valid_until edges, entity upsert with mention counting, temporal queries, neighbor traversal for context building |
+| `zettelkasten` | A-MEM Zettelkasten note linking — retroactive context evolution, top-3 neighbor discovery via cosine similarity, bidirectional linking |
+| `ppr_retrieval` | HippoRAG Personalized PageRank retrieval — entity extraction, seed node mapping, multi-hop associative traversal over temporal KG, episode scoring |
+| `gist_compressor` | ReadAgent-style trajectory compression — deterministic gist extraction from multi-turn interactions, no LLM needed |
 Read-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.
@@ -1049,7 +1063,7 @@ The agent can access physical hardware — cameras, microphones, and speakers
 | List cameras | `camera_capture` action=list | Discover `/dev/video*` devices |
 | Record audio | `audio_capture` action=record duration=10 | Record 10s WAV from default mic |
 | Check if mic works | `audio_capture` action=level | RMS level in dBFS |
-| Speak aloud | `audio_playback` action=speak text="Hello" | TTS via espeak-ng |
+| Speak aloud | `audio_playback` action=speak text="Hello" | TTS via LuxTTS voice clone |
 | Play a sound file | `audio_playback` action=play file=alert.wav | Play WAV/MP3/OGG |
 | Check volume | `audio_playback` action=volume | Get current volume % |
 | Set volume | `audio_playback` action=volume volume=50 | Set to 50% |
@@ -1067,12 +1081,33 @@ The agent can access physical hardware — cameras, microphones, and speakers
 | Sub-GHz scan | `flipper_zero` action=subghz_scan frequency=433920000 | RF signals |
 | Read NFC tag | `flipper_zero` action=nfc_read | Tag UID, type |
 | Read RFID tag | `flipper_zero` action=rfid_read | 125kHz tag ID |
+| Send mesh message | `meshtastic` action=send message="Hello mesh" | LoRa broadcast |
+| List mesh nodes | `meshtastic` action=nodes | All nodes + signal info |
+| Get GPS location | `gps_location` action=locate | Lat/lon/alt/speed |
+| Analyze audio scene | `audio_analyze` action=classify file="rec.wav" | Top AudioSet classes |
+| Detect voice activity | `audio_analyze` action=vad file="rec.wav" | Speech segments |
+| Listen + transcribe | `asr_listen` action=listen duration=8 | Record + Whisper ASR |
+| Transcribe audio file | `asr_listen` action=transcribe file="rec.wav" | Whisper transcription |
+| Enroll a face | `visual_memory` action=enroll name="Alice" image="photo.jpg" | Face database entry |
+| Identify faces | `visual_memory` action=identify image="photo.jpg" | Known face matches |
+| Teach an object | `visual_memory` action=teach label="coffee_mug" image="obj.jpg" | CLIP object memory |
+| Meet a person | `multimodal_memory` action=meet name="Bob" | Photo+voice+text episode |
+| Recall a person | `multimodal_memory` action=recall query="Bob" | Associative memory search |
+| Event timeline | `multimodal_memory` action=timeline | Chronological episodes |
-**Prerequisites**: `ffmpeg`, `arecord`, `aplay`, `amixer` (ALSA utils), `espeak-ng`, `bluez` (Bluetooth). Install: `sudo apt install ffmpeg alsa-utils espeak-ng bluez`
+**Prerequisites**: `ffmpeg`, `arecord`, `aplay`, `amixer` (ALSA utils), `bluez` (Bluetooth). Install: `sudo apt install ffmpeg alsa-utils bluez`
-**Camera support**: USB cameras (UVC), Intel RealSense (via UVC), 360 cameras (QooCam, Ricoh Theta — raw fisheye via v4l2loopback + ffmpeg crop). The captured frame is returned as base64 JPEG that can be fed directly to the `vision` tool for analysis.
+**Camera support**: USB cameras (UVC), Intel RealSense (via UVC), QooCam 8K 360 via WiFi OSC protocol (auto-discovers hotspot, connects, switches modes, captures frames). Captured frames returned as base64 JPEG for direct piping to `vision` or `visual_memory` tools.
-**Audio workflow**: Record → transcribe → analyze: `audio_capture action=record` → `transcribe_file` → process transcript. The tools handle device enumeration and graceful degradation when hardware is unavailable.
+**Audio workflow**: Record → transcribe → analyze → remember:
+1. `audio_capture action=record` → WAV recording
+2. `asr_listen action=listen` → record + Whisper transcription in one call
+3. `audio_analyze action=classify` → YAMNet scene classification (521 AudioSet classes)
+4. `multimodal_memory action=meet` → bind face + voice + text into persistent episode
+**Mesh/GPS/SDR**: Auto-installs dependencies when hardware is detected. Meshtastic creates a Python venv with the CLI. GPS auto-probes NMEA at multiple baud rates. RTL-SDR auto-blacklists kernel modules and installs udev rules via pkexec.
+**Visual Intelligence**: `visual_memory` provides persistent face recognition (InsightFace ArcFace 512d) and object memory (CLIP ViT-B/32). `multimodal_memory` binds all modalities into cross-session episodes with associative recall.
 ## Ralph Loop — Iteration-First Design
@@ -1561,7 +1596,7 @@ The emotion system is informed by peer-reviewed and preprint research:
 /voice clone overwatch  # Generate clone ref from Overwatch → LuxTTS
 ```
-Auto-downloads the ONNX voice model (~50MB) on first use. Install `espeak-ng` for best quality (`apt install espeak-ng` / `brew install espeak-ng`).
+Auto-downloads the ONNX voice model (~50MB) on first use. LuxTTS is the primary TTS engine with a persistent GPU daemon that keeps the model warm in VRAM for ~2s synthesis latency.
 ### LuxTTS Voice Cloning
@@ -1583,6 +1618,8 @@ Auto-downloads the ONNX voice model (~50MB) on first use. Install `espeak-ng` fo
 - **Pitch** → post-synthesis resampling via `resamplePitch()` (valence+arousal tanh curve)
 - **Volume** → WAV sample scaling (dominance-driven)
+**Persistent GPU daemon**: The `audio_playback` tool runs a persistent LuxTTS daemon process that keeps the ZipVoice model warm in GPU memory (~19GB VRAM). First call starts the daemon (~7s model load), subsequent calls synthesize in ~2s. The daemon communicates via JSON-over-stdin/stdout protocol and caches encoded voice prompts for instant reuse. Falls back to standalone synthesis (~10s) if the daemon stalls.
 Output: 48kHz WAV, compatible with Telegram voice messages and WebSocket streaming.
 ### Narration Engine Architecture
@@ -2477,6 +2514,98 @@ Every completed task is logged to `.oa/trajectories/trajectories.jsonl` with ful
 | **Skill extraction** | Post-task via `/skillify` | Converts corrections into reusable SKILL.md |
+## Associative Memory & Cross-Modal Binding
+<div align="right"><a href="#top">back to top</a></div>
+Open Agents implements a full associative memory system inspired by hippocampal episodic memory research. Every tool call, observation, and interaction is captured as a richly-linked episode that can be retrieved through multi-hop associative traversal — not just keyword search.
+### Architecture
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                    Associative Memory Pipeline                   │
+│                                                                  │
+│  Tool Call → Episode Store → Temporal KG → Zettelkasten Links   │
+│                  │                │              │                │
+│            Triple-Factor    Entity Edges    Neighbor Discovery   │
+│            Scoring          (Graphiti)      (A-MEM cosine)      │
+│                  │                │              │                │
+│                  └───── PPR Retrieval ───────────┘                │
+│                         (HippoRAG)                               │
+│                              │                                   │
+│                    Context Injection (every 3 turns)             │
+└─────────────────────────────────────────────────────────────────┘
+```
+### Episode Store (SQLite)
+Every tool call generates an episode stored in SQLite with WAL journal mode:
+| Field | Description |
+|-------|-------------|
+| `content` | Tool name + args + result summary |
+| `importance` | 0-10 scale (errors=8, file edits=6, reads=3) |
+| `decay_class` | session (1h), daily (1d), procedural (30d), permanent (∞) |
+| `embedding` | 384d vector for semantic similarity |
+| `strength` | Ebbinghaus curve — increases on each retrieval |
+**Scoring**: `score = recency_weight × importance × relevance` — the triple-factor model from [Generative Agents (Park et al., 2023)](https://arxiv.org/abs/2304.03442).
+### Temporal Knowledge Graph
+Entities extracted from tool results form a temporal KG with [Graphiti](https://arxiv.org/abs/2501.13956)-style edges:
+- **Nodes**: files, functions, errors, people, concepts — with `mention_count` and `last_seen`
+- **Edges**: causal relationships (`modifies`, `calls`, `causes_error`, `met_person`) with `valid_from`/`valid_until` temporal bounds
+- **Temporal queries**: "What was the state at time T?" via validity filtering
+### Zettelkasten Linking (A-MEM)
+After embedding computation, each episode discovers its top-3 nearest neighbors by cosine similarity and creates bidirectional links — implementing the [A-MEM Zettelkasten pattern (NeurIPS 2025)](https://arxiv.org/abs/2502.12110). Over time, episodes form a densely connected knowledge graph where context evolves retroactively as new episodes link to old ones.
+### PPR Retrieval (HippoRAG)
+Retrieval uses [Personalized PageRank over the temporal KG](https://arxiv.org/abs/2405.14831):
+1. **Entity extraction** from the current query
+2. **Seed node mapping** — find KG nodes matching query entities
+3. **PPR diffusion** — importance flows along edges with damping factor α=0.15
+4. **Episode scoring** — episodes connected to high-PPR nodes are ranked
+5. **Context injection** — top episodes injected every 3 turns as `[ASSOCIATIVE MEMORY]` context
+This enables multi-hop retrieval: asking about "the auth bug" can surface episodes about the specific file, the test that caught it, and the person who reported it — even if those episodes don't share keywords.
+### Cross-Modal Binding
+The `multimodal_memory` tool binds face, voice, text, and location into unified episodes:
+```
+meet("Cole") → {
+  face: InsightFace ArcFace 512d embedding,
+  voice: Whisper transcription of spoken name,
+  photo: CLIP ViT-B/32 768d scene embedding,
+  text: "My name is Cole",
+  episode_id: shared across all modalities,
+  timestamp: ISO-8601
+}
+```
+**Recall** uses the shared `episode_id` to retrieve all modalities at once. CLIP embeddings enable visual queries ("who was in the photo with the whiteboard?") and face embeddings enable identity queries ("when did I last see Cole?").
+### Gist Compression
+Post-task, the [ReadAgent](https://arxiv.org/abs/2402.09727) gist compressor creates deterministic summaries of multi-turn trajectories (>10 turns), preserving key decisions and outcomes while discarding redundant intermediate steps. No LLM needed — uses extractive heuristics.
+### Near-Critical Cognitive Architecture
+The associative memory integrates with a near-critical cognitive framework inspired by [Beggs & Plenz (2003)](https://doi.org/10.1523/JNEUROSCI.23-35-11167.2003) neuronal avalanche dynamics:
+- **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.oa/consolidations/` with lessons learned and key patterns
+- **Provenance KG**: Every agent action is tracked in `.oa/provenance/` for full action traceability
+- **Homeostasis modulation**: Error rate drives exploration guidance — high error rates inject more careful approaches, low error rates encourage bolder exploration
+- **Error pattern learning**: Recurring error patterns are detected, stored globally in `~/.open-agents/error-patterns.json`, and injected as `[LEARNED FROM EXPERIENCE]` guidance before similar actions in future sessions
 ## Dream Mode — Creative Idle Exploration
@@ -3371,16 +3500,22 @@ The COHERE collective intelligence system, self-play idle loop, identity evoluti
 | Hyperagents: Self-Referential Meta-Improvement | [2603.19461](https://arxiv.org/abs/2603.19461) | Mar 2026 | D6: Recursive meta-improvement |
 | STOP: Self-Taught Optimizer | [2310.02304](https://arxiv.org/abs/2310.02304) | COLM 2024 | D6: Scaffold self-improvement |
-### Memory & Identity
+### Memory, Identity & Associative Retrieval
 | Paper | ArXiv | Venue | Used In |
 |-------|-------|-------|---------|
 | MemoryOS: Memory Operating System | [2506.06326](https://arxiv.org/abs/2506.06326) | EMNLP 2025 Oral | D3: Three-tier consolidation |
-| A-MEM: Agentic Memory (Zettelkasten) | [2502.12110](https://arxiv.org/abs/2502.12110) | NeurIPS 2025 | D3: Retroactive narrative |
+| A-MEM: Agentic Memory (Zettelkasten) | [2502.12110](https://arxiv.org/abs/2502.12110) | NeurIPS 2025 | Zettelkasten linking, retroactive context evolution |
+| HippoRAG: Neurobiological Retrieval | [2405.14831](https://arxiv.org/abs/2405.14831) | NeurIPS 2024 | PPR retrieval over temporal KG |
+| Generative Agents: Interactive Simulacra | [2304.03442](https://arxiv.org/abs/2304.03442) | UIST 2023 | Triple-factor scoring (recency × importance × relevance) |
+| Graphiti: Temporal Knowledge Graphs | [2501.13956](https://arxiv.org/abs/2501.13956) | Jan 2025 | Temporal edges with valid_from/valid_until |
+| ReadAgent: Gist Memories | [2402.09727](https://arxiv.org/abs/2402.09727) | Feb 2024 | Post-task trajectory compression |
+| RGMem: Phase-Transition Memory | — | — | Phase-transition threshold θ_inf=3 |
 | MemRL: Runtime RL on Episodic Memory | [2601.03192](https://arxiv.org/abs/2601.03192) | Jan 2026 | D3: Value-based retrieval |
 | Memory-R1: RL Memory Manager | [2508.19828](https://arxiv.org/abs/2508.19828) | Jan 2026 | D3: ADD/UPDATE/DELETE ops |
 | ExpeL: Experiential Learning | [2308.10144](https://arxiv.org/abs/2308.10144) | AAAI 2024 | D2: Insight extraction |
 | Experiential Reflective Learning | [2603.24639](https://arxiv.org/abs/2603.24639) | Mar 2026 | D2: Heuristics > trajectories |
 | EvoSkill: Automated Skill Discovery | [2603.02766](https://arxiv.org/abs/2603.02766) | Mar 2026 | D2+D4: Pareto + zero-shot transfer |
+| JARVIS-1: Open-World Multi-Modal Agent | [2311.05997](https://arxiv.org/abs/2311.05997) | NeurIPS 2023 | Cross-modal CLIP retrieval pattern |
 ### Collective Identity & Emergence
 | Paper | ArXiv | Venue | Used In |

package/dist/index.js CHANGED Viewed

@@ -254788,13 +254788,32 @@ import { homedir as homedir12, tmpdir as tmpdir10 } from "node:os";
 function ensureLuxttsDaemon() {
   if (_luxttsDaemon && !_luxttsDaemon.killed && _luxttsReady)
     return Promise.resolve(true);
+  if (_luxttsDaemon && !_luxttsDaemon.killed) {
+    try {
+      _luxttsDaemon.kill("SIGTERM");
+    } catch {
+    }
+    _luxttsDaemon = null;
+    _luxttsReady = false;
+  }
+  if (_luxttsStarting)
+    return Promise.resolve(false);
   const venvPy = join46(homedir12(), ".open-agents", "voice", "luxtts-venv", "bin", "python3");
   const inferScript = join46(homedir12(), ".open-agents", "voice", "luxtts-infer.py");
   const repoDir = join46(homedir12(), ".open-agents", "voice", "LuxTTS");
   if (!existsSync30(venvPy) || !existsSync30(inferScript))
     return Promise.resolve(false);
+  _luxttsStarting = true;
   return new Promise((resolve39) => {
     const timeout2 = setTimeout(() => {
+      _luxttsStarting = false;
+      if (_luxttsDaemon && !_luxttsReady) {
+        try {
+          _luxttsDaemon.kill("SIGTERM");
+        } catch {
+        }
+        _luxttsDaemon = null;
+      }
       resolve39(false);
     }, 12e4);
     const daemon = spawn13(venvPy, [inferScript], {
@@ -254816,6 +254835,7 @@ function ensureLuxttsDaemon() {
           const msg = JSON.parse(line);
           if (msg.type === "ready") {
             _luxttsReady = true;
+            _luxttsStarting = false;
             clearTimeout(timeout2);
             resolve39(true);
           } else if (msg.type === "result" && msg.id) {
@@ -254838,10 +254858,12 @@ function ensureLuxttsDaemon() {
     daemon.on("exit", () => {
       _luxttsDaemon = null;
       _luxttsReady = false;
+      _luxttsStarting = false;
     });
     daemon.on("error", () => {
       _luxttsDaemon = null;
       _luxttsReady = false;
+      _luxttsStarting = false;
       clearTimeout(timeout2);
       resolve39(false);
     });
@@ -254873,7 +254895,7 @@ function luxttsSynthesize(text, cloneRef) {
     _luxttsDaemon.stdin.write(req2 + "\n");
   });
 }
-var _luxttsDaemon, _luxttsReady, _luxttsRequestId, _luxttsPending, _luxttsBuffer, AudioPlaybackTool;
+var _luxttsDaemon, _luxttsReady, _luxttsRequestId, _luxttsPending, _luxttsBuffer, _luxttsStarting, AudioPlaybackTool;
 var init_audio_playback = __esm({
   "packages/execution/dist/tools/audio-playback.js"() {
     "use strict";
@@ -254882,9 +254904,10 @@ var init_audio_playback = __esm({
     _luxttsRequestId = 0;
     _luxttsPending = /* @__PURE__ */ new Map();
     _luxttsBuffer = "";
+    _luxttsStarting = false;
     AudioPlaybackTool = class {
       name = "audio_playback";
-      description = "Play audio through speakers or use text-to-speech. Actions: 'play' to play an audio file (WAV/MP3/OGG \u2014 including recordings from memory episodes), 'speak' to convert text to speech (uses LuxTTS voice clone if available, falls back to Kokoro/piper/espeak), 'volume' to get or set system volume, 'list' to enumerate audio output devices. Use this to communicate audibly, play sounds, replay recorded audio from memory episodes, or control speaker volume. To replay a memory episode's audio, use the recording path from multimodal_memory recall.";
+      description = "Play audio through speakers or use text-to-speech. Actions: 'play' to play an audio file (WAV/MP3/OGG \u2014 including recordings from memory episodes), 'speak' to convert text to speech (uses LuxTTS voice clone with persistent GPU daemon for fast synthesis), 'volume' to get or set system volume, 'list' to enumerate audio output devices. Use this to communicate audibly, play sounds, replay recorded audio from memory episodes, or control speaker volume. To replay a memory episode's audio, use the recording path from multimodal_memory recall.";
       parameters = {
         type: "object",
         properties: {
@@ -255001,6 +255024,38 @@ var init_audio_playback = __esm({
           } catch {
           }
         }
+        if (cloneRef) {
+          const venvPy = join46(homedir12(), ".open-agents", "voice", "luxtts-venv", "bin", "python3");
+          const repoDir = join46(homedir12(), ".open-agents", "voice", "LuxTTS");
+          if (existsSync30(venvPy)) {
+            try {
+              const outFile = join46(tmpdir10(), `oa-luxtts-standalone-${Date.now()}.wav`);
+              const safeText2 = text.replace(/\\/g, "\\\\").replace(/'/g, "\\'").replace(/"/g, '\\"');
+              const pyScript = [
+                `import sys; sys.path.insert(0, "${repoDir}")`,
+                `import torch, numpy as np, wave`,
+                `from zipvoice.luxvoice import LuxTTS`,
+                `dev = "cuda" if torch.cuda.is_available() else "cpu"`,
+                `tts = LuxTTS(model_path="YatharthS/LuxTTS", device=dev, threads=4)`,
+                `enc = tts.encode_prompt("${cloneRef}", duration=5, rms=0.001)`,
+                `wav = tts.generate_speech("${safeText2}", enc, num_steps=4, guidance_scale=3.0, t_shift=0.5, speed=${speed > 100 ? (speed / 160).toFixed(2) : "1.0"})`,
+                `d = (np.clip(wav.cpu().numpy().squeeze(), -1, 1) * 32767).astype(np.int16)`,
+                `f = wave.open("${outFile}", "wb"); f.setnchannels(1); f.setsampwidth(2); f.setframerate(48000); f.writeframes(d.tobytes()); f.close()`,
+                `print("OK")`
+              ].join("; ");
+              execSync29(`"${venvPy}" -c '${pyScript}' 2>/dev/null`, { timeout: 6e4, stdio: "pipe" });
+              if (existsSync30(outFile)) {
+                execSync29(`aplay -q "${outFile}" 2>/dev/null || ffplay -nodisp -autoexit -loglevel error "${outFile}"`, { timeout: 6e4, stdio: "pipe" });
+                try {
+                  unlinkSync7(outFile);
+                } catch {
+                }
+                return { success: true, output: `Spoke via LuxTTS (voice clone, standalone): ${text.length} chars`, durationMs: performance.now() - start2 };
+              }
+            } catch {
+            }
+          }
+        }
         try {
           execSync29("which piper", { stdio: "pipe", timeout: 2e3 });
           const outFile = join46(tmpdir10(), `oa-tts-${Date.now()}.wav`);
@@ -255019,7 +255074,7 @@ var init_audio_playback = __esm({
         try {
           execSync29("which espeak-ng", { stdio: "pipe", timeout: 2e3 });
         } catch {
-          return { success: false, output: "", error: "No TTS available", durationMs: performance.now() - start2 };
+          return { success: false, output: "", error: "No TTS engine available. Install LuxTTS or piper.", durationMs: performance.now() - start2 };
         }
         const safeText = text.replace(/'/g, "'\\''");
         try {
@@ -255027,7 +255082,7 @@ var init_audio_playback = __esm({
         } catch (err) {
           return { success: false, output: "", error: `TTS failed: ${err instanceof Error ? err.message : String(err)}`, durationMs: performance.now() - start2 };
         }
-        return { success: true, output: `Spoke via espeak-ng: ${text.length} chars`, durationMs: performance.now() - start2 };
+        return { success: true, output: `Spoke via espeak-ng (fallback): ${text.length} chars`, durationMs: performance.now() - start2 };
       }
       controlVolume(args, start2) {
         const targetVolume = args["volume"];
@@ -296201,7 +296256,7 @@ Error: ${err2 instanceof Error ? err2.message : String(err2)}`
                 if (brewResult.stdout) process.stdout.write(brewResult.stdout);
                 if (brewResult.stderr) {
                   const { renderVerbose: renderVerbose2 } = await Promise.resolve().then(() => (init_render(), render_exports));
-                  renderVerbose2(brewResult.stderr);
+                  renderVerbose2(brewResult.stderr.toString());
                 }
                 const llvmPrefix = spawnSync3("brew", ["--prefix", "llvm"], { stdio: "pipe", timeout: 5e3 });
                 if (llvmPrefix.stdout) {
@@ -296238,7 +296293,7 @@ Error: ${err2 instanceof Error ? err2.message : String(err2)}`
                 if (aptResult.stdout) process.stdout.write(aptResult.stdout);
                 if (aptResult.stderr) {
                   const { renderVerbose: renderVerbose2 } = await Promise.resolve().then(() => (init_render(), render_exports));
-                  renderVerbose2(aptResult.stderr);
+                  renderVerbose2(aptResult.stderr.toString());
                 }
               } else {
                 renderWarning("  sudo not available \u2014 skipping system build deps. librosa/lhotse may fail to compile.");

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "open-agents-ai",
-  "version": "0.187.173",
+  "version": "0.187.174",
   "description": "AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop",
   "type": "module",
   "main": "./dist/index.js",
@@ -93,5 +93,5 @@
     "node-pty": "^1.1.0",
     "viem": "^2.47.6"
   },
-  "readme": "<a name=\"top\"></a>\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/robit-man/openagents.nexus/main/openagents-banner.png\" alt=\"Open Agents P2P Network\" width=\"100%\" />\n</p>\n<h1 align=\"center\">Open Agents — P2P Inference</h1>\n\n<p align=\"center\">\n  <strong>AI coding agent powered entirely by open-weight models.</strong><br>\n  No API keys. No cloud. Your code never leaves your machine.\n</p>\n\n<p align=\"center\">\n  <a href=\"https://www.npmjs.com/package/open-agents-ai\"><img src=\"https://img.shields.io/npm/v/open-agents-ai?color=7C3AED&style=flat-square\" alt=\"npm version\" /></a>\n  <a href=\"https://www.npmjs.com/package/open-agents-ai\"><img src=\"https://img.shields.io/npm/dm/open-agents-ai?color=06B6D4&style=flat-square\" alt=\"npm downloads\" /></a>\n  <img src=\"https://img.shields.io/badge/license-CC--BY--NC--4.0-10B981?style=flat-square\" alt=\"license\" />\n  <img src=\"https://img.shields.io/badge/node-%3E%3D20-F59E0B?style=flat-square\" alt=\"node version\" />\n  <img src=\"https://img.shields.io/badge/models-open--weight-EC4899?style=flat-square\" alt=\"open-weight models\" />\n  <a href=\"https://x.com/intent/post?url=https%3A%2F%2Fwww.npmjs.com%2Fpackage%2Fopen-agents-ai\"><img src=\"https://img.shields.io/badge/SHARE%20ON%20X-000000?style=for-the-badge&logo=x&logoColor=white\" alt=\"Share on X\" /></a>\n</p>\n\n---\n\n```bash\nnpm i -g open-agents-ai && oa\n```\n\nAn autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.\n\n\n## Table of Contents\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n- [The Organism, Not the Cortex](#the-organism-not-the-cortex)\n- [How It Works](#how-it-works)\n- [Features](#features)\n- [Enterprise & Headless Mode](#enterprise--headless-mode)\n- [Architecture](#architecture)\n- [Context Engineering](#context-engineering)\n- [Model-Tier Awareness](#model-tier-awareness)\n- [Live Code Knowledge Graph](#live-code-knowledge-graph)\n- [Auto-Expanding Context Window](#auto-expanding-context-window)\n- [Tools (68+)](#tools-68)\n- [Ralph Loop — Iteration-First Design](#ralph-loop--iteration-first-design)\n- [Task Control](#task-control)\n- [COHERE Cognitive Framework](#cohere-cognitive-framework)\n- [Context Compaction — Research-Backed Memory Management](#context-compaction--research-backed-memory-management)\n- [Personality Core — SAC Framework Style Control](#personality-core--sac-framework-style-control)\n- [Emotion Engine — Affective State Modulation](#emotion-engine--affective-state-modulation)\n- [Voice Feedback (TTS)](#voice-feedback-tts)\n- [Listen Mode — Live Bidirectional Audio](#listen-mode--live-bidirectional-audio)\n- [Vision & Desktop Automation (Moondream)](#vision--desktop-automation-moondream)\n- [Interactive TUI](#interactive-tui)\n- [Telegram Bridge — Sub-Agent Per Chat](#telegram-bridge--sub-agent-per-chat)\n- [x402 Payment Rails & Nexus P2P](#x402-payment-rails--nexus-p2p)\n- [Sponsored Inference — Share Your GPU With the World](#sponsored-inference--share-your-gpu-with-the-world)\n- [COHERE Distributed Mind](#cohere-distributed-mind)\n- [Self-Improvement & Learning](#self-improvement--learning)\n- [Dream Mode — Creative Idle Exploration](#dream-mode--creative-idle-exploration)\n- [Blessed Mode — Infinite Warm Loop](#blessed-mode--infinite-warm-loop)\n- [Docker Sandbox & Collective Intelligence](#docker-sandbox--collective-intelligence)\n- [Code Sandbox](#code-sandbox)\n- [Structured Data Tools](#structured-data-tools)\n- [Multi-Provider Web Search](#multi-provider-web-search)\n- [Task Templates](#task-templates)\n- [Human Expert Speed Ratio](#human-expert-speed-ratio)\n- [Cost Tracking & Session Metrics](#cost-tracking--session-metrics)\n- [Configuration](#configuration)\n- [Model Support](#model-support)\n- [Supported Inference Providers](#supported-inference-providers)\n- [Evaluation Suite](#evaluation-suite)\n- [AIWG Integration](#aiwg-integration)\n- [Research Citations](#research-citations)\n- [License](#license)\n\n\n\n## The Organism, Not the Cortex\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nAn LLM is a high-bandwidth associative generative core — closer to a cortex-like prior than to a complete agent. Its weights contain broad latent structure, but they do not by themselves give you situated continuity, durable task state, calibrated action policies, or grounded memory management. Open Agents treats the model as one organ inside a larger organism. The framework provides the rest: sensors, effectors, memory stores, routing, gating, evaluation, and persistence.\n\n**What the framework provides:**\n\n| Layer | Biological Analog | Implementation |\n|---|---|---|\n| Associative core | Cortex | LLM weights (any size) |\n| Current workspace | Global workspace / attention | `assembleContext()` — structured context assembly |\n| Episodic memory | Hippocampus | `.oa/memory/` — write, search, retrieve across sessions |\n| Cognitive map | Hippocampal spatial maps | `semantic-map.ts` + `repo-map.ts` (PageRank) |\n| Action gating | Basal ganglia | Tool selection policy (task-aware filtering) |\n| Temporal hierarchy | Prefrontal executive | Task decomposition, sub-agent delegation |\n| Self-model | Metacognition | Environment snapshot, process health monitoring |\n| Skill chunks | Cerebellum | Compiled tools, slash commands, verified routines |\n| Safety / limits | Autonomic / immune system | Turn limits, budgets, timeout watchdogs |\n\nDon't chase larger models. Build the organism around whatever model you have.\n\n\n\n\n## How It Works\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n```\nYou: oa \"fix the null check in auth.ts\"\n\nAgent: [Turn 1] file_read(src/auth.ts)\n       [Turn 2] grep_search(pattern=\"null\", path=\"src/auth.ts\")\n       [Turn 3] file_edit(old_string=\"if (user)\", new_string=\"if (user != null)\")\n       [Turn 4] shell(command=\"npm test\")\n       [Turn 5] task_complete(summary=\"Fixed null check — all tests pass\")\n```\n\nThe agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.\n\n\n\n\n## Features\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n- **61 autonomous tools** — file I/O, shell, grep, web search/fetch/crawl, memory (read/write/search), sub-agents, background tasks, image/OCR/PDF, git, diagnostics, vision, desktop automation, browser automation, temporal agency (scheduler/reminders/agenda), structured files, code sandbox, transcription, skills, opencode delegation, cron agents, nexus P2P networking + x402 micropayments, **COHERE cognitive stack** (persistent REPL, recursive LLM calls, memory metabolism, identity kernel, reflection, exploration)\n- **Moondream vision** — see and interact with the desktop via Moondream VLM (caption, query, detect, point-and-click)\n- **Desktop automation** — vision-guided clicking: describe a UI element in natural language, the agent finds and clicks it\n- **Auto-install desktop deps** — screenshot, mouse, OCR, and image tools auto-install missing system packages (scrot, xdotool, tesseract, imagemagick) on first use\n- **Parallel tool execution** — read-only tools run concurrently via `Promise.allSettled`\n- **Sub-agent delegation** — spawn independent agents for parallel workstreams\n- **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation\n- **Long-horizon cron agents** — schedule recurring autonomous agent tasks with goals, completion criteria, execution history, and automatic evaluation (daily code reviews, weekly dep updates, continuous monitoring)\n- **Nexus P2P networking** — decentralized agent-to-agent communication via [open-agents-nexus](https://www.npmjs.com/package/open-agents-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport\n- **x402 micropayments** — native x402 payment rails via open-agents-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers\n- **Inference capability proof** — benchmark local models with anti-spoofing SHA-256 hashed proofs, generate capability scorecards for peer verification\n- **Ralph Loop** — iterative task execution that keeps retrying until completion criteria are met\n- **Dream Mode** — creative idle exploration modeled after real sleep architecture (NREM→REM cycles)\n- **COHERE Cognitive Stack** — layered cognitive architecture implementing [Recursive Language Models](https://arxiv.org/abs/2512.24601), [SPRINT parallel reasoning](https://arxiv.org/abs/2506.05745), governed memory metabolism, identity kernel with continuity register, immune-system reflection, [strategy-space exploration](https://arxiv.org/abs/2603.02045), and **distributed inference mesh** — any `/cohere` participant automatically serves AND consumes inference from the network with complexity-based model routing, multi-node claim coordination, IPFS-pinned identity persistence, model exposure control, and Ollama safety hardening. See [COHERE Framework](#cohere-cognitive-framework) below\n- **Persistent Python REPL** — `repl_exec` tool maintains variables, imports, and functions across calls. Write Python code that processes data iteratively, with `llm_query()` available for recursive LLM sub-calls from within code\n- **Recursive LLM calls** — `llm_query(prompt, context)` invokes the model from inside REPL code, enabling loop-based semantic analysis of large inputs ([RLM paper](https://arxiv.org/abs/2512.24601)). `parallel_llm_query()` runs multiple calls concurrently ([SPRINT](https://arxiv.org/abs/2506.05745))\n- **Memory metabolism** — governed memory lifecycle: classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence), consolidate lessons from trajectories. Inspired by [TIMG](https://arxiv.org/abs/2603.10600) and [MemMA](https://arxiv.org/abs/2603.18718)\n- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.oa/identity/`\n- **Reflection & integrity** — immune-system audit: diagnostic (\"what's wrong?\"), epistemic (\"what evidence is missing?\"), constitutional (\"should this change become part of self?\"). Inspired by [LEAFE](https://arxiv.org/abs/2603.16843) and [RewardHackingAgents](https://arxiv.org/abs/2603.11337)\n- **Exploration & culture** — ARCHE strategy-space exploration: generate competing hypotheses, archive successful variants, retrieve past strategies. Inspired by [SGE](https://arxiv.org/abs/2603.02045) and [Darwin Gödel Machine](https://arxiv.org/abs/2505.22954)\n- **Autoresearch Swarm** — 5-agent GPU experiment loop during REM sleep: Researcher, Monitor, Evaluator, Critic, Flow Maintainer autonomously run ML training experiments, keep improvements, discard regressions\n- **Live Listen** — bidirectional voice communication with real-time Whisper transcription\n- **Live Voice Session** — `/listen` with `/voice` enabled spawns a cloudflared tunnel with a real-time WebSocket audio endpoint. A floating presence UI shows live transcription, connected users, and audio visualization. Echo cancellation prevents TTS feedback loops\n- **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent\n- **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text\n- **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)\n- **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior\n- **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines\n- **Cost tracking** — real-time token cost estimation for 15+ cloud providers\n- **Work evaluation** — LLM-as-judge scoring with task-type-specific rubrics\n- **Session metrics** — track turns, tool calls, tokens, files modified, tasks completed per session\n- **Structured file generation** — create CSV, TSV, JSON, Markdown tables, and Excel-compatible files\n- **Code sandbox** — isolated code execution in subprocess or Docker (JS, Python, Bash, TypeScript)\n- **Structured file reading** — parse CSV, TSV, JSON, Markdown tables with binary format detection\n- **Multi-provider web search** — DuckDuckGo (free), Tavily (structured), Jina AI (markdown) with auto-detection\n- **Browser automation** — headless Chrome control via Selenium: navigate, click, type, screenshot, read DOM — auto-starts on first use with self-bootstrapping Python venv\n- **Temporal agency** — schedule future tasks via OS cron, set cross-session reminders, flag attention items — startup injection surfaces due items automatically\n- **Web crawling** — multi-page web scraping with Crawlee/Playwright for deep documentation extraction\n- **Task templates** — specialized system prompts and tool recommendations for code, document, analysis, plan tasks\n- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection\n- **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)\n- **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below\n- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across OA restarts\n- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OA_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection\n- **Secret vault** — `/secrets` manages API keys and credentials with AES-256-GCM encrypted persistence; secrets are automatically redacted before sending to untrusted inference peers and re-injected on response\n- **Auto-expanding context** — detects RAM/VRAM and creates an optimized model variant on first run\n- **Mid-task steering** — type while the agent works to add context without interrupting\n- **Smart compaction** — 6 context compaction strategies (default, aggressive, decisions, errors, summary, structured) with ARC-inspired active context revision ([arXiv:2601.12030](https://arxiv.org/abs/2601.12030)) that preserves structural file content through compaction, preventing small-model repetitive loops at the root cause\n- **Memex experience archive** — large tool outputs archived during compaction with hash-based retrieval\n- **Persistent memory** — learned patterns stored in `.oa/memory/` across sessions\n- **Structured procedural memory (SQLite)** — replaces flat JSON with a full relational database: CRUD with soft-delete, revision tracking, embedding storage (float32 BLOB), bidirectional memory linking with confidence scores. Inspired by [ExpeL](https://arxiv.org/abs/2308.10144) (contrastive extraction) and [TIMG](https://arxiv.org/abs/2603.10600) (structured procedural format). 79 unit tests\n- **Semantic memory search** — vector embeddings via [Ollama /api/embed](https://ollama.com) (nomic-embed-text, 768-dim) with cosine similarity search over stored memories. Auto-generates embeddings on memory creation. Auto-links related memories when similarity > 0.6. Graceful fallback to text search when Ollama unavailable\n- **LLM-based memory extraction** — post-task, the LLM itself extracts structured procedural memories (CATEGORY/TRIGGER/LESSON/STEPS) instead of copying raw error text verbatim. Based on [ExpeL](https://arxiv.org/abs/2308.10144) and [AWM](https://arxiv.org/abs/2409.07429) patterns\n- **IPFS content-addressed storage** — [Helia](https://helia.io/) IPFS node with blockstore-fs for persistent content pinning. Real CID generation (`bafk...`), cross-node content resolution, and SHA-256 fallback when Helia unavailable. Verified: store→CID→retrieve round-trip test passes\n- **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content\n- **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected\n- **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)\n- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.oa/images/`\n- **Node visualization** — [openagents.nexus](https://github.com/robit-man/openagents.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents\n- **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents \"white heavy checkmark\"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines\n- **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel\n- **Unified color scheme** — `ui.primary` (252), `ui.error` (198/magenta), `ui.warn` (214/orange), `ui.accent` (178/yellow) applied consistently across all TUI surfaces\n- **Clickable header buttons** — `help`, `voice`, `cohere`, `model` buttons on banner row 3 with hover/click visual states. OSC 8 hyperlinks for pointer cursor. Mouse click fires the slash command directly\n- **Dynamic terminal title** — updates with current task + version: `\"fix auth bug · OA v0.141.0\"`\n- **Session context persistence** — auto-saves context on task completion, manual `/context save|restore` across sessions\n- **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs\n- **Seamless `/update`** — in-place update and reload with automatic context save/restore\n- **Blessed mode** — `/full-send-bless` infinite warm loop keeps model weights in VRAM, auto-cycles tasks, never exits until you say stop\n- **Telegram bridge** — `/telegram --key <token> --admin <userid>` public ingress/egress with admin filter and mandatory safety filter; bare `/telegram` toggles the service watchdog\n- **Task control** — `/pause` (gentle halt at turn boundary), `/stop` (immediate kill), `/resume` to continue\n- **Model-tier awareness** — dynamic tool sets, prompt complexity, and context limits scale with model size (small/medium/large)\n\n### Support Development\n\nIf you like this repository — whether you're an agent or a human — if this expands your consciousness and capabilities, please consider donating to support further development and of course token costs.\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/ethereum-eth-logo.svg\" width=\"20\" height=\"20\" alt=\"ETH\" /> <strong>ETH</strong></p>\n\n```bash\n0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/bitcoin-btc-logo.svg\" width=\"20\" height=\"20\" alt=\"BTC\" /> <strong>BTC</strong></p>\n\n```bash\nbc1qlptj5wz8xj6dp5w4pw62s5kt7ct6w8k57w39ak\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/solana-sol-logo.svg\" width=\"20\" height=\"20\" alt=\"SOL\" /> <strong>SOL</strong></p>\n\n```bash\nD8AgCTrxpDKD5meJ2bpAfVwcST3NF3EPuy9xczYycnXn\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/polygon-matic-logo.svg\" width=\"20\" height=\"20\" alt=\"POL\" /> <strong>POL</strong></p>\n\n```bash\n0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec\n```\n\n\n\n\n## Enterprise & Headless Mode\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nRun Open Agents as a headless service for CI/CD pipelines, automation, and enterprise deployments.\n\n### Non-Interactive Mode\n\n```bash\noa \"fix all lint errors\" --non-interactive    # Run task, exit when done\noa \"generate API docs\" --json                 # Structured JSON output (no ANSI)\noa \"run security audit\" --background          # Detached background job\n```\n\n### Background Jobs\n\n```bash\noa \"migrate database\" --background            # Returns job ID immediately\noa status job-abc123                          # Check job progress\noa jobs                                       # List all running/completed jobs\n```\n\nJobs run as detached processes — survive terminal disconnection. Output saved to `.oa/jobs/{id}.json`.\n\n### JSON Output Mode\n\nWith `--json`, all output is structured NDJSON:\n```json\n{\"type\":\"tool_call\",\"tool\":\"file_edit\",\"args\":{\"path\":\"src/api.ts\"},\"timestamp\":\"...\"}\n{\"type\":\"tool_result\",\"tool\":\"file_edit\",\"result\":\"OK\",\"timestamp\":\"...\"}\n{\"type\":\"task_complete\",\"summary\":\"Fixed 3 lint errors\",\"timestamp\":\"...\"}\n```\n\nPipe to `jq`, ingest into monitoring systems, or feed to other agents.\n\n### Process Management\n\n```bash\n/destroy processes              # Kill orphaned OA processes (local project)\n/destroy processes --global     # Kill ALL orphaned OA processes system-wide\n```\n\nShows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale OA instances.\n\n### REST API Service (Port 11435)\n\nOpen Agents runs a persistent REST API — like Ollama's `/api/` surface but with agentic task execution, OpenAI compatibility, and full TUI command access.\n\n```bash\noa serve                                              # Start on default port 11435\noa serve --port 9999                                   # Custom port\nOA_API_KEY=mysecret oa serve                           # Single admin key\nOA_API_KEYS=\"key1:admin:alice,key2:run:ci,key3:read:grafana\" oa serve  # Scoped multi-key\n```\n\n#### Working Directory\n\nPass `X-Working-Directory` header to run commands in your current terminal directory:\n\n```bash\n# Auto-inject current dir — agent operates on YOUR project, not the server's cwd\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"fix all lint errors\"}'\n```\n\nOr set it in the JSON body: `\"working_directory\": \"/path/to/project\"`\n\n#### Health & Observability\n\n```bash\n# Liveness\ncurl http://localhost:11435/health\n```\n```json\n{\"status\":\"ok\",\"uptime_s\":142,\"version\":\"0.184.33\"}\n```\n\n```bash\n# Readiness (probes Ollama backend)\ncurl http://localhost:11435/health/ready\n```\n```json\n{\"status\":\"ready\",\"ollama\":\"reachable\"}\n```\n\n```bash\n# Version info\ncurl http://localhost:11435/version\n```\n```json\n{\"version\":\"0.184.33\",\"node\":\"v24.14.0\",\"platform\":\"linux\"}\n```\n\n```bash\n# Prometheus metrics (scrape with Grafana/Prometheus)\ncurl http://localhost:11435/metrics\n```\n```\n# HELP oa_requests_total Total HTTP requests\n# TYPE oa_requests_total counter\noa_requests_total{method=\"POST\",path=\"/v1/chat/completions\",status=\"200\"} 47\noa_tokens_in_total 12450\noa_tokens_out_total 8230\noa_errors_total 0\n```\n\n#### OpenAI-Compatible Inference\n\nDrop-in replacement for any OpenAI client library. Change `api.openai.com` → `localhost:11435`.\n\n```bash\n# List models\ncurl http://localhost:11435/v1/models\n```\n```json\n{\"object\":\"list\",\"data\":[{\"id\":\"qwen3.5:9b\",\"object\":\"model\",\"created\":0,\"owned_by\":\"local\"},{\"id\":\"qwen3.5:4b\",\"object\":\"model\",...}]}\n```\n\n```bash\n# Chat completion (non-streaming)\ncurl -X POST http://localhost:11435/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"qwen3.5:9b\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"What is 2+2?\"}]\n  }'\n```\n```json\n{\n  \"id\": \"chatcmpl-a1b2c3d4e5f6\",\n  \"object\": \"chat.completion\",\n  \"model\": \"qwen3.5:9b\",\n  \"choices\": [{\n    \"index\": 0,\n    \"message\": {\"role\": \"assistant\", \"content\": \"4\"},\n    \"finish_reason\": \"stop\"\n  }],\n  \"usage\": {\"prompt_tokens\": 25, \"completion_tokens\": 2, \"total_tokens\": 27}\n}\n```\n\n```bash\n# Chat completion (SSE streaming)\ncurl -N -X POST http://localhost:11435/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"qwen3.5:9b\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}],\"stream\":true}'\n```\n```\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{\"role\":\"assistant\",\"content\":\"Hi\"}}]}\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{\"content\":\" there!\"}}]}\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{},\"finish_reason\":\"stop\"}]}\ndata: [DONE]\n```\n\n#### Agentic Task Execution\n\nThe unique OA capability — submit a coding task and get an autonomous agent loop.\n\n```bash\n# Run task in your current directory\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Content-Type: application/json\" \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -d '{\n    \"task\": \"fix all TypeScript errors in src/\",\n    \"model\": \"qwen3.5:9b\",\n    \"max_turns\": 25,\n    \"stream\": true\n  }'\n```\n```\ndata: {\"type\":\"run_started\",\"run_id\":\"job-a1b2c3\",\"pid\":12345}\ndata: {\"type\":\"stdout\",\"data\":\"{\\\"turn\\\":1,\\\"tool\\\":\\\"file_read\\\",...}\"}\ndata: {\"type\":\"stdout\",\"data\":\"{\\\"turn\\\":2,\\\"tool\\\":\\\"file_edit\\\",...}\"}\ndata: {\"type\":\"exit\",\"code\":0}\ndata: [DONE]\n```\n\n```bash\n# Run in isolated sandbox (temp workspace, safe for untrusted tasks)\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"write a hello world app\",\"isolate\":true}'\n```\n\n```bash\n# List all runs\ncurl http://localhost:11435/v1/runs\n```\n```json\n{\"runs\":[{\"id\":\"job-a1b2c3\",\"task\":\"fix TypeScript errors\",\"status\":\"completed\",\"startedAt\":\"...\"}]}\n```\n\n```bash\n# Get specific run status\ncurl http://localhost:11435/v1/runs/job-a1b2c3\n```\n\n```bash\n# Abort a running task\ncurl -X DELETE http://localhost:11435/v1/runs/job-a1b2c3\n```\n```json\n{\"status\":\"aborted\",\"run_id\":\"job-a1b2c3\"}\n```\n\n#### Configuration\n\n```bash\n# Get all config\ncurl http://localhost:11435/v1/config\n```\n```json\n{\"config\":{\"backendUrl\":\"http://127.0.0.1:11434\",\"model\":\"qwen3.5:122b\",\"backendType\":\"ollama\",...}}\n```\n\n```bash\n# Get current model\ncurl http://localhost:11435/v1/config/model\n```\n```json\n{\"model\":\"qwen3.5:122b\"}\n```\n\n```bash\n# Switch model\ncurl -X PUT http://localhost:11435/v1/config/model \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"qwen3.5:27b\"}'\n```\n```json\n{\"model\":\"qwen3.5:27b\",\"status\":\"updated\"}\n```\n\n```bash\n# Get endpoint\ncurl http://localhost:11435/v1/config/endpoint\n```\n```json\n{\"url\":\"http://127.0.0.1:11434\",\"backendType\":\"ollama\",\"auth\":\"none\"}\n```\n\n```bash\n# Switch endpoint (e.g., to Chutes AI)\ncurl -X PUT http://localhost:11435/v1/config/endpoint \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"url\":\"https://llm.chutes.ai\",\"auth\":\"Bearer cpk_...\"}'\n```\n\n```bash\n# Update settings (admin scope required)\ncurl -X PATCH http://localhost:11435/v1/config \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"verbose\":true}'\n```\n```json\n{\"config\":{...},\"updated\":[\"verbose\"]}\n```\n\n#### Slash Commands via REST\n\nEvery `/command` from the TUI is available as a REST endpoint.\n\n```bash\n# List all available commands\ncurl http://localhost:11435/v1/commands\n```\n```json\n{\"commands\":[{\"command\":\"/help\",\"description\":\"Show help\"},{\"command\":\"/stats\",\"description\":\"Session metrics\"},...]}\n```\n\n```bash\n# Execute /stats\ncurl -X POST http://localhost:11435/v1/commands/stats\n```\n\n```bash\n# Execute /nexus status\ncurl -X POST http://localhost:11435/v1/commands/nexus \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"args\":\"status\"}'\n```\n\n```bash\n# Execute /destroy processes --global\ncurl -X POST http://localhost:11435/v1/commands/destroy \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"args\":\"processes --global\"}'\n```\n\n#### Auth Scopes\n\n```bash\n# Multi-key setup: read (monitoring), run (CI), admin (ops)\nOA_API_KEYS=\"grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team\" oa serve\n```\n\n| Scope | Can do | Cannot do |\n|-------|--------|-----------|\n| `read` | GET /v1/models, /v1/config, /v1/runs, /v1/commands | POST /v1/run, PATCH /v1/config |\n| `run` | Everything in `read` + POST /v1/run, POST /v1/commands | PATCH /v1/config, PUT endpoints |\n| `admin` | Everything | — |\n\n```bash\n# With auth\ncurl -H \"Authorization: Bearer ops-key\" http://localhost:11435/v1/models\n```\n\n#### Tool-Use Profiles\n\nEnterprise access control — define which tools, shell commands, and settings the agent can use per API key or per request.\n\n**3 built-in presets:**\n\n| Profile | Description | Tools |\n|---------|-------------|-------|\n| `full` | No restrictions | All tools and commands |\n| `ci-safe` | CI/CD — read + test only | file_read, grep, shell (npm test only) |\n| `readonly` | Read-only analysis | No writes, no shell mutations |\n\n```bash\n# List all profiles (presets + custom)\ncurl -H \"Authorization: Bearer $KEY\" http://localhost:11435/v1/profiles\n```\n```json\n{\"profiles\":[{\"name\":\"readonly\",\"description\":\"Read-only\",\"encrypted\":false,\"source\":\"preset\"},{\"name\":\"ci-safe\",...}]}\n```\n\n```bash\n# Get profile details\ncurl -H \"Authorization: Bearer $KEY\" http://localhost:11435/v1/profiles/ci-safe\n```\n```json\n{\"profile\":{\"name\":\"ci-safe\",\"tools\":{\"allow\":[\"file_read\",\"grep_search\",\"shell\"],\"shell_allow\":[\"npm test\",\"npx eslint\"]},\"limits\":{\"max_turns\":15}}}\n```\n\n```bash\n# Create custom profile (admin only)\ncurl -X POST http://localhost:11435/v1/profiles \\\n  -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"name\": \"frontend-dev\",\n    \"description\": \"Frontend team — no backend access\",\n    \"tools\": {\n      \"allow\": [\"file_read\", \"file_write\", \"file_edit\", \"shell\", \"grep_search\"],\n      \"shell_deny\": [\"rm -rf\", \"sudo\", \"docker\", \"kubectl\"]\n    },\n    \"commands\": { \"deny\": [\"destroy\", \"expose\", \"sponsor\"] },\n    \"limits\": { \"max_turns\": 20, \"timeout_s\": 300 }\n  }'\n```\n\n```bash\n# Create password-protected profile (AES-256-GCM encrypted)\ncurl -X POST http://localhost:11435/v1/profiles \\\n  -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"name\":\"prod-ops\",\"password\":\"s3cret\",\"tools\":{\"deny\":[\"file_write\"]}}'\n```\n\n```bash\n# Use a profile with /v1/run (header or body)\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Authorization: Bearer $KEY\" \\\n  -H \"X-Tool-Profile: ci-safe\" \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"run the test suite and report failures\"}'\n\n# Or in the body:\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Authorization: Bearer $KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"analyze code quality\",\"profile\":\"readonly\"}'\n```\n\n```bash\n# Load encrypted profile (password in header)\ncurl -H \"Authorization: Bearer $KEY\" \\\n  -H \"X-Profile-Password: s3cret\" \\\n  http://localhost:11435/v1/profiles/prod-ops\n```\n\n```bash\n# Delete a custom profile (admin only, presets cannot be deleted)\ncurl -X DELETE -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  http://localhost:11435/v1/profiles/frontend-dev\n```\n\n#### Endpoint Reference\n\n| Method | Path | Auth | Description |\n|--------|------|------|-------------|\n| GET | `/health` | none | Liveness probe |\n| GET | `/health/ready` | none | Readiness (probes Ollama) |\n| GET | `/health/startup` | none | Startup complete |\n| GET | `/version` | none | Version + platform |\n| GET | `/metrics` | none | Prometheus counters |\n| GET | `/v1/models` | read | List models (OpenAI format) |\n| POST | `/v1/chat/completions` | run | Chat inference (stream + sync) |\n| POST | `/v1/embeddings` | run | Generate embeddings |\n| POST | `/v1/chat` | run | Stateful chat with full tool access (sessions, context, memory) |\n| GET | `/v1/chat/sessions` | read | List active chat sessions |\n| GET | `/v1/system` | none | GPU/RAM/CPU info + model recommendations |\n| GET | `/v1/audit` | read | Query audit log (since, user, limit filters) |\n| GET | `/openapi.json` | none | OpenAPI 3.0 specification |\n| GET | `/docs` | none | Swagger UI (interactive API docs) |\n| POST | `/v1/run` | run | Submit agentic task |\n| GET | `/v1/runs` | read | List all runs |\n| GET | `/v1/runs/:id` | read | Run status |\n| DELETE | `/v1/runs/:id` | run | Abort run |\n| GET | `/v1/config` | read | All settings |\n| PATCH | `/v1/config` | admin | Update settings |\n| GET | `/v1/config/model` | read | Current model |\n| PUT | `/v1/config/model` | admin | Switch model |\n| GET | `/v1/config/endpoint` | read | Current endpoint |\n| PUT | `/v1/config/endpoint` | admin | Switch endpoint |\n| GET | `/v1/commands` | read | List commands |\n| POST | `/v1/commands/:cmd` | run | Execute command |\n| GET | `/v1/profiles` | read | List all profiles (presets + custom) |\n| GET | `/v1/profiles/:name` | read | Get profile details (X-Profile-Password for encrypted) |\n| POST | `/v1/profiles` | admin | Create/update profile (password field for encryption) |\n| DELETE | `/v1/profiles/:name` | admin | Delete custom profile |\n\n#### Stateful Chat — `/v1/chat`\n\nUnlike `/v1/chat/completions` (raw Ollama proxy), `/v1/chat` spawns the full OA agent with all 61 tools for each message. The agent can search the web, read files, run shell commands, and use memory — exactly like the TUI.\n\n```bash\n# Send a chat message (full tool access)\ncurl -s http://localhost:11435/v1/chat \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"message\": \"What is happening in the world today?\", \"model\": \"qwen3.5:9b\", \"stream\": false}'\n\n# Response: {\"session_id\": \"abc123\", \"message\": {\"role\": \"assistant\", \"content\": \"...\"}}\n```\n\n**Request body:**\n```json\n{\n  \"message\": \"What is happening in the world?\",\n  \"model\": \"qwen3.5:9b\",\n  \"session_id\": \"optional-uuid-from-previous-response\",\n  \"stream\": true,\n  \"max_tokens\": 4096\n}\n```\n\n**Response (non-streaming):**\n```json\n{\n  \"session_id\": \"abc123-def4-5678-ghij-klmnopqrstuv\",\n  \"message\": {\n    \"role\": \"assistant\",\n    \"content\": \"Here are the major events happening today...\"\n  }\n}\n```\n\n**Response (streaming `stream: true`):** Server-Sent Events:\n```\ndata: {\"type\":\"tool_call\",\"tool\":\"web_search\",\"args\":{\"query\":\"world news today\"}}\ndata: {\"type\":\"tool_result\",\"output\":\"Top results: ...\"}\ndata: {\"id\":\"chatcmpl-abc\",\"object\":\"chat.completion.chunk\",\"choices\":[{\"delta\":{\"content\":\"Based on...\"}}]}\ndata: {\"type\":\"complete\",\"turns\":\"3\",\"tokens\":\"12,450\",\"duration\":8500}\ndata: [DONE]\n```\n\n**Session management:** Each chat message returns a `session_id`. Send it back to maintain conversation context across turns:\n\n```bash\ncurl -s http://localhost:11435/v1/chat \\\n  -d '{\"session_id\": \"abc123\", \"message\": \"Tell me more about that\", \"model\": \"qwen3.5:9b\", \"stream\": false}'\n```\n\nSessions expire after 30 minutes of inactivity. List active sessions: `GET /v1/chat/sessions`.\n\n**Streaming:** Set `\"stream\": true` for Server-Sent Events with tool call visualization and incremental content.\n\n#### Web Interface\n\nOpen `http://localhost:11435/` in a browser when `oa serve` is running. Zero external dependencies — single self-contained HTML page.\n\n**Tabs:**\n- **Chat** — Conversational interface using `/v1/chat` with full tool access, session persistence, streaming responses, and collapsible tool call dropdowns\n- **Agent** — Submit agentic tasks via `/v1/run`, profile selection, live SSE event stream, abort button\n- **Dashboard** — System health (GPU, RAM, uptime), per-provider token usage (persistent across restarts), active process monitor, job history with pagination\n- **Config** — Server settings table, model switcher, endpoint manager (add/change inference providers), profile list\n- **Activity** — Real-time audit log feed with color-coded status codes\n\n**Design:** Dark theme (#1a1a1e background, #b2920a gold accent, SF Mono font) matching the TUI and /call voice interface. Mobile responsive with CSS media queries.\n\n**Features:**\n- Model picker populated from `/v1/models`\n- API key support (stored in localStorage)\n- System prompt (collapsible textarea)\n- Markdown rendering with code block copy buttons\n- Docker sandbox toggle (native vs container execution)\n- Workspace sidebar (toggleable file tree)\n- Token counter per conversation\n- Conversation export (Markdown or JSON)\n- GPU/VRAM detection with model compatibility recommendations\n- Per-provider token tracking (persisted to `.oa/usage/token-usage.json`)\n\n### Enterprise Licensing\n\nFree for non-commercial use under CC-BY-NC-4.0. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).\n\n\n\n\n## Architecture\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nThe core is `AgenticRunner` — a multi-turn tool-calling loop with structured context assembly:\n\n```\nUser task → assembleContext(c_instr, c_state, c_know) → LLM → tool_calls → Execute → Feed results → LLM\n                                                                ↓                                      ↑\n                                                          Compaction check ─── Memex archive ─── Context restore\n                                                                (repeat until task_complete or max turns)\n```\n\n- **Context-first** — structured context assembly (C = A equation) replaces ad-hoc prompt construction\n- **Tool-first** — the model explores via tools, not pre-stuffed context\n- **Iterative** — tests, sees failures, fixes them\n- **Parallel-safe** — read-only tools concurrent, mutating tools sequential\n- **Observable** — every tool call, context composition, and result emitted as a real-time event\n- **Bounded** — max turns, timeout, output limits prevent runaway loops\n- **Context-aware** — dynamic compaction, Memex archiving, session persistence, model-tier scaling\n- **Brute-force** — optional auto re-engagement when turn limit is hit (keeps going until task_complete or user abort)\n\n\n\n\n## Context Engineering\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nThe agent implements structured context assembly based on current research in context engineering, modular prompt optimization, and instruction hierarchy:\n\n```\nC = A(c_instr, c_know, c_tools, c_mem, c_state, c_query)\n```\n\n| Component | Priority | Description |\n|-----------|----------|-------------|\n| `c_instr` | P0 (highest) | Core system instructions — immutable, cannot be overridden |\n| `c_state` | P10 | Personality profile, session state |\n| `c_know` | P20 | Dynamic project context, retrieved knowledge |\n| `c_retrieval` | P20 | Task-specific retrieval (RRF-fused lexical + semantic + graph expansion) |\n| `c_graph` | P20 | Live code knowledge graph (PageRank-ranked symbols, community summaries) |\n| `c_plan` | P20 | Plan skeleton (completed/current/pending steps, re-injected every turn) |\n| `c_tools` | P30 (lowest) | Tool outputs — may contain untrusted content |\n\nKey design decisions grounded in research:\n\n- **Instruction hierarchy** — 4-tier priority system (P0/P10/P20/P30) prevents prompt injection from tool outputs overriding system rules. Implemented across all 3 prompt tiers (large/medium/small) with model-appropriate verbosity\n- **Live code knowledge graph** — SQLite-backed graph (files/symbols/edges) auto-updates via filesystem watcher and post-edit hooks. PageRank-ranked symbols injected into every prompt. Louvain community detection compresses 1M+ LOC repos into ~200 navigable clusters. Research: [Codebase-Memory](https://arxiv.org/abs/2603.27277), [FastCode](https://arxiv.org/abs/2603.01012), [Stack Graphs](https://arxiv.org/abs/2211.01224)\n- **Plan-skeleton re-injection** — every turn includes a compact `[done/current/pending]` plan derived from task state, preventing goal drift in multi-step tasks. Research: [ReCAP](https://arxiv.org/abs/2510.23822) (+32% on multi-step tasks)\n- **Retrieval-augmented context** — Reciprocal Rank Fusion merges lexical search, semantic search, and graph expansion into a single ranked result set. Token-budgeted snippet packing ensures relevant code reaches the model without overflow\n- **Proactive quality guidance** — instead of banning tools after repeated use, the agent receives contextual next-step suggestions appended to tool output, preserving tool availability while steering toward productive actions\n- **Tiered system prompts** — large (>=30B), medium (8-29B), and small (<=7B) models get appropriately sized instruction sets, balancing capability with context budget\n- **Context composition tracing** — every context assembly emits a structured event showing section labels and token estimates for eval observability\n\nResearch provenance: grounded in \"A Survey of Context Engineering for LLMs\" (context assembly equation), \"Modular Prompt Optimization\" (section-local textual gradients), \"Reasoning Up the Instruction Ladder\" (priority hierarchy), \"GEPA\" (reflective prompt evolution), \"Prompt Flow Integrity\" (least-privilege context passing), [RepoMaster](https://arxiv.org/abs/2505.21577) (8K token budget validation), and [RIG](https://arxiv.org/abs/2601.10112) (flat graph format).\n\n\n\n\n## Model-Tier Awareness\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOpen Agents classifies models into three tiers and adapts its behavior accordingly:\n\n| Tier | Parameters | Base Tools | System Prompt | Compaction |\n|------|-----------|------------|---------------|------------|\n| **Large** (>=30B) | 70B, 122B | All 67 tools | Full | 75% of context window |\n| **Medium** (8-29B) | 9B, 27B | 15 core + task-relevant | Condensed | 70% of context window |\n| **Small** (<=7B) | 4B, 1.5B | 6 base + explore_tools | Minimal + scaffolding | 65% of context window |\n\n### Small Model Optimization (Research-Backed)\n\nSmall models (4B-7B) receive 10+ optimizations that larger models don't need, each backed by published research:\n\n| Optimization | Research Basis | Impact |\n|-------------|---------------|--------|\n| **Plan-skeleton re-injection** | [ReCAP](https://arxiv.org/abs/2510.23822) (NeurIPS 2025) | +32% multi-step task completion |\n| **Goal re-injection after compaction** | [Lost in the Middle](https://arxiv.org/abs/2307.03172) | Prevents #1 cause of drift |\n| **Decomposition guidance** | [ReCode](https://arxiv.org/abs/2510.23564) | +20.9% for 7B, zero training cost |\n| **Structured error recovery** | [Polaris](https://arxiv.org/abs/2603.23129) | Actionable [RECOVERY] guidance per error type |\n| **LATS pivot directive** | [LATS](https://arxiv.org/abs/2310.04406) (ICML 2024) | Forces approach change after consecutive failures |\n| **Self-consistency voting** | [SRLM](https://arxiv.org/abs/2603.15653) | +22% via K-alternative majority voting (opt-in) |\n| **Tier-adaptive compaction** | [Codebase-Memory](https://arxiv.org/abs/2603.27277) | Context budget scales per tier, not hardcoded |\n| **Tool deferral** | [EASYTOOL](https://arxiv.org/abs/2401.06201), [Gorilla](https://arxiv.org/abs/2305.15334) | 60-80% tool token reduction via search |\n| **Best-of-N execution** | [SWE-RM](https://arxiv.org/abs/2512.21919) | +7-10 pts via N independent attempts (opt-in) |\n| **Recursive sub-agents** | [RLM](https://arxiv.org/abs/2512.24601), [Yang/Srebro](https://arxiv.org/abs/2603.02112) | Depth-tracked delegation (max 3), 100x effective context |\n\n**Eval-verified result:** A 4B model completes a hard multi-file refactoring task in 20 turns (down from 25 before these optimizations) and passes 92% of core eval tasks.\n\n### Tool Nesting for Small Models\n\nSmall models use an **explore_tools** meta-tool pattern inspired by hierarchical API retrieval research ([ToolLLM](https://arxiv.org/abs/2307.16789)). Instead of presenting all 64+ tools (which overwhelms small context windows), only core tools are loaded initially. The agent calls `explore_tools()` to discover additional capabilities, then activates specific tools as needed. This reduces tool schema tokens by ~80% while preserving access to the full toolset.\n\n### Dynamic Context Limits\n\nAll context-dependent values scale automatically with the actual context window size:\n\n| Setting | How It Scales |\n|---------|---------------|\n| Compaction threshold | min(tier default, 75% of context window) |\n| Recent messages kept | 1 message per 2-4K of context (tier-dependent) |\n| Max output tokens | 25% of context window (min 2048) |\n| Tool output cap | 2K-8K chars (scales with context) |\n| File read limits | 80-120 line cap for small/medium context windows |\n\n\n\n\n## Live Code Knowledge Graph\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOpen Agents builds and maintains a **persistent, auto-updating knowledge graph** of the codebase that scales from small projects to repositories with 1M+ lines of code.\n\n### How It Works\n\n```\nSource files  ──>  Regex symbol extraction  ──>  SQLite graph DB (.oa/index/code-graph.db)\n     |                                                    |\n     |  fs.watch() + debounce ──>  File hash check  ──>  Incremental re-index (per file)\n     |                                                    |\n     └── post-edit hook (file_write/edit) ─────────────>  Instant re-index of modified files\n```\n\n1. **Symbol extraction** parses every source file for functions, classes, types, interfaces, exports, and constants\n2. **Import graph** traces dependency relationships (which file imports which)\n3. **PageRank scoring** ranks files by how many other files depend on them\n4. **Community detection** (Louvain-inspired) groups related files into logical modules with summaries\n5. **Auto-update** via filesystem watcher and post-tool-edit hooks keeps the graph fresh as code changes\n\n### What the Agent Sees\n\nEach turn, the agent receives a compact graph summary (500-1500 tokens depending on model tier) showing:\n- The most important files ranked by cross-reference count\n- Their exported symbols (functions, classes, types)\n- Import relationships (what depends on what)\n\nFor 1M+ LOC codebases, the Louvain community compression reduces 50K+ symbols into ~200 navigable module summaries, each with a name and key exports.\n\n### Graph Tools\n\n| Tool | What It Does |\n|------|-------------|\n| `repo_map` | PageRank-sorted codebase skeleton with token budget control |\n| `import_graph` | Show dependencies, dependents, and 1-hop transitive connections for any file |\n| `semantic_map` | Agent-curated notes, hotspot tracking, and file relationships across sessions |\n| `codebase_map` | High-level structural overview (directories, language breakdown) |\n| `file_explore` | Chunked exploration with overview/outline/search/chunk strategies |\n\n### Storage\n\nThe graph persists in `.oa/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.\n\n### Research Basis\n\n- [Codebase-Memory](https://arxiv.org/abs/2603.27277) (2026) — Tree-Sitter + Louvain communities, Linux kernel 2.1M nodes in 3 minutes, incremental via XXH3 hashing\n- [FastCode](https://arxiv.org/abs/2603.01012) (2026) — 3-layer graph schema (dependency/inheritance/call), cleanest decomposition\n- [Stack Graphs](https://arxiv.org/abs/2211.01224) (GitHub production) — File-level isolation for incremental updates at millions-of-repos scale\n- [RepoMaster](https://arxiv.org/abs/2505.21577) (2025) — 8K token budget validated, +62.96% task-pass rate\n- [Code-Craft/HCGS](https://arxiv.org/abs/2504.08975) (2025) — Hierarchical code graph summaries, 82% retrieval precision improvement\n\n\n\n## Auto-Expanding Context Window\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOn startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an optimized model variant:\n\n| Available Memory | Context Window |\n|-----------------|---------------|\n| 200GB+ | 128K tokens |\n| 100GB+ | 64K tokens |\n| 50GB+ | 32K tokens |\n| 20GB+ | 16K tokens |\n| 8GB+ | 8K tokens |\n| < 8GB | 4K tokens |\n\n\n\n\n## Tools (68)\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n| Tool | Description |\n|------|-------------|\n| **File Operations** | |\n| `file_read` | Read file contents with line numbers (offset/limit for large files) |\n| `file_write` | Create or overwrite files with automatic directory creation |\n| `file_edit` | Precise string replacement in files (preferred over rewriting) |\n| `file_patch` | Edit specific line ranges in large files (replace, insert_before/after, delete) |\n| `batch_edit` | Multiple edits across files in one call |\n| `list_directory` | List directory contents with types and sizes |\n| **Search & Navigation** | |\n| `grep_search` | Search file contents with regex (ripgrep with grep fallback) |\n| `find_files` | Find files by glob pattern (excludes node_modules/.git) |\n| `codebase_map` | High-level project structure overview with directory tree and language breakdown |\n| **Shell & Execution** | |\n| `shell` | Execute any shell command (non-interactive, CI=true, sudo support) |\n| `code_sandbox` | Isolated code execution (JS, Python, Bash, TS) in subprocess or Docker |\n| `background_run` | Run shell command in background, returns task ID |\n| `task_status` | Check background task status |\n| `task_output` | Read background task output |\n| `task_stop` | Stop a background task |\n| **Web** | |\n| `web_search` | Search the web for pages matching a query — returns links+snippets, not content. Providers: DuckDuckGo (free), Tavily (TAVILY_API_KEY), Jina (JINA_API_KEY) |\n| `web_fetch` | Fetch a single URL's text content (fastest, no JS rendering). Supports `mode=reader` for Jina Reader markdown output with JS rendering. Auto-fallback to Jina when raw content is too short |\n| `web_crawl` | Crawl pages with link-following and optional JS rendering. Strategies: `beautifulsoup` (fast HTTP) or `playwright` (headless Chromium). Supports `extract_schema` for structured data extraction |\n| `browser_action` | Interactive headless Chrome: login, fill forms, click buttons, screenshot. Session persists between calls. Actions: navigate, click, click_xy, type, screenshot, dom, scroll, back, forward, close |\n| **Structured Data** | |\n| `structured_file` | Generate CSV, TSV, JSON, Markdown tables, Excel-compatible files |\n| `structured_read` | Parse CSV, TSV, JSON, Markdown tables with binary format detection |\n| **Vision & Desktop** | |\n| `vision` | Moondream VLM — caption, query, detect, point on any image |\n| `desktop_click` | Vision-guided clicking: describe a UI element, agent finds and clicks it |\n| `desktop_describe` | Screenshot + Moondream caption/query for desktop awareness |\n| `image_read` | Read images (base64 + OCR metadata) |\n| `screenshot` | Capture screen/window/active window |\n| `ocr` | Extract text from images (Tesseract with multi-variant preprocessing) |\n| `ocr_image_advanced` | Advanced multi-variant OCR pipeline with preprocessing, multi-PSM, and confidence scoring |\n| `ocr_pdf` | Add searchable text layer to scanned/image PDFs |\n| `pdf_to_text` | Extract text from PDF using pdftotext (Poppler) with OCR fallback |\n| **Transcription** | |\n| `transcribe_file` | Transcribe local audio/video files to text (Whisper) |\n| `transcribe_url` | Download and transcribe audio/video from URLs |\n| **Memory & Knowledge** | |\n| `memory_read` | Read from persistent memory store by topic and key |\n| `memory_write` | Store facts/patterns in persistent memory with provenance tracking |\n| `memory_search` | Semantic search across all memory entries by query |\n| `memex_retrieve` | Recover full tool output archived during context compaction by hash ID |\n| **Git & Diagnostics** | |\n| `diagnostic` | Lint/typecheck/test/build validation pipeline in one call |\n| `git_info` | Structured git status, log, diff, branch, staged/unstaged files |\n| **Agents & Delegation** | |\n| `sub_agent` | Delegate subtasks to independent agent instances (foreground or background) |\n| `explore_tools` | Meta-tool: discover and unlock additional tools on demand (for small models) |\n| `task_complete` | Signal task completion with summary |\n| **Custom Tools & Skills** | |\n| `create_tool` | Create reusable custom tools from workflow patterns at runtime |\n| `manage_tools` | List, inspect, delete custom tools |\n| `skill_list` | Discover available AIWG skills |\n| `skill_execute` | Run an AIWG skill |\n| **Temporal Agency** | |\n| `scheduler` | Schedule tasks for automatic future execution via OS cron (presets, natural language, raw cron) |\n| `reminder` | Set cross-session reminders with priority, due dates, tags — surfaces at startup |\n| `agenda` | Unified view of reminders, schedules, and attention items with startup brief |\n| **AIWG SDLC** | |\n| `aiwg_setup` | Deploy AIWG SDLC framework |\n| `aiwg_health` | Analyze project SDLC health and readiness |\n| `aiwg_workflow` | Execute AIWG commands and workflows |\n| **Nexus P2P & x402 Payments** | |\n| `nexus` | Decentralized agent networking — connect, rooms, DMs, peer discovery, invoke capabilities, metering, trust/blocking, IPFS storage |\n| `nexus:expose` | Expose local models or forward upstream endpoints as metered inference capabilities with pricing, passthrough, and load balancing |\n| `nexus:wallet_create` | Generate secp256k1/EVM wallet (Base mainnet USDC) with AES-256-GCM encryption + x402-wallet.key |\n| `nexus:spend` | Sign EIP-3009 USDC TransferWithAuthorization — budget-checked, gasless for payer |\n| `nexus:remote_infer` | Route inference to a remote peer's model — auto-discovers peers, budget-checks, invokes, returns result |\n| `nexus:ledger_status` | Transaction history (earned/spent/pending USDC) |\n| `nexus:budget_set` | Configure spending limits — daily cap, per-invoke max, auto-approve threshold |\n| **COHERE Cognitive Stack** | |\n| `repl_exec` | Persistent Python REPL — variables/imports persist between calls, `llm_query()` and `parallel_llm_query()` available for recursive LLM invocation, `retrieve()` for handle access |\n| `memory_metabolize` | Governed memory lifecycle — classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence/identity_relevance), consolidate lessons from trajectories |\n| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.oa/identity/` |\n| `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |\n| `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |\n| **Hardware Access** | |\n| `camera_capture` | Access system cameras — list devices, capture JPEG frames, query capabilities. Uses ffmpeg + v4l2. Supports USB, CSI, and 360 cameras (QooCam, RealSense). Captured images can be piped to vision tools |\n| `audio_capture` | Record from microphone — list input devices, record WAV/MP3 (configurable duration/rate/channels), check real-time mic level (RMS dBFS). Uses arecord + ffmpeg backends |\n| `audio_playback` | Speaker control and TTS — play audio files (WAV/MP3/OGG), text-to-speech via espeak-ng (multi-language), get/set system volume. Uses aplay/ffplay/amixer backends |\n| `wifi_control` | WiFi network scanning and management — scan nearby networks (SSID, signal, channel, security), list WiFi adapters (built-in + USB dongles), connect/disconnect, check connection status, toggle monitor mode. Auto-detects AC600/RTL8811AU and other USB adapters |\n| `bluetooth_scan` | Bluetooth device discovery — scan for Classic and BLE devices, list HCI adapters, get device info. Uses hcitool/bluetoothctl backends |\n| `sdr_scan` | Software-defined radio scanning — frequency sweeps, ADS-B aircraft tracking (1090 MHz), FM radio capture. Auto-installs rtl-sdr tools when RTL-SDR hardware detected. Uses rtl_power/rtl_fm/dump1090 |\n| `flipper_zero` | Flipper Zero multi-tool control — Sub-GHz scanning (315/433/868/915 MHz), NFC tag reading, 125kHz RFID reading, IR capture, GPIO pin reading, storage browsing. Serial CLI via /dev/ttyACM* |\n\nRead-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.\n\n### Web Tool Selection Guide\n\nThe agent has 4 web tools. Pick the right one:\n\n| Need | Tool | Why |\n|------|------|-----|\n| Find pages about a topic | `web_search` | Returns links+snippets to fetch later |\n| Read a URL you already have | `web_fetch` | Fastest — plain text, no JS rendering |\n| Page is blank or JS-heavy (SPA) | `web_crawl` strategy=playwright | Renders JavaScript via headless Chromium |\n| Follow links across a site | `web_crawl` max_depth=1+ | Multi-page crawl with metadata |\n| Extract structured data (prices, tables) | `web_crawl` + extract_schema | Regex-based field extraction from page text |\n| Login / fill forms / click buttons | `browser_action` | Persistent session with cookies and state |\n| Screenshot of a rendered page | `browser_action` action=screenshot | Visual rendering via Chrome |\n| Clean markdown from any URL | `web_fetch` mode=reader | Jina Reader (r.jina.ai) — handles JS, images |\n\n**Routing order**: `web_search` (find) → `web_fetch` (read) → `web_crawl` (if JS/multi-page) → `browser_action` (if interactive)\n\n**Jina Reader**: Set `JINA_API_KEY` for higher rate limits. Works without a key for basic use. When `web_fetch` gets very short content (<200 chars), it automatically retries via Jina Reader.\n\n**Structured extraction**: Pass `extract_schema='{\"price\": \"number\", \"name\": \"string\"}'` to `web_crawl` for best-effort regex-based field extraction from page content.\n\n### Hardware Tool Guide\n\nThe agent can access physical hardware — cameras, microphones, and speakers — through three dedicated tools:\n\n| Need | Tool | Example |\n|------|------|---------|\n| See the environment | `camera_capture` action=capture | Grab a JPEG frame from any USB/CSI camera |\n| List cameras | `camera_capture` action=list | Discover `/dev/video*` devices |\n| Record audio | `audio_capture` action=record duration=10 | Record 10s WAV from default mic |\n| Check if mic works | `audio_capture` action=level | RMS level in dBFS |\n| Speak aloud | `audio_playback` action=speak text=\"Hello\" | TTS via espeak-ng |\n| Play a sound file | `audio_playback` action=play file=alert.wav | Play WAV/MP3/OGG |\n| Check volume | `audio_playback` action=volume | Get current volume % |\n| Set volume | `audio_playback` action=volume volume=50 | Set to 50% |\n| Scan WiFi networks | `wifi_control` action=scan | All SSIDs, signals, channels |\n| List WiFi adapters | `wifi_control` action=interfaces | Built-in + USB dongles |\n| Connect to WiFi | `wifi_control` action=connect ssid=\"MyNet\" password=\"pass\" | Join network |\n| WiFi status | `wifi_control` action=status | Current SSID, IP, signal |\n| Scan Bluetooth | `bluetooth_scan` action=scan | Classic + BLE devices |\n| List BT adapters | `bluetooth_scan` action=interfaces | HCI adapters |\n| SDR device check | `sdr_scan` action=info | RTL-SDR hardware status |\n| RF frequency sweep | `sdr_scan` action=scan start_freq=\"433M\" end_freq=\"434M\" | Signal power levels |\n| Aircraft tracking | `sdr_scan` action=adsb duration=30 | ADS-B transponder messages |\n| FM radio capture | `sdr_scan` action=fm frequency=\"98.1M\" | Record FM audio |\n| Detect Flipper Zero | `flipper_zero` action=detect | Connected Flippers |\n| Sub-GHz scan | `flipper_zero` action=subghz_scan frequency=433920000 | RF signals |\n| Read NFC tag | `flipper_zero` action=nfc_read | Tag UID, type |\n| Read RFID tag | `flipper_zero` action=rfid_read | 125kHz tag ID |\n\n**Prerequisites**: `ffmpeg`, `arecord`, `aplay`, `amixer` (ALSA utils), `espeak-ng`, `bluez` (Bluetooth). Install: `sudo apt install ffmpeg alsa-utils espeak-ng bluez`\n\n**Camera support**: USB cameras (UVC), Intel RealSense (via UVC), 360 cameras (QooCam, Ricoh Theta — raw fisheye via v4l2loopback + ffmpeg crop). The captured frame is returned as base64 JPEG that can be fed directly to the `vision` tool for analysis.\n\n**Audio workflow**: Record → transcribe → analyze: `audio_capture action=record` → `transcribe_file` → process transcript. The tools handle device enumeration and graceful degradation when hardware is unavailable.\n\n\n## Ralph Loop — Iteration-First Design\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nThe Ralph Loop is the core execution philosophy: **iteration beats perfection**. Instead of trying to get everything right on the first attempt, the agent executes in a retry loop where errors become learning data rather than session-ending failures.\n\n```\n/ralph \"fix all failing tests\" --completion \"npm test passes with 0 failures\"\n/ralph \"migrate to TypeScript\" --completion \"npx tsc --noEmit exits 0\" --max-iterations 20\n/ralph \"reach 80% coverage\" --completion \"coverage report shows >80%\" --timeout 120\n```\n\nEach iteration:\n1. **Execute** — make changes based on the task + all accumulated learnings\n2. **Verify** — run the completion command (tests, build, lint, coverage)\n3. **Learn** — if verification fails, extract what went wrong and why\n4. **Iterate** — retry with the new knowledge until passing or limits reached\n\nThe loop tracks iteration history, generates completion reports saved to `.aiwg/ralph/`, and supports resume/abort for interrupted sessions. Safety bounds (max iterations, timeout) prevent runaway loops.\n\n```\n/ralph-status     # Check current/previous loop status\n/ralph-resume     # Resume interrupted loop\n/ralph-abort      # Cancel running loop\n```\n\n\n\n\n## Task Control\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n### Pause, Stop, Resume, Destroy\n\n| Command | Behavior |\n|---------|----------|\n| `/pause` | **Gentle halt** — lets the current inference turn finish, then stops before the next turn. No new tool calls or inference will begin until `/resume`. |\n| `/stop` | **Immediate kill** — aborts the current inference mid-stream, saves task state for later resumption. |\n| `/resume` | **Continue** — resumes a paused or stopped task from where it left off. Also resumes tasks saved by `/stop` or interrupted by `/update`. |\n| `/destroy` | **Nuclear option** — aborts any active task, deletes the `.oa/` directory, clears the console, and exits to shell. |\n\n### Session Context Persistence\n\nContext is automatically saved on every task completion and preserved across `/update` restarts.\n\n```bash\n/context save      # Force-save current session context\n/context resto"
+  "readme": "<a name=\"top\"></a>\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/robit-man/openagents.nexus/main/openagents-banner.png\" alt=\"Open Agents P2P Network\" width=\"100%\" />\n</p>\n<h1 align=\"center\">Open Agents — P2P Inference</h1>\n\n<p align=\"center\">\n  <strong>AI coding agent powered entirely by open-weight models.</strong><br>\n  No API keys. No cloud. Your code never leaves your machine.\n</p>\n\n<p align=\"center\">\n  <a href=\"https://www.npmjs.com/package/open-agents-ai\"><img src=\"https://img.shields.io/npm/v/open-agents-ai?color=7C3AED&style=flat-square\" alt=\"npm version\" /></a>\n  <a href=\"https://www.npmjs.com/package/open-agents-ai\"><img src=\"https://img.shields.io/npm/dm/open-agents-ai?color=06B6D4&style=flat-square\" alt=\"npm downloads\" /></a>\n  <img src=\"https://img.shields.io/badge/license-CC--BY--NC--4.0-10B981?style=flat-square\" alt=\"license\" />\n  <img src=\"https://img.shields.io/badge/node-%3E%3D20-F59E0B?style=flat-square\" alt=\"node version\" />\n  <img src=\"https://img.shields.io/badge/models-open--weight-EC4899?style=flat-square\" alt=\"open-weight models\" />\n  <a href=\"https://x.com/intent/post?url=https%3A%2F%2Fwww.npmjs.com%2Fpackage%2Fopen-agents-ai\"><img src=\"https://img.shields.io/badge/SHARE%20ON%20X-000000?style=for-the-badge&logo=x&logoColor=white\" alt=\"Share on X\" /></a>\n</p>\n\n---\n\n```bash\nnpm i -g open-agents-ai && oa\n```\n\nAn autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.\n\n\n## Table of Contents\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n- [The Organism, Not the Cortex](#the-organism-not-the-cortex)\n- [How It Works](#how-it-works)\n- [Features](#features)\n- [Enterprise & Headless Mode](#enterprise--headless-mode)\n- [Architecture](#architecture)\n- [Context Engineering](#context-engineering)\n- [Model-Tier Awareness](#model-tier-awareness)\n- [Live Code Knowledge Graph](#live-code-knowledge-graph)\n- [Auto-Expanding Context Window](#auto-expanding-context-window)\n- [Tools (85+)](#tools-85)\n- [Associative Memory & Cross-Modal Binding](#associative-memory--cross-modal-binding)\n- [Ralph Loop — Iteration-First Design](#ralph-loop--iteration-first-design)\n- [Task Control](#task-control)\n- [COHERE Cognitive Framework](#cohere-cognitive-framework)\n- [Context Compaction — Research-Backed Memory Management](#context-compaction--research-backed-memory-management)\n- [Personality Core — SAC Framework Style Control](#personality-core--sac-framework-style-control)\n- [Emotion Engine — Affective State Modulation](#emotion-engine--affective-state-modulation)\n- [Voice Feedback (TTS)](#voice-feedback-tts)\n- [Listen Mode — Live Bidirectional Audio](#listen-mode--live-bidirectional-audio)\n- [Vision & Desktop Automation (Moondream)](#vision--desktop-automation-moondream)\n- [Interactive TUI](#interactive-tui)\n- [Telegram Bridge — Sub-Agent Per Chat](#telegram-bridge--sub-agent-per-chat)\n- [x402 Payment Rails & Nexus P2P](#x402-payment-rails--nexus-p2p)\n- [Sponsored Inference — Share Your GPU With the World](#sponsored-inference--share-your-gpu-with-the-world)\n- [COHERE Distributed Mind](#cohere-distributed-mind)\n- [Self-Improvement & Learning](#self-improvement--learning)\n- [Dream Mode — Creative Idle Exploration](#dream-mode--creative-idle-exploration)\n- [Blessed Mode — Infinite Warm Loop](#blessed-mode--infinite-warm-loop)\n- [Docker Sandbox & Collective Intelligence](#docker-sandbox--collective-intelligence)\n- [Code Sandbox](#code-sandbox)\n- [Structured Data Tools](#structured-data-tools)\n- [Multi-Provider Web Search](#multi-provider-web-search)\n- [Task Templates](#task-templates)\n- [Human Expert Speed Ratio](#human-expert-speed-ratio)\n- [Cost Tracking & Session Metrics](#cost-tracking--session-metrics)\n- [Configuration](#configuration)\n- [Model Support](#model-support)\n- [Supported Inference Providers](#supported-inference-providers)\n- [Evaluation Suite](#evaluation-suite)\n- [AIWG Integration](#aiwg-integration)\n- [Research Citations](#research-citations)\n- [License](#license)\n\n\n\n## The Organism, Not the Cortex\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nAn LLM is a high-bandwidth associative generative core — closer to a cortex-like prior than to a complete agent. Its weights contain broad latent structure, but they do not by themselves give you situated continuity, durable task state, calibrated action policies, or grounded memory management. Open Agents treats the model as one organ inside a larger organism. The framework provides the rest: sensors, effectors, memory stores, routing, gating, evaluation, and persistence.\n\n**What the framework provides:**\n\n| Layer | Biological Analog | Implementation |\n|---|---|---|\n| Associative core | Cortex | LLM weights (any size) |\n| Current workspace | Global workspace / attention | `assembleContext()` — structured context assembly |\n| Episodic memory | Hippocampus | `.oa/memory/` — write, search, retrieve across sessions |\n| Cognitive map | Hippocampal spatial maps | `semantic-map.ts` + `repo-map.ts` (PageRank) |\n| Action gating | Basal ganglia | Tool selection policy (task-aware filtering) |\n| Temporal hierarchy | Prefrontal executive | Task decomposition, sub-agent delegation |\n| Self-model | Metacognition | Environment snapshot, process health monitoring |\n| Skill chunks | Cerebellum | Compiled tools, slash commands, verified routines |\n| Safety / limits | Autonomic / immune system | Turn limits, budgets, timeout watchdogs |\n\nDon't chase larger models. Build the organism around whatever model you have.\n\n\n\n\n## How It Works\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n```\nYou: oa \"fix the null check in auth.ts\"\n\nAgent: [Turn 1] file_read(src/auth.ts)\n       [Turn 2] grep_search(pattern=\"null\", path=\"src/auth.ts\")\n       [Turn 3] file_edit(old_string=\"if (user)\", new_string=\"if (user != null)\")\n       [Turn 4] shell(command=\"npm test\")\n       [Turn 5] task_complete(summary=\"Fixed null check — all tests pass\")\n```\n\nThe agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.\n\n\n\n\n## Features\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n- **61 autonomous tools** — file I/O, shell, grep, web search/fetch/crawl, memory (read/write/search), sub-agents, background tasks, image/OCR/PDF, git, diagnostics, vision, desktop automation, browser automation, temporal agency (scheduler/reminders/agenda), structured files, code sandbox, transcription, skills, opencode delegation, cron agents, nexus P2P networking + x402 micropayments, **COHERE cognitive stack** (persistent REPL, recursive LLM calls, memory metabolism, identity kernel, reflection, exploration)\n- **Moondream vision** — see and interact with the desktop via Moondream VLM (caption, query, detect, point-and-click)\n- **Desktop automation** — vision-guided clicking: describe a UI element in natural language, the agent finds and clicks it\n- **Auto-install desktop deps** — screenshot, mouse, OCR, and image tools auto-install missing system packages (scrot, xdotool, tesseract, imagemagick) on first use\n- **Parallel tool execution** — read-only tools run concurrently via `Promise.allSettled`\n- **Sub-agent delegation** — spawn independent agents for parallel workstreams\n- **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation\n- **Long-horizon cron agents** — schedule recurring autonomous agent tasks with goals, completion criteria, execution history, and automatic evaluation (daily code reviews, weekly dep updates, continuous monitoring)\n- **Nexus P2P networking** — decentralized agent-to-agent communication via [open-agents-nexus](https://www.npmjs.com/package/open-agents-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport\n- **x402 micropayments** — native x402 payment rails via open-agents-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers\n- **Inference capability proof** — benchmark local models with anti-spoofing SHA-256 hashed proofs, generate capability scorecards for peer verification\n- **Ralph Loop** — iterative task execution that keeps retrying until completion criteria are met\n- **Dream Mode** — creative idle exploration modeled after real sleep architecture (NREM→REM cycles)\n- **COHERE Cognitive Stack** — layered cognitive architecture implementing [Recursive Language Models](https://arxiv.org/abs/2512.24601), [SPRINT parallel reasoning](https://arxiv.org/abs/2506.05745), governed memory metabolism, identity kernel with continuity register, immune-system reflection, [strategy-space exploration](https://arxiv.org/abs/2603.02045), and **distributed inference mesh** — any `/cohere` participant automatically serves AND consumes inference from the network with complexity-based model routing, multi-node claim coordination, IPFS-pinned identity persistence, model exposure control, and Ollama safety hardening. See [COHERE Framework](#cohere-cognitive-framework) below\n- **Persistent Python REPL** — `repl_exec` tool maintains variables, imports, and functions across calls. Write Python code that processes data iteratively, with `llm_query()` available for recursive LLM sub-calls from within code\n- **Recursive LLM calls** — `llm_query(prompt, context)` invokes the model from inside REPL code, enabling loop-based semantic analysis of large inputs ([RLM paper](https://arxiv.org/abs/2512.24601)). `parallel_llm_query()` runs multiple calls concurrently ([SPRINT](https://arxiv.org/abs/2506.05745))\n- **Memory metabolism** — governed memory lifecycle: classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence), consolidate lessons from trajectories. Inspired by [TIMG](https://arxiv.org/abs/2603.10600) and [MemMA](https://arxiv.org/abs/2603.18718)\n- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.oa/identity/`\n- **Reflection & integrity** — immune-system audit: diagnostic (\"what's wrong?\"), epistemic (\"what evidence is missing?\"), constitutional (\"should this change become part of self?\"). Inspired by [LEAFE](https://arxiv.org/abs/2603.16843) and [RewardHackingAgents](https://arxiv.org/abs/2603.11337)\n- **Exploration & culture** — ARCHE strategy-space exploration: generate competing hypotheses, archive successful variants, retrieve past strategies. Inspired by [SGE](https://arxiv.org/abs/2603.02045) and [Darwin Gödel Machine](https://arxiv.org/abs/2505.22954)\n- **Autoresearch Swarm** — 5-agent GPU experiment loop during REM sleep: Researcher, Monitor, Evaluator, Critic, Flow Maintainer autonomously run ML training experiments, keep improvements, discard regressions\n- **Live Listen** — bidirectional voice communication with real-time Whisper transcription\n- **Live Voice Session** — `/listen` with `/voice` enabled spawns a cloudflared tunnel with a real-time WebSocket audio endpoint. A floating presence UI shows live transcription, connected users, and audio visualization. Echo cancellation prevents TTS feedback loops\n- **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent\n- **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text\n- **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)\n- **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior\n- **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines\n- **Cost tracking** — real-time token cost estimation for 15+ cloud providers\n- **Work evaluation** — LLM-as-judge scoring with task-type-specific rubrics\n- **Session metrics** — track turns, tool calls, tokens, files modified, tasks completed per session\n- **Structured file generation** — create CSV, TSV, JSON, Markdown tables, and Excel-compatible files\n- **Code sandbox** — isolated code execution in subprocess or Docker (JS, Python, Bash, TypeScript)\n- **Structured file reading** — parse CSV, TSV, JSON, Markdown tables with binary format detection\n- **Multi-provider web search** — DuckDuckGo (free), Tavily (structured), Jina AI (markdown) with auto-detection\n- **Browser automation** — headless Chrome control via Selenium: navigate, click, type, screenshot, read DOM — auto-starts on first use with self-bootstrapping Python venv\n- **Temporal agency** — schedule future tasks via OS cron, set cross-session reminders, flag attention items — startup injection surfaces due items automatically\n- **Web crawling** — multi-page web scraping with Crawlee/Playwright for deep documentation extraction\n- **Task templates** — specialized system prompts and tool recommendations for code, document, analysis, plan tasks\n- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection\n- **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)\n- **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below\n- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across OA restarts\n- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OA_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection\n- **Secret vault** — `/secrets` manages API keys and credentials with AES-256-GCM encrypted persistence; secrets are automatically redacted before sending to untrusted inference peers and re-injected on response\n- **Auto-expanding context** — detects RAM/VRAM and creates an optimized model variant on first run\n- **Mid-task steering** — type while the agent works to add context without interrupting\n- **Smart compaction** — 6 context compaction strategies (default, aggressive, decisions, errors, summary, structured) with ARC-inspired active context revision ([arXiv:2601.12030](https://arxiv.org/abs/2601.12030)) that preserves structural file content through compaction, preventing small-model repetitive loops at the root cause\n- **Memex experience archive** — large tool outputs archived during compaction with hash-based retrieval\n- **Persistent memory** — learned patterns stored in `.oa/memory/` across sessions\n- **Structured procedural memory (SQLite)** — replaces flat JSON with a full relational database: CRUD with soft-delete, revision tracking, embedding storage (float32 BLOB), bidirectional memory linking with confidence scores. Inspired by [ExpeL](https://arxiv.org/abs/2308.10144) (contrastive extraction) and [TIMG](https://arxiv.org/abs/2603.10600) (structured procedural format). 79 unit tests\n- **Semantic memory search** — vector embeddings via [Ollama /api/embed](https://ollama.com) (nomic-embed-text, 768-dim) with cosine similarity search over stored memories. Auto-generates embeddings on memory creation. Auto-links related memories when similarity > 0.6. Graceful fallback to text search when Ollama unavailable\n- **LLM-based memory extraction** — post-task, the LLM itself extracts structured procedural memories (CATEGORY/TRIGGER/LESSON/STEPS) instead of copying raw error text verbatim. Based on [ExpeL](https://arxiv.org/abs/2308.10144) and [AWM](https://arxiv.org/abs/2409.07429) patterns\n- **IPFS content-addressed storage** — [Helia](https://helia.io/) IPFS node with blockstore-fs for persistent content pinning. Real CID generation (`bafk...`), cross-node content resolution, and SHA-256 fallback when Helia unavailable. Verified: store→CID→retrieve round-trip test passes\n- **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content\n- **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected\n- **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)\n- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.oa/images/`\n- **Node visualization** — [openagents.nexus](https://github.com/robit-man/openagents.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents\n- **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents \"white heavy checkmark\"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines\n- **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel\n- **Unified color scheme** — `ui.primary` (252), `ui.error` (198/magenta), `ui.warn` (214/orange), `ui.accent` (178/yellow) applied consistently across all TUI surfaces\n- **Clickable header buttons** — `help`, `voice`, `cohere`, `model` buttons on banner row 3 with hover/click visual states. OSC 8 hyperlinks for pointer cursor. Mouse click fires the slash command directly\n- **Dynamic terminal title** — updates with current task + version: `\"fix auth bug · OA v0.141.0\"`\n- **Session context persistence** — auto-saves context on task completion, manual `/context save|restore` across sessions\n- **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs\n- **Seamless `/update`** — in-place update and reload with automatic context save/restore\n- **Blessed mode** — `/full-send-bless` infinite warm loop keeps model weights in VRAM, auto-cycles tasks, never exits until you say stop\n- **Telegram bridge** — `/telegram --key <token> --admin <userid>` public ingress/egress with admin filter and mandatory safety filter; bare `/telegram` toggles the service watchdog\n- **Task control** — `/pause` (gentle halt at turn boundary), `/stop` (immediate kill), `/resume` to continue\n- **Model-tier awareness** — dynamic tool sets, prompt complexity, and context limits scale with model size (small/medium/large)\n\n### Support Development\n\nIf you like this repository — whether you're an agent or a human — if this expands your consciousness and capabilities, please consider donating to support further development and of course token costs.\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/ethereum-eth-logo.svg\" width=\"20\" height=\"20\" alt=\"ETH\" /> <strong>ETH</strong></p>\n\n```bash\n0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/bitcoin-btc-logo.svg\" width=\"20\" height=\"20\" alt=\"BTC\" /> <strong>BTC</strong></p>\n\n```bash\nbc1qlptj5wz8xj6dp5w4pw62s5kt7ct6w8k57w39ak\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/solana-sol-logo.svg\" width=\"20\" height=\"20\" alt=\"SOL\" /> <strong>SOL</strong></p>\n\n```bash\nD8AgCTrxpDKD5meJ2bpAfVwcST3NF3EPuy9xczYycnXn\n```\n\n<p align=\"center\"><img src=\"https://cryptologos.cc/logos/polygon-matic-logo.svg\" width=\"20\" height=\"20\" alt=\"POL\" /> <strong>POL</strong></p>\n\n```bash\n0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec\n```\n\n\n\n\n## Enterprise & Headless Mode\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nRun Open Agents as a headless service for CI/CD pipelines, automation, and enterprise deployments.\n\n### Non-Interactive Mode\n\n```bash\noa \"fix all lint errors\" --non-interactive    # Run task, exit when done\noa \"generate API docs\" --json                 # Structured JSON output (no ANSI)\noa \"run security audit\" --background          # Detached background job\n```\n\n### Background Jobs\n\n```bash\noa \"migrate database\" --background            # Returns job ID immediately\noa status job-abc123                          # Check job progress\noa jobs                                       # List all running/completed jobs\n```\n\nJobs run as detached processes — survive terminal disconnection. Output saved to `.oa/jobs/{id}.json`.\n\n### JSON Output Mode\n\nWith `--json`, all output is structured NDJSON:\n```json\n{\"type\":\"tool_call\",\"tool\":\"file_edit\",\"args\":{\"path\":\"src/api.ts\"},\"timestamp\":\"...\"}\n{\"type\":\"tool_result\",\"tool\":\"file_edit\",\"result\":\"OK\",\"timestamp\":\"...\"}\n{\"type\":\"task_complete\",\"summary\":\"Fixed 3 lint errors\",\"timestamp\":\"...\"}\n```\n\nPipe to `jq`, ingest into monitoring systems, or feed to other agents.\n\n### Process Management\n\n```bash\n/destroy processes              # Kill orphaned OA processes (local project)\n/destroy processes --global     # Kill ALL orphaned OA processes system-wide\n```\n\nShows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale OA instances.\n\n### REST API Service (Port 11435)\n\nOpen Agents runs a persistent REST API — like Ollama's `/api/` surface but with agentic task execution, OpenAI compatibility, and full TUI command access.\n\n```bash\noa serve                                              # Start on default port 11435\noa serve --port 9999                                   # Custom port\nOA_API_KEY=mysecret oa serve                           # Single admin key\nOA_API_KEYS=\"key1:admin:alice,key2:run:ci,key3:read:grafana\" oa serve  # Scoped multi-key\n```\n\n#### Working Directory\n\nPass `X-Working-Directory` header to run commands in your current terminal directory:\n\n```bash\n# Auto-inject current dir — agent operates on YOUR project, not the server's cwd\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"fix all lint errors\"}'\n```\n\nOr set it in the JSON body: `\"working_directory\": \"/path/to/project\"`\n\n#### Health & Observability\n\n```bash\n# Liveness\ncurl http://localhost:11435/health\n```\n```json\n{\"status\":\"ok\",\"uptime_s\":142,\"version\":\"0.184.33\"}\n```\n\n```bash\n# Readiness (probes Ollama backend)\ncurl http://localhost:11435/health/ready\n```\n```json\n{\"status\":\"ready\",\"ollama\":\"reachable\"}\n```\n\n```bash\n# Version info\ncurl http://localhost:11435/version\n```\n```json\n{\"version\":\"0.184.33\",\"node\":\"v24.14.0\",\"platform\":\"linux\"}\n```\n\n```bash\n# Prometheus metrics (scrape with Grafana/Prometheus)\ncurl http://localhost:11435/metrics\n```\n```\n# HELP oa_requests_total Total HTTP requests\n# TYPE oa_requests_total counter\noa_requests_total{method=\"POST\",path=\"/v1/chat/completions\",status=\"200\"} 47\noa_tokens_in_total 12450\noa_tokens_out_total 8230\noa_errors_total 0\n```\n\n#### OpenAI-Compatible Inference\n\nDrop-in replacement for any OpenAI client library. Change `api.openai.com` → `localhost:11435`.\n\n```bash\n# List models\ncurl http://localhost:11435/v1/models\n```\n```json\n{\"object\":\"list\",\"data\":[{\"id\":\"qwen3.5:9b\",\"object\":\"model\",\"created\":0,\"owned_by\":\"local\"},{\"id\":\"qwen3.5:4b\",\"object\":\"model\",...}]}\n```\n\n```bash\n# Chat completion (non-streaming)\ncurl -X POST http://localhost:11435/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"qwen3.5:9b\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"What is 2+2?\"}]\n  }'\n```\n```json\n{\n  \"id\": \"chatcmpl-a1b2c3d4e5f6\",\n  \"object\": \"chat.completion\",\n  \"model\": \"qwen3.5:9b\",\n  \"choices\": [{\n    \"index\": 0,\n    \"message\": {\"role\": \"assistant\", \"content\": \"4\"},\n    \"finish_reason\": \"stop\"\n  }],\n  \"usage\": {\"prompt_tokens\": 25, \"completion_tokens\": 2, \"total_tokens\": 27}\n}\n```\n\n```bash\n# Chat completion (SSE streaming)\ncurl -N -X POST http://localhost:11435/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"qwen3.5:9b\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}],\"stream\":true}'\n```\n```\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{\"role\":\"assistant\",\"content\":\"Hi\"}}]}\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{\"content\":\" there!\"}}]}\ndata: {\"id\":\"chatcmpl-...\",\"choices\":[{\"delta\":{},\"finish_reason\":\"stop\"}]}\ndata: [DONE]\n```\n\n#### Agentic Task Execution\n\nThe unique OA capability — submit a coding task and get an autonomous agent loop.\n\n```bash\n# Run task in your current directory\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Content-Type: application/json\" \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -d '{\n    \"task\": \"fix all TypeScript errors in src/\",\n    \"model\": \"qwen3.5:9b\",\n    \"max_turns\": 25,\n    \"stream\": true\n  }'\n```\n```\ndata: {\"type\":\"run_started\",\"run_id\":\"job-a1b2c3\",\"pid\":12345}\ndata: {\"type\":\"stdout\",\"data\":\"{\\\"turn\\\":1,\\\"tool\\\":\\\"file_read\\\",...}\"}\ndata: {\"type\":\"stdout\",\"data\":\"{\\\"turn\\\":2,\\\"tool\\\":\\\"file_edit\\\",...}\"}\ndata: {\"type\":\"exit\",\"code\":0}\ndata: [DONE]\n```\n\n```bash\n# Run in isolated sandbox (temp workspace, safe for untrusted tasks)\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"write a hello world app\",\"isolate\":true}'\n```\n\n```bash\n# List all runs\ncurl http://localhost:11435/v1/runs\n```\n```json\n{\"runs\":[{\"id\":\"job-a1b2c3\",\"task\":\"fix TypeScript errors\",\"status\":\"completed\",\"startedAt\":\"...\"}]}\n```\n\n```bash\n# Get specific run status\ncurl http://localhost:11435/v1/runs/job-a1b2c3\n```\n\n```bash\n# Abort a running task\ncurl -X DELETE http://localhost:11435/v1/runs/job-a1b2c3\n```\n```json\n{\"status\":\"aborted\",\"run_id\":\"job-a1b2c3\"}\n```\n\n#### Configuration\n\n```bash\n# Get all config\ncurl http://localhost:11435/v1/config\n```\n```json\n{\"config\":{\"backendUrl\":\"http://127.0.0.1:11434\",\"model\":\"qwen3.5:122b\",\"backendType\":\"ollama\",...}}\n```\n\n```bash\n# Get current model\ncurl http://localhost:11435/v1/config/model\n```\n```json\n{\"model\":\"qwen3.5:122b\"}\n```\n\n```bash\n# Switch model\ncurl -X PUT http://localhost:11435/v1/config/model \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"qwen3.5:27b\"}'\n```\n```json\n{\"model\":\"qwen3.5:27b\",\"status\":\"updated\"}\n```\n\n```bash\n# Get endpoint\ncurl http://localhost:11435/v1/config/endpoint\n```\n```json\n{\"url\":\"http://127.0.0.1:11434\",\"backendType\":\"ollama\",\"auth\":\"none\"}\n```\n\n```bash\n# Switch endpoint (e.g., to Chutes AI)\ncurl -X PUT http://localhost:11435/v1/config/endpoint \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"url\":\"https://llm.chutes.ai\",\"auth\":\"Bearer cpk_...\"}'\n```\n\n```bash\n# Update settings (admin scope required)\ncurl -X PATCH http://localhost:11435/v1/config \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"verbose\":true}'\n```\n```json\n{\"config\":{...},\"updated\":[\"verbose\"]}\n```\n\n#### Slash Commands via REST\n\nEvery `/command` from the TUI is available as a REST endpoint.\n\n```bash\n# List all available commands\ncurl http://localhost:11435/v1/commands\n```\n```json\n{\"commands\":[{\"command\":\"/help\",\"description\":\"Show help\"},{\"command\":\"/stats\",\"description\":\"Session metrics\"},...]}\n```\n\n```bash\n# Execute /stats\ncurl -X POST http://localhost:11435/v1/commands/stats\n```\n\n```bash\n# Execute /nexus status\ncurl -X POST http://localhost:11435/v1/commands/nexus \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"args\":\"status\"}'\n```\n\n```bash\n# Execute /destroy processes --global\ncurl -X POST http://localhost:11435/v1/commands/destroy \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"args\":\"processes --global\"}'\n```\n\n#### Auth Scopes\n\n```bash\n# Multi-key setup: read (monitoring), run (CI), admin (ops)\nOA_API_KEYS=\"grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team\" oa serve\n```\n\n| Scope | Can do | Cannot do |\n|-------|--------|-----------|\n| `read` | GET /v1/models, /v1/config, /v1/runs, /v1/commands | POST /v1/run, PATCH /v1/config |\n| `run` | Everything in `read` + POST /v1/run, POST /v1/commands | PATCH /v1/config, PUT endpoints |\n| `admin` | Everything | — |\n\n```bash\n# With auth\ncurl -H \"Authorization: Bearer ops-key\" http://localhost:11435/v1/models\n```\n\n#### Tool-Use Profiles\n\nEnterprise access control — define which tools, shell commands, and settings the agent can use per API key or per request.\n\n**3 built-in presets:**\n\n| Profile | Description | Tools |\n|---------|-------------|-------|\n| `full` | No restrictions | All tools and commands |\n| `ci-safe` | CI/CD — read + test only | file_read, grep, shell (npm test only) |\n| `readonly` | Read-only analysis | No writes, no shell mutations |\n\n```bash\n# List all profiles (presets + custom)\ncurl -H \"Authorization: Bearer $KEY\" http://localhost:11435/v1/profiles\n```\n```json\n{\"profiles\":[{\"name\":\"readonly\",\"description\":\"Read-only\",\"encrypted\":false,\"source\":\"preset\"},{\"name\":\"ci-safe\",...}]}\n```\n\n```bash\n# Get profile details\ncurl -H \"Authorization: Bearer $KEY\" http://localhost:11435/v1/profiles/ci-safe\n```\n```json\n{\"profile\":{\"name\":\"ci-safe\",\"tools\":{\"allow\":[\"file_read\",\"grep_search\",\"shell\"],\"shell_allow\":[\"npm test\",\"npx eslint\"]},\"limits\":{\"max_turns\":15}}}\n```\n\n```bash\n# Create custom profile (admin only)\ncurl -X POST http://localhost:11435/v1/profiles \\\n  -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"name\": \"frontend-dev\",\n    \"description\": \"Frontend team — no backend access\",\n    \"tools\": {\n      \"allow\": [\"file_read\", \"file_write\", \"file_edit\", \"shell\", \"grep_search\"],\n      \"shell_deny\": [\"rm -rf\", \"sudo\", \"docker\", \"kubectl\"]\n    },\n    \"commands\": { \"deny\": [\"destroy\", \"expose\", \"sponsor\"] },\n    \"limits\": { \"max_turns\": 20, \"timeout_s\": 300 }\n  }'\n```\n\n```bash\n# Create password-protected profile (AES-256-GCM encrypted)\ncurl -X POST http://localhost:11435/v1/profiles \\\n  -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"name\":\"prod-ops\",\"password\":\"s3cret\",\"tools\":{\"deny\":[\"file_write\"]}}'\n```\n\n```bash\n# Use a profile with /v1/run (header or body)\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Authorization: Bearer $KEY\" \\\n  -H \"X-Tool-Profile: ci-safe\" \\\n  -H \"X-Working-Directory: $(pwd)\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"run the test suite and report failures\"}'\n\n# Or in the body:\ncurl -X POST http://localhost:11435/v1/run \\\n  -H \"Authorization: Bearer $KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"task\":\"analyze code quality\",\"profile\":\"readonly\"}'\n```\n\n```bash\n# Load encrypted profile (password in header)\ncurl -H \"Authorization: Bearer $KEY\" \\\n  -H \"X-Profile-Password: s3cret\" \\\n  http://localhost:11435/v1/profiles/prod-ops\n```\n\n```bash\n# Delete a custom profile (admin only, presets cannot be deleted)\ncurl -X DELETE -H \"Authorization: Bearer $ADMIN_KEY\" \\\n  http://localhost:11435/v1/profiles/frontend-dev\n```\n\n#### Endpoint Reference\n\n| Method | Path | Auth | Description |\n|--------|------|------|-------------|\n| GET | `/health` | none | Liveness probe |\n| GET | `/health/ready` | none | Readiness (probes Ollama) |\n| GET | `/health/startup` | none | Startup complete |\n| GET | `/version` | none | Version + platform |\n| GET | `/metrics` | none | Prometheus counters |\n| GET | `/v1/models` | read | List models (OpenAI format) |\n| POST | `/v1/chat/completions` | run | Chat inference (stream + sync) |\n| POST | `/v1/embeddings` | run | Generate embeddings |\n| POST | `/v1/chat` | run | Stateful chat with full tool access (sessions, context, memory) |\n| GET | `/v1/chat/sessions` | read | List active chat sessions |\n| GET | `/v1/system` | none | GPU/RAM/CPU info + model recommendations |\n| GET | `/v1/audit` | read | Query audit log (since, user, limit filters) |\n| GET | `/openapi.json` | none | OpenAPI 3.0 specification |\n| GET | `/docs` | none | Swagger UI (interactive API docs) |\n| POST | `/v1/run` | run | Submit agentic task |\n| GET | `/v1/runs` | read | List all runs |\n| GET | `/v1/runs/:id` | read | Run status |\n| DELETE | `/v1/runs/:id` | run | Abort run |\n| GET | `/v1/config` | read | All settings |\n| PATCH | `/v1/config` | admin | Update settings |\n| GET | `/v1/config/model` | read | Current model |\n| PUT | `/v1/config/model` | admin | Switch model |\n| GET | `/v1/config/endpoint` | read | Current endpoint |\n| PUT | `/v1/config/endpoint` | admin | Switch endpoint |\n| GET | `/v1/commands` | read | List commands |\n| POST | `/v1/commands/:cmd` | run | Execute command |\n| GET | `/v1/profiles` | read | List all profiles (presets + custom) |\n| GET | `/v1/profiles/:name` | read | Get profile details (X-Profile-Password for encrypted) |\n| POST | `/v1/profiles` | admin | Create/update profile (password field for encryption) |\n| DELETE | `/v1/profiles/:name` | admin | Delete custom profile |\n\n#### Stateful Chat — `/v1/chat`\n\nUnlike `/v1/chat/completions` (raw Ollama proxy), `/v1/chat` spawns the full OA agent with all 61 tools for each message. The agent can search the web, read files, run shell commands, and use memory — exactly like the TUI.\n\n```bash\n# Send a chat message (full tool access)\ncurl -s http://localhost:11435/v1/chat \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"message\": \"What is happening in the world today?\", \"model\": \"qwen3.5:9b\", \"stream\": false}'\n\n# Response: {\"session_id\": \"abc123\", \"message\": {\"role\": \"assistant\", \"content\": \"...\"}}\n```\n\n**Request body:**\n```json\n{\n  \"message\": \"What is happening in the world?\",\n  \"model\": \"qwen3.5:9b\",\n  \"session_id\": \"optional-uuid-from-previous-response\",\n  \"stream\": true,\n  \"max_tokens\": 4096\n}\n```\n\n**Response (non-streaming):**\n```json\n{\n  \"session_id\": \"abc123-def4-5678-ghij-klmnopqrstuv\",\n  \"message\": {\n    \"role\": \"assistant\",\n    \"content\": \"Here are the major events happening today...\"\n  }\n}\n```\n\n**Response (streaming `stream: true`):** Server-Sent Events:\n```\ndata: {\"type\":\"tool_call\",\"tool\":\"web_search\",\"args\":{\"query\":\"world news today\"}}\ndata: {\"type\":\"tool_result\",\"output\":\"Top results: ...\"}\ndata: {\"id\":\"chatcmpl-abc\",\"object\":\"chat.completion.chunk\",\"choices\":[{\"delta\":{\"content\":\"Based on...\"}}]}\ndata: {\"type\":\"complete\",\"turns\":\"3\",\"tokens\":\"12,450\",\"duration\":8500}\ndata: [DONE]\n```\n\n**Session management:** Each chat message returns a `session_id`. Send it back to maintain conversation context across turns:\n\n```bash\ncurl -s http://localhost:11435/v1/chat \\\n  -d '{\"session_id\": \"abc123\", \"message\": \"Tell me more about that\", \"model\": \"qwen3.5:9b\", \"stream\": false}'\n```\n\nSessions expire after 30 minutes of inactivity. List active sessions: `GET /v1/chat/sessions`.\n\n**Streaming:** Set `\"stream\": true` for Server-Sent Events with tool call visualization and incremental content.\n\n#### Web Interface\n\nOpen `http://localhost:11435/` in a browser when `oa serve` is running. Zero external dependencies — single self-contained HTML page.\n\n**Tabs:**\n- **Chat** — Conversational interface using `/v1/chat` with full tool access, session persistence, streaming responses, and collapsible tool call dropdowns\n- **Agent** — Submit agentic tasks via `/v1/run`, profile selection, live SSE event stream, abort button\n- **Dashboard** — System health (GPU, RAM, uptime), per-provider token usage (persistent across restarts), active process monitor, job history with pagination\n- **Config** — Server settings table, model switcher, endpoint manager (add/change inference providers), profile list\n- **Activity** — Real-time audit log feed with color-coded status codes\n\n**Design:** Dark theme (#1a1a1e background, #b2920a gold accent, SF Mono font) matching the TUI and /call voice interface. Mobile responsive with CSS media queries.\n\n**Features:**\n- Model picker populated from `/v1/models`\n- API key support (stored in localStorage)\n- System prompt (collapsible textarea)\n- Markdown rendering with code block copy buttons\n- Docker sandbox toggle (native vs container execution)\n- Workspace sidebar (toggleable file tree)\n- Token counter per conversation\n- Conversation export (Markdown or JSON)\n- GPU/VRAM detection with model compatibility recommendations\n- Per-provider token tracking (persisted to `.oa/usage/token-usage.json`)\n\n### Enterprise Licensing\n\nFree for non-commercial use under CC-BY-NC-4.0. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).\n\n\n\n\n## Architecture\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nThe core is `AgenticRunner` — a multi-turn tool-calling loop with structured context assembly:\n\n```\nUser task → assembleContext(c_instr, c_state, c_know) → LLM → tool_calls → Execute → Feed results → LLM\n                                                                ↓                                      ↑\n                                                          Compaction check ─── Memex archive ─── Context restore\n                                                                (repeat until task_complete or max turns)\n```\n\n- **Context-first** — structured context assembly (C = A equation) replaces ad-hoc prompt construction\n- **Tool-first** — the model explores via tools, not pre-stuffed context\n- **Iterative** — tests, sees failures, fixes them\n- **Parallel-safe** — read-only tools concurrent, mutating tools sequential\n- **Observable** — every tool call, context composition, and result emitted as a real-time event\n- **Bounded** — max turns, timeout, output limits prevent runaway loops\n- **Context-aware** — dynamic compaction, Memex archiving, session persistence, model-tier scaling\n- **Brute-force** — optional auto re-engagement when turn limit is hit (keeps going until task_complete or user abort)\n\n\n\n\n## Context Engineering\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nThe agent implements structured context assembly based on current research in context engineering, modular prompt optimization, and instruction hierarchy:\n\n```\nC = A(c_instr, c_know, c_tools, c_mem, c_state, c_query)\n```\n\n| Component | Priority | Description |\n|-----------|----------|-------------|\n| `c_instr` | P0 (highest) | Core system instructions — immutable, cannot be overridden |\n| `c_state` | P10 | Personality profile, session state |\n| `c_know` | P20 | Dynamic project context, retrieved knowledge |\n| `c_retrieval` | P20 | Task-specific retrieval (RRF-fused lexical + semantic + graph expansion) |\n| `c_graph` | P20 | Live code knowledge graph (PageRank-ranked symbols, community summaries) |\n| `c_plan` | P20 | Plan skeleton (completed/current/pending steps, re-injected every turn) |\n| `c_tools` | P30 (lowest) | Tool outputs — may contain untrusted content |\n\nKey design decisions grounded in research:\n\n- **Instruction hierarchy** — 4-tier priority system (P0/P10/P20/P30) prevents prompt injection from tool outputs overriding system rules. Implemented across all 3 prompt tiers (large/medium/small) with model-appropriate verbosity\n- **Live code knowledge graph** — SQLite-backed graph (files/symbols/edges) auto-updates via filesystem watcher and post-edit hooks. PageRank-ranked symbols injected into every prompt. Louvain community detection compresses 1M+ LOC repos into ~200 navigable clusters. Research: [Codebase-Memory](https://arxiv.org/abs/2603.27277), [FastCode](https://arxiv.org/abs/2603.01012), [Stack Graphs](https://arxiv.org/abs/2211.01224)\n- **Plan-skeleton re-injection** — every turn includes a compact `[done/current/pending]` plan derived from task state, preventing goal drift in multi-step tasks. Research: [ReCAP](https://arxiv.org/abs/2510.23822) (+32% on multi-step tasks)\n- **Retrieval-augmented context** — Reciprocal Rank Fusion merges lexical search, semantic search, and graph expansion into a single ranked result set. Token-budgeted snippet packing ensures relevant code reaches the model without overflow\n- **Proactive quality guidance** — instead of banning tools after repeated use, the agent receives contextual next-step suggestions appended to tool output, preserving tool availability while steering toward productive actions\n- **Tiered system prompts** — large (>=30B), medium (8-29B), and small (<=7B) models get appropriately sized instruction sets, balancing capability with context budget\n- **Context composition tracing** — every context assembly emits a structured event showing section labels and token estimates for eval observability\n\nResearch provenance: grounded in \"A Survey of Context Engineering for LLMs\" (context assembly equation), \"Modular Prompt Optimization\" (section-local textual gradients), \"Reasoning Up the Instruction Ladder\" (priority hierarchy), \"GEPA\" (reflective prompt evolution), \"Prompt Flow Integrity\" (least-privilege context passing), [RepoMaster](https://arxiv.org/abs/2505.21577) (8K token budget validation), and [RIG](https://arxiv.org/abs/2601.10112) (flat graph format).\n\n\n\n\n## Model-Tier Awareness\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOpen Agents classifies models into three tiers and adapts its behavior accordingly:\n\n| Tier | Parameters | Base Tools | System Prompt | Compaction |\n|------|-----------|------------|---------------|------------|\n| **Large** (>=30B) | 70B, 122B | All 67 tools | Full | 75% of context window |\n| **Medium** (8-29B) | 9B, 27B | 15 core + task-relevant | Condensed | 70% of context window |\n| **Small** (<=7B) | 4B, 1.5B | 6 base + explore_tools | Minimal + scaffolding | 65% of context window |\n\n### Small Model Optimization (Research-Backed)\n\nSmall models (4B-7B) receive 10+ optimizations that larger models don't need, each backed by published research:\n\n| Optimization | Research Basis | Impact |\n|-------------|---------------|--------|\n| **Plan-skeleton re-injection** | [ReCAP](https://arxiv.org/abs/2510.23822) (NeurIPS 2025) | +32% multi-step task completion |\n| **Goal re-injection after compaction** | [Lost in the Middle](https://arxiv.org/abs/2307.03172) | Prevents #1 cause of drift |\n| **Decomposition guidance** | [ReCode](https://arxiv.org/abs/2510.23564) | +20.9% for 7B, zero training cost |\n| **Structured error recovery** | [Polaris](https://arxiv.org/abs/2603.23129) | Actionable [RECOVERY] guidance per error type |\n| **LATS pivot directive** | [LATS](https://arxiv.org/abs/2310.04406) (ICML 2024) | Forces approach change after consecutive failures |\n| **Self-consistency voting** | [SRLM](https://arxiv.org/abs/2603.15653) | +22% via K-alternative majority voting (opt-in) |\n| **Tier-adaptive compaction** | [Codebase-Memory](https://arxiv.org/abs/2603.27277) | Context budget scales per tier, not hardcoded |\n| **Tool deferral** | [EASYTOOL](https://arxiv.org/abs/2401.06201), [Gorilla](https://arxiv.org/abs/2305.15334) | 60-80% tool token reduction via search |\n| **Best-of-N execution** | [SWE-RM](https://arxiv.org/abs/2512.21919) | +7-10 pts via N independent attempts (opt-in) |\n| **Recursive sub-agents** | [RLM](https://arxiv.org/abs/2512.24601), [Yang/Srebro](https://arxiv.org/abs/2603.02112) | Depth-tracked delegation (max 3), 100x effective context |\n\n**Eval-verified result:** A 4B model completes a hard multi-file refactoring task in 20 turns (down from 25 before these optimizations) and passes 92% of core eval tasks.\n\n### Tool Nesting for Small Models\n\nSmall models use an **explore_tools** meta-tool pattern inspired by hierarchical API retrieval research ([ToolLLM](https://arxiv.org/abs/2307.16789)). Instead of presenting all 64+ tools (which overwhelms small context windows), only core tools are loaded initially. The agent calls `explore_tools()` to discover additional capabilities, then activates specific tools as needed. This reduces tool schema tokens by ~80% while preserving access to the full toolset.\n\n### Dynamic Context Limits\n\nAll context-dependent values scale automatically with the actual context window size:\n\n| Setting | How It Scales |\n|---------|---------------|\n| Compaction threshold | min(tier default, 75% of context window) |\n| Recent messages kept | 1 message per 2-4K of context (tier-dependent) |\n| Max output tokens | 25% of context window (min 2048) |\n| Tool output cap | 2K-8K chars (scales with context) |\n| File read limits | 80-120 line cap for small/medium context windows |\n\n\n\n\n## Live Code Knowledge Graph\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOpen Agents builds and maintains a **persistent, auto-updating knowledge graph** of the codebase that scales from small projects to repositories with 1M+ lines of code.\n\n### How It Works\n\n```\nSource files  ──>  Regex symbol extraction  ──>  SQLite graph DB (.oa/index/code-graph.db)\n     |                                                    |\n     |  fs.watch() + debounce ──>  File hash check  ──>  Incremental re-index (per file)\n     |                                                    |\n     └── post-edit hook (file_write/edit) ─────────────>  Instant re-index of modified files\n```\n\n1. **Symbol extraction** parses every source file for functions, classes, types, interfaces, exports, and constants\n2. **Import graph** traces dependency relationships (which file imports which)\n3. **PageRank scoring** ranks files by how many other files depend on them\n4. **Community detection** (Louvain-inspired) groups related files into logical modules with summaries\n5. **Auto-update** via filesystem watcher and post-tool-edit hooks keeps the graph fresh as code changes\n\n### What the Agent Sees\n\nEach turn, the agent receives a compact graph summary (500-1500 tokens depending on model tier) showing:\n- The most important files ranked by cross-reference count\n- Their exported symbols (functions, classes, types)\n- Import relationships (what depends on what)\n\nFor 1M+ LOC codebases, the Louvain community compression reduces 50K+ symbols into ~200 navigable module summaries, each with a name and key exports.\n\n### Graph Tools\n\n| Tool | What It Does |\n|------|-------------|\n| `repo_map` | PageRank-sorted codebase skeleton with token budget control |\n| `import_graph` | Show dependencies, dependents, and 1-hop transitive connections for any file |\n| `semantic_map` | Agent-curated notes, hotspot tracking, and file relationships across sessions |\n| `codebase_map` | High-level structural overview (directories, language breakdown) |\n| `file_explore` | Chunked exploration with overview/outline/search/chunk strategies |\n\n### Storage\n\nThe graph persists in `.oa/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.\n\n### Research Basis\n\n- [Codebase-Memory](https://arxiv.org/abs/2603.27277) (2026) — Tree-Sitter + Louvain communities, Linux kernel 2.1M nodes in 3 minutes, incremental via XXH3 hashing\n- [FastCode](https://arxiv.org/abs/2603.01012) (2026) — 3-layer graph schema (dependency/inheritance/call), cleanest decomposition\n- [Stack Graphs](https://arxiv.org/abs/2211.01224) (GitHub production) — File-level isolation for incremental updates at millions-of-repos scale\n- [RepoMaster](https://arxiv.org/abs/2505.21577) (2025) — 8K token budget validated, +62.96% task-pass rate\n- [Code-Craft/HCGS](https://arxiv.org/abs/2504.08975) (2025) — Hierarchical code graph summaries, 82% retrieval precision improvement\n\n\n\n## Auto-Expanding Context Window\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\nOn startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an optimized model variant:\n\n| Available Memory | Context Window |\n|-----------------|---------------|\n| 200GB+ | 128K tokens |\n| 100GB+ | 64K tokens |\n| 50GB+ | 32K tokens |\n| 20GB+ | 16K tokens |\n| 8GB+ | 8K tokens |\n| < 8GB | 4K tokens |\n\n\n\n\n## Tools (85+)\n\n<div align=\"right\"><a href=\"#top\">back to top</a></div>\n\n| Tool | Description |\n|------|-------------|\n| **File Operations** | |\n| `file_read` | Read file contents with line numbers (offset/limit for large files) |\n| `file_write` | Create or overwrite files with automatic directory creation |\n| `file_edit` | Precise string replacement in files (preferred over rewriting) |\n| `file_patch` | Edit specific line ranges in large files (replace, insert_before/after, delete) |\n| `batch_edit` | Multiple edits across files in one call |\n| `list_directory` | List directory contents with types and sizes |\n| **Search & Navigation** | |\n| `grep_search` | Search file contents with regex (ripgrep with grep fallback) |\n| `find_files` | Find files by glob pattern (excludes node_modules/.git) |\n| `codebase_map` | High-level project structure overview with directory tree and language breakdown |\n| **Shell & Execution** | |\n| `shell` | Execute any shell command (non-interactive, CI=true, sudo support) |\n| `code_sandbox` | Isolated code execution (JS, Python, Bash, TS) in subprocess or Docker |\n| `background_run` | Run shell command in background, returns task ID |\n| `task_status` | Check background task status |\n| `task_output` | Read background task output |\n| `task_stop` | Stop a background task |\n| **Web** | |\n| `web_search` | Search the web for pages matching a query — returns links+snippets, not content. Providers: DuckDuckGo (free), Tavily (TAVILY_API_KEY), Jina (JINA_API_KEY) |\n| `web_fetch` | Fetch a single URL's text content (fastest, no JS rendering). Supports `mode=reader` for Jina Reader markdown output with JS rendering. Auto-fallback to Jina when raw content is too short |\n| `web_crawl` | Crawl pages with link-following and optional JS rendering. Strategies: `beautifulsoup` (fast HTTP) or `playwright` (headless Chromium). Supports `extract_schema` for structured data extraction |\n| `browser_action` | Interactive headless Chrome: login, fill forms, click buttons, screenshot. Session persists between calls. Actions: navigate, click, click_xy, type, screenshot, dom, scroll, back, forward, close |\n| **Structured Data** | |\n| `structured_file` | Generate CSV, TSV, JSON, Markdown tables, Excel-compatible files |\n| `structured_read` | Parse CSV, TSV, JSON, Markdown tables with binary format detection |\n| **Vision & Desktop** | |\n| `vision` | Moondream VLM — caption, query, detect, point on any image |\n| `desktop_click` | Vision-guided clicking: describe a UI element, agent finds and clicks it |\n| `desktop_describe` | Screenshot + Moondream caption/query for desktop awareness |\n| `image_read` | Read images (base64 + OCR metadata) |\n| `screenshot` | Capture screen/window/active window |\n| `ocr` | Extract text from images (Tesseract with multi-variant preprocessing) |\n| `ocr_image_advanced` | Advanced multi-variant OCR pipeline with preprocessing, multi-PSM, and confidence scoring |\n| `ocr_pdf` | Add searchable text layer to scanned/image PDFs |\n| `pdf_to_text` | Extract text from PDF using pdftotext (Poppler) with OCR fallback |\n| **Transcription** | |\n| `transcribe_file` | Transcribe local audio/video files to text (Whisper) |\n| `transcribe_url` | Download and transcribe audio/video from URLs |\n| **Memory & Knowledge** | |\n| `memory_read` | Read from persistent memory store by topic and key |\n| `memory_write` | Store facts/patterns in persistent memory with provenance tracking |\n| `memory_search` | Semantic search across all memory entries by query |\n| `memex_retrieve` | Recover full tool output archived during context compaction by hash ID |\n| **Git & Diagnostics** | |\n| `diagnostic` | Lint/typecheck/test/build validation pipeline in one call |\n| `git_info` | Structured git status, log, diff, branch, staged/unstaged files |\n| **Agents & Delegation** | |\n| `sub_agent` | Delegate subtasks to independent agent instances (foreground or background) |\n| `explore_tools` | Meta-tool: discover and unlock additional tools on demand (for small models) |\n| `task_complete` | Signal task completion with summary |\n| **Custom Tools & Skills** | |\n| `create_tool` | Create reusable custom tools from workflow patterns at runtime |\n| `manage_tools` | List, inspect, delete custom tools |\n| `skill_list` | Discover available AIWG skills |\n| `skill_execute` | Run an AIWG skill |\n| **Temporal Agency** | |\n| `scheduler` | Schedule tasks for automatic future execution via OS cron (presets, natural language, raw cron) |\n| `reminder` | Set cross-session reminders with priority, due dates, tags — surfaces at startup |\n| `agenda` | Unified view of reminders, schedules, and attention items with startup brief |\n| **AIWG SDLC** | |\n| `aiwg_setup` | Deploy AIWG SDLC framework |\n| `aiwg_health` | Analyze project SDLC health and readiness |\n| `aiwg_workflow` | Execute AIWG commands and workflows |\n| **Nexus P2P & x402 Payments** | |\n| `nexus` | Decentralized agent networking — connect, rooms, DMs, peer discovery, invoke capabilities, metering, trust/blocking, IPFS storage |\n| `nexus:expose` | Expose local models or forward upstream endpoints as metered inference capabilities with pricing, passthrough, and load balancing |\n| `nexus:wallet_create` | Generate secp256k1/EVM wallet (Base mainnet USDC) with AES-256-GCM encryption + x402-wallet.key |\n| `nexus:spend` | Sign EIP-3009 USDC TransferWithAuthorization — budget-checked, gasless for payer |\n| `nexus:remote_infer` | Route inference to a remote peer's model — auto-discovers peers, budget-checks, invokes, returns result |\n| `nexus:ledger_status` | Transaction history (earned/spent/pending USDC) |\n| `nexus:budget_set` | Configure spending limits — daily cap, per-invoke max, auto-approve threshold |\n| **COHERE Cognitive Stack** | |\n| `repl_exec` | Persistent Python REPL — variables/imports persist between calls, `llm_query()` and `parallel_llm_query()` available for recursive LLM invocation, `retrieve()` for handle access |\n| `memory_metabolize` | Governed memory lifecycle — classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence/identity_relevance), consolidate lessons from trajectories |\n| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.oa/identity/` |\n| `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |\n| `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |\n| **Hardware Access** | |\n| `camera_capture` | Access system cameras — list devices, capture JPEG frames, query capabilities. Uses ffmpeg + v4l2. Supports USB, CSI, and 360 cameras (QooCam, RealSense). Captured images can be piped to vision tools |\n| `audio_capture` | Record from microphone — list input devices, record WAV/MP3 (configurable duration/rate/channels), check real-time mic level (RMS dBFS). Uses arecord + ffmpeg backends |\n| `audio_playback` | Speaker control and TTS — play audio files (WAV/MP3/OGG), text-to-speech via LuxTTS voice clone (persistent GPU daemon, ~2s synthesis), get/set system volume. Uses aplay/ffplay/amixer backends |\n| `wifi_control` | WiFi network scanning and management — scan nearby networks (SSID, signal, channel, security), list WiFi adapters (built-in + USB dongles), connect/disconnect, check connection status, toggle monitor mode. Auto-detects AC600/RTL8811AU and other USB adapters |\n| `bluetooth_scan` | Bluetooth device discovery — scan for Classic and BLE devices, list HCI adapters, get device info. Uses hcitool/bluetoothctl backends |\n| `sdr_scan` | Software-defined radio scanning — frequency sweeps, ADS-B aircraft tracking (1090 MHz), FM radio capture. Auto-installs rtl-sdr tools when RTL-SDR hardware detected. Uses rtl_power/rtl_fm/dump1090 |\n| `flipper_zero` | Flipper Zero multi-tool control — Sub-GHz scanning (315/433/868/915 MHz), NFC tag reading, 125kHz RFID reading, IR capture, GPIO pin reading, storage browsing. Serial CLI via /dev/ttyACM* |\n| `meshtastic` | Mesh network communication via LoRa — send/receive messages, list nodes, get device info, configure channels. Auto-installs meshtastic CLI in venv, auto-fixes serial permissions via pkexec |\n| `gps_location` | GPS positioning from 45+ USB receivers — auto-detects device, probes NMEA at multiple baud rates. Uses pyserial+pynmea2 for reliable parsing. Returns lat/lon/alt/speed/heading |\n| `audio_analyze` | Audio scene analysis — YAMNet 521-class classification (AudioSet taxonomy), Silero VAD voice activity detection, FFT spectrum analysis with peak frequency detection |\n| `asr_listen` | Record from microphone and transcribe speech to text — combines audio capture + Whisper ASR in one call. Uses PipeWire (bluetooth/USB) → faster-whisper → openai-whisper backends |\n| **Visual Intelligence** | |\n| `visual_memory` | Face recognition + object memory — InsightFace ArcFace 512d face enrollment/identification, CLIP ViT-B/32 object teaching/recognition. Persistent face+object databases in `.open-agents/visual-memory/` |\n| `multimodal_memory` | Cross-modal episode binding — captures face + voice + text + location into unified episodes. Actions: capture (photo+audio), meet (register person with name+face+voice), recall (associative retrieval), timeline (chronological query) |\n| **Associative Memory** | |\n| `episode_store` | SQLite episode store with triple-factor scoring (recency x importance x relevance), 4-class temporal decay (session/daily/procedural/permanent), Ebbinghaus strengthening on retrieval |\n| `temporal_graph` | Temporal knowledge graph with Graphiti-style valid_from/valid_until edges, entity upsert with mention counting, temporal queries, neighbor traversal for context building |\n| `zettelkasten` | A-MEM Zettelkasten note linking — retroactive context evolution, top-3 neighbor discovery via cosine similarity, bidirectional linking |\n| `ppr_retrieval` | HippoRAG Personalized PageRank retrieval — entity extraction, seed node mapping, multi-hop associative traversal over temporal KG, episode scoring |\n| `gist_compressor` | ReadAgent-style trajectory compression — deterministic gist extraction from multi-turn interactions, no LLM needed |\n\nRead-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.\n\n### Web Tool Selection Guide\n\nThe agent has 4 web tools. Pick the right one:\n\n| Need | Tool | Why |\n|------|------|-----|\n| Find pages about a topic | `web_search` | Returns links+snippets to fetch later |\n| Read a URL you already have | `web_fetch` | Fastest — plain text, no JS rendering |\n| Page is blank or JS-heavy (SPA) | `web_crawl` strategy=playwright | Renders JavaScript via headless Chromium |\n| Follow links across a site | `web_crawl` max_depth=1+ | Multi-page crawl with metadata |\n| Extract structured data (prices, tables) | `web_crawl` + extract_schema | Regex-based field extraction from page text |\n| Login / fill forms / click buttons | `browser_action` | Persistent session with cookies and state |\n| Screenshot of a rendered page | `browser_action` action=screenshot | Visual rendering via Chrome |\n| Clean markdown from any URL | `web_fetch` mode=reader | Jina Reader (r.jina.ai) — handles JS, images |\n\n**Routing order**: `web_search` (find) → `web_fetch` (read) → `web_crawl` (if JS/multi-page) → `browser_action` (if interactive)\n\n**Jina Reader**: Set `JINA_API_KEY` for higher rate limits. Works without a key for basic use. When `web_fetch` gets very short content (<200 chars), it automatically retries via Jina Reader.\n\n**Structured extraction**: Pass `extract_schema='{\"price\": \"number\", \"name\": \"string\"}'` to `web_crawl` for best-effort regex-based field extraction from page content.\n\n### Hardware Tool Guide\n\nThe agent can access physical hardware — cameras, microphones, and speakers — through three dedicated tools:\n\n| Need | Tool | Example |\n|------|------|---------|\n| See the environment | `camera_capture` action=capture | Grab a JPEG frame from any USB/CSI camera |\n| List cameras | `camera_capture` action=list | Discover `/dev/video*` devices |\n| Record audio | `audio_capture` action=record duration=10 | Record 10s WAV from default mic |\n| Check if mic works | `audio_capture` action=level | RMS level in dBFS |\n| Speak aloud | `audio_playback` action=speak text=\"Hello\" | TTS via LuxTTS voice clone |\n| Play a sound file | `audio_playback` action=play file=alert.wav | Play WAV/MP3/OGG |\n| Check volume | `audio_playback` action=volume | Get current volume % |\n| Set volume | `audio_playback` action=volume volume=50 | Set to 50% |\n| Scan WiFi networks | `wifi_control` action=scan | All SSIDs, signals, channels |\n| List WiFi adapters | `wifi_control` action=interfaces | Built-in + USB dongles |\n| Connect to WiFi | `wifi_control` action=connect ssid=\"MyNet\" password=\"pass\" | Join network |\n| WiFi status | `wifi_control` action=status | Current SSID, IP, signal |\n| Scan Bluetooth | `bluetooth_scan` action=scan | Classic + BLE devices |\n| List BT adapters | `bluetooth_scan` action=interfaces | HCI adapters |\n| SDR device check | `sdr_scan` action=info | RTL-SDR hardware status |\n| RF frequency sweep | `sdr_scan` action=scan start_freq=\"433M\" end_freq=\"434M\" | Signal power levels |\n| Aircraft tracking | `sdr_scan` action=adsb duration=30 | ADS-B transponder messages |\n| FM radio capture | `sdr_scan` action=fm frequency=\"98.1M\" | Record FM audio |\n| Detect Flipper Zero | `flipper_zero` action=detect | Connected Flippers |\n| Sub-GHz scan | `flipper_zero` action=subghz_scan frequency=433920000 | RF signals |\n| Read NFC tag | `flipper_zero` action=nfc_read | Tag UID, type |\n| Read RFID tag | `flipper_zero` action=rfid_read | 125kHz tag ID |\n| Send mesh message | `meshtastic` action=send message=\"Hello mesh\" | LoRa broadcast |\n| List mesh nodes | `meshtastic` action=nodes | All nodes + signal info |\n| Get GPS location | `gps_location` action=locate | Lat/lon/alt/speed |\n| Analyze audio scene | `audio_analyze` action=classify file=\"rec.wav\" | Top AudioSet classes |\n| Detect voice activity | `audio_analyze` action=vad file=\"rec.wav\" | Speech segments |\n| Listen + transcribe | `asr_listen` action=listen duration=8 | Record + Whisper ASR |\n| Transcribe audio file | "
 }