npm - open-agents-ai - Versions diffs - 0.186.62 → 0.186.63 - Mend

open-agents-ai 0.186.62 → 0.186.63

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +241 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -23,6 +23,7 @@ npm i -g open-agents-ai && oa
 An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.
 ## Table of Contents
 <div align="right"><a href="#top">back to top</a></div>
@@ -67,6 +68,10 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
 - [Research Citations](#research-citations)
 - [License](#license)
+<details id="the-organism-not-the-cortex">
+<summary><strong>The Organism, Not the Cortex</strong> — Why the LLM is one organ inside a larger organism</summary>
 ## The Organism, Not the Cortex
 <div align="right"><a href="#top">back to top</a></div>
@@ -89,6 +94,12 @@ An LLM is a high-bandwidth associative generative core — closer to a cortex-li
 Don't chase larger models. Build the organism around whatever model you have.
+</details>
+<details id="how-it-works">
+<summary><strong>How It Works</strong> — Multi-turn autonomous tool-calling loop in action</summary>
 ## How It Works
 <div align="right"><a href="#top">back to top</a></div>
@@ -105,6 +116,12 @@ Agent: [Turn 1] file_read(src/auth.ts)
 The agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.
+</details>
+<details id="features">
+<summary><strong>Features</strong> — 61 tools, voice, vision, P2P mesh, self-play, COHERE cognitive stack</summary>
 ## Features
 <div align="right"><a href="#top">back to top</a></div>
@@ -209,6 +226,12 @@ D8AgCTrxpDKD5meJ2bpAfVwcST3NF3EPuy9xczYycnXn
 0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec
 ```
+</details>
+<details id="enterprise--headless-mode">
+<summary><strong>Enterprise & Headless Mode</strong> — REST API, background jobs, JSON output, auth scopes, tool profiles</summary>
 ## Enterprise & Headless Mode
 <div align="right"><a href="#top">back to top</a></div>
@@ -720,6 +743,12 @@ Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero ext
 Free for non-commercial use under CC-BY-NC-4.0. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).
+</details>
+<details id="architecture">
+<summary><strong>Architecture</strong> — AgenticRunner core loop with structured context assembly</summary>
 ## Architecture
 <div align="right"><a href="#top">back to top</a></div>
@@ -742,6 +771,12 @@ User task → assembleContext(c_instr, c_state, c_know) → LLM → tool_calls
 - **Context-aware** — dynamic compaction, Memex archiving, session persistence, model-tier scaling
 - **Brute-force** — optional auto re-engagement when turn limit is hit (keeps going until task_complete or user abort)
+</details>
+<details id="context-engineering">
+<summary><strong>Context Engineering</strong> — C = A(c_instr, c_know, c_tools, c_mem, c_state, c_query) structured assembly</summary>
 ## Context Engineering
 <div align="right"><a href="#top">back to top</a></div>
@@ -768,6 +803,12 @@ Key design decisions grounded in research:
 Research provenance: grounded in "A Survey of Context Engineering for LLMs" (context assembly equation), "Modular Prompt Optimization" (section-local textual gradients), "Reasoning Up the Instruction Ladder" (priority hierarchy), "GEPA" (reflective prompt evolution), and "Prompt Flow Integrity" (least-privilege context passing).
+</details>
+<details id="model-tier-awareness">
+<summary><strong>Model-Tier Awareness</strong> — Dynamic tool sets, prompts, and limits that scale with model size</summary>
 ## Model-Tier Awareness
 <div align="right"><a href="#top">back to top</a></div>
@@ -805,6 +846,12 @@ All context-dependent values scale automatically with the actual context window
 | Tool output cap | 2K-8K chars (scales with context) |
 | File read limits | 80-120 line cap for small/medium context windows |
+</details>
+<details id="auto-expanding-context-window">
+<summary><strong>Auto-Expanding Context Window</strong> — RAM/VRAM detection creates optimized model variants automatically</summary>
 ## Auto-Expanding Context Window
 <div align="right"><a href="#top">back to top</a></div>
@@ -820,6 +867,12 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
 | 8GB+ | 8K tokens |
 | < 8GB | 4K tokens |
+</details>
+<details id="tools-61">
+<summary><strong>Tools (61)</strong> — File I/O, shell, web, vision, memory, agents, COHERE, P2P, x402</summary>
 ## Tools (61)
 <div align="right"><a href="#top">back to top</a></div>
@@ -928,6 +981,12 @@ The agent has 4 web tools. Pick the right one:
 **Structured extraction**: Pass `extract_schema='{"price": "number", "name": "string"}'` to `web_crawl` for best-effort regex-based field extraction from page content.
+</details>
+<details id="ralph-loop--iteration-first-design">
+<summary><strong>Ralph Loop — Iteration-First Design</strong> — Iterative retry loop where errors become learning data</summary>
 ## Ralph Loop — Iteration-First Design
 <div align="right"><a href="#top">back to top</a></div>
@@ -954,6 +1013,12 @@ The loop tracks iteration history, generates completion reports saved to `.aiwg/
 /ralph-abort      # Cancel running loop
 ```
+</details>
+<details id="task-control">
+<summary><strong>Task Control</strong> — Pause, stop, resume, destroy, and session context persistence</summary>
 ## Task Control
 <div align="right"><a href="#top">back to top</a></div>
@@ -995,6 +1060,12 @@ When you launch `oa` in a workspace that has saved session context from a previo
 Type `y` to restore — the previous session context will be prepended to your next task, giving the agent full continuity. Type `n` (or anything else) to start fresh. The prompt only appears on fresh starts, not on `/update` resumes (which auto-restore context).
+</details>
+<details id="cohere-cognitive-framework">
+<summary><strong>COHERE Cognitive Framework</strong> — 8-layer cognitive stack with distributed inference, identity, and reflection</summary>
 ## COHERE Cognitive Framework
 <div align="right"><a href="#top">back to top</a></div>
@@ -1075,6 +1146,12 @@ The identity kernel maintains a persistent self-model across sessions, the refle
 | L8 | Darwin Gödel Machine: Open-Ended Self-Improvement (2025) | [arxiv:2505.22954](https://arxiv.org/abs/2505.22954) |
 | L8 | i-MENTOR: Intrinsic Motivation Exploration (2025) | [arxiv:2505.17621](https://arxiv.org/abs/2505.17621) |
+</details>
+<details id="agent-immune-system--constraint-enforcement--pressure-resistance">
+<summary><strong>Agent Immune System — Constraint Enforcement & Pressure Resistance</strong> — Behavioral constraints, pressure-aware decision gates, and audit logging</summary>
 ## Agent Immune System — Constraint Enforcement & Pressure Resistance
 <div align="right"><a href="#top">back to top</a></div>
@@ -1136,6 +1213,12 @@ User (frustrated): "fix this broken shit"
   → Model fixes the architecture instead of adding a prompt hack
 ```
+</details>
+<details id="context-compaction--research-backed-memory-management">
+<summary><strong>Context Compaction — Research-Backed Memory Management</strong> — 6 compaction strategies, Memex archive, SNR tracking, deep context mode</summary>
 ## Context Compaction — Research-Backed Memory Management
 <div align="right"><a href="#top">back to top</a></div>
@@ -1264,6 +1347,12 @@ Compaction summaries include:
 This ensures the agent can resume coherently after compaction without re-reading files or re-running commands.
+</details>
+<details id="personality-core--sac-framework-style-control">
+<summary><strong>Personality Core — SAC Framework Style Control</strong> — Five-dimension behavioral intensity from silent operator to teacher mode</summary>
 ## Personality Core — SAC Framework Style Control
 <div align="right"><a href="#top">back to top</a></div>
@@ -1314,6 +1403,12 @@ The personality system draws on:
 - **Linear Personality Probing** ([arXiv:2512.17639](https://arxiv.org/abs/2512.17639)) — Prompt-level steering completely dominates activation-level interventions
 - **The Prompt Report** ([arXiv:2406.06608](https://arxiv.org/abs/2406.06608)) — Positive framing outperforms negated instructions for behavioral control
+</details>
+<details id="emotion-engine--affective-state-modulation">
+<summary><strong>Emotion Engine — Affective State Modulation</strong> — Circumplex affect model with valence, arousal, dominance axes</summary>
 ## Emotion Engine — Affective State Modulation
 <div align="right"><a href="#top">back to top</a></div>
@@ -1378,6 +1473,12 @@ The emotion system is informed by peer-reviewed and preprint research:
 8. **EmotionBench** — Huang et al. ([arXiv:2308.03656](https://arxiv.org/abs/2308.03656), 2023). LLMs cannot maintain emotional state across turns implicitly — argues for explicit external mood state representation (which this engine implements).
+</details>
+<details id="voice-feedback-tts">
+<summary><strong>Voice Feedback (TTS)</strong> — GLaDOS, Overwatch, Kokoro, LuxTTS voice clone with emotion-driven prosody</summary>
 ## Voice Feedback (TTS)
 <div align="right"><a href="#top">back to top</a></div>
@@ -1571,6 +1672,12 @@ The stochastic narration engine generates spoken descriptions of what the agent
 - **Personality scaling** — terse mode (level 1-2) uses short functional descriptions; conversational (3) adds natural phrasing; chatty (4-5) adds theatrical commentary and content references
 - **Natural silence** — on bland successes without notable content, ~40% of the time the narration is skipped entirely for a more natural rhythm
+</details>
+<details id="listen-mode--live-bidirectional-audio">
+<summary><strong>Listen Mode — Live Bidirectional Audio</strong> — Real-time Whisper transcription with hands-free auto-submit</summary>
 ## Listen Mode — Live Bidirectional Audio
 <div align="right"><a href="#top">back to top</a></div>
@@ -1609,6 +1716,12 @@ The `transcribe-cli` dependency auto-installs in the background on first use. On
 **File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.oa/transcripts/`.
+</details>
+<details id="vision--desktop-automation-moondream">
+<summary><strong>Vision & Desktop Automation (Moondream)</strong> — Local VLM for screenshots, point-and-click, browser automation, OCR</summary>
 ## Vision & Desktop Automation (Moondream)
 <div align="right"><a href="#top">back to top</a></div>
@@ -1797,6 +1910,12 @@ Supports `apt` (Debian/Ubuntu), `dnf` (Fedora), `pacman` (Arch), and `brew` (mac
 - Moondream Station (local) — runs entirely on your machine, no API keys needed
 - Moondream Cloud API — set `MOONDREAM_API_KEY` for cloud inference
+</details>
+<details id="interactive-tui">
+<summary><strong>Interactive TUI</strong> — REPL with slash commands, mid-task steering, animated metrics bar</summary>
 ## Interactive TUI
 <div align="right"><a href="#top">back to top</a></div>
@@ -1902,6 +2021,12 @@ The steering sub-agent uses the same model and backend as the main agent with `m
 - **LATS** (Zhou et al., 2024) — mid-execution replanning with user-provided value signals improves task completion on complex multi-step problems
 - **AutoGen** (Wu et al., 2023) — human-in-the-loop patterns work best when user messages are expanded into structured instructions, reducing ambiguity for the primary agent
+</details>
+<details id="telegram-bridge--sub-agent-per-chat">
+<summary><strong>Telegram Bridge — Sub-Agent Per Chat</strong> — Per-chat sub-agents with admin passthrough, media handling, and streaming</summary>
 ## Telegram Bridge — Sub-Agent Per Chat
 <div align="right"><a href="#top">back to top</a></div>
@@ -2035,6 +2160,12 @@ The bridge automatically handles Telegram's rate limits (HTTP 429) with exponent
 **Combined with blessed mode** — `/full-send-bless` + `/telegram` creates a persistent, always-on agent that processes Telegram messages around the clock while keeping the model warm.
+</details>
+<details id="x402-payment-rails--nexus-p2p">
+<summary><strong>x402 Payment Rails & Nexus P2P</strong> — EVM wallets, EIP-3009 USDC transfers, metered inference, budget policies</summary>
 ## x402 Payment Rails & Nexus P2P
 <div align="right"><a href="#top">back to top</a></div>
@@ -2094,6 +2225,12 @@ nexus(action='budget_set', auto_approve_below='0.01')  # Auto-approve micropayme
 - All outbound messages scanned for key material before sending
 - Keys NEVER appear in tool output, logs, or LLM context
+</details>
+<details id="sponsored-inference--share-your-gpu-with-the-world">
+<summary><strong>Sponsored Inference — Share Your GPU With the World</strong> — 5-step wizard to share models via secure branded relay</summary>
 ## Sponsored Inference — Share Your GPU With the World
 <div align="right"><a href="#top">back to top</a></div>
@@ -2160,6 +2297,12 @@ Consumer OA ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollam
 The tunnel fix uses debounced restarts with exponential cooldown (10s → 20s → 40s), stopping auto-restart after 3 consecutive failures to prevent Cloudflare rate limiting. Progress indicators emit every 5 seconds during startup, and specific error messages are shown for common failure modes (ENOENT, port conflict, 429, DNS).
+</details>
+<details id="cohere-distributed-mind">
+<summary><strong>COHERE Distributed Mind</strong> — Multi-node mesh with NATS pub/sub, peer review, collective learning</summary>
 ## COHERE Distributed Mind
 <div align="right"><a href="#top">back to top</a></div>
@@ -2226,6 +2369,12 @@ Inbound queries are scanned for prompt injection attempts before processing:
 - Remote constraints from peer nodes (CM-07, published every 5 minutes)
 - Blocked queries increment `queriesErrors` and are silently dropped
+</details>
+<details id="dream-mode--creative-idle-exploration">
+<summary><strong>Dream Mode — Creative Idle Exploration</strong> — NREM/REM sleep cycles with autoresearch swarm on GPU</summary>
 ## Dream Mode — Creative Idle Exploration
 <div align="right"><a href="#top">back to top</a></div>
@@ -2302,6 +2451,12 @@ If the Python scripts are invoked directly (without `uv run`), they self-bootstr
 If no GPU is detected, the REM stage falls back to the standard multi-agent creative exploration (Visionary + Pragmatist + Cross-Pollinator + Synthesizer).
+</details>
+<details id="blessed-mode--infinite-warm-loop">
+<summary><strong>Blessed Mode — Infinite Warm Loop</strong> — Keep model warm in VRAM, auto-cycle tasks, Default Mode Network</summary>
 ## Blessed Mode — Infinite Warm Loop
 <div align="right"><a href="#top">back to top</a></div>
@@ -2341,6 +2496,12 @@ Each DMN cycle runs a lightweight LLM agent (15 max turns, temperature 0.4) with
 **Research basis**: Reflexion ([arXiv:2303.11366](https://arxiv.org/abs/2303.11366)), Self-Rewarding LMs ([arXiv:2401.10020](https://arxiv.org/abs/2401.10020)), Generative Agents ([arXiv:2304.03442](https://arxiv.org/abs/2304.03442)), STOP ([arXiv:2310.02226](https://arxiv.org/abs/2310.02226)), Voyager ([arXiv:2305.16291](https://arxiv.org/abs/2305.16291))
+</details>
+<details id="docker-sandbox--collective-intelligence">
+<summary><strong>Docker Sandbox & Collective Intelligence</strong> — Container isolation, multi-agent testbed, self-play loop</summary>
 ## Docker Sandbox & Collective Intelligence
 <div align="right"><a href="#top">back to top</a></div>
@@ -2458,6 +2619,12 @@ Nodes share identity kernel updates via `nexus.cohere.kernel.delta` on NATS. Ado
 4. **Tool Use = Quality** — Agents using `web_search` produced current, verifiable data. Non-tool responses were generic.
 5. **Identity Divergence** — Different task exposure → different specializations. Intern gained `web-research` from heavy search; Director gained nothing (still loading).
+</details>
+<details id="code-sandbox">
+<summary><strong>Code Sandbox</strong> — Isolated JS, Python, Bash, TypeScript execution in subprocess or Docker</summary>
 ## Code Sandbox
 <div align="right"><a href="#top">back to top</a></div>
@@ -2476,6 +2643,12 @@ Supports JavaScript, TypeScript, Python, and Bash. Two execution modes:
 - **Subprocess** (default) — runs in a child process with timeout and output limits
 - **Docker** — runs in an isolated container when `docker` is available
+</details>
+<details id="structured-data-tools">
+<summary><strong>Structured Data Tools</strong> — Generate and parse CSV, TSV, JSON, Markdown tables, Excel files</summary>
 ## Structured Data Tools
 <div align="right"><a href="#top">back to top</a></div>
@@ -2504,6 +2677,12 @@ Agent: read_structured_file(path="report.md")
 Detects binary formats (XLSX, PDF, DOCX) and suggests conversion tools.
+</details>
+<details id="multi-provider-web-search">
+<summary><strong>Multi-Provider Web Search</strong> — DuckDuckGo, Tavily, and Jina AI with auto-detection</summary>
 ## Multi-Provider Web Search
 <div align="right"><a href="#top">back to top</a></div>
@@ -2521,6 +2700,12 @@ export TAVILY_API_KEY=tvly-...   # Enable Tavily (optional)
 export JINA_API_KEY=jina_...     # Enable Jina AI (optional)
 ```
+</details>
+<details id="task-templates">
+<summary><strong>Task Templates</strong> — Specialized system prompts for code, document, analysis, and plan tasks</summary>
 ## Task Templates
 <div align="right"><a href="#top">back to top</a></div>
@@ -2534,6 +2719,12 @@ Set a task type to get specialized system prompts, recommended tools, and output
 /task-type plan       # Planning — emphasizes steps, dependencies, risks
 ```
+</details>
+<details id="human-expert-speed-ratio">
+<summary><strong>Human Expert Speed Ratio</strong> — Real-time Exp: Nx gauge calibrated across 47 tool baselines</summary>
 ## Human Expert Speed Ratio
 <div align="right"><a href="#top">back to top</a></div>
@@ -2575,6 +2766,12 @@ Color coding: green (2x+ faster), yellow (1-2x, comparable), red (<1x, slower th
 All 47 tools have calibrated baselines ranging from 3s (`task_stop`) to 180s (`codebase_map`). Unknown tools default to 20s.
+</details>
+<details id="cost-tracking--session-metrics">
+<summary><strong>Cost Tracking & Session Metrics</strong> — Token cost estimation for 15+ providers with LLM-as-judge evaluation</summary>
 ## Cost Tracking & Session Metrics
 <div align="right"><a href="#top">back to top</a></div>
@@ -2591,6 +2788,12 @@ Cost tracking supports 15+ providers including Groq, Together AI, OpenRouter, Fi
 Work evaluation uses five task-type-specific rubrics (code, document, analysis, plan, general) scoring correctness, completeness, efficiency, code quality, and communication on a 1-5 scale.
+</details>
+<details id="configuration">
+<summary><strong>Configuration</strong> — CLI flags, env vars, config files, project context, and .oa/ directory</summary>
 ## Configuration
 <div align="right"><a href="#top">back to top</a></div>
@@ -2623,6 +2826,12 @@ Create `AGENTS.md`, `OA.md`, or `.open-agents.md` in your project root for agent
 └── pending-task.json  # Saved task state for /stop and /update resume
 ```
+</details>
+<details id="model-support">
+<summary><strong>Model Support</strong> — Qwen3.5-122B primary target, any Ollama or OpenAI-compatible model</summary>
 ## Model Support
 <div align="right"><a href="#top">back to top</a></div>
@@ -2637,6 +2846,12 @@ oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
 oa --backend-url http://10.0.0.5:11434 "refactor auth"
 ```
+</details>
+<details id="supported-inference-providers">
+<summary><strong>Supported Inference Providers</strong> — 14 providers from local Ollama to Groq, Chutes, OpenRouter, and P2P mesh</summary>
 ## Supported Inference Providers
 <div align="right"><a href="#top">back to top</a></div>
@@ -2795,6 +3010,12 @@ When you've used multiple endpoints, the agent automatically builds a failover c
 No configuration needed — the cascade is built from your endpoint usage history. Works across local Ollama, cloud providers, and P2P peers.
+</details>
+<details id="evaluation-suite">
+<summary><strong>Evaluation Suite</strong> — 23 web nav + 46 coding + 35 enterprise tasks with pass^k reliability</summary>
 ## Evaluation Suite
 <div align="right"><a href="#top">back to top</a></div>
@@ -2986,6 +3207,12 @@ The PoT (Program-of-Thought) guidance achieves **100% code generation rate** —
 - **~80 tokens of prompt additions** (PoT math guidance + search-when-uncertain) took the eval from 41.2% to 100% across all tiers — no fine-tuning required.
 - 4B models match 9B/27B on structured domain tasks (healthcare, DevOps, e-commerce) but need search tools for specialized regulatory knowledge.
+</details>
+<details id="aiwg-integration">
+<summary><strong>AIWG Integration</strong> — AI-augmented SDLC with 85+ agents, structured memory, and traceability</summary>
 ## AIWG Integration
 <div align="right"><a href="#top">back to top</a></div>
@@ -3005,6 +3232,12 @@ oa "analyze this project's SDLC health and set up documentation"
 | **85+ Agents** | Specialized AI personas (Test Engineer, Security Auditor, API Designer) |
 | **Traceability** | @-mention system links requirements to code to tests |
+</details>
+<details id="research-citations">
+<summary><strong>Research Citations</strong> — 32 papers (2023-2026) grounding self-play, memory, identity, and containers</summary>
 ## Research Citations
 <div align="right"><a href="#top">back to top</a></div>
@@ -3058,6 +3291,12 @@ The COHERE collective intelligence system, self-play idle loop, identity evoluti
 | LatentMAS: Latent-Space Collaboration | [2511.20639](https://arxiv.org/abs/2511.20639) | Nov 2025 | Future: 4x faster, 70-84% token reduction |
 | Agent-Kernel Microkernel Architecture | [2512.01610](https://arxiv.org/abs/2512.01610) | Dec 2025 | Architecture: 10k agent coordination |
+</details>
+<details id="license">
+<summary><strong>License</strong> — CC BY-NC 4.0 with enterprise licensing available</summary>
 ## License
 <div align="right"><a href="#top">back to top</a></div>
@@ -3065,3 +3304,5 @@ The COHERE collective intelligence system, self-play idle loop, identity evoluti
 [Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/)
 Free for non-commercial use. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).
+</details>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "open-agents-ai",
-  "version": "0.186.62",
+  "version": "0.186.63",
   "description": "AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop",
   "type": "module",
   "main": "./dist/index.js",