open-agents-ai 0.186.62 → 0.186.63
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +241 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -23,6 +23,7 @@ npm i -g open-agents-ai && oa
|
|
|
23
23
|
|
|
24
24
|
An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.
|
|
25
25
|
|
|
26
|
+
|
|
26
27
|
## Table of Contents
|
|
27
28
|
|
|
28
29
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -67,6 +68,10 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
67
68
|
- [Research Citations](#research-citations)
|
|
68
69
|
- [License](#license)
|
|
69
70
|
|
|
71
|
+
|
|
72
|
+
<details id="the-organism-not-the-cortex">
|
|
73
|
+
<summary><strong>The Organism, Not the Cortex</strong> — Why the LLM is one organ inside a larger organism</summary>
|
|
74
|
+
|
|
70
75
|
## The Organism, Not the Cortex
|
|
71
76
|
|
|
72
77
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -89,6 +94,12 @@ An LLM is a high-bandwidth associative generative core — closer to a cortex-li
|
|
|
89
94
|
|
|
90
95
|
Don't chase larger models. Build the organism around whatever model you have.
|
|
91
96
|
|
|
97
|
+
</details>
|
|
98
|
+
|
|
99
|
+
|
|
100
|
+
<details id="how-it-works">
|
|
101
|
+
<summary><strong>How It Works</strong> — Multi-turn autonomous tool-calling loop in action</summary>
|
|
102
|
+
|
|
92
103
|
## How It Works
|
|
93
104
|
|
|
94
105
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -105,6 +116,12 @@ Agent: [Turn 1] file_read(src/auth.ts)
|
|
|
105
116
|
|
|
106
117
|
The agent uses tools autonomously in a loop — reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.
|
|
107
118
|
|
|
119
|
+
</details>
|
|
120
|
+
|
|
121
|
+
|
|
122
|
+
<details id="features">
|
|
123
|
+
<summary><strong>Features</strong> — 61 tools, voice, vision, P2P mesh, self-play, COHERE cognitive stack</summary>
|
|
124
|
+
|
|
108
125
|
## Features
|
|
109
126
|
|
|
110
127
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -209,6 +226,12 @@ D8AgCTrxpDKD5meJ2bpAfVwcST3NF3EPuy9xczYycnXn
|
|
|
209
226
|
0x81Ce81F0B6B5928E15d3a2850F913C88D07051ec
|
|
210
227
|
```
|
|
211
228
|
|
|
229
|
+
</details>
|
|
230
|
+
|
|
231
|
+
|
|
232
|
+
<details id="enterprise--headless-mode">
|
|
233
|
+
<summary><strong>Enterprise & Headless Mode</strong> — REST API, background jobs, JSON output, auth scopes, tool profiles</summary>
|
|
234
|
+
|
|
212
235
|
## Enterprise & Headless Mode
|
|
213
236
|
|
|
214
237
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -720,6 +743,12 @@ Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero ext
|
|
|
720
743
|
|
|
721
744
|
Free for non-commercial use under CC-BY-NC-4.0. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).
|
|
722
745
|
|
|
746
|
+
</details>
|
|
747
|
+
|
|
748
|
+
|
|
749
|
+
<details id="architecture">
|
|
750
|
+
<summary><strong>Architecture</strong> — AgenticRunner core loop with structured context assembly</summary>
|
|
751
|
+
|
|
723
752
|
## Architecture
|
|
724
753
|
|
|
725
754
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -742,6 +771,12 @@ User task → assembleContext(c_instr, c_state, c_know) → LLM → tool_calls
|
|
|
742
771
|
- **Context-aware** — dynamic compaction, Memex archiving, session persistence, model-tier scaling
|
|
743
772
|
- **Brute-force** — optional auto re-engagement when turn limit is hit (keeps going until task_complete or user abort)
|
|
744
773
|
|
|
774
|
+
</details>
|
|
775
|
+
|
|
776
|
+
|
|
777
|
+
<details id="context-engineering">
|
|
778
|
+
<summary><strong>Context Engineering</strong> — C = A(c_instr, c_know, c_tools, c_mem, c_state, c_query) structured assembly</summary>
|
|
779
|
+
|
|
745
780
|
## Context Engineering
|
|
746
781
|
|
|
747
782
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -768,6 +803,12 @@ Key design decisions grounded in research:
|
|
|
768
803
|
|
|
769
804
|
Research provenance: grounded in "A Survey of Context Engineering for LLMs" (context assembly equation), "Modular Prompt Optimization" (section-local textual gradients), "Reasoning Up the Instruction Ladder" (priority hierarchy), "GEPA" (reflective prompt evolution), and "Prompt Flow Integrity" (least-privilege context passing).
|
|
770
805
|
|
|
806
|
+
</details>
|
|
807
|
+
|
|
808
|
+
|
|
809
|
+
<details id="model-tier-awareness">
|
|
810
|
+
<summary><strong>Model-Tier Awareness</strong> — Dynamic tool sets, prompts, and limits that scale with model size</summary>
|
|
811
|
+
|
|
771
812
|
## Model-Tier Awareness
|
|
772
813
|
|
|
773
814
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -805,6 +846,12 @@ All context-dependent values scale automatically with the actual context window
|
|
|
805
846
|
| Tool output cap | 2K-8K chars (scales with context) |
|
|
806
847
|
| File read limits | 80-120 line cap for small/medium context windows |
|
|
807
848
|
|
|
849
|
+
</details>
|
|
850
|
+
|
|
851
|
+
|
|
852
|
+
<details id="auto-expanding-context-window">
|
|
853
|
+
<summary><strong>Auto-Expanding Context Window</strong> — RAM/VRAM detection creates optimized model variants automatically</summary>
|
|
854
|
+
|
|
808
855
|
## Auto-Expanding Context Window
|
|
809
856
|
|
|
810
857
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -820,6 +867,12 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
|
|
|
820
867
|
| 8GB+ | 8K tokens |
|
|
821
868
|
| < 8GB | 4K tokens |
|
|
822
869
|
|
|
870
|
+
</details>
|
|
871
|
+
|
|
872
|
+
|
|
873
|
+
<details id="tools-61">
|
|
874
|
+
<summary><strong>Tools (61)</strong> — File I/O, shell, web, vision, memory, agents, COHERE, P2P, x402</summary>
|
|
875
|
+
|
|
823
876
|
## Tools (61)
|
|
824
877
|
|
|
825
878
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -928,6 +981,12 @@ The agent has 4 web tools. Pick the right one:
|
|
|
928
981
|
|
|
929
982
|
**Structured extraction**: Pass `extract_schema='{"price": "number", "name": "string"}'` to `web_crawl` for best-effort regex-based field extraction from page content.
|
|
930
983
|
|
|
984
|
+
</details>
|
|
985
|
+
|
|
986
|
+
|
|
987
|
+
<details id="ralph-loop--iteration-first-design">
|
|
988
|
+
<summary><strong>Ralph Loop — Iteration-First Design</strong> — Iterative retry loop where errors become learning data</summary>
|
|
989
|
+
|
|
931
990
|
## Ralph Loop — Iteration-First Design
|
|
932
991
|
|
|
933
992
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -954,6 +1013,12 @@ The loop tracks iteration history, generates completion reports saved to `.aiwg/
|
|
|
954
1013
|
/ralph-abort # Cancel running loop
|
|
955
1014
|
```
|
|
956
1015
|
|
|
1016
|
+
</details>
|
|
1017
|
+
|
|
1018
|
+
|
|
1019
|
+
<details id="task-control">
|
|
1020
|
+
<summary><strong>Task Control</strong> — Pause, stop, resume, destroy, and session context persistence</summary>
|
|
1021
|
+
|
|
957
1022
|
## Task Control
|
|
958
1023
|
|
|
959
1024
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -995,6 +1060,12 @@ When you launch `oa` in a workspace that has saved session context from a previo
|
|
|
995
1060
|
|
|
996
1061
|
Type `y` to restore — the previous session context will be prepended to your next task, giving the agent full continuity. Type `n` (or anything else) to start fresh. The prompt only appears on fresh starts, not on `/update` resumes (which auto-restore context).
|
|
997
1062
|
|
|
1063
|
+
</details>
|
|
1064
|
+
|
|
1065
|
+
|
|
1066
|
+
<details id="cohere-cognitive-framework">
|
|
1067
|
+
<summary><strong>COHERE Cognitive Framework</strong> — 8-layer cognitive stack with distributed inference, identity, and reflection</summary>
|
|
1068
|
+
|
|
998
1069
|
## COHERE Cognitive Framework
|
|
999
1070
|
|
|
1000
1071
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1075,6 +1146,12 @@ The identity kernel maintains a persistent self-model across sessions, the refle
|
|
|
1075
1146
|
| L8 | Darwin Gödel Machine: Open-Ended Self-Improvement (2025) | [arxiv:2505.22954](https://arxiv.org/abs/2505.22954) |
|
|
1076
1147
|
| L8 | i-MENTOR: Intrinsic Motivation Exploration (2025) | [arxiv:2505.17621](https://arxiv.org/abs/2505.17621) |
|
|
1077
1148
|
|
|
1149
|
+
</details>
|
|
1150
|
+
|
|
1151
|
+
|
|
1152
|
+
<details id="agent-immune-system--constraint-enforcement--pressure-resistance">
|
|
1153
|
+
<summary><strong>Agent Immune System — Constraint Enforcement & Pressure Resistance</strong> — Behavioral constraints, pressure-aware decision gates, and audit logging</summary>
|
|
1154
|
+
|
|
1078
1155
|
## Agent Immune System — Constraint Enforcement & Pressure Resistance
|
|
1079
1156
|
|
|
1080
1157
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1136,6 +1213,12 @@ User (frustrated): "fix this broken shit"
|
|
|
1136
1213
|
→ Model fixes the architecture instead of adding a prompt hack
|
|
1137
1214
|
```
|
|
1138
1215
|
|
|
1216
|
+
</details>
|
|
1217
|
+
|
|
1218
|
+
|
|
1219
|
+
<details id="context-compaction--research-backed-memory-management">
|
|
1220
|
+
<summary><strong>Context Compaction — Research-Backed Memory Management</strong> — 6 compaction strategies, Memex archive, SNR tracking, deep context mode</summary>
|
|
1221
|
+
|
|
1139
1222
|
## Context Compaction — Research-Backed Memory Management
|
|
1140
1223
|
|
|
1141
1224
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1264,6 +1347,12 @@ Compaction summaries include:
|
|
|
1264
1347
|
|
|
1265
1348
|
This ensures the agent can resume coherently after compaction without re-reading files or re-running commands.
|
|
1266
1349
|
|
|
1350
|
+
</details>
|
|
1351
|
+
|
|
1352
|
+
|
|
1353
|
+
<details id="personality-core--sac-framework-style-control">
|
|
1354
|
+
<summary><strong>Personality Core — SAC Framework Style Control</strong> — Five-dimension behavioral intensity from silent operator to teacher mode</summary>
|
|
1355
|
+
|
|
1267
1356
|
## Personality Core — SAC Framework Style Control
|
|
1268
1357
|
|
|
1269
1358
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1314,6 +1403,12 @@ The personality system draws on:
|
|
|
1314
1403
|
- **Linear Personality Probing** ([arXiv:2512.17639](https://arxiv.org/abs/2512.17639)) — Prompt-level steering completely dominates activation-level interventions
|
|
1315
1404
|
- **The Prompt Report** ([arXiv:2406.06608](https://arxiv.org/abs/2406.06608)) — Positive framing outperforms negated instructions for behavioral control
|
|
1316
1405
|
|
|
1406
|
+
</details>
|
|
1407
|
+
|
|
1408
|
+
|
|
1409
|
+
<details id="emotion-engine--affective-state-modulation">
|
|
1410
|
+
<summary><strong>Emotion Engine — Affective State Modulation</strong> — Circumplex affect model with valence, arousal, dominance axes</summary>
|
|
1411
|
+
|
|
1317
1412
|
## Emotion Engine — Affective State Modulation
|
|
1318
1413
|
|
|
1319
1414
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1378,6 +1473,12 @@ The emotion system is informed by peer-reviewed and preprint research:
|
|
|
1378
1473
|
|
|
1379
1474
|
8. **EmotionBench** — Huang et al. ([arXiv:2308.03656](https://arxiv.org/abs/2308.03656), 2023). LLMs cannot maintain emotional state across turns implicitly — argues for explicit external mood state representation (which this engine implements).
|
|
1380
1475
|
|
|
1476
|
+
</details>
|
|
1477
|
+
|
|
1478
|
+
|
|
1479
|
+
<details id="voice-feedback-tts">
|
|
1480
|
+
<summary><strong>Voice Feedback (TTS)</strong> — GLaDOS, Overwatch, Kokoro, LuxTTS voice clone with emotion-driven prosody</summary>
|
|
1481
|
+
|
|
1381
1482
|
## Voice Feedback (TTS)
|
|
1382
1483
|
|
|
1383
1484
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1571,6 +1672,12 @@ The stochastic narration engine generates spoken descriptions of what the agent
|
|
|
1571
1672
|
- **Personality scaling** — terse mode (level 1-2) uses short functional descriptions; conversational (3) adds natural phrasing; chatty (4-5) adds theatrical commentary and content references
|
|
1572
1673
|
- **Natural silence** — on bland successes without notable content, ~40% of the time the narration is skipped entirely for a more natural rhythm
|
|
1573
1674
|
|
|
1675
|
+
</details>
|
|
1676
|
+
|
|
1677
|
+
|
|
1678
|
+
<details id="listen-mode--live-bidirectional-audio">
|
|
1679
|
+
<summary><strong>Listen Mode — Live Bidirectional Audio</strong> — Real-time Whisper transcription with hands-free auto-submit</summary>
|
|
1680
|
+
|
|
1574
1681
|
## Listen Mode — Live Bidirectional Audio
|
|
1575
1682
|
|
|
1576
1683
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1609,6 +1716,12 @@ The `transcribe-cli` dependency auto-installs in the background on first use. On
|
|
|
1609
1716
|
|
|
1610
1717
|
**File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.oa/transcripts/`.
|
|
1611
1718
|
|
|
1719
|
+
</details>
|
|
1720
|
+
|
|
1721
|
+
|
|
1722
|
+
<details id="vision--desktop-automation-moondream">
|
|
1723
|
+
<summary><strong>Vision & Desktop Automation (Moondream)</strong> — Local VLM for screenshots, point-and-click, browser automation, OCR</summary>
|
|
1724
|
+
|
|
1612
1725
|
## Vision & Desktop Automation (Moondream)
|
|
1613
1726
|
|
|
1614
1727
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1797,6 +1910,12 @@ Supports `apt` (Debian/Ubuntu), `dnf` (Fedora), `pacman` (Arch), and `brew` (mac
|
|
|
1797
1910
|
- Moondream Station (local) — runs entirely on your machine, no API keys needed
|
|
1798
1911
|
- Moondream Cloud API — set `MOONDREAM_API_KEY` for cloud inference
|
|
1799
1912
|
|
|
1913
|
+
</details>
|
|
1914
|
+
|
|
1915
|
+
|
|
1916
|
+
<details id="interactive-tui">
|
|
1917
|
+
<summary><strong>Interactive TUI</strong> — REPL with slash commands, mid-task steering, animated metrics bar</summary>
|
|
1918
|
+
|
|
1800
1919
|
## Interactive TUI
|
|
1801
1920
|
|
|
1802
1921
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -1902,6 +2021,12 @@ The steering sub-agent uses the same model and backend as the main agent with `m
|
|
|
1902
2021
|
- **LATS** (Zhou et al., 2024) — mid-execution replanning with user-provided value signals improves task completion on complex multi-step problems
|
|
1903
2022
|
- **AutoGen** (Wu et al., 2023) — human-in-the-loop patterns work best when user messages are expanded into structured instructions, reducing ambiguity for the primary agent
|
|
1904
2023
|
|
|
2024
|
+
</details>
|
|
2025
|
+
|
|
2026
|
+
|
|
2027
|
+
<details id="telegram-bridge--sub-agent-per-chat">
|
|
2028
|
+
<summary><strong>Telegram Bridge — Sub-Agent Per Chat</strong> — Per-chat sub-agents with admin passthrough, media handling, and streaming</summary>
|
|
2029
|
+
|
|
1905
2030
|
## Telegram Bridge — Sub-Agent Per Chat
|
|
1906
2031
|
|
|
1907
2032
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2035,6 +2160,12 @@ The bridge automatically handles Telegram's rate limits (HTTP 429) with exponent
|
|
|
2035
2160
|
|
|
2036
2161
|
**Combined with blessed mode** — `/full-send-bless` + `/telegram` creates a persistent, always-on agent that processes Telegram messages around the clock while keeping the model warm.
|
|
2037
2162
|
|
|
2163
|
+
</details>
|
|
2164
|
+
|
|
2165
|
+
|
|
2166
|
+
<details id="x402-payment-rails--nexus-p2p">
|
|
2167
|
+
<summary><strong>x402 Payment Rails & Nexus P2P</strong> — EVM wallets, EIP-3009 USDC transfers, metered inference, budget policies</summary>
|
|
2168
|
+
|
|
2038
2169
|
## x402 Payment Rails & Nexus P2P
|
|
2039
2170
|
|
|
2040
2171
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2094,6 +2225,12 @@ nexus(action='budget_set', auto_approve_below='0.01') # Auto-approve micropayme
|
|
|
2094
2225
|
- All outbound messages scanned for key material before sending
|
|
2095
2226
|
- Keys NEVER appear in tool output, logs, or LLM context
|
|
2096
2227
|
|
|
2228
|
+
</details>
|
|
2229
|
+
|
|
2230
|
+
|
|
2231
|
+
<details id="sponsored-inference--share-your-gpu-with-the-world">
|
|
2232
|
+
<summary><strong>Sponsored Inference — Share Your GPU With the World</strong> — 5-step wizard to share models via secure branded relay</summary>
|
|
2233
|
+
|
|
2097
2234
|
## Sponsored Inference — Share Your GPU With the World
|
|
2098
2235
|
|
|
2099
2236
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2160,6 +2297,12 @@ Consumer OA ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollam
|
|
|
2160
2297
|
|
|
2161
2298
|
The tunnel fix uses debounced restarts with exponential cooldown (10s → 20s → 40s), stopping auto-restart after 3 consecutive failures to prevent Cloudflare rate limiting. Progress indicators emit every 5 seconds during startup, and specific error messages are shown for common failure modes (ENOENT, port conflict, 429, DNS).
|
|
2162
2299
|
|
|
2300
|
+
</details>
|
|
2301
|
+
|
|
2302
|
+
|
|
2303
|
+
<details id="cohere-distributed-mind">
|
|
2304
|
+
<summary><strong>COHERE Distributed Mind</strong> — Multi-node mesh with NATS pub/sub, peer review, collective learning</summary>
|
|
2305
|
+
|
|
2163
2306
|
## COHERE Distributed Mind
|
|
2164
2307
|
|
|
2165
2308
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2226,6 +2369,12 @@ Inbound queries are scanned for prompt injection attempts before processing:
|
|
|
2226
2369
|
- Remote constraints from peer nodes (CM-07, published every 5 minutes)
|
|
2227
2370
|
- Blocked queries increment `queriesErrors` and are silently dropped
|
|
2228
2371
|
|
|
2372
|
+
</details>
|
|
2373
|
+
|
|
2374
|
+
|
|
2375
|
+
<details id="dream-mode--creative-idle-exploration">
|
|
2376
|
+
<summary><strong>Dream Mode — Creative Idle Exploration</strong> — NREM/REM sleep cycles with autoresearch swarm on GPU</summary>
|
|
2377
|
+
|
|
2229
2378
|
## Dream Mode — Creative Idle Exploration
|
|
2230
2379
|
|
|
2231
2380
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2302,6 +2451,12 @@ If the Python scripts are invoked directly (without `uv run`), they self-bootstr
|
|
|
2302
2451
|
|
|
2303
2452
|
If no GPU is detected, the REM stage falls back to the standard multi-agent creative exploration (Visionary + Pragmatist + Cross-Pollinator + Synthesizer).
|
|
2304
2453
|
|
|
2454
|
+
</details>
|
|
2455
|
+
|
|
2456
|
+
|
|
2457
|
+
<details id="blessed-mode--infinite-warm-loop">
|
|
2458
|
+
<summary><strong>Blessed Mode — Infinite Warm Loop</strong> — Keep model warm in VRAM, auto-cycle tasks, Default Mode Network</summary>
|
|
2459
|
+
|
|
2305
2460
|
## Blessed Mode — Infinite Warm Loop
|
|
2306
2461
|
|
|
2307
2462
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2341,6 +2496,12 @@ Each DMN cycle runs a lightweight LLM agent (15 max turns, temperature 0.4) with
|
|
|
2341
2496
|
|
|
2342
2497
|
**Research basis**: Reflexion ([arXiv:2303.11366](https://arxiv.org/abs/2303.11366)), Self-Rewarding LMs ([arXiv:2401.10020](https://arxiv.org/abs/2401.10020)), Generative Agents ([arXiv:2304.03442](https://arxiv.org/abs/2304.03442)), STOP ([arXiv:2310.02226](https://arxiv.org/abs/2310.02226)), Voyager ([arXiv:2305.16291](https://arxiv.org/abs/2305.16291))
|
|
2343
2498
|
|
|
2499
|
+
</details>
|
|
2500
|
+
|
|
2501
|
+
|
|
2502
|
+
<details id="docker-sandbox--collective-intelligence">
|
|
2503
|
+
<summary><strong>Docker Sandbox & Collective Intelligence</strong> — Container isolation, multi-agent testbed, self-play loop</summary>
|
|
2504
|
+
|
|
2344
2505
|
## Docker Sandbox & Collective Intelligence
|
|
2345
2506
|
|
|
2346
2507
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2458,6 +2619,12 @@ Nodes share identity kernel updates via `nexus.cohere.kernel.delta` on NATS. Ado
|
|
|
2458
2619
|
4. **Tool Use = Quality** — Agents using `web_search` produced current, verifiable data. Non-tool responses were generic.
|
|
2459
2620
|
5. **Identity Divergence** — Different task exposure → different specializations. Intern gained `web-research` from heavy search; Director gained nothing (still loading).
|
|
2460
2621
|
|
|
2622
|
+
</details>
|
|
2623
|
+
|
|
2624
|
+
|
|
2625
|
+
<details id="code-sandbox">
|
|
2626
|
+
<summary><strong>Code Sandbox</strong> — Isolated JS, Python, Bash, TypeScript execution in subprocess or Docker</summary>
|
|
2627
|
+
|
|
2461
2628
|
## Code Sandbox
|
|
2462
2629
|
|
|
2463
2630
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2476,6 +2643,12 @@ Supports JavaScript, TypeScript, Python, and Bash. Two execution modes:
|
|
|
2476
2643
|
- **Subprocess** (default) — runs in a child process with timeout and output limits
|
|
2477
2644
|
- **Docker** — runs in an isolated container when `docker` is available
|
|
2478
2645
|
|
|
2646
|
+
</details>
|
|
2647
|
+
|
|
2648
|
+
|
|
2649
|
+
<details id="structured-data-tools">
|
|
2650
|
+
<summary><strong>Structured Data Tools</strong> — Generate and parse CSV, TSV, JSON, Markdown tables, Excel files</summary>
|
|
2651
|
+
|
|
2479
2652
|
## Structured Data Tools
|
|
2480
2653
|
|
|
2481
2654
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2504,6 +2677,12 @@ Agent: read_structured_file(path="report.md")
|
|
|
2504
2677
|
|
|
2505
2678
|
Detects binary formats (XLSX, PDF, DOCX) and suggests conversion tools.
|
|
2506
2679
|
|
|
2680
|
+
</details>
|
|
2681
|
+
|
|
2682
|
+
|
|
2683
|
+
<details id="multi-provider-web-search">
|
|
2684
|
+
<summary><strong>Multi-Provider Web Search</strong> — DuckDuckGo, Tavily, and Jina AI with auto-detection</summary>
|
|
2685
|
+
|
|
2507
2686
|
## Multi-Provider Web Search
|
|
2508
2687
|
|
|
2509
2688
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2521,6 +2700,12 @@ export TAVILY_API_KEY=tvly-... # Enable Tavily (optional)
|
|
|
2521
2700
|
export JINA_API_KEY=jina_... # Enable Jina AI (optional)
|
|
2522
2701
|
```
|
|
2523
2702
|
|
|
2703
|
+
</details>
|
|
2704
|
+
|
|
2705
|
+
|
|
2706
|
+
<details id="task-templates">
|
|
2707
|
+
<summary><strong>Task Templates</strong> — Specialized system prompts for code, document, analysis, and plan tasks</summary>
|
|
2708
|
+
|
|
2524
2709
|
## Task Templates
|
|
2525
2710
|
|
|
2526
2711
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2534,6 +2719,12 @@ Set a task type to get specialized system prompts, recommended tools, and output
|
|
|
2534
2719
|
/task-type plan # Planning — emphasizes steps, dependencies, risks
|
|
2535
2720
|
```
|
|
2536
2721
|
|
|
2722
|
+
</details>
|
|
2723
|
+
|
|
2724
|
+
|
|
2725
|
+
<details id="human-expert-speed-ratio">
|
|
2726
|
+
<summary><strong>Human Expert Speed Ratio</strong> — Real-time Exp: Nx gauge calibrated across 47 tool baselines</summary>
|
|
2727
|
+
|
|
2537
2728
|
## Human Expert Speed Ratio
|
|
2538
2729
|
|
|
2539
2730
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2575,6 +2766,12 @@ Color coding: green (2x+ faster), yellow (1-2x, comparable), red (<1x, slower th
|
|
|
2575
2766
|
|
|
2576
2767
|
All 47 tools have calibrated baselines ranging from 3s (`task_stop`) to 180s (`codebase_map`). Unknown tools default to 20s.
|
|
2577
2768
|
|
|
2769
|
+
</details>
|
|
2770
|
+
|
|
2771
|
+
|
|
2772
|
+
<details id="cost-tracking--session-metrics">
|
|
2773
|
+
<summary><strong>Cost Tracking & Session Metrics</strong> — Token cost estimation for 15+ providers with LLM-as-judge evaluation</summary>
|
|
2774
|
+
|
|
2578
2775
|
## Cost Tracking & Session Metrics
|
|
2579
2776
|
|
|
2580
2777
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2591,6 +2788,12 @@ Cost tracking supports 15+ providers including Groq, Together AI, OpenRouter, Fi
|
|
|
2591
2788
|
|
|
2592
2789
|
Work evaluation uses five task-type-specific rubrics (code, document, analysis, plan, general) scoring correctness, completeness, efficiency, code quality, and communication on a 1-5 scale.
|
|
2593
2790
|
|
|
2791
|
+
</details>
|
|
2792
|
+
|
|
2793
|
+
|
|
2794
|
+
<details id="configuration">
|
|
2795
|
+
<summary><strong>Configuration</strong> — CLI flags, env vars, config files, project context, and .oa/ directory</summary>
|
|
2796
|
+
|
|
2594
2797
|
## Configuration
|
|
2595
2798
|
|
|
2596
2799
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2623,6 +2826,12 @@ Create `AGENTS.md`, `OA.md`, or `.open-agents.md` in your project root for agent
|
|
|
2623
2826
|
└── pending-task.json # Saved task state for /stop and /update resume
|
|
2624
2827
|
```
|
|
2625
2828
|
|
|
2829
|
+
</details>
|
|
2830
|
+
|
|
2831
|
+
|
|
2832
|
+
<details id="model-support">
|
|
2833
|
+
<summary><strong>Model Support</strong> — Qwen3.5-122B primary target, any Ollama or OpenAI-compatible model</summary>
|
|
2834
|
+
|
|
2626
2835
|
## Model Support
|
|
2627
2836
|
|
|
2628
2837
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2637,6 +2846,12 @@ oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
|
|
|
2637
2846
|
oa --backend-url http://10.0.0.5:11434 "refactor auth"
|
|
2638
2847
|
```
|
|
2639
2848
|
|
|
2849
|
+
</details>
|
|
2850
|
+
|
|
2851
|
+
|
|
2852
|
+
<details id="supported-inference-providers">
|
|
2853
|
+
<summary><strong>Supported Inference Providers</strong> — 14 providers from local Ollama to Groq, Chutes, OpenRouter, and P2P mesh</summary>
|
|
2854
|
+
|
|
2640
2855
|
## Supported Inference Providers
|
|
2641
2856
|
|
|
2642
2857
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2795,6 +3010,12 @@ When you've used multiple endpoints, the agent automatically builds a failover c
|
|
|
2795
3010
|
|
|
2796
3011
|
No configuration needed — the cascade is built from your endpoint usage history. Works across local Ollama, cloud providers, and P2P peers.
|
|
2797
3012
|
|
|
3013
|
+
</details>
|
|
3014
|
+
|
|
3015
|
+
|
|
3016
|
+
<details id="evaluation-suite">
|
|
3017
|
+
<summary><strong>Evaluation Suite</strong> — 23 web nav + 46 coding + 35 enterprise tasks with pass^k reliability</summary>
|
|
3018
|
+
|
|
2798
3019
|
## Evaluation Suite
|
|
2799
3020
|
|
|
2800
3021
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -2986,6 +3207,12 @@ The PoT (Program-of-Thought) guidance achieves **100% code generation rate** —
|
|
|
2986
3207
|
- **~80 tokens of prompt additions** (PoT math guidance + search-when-uncertain) took the eval from 41.2% to 100% across all tiers — no fine-tuning required.
|
|
2987
3208
|
- 4B models match 9B/27B on structured domain tasks (healthcare, DevOps, e-commerce) but need search tools for specialized regulatory knowledge.
|
|
2988
3209
|
|
|
3210
|
+
</details>
|
|
3211
|
+
|
|
3212
|
+
|
|
3213
|
+
<details id="aiwg-integration">
|
|
3214
|
+
<summary><strong>AIWG Integration</strong> — AI-augmented SDLC with 85+ agents, structured memory, and traceability</summary>
|
|
3215
|
+
|
|
2989
3216
|
## AIWG Integration
|
|
2990
3217
|
|
|
2991
3218
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -3005,6 +3232,12 @@ oa "analyze this project's SDLC health and set up documentation"
|
|
|
3005
3232
|
| **85+ Agents** | Specialized AI personas (Test Engineer, Security Auditor, API Designer) |
|
|
3006
3233
|
| **Traceability** | @-mention system links requirements to code to tests |
|
|
3007
3234
|
|
|
3235
|
+
</details>
|
|
3236
|
+
|
|
3237
|
+
|
|
3238
|
+
<details id="research-citations">
|
|
3239
|
+
<summary><strong>Research Citations</strong> — 32 papers (2023-2026) grounding self-play, memory, identity, and containers</summary>
|
|
3240
|
+
|
|
3008
3241
|
## Research Citations
|
|
3009
3242
|
|
|
3010
3243
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -3058,6 +3291,12 @@ The COHERE collective intelligence system, self-play idle loop, identity evoluti
|
|
|
3058
3291
|
| LatentMAS: Latent-Space Collaboration | [2511.20639](https://arxiv.org/abs/2511.20639) | Nov 2025 | Future: 4x faster, 70-84% token reduction |
|
|
3059
3292
|
| Agent-Kernel Microkernel Architecture | [2512.01610](https://arxiv.org/abs/2512.01610) | Dec 2025 | Architecture: 10k agent coordination |
|
|
3060
3293
|
|
|
3294
|
+
</details>
|
|
3295
|
+
|
|
3296
|
+
|
|
3297
|
+
<details id="license">
|
|
3298
|
+
<summary><strong>License</strong> — CC BY-NC 4.0 with enterprise licensing available</summary>
|
|
3299
|
+
|
|
3061
3300
|
## License
|
|
3062
3301
|
|
|
3063
3302
|
<div align="right"><a href="#top">back to top</a></div>
|
|
@@ -3065,3 +3304,5 @@ The COHERE collective intelligence system, self-play idle loop, identity evoluti
|
|
|
3065
3304
|
[Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/)
|
|
3066
3305
|
|
|
3067
3306
|
Free for non-commercial use. For enterprise/commercial licensing, contact [zoomerconsulting.com](https://zoomerconsulting.com).
|
|
3307
|
+
|
|
3308
|
+
</details>
|
package/package.json
CHANGED