open-agents-ai 0.187.133 → 0.187.135
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +26 -3
- package/dist/index.js +508 -226
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -40,7 +40,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
40
40
|
- [Model-Tier Awareness](#model-tier-awareness)
|
|
41
41
|
- [Live Code Knowledge Graph](#live-code-knowledge-graph)
|
|
42
42
|
- [Auto-Expanding Context Window](#auto-expanding-context-window)
|
|
43
|
-
- [Tools (
|
|
43
|
+
- [Tools (64+)](#tools-64)
|
|
44
44
|
- [Ralph Loop — Iteration-First Design](#ralph-loop--iteration-first-design)
|
|
45
45
|
- [Task Control](#task-control)
|
|
46
46
|
- [COHERE Cognitive Framework](#cohere-cognitive-framework)
|
|
@@ -833,7 +833,7 @@ Small models (4B-7B) receive 10+ optimizations that larger models don't need, ea
|
|
|
833
833
|
|
|
834
834
|
### Tool Nesting for Small Models
|
|
835
835
|
|
|
836
|
-
Small models use an **explore_tools** meta-tool pattern inspired by hierarchical API retrieval research ([ToolLLM](https://arxiv.org/abs/2307.16789)). Instead of presenting all
|
|
836
|
+
Small models use an **explore_tools** meta-tool pattern inspired by hierarchical API retrieval research ([ToolLLM](https://arxiv.org/abs/2307.16789)). Instead of presenting all 64+ tools (which overwhelms small context windows), only core tools are loaded initially. The agent calls `explore_tools()` to discover additional capabilities, then activates specific tools as needed. This reduces tool schema tokens by ~80% while preserving access to the full toolset.
|
|
837
837
|
|
|
838
838
|
### Dynamic Context Limits
|
|
839
839
|
|
|
@@ -923,7 +923,7 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
|
|
|
923
923
|
|
|
924
924
|
|
|
925
925
|
|
|
926
|
-
## Tools (
|
|
926
|
+
## Tools (64)
|
|
927
927
|
|
|
928
928
|
<div align="right"><a href="#top">back to top</a></div>
|
|
929
929
|
|
|
@@ -1007,6 +1007,10 @@ On startup and `/model` switch, Open Agents detects your RAM/VRAM and creates an
|
|
|
1007
1007
|
| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.oa/identity/` |
|
|
1008
1008
|
| `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |
|
|
1009
1009
|
| `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |
|
|
1010
|
+
| **Hardware Access** | |
|
|
1011
|
+
| `camera_capture` | Access system cameras — list devices, capture JPEG frames, query capabilities. Uses ffmpeg + v4l2. Supports USB, CSI, and 360 cameras (QooCam, RealSense). Captured images can be piped to vision tools |
|
|
1012
|
+
| `audio_capture` | Record from microphone — list input devices, record WAV/MP3 (configurable duration/rate/channels), check real-time mic level (RMS dBFS). Uses arecord + ffmpeg backends |
|
|
1013
|
+
| `audio_playback` | Speaker control and TTS — play audio files (WAV/MP3/OGG), text-to-speech via espeak-ng (multi-language), get/set system volume. Uses aplay/ffplay/amixer backends |
|
|
1010
1014
|
|
|
1011
1015
|
Read-only tools execute concurrently when called in the same turn. Mutating tools run sequentially.
|
|
1012
1016
|
|
|
@@ -1031,7 +1035,26 @@ The agent has 4 web tools. Pick the right one:
|
|
|
1031
1035
|
|
|
1032
1036
|
**Structured extraction**: Pass `extract_schema='{"price": "number", "name": "string"}'` to `web_crawl` for best-effort regex-based field extraction from page content.
|
|
1033
1037
|
|
|
1038
|
+
### Hardware Tool Guide
|
|
1034
1039
|
|
|
1040
|
+
The agent can access physical hardware — cameras, microphones, and speakers — through three dedicated tools:
|
|
1041
|
+
|
|
1042
|
+
| Need | Tool | Example |
|
|
1043
|
+
|------|------|---------|
|
|
1044
|
+
| See the environment | `camera_capture` action=capture | Grab a JPEG frame from any USB/CSI camera |
|
|
1045
|
+
| List cameras | `camera_capture` action=list | Discover `/dev/video*` devices |
|
|
1046
|
+
| Record audio | `audio_capture` action=record duration=10 | Record 10s WAV from default mic |
|
|
1047
|
+
| Check if mic works | `audio_capture` action=level | RMS level in dBFS |
|
|
1048
|
+
| Speak aloud | `audio_playback` action=speak text="Hello" | TTS via espeak-ng |
|
|
1049
|
+
| Play a sound file | `audio_playback` action=play file=alert.wav | Play WAV/MP3/OGG |
|
|
1050
|
+
| Check volume | `audio_playback` action=volume | Get current volume % |
|
|
1051
|
+
| Set volume | `audio_playback` action=volume volume=50 | Set to 50% |
|
|
1052
|
+
|
|
1053
|
+
**Prerequisites**: `ffmpeg`, `arecord`, `aplay`, `amixer` (ALSA utils), `espeak-ng`. Install: `sudo apt install ffmpeg alsa-utils espeak-ng`
|
|
1054
|
+
|
|
1055
|
+
**Camera support**: USB cameras (UVC), Intel RealSense (via UVC), 360 cameras (QooCam, Ricoh Theta — raw fisheye via v4l2loopback + ffmpeg crop). The captured frame is returned as base64 JPEG that can be fed directly to the `vision` tool for analysis.
|
|
1056
|
+
|
|
1057
|
+
**Audio workflow**: Record → transcribe → analyze: `audio_capture action=record` → `transcribe_file` → process transcript. The tools handle device enumeration and graceful degradation when hardware is unavailable.
|
|
1035
1058
|
|
|
1036
1059
|
|
|
1037
1060
|
## Ralph Loop — Iteration-First Design
|