npm - open-agents-ai - Versions diffs - 0.187.580 → 0.187.582 - Mend

open-agents-ai 0.187.580 → 0.187.582

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -131,6 +131,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
   - [Research Foundations](#research-foundations)
 - [Voice Feedback (TTS)](#voice-feedback-tts)
   - [LuxTTS Voice Cloning](#luxtts-voice-cloning)
+  - [Supertonic Expressive Tags](#supertonic-expressive-tags)
   - [Narration Engine Architecture](#narration-engine-architecture)
   - [Emotion-Driven Prosody (SEST)](#emotion-driven-prosody-sest)
   - [Personality-Aware Voice](#personality-aware-voice)
@@ -148,6 +149,8 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
   - [Browser Automation](#browser-automation)
 - [Interactive TUI](#interactive-tui)
   - [Slash Commands](#slash-commands)
+  - [Platform Connectors](#platform-connectors)
+  - [Workspace Explorer](#workspace-explorer)
   - [Mid-Task Steering (Sub-Agent Architecture)](#mid-task-steering-sub-agent-architecture)
 - [Telegram Bridge — Sub-Agent Per Chat](#telegram-bridge--sub-agent-per-chat)
   - [Admin Slash Command Passthrough](#admin-slash-command-passthrough)
@@ -281,6 +284,8 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
 - **Littleman Observer** — parallel meta-analysis system that watches the agent loop in real-time. Detects false failure claims after successful tools, blocks redundant re-execution, catches runaway one-sided output in conversations, and dynamically extends turn limits when active work is detected. Emits `debug_context` and `debug_littleman` events for live observability
 - **Interactive Session Lock** — generic `SESSION_ACTIVE` protocol prevents premature task completion during long-running sessions (phone calls, live chat, monitoring). Any MCP contract can adopt the protocol. Paired with context-engineered system prompts that teach small models to maintain conversation loops
 - **Voice Chat** — `/voicechat` starts an async voice conversation that runs parallel to the main agent loop. Mic audio is transcribed via Whisper and injected as user messages; agent responses are synthesized to speech via TTS. Neither blocks the other — talk to the agent while it works
+- **Platform connector menus** — `/platforms menu` opens a TUI onboarding and status surface for Telegram plus Discord, Slack, Matrix, and webhook adapter configuration. `/gateway` is the alias for the same connector surface
+- **Workspace explorer** — `/files`, `/files <query>`, and `/files menu` provide a root-bounded working-directory browser with searchable file selection, noisy-directory filtering, file classification, and inline previews in the TUI scrollback
 ### Cross-Modal Workers
@@ -312,6 +317,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
 - **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent
 - **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text
 - **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)
+- **Supertonic expressive tags** — when `/voice supertonic` is active, OA inserts supported expression tags such as `<sigh>`, `<breath>`, and `<laugh>` into spoken status updates based on failure, recovery, sentence boundaries, success, and playful tone. Other voice backends receive sanitized plain text
 - **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior
 - **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines
 - **Cost tracking** — real-time token cost estimation for 15+ cloud providers
@@ -2814,6 +2820,7 @@ The emotion system is informed by peer-reviewed and preprint research:
 /voice overwatch    # Overwatch voice (ONNX, ~50MB)
 /voice kokoro       # Kokoro voice (MLX, macOS Apple Silicon)
 /voice luxtts       # LuxTTS voice clone (flow-matching, any platform)
+/voice supertonic   # Supertonic3 voice with expressive tags
 /voice clone <file> # Set clone reference audio for LuxTTS (wav/mp3/ogg/flac)
 /voice clone glados # Generate clone ref from GLaDOS → LuxTTS
 /voice clone overwatch  # Generate clone ref from Overwatch → LuxTTS
@@ -2845,6 +2852,20 @@ Auto-downloads the ONNX voice model (~50MB) on first use. LuxTTS is the primary
 Output: 48kHz WAV, compatible with Telegram voice messages and WebSocket streaming.
+### Supertonic Expressive Tags
+When Supertonic is the active voice backend, OA decorates spoken status updates with the expression tags Supertonic supports. The tag pass runs after markdown/ANSI cleanup and only for Supertonic, so GLaDOS, Overwatch, Kokoro, and LuxTTS continue receiving plain sanitized text.
+Tag placement is context-aware:
+| Tag | Used for |
+|-----|----------|
+| `<sigh>` | Failures, retry loops, blocked work, and error recovery |
+| `<breath>` | Natural pauses between sentences or before follow-up checks |
+| `<laugh>` | Playful or successful moments where a lighter delivery fits |
+Existing tags are preserved and duplicate tags are suppressed. The result is still concise TUI narration, but failure modes, recovery, success, and playful updates sound less flat when `/voice supertonic` is selected.
 ### Narration Engine Architecture
 The voice narration system produces **zero static phrase pools** — every spoken sentence is dynamically composed from live tool state, session metrics, and emotion coordinates. The architecture is grounded in 2024-2026 TTS and emotion research:
@@ -3271,7 +3292,7 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
 | `/compact` | Force context compaction now (default strategy) |
 | `/compact <strategy>` | Compact with strategy: `aggressive`, `decisions`, `errors`, `summary`, `structured` |
 | **Audio & Vision** | |
-| `/voice [model]` | Toggle TTS voice (GLaDOS, Overwatch, Kokoro, LuxTTS) |
+| `/voice [model]` | Toggle TTS voice (GLaDOS, Overwatch, Kokoro, LuxTTS, Supertonic) |
 | `/listen [mode]` | Toggle live microphone transcription |
 | `/dream [mode]` | Start dream mode (default, deep, lucid) |
 | **Display & Behavior** | |
@@ -3284,6 +3305,29 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
 | `/tools` | List agent-created custom tools |
 | `/skills [keyword]` | List/search available AIWG skills |
 | `/<skill-name> [args]` | Invoke an AIWG skill directly |
+| **Workspace Explorer** | |
+| `/files` | Show searchable working-directory file overview |
+| `/files <query>` | Filter the working-directory file overview |
+| `/files menu` | Open searchable workspace file selector and inline preview |
+| `/workspace` | Alias for `/files` |
+| `/workspace <query>` | Alias for `/files <query>` |
+| `/workspace menu` | Alias for `/files menu` |
+| **Platform Connectors** | |
+| `/platforms` | Show platform adapter state |
+| `/platforms status` | Show platform adapter state |
+| `/platforms menu` | Open interactive platform connectivity menu |
+| `/platforms setup` | Open platform onboarding menu |
+| `/platforms onboard` | Open platform onboarding menu |
+| `/platforms config` | Show redacted platform adapter configuration |
+| `/platforms enable <id>` | Enable a platform adapter |
+| `/platforms disable <id>` | Disable a platform adapter |
+| `/platforms set <id> token-env <ENV>` | Set adapter token environment variable |
+| `/platforms set <id> base-url <url>` | Set adapter base URL or webhook URL |
+| `/platforms set <id> target <target>` | Set default channel, room, or target |
+| `/platforms set <id> mode <polling\|webhook>` | Set adapter polling or webhook mode |
+| `/platforms clear <id> <key>` | Clear saved adapter token, token-env, base-url, or target |
+| `/gateway` | Alias for `/platforms` |
+| `/gateway menu` | Alias for `/platforms menu` |
 | **P2P & Secrets** | |
 | `/p2p start` | Start the P2P inference mesh node |
 | `/p2p connect <url>` | Connect to a remote peer |
@@ -3317,6 +3361,29 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
 All settings commands accept `--local` to save to project `.oa/settings.json` instead of global config.
+### Platform Connectors
+`/platforms menu` opens a keyboard-driven connectivity surface for platform status and onboarding. The same surface is available through `/gateway menu`.
+Supported adapter IDs are `telegram`, `discord`, `slack`, `matrix`, and `webhook`. Telegram is wired into the existing bridge controls, so the menu can save the bot token, save the admin user ID, start the bridge, stop it, and show status. The other platform adapters expose the shared connector configuration surface: enable/disable, token environment variable, base URL or webhook URL, default target/channel/room, polling vs webhook mode, redacted config display, and status/health reporting.
+Use environment variables for credentials:
+```bash
+/platforms set slack token-env SLACK_BOT_TOKEN
+/platforms set slack base-url https://slack.com/api
+/platforms set slack target C0123456789
+/platforms set slack mode webhook
+/platforms enable slack
+/platforms config
+```
+### Workspace Explorer
+`/files` renders a compact working-directory overview directly in the TUI scrollback. It classifies files as source, test, docs, config, assets, or other, skips noisy directories like `.git`, `node_modules`, build outputs, caches, and publish artifacts, and keeps all preview reads bounded to the current workspace root.
+Use `/files <query>` for fast path filtering, or `/files menu` for the searchable selector. Selecting a file prints an inline, line-numbered preview without changing editor focus. `/workspace`, `/workspace <query>`, and `/workspace menu` are aliases for the same explorer.
 ### Mid-Task Steering (Sub-Agent Architecture)
 While the agent is working (shown by the `+` prompt), type to add context. A **dedicated steering sub-agent** spins up in the background to process your input: