npm - open-agents-ai - Versions diffs - 0.185.71 → 0.185.72 - Mend

open-agents-ai 0.185.71 → 0.185.72

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +5 -5
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1286,7 +1286,7 @@ The voice narration system produces **zero static phrase pools** — every spoke
 | 2 | Contextual opener | "Moving to voice.ts" |
 | 3 | Gerund-led | "Taking a deeper look at voice.ts now" |
-**Ring buffer deduplication** ([Moshi inner monologue, arXiv:2410.00037](https://arxiv.org/abs/2410.00037)): A sliding window of the last 8 utterances catches near-duplicates via Jaccard word-level similarity (threshold 0.7). When a near-duplicate is detected, **DITTO adaptive rotation** ([arXiv:2206.02369](https://arxiv.org/abs/2206.02369), NeurIPS 2022) advances the structure pattern by 2 positions to break self-reinforcing repetition loops.
+**Ring buffer deduplication** ([Moshi inner monologue, [arXiv:2410.00037](https://arxiv.org/abs/2410.00037)](https://arxiv.org/abs/2410.00037)): A sliding window of the last 8 utterances catches near-duplicates via Jaccard word-level similarity (threshold 0.7). When a near-duplicate is detected, **DITTO adaptive rotation** ([arXiv:2206.02369](https://arxiv.org/abs/2206.02369), NeurIPS 2022) advances the structure pattern by 2 positions to break self-reinforcing repetition loops.
 **State-computed emotion interjections**: Instead of word pools, emotion interjections are computed from real session metrics. The emotion quadrant (from ADV coordinates) determines *which* metrics to surface:
@@ -1299,7 +1299,7 @@ The voice narration system produces **zero static phrase pools** — every spoke
 ### Emotion-Driven Prosody (SEST)
-The voice engine modulates **three prosodic dimensions** from the emotion state — text vocabulary stays factual, emotion is expressed through *how* it sounds, not *what* it says ([EmoShift, arXiv:2601.22873](https://arxiv.org/abs/2601.22873)):
+The voice engine modulates **three prosodic dimensions** from the emotion state — text vocabulary stays factual, emotion is expressed through *how* it sounds, not *what* it says ([EmoShift, [arXiv:2601.22873](https://arxiv.org/abs/2601.22873)](https://arxiv.org/abs/2601.22873)):
 | Dimension | Source | Effect | Range |
 |-----------|--------|--------|-------|
@@ -1307,7 +1307,7 @@ The voice engine modulates **three prosodic dimensions** from the emotion state
 | **Speed** | Arousal (primary) + Dominance (secondary) | High arousal = faster, high dominance = more deliberate | [0.85x, 1.15x] |
 | **Volume** | Speaker role | Primary = 100%, subordinate (sub-agent) = 55% | [0.55, 1.0] |
-Pitch and speed use **nonlinear tanh squashing** ([UDDETTS, arXiv:2505.10599](https://arxiv.org/abs/2505.10599)) — moderate emotions get amplified for expressiveness, extreme emotions saturate gracefully instead of clipping.
+Pitch and speed use **nonlinear tanh squashing** ([UDDETTS, [arXiv:2505.10599](https://arxiv.org/abs/2505.10599)](https://arxiv.org/abs/2505.10599)) — moderate emotions get amplified for expressiveness, extreme emotions saturate gracefully instead of clipping.
 Each narration also emits a **ProsodyHint** metadata object following the RLAIF-SPA SEST schema ([arXiv:2510.14628](https://arxiv.org/abs/2510.14628)) — Structure/Emotion/Speed/Tone — which downstream consumers (WebSocket voice sessions, Telegram TTS) can use independently:
@@ -2533,7 +2533,7 @@ The eval runner supports `--runs N` for pass^k reliability measurement (consiste
 ### REST API Enterprise Evaluation (v0.185.68)
-35 test cases executed against the real REST API (`oa serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
+35 test cases executed against the oa REST API (`oa serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
 ```bash
 node eval/api-enterprise-eval.mjs                    # Run all 85 tests (35 cases × 3 models)
@@ -2572,7 +2572,7 @@ node eval/api-enterprise-eval.mjs                    # Run all 85 tests (35 case
 | 9B + PoT hint | 13% | **100%** | Models write correct Python but chat API can't execute it |
 | 27B + PoT hint | 47% | **100%** | Larger models can trace code mentally; full accuracy requires `repl_exec` in agentic mode |
-The PoT (Program-of-Thought) guidance achieves **100% code generation rate** — every model writes Python instead of computing in-head. Full correctness is realized in agentic mode where `repl_exec` executes the code. Research basis: PAL (arXiv:2211.10435), PoT (arXiv:2211.12588), ToRA (arXiv:2309.17452), START (arXiv:2503.04625).
+The PoT (Program-of-Thought) guidance achieves **100% code generation rate** — every model writes Python instead of computing in-head. Full correctness is realized in agentic mode where `repl_exec` executes the code. Research basis: PAL ([arXiv:2211.10435](https://arxiv.org/abs/2211.10435)), PoT ([arXiv:2211.12588](https://arxiv.org/abs/2211.12588)), ToRA ([arXiv:2309.17452](https://arxiv.org/abs/2309.17452)), START ([arXiv:2503.04625](https://arxiv.org/abs/2503.04625)).
 **Key architectural findings:**
 - API proxy timeout of 10s caused **100% failure** for cold model loads (Ollama needs 15-115s to load models). Fixed to 120s in v0.185.60.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "open-agents-ai",
-  "version": "0.185.71",
+  "version": "0.185.72",
   "description": "AI coding agent powered by open-source models (Ollama/vLLM) — interactive TUI with agentic tool-calling loop",
   "type": "module",
   "main": "./dist/index.js",