@framers/agentos 0.5.9 → 0.5.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +119 -410
- package/dist/api/agency.d.ts.map +1 -1
- package/dist/api/agency.js +0 -2
- package/dist/api/agency.js.map +1 -1
- package/dist/api/runtime/AgentOSOrchestrator.d.ts.map +1 -1
- package/dist/api/runtime/AgentOSOrchestrator.js +6 -3
- package/dist/api/runtime/AgentOSOrchestrator.js.map +1 -1
- package/dist/api/runtime/strategies/debate.d.ts.map +1 -1
- package/dist/api/runtime/strategies/debate.js +64 -21
- package/dist/api/runtime/strategies/debate.js.map +1 -1
- package/dist/api/runtime/strategies/graph.d.ts.map +1 -1
- package/dist/api/runtime/strategies/graph.js +11 -25
- package/dist/api/runtime/strategies/graph.js.map +1 -1
- package/dist/api/runtime/strategies/hierarchical.d.ts.map +1 -1
- package/dist/api/runtime/strategies/hierarchical.js +27 -7
- package/dist/api/runtime/strategies/hierarchical.js.map +1 -1
- package/dist/api/runtime/strategies/parallel.d.ts.map +1 -1
- package/dist/api/runtime/strategies/parallel.js +4 -8
- package/dist/api/runtime/strategies/parallel.js.map +1 -1
- package/dist/api/runtime/strategies/review-loop.d.ts +25 -0
- package/dist/api/runtime/strategies/review-loop.d.ts.map +1 -1
- package/dist/api/runtime/strategies/review-loop.js +5 -13
- package/dist/api/runtime/strategies/review-loop.js.map +1 -1
- package/dist/api/runtime/strategies/sequential.d.ts.map +1 -1
- package/dist/api/runtime/strategies/sequential.js +11 -25
- package/dist/api/runtime/strategies/sequential.js.map +1 -1
- package/dist/api/runtime/strategies/shared.d.ts +45 -8
- package/dist/api/runtime/strategies/shared.d.ts.map +1 -1
- package/dist/api/runtime/strategies/shared.js +47 -8
- package/dist/api/runtime/strategies/shared.js.map +1 -1
- package/dist/api/types.d.ts +16 -0
- package/dist/api/types.d.ts.map +1 -1
- package/dist/api/types.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,12 +1,14 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
3
|
<a href="https://agentos.sh">
|
|
4
|
-
<img src="https://raw.githubusercontent.com/framersai/agentos/master/assets/agentos-primary-no-tagline-transparent-2x.png" alt="AgentOS — TypeScript AI Agent Framework" height="100" />
|
|
4
|
+
<img src="https://raw.githubusercontent.com/framersai/agentos/master/assets/agentos-primary-no-tagline-transparent-2x.png" alt="AgentOS — TypeScript AI Agent Framework with Cognitive Memory" height="100" />
|
|
5
5
|
</a>
|
|
6
6
|
|
|
7
7
|
<br />
|
|
8
8
|
|
|
9
|
-
**Open-
|
|
9
|
+
# **AgentOS** — Open-Source TypeScript AI Agent Runtime with Cognitive Memory, HEXACO Personality, and Runtime Tool Forging
|
|
10
|
+
|
|
11
|
+
**85.6% on LongMemEval-S** at $0.0090/correct · **70.2% on LongMemEval-M** (first open-source library above 65% on the 1.5M-token variant) · 16 LLM providers · 8 neuroscience-backed memory mechanisms · MIT-friendly Apache 2.0
|
|
10
12
|
|
|
11
13
|
[](https://www.npmjs.com/package/@framers/agentos)
|
|
12
14
|
[](https://github.com/framersai/agentos/actions/workflows/ci.yml)
|
|
@@ -16,316 +18,132 @@
|
|
|
16
18
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
17
19
|
[](https://wilds.ai/discord)
|
|
18
20
|
|
|
19
|
-
[Website](https://agentos.sh) · [Docs](https://docs.agentos.sh) · [npm](https://www.npmjs.com/package/@framers/agentos) · [
|
|
21
|
+
[**Benchmarks**](https://docs.agentos.sh/benchmarks) · [Website](https://agentos.sh) · [Docs](https://docs.agentos.sh) · [npm](https://www.npmjs.com/package/@framers/agentos) · [Discord](https://wilds.ai/discord) · [Blog](https://docs.agentos.sh/blog)
|
|
20
22
|
|
|
21
23
|
</div>
|
|
22
24
|
|
|
23
25
|
---
|
|
24
26
|
|
|
25
|
-
##
|
|
26
|
-
|
|
27
|
-
AgentOS is a TypeScript runtime for building AI agents that remember, adapt, and create new tools at runtime. Each agent is a **Generalized Mind Instance** (GMI) with its own personality, memory lifecycle, and behavioral adaptation loop.
|
|
28
|
-
|
|
29
|
-
### Why AgentOS over alternatives?
|
|
30
|
-
|
|
31
|
-
| vs. | AgentOS differentiator |
|
|
32
|
-
|-----|------------------------|
|
|
33
|
-
| **LangChain / LangGraph** | Cognitive memory (8 neuroscience-backed mechanisms), HEXACO personality, runtime tool forging |
|
|
34
|
-
| **Vercel AI SDK** | Multi-agent teams (6 strategies), full RAG pipeline (7 vector backends), guardrails, voice/telephony |
|
|
35
|
-
| **CrewAI / Mastra** | Unified orchestration (workflow DAGs + agent graphs + goal-driven missions), personality-driven routing |
|
|
36
|
-
|
|
37
|
-
> **Full comparison:** [AgentOS vs LangGraph vs CrewAI vs Mastra](https://docs.agentos.sh/blog/2026/02/20/agentos-vs-langgraph-vs-crewai)
|
|
38
|
-
|
|
39
|
-
---
|
|
40
|
-
|
|
41
|
-
## Classifier-Driven Memory Pipeline
|
|
42
|
-
|
|
43
|
-
Most memory libraries retrieve on every query. AgentOS gates memory through three independent LLM-as-judge classifiers, so trivial queries skip retrieval entirely, queries that need memory get the right architecture, and the right reader handles each category.
|
|
27
|
+
## Install
|
|
44
28
|
|
|
29
|
+
```bash
|
|
30
|
+
npm install @framers/agentos
|
|
45
31
|
```
|
|
46
|
-
User query
|
|
47
|
-
│
|
|
48
|
-
▼
|
|
49
|
-
┌──────────────────────────────────┐
|
|
50
|
-
│ Stage 1: QueryClassifier │ gpt-5-mini few-shot, ~$0.0001 / query
|
|
51
|
-
│ Memory needed at all? │
|
|
52
|
-
│ T0 = none ────────────────► answer from context, skip retrieval
|
|
53
|
-
│ T1+ = simple/moderate/complex │
|
|
54
|
-
└──────────────────────────────────┘
|
|
55
|
-
│ (T1+ only)
|
|
56
|
-
▼
|
|
57
|
-
┌──────────────────────────────────┐
|
|
58
|
-
│ Stage 2: MemoryRouter │ reuses Stage 1 classification
|
|
59
|
-
│ Which retrieval architecture? │
|
|
60
|
-
│ canonical-hybrid · OM-v10 · OM-v11
|
|
61
|
-
└──────────────────────────────────┘
|
|
62
|
-
│
|
|
63
|
-
▼
|
|
64
|
-
┌──────────────────────────────────┐
|
|
65
|
-
│ Stage 3: ReaderRouter │ reuses Stage 1 classification
|
|
66
|
-
│ Which reader tier? │
|
|
67
|
-
│ gpt-4o (TR/SSU) · gpt-5-mini (SSA/SSP/KU/MS)
|
|
68
|
-
└──────────────────────────────────┘
|
|
69
|
-
│
|
|
70
|
-
▼
|
|
71
|
-
Grounded answer
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
Each stage is a small LLM-as-judge classifier (gpt-5-mini, ~$0.0001-0.0014 per call). Each stage is independent and shippable on its own. Stages 2 and 3 reuse the Stage 1 classification output, so the full pipeline costs **one classifier call per query**, not three.
|
|
75
32
|
|
|
76
|
-
|
|
33
|
+
```typescript
|
|
34
|
+
import { agent } from '@framers/agentos';
|
|
77
35
|
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
36
|
+
const tutor = agent({
|
|
37
|
+
provider: 'anthropic',
|
|
38
|
+
instructions: 'You are a patient CS tutor.',
|
|
39
|
+
personality: { openness: 0.9, conscientiousness: 0.95 },
|
|
40
|
+
memory: { types: ['episodic', 'semantic'], working: { enabled: true } },
|
|
41
|
+
});
|
|
83
42
|
|
|
84
|
-
|
|
43
|
+
const session = tutor.session('student-1');
|
|
44
|
+
await session.send('Explain recursion with an analogy.');
|
|
45
|
+
await session.send('Can you expand on that?'); // remembers context
|
|
46
|
+
```
|
|
85
47
|
|
|
86
|
-
|
|
48
|
+
[Full quickstart](https://docs.agentos.sh/getting-started) · [Examples cookbook](https://docs.agentos.sh/getting-started/examples) · [API reference](https://docs.agentos.sh/api)
|
|
87
49
|
|
|
88
50
|
---
|
|
89
51
|
|
|
90
52
|
## Memory Benchmarks at Matched Reader
|
|
91
53
|
|
|
92
|
-
Honest, apples-to-apples comparison: same
|
|
54
|
+
Honest, apples-to-apples comparison: same `gpt-4o` reader, same dataset, same Phase B N=500, same `gpt-4o-2024-08-06` judge with rubric `2026-04-18.1` (judge FPR 1% [0%, 3%]). Cross-provider configurations are excluded because they cannot be reproduced from public methodology disclosures.
|
|
93
55
|
|
|
94
|
-
### LongMemEval-S Phase B (115K tokens, 50 sessions
|
|
56
|
+
### LongMemEval-S Phase B (115K tokens, 50 sessions)
|
|
95
57
|
|
|
96
|
-
| System (gpt-4o reader
|
|
58
|
+
| System (gpt-4o reader) | Accuracy | 95% CI | $/correct | p50 latency | Source |
|
|
97
59
|
|---|---:|---|---:|---:|---|
|
|
98
60
|
| EmergenceMem Internal | 86.0% | not published | not published | 5,650 ms | [emergence.ai](https://www.emergence.ai/blog/sota-on-longmemeval-with-rag) |
|
|
99
|
-
| **🚀 AgentOS canonical-hybrid + reader-router** | **85.6%** | **[82.4%, 88.6%]** | **$0.0090** | **3,558 ms** | [
|
|
61
|
+
| **🚀 AgentOS canonical-hybrid + reader-router** | **85.6%** | **[82.4%, 88.6%]** | **$0.0090** | **3,558 ms** | [post](https://docs.agentos.sh/blog/2026/04/28/reader-router-pareto-win) |
|
|
100
62
|
| Mastra OM gpt-4o (gemini-flash observer) | 84.23% | not published | not published | not published | [mastra.ai](https://mastra.ai/research/observational-memory) |
|
|
101
63
|
| Supermemory gpt-4o | 81.6% | not published | not published | not published | [supermemory.ai](https://supermemory.ai/research/) |
|
|
102
|
-
| EmergenceMem Simple Fast (
|
|
103
|
-
| Zep self
|
|
64
|
+
| EmergenceMem Simple Fast (in our harness) | 80.6% | [77.0%, 84.0%] | $0.0586 | 3,703 ms | [adapter](https://github.com/framersai/agentos-bench/blob/master/vendors/emergence-simple-fast/) |
|
|
65
|
+
| Zep self / independent reproduction | 71.2% / 63.8% | not published | not published | — | [self](https://blog.getzep.com/state-of-the-art-agent-memory/) / [arXiv](https://arxiv.org/abs/2512.13564) |
|
|
66
|
+
|
|
67
|
+
**+1.4 pp at point estimate over Mastra OM gpt-4o at the matched reader.** Mastra publishes no CI; their 84.23% sits inside our 95% CI [82.4%, 88.6%], so the gap is at the threshold of statistical significance. EmergenceMem Internal's 86.0% (no CI) also sits inside our CI; we are statistically tied with both. AgentOS p50 latency 3,558 ms vs EmergenceMem's published median 5,650 ms (-2,092 ms at the median; the only vendor that publishes a comparable latency number).
|
|
104
68
|
|
|
105
|
-
|
|
69
|
+
**Cost at scale**: $0.0090 per memory-grounded answer = $9 per 1,000 RAG calls. A chatbot averaging 5 RAG calls per conversation across 1,000 conversations costs ~$45.
|
|
106
70
|
|
|
107
|
-
### LongMemEval-M Phase B (1.5M tokens, 500 sessions
|
|
71
|
+
### LongMemEval-M Phase B (1.5M tokens, 500 sessions)
|
|
108
72
|
|
|
109
|
-
The harder variant. M's haystacks exceed every production context window
|
|
73
|
+
The harder variant. M's haystacks exceed every production context window. Most vendors stop at S because raw long-context fits there.
|
|
110
74
|
|
|
111
75
|
| System | Accuracy | 95% CI | License | Source |
|
|
112
76
|
|---|---:|---|---|---|
|
|
113
|
-
| AgentBrain | 71.7%
|
|
114
|
-
| **🚀 AgentOS (sem-embed + reader-router + top-K=5)
|
|
115
|
-
| LongMemEval paper academic baseline | 65.7% | not published | open repo | [Wu et al., ICLR 2025
|
|
116
|
-
| Mem0 v3, Mastra
|
|
117
|
-
|
|
118
|
-
**Statistically tied with AgentBrain's closed-source SaaS** (their 71.7% sits inside our CI [66.0%, 74.0%]). **+4.5 pp above the LongMemEval paper's published academic ceiling.** **First open-source memory library on the public record above 65% on M with full methodology disclosure** (bootstrap CIs, per-case run JSONs, reproducible CLI, MIT-licensed).
|
|
119
|
-
|
|
120
|
-
### Methodology disclosure (12 axes most vendors omit)
|
|
121
|
-
|
|
122
|
-
| Axis | AgentOS | Most vendors |
|
|
123
|
-
|---|:-:|:-:|
|
|
124
|
-
| Aggregate accuracy | yes | yes |
|
|
125
|
-
| 95% bootstrap CI on headline | yes | no |
|
|
126
|
-
| Per-category 95% CI | yes | no |
|
|
127
|
-
| Reader model disclosed | yes | mostly |
|
|
128
|
-
| Observer / ingest model disclosed | yes | mostly |
|
|
129
|
-
| USD cost per correct | yes | no |
|
|
130
|
-
| Latency avg / p50 / p95 | yes | rarely |
|
|
131
|
-
| Per-category breakdown | yes | sometimes |
|
|
132
|
-
| Open-source benchmark runner | yes | rarely |
|
|
133
|
-
| Per-case run JSONs at fixed seed | yes | no |
|
|
134
|
-
| Judge-adversarial FPR probe | yes (1% S, 2% M, 0% LOCOMO) | no |
|
|
135
|
-
| Matched-reader cross-vendor table | yes | partial |
|
|
136
|
-
|
|
137
|
-
The full audit framework is at [Memory Benchmark Transparency Audit](https://docs.agentos.sh/blog/2026/04/24/memory-benchmark-transparency-audit). Every run referenced above ships with a per-case run JSON at `seed=42`.
|
|
77
|
+
| AgentBrain | 71.7% | not published | closed-source SaaS | [github.com/AgentBrainHQ](https://github.com/AgentBrainHQ) |
|
|
78
|
+
| **🚀 AgentOS** (sem-embed + reader-router + top-K=5) | **70.2%** | **[66.0%, 74.0%]** | **MIT** | [post](https://docs.agentos.sh/blog/2026/04/29/longmemeval-m-70-with-topk5) |
|
|
79
|
+
| LongMemEval paper academic baseline | 65.7% | not published | open repo | [Wu et al., ICLR 2025](https://arxiv.org/abs/2410.10813) |
|
|
80
|
+
| Mem0 v3, Mastra, Hindsight, Zep, EmergenceMem, Supermemory, Letta, others | not published | — | various | reports S only |
|
|
138
81
|
|
|
139
|
-
|
|
82
|
+
**Statistically tied with AgentBrain's closed-source SaaS** (their 71.7% sits inside our CI). **+4.5 pp above the LongMemEval paper's academic ceiling.** **First open-source memory library above 65% on M with full methodology disclosure** (bootstrap CIs, per-case run JSONs, reproducible CLI).
|
|
140
83
|
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
### 🌀 Paracosm — AI Agent Swarm Simulation
|
|
144
|
-
|
|
145
|
-
Define any scenario as JSON. Run it with AI commanders that have different HEXACO personalities. Same starting conditions, different decisions, divergent civilizations. Built on AgentOS.
|
|
146
|
-
|
|
147
|
-
```bash
|
|
148
|
-
npm install paracosm
|
|
149
|
-
```
|
|
150
|
-
|
|
151
|
-
**[Live Demo](https://paracosm.agentos.sh/sim)** · **[GitHub](https://github.com/framersai/paracosm)** · **[npm](https://www.npmjs.com/package/paracosm)** · **[Landing Page](https://paracosm.agentos.sh)**
|
|
84
|
+
> **[Full benchmarks page →](https://docs.agentos.sh/benchmarks)** · **[Reproducible run JSONs →](https://github.com/framersai/agentos-bench/tree/master/results/runs)** · **[Methodology audit →](https://docs.agentos.sh/blog/2026/04/24/memory-benchmark-transparency-audit)**
|
|
152
85
|
|
|
153
86
|
---
|
|
154
87
|
|
|
155
|
-
##
|
|
156
|
-
|
|
157
|
-
```bash
|
|
158
|
-
npm install @framers/agentos
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
### Configure API Keys
|
|
162
|
-
|
|
163
|
-
```bash
|
|
164
|
-
# Environment variables (recommended for production)
|
|
165
|
-
export OPENAI_API_KEY=sk-...
|
|
166
|
-
export ANTHROPIC_API_KEY=sk-ant-...
|
|
167
|
-
export GEMINI_API_KEY=AIza...
|
|
168
|
-
|
|
169
|
-
# Key rotation — comma-separated keys auto-rotate with quota detection
|
|
170
|
-
export OPENAI_API_KEY=sk-key1,sk-key2,sk-key3
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
```typescript
|
|
174
|
-
// Or pass apiKey inline (multi-tenant apps, tests, dynamic config)
|
|
175
|
-
await generateText({ provider: 'openai', apiKey: 'sk-...', prompt: '...' });
|
|
176
|
-
```
|
|
177
|
-
|
|
178
|
-
All high-level functions accept `apiKey` and `baseUrl` parameters.
|
|
179
|
-
|
|
180
|
-
---
|
|
181
|
-
|
|
182
|
-
## Quick Start
|
|
183
|
-
|
|
184
|
-
### Generate Text
|
|
185
|
-
|
|
186
|
-
```typescript
|
|
187
|
-
import { generateText } from '@framers/agentos';
|
|
188
|
-
|
|
189
|
-
// Auto-detect provider from env vars
|
|
190
|
-
const { text } = await generateText({
|
|
191
|
-
prompt: 'Explain TCP handshakes in 3 bullets.',
|
|
192
|
-
});
|
|
193
|
-
|
|
194
|
-
// Pin a provider
|
|
195
|
-
const { text: claude } = await generateText({
|
|
196
|
-
provider: 'anthropic',
|
|
197
|
-
prompt: 'Compare TCP and UDP.',
|
|
198
|
-
});
|
|
199
|
-
```
|
|
200
|
-
|
|
201
|
-
16 providers, automatic fallback. When the primary provider returns a retryable error (HTTP 402/429/5xx, network failures, auth issues), `generateText` walks the canonical fallback chain for that provider using whichever API keys are present in the environment — no extra imports, no chain construction needed:
|
|
202
|
-
|
|
203
|
-
```typescript
|
|
204
|
-
import { generateText } from '@framers/agentos';
|
|
205
|
-
|
|
206
|
-
const { text } = await generateText({
|
|
207
|
-
provider: 'anthropic',
|
|
208
|
-
prompt: 'Compare TCP and UDP.',
|
|
209
|
-
});
|
|
210
|
-
// Anthropic primary, falls through to OpenAI / Gemini / OpenRouter / etc. on retryable errors
|
|
211
|
-
```
|
|
212
|
-
|
|
213
|
-
Want strict single-provider routing (e.g. for billing isolation, capability auditing, or provider-pinned tests)? Pass an empty array to opt out:
|
|
214
|
-
|
|
215
|
-
```typescript
|
|
216
|
-
const { text } = await generateText({
|
|
217
|
-
provider: 'anthropic',
|
|
218
|
-
prompt: 'Compare TCP and UDP.',
|
|
219
|
-
fallbackProviders: [], // strict mode — fail if Anthropic is unavailable
|
|
220
|
-
});
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
Or supply your own chain (and import `buildFallbackChain` only if you want to derive a default chain to splice from):
|
|
224
|
-
|
|
225
|
-
```typescript
|
|
226
|
-
import { generateText, buildFallbackChain } from '@framers/agentos';
|
|
227
|
-
|
|
228
|
-
const { text } = await generateText({
|
|
229
|
-
provider: 'anthropic',
|
|
230
|
-
prompt: 'Compare TCP and UDP.',
|
|
231
|
-
fallbackProviders: [
|
|
232
|
-
{ provider: 'openai', model: 'gpt-4o-mini' },
|
|
233
|
-
{ provider: 'openrouter' },
|
|
234
|
-
],
|
|
235
|
-
});
|
|
236
|
-
```
|
|
237
|
-
|
|
238
|
-
### Streaming
|
|
88
|
+
## Classifier-Driven Memory Pipeline
|
|
239
89
|
|
|
240
|
-
|
|
241
|
-
import { streamText } from '@framers/agentos';
|
|
90
|
+
Most memory libraries retrieve on every query. AgentOS gates memory through three LLM-as-judge classifiers in a single shared pass, so trivial queries skip retrieval entirely and the rest get the right architecture and reader per category.
|
|
242
91
|
|
|
243
|
-
const stream = streamText({ provider: 'openai', prompt: 'Write a haiku.' });
|
|
244
|
-
for await (const chunk of stream.textStream) process.stdout.write(chunk);
|
|
245
92
|
```
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
schema: z.object({
|
|
256
|
-
sentiment: z.enum(['positive', 'negative', 'neutral']),
|
|
257
|
-
topics: z.array(z.string()),
|
|
258
|
-
}),
|
|
259
|
-
prompt: 'Analyze: "Great camera but disappointing battery."',
|
|
260
|
-
});
|
|
93
|
+
User query
|
|
94
|
+
│
|
|
95
|
+
▼ Stage 1: QueryClassifier (gpt-5-mini, ~$0.0001/query)
|
|
96
|
+
│ T0=none ─────► answer from context, skip retrieval
|
|
97
|
+
│ T1+=needs memory
|
|
98
|
+
▼ Stage 2: MemoryRouter → canonical-hybrid · OM-v10 · OM-v11
|
|
99
|
+
▼ Stage 3: ReaderRouter → gpt-4o (TR/SSU) · gpt-5-mini (SSA/SSP/KU/MS)
|
|
100
|
+
▼
|
|
101
|
+
Grounded answer
|
|
261
102
|
```
|
|
262
103
|
|
|
263
|
-
|
|
104
|
+
Stages 2 and 3 reuse the Stage 1 classification, so the full pipeline costs **one classifier call per query**, not three. **The T0 / no-memory gate is the novel piece**: removing retrieval entirely for greetings and small talk saves the embedding + rerank + reader cost on a substantial fraction of typical agent traffic.
|
|
264
105
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
console.log(reply.text);
|
|
271
|
-
```
|
|
272
|
-
|
|
273
|
-
### Agent with Personality & Memory
|
|
274
|
-
|
|
275
|
-
```typescript
|
|
276
|
-
const tutor = agent({
|
|
277
|
-
provider: 'anthropic',
|
|
278
|
-
instructions: 'You are a patient CS tutor.',
|
|
279
|
-
personality: {
|
|
280
|
-
openness: 0.9,
|
|
281
|
-
conscientiousness: 0.95,
|
|
282
|
-
agreeableness: 0.85,
|
|
283
|
-
},
|
|
284
|
-
memory: {
|
|
285
|
-
types: ['episodic', 'semantic'],
|
|
286
|
-
working: { enabled: true, maxTokens: 1200 },
|
|
287
|
-
},
|
|
288
|
-
});
|
|
106
|
+
| Primitive | Source | Decision |
|
|
107
|
+
|---|---|---|
|
|
108
|
+
| `QueryClassifier` | [`@framers/agentos/query-router`](https://docs.agentos.sh/features/query-routing) | T0/none vs T1/simple vs T2/moderate vs T3/complex |
|
|
109
|
+
| `MemoryRouter` | [`@framers/agentos/memory-router`](https://docs.agentos.sh/features/memory-router) | canonical-hybrid vs observational-memory-v10 vs v11 |
|
|
110
|
+
| `ReaderRouter` | [`@framers/agentos/memory-router`](https://docs.agentos.sh/features/memory-router) | gpt-4o vs gpt-5-mini per category |
|
|
289
111
|
|
|
290
|
-
|
|
291
|
-
await session.send('Explain recursion with an analogy.');
|
|
292
|
-
await session.send('Can you expand on that?'); // remembers context
|
|
293
|
-
```
|
|
112
|
+
[Cognitive Pipeline docs →](https://docs.agentos.sh/features/cognitive-pipeline) · [Architecture deep dive →](https://docs.agentos.sh/blog/2026/04/10/cognitive-memory-architecture-deep-dive) · [Beyond RAG →](https://docs.agentos.sh/blog/2026/03/31/cognitive-memory-beyond-rag)
|
|
294
113
|
|
|
295
|
-
|
|
114
|
+
---
|
|
296
115
|
|
|
297
|
-
|
|
116
|
+
## Why AgentOS
|
|
298
117
|
|
|
299
|
-
|
|
300
|
-
|
|
118
|
+
| vs. | AgentOS differentiator |
|
|
119
|
+
|---|---|
|
|
120
|
+
| **LangChain / LangGraph** | Cognitive memory ([8 neuroscience-backed mechanisms](https://docs.agentos.sh/features/cognitive-memory)), HEXACO personality, runtime tool forging |
|
|
121
|
+
| **Vercel AI SDK** | Multi-agent teams (6 strategies), 7 vector backends, [guardrails](https://docs.agentos.sh/features/guardrails-architecture), voice/telephony |
|
|
122
|
+
| **CrewAI / Mastra** | Unified orchestration (DAGs + graphs + missions), personality-driven routing, **published reproducible numbers on LongMemEval-S (85.6%) and LongMemEval-M (70.2%) with full methodology disclosure** |
|
|
301
123
|
|
|
302
|
-
|
|
303
|
-
async getContext(text, opts) {
|
|
304
|
-
return { contextText: await recallRelevant(text, opts?.tokenBudget) };
|
|
305
|
-
},
|
|
306
|
-
async observe(role, text) {
|
|
307
|
-
await persist(role, text);
|
|
308
|
-
},
|
|
309
|
-
};
|
|
124
|
+
[Full framework comparison →](https://docs.agentos.sh/blog/2026/02/20/agentos-vs-langgraph-vs-crewai)
|
|
310
125
|
|
|
311
|
-
|
|
312
|
-
provider: 'anthropic',
|
|
313
|
-
instructions: 'You are a patient CS tutor.',
|
|
314
|
-
memoryProvider: myProvider,
|
|
315
|
-
});
|
|
126
|
+
---
|
|
316
127
|
|
|
317
|
-
|
|
318
|
-
// recorded after. No session required.
|
|
319
|
-
const stream = tutor.stream('Explain recursion.');
|
|
128
|
+
## Key Features
|
|
320
129
|
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
|
|
130
|
+
| Category | Highlights |
|
|
131
|
+
|---|---|
|
|
132
|
+
| **LLM Providers** | 16: OpenAI, Anthropic, Gemini, Groq, Ollama, OpenRouter, Together, Mistral, xAI, Claude/Gemini CLI, + 5 image/video |
|
|
133
|
+
| **Cognitive Memory** | 8 mechanisms: reconsolidation, retrieval-induced forgetting, involuntary recall, FOK, gist extraction, schema encoding, source decay, emotion regulation |
|
|
134
|
+
| **HEXACO Personality** | 6 traits modulate memory, retrieval bias, response style |
|
|
135
|
+
| **RAG Pipeline** | 7 vector backends · 4 retrieval strategies · GraphRAG · HyDE · Cohere rerank-v3.5 |
|
|
136
|
+
| **Multi-Agent Teams** | 6 coordination strategies · shared memory · inter-agent messaging · HITL gates |
|
|
137
|
+
| **Orchestration** | `workflow()` DAGs · `AgentGraph` cycles · `mission()` goal-driven planning · checkpointing |
|
|
138
|
+
| **Guardrails** | 5 security tiers · 6 packs (PII, ML classifiers, topicality, code safety, grounding, content policy) |
|
|
139
|
+
| **Emergent Capabilities** | Runtime tool forging · 4 self-improvement tools · tiered promotion · skill export |
|
|
140
|
+
| **Voice & Telephony** | ElevenLabs, Deepgram, Whisper · Twilio, Telnyx, Plivo |
|
|
141
|
+
| **Channels** | 37 platform adapters (Telegram, Discord, Slack, WhatsApp, webchat, ...) |
|
|
142
|
+
| **Observability** | OpenTelemetry · usage ledger · cost guard · circuit breaker |
|
|
325
143
|
|
|
326
|
-
|
|
144
|
+
---
|
|
327
145
|
|
|
328
|
-
|
|
146
|
+
## Multi-Agent in 6 Lines
|
|
329
147
|
|
|
330
148
|
```typescript
|
|
331
149
|
import { agency } from '@framers/agentos';
|
|
@@ -334,180 +152,81 @@ const team = agency({
|
|
|
334
152
|
strategy: 'graph',
|
|
335
153
|
agents: {
|
|
336
154
|
researcher: { provider: 'anthropic', instructions: 'Find relevant facts.' },
|
|
337
|
-
writer: { provider: 'openai',
|
|
338
|
-
reviewer: { provider: 'gemini',
|
|
155
|
+
writer: { provider: 'openai', instructions: 'Summarize clearly.', dependsOn: ['researcher'] },
|
|
156
|
+
reviewer: { provider: 'gemini', instructions: 'Check accuracy.', dependsOn: ['writer'] },
|
|
339
157
|
},
|
|
340
158
|
});
|
|
341
159
|
|
|
342
160
|
const result = await team.generate('Compare TCP vs UDP for game networking.');
|
|
343
161
|
```
|
|
344
162
|
|
|
345
|
-
|
|
163
|
+
Strategies: `sequential` · `parallel` · `debate` · `review-loop` · `hierarchical` · `graph`. [Multi-agent docs →](https://docs.agentos.sh/features/multi-agent)
|
|
346
164
|
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
```typescript
|
|
350
|
-
import { generateImage, generateVideo, generateMusic, performOCR, embedText } from '@framers/agentos';
|
|
351
|
-
|
|
352
|
-
const image = await generateImage({ provider: 'openai', prompt: 'Neon cityscape at sunset' });
|
|
353
|
-
const video = await generateVideo({ prompt: 'Drone over misty forest' });
|
|
354
|
-
const music = await generateMusic({ prompt: 'Lo-fi hip hop beat' });
|
|
355
|
-
const ocr = await performOCR({ image: './receipt.png', strategy: 'progressive' });
|
|
356
|
-
const embed = await embedText({ provider: 'openai', input: ['hello', 'world'] });
|
|
357
|
-
```
|
|
358
|
-
|
|
359
|
-
### Orchestration
|
|
360
|
-
|
|
361
|
-
Three authoring APIs, one graph runtime:
|
|
165
|
+
---
|
|
362
166
|
|
|
363
|
-
|
|
364
|
-
import { workflow, AgentGraph, mission } from '@framers/agentos/orchestration';
|
|
167
|
+
## See It In Action
|
|
365
168
|
|
|
366
|
-
|
|
367
|
-
const pipe = workflow('content').step('research', { tool: 'web_search' }).then('draft', { gmi: { instructions: '...' } }).compile();
|
|
169
|
+
### 🌀 Paracosm — AI Agent Swarm Simulation
|
|
368
170
|
|
|
369
|
-
|
|
370
|
-
const graph = new AgentGraph('review').addNode('draft', gmiNode({...})).addNode('review', judgeNode({...})).addEdge('draft','review').compile();
|
|
171
|
+
Define any scenario as JSON. Run it with AI commanders that have different HEXACO personalities. Same starting conditions, different decisions, divergent civilizations. Built on AgentOS.
|
|
371
172
|
|
|
372
|
-
|
|
373
|
-
|
|
173
|
+
```bash
|
|
174
|
+
npm install paracosm
|
|
374
175
|
```
|
|
375
176
|
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
AgentOS exposes related entry points at different depths. The shared config surface does not imply identical enforcement across them.
|
|
379
|
-
|
|
380
|
-
- The lightweight `agent()` facade owns prompt assembly, sessions, personality shaping, hooks, tools, and usage-ledger forwarding.
|
|
381
|
-
- `generateText()` and `streamText()` are the low-level generation helpers for provider control, native tool calling, and text-fallback tool loops.
|
|
382
|
-
- The full `AgentOS` runtime and `agency()` own emergent tooling, guardrails, discovery, RAG initialization, permissions/security tiers, HITL, channels/voice, and provenance-aware orchestration.
|
|
383
|
-
|
|
384
|
-
---
|
|
385
|
-
|
|
386
|
-
## Key Features
|
|
387
|
-
|
|
388
|
-
| Category | Highlights |
|
|
389
|
-
|----------|-----------|
|
|
390
|
-
| **LLM Providers** | 16 providers: OpenAI, Anthropic, Gemini, Groq, Ollama, OpenRouter, Together, Mistral, xAI, Claude CLI, Gemini CLI, + 5 image/video |
|
|
391
|
-
| **Cognitive Memory** | 8 neuroscience-backed mechanisms (reconsolidation, RIF, involuntary recall, FOK, gist extraction, schema encoding, source decay, emotion regulation) |
|
|
392
|
-
| **HEXACO Personality** | 6 traits modulate memory, retrieval bias, response style — agents have consistent identity |
|
|
393
|
-
| **RAG Pipeline** | 7 vector backends (InMemory, SQL, HNSW, Qdrant, Neo4j, pgvector, Pinecone) · 4 retrieval strategies · GraphRAG |
|
|
394
|
-
| **Multi-Agent Teams** | 6 coordination strategies · shared memory · inter-agent messaging · HITL approval gates |
|
|
395
|
-
| **Orchestration** | `workflow()` DAGs · `AgentGraph` cycles/subgraphs · `mission()` goal-driven planning · persistent checkpointing |
|
|
396
|
-
| **Guardrails** | 5 security tiers · 6 packs (PII redaction, ML classifiers, topicality, code safety, grounding, content policy) |
|
|
397
|
-
| **Emergent Capabilities** | Runtime tool forging · 4 self-improvement tools · tiered promotion (session → agent → shared) · skill export |
|
|
398
|
-
| **Capability Discovery** | Semantic per-turn tool selection · ~90% token reduction · 3-tier context model · Neo4j graph backend |
|
|
399
|
-
| **Skills** | 88 curated skills · 3-tier architecture (engine, content, catalog SDK) · auto-update on install |
|
|
400
|
-
| **Voice & Telephony** | ElevenLabs, Deepgram, OpenAI Whisper · Twilio, Telnyx, Plivo |
|
|
401
|
-
| **Channels** | 37 platform adapters (Telegram, Discord, Slack, WhatsApp, webchat, and more) |
|
|
402
|
-
| **Structured Output** | Zod-validated JSON extraction with retry · provider-native structured output |
|
|
403
|
-
| **Observability** | OpenTelemetry traces/metrics · usage ledger · cost guard · circuit breaker |
|
|
177
|
+
[Live Demo](https://paracosm.agentos.sh/sim) · [GitHub](https://github.com/framersai/paracosm) · [npm](https://www.npmjs.com/package/paracosm)
|
|
404
178
|
|
|
405
179
|
---
|
|
406
180
|
|
|
407
|
-
##
|
|
408
|
-
|
|
409
|
-
| Provider | Text Model | Image Model | Env Var |
|
|
410
|
-
|---|---|---|---|
|
|
411
|
-
| `openai` | gpt-4o | gpt-image-1 | `OPENAI_API_KEY` |
|
|
412
|
-
| `anthropic` | claude-sonnet-4 | — | `ANTHROPIC_API_KEY` |
|
|
413
|
-
| `gemini` | gemini-2.5-flash | — | `GEMINI_API_KEY` |
|
|
414
|
-
| `groq` | llama-3.3-70b | — | `GROQ_API_KEY` |
|
|
415
|
-
| `ollama` | llama3.2 | stable-diffusion | `OLLAMA_BASE_URL` |
|
|
416
|
-
| `openrouter` | openai/gpt-4o | — | `OPENROUTER_API_KEY` |
|
|
417
|
-
| `together` | Llama-3.1-70B | — | `TOGETHER_API_KEY` |
|
|
418
|
-
| `mistral` | mistral-large | — | `MISTRAL_API_KEY` |
|
|
419
|
-
| `xai` | grok-2 | — | `XAI_API_KEY` |
|
|
420
|
-
| `stability` | — | stable-diffusion-xl | `STABILITY_API_KEY` |
|
|
421
|
-
| `replicate` | — | flux-1.1-pro | `REPLICATE_API_TOKEN` |
|
|
422
|
-
| `bfl` | — | flux-pro-1.1 | `BFL_API_KEY` |
|
|
423
|
-
| `fal` | — | fal-ai/flux/dev | `FAL_API_KEY` |
|
|
424
|
-
| `claude-code-cli` | claude-sonnet-4 | — | `claude` on PATH |
|
|
425
|
-
| `gemini-cli` | gemini-2.5-flash | — | `gemini` on PATH |
|
|
426
|
-
|
|
427
|
-
Auto-detection: OpenAI → Anthropic → OpenRouter → Gemini → Groq → Together → Mistral → xAI → CLI → Ollama
|
|
428
|
-
|
|
429
|
-
### Model String Formats
|
|
430
|
-
|
|
431
|
-
Three ways to specify a model:
|
|
432
|
-
|
|
433
|
-
```ts
|
|
434
|
-
// 1. Separate fields (recommended)
|
|
435
|
-
generateText({ provider: 'anthropic', model: 'claude-sonnet-4-20250514', prompt: '...' });
|
|
181
|
+
## Configure API Keys
|
|
436
182
|
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
generateText({ model: 'anthropic/claude-sonnet-4-20250514', prompt: '...' });
|
|
183
|
+
```bash
|
|
184
|
+
export OPENAI_API_KEY=sk-...
|
|
185
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
186
|
+
export GEMINI_API_KEY=AIza...
|
|
442
187
|
|
|
443
|
-
|
|
444
|
-
|
|
188
|
+
# Comma-separated keys auto-rotate with quota detection
|
|
189
|
+
export OPENAI_API_KEY=sk-key1,sk-key2,sk-key3
|
|
445
190
|
```
|
|
446
191
|
|
|
447
|
-
|
|
192
|
+
Or pass `apiKey` inline on any call. Auto-detection order: OpenAI → Anthropic → OpenRouter → Gemini → Groq → Together → Mistral → xAI → CLI → Ollama. [Default models per provider →](https://docs.agentos.sh/architecture/llm-providers)
|
|
448
193
|
|
|
449
194
|
---
|
|
450
195
|
|
|
451
|
-
## API
|
|
452
|
-
|
|
453
|
-
### High-Level Functions
|
|
454
|
-
|
|
455
|
-
| Function | Description |
|
|
456
|
-
|----------|-------------|
|
|
457
|
-
| `generateText()` | Text generation with multi-step tool calling |
|
|
458
|
-
| `streamText()` | Streaming text with async iterables |
|
|
459
|
-
| `generateObject()` | Zod-validated structured output |
|
|
460
|
-
| `streamObject()` | Streaming structured output |
|
|
461
|
-
| `generateImage()` | Image generation (7 providers, character consistency) |
|
|
462
|
-
| `generateVideo()` | Video generation |
|
|
463
|
-
| `generateMusic()` / `generateSFX()` | Audio generation |
|
|
464
|
-
| `performOCR()` | Text extraction from images |
|
|
465
|
-
| `embedText()` | Embedding generation |
|
|
466
|
-
| `agent()` | Lightweight stateful agent for prompts, tools, memory, and sessions |
|
|
467
|
-
| `agency()` | Multi-agent teams plus full runtime-owned orchestration features |
|
|
468
|
-
|
|
469
|
-
### Orchestration
|
|
196
|
+
## API Surfaces
|
|
470
197
|
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
| `mission(name)` | Goal-driven, planner decides steps |
|
|
198
|
+
- **`agent()`**: lightweight stateful agent. Prompts, sessions, personality, hooks, tools, memory.
|
|
199
|
+
- **`agency()`**: multi-agent teams + full runtime. Emergent tooling, guardrails, RAG, voice, channels, HITL.
|
|
200
|
+
- **`generateText()` / `streamText()` / `generateObject()` / `generateImage()` / `generateVideo()` / `generateMusic()` / `performOCR()` / `embedText()`**: low-level multi-modal helpers with native tool calling.
|
|
201
|
+
- **`workflow()` / `AgentGraph` / `mission()`**: three orchestration authoring APIs over one graph runtime.
|
|
476
202
|
|
|
477
|
-
Full API reference
|
|
203
|
+
[Full API reference →](https://docs.agentos.sh/api) · [High-Level API guide →](https://docs.agentos.sh/getting-started/high-level-api)
|
|
478
204
|
|
|
479
205
|
---
|
|
480
206
|
|
|
481
207
|
## Ecosystem
|
|
482
208
|
|
|
483
209
|
| Package | Description |
|
|
484
|
-
|
|
485
|
-
| [`@framers/agentos`](https://www.npmjs.com/package/@framers/agentos) | Core runtime
|
|
210
|
+
|---|---|
|
|
211
|
+
| [`@framers/agentos`](https://www.npmjs.com/package/@framers/agentos) | Core runtime |
|
|
486
212
|
| [`@framers/agentos-extensions`](https://www.npmjs.com/package/@framers/agentos-extensions) | 100+ extensions and templates |
|
|
487
|
-
| [`@framers/agentos-extensions-registry`](https://www.npmjs.com/package/@framers/agentos-extensions-registry) | Curated manifest builder |
|
|
488
213
|
| [`@framers/agentos-skills`](https://www.npmjs.com/package/@framers/agentos-skills) | 88 curated SKILL.md definitions |
|
|
489
|
-
| [`@framers/agentos-
|
|
214
|
+
| [`@framers/agentos-bench`](https://github.com/framersai/agentos-bench) | Open benchmark harness with bootstrap CIs, judge-FPR probes, per-case run JSONs |
|
|
490
215
|
| [`@framers/sql-storage-adapter`](https://www.npmjs.com/package/@framers/sql-storage-adapter) | SQL persistence (SQLite, Postgres, IndexedDB) |
|
|
216
|
+
| [paracosm](https://www.npmjs.com/package/paracosm) | AI agent swarm simulation engine |
|
|
491
217
|
|
|
492
218
|
---
|
|
493
219
|
|
|
494
|
-
## Documentation
|
|
220
|
+
## Documentation & Community
|
|
495
221
|
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
|
|
500
|
-
|
|
501
|
-
|
|
502
|
-
|
|
503
|
-
|
|
504
|
-
| [Human-in-the-Loop](./docs/safety/HUMAN_IN_THE_LOOP.md) | Approval workflows, escalation |
|
|
505
|
-
| [Emergent Capabilities](./docs/architecture/EMERGENT_CAPABILITIES.md) | Runtime tool forging |
|
|
506
|
-
| [Channels & Platforms](./docs/architecture/PLATFORM_SUPPORT.md) | 37 platform adapters |
|
|
507
|
-
| [Voice Pipeline](./docs/features/VOICE_PIPELINE.md) | TTS, STT, telephony |
|
|
508
|
-
| [Uncensored Content](./docs/features/UNCENSORED_CONTENT.md) | `policyTier`-driven routing for mature text + image generation |
|
|
509
|
-
|
|
510
|
-
Full documentation: [docs.agentos.sh](https://docs.agentos.sh)
|
|
222
|
+
- **[Benchmarks](https://docs.agentos.sh/benchmarks)**: matched-reader SOTA tables, bootstrap CIs, methodology audit
|
|
223
|
+
- **[Architecture](https://docs.agentos.sh/architecture/system-architecture)**: system design, layer breakdown
|
|
224
|
+
- **[Cognitive Memory](https://docs.agentos.sh/features/cognitive-memory)**: 8 mechanisms with 30+ APA citations
|
|
225
|
+
- **[RAG Configuration](https://docs.agentos.sh/features/rag-memory-configuration)**: vector stores, embeddings, sources
|
|
226
|
+
- **[Guardrails](https://docs.agentos.sh/features/guardrails-architecture)**: 5 tiers, 6 packs
|
|
227
|
+
- **[Voice Pipeline](https://docs.agentos.sh/features/voice-pipeline)**: TTS, STT, telephony
|
|
228
|
+
- **[Blog](https://docs.agentos.sh/blog)**: engineering posts, benchmark publications, transparency audits
|
|
229
|
+
- **[Discord](https://wilds.ai/discord)** · **[GitHub Issues](https://github.com/framersai/agentos/issues)** · **[Wilds.ai](https://wilds.ai)** (AI game worlds powered by AgentOS)
|
|
511
230
|
|
|
512
231
|
---
|
|
513
232
|
|
|
@@ -518,17 +237,7 @@ git clone https://github.com/framersai/agentos.git && cd agentos
|
|
|
518
237
|
pnpm install && pnpm build && pnpm test
|
|
519
238
|
```
|
|
520
239
|
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
---
|
|
524
|
-
|
|
525
|
-
## Community
|
|
526
|
-
|
|
527
|
-
- **Discord:** [wilds.ai/discord](https://wilds.ai/discord)
|
|
528
|
-
- **GitHub Issues:** [github.com/framersai/agentos/issues](https://github.com/framersai/agentos/issues)
|
|
529
|
-
- **Blog:** [docs.agentos.sh/blog](https://docs.agentos.sh/blog)
|
|
530
|
-
- **Paracosm:** [paracosm.agentos.sh](https://paracosm.agentos.sh) — AI agent swarm simulation engine built on AgentOS
|
|
531
|
-
- **Wilds.ai:** [wilds.ai](https://wilds.ai) — AI game worlds powered by AgentOS
|
|
240
|
+
[Contributing Guide](https://github.com/framersai/agentos/blob/master/CONTRIBUTING.md) · We use [Conventional Commits](https://www.conventionalcommits.org/).
|
|
532
241
|
|
|
533
242
|
---
|
|
534
243
|
|