open-agents-ai 0.187.194 → 0.187.196

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +168 -0
  2. package/dist/index.js +1135 -54
  3. package/package.json +2 -2
package/README.md CHANGED
@@ -1053,6 +1053,174 @@ MODEL=qwen3.5:32b OA_TIMEOUT=600 bash scripts/oa-vs-ollama-chat-compare.sh # hi
1053
1053
 
1054
1054
  **Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses; OA with the full agent is correct with citations. That's the differentiator captured live in the harness output.
1055
1055
 
1056
+ #### One-Off Completions — `/api/generate` + `/v1/generate`
1057
+
1058
+ Drop-in for **Ollama `/api/generate`**. Same body shape, same response shape, same port-swap semantics as `/api/chat`. No session history — pure one-shot completion. The full agent runs under the hood by default (`tools: true`), returning the final `assistant_text` wrapped in Ollama's shape.
1059
+
1060
+ ```bash
1061
+ # Ollama (bare LLM)
1062
+ curl -s http://127.0.0.1:11434/api/generate \
1063
+ -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
1064
+
1065
+ # OA with full agent — only port changed
1066
+ curl -s http://127.0.0.1:11435/api/generate \
1067
+ -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
1068
+
1069
+ # OA direct backend bypass (fast path, no agent)
1070
+ curl -s http://127.0.0.1:11435/api/generate \
1071
+ -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false,"tools":false}'
1072
+ ```
1073
+
1074
+ **Response shape** — Ollama-native so any client parsing `done`, `response`, `total_duration` keeps working:
1075
+
1076
+ ```json
1077
+ {
1078
+ "model": "qwen3.5:9b",
1079
+ "created_at": "2026-04-07T22:01:08Z",
1080
+ "response": "1. PostgreSQL\n2. MongoDB\n3. Redis",
1081
+ "done": true,
1082
+ "done_reason": "stop",
1083
+ "total_duration": 18000000000,
1084
+ "eval_count": 45,
1085
+ "_oa": {
1086
+ "tool_calls": 0,
1087
+ "finish_reason": "stop",
1088
+ "duration_ms": 17991,
1089
+ "request_id": "..."
1090
+ }
1091
+ }
1092
+ ```
1093
+
1094
+ The `_oa` extension block carries the OA-specific metadata (tool call count, agent duration, request ID for correlation with `/v1/audit`). Strict Ollama clients ignore unknown fields — no client changes required.
1095
+
1096
+ **Streaming** — set `"stream": true` and receive Ollama-style NDJSON chunks:
1097
+
1098
+ ```
1099
+ {"model":"qwen3.5:9b","created_at":"...","response":"","done":false,"_oa":{"type":"tool_call","tool":"web_search","args":{...}}}
1100
+ {"model":"qwen3.5:9b","created_at":"...","response":"PostgreSQL...","done":false}
1101
+ {"model":"qwen3.5:9b","created_at":"...","response":"...","done":true,"done_reason":"stop","total_duration":18000000000,"eval_count":45}
1102
+ ```
1103
+
1104
+ Tool-call events appear as NDJSON frames with `_oa.type: "tool_call"` interleaved between content frames.
1105
+
1106
+ #### Embeddings — `/v1/embeddings` + `/api/embed`
1107
+
1108
+ Drop-in for Ollama `/api/embed` (returns Ollama's `{embeddings: [[...]]}` shape) **and** OpenAI `/v1/embeddings` (returns OpenAI's `{object:"list", data: [{object:"embedding", embedding:[...], index: 0}]}` shape). The endpoint path determines the response shape; both wire to the same backend embedding model.
1109
+
1110
+ ```bash
1111
+ # Ollama shape
1112
+ curl -s http://127.0.0.1:11435/api/embed \
1113
+ -d '{"model":"nomic-embed-text","input":"hello world"}'
1114
+
1115
+ # OpenAI shape
1116
+ curl -s http://127.0.0.1:11435/v1/embeddings \
1117
+ -d '{"model":"nomic-embed-text","input":"hello world"}'
1118
+ ```
1119
+
1120
+ Both paths accept `{input: "..."}` or `{prompt: "..."}` in the body, and both support `input: ["a","b","c"]` for batched embeddings.
1121
+
1122
+ #### Memory Recall + Knowledge Graph — `/v1/memory/*`
1123
+
1124
+ Backed by `@open-agents/memory` (SQLite + better-sqlite3). The endpoints expose the daemon's persistent memory stores that the agent uses under the hood.
1125
+
1126
+ ```bash
1127
+ # Backend summary
1128
+ curl -s http://127.0.0.1:11435/v1/memory
1129
+
1130
+ # Write a memory entry (run scope)
1131
+ curl -s -X POST http://127.0.0.1:11435/v1/memory/write \
1132
+ -d '{"kind":"fact","content":"PostgreSQL supports JSONB indexing via GIN.","tags":["db","postgres"]}'
1133
+
1134
+ # Semantic/keyword search (returns ranked episodes)
1135
+ curl -s -X POST http://127.0.0.1:11435/v1/memory/search \
1136
+ -d '{"query":"postgres indexing","limit":5}'
1137
+
1138
+ # Paginated episode walk (knowledge graph)
1139
+ curl -s 'http://127.0.0.1:11435/v1/memory/episodes?limit=10'
1140
+
1141
+ # Paginated failure store (anti-patterns)
1142
+ curl -s 'http://127.0.0.1:11435/v1/memory/failures?limit=10'
1143
+ ```
1144
+
1145
+ **Example search response** — search returns real episode records with timestamps, content, importance scores, and retrieval counts:
1146
+
1147
+ ```json
1148
+ {
1149
+ "query": "sorting algorithm complexity",
1150
+ "results": [
1151
+ {
1152
+ "kind": "episode",
1153
+ "id": "89e5b7f3-e6ee-462f-97fa-e9f1bbec3d73",
1154
+ "timestamp": 1775599267977,
1155
+ "content": "The QuickSort algorithm has average O(n log n), worst case O(n²)",
1156
+ "contentHash": "fd43a4bc9bfbec3b",
1157
+ "importance": 0.5,
1158
+ "decayClass": "daily",
1159
+ "strength": 2,
1160
+ "lastRetrieved": 1775599267983
1161
+ }
1162
+ ]
1163
+ }
1164
+ ```
1165
+
1166
+ The `strength` and `lastRetrieved` fields are updated on every search — the store keeps a read-count that decays over time, matching the spaced-repetition model used by the agent for context selection.
1167
+
1168
+ #### Generate/Embed/Memory Test Harness
1169
+
1170
+ A second harness at [`scripts/oa-vs-ollama-generate-embed-memory.sh`](scripts/oa-vs-ollama-generate-embed-memory.sh) covers the four non-chat endpoint families:
1171
+
1172
+ ```bash
1173
+ MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
1174
+ bash scripts/oa-vs-ollama-generate-embed-memory.sh
1175
+ ```
1176
+
1177
+ **Tested results from `open-agents-ai@0.187.195`** (live, single run, `qwen3.5:9b` + `nomic-embed-text`):
1178
+
1179
+ **Part 1 — `/api/generate` one-off prompts**:
1180
+
1181
+ | Prompt | Ollama | OA direct | OA full agent |
1182
+ |---|---|---|---|
1183
+ | "TCP vs UDP in one sentence" | 26.8s — correct | 12.5s — correct | 43.8s — correct, **1 tool call** |
1184
+ | "One-line Python square function" | 32.1s — correct | 12.2s — correct | ~3min — correct, **2 tool calls** |
1185
+ | "Name 3 open-source databases" | 36.6s — Postgres/MySQL/SQLite | 21.0s — Postgres/MySQL/MongoDB | 18.2s — Postgres/MongoDB/Redis |
1186
+
1187
+ **Part 2 — `/api/embed` cosine similarity sanity** (4 test sentences):
1188
+
1189
+ Both Ollama and OA emitted **identical 768-dim vectors** (same backend). Cosine similarity matrix:
1190
+
1191
+ ```
1192
+ France→Par Paris→Fran Germany→Be Bananas
1193
+ France→Paris 1.000 0.979 1.000 0.449
1194
+ Paris→France 0.979 1.000 0.979 0.477
1195
+ Germany→Berlin 1.000 0.979 1.000 0.449
1196
+ Bananas 0.449 0.477 0.449 1.000
1197
+ ```
1198
+
1199
+ Semantic sanity check: `sim(Paris, Paris-paraphrase) = 0.979 > sim(Paris, Bananas) = 0.449`. ✅ Both endpoints `0.22–0.25s` per 4 embeddings.
1200
+
1201
+ **Part 3 — `/v1/memory/write` + `/v1/memory/search`** round-trip:
1202
+
1203
+ ```
1204
+ write: "The QuickSort algorithm has O(n log n) average...") → {"status":"written", "timestamp":"2026-04-07T22:01:07.931Z"}
1205
+ write: "HTTP/2 uses binary framing..." → {"status":"written", ...}
1206
+ write: "The Rust ownership model enforces memory safety..." → {"status":"written", ...}
1207
+
1208
+ search query="sorting algorithm complexity" → 3 episodes returned with content, importance, strength, lastRetrieved
1209
+ search query="network protocol streaming" → 3 episodes returned (strength incremented on re-read)
1210
+ ```
1211
+
1212
+ Every write round-trips correctly. Search returns ranked episodes with updated `strength` and `lastRetrieved` timestamps — the spaced-repetition reinforcement loop is live.
1213
+
1214
+ **Part 4 — Knowledge graph walk** (`/v1/memory/episodes`, `/v1/memory/failures`):
1215
+
1216
+ ```
1217
+ GET /v1/memory → backends: episodes (available), failures (available), temporal_graph (available)
1218
+ GET /v1/memory/episodes → paginated episode list with {data, pagination}
1219
+ GET /v1/memory/failures → paginated failure list with {data, pagination}
1220
+ ```
1221
+
1222
+ Empty on a fresh daemon; populates as the agent runs tasks. Fixed in v0.187.195 — earlier versions silently fell back to "memory stores unavailable" because the dynamic `await import("@open-agents/memory")` didn't resolve in the esbuild-bundled daemon. Now uses a static top-level import.
1223
+
1056
1224
  #### AIWG Cascade — `/v1/aiwg/*`
1057
1225
 
1058
1226
  Exposes the entire AIWG ecosystem (5 frameworks, 19 addons, 136+ skills, ~42 MB / ~2M tokens of markdown) through a **4-tier cascade loader** that auto-sizes responses to the detected model tier and **never overflows small-model context**.