omnius 1.0.0 → 1.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +169 -167
- package/dist/index.js +2216 -2198
- package/dist/launcher.cjs +1 -1
- package/dist/postinstall-daemon.cjs +78 -78
- package/dist/preinstall.cjs +8 -8
- package/dist/scripts/ocr-advanced.py +2 -2
- package/dist/scripts/start-moondream.py +1 -1
- package/dist/scripts/tor/tor_setup.sh +1 -1
- package/npm-shrinkwrap.json +3 -7
- package/package.json +3 -7
- package/prompts/agentic/system-large.md +10 -10
- package/prompts/agentic/system-medium.md +2 -2
- package/prompts/agentic/system-small.md +2 -2
- package/prompts/tui/dream-consolidate.md +1 -1
- package/prompts/tui/dream-lucid-eval.md +1 -1
- package/prompts/tui/dream-lucid-implement.md +1 -1
- package/prompts/tui/dream-stages.md +1 -1
package/README.md
CHANGED
|
@@ -1,13 +1,15 @@
|
|
|
1
1
|
<a name="top"></a>
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
2
|
+
```text
|
|
3
|
+
░▒▓██████▓▒░░▒▓██████████████▓▒░░▒▓███████▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓███████▓▒░
|
|
4
|
+
░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
|
|
5
|
+
░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
|
|
6
|
+
░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓██████▓▒░
|
|
7
|
+
░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░ ░▒▓█▓▒░
|
|
8
|
+
░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░ ░▒▓█▓▒░
|
|
9
|
+
░▒▓██████▓▒░░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓██████▓▒░░▒▓███████▓▒░
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
```
|
|
11
13
|
|
|
12
14
|
<p align="center">
|
|
13
15
|
<strong>AI coding agent powered entirely by open-weight models.</strong><br>
|
|
@@ -26,7 +28,7 @@
|
|
|
26
28
|
---
|
|
27
29
|
|
|
28
30
|
```bash
|
|
29
|
-
npm i -g omnius &&
|
|
31
|
+
npm i -g omnius && omnius
|
|
30
32
|
```
|
|
31
33
|
|
|
32
34
|
An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.
|
|
@@ -57,7 +59,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
57
59
|
- [Parallelism & Concurrency](#parallelism--concurrency)
|
|
58
60
|
- [Endpoint Reference](#endpoint-reference)
|
|
59
61
|
- [Stateful Chat — `/v1/chat` + `/api/chat` (OpenAI drop-in with full agent under the hood)](#stateful-chat--v1chat--apichat-openai-drop-in-with-full-agent-under-the-hood)
|
|
60
|
-
- [Live Comparison: Ollama vs
|
|
62
|
+
- [Live Comparison: Ollama vs Omnius Full Agent](#live-comparison-ollama-vs-omnius-full-agent)
|
|
61
63
|
- [One-Off Completions — `/api/generate` + `/v1/generate`](#one-off-completions--apigenerate--v1generate)
|
|
62
64
|
- [Embeddings — `/v1/embeddings` + `/api/embed`](#embeddings--v1embeddings--apiembed)
|
|
63
65
|
- [Memory Recall + Knowledge Graph — `/v1/memory/*`](#memory-recall--knowledge-graph--v1memory)
|
|
@@ -210,7 +212,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
210
212
|
- [Configuration](#configuration)
|
|
211
213
|
- [Network Access & Binding](#network-access--binding)
|
|
212
214
|
- [Project Context](#project-context)
|
|
213
|
-
- [`.
|
|
215
|
+
- [`.omnius/` Project Directory](#omnius-project-directory)
|
|
214
216
|
- [Model Support](#model-support)
|
|
215
217
|
- [Supported Inference Providers](#supported-inference-providers)
|
|
216
218
|
- [Connecting to a Provider](#connecting-to-a-provider)
|
|
@@ -240,7 +242,7 @@ An LLM is a high-bandwidth associative generative core — closer to a cortex-li
|
|
|
240
242
|
|---|---|---|
|
|
241
243
|
| Associative core | Cortex | LLM weights (any size) |
|
|
242
244
|
| Current workspace | Global workspace / attention | `assembleContext()` — structured context assembly |
|
|
243
|
-
| Episodic memory | Hippocampus | `.
|
|
245
|
+
| Episodic memory | Hippocampus | `.omnius/memory/` — write, search, retrieve across sessions |
|
|
244
246
|
| Cognitive map | Hippocampal spatial maps | `semantic-map.ts` + `repo-map.ts` (PageRank) |
|
|
245
247
|
| Action gating | Basal ganglia | Tool selection policy (task-aware filtering) |
|
|
246
248
|
| Temporal hierarchy | Prefrontal executive | Task decomposition, sub-agent delegation |
|
|
@@ -258,7 +260,7 @@ Don't chase larger models. Build the organism around whatever model you have.
|
|
|
258
260
|
<div align="right"><a href="#top">back to top</a></div>
|
|
259
261
|
|
|
260
262
|
```
|
|
261
|
-
You:
|
|
263
|
+
You: omnius "fix the null check in auth.ts"
|
|
262
264
|
|
|
263
265
|
Agent: [Turn 1] file_read(src/auth.ts)
|
|
264
266
|
[Turn 2] grep_search(pattern="null", path="src/auth.ts")
|
|
@@ -284,8 +286,8 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
|
|
|
284
286
|
- **Sub-agent delegation** — spawn independent agents for parallel workstreams
|
|
285
287
|
- **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation
|
|
286
288
|
- **Long-horizon cron agents** — schedule recurring autonomous agent tasks with goals, completion criteria, execution history, and automatic evaluation (daily code reviews, weekly dep updates, continuous monitoring)
|
|
287
|
-
- **Nexus P2P networking** — decentralized agent-to-agent communication via [
|
|
288
|
-
- **x402 micropayments** — native x402 payment rails via
|
|
289
|
+
- **Nexus P2P networking** — decentralized agent-to-agent communication via [open-agents-nexus](https://www.npmjs.com/package/open-agents-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport
|
|
290
|
+
- **x402 micropayments** — native x402 payment rails via open-agents-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers
|
|
289
291
|
- **Inference capability proof** — benchmark local models with anti-spoofing SHA-256 hashed proofs, generate capability scorecards for peer verification
|
|
290
292
|
- **Littleman Observer** — parallel meta-analysis system that watches the agent loop in real-time. Detects false failure claims after successful tools, blocks redundant re-execution, catches runaway one-sided output in conversations, and dynamically extends turn limits when active work is detected. Emits `debug_context` and `debug_littleman` events for live observability
|
|
291
293
|
- **Interactive Session Lock** — generic `SESSION_ACTIVE` protocol prevents premature task completion during long-running sessions (phone calls, live chat, monitoring). Any MCP contract can adopt the protocol. Paired with context-engineered system prompts that teach small models to maintain conversation loops
|
|
@@ -304,8 +306,8 @@ Omnius includes background workers that compute and associate embeddings across
|
|
|
304
306
|
|
|
305
307
|
Config (env vars):
|
|
306
308
|
|
|
307
|
-
- `
|
|
308
|
-
- `
|
|
309
|
+
- `OMNIUS_COOCUR_WINDOW_MS` — max time delta between visual and transcript episodes to create co‑occurrence links (default: 120000 ms).
|
|
310
|
+
- `OMNIUS_COOCUR_CLIP_SIM_MIN` — minimum CLIP text↔image cosine (0..1, default: 0.22) for linking when both embeddings are available.
|
|
309
311
|
|
|
310
312
|
The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile, speechbrain, Whisper) into `~/.omnius/venv` and registers providers automatically. No manual installs are required.
|
|
311
313
|
- **Ralph Loop** — iterative task execution that keeps retrying until completion criteria are met
|
|
@@ -314,7 +316,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
314
316
|
- **Persistent Python REPL** — `repl_exec` tool maintains variables, imports, and functions across calls. Write Python code that processes data iteratively, with `llm_query()` available for recursive LLM sub-calls from within code
|
|
315
317
|
- **Recursive LLM calls** — `llm_query(prompt, context)` invokes the model from inside REPL code, enabling loop-based semantic analysis of large inputs ([RLM paper](https://arxiv.org/abs/2512.24601)). `parallel_llm_query()` runs multiple calls concurrently ([SPRINT](https://arxiv.org/abs/2506.05745))
|
|
316
318
|
- **Memory metabolism** — governed memory lifecycle: classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence), consolidate lessons from trajectories. Inspired by [TIMG](https://arxiv.org/abs/2603.10600) and [MemMA](https://arxiv.org/abs/2603.18718)
|
|
317
|
-
- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.
|
|
319
|
+
- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.omnius/identity/`
|
|
318
320
|
- **Reflection & integrity** — immune-system audit: diagnostic ("what's wrong?"), epistemic ("what evidence is missing?"), constitutional ("should this change become part of self?"). Inspired by [LEAFE](https://arxiv.org/abs/2603.16843) and [RewardHackingAgents](https://arxiv.org/abs/2603.11337)
|
|
319
321
|
- **Exploration & culture** — ARCHE strategy-space exploration: generate competing hypotheses, archive successful variants, retrieve past strategies. Inspired by [SGE](https://arxiv.org/abs/2603.02045) and [Darwin Gödel Machine](https://arxiv.org/abs/2505.22954)
|
|
320
322
|
- **Autoresearch Swarm** — 5-agent GPU experiment loop during REM sleep: Researcher, Monitor, Evaluator, Critic, Flow Maintainer autonomously run ML training experiments, keep improvements, discard regressions
|
|
@@ -323,7 +325,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
323
325
|
- **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent
|
|
324
326
|
- **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text
|
|
325
327
|
- **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)
|
|
326
|
-
- **Supertonic expressive tags** — when `/voice supertonic` is active,
|
|
328
|
+
- **Supertonic expressive tags** — when `/voice supertonic` is active, Omnius inserts supported expression tags such as `<sigh>`, `<breath>`, and `<laugh>` into spoken status updates based on failure, recovery, sentence boundaries, success, and playful tone. Other voice backends receive sanitized plain text
|
|
327
329
|
- **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior
|
|
328
330
|
- **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines
|
|
329
331
|
- **Cost tracking** — real-time token cost estimation for 15+ cloud providers
|
|
@@ -340,14 +342,14 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
340
342
|
- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection
|
|
341
343
|
- **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)
|
|
342
344
|
- **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below
|
|
343
|
-
- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across
|
|
344
|
-
- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{
|
|
345
|
+
- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across Omnius restarts
|
|
346
|
+
- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OMNIUS_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection
|
|
345
347
|
- **Secret vault** — `/secrets` manages API keys and credentials with AES-256-GCM encrypted persistence; secrets are automatically redacted before sending to untrusted inference peers and re-injected on response
|
|
346
348
|
- **Auto-expanding context** — detects RAM/VRAM and creates an optimized model variant on first run
|
|
347
349
|
- **Mid-task steering** — type while the agent works to add context without interrupting
|
|
348
350
|
- **Smart compaction** — 6 context compaction strategies (default, aggressive, decisions, errors, summary, structured) with ARC-inspired active context revision ([arXiv:2601.12030](https://arxiv.org/abs/2601.12030)) that preserves structural file content through compaction, preventing small-model repetitive loops at the root cause. Success signals and content previews survive compaction so models never lose evidence that tools succeeded
|
|
349
351
|
- **Memex experience archive** — large tool outputs archived during compaction with hash-based retrieval
|
|
350
|
-
- **Persistent memory** — learned patterns stored in `.
|
|
352
|
+
- **Persistent memory** — learned patterns stored in `.omnius/memory/` across sessions
|
|
351
353
|
- **Structured procedural memory (SQLite)** — replaces flat JSON with a full relational database: CRUD with soft-delete, revision tracking, embedding storage (float32 BLOB), bidirectional memory linking with confidence scores. Inspired by [ExpeL](https://arxiv.org/abs/2308.10144) (contrastive extraction) and [TIMG](https://arxiv.org/abs/2603.10600) (structured procedural format). 79 unit tests
|
|
352
354
|
- **Semantic memory search** — vector embeddings via [Ollama /api/embed](https://ollama.com) (nomic-embed-text, 768-dim) with cosine similarity search over stored memories. Auto-generates embeddings on memory creation. Auto-links related memories when similarity > 0.6. Graceful fallback to text search when Ollama unavailable
|
|
353
355
|
- **LLM-based memory extraction** — post-task, the LLM itself extracts structured procedural memories (CATEGORY/TRIGGER/LESSON/STEPS) instead of copying raw error text verbatim. Based on [ExpeL](https://arxiv.org/abs/2308.10144) and [AWM](https://arxiv.org/abs/2409.07429) patterns
|
|
@@ -355,13 +357,13 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
355
357
|
- **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content
|
|
356
358
|
- **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected
|
|
357
359
|
- **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)
|
|
358
|
-
- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.
|
|
359
|
-
- **Node visualization** — [
|
|
360
|
+
- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.omnius/images/`
|
|
361
|
+
- **Node visualization** — [omnius.nexus](https://github.com/robit-man/omnius.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
|
|
360
362
|
- **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents "white heavy checkmark"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines
|
|
361
363
|
- **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel
|
|
362
364
|
- **Unified color scheme** — `ui.primary` (252), `ui.error` (198/magenta), `ui.warn` (214/orange), `ui.accent` (178/yellow) applied consistently across all TUI surfaces
|
|
363
365
|
- **Clickable header buttons** — `help`, `voice`, `cohere`, `model` buttons on banner row 3 with hover/click visual states. OSC 8 hyperlinks for pointer cursor. Mouse click fires the slash command directly
|
|
364
|
-
- **Dynamic terminal title** — updates with current task + version: `"fix auth bug ·
|
|
366
|
+
- **Dynamic terminal title** — updates with current task + version: `"fix auth bug · Omnius v0.141.0"`
|
|
365
367
|
- **Session context persistence** — auto-saves context on task completion, manual `/context save|restore` across sessions
|
|
366
368
|
- **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs
|
|
367
369
|
- **Seamless `/update`** — in-place update and reload with automatic context save/restore
|
|
@@ -410,20 +412,20 @@ Run Omnius as a headless service for CI/CD pipelines, automation, and enterprise
|
|
|
410
412
|
### Non-Interactive Mode
|
|
411
413
|
|
|
412
414
|
```bash
|
|
413
|
-
|
|
414
|
-
|
|
415
|
-
|
|
415
|
+
omnius "fix all lint errors" --non-interactive # Run task, exit when done
|
|
416
|
+
omnius "generate API docs" --json # Structured JSON output (no ANSI)
|
|
417
|
+
omnius "run security audit" --background # Detached background job
|
|
416
418
|
```
|
|
417
419
|
|
|
418
420
|
### Background Jobs
|
|
419
421
|
|
|
420
422
|
```bash
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
423
|
+
omnius "migrate database" --background # Returns job ID immediately
|
|
424
|
+
omnius status job-abc123 # Check job progress
|
|
425
|
+
omnius jobs # List all running/completed jobs
|
|
424
426
|
```
|
|
425
427
|
|
|
426
|
-
Jobs run as detached processes — survive terminal disconnection. Output saved to `.
|
|
428
|
+
Jobs run as detached processes — survive terminal disconnection. Output saved to `.omnius/jobs/{id}.json`.
|
|
427
429
|
|
|
428
430
|
### JSON Output Mode
|
|
429
431
|
|
|
@@ -439,15 +441,15 @@ Pipe to `jq`, ingest into monitoring systems, or feed to other agents.
|
|
|
439
441
|
### Process Management
|
|
440
442
|
|
|
441
443
|
```bash
|
|
442
|
-
/destroy processes # Kill orphaned
|
|
443
|
-
/destroy processes --global # Kill ALL orphaned
|
|
444
|
+
/destroy processes # Kill orphaned Omnius processes (local project)
|
|
445
|
+
/destroy processes --global # Kill ALL orphaned Omnius processes system-wide
|
|
444
446
|
```
|
|
445
447
|
|
|
446
|
-
Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale
|
|
448
|
+
Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale Omnius instances.
|
|
447
449
|
|
|
448
450
|
### REST API Service (Port 11435)
|
|
449
451
|
|
|
450
|
-
Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full
|
|
452
|
+
Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full Omnius capability surface** through standards most organizations expect:
|
|
451
453
|
|
|
452
454
|
- **OpenAI / Ollama drop-in** — `/v1/chat`, `/v1/chat/completions`, `/v1/embeddings`, `/v1/models` are wire-compatible with both ecosystems
|
|
453
455
|
- **API discovery** — `GET /help` returns a full human and agent-readable guide with quickstart curl commands, all 70+ endpoints by category, MCP integration instructions, and auth documentation
|
|
@@ -462,19 +464,19 @@ Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — inst
|
|
|
462
464
|
- **`X-Request-ID`** echoed or generated for correlation
|
|
463
465
|
- **SSE event bus** at `/v1/events` with optional `?type=foo.*` filter, tagged with `aims:control` for auditors
|
|
464
466
|
- **Bearer auth + scoped keys** (`read` / `run` / `admin`) and OIDC JWT support
|
|
465
|
-
- **Per-key concurrency limits** (`maxJobs` in `
|
|
467
|
+
- **Per-key concurrency limits** (`maxJobs` in `OMNIUS_API_KEYS` is now actually enforced)
|
|
466
468
|
- **Atomic job record writes** with 64-bit job IDs (no race conditions)
|
|
467
469
|
- **OpenAPI 3.0** at `/openapi.json` and Swagger UI at `/docs`
|
|
468
470
|
- **Web chat UI** at `/`
|
|
469
471
|
|
|
470
|
-
> **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `
|
|
472
|
+
> **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `OMNIUS_SKIP_DAEMON_INSTALL=1 npm i -g omnius`.
|
|
471
473
|
|
|
472
474
|
```bash
|
|
473
475
|
# Manually run the server (the daemon already does this for you)
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
|
|
476
|
+
omnius serve # Start on default port 11435
|
|
477
|
+
omnius serve --port 9999 # Custom port
|
|
478
|
+
OMNIUS_API_KEY=mysecret omnius serve # Single admin key
|
|
479
|
+
OMNIUS_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" omnius serve # Scoped multi-key with rpm:tpd:maxjobs
|
|
478
480
|
```
|
|
479
481
|
|
|
480
482
|
> **Every example below is verified against `omnius@0.187.189` on a live daemon.** Examples from earlier versions are deprecated.
|
|
@@ -484,7 +486,7 @@ OA_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" oa
|
|
|
484
486
|
Control who can reach the daemon and where it binds:
|
|
485
487
|
|
|
486
488
|
- TUI commands: `/access loopback|lan|any`, `/host <host[:port]>`, `/network config` (interactive), `--local` to save per‑project.
|
|
487
|
-
- Environment: `
|
|
489
|
+
- Environment: `OMNIUS_ACCESS=loopback|lan|any`, `OMNIUS_HOST=host[:port]`.
|
|
488
490
|
- See Configuration → [Network Access & Binding](#network-access--binding) for full details and security guidance.
|
|
489
491
|
|
|
490
492
|
#### Working Directory
|
|
@@ -532,12 +534,12 @@ curl http://localhost:11435/version
|
|
|
532
534
|
curl http://localhost:11435/metrics
|
|
533
535
|
```
|
|
534
536
|
```
|
|
535
|
-
# HELP
|
|
536
|
-
# TYPE
|
|
537
|
-
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
537
|
+
# HELP omnius_requests_total Total HTTP requests
|
|
538
|
+
# TYPE omnius_requests_total counter
|
|
539
|
+
omnius_requests_total{method="POST",path="/v1/chat/completions",status="200"} 47
|
|
540
|
+
omnius_tokens_in_total 12450
|
|
541
|
+
omnius_tokens_out_total 8230
|
|
542
|
+
omnius_errors_total 0
|
|
541
543
|
```
|
|
542
544
|
|
|
543
545
|
#### OpenAI-Compatible Inference
|
|
@@ -590,7 +592,7 @@ data: [DONE]
|
|
|
590
592
|
|
|
591
593
|
#### Agentic Task Execution
|
|
592
594
|
|
|
593
|
-
The unique
|
|
595
|
+
The unique Omnius capability — submit a coding task and get an autonomous agent loop.
|
|
594
596
|
|
|
595
597
|
```bash
|
|
596
598
|
# Run task in your current directory
|
|
@@ -728,7 +730,7 @@ curl -X POST http://localhost:11435/v1/commands/destroy \
|
|
|
728
730
|
|
|
729
731
|
```bash
|
|
730
732
|
# Multi-key setup: read (monitoring), run (CI), admin (ops)
|
|
731
|
-
|
|
733
|
+
OMNIUS_API_KEYS="grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team" omnius serve
|
|
732
734
|
```
|
|
733
735
|
|
|
734
736
|
| Scope | Can do | Cannot do |
|
|
@@ -828,21 +830,21 @@ curl -X DELETE -H "Authorization: Bearer $ADMIN_KEY" \
|
|
|
828
830
|
|
|
829
831
|
The daemon is built for **unbounded concurrent requests** with per-key enforcement. Every agentic task (`/v1/run`, `/v1/chat`, `/api/chat`, `/api/generate`) spawns its own subprocess, so multiple jobs run in true parallel — same model or different models, same or different profiles, same or different sandbox modes.
|
|
830
832
|
|
|
831
|
-
**Per-key concurrency limits** are enforced from the `
|
|
833
|
+
**Per-key concurrency limits** are enforced from the `OMNIUS_API_KEYS` env var:
|
|
832
834
|
|
|
833
835
|
```bash
|
|
834
836
|
# key:scope:user:rpm:tpd:maxJobs
|
|
835
|
-
|
|
837
|
+
OMNIUS_API_KEYS="ci-key:run:github-actions:60:100000:5, \
|
|
836
838
|
ops-key:admin:ops:120:500000:20, \
|
|
837
839
|
read-key:read:grafana:600::"
|
|
838
|
-
|
|
840
|
+
omnius serve
|
|
839
841
|
```
|
|
840
842
|
|
|
841
843
|
The 6th field is `maxJobs` — the maximum number of **concurrent** (in-flight) agentic tasks for that key. When exceeded, the daemon returns **RFC 7807 `429 Too Many Requests`**:
|
|
842
844
|
|
|
843
845
|
```json
|
|
844
846
|
{
|
|
845
|
-
"type": "https://
|
|
847
|
+
"type": "https://omnius.nexus/problems/rate-limited",
|
|
846
848
|
"title": "Concurrent job limit exceeded",
|
|
847
849
|
"status": 429,
|
|
848
850
|
"detail": "Concurrent job limit exceeded for github-actions: 5/5",
|
|
@@ -869,7 +871,7 @@ done
|
|
|
869
871
|
wait
|
|
870
872
|
```
|
|
871
873
|
|
|
872
|
-
Each subprocess inherits a **clean env** — `
|
|
874
|
+
Each subprocess inherits a **clean env** — `OMNIUS_DAEMON` and `OMNIUS_PORT` are explicitly stripped so the child doesn't re-enter daemon mode. Fixed in v0.187.189 (root cause of the earlier "Task incomplete (0 turns, 0 tool calls)" bug).
|
|
873
875
|
|
|
874
876
|
**Observing parallelism live** — subscribe to the event bus to watch every job lifecycle event:
|
|
875
877
|
|
|
@@ -930,7 +932,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
|
|
|
930
932
|
| Method | Path | Auth | Description |
|
|
931
933
|
|--------|------|------|-------------|
|
|
932
934
|
| POST | `/v1/chat` | run | Full agent under the hood, OpenAI chat.completion shape. Default = tools=true (subprocess agent). Set `tools:false` for direct backend bypass. Supports `timeout_s` body field (default 180s). Non-streaming path has a safety SIGTERM→SIGKILL after `timeout_s + 30s`. |
|
|
933
|
-
| POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both
|
|
935
|
+
| POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both Omnius-shape (`{message, model}`) and Ollama-shape (`{model, messages: [...]}`) bodies. Returns OpenAI `chat.completion` shape on success and failure (failure uses `finish_reason:"error"`). |
|
|
934
936
|
| POST | `/v1/generate` | run | **One-off completion** — same agent stack as `/v1/chat` but no session history. Returns Ollama-shape `{model, response, done, total_duration}`. |
|
|
935
937
|
| POST | `/api/generate` | run | **Ollama-compatible alias** of `/v1/generate`. Drop-in for Ollama `/api/generate`. |
|
|
936
938
|
| GET | `/v1/chat/sessions` | read | List active chat sessions |
|
|
@@ -997,7 +999,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
|
|
|
997
999
|
**Sessions + context**
|
|
998
1000
|
| Method | Path | Auth | Description |
|
|
999
1001
|
|--------|------|------|-------------|
|
|
1000
|
-
| GET | `/v1/sessions` | read |
|
|
1002
|
+
| GET | `/v1/sessions` | read | Omnius task session archive |
|
|
1001
1003
|
| GET | `/v1/sessions/:id` | read | Session history |
|
|
1002
1004
|
| GET | `/v1/context` | read | Show current session context |
|
|
1003
1005
|
| POST | `/v1/context/save` | run | Save a context entry |
|
|
@@ -1064,15 +1066,15 @@ The chat endpoint is mounted at **two paths on port 11435**:
|
|
|
1064
1066
|
|
|
1065
1067
|
| Path | Purpose |
|
|
1066
1068
|
|------|---------|
|
|
1067
|
-
| `POST /v1/chat` |
|
|
1069
|
+
| `POST /v1/chat` | Omnius-native path |
|
|
1068
1070
|
| `POST /api/chat` | **Ollama-compatible alias** — same handler, so clients pointing at Ollama can be flipped over by changing only the port (`11434` → `11435`) |
|
|
1069
1071
|
|
|
1070
|
-
It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full
|
|
1072
|
+
It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full Omnius agent (tools, multi-agent, memory, skills) under the hood and returns an **OpenAI `chat.completion`-shaped response** so any client SDK can use it without modification.
|
|
1071
1073
|
|
|
1072
1074
|
**Both body shapes are accepted** on either path:
|
|
1073
1075
|
|
|
1074
1076
|
```jsonc
|
|
1075
|
-
//
|
|
1077
|
+
// Omnius-native
|
|
1076
1078
|
{"message": "hello", "model": "qwen3.5:9b", "stream": false}
|
|
1077
1079
|
|
|
1078
1080
|
// Ollama-native (the `messages` array; the last user message is extracted)
|
|
@@ -1080,18 +1082,18 @@ It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/
|
|
|
1080
1082
|
```
|
|
1081
1083
|
|
|
1082
1084
|
> **Two execution modes:**
|
|
1083
|
-
> - **Default (`tools` unset or `tools: true`)** — full agent: spawns the
|
|
1085
|
+
> - **Default (`tools` unset or `tools: true`)** — full agent: spawns the Omnius subprocess with the entire 82-tool set, runs the agent loop, returns the final answer with `tool_calls` metadata.
|
|
1084
1086
|
> - **Direct (`tools: false`)** — fast path: bypasses the agent and forwards straight to the configured backend (Ollama/vLLM) using the session history. Useful for plain chat without tools.
|
|
1085
1087
|
|
|
1086
1088
|
**Safety timeout** — every non-streaming request is bounded by `timeout_s` (default **180s**). If the agent subprocess doesn't close in `timeout_s + 30s`, the daemon SIGTERMs (then SIGKILLs) it and returns an OpenAI-shaped error with `finish_reason:"error"` and a clear explanation. No more hung requests.
|
|
1087
1089
|
|
|
1088
|
-
**Flip Ollama →
|
|
1090
|
+
**Flip Ollama → Omnius by port alone** — this is verified to work via `scripts/omnius-vs-ollama-chat-compare.sh` (see [Live Comparison](#live-comparison-ollama-vs-omnius-full-agent) below):
|
|
1089
1091
|
|
|
1090
1092
|
```bash
|
|
1091
1093
|
# Before (Ollama)
|
|
1092
1094
|
curl -s http://127.0.0.1:11434/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
|
|
1093
1095
|
|
|
1094
|
-
# After (
|
|
1096
|
+
# After (Omnius with full agent) — only port changed
|
|
1095
1097
|
curl -s http://127.0.0.1:11435/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
|
|
1096
1098
|
```
|
|
1097
1099
|
|
|
@@ -1195,32 +1197,32 @@ curl -s http://localhost:11435/v1/chat \
|
|
|
1195
1197
|
|
|
1196
1198
|
Sessions expire after 30 minutes of inactivity. List active sessions: `GET /v1/chat/sessions`.
|
|
1197
1199
|
|
|
1198
|
-
#### Live Comparison: Ollama vs
|
|
1200
|
+
#### Live Comparison: Ollama vs Omnius Full Agent
|
|
1199
1201
|
|
|
1200
|
-
The repo ships a reproducible side-by-side harness at [`scripts/
|
|
1202
|
+
The repo ships a reproducible side-by-side harness at [`scripts/omnius-vs-ollama-chat-compare.sh`](scripts/omnius-vs-ollama-chat-compare.sh). It runs **5 tool-call-required prompts** × **4 phases** (Ollama non-stream, Omnius non-stream, Ollama stream, Omnius stream) = **20 runs per invocation** with the same model and the same `/api/chat` path on both ports.
|
|
1201
1203
|
|
|
1202
1204
|
```bash
|
|
1203
|
-
MODEL=qwen3.5:9b bash scripts/
|
|
1205
|
+
MODEL=qwen3.5:9b bash scripts/omnius-vs-ollama-chat-compare.sh
|
|
1204
1206
|
```
|
|
1205
1207
|
|
|
1206
1208
|
**Results from `omnius@0.187.191` with `qwen3.5:9b`** (all 20 runs completed, zero timeouts):
|
|
1207
1209
|
|
|
1208
1210
|
| # | Prompt | Ollama (bare) | Omnius (full agent) | Winner |
|
|
1209
1211
|
|---|---|---|---|---|
|
|
1210
|
-
| 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **
|
|
1211
|
-
| 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **
|
|
1212
|
-
| 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **
|
|
1213
|
-
| 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **
|
|
1214
|
-
| 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (
|
|
1212
|
+
| 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **Omnius** |
|
|
1213
|
+
| 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **Omnius** |
|
|
1214
|
+
| 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **Omnius** |
|
|
1215
|
+
| 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **Omnius** |
|
|
1216
|
+
| 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (Omnius more detailed)** |
|
|
1215
1217
|
|
|
1216
1218
|
**Latency profile** (wall clock, 5-prompt median):
|
|
1217
1219
|
|
|
1218
|
-
| Phase | Ollama |
|
|
1220
|
+
| Phase | Ollama | Omnius agent | Omnius overhead |
|
|
1219
1221
|
|---|---|---|---|
|
|
1220
1222
|
| Non-streaming | 12–18s | 24–42s | 12–26s (agent loop + tool calls) |
|
|
1221
1223
|
| Streaming SSE | 11–16s | 24–56s | 10–40s |
|
|
1222
1224
|
|
|
1223
|
-
**Streaming parser validation** — every
|
|
1225
|
+
**Streaming parser validation** — every Omnius stream delivered:
|
|
1224
1226
|
- Live intermediate `tool_call` events mid-stream (e.g. `['web_search', 'web_fetch', 'task_complete']`)
|
|
1225
1227
|
- OpenAI `chat.completion.chunk` deltas with `id`, `model`, `finish_reason`
|
|
1226
1228
|
- Clean `data: [DONE]` termination with `finish_reason:"stop"`
|
|
@@ -1228,12 +1230,12 @@ MODEL=qwen3.5:9b bash scripts/oa-vs-ollama-chat-compare.sh
|
|
|
1228
1230
|
The harness is **reproducible** — rerun it after any `/v1/chat` change to catch regressions:
|
|
1229
1231
|
|
|
1230
1232
|
```bash
|
|
1231
|
-
MODEL=qwen3.5:4b bash scripts/
|
|
1232
|
-
MODEL=qwen3.5:9b
|
|
1233
|
-
MODEL=qwen3.5:32b
|
|
1233
|
+
MODEL=qwen3.5:4b bash scripts/omnius-vs-ollama-chat-compare.sh # faster tier for quick smoke
|
|
1234
|
+
MODEL=qwen3.5:9b OMNIUS_TIMEOUT=300 bash scripts/omnius-vs-ollama-chat-compare.sh # default
|
|
1235
|
+
MODEL=qwen3.5:32b OMNIUS_TIMEOUT=600 bash scripts/omnius-vs-ollama-chat-compare.sh # higher tier
|
|
1234
1236
|
```
|
|
1235
1237
|
|
|
1236
|
-
**Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses;
|
|
1238
|
+
**Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses; Omnius with the full agent is correct with citations. That's the differentiator captured live in the harness output.
|
|
1237
1239
|
|
|
1238
1240
|
#### One-Off Completions — `/api/generate` + `/v1/generate`
|
|
1239
1241
|
|
|
@@ -1244,11 +1246,11 @@ Drop-in for **Ollama `/api/generate`**. Same body shape, same response shape, sa
|
|
|
1244
1246
|
curl -s http://127.0.0.1:11434/api/generate \
|
|
1245
1247
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
|
|
1246
1248
|
|
|
1247
|
-
#
|
|
1249
|
+
# Omnius with full agent — only port changed
|
|
1248
1250
|
curl -s http://127.0.0.1:11435/api/generate \
|
|
1249
1251
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
|
|
1250
1252
|
|
|
1251
|
-
#
|
|
1253
|
+
# Omnius direct backend bypass (fast path, no agent)
|
|
1252
1254
|
curl -s http://127.0.0.1:11435/api/generate \
|
|
1253
1255
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false,"tools":false}'
|
|
1254
1256
|
```
|
|
@@ -1273,7 +1275,7 @@ curl -s http://127.0.0.1:11435/api/generate \
|
|
|
1273
1275
|
}
|
|
1274
1276
|
```
|
|
1275
1277
|
|
|
1276
|
-
The `_oa` extension block carries the
|
|
1278
|
+
The `_oa` extension block carries the Omnius-specific metadata (tool call count, agent duration, request ID for correlation with `/v1/audit`). Strict Ollama clients ignore unknown fields — no client changes required.
|
|
1277
1279
|
|
|
1278
1280
|
**Streaming** — set `"stream": true` and receive Ollama-style NDJSON chunks:
|
|
1279
1281
|
|
|
@@ -1349,18 +1351,18 @@ The `strength` and `lastRetrieved` fields are updated on every search — the st
|
|
|
1349
1351
|
|
|
1350
1352
|
#### Generate/Embed/Memory Test Harness
|
|
1351
1353
|
|
|
1352
|
-
A second harness at [`scripts/
|
|
1354
|
+
A second harness at [`scripts/omnius-vs-ollama-generate-embed-memory.sh`](scripts/omnius-vs-ollama-generate-embed-memory.sh) covers the four non-chat endpoint families:
|
|
1353
1355
|
|
|
1354
1356
|
```bash
|
|
1355
1357
|
MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
|
|
1356
|
-
bash scripts/
|
|
1358
|
+
bash scripts/omnius-vs-ollama-generate-embed-memory.sh
|
|
1357
1359
|
```
|
|
1358
1360
|
|
|
1359
1361
|
**Tested results from `omnius@0.187.195`** (live, single run, `qwen3.5:9b` + `nomic-embed-text`):
|
|
1360
1362
|
|
|
1361
1363
|
**Part 1 — `/api/generate` one-off prompts**:
|
|
1362
1364
|
|
|
1363
|
-
| Prompt | Ollama |
|
|
1365
|
+
| Prompt | Ollama | Omnius direct | Omnius full agent |
|
|
1364
1366
|
|---|---|---|---|
|
|
1365
1367
|
| "TCP vs UDP in one sentence" | 26.8s — correct | 12.5s — correct | 43.8s — correct, **1 tool call** |
|
|
1366
1368
|
| "One-line Python square function" | 32.1s — correct | 12.2s — correct | ~3min — correct, **2 tool calls** |
|
|
@@ -1368,7 +1370,7 @@ MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
|
|
|
1368
1370
|
|
|
1369
1371
|
**Part 2 — `/api/embed` cosine similarity sanity** (4 test sentences):
|
|
1370
1372
|
|
|
1371
|
-
Both Ollama and
|
|
1373
|
+
Both Ollama and Omnius emitted **identical 768-dim vectors** (same backend). Cosine similarity matrix:
|
|
1372
1374
|
|
|
1373
1375
|
```
|
|
1374
1376
|
France→Par Paris→Fran Germany→Be Bananas
|
|
@@ -1618,7 +1620,7 @@ curl -s -X POST http://localhost:11435/v1/files/read \
|
|
|
1618
1620
|
#### Sessions, Context, Cost, Sponsors, Nexus
|
|
1619
1621
|
|
|
1620
1622
|
```bash
|
|
1621
|
-
#
|
|
1623
|
+
# Omnius task session archive (not chat sessions)
|
|
1622
1624
|
curl -s 'http://localhost:11435/v1/sessions?limit=10'
|
|
1623
1625
|
curl -s http://localhost:11435/v1/sessions/{session_id}
|
|
1624
1626
|
|
|
@@ -1651,7 +1653,7 @@ curl -s -X POST http://localhost:11435/v1/files/read -d '{}'
|
|
|
1651
1653
|
```
|
|
1652
1654
|
```json
|
|
1653
1655
|
{
|
|
1654
|
-
"type": "https://
|
|
1656
|
+
"type": "https://omnius.nexus/problems/invalid-request",
|
|
1655
1657
|
"title": "Missing 'path'",
|
|
1656
1658
|
"status": 400,
|
|
1657
1659
|
"detail": "POST body must include {path: string, offset?: number, limit?: number}",
|
|
@@ -1695,7 +1697,7 @@ curl -s -o /dev/null -w '%{http_code}\n' \
|
|
|
1695
1697
|
|
|
1696
1698
|
#### Web Interface
|
|
1697
1699
|
|
|
1698
|
-
Open `http://localhost:11435/` in a browser when `
|
|
1700
|
+
Open `http://localhost:11435/` in a browser when `omnius serve` is running. Zero external dependencies — single self-contained HTML page.
|
|
1699
1701
|
|
|
1700
1702
|
**Tabs:**
|
|
1701
1703
|
- **Chat** — Conversational interface using `/v1/chat` with full tool access, session persistence, streaming responses, and collapsible tool call dropdowns
|
|
@@ -1716,7 +1718,7 @@ Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero ext
|
|
|
1716
1718
|
- Token counter per conversation
|
|
1717
1719
|
- Conversation export (Markdown or JSON)
|
|
1718
1720
|
- GPU/VRAM detection with model compatibility recommendations
|
|
1719
|
-
- Per-provider token tracking (persisted to `.
|
|
1721
|
+
- Per-provider token tracking (persisted to `.omnius/usage/token-usage.json`)
|
|
1720
1722
|
|
|
1721
1723
|
### Enterprise Licensing
|
|
1722
1724
|
|
|
@@ -1795,16 +1797,16 @@ SUGGESTED NEXT STEP: A completed todo claims a missing artifact...
|
|
|
1795
1797
|
Prior `<world-state>` blocks are stripped before injecting the freshest one — only the current snapshot lives in context. Plan reconciliation uses `verifyCommand` + `declaredArtifacts` from the todo store + heuristic filename matching. Disk scan is gitignore-aware, capped at 200 files. Generic across stacks.
|
|
1796
1798
|
*Lit anchors*: MetaGPT (Hong et al. ICLR 2024) — SOP-encoded state representation; AlphaCodium (Pinto 2024) — symbol-aware iteration.
|
|
1797
1799
|
|
|
1798
|
-
Configurable via `
|
|
1800
|
+
Configurable via `OMNIUS_WORLD_STATE_INTERVAL` (default 8), `OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD` (default 5), `OMNIUS_WORLD_STATE_MAX_FILES` (default 200).
|
|
1799
1801
|
|
|
1800
1802
|
### REG-47 — Backward-pass critic on `task_complete`
|
|
1801
1803
|
|
|
1802
|
-
When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `
|
|
1804
|
+
When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `OMNIUS_BACKWARD_PASS=on`, the orchestrator spawns a dedicated CRITIC sub-agent against the same backend. The critic gets the diff + plan reconciliation + recent failures + a 10-point structural audit checklist (dead refs, missing imports, off-by-one, null-handling, stateful regex, hardcoded paths, untested code paths, plan-disk gaps, unresolved failures, generic-vs-specific drift) and votes:
|
|
1803
1805
|
- **approve** → task_complete proceeds, run terminates
|
|
1804
1806
|
- **request_changes** → issue feedback injected as a system message; agent loops to address
|
|
1805
1807
|
- **reject** → critical event; same as request_changes but with escalation marker
|
|
1806
1808
|
|
|
1807
|
-
Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `
|
|
1809
|
+
Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `OMNIUS_BACKWARD_PASS=on`.
|
|
1808
1810
|
*Lit anchors*: Self-Refine (Madaan et al. NeurIPS 2024) — +6-12% HumanEval correctness from a dedicated reviewer; CodeT (Chen et al. arxiv 2306.03907) — critic-contested implementer claims.
|
|
1809
1811
|
|
|
1810
1812
|
### REG-48 — Cross-file specification drift detection
|
|
@@ -1859,29 +1861,29 @@ Run-by-run progression of the orchestrator:
|
|
|
1859
1861
|
| #18 | 43/44/45/46/47 | killed @ ~30m, 8/9 phases done, test-debug stuck | 62 | ✓ | partial |
|
|
1860
1862
|
| **#19** | **43/44/45/46/47/48** | **completed cleanly** | **62** | **✓** | **6/6 pass** |
|
|
1861
1863
|
|
|
1862
|
-
Detailed archival report: [`.aiwg/
|
|
1864
|
+
Detailed archival report: [`.aiwg/omnius-eval/RESULTS-RUN-19.md`](.aiwg/omnius-eval/RESULTS-RUN-19.md).
|
|
1863
1865
|
|
|
1864
1866
|
### Configuration summary
|
|
1865
1867
|
|
|
1866
1868
|
```bash
|
|
1867
1869
|
# Defense activation (set in daemon env or systemd unit)
|
|
1868
|
-
|
|
1869
|
-
|
|
1870
|
-
|
|
1871
|
-
|
|
1872
|
-
|
|
1873
|
-
|
|
1874
|
-
|
|
1870
|
+
OMNIUS_BACKWARD_PASS=on # enable REG-47 critic (default: off)
|
|
1871
|
+
OMNIUS_BACKWARD_PASS_MAX_CYCLES=2 # max review iterations
|
|
1872
|
+
OMNIUS_BACKWARD_PASS_MIN_WRITES=1 # min file mutations to trigger review
|
|
1873
|
+
OMNIUS_BACKWARD_PASS_TIMEOUT_MS=120000 # critic call timeout
|
|
1874
|
+
OMNIUS_BACKWARD_PASS_MAX_TOKENS=4096 # critic response cap
|
|
1875
|
+
OMNIUS_BACKWARD_PASS_MAX_FILES=60 # max files in critic prompt
|
|
1876
|
+
OMNIUS_BACKWARD_PASS_MAX_FILE_PREVIEW=8000
|
|
1875
1877
|
|
|
1876
|
-
|
|
1877
|
-
|
|
1878
|
-
|
|
1878
|
+
OMNIUS_WORLD_STATE_INTERVAL=8 # REG-46 turn-cadence (default: 8)
|
|
1879
|
+
OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD=5 # REG-46 write-trigger (default: 5)
|
|
1880
|
+
OMNIUS_WORLD_STATE_MAX_FILES=200 # REG-46 disk-scan cap
|
|
1879
1881
|
|
|
1880
|
-
|
|
1881
|
-
|
|
1882
|
+
OMNIUS_WORLD_STATE_DRIFT=on # REG-48 drift detector (default: on)
|
|
1883
|
+
OMNIUS_DRIFT_ALIASES='{"~/":"src/"}' # extra path aliases (JSON)
|
|
1882
1884
|
|
|
1883
|
-
|
|
1884
|
-
|
|
1885
|
+
OMNIUS_RUN_RETENTION_H=24 # run-record GC (default: 24h, 0 disables)
|
|
1886
|
+
OMNIUS_TOOL_OVERRIDES='{"shell":{"off_device_allowed":true}}' # per-tool security overrides
|
|
1885
1887
|
```
|
|
1886
1888
|
|
|
1887
1889
|
|
|
@@ -1997,7 +1999,7 @@ Omnius builds and maintains a **persistent, auto-updating knowledge graph** of t
|
|
|
1997
1999
|
### How It Works
|
|
1998
2000
|
|
|
1999
2001
|
```
|
|
2000
|
-
Source files ──> Regex symbol extraction ──> SQLite graph DB (.
|
|
2002
|
+
Source files ──> Regex symbol extraction ──> SQLite graph DB (.omnius/index/code-graph.db)
|
|
2001
2003
|
| |
|
|
2002
2004
|
| fs.watch() + debounce ──> File hash check ──> Incremental re-index (per file)
|
|
2003
2005
|
| |
|
|
@@ -2031,7 +2033,7 @@ For 1M+ LOC codebases, the Louvain community compression reduces 50K+ symbols in
|
|
|
2031
2033
|
|
|
2032
2034
|
### Storage
|
|
2033
2035
|
|
|
2034
|
-
The graph persists in `.
|
|
2036
|
+
The graph persists in `.omnius/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.
|
|
2035
2037
|
|
|
2036
2038
|
### Research Basis
|
|
2037
2039
|
|
|
@@ -2142,7 +2144,7 @@ On startup and `/model` switch, Omnius detects your RAM/VRAM and creates an opti
|
|
|
2142
2144
|
| **COHERE Cognitive Stack** | |
|
|
2143
2145
|
| `repl_exec` | Persistent Python REPL — variables/imports persist between calls, `llm_query()` and `parallel_llm_query()` available for recursive LLM invocation, `retrieve()` for handle access |
|
|
2144
2146
|
| `memory_metabolize` | Governed memory lifecycle — classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence/identity_relevance), consolidate lessons from trajectories |
|
|
2145
|
-
| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.
|
|
2147
|
+
| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.omnius/identity/` |
|
|
2146
2148
|
| `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |
|
|
2147
2149
|
| `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |
|
|
2148
2150
|
| **Hardware Access** | |
|
|
@@ -2267,7 +2269,7 @@ Instead of writing custom integrations, point Omnius at an MCP server and its to
|
|
|
2267
2269
|
}
|
|
2268
2270
|
```
|
|
2269
2271
|
|
|
2270
|
-
Save that as `.
|
|
2272
|
+
Save that as `.omnius/mcp.json` (project) or `~/.omnius/mcp.json` (global). On startup, every server is spawned, the handshake runs, and every tool it advertises is exposed under the namespace `mcp__<server>__<tool>` — selectable by the agent like any built-in.
|
|
2271
2273
|
|
|
2272
2274
|
### Spec compliance — what we implement
|
|
2273
2275
|
|
|
@@ -2287,9 +2289,9 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
|
|
|
2287
2289
|
|
|
2288
2290
|
### Three ways to add a server
|
|
2289
2291
|
|
|
2290
|
-
**1. Edit `.
|
|
2292
|
+
**1. Edit `.omnius/mcp.json` directly** — drop in the JSON shape above. On next launch the server is spawned and connected automatically.
|
|
2291
2293
|
|
|
2292
|
-
**2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the
|
|
2294
|
+
**2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the Omnius terminal. The MD parser detects the configuration with confidence scoring, persists it to `.omnius/mcp.json`, and connects immediately. No restart needed. Implementation: `packages/execution/src/mcp/md-intake.ts`.
|
|
2293
2295
|
|
|
2294
2296
|
**3. Use the `/mcp` slash command** — interactive TUI registry browser:
|
|
2295
2297
|
|
|
@@ -2297,7 +2299,7 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
|
|
|
2297
2299
|
/mcp # Open the MCP registry menu
|
|
2298
2300
|
/mcp status # Quick connection table
|
|
2299
2301
|
/mcp ls # Same as status
|
|
2300
|
-
/mcp reload # Reconnect every server from .
|
|
2302
|
+
/mcp reload # Reconnect every server from .omnius/mcp.json
|
|
2301
2303
|
```
|
|
2302
2304
|
|
|
2303
2305
|
The main menu lists every configured server with status (●), transport type, tool count, and any error. Selecting a server opens a detail view showing every advertised tool with its description, plus actions to **Edit**, **Reconnect**, **Delete**, or go **Back**. Edit accepts a one-line JSON config; Save returns to the main list with the updated server reconnected.
|
|
@@ -2327,7 +2329,7 @@ We test the streaming features end-to-end against the [official everything refer
|
|
|
2327
2329
|
|
|
2328
2330
|
### Programmatic API
|
|
2329
2331
|
|
|
2330
|
-
If you want to drive an MCP server directly from code (instead of through an agent), the
|
|
2332
|
+
If you want to drive an MCP server directly from code (instead of through an agent), the Omnius package re-exports the client:
|
|
2331
2333
|
|
|
2332
2334
|
```typescript
|
|
2333
2335
|
import { McpClient } from "omnius";
|
|
@@ -2401,7 +2403,7 @@ The loop tracks iteration history, generates completion reports saved to `.aiwg/
|
|
|
2401
2403
|
| `/pause` | **Gentle halt** — lets the current inference turn finish, then stops before the next turn. No new tool calls or inference will begin until `/resume`. |
|
|
2402
2404
|
| `/stop` | **Immediate kill** — aborts the current inference mid-stream, saves task state for later resumption. |
|
|
2403
2405
|
| `/resume` | **Continue** — resumes a paused or stopped task from where it left off. Also resumes tasks saved by `/stop` or interrupted by `/update`. |
|
|
2404
|
-
| `/destroy` | **Nuclear option** — aborts any active task, deletes the `.
|
|
2406
|
+
| `/destroy` | **Nuclear option** — aborts any active task, deletes the `.omnius/` directory, clears the console, and exits to shell. |
|
|
2405
2407
|
|
|
2406
2408
|
### Session Context Persistence
|
|
2407
2409
|
|
|
@@ -2413,13 +2415,13 @@ Context is automatically saved on every task completion and preserved across `/u
|
|
|
2413
2415
|
/context show # Show saved context status (entries, last saved)
|
|
2414
2416
|
```
|
|
2415
2417
|
|
|
2416
|
-
The system maintains a rolling window of the last 20 session entries in `.
|
|
2418
|
+
The system maintains a rolling window of the last 20 session entries in `.omnius/context/session-context.json`. When you run `/context restore`, the last 10 entries are formatted into a restore prompt and injected into your next task, giving the agent continuity across sessions.
|
|
2417
2419
|
|
|
2418
2420
|
During `/update`, context is automatically saved before the process restarts and restored when the new version resumes your task.
|
|
2419
2421
|
|
|
2420
2422
|
### Auto-Restore on Startup
|
|
2421
2423
|
|
|
2422
|
-
When you launch `
|
|
2424
|
+
When you launch `omnius` in a workspace that has saved session context from a previous run, you'll be prompted to restore it:
|
|
2423
2425
|
|
|
2424
2426
|
```
|
|
2425
2427
|
ℹ Previous session found (5 entries, last active 2h ago)
|
|
@@ -2462,7 +2464,7 @@ Daemon: COHERE enabled — listening on nexus.cohere.query
|
|
|
2462
2464
|
Capacity announcement: 3 models, warm=qwen3.5:122b
|
|
2463
2465
|
|
|
2464
2466
|
Peer: "Explain TCP vs UDP" → NATS broadcast
|
|
2465
|
-
Your
|
|
2467
|
+
Your Omnius: claim → route to qwen3:4b (trivial) → respond in 1.2s
|
|
2466
2468
|
```
|
|
2467
2469
|
|
|
2468
2470
|
**How it works:**
|
|
@@ -2473,7 +2475,7 @@ Your OA: claim → route to qwen3:4b (trivial) → respond in 1.2s
|
|
|
2473
2475
|
- **Model allowlist** — `/cohere allow qwen3:4b` controls which models are exposed
|
|
2474
2476
|
- **Ollama safety** — remote queries can ONLY run inference on existing models; `/api/pull`, `/api/delete`, `/api/create` are never called
|
|
2475
2477
|
- **Identity pinning** — snapshots published to IPFS (Helia) with SHA-256 content addressing; survives daemon restarts
|
|
2476
|
-
- **Background daemon** persists across
|
|
2478
|
+
- **Background daemon** persists across Omnius restarts (`detached: true` + PID file reconnection)
|
|
2477
2479
|
|
|
2478
2480
|
```bash
|
|
2479
2481
|
/cohere stats # Network transparency — queries in/out, model usage, peer activity
|
|
@@ -2523,7 +2525,7 @@ The identity kernel maintains a persistent self-model across sessions, the refle
|
|
|
2523
2525
|
|
|
2524
2526
|
Omnius includes a behavioral immune system that prevents the agent from making pattern-matched mistakes under pressure. Inspired by biological immune systems: constraints are the antibodies, pressure detection is the inflammatory response, and memory injection is the recall mechanism.
|
|
2525
2527
|
|
|
2526
|
-
### Constraint Enforcement (`.
|
|
2528
|
+
### Constraint Enforcement (`.omnius/constraints.json`)
|
|
2527
2529
|
|
|
2528
2530
|
Machine-readable rules checked **before every tool execution**:
|
|
2529
2531
|
|
|
@@ -2548,7 +2550,7 @@ Machine-readable rules checked **before every tool execution**:
|
|
|
2548
2550
|
| `warn` | Executes tool but emits warning in agent's next turn context |
|
|
2549
2551
|
| `log` | Silent recording to audit log, no interruption |
|
|
2550
2552
|
|
|
2551
|
-
Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.
|
|
2553
|
+
Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.omnius/constraints.json`), or session (ephemeral).
|
|
2552
2554
|
|
|
2553
2555
|
### Pressure-Aware Decision Gate
|
|
2554
2556
|
|
|
@@ -2640,7 +2642,7 @@ Use deep context for:
|
|
|
2640
2642
|
- Long debugging sessions where error context from earlier is critical
|
|
2641
2643
|
- Tasks where the agent needs to reason about patterns across many files
|
|
2642
2644
|
|
|
2643
|
-
The setting persists to `.
|
|
2645
|
+
The setting persists to `.omnius/settings.json`. Deep context is particularly valuable for models with 64K+ context windows (Qwen3.5-122B, Llama 3.1 70B, etc.) where the default thresholds were leaving significant capacity unused.
|
|
2644
2646
|
|
|
2645
2647
|
### Status Bar Context Tracking (`Ctx:` + `SNR:`)
|
|
2646
2648
|
|
|
@@ -2750,7 +2752,7 @@ The profile is compiled into a system prompt suffix (max 80 tokens) injected at
|
|
|
2750
2752
|
|
|
2751
2753
|
### Persistence
|
|
2752
2754
|
|
|
2753
|
-
The style is saved to `.
|
|
2755
|
+
The style is saved to `.omnius/settings.json` (with `--local`) or `~/.omnius/config.json` (global) and persists across sessions. Change it anytime with `/style <preset>` — takes effect on the next task.
|
|
2754
2756
|
|
|
2755
2757
|
### Research Provenance
|
|
2756
2758
|
|
|
@@ -2876,7 +2878,7 @@ Output: 48kHz WAV, compatible with Telegram voice messages and WebSocket streami
|
|
|
2876
2878
|
|
|
2877
2879
|
### Supertonic Expressive Tags
|
|
2878
2880
|
|
|
2879
|
-
When Supertonic is the active voice backend,
|
|
2881
|
+
When Supertonic is the active voice backend, Omnius decorates spoken status updates with the expression tags Supertonic supports. The tag pass runs after markdown/ANSI cleanup and only for Supertonic, so GLaDOS, Overwatch, Kokoro, and LuxTTS continue receiving plain sanitized text.
|
|
2880
2882
|
|
|
2881
2883
|
Tag placement is context-aware:
|
|
2882
2884
|
|
|
@@ -3084,7 +3086,7 @@ When combined with `/voice`, you get full bidirectional audio — speak your tas
|
|
|
3084
3086
|
|
|
3085
3087
|
The `transcribe-cli` dependency auto-installs in the background on first use. On ARM or when transcribe-cli fails, the system automatically falls back to `openai-whisper` via a self-managed Python venv (same approach used by Moondream vision).
|
|
3086
3088
|
|
|
3087
|
-
**File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.
|
|
3089
|
+
**File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.omnius/transcripts/`.
|
|
3088
3090
|
|
|
3089
3091
|
|
|
3090
3092
|
|
|
@@ -3233,7 +3235,7 @@ Agent: agenda()
|
|
|
3233
3235
|
|
|
3234
3236
|
| Decision | Research Basis | Key Finding |
|
|
3235
3237
|
|----------|---------------|-------------|
|
|
3236
|
-
| Separate directive store (`.
|
|
3238
|
+
| Separate directive store (`.omnius/scheduled/`, not `.omnius/memory/`) | SSGM ([arXiv:2603.11768](https://arxiv.org/abs/2603.11768), 2026) | Directives in summarizable memory corrupt via compaction — semantic drift degrades scheduling data |
|
|
3237
3239
|
| File-based persistence survives process death | MemGPT/Letta (Packer et al. 2023, [arXiv:2310.08560](https://arxiv.org/abs/2310.08560)) | Agents are ephemeral; state must be external to the process |
|
|
3238
3240
|
| Priority-based startup surfacing | A-MAC ([arXiv:2603.04549](https://arxiv.org/abs/2603.04549), 2026) | 5-factor attention scoring; content type prior is most influential factor (31% latency reduction) |
|
|
3239
3241
|
| Cross-session self-reflection | Reflexion (Shinn et al. 2023, [arXiv:2303.11366](https://arxiv.org/abs/2303.11366)) | Persistent self-reflection stored as text improves task success 20-30% |
|
|
@@ -3287,7 +3289,7 @@ Supports `apt` (Debian/Ubuntu), `dnf` (Fedora), `pacman` (Arch), and `brew` (mac
|
|
|
3287
3289
|
Launch without arguments to enter the interactive REPL:
|
|
3288
3290
|
|
|
3289
3291
|
```bash
|
|
3290
|
-
|
|
3292
|
+
omnius
|
|
3291
3293
|
```
|
|
3292
3294
|
|
|
3293
3295
|
The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage, human expert speed ratio, cost), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.
|
|
@@ -3306,9 +3308,9 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
|
|
|
3306
3308
|
| `/pause` | Pause after current turn finishes (gentle halt) |
|
|
3307
3309
|
| `/stop` | Kill current inference immediately, save state |
|
|
3308
3310
|
| `/resume` | Resume a paused or stopped task |
|
|
3309
|
-
| `/destroy` | Remove `.
|
|
3311
|
+
| `/destroy` | Remove `.omnius/` folder, kill all tasks, clear console, exit |
|
|
3310
3312
|
| **Context & Memory** | |
|
|
3311
|
-
| `/context save` | Force-save session context to `.
|
|
3313
|
+
| `/context save` | Force-save session context to `.omnius/context/` |
|
|
3312
3314
|
| `/context restore` | Restore context from previous sessions into next task |
|
|
3313
3315
|
| `/context show` | Show saved session context status |
|
|
3314
3316
|
| `/compact` | Force context compaction now (default strategy) |
|
|
@@ -3381,7 +3383,7 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
|
|
|
3381
3383
|
| `/help` | Show all available commands |
|
|
3382
3384
|
| `/quit` | Exit |
|
|
3383
3385
|
|
|
3384
|
-
All settings commands accept `--local` to save to project `.
|
|
3386
|
+
All settings commands accept `--local` to save to project `.omnius/settings.json` instead of global config.
|
|
3385
3387
|
|
|
3386
3388
|
### Platform Connectors
|
|
3387
3389
|
|
|
@@ -3441,7 +3443,7 @@ The steering sub-agent uses the same model and backend as the main agent with `m
|
|
|
3441
3443
|
Connect the agent to a Telegram bot. Telegram can run in auto, chat, or action mode: conversational messages get rapid streamed replies in chat mode, while codebase/file/run requests use dedicated action sub-agents that are visible in the terminal waterfall alongside other agent activity.
|
|
3442
3444
|
|
|
3443
3445
|
```bash
|
|
3444
|
-
/telegram --key <token> # Save bot token (persisted to .
|
|
3446
|
+
/telegram --key <token> # Save bot token (persisted to .omnius/settings.json)
|
|
3445
3447
|
/telegram --admin <userid> # Set admin user — gets full memory + tools
|
|
3446
3448
|
/telegram # Toggle bridge on/off (uses saved key)
|
|
3447
3449
|
/telegram status # Show connection status + active sub-agents
|
|
@@ -3488,7 +3490,7 @@ On success, that Telegram user ID is saved as the admin user and future private-
|
|
|
3488
3490
|
The Telegram bridge handles modern Bot API traffic directly:
|
|
3489
3491
|
|
|
3490
3492
|
- **Guest Mode** — inbound `guest_message` updates are normalized into regular agent work and answered through `answerGuestQuery`, so users can interact from profile-surface guest chats before a normal bot DM exists.
|
|
3491
|
-
- **Command menu registration** — when the bridge starts,
|
|
3493
|
+
- **Command menu registration** — when the bridge starts, Omnius registers the local slash-command surface with Telegram via `setMyCommands`; Telegram-safe names such as `/full_send_bless` are mapped back to canonical TUI commands like `/full-send-bless` before execution.
|
|
3492
3494
|
- **Bot-to-bot sends** — `/telegram bot <username> <text>` targets another bot by username using Telegram's supported bot-to-bot message subset.
|
|
3493
3495
|
- **Managed bot access** — `/telegram access get|set` reads and configures managed-bot access restrictions by managed bot user ID.
|
|
3494
3496
|
- **Polls and live photos** — incoming polls, poll media summaries, option media, country/member limits, and live photos are captured as first-class Telegram message context; `/telegram poll` and `/telegram live-photo` send the matching Bot API payloads.
|
|
@@ -3592,7 +3594,7 @@ The bridge distinguishes between **private DMs** and **group/supergroup chats**,
|
|
|
3592
3594
|
|
|
3593
3595
|
Photos, audio, voice messages, video, video notes, and documents sent via Telegram are automatically downloaded and processed:
|
|
3594
3596
|
|
|
3595
|
-
1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.
|
|
3597
|
+
1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.omnius/media-cache/`
|
|
3596
3598
|
2. **Processing** — routed to the appropriate pipeline:
|
|
3597
3599
|
- Images → `vision` / `image_read` / `ocr` tools
|
|
3598
3600
|
- Audio/voice → `transcribe_file` tool
|
|
@@ -3621,7 +3623,7 @@ The bridge automatically handles Telegram's rate limits (HTTP 429) with exponent
|
|
|
3621
3623
|
|
|
3622
3624
|
<div align="right"><a href="#top">back to top</a></div>
|
|
3623
3625
|
|
|
3624
|
-
Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [
|
|
3626
|
+
Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [open-agents-nexus@1.5.6](https://www.npmjs.com/package/open-agents-nexus).
|
|
3625
3627
|
|
|
3626
3628
|
### Wallet & Identity
|
|
3627
3629
|
```
|
|
@@ -3642,7 +3644,7 @@ When margin > 0, capabilities are registered with USDC pricing metadata. The dae
|
|
|
3642
3644
|
```
|
|
3643
3645
|
nexus(action='spend', target_address='0x...', amount_usdc='0.10')
|
|
3644
3646
|
```
|
|
3645
|
-
Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.
|
|
3647
|
+
Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.omnius/nexus/pending-transfer.json`.
|
|
3646
3648
|
|
|
3647
3649
|
### Remote Inference — Tap Into the Mesh
|
|
3648
3650
|
```
|
|
@@ -3708,7 +3710,7 @@ Step 5 → Review and Go Live
|
|
|
3708
3710
|
- **libp2p P2P mesh** provides decentralized relay — no DNS, no port forwarding, NAT-traversing
|
|
3709
3711
|
- Cloudflared tunnel available as HTTPS fallback for non-P2P consumers
|
|
3710
3712
|
- Your raw API endpoint URL is **never exposed** — consumers connect via peerId or tunnel
|
|
3711
|
-
- Config persists to `.
|
|
3713
|
+
- Config persists to `.omnius/sponsor/config.json` — survives restarts
|
|
3712
3714
|
|
|
3713
3715
|
**Management:**
|
|
3714
3716
|
```bash
|
|
@@ -3734,11 +3736,11 @@ When using sponsored inference, the sponsor's banner animation and message appea
|
|
|
3734
3736
|
|
|
3735
3737
|
```
|
|
3736
3738
|
Primary path (libp2p):
|
|
3737
|
-
Consumer
|
|
3739
|
+
Consumer Omnius ──→ libp2p mesh ──→ Sponsor Daemon ──→ Ollama/vLLM
|
|
3738
3740
|
(P2P, NAT-traversing) (auth + rate limit) (local)
|
|
3739
3741
|
|
|
3740
3742
|
Fallback path (tunnel):
|
|
3741
|
-
Consumer
|
|
3743
|
+
Consumer Omnius ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollama/vLLM
|
|
3742
3744
|
(HTTPS) (auth + rate limit) (local)
|
|
3743
3745
|
|
|
3744
3746
|
Both paths enforce:
|
|
@@ -3782,7 +3784,7 @@ The `--full` flag is required to grant remote peers model management access. Spo
|
|
|
3782
3784
|
|
|
3783
3785
|
<div align="right"><a href="#top">back to top</a></div>
|
|
3784
3786
|
|
|
3785
|
-
COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple
|
|
3787
|
+
COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple Omnius nodes form a mesh that learns, evolves, and improves collectively. Queries from the [omnius.nexus](https://omnius.nexus) frontend or CLI are broadcast via NATS, processed by elected nodes through the full AgenticRunner (tools, context engineering, system prompts), and responses are peer-reviewed before delivery.
|
|
3786
3788
|
|
|
3787
3789
|
### How COHERE Works
|
|
3788
3790
|
|
|
@@ -3855,7 +3857,7 @@ Omnius includes infrastructure for the agent to learn from its own execution, im
|
|
|
3855
3857
|
|
|
3856
3858
|
### Trajectory Logging
|
|
3857
3859
|
|
|
3858
|
-
Every completed task is logged to `.
|
|
3860
|
+
Every completed task is logged to `.omnius/trajectories/trajectories.jsonl` with full metadata: task description, outcome (pass/fail), tool calls made, files modified, failed approaches, and timing. This data feeds the rejection fine-tuning pipeline. Research: [Golubev et al.](https://arxiv.org/abs/2508.03501) showed RFT on passing trajectories alone improved Qwen-72B from 11% to 25% on SWE-bench.
|
|
3859
3861
|
|
|
3860
3862
|
### Rejection Fine-Tuning Pipeline
|
|
3861
3863
|
|
|
@@ -3969,14 +3971,14 @@ Omnius binds entities across image, audio, and text using joint‑embedding mode
|
|
|
3969
3971
|
- Voiceprint linkage: speaker embeddings (x‑vector/ECAPA) are associated with entities when co‑occurring in time with a visual track and a transcribed utterance; robust to background noise via median pooling across windows.
|
|
3970
3972
|
- Text label fusion: natural‑language labels (names, roles, tags) are bound to the same entity when co‑referents appear in proximate context windows (heuristics + clustering).
|
|
3971
3973
|
- Association graph: cross‑modal edges (image↔voice↔text) consolidate into a unified entity node with provenance (model, score, timestamp) and decay‑based confidence.
|
|
3972
|
-
- Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.
|
|
3974
|
+
- Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.omnius/memory/`. Redaction controls can drop embeddings by label or recency.
|
|
3973
3975
|
|
|
3974
3976
|
This enables queries like: “Find where Alex spoke about deployment,” “Show files edited after the person in the red sweater approved the PR,” or “Summarize conversations where Speaker‑B and Alice appear together.”
|
|
3975
3977
|
|
|
3976
3978
|
The associative memory integrates with a near-critical cognitive framework inspired by [Beggs & Plenz (2003)](https://doi.org/10.1523/JNEUROSCI.23-35-11167.2003) neuronal avalanche dynamics:
|
|
3977
3979
|
|
|
3978
|
-
- **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.
|
|
3979
|
-
- **Provenance KG**: Every agent action is tracked in `.
|
|
3980
|
+
- **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.omnius/consolidations/` with lessons learned and key patterns
|
|
3981
|
+
- **Provenance KG**: Every agent action is tracked in `.omnius/provenance/` for full action traceability
|
|
3980
3982
|
- **Homeostasis modulation**: Error rate drives exploration guidance — high error rates inject more careful approaches, low error rates encourage bolder exploration
|
|
3981
3983
|
- **Error pattern learning**: Recurring error patterns are detected, stored globally in `~/.omnius/error-patterns.json`, and injected as `[LEARNED FROM EXPERIENCE]` guidance before similar actions in future sessions
|
|
3982
3984
|
|
|
@@ -3997,18 +3999,18 @@ When you're not actively tasking the agent, Dream Mode lets it creatively explor
|
|
|
3997
3999
|
Each cycle expands through all four stages then contracts (evaluation, pruning of weak ideas). Three modes control how far the agent can go:
|
|
3998
4000
|
|
|
3999
4001
|
```bash
|
|
4000
|
-
/dream # Default — read-only exploration, proposals saved to .
|
|
4002
|
+
/dream # Default — read-only exploration, proposals saved to .omnius/dreams/
|
|
4001
4003
|
/dream deep # Multi-cycle deep exploration with expansion/contraction phases
|
|
4002
4004
|
/dream lucid # Full implementation — saves workspace backup, then implements,
|
|
4003
4005
|
# tests, evaluates, and self-plays each proposal with checkpoints
|
|
4004
4006
|
/dream stop # Wake up — stop dreaming
|
|
4005
4007
|
```
|
|
4006
4008
|
|
|
4007
|
-
**Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.
|
|
4009
|
+
**Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.omnius/dreams/`. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.
|
|
4008
4010
|
|
|
4009
4011
|
**Lucid** mode unlocks full write access. Before making changes, it saves a workspace checkpoint so you can roll back. Each cycle goes: dream → implement → test → evaluate → checkpoint → next cycle.
|
|
4010
4012
|
|
|
4011
|
-
All proposals are indexed in `.
|
|
4013
|
+
All proposals are indexed in `.omnius/dreams/PROPOSAL-INDEX.md` for easy review.
|
|
4012
4014
|
|
|
4013
4015
|
### Autoresearch Swarm — 5-Agent GPU Experiment Loop
|
|
4014
4016
|
|
|
@@ -4021,7 +4023,7 @@ The swarm operates in four phases:
|
|
|
4021
4023
|
| **Phase 0: Load** | Reads autoresearch memory (best config, experiment log, failed approaches, hypothesis queue, architectural insights) + detects GPU specs |
|
|
4022
4024
|
| **Phase 1: Hypothesis** | Critic generates 5-8 hypotheses; Flow Maintainer plans experiment ordering and round budget |
|
|
4023
4025
|
| **Phase 2: Experiment** | Sequential rounds (up to 3): Critic pre-screens → Researcher modifies train.py + runs → Monitor watches GPU → Evaluator keeps/discards → Flow Maintainer decides continue/stop |
|
|
4024
|
-
| **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.
|
|
4026
|
+
| **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.omnius/dreams/` |
|
|
4025
4027
|
|
|
4026
4028
|
#### The 5 Agent Roles
|
|
4027
4029
|
|
|
@@ -4035,7 +4037,7 @@ The swarm operates in four phases:
|
|
|
4035
4037
|
|
|
4036
4038
|
#### Bidirectional Memory
|
|
4037
4039
|
|
|
4038
|
-
The swarm maintains persistent memory in `.
|
|
4040
|
+
The swarm maintains persistent memory in `.omnius/memory/autoresearch.json` with five keys:
|
|
4039
4041
|
|
|
4040
4042
|
- **best_config** — best val_bpb and what train.py changes produced it
|
|
4041
4043
|
- **experiment_log** — chronological list of experiments with hypotheses, results, and verdicts
|
|
@@ -4132,7 +4134,7 @@ curl -X POST http://localhost:11435/v1/run \
|
|
|
4132
4134
|
|
|
4133
4135
|
### Multi-Agent Collective Testbed
|
|
4134
4136
|
|
|
4135
|
-
Spawn multiple
|
|
4137
|
+
Spawn multiple Omnius instances in Docker for collective intelligence experiments:
|
|
4136
4138
|
|
|
4137
4139
|
```bash
|
|
4138
4140
|
cd testbed
|
|
@@ -4379,12 +4381,12 @@ omnius config set backendUrl http://localhost:11434
|
|
|
4379
4381
|
|
|
4380
4382
|
### Project Context
|
|
4381
4383
|
|
|
4382
|
-
Create `AGENTS.md`, `
|
|
4384
|
+
Create `AGENTS.md`, `Omnius.md`, or `.omnius.md` in your project root for agent instructions. Context files merge from parent to child directories.
|
|
4383
4385
|
|
|
4384
|
-
### `.
|
|
4386
|
+
### `.omnius/` Project Directory
|
|
4385
4387
|
|
|
4386
4388
|
```
|
|
4387
|
-
.
|
|
4389
|
+
.omnius/
|
|
4388
4390
|
├── config.json # Project config overrides
|
|
4389
4391
|
├── settings.json # TUI settings (model, endpoint, voice, stream, etc.)
|
|
4390
4392
|
├── memory/ # Persistent memory store (topics, patterns, facts)
|
|
@@ -4410,9 +4412,9 @@ Create `AGENTS.md`, `OA.md`, or `.omnius.md` in your project root for agent inst
|
|
|
4410
4412
|
Any Ollama or OpenAI-compatible API model with tool calling works:
|
|
4411
4413
|
|
|
4412
4414
|
```bash
|
|
4413
|
-
|
|
4414
|
-
|
|
4415
|
-
|
|
4415
|
+
omnius --model qwen2.5-coder:32b "fix the bug"
|
|
4416
|
+
omnius --backend vllm --backend-url http://localhost:8000/v1 "add tests"
|
|
4417
|
+
omnius --backend-url http://10.0.0.5:11434 "refactor auth"
|
|
4416
4418
|
```
|
|
4417
4419
|
|
|
4418
4420
|
|
|
@@ -4506,8 +4508,8 @@ Forward any configured `/endpoint` (Chutes, Groq, OpenRouter, Together, vLLM, et
|
|
|
4506
4508
|
- Your node registers inference capabilities on the P2P mesh using your upstream endpoint's models
|
|
4507
4509
|
- Remote peers discover and invoke these capabilities via libp2p streams (DHT/mDNS/NATS)
|
|
4508
4510
|
- Requests are forwarded to your upstream API, responses streamed back to the peer
|
|
4509
|
-
- The libp2p daemon persists in the background — it survives
|
|
4510
|
-
- When you reopen
|
|
4511
|
+
- The libp2p daemon persists in the background — it survives Omnius restarts and remains discoverable even when the TUI is closed
|
|
4512
|
+
- When you reopen Omnius, it reconnects to the existing daemon and resumes stats tracking
|
|
4511
4513
|
|
|
4512
4514
|
**Rate limit distribution (`--loadbalance`):**
|
|
4513
4515
|
- Captures `x-ratelimit-remaining-tokens` and `x-ratelimit-limit-tokens` headers from upstream API responses
|
|
@@ -4776,7 +4778,7 @@ node eval/run-agentic.mjs --model qwen3.5:4b # Different model tier
|
|
|
4776
4778
|
|
|
4777
4779
|
### REST API Enterprise Evaluation (v0.185.68)
|
|
4778
4780
|
|
|
4779
|
-
35 test cases executed against the
|
|
4781
|
+
35 test cases executed against the omnius REST API (`omnius serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
|
|
4780
4782
|
|
|
4781
4783
|
```bash
|
|
4782
4784
|
node eval/api-enterprise-eval.mjs # Run all 85 tests (35 cases × 3 models)
|
|
@@ -4833,7 +4835,7 @@ Omnius integrates with [AIWG](https://aiwg.io) ([npm](https://www.npmjs.com/pack
|
|
|
4833
4835
|
|
|
4834
4836
|
```bash
|
|
4835
4837
|
npm i -g aiwg
|
|
4836
|
-
|
|
4838
|
+
omnius "analyze this project's SDLC health and set up documentation"
|
|
4837
4839
|
```
|
|
4838
4840
|
|
|
4839
4841
|
| Capability | Description |
|
|
@@ -4930,26 +4932,26 @@ Control it live from the TUI:
|
|
|
4930
4932
|
|
|
4931
4933
|
```
|
|
4932
4934
|
/access # show current access + host
|
|
4933
|
-
/access loopback|lan|any # set access policy (
|
|
4934
|
-
/host 127.0.0.1:11435 # bind to loopback only (
|
|
4935
|
+
/access loopback|lan|any # set access policy (OMNIUS_ACCESS) and restart daemon
|
|
4936
|
+
/host 127.0.0.1:11435 # bind to loopback only (OMNIUS_HOST) and restart daemon
|
|
4935
4937
|
/host 0.0.0.0:11435 # bind all interfaces and restart daemon
|
|
4936
4938
|
/network config # interactive menu (arrow keys) to change both
|
|
4937
4939
|
|
|
4938
4940
|
# Project-local persistence
|
|
4939
|
-
/access any --local # save to ./.
|
|
4941
|
+
/access any --local # save to ./.omnius/settings.json
|
|
4940
4942
|
/host 127.0.0.1:11435 --local
|
|
4941
4943
|
```
|
|
4942
4944
|
|
|
4943
4945
|
Environment variables (non-TUI usage):
|
|
4944
4946
|
|
|
4945
4947
|
```
|
|
4946
|
-
|
|
4948
|
+
OMNIUS_ACCESS=lan OMNIUS_HOST=0.0.0.0:11435 omnius
|
|
4947
4949
|
```
|
|
4948
4950
|
|
|
4949
4951
|
Persistence and startup behavior:
|
|
4950
4952
|
|
|
4951
|
-
- The TUI saves your choices to `.
|
|
4952
|
-
- On startup, the TUI loads saved `
|
|
4953
|
+
- The TUI saves your choices to `.omnius/settings.json` (project) or `~/.omnius/settings.json` (global).
|
|
4954
|
+
- On startup, the TUI loads saved `omniusAccess`/`omniusHost` and seeds `OMNIUS_ACCESS`/`OMNIUS_HOST` before ensuring the daemon, so the 11435 service picks them up immediately.
|
|
4953
4955
|
- Explicit environment variables always win over saved settings.
|
|
4954
4956
|
|
|
4955
4957
|
Security tips:
|