omnius 1.0.2 → 1.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +158 -158
- package/dist/index.js +2216 -2198
- package/dist/launcher.cjs +1 -1
- package/dist/postinstall-daemon.cjs +78 -78
- package/dist/preinstall.cjs +8 -8
- package/dist/scripts/ocr-advanced.py +2 -2
- package/dist/scripts/start-moondream.py +1 -1
- package/dist/scripts/tor/tor_setup.sh +1 -1
- package/npm-shrinkwrap.json +3 -7
- package/package.json +3 -7
- package/prompts/agentic/system-large.md +10 -10
- package/prompts/agentic/system-medium.md +2 -2
- package/prompts/agentic/system-small.md +2 -2
- package/prompts/tui/dream-consolidate.md +1 -1
- package/prompts/tui/dream-lucid-eval.md +1 -1
- package/prompts/tui/dream-lucid-implement.md +1 -1
- package/prompts/tui/dream-stages.md +1 -1
package/README.md
CHANGED
|
@@ -28,7 +28,7 @@
|
|
|
28
28
|
---
|
|
29
29
|
|
|
30
30
|
```bash
|
|
31
|
-
npm i -g omnius &&
|
|
31
|
+
npm i -g omnius && omnius
|
|
32
32
|
```
|
|
33
33
|
|
|
34
34
|
An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.
|
|
@@ -59,7 +59,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
59
59
|
- [Parallelism & Concurrency](#parallelism--concurrency)
|
|
60
60
|
- [Endpoint Reference](#endpoint-reference)
|
|
61
61
|
- [Stateful Chat — `/v1/chat` + `/api/chat` (OpenAI drop-in with full agent under the hood)](#stateful-chat--v1chat--apichat-openai-drop-in-with-full-agent-under-the-hood)
|
|
62
|
-
- [Live Comparison: Ollama vs
|
|
62
|
+
- [Live Comparison: Ollama vs Omnius Full Agent](#live-comparison-ollama-vs-omnius-full-agent)
|
|
63
63
|
- [One-Off Completions — `/api/generate` + `/v1/generate`](#one-off-completions--apigenerate--v1generate)
|
|
64
64
|
- [Embeddings — `/v1/embeddings` + `/api/embed`](#embeddings--v1embeddings--apiembed)
|
|
65
65
|
- [Memory Recall + Knowledge Graph — `/v1/memory/*`](#memory-recall--knowledge-graph--v1memory)
|
|
@@ -212,7 +212,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
|
|
|
212
212
|
- [Configuration](#configuration)
|
|
213
213
|
- [Network Access & Binding](#network-access--binding)
|
|
214
214
|
- [Project Context](#project-context)
|
|
215
|
-
- [`.
|
|
215
|
+
- [`.omnius/` Project Directory](#omnius-project-directory)
|
|
216
216
|
- [Model Support](#model-support)
|
|
217
217
|
- [Supported Inference Providers](#supported-inference-providers)
|
|
218
218
|
- [Connecting to a Provider](#connecting-to-a-provider)
|
|
@@ -242,7 +242,7 @@ An LLM is a high-bandwidth associative generative core — closer to a cortex-li
|
|
|
242
242
|
|---|---|---|
|
|
243
243
|
| Associative core | Cortex | LLM weights (any size) |
|
|
244
244
|
| Current workspace | Global workspace / attention | `assembleContext()` — structured context assembly |
|
|
245
|
-
| Episodic memory | Hippocampus | `.
|
|
245
|
+
| Episodic memory | Hippocampus | `.omnius/memory/` — write, search, retrieve across sessions |
|
|
246
246
|
| Cognitive map | Hippocampal spatial maps | `semantic-map.ts` + `repo-map.ts` (PageRank) |
|
|
247
247
|
| Action gating | Basal ganglia | Tool selection policy (task-aware filtering) |
|
|
248
248
|
| Temporal hierarchy | Prefrontal executive | Task decomposition, sub-agent delegation |
|
|
@@ -260,7 +260,7 @@ Don't chase larger models. Build the organism around whatever model you have.
|
|
|
260
260
|
<div align="right"><a href="#top">back to top</a></div>
|
|
261
261
|
|
|
262
262
|
```
|
|
263
|
-
You:
|
|
263
|
+
You: omnius "fix the null check in auth.ts"
|
|
264
264
|
|
|
265
265
|
Agent: [Turn 1] file_read(src/auth.ts)
|
|
266
266
|
[Turn 2] grep_search(pattern="null", path="src/auth.ts")
|
|
@@ -286,8 +286,8 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
|
|
|
286
286
|
- **Sub-agent delegation** — spawn independent agents for parallel workstreams
|
|
287
287
|
- **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation
|
|
288
288
|
- **Long-horizon cron agents** — schedule recurring autonomous agent tasks with goals, completion criteria, execution history, and automatic evaluation (daily code reviews, weekly dep updates, continuous monitoring)
|
|
289
|
-
- **Nexus P2P networking** — decentralized agent-to-agent communication via [
|
|
290
|
-
- **x402 micropayments** — native x402 payment rails via
|
|
289
|
+
- **Nexus P2P networking** — decentralized agent-to-agent communication via [open-agents-nexus](https://www.npmjs.com/package/open-agents-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport
|
|
290
|
+
- **x402 micropayments** — native x402 payment rails via open-agents-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers
|
|
291
291
|
- **Inference capability proof** — benchmark local models with anti-spoofing SHA-256 hashed proofs, generate capability scorecards for peer verification
|
|
292
292
|
- **Littleman Observer** — parallel meta-analysis system that watches the agent loop in real-time. Detects false failure claims after successful tools, blocks redundant re-execution, catches runaway one-sided output in conversations, and dynamically extends turn limits when active work is detected. Emits `debug_context` and `debug_littleman` events for live observability
|
|
293
293
|
- **Interactive Session Lock** — generic `SESSION_ACTIVE` protocol prevents premature task completion during long-running sessions (phone calls, live chat, monitoring). Any MCP contract can adopt the protocol. Paired with context-engineered system prompts that teach small models to maintain conversation loops
|
|
@@ -306,8 +306,8 @@ Omnius includes background workers that compute and associate embeddings across
|
|
|
306
306
|
|
|
307
307
|
Config (env vars):
|
|
308
308
|
|
|
309
|
-
- `
|
|
310
|
-
- `
|
|
309
|
+
- `OMNIUS_COOCUR_WINDOW_MS` — max time delta between visual and transcript episodes to create co‑occurrence links (default: 120000 ms).
|
|
310
|
+
- `OMNIUS_COOCUR_CLIP_SIM_MIN` — minimum CLIP text↔image cosine (0..1, default: 0.22) for linking when both embeddings are available.
|
|
311
311
|
|
|
312
312
|
The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile, speechbrain, Whisper) into `~/.omnius/venv` and registers providers automatically. No manual installs are required.
|
|
313
313
|
- **Ralph Loop** — iterative task execution that keeps retrying until completion criteria are met
|
|
@@ -316,7 +316,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
316
316
|
- **Persistent Python REPL** — `repl_exec` tool maintains variables, imports, and functions across calls. Write Python code that processes data iteratively, with `llm_query()` available for recursive LLM sub-calls from within code
|
|
317
317
|
- **Recursive LLM calls** — `llm_query(prompt, context)` invokes the model from inside REPL code, enabling loop-based semantic analysis of large inputs ([RLM paper](https://arxiv.org/abs/2512.24601)). `parallel_llm_query()` runs multiple calls concurrently ([SPRINT](https://arxiv.org/abs/2506.05745))
|
|
318
318
|
- **Memory metabolism** — governed memory lifecycle: classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence), consolidate lessons from trajectories. Inspired by [TIMG](https://arxiv.org/abs/2603.10600) and [MemMA](https://arxiv.org/abs/2603.18718)
|
|
319
|
-
- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.
|
|
319
|
+
- **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.omnius/identity/`
|
|
320
320
|
- **Reflection & integrity** — immune-system audit: diagnostic ("what's wrong?"), epistemic ("what evidence is missing?"), constitutional ("should this change become part of self?"). Inspired by [LEAFE](https://arxiv.org/abs/2603.16843) and [RewardHackingAgents](https://arxiv.org/abs/2603.11337)
|
|
321
321
|
- **Exploration & culture** — ARCHE strategy-space exploration: generate competing hypotheses, archive successful variants, retrieve past strategies. Inspired by [SGE](https://arxiv.org/abs/2603.02045) and [Darwin Gödel Machine](https://arxiv.org/abs/2505.22954)
|
|
322
322
|
- **Autoresearch Swarm** — 5-agent GPU experiment loop during REM sleep: Researcher, Monitor, Evaluator, Critic, Flow Maintainer autonomously run ML training experiments, keep improvements, discard regressions
|
|
@@ -325,7 +325,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
325
325
|
- **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent
|
|
326
326
|
- **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text
|
|
327
327
|
- **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)
|
|
328
|
-
- **Supertonic expressive tags** — when `/voice supertonic` is active,
|
|
328
|
+
- **Supertonic expressive tags** — when `/voice supertonic` is active, Omnius inserts supported expression tags such as `<sigh>`, `<breath>`, and `<laugh>` into spoken status updates based on failure, recovery, sentence boundaries, success, and playful tone. Other voice backends receive sanitized plain text
|
|
329
329
|
- **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior
|
|
330
330
|
- **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines
|
|
331
331
|
- **Cost tracking** — real-time token cost estimation for 15+ cloud providers
|
|
@@ -342,14 +342,14 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
342
342
|
- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection
|
|
343
343
|
- **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)
|
|
344
344
|
- **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below
|
|
345
|
-
- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across
|
|
346
|
-
- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{
|
|
345
|
+
- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across Omnius restarts
|
|
346
|
+
- **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OMNIUS_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection
|
|
347
347
|
- **Secret vault** — `/secrets` manages API keys and credentials with AES-256-GCM encrypted persistence; secrets are automatically redacted before sending to untrusted inference peers and re-injected on response
|
|
348
348
|
- **Auto-expanding context** — detects RAM/VRAM and creates an optimized model variant on first run
|
|
349
349
|
- **Mid-task steering** — type while the agent works to add context without interrupting
|
|
350
350
|
- **Smart compaction** — 6 context compaction strategies (default, aggressive, decisions, errors, summary, structured) with ARC-inspired active context revision ([arXiv:2601.12030](https://arxiv.org/abs/2601.12030)) that preserves structural file content through compaction, preventing small-model repetitive loops at the root cause. Success signals and content previews survive compaction so models never lose evidence that tools succeeded
|
|
351
351
|
- **Memex experience archive** — large tool outputs archived during compaction with hash-based retrieval
|
|
352
|
-
- **Persistent memory** — learned patterns stored in `.
|
|
352
|
+
- **Persistent memory** — learned patterns stored in `.omnius/memory/` across sessions
|
|
353
353
|
- **Structured procedural memory (SQLite)** — replaces flat JSON with a full relational database: CRUD with soft-delete, revision tracking, embedding storage (float32 BLOB), bidirectional memory linking with confidence scores. Inspired by [ExpeL](https://arxiv.org/abs/2308.10144) (contrastive extraction) and [TIMG](https://arxiv.org/abs/2603.10600) (structured procedural format). 79 unit tests
|
|
354
354
|
- **Semantic memory search** — vector embeddings via [Ollama /api/embed](https://ollama.com) (nomic-embed-text, 768-dim) with cosine similarity search over stored memories. Auto-generates embeddings on memory creation. Auto-links related memories when similarity > 0.6. Graceful fallback to text search when Ollama unavailable
|
|
355
355
|
- **LLM-based memory extraction** — post-task, the LLM itself extracts structured procedural memories (CATEGORY/TRIGGER/LESSON/STEPS) instead of copying raw error text verbatim. Based on [ExpeL](https://arxiv.org/abs/2308.10144) and [AWM](https://arxiv.org/abs/2409.07429) patterns
|
|
@@ -357,13 +357,13 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
357
357
|
- **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content
|
|
358
358
|
- **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected
|
|
359
359
|
- **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)
|
|
360
|
-
- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.
|
|
361
|
-
- **Node visualization** — [
|
|
360
|
+
- **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.omnius/images/`
|
|
361
|
+
- **Node visualization** — [omnius.nexus](https://github.com/robit-man/omnius.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
|
|
362
362
|
- **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents "white heavy checkmark"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines
|
|
363
363
|
- **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel
|
|
364
364
|
- **Unified color scheme** — `ui.primary` (252), `ui.error` (198/magenta), `ui.warn` (214/orange), `ui.accent` (178/yellow) applied consistently across all TUI surfaces
|
|
365
365
|
- **Clickable header buttons** — `help`, `voice`, `cohere`, `model` buttons on banner row 3 with hover/click visual states. OSC 8 hyperlinks for pointer cursor. Mouse click fires the slash command directly
|
|
366
|
-
- **Dynamic terminal title** — updates with current task + version: `"fix auth bug ·
|
|
366
|
+
- **Dynamic terminal title** — updates with current task + version: `"fix auth bug · Omnius v0.141.0"`
|
|
367
367
|
- **Session context persistence** — auto-saves context on task completion, manual `/context save|restore` across sessions
|
|
368
368
|
- **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs
|
|
369
369
|
- **Seamless `/update`** — in-place update and reload with automatic context save/restore
|
|
@@ -412,20 +412,20 @@ Run Omnius as a headless service for CI/CD pipelines, automation, and enterprise
|
|
|
412
412
|
### Non-Interactive Mode
|
|
413
413
|
|
|
414
414
|
```bash
|
|
415
|
-
|
|
416
|
-
|
|
417
|
-
|
|
415
|
+
omnius "fix all lint errors" --non-interactive # Run task, exit when done
|
|
416
|
+
omnius "generate API docs" --json # Structured JSON output (no ANSI)
|
|
417
|
+
omnius "run security audit" --background # Detached background job
|
|
418
418
|
```
|
|
419
419
|
|
|
420
420
|
### Background Jobs
|
|
421
421
|
|
|
422
422
|
```bash
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
|
|
423
|
+
omnius "migrate database" --background # Returns job ID immediately
|
|
424
|
+
omnius status job-abc123 # Check job progress
|
|
425
|
+
omnius jobs # List all running/completed jobs
|
|
426
426
|
```
|
|
427
427
|
|
|
428
|
-
Jobs run as detached processes — survive terminal disconnection. Output saved to `.
|
|
428
|
+
Jobs run as detached processes — survive terminal disconnection. Output saved to `.omnius/jobs/{id}.json`.
|
|
429
429
|
|
|
430
430
|
### JSON Output Mode
|
|
431
431
|
|
|
@@ -441,15 +441,15 @@ Pipe to `jq`, ingest into monitoring systems, or feed to other agents.
|
|
|
441
441
|
### Process Management
|
|
442
442
|
|
|
443
443
|
```bash
|
|
444
|
-
/destroy processes # Kill orphaned
|
|
445
|
-
/destroy processes --global # Kill ALL orphaned
|
|
444
|
+
/destroy processes # Kill orphaned Omnius processes (local project)
|
|
445
|
+
/destroy processes --global # Kill ALL orphaned Omnius processes system-wide
|
|
446
446
|
```
|
|
447
447
|
|
|
448
|
-
Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale
|
|
448
|
+
Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale Omnius instances.
|
|
449
449
|
|
|
450
450
|
### REST API Service (Port 11435)
|
|
451
451
|
|
|
452
|
-
Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full
|
|
452
|
+
Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full Omnius capability surface** through standards most organizations expect:
|
|
453
453
|
|
|
454
454
|
- **OpenAI / Ollama drop-in** — `/v1/chat`, `/v1/chat/completions`, `/v1/embeddings`, `/v1/models` are wire-compatible with both ecosystems
|
|
455
455
|
- **API discovery** — `GET /help` returns a full human and agent-readable guide with quickstart curl commands, all 70+ endpoints by category, MCP integration instructions, and auth documentation
|
|
@@ -464,19 +464,19 @@ Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — inst
|
|
|
464
464
|
- **`X-Request-ID`** echoed or generated for correlation
|
|
465
465
|
- **SSE event bus** at `/v1/events` with optional `?type=foo.*` filter, tagged with `aims:control` for auditors
|
|
466
466
|
- **Bearer auth + scoped keys** (`read` / `run` / `admin`) and OIDC JWT support
|
|
467
|
-
- **Per-key concurrency limits** (`maxJobs` in `
|
|
467
|
+
- **Per-key concurrency limits** (`maxJobs` in `OMNIUS_API_KEYS` is now actually enforced)
|
|
468
468
|
- **Atomic job record writes** with 64-bit job IDs (no race conditions)
|
|
469
469
|
- **OpenAPI 3.0** at `/openapi.json` and Swagger UI at `/docs`
|
|
470
470
|
- **Web chat UI** at `/`
|
|
471
471
|
|
|
472
|
-
> **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `
|
|
472
|
+
> **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `OMNIUS_SKIP_DAEMON_INSTALL=1 npm i -g omnius`.
|
|
473
473
|
|
|
474
474
|
```bash
|
|
475
475
|
# Manually run the server (the daemon already does this for you)
|
|
476
|
-
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
476
|
+
omnius serve # Start on default port 11435
|
|
477
|
+
omnius serve --port 9999 # Custom port
|
|
478
|
+
OMNIUS_API_KEY=mysecret omnius serve # Single admin key
|
|
479
|
+
OMNIUS_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" omnius serve # Scoped multi-key with rpm:tpd:maxjobs
|
|
480
480
|
```
|
|
481
481
|
|
|
482
482
|
> **Every example below is verified against `omnius@0.187.189` on a live daemon.** Examples from earlier versions are deprecated.
|
|
@@ -486,7 +486,7 @@ OA_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" oa
|
|
|
486
486
|
Control who can reach the daemon and where it binds:
|
|
487
487
|
|
|
488
488
|
- TUI commands: `/access loopback|lan|any`, `/host <host[:port]>`, `/network config` (interactive), `--local` to save per‑project.
|
|
489
|
-
- Environment: `
|
|
489
|
+
- Environment: `OMNIUS_ACCESS=loopback|lan|any`, `OMNIUS_HOST=host[:port]`.
|
|
490
490
|
- See Configuration → [Network Access & Binding](#network-access--binding) for full details and security guidance.
|
|
491
491
|
|
|
492
492
|
#### Working Directory
|
|
@@ -534,12 +534,12 @@ curl http://localhost:11435/version
|
|
|
534
534
|
curl http://localhost:11435/metrics
|
|
535
535
|
```
|
|
536
536
|
```
|
|
537
|
-
# HELP
|
|
538
|
-
# TYPE
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
537
|
+
# HELP omnius_requests_total Total HTTP requests
|
|
538
|
+
# TYPE omnius_requests_total counter
|
|
539
|
+
omnius_requests_total{method="POST",path="/v1/chat/completions",status="200"} 47
|
|
540
|
+
omnius_tokens_in_total 12450
|
|
541
|
+
omnius_tokens_out_total 8230
|
|
542
|
+
omnius_errors_total 0
|
|
543
543
|
```
|
|
544
544
|
|
|
545
545
|
#### OpenAI-Compatible Inference
|
|
@@ -592,7 +592,7 @@ data: [DONE]
|
|
|
592
592
|
|
|
593
593
|
#### Agentic Task Execution
|
|
594
594
|
|
|
595
|
-
The unique
|
|
595
|
+
The unique Omnius capability — submit a coding task and get an autonomous agent loop.
|
|
596
596
|
|
|
597
597
|
```bash
|
|
598
598
|
# Run task in your current directory
|
|
@@ -730,7 +730,7 @@ curl -X POST http://localhost:11435/v1/commands/destroy \
|
|
|
730
730
|
|
|
731
731
|
```bash
|
|
732
732
|
# Multi-key setup: read (monitoring), run (CI), admin (ops)
|
|
733
|
-
|
|
733
|
+
OMNIUS_API_KEYS="grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team" omnius serve
|
|
734
734
|
```
|
|
735
735
|
|
|
736
736
|
| Scope | Can do | Cannot do |
|
|
@@ -830,21 +830,21 @@ curl -X DELETE -H "Authorization: Bearer $ADMIN_KEY" \
|
|
|
830
830
|
|
|
831
831
|
The daemon is built for **unbounded concurrent requests** with per-key enforcement. Every agentic task (`/v1/run`, `/v1/chat`, `/api/chat`, `/api/generate`) spawns its own subprocess, so multiple jobs run in true parallel — same model or different models, same or different profiles, same or different sandbox modes.
|
|
832
832
|
|
|
833
|
-
**Per-key concurrency limits** are enforced from the `
|
|
833
|
+
**Per-key concurrency limits** are enforced from the `OMNIUS_API_KEYS` env var:
|
|
834
834
|
|
|
835
835
|
```bash
|
|
836
836
|
# key:scope:user:rpm:tpd:maxJobs
|
|
837
|
-
|
|
837
|
+
OMNIUS_API_KEYS="ci-key:run:github-actions:60:100000:5, \
|
|
838
838
|
ops-key:admin:ops:120:500000:20, \
|
|
839
839
|
read-key:read:grafana:600::"
|
|
840
|
-
|
|
840
|
+
omnius serve
|
|
841
841
|
```
|
|
842
842
|
|
|
843
843
|
The 6th field is `maxJobs` — the maximum number of **concurrent** (in-flight) agentic tasks for that key. When exceeded, the daemon returns **RFC 7807 `429 Too Many Requests`**:
|
|
844
844
|
|
|
845
845
|
```json
|
|
846
846
|
{
|
|
847
|
-
"type": "https://
|
|
847
|
+
"type": "https://omnius.nexus/problems/rate-limited",
|
|
848
848
|
"title": "Concurrent job limit exceeded",
|
|
849
849
|
"status": 429,
|
|
850
850
|
"detail": "Concurrent job limit exceeded for github-actions: 5/5",
|
|
@@ -871,7 +871,7 @@ done
|
|
|
871
871
|
wait
|
|
872
872
|
```
|
|
873
873
|
|
|
874
|
-
Each subprocess inherits a **clean env** — `
|
|
874
|
+
Each subprocess inherits a **clean env** — `OMNIUS_DAEMON` and `OMNIUS_PORT` are explicitly stripped so the child doesn't re-enter daemon mode. Fixed in v0.187.189 (root cause of the earlier "Task incomplete (0 turns, 0 tool calls)" bug).
|
|
875
875
|
|
|
876
876
|
**Observing parallelism live** — subscribe to the event bus to watch every job lifecycle event:
|
|
877
877
|
|
|
@@ -932,7 +932,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
|
|
|
932
932
|
| Method | Path | Auth | Description |
|
|
933
933
|
|--------|------|------|-------------|
|
|
934
934
|
| POST | `/v1/chat` | run | Full agent under the hood, OpenAI chat.completion shape. Default = tools=true (subprocess agent). Set `tools:false` for direct backend bypass. Supports `timeout_s` body field (default 180s). Non-streaming path has a safety SIGTERM→SIGKILL after `timeout_s + 30s`. |
|
|
935
|
-
| POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both
|
|
935
|
+
| POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both Omnius-shape (`{message, model}`) and Ollama-shape (`{model, messages: [...]}`) bodies. Returns OpenAI `chat.completion` shape on success and failure (failure uses `finish_reason:"error"`). |
|
|
936
936
|
| POST | `/v1/generate` | run | **One-off completion** — same agent stack as `/v1/chat` but no session history. Returns Ollama-shape `{model, response, done, total_duration}`. |
|
|
937
937
|
| POST | `/api/generate` | run | **Ollama-compatible alias** of `/v1/generate`. Drop-in for Ollama `/api/generate`. |
|
|
938
938
|
| GET | `/v1/chat/sessions` | read | List active chat sessions |
|
|
@@ -999,7 +999,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
|
|
|
999
999
|
**Sessions + context**
|
|
1000
1000
|
| Method | Path | Auth | Description |
|
|
1001
1001
|
|--------|------|------|-------------|
|
|
1002
|
-
| GET | `/v1/sessions` | read |
|
|
1002
|
+
| GET | `/v1/sessions` | read | Omnius task session archive |
|
|
1003
1003
|
| GET | `/v1/sessions/:id` | read | Session history |
|
|
1004
1004
|
| GET | `/v1/context` | read | Show current session context |
|
|
1005
1005
|
| POST | `/v1/context/save` | run | Save a context entry |
|
|
@@ -1066,15 +1066,15 @@ The chat endpoint is mounted at **two paths on port 11435**:
|
|
|
1066
1066
|
|
|
1067
1067
|
| Path | Purpose |
|
|
1068
1068
|
|------|---------|
|
|
1069
|
-
| `POST /v1/chat` |
|
|
1069
|
+
| `POST /v1/chat` | Omnius-native path |
|
|
1070
1070
|
| `POST /api/chat` | **Ollama-compatible alias** — same handler, so clients pointing at Ollama can be flipped over by changing only the port (`11434` → `11435`) |
|
|
1071
1071
|
|
|
1072
|
-
It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full
|
|
1072
|
+
It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full Omnius agent (tools, multi-agent, memory, skills) under the hood and returns an **OpenAI `chat.completion`-shaped response** so any client SDK can use it without modification.
|
|
1073
1073
|
|
|
1074
1074
|
**Both body shapes are accepted** on either path:
|
|
1075
1075
|
|
|
1076
1076
|
```jsonc
|
|
1077
|
-
//
|
|
1077
|
+
// Omnius-native
|
|
1078
1078
|
{"message": "hello", "model": "qwen3.5:9b", "stream": false}
|
|
1079
1079
|
|
|
1080
1080
|
// Ollama-native (the `messages` array; the last user message is extracted)
|
|
@@ -1082,18 +1082,18 @@ It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/
|
|
|
1082
1082
|
```
|
|
1083
1083
|
|
|
1084
1084
|
> **Two execution modes:**
|
|
1085
|
-
> - **Default (`tools` unset or `tools: true`)** — full agent: spawns the
|
|
1085
|
+
> - **Default (`tools` unset or `tools: true`)** — full agent: spawns the Omnius subprocess with the entire 82-tool set, runs the agent loop, returns the final answer with `tool_calls` metadata.
|
|
1086
1086
|
> - **Direct (`tools: false`)** — fast path: bypasses the agent and forwards straight to the configured backend (Ollama/vLLM) using the session history. Useful for plain chat without tools.
|
|
1087
1087
|
|
|
1088
1088
|
**Safety timeout** — every non-streaming request is bounded by `timeout_s` (default **180s**). If the agent subprocess doesn't close in `timeout_s + 30s`, the daemon SIGTERMs (then SIGKILLs) it and returns an OpenAI-shaped error with `finish_reason:"error"` and a clear explanation. No more hung requests.
|
|
1089
1089
|
|
|
1090
|
-
**Flip Ollama →
|
|
1090
|
+
**Flip Ollama → Omnius by port alone** — this is verified to work via `scripts/omnius-vs-ollama-chat-compare.sh` (see [Live Comparison](#live-comparison-ollama-vs-omnius-full-agent) below):
|
|
1091
1091
|
|
|
1092
1092
|
```bash
|
|
1093
1093
|
# Before (Ollama)
|
|
1094
1094
|
curl -s http://127.0.0.1:11434/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
|
|
1095
1095
|
|
|
1096
|
-
# After (
|
|
1096
|
+
# After (Omnius with full agent) — only port changed
|
|
1097
1097
|
curl -s http://127.0.0.1:11435/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
|
|
1098
1098
|
```
|
|
1099
1099
|
|
|
@@ -1197,32 +1197,32 @@ curl -s http://localhost:11435/v1/chat \
|
|
|
1197
1197
|
|
|
1198
1198
|
Sessions expire after 30 minutes of inactivity. List active sessions: `GET /v1/chat/sessions`.
|
|
1199
1199
|
|
|
1200
|
-
#### Live Comparison: Ollama vs
|
|
1200
|
+
#### Live Comparison: Ollama vs Omnius Full Agent
|
|
1201
1201
|
|
|
1202
|
-
The repo ships a reproducible side-by-side harness at [`scripts/
|
|
1202
|
+
The repo ships a reproducible side-by-side harness at [`scripts/omnius-vs-ollama-chat-compare.sh`](scripts/omnius-vs-ollama-chat-compare.sh). It runs **5 tool-call-required prompts** × **4 phases** (Ollama non-stream, Omnius non-stream, Ollama stream, Omnius stream) = **20 runs per invocation** with the same model and the same `/api/chat` path on both ports.
|
|
1203
1203
|
|
|
1204
1204
|
```bash
|
|
1205
|
-
MODEL=qwen3.5:9b bash scripts/
|
|
1205
|
+
MODEL=qwen3.5:9b bash scripts/omnius-vs-ollama-chat-compare.sh
|
|
1206
1206
|
```
|
|
1207
1207
|
|
|
1208
1208
|
**Results from `omnius@0.187.191` with `qwen3.5:9b`** (all 20 runs completed, zero timeouts):
|
|
1209
1209
|
|
|
1210
1210
|
| # | Prompt | Ollama (bare) | Omnius (full agent) | Winner |
|
|
1211
1211
|
|---|---|---|---|---|
|
|
1212
|
-
| 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **
|
|
1213
|
-
| 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **
|
|
1214
|
-
| 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **
|
|
1215
|
-
| 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **
|
|
1216
|
-
| 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (
|
|
1212
|
+
| 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **Omnius** |
|
|
1213
|
+
| 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **Omnius** |
|
|
1214
|
+
| 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **Omnius** |
|
|
1215
|
+
| 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **Omnius** |
|
|
1216
|
+
| 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (Omnius more detailed)** |
|
|
1217
1217
|
|
|
1218
1218
|
**Latency profile** (wall clock, 5-prompt median):
|
|
1219
1219
|
|
|
1220
|
-
| Phase | Ollama |
|
|
1220
|
+
| Phase | Ollama | Omnius agent | Omnius overhead |
|
|
1221
1221
|
|---|---|---|---|
|
|
1222
1222
|
| Non-streaming | 12–18s | 24–42s | 12–26s (agent loop + tool calls) |
|
|
1223
1223
|
| Streaming SSE | 11–16s | 24–56s | 10–40s |
|
|
1224
1224
|
|
|
1225
|
-
**Streaming parser validation** — every
|
|
1225
|
+
**Streaming parser validation** — every Omnius stream delivered:
|
|
1226
1226
|
- Live intermediate `tool_call` events mid-stream (e.g. `['web_search', 'web_fetch', 'task_complete']`)
|
|
1227
1227
|
- OpenAI `chat.completion.chunk` deltas with `id`, `model`, `finish_reason`
|
|
1228
1228
|
- Clean `data: [DONE]` termination with `finish_reason:"stop"`
|
|
@@ -1230,12 +1230,12 @@ MODEL=qwen3.5:9b bash scripts/oa-vs-ollama-chat-compare.sh
|
|
|
1230
1230
|
The harness is **reproducible** — rerun it after any `/v1/chat` change to catch regressions:
|
|
1231
1231
|
|
|
1232
1232
|
```bash
|
|
1233
|
-
MODEL=qwen3.5:4b bash scripts/
|
|
1234
|
-
MODEL=qwen3.5:9b
|
|
1235
|
-
MODEL=qwen3.5:32b
|
|
1233
|
+
MODEL=qwen3.5:4b bash scripts/omnius-vs-ollama-chat-compare.sh # faster tier for quick smoke
|
|
1234
|
+
MODEL=qwen3.5:9b OMNIUS_TIMEOUT=300 bash scripts/omnius-vs-ollama-chat-compare.sh # default
|
|
1235
|
+
MODEL=qwen3.5:32b OMNIUS_TIMEOUT=600 bash scripts/omnius-vs-ollama-chat-compare.sh # higher tier
|
|
1236
1236
|
```
|
|
1237
1237
|
|
|
1238
|
-
**Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses;
|
|
1238
|
+
**Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses; Omnius with the full agent is correct with citations. That's the differentiator captured live in the harness output.
|
|
1239
1239
|
|
|
1240
1240
|
#### One-Off Completions — `/api/generate` + `/v1/generate`
|
|
1241
1241
|
|
|
@@ -1246,11 +1246,11 @@ Drop-in for **Ollama `/api/generate`**. Same body shape, same response shape, sa
|
|
|
1246
1246
|
curl -s http://127.0.0.1:11434/api/generate \
|
|
1247
1247
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
|
|
1248
1248
|
|
|
1249
|
-
#
|
|
1249
|
+
# Omnius with full agent — only port changed
|
|
1250
1250
|
curl -s http://127.0.0.1:11435/api/generate \
|
|
1251
1251
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
|
|
1252
1252
|
|
|
1253
|
-
#
|
|
1253
|
+
# Omnius direct backend bypass (fast path, no agent)
|
|
1254
1254
|
curl -s http://127.0.0.1:11435/api/generate \
|
|
1255
1255
|
-d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false,"tools":false}'
|
|
1256
1256
|
```
|
|
@@ -1275,7 +1275,7 @@ curl -s http://127.0.0.1:11435/api/generate \
|
|
|
1275
1275
|
}
|
|
1276
1276
|
```
|
|
1277
1277
|
|
|
1278
|
-
The `_oa` extension block carries the
|
|
1278
|
+
The `_oa` extension block carries the Omnius-specific metadata (tool call count, agent duration, request ID for correlation with `/v1/audit`). Strict Ollama clients ignore unknown fields — no client changes required.
|
|
1279
1279
|
|
|
1280
1280
|
**Streaming** — set `"stream": true` and receive Ollama-style NDJSON chunks:
|
|
1281
1281
|
|
|
@@ -1351,18 +1351,18 @@ The `strength` and `lastRetrieved` fields are updated on every search — the st
|
|
|
1351
1351
|
|
|
1352
1352
|
#### Generate/Embed/Memory Test Harness
|
|
1353
1353
|
|
|
1354
|
-
A second harness at [`scripts/
|
|
1354
|
+
A second harness at [`scripts/omnius-vs-ollama-generate-embed-memory.sh`](scripts/omnius-vs-ollama-generate-embed-memory.sh) covers the four non-chat endpoint families:
|
|
1355
1355
|
|
|
1356
1356
|
```bash
|
|
1357
1357
|
MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
|
|
1358
|
-
bash scripts/
|
|
1358
|
+
bash scripts/omnius-vs-ollama-generate-embed-memory.sh
|
|
1359
1359
|
```
|
|
1360
1360
|
|
|
1361
1361
|
**Tested results from `omnius@0.187.195`** (live, single run, `qwen3.5:9b` + `nomic-embed-text`):
|
|
1362
1362
|
|
|
1363
1363
|
**Part 1 — `/api/generate` one-off prompts**:
|
|
1364
1364
|
|
|
1365
|
-
| Prompt | Ollama |
|
|
1365
|
+
| Prompt | Ollama | Omnius direct | Omnius full agent |
|
|
1366
1366
|
|---|---|---|---|
|
|
1367
1367
|
| "TCP vs UDP in one sentence" | 26.8s — correct | 12.5s — correct | 43.8s — correct, **1 tool call** |
|
|
1368
1368
|
| "One-line Python square function" | 32.1s — correct | 12.2s — correct | ~3min — correct, **2 tool calls** |
|
|
@@ -1370,7 +1370,7 @@ MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
|
|
|
1370
1370
|
|
|
1371
1371
|
**Part 2 — `/api/embed` cosine similarity sanity** (4 test sentences):
|
|
1372
1372
|
|
|
1373
|
-
Both Ollama and
|
|
1373
|
+
Both Ollama and Omnius emitted **identical 768-dim vectors** (same backend). Cosine similarity matrix:
|
|
1374
1374
|
|
|
1375
1375
|
```
|
|
1376
1376
|
France→Par Paris→Fran Germany→Be Bananas
|
|
@@ -1620,7 +1620,7 @@ curl -s -X POST http://localhost:11435/v1/files/read \
|
|
|
1620
1620
|
#### Sessions, Context, Cost, Sponsors, Nexus
|
|
1621
1621
|
|
|
1622
1622
|
```bash
|
|
1623
|
-
#
|
|
1623
|
+
# Omnius task session archive (not chat sessions)
|
|
1624
1624
|
curl -s 'http://localhost:11435/v1/sessions?limit=10'
|
|
1625
1625
|
curl -s http://localhost:11435/v1/sessions/{session_id}
|
|
1626
1626
|
|
|
@@ -1653,7 +1653,7 @@ curl -s -X POST http://localhost:11435/v1/files/read -d '{}'
|
|
|
1653
1653
|
```
|
|
1654
1654
|
```json
|
|
1655
1655
|
{
|
|
1656
|
-
"type": "https://
|
|
1656
|
+
"type": "https://omnius.nexus/problems/invalid-request",
|
|
1657
1657
|
"title": "Missing 'path'",
|
|
1658
1658
|
"status": 400,
|
|
1659
1659
|
"detail": "POST body must include {path: string, offset?: number, limit?: number}",
|
|
@@ -1697,7 +1697,7 @@ curl -s -o /dev/null -w '%{http_code}\n' \
|
|
|
1697
1697
|
|
|
1698
1698
|
#### Web Interface
|
|
1699
1699
|
|
|
1700
|
-
Open `http://localhost:11435/` in a browser when `
|
|
1700
|
+
Open `http://localhost:11435/` in a browser when `omnius serve` is running. Zero external dependencies — single self-contained HTML page.
|
|
1701
1701
|
|
|
1702
1702
|
**Tabs:**
|
|
1703
1703
|
- **Chat** — Conversational interface using `/v1/chat` with full tool access, session persistence, streaming responses, and collapsible tool call dropdowns
|
|
@@ -1718,7 +1718,7 @@ Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero ext
|
|
|
1718
1718
|
- Token counter per conversation
|
|
1719
1719
|
- Conversation export (Markdown or JSON)
|
|
1720
1720
|
- GPU/VRAM detection with model compatibility recommendations
|
|
1721
|
-
- Per-provider token tracking (persisted to `.
|
|
1721
|
+
- Per-provider token tracking (persisted to `.omnius/usage/token-usage.json`)
|
|
1722
1722
|
|
|
1723
1723
|
### Enterprise Licensing
|
|
1724
1724
|
|
|
@@ -1797,16 +1797,16 @@ SUGGESTED NEXT STEP: A completed todo claims a missing artifact...
|
|
|
1797
1797
|
Prior `<world-state>` blocks are stripped before injecting the freshest one — only the current snapshot lives in context. Plan reconciliation uses `verifyCommand` + `declaredArtifacts` from the todo store + heuristic filename matching. Disk scan is gitignore-aware, capped at 200 files. Generic across stacks.
|
|
1798
1798
|
*Lit anchors*: MetaGPT (Hong et al. ICLR 2024) — SOP-encoded state representation; AlphaCodium (Pinto 2024) — symbol-aware iteration.
|
|
1799
1799
|
|
|
1800
|
-
Configurable via `
|
|
1800
|
+
Configurable via `OMNIUS_WORLD_STATE_INTERVAL` (default 8), `OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD` (default 5), `OMNIUS_WORLD_STATE_MAX_FILES` (default 200).
|
|
1801
1801
|
|
|
1802
1802
|
### REG-47 — Backward-pass critic on `task_complete`
|
|
1803
1803
|
|
|
1804
|
-
When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `
|
|
1804
|
+
When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `OMNIUS_BACKWARD_PASS=on`, the orchestrator spawns a dedicated CRITIC sub-agent against the same backend. The critic gets the diff + plan reconciliation + recent failures + a 10-point structural audit checklist (dead refs, missing imports, off-by-one, null-handling, stateful regex, hardcoded paths, untested code paths, plan-disk gaps, unresolved failures, generic-vs-specific drift) and votes:
|
|
1805
1805
|
- **approve** → task_complete proceeds, run terminates
|
|
1806
1806
|
- **request_changes** → issue feedback injected as a system message; agent loops to address
|
|
1807
1807
|
- **reject** → critical event; same as request_changes but with escalation marker
|
|
1808
1808
|
|
|
1809
|
-
Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `
|
|
1809
|
+
Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `OMNIUS_BACKWARD_PASS=on`.
|
|
1810
1810
|
*Lit anchors*: Self-Refine (Madaan et al. NeurIPS 2024) — +6-12% HumanEval correctness from a dedicated reviewer; CodeT (Chen et al. arxiv 2306.03907) — critic-contested implementer claims.
|
|
1811
1811
|
|
|
1812
1812
|
### REG-48 — Cross-file specification drift detection
|
|
@@ -1861,29 +1861,29 @@ Run-by-run progression of the orchestrator:
|
|
|
1861
1861
|
| #18 | 43/44/45/46/47 | killed @ ~30m, 8/9 phases done, test-debug stuck | 62 | ✓ | partial |
|
|
1862
1862
|
| **#19** | **43/44/45/46/47/48** | **completed cleanly** | **62** | **✓** | **6/6 pass** |
|
|
1863
1863
|
|
|
1864
|
-
Detailed archival report: [`.aiwg/
|
|
1864
|
+
Detailed archival report: [`.aiwg/omnius-eval/RESULTS-RUN-19.md`](.aiwg/omnius-eval/RESULTS-RUN-19.md).
|
|
1865
1865
|
|
|
1866
1866
|
### Configuration summary
|
|
1867
1867
|
|
|
1868
1868
|
```bash
|
|
1869
1869
|
# Defense activation (set in daemon env or systemd unit)
|
|
1870
|
-
|
|
1871
|
-
|
|
1872
|
-
|
|
1873
|
-
|
|
1874
|
-
|
|
1875
|
-
|
|
1876
|
-
|
|
1870
|
+
OMNIUS_BACKWARD_PASS=on # enable REG-47 critic (default: off)
|
|
1871
|
+
OMNIUS_BACKWARD_PASS_MAX_CYCLES=2 # max review iterations
|
|
1872
|
+
OMNIUS_BACKWARD_PASS_MIN_WRITES=1 # min file mutations to trigger review
|
|
1873
|
+
OMNIUS_BACKWARD_PASS_TIMEOUT_MS=120000 # critic call timeout
|
|
1874
|
+
OMNIUS_BACKWARD_PASS_MAX_TOKENS=4096 # critic response cap
|
|
1875
|
+
OMNIUS_BACKWARD_PASS_MAX_FILES=60 # max files in critic prompt
|
|
1876
|
+
OMNIUS_BACKWARD_PASS_MAX_FILE_PREVIEW=8000
|
|
1877
1877
|
|
|
1878
|
-
|
|
1879
|
-
|
|
1880
|
-
|
|
1878
|
+
OMNIUS_WORLD_STATE_INTERVAL=8 # REG-46 turn-cadence (default: 8)
|
|
1879
|
+
OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD=5 # REG-46 write-trigger (default: 5)
|
|
1880
|
+
OMNIUS_WORLD_STATE_MAX_FILES=200 # REG-46 disk-scan cap
|
|
1881
1881
|
|
|
1882
|
-
|
|
1883
|
-
|
|
1882
|
+
OMNIUS_WORLD_STATE_DRIFT=on # REG-48 drift detector (default: on)
|
|
1883
|
+
OMNIUS_DRIFT_ALIASES='{"~/":"src/"}' # extra path aliases (JSON)
|
|
1884
1884
|
|
|
1885
|
-
|
|
1886
|
-
|
|
1885
|
+
OMNIUS_RUN_RETENTION_H=24 # run-record GC (default: 24h, 0 disables)
|
|
1886
|
+
OMNIUS_TOOL_OVERRIDES='{"shell":{"off_device_allowed":true}}' # per-tool security overrides
|
|
1887
1887
|
```
|
|
1888
1888
|
|
|
1889
1889
|
|
|
@@ -1999,7 +1999,7 @@ Omnius builds and maintains a **persistent, auto-updating knowledge graph** of t
|
|
|
1999
1999
|
### How It Works
|
|
2000
2000
|
|
|
2001
2001
|
```
|
|
2002
|
-
Source files ──> Regex symbol extraction ──> SQLite graph DB (.
|
|
2002
|
+
Source files ──> Regex symbol extraction ──> SQLite graph DB (.omnius/index/code-graph.db)
|
|
2003
2003
|
| |
|
|
2004
2004
|
| fs.watch() + debounce ──> File hash check ──> Incremental re-index (per file)
|
|
2005
2005
|
| |
|
|
@@ -2033,7 +2033,7 @@ For 1M+ LOC codebases, the Louvain community compression reduces 50K+ symbols in
|
|
|
2033
2033
|
|
|
2034
2034
|
### Storage
|
|
2035
2035
|
|
|
2036
|
-
The graph persists in `.
|
|
2036
|
+
The graph persists in `.omnius/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.
|
|
2037
2037
|
|
|
2038
2038
|
### Research Basis
|
|
2039
2039
|
|
|
@@ -2144,7 +2144,7 @@ On startup and `/model` switch, Omnius detects your RAM/VRAM and creates an opti
|
|
|
2144
2144
|
| **COHERE Cognitive Stack** | |
|
|
2145
2145
|
| `repl_exec` | Persistent Python REPL — variables/imports persist between calls, `llm_query()` and `parallel_llm_query()` available for recursive LLM invocation, `retrieve()` for handle access |
|
|
2146
2146
|
| `memory_metabolize` | Governed memory lifecycle — classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence/identity_relevance), consolidate lessons from trajectories |
|
|
2147
|
-
| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.
|
|
2147
|
+
| `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.omnius/identity/` |
|
|
2148
2148
|
| `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |
|
|
2149
2149
|
| `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |
|
|
2150
2150
|
| **Hardware Access** | |
|
|
@@ -2269,7 +2269,7 @@ Instead of writing custom integrations, point Omnius at an MCP server and its to
|
|
|
2269
2269
|
}
|
|
2270
2270
|
```
|
|
2271
2271
|
|
|
2272
|
-
Save that as `.
|
|
2272
|
+
Save that as `.omnius/mcp.json` (project) or `~/.omnius/mcp.json` (global). On startup, every server is spawned, the handshake runs, and every tool it advertises is exposed under the namespace `mcp__<server>__<tool>` — selectable by the agent like any built-in.
|
|
2273
2273
|
|
|
2274
2274
|
### Spec compliance — what we implement
|
|
2275
2275
|
|
|
@@ -2289,9 +2289,9 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
|
|
|
2289
2289
|
|
|
2290
2290
|
### Three ways to add a server
|
|
2291
2291
|
|
|
2292
|
-
**1. Edit `.
|
|
2292
|
+
**1. Edit `.omnius/mcp.json` directly** — drop in the JSON shape above. On next launch the server is spawned and connected automatically.
|
|
2293
2293
|
|
|
2294
|
-
**2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the
|
|
2294
|
+
**2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the Omnius terminal. The MD parser detects the configuration with confidence scoring, persists it to `.omnius/mcp.json`, and connects immediately. No restart needed. Implementation: `packages/execution/src/mcp/md-intake.ts`.
|
|
2295
2295
|
|
|
2296
2296
|
**3. Use the `/mcp` slash command** — interactive TUI registry browser:
|
|
2297
2297
|
|
|
@@ -2299,7 +2299,7 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
|
|
|
2299
2299
|
/mcp # Open the MCP registry menu
|
|
2300
2300
|
/mcp status # Quick connection table
|
|
2301
2301
|
/mcp ls # Same as status
|
|
2302
|
-
/mcp reload # Reconnect every server from .
|
|
2302
|
+
/mcp reload # Reconnect every server from .omnius/mcp.json
|
|
2303
2303
|
```
|
|
2304
2304
|
|
|
2305
2305
|
The main menu lists every configured server with status (●), transport type, tool count, and any error. Selecting a server opens a detail view showing every advertised tool with its description, plus actions to **Edit**, **Reconnect**, **Delete**, or go **Back**. Edit accepts a one-line JSON config; Save returns to the main list with the updated server reconnected.
|
|
@@ -2329,7 +2329,7 @@ We test the streaming features end-to-end against the [official everything refer
|
|
|
2329
2329
|
|
|
2330
2330
|
### Programmatic API
|
|
2331
2331
|
|
|
2332
|
-
If you want to drive an MCP server directly from code (instead of through an agent), the
|
|
2332
|
+
If you want to drive an MCP server directly from code (instead of through an agent), the Omnius package re-exports the client:
|
|
2333
2333
|
|
|
2334
2334
|
```typescript
|
|
2335
2335
|
import { McpClient } from "omnius";
|
|
@@ -2403,7 +2403,7 @@ The loop tracks iteration history, generates completion reports saved to `.aiwg/
|
|
|
2403
2403
|
| `/pause` | **Gentle halt** — lets the current inference turn finish, then stops before the next turn. No new tool calls or inference will begin until `/resume`. |
|
|
2404
2404
|
| `/stop` | **Immediate kill** — aborts the current inference mid-stream, saves task state for later resumption. |
|
|
2405
2405
|
| `/resume` | **Continue** — resumes a paused or stopped task from where it left off. Also resumes tasks saved by `/stop` or interrupted by `/update`. |
|
|
2406
|
-
| `/destroy` | **Nuclear option** — aborts any active task, deletes the `.
|
|
2406
|
+
| `/destroy` | **Nuclear option** — aborts any active task, deletes the `.omnius/` directory, clears the console, and exits to shell. |
|
|
2407
2407
|
|
|
2408
2408
|
### Session Context Persistence
|
|
2409
2409
|
|
|
@@ -2415,13 +2415,13 @@ Context is automatically saved on every task completion and preserved across `/u
|
|
|
2415
2415
|
/context show # Show saved context status (entries, last saved)
|
|
2416
2416
|
```
|
|
2417
2417
|
|
|
2418
|
-
The system maintains a rolling window of the last 20 session entries in `.
|
|
2418
|
+
The system maintains a rolling window of the last 20 session entries in `.omnius/context/session-context.json`. When you run `/context restore`, the last 10 entries are formatted into a restore prompt and injected into your next task, giving the agent continuity across sessions.
|
|
2419
2419
|
|
|
2420
2420
|
During `/update`, context is automatically saved before the process restarts and restored when the new version resumes your task.
|
|
2421
2421
|
|
|
2422
2422
|
### Auto-Restore on Startup
|
|
2423
2423
|
|
|
2424
|
-
When you launch `
|
|
2424
|
+
When you launch `omnius` in a workspace that has saved session context from a previous run, you'll be prompted to restore it:
|
|
2425
2425
|
|
|
2426
2426
|
```
|
|
2427
2427
|
ℹ Previous session found (5 entries, last active 2h ago)
|
|
@@ -2464,7 +2464,7 @@ Daemon: COHERE enabled — listening on nexus.cohere.query
|
|
|
2464
2464
|
Capacity announcement: 3 models, warm=qwen3.5:122b
|
|
2465
2465
|
|
|
2466
2466
|
Peer: "Explain TCP vs UDP" → NATS broadcast
|
|
2467
|
-
Your
|
|
2467
|
+
Your Omnius: claim → route to qwen3:4b (trivial) → respond in 1.2s
|
|
2468
2468
|
```
|
|
2469
2469
|
|
|
2470
2470
|
**How it works:**
|
|
@@ -2475,7 +2475,7 @@ Your OA: claim → route to qwen3:4b (trivial) → respond in 1.2s
|
|
|
2475
2475
|
- **Model allowlist** — `/cohere allow qwen3:4b` controls which models are exposed
|
|
2476
2476
|
- **Ollama safety** — remote queries can ONLY run inference on existing models; `/api/pull`, `/api/delete`, `/api/create` are never called
|
|
2477
2477
|
- **Identity pinning** — snapshots published to IPFS (Helia) with SHA-256 content addressing; survives daemon restarts
|
|
2478
|
-
- **Background daemon** persists across
|
|
2478
|
+
- **Background daemon** persists across Omnius restarts (`detached: true` + PID file reconnection)
|
|
2479
2479
|
|
|
2480
2480
|
```bash
|
|
2481
2481
|
/cohere stats # Network transparency — queries in/out, model usage, peer activity
|
|
@@ -2525,7 +2525,7 @@ The identity kernel maintains a persistent self-model across sessions, the refle
|
|
|
2525
2525
|
|
|
2526
2526
|
Omnius includes a behavioral immune system that prevents the agent from making pattern-matched mistakes under pressure. Inspired by biological immune systems: constraints are the antibodies, pressure detection is the inflammatory response, and memory injection is the recall mechanism.
|
|
2527
2527
|
|
|
2528
|
-
### Constraint Enforcement (`.
|
|
2528
|
+
### Constraint Enforcement (`.omnius/constraints.json`)
|
|
2529
2529
|
|
|
2530
2530
|
Machine-readable rules checked **before every tool execution**:
|
|
2531
2531
|
|
|
@@ -2550,7 +2550,7 @@ Machine-readable rules checked **before every tool execution**:
|
|
|
2550
2550
|
| `warn` | Executes tool but emits warning in agent's next turn context |
|
|
2551
2551
|
| `log` | Silent recording to audit log, no interruption |
|
|
2552
2552
|
|
|
2553
|
-
Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.
|
|
2553
|
+
Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.omnius/constraints.json`), or session (ephemeral).
|
|
2554
2554
|
|
|
2555
2555
|
### Pressure-Aware Decision Gate
|
|
2556
2556
|
|
|
@@ -2642,7 +2642,7 @@ Use deep context for:
|
|
|
2642
2642
|
- Long debugging sessions where error context from earlier is critical
|
|
2643
2643
|
- Tasks where the agent needs to reason about patterns across many files
|
|
2644
2644
|
|
|
2645
|
-
The setting persists to `.
|
|
2645
|
+
The setting persists to `.omnius/settings.json`. Deep context is particularly valuable for models with 64K+ context windows (Qwen3.5-122B, Llama 3.1 70B, etc.) where the default thresholds were leaving significant capacity unused.
|
|
2646
2646
|
|
|
2647
2647
|
### Status Bar Context Tracking (`Ctx:` + `SNR:`)
|
|
2648
2648
|
|
|
@@ -2752,7 +2752,7 @@ The profile is compiled into a system prompt suffix (max 80 tokens) injected at
|
|
|
2752
2752
|
|
|
2753
2753
|
### Persistence
|
|
2754
2754
|
|
|
2755
|
-
The style is saved to `.
|
|
2755
|
+
The style is saved to `.omnius/settings.json` (with `--local`) or `~/.omnius/config.json` (global) and persists across sessions. Change it anytime with `/style <preset>` — takes effect on the next task.
|
|
2756
2756
|
|
|
2757
2757
|
### Research Provenance
|
|
2758
2758
|
|
|
@@ -2878,7 +2878,7 @@ Output: 48kHz WAV, compatible with Telegram voice messages and WebSocket streami
|
|
|
2878
2878
|
|
|
2879
2879
|
### Supertonic Expressive Tags
|
|
2880
2880
|
|
|
2881
|
-
When Supertonic is the active voice backend,
|
|
2881
|
+
When Supertonic is the active voice backend, Omnius decorates spoken status updates with the expression tags Supertonic supports. The tag pass runs after markdown/ANSI cleanup and only for Supertonic, so GLaDOS, Overwatch, Kokoro, and LuxTTS continue receiving plain sanitized text.
|
|
2882
2882
|
|
|
2883
2883
|
Tag placement is context-aware:
|
|
2884
2884
|
|
|
@@ -3086,7 +3086,7 @@ When combined with `/voice`, you get full bidirectional audio — speak your tas
|
|
|
3086
3086
|
|
|
3087
3087
|
The `transcribe-cli` dependency auto-installs in the background on first use. On ARM or when transcribe-cli fails, the system automatically falls back to `openai-whisper` via a self-managed Python venv (same approach used by Moondream vision).
|
|
3088
3088
|
|
|
3089
|
-
**File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.
|
|
3089
|
+
**File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.omnius/transcripts/`.
|
|
3090
3090
|
|
|
3091
3091
|
|
|
3092
3092
|
|
|
@@ -3235,7 +3235,7 @@ Agent: agenda()
|
|
|
3235
3235
|
|
|
3236
3236
|
| Decision | Research Basis | Key Finding |
|
|
3237
3237
|
|----------|---------------|-------------|
|
|
3238
|
-
| Separate directive store (`.
|
|
3238
|
+
| Separate directive store (`.omnius/scheduled/`, not `.omnius/memory/`) | SSGM ([arXiv:2603.11768](https://arxiv.org/abs/2603.11768), 2026) | Directives in summarizable memory corrupt via compaction — semantic drift degrades scheduling data |
|
|
3239
3239
|
| File-based persistence survives process death | MemGPT/Letta (Packer et al. 2023, [arXiv:2310.08560](https://arxiv.org/abs/2310.08560)) | Agents are ephemeral; state must be external to the process |
|
|
3240
3240
|
| Priority-based startup surfacing | A-MAC ([arXiv:2603.04549](https://arxiv.org/abs/2603.04549), 2026) | 5-factor attention scoring; content type prior is most influential factor (31% latency reduction) |
|
|
3241
3241
|
| Cross-session self-reflection | Reflexion (Shinn et al. 2023, [arXiv:2303.11366](https://arxiv.org/abs/2303.11366)) | Persistent self-reflection stored as text improves task success 20-30% |
|
|
@@ -3289,7 +3289,7 @@ Supports `apt` (Debian/Ubuntu), `dnf` (Fedora), `pacman` (Arch), and `brew` (mac
|
|
|
3289
3289
|
Launch without arguments to enter the interactive REPL:
|
|
3290
3290
|
|
|
3291
3291
|
```bash
|
|
3292
|
-
|
|
3292
|
+
omnius
|
|
3293
3293
|
```
|
|
3294
3294
|
|
|
3295
3295
|
The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage, human expert speed ratio, cost), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.
|
|
@@ -3308,9 +3308,9 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
|
|
|
3308
3308
|
| `/pause` | Pause after current turn finishes (gentle halt) |
|
|
3309
3309
|
| `/stop` | Kill current inference immediately, save state |
|
|
3310
3310
|
| `/resume` | Resume a paused or stopped task |
|
|
3311
|
-
| `/destroy` | Remove `.
|
|
3311
|
+
| `/destroy` | Remove `.omnius/` folder, kill all tasks, clear console, exit |
|
|
3312
3312
|
| **Context & Memory** | |
|
|
3313
|
-
| `/context save` | Force-save session context to `.
|
|
3313
|
+
| `/context save` | Force-save session context to `.omnius/context/` |
|
|
3314
3314
|
| `/context restore` | Restore context from previous sessions into next task |
|
|
3315
3315
|
| `/context show` | Show saved session context status |
|
|
3316
3316
|
| `/compact` | Force context compaction now (default strategy) |
|
|
@@ -3383,7 +3383,7 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
|
|
|
3383
3383
|
| `/help` | Show all available commands |
|
|
3384
3384
|
| `/quit` | Exit |
|
|
3385
3385
|
|
|
3386
|
-
All settings commands accept `--local` to save to project `.
|
|
3386
|
+
All settings commands accept `--local` to save to project `.omnius/settings.json` instead of global config.
|
|
3387
3387
|
|
|
3388
3388
|
### Platform Connectors
|
|
3389
3389
|
|
|
@@ -3443,7 +3443,7 @@ The steering sub-agent uses the same model and backend as the main agent with `m
|
|
|
3443
3443
|
Connect the agent to a Telegram bot. Telegram can run in auto, chat, or action mode: conversational messages get rapid streamed replies in chat mode, while codebase/file/run requests use dedicated action sub-agents that are visible in the terminal waterfall alongside other agent activity.
|
|
3444
3444
|
|
|
3445
3445
|
```bash
|
|
3446
|
-
/telegram --key <token> # Save bot token (persisted to .
|
|
3446
|
+
/telegram --key <token> # Save bot token (persisted to .omnius/settings.json)
|
|
3447
3447
|
/telegram --admin <userid> # Set admin user — gets full memory + tools
|
|
3448
3448
|
/telegram # Toggle bridge on/off (uses saved key)
|
|
3449
3449
|
/telegram status # Show connection status + active sub-agents
|
|
@@ -3490,7 +3490,7 @@ On success, that Telegram user ID is saved as the admin user and future private-
|
|
|
3490
3490
|
The Telegram bridge handles modern Bot API traffic directly:
|
|
3491
3491
|
|
|
3492
3492
|
- **Guest Mode** — inbound `guest_message` updates are normalized into regular agent work and answered through `answerGuestQuery`, so users can interact from profile-surface guest chats before a normal bot DM exists.
|
|
3493
|
-
- **Command menu registration** — when the bridge starts,
|
|
3493
|
+
- **Command menu registration** — when the bridge starts, Omnius registers the local slash-command surface with Telegram via `setMyCommands`; Telegram-safe names such as `/full_send_bless` are mapped back to canonical TUI commands like `/full-send-bless` before execution.
|
|
3494
3494
|
- **Bot-to-bot sends** — `/telegram bot <username> <text>` targets another bot by username using Telegram's supported bot-to-bot message subset.
|
|
3495
3495
|
- **Managed bot access** — `/telegram access get|set` reads and configures managed-bot access restrictions by managed bot user ID.
|
|
3496
3496
|
- **Polls and live photos** — incoming polls, poll media summaries, option media, country/member limits, and live photos are captured as first-class Telegram message context; `/telegram poll` and `/telegram live-photo` send the matching Bot API payloads.
|
|
@@ -3594,7 +3594,7 @@ The bridge distinguishes between **private DMs** and **group/supergroup chats**,
|
|
|
3594
3594
|
|
|
3595
3595
|
Photos, audio, voice messages, video, video notes, and documents sent via Telegram are automatically downloaded and processed:
|
|
3596
3596
|
|
|
3597
|
-
1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.
|
|
3597
|
+
1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.omnius/media-cache/`
|
|
3598
3598
|
2. **Processing** — routed to the appropriate pipeline:
|
|
3599
3599
|
- Images → `vision` / `image_read` / `ocr` tools
|
|
3600
3600
|
- Audio/voice → `transcribe_file` tool
|
|
@@ -3623,7 +3623,7 @@ The bridge automatically handles Telegram's rate limits (HTTP 429) with exponent
|
|
|
3623
3623
|
|
|
3624
3624
|
<div align="right"><a href="#top">back to top</a></div>
|
|
3625
3625
|
|
|
3626
|
-
Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [
|
|
3626
|
+
Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [open-agents-nexus@1.5.6](https://www.npmjs.com/package/open-agents-nexus).
|
|
3627
3627
|
|
|
3628
3628
|
### Wallet & Identity
|
|
3629
3629
|
```
|
|
@@ -3644,7 +3644,7 @@ When margin > 0, capabilities are registered with USDC pricing metadata. The dae
|
|
|
3644
3644
|
```
|
|
3645
3645
|
nexus(action='spend', target_address='0x...', amount_usdc='0.10')
|
|
3646
3646
|
```
|
|
3647
|
-
Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.
|
|
3647
|
+
Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.omnius/nexus/pending-transfer.json`.
|
|
3648
3648
|
|
|
3649
3649
|
### Remote Inference — Tap Into the Mesh
|
|
3650
3650
|
```
|
|
@@ -3710,7 +3710,7 @@ Step 5 → Review and Go Live
|
|
|
3710
3710
|
- **libp2p P2P mesh** provides decentralized relay — no DNS, no port forwarding, NAT-traversing
|
|
3711
3711
|
- Cloudflared tunnel available as HTTPS fallback for non-P2P consumers
|
|
3712
3712
|
- Your raw API endpoint URL is **never exposed** — consumers connect via peerId or tunnel
|
|
3713
|
-
- Config persists to `.
|
|
3713
|
+
- Config persists to `.omnius/sponsor/config.json` — survives restarts
|
|
3714
3714
|
|
|
3715
3715
|
**Management:**
|
|
3716
3716
|
```bash
|
|
@@ -3736,11 +3736,11 @@ When using sponsored inference, the sponsor's banner animation and message appea
|
|
|
3736
3736
|
|
|
3737
3737
|
```
|
|
3738
3738
|
Primary path (libp2p):
|
|
3739
|
-
Consumer
|
|
3739
|
+
Consumer Omnius ──→ libp2p mesh ──→ Sponsor Daemon ──→ Ollama/vLLM
|
|
3740
3740
|
(P2P, NAT-traversing) (auth + rate limit) (local)
|
|
3741
3741
|
|
|
3742
3742
|
Fallback path (tunnel):
|
|
3743
|
-
Consumer
|
|
3743
|
+
Consumer Omnius ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollama/vLLM
|
|
3744
3744
|
(HTTPS) (auth + rate limit) (local)
|
|
3745
3745
|
|
|
3746
3746
|
Both paths enforce:
|
|
@@ -3784,7 +3784,7 @@ The `--full` flag is required to grant remote peers model management access. Spo
|
|
|
3784
3784
|
|
|
3785
3785
|
<div align="right"><a href="#top">back to top</a></div>
|
|
3786
3786
|
|
|
3787
|
-
COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple
|
|
3787
|
+
COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple Omnius nodes form a mesh that learns, evolves, and improves collectively. Queries from the [omnius.nexus](https://omnius.nexus) frontend or CLI are broadcast via NATS, processed by elected nodes through the full AgenticRunner (tools, context engineering, system prompts), and responses are peer-reviewed before delivery.
|
|
3788
3788
|
|
|
3789
3789
|
### How COHERE Works
|
|
3790
3790
|
|
|
@@ -3857,7 +3857,7 @@ Omnius includes infrastructure for the agent to learn from its own execution, im
|
|
|
3857
3857
|
|
|
3858
3858
|
### Trajectory Logging
|
|
3859
3859
|
|
|
3860
|
-
Every completed task is logged to `.
|
|
3860
|
+
Every completed task is logged to `.omnius/trajectories/trajectories.jsonl` with full metadata: task description, outcome (pass/fail), tool calls made, files modified, failed approaches, and timing. This data feeds the rejection fine-tuning pipeline. Research: [Golubev et al.](https://arxiv.org/abs/2508.03501) showed RFT on passing trajectories alone improved Qwen-72B from 11% to 25% on SWE-bench.
|
|
3861
3861
|
|
|
3862
3862
|
### Rejection Fine-Tuning Pipeline
|
|
3863
3863
|
|
|
@@ -3971,14 +3971,14 @@ Omnius binds entities across image, audio, and text using joint‑embedding mode
|
|
|
3971
3971
|
- Voiceprint linkage: speaker embeddings (x‑vector/ECAPA) are associated with entities when co‑occurring in time with a visual track and a transcribed utterance; robust to background noise via median pooling across windows.
|
|
3972
3972
|
- Text label fusion: natural‑language labels (names, roles, tags) are bound to the same entity when co‑referents appear in proximate context windows (heuristics + clustering).
|
|
3973
3973
|
- Association graph: cross‑modal edges (image↔voice↔text) consolidate into a unified entity node with provenance (model, score, timestamp) and decay‑based confidence.
|
|
3974
|
-
- Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.
|
|
3974
|
+
- Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.omnius/memory/`. Redaction controls can drop embeddings by label or recency.
|
|
3975
3975
|
|
|
3976
3976
|
This enables queries like: “Find where Alex spoke about deployment,” “Show files edited after the person in the red sweater approved the PR,” or “Summarize conversations where Speaker‑B and Alice appear together.”
|
|
3977
3977
|
|
|
3978
3978
|
The associative memory integrates with a near-critical cognitive framework inspired by [Beggs & Plenz (2003)](https://doi.org/10.1523/JNEUROSCI.23-35-11167.2003) neuronal avalanche dynamics:
|
|
3979
3979
|
|
|
3980
|
-
- **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.
|
|
3981
|
-
- **Provenance KG**: Every agent action is tracked in `.
|
|
3980
|
+
- **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.omnius/consolidations/` with lessons learned and key patterns
|
|
3981
|
+
- **Provenance KG**: Every agent action is tracked in `.omnius/provenance/` for full action traceability
|
|
3982
3982
|
- **Homeostasis modulation**: Error rate drives exploration guidance — high error rates inject more careful approaches, low error rates encourage bolder exploration
|
|
3983
3983
|
- **Error pattern learning**: Recurring error patterns are detected, stored globally in `~/.omnius/error-patterns.json`, and injected as `[LEARNED FROM EXPERIENCE]` guidance before similar actions in future sessions
|
|
3984
3984
|
|
|
@@ -3999,18 +3999,18 @@ When you're not actively tasking the agent, Dream Mode lets it creatively explor
|
|
|
3999
3999
|
Each cycle expands through all four stages then contracts (evaluation, pruning of weak ideas). Three modes control how far the agent can go:
|
|
4000
4000
|
|
|
4001
4001
|
```bash
|
|
4002
|
-
/dream # Default — read-only exploration, proposals saved to .
|
|
4002
|
+
/dream # Default — read-only exploration, proposals saved to .omnius/dreams/
|
|
4003
4003
|
/dream deep # Multi-cycle deep exploration with expansion/contraction phases
|
|
4004
4004
|
/dream lucid # Full implementation — saves workspace backup, then implements,
|
|
4005
4005
|
# tests, evaluates, and self-plays each proposal with checkpoints
|
|
4006
4006
|
/dream stop # Wake up — stop dreaming
|
|
4007
4007
|
```
|
|
4008
4008
|
|
|
4009
|
-
**Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.
|
|
4009
|
+
**Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.omnius/dreams/`. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.
|
|
4010
4010
|
|
|
4011
4011
|
**Lucid** mode unlocks full write access. Before making changes, it saves a workspace checkpoint so you can roll back. Each cycle goes: dream → implement → test → evaluate → checkpoint → next cycle.
|
|
4012
4012
|
|
|
4013
|
-
All proposals are indexed in `.
|
|
4013
|
+
All proposals are indexed in `.omnius/dreams/PROPOSAL-INDEX.md` for easy review.
|
|
4014
4014
|
|
|
4015
4015
|
### Autoresearch Swarm — 5-Agent GPU Experiment Loop
|
|
4016
4016
|
|
|
@@ -4023,7 +4023,7 @@ The swarm operates in four phases:
|
|
|
4023
4023
|
| **Phase 0: Load** | Reads autoresearch memory (best config, experiment log, failed approaches, hypothesis queue, architectural insights) + detects GPU specs |
|
|
4024
4024
|
| **Phase 1: Hypothesis** | Critic generates 5-8 hypotheses; Flow Maintainer plans experiment ordering and round budget |
|
|
4025
4025
|
| **Phase 2: Experiment** | Sequential rounds (up to 3): Critic pre-screens → Researcher modifies train.py + runs → Monitor watches GPU → Evaluator keeps/discards → Flow Maintainer decides continue/stop |
|
|
4026
|
-
| **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.
|
|
4026
|
+
| **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.omnius/dreams/` |
|
|
4027
4027
|
|
|
4028
4028
|
#### The 5 Agent Roles
|
|
4029
4029
|
|
|
@@ -4037,7 +4037,7 @@ The swarm operates in four phases:
|
|
|
4037
4037
|
|
|
4038
4038
|
#### Bidirectional Memory
|
|
4039
4039
|
|
|
4040
|
-
The swarm maintains persistent memory in `.
|
|
4040
|
+
The swarm maintains persistent memory in `.omnius/memory/autoresearch.json` with five keys:
|
|
4041
4041
|
|
|
4042
4042
|
- **best_config** — best val_bpb and what train.py changes produced it
|
|
4043
4043
|
- **experiment_log** — chronological list of experiments with hypotheses, results, and verdicts
|
|
@@ -4134,7 +4134,7 @@ curl -X POST http://localhost:11435/v1/run \
|
|
|
4134
4134
|
|
|
4135
4135
|
### Multi-Agent Collective Testbed
|
|
4136
4136
|
|
|
4137
|
-
Spawn multiple
|
|
4137
|
+
Spawn multiple Omnius instances in Docker for collective intelligence experiments:
|
|
4138
4138
|
|
|
4139
4139
|
```bash
|
|
4140
4140
|
cd testbed
|
|
@@ -4381,12 +4381,12 @@ omnius config set backendUrl http://localhost:11434
|
|
|
4381
4381
|
|
|
4382
4382
|
### Project Context
|
|
4383
4383
|
|
|
4384
|
-
Create `AGENTS.md`, `
|
|
4384
|
+
Create `AGENTS.md`, `Omnius.md`, or `.omnius.md` in your project root for agent instructions. Context files merge from parent to child directories.
|
|
4385
4385
|
|
|
4386
|
-
### `.
|
|
4386
|
+
### `.omnius/` Project Directory
|
|
4387
4387
|
|
|
4388
4388
|
```
|
|
4389
|
-
.
|
|
4389
|
+
.omnius/
|
|
4390
4390
|
├── config.json # Project config overrides
|
|
4391
4391
|
├── settings.json # TUI settings (model, endpoint, voice, stream, etc.)
|
|
4392
4392
|
├── memory/ # Persistent memory store (topics, patterns, facts)
|
|
@@ -4412,9 +4412,9 @@ Create `AGENTS.md`, `OA.md`, or `.omnius.md` in your project root for agent inst
|
|
|
4412
4412
|
Any Ollama or OpenAI-compatible API model with tool calling works:
|
|
4413
4413
|
|
|
4414
4414
|
```bash
|
|
4415
|
-
|
|
4416
|
-
|
|
4417
|
-
|
|
4415
|
+
omnius --model qwen2.5-coder:32b "fix the bug"
|
|
4416
|
+
omnius --backend vllm --backend-url http://localhost:8000/v1 "add tests"
|
|
4417
|
+
omnius --backend-url http://10.0.0.5:11434 "refactor auth"
|
|
4418
4418
|
```
|
|
4419
4419
|
|
|
4420
4420
|
|
|
@@ -4508,8 +4508,8 @@ Forward any configured `/endpoint` (Chutes, Groq, OpenRouter, Together, vLLM, et
|
|
|
4508
4508
|
- Your node registers inference capabilities on the P2P mesh using your upstream endpoint's models
|
|
4509
4509
|
- Remote peers discover and invoke these capabilities via libp2p streams (DHT/mDNS/NATS)
|
|
4510
4510
|
- Requests are forwarded to your upstream API, responses streamed back to the peer
|
|
4511
|
-
- The libp2p daemon persists in the background — it survives
|
|
4512
|
-
- When you reopen
|
|
4511
|
+
- The libp2p daemon persists in the background — it survives Omnius restarts and remains discoverable even when the TUI is closed
|
|
4512
|
+
- When you reopen Omnius, it reconnects to the existing daemon and resumes stats tracking
|
|
4513
4513
|
|
|
4514
4514
|
**Rate limit distribution (`--loadbalance`):**
|
|
4515
4515
|
- Captures `x-ratelimit-remaining-tokens` and `x-ratelimit-limit-tokens` headers from upstream API responses
|
|
@@ -4778,7 +4778,7 @@ node eval/run-agentic.mjs --model qwen3.5:4b # Different model tier
|
|
|
4778
4778
|
|
|
4779
4779
|
### REST API Enterprise Evaluation (v0.185.68)
|
|
4780
4780
|
|
|
4781
|
-
35 test cases executed against the
|
|
4781
|
+
35 test cases executed against the omnius REST API (`omnius serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
|
|
4782
4782
|
|
|
4783
4783
|
```bash
|
|
4784
4784
|
node eval/api-enterprise-eval.mjs # Run all 85 tests (35 cases × 3 models)
|
|
@@ -4835,7 +4835,7 @@ Omnius integrates with [AIWG](https://aiwg.io) ([npm](https://www.npmjs.com/pack
|
|
|
4835
4835
|
|
|
4836
4836
|
```bash
|
|
4837
4837
|
npm i -g aiwg
|
|
4838
|
-
|
|
4838
|
+
omnius "analyze this project's SDLC health and set up documentation"
|
|
4839
4839
|
```
|
|
4840
4840
|
|
|
4841
4841
|
| Capability | Description |
|
|
@@ -4932,26 +4932,26 @@ Control it live from the TUI:
|
|
|
4932
4932
|
|
|
4933
4933
|
```
|
|
4934
4934
|
/access # show current access + host
|
|
4935
|
-
/access loopback|lan|any # set access policy (
|
|
4936
|
-
/host 127.0.0.1:11435 # bind to loopback only (
|
|
4935
|
+
/access loopback|lan|any # set access policy (OMNIUS_ACCESS) and restart daemon
|
|
4936
|
+
/host 127.0.0.1:11435 # bind to loopback only (OMNIUS_HOST) and restart daemon
|
|
4937
4937
|
/host 0.0.0.0:11435 # bind all interfaces and restart daemon
|
|
4938
4938
|
/network config # interactive menu (arrow keys) to change both
|
|
4939
4939
|
|
|
4940
4940
|
# Project-local persistence
|
|
4941
|
-
/access any --local # save to ./.
|
|
4941
|
+
/access any --local # save to ./.omnius/settings.json
|
|
4942
4942
|
/host 127.0.0.1:11435 --local
|
|
4943
4943
|
```
|
|
4944
4944
|
|
|
4945
4945
|
Environment variables (non-TUI usage):
|
|
4946
4946
|
|
|
4947
4947
|
```
|
|
4948
|
-
|
|
4948
|
+
OMNIUS_ACCESS=lan OMNIUS_HOST=0.0.0.0:11435 omnius
|
|
4949
4949
|
```
|
|
4950
4950
|
|
|
4951
4951
|
Persistence and startup behavior:
|
|
4952
4952
|
|
|
4953
|
-
- The TUI saves your choices to `.
|
|
4954
|
-
- On startup, the TUI loads saved `
|
|
4953
|
+
- The TUI saves your choices to `.omnius/settings.json` (project) or `~/.omnius/settings.json` (global).
|
|
4954
|
+
- On startup, the TUI loads saved `omniusAccess`/`omniusHost` and seeds `OMNIUS_ACCESS`/`OMNIUS_HOST` before ensuring the daemon, so the 11435 service picks them up immediately.
|
|
4955
4955
|
- Explicit environment variables always win over saved settings.
|
|
4956
4956
|
|
|
4957
4957
|
Security tips:
|