omnius 1.0.0 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,13 +1,15 @@
1
1
  <a name="top"></a>
2
- <code>
3
- _______ _______ _______ ________ ________ ________
4
- / \\/ \\// / \ / \/ / \/ \
5
- / // /// /_/ // / _/
6
- / / / // / //- /
7
- \________/\__/__/__/\__/_____/ \\_______/\_______//\_______//
8
-
9
- </code>
10
- <h1 align="center">Omnius — P2P Inference</h1>
2
+ ```text
3
+ ░▒▓██████▓▒░░▒▓██████████████▓▒░░▒▓███████▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓███████▓▒░
4
+ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
5
+ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
6
+ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓██████▓▒░
7
+ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░ ░▒▓█▓▒░
8
+ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░ ░▒▓█▓▒░
9
+ ░▒▓██████▓▒░░▒▓█▓▒░░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓██████▓▒░░▒▓███████▓▒░
10
+
11
+
12
+ ```
11
13
 
12
14
  <p align="center">
13
15
  <strong>AI coding agent powered entirely by open-weight models.</strong><br>
@@ -26,7 +28,7 @@
26
28
  ---
27
29
 
28
30
  ```bash
29
- npm i -g omnius && oa
31
+ npm i -g omnius && omnius
30
32
  ```
31
33
 
32
34
  An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures in an iterative loop until the task is complete. First launch auto-detects your hardware and configures the optimal model with expanded context window automatically.
@@ -57,7 +59,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
57
59
  - [Parallelism & Concurrency](#parallelism--concurrency)
58
60
  - [Endpoint Reference](#endpoint-reference)
59
61
  - [Stateful Chat — `/v1/chat` + `/api/chat` (OpenAI drop-in with full agent under the hood)](#stateful-chat--v1chat--apichat-openai-drop-in-with-full-agent-under-the-hood)
60
- - [Live Comparison: Ollama vs OA Full Agent](#live-comparison-ollama-vs-oa-full-agent)
62
+ - [Live Comparison: Ollama vs Omnius Full Agent](#live-comparison-ollama-vs-omnius-full-agent)
61
63
  - [One-Off Completions — `/api/generate` + `/v1/generate`](#one-off-completions--apigenerate--v1generate)
62
64
  - [Embeddings — `/v1/embeddings` + `/api/embed`](#embeddings--v1embeddings--apiembed)
63
65
  - [Memory Recall + Knowledge Graph — `/v1/memory/*`](#memory-recall--knowledge-graph--v1memory)
@@ -210,7 +212,7 @@ An autonomous multi-turn tool-calling agent that reads your code, makes changes,
210
212
  - [Configuration](#configuration)
211
213
  - [Network Access & Binding](#network-access--binding)
212
214
  - [Project Context](#project-context)
213
- - [`.oa/` Project Directory](#oa-project-directory)
215
+ - [`.omnius/` Project Directory](#omnius-project-directory)
214
216
  - [Model Support](#model-support)
215
217
  - [Supported Inference Providers](#supported-inference-providers)
216
218
  - [Connecting to a Provider](#connecting-to-a-provider)
@@ -240,7 +242,7 @@ An LLM is a high-bandwidth associative generative core — closer to a cortex-li
240
242
  |---|---|---|
241
243
  | Associative core | Cortex | LLM weights (any size) |
242
244
  | Current workspace | Global workspace / attention | `assembleContext()` — structured context assembly |
243
- | Episodic memory | Hippocampus | `.oa/memory/` — write, search, retrieve across sessions |
245
+ | Episodic memory | Hippocampus | `.omnius/memory/` — write, search, retrieve across sessions |
244
246
  | Cognitive map | Hippocampal spatial maps | `semantic-map.ts` + `repo-map.ts` (PageRank) |
245
247
  | Action gating | Basal ganglia | Tool selection policy (task-aware filtering) |
246
248
  | Temporal hierarchy | Prefrontal executive | Task decomposition, sub-agent delegation |
@@ -258,7 +260,7 @@ Don't chase larger models. Build the organism around whatever model you have.
258
260
  <div align="right"><a href="#top">back to top</a></div>
259
261
 
260
262
  ```
261
- You: oa "fix the null check in auth.ts"
263
+ You: omnius "fix the null check in auth.ts"
262
264
 
263
265
  Agent: [Turn 1] file_read(src/auth.ts)
264
266
  [Turn 2] grep_search(pattern="null", path="src/auth.ts")
@@ -284,8 +286,8 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
284
286
  - **Sub-agent delegation** — spawn independent agents for parallel workstreams
285
287
  - **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation
286
288
  - **Long-horizon cron agents** — schedule recurring autonomous agent tasks with goals, completion criteria, execution history, and automatic evaluation (daily code reviews, weekly dep updates, continuous monitoring)
287
- - **Nexus P2P networking** — decentralized agent-to-agent communication via [omnius-nexus](https://www.npmjs.com/package/omnius-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport
288
- - **x402 micropayments** — native x402 payment rails via omnius-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers
289
+ - **Nexus P2P networking** — decentralized agent-to-agent communication via [open-agents-nexus](https://www.npmjs.com/package/open-agents-nexus). Join rooms, discover peers, share resources, and communicate across the agent mesh with encrypted P2P transport
290
+ - **x402 micropayments** — native x402 payment rails via open-agents-nexus@1.5.6. Agents create secp256k1/EVM wallets (AES-256-GCM encrypted, keys never exposed to LLM), register inference with USDC pricing on Base, auto-handle `payment_required`/`payment_proof` negotiation, track earnings/spending in ledger.jsonl, enforce budget policies, and sign gasless EIP-3009 transfers
289
291
  - **Inference capability proof** — benchmark local models with anti-spoofing SHA-256 hashed proofs, generate capability scorecards for peer verification
290
292
  - **Littleman Observer** — parallel meta-analysis system that watches the agent loop in real-time. Detects false failure claims after successful tools, blocks redundant re-execution, catches runaway one-sided output in conversations, and dynamically extends turn limits when active work is detected. Emits `debug_context` and `debug_littleman` events for live observability
291
293
  - **Interactive Session Lock** — generic `SESSION_ACTIVE` protocol prevents premature task completion during long-running sessions (phone calls, live chat, monitoring). Any MCP contract can adopt the protocol. Paired with context-engineered system prompts that teach small models to maintain conversation loops
@@ -304,8 +306,8 @@ Omnius includes background workers that compute and associate embeddings across
304
306
 
305
307
  Config (env vars):
306
308
 
307
- - `OA_COOCUR_WINDOW_MS` — max time delta between visual and transcript episodes to create co‑occurrence links (default: 120000 ms).
308
- - `OA_COOCUR_CLIP_SIM_MIN` — minimum CLIP text↔image cosine (0..1, default: 0.22) for linking when both embeddings are available.
309
+ - `OMNIUS_COOCUR_WINDOW_MS` — max time delta between visual and transcript episodes to create co‑occurrence links (default: 120000 ms).
310
+ - `OMNIUS_COOCUR_CLIP_SIM_MIN` — minimum CLIP text↔image cosine (0..1, default: 0.22) for linking when both embeddings are available.
309
311
 
310
312
  The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile, speechbrain, Whisper) into `~/.omnius/venv` and registers providers automatically. No manual installs are required.
311
313
  - **Ralph Loop** — iterative task execution that keeps retrying until completion criteria are met
@@ -314,7 +316,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
314
316
  - **Persistent Python REPL** — `repl_exec` tool maintains variables, imports, and functions across calls. Write Python code that processes data iteratively, with `llm_query()` available for recursive LLM sub-calls from within code
315
317
  - **Recursive LLM calls** — `llm_query(prompt, context)` invokes the model from inside REPL code, enabling loop-based semantic analysis of large inputs ([RLM paper](https://arxiv.org/abs/2512.24601)). `parallel_llm_query()` runs multiple calls concurrently ([SPRINT](https://arxiv.org/abs/2506.05745))
316
318
  - **Memory metabolism** — governed memory lifecycle: classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence), consolidate lessons from trajectories. Inspired by [TIMG](https://arxiv.org/abs/2603.10600) and [MemMA](https://arxiv.org/abs/2603.18718)
317
- - **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.oa/identity/`
319
+ - **Identity kernel** — persistent self-state with continuity register, homeostasis estimation, relationship models, and version lineage. Persists across sessions in `.omnius/identity/`
318
320
  - **Reflection & integrity** — immune-system audit: diagnostic ("what's wrong?"), epistemic ("what evidence is missing?"), constitutional ("should this change become part of self?"). Inspired by [LEAFE](https://arxiv.org/abs/2603.16843) and [RewardHackingAgents](https://arxiv.org/abs/2603.11337)
319
321
  - **Exploration & culture** — ARCHE strategy-space exploration: generate competing hypotheses, archive successful variants, retrieve past strategies. Inspired by [SGE](https://arxiv.org/abs/2603.02045) and [Darwin Gödel Machine](https://arxiv.org/abs/2505.22954)
320
322
  - **Autoresearch Swarm** — 5-agent GPU experiment loop during REM sleep: Researcher, Monitor, Evaluator, Critic, Flow Maintainer autonomously run ML training experiments, keep improvements, discard regressions
@@ -323,7 +325,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
323
325
  - **Call Sub-Agent** — each WebSocket caller gets a dedicated AgenticRunner for low-latency voice-to-voice loops, with admin/public access tiers and bidirectional activity sharing with the main agent
324
326
  - **Telegram Voice** — `/voice` enabled via Telegram forwards TTS audio as voice messages alongside text responses. Incoming voice messages are auto-transcribed and handled as text
325
327
  - **Neural TTS** — hear what the agent is doing via GLaDOS, Overwatch, Kokoro, or LuxTTS voice clone, with literature-grounded narration engine (sNeuron-TST structure rotation, Moshi ring buffer dedup, UDDETTS emotion-driven prosody, SEST metadata, LuxTTS flow-matching voice cloning)
326
- - **Supertonic expressive tags** — when `/voice supertonic` is active, OA inserts supported expression tags such as `<sigh>`, `<breath>`, and `<laugh>` into spoken status updates based on failure, recovery, sentence boundaries, success, and playful tone. Other voice backends receive sanitized plain text
328
+ - **Supertonic expressive tags** — when `/voice supertonic` is active, Omnius inserts supported expression tags such as `<sigh>`, `<breath>`, and `<laugh>` into spoken status updates based on failure, recovery, sentence boundaries, success, and playful tone. Other voice backends receive sanitized plain text
327
329
  - **Personality Core** — SAC framework-based style control (concise/balanced/verbose/pedagogical) that shapes agent response depth, voice expressiveness, and system prompt behavior
328
330
  - **Human expert speed ratio** — real-time `Exp: Nx` gauge comparing agent speed to a leading human expert, calibrated across 47 tool baselines
329
331
  - **Cost tracking** — real-time token cost estimation for 15+ cloud providers
@@ -340,14 +342,14 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
340
342
  - **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection
341
343
  - **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)
342
344
  - **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below
343
- - **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across OA restarts
344
- - **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OA_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection
345
+ - **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across Omnius restarts
346
+ - **P2P mesh networking** — `/p2p` with secret-safe variable placeholders (`{{OMNIUS_VAR_*}}`), trust tiers (LOCAL/TEE/VERIFIED/PUBLIC), WebSocket peer mesh, and inference routing with automatic secret redaction/injection
345
347
  - **Secret vault** — `/secrets` manages API keys and credentials with AES-256-GCM encrypted persistence; secrets are automatically redacted before sending to untrusted inference peers and re-injected on response
346
348
  - **Auto-expanding context** — detects RAM/VRAM and creates an optimized model variant on first run
347
349
  - **Mid-task steering** — type while the agent works to add context without interrupting
348
350
  - **Smart compaction** — 6 context compaction strategies (default, aggressive, decisions, errors, summary, structured) with ARC-inspired active context revision ([arXiv:2601.12030](https://arxiv.org/abs/2601.12030)) that preserves structural file content through compaction, preventing small-model repetitive loops at the root cause. Success signals and content previews survive compaction so models never lose evidence that tools succeeded
349
351
  - **Memex experience archive** — large tool outputs archived during compaction with hash-based retrieval
350
- - **Persistent memory** — learned patterns stored in `.oa/memory/` across sessions
352
+ - **Persistent memory** — learned patterns stored in `.omnius/memory/` across sessions
351
353
  - **Structured procedural memory (SQLite)** — replaces flat JSON with a full relational database: CRUD with soft-delete, revision tracking, embedding storage (float32 BLOB), bidirectional memory linking with confidence scores. Inspired by [ExpeL](https://arxiv.org/abs/2308.10144) (contrastive extraction) and [TIMG](https://arxiv.org/abs/2603.10600) (structured procedural format). 79 unit tests
352
354
  - **Semantic memory search** — vector embeddings via [Ollama /api/embed](https://ollama.com) (nomic-embed-text, 768-dim) with cosine similarity search over stored memories. Auto-generates embeddings on memory creation. Auto-links related memories when similarity > 0.6. Graceful fallback to text search when Ollama unavailable
353
355
  - **LLM-based memory extraction** — post-task, the LLM itself extracts structured procedural memories (CATEGORY/TRIGGER/LESSON/STEPS) instead of copying raw error text verbatim. Based on [ExpeL](https://arxiv.org/abs/2308.10144) and [AWM](https://arxiv.org/abs/2409.07429) patterns
@@ -355,13 +357,13 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
355
357
  - **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content
356
358
  - **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected
357
359
  - **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)
358
- - **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.oa/images/`
359
- - **Node visualization** — [openagents.nexus](https://github.com/robit-man/openagents.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
360
+ - **Image generation** — `generate_image` tool using Ollama experimental models ([x/z-image-turbo](https://ollama.com/x/z-image-turbo), [x/flux2-klein](https://ollama.com/x/flux2-klein)). Auto-detect or auto-pull models. Saves PNG to `.omnius/images/`
361
+ - **Node visualization** — [omnius.nexus](https://github.com/robit-man/omnius.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
360
362
  - **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents "white heavy checkmark"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines
361
363
  - **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel
362
364
  - **Unified color scheme** — `ui.primary` (252), `ui.error` (198/magenta), `ui.warn` (214/orange), `ui.accent` (178/yellow) applied consistently across all TUI surfaces
363
365
  - **Clickable header buttons** — `help`, `voice`, `cohere`, `model` buttons on banner row 3 with hover/click visual states. OSC 8 hyperlinks for pointer cursor. Mouse click fires the slash command directly
364
- - **Dynamic terminal title** — updates with current task + version: `"fix auth bug · OA v0.141.0"`
366
+ - **Dynamic terminal title** — updates with current task + version: `"fix auth bug · Omnius v0.141.0"`
365
367
  - **Session context persistence** — auto-saves context on task completion, manual `/context save|restore` across sessions
366
368
  - **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs
367
369
  - **Seamless `/update`** — in-place update and reload with automatic context save/restore
@@ -410,20 +412,20 @@ Run Omnius as a headless service for CI/CD pipelines, automation, and enterprise
410
412
  ### Non-Interactive Mode
411
413
 
412
414
  ```bash
413
- oa "fix all lint errors" --non-interactive # Run task, exit when done
414
- oa "generate API docs" --json # Structured JSON output (no ANSI)
415
- oa "run security audit" --background # Detached background job
415
+ omnius "fix all lint errors" --non-interactive # Run task, exit when done
416
+ omnius "generate API docs" --json # Structured JSON output (no ANSI)
417
+ omnius "run security audit" --background # Detached background job
416
418
  ```
417
419
 
418
420
  ### Background Jobs
419
421
 
420
422
  ```bash
421
- oa "migrate database" --background # Returns job ID immediately
422
- oa status job-abc123 # Check job progress
423
- oa jobs # List all running/completed jobs
423
+ omnius "migrate database" --background # Returns job ID immediately
424
+ omnius status job-abc123 # Check job progress
425
+ omnius jobs # List all running/completed jobs
424
426
  ```
425
427
 
426
- Jobs run as detached processes — survive terminal disconnection. Output saved to `.oa/jobs/{id}.json`.
428
+ Jobs run as detached processes — survive terminal disconnection. Output saved to `.omnius/jobs/{id}.json`.
427
429
 
428
430
  ### JSON Output Mode
429
431
 
@@ -439,15 +441,15 @@ Pipe to `jq`, ingest into monitoring systems, or feed to other agents.
439
441
  ### Process Management
440
442
 
441
443
  ```bash
442
- /destroy processes # Kill orphaned OA processes (local project)
443
- /destroy processes --global # Kill ALL orphaned OA processes system-wide
444
+ /destroy processes # Kill orphaned Omnius processes (local project)
445
+ /destroy processes --global # Kill ALL orphaned Omnius processes system-wide
444
446
  ```
445
447
 
446
- Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale OA instances.
448
+ Shows per-process RAM and CPU usage before killing. Detects: cloudflared tunnels, nexus daemons, headless Chrome, TTS servers, Python REPLs, stale Omnius instances.
447
449
 
448
450
  ### REST API Service (Port 11435)
449
451
 
450
- Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full OA capability surface** through standards most organizations expect:
452
+ Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — installed automatically by `npm i -g omnius` (systemd user unit on Linux, launchd on macOS, scheduled task on Windows). It exposes the **full Omnius capability surface** through standards most organizations expect:
451
453
 
452
454
  - **OpenAI / Ollama drop-in** — `/v1/chat`, `/v1/chat/completions`, `/v1/embeddings`, `/v1/models` are wire-compatible with both ecosystems
453
455
  - **API discovery** — `GET /help` returns a full human and agent-readable guide with quickstart curl commands, all 70+ endpoints by category, MCP integration instructions, and auth documentation
@@ -462,19 +464,19 @@ Omnius runs a persistent enterprise-grade REST API on `127.0.0.1:11435` — inst
462
464
  - **`X-Request-ID`** echoed or generated for correlation
463
465
  - **SSE event bus** at `/v1/events` with optional `?type=foo.*` filter, tagged with `aims:control` for auditors
464
466
  - **Bearer auth + scoped keys** (`read` / `run` / `admin`) and OIDC JWT support
465
- - **Per-key concurrency limits** (`maxJobs` in `OA_API_KEYS` is now actually enforced)
467
+ - **Per-key concurrency limits** (`maxJobs` in `OMNIUS_API_KEYS` is now actually enforced)
466
468
  - **Atomic job record writes** with 64-bit job IDs (no race conditions)
467
469
  - **OpenAPI 3.0** at `/openapi.json` and Swagger UI at `/docs`
468
470
  - **Web chat UI** at `/`
469
471
 
470
- > **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `OA_SKIP_DAEMON_INSTALL=1 npm i -g omnius`.
472
+ > **Daemon auto-start.** After `npm i -g omnius`, the daemon comes online automatically. Verify with `systemctl --user status omnius-daemon` (Linux) or `launchctl print gui/$(id -u)/ai.omnius.daemon` (macOS). Opt out with `OMNIUS_SKIP_DAEMON_INSTALL=1 npm i -g omnius`.
471
473
 
472
474
  ```bash
473
475
  # Manually run the server (the daemon already does this for you)
474
- oa serve # Start on default port 11435
475
- oa serve --port 9999 # Custom port
476
- OA_API_KEY=mysecret oa serve # Single admin key
477
- OA_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" oa serve # Scoped multi-key with rpm:tpd:maxjobs
476
+ omnius serve # Start on default port 11435
477
+ omnius serve --port 9999 # Custom port
478
+ OMNIUS_API_KEY=mysecret omnius serve # Single admin key
479
+ OMNIUS_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" omnius serve # Scoped multi-key with rpm:tpd:maxjobs
478
480
  ```
479
481
 
480
482
  > **Every example below is verified against `omnius@0.187.189` on a live daemon.** Examples from earlier versions are deprecated.
@@ -484,7 +486,7 @@ OA_API_KEYS="key1:admin:alice:30:50000:5,key2:run:ci:60::3,key3:read:grafana" oa
484
486
  Control who can reach the daemon and where it binds:
485
487
 
486
488
  - TUI commands: `/access loopback|lan|any`, `/host <host[:port]>`, `/network config` (interactive), `--local` to save per‑project.
487
- - Environment: `OA_ACCESS=loopback|lan|any`, `OA_HOST=host[:port]`.
489
+ - Environment: `OMNIUS_ACCESS=loopback|lan|any`, `OMNIUS_HOST=host[:port]`.
488
490
  - See Configuration → [Network Access & Binding](#network-access--binding) for full details and security guidance.
489
491
 
490
492
  #### Working Directory
@@ -532,12 +534,12 @@ curl http://localhost:11435/version
532
534
  curl http://localhost:11435/metrics
533
535
  ```
534
536
  ```
535
- # HELP oa_requests_total Total HTTP requests
536
- # TYPE oa_requests_total counter
537
- oa_requests_total{method="POST",path="/v1/chat/completions",status="200"} 47
538
- oa_tokens_in_total 12450
539
- oa_tokens_out_total 8230
540
- oa_errors_total 0
537
+ # HELP omnius_requests_total Total HTTP requests
538
+ # TYPE omnius_requests_total counter
539
+ omnius_requests_total{method="POST",path="/v1/chat/completions",status="200"} 47
540
+ omnius_tokens_in_total 12450
541
+ omnius_tokens_out_total 8230
542
+ omnius_errors_total 0
541
543
  ```
542
544
 
543
545
  #### OpenAI-Compatible Inference
@@ -590,7 +592,7 @@ data: [DONE]
590
592
 
591
593
  #### Agentic Task Execution
592
594
 
593
- The unique OA capability — submit a coding task and get an autonomous agent loop.
595
+ The unique Omnius capability — submit a coding task and get an autonomous agent loop.
594
596
 
595
597
  ```bash
596
598
  # Run task in your current directory
@@ -728,7 +730,7 @@ curl -X POST http://localhost:11435/v1/commands/destroy \
728
730
 
729
731
  ```bash
730
732
  # Multi-key setup: read (monitoring), run (CI), admin (ops)
731
- OA_API_KEYS="grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team" oa serve
733
+ OMNIUS_API_KEYS="grafana-key:read:grafana,ci-key:run:github-actions,ops-key:admin:ops-team" omnius serve
732
734
  ```
733
735
 
734
736
  | Scope | Can do | Cannot do |
@@ -828,21 +830,21 @@ curl -X DELETE -H "Authorization: Bearer $ADMIN_KEY" \
828
830
 
829
831
  The daemon is built for **unbounded concurrent requests** with per-key enforcement. Every agentic task (`/v1/run`, `/v1/chat`, `/api/chat`, `/api/generate`) spawns its own subprocess, so multiple jobs run in true parallel — same model or different models, same or different profiles, same or different sandbox modes.
830
832
 
831
- **Per-key concurrency limits** are enforced from the `OA_API_KEYS` env var:
833
+ **Per-key concurrency limits** are enforced from the `OMNIUS_API_KEYS` env var:
832
834
 
833
835
  ```bash
834
836
  # key:scope:user:rpm:tpd:maxJobs
835
- OA_API_KEYS="ci-key:run:github-actions:60:100000:5, \
837
+ OMNIUS_API_KEYS="ci-key:run:github-actions:60:100000:5, \
836
838
  ops-key:admin:ops:120:500000:20, \
837
839
  read-key:read:grafana:600::"
838
- oa serve
840
+ omnius serve
839
841
  ```
840
842
 
841
843
  The 6th field is `maxJobs` — the maximum number of **concurrent** (in-flight) agentic tasks for that key. When exceeded, the daemon returns **RFC 7807 `429 Too Many Requests`**:
842
844
 
843
845
  ```json
844
846
  {
845
- "type": "https://openagents.nexus/problems/rate-limited",
847
+ "type": "https://omnius.nexus/problems/rate-limited",
846
848
  "title": "Concurrent job limit exceeded",
847
849
  "status": 429,
848
850
  "detail": "Concurrent job limit exceeded for github-actions: 5/5",
@@ -869,7 +871,7 @@ done
869
871
  wait
870
872
  ```
871
873
 
872
- Each subprocess inherits a **clean env** — `OA_DAEMON` and `OA_PORT` are explicitly stripped so the child doesn't re-enter daemon mode. Fixed in v0.187.189 (root cause of the earlier "Task incomplete (0 turns, 0 tool calls)" bug).
874
+ Each subprocess inherits a **clean env** — `OMNIUS_DAEMON` and `OMNIUS_PORT` are explicitly stripped so the child doesn't re-enter daemon mode. Fixed in v0.187.189 (root cause of the earlier "Task incomplete (0 turns, 0 tool calls)" bug).
873
875
 
874
876
  **Observing parallelism live** — subscribe to the event bus to watch every job lifecycle event:
875
877
 
@@ -930,7 +932,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
930
932
  | Method | Path | Auth | Description |
931
933
  |--------|------|------|-------------|
932
934
  | POST | `/v1/chat` | run | Full agent under the hood, OpenAI chat.completion shape. Default = tools=true (subprocess agent). Set `tools:false` for direct backend bypass. Supports `timeout_s` body field (default 180s). Non-streaming path has a safety SIGTERM→SIGKILL after `timeout_s + 30s`. |
933
- | POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both OA-shape (`{message, model}`) and Ollama-shape (`{model, messages: [...]}`) bodies. Returns OpenAI `chat.completion` shape on success and failure (failure uses `finish_reason:"error"`). |
935
+ | POST | `/api/chat` | run | **Ollama-compatible alias** — same handler as `/v1/chat`. Accepts both Omnius-shape (`{message, model}`) and Ollama-shape (`{model, messages: [...]}`) bodies. Returns OpenAI `chat.completion` shape on success and failure (failure uses `finish_reason:"error"`). |
934
936
  | POST | `/v1/generate` | run | **One-off completion** — same agent stack as `/v1/chat` but no session history. Returns Ollama-shape `{model, response, done, total_duration}`. |
935
937
  | POST | `/api/generate` | run | **Ollama-compatible alias** of `/v1/generate`. Drop-in for Ollama `/api/generate`. |
936
938
  | GET | `/v1/chat/sessions` | read | List active chat sessions |
@@ -997,7 +999,7 @@ Also cleans up the Docker container if the job was spawned with `"sandbox":"cont
997
999
  **Sessions + context**
998
1000
  | Method | Path | Auth | Description |
999
1001
  |--------|------|------|-------------|
1000
- | GET | `/v1/sessions` | read | OA task session archive |
1002
+ | GET | `/v1/sessions` | read | Omnius task session archive |
1001
1003
  | GET | `/v1/sessions/:id` | read | Session history |
1002
1004
  | GET | `/v1/context` | read | Show current session context |
1003
1005
  | POST | `/v1/context/save` | run | Save a context entry |
@@ -1064,15 +1066,15 @@ The chat endpoint is mounted at **two paths on port 11435**:
1064
1066
 
1065
1067
  | Path | Purpose |
1066
1068
  |------|---------|
1067
- | `POST /v1/chat` | OA-native path |
1069
+ | `POST /v1/chat` | Omnius-native path |
1068
1070
  | `POST /api/chat` | **Ollama-compatible alias** — same handler, so clients pointing at Ollama can be flipped over by changing only the port (`11434` → `11435`) |
1069
1071
 
1070
- It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full OA agent (tools, multi-agent, memory, skills) under the hood and returns an **OpenAI `chat.completion`-shaped response** so any client SDK can use it without modification.
1072
+ It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/chat`**. The endpoint runs the full Omnius agent (tools, multi-agent, memory, skills) under the hood and returns an **OpenAI `chat.completion`-shaped response** so any client SDK can use it without modification.
1071
1073
 
1072
1074
  **Both body shapes are accepted** on either path:
1073
1075
 
1074
1076
  ```jsonc
1075
- // OA-native
1077
+ // Omnius-native
1076
1078
  {"message": "hello", "model": "qwen3.5:9b", "stream": false}
1077
1079
 
1078
1080
  // Ollama-native (the `messages` array; the last user message is extracted)
@@ -1080,18 +1082,18 @@ It's a **drop-in replacement for OpenAI `/v1/chat/completions` and Ollama `/api/
1080
1082
  ```
1081
1083
 
1082
1084
  > **Two execution modes:**
1083
- > - **Default (`tools` unset or `tools: true`)** — full agent: spawns the OA subprocess with the entire 82-tool set, runs the agent loop, returns the final answer with `tool_calls` metadata.
1085
+ > - **Default (`tools` unset or `tools: true`)** — full agent: spawns the Omnius subprocess with the entire 82-tool set, runs the agent loop, returns the final answer with `tool_calls` metadata.
1084
1086
  > - **Direct (`tools: false`)** — fast path: bypasses the agent and forwards straight to the configured backend (Ollama/vLLM) using the session history. Useful for plain chat without tools.
1085
1087
 
1086
1088
  **Safety timeout** — every non-streaming request is bounded by `timeout_s` (default **180s**). If the agent subprocess doesn't close in `timeout_s + 30s`, the daemon SIGTERMs (then SIGKILLs) it and returns an OpenAI-shaped error with `finish_reason:"error"` and a clear explanation. No more hung requests.
1087
1089
 
1088
- **Flip Ollama → OA by port alone** — this is verified to work via `scripts/oa-vs-ollama-chat-compare.sh` (see [Live Comparison](#live-comparison-ollama-vs-oa-full-agent) below):
1090
+ **Flip Ollama → Omnius by port alone** — this is verified to work via `scripts/omnius-vs-ollama-chat-compare.sh` (see [Live Comparison](#live-comparison-ollama-vs-omnius-full-agent) below):
1089
1091
 
1090
1092
  ```bash
1091
1093
  # Before (Ollama)
1092
1094
  curl -s http://127.0.0.1:11434/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
1093
1095
 
1094
- # After (OA with full agent) — only port changed
1096
+ # After (Omnius with full agent) — only port changed
1095
1097
  curl -s http://127.0.0.1:11435/api/chat -d '{"model":"qwen3.5:9b","messages":[{"role":"user","content":"hi"}],"stream":false}'
1096
1098
  ```
1097
1099
 
@@ -1195,32 +1197,32 @@ curl -s http://localhost:11435/v1/chat \
1195
1197
 
1196
1198
  Sessions expire after 30 minutes of inactivity. List active sessions: `GET /v1/chat/sessions`.
1197
1199
 
1198
- #### Live Comparison: Ollama vs OA Full Agent
1200
+ #### Live Comparison: Ollama vs Omnius Full Agent
1199
1201
 
1200
- The repo ships a reproducible side-by-side harness at [`scripts/oa-vs-ollama-chat-compare.sh`](scripts/oa-vs-ollama-chat-compare.sh). It runs **5 tool-call-required prompts** × **4 phases** (Ollama non-stream, OA non-stream, Ollama stream, OA stream) = **20 runs per invocation** with the same model and the same `/api/chat` path on both ports.
1202
+ The repo ships a reproducible side-by-side harness at [`scripts/omnius-vs-ollama-chat-compare.sh`](scripts/omnius-vs-ollama-chat-compare.sh). It runs **5 tool-call-required prompts** × **4 phases** (Ollama non-stream, Omnius non-stream, Ollama stream, Omnius stream) = **20 runs per invocation** with the same model and the same `/api/chat` path on both ports.
1201
1203
 
1202
1204
  ```bash
1203
- MODEL=qwen3.5:9b bash scripts/oa-vs-ollama-chat-compare.sh
1205
+ MODEL=qwen3.5:9b bash scripts/omnius-vs-ollama-chat-compare.sh
1204
1206
  ```
1205
1207
 
1206
1208
  **Results from `omnius@0.187.191` with `qwen3.5:9b`** (all 20 runs completed, zero timeouts):
1207
1209
 
1208
1210
  | # | Prompt | Ollama (bare) | Omnius (full agent) | Winner |
1209
1211
  |---|---|---|---|---|
1210
- | 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **OA** |
1211
- | 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **OA** |
1212
- | 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **OA** |
1213
- | 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **OA** |
1214
- | 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (OA more detailed)** |
1212
+ | 1 | "Latest stable Node.js version + source URL" | ❌ **v22.10.0** — hallucinated from Aug-2024 training cutoff | ✅ **v25.9.0** fetched from `nodejs.org/download/current`, **3 tool calls** (`web_search` → `web_fetch` → `task_complete`) | **Omnius** |
1213
+ | 2 | "Biggest tech news this week + source URL" | ❌ "I don't have real-time access" + generic AI trend guess | ✅ **Anthropic Mythos, Intel Terafab, Apple foldable, Russian router breach, Firmus $5.5B** — sourced from TechCrunch, **4 tool calls** | **Omnius** |
1214
+ | 3 | "Current OS, CPU cores, free memory — use shell tools" | ❌ Confabulated **"Linux / 8 cores / 6.1 GB"** (all wrong) | ✅ **Ubuntu 24.04.2 / 48 cores / 120 GB** (all correct), **6–7 shell tool calls** | **Omnius** |
1215
+ | 4 | "List files in cwd, count top level, most recent" | ❌ "I cannot access your filesystem" | ✅ **20 files, 50+ dirs, `.claude.json` (81 KB, 09:09 UTC)** via `list_directory`, **2 tool calls** | **Omnius** |
1216
+ | 5 | "2022 FIFA World Cup final winner + score" (both endpoints have this in training data) | ✅ Argentina 4–2 France | ✅ Argentina 3–3 France, **4–2 on penalties at Lusail Stadium, Dec 18 2022** — grounded with 4 tool calls | **Tie (Omnius more detailed)** |
1215
1217
 
1216
1218
  **Latency profile** (wall clock, 5-prompt median):
1217
1219
 
1218
- | Phase | Ollama | OA agent | OA overhead |
1220
+ | Phase | Ollama | Omnius agent | Omnius overhead |
1219
1221
  |---|---|---|---|
1220
1222
  | Non-streaming | 12–18s | 24–42s | 12–26s (agent loop + tool calls) |
1221
1223
  | Streaming SSE | 11–16s | 24–56s | 10–40s |
1222
1224
 
1223
- **Streaming parser validation** — every OA stream delivered:
1225
+ **Streaming parser validation** — every Omnius stream delivered:
1224
1226
  - Live intermediate `tool_call` events mid-stream (e.g. `['web_search', 'web_fetch', 'task_complete']`)
1225
1227
  - OpenAI `chat.completion.chunk` deltas with `id`, `model`, `finish_reason`
1226
1228
  - Clean `data: [DONE]` termination with `finish_reason:"stop"`
@@ -1228,12 +1230,12 @@ MODEL=qwen3.5:9b bash scripts/oa-vs-ollama-chat-compare.sh
1228
1230
  The harness is **reproducible** — rerun it after any `/v1/chat` change to catch regressions:
1229
1231
 
1230
1232
  ```bash
1231
- MODEL=qwen3.5:4b bash scripts/oa-vs-ollama-chat-compare.sh # faster tier for quick smoke
1232
- MODEL=qwen3.5:9b OA_TIMEOUT=300 bash scripts/oa-vs-ollama-chat-compare.sh # default
1233
- MODEL=qwen3.5:32b OA_TIMEOUT=600 bash scripts/oa-vs-ollama-chat-compare.sh # higher tier
1233
+ MODEL=qwen3.5:4b bash scripts/omnius-vs-ollama-chat-compare.sh # faster tier for quick smoke
1234
+ MODEL=qwen3.5:9b OMNIUS_TIMEOUT=300 bash scripts/omnius-vs-ollama-chat-compare.sh # default
1235
+ MODEL=qwen3.5:32b OMNIUS_TIMEOUT=600 bash scripts/omnius-vs-ollama-chat-compare.sh # higher tier
1234
1236
  ```
1235
1237
 
1236
- **Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses; OA with the full agent is correct with citations. That's the differentiator captured live in the harness output.
1238
+ **Bottom line**: for any question that needs fresh data, system access, or filesystem visibility — bare Ollama is wrong or refuses; Omnius with the full agent is correct with citations. That's the differentiator captured live in the harness output.
1237
1239
 
1238
1240
  #### One-Off Completions — `/api/generate` + `/v1/generate`
1239
1241
 
@@ -1244,11 +1246,11 @@ Drop-in for **Ollama `/api/generate`**. Same body shape, same response shape, sa
1244
1246
  curl -s http://127.0.0.1:11434/api/generate \
1245
1247
  -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
1246
1248
 
1247
- # OA with full agent — only port changed
1249
+ # Omnius with full agent — only port changed
1248
1250
  curl -s http://127.0.0.1:11435/api/generate \
1249
1251
  -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false}'
1250
1252
 
1251
- # OA direct backend bypass (fast path, no agent)
1253
+ # Omnius direct backend bypass (fast path, no agent)
1252
1254
  curl -s http://127.0.0.1:11435/api/generate \
1253
1255
  -d '{"model":"qwen3.5:9b","prompt":"Name 3 open-source databases.","stream":false,"tools":false}'
1254
1256
  ```
@@ -1273,7 +1275,7 @@ curl -s http://127.0.0.1:11435/api/generate \
1273
1275
  }
1274
1276
  ```
1275
1277
 
1276
- The `_oa` extension block carries the OA-specific metadata (tool call count, agent duration, request ID for correlation with `/v1/audit`). Strict Ollama clients ignore unknown fields — no client changes required.
1278
+ The `_oa` extension block carries the Omnius-specific metadata (tool call count, agent duration, request ID for correlation with `/v1/audit`). Strict Ollama clients ignore unknown fields — no client changes required.
1277
1279
 
1278
1280
  **Streaming** — set `"stream": true` and receive Ollama-style NDJSON chunks:
1279
1281
 
@@ -1349,18 +1351,18 @@ The `strength` and `lastRetrieved` fields are updated on every search — the st
1349
1351
 
1350
1352
  #### Generate/Embed/Memory Test Harness
1351
1353
 
1352
- A second harness at [`scripts/oa-vs-ollama-generate-embed-memory.sh`](scripts/oa-vs-ollama-generate-embed-memory.sh) covers the four non-chat endpoint families:
1354
+ A second harness at [`scripts/omnius-vs-ollama-generate-embed-memory.sh`](scripts/omnius-vs-ollama-generate-embed-memory.sh) covers the four non-chat endpoint families:
1353
1355
 
1354
1356
  ```bash
1355
1357
  MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
1356
- bash scripts/oa-vs-ollama-generate-embed-memory.sh
1358
+ bash scripts/omnius-vs-ollama-generate-embed-memory.sh
1357
1359
  ```
1358
1360
 
1359
1361
  **Tested results from `omnius@0.187.195`** (live, single run, `qwen3.5:9b` + `nomic-embed-text`):
1360
1362
 
1361
1363
  **Part 1 — `/api/generate` one-off prompts**:
1362
1364
 
1363
- | Prompt | Ollama | OA direct | OA full agent |
1365
+ | Prompt | Ollama | Omnius direct | Omnius full agent |
1364
1366
  |---|---|---|---|
1365
1367
  | "TCP vs UDP in one sentence" | 26.8s — correct | 12.5s — correct | 43.8s — correct, **1 tool call** |
1366
1368
  | "One-line Python square function" | 32.1s — correct | 12.2s — correct | ~3min — correct, **2 tool calls** |
@@ -1368,7 +1370,7 @@ MODEL=qwen3.5:9b EMBED_MODEL=nomic-embed-text \
1368
1370
 
1369
1371
  **Part 2 — `/api/embed` cosine similarity sanity** (4 test sentences):
1370
1372
 
1371
- Both Ollama and OA emitted **identical 768-dim vectors** (same backend). Cosine similarity matrix:
1373
+ Both Ollama and Omnius emitted **identical 768-dim vectors** (same backend). Cosine similarity matrix:
1372
1374
 
1373
1375
  ```
1374
1376
  France→Par Paris→Fran Germany→Be Bananas
@@ -1618,7 +1620,7 @@ curl -s -X POST http://localhost:11435/v1/files/read \
1618
1620
  #### Sessions, Context, Cost, Sponsors, Nexus
1619
1621
 
1620
1622
  ```bash
1621
- # OA task session archive (not chat sessions)
1623
+ # Omnius task session archive (not chat sessions)
1622
1624
  curl -s 'http://localhost:11435/v1/sessions?limit=10'
1623
1625
  curl -s http://localhost:11435/v1/sessions/{session_id}
1624
1626
 
@@ -1651,7 +1653,7 @@ curl -s -X POST http://localhost:11435/v1/files/read -d '{}'
1651
1653
  ```
1652
1654
  ```json
1653
1655
  {
1654
- "type": "https://openagents.nexus/problems/invalid-request",
1656
+ "type": "https://omnius.nexus/problems/invalid-request",
1655
1657
  "title": "Missing 'path'",
1656
1658
  "status": 400,
1657
1659
  "detail": "POST body must include {path: string, offset?: number, limit?: number}",
@@ -1695,7 +1697,7 @@ curl -s -o /dev/null -w '%{http_code}\n' \
1695
1697
 
1696
1698
  #### Web Interface
1697
1699
 
1698
- Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero external dependencies — single self-contained HTML page.
1700
+ Open `http://localhost:11435/` in a browser when `omnius serve` is running. Zero external dependencies — single self-contained HTML page.
1699
1701
 
1700
1702
  **Tabs:**
1701
1703
  - **Chat** — Conversational interface using `/v1/chat` with full tool access, session persistence, streaming responses, and collapsible tool call dropdowns
@@ -1716,7 +1718,7 @@ Open `http://localhost:11435/` in a browser when `oa serve` is running. Zero ext
1716
1718
  - Token counter per conversation
1717
1719
  - Conversation export (Markdown or JSON)
1718
1720
  - GPU/VRAM detection with model compatibility recommendations
1719
- - Per-provider token tracking (persisted to `.oa/usage/token-usage.json`)
1721
+ - Per-provider token tracking (persisted to `.omnius/usage/token-usage.json`)
1720
1722
 
1721
1723
  ### Enterprise Licensing
1722
1724
 
@@ -1795,16 +1797,16 @@ SUGGESTED NEXT STEP: A completed todo claims a missing artifact...
1795
1797
  Prior `<world-state>` blocks are stripped before injecting the freshest one — only the current snapshot lives in context. Plan reconciliation uses `verifyCommand` + `declaredArtifacts` from the todo store + heuristic filename matching. Disk scan is gitignore-aware, capped at 200 files. Generic across stacks.
1796
1798
  *Lit anchors*: MetaGPT (Hong et al. ICLR 2024) — SOP-encoded state representation; AlphaCodium (Pinto 2024) — symbol-aware iteration.
1797
1799
 
1798
- Configurable via `OA_WORLD_STATE_INTERVAL` (default 8), `OA_WORLD_STATE_FILE_WRITE_THRESHOLD` (default 5), `OA_WORLD_STATE_MAX_FILES` (default 200).
1800
+ Configurable via `OMNIUS_WORLD_STATE_INTERVAL` (default 8), `OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD` (default 5), `OMNIUS_WORLD_STATE_MAX_FILES` (default 200).
1799
1801
 
1800
1802
  ### REG-47 — Backward-pass critic on `task_complete`
1801
1803
 
1802
- When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `OA_BACKWARD_PASS=on`, the orchestrator spawns a dedicated CRITIC sub-agent against the same backend. The critic gets the diff + plan reconciliation + recent failures + a 10-point structural audit checklist (dead refs, missing imports, off-by-one, null-handling, stateful regex, hardcoded paths, untested code paths, plan-disk gaps, unresolved failures, generic-vs-specific drift) and votes:
1804
+ When the agent calls `task_complete` AND ≥ 1 file mutation occurred AND `OMNIUS_BACKWARD_PASS=on`, the orchestrator spawns a dedicated CRITIC sub-agent against the same backend. The critic gets the diff + plan reconciliation + recent failures + a 10-point structural audit checklist (dead refs, missing imports, off-by-one, null-handling, stateful regex, hardcoded paths, untested code paths, plan-disk gaps, unresolved failures, generic-vs-specific drift) and votes:
1803
1805
  - **approve** → task_complete proceeds, run terminates
1804
1806
  - **request_changes** → issue feedback injected as a system message; agent loops to address
1805
1807
  - **reject** → critical event; same as request_changes but with escalation marker
1806
1808
 
1807
- Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `OA_BACKWARD_PASS=on`.
1809
+ Cycle-bounded (default 2 cycles before fail-soft). Default OFF — explicit opt-in via `OMNIUS_BACKWARD_PASS=on`.
1808
1810
  *Lit anchors*: Self-Refine (Madaan et al. NeurIPS 2024) — +6-12% HumanEval correctness from a dedicated reviewer; CodeT (Chen et al. arxiv 2306.03907) — critic-contested implementer claims.
1809
1811
 
1810
1812
  ### REG-48 — Cross-file specification drift detection
@@ -1859,29 +1861,29 @@ Run-by-run progression of the orchestrator:
1859
1861
  | #18 | 43/44/45/46/47 | killed @ ~30m, 8/9 phases done, test-debug stuck | 62 | ✓ | partial |
1860
1862
  | **#19** | **43/44/45/46/47/48** | **completed cleanly** | **62** | **✓** | **6/6 pass** |
1861
1863
 
1862
- Detailed archival report: [`.aiwg/oa-eval/RESULTS-RUN-19.md`](.aiwg/oa-eval/RESULTS-RUN-19.md).
1864
+ Detailed archival report: [`.aiwg/omnius-eval/RESULTS-RUN-19.md`](.aiwg/omnius-eval/RESULTS-RUN-19.md).
1863
1865
 
1864
1866
  ### Configuration summary
1865
1867
 
1866
1868
  ```bash
1867
1869
  # Defense activation (set in daemon env or systemd unit)
1868
- OA_BACKWARD_PASS=on # enable REG-47 critic (default: off)
1869
- OA_BACKWARD_PASS_MAX_CYCLES=2 # max review iterations
1870
- OA_BACKWARD_PASS_MIN_WRITES=1 # min file mutations to trigger review
1871
- OA_BACKWARD_PASS_TIMEOUT_MS=120000 # critic call timeout
1872
- OA_BACKWARD_PASS_MAX_TOKENS=4096 # critic response cap
1873
- OA_BACKWARD_PASS_MAX_FILES=60 # max files in critic prompt
1874
- OA_BACKWARD_PASS_MAX_FILE_PREVIEW=8000
1870
+ OMNIUS_BACKWARD_PASS=on # enable REG-47 critic (default: off)
1871
+ OMNIUS_BACKWARD_PASS_MAX_CYCLES=2 # max review iterations
1872
+ OMNIUS_BACKWARD_PASS_MIN_WRITES=1 # min file mutations to trigger review
1873
+ OMNIUS_BACKWARD_PASS_TIMEOUT_MS=120000 # critic call timeout
1874
+ OMNIUS_BACKWARD_PASS_MAX_TOKENS=4096 # critic response cap
1875
+ OMNIUS_BACKWARD_PASS_MAX_FILES=60 # max files in critic prompt
1876
+ OMNIUS_BACKWARD_PASS_MAX_FILE_PREVIEW=8000
1875
1877
 
1876
- OA_WORLD_STATE_INTERVAL=8 # REG-46 turn-cadence (default: 8)
1877
- OA_WORLD_STATE_FILE_WRITE_THRESHOLD=5 # REG-46 write-trigger (default: 5)
1878
- OA_WORLD_STATE_MAX_FILES=200 # REG-46 disk-scan cap
1878
+ OMNIUS_WORLD_STATE_INTERVAL=8 # REG-46 turn-cadence (default: 8)
1879
+ OMNIUS_WORLD_STATE_FILE_WRITE_THRESHOLD=5 # REG-46 write-trigger (default: 5)
1880
+ OMNIUS_WORLD_STATE_MAX_FILES=200 # REG-46 disk-scan cap
1879
1881
 
1880
- OA_WORLD_STATE_DRIFT=on # REG-48 drift detector (default: on)
1881
- OA_DRIFT_ALIASES='{"~/":"src/"}' # extra path aliases (JSON)
1882
+ OMNIUS_WORLD_STATE_DRIFT=on # REG-48 drift detector (default: on)
1883
+ OMNIUS_DRIFT_ALIASES='{"~/":"src/"}' # extra path aliases (JSON)
1882
1884
 
1883
- OA_RUN_RETENTION_H=24 # run-record GC (default: 24h, 0 disables)
1884
- OA_TOOL_OVERRIDES='{"shell":{"off_device_allowed":true}}' # per-tool security overrides
1885
+ OMNIUS_RUN_RETENTION_H=24 # run-record GC (default: 24h, 0 disables)
1886
+ OMNIUS_TOOL_OVERRIDES='{"shell":{"off_device_allowed":true}}' # per-tool security overrides
1885
1887
  ```
1886
1888
 
1887
1889
 
@@ -1997,7 +1999,7 @@ Omnius builds and maintains a **persistent, auto-updating knowledge graph** of t
1997
1999
  ### How It Works
1998
2000
 
1999
2001
  ```
2000
- Source files ──> Regex symbol extraction ──> SQLite graph DB (.oa/index/code-graph.db)
2002
+ Source files ──> Regex symbol extraction ──> SQLite graph DB (.omnius/index/code-graph.db)
2001
2003
  | |
2002
2004
  | fs.watch() + debounce ──> File hash check ──> Incremental re-index (per file)
2003
2005
  | |
@@ -2031,7 +2033,7 @@ For 1M+ LOC codebases, the Louvain community compression reduces 50K+ symbols in
2031
2033
 
2032
2034
  ### Storage
2033
2035
 
2034
- The graph persists in `.oa/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.
2036
+ The graph persists in `.omnius/index/code-graph.db` (SQLite with WAL mode) across sessions. Incremental updates mean editing a single file costs <50ms regardless of codebase size.
2035
2037
 
2036
2038
  ### Research Basis
2037
2039
 
@@ -2142,7 +2144,7 @@ On startup and `/model` switch, Omnius detects your RAM/VRAM and creates an opti
2142
2144
  | **COHERE Cognitive Stack** | |
2143
2145
  | `repl_exec` | Persistent Python REPL — variables/imports persist between calls, `llm_query()` and `parallel_llm_query()` available for recursive LLM invocation, `retrieve()` for handle access |
2144
2146
  | `memory_metabolize` | Governed memory lifecycle — classify (episodic/semantic/procedural/normative), score (novelty/utility/confidence/identity_relevance), consolidate lessons from trajectories |
2145
- | `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.oa/identity/` |
2147
+ | `identity_kernel` | Persistent identity state — hydrate, observe events, propose updates with justification, publish snapshot, reconcile contradictions. Persists in `.omnius/identity/` |
2146
2148
  | `reflect` | Immune-system reflection — diagnostic (find flaws), epistemic (identify missing evidence), constitutional (review self-updates). Returns pass/revise/block verdict |
2147
2149
  | `explore` | ARCHE strategy-space exploration — generate diverse strategies, archive successful variants with tags/confidence, compare competing approaches, retrieve past strategies |
2148
2150
  | **Hardware Access** | |
@@ -2267,7 +2269,7 @@ Instead of writing custom integrations, point Omnius at an MCP server and its to
2267
2269
  }
2268
2270
  ```
2269
2271
 
2270
- Save that as `.oa/mcp.json` (project) or `~/.omnius/mcp.json` (global). On startup, every server is spawned, the handshake runs, and every tool it advertises is exposed under the namespace `mcp__<server>__<tool>` — selectable by the agent like any built-in.
2272
+ Save that as `.omnius/mcp.json` (project) or `~/.omnius/mcp.json` (global). On startup, every server is spawned, the handshake runs, and every tool it advertises is exposed under the namespace `mcp__<server>__<tool>` — selectable by the agent like any built-in.
2271
2273
 
2272
2274
  ### Spec compliance — what we implement
2273
2275
 
@@ -2287,9 +2289,9 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
2287
2289
 
2288
2290
  ### Three ways to add a server
2289
2291
 
2290
- **1. Edit `.oa/mcp.json` directly** — drop in the JSON shape above. On next launch the server is spawned and connected automatically.
2292
+ **1. Edit `.omnius/mcp.json` directly** — drop in the JSON shape above. On next launch the server is spawned and connected automatically.
2291
2293
 
2292
- **2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the OA terminal. The MD parser detects the configuration with confidence scoring, persists it to `.oa/mcp.json`, and connects immediately. No restart needed. Implementation: `packages/execution/src/mcp/md-intake.ts`.
2294
+ **2. Drag-and-drop a markdown file** — drop any README that contains an MCP config block (Claude Desktop format, bare server JSON, or `npx -y @scope/server-foo` install instructions in a code block) onto the Omnius terminal. The MD parser detects the configuration with confidence scoring, persists it to `.omnius/mcp.json`, and connects immediately. No restart needed. Implementation: `packages/execution/src/mcp/md-intake.ts`.
2293
2295
 
2294
2296
  **3. Use the `/mcp` slash command** — interactive TUI registry browser:
2295
2297
 
@@ -2297,7 +2299,7 @@ The transport layer lives in `packages/execution/src/mcp/transport.ts`; the clie
2297
2299
  /mcp # Open the MCP registry menu
2298
2300
  /mcp status # Quick connection table
2299
2301
  /mcp ls # Same as status
2300
- /mcp reload # Reconnect every server from .oa/mcp.json
2302
+ /mcp reload # Reconnect every server from .omnius/mcp.json
2301
2303
  ```
2302
2304
 
2303
2305
  The main menu lists every configured server with status (●), transport type, tool count, and any error. Selecting a server opens a detail view showing every advertised tool with its description, plus actions to **Edit**, **Reconnect**, **Delete**, or go **Back**. Edit accepts a one-line JSON config; Save returns to the main list with the updated server reconnected.
@@ -2327,7 +2329,7 @@ We test the streaming features end-to-end against the [official everything refer
2327
2329
 
2328
2330
  ### Programmatic API
2329
2331
 
2330
- If you want to drive an MCP server directly from code (instead of through an agent), the OA package re-exports the client:
2332
+ If you want to drive an MCP server directly from code (instead of through an agent), the Omnius package re-exports the client:
2331
2333
 
2332
2334
  ```typescript
2333
2335
  import { McpClient } from "omnius";
@@ -2401,7 +2403,7 @@ The loop tracks iteration history, generates completion reports saved to `.aiwg/
2401
2403
  | `/pause` | **Gentle halt** — lets the current inference turn finish, then stops before the next turn. No new tool calls or inference will begin until `/resume`. |
2402
2404
  | `/stop` | **Immediate kill** — aborts the current inference mid-stream, saves task state for later resumption. |
2403
2405
  | `/resume` | **Continue** — resumes a paused or stopped task from where it left off. Also resumes tasks saved by `/stop` or interrupted by `/update`. |
2404
- | `/destroy` | **Nuclear option** — aborts any active task, deletes the `.oa/` directory, clears the console, and exits to shell. |
2406
+ | `/destroy` | **Nuclear option** — aborts any active task, deletes the `.omnius/` directory, clears the console, and exits to shell. |
2405
2407
 
2406
2408
  ### Session Context Persistence
2407
2409
 
@@ -2413,13 +2415,13 @@ Context is automatically saved on every task completion and preserved across `/u
2413
2415
  /context show # Show saved context status (entries, last saved)
2414
2416
  ```
2415
2417
 
2416
- The system maintains a rolling window of the last 20 session entries in `.oa/context/session-context.json`. When you run `/context restore`, the last 10 entries are formatted into a restore prompt and injected into your next task, giving the agent continuity across sessions.
2418
+ The system maintains a rolling window of the last 20 session entries in `.omnius/context/session-context.json`. When you run `/context restore`, the last 10 entries are formatted into a restore prompt and injected into your next task, giving the agent continuity across sessions.
2417
2419
 
2418
2420
  During `/update`, context is automatically saved before the process restarts and restored when the new version resumes your task.
2419
2421
 
2420
2422
  ### Auto-Restore on Startup
2421
2423
 
2422
- When you launch `oa` in a workspace that has saved session context from a previous run, you'll be prompted to restore it:
2424
+ When you launch `omnius` in a workspace that has saved session context from a previous run, you'll be prompted to restore it:
2423
2425
 
2424
2426
  ```
2425
2427
  ℹ Previous session found (5 entries, last active 2h ago)
@@ -2462,7 +2464,7 @@ Daemon: COHERE enabled — listening on nexus.cohere.query
2462
2464
  Capacity announcement: 3 models, warm=qwen3.5:122b
2463
2465
 
2464
2466
  Peer: "Explain TCP vs UDP" → NATS broadcast
2465
- Your OA: claim → route to qwen3:4b (trivial) → respond in 1.2s
2467
+ Your Omnius: claim → route to qwen3:4b (trivial) → respond in 1.2s
2466
2468
  ```
2467
2469
 
2468
2470
  **How it works:**
@@ -2473,7 +2475,7 @@ Your OA: claim → route to qwen3:4b (trivial) → respond in 1.2s
2473
2475
  - **Model allowlist** — `/cohere allow qwen3:4b` controls which models are exposed
2474
2476
  - **Ollama safety** — remote queries can ONLY run inference on existing models; `/api/pull`, `/api/delete`, `/api/create` are never called
2475
2477
  - **Identity pinning** — snapshots published to IPFS (Helia) with SHA-256 content addressing; survives daemon restarts
2476
- - **Background daemon** persists across OA restarts (`detached: true` + PID file reconnection)
2478
+ - **Background daemon** persists across Omnius restarts (`detached: true` + PID file reconnection)
2477
2479
 
2478
2480
  ```bash
2479
2481
  /cohere stats # Network transparency — queries in/out, model usage, peer activity
@@ -2523,7 +2525,7 @@ The identity kernel maintains a persistent self-model across sessions, the refle
2523
2525
 
2524
2526
  Omnius includes a behavioral immune system that prevents the agent from making pattern-matched mistakes under pressure. Inspired by biological immune systems: constraints are the antibodies, pressure detection is the inflammatory response, and memory injection is the recall mechanism.
2525
2527
 
2526
- ### Constraint Enforcement (`.oa/constraints.json`)
2528
+ ### Constraint Enforcement (`.omnius/constraints.json`)
2527
2529
 
2528
2530
  Machine-readable rules checked **before every tool execution**:
2529
2531
 
@@ -2548,7 +2550,7 @@ Machine-readable rules checked **before every tool execution**:
2548
2550
  | `warn` | Executes tool but emits warning in agent's next turn context |
2549
2551
  | `log` | Silent recording to audit log, no interruption |
2550
2552
 
2551
- Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.oa/constraints.json`), or session (ephemeral).
2553
+ Constraints are scoped: global (`~/.omnius/constraints.json`), project (`.omnius/constraints.json`), or session (ephemeral).
2552
2554
 
2553
2555
  ### Pressure-Aware Decision Gate
2554
2556
 
@@ -2640,7 +2642,7 @@ Use deep context for:
2640
2642
  - Long debugging sessions where error context from earlier is critical
2641
2643
  - Tasks where the agent needs to reason about patterns across many files
2642
2644
 
2643
- The setting persists to `.oa/settings.json`. Deep context is particularly valuable for models with 64K+ context windows (Qwen3.5-122B, Llama 3.1 70B, etc.) where the default thresholds were leaving significant capacity unused.
2645
+ The setting persists to `.omnius/settings.json`. Deep context is particularly valuable for models with 64K+ context windows (Qwen3.5-122B, Llama 3.1 70B, etc.) where the default thresholds were leaving significant capacity unused.
2644
2646
 
2645
2647
  ### Status Bar Context Tracking (`Ctx:` + `SNR:`)
2646
2648
 
@@ -2750,7 +2752,7 @@ The profile is compiled into a system prompt suffix (max 80 tokens) injected at
2750
2752
 
2751
2753
  ### Persistence
2752
2754
 
2753
- The style is saved to `.oa/settings.json` (with `--local`) or `~/.omnius/config.json` (global) and persists across sessions. Change it anytime with `/style <preset>` — takes effect on the next task.
2755
+ The style is saved to `.omnius/settings.json` (with `--local`) or `~/.omnius/config.json` (global) and persists across sessions. Change it anytime with `/style <preset>` — takes effect on the next task.
2754
2756
 
2755
2757
  ### Research Provenance
2756
2758
 
@@ -2876,7 +2878,7 @@ Output: 48kHz WAV, compatible with Telegram voice messages and WebSocket streami
2876
2878
 
2877
2879
  ### Supertonic Expressive Tags
2878
2880
 
2879
- When Supertonic is the active voice backend, OA decorates spoken status updates with the expression tags Supertonic supports. The tag pass runs after markdown/ANSI cleanup and only for Supertonic, so GLaDOS, Overwatch, Kokoro, and LuxTTS continue receiving plain sanitized text.
2881
+ When Supertonic is the active voice backend, Omnius decorates spoken status updates with the expression tags Supertonic supports. The tag pass runs after markdown/ANSI cleanup and only for Supertonic, so GLaDOS, Overwatch, Kokoro, and LuxTTS continue receiving plain sanitized text.
2880
2882
 
2881
2883
  Tag placement is context-aware:
2882
2884
 
@@ -3084,7 +3086,7 @@ When combined with `/voice`, you get full bidirectional audio — speak your tas
3084
3086
 
3085
3087
  The `transcribe-cli` dependency auto-installs in the background on first use. On ARM or when transcribe-cli fails, the system automatically falls back to `openai-whisper` via a self-managed Python venv (same approach used by Moondream vision).
3086
3088
 
3087
- **File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.oa/transcripts/`.
3089
+ **File transcription**: Drag-and-drop audio/video files (`.mp3`, `.wav`, `.mp4`, `.mkv`, etc.) onto the terminal to transcribe them. Results are saved to `.omnius/transcripts/`.
3088
3090
 
3089
3091
 
3090
3092
 
@@ -3233,7 +3235,7 @@ Agent: agenda()
3233
3235
 
3234
3236
  | Decision | Research Basis | Key Finding |
3235
3237
  |----------|---------------|-------------|
3236
- | Separate directive store (`.oa/scheduled/`, not `.oa/memory/`) | SSGM ([arXiv:2603.11768](https://arxiv.org/abs/2603.11768), 2026) | Directives in summarizable memory corrupt via compaction — semantic drift degrades scheduling data |
3238
+ | Separate directive store (`.omnius/scheduled/`, not `.omnius/memory/`) | SSGM ([arXiv:2603.11768](https://arxiv.org/abs/2603.11768), 2026) | Directives in summarizable memory corrupt via compaction — semantic drift degrades scheduling data |
3237
3239
  | File-based persistence survives process death | MemGPT/Letta (Packer et al. 2023, [arXiv:2310.08560](https://arxiv.org/abs/2310.08560)) | Agents are ephemeral; state must be external to the process |
3238
3240
  | Priority-based startup surfacing | A-MAC ([arXiv:2603.04549](https://arxiv.org/abs/2603.04549), 2026) | 5-factor attention scoring; content type prior is most influential factor (31% latency reduction) |
3239
3241
  | Cross-session self-reflection | Reflexion (Shinn et al. 2023, [arXiv:2303.11366](https://arxiv.org/abs/2303.11366)) | Persistent self-reflection stored as text improves task success 20-30% |
@@ -3287,7 +3289,7 @@ Supports `apt` (Debian/Ubuntu), `dnf` (Fedora), `pacman` (Arch), and `brew` (mac
3287
3289
  Launch without arguments to enter the interactive REPL:
3288
3290
 
3289
3291
  ```bash
3290
- oa
3292
+ omnius
3291
3293
  ```
3292
3294
 
3293
3295
  The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage, human expert speed ratio, cost), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.
@@ -3306,9 +3308,9 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
3306
3308
  | `/pause` | Pause after current turn finishes (gentle halt) |
3307
3309
  | `/stop` | Kill current inference immediately, save state |
3308
3310
  | `/resume` | Resume a paused or stopped task |
3309
- | `/destroy` | Remove `.oa/` folder, kill all tasks, clear console, exit |
3311
+ | `/destroy` | Remove `.omnius/` folder, kill all tasks, clear console, exit |
3310
3312
  | **Context & Memory** | |
3311
- | `/context save` | Force-save session context to `.oa/context/` |
3313
+ | `/context save` | Force-save session context to `.omnius/context/` |
3312
3314
  | `/context restore` | Restore context from previous sessions into next task |
3313
3315
  | `/context show` | Show saved session context status |
3314
3316
  | `/compact` | Force context compaction now (default strategy) |
@@ -3381,7 +3383,7 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
3381
3383
  | `/help` | Show all available commands |
3382
3384
  | `/quit` | Exit |
3383
3385
 
3384
- All settings commands accept `--local` to save to project `.oa/settings.json` instead of global config.
3386
+ All settings commands accept `--local` to save to project `.omnius/settings.json` instead of global config.
3385
3387
 
3386
3388
  ### Platform Connectors
3387
3389
 
@@ -3441,7 +3443,7 @@ The steering sub-agent uses the same model and backend as the main agent with `m
3441
3443
  Connect the agent to a Telegram bot. Telegram can run in auto, chat, or action mode: conversational messages get rapid streamed replies in chat mode, while codebase/file/run requests use dedicated action sub-agents that are visible in the terminal waterfall alongside other agent activity.
3442
3444
 
3443
3445
  ```bash
3444
- /telegram --key <token> # Save bot token (persisted to .oa/settings.json)
3446
+ /telegram --key <token> # Save bot token (persisted to .omnius/settings.json)
3445
3447
  /telegram --admin <userid> # Set admin user — gets full memory + tools
3446
3448
  /telegram # Toggle bridge on/off (uses saved key)
3447
3449
  /telegram status # Show connection status + active sub-agents
@@ -3488,7 +3490,7 @@ On success, that Telegram user ID is saved as the admin user and future private-
3488
3490
  The Telegram bridge handles modern Bot API traffic directly:
3489
3491
 
3490
3492
  - **Guest Mode** — inbound `guest_message` updates are normalized into regular agent work and answered through `answerGuestQuery`, so users can interact from profile-surface guest chats before a normal bot DM exists.
3491
- - **Command menu registration** — when the bridge starts, OA registers the local slash-command surface with Telegram via `setMyCommands`; Telegram-safe names such as `/full_send_bless` are mapped back to canonical TUI commands like `/full-send-bless` before execution.
3493
+ - **Command menu registration** — when the bridge starts, Omnius registers the local slash-command surface with Telegram via `setMyCommands`; Telegram-safe names such as `/full_send_bless` are mapped back to canonical TUI commands like `/full-send-bless` before execution.
3492
3494
  - **Bot-to-bot sends** — `/telegram bot <username> <text>` targets another bot by username using Telegram's supported bot-to-bot message subset.
3493
3495
  - **Managed bot access** — `/telegram access get|set` reads and configures managed-bot access restrictions by managed bot user ID.
3494
3496
  - **Polls and live photos** — incoming polls, poll media summaries, option media, country/member limits, and live photos are captured as first-class Telegram message context; `/telegram poll` and `/telegram live-photo` send the matching Bot API payloads.
@@ -3592,7 +3594,7 @@ The bridge distinguishes between **private DMs** and **group/supergroup chats**,
3592
3594
 
3593
3595
  Photos, audio, voice messages, video, video notes, and documents sent via Telegram are automatically downloaded and processed:
3594
3596
 
3595
- 1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.oa/media-cache/`
3597
+ 1. **Download** — files are fetched via the Telegram `getFile` API and cached to `.omnius/media-cache/`
3596
3598
  2. **Processing** — routed to the appropriate pipeline:
3597
3599
  - Images → `vision` / `image_read` / `ocr` tools
3598
3600
  - Audio/voice → `transcribe_file` tool
@@ -3621,7 +3623,7 @@ The bridge automatically handles Telegram's rate limits (HTTP 429) with exponent
3621
3623
 
3622
3624
  <div align="right"><a href="#top">back to top</a></div>
3623
3625
 
3624
- Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [omnius-nexus@1.5.6](https://www.npmjs.com/package/omnius-nexus).
3626
+ Agents can earn and spend USDC on Base mainnet through the native x402 protocol built into [open-agents-nexus@1.5.6](https://www.npmjs.com/package/open-agents-nexus).
3625
3627
 
3626
3628
  ### Wallet & Identity
3627
3629
  ```
@@ -3642,7 +3644,7 @@ When margin > 0, capabilities are registered with USDC pricing metadata. The dae
3642
3644
  ```
3643
3645
  nexus(action='spend', target_address='0x...', amount_usdc='0.10')
3644
3646
  ```
3645
- Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.oa/nexus/pending-transfer.json`.
3647
+ Signs an EIP-3009 `TransferWithAuthorization`. Budget-checked before signing. The recipient (or any facilitator) submits on-chain — no gas needed from the payer. Proof saved to `.omnius/nexus/pending-transfer.json`.
3646
3648
 
3647
3649
  ### Remote Inference — Tap Into the Mesh
3648
3650
  ```
@@ -3708,7 +3710,7 @@ Step 5 → Review and Go Live
3708
3710
  - **libp2p P2P mesh** provides decentralized relay — no DNS, no port forwarding, NAT-traversing
3709
3711
  - Cloudflared tunnel available as HTTPS fallback for non-P2P consumers
3710
3712
  - Your raw API endpoint URL is **never exposed** — consumers connect via peerId or tunnel
3711
- - Config persists to `.oa/sponsor/config.json` — survives restarts
3713
+ - Config persists to `.omnius/sponsor/config.json` — survives restarts
3712
3714
 
3713
3715
  **Management:**
3714
3716
  ```bash
@@ -3734,11 +3736,11 @@ When using sponsored inference, the sponsor's banner animation and message appea
3734
3736
 
3735
3737
  ```
3736
3738
  Primary path (libp2p):
3737
- Consumer OA ──→ libp2p mesh ──→ Sponsor Daemon ──→ Ollama/vLLM
3739
+ Consumer Omnius ──→ libp2p mesh ──→ Sponsor Daemon ──→ Ollama/vLLM
3738
3740
  (P2P, NAT-traversing) (auth + rate limit) (local)
3739
3741
 
3740
3742
  Fallback path (tunnel):
3741
- Consumer OA ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollama/vLLM
3743
+ Consumer Omnius ──→ Cloudflared Tunnel ──→ Sponsor Proxy ──→ Ollama/vLLM
3742
3744
  (HTTPS) (auth + rate limit) (local)
3743
3745
 
3744
3746
  Both paths enforce:
@@ -3782,7 +3784,7 @@ The `--full` flag is required to grant remote peers model management access. Spo
3782
3784
 
3783
3785
  <div align="right"><a href="#top">back to top</a></div>
3784
3786
 
3785
- COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple OA nodes form a mesh that learns, evolves, and improves collectively. Queries from the [openagents.nexus](https://openagents.nexus) frontend or CLI are broadcast via NATS, processed by elected nodes through the full AgenticRunner (tools, context engineering, system prompts), and responses are peer-reviewed before delivery.
3787
+ COHERE (Collaborative Orchestration of Heuristic Emergent Reasoning Engines) is a distributed collective intelligence system where multiple Omnius nodes form a mesh that learns, evolves, and improves collectively. Queries from the [omnius.nexus](https://omnius.nexus) frontend or CLI are broadcast via NATS, processed by elected nodes through the full AgenticRunner (tools, context engineering, system prompts), and responses are peer-reviewed before delivery.
3786
3788
 
3787
3789
  ### How COHERE Works
3788
3790
 
@@ -3855,7 +3857,7 @@ Omnius includes infrastructure for the agent to learn from its own execution, im
3855
3857
 
3856
3858
  ### Trajectory Logging
3857
3859
 
3858
- Every completed task is logged to `.oa/trajectories/trajectories.jsonl` with full metadata: task description, outcome (pass/fail), tool calls made, files modified, failed approaches, and timing. This data feeds the rejection fine-tuning pipeline. Research: [Golubev et al.](https://arxiv.org/abs/2508.03501) showed RFT on passing trajectories alone improved Qwen-72B from 11% to 25% on SWE-bench.
3860
+ Every completed task is logged to `.omnius/trajectories/trajectories.jsonl` with full metadata: task description, outcome (pass/fail), tool calls made, files modified, failed approaches, and timing. This data feeds the rejection fine-tuning pipeline. Research: [Golubev et al.](https://arxiv.org/abs/2508.03501) showed RFT on passing trajectories alone improved Qwen-72B from 11% to 25% on SWE-bench.
3859
3861
 
3860
3862
  ### Rejection Fine-Tuning Pipeline
3861
3863
 
@@ -3969,14 +3971,14 @@ Omnius binds entities across image, audio, and text using joint‑embedding mode
3969
3971
  - Voiceprint linkage: speaker embeddings (x‑vector/ECAPA) are associated with entities when co‑occurring in time with a visual track and a transcribed utterance; robust to background noise via median pooling across windows.
3970
3972
  - Text label fusion: natural‑language labels (names, roles, tags) are bound to the same entity when co‑referents appear in proximate context windows (heuristics + clustering).
3971
3973
  - Association graph: cross‑modal edges (image↔voice↔text) consolidate into a unified entity node with provenance (model, score, timestamp) and decay‑based confidence.
3972
- - Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.oa/memory/`. Redaction controls can drop embeddings by label or recency.
3974
+ - Privacy & safety: raw media never leaves the machine; embeddings are stored locally under `.omnius/memory/`. Redaction controls can drop embeddings by label or recency.
3973
3975
 
3974
3976
  This enables queries like: “Find where Alex spoke about deployment,” “Show files edited after the person in the red sweater approved the PR,” or “Summarize conversations where Speaker‑B and Alice appear together.”
3975
3977
 
3976
3978
  The associative memory integrates with a near-critical cognitive framework inspired by [Beggs & Plenz (2003)](https://doi.org/10.1523/JNEUROSCI.23-35-11167.2003) neuronal avalanche dynamics:
3977
3979
 
3978
- - **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.oa/consolidations/` with lessons learned and key patterns
3979
- - **Provenance KG**: Every agent action is tracked in `.oa/provenance/` for full action traceability
3980
+ - **Auto-consolidation**: At task boundaries, the system writes consolidation snapshots to `.omnius/consolidations/` with lessons learned and key patterns
3981
+ - **Provenance KG**: Every agent action is tracked in `.omnius/provenance/` for full action traceability
3980
3982
  - **Homeostasis modulation**: Error rate drives exploration guidance — high error rates inject more careful approaches, low error rates encourage bolder exploration
3981
3983
  - **Error pattern learning**: Recurring error patterns are detected, stored globally in `~/.omnius/error-patterns.json`, and injected as `[LEARNED FROM EXPERIENCE]` guidance before similar actions in future sessions
3982
3984
 
@@ -3997,18 +3999,18 @@ When you're not actively tasking the agent, Dream Mode lets it creatively explor
3997
3999
  Each cycle expands through all four stages then contracts (evaluation, pruning of weak ideas). Three modes control how far the agent can go:
3998
4000
 
3999
4001
  ```bash
4000
- /dream # Default — read-only exploration, proposals saved to .oa/dreams/
4002
+ /dream # Default — read-only exploration, proposals saved to .omnius/dreams/
4001
4003
  /dream deep # Multi-cycle deep exploration with expansion/contraction phases
4002
4004
  /dream lucid # Full implementation — saves workspace backup, then implements,
4003
4005
  # tests, evaluates, and self-plays each proposal with checkpoints
4004
4006
  /dream stop # Wake up — stop dreaming
4005
4007
  ```
4006
4008
 
4007
- **Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.oa/dreams/`. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.
4009
+ **Default** and **Deep** modes are completely safe — the agent can only read your code and write proposals to `.omnius/dreams/`. File writes, edits, and shell commands outside that directory are blocked by sandboxed dream tools.
4008
4010
 
4009
4011
  **Lucid** mode unlocks full write access. Before making changes, it saves a workspace checkpoint so you can roll back. Each cycle goes: dream → implement → test → evaluate → checkpoint → next cycle.
4010
4012
 
4011
- All proposals are indexed in `.oa/dreams/PROPOSAL-INDEX.md` for easy review.
4013
+ All proposals are indexed in `.omnius/dreams/PROPOSAL-INDEX.md` for easy review.
4012
4014
 
4013
4015
  ### Autoresearch Swarm — 5-Agent GPU Experiment Loop
4014
4016
 
@@ -4021,7 +4023,7 @@ The swarm operates in four phases:
4021
4023
  | **Phase 0: Load** | Reads autoresearch memory (best config, experiment log, failed approaches, hypothesis queue, architectural insights) + detects GPU specs |
4022
4024
  | **Phase 1: Hypothesis** | Critic generates 5-8 hypotheses; Flow Maintainer plans experiment ordering and round budget |
4023
4025
  | **Phase 2: Experiment** | Sequential rounds (up to 3): Critic pre-screens → Researcher modifies train.py + runs → Monitor watches GPU → Evaluator keeps/discards → Flow Maintainer decides continue/stop |
4024
- | **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.oa/dreams/` |
4026
+ | **Phase 3: Summary** | Flow Maintainer writes consolidated summary to memory + dream report to `.omnius/dreams/` |
4025
4027
 
4026
4028
  #### The 5 Agent Roles
4027
4029
 
@@ -4035,7 +4037,7 @@ The swarm operates in four phases:
4035
4037
 
4036
4038
  #### Bidirectional Memory
4037
4039
 
4038
- The swarm maintains persistent memory in `.oa/memory/autoresearch.json` with five keys:
4040
+ The swarm maintains persistent memory in `.omnius/memory/autoresearch.json` with five keys:
4039
4041
 
4040
4042
  - **best_config** — best val_bpb and what train.py changes produced it
4041
4043
  - **experiment_log** — chronological list of experiments with hypotheses, results, and verdicts
@@ -4132,7 +4134,7 @@ curl -X POST http://localhost:11435/v1/run \
4132
4134
 
4133
4135
  ### Multi-Agent Collective Testbed
4134
4136
 
4135
- Spawn multiple OA instances in Docker for collective intelligence experiments:
4137
+ Spawn multiple Omnius instances in Docker for collective intelligence experiments:
4136
4138
 
4137
4139
  ```bash
4138
4140
  cd testbed
@@ -4379,12 +4381,12 @@ omnius config set backendUrl http://localhost:11434
4379
4381
 
4380
4382
  ### Project Context
4381
4383
 
4382
- Create `AGENTS.md`, `OA.md`, or `.omnius.md` in your project root for agent instructions. Context files merge from parent to child directories.
4384
+ Create `AGENTS.md`, `Omnius.md`, or `.omnius.md` in your project root for agent instructions. Context files merge from parent to child directories.
4383
4385
 
4384
- ### `.oa/` Project Directory
4386
+ ### `.omnius/` Project Directory
4385
4387
 
4386
4388
  ```
4387
- .oa/
4389
+ .omnius/
4388
4390
  ├── config.json # Project config overrides
4389
4391
  ├── settings.json # TUI settings (model, endpoint, voice, stream, etc.)
4390
4392
  ├── memory/ # Persistent memory store (topics, patterns, facts)
@@ -4410,9 +4412,9 @@ Create `AGENTS.md`, `OA.md`, or `.omnius.md` in your project root for agent inst
4410
4412
  Any Ollama or OpenAI-compatible API model with tool calling works:
4411
4413
 
4412
4414
  ```bash
4413
- oa --model qwen2.5-coder:32b "fix the bug"
4414
- oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
4415
- oa --backend-url http://10.0.0.5:11434 "refactor auth"
4415
+ omnius --model qwen2.5-coder:32b "fix the bug"
4416
+ omnius --backend vllm --backend-url http://localhost:8000/v1 "add tests"
4417
+ omnius --backend-url http://10.0.0.5:11434 "refactor auth"
4416
4418
  ```
4417
4419
 
4418
4420
 
@@ -4506,8 +4508,8 @@ Forward any configured `/endpoint` (Chutes, Groq, OpenRouter, Together, vLLM, et
4506
4508
  - Your node registers inference capabilities on the P2P mesh using your upstream endpoint's models
4507
4509
  - Remote peers discover and invoke these capabilities via libp2p streams (DHT/mDNS/NATS)
4508
4510
  - Requests are forwarded to your upstream API, responses streamed back to the peer
4509
- - The libp2p daemon persists in the background — it survives OA restarts and remains discoverable even when the TUI is closed
4510
- - When you reopen OA, it reconnects to the existing daemon and resumes stats tracking
4511
+ - The libp2p daemon persists in the background — it survives Omnius restarts and remains discoverable even when the TUI is closed
4512
+ - When you reopen Omnius, it reconnects to the existing daemon and resumes stats tracking
4511
4513
 
4512
4514
  **Rate limit distribution (`--loadbalance`):**
4513
4515
  - Captures `x-ratelimit-remaining-tokens` and `x-ratelimit-limit-tokens` headers from upstream API responses
@@ -4776,7 +4778,7 @@ node eval/run-agentic.mjs --model qwen3.5:4b # Different model tier
4776
4778
 
4777
4779
  ### REST API Enterprise Evaluation (v0.185.68)
4778
4780
 
4779
- 35 test cases executed against the oa REST API (`oa serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
4781
+ 35 test cases executed against the omnius REST API (`omnius serve` on port 11435) across **10 industries** and **3 model tiers**. Each case sends a domain-specific prompt via `/v1/chat/completions` and verifies correctness against expected patterns.
4780
4782
 
4781
4783
  ```bash
4782
4784
  node eval/api-enterprise-eval.mjs # Run all 85 tests (35 cases × 3 models)
@@ -4833,7 +4835,7 @@ Omnius integrates with [AIWG](https://aiwg.io) ([npm](https://www.npmjs.com/pack
4833
4835
 
4834
4836
  ```bash
4835
4837
  npm i -g aiwg
4836
- oa "analyze this project's SDLC health and set up documentation"
4838
+ omnius "analyze this project's SDLC health and set up documentation"
4837
4839
  ```
4838
4840
 
4839
4841
  | Capability | Description |
@@ -4930,26 +4932,26 @@ Control it live from the TUI:
4930
4932
 
4931
4933
  ```
4932
4934
  /access # show current access + host
4933
- /access loopback|lan|any # set access policy (OA_ACCESS) and restart daemon
4934
- /host 127.0.0.1:11435 # bind to loopback only (OA_HOST) and restart daemon
4935
+ /access loopback|lan|any # set access policy (OMNIUS_ACCESS) and restart daemon
4936
+ /host 127.0.0.1:11435 # bind to loopback only (OMNIUS_HOST) and restart daemon
4935
4937
  /host 0.0.0.0:11435 # bind all interfaces and restart daemon
4936
4938
  /network config # interactive menu (arrow keys) to change both
4937
4939
 
4938
4940
  # Project-local persistence
4939
- /access any --local # save to ./.oa/settings.json
4941
+ /access any --local # save to ./.omnius/settings.json
4940
4942
  /host 127.0.0.1:11435 --local
4941
4943
  ```
4942
4944
 
4943
4945
  Environment variables (non-TUI usage):
4944
4946
 
4945
4947
  ```
4946
- OA_ACCESS=lan OA_HOST=0.0.0.0:11435 oa
4948
+ OMNIUS_ACCESS=lan OMNIUS_HOST=0.0.0.0:11435 omnius
4947
4949
  ```
4948
4950
 
4949
4951
  Persistence and startup behavior:
4950
4952
 
4951
- - The TUI saves your choices to `.oa/settings.json` (project) or `~/.omnius/settings.json` (global).
4952
- - On startup, the TUI loads saved `oaAccess`/`oaHost` and seeds `OA_ACCESS`/`OA_HOST` before ensuring the daemon, so the 11435 service picks them up immediately.
4953
+ - The TUI saves your choices to `.omnius/settings.json` (project) or `~/.omnius/settings.json` (global).
4954
+ - On startup, the TUI loads saved `omniusAccess`/`omniusHost` and seeds `OMNIUS_ACCESS`/`OMNIUS_HOST` before ensuring the daemon, so the 11435 service picks them up immediately.
4953
4955
  - Explicit environment variables always win over saved settings.
4954
4956
 
4955
4957
  Security tips: