omnius 1.0.8 → 1.0.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +43 -22
- package/npm-shrinkwrap.json +559 -2
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -1,19 +1,17 @@
|
|
|
1
1
|
<a name="top"></a>
|
|
2
2
|
```text
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
3
|
+
|
|
4
|
+
░░ ░░░ ░░░░ ░░ ░░░ ░░ ░░ ░░░░ ░░░ ░░
|
|
5
|
+
▒ ▒▒▒▒ ▒▒ ▒▒ ▒▒ ▒▒ ▒▒▒▒▒ ▒▒▒▒▒ ▒▒▒▒ ▒▒ ▒▒▒▒▒▒▒
|
|
6
|
+
▓ ▓▓▓▓ ▓▓ ▓▓ ▓ ▓ ▓▓▓▓▓ ▓▓▓▓▓ ▓▓▓▓ ▓▓▓ ▓▓
|
|
7
|
+
█ ████ ██ █ █ ██ ██ █████ █████ ████ ████████ █
|
|
8
|
+
██ ███ ████ ██ ███ ██ ███ ████ ██
|
|
9
|
+
|
|
12
10
|
```
|
|
13
11
|
|
|
14
12
|
<p align="center">
|
|
15
13
|
<strong>AI coding agent powered entirely by open-weight models.</strong><br>
|
|
16
|
-
No API keys. No cloud. Your code never leaves your machine
|
|
14
|
+
No API keys. No cloud. Your code never leaves your machine <i>(unless you want it to!)</i>
|
|
17
15
|
</p>
|
|
18
16
|
|
|
19
17
|
<p align="center">
|
|
@@ -280,8 +278,10 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
|
|
|
280
278
|
|
|
281
279
|
- **61 autonomous tools** — file I/O, shell, grep, web search/fetch/crawl, memory (read/write/search), sub-agents, background tasks, image/OCR/PDF, git, diagnostics, vision, desktop automation, browser automation, temporal agency (scheduler/reminders/agenda), structured files, code sandbox, transcription, skills, opencode delegation, cron agents, nexus P2P networking + x402 micropayments, **COHERE cognitive stack** (persistent REPL, recursive LLM calls, memory metabolism, identity kernel, reflection, exploration)
|
|
282
280
|
- **Moondream vision** — see and interact with the desktop via Moondream VLM (caption, query, detect, point-and-click)
|
|
281
|
+
- **Image generation with TUI previews** — `/image <prompt>` and the `generate_image` tool create PNGs under `.omnius/images/`, support explicit `--model` selection, and render generated, pasted, screenshot, and camera-capture images as auto-sized ASCII previews via `image-to-ascii`
|
|
283
282
|
- **Desktop automation** — vision-guided clicking: describe a UI element in natural language, the agent finds and clicks it
|
|
284
283
|
- **Auto-install desktop deps** — screenshot, mouse, OCR, and image tools auto-install missing system packages (scrot, xdotool, tesseract, imagemagick) on first use
|
|
284
|
+
- **Hardware-rated model lists** — first-run setup, `/models`, `/score`, and `/image list` score model fit against detected RAM/VRAM/GPU so text and image model choices are visible before you switch or generate
|
|
285
285
|
- **Parallel tool execution** — read-only tools run concurrently via `Promise.allSettled`
|
|
286
286
|
- **Sub-agent delegation** — spawn independent agents for parallel workstreams
|
|
287
287
|
- **OpenCode delegation** — offload coding tasks to opencode (sst/opencode) as an autonomous sub-agent with auto-install, progress monitoring, and result evaluation
|
|
@@ -339,7 +339,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
339
339
|
- **Temporal agency** — schedule future tasks via OS cron, set cross-session reminders, flag attention items — startup injection surfaces due items automatically
|
|
340
340
|
- **Web crawling** — multi-page web scraping with Crawlee/Playwright for deep documentation extraction
|
|
341
341
|
- **Task templates** — specialized system prompts and tool recommendations for code, document, analysis, plan tasks
|
|
342
|
-
- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch: memory/compute/speed scores, per-model compatibility matrix, recommended model selection
|
|
342
|
+
- **Inference capability scoring** — canirun.ai-style hardware assessment at first launch and on demand: memory/compute/speed scores, per-model compatibility matrix, `/models` runtime fit ratings, `/image list` image-model fit ratings, and recommended model selection
|
|
343
343
|
- **Auto-install everything** — first-run wizard auto-installs Ollama, curl, Python3, python3-venv with platform-aware package managers (apt, dnf, yum, pacman, apk, zypper, brew)
|
|
344
344
|
- **Sponsored inference** — `/sponsor` walks through a 5-step wizard to share your GPU with the world: select endpoints, choose banner animation (8 presets + AI-generated custom), set header message/links, configure transport (cloudflared/libp2p) + rate limits, and go live. Consumers discover sponsors via `/endpoint sponsor`. Secure proxy relay with per-IP rate limiting, daily token budgets, model allowlist, and concurrent request caps. Sponsor's raw API URL is never exposed. See [Sponsored Inference](#sponsored-inference--share-your-gpu-with-the-world) below
|
|
345
345
|
- **P2P inference network** — `/expose` local models or forward any `/endpoint` (Chutes, Groq, OpenRouter, etc.) through the libp2p P2P mesh. Passthrough mode (`/expose passthrough`) relays upstream API requests; `--loadbalance` distributes rate-limited token budgets across peers. `/expose config` provides an arrow-key menu for all settings. Gateway stats show budget remaining from `x-ratelimit-*` headers. Background daemon persists across Omnius restarts
|
|
@@ -357,7 +357,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
357
357
|
- **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content
|
|
358
358
|
- **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected
|
|
359
359
|
- **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)
|
|
360
|
-
- **Image generation** — `generate_image`
|
|
360
|
+
- **Image generation** — `generate_image` supports Ollama image models, Diffusers models, and stable-diffusion.cpp checkpoints/GGUF. SDXL Turbo is the practical default auto-install path under `.omnius/image-gen/.venv`; FLUX.1 dev and Stable Diffusion 3.5 Large are the primary high-realism baselines when hardware allows. `/image list` groups models by type, size, quality expectations, and hardware fit
|
|
361
361
|
- **Node visualization** — [omnius.nexus](https://github.com/robit-man/omnius.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
|
|
362
362
|
- **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents "white heavy checkmark"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines
|
|
363
363
|
- **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel
|
|
@@ -368,7 +368,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
|
|
|
368
368
|
- **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs
|
|
369
369
|
- **Seamless `/update`** — in-place update and reload with automatic context save/restore
|
|
370
370
|
- **Blessed mode** — `/full-send-bless` infinite warm loop keeps model weights in VRAM, auto-cycles tasks, never exits until you say stop
|
|
371
|
-
- **Telegram bridge** — `/telegram --key <token> --admin <userid>` public ingress/egress with admin filter and mandatory safety filter; bare `/telegram` toggles the service watchdog
|
|
371
|
+
- **Telegram bridge** — `/telegram --key <token> --admin <userid>` public ingress/egress with admin filter, scoped memory, per-chat personality profiles, sandboxed public creative file/image/audio tools, generated-artifact send-back, and mandatory safety filter; bare `/telegram` toggles the service watchdog
|
|
372
372
|
- **Task control** — `/pause` (gentle halt at turn boundary), `/stop` (immediate kill), `/resume` to continue
|
|
373
373
|
- **Model-tier awareness** — dynamic tool sets, prompt complexity, and context limits scale with model size (small/medium/large)
|
|
374
374
|
|
|
@@ -3294,13 +3294,16 @@ omnius
|
|
|
3294
3294
|
|
|
3295
3295
|
The TUI features an animated multilingual phrase carousel, live metrics bar with pastel-colored labels (token in/out, context window usage, human expert speed ratio, cost), rotating tips, syntax-highlighted tool output, and dynamic terminal-width cropping.
|
|
3296
3296
|
|
|
3297
|
+
Image surfaces are first-class in the terminal. `/image` generations, generated-image tool results, pasted image context, screenshots, and camera captures are converted through `image-to-ascii` and sized to the current terminal before being printed into the main scrollback. Each generated image also includes the saved file path below the preview.
|
|
3298
|
+
|
|
3297
3299
|
### Slash Commands
|
|
3298
3300
|
|
|
3299
3301
|
| Command | Description |
|
|
3300
3302
|
|---------|-------------|
|
|
3301
3303
|
| **Model & Endpoint** | |
|
|
3302
3304
|
| `/model <name>` | Switch to a different model |
|
|
3303
|
-
| `/models` | List
|
|
3305
|
+
| `/models` | List available text models with detected hardware-fit ratings |
|
|
3306
|
+
| `/score` | Show the inference capability scorecard: memory, compute, speed, and model compatibility |
|
|
3304
3307
|
| `/endpoint <url>` | Connect to a remote vLLM or OpenAI-compatible API |
|
|
3305
3308
|
| `/endpoint <url> --auth <key>` | Set endpoint with Bearer auth |
|
|
3306
3309
|
| `/endpoint <peerId> --auth <key>` | Connect to a libp2p peer via nexus P2P network |
|
|
@@ -3318,6 +3321,11 @@ The TUI features an animated multilingual phrase carousel, live metrics bar with
|
|
|
3318
3321
|
| **Audio & Vision** | |
|
|
3319
3322
|
| `/voice [model]` | Toggle TTS voice (GLaDOS, Overwatch, Kokoro, LuxTTS, Supertonic) |
|
|
3320
3323
|
| `/listen [mode]` | Toggle live microphone transcription |
|
|
3324
|
+
| `/image` | Open the image-generation model/setup menu |
|
|
3325
|
+
| `/image <prompt>` | Generate an image and show an auto-sized ASCII preview in the TUI |
|
|
3326
|
+
| `/image --model <model> <prompt>` | Generate with an explicit image model |
|
|
3327
|
+
| `/image list` | List image models by category, size, quality expectation, and hardware fit |
|
|
3328
|
+
| `/image setup <ollama\|diffusers\|sdcpp>` | Show setup commands for an image-generation backend |
|
|
3321
3329
|
| `/dream [mode]` | Start dream mode (default, deep, lucid) |
|
|
3322
3330
|
| **Display & Behavior** | |
|
|
3323
3331
|
| `/stream` | Toggle streaming token display with pastel syntax highlighting |
|
|
@@ -3440,7 +3448,7 @@ The steering sub-agent uses the same model and backend as the main agent with `m
|
|
|
3440
3448
|
|
|
3441
3449
|
<div align="right"><a href="#top">back to top</a></div>
|
|
3442
3450
|
|
|
3443
|
-
Connect the agent to a Telegram bot. Telegram can run in auto, chat, or action mode: conversational messages get rapid streamed replies in chat mode, while codebase/file/run requests use dedicated action sub-agents that are visible in the terminal waterfall alongside other agent activity.
|
|
3451
|
+
Connect the agent to a Telegram bot. Telegram can run in auto, chat, or action mode: conversational messages get rapid streamed replies in chat mode, while codebase/file/run requests use dedicated action sub-agents that are visible in the terminal waterfall alongside other agent activity. Public group chats get scoped memory, live reply discretion, and sandboxed creative tools for generating files, audio, and images without exposing the local workspace.
|
|
3444
3452
|
|
|
3445
3453
|
```bash
|
|
3446
3454
|
/telegram --key <token> # Save bot token (persisted to .omnius/settings.json)
|
|
@@ -3469,7 +3477,7 @@ The bot token, admin ID, and interaction mode are persisted to settings, so you
|
|
|
3469
3477
|
|
|
3470
3478
|
Use `/telegram mode auto|chat|action` to control how inbound Telegram messages are routed:
|
|
3471
3479
|
|
|
3472
|
-
- **auto** —
|
|
3480
|
+
- **auto** — the live router decides whether the message is conversational, actionable, or not directed at the bot. Reply-worthy conversational turns use fast streamed chat replies; explicit codebase/file/command/run/test requests use action sub-agents.
|
|
3473
3481
|
- **chat** — every non-command message gets a direct quick-chat completion with no tool loop. This is best for rapid back-and-forth conversation.
|
|
3474
3482
|
- **action** — every non-command message runs through the Telegram sub-agent path with the configured tool policy.
|
|
3475
3483
|
|
|
@@ -3528,13 +3536,13 @@ If a user sends another message while their sub-agent is still running, it's inj
|
|
|
3528
3536
|
|-------|----------|-------|--------|
|
|
3529
3537
|
| **Admin DM** (`--admin`, private chat) | 30 | All tools except shell (overridable) | Full read + write |
|
|
3530
3538
|
| **Admin Group** (admin in group chat) | 15 | Read-only + web + vision/OCR/transcription | Full read + write |
|
|
3531
|
-
| **Public** (everyone else) | 8 | memory
|
|
3539
|
+
| **Public** (everyone else) | 8 | scoped memory, web fetch/search, media analysis, sandboxed creative file/image/audio tools | Scoped per-chat |
|
|
3532
3540
|
|
|
3533
3541
|
**Admin DM** — full agent experience in private chat. File read, grep, glob, memory, web research, all tools except shell (which can be unblocked via config).
|
|
3534
3542
|
|
|
3535
3543
|
**Admin Group** — when the admin speaks in a group chat, the agent responds with read-only capabilities. No system-mutating tools (no shell, no file write, no code execution). Vision, OCR, transcription, and web tools are available for analyzing shared media and answering questions.
|
|
3536
3544
|
|
|
3537
|
-
**Public** — lightweight assistant with safety guardrails. No
|
|
3545
|
+
**Public** — lightweight assistant with safety guardrails. No shell and no access to arbitrary local files. Web search, scoped memory, media analysis, and creative artifact generation are available inside a per-chat sandbox. Reply discretion is active in groups.
|
|
3538
3546
|
|
|
3539
3547
|
### Streaming Responses
|
|
3540
3548
|
|
|
@@ -3542,12 +3550,25 @@ While the sub-agent is working, users see:
|
|
|
3542
3550
|
1. **Typing indicator** — "typing..." appears immediately and refreshes every 4 seconds until the response is ready
|
|
3543
3551
|
2. **Admin live streaming** — a placeholder message is sent immediately, then progressively edited via `editMessageText` with accumulated content + intermediate states (tool calls, results, status updates). Admin sees `🔧 tool_name(...)` and `✔ tool_name: result` inline as the agent works
|
|
3544
3552
|
3. **Markdown → HTML conversion** — all responses are automatically converted from GitHub-flavored Markdown to Telegram-compatible HTML (`<b>`, `<i>`, `<code>`, `<pre>`, `<s>`, `<a>`) with plaintext fallback
|
|
3545
|
-
4. **Final message** —
|
|
3553
|
+
4. **Final message selection** — the bridge prefers the assistant's refined visible content over task-complete summaries, router decisions, memory-stage notes, or `no_reply` markers
|
|
3554
|
+
5. **Artifact send-back** — generated images, documents, and audio files created inside the scoped creative workspace are uploaded back to Telegram via the appropriate Bot API method
|
|
3555
|
+
6. **Final message** — committed via `editMessageText` (admin) or `sendMessage` (public) when the agent completes
|
|
3546
3556
|
|
|
3547
3557
|
### Public User Isolation
|
|
3548
3558
|
|
|
3549
3559
|
Public users get **per-chat isolated memory** — each chat has its own scoped memory namespace (`telegram-{chatId}-{topic}`) so public users can store and retrieve facts about their conversation without accessing or polluting global agent memory. Public tools include: `memory_read`, `memory_write` (scoped), `memory_search`, `web_search`, `web_fetch`.
|
|
3550
3560
|
|
|
3561
|
+
The bridge also maintains a per-chat conversation state file with recent history, participants, relationship signals, and lightweight Zettelkasten memory cards. Each Telegram group or private chat gets its own scoped personality document under `.omnius/scoped-personality/telegram-chat/`; that profile is updated as people talk and injected into future Telegram context so tone, pacing, names, and relationships stay available turn to turn.
|
|
3562
|
+
|
|
3563
|
+
### Public Creative Artifacts
|
|
3564
|
+
|
|
3565
|
+
Public chats can ask Omnius to create files, images, and audio without giving the model arbitrary write access. The bridge injects a per-chat creative workspace under `.omnius/telegram-creative/<chat>/` and exposes scoped tools that can only create or edit files inside that folder. Generated artifacts are tracked by manifest and by tool result text, then uploaded back into the chat.
|
|
3566
|
+
|
|
3567
|
+
- **Images** — the model can call `generate_image` directly when the conversation asks for an image; generated PNGs are sent with `sendPhoto` when Telegram accepts them
|
|
3568
|
+
- **Documents** — Markdown, text, JSON, CSV, and other generated files are sent with `sendDocument`
|
|
3569
|
+
- **Audio** — generated WAV/voice artifacts are sent as audio or voice media based on file type
|
|
3570
|
+
- **Sandbox rule** — public creative tools cannot delete or mutate anything outside the scoped chat folder
|
|
3571
|
+
|
|
3551
3572
|
### Context-Aware Tool Policy
|
|
3552
3573
|
|
|
3553
3574
|
Tools are gated per execution context. The system enforces strict separation between what's available in a terminal session versus a public Telegram group:
|
|
@@ -3557,7 +3578,7 @@ Tools are gated per execution context. The system enforces strict separation bet
|
|
|
3557
3578
|
| `terminal` | All tools | Wide open — shell, file read/write, everything |
|
|
3558
3579
|
| `telegram-admin-dm` | All except shell | Admin DM — full tools, shell blocked by default (overridable) |
|
|
3559
3580
|
| `telegram-admin-group` | Read-only + web + vision/OCR | Admin in public group — no system mutation tools |
|
|
3560
|
-
| `telegram-public` | Memory r/w, web fetch/search | Public users —
|
|
3581
|
+
| `telegram-public` | Memory r/w, web fetch/search, scoped creative tools | Public users — no arbitrary local file access or shell |
|
|
3561
3582
|
| `api` | All tools | API endpoint — configurable |
|
|
3562
3583
|
|
|
3563
3584
|
**System tools** (`shell`, `file_write`, `file_edit`, `file_read`, `file_patch`, `batch_edit`, `grep_search`, `glob_find`, `list_directory`, `code_sandbox`, `codebase_map`, `git_info`, etc.) are **never exposed** in public-facing contexts.
|
|
@@ -3586,9 +3607,9 @@ The bridge distinguishes between **private DMs** and **group/supergroup chats**,
|
|
|
3586
3607
|
|
|
3587
3608
|
- **Admin DM** → full tool access, live streaming via `editMessageText`, project context injected
|
|
3588
3609
|
- **Admin in group** → read-only tools + web + vision/OCR, no live streaming, concise responses
|
|
3589
|
-
- **Public in group** →
|
|
3610
|
+
- **Public in group** → scoped memory + web + media + creative sandbox tools, reply discretion active
|
|
3590
3611
|
|
|
3591
|
-
**Reply discretion** — in group chats, the
|
|
3612
|
+
**Reply discretion** — in group chats, the live router evaluates whether a message warrants a response using the current conversation stream, participants, mentions, replies, and recent tone. Chatter that doesn't involve the bot is silently skipped and retained as context. Skip decisions are not sent back into the chat.
|
|
3592
3613
|
|
|
3593
3614
|
### Media Handling
|
|
3594
3615
|
|