npm - omnius - Versions diffs - 1.0.19 → 1.0.21 - Mend

omnius 1.0.19 → 1.0.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -278,7 +278,8 @@ The agent uses tools autonomously in a loop — reading errors, fixing code, and
 - **61 autonomous tools** — file I/O, shell, grep, web search/fetch/crawl, memory (read/write/search), sub-agents, background tasks, image/OCR/PDF, git, diagnostics, vision, desktop automation, browser automation, temporal agency (scheduler/reminders/agenda), structured files, code sandbox, transcription, skills, opencode delegation, cron agents, nexus P2P networking + x402 micropayments, **COHERE cognitive stack** (persistent REPL, recursive LLM calls, memory metabolism, identity kernel, reflection, exploration)
 - **Moondream vision** — see and interact with the desktop via Moondream VLM (caption, query, detect, point-and-click)
-- **Image generation with TUI previews** — `/image <prompt>` and the `generate_image` tool create PNGs under `.omnius/images/`, support explicit `--model` selection, and render generated, pasted, screenshot, and camera-capture images as auto-sized ASCII previews via the bundled `image-to-ascii` renderer
+- **Image generation with TUI previews** — `/image <prompt>` and the `generate_image` tool create PNGs under `.omnius/images/`, support explicit `--model` selection, try a ranked quality fallback ladder from FLUX.1 dev / SD3.5 Large down to lightweight smoke-test models when setup or generation fails, and render generated, pasted, screenshot, and camera-capture images as auto-sized ASCII previews via the bundled `image-to-ascii` renderer
+- **Sound and music generation** — `/sound` and `/music` generate WAV files under `.omnius/audio/`, auto-create backend venvs under `.omnius/audio-gen/`, and fall back from high-quality Stable Audio / AudioLDM / MusicGen tiers to smaller practical models when a larger setup or model download fails. Stable Audio uses Diffusers `StableAudioPipeline` instead of the build-prone `stable-audio-tools` package
 - **Desktop automation** — vision-guided clicking: describe a UI element in natural language, the agent finds and clicks it
 - **Auto-install desktop deps** — screenshot, mouse, OCR, and image tools auto-install missing system packages (scrot, xdotool, tesseract, imagemagick) on first use
 - **Hardware-rated model lists** — first-run setup, `/models`, `/score`, and `/image list` score model fit against detected RAM/VRAM/GPU so text and image model choices are visible before you switch or generate
@@ -357,7 +358,7 @@ The daemon auto-installs Python dependencies (OpenCLIP, torchaudio + soundfile,
 - **IPFS sharing surface** — `/ipfs` status page with peer info + identity kernel metrics + memory sentiment. `/ipfs pin <CID>` to pin remote agent content. `/ipfs publish` to share identity kernel. `/ipfs share tool/skill` to publish agent-created tools with secret stripping. `/ipfs import <CID>` to retrieve shared content
 - **Fortemi-React bridge** — `/fortemi start/status/stop` connects to [fortemi-react](https://github.com/robit-man/fortemi-react) (browser-first PGlite+pgvector knowledge system) via JWT auth. Proxy tools: `fortemi_capture`, `fortemi_search`, `fortemi_list`, `fortemi_get` auto-register when bridge is connected
 - **Content ingestion** — `/ingest <file>` imports audio (transcribe via Whisper), PDF (pdftotext), or text files into structured memory with 800-char/100-overlap chunking (matches fortemi pattern)
-- **Image generation** — `generate_image` supports Ollama image models, Diffusers models, and stable-diffusion.cpp checkpoints/GGUF. SDXL Turbo is the practical default auto-install path under `.omnius/image-gen/.venv`; FLUX.1 dev and Stable Diffusion 3.5 Large are the primary high-realism baselines when hardware allows. `/image list` groups models by type, size, quality expectations, and hardware fit
+- **Image generation** — `generate_image` supports Ollama image models, Diffusers models, and stable-diffusion.cpp checkpoints/GGUF. SDXL Turbo is the practical default auto-install path under `.omnius/image-gen/.venv`; FLUX.1 dev and Stable Diffusion 3.5 Large are the primary high-realism baselines when hardware allows. `/image list` groups models by type, size, quality expectations, and hardware fit. Generation falls through the ranked model ladder unless `strict_model=true` or `fallback=false` is set
 - **Node visualization** — [omnius.nexus](https://github.com/robit-man/omnius.nexus) Three.js dashboard: 5-color emotional state mapping (neutral/focused/stressed/dreaming/excited), dynamic node size by memory depth + IPFS storage, activity-modulated connections, identity synchrony golden threads between mutually-pinned agents
 - **TTS sanitizer** — strips markdown syntax (`##`, `**`, `` ` ``), emoji (prevents "white heavy checkmark"), box-drawing chars, and ANSI codes before feeding to ALL TTS engines
 - **LuxTTS gapless playback** — look-ahead pre-synthesis pipeline: next chunk synthesizes while current plays, eliminating inter-sentence gaps. Jetson ARM support with NVIDIA's prebuilt PyTorch wheel