noosphere 0.5.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,11 +6,14 @@ One import. Every model. Every modality.
6
6
 
7
7
  ## Features
8
8
 
9
- - **4 modalities** — LLM chat, image generation, video generation, and text-to-speech
9
+ - **7 modalities** — LLM, image, video, TTS, STT, music, and embeddings
10
10
  - **Always up-to-date models** — Dynamic auto-fetch from ALL provider APIs at runtime (OpenAI, Anthropic, Google, Groq, Mistral, xAI, Cerebras, OpenRouter)
11
+ - **Dynamic descriptions** — Model descriptions fetched from source (Ollama library, HuggingFace READMEs, CivitAI API) — no hardcoded strings
12
+ - **Modality-filtered sync** — `syncModels('llm')` only fetches LLM providers, avoiding unnecessary requests
11
13
  - **867+ media endpoints** — via FAL (Flux, SDXL, Kling, Sora 2, VEO 3, Kokoro, ElevenLabs, and hundreds more)
12
14
  - **30+ HuggingFace tasks** — LLM, image, TTS, translation, summarization, classification, and more
13
- - **Local-first architecture** — Auto-detects ComfyUI, Ollama, Piper, and Kokoro on your machine
15
+ - **Local-first architecture** — Auto-detects Ollama, ComfyUI, Whisper, AudioCraft, Piper, and Kokoro on your machine
16
+ - **Org-aware logos** — HuggingFace models show the real org logo (Meta, Google, NVIDIA) instead of generic HF logo
14
17
  - **Agentic capabilities** — Tool use, function calling, reasoning/thinking, vision, and agent loops via Pi-AI
15
18
  - **Failover & retry** — Automatic retries with exponential backoff and cross-provider failover
16
19
  - **Usage tracking** — Real-time cost, latency, and token tracking across all providers
@@ -324,6 +327,288 @@ const imageModels = await ai.getModels('image');
324
327
 
325
328
  ---
326
329
 
330
+ ## Local Models — Run Everything on Your Machine
331
+
332
+ Noosphere has **comprehensive local model support** across all modalities — LLM, image, video, TTS, STT, and music. Auto-discovers what's installed, catalogs what's available to download, and provides a unified API for everything.
333
+
334
+ ### Quick Start
335
+
336
+ ```typescript
337
+ const ai = new Noosphere();
338
+ await ai.syncModels();
339
+
340
+ // 774 models discovered — cloud + local, all modalities
341
+ const all = await ai.getModels();
342
+
343
+ // Filter by what you can run locally
344
+ const localModels = all.filter(m => m.local || m.status === 'installed');
345
+
346
+ // What's installed vs what's available to download
347
+ const installed = all.filter(m => m.status === 'installed'); // 39 models ready to use
348
+ const available = all.filter(m => m.status === 'available'); // 251 models you can download
349
+
350
+ // Chat with a local Ollama model — same API as cloud
351
+ const result = await ai.chat({
352
+ model: 'qwen3:8b',
353
+ provider: 'ollama',
354
+ messages: [{ role: 'user', content: 'Hello!' }],
355
+ });
356
+ console.log(result.content); // "Hello! How can I help?"
357
+ console.log(result.usage); // { cost: 0, input: 24, output: 198, unit: 'tokens' }
358
+
359
+ // Install a new model from Ollama library
360
+ await ai.installModel('deepseek-r1:14b');
361
+
362
+ // Uninstall
363
+ await ai.uninstallModel('deepseek-r1:14b');
364
+ ```
365
+
366
+ ### 8 Providers, 5 Modalities, 774+ Models
367
+
368
+ | Provider | Modality | Models | Source | Auto-Detect |
369
+ |---|---|---|---|---|
370
+ | **pi-ai** | LLM | 482 | OpenAI, Anthropic, Google, Groq, Mistral, xAI, OpenRouter, Cerebras | API keys |
371
+ | **ollama** | LLM, embedding | 70 | 38 installed + 32 from Ollama web catalog | `localhost:11434` |
372
+ | **hf-local** | image, video, tts, stt, music | 220 | HuggingFace catalog (FLUX, SDXL, Wan2.2, Whisper, MusicGen) | Always (no API key) |
373
+ | **huggingface** | LLM, image, tts | dynamic | HuggingFace Inference API | `HUGGINGFACE_TOKEN` |
374
+ | **comfyui** | image, video | dynamic | Installed checkpoints + CivitAI catalog | `localhost:8188` |
375
+ | **openai-compat** | LLM | dynamic | llama.cpp, LM Studio, vLLM, LocalAI, KoboldCpp, Jan, TabbyAPI | Scans ports |
376
+ | **fal** | image, video, tts | 867+ | FAL.ai (Flux, SDXL, Kling, Sora 2, Kokoro, ElevenLabs) | `FAL_KEY` |
377
+ | **piper** | TTS | 2+ | Piper voices installed locally | Binary detection |
378
+ | **whisper-local** | STT | 8 | Whisper/Faster-Whisper (tiny → large-v3) | Python detection |
379
+ | **audiocraft** | music | 5 | MusicGen (small/medium/large/melody) + AudioGen | Python detection |
380
+
381
+ ### Modality-Filtered Sync — Only Fetch What You Need
382
+
383
+ Sync **only the providers relevant to a specific modality** instead of fetching everything. This avoids unnecessary network requests (e.g., fetching 270+ HuggingFace READMEs when you only need LLMs).
384
+
385
+ ```typescript
386
+ // Sync only LLM providers (Ollama, pi-ai, openai-compat, huggingface)
387
+ await ai.syncModels('llm');
388
+
389
+ // Sync only image providers (hf-local, comfyui, fal, huggingface)
390
+ await ai.syncModels('image');
391
+
392
+ // Sync only STT providers (whisper-local, hf-local)
393
+ await ai.syncModels('stt');
394
+
395
+ // Sync everything (backward compatible)
396
+ await ai.syncModels();
397
+ ```
398
+
399
+ **Which providers sync for each modality:**
400
+
401
+ | Modality | Providers Synced |
402
+ |---|---|
403
+ | `llm` | pi-ai, ollama, openai-compat, huggingface (cloud, needs API key) |
404
+ | `image` | hf-local, comfyui, fal, huggingface (cloud) |
405
+ | `video` | hf-local, comfyui, fal |
406
+ | `tts` | hf-local (speech models), fal, piper, kokoro, huggingface (cloud) |
407
+ | `stt` | hf-local, whisper-local |
408
+ | `music` | hf-local (MusicGen, AudioLDM, etc.), audiocraft |
409
+ | `embedding` | ollama |
410
+
411
+ ### Models by Modality
412
+
413
+ ```typescript
414
+ const models = await ai.getModels();
415
+
416
+ // Filter by modality
417
+ const llm = models.filter(m => m.modality === 'llm'); // 552 (cloud + Ollama local)
418
+ const image = models.filter(m => m.modality === 'image'); // 101 (FLUX, SDXL, SD3, PixArt...)
419
+ const tts = models.filter(m => m.modality === 'tts'); // 61 (MusicGen, Bark, Piper, Kokoro...)
420
+ const video = models.filter(m => m.modality === 'video'); // 30 (Wan2.2, CogVideoX, AnimateDiff...)
421
+ const stt = models.filter(m => m.modality === 'stt'); // 30 (Whisper, wav2vec2...)
422
+ ```
423
+
424
+ ### Ollama Provider — Local LLM
425
+
426
+ Full integration with Ollama's API:
427
+
428
+ ```typescript
429
+ // Auto-detected on startup — no config needed
430
+ // Models include full metadata from Ollama
431
+
432
+ const ollamaModels = models.filter(m => m.provider === 'ollama');
433
+ for (const m of ollamaModels) {
434
+ console.log(m.id); // "llama3.3:70b"
435
+ console.log(m.status); // "installed" | "available" | "running"
436
+ console.log(m.localInfo.parameterSize); // "70.6B"
437
+ console.log(m.localInfo.quantization); // "Q4_K_M"
438
+ console.log(m.localInfo.sizeBytes); // 42520413916
439
+ console.log(m.localInfo.family); // "llama"
440
+ console.log(m.logo); // { svg: "...meta.svg", png: "...meta.png" }
441
+ }
442
+
443
+ // Chat with streaming
444
+ const stream = ai.stream({
445
+ model: 'qwen3:8b',
446
+ provider: 'ollama',
447
+ messages: [{ role: 'user', content: 'Explain quantum computing' }],
448
+ });
449
+
450
+ for await (const event of stream) {
451
+ if (event.type === 'text_delta') process.stdout.write(event.delta);
452
+ }
453
+
454
+ const finalResult = await stream.result();
455
+
456
+ // Model management
457
+ await ai.installModel('deepseek-r1:14b'); // Downloads from Ollama library
458
+ await ai.uninstallModel('old-model:7b'); // Removes from disk
459
+
460
+ // Hardware info
461
+ const hw = await ai.getHardware();
462
+ // { ollama: true, runningModels: [{ name: 'qwen3:8b', size: 5200000000, ... }] }
463
+ ```
464
+
465
+ ### OpenAI-Compatible Provider — Any Local Server
466
+
467
+ Connects to ANY server that implements the OpenAI API:
468
+
469
+ ```typescript
470
+ // Auto-detects servers on common ports:
471
+ // llama.cpp (:8080), LM Studio (:1234), vLLM (:8000)
472
+ // LocalAI (:8080), TabbyAPI (:5000), KoboldCpp (:5001), Jan (:1337)
473
+
474
+ // Or configure manually:
475
+ const ai = new Noosphere({
476
+ openaiCompat: [
477
+ { baseUrl: 'http://localhost:1234/v1', name: 'LM Studio' },
478
+ { baseUrl: 'http://192.168.1.100:8080/v1', name: 'Remote llama.cpp' },
479
+ ],
480
+ });
481
+ ```
482
+
483
+ ### HuggingFace Local Catalog
484
+
485
+ Auto-fetches the top models by downloads for each modality:
486
+
487
+ ```typescript
488
+ const imageModels = models.filter(m => m.provider === 'hf-local' && m.modality === 'image');
489
+ // → FLUX.1-dev, FLUX.1-schnell, SDXL, SD 3.5, PixArt-Σ, Playground v2.5, Kolors...
490
+
491
+ const videoModels = models.filter(m => m.provider === 'hf-local' && m.modality === 'video');
492
+ // → Wan2.2-T2V, CogVideoX-5b, AnimateDiff, Stable Video Diffusion...
493
+
494
+ const ttsModels = models.filter(m => m.provider === 'hf-local' && m.modality === 'tts');
495
+ // → MusicGen, Stable Audio Open, Bark, ACE-Step...
496
+
497
+ const sttModels = models.filter(m => m.provider === 'hf-local' && m.modality === 'stt');
498
+ // → Whisper large-v3, Whisper large-v3-turbo, wav2vec2...
499
+ ```
500
+
501
+ Models already downloaded to `~/.cache/huggingface/hub/` are automatically detected as `status: 'installed'`.
502
+
503
+ ### ComfyUI — Dynamic Workflow Engine
504
+
505
+ When ComfyUI is running, noosphere discovers all installed checkpoints, LoRAs, and models:
506
+
507
+ ```typescript
508
+ // Auto-detected on localhost:8188
509
+ const comfyModels = models.filter(m => m.provider === 'comfyui');
510
+ // → All checkpoints (SD 1.5, SDXL, FLUX, Pony, etc.)
511
+
512
+ // Also fetches top models from CivitAI as "available"
513
+ const civitai = comfyModels.filter(m => m.status === 'available');
514
+ ```
515
+
516
+ ### Model Descriptions — Dynamic from Source
517
+
518
+ Every model includes a `description` field fetched dynamically from its source — no hardcoded strings:
519
+
520
+ ```typescript
521
+ const models = await ai.getModels('llm');
522
+
523
+ for (const m of models) {
524
+ console.log(m.name, m.description);
525
+ // "llama3.1" "Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B"
526
+ // "qwen3" "Qwen3 is the latest generation of large language models in Qwen series"
527
+ // "gemma3" "The current, most capable model that runs on a single GPU"
528
+ }
529
+
530
+ const imageModels = await ai.getModels('image');
531
+ for (const m of imageModels) {
532
+ console.log(m.name, m.description);
533
+ // "stable-diffusion-xl-base-1.0" "Stable Diffusion XL (SDXL) is a latent text-to-image..."
534
+ // "FLUX.1-dev" "FLUX.1 [dev] is a 12 billion parameter rectified flow..."
535
+ }
536
+ ```
537
+
538
+ | Provider | Description Source |
539
+ |---|---|
540
+ | **Ollama** | Scraped from `ollama.com/library` page |
541
+ | **HuggingFace Local** | Parsed from each model's `README.md` on HuggingFace Hub |
542
+ | **CivitAI/ComfyUI** | Extracted from CivitAI API response |
543
+ | **Whisper** | Parsed from OpenAI's Whisper README on HuggingFace |
544
+ | **AudioCraft** | Parsed from Meta's AudioCraft README on HuggingFace |
545
+
546
+ All description fetches are **parallel and fail-safe** — if a source is unreachable, models are returned without descriptions. No API keys required.
547
+
548
+ ### Model Status & Local Info
549
+
550
+ Every local model includes rich metadata:
551
+
552
+ ```typescript
553
+ interface ModelInfo {
554
+ id: string;
555
+ provider: string;
556
+ name: string;
557
+ description?: string; // Dynamic from source (Ollama library, HF README, CivitAI)
558
+ modality: 'llm' | 'image' | 'video' | 'tts' | 'stt' | 'music' | 'embedding';
559
+ status?: 'installed' | 'available' | 'downloading' | 'running' | 'error';
560
+ local: boolean;
561
+ logo?: { svg?: string; png?: string };
562
+ localInfo?: {
563
+ sizeBytes: number;
564
+ family?: string; // "llama", "gemma3", "qwen2"
565
+ parameterSize?: string; // "70.6B", "7B", "3.2B"
566
+ quantization?: string; // "Q4_K_M", "Q8_0", "F16"
567
+ format?: string; // "gguf", "safetensors", "onnx"
568
+ digest?: string;
569
+ modifiedAt?: string;
570
+ running?: boolean;
571
+ runtime: string; // "ollama", "diffusers", "comfyui", "piper", "whisper"
572
+ };
573
+ capabilities: {
574
+ contextWindow?: number;
575
+ maxTokens?: number;
576
+ supportsVision?: boolean;
577
+ supportsStreaming?: boolean;
578
+ };
579
+ }
580
+ ```
581
+
582
+ ### Web Catalogs (Auto-Fetched)
583
+
584
+ | Source | API | What it provides |
585
+ |---|---|---|
586
+ | **Ollama Library** | `ollama.com/api/tags` | 215+ LLM families with sizes and quantizations |
587
+ | **HuggingFace** | `huggingface.co/api/models?pipeline_tag=...` | Top models per modality (image, video, TTS, STT) |
588
+ | **CivitAI** | `civitai.com/api/v1/models` | SD/SDXL/FLUX checkpoints with previews |
589
+
590
+ ### Auto-Detection — Zero Config
591
+
592
+ Noosphere auto-detects all local runtimes on startup:
593
+
594
+ | Runtime | Detection Method | Default Port |
595
+ |---|---|---|
596
+ | Ollama | `GET localhost:11434/api/version` | 11434 |
597
+ | ComfyUI | `GET localhost:8188/system_stats` | 8188 |
598
+ | llama.cpp | `GET localhost:8080/health` | 8080 |
599
+ | LM Studio | `GET localhost:1234/v1/models` | 1234 |
600
+ | vLLM | `GET localhost:8000/v1/models` | 8000 |
601
+ | KoboldCpp | `GET localhost:5001/v1/models` | 5001 |
602
+ | TabbyAPI | `GET localhost:5000/v1/models` | 5000 |
603
+ | Jan | `GET localhost:1337/v1/models` | 1337 |
604
+ | Piper | Binary in PATH | — |
605
+ | Whisper | Python package detection | — |
606
+ | AudioCraft | Python package detection | — |
607
+
608
+ > 📄 **Full research:** [`docs/LOCAL_AI_RESEARCH.md`](./docs/LOCAL_AI_RESEARCH.md) — 44KB covering 12+ runtimes across all modalities
609
+
610
+ ---
611
+
327
612
  ## Configuration
328
613
 
329
614
  API keys are resolved from the constructor config or environment variables (config takes priority):