noosphere 0.7.0 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +95 -6
- package/dist/index.cjs +695 -34
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +34 -2
- package/dist/index.d.ts +34 -2
- package/dist/index.js +693 -34
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -6,11 +6,16 @@ One import. Every model. Every modality.
|
|
|
6
6
|
|
|
7
7
|
## Features
|
|
8
8
|
|
|
9
|
-
- **
|
|
9
|
+
- **7 modalities** — LLM, image, video, TTS, STT, music, and embeddings
|
|
10
|
+
- **OpenAI media** — GPT-Image-1/1.5, DALL-E 2/3, Sora 2/Pro (video), TTS-1/HD, Whisper — all auto-fetched from `OPENAI_API_KEY`
|
|
11
|
+
- **Google media** — Imagen 4.0 (image), Veo 2/3/3.1 (video), Gemini TTS — all auto-fetched from `GEMINI_API_KEY`
|
|
10
12
|
- **Always up-to-date models** — Dynamic auto-fetch from ALL provider APIs at runtime (OpenAI, Anthropic, Google, Groq, Mistral, xAI, Cerebras, OpenRouter)
|
|
13
|
+
- **Dynamic descriptions** — Model descriptions fetched from source (Ollama library, HuggingFace READMEs, CivitAI API) — no hardcoded strings
|
|
14
|
+
- **Modality-filtered sync** — `syncModels('llm')` only fetches LLM providers, avoiding unnecessary requests
|
|
11
15
|
- **867+ media endpoints** — via FAL (Flux, SDXL, Kling, Sora 2, VEO 3, Kokoro, ElevenLabs, and hundreds more)
|
|
12
16
|
- **30+ HuggingFace tasks** — LLM, image, TTS, translation, summarization, classification, and more
|
|
13
|
-
- **Local-first architecture** — Auto-detects ComfyUI,
|
|
17
|
+
- **Local-first architecture** — Auto-detects Ollama, ComfyUI, Whisper, AudioCraft, Piper, and Kokoro on your machine
|
|
18
|
+
- **Org-aware logos** — HuggingFace models show the real org logo (Meta, Google, NVIDIA) instead of generic HF logo
|
|
14
19
|
- **Agentic capabilities** — Tool use, function calling, reasoning/thinking, vision, and agent loops via Pi-AI
|
|
15
20
|
- **Failover & retry** — Automatic retries with exponential backoff and cross-provider failover
|
|
16
21
|
- **Usage tracking** — Real-time cost, latency, and token tracking across all providers
|
|
@@ -35,13 +40,29 @@ const response = await ai.chat({
|
|
|
35
40
|
});
|
|
36
41
|
console.log(response.content);
|
|
37
42
|
|
|
38
|
-
// Generate an image
|
|
43
|
+
// Generate an image with GPT-Image-1 (OpenAI) — just needs OPENAI_API_KEY
|
|
39
44
|
const image = await ai.image({
|
|
40
45
|
prompt: 'A sunset over mountains',
|
|
46
|
+
provider: 'openai-media',
|
|
47
|
+
});
|
|
48
|
+
// image.buffer contains the PNG data
|
|
49
|
+
|
|
50
|
+
// Generate an image with Imagen 4.0 (Google) — just needs GEMINI_API_KEY
|
|
51
|
+
const googleImage = await ai.image({
|
|
52
|
+
prompt: 'A sunset over mountains',
|
|
53
|
+
provider: 'google-media',
|
|
54
|
+
});
|
|
55
|
+
// googleImage.buffer contains the PNG data
|
|
56
|
+
|
|
57
|
+
// Generate an image with DALL-E 3
|
|
58
|
+
const dalle = await ai.image({
|
|
59
|
+
prompt: 'A sunset over mountains',
|
|
60
|
+
provider: 'openai-media',
|
|
61
|
+
model: 'dall-e-3',
|
|
41
62
|
width: 1024,
|
|
42
63
|
height: 1024,
|
|
43
64
|
});
|
|
44
|
-
console.log(
|
|
65
|
+
console.log(dalle.url);
|
|
45
66
|
|
|
46
67
|
// Generate a video
|
|
47
68
|
const video = await ai.video({
|
|
@@ -50,7 +71,7 @@ const video = await ai.video({
|
|
|
50
71
|
});
|
|
51
72
|
console.log(video.url);
|
|
52
73
|
|
|
53
|
-
// Text-to-speech
|
|
74
|
+
// Text-to-speech with OpenAI TTS — just needs OPENAI_API_KEY
|
|
54
75
|
const audio = await ai.speak({
|
|
55
76
|
text: 'Welcome to Noosphere',
|
|
56
77
|
voice: 'alloy',
|
|
@@ -365,14 +386,48 @@ await ai.uninstallModel('deepseek-r1:14b');
|
|
|
365
386
|
| Provider | Modality | Models | Source | Auto-Detect |
|
|
366
387
|
|---|---|---|---|---|
|
|
367
388
|
| **pi-ai** | LLM | 482 | OpenAI, Anthropic, Google, Groq, Mistral, xAI, OpenRouter, Cerebras | API keys |
|
|
389
|
+
| **openai-media** | image, video, tts, stt | 12 | GPT-Image-1/1.5, DALL-E 2/3, Sora 2/Pro, TTS-1/HD, Whisper | `OPENAI_API_KEY` |
|
|
390
|
+
| **google-media** | image, video, tts | 10 | Imagen 4.0, Veo 2/3/3.1, Gemini TTS (Flash/Pro) | `GEMINI_API_KEY` |
|
|
368
391
|
| **ollama** | LLM, embedding | 70 | 38 installed + 32 from Ollama web catalog | `localhost:11434` |
|
|
369
|
-
| **hf-local** | image, video, tts, stt | 220 | HuggingFace catalog (FLUX, SDXL, Wan2.2, Whisper, MusicGen) | Always |
|
|
392
|
+
| **hf-local** | image, video, tts, stt, music | 220 | HuggingFace catalog (FLUX, SDXL, Wan2.2, Whisper, MusicGen) | Always (no API key) |
|
|
393
|
+
| **huggingface** | LLM, image, tts | dynamic | HuggingFace Inference API | `HUGGINGFACE_TOKEN` |
|
|
370
394
|
| **comfyui** | image, video | dynamic | Installed checkpoints + CivitAI catalog | `localhost:8188` |
|
|
371
395
|
| **openai-compat** | LLM | dynamic | llama.cpp, LM Studio, vLLM, LocalAI, KoboldCpp, Jan, TabbyAPI | Scans ports |
|
|
396
|
+
| **fal** | image, video, tts | 867+ | FAL.ai (Flux, SDXL, Kling, Sora 2, Kokoro, ElevenLabs) | `FAL_KEY` |
|
|
372
397
|
| **piper** | TTS | 2+ | Piper voices installed locally | Binary detection |
|
|
373
398
|
| **whisper-local** | STT | 8 | Whisper/Faster-Whisper (tiny → large-v3) | Python detection |
|
|
374
399
|
| **audiocraft** | music | 5 | MusicGen (small/medium/large/melody) + AudioGen | Python detection |
|
|
375
400
|
|
|
401
|
+
### Modality-Filtered Sync — Only Fetch What You Need
|
|
402
|
+
|
|
403
|
+
Sync **only the providers relevant to a specific modality** instead of fetching everything. This avoids unnecessary network requests (e.g., fetching 270+ HuggingFace READMEs when you only need LLMs).
|
|
404
|
+
|
|
405
|
+
```typescript
|
|
406
|
+
// Sync only LLM providers (Ollama, pi-ai, openai-compat, huggingface)
|
|
407
|
+
await ai.syncModels('llm');
|
|
408
|
+
|
|
409
|
+
// Sync only image providers (hf-local, comfyui, fal, huggingface)
|
|
410
|
+
await ai.syncModels('image');
|
|
411
|
+
|
|
412
|
+
// Sync only STT providers (whisper-local, hf-local)
|
|
413
|
+
await ai.syncModels('stt');
|
|
414
|
+
|
|
415
|
+
// Sync everything (backward compatible)
|
|
416
|
+
await ai.syncModels();
|
|
417
|
+
```
|
|
418
|
+
|
|
419
|
+
**Which providers sync for each modality:**
|
|
420
|
+
|
|
421
|
+
| Modality | Providers Synced |
|
|
422
|
+
|---|---|
|
|
423
|
+
| `llm` | pi-ai, ollama, openai-compat, huggingface (cloud) |
|
|
424
|
+
| `image` | **openai-media** (GPT-Image-1, DALL-E), **google-media** (Imagen 4.0), hf-local, comfyui, fal, huggingface (cloud) |
|
|
425
|
+
| `video` | **openai-media** (Sora 2/Pro), **google-media** (Veo 2/3/3.1), hf-local, comfyui, fal |
|
|
426
|
+
| `tts` | **openai-media** (TTS-1, TTS-1-HD), **google-media** (Gemini TTS), hf-local, fal, piper, kokoro, huggingface (cloud) |
|
|
427
|
+
| `stt` | **openai-media** (Whisper), hf-local, whisper-local |
|
|
428
|
+
| `music` | hf-local (MusicGen, AudioLDM, etc.), audiocraft |
|
|
429
|
+
| `embedding` | ollama |
|
|
430
|
+
|
|
376
431
|
### Models by Modality
|
|
377
432
|
|
|
378
433
|
```typescript
|
|
@@ -478,6 +533,38 @@ const comfyModels = models.filter(m => m.provider === 'comfyui');
|
|
|
478
533
|
const civitai = comfyModels.filter(m => m.status === 'available');
|
|
479
534
|
```
|
|
480
535
|
|
|
536
|
+
### Model Descriptions — Dynamic from Source
|
|
537
|
+
|
|
538
|
+
Every model includes a `description` field fetched dynamically from its source — no hardcoded strings:
|
|
539
|
+
|
|
540
|
+
```typescript
|
|
541
|
+
const models = await ai.getModels('llm');
|
|
542
|
+
|
|
543
|
+
for (const m of models) {
|
|
544
|
+
console.log(m.name, m.description);
|
|
545
|
+
// "llama3.1" "Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B"
|
|
546
|
+
// "qwen3" "Qwen3 is the latest generation of large language models in Qwen series"
|
|
547
|
+
// "gemma3" "The current, most capable model that runs on a single GPU"
|
|
548
|
+
}
|
|
549
|
+
|
|
550
|
+
const imageModels = await ai.getModels('image');
|
|
551
|
+
for (const m of imageModels) {
|
|
552
|
+
console.log(m.name, m.description);
|
|
553
|
+
// "stable-diffusion-xl-base-1.0" "Stable Diffusion XL (SDXL) is a latent text-to-image..."
|
|
554
|
+
// "FLUX.1-dev" "FLUX.1 [dev] is a 12 billion parameter rectified flow..."
|
|
555
|
+
}
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
| Provider | Description Source |
|
|
559
|
+
|---|---|
|
|
560
|
+
| **Ollama** | Scraped from `ollama.com/library` page |
|
|
561
|
+
| **HuggingFace Local** | Parsed from each model's `README.md` on HuggingFace Hub |
|
|
562
|
+
| **CivitAI/ComfyUI** | Extracted from CivitAI API response |
|
|
563
|
+
| **Whisper** | Parsed from OpenAI's Whisper README on HuggingFace |
|
|
564
|
+
| **AudioCraft** | Parsed from Meta's AudioCraft README on HuggingFace |
|
|
565
|
+
|
|
566
|
+
All description fetches are **parallel and fail-safe** — if a source is unreachable, models are returned without descriptions. No API keys required.
|
|
567
|
+
|
|
481
568
|
### Model Status & Local Info
|
|
482
569
|
|
|
483
570
|
Every local model includes rich metadata:
|
|
@@ -486,6 +573,8 @@ Every local model includes rich metadata:
|
|
|
486
573
|
interface ModelInfo {
|
|
487
574
|
id: string;
|
|
488
575
|
provider: string;
|
|
576
|
+
name: string;
|
|
577
|
+
description?: string; // Dynamic from source (Ollama library, HF README, CivitAI)
|
|
489
578
|
modality: 'llm' | 'image' | 'video' | 'tts' | 'stt' | 'music' | 'embedding';
|
|
490
579
|
status?: 'installed' | 'available' | 'downloading' | 'running' | 'error';
|
|
491
580
|
local: boolean;
|