@j-o-r/hello-dave 0.0.10 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -0
- package/README.md.bak.1779452127 +240 -0
- package/TODO.md +30 -8
- package/agents/code_agent.js +6 -6
- package/agents/daisy_agent.js +10 -7
- package/agents/minimax.js +173 -0
- package/agents/stability.js +173 -0
- package/bin/codeDave +1 -1
- package/bin/dave.js +1 -1
- package/docs/music-toolsets.md +137 -0
- package/docs/plans/minimax-music-generation.md +80 -0
- package/docs/plans/unified-agent-architecture.md +146 -0
- package/docs/plans/websocket-streaming-plan.md.bak +317 -0
- package/docs/prompt/task_clarification_and_documentation.md +35 -0
- package/lib/API/minimax/ImageToolset.js +169 -0
- package/lib/API/minimax/MusicToolset.js +290 -0
- package/lib/API/minimax/VideoToolset.js +296 -0
- package/lib/API/minimax/image.generation.md +239 -0
- package/lib/API/minimax/image.js +219 -0
- package/lib/API/minimax/image.to.image.md +257 -0
- package/lib/API/minimax/index.js +16 -0
- package/lib/API/minimax/music.cover.preprocess.md +206 -0
- package/lib/API/minimax/music.generation.md +346 -0
- package/lib/API/minimax/music.js +257 -0
- package/lib/API/minimax/music.lyrics.generation.md +205 -0
- package/lib/API/minimax/video.download.md +133 -0
- package/lib/API/minimax/video.first.last.image.md +186 -0
- package/lib/API/minimax/video.from.image.md +206 -0
- package/lib/API/minimax/video.from.subject.md +164 -0
- package/lib/API/minimax/video.generation.md +192 -0
- package/lib/API/minimax/video.js +339 -0
- package/lib/API/minimax/video.query.md +128 -0
- package/lib/API/stability.ai/ImageToolset.js +357 -0
- package/lib/API/stability.ai/MusicToolset.js +302 -0
- package/lib/API/stability.ai/audio-3.md +205 -0
- package/lib/API/stability.ai/audio.js +679 -0
- package/lib/API/stability.ai/image.js +911 -0
- package/lib/API/stability.ai/image.md +271 -0
- package/lib/API/stability.ai/index.js +11 -0
- package/lib/API/stability.ai/openapi.json +17118 -0
- package/lib/API/x.ai/ImageToolset.js +165 -0
- package/lib/API/x.ai/image.editing.md +86 -0
- package/lib/API/x.ai/image.js +393 -0
- package/lib/API/x.ai/image.md +213 -0
- package/lib/API/x.ai/image.to.generation.md +494 -0
- package/lib/API/x.ai/image.to.video.md +23 -0
- package/lib/API/x.ai/index.js +7 -0
- package/lib/AgentManager.js +1 -1
- package/lib/CdnToolset.js +191 -0
- package/lib/ToolSet.js +19 -1
- package/lib/cdn.js +373 -0
- package/lib/fafs.js +3 -1
- package/lib/genericToolset.js +43 -166
- package/lib/index.js +9 -1
- package/package.json +2 -2
- package/types/API/minimax/ImageToolset.d.ts +3 -0
- package/types/API/minimax/MusicToolset.d.ts +3 -0
- package/types/API/minimax/VideoToolset.d.ts +3 -0
- package/types/API/minimax/image.d.ts +109 -0
- package/types/API/minimax/index.d.ts +15 -0
- package/types/API/minimax/music.d.ts +46 -0
- package/types/API/minimax/video.d.ts +165 -0
- package/types/API/stability.ai/ImageToolset.d.ts +3 -0
- package/types/API/stability.ai/MusicToolset.d.ts +3 -0
- package/types/API/stability.ai/audio.d.ts +193 -0
- package/types/API/stability.ai/image.d.ts +274 -0
- package/types/API/stability.ai/index.d.ts +11 -0
- package/types/API/x.ai/ImageToolset.d.ts +3 -0
- package/types/API/x.ai/image.d.ts +82 -0
- package/types/API/x.ai/index.d.ts +7 -0
- package/types/AgentManager.d.ts +1 -1
- package/types/CdnToolset.d.ts +20 -0
- package/types/ToolSet.d.ts +8 -0
- package/types/cdn.d.ts +141 -0
- package/types/index.d.ts +9 -2
- package/docs/multi-agent-clusters.md.bak +0 -229
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
# Stable Audio 3 API Specifications
|
|
2
|
+
|
|
3
|
+
> Extracted from `lib/API/stability.ai/openapi.json` for Stable Audio 3.0 (`stable-audio-3` model).
|
|
4
|
+
> This document serves as the source specification for generating the HTTP wrapper `audio.js` (similar to `lib/API/minimax/music.js` and associated `.md` files).
|
|
5
|
+
|
|
6
|
+
## Overview
|
|
7
|
+
|
|
8
|
+
**Stable Audio 3.0**: Fast, Best-Quality, Long-Form Music & Audio Generation
|
|
9
|
+
|
|
10
|
+
Our most advanced audio generation model, capable of generating up to 6-minute, 44.1 kHz stereo compositions. Stable Audio 3.0 supports text-to-audio, audio-to-audio, and audio-inpaint workflows - allowing creators to upload a sound and transform it into new instruments, styles, or genres using natural language prompts. Ideal for music production, cinematic sound design, and remixing.
|
|
11
|
+
|
|
12
|
+
- **Model ID**: `stable-audio-3`
|
|
13
|
+
- **Credits**: Flat rate of 26 credits per successful generation (not charged for failed generations).
|
|
14
|
+
- **Max Duration**: 380 seconds (default 190s)
|
|
15
|
+
- **Output Formats**: `mp3` (default), `wav`
|
|
16
|
+
- **Sample Rate**: 44.1 kHz stereo
|
|
17
|
+
- **Base URL**: `https://api.stability.ai`
|
|
18
|
+
- **Authentication**: Bearer token in `authorization` header (`Bearer sk-...`)
|
|
19
|
+
- **Async Workflow**: All generation endpoints return HTTP 202 with a `generation_id`. Poll `GET /v2beta/audio/results/{id}` until 200 (audio ready) or error.
|
|
20
|
+
- **Accept Header**: Use `audio/*` for direct binary audio response, or `application/json` for base64-encoded JSON.
|
|
21
|
+
- **Notes**:
|
|
22
|
+
- No copyrighted content allowed.
|
|
23
|
+
- Max request size: 100MB for audio uploads.
|
|
24
|
+
- Audio input validation: 6 to 380 seconds.
|
|
25
|
+
- English is the only supported language (per error examples).
|
|
26
|
+
- See official docs: [Stable Audio 3.0 announcement](https://stability.ai/news/stable-audio-3-0)
|
|
27
|
+
|
|
28
|
+
## Endpoints
|
|
29
|
+
|
|
30
|
+
### 1. Text-to-Audio
|
|
31
|
+
**POST** `/v2beta/audio/stable-audio/text-to-audio`
|
|
32
|
+
|
|
33
|
+
Generates audio from a text prompt. No input audio required.
|
|
34
|
+
|
|
35
|
+
#### Request
|
|
36
|
+
- **Content-Type**: `multipart/form-data`
|
|
37
|
+
- **Headers**:
|
|
38
|
+
- `authorization`: Bearer API key (required)
|
|
39
|
+
- `accept`: `audio/*` | `application/json` (optional, default `audio/*`)
|
|
40
|
+
- Other client headers optional (stability-client-id, etc.)
|
|
41
|
+
|
|
42
|
+
#### Request Body Parameters (multipart/form-data)
|
|
43
|
+
| Parameter | Type | Required | Default | Description |
|
|
44
|
+
|---------------|---------|----------|---------|-------------|
|
|
45
|
+
| `prompt` | string | Yes | - | Strong, descriptive prompt defining instruments, moods, styles, genre. Max 10000 chars. |
|
|
46
|
+
| `model` | string | No | `stable-audio-3` | Must be `stable-audio-3`. |
|
|
47
|
+
| `duration` | number | No | 190 | Duration in seconds (1-380). |
|
|
48
|
+
| `seed` | number | No | 0 | Random seed (0 for random, 0-4294967294). |
|
|
49
|
+
| `steps` | integer| No | 8 | Sampling steps (4-8). |
|
|
50
|
+
| `cfg_scale` | number | No | 1 | Prompt adherence (1-25). |
|
|
51
|
+
| `output_format` | string | No | `mp3` | `mp3` or `wav`. |
|
|
52
|
+
|
|
53
|
+
#### Responses
|
|
54
|
+
- **202 Accepted**: `{ "id": "generation_id" }` - Poll results endpoint.
|
|
55
|
+
- **400/422**: Validation or content errors (e.g., invalid params, copyrighted content, language issues).
|
|
56
|
+
- **403**: Content moderation flag.
|
|
57
|
+
- **429**: Rate limit.
|
|
58
|
+
- **500**: Server error.
|
|
59
|
+
|
|
60
|
+
#### Example (from OpenAPI)
|
|
61
|
+
See code samples in openapi.json for Python, JS (axios + FormData), cURL.
|
|
62
|
+
|
|
63
|
+
### 2. Audio-to-Audio
|
|
64
|
+
**POST** `/v2beta/audio/stable-audio/audio-to-audio`
|
|
65
|
+
|
|
66
|
+
Transforms an existing audio sample using a text prompt.
|
|
67
|
+
|
|
68
|
+
#### Request
|
|
69
|
+
- Same headers as Text-to-Audio.
|
|
70
|
+
- **Required Files/Fields**: `audio` (binary file: mp3/wav, 6-380s).
|
|
71
|
+
|
|
72
|
+
#### Request Body Parameters (multipart/form-data)
|
|
73
|
+
| Parameter | Type | Required | Default | Description |
|
|
74
|
+
|---------------|---------|----------|---------|-------------|
|
|
75
|
+
| `prompt` | string | Yes | - | Same as text-to-audio. |
|
|
76
|
+
| `model` | string | No | `stable-audio-3` | `stable-audio-3`. |
|
|
77
|
+
| `duration` | number | No | 190 | 1-380s. |
|
|
78
|
+
| `seed` | number | No | 0 | 0-4294967294. |
|
|
79
|
+
| `steps` | integer| No | 8 | 4-8. |
|
|
80
|
+
| `cfg_scale` | number | No | 1 | 1-25. |
|
|
81
|
+
| `output_format` | string | No | `mp3` | `mp3` or `wav`. |
|
|
82
|
+
| `strength` | number | No | 1 | Denoising strength (0-1). 0 = identical to input, 1 = no audio influence. |
|
|
83
|
+
| `audio` | file | Yes | - | Input audio file (mp3/wav). |
|
|
84
|
+
|
|
85
|
+
#### Responses
|
|
86
|
+
Same as Text-to-Audio (202 with id, errors, etc.).
|
|
87
|
+
|
|
88
|
+
### 3. Inpaint
|
|
89
|
+
**POST** `/v2beta/audio/stable-audio/inpaint`
|
|
90
|
+
|
|
91
|
+
Inpaints/replaces a section of an audio sample using a text prompt and time mask.
|
|
92
|
+
|
|
93
|
+
#### Request
|
|
94
|
+
- Same headers.
|
|
95
|
+
- **Required**: `audio` file + `prompt`.
|
|
96
|
+
|
|
97
|
+
#### Request Body Parameters (multipart/form-data)
|
|
98
|
+
| Parameter | Type | Required | Default | Description |
|
|
99
|
+
|---------------|---------|----------|---------|-------------|
|
|
100
|
+
| `prompt` | string | Yes | - | Same as above. |
|
|
101
|
+
| `model` | string | No | `stable-audio-3` | `stable-audio-3`. |
|
|
102
|
+
| `duration` | number | No | 190 | 1-380s. |
|
|
103
|
+
| `seed` | number | No | 0 | ... |
|
|
104
|
+
| `steps` | integer| No | 8 | ... |
|
|
105
|
+
| `cfg_scale` | number | No | 1 | ... |
|
|
106
|
+
| `output_format` | string | No | `mp3` | ... |
|
|
107
|
+
| `mask_start` | number | No | 30 | Start time in seconds for inpaint segment (0-380). |
|
|
108
|
+
| `mask_end` | number | No | 380 | End time in seconds (0-380). |
|
|
109
|
+
| `audio` | file | Yes | - | Input audio file. |
|
|
110
|
+
|
|
111
|
+
#### Responses
|
|
112
|
+
Same as above.
|
|
113
|
+
|
|
114
|
+
### 4. Fetch Audio Result
|
|
115
|
+
**GET** `/v2beta/audio/results/{id}`
|
|
116
|
+
|
|
117
|
+
Poll for generation result. Use after 202 from generation endpoints.
|
|
118
|
+
|
|
119
|
+
#### Path Parameters
|
|
120
|
+
- `id`: Generation ID from 202 response.
|
|
121
|
+
|
|
122
|
+
#### Headers
|
|
123
|
+
- `authorization`: Bearer key (required)
|
|
124
|
+
- `accept`: `audio/*` (binary) or `application/json` (base64)
|
|
125
|
+
|
|
126
|
+
#### Responses
|
|
127
|
+
- **200 OK**:
|
|
128
|
+
- If `accept: audio/*`: Binary audio bytes + headers: `content-type`, `finish-reason: SUCCESS`, `seed`, `x-request-id`.
|
|
129
|
+
- If `accept: application/json`: JSON `{ "audio": "base64...", "seed": ..., "finish_reason": "SUCCESS" }`
|
|
130
|
+
- **202**: Still in-progress: `{ "id": "...", "status": "in-progress" }`
|
|
131
|
+
- **404**: Generation not found or expired.
|
|
132
|
+
- Other errors as above.
|
|
133
|
+
|
|
134
|
+
#### Polling Recommendation
|
|
135
|
+
Poll every 5-10 seconds until 200 or error.
|
|
136
|
+
|
|
137
|
+
## Common Error Responses
|
|
138
|
+
- **400 Bad Request**: Invalid parameters.
|
|
139
|
+
- **403 Forbidden**: Content moderation.
|
|
140
|
+
- **422 Unprocessable**: Well-formed but rejected (e.g., copyrighted content, invalid language).
|
|
141
|
+
- **429 Too Many Requests**: Rate limit exceeded.
|
|
142
|
+
- **500 Internal Server Error**: Server issue.
|
|
143
|
+
|
|
144
|
+
Error body example:
|
|
145
|
+
```json
|
|
146
|
+
{
|
|
147
|
+
"id": "error-id",
|
|
148
|
+
"name": "bad_request",
|
|
149
|
+
"errors": ["some-field: is required"]
|
|
150
|
+
}
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
## OpenAPI Excerpts (Key Paths)
|
|
154
|
+
|
|
155
|
+
### Text-to-Audio Path
|
|
156
|
+
```json
|
|
157
|
+
{
|
|
158
|
+
"post": {
|
|
159
|
+
"tags": ["Stable Audio"],
|
|
160
|
+
"summary": "Text-to-Audio",
|
|
161
|
+
"description": "Stable Audio transforms existing audio samples into new high-quality compositions up to six minutes long at 44.1kHz stereo using text instructions. ...",
|
|
162
|
+
"x-launchDarklyEnableFlag": "allow-access-stable-audio-3-api",
|
|
163
|
+
"requestBody": {
|
|
164
|
+
"content": {
|
|
165
|
+
"multipart/form-data": {
|
|
166
|
+
"schema": {
|
|
167
|
+
"type": "object",
|
|
168
|
+
"properties": {
|
|
169
|
+
"prompt": { "type": "string", "maxLength": 10000, ... },
|
|
170
|
+
"model": { "type": "string", "enum": ["stable-audio-3"], "default": "stable-audio-3", ... },
|
|
171
|
+
"duration": { "type": "number", "minimum": 1, "maximum": 380, "default": 190, ... },
|
|
172
|
+
"seed": { ... },
|
|
173
|
+
"steps": { "type": "integer", "minimum": 4, "maximum": 8, "default": 8, ... },
|
|
174
|
+
"cfg_scale": { ... },
|
|
175
|
+
"output_format": { "type": "string", "enum": ["mp3", "wav"], "default": "mp3", ... }
|
|
176
|
+
},
|
|
177
|
+
"required": ["prompt"]
|
|
178
|
+
}
|
|
179
|
+
}
|
|
180
|
+
}
|
|
181
|
+
},
|
|
182
|
+
"responses": { "202": { ... }, "400": {...}, ... }
|
|
183
|
+
}
|
|
184
|
+
}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
(Similar structures for audio-to-audio with added `strength` and `audio` file; inpaint with `mask_start`, `mask_end`, `audio`; results as GET.)
|
|
188
|
+
|
|
189
|
+
## Code Generation Notes for `audio.js`
|
|
190
|
+
- Implement async methods: `textToAudio(params)`, `audioToAudio(params, audioFile)`, `inpaint(params, audioFile)`, `fetchResult(id, acceptHeader)`.
|
|
191
|
+
- Use FormData for multipart uploads.
|
|
192
|
+
- Handle 202 polling loop with timeout/retry.
|
|
193
|
+
- Support both binary download and base64 JSON.
|
|
194
|
+
- Add validation for duration (1-380), strength (0-1), masks, etc.
|
|
195
|
+
- Error handling matching Minimax style (specific error messages, retries).
|
|
196
|
+
- Default to `stable-audio-3` model, mp3 output.
|
|
197
|
+
- Preserve compatibility with potential future stable-audio-2 if needed, but focus on v3.
|
|
198
|
+
|
|
199
|
+
## References
|
|
200
|
+
- Full OpenAPI: `lib/API/stability.ai/openapi.json`
|
|
201
|
+
- Related: Stable Audio 2 endpoints exist under `/stable-audio-2/` (different model).
|
|
202
|
+
- Official: https://platform.stability.ai/docs (or stableaudio.com)
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
*Generated on 2026-05-22. Specs extracted via jq from openapi.json paths: /v2beta/audio/stable-audio/* and /v2beta/audio/results/{id}.*
|