elevenlabs_client 0.3.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +52 -1
- data/README.md +78 -1
- data/lib/elevenlabs_client/client.rb +63 -1
- data/lib/elevenlabs_client/endpoints/audio_isolation.rb +71 -0
- data/lib/elevenlabs_client/endpoints/audio_native.rb +103 -0
- data/lib/elevenlabs_client/endpoints/dubs.rb +208 -2
- data/lib/elevenlabs_client/endpoints/forced_alignment.rb +41 -0
- data/lib/elevenlabs_client/endpoints/speech_to_speech.rb +125 -0
- data/lib/elevenlabs_client/endpoints/speech_to_text.rb +108 -0
- data/lib/elevenlabs_client/endpoints/text_to_dialogue_stream.rb +50 -0
- data/lib/elevenlabs_client/endpoints/text_to_speech_stream.rb +1 -0
- data/lib/elevenlabs_client/endpoints/text_to_speech_stream_with_timestamps.rb +75 -0
- data/lib/elevenlabs_client/endpoints/text_to_speech_with_timestamps.rb +73 -0
- data/lib/elevenlabs_client/endpoints/voices.rb +362 -0
- data/lib/elevenlabs_client/endpoints/websocket_text_to_speech.rb +250 -0
- data/lib/elevenlabs_client/version.rb +1 -1
- data/lib/elevenlabs_client.rb +9 -2
- metadata +25 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f1c75ecb60655822ec4d8b88e22ebae5e0a1714e5573000cd5a36c3e80bcb886
|
4
|
+
data.tar.gz: 5d05b4e838bc30cbc1c290b615b1c0d686ea6d6aafe9521097ddc00d0ba28189
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e26733f1b2ddaaec79432e7458f2af56b50d0f29bb52bdddc4fcbdbb564c85eea40949c7304fef7a4af3da5ff2c364bb42341b3755bf385cc6e81bb429f81aa5
|
7
|
+
data.tar.gz: 59f527fa65e17375fa3a33eb8f7d140a3f59ccd87f51168e6f43bac5c94c3d93fa49026dc4ee6fbe5eb4fb7b0a772f6e6925b6051092f112b12936e3b154009e
|
data/CHANGELOG.md
CHANGED
@@ -5,6 +5,57 @@ All notable changes to this project will be documented in this file.
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
7
7
|
|
8
|
+
## [0.5.0] - 2025-09-14
|
9
|
+
|
10
|
+
### Added
|
11
|
+
|
12
|
+
- Text-to-Speech With Timestamps
|
13
|
+
- `client.text_to_speech_with_timestamps.generate(voice_id, text, **options)`
|
14
|
+
- Character-level `alignment` and `normalized_alignment`
|
15
|
+
- Streaming Text-to-Speech With Timestamps
|
16
|
+
- `client.text_to_speech_stream_with_timestamps.stream(voice_id, text, **options, &block)`
|
17
|
+
- JSON streaming with audio chunks and timing per chunk
|
18
|
+
- WebSocket Streaming Enhancements
|
19
|
+
- Single-context and multi-context improvements; correct query param ordering and filtering
|
20
|
+
- Docs: `docs/WEBSOCKET_STREAMING.md`
|
21
|
+
- Text-to-Dialogue Streaming
|
22
|
+
- `client.text_to_dialogue_stream.stream(inputs, **options, &block)`
|
23
|
+
- Docs: `docs/TEXT_TO_DIALOGUE_STREAMING.md`
|
24
|
+
|
25
|
+
### Improved
|
26
|
+
|
27
|
+
- Client streaming JSON handling for timestamp streams (`post_streaming_with_timestamps`)
|
28
|
+
- Robust parsing and block yielding across streaming tests
|
29
|
+
- URL query parameter ordering to match expectations in tests
|
30
|
+
|
31
|
+
### Tests
|
32
|
+
|
33
|
+
- Added comprehensive unit and integration tests for all new endpoints
|
34
|
+
- Full suite now: 687 examples, 0 failures
|
35
|
+
|
36
|
+
### Notes
|
37
|
+
|
38
|
+
- These features require valid ElevenLabs API keys and correct model/voice permissions
|
39
|
+
|
40
|
+
## [0.4.0] - 2025-09-12
|
41
|
+
|
42
|
+
### Added
|
43
|
+
|
44
|
+
- **🎵 Dubbing Generation API**
|
45
|
+
- `delete(dubbing_id)` - Delete dubbing projects
|
46
|
+
- `get_resource(dubbing_id)` - Get detailed resource information
|
47
|
+
- `create_segment(options)` - Create new segments
|
48
|
+
- `delete_segment(options)` - Delete segments
|
49
|
+
- `update_segment(options)` - Update segment text/timing
|
50
|
+
- `transcribe_segment(options)` - Regenerate transcriptions
|
51
|
+
- `translate_segment(options)` - Regenerate translations
|
52
|
+
- `dub_segment(options)` - Regenerate dubs
|
53
|
+
- `render_project(options)` - Render output media
|
54
|
+
- `update_speaker(options)` - Update speaker voices
|
55
|
+
- `get_similar_voices(options)` - Get voice recommendations
|
56
|
+
- **🔧 HTTP Client Improvements** - Added HTTP method
|
57
|
+
- Added `patch` method for PATCH requests
|
58
|
+
|
8
59
|
## [0.3.0] - 2025-09-12
|
9
60
|
|
10
61
|
### Added
|
@@ -252,4 +303,4 @@ client.dubs.create(file_io: file, filename: "video.mp4", target_languages: ["es"
|
|
252
303
|
- **File Support**: Multiple video and audio formats (MP4, MOV, MP3, WAV, etc.)
|
253
304
|
- **Language Support**: Multiple target languages for dubbing
|
254
305
|
- **Configuration**: Flexible API key and endpoint configuration
|
255
|
-
- **Testing**: Comprehensive test suite with integration tests
|
306
|
+
- **Testing**: Comprehensive test suite with integration tests
|
data/README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2
2
|
|
3
3
|
[](https://badge.fury.io/rb/elevenlabs_client)
|
4
4
|
|
5
|
-
A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects,
|
5
|
+
A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects, AI music composition, voice transformation, speech transcription, audio isolation, and advanced audio processing features.
|
6
6
|
|
7
7
|
## Features
|
8
8
|
|
@@ -13,6 +13,11 @@ A comprehensive Ruby client library for the ElevenLabs API, supporting voice syn
|
|
13
13
|
🎵 **Music Generation** - AI-powered music composition and streaming
|
14
14
|
🎨 **Voice Design** - Create custom voices from text descriptions
|
15
15
|
🎭 **Voice Management** - Create, edit, and manage individual voices
|
16
|
+
🔄 **Speech-to-Speech** - Transform audio from one voice to another (Voice Changer)
|
17
|
+
📝 **Speech-to-Text** - Transcribe audio and video files with advanced features
|
18
|
+
🔇 **Audio Isolation** - Remove background noise from audio files
|
19
|
+
📱 **Audio Native** - Create embeddable audio players for websites
|
20
|
+
⏱️ **Forced Alignment** - Get precise timing information for audio transcripts
|
16
21
|
🤖 **Models** - List available models and their capabilities
|
17
22
|
📡 **Streaming** - Real-time audio streaming
|
18
23
|
⚙️ **Configurable** - Flexible configuration options
|
@@ -141,6 +146,60 @@ music_data = client.music.compose(
|
|
141
146
|
)
|
142
147
|
File.open("generated_music.mp3", "wb") { |f| f.write(music_data) }
|
143
148
|
|
149
|
+
# Speech-to-Speech (Voice Changer)
|
150
|
+
File.open("input_audio.mp3", "rb") do |audio_file|
|
151
|
+
converted_audio = client.speech_to_speech.convert(
|
152
|
+
"target_voice_id",
|
153
|
+
audio_file,
|
154
|
+
"input_audio.mp3",
|
155
|
+
remove_background_noise: true
|
156
|
+
)
|
157
|
+
File.open("converted_audio.mp3", "wb") { |f| f.write(converted_audio) }
|
158
|
+
end
|
159
|
+
|
160
|
+
# Speech-to-Text Transcription
|
161
|
+
File.open("audio.mp3", "rb") do |audio_file|
|
162
|
+
transcription = client.speech_to_text.create(
|
163
|
+
"scribe_v1",
|
164
|
+
file: audio_file,
|
165
|
+
filename: "audio.mp3",
|
166
|
+
diarize: true,
|
167
|
+
timestamps_granularity: "word"
|
168
|
+
)
|
169
|
+
puts "Transcribed: #{transcription['text']}"
|
170
|
+
end
|
171
|
+
|
172
|
+
# Audio Isolation (Background Noise Removal)
|
173
|
+
File.open("noisy_audio.mp3", "rb") do |audio_file|
|
174
|
+
clean_audio = client.audio_isolation.isolate(audio_file, "noisy_audio.mp3")
|
175
|
+
File.open("clean_audio.mp3", "wb") { |f| f.write(clean_audio) }
|
176
|
+
end
|
177
|
+
|
178
|
+
# Audio Native (Embeddable Player)
|
179
|
+
File.open("article.html", "rb") do |html_file|
|
180
|
+
project = client.audio_native.create(
|
181
|
+
"My Article",
|
182
|
+
file: html_file,
|
183
|
+
filename: "article.html",
|
184
|
+
voice_id: "voice_id",
|
185
|
+
auto_convert: true
|
186
|
+
)
|
187
|
+
puts "Player HTML: #{project['html_snippet']}"
|
188
|
+
end
|
189
|
+
|
190
|
+
# Forced Alignment
|
191
|
+
File.open("speech.wav", "rb") do |audio_file|
|
192
|
+
alignment = client.forced_alignment.create(
|
193
|
+
audio_file,
|
194
|
+
"speech.wav",
|
195
|
+
"Hello world, this is a test transcript"
|
196
|
+
)
|
197
|
+
|
198
|
+
alignment['words'].each do |word|
|
199
|
+
puts "#{word['text']}: #{word['start']}s - #{word['end']}s"
|
200
|
+
end
|
201
|
+
end
|
202
|
+
|
144
203
|
# Streaming Text-to-Speech
|
145
204
|
client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
|
146
205
|
# Process audio chunk in real-time
|
@@ -160,6 +219,11 @@ end
|
|
160
219
|
- **[Music Generation API](docs/MUSIC.md)** - AI-powered music composition and streaming
|
161
220
|
- **[Text-to-Voice API](docs/TEXT_TO_VOICE.md)** - Design and create custom voices
|
162
221
|
- **[Voice Management API](docs/VOICES.md)** - Manage individual voices (CRUD operations)
|
222
|
+
- **[Speech-to-Speech API](docs/SPEECH_TO_SPEECH.md)** - Transform audio from one voice to another
|
223
|
+
- **[Speech-to-Text API](docs/SPEECH_TO_TEXT.md)** - Transcribe audio and video files
|
224
|
+
- **[Audio Isolation API](docs/AUDIO_ISOLATION.md)** - Remove background noise from audio
|
225
|
+
- **[Audio Native API](docs/AUDIO_NATIVE.md)** - Create embeddable audio players
|
226
|
+
- **[Forced Alignment API](docs/FORCED_ALIGNMENT.md)** - Get precise timing information
|
163
227
|
- **[Models API](docs/MODELS.md)** - List available models and capabilities
|
164
228
|
|
165
229
|
### Available Endpoints
|
@@ -174,6 +238,11 @@ end
|
|
174
238
|
| `client.music.*` | AI music composition and streaming | [MUSIC.md](docs/MUSIC.md) |
|
175
239
|
| `client.text_to_voice.*` | Voice design and creation | [TEXT_TO_VOICE.md](docs/TEXT_TO_VOICE.md) |
|
176
240
|
| `client.voices.*` | Voice management (CRUD) | [VOICES.md](docs/VOICES.md) |
|
241
|
+
| `client.speech_to_speech.*` | Voice changer and audio transformation | [SPEECH_TO_SPEECH.md](docs/SPEECH_TO_SPEECH.md) |
|
242
|
+
| `client.speech_to_text.*` | Audio/video transcription | [SPEECH_TO_TEXT.md](docs/SPEECH_TO_TEXT.md) |
|
243
|
+
| `client.audio_isolation.*` | Background noise removal | [AUDIO_ISOLATION.md](docs/AUDIO_ISOLATION.md) |
|
244
|
+
| `client.audio_native.*` | Embeddable audio players | [AUDIO_NATIVE.md](docs/AUDIO_NATIVE.md) |
|
245
|
+
| `client.forced_alignment.*` | Audio-text timing alignment | [FORCED_ALIGNMENT.md](docs/FORCED_ALIGNMENT.md) |
|
177
246
|
| `client.models.*` | Model information and capabilities | [MODELS.md](docs/MODELS.md) |
|
178
247
|
|
179
248
|
## Configuration Options
|
@@ -221,6 +290,9 @@ end
|
|
221
290
|
- `AuthenticationError` - Invalid API key or authentication failure
|
222
291
|
- `RateLimitError` - Rate limit exceeded
|
223
292
|
- `ValidationError` - Invalid request parameters
|
293
|
+
- `NotFoundError` - Resource not found (e.g., voice ID, transcript ID)
|
294
|
+
- `BadRequestError` - Bad request with invalid parameters
|
295
|
+
- `UnprocessableEntityError` - Request cannot be processed (e.g., invalid file format)
|
224
296
|
- `APIError` - General API errors
|
225
297
|
|
226
298
|
## Rails Integration
|
@@ -235,6 +307,11 @@ The gem is designed to work seamlessly with Rails applications. See the [example
|
|
235
307
|
- [MusicController](examples/music_controller.rb) - AI music composition and streaming
|
236
308
|
- [TextToVoiceController](examples/text_to_voice_controller.rb) - Voice design and creation
|
237
309
|
- [VoicesController](examples/voices_controller.rb) - Voice management (CRUD operations)
|
310
|
+
- [SpeechToSpeechController](examples/speech_to_speech_controller.rb) - Voice changer and audio transformation
|
311
|
+
- [SpeechToTextController](examples/speech_to_text_controller.rb) - Audio/video transcription with advanced features
|
312
|
+
- [AudioIsolationController](examples/audio_isolation_controller.rb) - Background noise removal and audio cleanup
|
313
|
+
- [AudioNativeController](examples/audio_native_controller.rb) - Embeddable audio players for websites
|
314
|
+
- [ForcedAlignmentController](examples/forced_alignment_controller.rb) - Audio-text timing alignment and subtitle generation
|
238
315
|
|
239
316
|
## Development
|
240
317
|
|
@@ -2,12 +2,13 @@
|
|
2
2
|
|
3
3
|
require "faraday"
|
4
4
|
require "faraday/multipart"
|
5
|
+
require "json"
|
5
6
|
|
6
7
|
module ElevenlabsClient
|
7
8
|
class Client
|
8
9
|
DEFAULT_BASE_URL = "https://api.elevenlabs.io"
|
9
10
|
|
10
|
-
attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_dialogue, :sound_generation, :text_to_voice, :models, :voices, :music
|
11
|
+
attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_speech_with_timestamps, :text_to_speech_stream_with_timestamps, :text_to_dialogue, :text_to_dialogue_stream, :sound_generation, :text_to_voice, :models, :voices, :music, :audio_isolation, :audio_native, :forced_alignment, :speech_to_speech, :speech_to_text, :websocket_text_to_speech
|
11
12
|
|
12
13
|
def initialize(api_key: nil, base_url: nil, api_key_env: "ELEVENLABS_API_KEY", base_url_env: "ELEVENLABS_BASE_URL")
|
13
14
|
@api_key = api_key || fetch_api_key(api_key_env)
|
@@ -16,12 +17,21 @@ module ElevenlabsClient
|
|
16
17
|
@dubs = Dubs.new(self)
|
17
18
|
@text_to_speech = TextToSpeech.new(self)
|
18
19
|
@text_to_speech_stream = TextToSpeechStream.new(self)
|
20
|
+
@text_to_speech_with_timestamps = TextToSpeechWithTimestamps.new(self)
|
21
|
+
@text_to_speech_stream_with_timestamps = TextToSpeechStreamWithTimestamps.new(self)
|
19
22
|
@text_to_dialogue = TextToDialogue.new(self)
|
23
|
+
@text_to_dialogue_stream = TextToDialogueStream.new(self)
|
20
24
|
@sound_generation = SoundGeneration.new(self)
|
21
25
|
@text_to_voice = TextToVoice.new(self)
|
22
26
|
@models = Models.new(self)
|
23
27
|
@voices = Voices.new(self)
|
24
28
|
@music = Endpoints::Music.new(self)
|
29
|
+
@audio_isolation = AudioIsolation.new(self)
|
30
|
+
@audio_native = AudioNative.new(self)
|
31
|
+
@forced_alignment = ForcedAlignment.new(self)
|
32
|
+
@speech_to_speech = SpeechToSpeech.new(self)
|
33
|
+
@speech_to_text = SpeechToText.new(self)
|
34
|
+
@websocket_text_to_speech = WebSocketTextToSpeech.new(self)
|
25
35
|
end
|
26
36
|
|
27
37
|
# Makes an authenticated GET request
|
@@ -61,6 +71,20 @@ module ElevenlabsClient
|
|
61
71
|
handle_response(response)
|
62
72
|
end
|
63
73
|
|
74
|
+
# Makes an authenticated PATCH request
|
75
|
+
# @param path [String] API endpoint path
|
76
|
+
# @param body [Hash, nil] Request body
|
77
|
+
# @return [Hash] Response body
|
78
|
+
def patch(path, body = nil)
|
79
|
+
response = @conn.patch(path) do |req|
|
80
|
+
req.headers["xi-api-key"] = api_key
|
81
|
+
req.headers["Content-Type"] = "application/json"
|
82
|
+
req.body = body.to_json if body
|
83
|
+
end
|
84
|
+
|
85
|
+
handle_response(response)
|
86
|
+
end
|
87
|
+
|
64
88
|
# Makes an authenticated multipart POST request
|
65
89
|
# @param path [String] API endpoint path
|
66
90
|
# @param payload [Hash] Multipart payload
|
@@ -130,6 +154,44 @@ module ElevenlabsClient
|
|
130
154
|
handle_response(response)
|
131
155
|
end
|
132
156
|
|
157
|
+
# Makes an authenticated POST request with streaming response for timestamp data
|
158
|
+
# @param path [String] API endpoint path
|
159
|
+
# @param body [Hash, nil] Request body
|
160
|
+
# @param block [Proc] Block to handle each JSON chunk with timestamps
|
161
|
+
# @return [Faraday::Response] Response object
|
162
|
+
def post_streaming_with_timestamps(path, body = nil, &block)
|
163
|
+
buffer = ""
|
164
|
+
|
165
|
+
response = @conn.post(path) do |req|
|
166
|
+
req.headers["xi-api-key"] = api_key
|
167
|
+
req.headers["Content-Type"] = "application/json"
|
168
|
+
req.body = body.to_json if body
|
169
|
+
|
170
|
+
# Set up streaming callback for JSON chunks
|
171
|
+
req.options.on_data = proc do |chunk, _|
|
172
|
+
if block_given?
|
173
|
+
buffer += chunk
|
174
|
+
|
175
|
+
# Process complete JSON objects
|
176
|
+
while buffer.include?("\n")
|
177
|
+
line, buffer = buffer.split("\n", 2)
|
178
|
+
next if line.strip.empty?
|
179
|
+
|
180
|
+
begin
|
181
|
+
json_data = JSON.parse(line)
|
182
|
+
block.call(json_data)
|
183
|
+
rescue JSON::ParserError
|
184
|
+
# Skip malformed JSON lines
|
185
|
+
next
|
186
|
+
end
|
187
|
+
end
|
188
|
+
end
|
189
|
+
end
|
190
|
+
end
|
191
|
+
|
192
|
+
handle_response(response)
|
193
|
+
end
|
194
|
+
|
133
195
|
# Helper method to create Faraday::Multipart::FilePart
|
134
196
|
# @param file_io [IO] File IO object
|
135
197
|
# @param filename [String] Original filename
|
@@ -0,0 +1,71 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class AudioIsolation
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/audio-isolation
|
10
|
+
# Removes background noise from audio
|
11
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/audio-isolation
|
12
|
+
#
|
13
|
+
# @param audio_file [IO, File] The audio file from which vocals/speech will be isolated
|
14
|
+
# @param filename [String] Original filename for the audio file
|
15
|
+
# @param options [Hash] Optional parameters
|
16
|
+
# @option options [String] :file_format Format of input audio ('pcm_s16le_16' or 'other', defaults to 'other')
|
17
|
+
# @return [String] Binary audio data with background noise removed
|
18
|
+
def isolate(audio_file, filename, **options)
|
19
|
+
endpoint = "/v1/audio-isolation"
|
20
|
+
|
21
|
+
payload = {
|
22
|
+
audio: @client.file_part(audio_file, filename)
|
23
|
+
}
|
24
|
+
|
25
|
+
# Add optional parameters if provided
|
26
|
+
payload[:file_format] = options[:file_format] if options[:file_format]
|
27
|
+
|
28
|
+
@client.post_multipart(endpoint, payload)
|
29
|
+
end
|
30
|
+
|
31
|
+
# POST /v1/audio-isolation/stream
|
32
|
+
# Removes background noise from audio with streaming response
|
33
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/audio-isolation/stream
|
34
|
+
#
|
35
|
+
# @param audio_file [IO, File] The audio file from which vocals/speech will be isolated
|
36
|
+
# @param filename [String] Original filename for the audio file
|
37
|
+
# @param options [Hash] Optional parameters
|
38
|
+
# @option options [String] :file_format Format of input audio ('pcm_s16le_16' or 'other', defaults to 'other')
|
39
|
+
# @param block [Proc] Block to handle each chunk of streaming audio data
|
40
|
+
# @return [Faraday::Response] Response object for streaming
|
41
|
+
def isolate_stream(audio_file, filename, **options, &block)
|
42
|
+
endpoint = "/v1/audio-isolation/stream"
|
43
|
+
|
44
|
+
payload = {
|
45
|
+
audio: @client.file_part(audio_file, filename)
|
46
|
+
}
|
47
|
+
|
48
|
+
# Add optional parameters if provided
|
49
|
+
payload[:file_format] = options[:file_format] if options[:file_format]
|
50
|
+
|
51
|
+
# Use streaming multipart request
|
52
|
+
response = @client.instance_variable_get(:@conn).post(endpoint) do |req|
|
53
|
+
req.headers["xi-api-key"] = @client.api_key
|
54
|
+
req.body = payload
|
55
|
+
|
56
|
+
# Set up streaming callback if block provided
|
57
|
+
if block_given?
|
58
|
+
req.options.on_data = proc do |chunk, _|
|
59
|
+
block.call(chunk)
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
63
|
+
|
64
|
+
@client.send(:handle_response, response)
|
65
|
+
end
|
66
|
+
|
67
|
+
private
|
68
|
+
|
69
|
+
attr_reader :client
|
70
|
+
end
|
71
|
+
end
|
@@ -0,0 +1,103 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class AudioNative
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/audio-native
|
10
|
+
# Creates Audio Native enabled project, optionally starts conversion and returns project ID and embeddable HTML snippet
|
11
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/audio-native/create
|
12
|
+
#
|
13
|
+
# @param name [String] Project name
|
14
|
+
# @param options [Hash] Optional parameters
|
15
|
+
# @option options [String] :image Image URL used in the player (deprecated)
|
16
|
+
# @option options [String] :author Author used in the player
|
17
|
+
# @option options [String] :title Title used in the player
|
18
|
+
# @option options [Boolean] :small Whether to use small player (deprecated, defaults to false)
|
19
|
+
# @option options [String] :text_color Text color used in the player
|
20
|
+
# @option options [String] :background_color Background color used in the player
|
21
|
+
# @option options [Integer] :sessionization Minutes to persist session (deprecated, defaults to 0)
|
22
|
+
# @option options [String] :voice_id Voice ID used to voice the content
|
23
|
+
# @option options [String] :model_id TTS Model ID used in the player
|
24
|
+
# @option options [IO, File] :file Text or HTML input file containing article content
|
25
|
+
# @option options [String] :filename Original filename for the file
|
26
|
+
# @option options [Boolean] :auto_convert Whether to auto convert project to audio (defaults to false)
|
27
|
+
# @option options [String] :apply_text_normalization Text normalization mode ('auto', 'on', 'off', 'apply_english')
|
28
|
+
# @return [Hash] JSON response containing project_id, converting status, and html_snippet
|
29
|
+
def create(name, **options)
|
30
|
+
endpoint = "/v1/audio-native"
|
31
|
+
|
32
|
+
payload = { name: name }
|
33
|
+
|
34
|
+
# Add optional parameters if provided
|
35
|
+
payload[:image] = options[:image] if options[:image]
|
36
|
+
payload[:author] = options[:author] if options[:author]
|
37
|
+
payload[:title] = options[:title] if options[:title]
|
38
|
+
payload[:small] = options[:small] unless options[:small].nil?
|
39
|
+
payload[:text_color] = options[:text_color] if options[:text_color]
|
40
|
+
payload[:background_color] = options[:background_color] if options[:background_color]
|
41
|
+
payload[:sessionization] = options[:sessionization] if options[:sessionization]
|
42
|
+
payload[:voice_id] = options[:voice_id] if options[:voice_id]
|
43
|
+
payload[:model_id] = options[:model_id] if options[:model_id]
|
44
|
+
payload[:auto_convert] = options[:auto_convert] unless options[:auto_convert].nil?
|
45
|
+
payload[:apply_text_normalization] = options[:apply_text_normalization] if options[:apply_text_normalization]
|
46
|
+
|
47
|
+
# Add file if provided
|
48
|
+
if options[:file] && options[:filename]
|
49
|
+
payload[:file] = @client.file_part(options[:file], options[:filename])
|
50
|
+
end
|
51
|
+
|
52
|
+
@client.post_multipart(endpoint, payload)
|
53
|
+
end
|
54
|
+
|
55
|
+
# POST /v1/audio-native/:project_id/content
|
56
|
+
# Updates content for the specific AudioNative Project
|
57
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/audio-native/update
|
58
|
+
#
|
59
|
+
# @param project_id [String] The ID of the project to be used
|
60
|
+
# @param options [Hash] Optional parameters
|
61
|
+
# @option options [IO, File] :file Text or HTML input file containing article content
|
62
|
+
# @option options [String] :filename Original filename for the file
|
63
|
+
# @option options [Boolean] :auto_convert Whether to auto convert project to audio (defaults to false)
|
64
|
+
# @option options [Boolean] :auto_publish Whether to auto publish after conversion (defaults to false)
|
65
|
+
# @return [Hash] JSON response containing project_id, converting, publishing status, and html_snippet
|
66
|
+
def update_content(project_id, **options)
|
67
|
+
endpoint = "/v1/audio-native/#{project_id}/content"
|
68
|
+
|
69
|
+
payload = {}
|
70
|
+
|
71
|
+
# Add optional parameters if provided
|
72
|
+
payload[:auto_convert] = options[:auto_convert] unless options[:auto_convert].nil?
|
73
|
+
payload[:auto_publish] = options[:auto_publish] unless options[:auto_publish].nil?
|
74
|
+
|
75
|
+
# Add file if provided
|
76
|
+
if options[:file] && options[:filename]
|
77
|
+
payload[:file] = @client.file_part(options[:file], options[:filename])
|
78
|
+
end
|
79
|
+
|
80
|
+
@client.post_multipart(endpoint, payload)
|
81
|
+
end
|
82
|
+
|
83
|
+
# GET /v1/audio-native/:project_id/settings
|
84
|
+
# Get player settings for the specific project
|
85
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/audio-native/settings
|
86
|
+
#
|
87
|
+
# @param project_id [String] The ID of the Studio project
|
88
|
+
# @return [Hash] JSON response containing enabled status, snapshot_id, and settings
|
89
|
+
def get_settings(project_id)
|
90
|
+
endpoint = "/v1/audio-native/#{project_id}/settings"
|
91
|
+
@client.get(endpoint)
|
92
|
+
end
|
93
|
+
|
94
|
+
# Alias methods for convenience
|
95
|
+
alias_method :create_project, :create
|
96
|
+
alias_method :update_project_content, :update_content
|
97
|
+
alias_method :project_settings, :get_settings
|
98
|
+
|
99
|
+
private
|
100
|
+
|
101
|
+
attr_reader :client
|
102
|
+
end
|
103
|
+
end
|