elevenlabs_client 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +79 -9
- data/README.md +172 -90
- data/lib/elevenlabs_client/client.rb +91 -1
- data/lib/elevenlabs_client/endpoints/sound_generation.rb +46 -0
- data/lib/elevenlabs_client/endpoints/text_to_dialogue.rb +40 -0
- data/lib/elevenlabs_client/endpoints/text_to_speech.rb +50 -0
- data/lib/elevenlabs_client/endpoints/text_to_speech_stream.rb +42 -0
- data/lib/elevenlabs_client/version.rb +1 -1
- data/lib/elevenlabs_client.rb +5 -1
- metadata +6 -2
- /data/lib/elevenlabs_client/{dubs.rb → endpoints/dubs.rb} +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2eb4466ffb626d55734bcd3569141c50293bbee7219fcffc252bd161f3bacac5
|
4
|
+
data.tar.gz: 2b045b85c15865d17f000a924c2f5088558d81f85b50e1273a11b6950e4eda9f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 30cfe941d5a311175436c55e5952d743efaa42410fc38e3e2c6df5c1b8db374720d960f86cc57c8985127028b6bc9312a360d4744e7536913576a1ce4c3bbb9e
|
7
|
+
data.tar.gz: ec656950468c78eede815ae879c1e7475e6c8d64d725a850c4d6ebf95f7ce9c058174b734a4cec19c4c65133ede31aac0b57f08b271e17ce6a87904c2609e3e3
|
data/CHANGELOG.md
CHANGED
@@ -7,6 +7,84 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
7
7
|
|
8
8
|
## [Unreleased]
|
9
9
|
|
10
|
+
## [0.2.0] - 2025-09-12
|
11
|
+
|
12
|
+
### Added
|
13
|
+
- **Text-to-Speech API** - Convert text to natural-sounding speech with voice customization
|
14
|
+
- **Text-to-Speech Streaming API** - Real-time audio streaming for live applications
|
15
|
+
- **Text-to-Dialogue API** - Multi-speaker conversation generation
|
16
|
+
- **Sound Generation API** - AI-generated sound effects and ambient audio
|
17
|
+
- **Comprehensive Documentation** - Separate documentation files for each API endpoint
|
18
|
+
- **Rails Integration Examples** - Complete controller examples for all endpoints
|
19
|
+
- **Enhanced Configuration** - Flexible configuration with Settings module
|
20
|
+
- **Streaming Support** - Real-time audio chunk processing with block callbacks
|
21
|
+
- **Binary Response Handling** - Proper handling of audio data responses
|
22
|
+
- **Query Parameter Support** - URL query parameters for API requests
|
23
|
+
|
24
|
+
### Enhanced
|
25
|
+
- **Endpoint Organization** - Moved all endpoints to dedicated `lib/elevenlabs_client/endpoints/` directory
|
26
|
+
- **Client Architecture** - Separated HTTP client logic from endpoint-specific functionality
|
27
|
+
- **Error Handling** - Enhanced error handling with streaming-specific exceptions
|
28
|
+
- **Test Coverage** - Expanded test suite to 187+ tests covering all new functionality
|
29
|
+
- **Configuration System** - Priority-based configuration (explicit > Settings > ENV)
|
30
|
+
|
31
|
+
### Documentation
|
32
|
+
- **Modular Documentation** - Split endpoint documentation into separate files:
|
33
|
+
- [DUBBING.md](docs/DUBBING.md) - Audio/video dubbing functionality
|
34
|
+
- [TEXT_TO_SPEECH.md](docs/TEXT_TO_SPEECH.md) - Text-to-speech conversion
|
35
|
+
- [TEXT_TO_SPEECH_STREAMING.md](docs/TEXT_TO_SPEECH_STREAMING.md) - Real-time streaming
|
36
|
+
- [TEXT_TO_DIALOGUE.md](docs/TEXT_TO_DIALOGUE.md) - Multi-speaker dialogues
|
37
|
+
- [SOUND_GENERATION.md](docs/SOUND_GENERATION.md) - Sound effect generation
|
38
|
+
- **Improved README** - Streamlined main README with quick start guide
|
39
|
+
- **Rails Examples** - Complete controller implementations for all endpoints
|
40
|
+
- **Usage Examples** - Comprehensive examples for each API feature
|
41
|
+
|
42
|
+
### New Endpoints
|
43
|
+
- `client.text_to_speech.*` - Text-to-speech conversion with voice settings
|
44
|
+
- `client.text_to_speech_stream.*` - Real-time streaming text-to-speech
|
45
|
+
- `client.text_to_dialogue.*` - Multi-speaker dialogue generation
|
46
|
+
- `client.sound_generation.*` - AI sound effect and ambient audio generation
|
47
|
+
|
48
|
+
### New Features
|
49
|
+
- **Voice Customization** - Stability, similarity boost, style controls
|
50
|
+
- **Audio Formats** - Multiple output formats (MP3, PCM) with quality options
|
51
|
+
- **Looping Audio** - Generate seamless looping sound effects
|
52
|
+
- **Deterministic Generation** - Seed support for consistent results
|
53
|
+
- **Batch Processing** - Multiple sound generation in single requests
|
54
|
+
- **WebSocket Integration** - Real-time streaming to WebSocket connections
|
55
|
+
- **File Format Support** - Enhanced support for various audio/video formats
|
56
|
+
|
57
|
+
### Technical Improvements
|
58
|
+
- **Modular Architecture** - Clean separation of concerns with endpoint classes
|
59
|
+
- **HTTP Client Enhancement** - Added streaming, binary, and custom header support
|
60
|
+
- **Settings Management** - Centralized configuration with Rails initializer support
|
61
|
+
- **Memory Management** - Efficient handling of large audio files and streams
|
62
|
+
- **Concurrent Testing** - Parallel test execution for faster development
|
63
|
+
|
64
|
+
### Examples Added
|
65
|
+
- `examples/dubs_controller.rb` - Complete dubbing workflow with batch processing
|
66
|
+
- `examples/text_to_speech_controller.rb` - TTS with voice customization
|
67
|
+
- `examples/streaming_audio_controller.rb` - Real-time streaming with WebSocket support
|
68
|
+
- `examples/text_to_dialogue_controller.rb` - Specialized dialogue endpoints
|
69
|
+
- `examples/sound_generation_controller.rb` - Sound effects with presets and batch processing
|
70
|
+
- `examples/rails_initializer.rb` - Rails configuration example
|
71
|
+
|
72
|
+
### Breaking Changes
|
73
|
+
- **Endpoint Access** - Dubbing methods moved from `client.create_dub` to `client.dubs.create`
|
74
|
+
- **File Structure** - Endpoint classes moved to `lib/elevenlabs_client/endpoints/`
|
75
|
+
- **Configuration** - Enhanced configuration system with new precedence rules
|
76
|
+
|
77
|
+
### Migration Guide
|
78
|
+
```ruby
|
79
|
+
# Before (v0.1.0)
|
80
|
+
client.create_dub(file_io: file, filename: "video.mp4", target_languages: ["es"])
|
81
|
+
|
82
|
+
# After (v0.2.0)
|
83
|
+
client.dubs.create(file_io: file, filename: "video.mp4", target_languages: ["es"])
|
84
|
+
```
|
85
|
+
|
86
|
+
## [0.1.0] - 2025-09-12
|
87
|
+
|
10
88
|
### Added
|
11
89
|
- Initial release of ElevenLabs Client gem
|
12
90
|
- Support for ElevenLabs Dubbing API
|
@@ -26,12 +104,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
26
104
|
- **File Support**: Multiple video and audio formats (MP4, MOV, MP3, WAV, etc.)
|
27
105
|
- **Language Support**: Multiple target languages for dubbing
|
28
106
|
- **Configuration**: Flexible API key and endpoint configuration
|
29
|
-
- **Testing**: Comprehensive test suite with integration tests
|
30
|
-
|
31
|
-
## [0.1.0] - 2025-01-XX
|
32
|
-
|
33
|
-
### Added
|
34
|
-
- Initial implementation of ElevenLabs Client
|
35
|
-
- Basic dubbing functionality
|
36
|
-
- Error handling and validation
|
37
|
-
- Documentation and examples
|
107
|
+
- **Testing**: Comprehensive test suite with integration tests
|
data/README.md
CHANGED
@@ -1,24 +1,45 @@
|
|
1
1
|
# ElevenlabsClient
|
2
2
|
|
3
|
-
|
3
|
+
[](https://badge.fury.io/rb/elevenlabs_client)
|
4
|
+
[](https://github.com/yourusername/elevenlabs_client/actions)
|
5
|
+
|
6
|
+
A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, and sound effects.
|
7
|
+
|
8
|
+
## Features
|
9
|
+
|
10
|
+
🎙️ **Text-to-Speech** - Convert text to natural-sounding speech
|
11
|
+
🎬 **Dubbing** - Create dubbed versions of audio/video content
|
12
|
+
💬 **Dialogue Generation** - Multi-speaker conversations
|
13
|
+
🔊 **Sound Generation** - AI-generated sound effects and ambient audio
|
14
|
+
📡 **Streaming** - Real-time audio streaming
|
15
|
+
⚙️ **Configurable** - Flexible configuration options
|
16
|
+
🧪 **Well-tested** - Comprehensive test coverage
|
4
17
|
|
5
18
|
## Installation
|
6
19
|
|
7
20
|
Add this line to your application's Gemfile:
|
8
21
|
|
9
22
|
```ruby
|
10
|
-
gem 'elevenlabs_client'
|
23
|
+
gem 'elevenlabs_client'
|
11
24
|
```
|
12
25
|
|
13
26
|
And then execute:
|
14
27
|
|
15
|
-
|
28
|
+
```bash
|
29
|
+
$ bundle install
|
30
|
+
```
|
31
|
+
|
32
|
+
Or install it yourself as:
|
33
|
+
|
34
|
+
```bash
|
35
|
+
$ gem install elevenlabs_client
|
36
|
+
```
|
16
37
|
|
17
|
-
##
|
38
|
+
## Quick Start
|
18
39
|
|
19
40
|
### Configuration
|
20
41
|
|
21
|
-
#### Rails
|
42
|
+
#### Rails Applications (Recommended)
|
22
43
|
|
23
44
|
Create `config/initializers/elevenlabs_client.rb`:
|
24
45
|
|
@@ -26,149 +47,210 @@ Create `config/initializers/elevenlabs_client.rb`:
|
|
26
47
|
ElevenlabsClient::Settings.configure do |config|
|
27
48
|
config.properties = {
|
28
49
|
elevenlabs_base_uri: ENV["ELEVENLABS_BASE_URL"],
|
29
|
-
elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"]
|
50
|
+
elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"]
|
30
51
|
}
|
31
52
|
end
|
32
53
|
```
|
33
54
|
|
34
|
-
|
55
|
+
Set your environment variables:
|
35
56
|
|
36
|
-
```
|
37
|
-
|
38
|
-
#
|
57
|
+
```bash
|
58
|
+
export ELEVENLABS_API_KEY="your_api_key_here"
|
59
|
+
export ELEVENLABS_BASE_URL="https://api.elevenlabs.io" # Optional, defaults to official API
|
39
60
|
```
|
40
61
|
|
41
|
-
####
|
42
|
-
|
43
|
-
You can also use the module-level configure method:
|
62
|
+
#### Direct Configuration
|
44
63
|
|
45
64
|
```ruby
|
65
|
+
# Module-level configuration
|
46
66
|
ElevenlabsClient.configure do |config|
|
47
67
|
config.properties = {
|
48
68
|
elevenlabs_base_uri: "https://api.elevenlabs.io",
|
49
69
|
elevenlabs_api_key: "your_api_key_here"
|
50
70
|
}
|
51
71
|
end
|
52
|
-
```
|
53
|
-
|
54
|
-
#### Configuration Precedence
|
55
|
-
|
56
|
-
The client uses the following precedence order for configuration:
|
57
|
-
|
58
|
-
1. **Explicit parameters** passed to `Client.new` (highest priority)
|
59
|
-
2. **Settings.properties** configured via initializer
|
60
|
-
3. **Environment variables** (lowest priority)
|
61
|
-
|
62
|
-
This allows you to set defaults in your initializer while still being able to override them when needed.
|
63
72
|
|
64
|
-
|
65
|
-
|
66
|
-
There are several ways to create a client:
|
67
|
-
|
68
|
-
```ruby
|
69
|
-
# Using environment variables (default behavior)
|
70
|
-
client = ElevenlabsClient.new
|
71
|
-
|
72
|
-
# Passing API key directly
|
73
|
-
client = ElevenlabsClient::Client.new(api_key: "your_api_key_here")
|
74
|
-
|
75
|
-
# Custom base URL
|
76
|
-
client = ElevenlabsClient::Client.new(
|
73
|
+
# Or pass directly to client
|
74
|
+
client = ElevenlabsClient.new(
|
77
75
|
api_key: "your_api_key_here",
|
78
|
-
base_url: "https://
|
79
|
-
)
|
80
|
-
|
81
|
-
# Custom environment variable names
|
82
|
-
client = ElevenlabsClient::Client.new(
|
83
|
-
api_key_env: "MY_CUSTOM_API_KEY_VAR",
|
84
|
-
base_url_env: "MY_CUSTOM_BASE_URL_VAR"
|
76
|
+
base_url: "https://api.elevenlabs.io"
|
85
77
|
)
|
86
78
|
```
|
87
79
|
|
88
80
|
### Basic Usage
|
89
81
|
|
90
82
|
```ruby
|
91
|
-
|
92
|
-
|
93
|
-
# Create a client
|
83
|
+
# Initialize client (uses configured settings)
|
94
84
|
client = ElevenlabsClient.new
|
95
85
|
|
96
|
-
#
|
86
|
+
# Text-to-Speech
|
87
|
+
audio_data = client.text_to_speech.convert("21m00Tcm4TlvDq8ikWAM", "Hello, world!")
|
88
|
+
File.open("hello.mp3", "wb") { |f| f.write(audio_data) }
|
89
|
+
|
90
|
+
# Dubbing
|
97
91
|
File.open("video.mp4", "rb") do |file|
|
98
92
|
result = client.dubs.create(
|
99
93
|
file_io: file,
|
100
94
|
filename: "video.mp4",
|
101
|
-
target_languages: ["es", "
|
102
|
-
name: "My Video Dub",
|
103
|
-
drop_background_audio: true,
|
104
|
-
use_profanity_filter: false
|
95
|
+
target_languages: ["es", "fr", "de"]
|
105
96
|
)
|
106
|
-
|
107
|
-
puts "Dubbing job created: #{result['dubbing_id']}"
|
108
97
|
end
|
109
98
|
|
110
|
-
#
|
111
|
-
|
112
|
-
|
99
|
+
# Dialogue Generation
|
100
|
+
dialogue = [
|
101
|
+
{ text: "Hello, how are you?", voice_id: "voice_1" },
|
102
|
+
{ text: "I'm doing great, thanks!", voice_id: "voice_2" }
|
103
|
+
]
|
104
|
+
audio_data = client.text_to_dialogue.convert(dialogue)
|
113
105
|
|
114
|
-
#
|
115
|
-
|
116
|
-
puts "Completed dubs: #{dubs['dubs'].length}"
|
106
|
+
# Sound Generation
|
107
|
+
audio_data = client.sound_generation.generate("Ocean waves crashing on rocks")
|
117
108
|
|
118
|
-
#
|
119
|
-
|
120
|
-
|
109
|
+
# Streaming Text-to-Speech
|
110
|
+
client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
|
111
|
+
# Process audio chunk in real-time
|
112
|
+
puts "Received #{chunk.bytesize} bytes"
|
113
|
+
end
|
121
114
|
```
|
122
115
|
|
123
|
-
|
116
|
+
## API Documentation
|
117
|
+
|
118
|
+
### Core APIs
|
124
119
|
|
125
|
-
|
120
|
+
- **[Dubbing API](docs/DUBBING.md)** - Create dubbed versions of audio/video content
|
121
|
+
- **[Text-to-Speech API](docs/TEXT_TO_SPEECH.md)** - Convert text to natural speech
|
122
|
+
- **[Text-to-Speech Streaming API](docs/TEXT_TO_SPEECH_STREAMING.md)** - Real-time audio streaming
|
123
|
+
- **[Text-to-Dialogue API](docs/TEXT_TO_DIALOGUE.md)** - Multi-speaker conversations
|
124
|
+
- **[Sound Generation API](docs/SOUND_GENERATION.md)** - AI-generated sound effects
|
126
125
|
|
127
|
-
|
128
|
-
- `client.dubs.get(dubbing_id)` - Get dubbing job details
|
129
|
-
- `client.dubs.list(params = {})` - List dubbing jobs with optional filters
|
130
|
-
- `client.dubs.resources(dubbing_id)` - Get dubbing resources for editing
|
126
|
+
### Available Endpoints
|
131
127
|
|
132
|
-
|
128
|
+
| Endpoint | Description | Documentation |
|
129
|
+
|----------|-------------|---------------|
|
130
|
+
| `client.dubs.*` | Audio/video dubbing | [DUBBING.md](docs/DUBBING.md) |
|
131
|
+
| `client.text_to_speech.*` | Text-to-speech conversion | [TEXT_TO_SPEECH.md](docs/TEXT_TO_SPEECH.md) |
|
132
|
+
| `client.text_to_speech_stream.*` | Streaming TTS | [TEXT_TO_SPEECH_STREAMING.md](docs/TEXT_TO_SPEECH_STREAMING.md) |
|
133
|
+
| `client.text_to_dialogue.*` | Dialogue generation | [TEXT_TO_DIALOGUE.md](docs/TEXT_TO_DIALOGUE.md) |
|
134
|
+
| `client.sound_generation.*` | Sound effect generation | [SOUND_GENERATION.md](docs/SOUND_GENERATION.md) |
|
135
|
+
|
136
|
+
## Configuration Options
|
137
|
+
|
138
|
+
### Configuration Precedence
|
139
|
+
|
140
|
+
1. **Explicit parameters** (highest priority)
|
141
|
+
2. **Settings.properties** (configured via initializer)
|
142
|
+
3. **Environment variables** (lowest priority)
|
133
143
|
|
134
|
-
|
135
|
-
|
136
|
-
- `
|
137
|
-
- `
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
144
|
+
### Environment Variables
|
145
|
+
|
146
|
+
- `ELEVENLABS_API_KEY` - Your ElevenLabs API key (required)
|
147
|
+
- `ELEVENLABS_BASE_URL` - API base URL (optional, defaults to `https://api.elevenlabs.io`)
|
148
|
+
|
149
|
+
### Custom Environment Variable Names
|
150
|
+
|
151
|
+
```ruby
|
152
|
+
client = ElevenlabsClient.new(
|
153
|
+
api_key_env: "CUSTOM_API_KEY_VAR",
|
154
|
+
base_url_env: "CUSTOM_BASE_URL_VAR"
|
155
|
+
)
|
156
|
+
```
|
145
157
|
|
146
158
|
## Error Handling
|
147
159
|
|
148
|
-
The client
|
160
|
+
The client provides specific exception types for different error conditions:
|
149
161
|
|
150
162
|
```ruby
|
151
163
|
begin
|
152
|
-
client.
|
153
|
-
rescue ElevenlabsClient::AuthenticationError
|
154
|
-
puts "Invalid API key
|
155
|
-
rescue ElevenlabsClient::RateLimitError
|
156
|
-
puts "Rate limit exceeded
|
164
|
+
result = client.text_to_speech.convert(voice_id, text)
|
165
|
+
rescue ElevenlabsClient::AuthenticationError
|
166
|
+
puts "Invalid API key"
|
167
|
+
rescue ElevenlabsClient::RateLimitError
|
168
|
+
puts "Rate limit exceeded"
|
157
169
|
rescue ElevenlabsClient::ValidationError => e
|
158
|
-
puts "
|
170
|
+
puts "Invalid parameters: #{e.message}"
|
159
171
|
rescue ElevenlabsClient::APIError => e
|
160
172
|
puts "API error: #{e.message}"
|
161
173
|
end
|
162
174
|
```
|
163
175
|
|
176
|
+
### Exception Types
|
177
|
+
|
178
|
+
- `AuthenticationError` - Invalid API key or authentication failure
|
179
|
+
- `RateLimitError` - Rate limit exceeded
|
180
|
+
- `ValidationError` - Invalid request parameters
|
181
|
+
- `APIError` - General API errors
|
182
|
+
|
183
|
+
## Rails Integration
|
184
|
+
|
185
|
+
The gem is designed to work seamlessly with Rails applications. See the [examples](examples/) directory for complete controller implementations:
|
186
|
+
|
187
|
+
- [DubsController](examples/dubs_controller.rb) - Complete dubbing workflow
|
188
|
+
- [TextToSpeechController](examples/text_to_speech_controller.rb) - TTS with error handling
|
189
|
+
- [StreamingAudioController](examples/streaming_audio_controller.rb) - Real-time streaming
|
190
|
+
- [TextToDialogueController](examples/text_to_dialogue_controller.rb) - Dialogue generation
|
191
|
+
- [SoundGenerationController](examples/sound_generation_controller.rb) - Sound effects
|
192
|
+
|
164
193
|
## Development
|
165
194
|
|
166
|
-
After checking out the repo, run
|
195
|
+
After checking out the repo, run:
|
196
|
+
|
197
|
+
```bash
|
198
|
+
bin/setup # Install dependencies
|
199
|
+
bundle exec rspec # Run tests
|
200
|
+
```
|
201
|
+
|
202
|
+
To install this gem onto your local machine:
|
203
|
+
|
204
|
+
```bash
|
205
|
+
bundle exec rake install
|
206
|
+
```
|
207
|
+
|
208
|
+
To release a new version:
|
209
|
+
|
210
|
+
1. Update the version number in `version.rb`
|
211
|
+
2. Update `CHANGELOG.md`
|
212
|
+
3. Run `bundle exec rake release`
|
213
|
+
|
214
|
+
## Testing
|
215
|
+
|
216
|
+
The gem includes comprehensive test coverage with RSpec:
|
217
|
+
|
218
|
+
```bash
|
219
|
+
# Run all tests
|
220
|
+
bundle exec rspec
|
221
|
+
|
222
|
+
# Run specific test files
|
223
|
+
bundle exec rspec spec/elevenlabs_client/endpoints/
|
224
|
+
bundle exec rspec spec/integration/
|
225
|
+
|
226
|
+
# Run with documentation format
|
227
|
+
bundle exec rspec --format documentation
|
228
|
+
```
|
167
229
|
|
168
230
|
## Contributing
|
169
231
|
|
170
|
-
Bug reports and pull requests are welcome on GitHub.
|
232
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/yourusername/elevenlabs_client.
|
233
|
+
|
234
|
+
1. Fork it
|
235
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
236
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
237
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
238
|
+
5. Create a new Pull Request
|
171
239
|
|
172
240
|
## License
|
173
241
|
|
174
242
|
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
243
|
+
|
244
|
+
## Changelog
|
245
|
+
|
246
|
+
See [CHANGELOG.md](CHANGELOG.md) for a detailed list of changes and version history.
|
247
|
+
|
248
|
+
## Support
|
249
|
+
|
250
|
+
- 📖 **Documentation**: [API Documentation](docs/)
|
251
|
+
- 🐛 **Issues**: [GitHub Issues](https://github.com/yourusername/elevenlabs_client/issues)
|
252
|
+
- 💬 **Discussions**: [GitHub Discussions](https://github.com/yourusername/elevenlabs_client/discussions)
|
253
|
+
|
254
|
+
---
|
255
|
+
|
256
|
+
Made with ❤️ for the Ruby community
|
@@ -7,13 +7,17 @@ module ElevenlabsClient
|
|
7
7
|
class Client
|
8
8
|
DEFAULT_BASE_URL = "https://api.elevenlabs.io"
|
9
9
|
|
10
|
-
attr_reader :base_url, :api_key, :dubs
|
10
|
+
attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_dialogue, :sound_generation
|
11
11
|
|
12
12
|
def initialize(api_key: nil, base_url: nil, api_key_env: "ELEVENLABS_API_KEY", base_url_env: "ELEVENLABS_BASE_URL")
|
13
13
|
@api_key = api_key || fetch_api_key(api_key_env)
|
14
14
|
@base_url = base_url || fetch_base_url(base_url_env)
|
15
15
|
@conn = build_connection
|
16
16
|
@dubs = Dubs.new(self)
|
17
|
+
@text_to_speech = TextToSpeech.new(self)
|
18
|
+
@text_to_speech_stream = TextToSpeechStream.new(self)
|
19
|
+
@text_to_dialogue = TextToDialogue.new(self)
|
20
|
+
@sound_generation = SoundGeneration.new(self)
|
17
21
|
end
|
18
22
|
|
19
23
|
# Makes an authenticated GET request
|
@@ -54,6 +58,62 @@ module ElevenlabsClient
|
|
54
58
|
handle_response(response)
|
55
59
|
end
|
56
60
|
|
61
|
+
# Makes an authenticated POST request expecting binary response
|
62
|
+
# @param path [String] API endpoint path
|
63
|
+
# @param body [Hash, nil] Request body
|
64
|
+
# @return [String] Binary response body
|
65
|
+
def post_binary(path, body = nil)
|
66
|
+
response = @conn.post(path) do |req|
|
67
|
+
req.headers["xi-api-key"] = api_key
|
68
|
+
req.headers["Content-Type"] = "application/json"
|
69
|
+
req.body = body.to_json if body
|
70
|
+
end
|
71
|
+
|
72
|
+
handle_binary_response(response)
|
73
|
+
end
|
74
|
+
|
75
|
+
# Makes an authenticated POST request with custom headers
|
76
|
+
# @param path [String] API endpoint path
|
77
|
+
# @param body [Hash, nil] Request body
|
78
|
+
# @param custom_headers [Hash] Additional headers
|
79
|
+
# @return [String] Response body (binary or text)
|
80
|
+
def post_with_custom_headers(path, body = nil, custom_headers = {})
|
81
|
+
response = @conn.post(path) do |req|
|
82
|
+
req.headers["xi-api-key"] = api_key
|
83
|
+
req.headers["Content-Type"] = "application/json"
|
84
|
+
custom_headers.each { |key, value| req.headers[key] = value }
|
85
|
+
req.body = body.to_json if body
|
86
|
+
end
|
87
|
+
|
88
|
+
# For streaming/binary responses, return raw body
|
89
|
+
if custom_headers["Accept"]&.include?("audio") || custom_headers["Transfer-Encoding"] == "chunked"
|
90
|
+
handle_binary_response(response)
|
91
|
+
else
|
92
|
+
handle_response(response)
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
# Makes an authenticated POST request with streaming response
|
97
|
+
# @param path [String] API endpoint path
|
98
|
+
# @param body [Hash, nil] Request body
|
99
|
+
# @param block [Proc] Block to handle each chunk
|
100
|
+
# @return [Faraday::Response] Response object
|
101
|
+
def post_streaming(path, body = nil, &block)
|
102
|
+
response = @conn.post(path) do |req|
|
103
|
+
req.headers["xi-api-key"] = api_key
|
104
|
+
req.headers["Content-Type"] = "application/json"
|
105
|
+
req.headers["Accept"] = "audio/mpeg"
|
106
|
+
req.body = body.to_json if body
|
107
|
+
|
108
|
+
# Set up streaming callback
|
109
|
+
req.options.on_data = proc do |chunk, _|
|
110
|
+
block.call(chunk) if block_given?
|
111
|
+
end
|
112
|
+
end
|
113
|
+
|
114
|
+
handle_streaming_response(response)
|
115
|
+
end
|
116
|
+
|
57
117
|
# Helper method to create Faraday::Multipart::FilePart
|
58
118
|
# @param file_io [IO] File IO object
|
59
119
|
# @param filename [String] Original filename
|
@@ -108,6 +168,36 @@ module ElevenlabsClient
|
|
108
168
|
end
|
109
169
|
end
|
110
170
|
|
171
|
+
def handle_binary_response(response)
|
172
|
+
case response.status
|
173
|
+
when 200..299
|
174
|
+
response.body
|
175
|
+
when 401
|
176
|
+
raise AuthenticationError, "Invalid API key or authentication failed"
|
177
|
+
when 429
|
178
|
+
raise RateLimitError, "Rate limit exceeded"
|
179
|
+
when 400..499
|
180
|
+
raise ValidationError, "API request failed with status #{response.status}"
|
181
|
+
else
|
182
|
+
raise APIError, "API request failed with status #{response.status}"
|
183
|
+
end
|
184
|
+
end
|
185
|
+
|
186
|
+
def handle_streaming_response(response)
|
187
|
+
case response.status
|
188
|
+
when 200..299
|
189
|
+
response
|
190
|
+
when 401
|
191
|
+
raise AuthenticationError, "Invalid API key or authentication failed"
|
192
|
+
when 429
|
193
|
+
raise RateLimitError, "Rate limit exceeded"
|
194
|
+
when 400..499
|
195
|
+
raise ValidationError, "API request failed with status #{response.status}"
|
196
|
+
else
|
197
|
+
raise APIError, "API request failed with status #{response.status}"
|
198
|
+
end
|
199
|
+
end
|
200
|
+
|
111
201
|
def mime_for(filename)
|
112
202
|
ext = File.extname(filename).downcase
|
113
203
|
case ext
|
@@ -0,0 +1,46 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class SoundGeneration
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/sound-generation
|
10
|
+
# Convert text to sound effects and retrieve audio (binary data)
|
11
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/sound-generation
|
12
|
+
#
|
13
|
+
# @param text [String] Text prompt describing the sound effect
|
14
|
+
# @param options [Hash] Optional parameters
|
15
|
+
# @option options [Boolean] :loop Whether to create a looping sound effect (default: false)
|
16
|
+
# @option options [Float] :duration_seconds Duration in seconds (0.5 to 30, default: nil for auto-detection)
|
17
|
+
# @option options [Float] :prompt_influence Prompt influence (0.0 to 1.0, default: 0.3)
|
18
|
+
# @option options [String] :output_format Output format (e.g., "mp3_22050_32", default: "mp3_44100_128")
|
19
|
+
# @return [String] The binary audio data (usually an MP3)
|
20
|
+
def generate(text, **options)
|
21
|
+
endpoint = "/v1/sound-generation"
|
22
|
+
request_body = { text: text }
|
23
|
+
|
24
|
+
# Add optional parameters if provided
|
25
|
+
request_body[:loop] = options[:loop] unless options[:loop].nil?
|
26
|
+
request_body[:duration_seconds] = options[:duration_seconds] if options[:duration_seconds]
|
27
|
+
request_body[:prompt_influence] = options[:prompt_influence] if options[:prompt_influence]
|
28
|
+
|
29
|
+
# Handle output_format as query parameter
|
30
|
+
query_params = {}
|
31
|
+
query_params[:output_format] = options[:output_format] if options[:output_format]
|
32
|
+
|
33
|
+
# Build endpoint with query parameters if any
|
34
|
+
full_endpoint = query_params.any? ? "#{endpoint}?#{URI.encode_www_form(query_params)}" : endpoint
|
35
|
+
|
36
|
+
@client.post_binary(full_endpoint, request_body)
|
37
|
+
end
|
38
|
+
|
39
|
+
# Alias for backward compatibility and convenience
|
40
|
+
alias_method :sound_generation, :generate
|
41
|
+
|
42
|
+
private
|
43
|
+
|
44
|
+
attr_reader :client
|
45
|
+
end
|
46
|
+
end
|
@@ -0,0 +1,40 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class TextToDialogue
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/text-to-dialogue
|
10
|
+
# Converts a list of text and voice ID pairs into speech (dialogue) and returns audio.
|
11
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/text-to-dialogue/convert
|
12
|
+
#
|
13
|
+
# @param inputs [Array<Hash>] A list of dialogue inputs, each containing text and a voice ID
|
14
|
+
# @option inputs [String] :text The text to be converted to speech
|
15
|
+
# @option inputs [String] :voice_id The voice ID to use for this text
|
16
|
+
# @param options [Hash] Optional parameters
|
17
|
+
# @option options [String] :model_id Identifier of the model to be used
|
18
|
+
# @option options [Hash] :settings Settings controlling the dialogue generation
|
19
|
+
# @option options [Integer] :seed Best effort to sample deterministically
|
20
|
+
# @return [String] The binary audio data (usually an MP3)
|
21
|
+
def convert(inputs, **options)
|
22
|
+
endpoint = "/v1/text-to-dialogue"
|
23
|
+
request_body = { inputs: inputs }
|
24
|
+
|
25
|
+
# Add optional parameters
|
26
|
+
request_body[:model_id] = options[:model_id] if options[:model_id]
|
27
|
+
request_body[:settings] = options[:settings] if options[:settings] && !options[:settings].empty?
|
28
|
+
request_body[:seed] = options[:seed] if options[:seed]
|
29
|
+
|
30
|
+
@client.post_binary(endpoint, request_body)
|
31
|
+
end
|
32
|
+
|
33
|
+
# Alias for backward compatibility and convenience
|
34
|
+
alias_method :text_to_dialogue, :convert
|
35
|
+
|
36
|
+
private
|
37
|
+
|
38
|
+
attr_reader :client
|
39
|
+
end
|
40
|
+
end
|
@@ -0,0 +1,50 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class TextToSpeech
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/text-to-speech/{voice_id}
|
10
|
+
# Convert text to speech and retrieve audio (binary data)
|
11
|
+
# Documentation: https://elevenlabs.io/docs/api-reference/text-to-speech/convert
|
12
|
+
#
|
13
|
+
# @param voice_id [String] The ID of the voice to use
|
14
|
+
# @param text [String] Text to synthesize
|
15
|
+
# @param options [Hash] Optional TTS parameters
|
16
|
+
# @option options [String] :model_id Model to use (e.g. "eleven_monolingual_v1" or "eleven_multilingual_v1")
|
17
|
+
# @option options [Hash] :voice_settings Voice configuration (stability, similarity_boost, style, use_speaker_boost, etc.)
|
18
|
+
# @option options [Boolean] :optimize_streaming Whether to receive chunked streaming audio
|
19
|
+
# @return [String] The binary audio data (usually an MP3)
|
20
|
+
def convert(voice_id, text, **options)
|
21
|
+
endpoint = "/v1/text-to-speech/#{voice_id}"
|
22
|
+
request_body = { text: text }
|
23
|
+
|
24
|
+
# Add optional parameters
|
25
|
+
request_body[:model_id] = options[:model_id] if options[:model_id]
|
26
|
+
request_body[:voice_settings] = options[:voice_settings] if options[:voice_settings]
|
27
|
+
|
28
|
+
# Handle streaming optimization
|
29
|
+
if options[:optimize_streaming]
|
30
|
+
@client.post_with_custom_headers(endpoint, request_body, streaming_headers)
|
31
|
+
else
|
32
|
+
@client.post_binary(endpoint, request_body)
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
# Alias for backward compatibility and convenience
|
37
|
+
alias_method :text_to_speech, :convert
|
38
|
+
|
39
|
+
private
|
40
|
+
|
41
|
+
attr_reader :client
|
42
|
+
|
43
|
+
def streaming_headers
|
44
|
+
{
|
45
|
+
"Accept" => "audio/mpeg",
|
46
|
+
"Transfer-Encoding" => "chunked"
|
47
|
+
}
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
@@ -0,0 +1,42 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module ElevenlabsClient
|
4
|
+
class TextToSpeechStream
|
5
|
+
def initialize(client)
|
6
|
+
@client = client
|
7
|
+
end
|
8
|
+
|
9
|
+
# POST /v1/text-to-speech/{voice_id}/stream
|
10
|
+
# Stream text-to-speech audio in real-time chunks
|
11
|
+
#
|
12
|
+
# @param voice_id [String] The ID of the voice to use
|
13
|
+
# @param text [String] Text to synthesize
|
14
|
+
# @param options [Hash] Optional TTS parameters
|
15
|
+
# @option options [String] :model_id Model to use (defaults to "eleven_multilingual_v2")
|
16
|
+
# @option options [String] :output_format Output format (defaults to "mp3_44100_128")
|
17
|
+
# @option options [Hash] :voice_settings Voice configuration
|
18
|
+
# @param block [Proc] Block to handle each audio chunk
|
19
|
+
# @return [Faraday::Response] The response object
|
20
|
+
def stream(voice_id, text, **options, &block)
|
21
|
+
output_format = options[:output_format] || "mp3_44100_128"
|
22
|
+
endpoint = "/v1/text-to-speech/#{voice_id}/stream?output_format=#{output_format}"
|
23
|
+
|
24
|
+
request_body = {
|
25
|
+
text: text,
|
26
|
+
model_id: options[:model_id] || "eleven_multilingual_v2"
|
27
|
+
}
|
28
|
+
|
29
|
+
# Add voice_settings if provided
|
30
|
+
request_body[:voice_settings] = options[:voice_settings] if options[:voice_settings]
|
31
|
+
|
32
|
+
@client.post_streaming(endpoint, request_body, &block)
|
33
|
+
end
|
34
|
+
|
35
|
+
# Alias for backward compatibility
|
36
|
+
alias_method :text_to_speech_stream, :stream
|
37
|
+
|
38
|
+
private
|
39
|
+
|
40
|
+
attr_reader :client
|
41
|
+
end
|
42
|
+
end
|
data/lib/elevenlabs_client.rb
CHANGED
@@ -3,7 +3,11 @@
|
|
3
3
|
require_relative "elevenlabs_client/version"
|
4
4
|
require_relative "elevenlabs_client/errors"
|
5
5
|
require_relative "elevenlabs_client/settings"
|
6
|
-
require_relative "elevenlabs_client/dubs"
|
6
|
+
require_relative "elevenlabs_client/endpoints/dubs"
|
7
|
+
require_relative "elevenlabs_client/endpoints/text_to_speech"
|
8
|
+
require_relative "elevenlabs_client/endpoints/text_to_speech_stream"
|
9
|
+
require_relative "elevenlabs_client/endpoints/text_to_dialogue"
|
10
|
+
require_relative "elevenlabs_client/endpoints/sound_generation"
|
7
11
|
require_relative "elevenlabs_client/client"
|
8
12
|
|
9
13
|
module ElevenlabsClient
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: elevenlabs_client
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Vitor Oliveira
|
@@ -121,7 +121,11 @@ files:
|
|
121
121
|
- README.md
|
122
122
|
- lib/elevenlabs_client.rb
|
123
123
|
- lib/elevenlabs_client/client.rb
|
124
|
-
- lib/elevenlabs_client/dubs.rb
|
124
|
+
- lib/elevenlabs_client/endpoints/dubs.rb
|
125
|
+
- lib/elevenlabs_client/endpoints/sound_generation.rb
|
126
|
+
- lib/elevenlabs_client/endpoints/text_to_dialogue.rb
|
127
|
+
- lib/elevenlabs_client/endpoints/text_to_speech.rb
|
128
|
+
- lib/elevenlabs_client/endpoints/text_to_speech_stream.rb
|
125
129
|
- lib/elevenlabs_client/errors.rb
|
126
130
|
- lib/elevenlabs_client/settings.rb
|
127
131
|
- lib/elevenlabs_client/version.rb
|
File without changes
|