elevenlabs_client 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 244f4e543adab6725041a23c4742e95b32e6352635496ecf7ea3dbb7ae518d8b
4
- data.tar.gz: 0f4444f50015137e1627a82edc7b9a6159a5dc4bdb41de521c8d2609130d642c
3
+ metadata.gz: 2eb4466ffb626d55734bcd3569141c50293bbee7219fcffc252bd161f3bacac5
4
+ data.tar.gz: 2b045b85c15865d17f000a924c2f5088558d81f85b50e1273a11b6950e4eda9f
5
5
  SHA512:
6
- metadata.gz: 4fda1901bc041645ef289c56bc1474ae9f5e3c9ec965dc70371271344b5412c799c8e9fb519d28d780b1c26de32b7318abc63b2d0d0681337fd004ec53e48179
7
- data.tar.gz: 92dc43058342c80fd0c72d66d8936fac2d66f1a38c31d606481a40c244bc8c8cd1cfc452f959e1dc014b514c2a56135a40e103166c53d1cbd3c1d15e0f3849fd
6
+ metadata.gz: 30cfe941d5a311175436c55e5952d743efaa42410fc38e3e2c6df5c1b8db374720d960f86cc57c8985127028b6bc9312a360d4744e7536913576a1ce4c3bbb9e
7
+ data.tar.gz: ec656950468c78eede815ae879c1e7475e6c8d64d725a850c4d6ebf95f7ce9c058174b734a4cec19c4c65133ede31aac0b57f08b271e17ce6a87904c2609e3e3
data/CHANGELOG.md CHANGED
@@ -7,6 +7,84 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.2.0] - 2025-09-12
11
+
12
+ ### Added
13
+ - **Text-to-Speech API** - Convert text to natural-sounding speech with voice customization
14
+ - **Text-to-Speech Streaming API** - Real-time audio streaming for live applications
15
+ - **Text-to-Dialogue API** - Multi-speaker conversation generation
16
+ - **Sound Generation API** - AI-generated sound effects and ambient audio
17
+ - **Comprehensive Documentation** - Separate documentation files for each API endpoint
18
+ - **Rails Integration Examples** - Complete controller examples for all endpoints
19
+ - **Enhanced Configuration** - Flexible configuration with Settings module
20
+ - **Streaming Support** - Real-time audio chunk processing with block callbacks
21
+ - **Binary Response Handling** - Proper handling of audio data responses
22
+ - **Query Parameter Support** - URL query parameters for API requests
23
+
24
+ ### Enhanced
25
+ - **Endpoint Organization** - Moved all endpoints to dedicated `lib/elevenlabs_client/endpoints/` directory
26
+ - **Client Architecture** - Separated HTTP client logic from endpoint-specific functionality
27
+ - **Error Handling** - Enhanced error handling with streaming-specific exceptions
28
+ - **Test Coverage** - Expanded test suite to 187+ tests covering all new functionality
29
+ - **Configuration System** - Priority-based configuration (explicit > Settings > ENV)
30
+
31
+ ### Documentation
32
+ - **Modular Documentation** - Split endpoint documentation into separate files:
33
+ - [DUBBING.md](docs/DUBBING.md) - Audio/video dubbing functionality
34
+ - [TEXT_TO_SPEECH.md](docs/TEXT_TO_SPEECH.md) - Text-to-speech conversion
35
+ - [TEXT_TO_SPEECH_STREAMING.md](docs/TEXT_TO_SPEECH_STREAMING.md) - Real-time streaming
36
+ - [TEXT_TO_DIALOGUE.md](docs/TEXT_TO_DIALOGUE.md) - Multi-speaker dialogues
37
+ - [SOUND_GENERATION.md](docs/SOUND_GENERATION.md) - Sound effect generation
38
+ - **Improved README** - Streamlined main README with quick start guide
39
+ - **Rails Examples** - Complete controller implementations for all endpoints
40
+ - **Usage Examples** - Comprehensive examples for each API feature
41
+
42
+ ### New Endpoints
43
+ - `client.text_to_speech.*` - Text-to-speech conversion with voice settings
44
+ - `client.text_to_speech_stream.*` - Real-time streaming text-to-speech
45
+ - `client.text_to_dialogue.*` - Multi-speaker dialogue generation
46
+ - `client.sound_generation.*` - AI sound effect and ambient audio generation
47
+
48
+ ### New Features
49
+ - **Voice Customization** - Stability, similarity boost, style controls
50
+ - **Audio Formats** - Multiple output formats (MP3, PCM) with quality options
51
+ - **Looping Audio** - Generate seamless looping sound effects
52
+ - **Deterministic Generation** - Seed support for consistent results
53
+ - **Batch Processing** - Multiple sound generation in single requests
54
+ - **WebSocket Integration** - Real-time streaming to WebSocket connections
55
+ - **File Format Support** - Enhanced support for various audio/video formats
56
+
57
+ ### Technical Improvements
58
+ - **Modular Architecture** - Clean separation of concerns with endpoint classes
59
+ - **HTTP Client Enhancement** - Added streaming, binary, and custom header support
60
+ - **Settings Management** - Centralized configuration with Rails initializer support
61
+ - **Memory Management** - Efficient handling of large audio files and streams
62
+ - **Concurrent Testing** - Parallel test execution for faster development
63
+
64
+ ### Examples Added
65
+ - `examples/dubs_controller.rb` - Complete dubbing workflow with batch processing
66
+ - `examples/text_to_speech_controller.rb` - TTS with voice customization
67
+ - `examples/streaming_audio_controller.rb` - Real-time streaming with WebSocket support
68
+ - `examples/text_to_dialogue_controller.rb` - Specialized dialogue endpoints
69
+ - `examples/sound_generation_controller.rb` - Sound effects with presets and batch processing
70
+ - `examples/rails_initializer.rb` - Rails configuration example
71
+
72
+ ### Breaking Changes
73
+ - **Endpoint Access** - Dubbing methods moved from `client.create_dub` to `client.dubs.create`
74
+ - **File Structure** - Endpoint classes moved to `lib/elevenlabs_client/endpoints/`
75
+ - **Configuration** - Enhanced configuration system with new precedence rules
76
+
77
+ ### Migration Guide
78
+ ```ruby
79
+ # Before (v0.1.0)
80
+ client.create_dub(file_io: file, filename: "video.mp4", target_languages: ["es"])
81
+
82
+ # After (v0.2.0)
83
+ client.dubs.create(file_io: file, filename: "video.mp4", target_languages: ["es"])
84
+ ```
85
+
86
+ ## [0.1.0] - 2025-09-12
87
+
10
88
  ### Added
11
89
  - Initial release of ElevenLabs Client gem
12
90
  - Support for ElevenLabs Dubbing API
@@ -26,12 +104,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
26
104
  - **File Support**: Multiple video and audio formats (MP4, MOV, MP3, WAV, etc.)
27
105
  - **Language Support**: Multiple target languages for dubbing
28
106
  - **Configuration**: Flexible API key and endpoint configuration
29
- - **Testing**: Comprehensive test suite with integration tests
30
-
31
- ## [0.1.0] - 2025-01-XX
32
-
33
- ### Added
34
- - Initial implementation of ElevenLabs Client
35
- - Basic dubbing functionality
36
- - Error handling and validation
37
- - Documentation and examples
107
+ - **Testing**: Comprehensive test suite with integration tests
data/README.md CHANGED
@@ -1,24 +1,45 @@
1
1
  # ElevenlabsClient
2
2
 
3
- A Ruby client library for interacting with ElevenLabs APIs, including dubbing and voice synthesis.
3
+ [![Gem Version](https://badge.fury.io/rb/elevenlabs_client.svg)](https://badge.fury.io/rb/elevenlabs_client)
4
+ [![Build Status](https://github.com/yourusername/elevenlabs_client/workflows/CI/badge.svg)](https://github.com/yourusername/elevenlabs_client/actions)
5
+
6
+ A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, and sound effects.
7
+
8
+ ## Features
9
+
10
+ 🎙️ **Text-to-Speech** - Convert text to natural-sounding speech
11
+ 🎬 **Dubbing** - Create dubbed versions of audio/video content
12
+ 💬 **Dialogue Generation** - Multi-speaker conversations
13
+ 🔊 **Sound Generation** - AI-generated sound effects and ambient audio
14
+ 📡 **Streaming** - Real-time audio streaming
15
+ ⚙️ **Configurable** - Flexible configuration options
16
+ 🧪 **Well-tested** - Comprehensive test coverage
4
17
 
5
18
  ## Installation
6
19
 
7
20
  Add this line to your application's Gemfile:
8
21
 
9
22
  ```ruby
10
- gem 'elevenlabs_client', path: 'lib/elevenlabs_client'
23
+ gem 'elevenlabs_client'
11
24
  ```
12
25
 
13
26
  And then execute:
14
27
 
15
- $ bundle install
28
+ ```bash
29
+ $ bundle install
30
+ ```
31
+
32
+ Or install it yourself as:
33
+
34
+ ```bash
35
+ $ gem install elevenlabs_client
36
+ ```
16
37
 
17
- ## Usage
38
+ ## Quick Start
18
39
 
19
40
  ### Configuration
20
41
 
21
- #### Rails Initializer (Recommended for Rails apps)
42
+ #### Rails Applications (Recommended)
22
43
 
23
44
  Create `config/initializers/elevenlabs_client.rb`:
24
45
 
@@ -26,149 +47,210 @@ Create `config/initializers/elevenlabs_client.rb`:
26
47
  ElevenlabsClient::Settings.configure do |config|
27
48
  config.properties = {
28
49
  elevenlabs_base_uri: ENV["ELEVENLABS_BASE_URL"],
29
- elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"],
50
+ elevenlabs_api_key: ENV["ELEVENLABS_API_KEY"]
30
51
  }
31
52
  end
32
53
  ```
33
54
 
34
- Once configured this way, you can create clients without passing any parameters:
55
+ Set your environment variables:
35
56
 
36
- ```ruby
37
- client = ElevenlabsClient.new
38
- # Uses the configured settings automatically
57
+ ```bash
58
+ export ELEVENLABS_API_KEY="your_api_key_here"
59
+ export ELEVENLABS_BASE_URL="https://api.elevenlabs.io" # Optional, defaults to official API
39
60
  ```
40
61
 
41
- #### Alternative Configuration Syntax
42
-
43
- You can also use the module-level configure method:
62
+ #### Direct Configuration
44
63
 
45
64
  ```ruby
65
+ # Module-level configuration
46
66
  ElevenlabsClient.configure do |config|
47
67
  config.properties = {
48
68
  elevenlabs_base_uri: "https://api.elevenlabs.io",
49
69
  elevenlabs_api_key: "your_api_key_here"
50
70
  }
51
71
  end
52
- ```
53
-
54
- #### Configuration Precedence
55
-
56
- The client uses the following precedence order for configuration:
57
-
58
- 1. **Explicit parameters** passed to `Client.new` (highest priority)
59
- 2. **Settings.properties** configured via initializer
60
- 3. **Environment variables** (lowest priority)
61
-
62
- This allows you to set defaults in your initializer while still being able to override them when needed.
63
72
 
64
- ### Client Initialization
65
-
66
- There are several ways to create a client:
67
-
68
- ```ruby
69
- # Using environment variables (default behavior)
70
- client = ElevenlabsClient.new
71
-
72
- # Passing API key directly
73
- client = ElevenlabsClient::Client.new(api_key: "your_api_key_here")
74
-
75
- # Custom base URL
76
- client = ElevenlabsClient::Client.new(
73
+ # Or pass directly to client
74
+ client = ElevenlabsClient.new(
77
75
  api_key: "your_api_key_here",
78
- base_url: "https://custom-api.elevenlabs.io"
79
- )
80
-
81
- # Custom environment variable names
82
- client = ElevenlabsClient::Client.new(
83
- api_key_env: "MY_CUSTOM_API_KEY_VAR",
84
- base_url_env: "MY_CUSTOM_BASE_URL_VAR"
76
+ base_url: "https://api.elevenlabs.io"
85
77
  )
86
78
  ```
87
79
 
88
80
  ### Basic Usage
89
81
 
90
82
  ```ruby
91
- require 'elevenlabs_client'
92
-
93
- # Create a client
83
+ # Initialize client (uses configured settings)
94
84
  client = ElevenlabsClient.new
95
85
 
96
- # Create a dubbing job
86
+ # Text-to-Speech
87
+ audio_data = client.text_to_speech.convert("21m00Tcm4TlvDq8ikWAM", "Hello, world!")
88
+ File.open("hello.mp3", "wb") { |f| f.write(audio_data) }
89
+
90
+ # Dubbing
97
91
  File.open("video.mp4", "rb") do |file|
98
92
  result = client.dubs.create(
99
93
  file_io: file,
100
94
  filename: "video.mp4",
101
- target_languages: ["es", "pt", "fr"],
102
- name: "My Video Dub",
103
- drop_background_audio: true,
104
- use_profanity_filter: false
95
+ target_languages: ["es", "fr", "de"]
105
96
  )
106
-
107
- puts "Dubbing job created: #{result['dubbing_id']}"
108
97
  end
109
98
 
110
- # Check dubbing status
111
- dub_details = client.dubs.get("dubbing_id_here")
112
- puts "Status: #{dub_details['status']}"
99
+ # Dialogue Generation
100
+ dialogue = [
101
+ { text: "Hello, how are you?", voice_id: "voice_1" },
102
+ { text: "I'm doing great, thanks!", voice_id: "voice_2" }
103
+ ]
104
+ audio_data = client.text_to_dialogue.convert(dialogue)
113
105
 
114
- # List all dubbing jobs
115
- dubs = client.dubs.list(dubbing_status: "dubbed")
116
- puts "Completed dubs: #{dubs['dubs'].length}"
106
+ # Sound Generation
107
+ audio_data = client.sound_generation.generate("Ocean waves crashing on rocks")
117
108
 
118
- # Get dubbing resources (for editing)
119
- resources = client.dubs.resources("dubbing_id_here")
120
- puts "Audio files: #{resources['resources']['audio_files']}"
109
+ # Streaming Text-to-Speech
110
+ client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
111
+ # Process audio chunk in real-time
112
+ puts "Received #{chunk.bytesize} bytes"
113
+ end
121
114
  ```
122
115
 
123
- ### Available Dubbing Methods
116
+ ## API Documentation
117
+
118
+ ### Core APIs
124
119
 
125
- The client provides access to all dubbing endpoints through the `client.dubs` interface:
120
+ - **[Dubbing API](docs/DUBBING.md)** - Create dubbed versions of audio/video content
121
+ - **[Text-to-Speech API](docs/TEXT_TO_SPEECH.md)** - Convert text to natural speech
122
+ - **[Text-to-Speech Streaming API](docs/TEXT_TO_SPEECH_STREAMING.md)** - Real-time audio streaming
123
+ - **[Text-to-Dialogue API](docs/TEXT_TO_DIALOGUE.md)** - Multi-speaker conversations
124
+ - **[Sound Generation API](docs/SOUND_GENERATION.md)** - AI-generated sound effects
126
125
 
127
- - `client.dubs.create(file_io:, filename:, target_languages:, **options)` - Create a new dubbing job
128
- - `client.dubs.get(dubbing_id)` - Get dubbing job details
129
- - `client.dubs.list(params = {})` - List dubbing jobs with optional filters
130
- - `client.dubs.resources(dubbing_id)` - Get dubbing resources for editing
126
+ ### Available Endpoints
131
127
 
132
- ## Supported Language Codes
128
+ | Endpoint | Description | Documentation |
129
+ |----------|-------------|---------------|
130
+ | `client.dubs.*` | Audio/video dubbing | [DUBBING.md](docs/DUBBING.md) |
131
+ | `client.text_to_speech.*` | Text-to-speech conversion | [TEXT_TO_SPEECH.md](docs/TEXT_TO_SPEECH.md) |
132
+ | `client.text_to_speech_stream.*` | Streaming TTS | [TEXT_TO_SPEECH_STREAMING.md](docs/TEXT_TO_SPEECH_STREAMING.md) |
133
+ | `client.text_to_dialogue.*` | Dialogue generation | [TEXT_TO_DIALOGUE.md](docs/TEXT_TO_DIALOGUE.md) |
134
+ | `client.sound_generation.*` | Sound effect generation | [SOUND_GENERATION.md](docs/SOUND_GENERATION.md) |
135
+
136
+ ## Configuration Options
137
+
138
+ ### Configuration Precedence
139
+
140
+ 1. **Explicit parameters** (highest priority)
141
+ 2. **Settings.properties** (configured via initializer)
142
+ 3. **Environment variables** (lowest priority)
133
143
 
134
- Common target languages include:
135
- - `es` - Spanish
136
- - `pt` - Portuguese
137
- - `fr` - French
138
- - `de` - German
139
- - `it` - Italian
140
- - `pl` - Polish
141
- - `ja` - Japanese
142
- - `ko` - Korean
143
- - `zh` - Chinese
144
- - `hi` - Hindi
144
+ ### Environment Variables
145
+
146
+ - `ELEVENLABS_API_KEY` - Your ElevenLabs API key (required)
147
+ - `ELEVENLABS_BASE_URL` - API base URL (optional, defaults to `https://api.elevenlabs.io`)
148
+
149
+ ### Custom Environment Variable Names
150
+
151
+ ```ruby
152
+ client = ElevenlabsClient.new(
153
+ api_key_env: "CUSTOM_API_KEY_VAR",
154
+ base_url_env: "CUSTOM_BASE_URL_VAR"
155
+ )
156
+ ```
145
157
 
146
158
  ## Error Handling
147
159
 
148
- The client raises specific exceptions for different error conditions:
160
+ The client provides specific exception types for different error conditions:
149
161
 
150
162
  ```ruby
151
163
  begin
152
- client.create_dub(...)
153
- rescue ElevenlabsClient::AuthenticationError => e
154
- puts "Invalid API key: #{e.message}"
155
- rescue ElevenlabsClient::RateLimitError => e
156
- puts "Rate limit exceeded: #{e.message}"
164
+ result = client.text_to_speech.convert(voice_id, text)
165
+ rescue ElevenlabsClient::AuthenticationError
166
+ puts "Invalid API key"
167
+ rescue ElevenlabsClient::RateLimitError
168
+ puts "Rate limit exceeded"
157
169
  rescue ElevenlabsClient::ValidationError => e
158
- puts "Validation error: #{e.message}"
170
+ puts "Invalid parameters: #{e.message}"
159
171
  rescue ElevenlabsClient::APIError => e
160
172
  puts "API error: #{e.message}"
161
173
  end
162
174
  ```
163
175
 
176
+ ### Exception Types
177
+
178
+ - `AuthenticationError` - Invalid API key or authentication failure
179
+ - `RateLimitError` - Rate limit exceeded
180
+ - `ValidationError` - Invalid request parameters
181
+ - `APIError` - General API errors
182
+
183
+ ## Rails Integration
184
+
185
+ The gem is designed to work seamlessly with Rails applications. See the [examples](examples/) directory for complete controller implementations:
186
+
187
+ - [DubsController](examples/dubs_controller.rb) - Complete dubbing workflow
188
+ - [TextToSpeechController](examples/text_to_speech_controller.rb) - TTS with error handling
189
+ - [StreamingAudioController](examples/streaming_audio_controller.rb) - Real-time streaming
190
+ - [TextToDialogueController](examples/text_to_dialogue_controller.rb) - Dialogue generation
191
+ - [SoundGenerationController](examples/sound_generation_controller.rb) - Sound effects
192
+
164
193
  ## Development
165
194
 
166
- After checking out the repo, run `bundle install` to install dependencies.
195
+ After checking out the repo, run:
196
+
197
+ ```bash
198
+ bin/setup # Install dependencies
199
+ bundle exec rspec # Run tests
200
+ ```
201
+
202
+ To install this gem onto your local machine:
203
+
204
+ ```bash
205
+ bundle exec rake install
206
+ ```
207
+
208
+ To release a new version:
209
+
210
+ 1. Update the version number in `version.rb`
211
+ 2. Update `CHANGELOG.md`
212
+ 3. Run `bundle exec rake release`
213
+
214
+ ## Testing
215
+
216
+ The gem includes comprehensive test coverage with RSpec:
217
+
218
+ ```bash
219
+ # Run all tests
220
+ bundle exec rspec
221
+
222
+ # Run specific test files
223
+ bundle exec rspec spec/elevenlabs_client/endpoints/
224
+ bundle exec rspec spec/integration/
225
+
226
+ # Run with documentation format
227
+ bundle exec rspec --format documentation
228
+ ```
167
229
 
168
230
  ## Contributing
169
231
 
170
- Bug reports and pull requests are welcome on GitHub.
232
+ Bug reports and pull requests are welcome on GitHub at https://github.com/yourusername/elevenlabs_client.
233
+
234
+ 1. Fork it
235
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
236
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
237
+ 4. Push to the branch (`git push origin my-new-feature`)
238
+ 5. Create a new Pull Request
171
239
 
172
240
  ## License
173
241
 
174
242
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
243
+
244
+ ## Changelog
245
+
246
+ See [CHANGELOG.md](CHANGELOG.md) for a detailed list of changes and version history.
247
+
248
+ ## Support
249
+
250
+ - 📖 **Documentation**: [API Documentation](docs/)
251
+ - 🐛 **Issues**: [GitHub Issues](https://github.com/yourusername/elevenlabs_client/issues)
252
+ - 💬 **Discussions**: [GitHub Discussions](https://github.com/yourusername/elevenlabs_client/discussions)
253
+
254
+ ---
255
+
256
+ Made with ❤️ for the Ruby community
@@ -7,13 +7,17 @@ module ElevenlabsClient
7
7
  class Client
8
8
  DEFAULT_BASE_URL = "https://api.elevenlabs.io"
9
9
 
10
- attr_reader :base_url, :api_key, :dubs
10
+ attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_dialogue, :sound_generation
11
11
 
12
12
  def initialize(api_key: nil, base_url: nil, api_key_env: "ELEVENLABS_API_KEY", base_url_env: "ELEVENLABS_BASE_URL")
13
13
  @api_key = api_key || fetch_api_key(api_key_env)
14
14
  @base_url = base_url || fetch_base_url(base_url_env)
15
15
  @conn = build_connection
16
16
  @dubs = Dubs.new(self)
17
+ @text_to_speech = TextToSpeech.new(self)
18
+ @text_to_speech_stream = TextToSpeechStream.new(self)
19
+ @text_to_dialogue = TextToDialogue.new(self)
20
+ @sound_generation = SoundGeneration.new(self)
17
21
  end
18
22
 
19
23
  # Makes an authenticated GET request
@@ -54,6 +58,62 @@ module ElevenlabsClient
54
58
  handle_response(response)
55
59
  end
56
60
 
61
+ # Makes an authenticated POST request expecting binary response
62
+ # @param path [String] API endpoint path
63
+ # @param body [Hash, nil] Request body
64
+ # @return [String] Binary response body
65
+ def post_binary(path, body = nil)
66
+ response = @conn.post(path) do |req|
67
+ req.headers["xi-api-key"] = api_key
68
+ req.headers["Content-Type"] = "application/json"
69
+ req.body = body.to_json if body
70
+ end
71
+
72
+ handle_binary_response(response)
73
+ end
74
+
75
+ # Makes an authenticated POST request with custom headers
76
+ # @param path [String] API endpoint path
77
+ # @param body [Hash, nil] Request body
78
+ # @param custom_headers [Hash] Additional headers
79
+ # @return [String] Response body (binary or text)
80
+ def post_with_custom_headers(path, body = nil, custom_headers = {})
81
+ response = @conn.post(path) do |req|
82
+ req.headers["xi-api-key"] = api_key
83
+ req.headers["Content-Type"] = "application/json"
84
+ custom_headers.each { |key, value| req.headers[key] = value }
85
+ req.body = body.to_json if body
86
+ end
87
+
88
+ # For streaming/binary responses, return raw body
89
+ if custom_headers["Accept"]&.include?("audio") || custom_headers["Transfer-Encoding"] == "chunked"
90
+ handle_binary_response(response)
91
+ else
92
+ handle_response(response)
93
+ end
94
+ end
95
+
96
+ # Makes an authenticated POST request with streaming response
97
+ # @param path [String] API endpoint path
98
+ # @param body [Hash, nil] Request body
99
+ # @param block [Proc] Block to handle each chunk
100
+ # @return [Faraday::Response] Response object
101
+ def post_streaming(path, body = nil, &block)
102
+ response = @conn.post(path) do |req|
103
+ req.headers["xi-api-key"] = api_key
104
+ req.headers["Content-Type"] = "application/json"
105
+ req.headers["Accept"] = "audio/mpeg"
106
+ req.body = body.to_json if body
107
+
108
+ # Set up streaming callback
109
+ req.options.on_data = proc do |chunk, _|
110
+ block.call(chunk) if block_given?
111
+ end
112
+ end
113
+
114
+ handle_streaming_response(response)
115
+ end
116
+
57
117
  # Helper method to create Faraday::Multipart::FilePart
58
118
  # @param file_io [IO] File IO object
59
119
  # @param filename [String] Original filename
@@ -108,6 +168,36 @@ module ElevenlabsClient
108
168
  end
109
169
  end
110
170
 
171
+ def handle_binary_response(response)
172
+ case response.status
173
+ when 200..299
174
+ response.body
175
+ when 401
176
+ raise AuthenticationError, "Invalid API key or authentication failed"
177
+ when 429
178
+ raise RateLimitError, "Rate limit exceeded"
179
+ when 400..499
180
+ raise ValidationError, "API request failed with status #{response.status}"
181
+ else
182
+ raise APIError, "API request failed with status #{response.status}"
183
+ end
184
+ end
185
+
186
+ def handle_streaming_response(response)
187
+ case response.status
188
+ when 200..299
189
+ response
190
+ when 401
191
+ raise AuthenticationError, "Invalid API key or authentication failed"
192
+ when 429
193
+ raise RateLimitError, "Rate limit exceeded"
194
+ when 400..499
195
+ raise ValidationError, "API request failed with status #{response.status}"
196
+ else
197
+ raise APIError, "API request failed with status #{response.status}"
198
+ end
199
+ end
200
+
111
201
  def mime_for(filename)
112
202
  ext = File.extname(filename).downcase
113
203
  case ext
@@ -0,0 +1,46 @@
1
+ # frozen_string_literal: true
2
+
3
+ module ElevenlabsClient
4
+ class SoundGeneration
5
+ def initialize(client)
6
+ @client = client
7
+ end
8
+
9
+ # POST /v1/sound-generation
10
+ # Convert text to sound effects and retrieve audio (binary data)
11
+ # Documentation: https://elevenlabs.io/docs/api-reference/sound-generation
12
+ #
13
+ # @param text [String] Text prompt describing the sound effect
14
+ # @param options [Hash] Optional parameters
15
+ # @option options [Boolean] :loop Whether to create a looping sound effect (default: false)
16
+ # @option options [Float] :duration_seconds Duration in seconds (0.5 to 30, default: nil for auto-detection)
17
+ # @option options [Float] :prompt_influence Prompt influence (0.0 to 1.0, default: 0.3)
18
+ # @option options [String] :output_format Output format (e.g., "mp3_22050_32", default: "mp3_44100_128")
19
+ # @return [String] The binary audio data (usually an MP3)
20
+ def generate(text, **options)
21
+ endpoint = "/v1/sound-generation"
22
+ request_body = { text: text }
23
+
24
+ # Add optional parameters if provided
25
+ request_body[:loop] = options[:loop] unless options[:loop].nil?
26
+ request_body[:duration_seconds] = options[:duration_seconds] if options[:duration_seconds]
27
+ request_body[:prompt_influence] = options[:prompt_influence] if options[:prompt_influence]
28
+
29
+ # Handle output_format as query parameter
30
+ query_params = {}
31
+ query_params[:output_format] = options[:output_format] if options[:output_format]
32
+
33
+ # Build endpoint with query parameters if any
34
+ full_endpoint = query_params.any? ? "#{endpoint}?#{URI.encode_www_form(query_params)}" : endpoint
35
+
36
+ @client.post_binary(full_endpoint, request_body)
37
+ end
38
+
39
+ # Alias for backward compatibility and convenience
40
+ alias_method :sound_generation, :generate
41
+
42
+ private
43
+
44
+ attr_reader :client
45
+ end
46
+ end
@@ -0,0 +1,40 @@
1
+ # frozen_string_literal: true
2
+
3
+ module ElevenlabsClient
4
+ class TextToDialogue
5
+ def initialize(client)
6
+ @client = client
7
+ end
8
+
9
+ # POST /v1/text-to-dialogue
10
+ # Converts a list of text and voice ID pairs into speech (dialogue) and returns audio.
11
+ # Documentation: https://elevenlabs.io/docs/api-reference/text-to-dialogue/convert
12
+ #
13
+ # @param inputs [Array<Hash>] A list of dialogue inputs, each containing text and a voice ID
14
+ # @option inputs [String] :text The text to be converted to speech
15
+ # @option inputs [String] :voice_id The voice ID to use for this text
16
+ # @param options [Hash] Optional parameters
17
+ # @option options [String] :model_id Identifier of the model to be used
18
+ # @option options [Hash] :settings Settings controlling the dialogue generation
19
+ # @option options [Integer] :seed Best effort to sample deterministically
20
+ # @return [String] The binary audio data (usually an MP3)
21
+ def convert(inputs, **options)
22
+ endpoint = "/v1/text-to-dialogue"
23
+ request_body = { inputs: inputs }
24
+
25
+ # Add optional parameters
26
+ request_body[:model_id] = options[:model_id] if options[:model_id]
27
+ request_body[:settings] = options[:settings] if options[:settings] && !options[:settings].empty?
28
+ request_body[:seed] = options[:seed] if options[:seed]
29
+
30
+ @client.post_binary(endpoint, request_body)
31
+ end
32
+
33
+ # Alias for backward compatibility and convenience
34
+ alias_method :text_to_dialogue, :convert
35
+
36
+ private
37
+
38
+ attr_reader :client
39
+ end
40
+ end
@@ -0,0 +1,50 @@
1
+ # frozen_string_literal: true
2
+
3
+ module ElevenlabsClient
4
+ class TextToSpeech
5
+ def initialize(client)
6
+ @client = client
7
+ end
8
+
9
+ # POST /v1/text-to-speech/{voice_id}
10
+ # Convert text to speech and retrieve audio (binary data)
11
+ # Documentation: https://elevenlabs.io/docs/api-reference/text-to-speech/convert
12
+ #
13
+ # @param voice_id [String] The ID of the voice to use
14
+ # @param text [String] Text to synthesize
15
+ # @param options [Hash] Optional TTS parameters
16
+ # @option options [String] :model_id Model to use (e.g. "eleven_monolingual_v1" or "eleven_multilingual_v1")
17
+ # @option options [Hash] :voice_settings Voice configuration (stability, similarity_boost, style, use_speaker_boost, etc.)
18
+ # @option options [Boolean] :optimize_streaming Whether to receive chunked streaming audio
19
+ # @return [String] The binary audio data (usually an MP3)
20
+ def convert(voice_id, text, **options)
21
+ endpoint = "/v1/text-to-speech/#{voice_id}"
22
+ request_body = { text: text }
23
+
24
+ # Add optional parameters
25
+ request_body[:model_id] = options[:model_id] if options[:model_id]
26
+ request_body[:voice_settings] = options[:voice_settings] if options[:voice_settings]
27
+
28
+ # Handle streaming optimization
29
+ if options[:optimize_streaming]
30
+ @client.post_with_custom_headers(endpoint, request_body, streaming_headers)
31
+ else
32
+ @client.post_binary(endpoint, request_body)
33
+ end
34
+ end
35
+
36
+ # Alias for backward compatibility and convenience
37
+ alias_method :text_to_speech, :convert
38
+
39
+ private
40
+
41
+ attr_reader :client
42
+
43
+ def streaming_headers
44
+ {
45
+ "Accept" => "audio/mpeg",
46
+ "Transfer-Encoding" => "chunked"
47
+ }
48
+ end
49
+ end
50
+ end
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ module ElevenlabsClient
4
+ class TextToSpeechStream
5
+ def initialize(client)
6
+ @client = client
7
+ end
8
+
9
+ # POST /v1/text-to-speech/{voice_id}/stream
10
+ # Stream text-to-speech audio in real-time chunks
11
+ #
12
+ # @param voice_id [String] The ID of the voice to use
13
+ # @param text [String] Text to synthesize
14
+ # @param options [Hash] Optional TTS parameters
15
+ # @option options [String] :model_id Model to use (defaults to "eleven_multilingual_v2")
16
+ # @option options [String] :output_format Output format (defaults to "mp3_44100_128")
17
+ # @option options [Hash] :voice_settings Voice configuration
18
+ # @param block [Proc] Block to handle each audio chunk
19
+ # @return [Faraday::Response] The response object
20
+ def stream(voice_id, text, **options, &block)
21
+ output_format = options[:output_format] || "mp3_44100_128"
22
+ endpoint = "/v1/text-to-speech/#{voice_id}/stream?output_format=#{output_format}"
23
+
24
+ request_body = {
25
+ text: text,
26
+ model_id: options[:model_id] || "eleven_multilingual_v2"
27
+ }
28
+
29
+ # Add voice_settings if provided
30
+ request_body[:voice_settings] = options[:voice_settings] if options[:voice_settings]
31
+
32
+ @client.post_streaming(endpoint, request_body, &block)
33
+ end
34
+
35
+ # Alias for backward compatibility
36
+ alias_method :text_to_speech_stream, :stream
37
+
38
+ private
39
+
40
+ attr_reader :client
41
+ end
42
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ElevenlabsClient
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
@@ -3,7 +3,11 @@
3
3
  require_relative "elevenlabs_client/version"
4
4
  require_relative "elevenlabs_client/errors"
5
5
  require_relative "elevenlabs_client/settings"
6
- require_relative "elevenlabs_client/dubs"
6
+ require_relative "elevenlabs_client/endpoints/dubs"
7
+ require_relative "elevenlabs_client/endpoints/text_to_speech"
8
+ require_relative "elevenlabs_client/endpoints/text_to_speech_stream"
9
+ require_relative "elevenlabs_client/endpoints/text_to_dialogue"
10
+ require_relative "elevenlabs_client/endpoints/sound_generation"
7
11
  require_relative "elevenlabs_client/client"
8
12
 
9
13
  module ElevenlabsClient
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: elevenlabs_client
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Vitor Oliveira
@@ -121,7 +121,11 @@ files:
121
121
  - README.md
122
122
  - lib/elevenlabs_client.rb
123
123
  - lib/elevenlabs_client/client.rb
124
- - lib/elevenlabs_client/dubs.rb
124
+ - lib/elevenlabs_client/endpoints/dubs.rb
125
+ - lib/elevenlabs_client/endpoints/sound_generation.rb
126
+ - lib/elevenlabs_client/endpoints/text_to_dialogue.rb
127
+ - lib/elevenlabs_client/endpoints/text_to_speech.rb
128
+ - lib/elevenlabs_client/endpoints/text_to_speech_stream.rb
125
129
  - lib/elevenlabs_client/errors.rb
126
130
  - lib/elevenlabs_client/settings.rb
127
131
  - lib/elevenlabs_client/version.rb