RubyGems - elevenlabs - Versions diffs - 0.0.4 → 0.0.6 - Mend

elevenlabs 0.0.4 → 0.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 79aed88bf019f49fd61d4f198328b29a95c979c640918ba640e6fb8a9f09a90a
-  data.tar.gz: 7dfe933f008fa3ddfa4576eae3775565caff3000a3219cab0d3f8b4527b86a2a
+  metadata.gz: 22770e41ca0d3c88d2dc5f83e3e4d9de510610bf1a0adaf9bf675951a647ab30
+  data.tar.gz: 8f6ffc3ef844da02c3f45385a730ebbddfe3c711d11e1b983837153c8dbd859a
 SHA512:
-  metadata.gz: 0a5a3ea26078b7bf3bba5bc504225471a0f075d3f90524a865e0e06ba35f9008f198bb80e1174a560a54f68f5838ef785b7969994ee2353ceeec56c0ef30f231
-  data.tar.gz: 45f4e0bdcb8c42f025c400e5ca553ba24da0fffbf9e8c81041e48f4ad7bd70595d2949fa08ea5c7b26ad42cb0908807e89d66b4111a3e9d18d06bd5b979f0c3a
+  metadata.gz: 1b094e808358b342f7fe8cb08cf993dbafe2bac989bcb1e4655d5de5dd2884ff83626cb098da9b7c06d60a697302d33e848419f80a26fb53f34198f7894390b7
+  data.tar.gz: 44ad5334ed45f2628a22be91a0ed307b74f7e8611f6ba71cca0b4ee79f7e510f1f655e45321e47060b413e476ed9ea9913e304a44a789ec388bb7411235d42b4

data/README.md CHANGED Viewed

@@ -12,6 +12,8 @@ This gem provides an easy-to-use interface for:
 - **Editing an existing voice**
 - **Deleting a voice**
 - **Converting text to speech** and retrieving the generated audio
+- **Designing a voice** based on a text description
+- **Streaming text-to-speech audio**
 All requests are handled via [Faraday](https://github.com/lostisland/faraday).
@@ -39,6 +41,7 @@ All requests are handled via [Faraday](https://github.com/lostisland/faraday).
 - **Simple and intuitive API client** for ElevenLabs.
 - **Multipart file uploads** for training custom voices.
+- **Voice design** via text prompts to generate voice previews.
 - **Automatic authentication** via API key configuration.
 - **Error handling** with custom exceptions.
 - **Rails integration support** (including credentials storage).
@@ -52,16 +55,25 @@ Add the gem to your `Gemfile`:
 ```ruby
 gem "elevenlabs"
 ```
 Then run:
-```ruby
+```bash
 bundle install
 ```
 Or install it directly using:
-```ruby
+```bash
 gem install elevenlabs
 ```
-Usage
-Basic Example (Standalone Ruby)
+---
+## Usage
+### Basic Example (Standalone Ruby)
 ```ruby
 require "elevenlabs"
@@ -85,27 +97,58 @@ audio_data = client.text_to_speech(voice_id, text)
 # 5. Save the audio file
 File.open("output.mp3", "wb") { |f| f.write(audio_data) }
 puts "Audio file saved to output.mp3"
+# 6. Design a voice with a text prompt
+response = client.design_voice(
+  "A deep, resonant male voice with a British accent, suitable for storytelling",
+  output_format: "mp3_44100_192",
+  model_id: "eleven_multilingual_ttv_v2",
+  text: "In a land far away, where the mountains meet the sky, a great adventure began. Brave heroes embarked on a quest to find the lost artifact, facing challenges and forging bonds that would last a lifetime. Their journey took them through enchanted forests, across raging rivers, and into the heart of ancient ruins.",
+  auto_generate_text: false,
+  loudness: 0.5,
+  seed: 12345,
+  guidance_scale: 5.0,
+  stream_previews: false
+)
+# 7. Save voice preview audio
+require "base64"
+response["previews"].each_with_index do |preview, index|
+  audio_data = Base64.decode64(preview["audio_base_64"])
+  File.open("preview_#{index}.mp3", "wb") { |f| f.write(audio_data) }
+  puts "Saved preview #{index + 1} to preview_#{index}.mp3"
+end
 ```
 Note: You can override the API key per request:
 ```ruby
 client = Elevenlabs::Client.new(api_key: "DIFFERENT_API_KEY")
 ```
-Rails Integration
-Store API Key in Rails Credentials
+### Rails Integration
+#### Store API Key in Rails Credentials
 1. Open your encrypted credentials:
-```ruby
+```bash
 EDITOR=vim rails credentials:edit
 ```
 2. Add the ElevenLabs API key:
-```ruby
+```yaml
 eleven_labs:
   api_key: YOUR_SECURE_KEY
 ```
 3. Save and exit. Rails will securely encrypt your API key.
-Rails Initializer
-Create an initializer file: config/initializers/elevenlabs.rb
+#### Rails Initializer
+Create an initializer file: `config/initializers/elevenlabs.rb`
 ```ruby
 # config/initializers/elevenlabs.rb
 require "elevenlabs"
@@ -116,59 +159,108 @@ Rails.application.config.to_prepare do
   end
 end
 ```
 Now you can simply call:
 ```ruby
 client = Elevenlabs::Client.new
 ```
 without manually providing an API key.
-Endpoints
-1. List Voices
+#### Controller Example
+```ruby
+class AudioController < ApplicationController
+  def generate
+    client = Elevenlabs::Client.new
+    voice_id = params[:voice_id]
+    text = params[:text]
+    begin
+      audio_data = client.text_to_speech(voice_id, text)
+      send_data audio_data, type: "audio/mpeg", disposition: "attachment", filename: "output.mp3"
+    rescue Elevenlabs::APIError => e
+      render json: { error: e.message }, status: :bad_request
+    end
+  end
+end
+```
+---
+## Endpoints
+1. **List Voices**
 ```ruby
 client.list_voices
 # => { "voices" => [...] }
-```
-2. Get Voice Details
+2. List Models
+client.list_models
+# => [...]
+3. **Get Voice Details**
 ```ruby
 client.get_voice("VOICE_ID")
 # => { "voice_id" => "...", "name" => "...", ... }
 ```
-3. Create a Custom Voice
+4. **Create a Custom Voice**
 ```ruby
 sample_files = [File.open("sample1.mp3", "rb")]
 client.create_voice("Custom Voice", sample_files, description: "My custom AI voice")
 # => JSON response with new voice details
 ```
-4. Check if a voice is banned?
+5. **Check if a Voice is Banned**
 ```ruby
 sample_files = [File.open("trump.mp3", "rb")]
 client.create_voice("Donald Trump", sample_files, description: "My Trump Voice")
-  => {"voice_id"=>"<RETURNED_VOICE_ID>", "requires_verification"=>false}
-  trump= "<RETURNED_VOICE_ID>"
-  client.banned? trump
-=> true
+# => {"voice_id"=>"<RETURNED_VOICE_ID>", "requires_verification"=>false}
+trump = "<RETURNED_VOICE_ID>"
+client.banned?(trump)
+# => true
 ```
-5. Edit a Voice
+6. **Edit a Voice**
 ```ruby
 client.edit_voice("VOICE_ID", name: "Updated Voice Name")
 # => JSON response with updated details
 ```
-6. Delete a Voice
+7. **Delete a Voice**
 ```ruby
 client.delete_voice("VOICE_ID")
 # => JSON response acknowledging deletion
 ```
-7. Convert Text to Speech
+8. **Convert Text to Speech**
 ```ruby
 audio_data = client.text_to_speech("VOICE_ID", "Hello world!")
 File.open("output.mp3", "wb") { |f| f.write(audio_data) }
 ```
-8 Stream Text to Speech
-stream from terminal
-```ruby
-Mac: brew install sox
-Linux: sudo apt install sox
+9. **Stream Text to Speech**
+Stream from terminal:
+```bash
+# Mac: Install sox
+brew install sox
+# Linux: Install sox
+sudo apt install sox
+```
+```ruby
 IO.popen("play -t mp3 -", "wb") do |audio_pipe| # Notice "wb" (write binary)
   client.text_to_speech_stream("VOICE_ID", "Some text to stream back in chunks") do |chunk|
     audio_pipe.write(chunk.b) # Ensure chunk is written as binary
@@ -176,19 +268,71 @@ IO.popen("play -t mp3 -", "wb") do |audio_pipe| # Notice "wb" (write binary)
 end
 ```
-Error Handling
+10. **Create a Voice from a Design**
+Once you’ve generated a voice design using client.design_voice, you can turn it into a permanent voice in your account by passing its generated_voice_id to client.create_from_generated_voice.
+# Step 1: Design a voice (returns previews + generated_voice_id)
+```ruby
+design_response = client.design_voice(
+  "A warm, friendly female voice with a slight Australian accent",
+  model_id: "eleven_multilingual_ttv_v2",
+  text: "Welcome to our podcast, where every story is an adventure, taking you on a journey through fascinating worlds, inspiring voices, and unforgettable moments.",
+  auto_generate_text: false
+)
+generated_voice_id = design_response["previews"].first["generated_voice_id"] #three previews are given, but for this example we will use the first to create a voice here
+# Step 2: Create the permanent voice
+create_response = client.create_from_generated_voice(
+  "Friendly Aussie",
+  "A warm, friendly Australian-accented voice for podcasts",
+   generated_voice_id,
+)
+voice_id = create_response["voice_id"] # This is the ID you can use for TTS
+# Step 3: Use the new voice for TTS
+audio_data = client.text_to_speech(voice_id, "This is my new permanent designed voice.")
+File.open("friendly_aussie.mp3", "wb") { |f| f.write(audio_data) }
+```
+Important notes:
+Always store the returned voice_id from create_voice_from_design. This is the permanent identifier for TTS.
+Designed voices cannot be used for TTS until they are created in your account.
+If the voice is not immediately available for TTS, wait a few seconds or check its status via client.get_voice(voice_id) until it’s "active".
+10. Create a multi-speaker dialogue
+```ruby
+inputs = [{text: "It smells like updog in here", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}, {text: "What's updog?", voice_id: "RILOU7YmBhvwJGDGjNmP"}, {text: "Not much, you?", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}]
+audio_data = client.text_to_dialogue(inputs)
+File.open("what's updog.mp3", "wb") { |f| f.write(audio_data) }
+```
+---
+## Error Handling
 When the API returns an error, the gem raises specific exceptions:
-Exception	Meaning
-Elevenlabs::BadRequestError	Invalid request parameters
-Elevenlabs::AuthenticationError	Invalid API key
-Elevenlabs::NotFoundError	Resource (voice) not found
-Elevenlabs::APIError	General API failure
+| Exception                     | Meaning                          |
+|-------------------------------|----------------------------------|
+| `Elevenlabs::BadRequestError` | Invalid request parameters       |
+| `Elevenlabs::AuthenticationError` | Invalid API key              |
+| `Elevenlabs::NotFoundError`   | Resource (voice) not found       |
+| `Elevenlabs::UnprocessableEntityError` | Unprocessable entity (e.g., invalid input format) |
+| `Elevenlabs::APIError`        | General API failure              |
 Example:
 ```ruby
 begin
-  client.text_to_speech("INVALID_VOICE_ID", "Test")
+  client.design_voice("Short description") # Too short, will raise error
+rescue Elevenlabs::UnprocessableEntityError => e
+  puts "Validation error: #{e.message}"
 rescue Elevenlabs::AuthenticationError => e
   puts "Invalid API key: #{e.message}"
 rescue Elevenlabs::NotFoundError => e
@@ -198,38 +342,54 @@ rescue Elevenlabs::APIError => e
 end
 ```
-Development
-Clone this repository
+---
+## Development
+Clone this repository:
 ```bash
 git clone https://github.com/your-username/elevenlabs.git
 cd elevenlabs
 ```
-Install dependencies
+Install dependencies:
 ```bash
 bundle install
 ```
-Build the gem
+Build the gem:
 ```bash
 gem build elevenlabs.gemspec
 ```
-Install the gem locally
+Install the gem locally:
 ```bash
-gem install ./elevenlabs-0.0.3.gem
+gem install ./elevenlabs-0.0.5.gem
 ```
-Contributing
+---
+## Contributing
 Contributions are welcome! Please follow these steps:
-Fork the repository
-Create a feature branch (git checkout -b feature/my-new-feature)
-Commit your changes (git commit -am 'Add new feature')
-Push to your branch (git push origin feature/my-new-feature)
-Create a Pull Request describing your changes
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/my-new-feature`)
+3. Commit your changes (`git commit -am 'Add new feature'`)
+4. Push to your branch (`git push origin feature/my-new-feature`)
+5. Create a Pull Request describing your changes
 For bug reports, please open an issue with details.
-License
+---
+## License
 This project is licensed under the MIT License. See the LICENSE file for details.
-⭐ Thank you for using the Elevenlabs Ruby Gem!
+⭐ Thank you for using the Elevenlabs Ruby Gem!
 If you have any questions or suggestions, feel free to open an issue or submit a Pull Request!
-# elevenlabs

data/lib/elevenlabs/client.rb CHANGED Viewed

@@ -88,6 +88,133 @@ module Elevenlabs
       handle_error(e)
     end
+    #####################################################
+    #              Text-to-Dialogue                     #
+    #    (POST /v1/text-to-dialogue)                    #
+    #####################################################
+    # Converts a list of text and voice ID pairs into speech (dialogue) and returns audio.
+    # Documentation: https://elevenlabs.io/docs/api-reference/text-to-dialogue/convert
+    #
+    # @param [Array[Objects]] inputs - A list of dialogue inputs, each containing text and a voice ID which will be converted into speech
+    #   :text => String
+    #   :voice_id => String
+    # @param [String] model_id - optional Identifier of the model to be used
+    # @param [Hash] settings - optinal Settings controlling the dialogue generation
+    #   :stability => double - 0.0 = Creative, 0.5 = Natural, 1.0 = Robust
+    #   :use_speaker_boost => boolean
+    # @param [Integer] seed - optional Best effort to sample deterministically.
+    #
+    # @return [String] The binary audio data (usually an MP3).
+    def text_to_dialogue(inputs, model_id = nil, settings = {}, seed = nil)
+      endpoint = "/v1/text-to-dialogue"
+      request_body = {}.tap do |r|
+        r[:inputs] = inputs
+        r[:model_id] = model_id if model_id
+        r[:settings] = settings unless settings.empty?
+        r[:seed] = seed if seed
+      end
+      headers = default_headers
+      headers["Accept"] = "audio/mpeg"
+      response = @connection.post(endpoint) do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      # Returns raw binary data (often MP3)
+      response.body
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    #####################################################
+    #                  Design a Voice                   #
+    #      (POST /v1/text-to-voice/design)              #
+    #####################################################
+    # Designs a voice based on a description
+    # Documentation: https://elevenlabs.io/docs/api-reference/text-to-voice/design
+    #
+    # @param [String] voice_description - Description of the voice (20-1000 characters)
+    # @param [Hash] options - Optional parameters
+    #   :output_format              => String   (e.g., "mp3_44100_192", default: "mp3_44100_192")
+    #   :model_id                  => String   (e.g., "eleven_multilingual_ttv_v2", "eleven_ttv_v3")
+    #   :text                      => String   (100-1000 characters, optional)
+    #   :auto_generate_text        => Boolean  (default: false)
+    #   :loudness                  => Float    (-1 to 1, default: 0.5)
+    #   :seed                      => Integer  (0 to 2147483647, optional)
+    #   :guidance_scale            => Float    (0 to 100, default: 5)
+    #   :stream_previews           => Boolean  (default: false)
+    #   :remixing_session_id       => String   (optional)
+    #   :remixing_session_iteration_id => String (optional)
+    #   :quality                   => Float    (-1 to 1, optional)
+    #   :reference_audio_base64    => String   (base64 encoded audio, optional, requires eleven_ttv_v3)
+    #   :prompt_strength           => Float    (0 to 1, optional, requires eleven_ttv_v3)
+    #
+    # @return [Hash] JSON response containing previews and text
+    def design_voice(voice_description, options = {})
+      endpoint = "/v1/text-to-voice/design"
+      request_body = { voice_description: voice_description }
+      # Add optional parameters if provided
+      request_body[:output_format] = options[:output_format] if options[:output_format]
+      request_body[:model_id] = options[:model_id] if options[:model_id]
+      request_body[:text] = options[:text] if options[:text]
+      request_body[:auto_generate_text] = options[:auto_generate_text] unless options[:auto_generate_text].nil?
+      request_body[:loudness] = options[:loudness] if options[:loudness]
+      request_body[:seed] = options[:seed] if options[:seed]
+      request_body[:guidance_scale] = options[:guidance_scale] if options[:guidance_scale]
+      request_body[:stream_previews] = options[:stream_previews] unless options[:stream_previews].nil?
+      request_body[:remixing_session_id] = options[:remixing_session_id] if options[:remixing_session_id]
+      request_body[:remixing_session_iteration_id] = options[:remixing_session_iteration_id] if options[:remixing_session_iteration_id]
+      request_body[:quality] = options[:quality] if options[:quality]
+      request_body[:reference_audio_base64] = options[:reference_audio_base64] if options[:reference_audio_base64]
+      request_body[:prompt_strength] = options[:prompt_strength] if options[:prompt_strength]
+      response = @connection.post(endpoint) do |req|
+        req.headers = default_headers
+        req.body = request_body.to_json
+      end
+      JSON.parse(response.body)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    #####################################################
+    #                  Create a Voice                   #
+    #      (POST /v1/text-to-voice/create)              #
+    #####################################################
+    # Creates a voice from the designed voice generated_voice_id
+    # Documentation: https://elevenlabs.io/docs/api-reference/text-to-voice
+    #
+    # @param [String] voice_name - Name of the voice
+    # @param [String] voice_description - Description of the voice (20-1000 characters)
+    # @param [String] generated_voice_id - The generated voice ID from design_voice
+    # @param [Hash] labels - Optional metadata for the voice
+    # @param [Array<String>] played_not_selected_voice_ids - Optional list of voice IDs played but not selected
+    #
+    # @return [Hash] JSON response containing voice_id and other voice details
+    def create_from_generated_voice(voice_name, voice_description, generated_voice_id, labels: nil, played_not_selected_voice_ids: nil)
+      endpoint = "/v1/text-to-voice"
+      request_body = {
+        voice_name: voice_name,
+        voice_description: voice_description,
+        generated_voice_id: generated_voice_id,
+        labels: labels,
+        played_not_selected_voice_ids: played_not_selected_voice_ids
+      }.compact
+      response = @connection.post(endpoint) do |req|
+        req.headers = default_headers
+        req.body = request_body.to_json
+      end
+      JSON.parse(response.body)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     #####################################################
     #                     GET Voices                    #
@@ -108,6 +235,25 @@ module Elevenlabs
       handle_error(e)
     end
+    #####################################################
+    #                     GET models #
+    #                  (GET /v1/models)                 #
+    #####################################################
+    # Gets a list of available models
+    # Documentation: https://elevenlabs.io/docs/api-reference/models/list
+    #
+    # @return [Hash] The JSON response containing an array of models
+    def list_models
+      endpoint = "/v1/models"
+      response = @connection.get(endpoint) do |req|
+        req.headers = default_headers
+      end
+      JSON.parse(response.body)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     #####################################################
     #                 GET a Single Voice                #
     #               (GET /v1/voices/{voice_id})         #

data/lib/elevenlabs/errors.rb CHANGED Viewed

@@ -6,6 +6,6 @@ module Elevenlabs
   class AuthenticationError < Error; end
   class NotFoundError < Error; end
   class BadRequestError < Error; end
-  # ... add more as needed ...
+  class UnprocessableEntityError < Error; end
 end

data/lib/elevenlabs.rb CHANGED Viewed

@@ -5,7 +5,7 @@ require_relative "elevenlabs/client"
 require_relative "elevenlabs/errors"
 module Elevenlabs
-  VERSION = "0.0.3"
+  VERSION = "0.0.6"
   # Optional global configuration
   class << self

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: elevenlabs
 version: !ruby/object:Gem::Version
-  version: 0.0.4
+  version: 0.0.6
 platform: ruby
 authors:
 - hackliteracy
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2025-06-20 00:00:00.000000000 Z
+date: 2025-08-23 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: faraday
@@ -39,7 +39,7 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '1.1'
 description: This gem provides a convenient Ruby interface to the ElevenLabs TTS,
-  Voice Cloning, and Streaming endpoints.
+  Voice Cloning, Voice Design, Voice dialogues and Streaming endpoints.
 email:
 - hackliteracy@gmail.com
 executables: []