RubyGems - elevenlabs - Versions diffs - 0.0.5 → 0.0.7 - Mend

elevenlabs 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 59c61b4f80cd6efaaa39b93d6ab8ebfdc3a3edce59d2012a984ce8b9f032352d
-  data.tar.gz: 06c3253d6d21cd59fa5942620de9dee173a3ac926c484dcd17ca378793f6920e
+  metadata.gz: 2daafae7b6dbf3724b93ce2022b2fe6ac3703bfbcac12326b75e1a37cd188a39
+  data.tar.gz: ba2227a765efc7538e4aadbe0fcb0917a55c1ba70540a2660b4c75b2545f85da
 SHA512:
-  metadata.gz: 5f1cd0d00b602fe88356591d3292dfc6154d722eab9904dea0ec0cd99a5340651be30f44e0470783c1504cabeeed4a0e4dcfab6732c742a650c3c4161129c775
-  data.tar.gz: b9e3f4821d6e355f811f0d783b36cec64dd103a83713eb13a3ff3ae8c886009294371df5de825ff9287be969b92489b19c90ce426584a81ddb34d8f6e5e97182
+  metadata.gz: 07d40969dd5fdf8926c2f09c21359df4b5060b1f212797de03ef16c4fcf0dc2b6495c476a9a0741247dccf95dd30fde8d6da7404370b6acc4b61a0ea0ce8f7cd
+  data.tar.gz: 927e01fdc01e4f985466b62e2725f676117f757d3f809d8b4da7ea420e54c0e0a57ab2279cf031dd466451ce8c7a96dd76219da4a7e3d02ada842682a20246f8

data/README.md CHANGED Viewed

@@ -14,6 +14,7 @@ This gem provides an easy-to-use interface for:
 - **Converting text to speech** and retrieving the generated audio
 - **Designing a voice** based on a text description
 - **Streaming text-to-speech audio**
+- **Music Generation**
 All requests are handled via [Faraday](https://github.com/lostisland/faraday).
@@ -196,16 +197,20 @@ end
 ```ruby
 client.list_voices
 # => { "voices" => [...] }
-```
-2. **Get Voice Details**
+2. List Models
+client.list_models
+# => [...]
+3. **Get Voice Details**
 ```ruby
 client.get_voice("VOICE_ID")
 # => { "voice_id" => "...", "name" => "...", ... }
 ```
-3. **Create a Custom Voice**
+4. **Create a Custom Voice**
 ```ruby
 sample_files = [File.open("sample1.mp3", "rb")]
@@ -213,7 +218,7 @@ client.create_voice("Custom Voice", sample_files, description: "My custom AI voi
 # => JSON response with new voice details
 ```
-4. **Check if a Voice is Banned**
+5. **Check if a Voice is Banned**
 ```ruby
 sample_files = [File.open("trump.mp3", "rb")]
@@ -224,28 +229,28 @@ client.banned?(trump)
 # => true
 ```
-5. **Edit a Voice**
+6. **Edit a Voice**
 ```ruby
 client.edit_voice("VOICE_ID", name: "Updated Voice Name")
 # => JSON response with updated details
 ```
-6. **Delete a Voice**
+7. **Delete a Voice**
 ```ruby
 client.delete_voice("VOICE_ID")
 # => JSON response acknowledging deletion
 ```
-7. **Convert Text to Speech**
+8. **Convert Text to Speech**
 ```ruby
 audio_data = client.text_to_speech("VOICE_ID", "Hello world!")
 File.open("output.mp3", "wb") { |f| f.write(audio_data) }
 ```
-8. **Stream Text to Speech**
+9. **Stream Text to Speech**
 Stream from terminal:
@@ -264,22 +269,75 @@ IO.popen("play -t mp3 -", "wb") do |audio_pipe| # Notice "wb" (write binary)
 end
 ```
-9. **Design a Voice**
+10. **Create a Voice from a Design**
-Generate voice previews based on a text description:
+Once you’ve generated a voice design using client.design_voice, you can turn it into a permanent voice in your account by passing its generated_voice_id to client.create_from_generated_voice.
+# Step 1: Design a voice (returns previews + generated_voice_id)
 ```ruby
-response = client.design_voice(
-  "A deep, resonant male voice with a British accent, suitable for storytelling",
-  output_format: "mp3_44100_192",
+design_response = client.design_voice(
+  "A warm, friendly female voice with a slight Australian accent",
   model_id: "eleven_multilingual_ttv_v2",
-  text: "In a land far away, where the mountains meet the sky, a great adventure began. Brave heroes embarked on a quest to find the lost artifact, facing challenges and forging bonds that would last a lifetime. Their journey took them through enchanted forests, across raging rivers, and into the heart of ancient ruins.",
+  text: "Welcome to our podcast, where every story is an adventure, taking you on a journey through fascinating worlds, inspiring voices, and unforgettable moments.",
   auto_generate_text: false
 )
-# Save the first preview to an MP3 file
-require "base64"
-audio_data = Base64.decode64(response["previews"][0]["audio_base_64"])
-File.open("voice_preview.mp3", "wb") { |f| f.write(audio_data) }
+generated_voice_id = design_response["previews"].first["generated_voice_id"] #three previews are given, but for this example we will use the first to create a voice here
+# Step 2: Create the permanent voice
+create_response = client.create_from_generated_voice(
+  "Friendly Aussie",
+  "A warm, friendly Australian-accented voice for podcasts",
+   generated_voice_id,
+)
+voice_id = create_response["voice_id"] # This is the ID you can use for TTS
+# Step 3: Use the new voice for TTS
+audio_data = client.text_to_speech(voice_id, "This is my new permanent designed voice.")
+File.open("friendly_aussie.mp3", "wb") { |f| f.write(audio_data) }
+```
+Important notes:
+Always store the returned voice_id from create_voice_from_design. This is the permanent identifier for TTS.
+Designed voices cannot be used for TTS until they are created in your account.
+If the voice is not immediately available for TTS, wait a few seconds or check its status via client.get_voice(voice_id) until it’s "active".
+11. Create a multi-speaker dialogue
+```ruby
+inputs = [{text: "It smells like updog in here", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}, {text: "What's updog?", voice_id: "RILOU7YmBhvwJGDGjNmP"}, {text: "Not much, you?", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}]
+audio_data = client.text_to_dialogue(inputs)
+File.open("what's updog.mp3", "wb") { |f| f.write(audio_data) }
+```
+12. **Generate Music from prompt**
+```ruby
+audio = client.compose_music(prompt: "Lo-fi hip hop beat", music_length_ms: 30000)
+File.binwrite("lofi.mp3", audio)
+```
+12. **Stream Music Generated from prompt**
+```ruby
+File.open("epic_stream.mp3", "wb") do |f|
+  client.compose_music_stream(prompt: "Epic orchestral build", music_length_ms: 60000) do |chunk|
+    f.write(chunk)
+  end
+end
+```
+13. **Generate Music with Detailed Metadata (metadata + audio) from prompt**
+```ruby
+result = client.compose_music_detailed(prompt: "Jazz piano trio", music_length_ms: 20000)
+puts result # raw multipart data (needs parsing)
+```
+14. **Create a music composition plan from prompt**
+```ruby
+plan = client.create_music_plan(prompt: "Upbeat pop song with verse and chorus", music_length_ms: 60000)
+puts plan[:sections]
 ```
 ---
@@ -338,7 +396,7 @@ gem build elevenlabs.gemspec
 Install the gem locally:
 ```bash
-gem install ./elevenlabs-0.0.5.gem
+gem install ./elevenlabs-0.0.7.gem
 ```
 ---

data/lib/elevenlabs/client.rb CHANGED Viewed

@@ -88,6 +88,47 @@ module Elevenlabs
       handle_error(e)
     end
+    #####################################################
+    #              Text-to-Dialogue                     #
+    #    (POST /v1/text-to-dialogue)                    #
+    #####################################################
+    # Converts a list of text and voice ID pairs into speech (dialogue) and returns audio.
+    # Documentation: https://elevenlabs.io/docs/api-reference/text-to-dialogue/convert
+    #
+    # @param [Array[Objects]] inputs - A list of dialogue inputs, each containing text and a voice ID which will be converted into speech
+    #   :text => String
+    #   :voice_id => String
+    # @param [String] model_id - optional Identifier of the model to be used
+    # @param [Hash] settings - optinal Settings controlling the dialogue generation
+    #   :stability => double - 0.0 = Creative, 0.5 = Natural, 1.0 = Robust
+    #   :use_speaker_boost => boolean
+    # @param [Integer] seed - optional Best effort to sample deterministically.
+    #
+    # @return [String] The binary audio data (usually an MP3).
+    def text_to_dialogue(inputs, model_id = nil, settings = {}, seed = nil)
+      endpoint = "/v1/text-to-dialogue"
+      request_body = {}.tap do |r|
+        r[:inputs] = inputs
+        r[:model_id] = model_id if model_id
+        r[:settings] = settings unless settings.empty?
+        r[:seed] = seed if seed
+      end
+      headers = default_headers
+      headers["Accept"] = "audio/mpeg"
+      response = @connection.post(endpoint) do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      # Returns raw binary data (often MP3)
+      response.body
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     #####################################################
     #                  Design a Voice                   #
     #      (POST /v1/text-to-voice/design)              #
@@ -194,6 +235,25 @@ module Elevenlabs
       handle_error(e)
     end
+    #####################################################
+    #                     GET models #
+    #                  (GET /v1/models)                 #
+    #####################################################
+    # Gets a list of available models
+    # Documentation: https://elevenlabs.io/docs/api-reference/models/list
+    #
+    # @return [Hash] The JSON response containing an array of models
+    def list_models
+      endpoint = "/v1/models"
+      response = @connection.get(endpoint) do |req|
+        req.headers = default_headers
+      end
+      JSON.parse(response.body)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     #####################################################
     #                 GET a Single Voice                #
     #               (GET /v1/voices/{voice_id})         #
@@ -345,6 +405,110 @@ module Elevenlabs
       voice_id.in?(active_voices)
     end
+    #####################################################
+    #                     Music API                     #
+    #####################################################
+    # 1. Compose music (basic)
+    # POST /v1/music
+    def compose_music(options = {})
+      endpoint = "/v1/music"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers.merge("Accept" => "audio/mpeg")
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      response = @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      response.body # raw binary audio
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 2. Stream music
+    # POST /v1/music/stream
+    def compose_music_stream(options = {}, &block)
+      endpoint = "/v1/music/stream"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers.merge("Accept" => "audio/mpeg")
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.options.on_data = Proc.new do |chunk, _|
+          block.call(chunk) if block
+        end
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      nil # audio streamed via block
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 3. Compose detailed music (metadata + audio)
+    # POST /v1/music/detailed
+    def compose_music_detailed(options = {})
+      endpoint = "/v1/music/detailed"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      response = @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      response.body # multipart/mixed with JSON + binary audio
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 4. Create a composition plan
+    # POST /v1/music/plan
+    def create_music_plan(options = {})
+      endpoint = "/v1/music/plan"
+      request_body = {
+        prompt: options[:prompt],
+        music_length_ms: options[:music_length_ms],
+        source_composition_plan: options[:source_composition_plan],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      response = @connection.post(endpoint) do |req|
+        req.headers = default_headers
+        req.body = request_body.to_json
+      end
+      JSON.parse(response.body, symbolize_names: true)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     private
     # Common headers needed by Elevenlabs

data/lib/elevenlabs.rb CHANGED Viewed

@@ -5,7 +5,7 @@ require_relative "elevenlabs/client"
 require_relative "elevenlabs/errors"
 module Elevenlabs
-  VERSION = "0.0.5"
+  VERSION = "0.0.7"
   # Optional global configuration
   class << self

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: elevenlabs
 version: !ruby/object:Gem::Version
-  version: 0.0.5
+  version: 0.0.7
 platform: ruby
 authors:
 - hackliteracy
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2025-08-09 00:00:00.000000000 Z
+date: 2025-08-25 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: faraday
@@ -39,7 +39,8 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '1.1'
 description: This gem provides a convenient Ruby interface to the ElevenLabs TTS,
-  Voice Cloning, Voice Design and Streaming endpoints.
+  Voice Cloning, Voice Design, Voice dialogues, TTS Streaming, Music Generation and
+  Streaming endpoints.
 email:
 - hackliteracy@gmail.com
 executables: []