RubyGems - elevenlabs - Versions diffs - 0.0.6 → 0.0.8 - Mend

elevenlabs 0.0.6 → 0.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 22770e41ca0d3c88d2dc5f83e3e4d9de510610bf1a0adaf9bf675951a647ab30
-  data.tar.gz: 8f6ffc3ef844da02c3f45385a730ebbddfe3c711d11e1b983837153c8dbd859a
+  metadata.gz: ea563591f2116a24c1911bd964ed30bfee9dd468e9890db98a7f38ad84138bd4
+  data.tar.gz: e1c155f3dc9f5daaff7ceb283ccd7493335f639a1beb66df93adaf14ac8fac96
 SHA512:
-  metadata.gz: 1b094e808358b342f7fe8cb08cf993dbafe2bac989bcb1e4655d5de5dd2884ff83626cb098da9b7c06d60a697302d33e848419f80a26fb53f34198f7894390b7
-  data.tar.gz: 44ad5334ed45f2628a22be91a0ed307b74f7e8611f6ba71cca0b4ee79f7e510f1f655e45321e47060b413e476ed9ea9913e304a44a789ec388bb7411235d42b4
+  metadata.gz: 8559ecdf9dc45be3f7018afb389cb784603ba6c255db024f7276df60490504da366f462cbf2fb3886eeaf39f305ff530ef5ee4f9dcee577df883e62e48684af7
+  data.tar.gz: 62c761c794c8641940d5b297a0252b37dc9f41e2afcaaad777701acf7c94ee8ec3089f76559eaffd26bdf4292930cf2a9f5827f5d7ed5c523445356fa3e085b4

data/README.md CHANGED Viewed

@@ -14,6 +14,8 @@ This gem provides an easy-to-use interface for:
 - **Converting text to speech** and retrieving the generated audio
 - **Designing a voice** based on a text description
 - **Streaming text-to-speech audio**
+- **Music Generation**
+- **Sound Effect Generation**
 All requests are handled via [Faraday](https://github.com/lostisland/faraday).
@@ -304,7 +306,7 @@ Designed voices cannot be used for TTS until they are created in your account.
 If the voice is not immediately available for TTS, wait a few seconds or check its status via client.get_voice(voice_id) until it’s "active".
-10. Create a multi-speaker dialogue
+11. Create a multi-speaker dialogue
 ```ruby
 inputs = [{text: "It smells like updog in here", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}, {text: "What's updog?", voice_id: "RILOU7YmBhvwJGDGjNmP"}, {text: "Not much, you?", voice_id: "TX3LPaxmHKxFdv7VOQHJ"}]
@@ -312,6 +314,78 @@ audio_data = client.text_to_dialogue(inputs)
 File.open("what's updog.mp3", "wb") { |f| f.write(audio_data) }
 ```
+12. **Generate Music from prompt**
+```ruby
+audio = client.compose_music(prompt: "Lo-fi hip hop beat", music_length_ms: 30000)
+File.binwrite("lofi.mp3", audio)
+```
+12. **Stream Music Generated from prompt**
+```ruby
+File.open("epic_stream.mp3", "wb") do |f|
+  client.compose_music_stream(prompt: "Epic orchestral build", music_length_ms: 60000) do |chunk|
+    f.write(chunk)
+  end
+end
+```
+13. **Generate Music with Detailed Metadata (metadata + audio) from prompt**
+```ruby
+result = client.compose_music_detailed(prompt: "Jazz piano trio", music_length_ms: 20000)
+puts result # raw multipart data (needs parsing)
+```
+14. **Create a music composition plan from prompt**
+```ruby
+plan = client.create_music_plan(prompt: "Upbeat pop song with verse and chorus", music_length_ms: 60000)
+puts plan[:sections]
+```
+15. **Create sound effects from a prompt**
+Basic Usage: Simple Prompt
+  Generate a sound effect with only a text prompt, using default settings (output_format: "mp3_44100_128", duration_seconds: nil (auto-detected), prompt_influence: 0.3).
+```ruby
+audio_data = client.sound_generation("Futuristic laser blast in a space battle")
+# Save the audio to a file
+File.open("laser_blast.mp3", "wb") { |f| f.write(audio_data) }
+```
+Advanced Usage: Custom Duration, Influence, and Format
+Specify duration_seconds, prompt_influence, and output_format for precise control over the sound effect.
+# Generate a roaring dragon sound with specific settings
+```ruby
+audio_data = client.sound_generation(
+  "Roaring dragon in a fantasy cave",
+  duration_seconds: 3.0,
+  prompt_influence: 0.7, # Higher influence for closer adherence to the prompt
+  output_format: "mp3_22050_32"
+)
+# Save the audio to a file
+File.open("dragon_roar.mp3", "wb") { |f| f.write(audio_data) }
+```
+Looping Sound Effect
+Create a looping sound effect for continuous playback, such as background ambiance in a video game.
+# Generate a looping ambient sound for a haunted forest
+```ruby
+audio_data = client.sound_generation(
+  "Eerie wind and distant owl hoots in a haunted forest",
+  loop: true,
+  duration_seconds: 10.0,
+  prompt_influence: 0.5,
+  output_format: "mp3_22050_32"
+)
+# Save the audio to a file
+File.open("haunted_forest_loop.mp3", "wb") { |f| f.write(audio_data) }
+```
+For more details, see the ElevenLabs Sound Generation API documentation.
 ---
 ## Error Handling
@@ -368,7 +442,7 @@ gem build elevenlabs.gemspec
 Install the gem locally:
 ```bash
-gem install ./elevenlabs-0.0.5.gem
+gem install ./elevenlabs-0.0.8.gem
 ```
 ---

data/lib/elevenlabs/client.rb CHANGED Viewed

@@ -9,13 +9,15 @@ module Elevenlabs
     BASE_URL = "https://api.elevenlabs.io"
     # Note the default param: `api_key: nil`
-    def initialize(api_key: nil)
+    def initialize(api_key: nil, open_timeout: 5, read_timeout: 120)
       # If the caller doesn’t provide an api_key, use the gem-wide config
       @api_key = api_key || Elevenlabs.configuration&.api_key
       @connection = Faraday.new(url: BASE_URL) do |conn|
         conn.request :url_encoded
         conn.response :raise_error
+        conn.options.open_timeout = open_timeout   # time to open connection
+        conn.options.timeout      = read_timeout   # time to wait for response
         conn.adapter Faraday.default_adapter
       end
     end
@@ -129,6 +131,48 @@ module Elevenlabs
       handle_error(e)
     end
+    #####################################################
+    #                 Sound Generation                  #
+    #      (POST /v1/sound-generation)                  #
+    #####################################################
+    # Convert text to sound effects and retrieve audio (binary data)
+    # Documentation: https://elevenlabs.io/docs/api-reference/sound-generation
+    #
+    # @param [String] text - text prompt describing the sound effect
+    # @param [Hash] options - optional parameters
+    #   :loop              => Boolean  (whether to create a looping sound effect, default: false)
+    #   :duration_seconds  => Float    (0.5 to 30 seconds, default: nil for auto-detection)
+    #   :prompt_influence  => Float    (0.0 to 1.0, default: 0.3)
+    #   :output_format     => String   (e.g., "mp3_22050_32", default: "mp3_44100_128")
+    #
+    # @return [String] The binary audio data (usually an MP3).
+    def sound_generation(text, options = {})
+      endpoint = "/v1/sound-generation"
+      request_body = { text: text }
+      # Add optional parameters if provided
+      request_body[:loop] = options[:loop] unless options[:loop].nil?
+      request_body[:duration_seconds] = options[:duration_seconds] if options[:duration_seconds]
+      request_body[:prompt_influence] = options[:prompt_influence] if options[:prompt_influence]
+      headers = default_headers
+      headers["Accept"] = "audio/mpeg"
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      response = @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      # Returns raw binary data (often MP3)
+      response.body
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     #####################################################
     #                  Design a Voice                   #
     #      (POST /v1/text-to-voice/design)              #
@@ -405,6 +449,110 @@ module Elevenlabs
       voice_id.in?(active_voices)
     end
+    #####################################################
+    #                     Music API                     #
+    #####################################################
+    # 1. Compose music (basic)
+    # POST /v1/music
+    def compose_music(options = {})
+      endpoint = "/v1/music"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers.merge("Accept" => "audio/mpeg")
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      response = @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      response.body # raw binary audio
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 2. Stream music
+    # POST /v1/music/stream
+    def compose_music_stream(options = {}, &block)
+      endpoint = "/v1/music/stream"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers.merge("Accept" => "audio/mpeg")
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.options.on_data = Proc.new do |chunk, _|
+          block.call(chunk) if block
+        end
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      nil # audio streamed via block
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 3. Compose detailed music (metadata + audio)
+    # POST /v1/music/detailed
+    def compose_music_detailed(options = {})
+      endpoint = "/v1/music/detailed"
+      request_body = {
+        prompt: options[:prompt],
+        composition_plan: options[:composition_plan],
+        music_length_ms: options[:music_length_ms],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      headers = default_headers
+      query = {}
+      query[:output_format] = options[:output_format] if options[:output_format]
+      response = @connection.post("#{endpoint}?#{URI.encode_www_form(query)}") do |req|
+        req.headers = headers
+        req.body = request_body.to_json
+      end
+      response.body # multipart/mixed with JSON + binary audio
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
+    # 4. Create a composition plan
+    # POST /v1/music/plan
+    def create_music_plan(options = {})
+      endpoint = "/v1/music/plan"
+      request_body = {
+        prompt: options[:prompt],
+        music_length_ms: options[:music_length_ms],
+        source_composition_plan: options[:source_composition_plan],
+        model_id: options[:model_id] || "music_v1"
+      }.compact
+      response = @connection.post(endpoint) do |req|
+        req.headers = default_headers
+        req.body = request_body.to_json
+      end
+      JSON.parse(response.body, symbolize_names: true)
+    rescue Faraday::ClientError => e
+      handle_error(e)
+    end
     private
     # Common headers needed by Elevenlabs

data/lib/elevenlabs.rb CHANGED Viewed

@@ -5,7 +5,7 @@ require_relative "elevenlabs/client"
 require_relative "elevenlabs/errors"
 module Elevenlabs
-  VERSION = "0.0.6"
+  VERSION = "0.0.8"
   # Optional global configuration
   class << self

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: elevenlabs
 version: !ruby/object:Gem::Version
-  version: 0.0.6
+  version: 0.0.8
 platform: ruby
 authors:
 - hackliteracy
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2025-08-23 00:00:00.000000000 Z
+date: 2025-09-03 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: faraday
@@ -39,7 +39,8 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '1.1'
 description: This gem provides a convenient Ruby interface to the ElevenLabs TTS,
-  Voice Cloning, Voice Design, Voice dialogues and Streaming endpoints.
+  Voice Cloning, Voice Design, Voice dialogues, TTS Streaming, Music Generation and
+  Streaming endpoints.
 email:
 - hackliteracy@gmail.com
 executables: []