RubyGems - elevenlabs_client - Versions diffs - 0.4.0 → 0.5.0 - Mend

elevenlabs_client 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +32 -0
data/README.md +78 -1
data/lib/elevenlabs_client/client.rb +49 -1
data/lib/elevenlabs_client/endpoints/audio_isolation.rb +71 -0
data/lib/elevenlabs_client/endpoints/audio_native.rb +103 -0
data/lib/elevenlabs_client/endpoints/dubs.rb +52 -2
data/lib/elevenlabs_client/endpoints/forced_alignment.rb +41 -0
data/lib/elevenlabs_client/endpoints/speech_to_speech.rb +125 -0
data/lib/elevenlabs_client/endpoints/speech_to_text.rb +108 -0
data/lib/elevenlabs_client/endpoints/text_to_dialogue_stream.rb +50 -0
data/lib/elevenlabs_client/endpoints/text_to_speech_stream.rb +1 -0
data/lib/elevenlabs_client/endpoints/text_to_speech_stream_with_timestamps.rb +75 -0
data/lib/elevenlabs_client/endpoints/text_to_speech_with_timestamps.rb +73 -0
data/lib/elevenlabs_client/endpoints/voices.rb +362 -0
data/lib/elevenlabs_client/endpoints/websocket_text_to_speech.rb +250 -0
data/lib/elevenlabs_client/version.rb +1 -1
data/lib/elevenlabs_client.rb +9 -2
metadata +25 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 2b65be08b17b9ae232158f2c004511a7840688e76853c014f9be37984a9639d1
-  data.tar.gz: 2de88d74e59af044cfe943e32f0d2f2d8b45f71200469a69b3f95fc7605436d2
+  metadata.gz: f1c75ecb60655822ec4d8b88e22ebae5e0a1714e5573000cd5a36c3e80bcb886
+  data.tar.gz: 5d05b4e838bc30cbc1c290b615b1c0d686ea6d6aafe9521097ddc00d0ba28189
 SHA512:
-  metadata.gz: f919ebdf7090d2f4cdd812589eccd1425e8be5e86615c04f3bf2882c7e8e8e07058db4f37cdf420e6b1948c668c3acbb177fe796b48c7f23563fa41a7204f4ce
-  data.tar.gz: 23b7dc77bb3ca90e2019d4098887b8555d103d66c1fdf259f883f7952b4c26f57671f9bd2e250e27491acc42c2518e85b37bcb48cdb678fd163f9b3be9b1d7e4
+  metadata.gz: e26733f1b2ddaaec79432e7458f2af56b50d0f29bb52bdddc4fcbdbb564c85eea40949c7304fef7a4af3da5ff2c364bb42341b3755bf385cc6e81bb429f81aa5
+  data.tar.gz: 59f527fa65e17375fa3a33eb8f7d140a3f59ccd87f51168e6f43bac5c94c3d93fa49026dc4ee6fbe5eb4fb7b0a772f6e6925b6051092f112b12936e3b154009e

data/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,38 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.5.0] - 2025-09-14
+### Added
+- Text-to-Speech With Timestamps
+  - `client.text_to_speech_with_timestamps.generate(voice_id, text, **options)`
+  - Character-level `alignment` and `normalized_alignment`
+- Streaming Text-to-Speech With Timestamps
+  - `client.text_to_speech_stream_with_timestamps.stream(voice_id, text, **options, &block)`
+  - JSON streaming with audio chunks and timing per chunk
+- WebSocket Streaming Enhancements
+  - Single-context and multi-context improvements; correct query param ordering and filtering
+  - Docs: `docs/WEBSOCKET_STREAMING.md`
+- Text-to-Dialogue Streaming
+  - `client.text_to_dialogue_stream.stream(inputs, **options, &block)`
+  - Docs: `docs/TEXT_TO_DIALOGUE_STREAMING.md`
+### Improved
+- Client streaming JSON handling for timestamp streams (`post_streaming_with_timestamps`)
+- Robust parsing and block yielding across streaming tests
+- URL query parameter ordering to match expectations in tests
+### Tests
+- Added comprehensive unit and integration tests for all new endpoints
+- Full suite now: 687 examples, 0 failures
+### Notes
+- These features require valid ElevenLabs API keys and correct model/voice permissions
 ## [0.4.0] - 2025-09-12
 ### Added

data/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 [![Gem Version](https://badge.fury.io/rb/elevenlabs_client.svg)](https://badge.fury.io/rb/elevenlabs_client)
-A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects, and AI music composition.
+A comprehensive Ruby client library for the ElevenLabs API, supporting voice synthesis, dubbing, dialogue generation, sound effects, AI music composition, voice transformation, speech transcription, audio isolation, and advanced audio processing features.
 ## Features
@@ -13,6 +13,11 @@ A comprehensive Ruby client library for the ElevenLabs API, supporting voice syn
 🎵 **Music Generation** - AI-powered music composition and streaming
 🎨 **Voice Design** - Create custom voices from text descriptions
 🎭 **Voice Management** - Create, edit, and manage individual voices
+🔄 **Speech-to-Speech** - Transform audio from one voice to another (Voice Changer)
+📝 **Speech-to-Text** - Transcribe audio and video files with advanced features
+🔇 **Audio Isolation** - Remove background noise from audio files
+📱 **Audio Native** - Create embeddable audio players for websites
+⏱️ **Forced Alignment** - Get precise timing information for audio transcripts
 🤖 **Models** - List available models and their capabilities
 📡 **Streaming** - Real-time audio streaming
 ⚙️ **Configurable** - Flexible configuration options
@@ -141,6 +146,60 @@ music_data = client.music.compose(
 )
 File.open("generated_music.mp3", "wb") { |f| f.write(music_data) }
+# Speech-to-Speech (Voice Changer)
+File.open("input_audio.mp3", "rb") do |audio_file|
+  converted_audio = client.speech_to_speech.convert(
+    "target_voice_id",
+    audio_file,
+    "input_audio.mp3",
+    remove_background_noise: true
+  )
+  File.open("converted_audio.mp3", "wb") { |f| f.write(converted_audio) }
+end
+# Speech-to-Text Transcription
+File.open("audio.mp3", "rb") do |audio_file|
+  transcription = client.speech_to_text.create(
+    "scribe_v1",
+    file: audio_file,
+    filename: "audio.mp3",
+    diarize: true,
+    timestamps_granularity: "word"
+  )
+  puts "Transcribed: #{transcription['text']}"
+end
+# Audio Isolation (Background Noise Removal)
+File.open("noisy_audio.mp3", "rb") do |audio_file|
+  clean_audio = client.audio_isolation.isolate(audio_file, "noisy_audio.mp3")
+  File.open("clean_audio.mp3", "wb") { |f| f.write(clean_audio) }
+end
+# Audio Native (Embeddable Player)
+File.open("article.html", "rb") do |html_file|
+  project = client.audio_native.create(
+    "My Article",
+    file: html_file,
+    filename: "article.html",
+    voice_id: "voice_id",
+    auto_convert: true
+  )
+  puts "Player HTML: #{project['html_snippet']}"
+end
+# Forced Alignment
+File.open("speech.wav", "rb") do |audio_file|
+  alignment = client.forced_alignment.create(
+    audio_file,
+    "speech.wav",
+    "Hello world, this is a test transcript"
+  )
+  alignment['words'].each do |word|
+    puts "#{word['text']}: #{word['start']}s - #{word['end']}s"
+  end
+end
 # Streaming Text-to-Speech
 client.text_to_speech_stream.stream("voice_id", "Streaming text") do |chunk|
   # Process audio chunk in real-time
@@ -160,6 +219,11 @@ end
 - **[Music Generation API](docs/MUSIC.md)** - AI-powered music composition and streaming
 - **[Text-to-Voice API](docs/TEXT_TO_VOICE.md)** - Design and create custom voices
 - **[Voice Management API](docs/VOICES.md)** - Manage individual voices (CRUD operations)
+- **[Speech-to-Speech API](docs/SPEECH_TO_SPEECH.md)** - Transform audio from one voice to another
+- **[Speech-to-Text API](docs/SPEECH_TO_TEXT.md)** - Transcribe audio and video files
+- **[Audio Isolation API](docs/AUDIO_ISOLATION.md)** - Remove background noise from audio
+- **[Audio Native API](docs/AUDIO_NATIVE.md)** - Create embeddable audio players
+- **[Forced Alignment API](docs/FORCED_ALIGNMENT.md)** - Get precise timing information
 - **[Models API](docs/MODELS.md)** - List available models and capabilities
 ### Available Endpoints
@@ -174,6 +238,11 @@ end
 | `client.music.*` | AI music composition and streaming | [MUSIC.md](docs/MUSIC.md) |
 | `client.text_to_voice.*` | Voice design and creation | [TEXT_TO_VOICE.md](docs/TEXT_TO_VOICE.md) |
 | `client.voices.*` | Voice management (CRUD) | [VOICES.md](docs/VOICES.md) |
+| `client.speech_to_speech.*` | Voice changer and audio transformation | [SPEECH_TO_SPEECH.md](docs/SPEECH_TO_SPEECH.md) |
+| `client.speech_to_text.*` | Audio/video transcription | [SPEECH_TO_TEXT.md](docs/SPEECH_TO_TEXT.md) |
+| `client.audio_isolation.*` | Background noise removal | [AUDIO_ISOLATION.md](docs/AUDIO_ISOLATION.md) |
+| `client.audio_native.*` | Embeddable audio players | [AUDIO_NATIVE.md](docs/AUDIO_NATIVE.md) |
+| `client.forced_alignment.*` | Audio-text timing alignment | [FORCED_ALIGNMENT.md](docs/FORCED_ALIGNMENT.md) |
 | `client.models.*` | Model information and capabilities | [MODELS.md](docs/MODELS.md) |
 ## Configuration Options
@@ -221,6 +290,9 @@ end
 - `AuthenticationError` - Invalid API key or authentication failure
 - `RateLimitError` - Rate limit exceeded
 - `ValidationError` - Invalid request parameters
+- `NotFoundError` - Resource not found (e.g., voice ID, transcript ID)
+- `BadRequestError` - Bad request with invalid parameters
+- `UnprocessableEntityError` - Request cannot be processed (e.g., invalid file format)
 - `APIError` - General API errors
 ## Rails Integration
@@ -235,6 +307,11 @@ The gem is designed to work seamlessly with Rails applications. See the [example
 - [MusicController](examples/music_controller.rb) - AI music composition and streaming
 - [TextToVoiceController](examples/text_to_voice_controller.rb) - Voice design and creation
 - [VoicesController](examples/voices_controller.rb) - Voice management (CRUD operations)
+- [SpeechToSpeechController](examples/speech_to_speech_controller.rb) - Voice changer and audio transformation
+- [SpeechToTextController](examples/speech_to_text_controller.rb) - Audio/video transcription with advanced features
+- [AudioIsolationController](examples/audio_isolation_controller.rb) - Background noise removal and audio cleanup
+- [AudioNativeController](examples/audio_native_controller.rb) - Embeddable audio players for websites
+- [ForcedAlignmentController](examples/forced_alignment_controller.rb) - Audio-text timing alignment and subtitle generation
 ## Development

data/lib/elevenlabs_client/client.rb CHANGED Viewed

@@ -2,12 +2,13 @@
 require "faraday"
 require "faraday/multipart"
+require "json"
 module ElevenlabsClient
   class Client
     DEFAULT_BASE_URL = "https://api.elevenlabs.io"
-    attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_dialogue, :sound_generation, :text_to_voice, :models, :voices, :music
+    attr_reader :base_url, :api_key, :dubs, :text_to_speech, :text_to_speech_stream, :text_to_speech_with_timestamps, :text_to_speech_stream_with_timestamps, :text_to_dialogue, :text_to_dialogue_stream, :sound_generation, :text_to_voice, :models, :voices, :music, :audio_isolation, :audio_native, :forced_alignment, :speech_to_speech, :speech_to_text, :websocket_text_to_speech
     def initialize(api_key: nil, base_url: nil, api_key_env: "ELEVENLABS_API_KEY", base_url_env: "ELEVENLABS_BASE_URL")
       @api_key = api_key || fetch_api_key(api_key_env)
@@ -16,12 +17,21 @@ module ElevenlabsClient
       @dubs = Dubs.new(self)
       @text_to_speech = TextToSpeech.new(self)
       @text_to_speech_stream = TextToSpeechStream.new(self)
+      @text_to_speech_with_timestamps = TextToSpeechWithTimestamps.new(self)
+      @text_to_speech_stream_with_timestamps = TextToSpeechStreamWithTimestamps.new(self)
       @text_to_dialogue = TextToDialogue.new(self)
+      @text_to_dialogue_stream = TextToDialogueStream.new(self)
       @sound_generation = SoundGeneration.new(self)
       @text_to_voice = TextToVoice.new(self)
       @models = Models.new(self)
       @voices = Voices.new(self)
       @music = Endpoints::Music.new(self)
+      @audio_isolation = AudioIsolation.new(self)
+      @audio_native = AudioNative.new(self)
+      @forced_alignment = ForcedAlignment.new(self)
+      @speech_to_speech = SpeechToSpeech.new(self)
+      @speech_to_text = SpeechToText.new(self)
+      @websocket_text_to_speech = WebSocketTextToSpeech.new(self)
     end
     # Makes an authenticated GET request
@@ -144,6 +154,44 @@ module ElevenlabsClient
       handle_response(response)
     end
+    # Makes an authenticated POST request with streaming response for timestamp data
+    # @param path [String] API endpoint path
+    # @param body [Hash, nil] Request body
+    # @param block [Proc] Block to handle each JSON chunk with timestamps
+    # @return [Faraday::Response] Response object
+    def post_streaming_with_timestamps(path, body = nil, &block)
+      buffer = ""
+      response = @conn.post(path) do |req|
+        req.headers["xi-api-key"] = api_key
+        req.headers["Content-Type"] = "application/json"
+        req.body = body.to_json if body
+        # Set up streaming callback for JSON chunks
+        req.options.on_data = proc do |chunk, _|
+          if block_given?
+            buffer += chunk
+            # Process complete JSON objects
+            while buffer.include?("\n")
+              line, buffer = buffer.split("\n", 2)
+              next if line.strip.empty?
+              begin
+                json_data = JSON.parse(line)
+                block.call(json_data)
+              rescue JSON::ParserError
+                # Skip malformed JSON lines
+                next
+              end
+            end
+          end
+        end
+      end
+      handle_response(response)
+    end
     # Helper method to create Faraday::Multipart::FilePart
     # @param file_io [IO] File IO object
     # @param filename [String] Original filename

data/lib/elevenlabs_client/endpoints/audio_isolation.rb ADDED Viewed

@@ -0,0 +1,71 @@
+# frozen_string_literal: true
+module ElevenlabsClient
+  class AudioIsolation
+    def initialize(client)
+      @client = client
+    end
+    # POST /v1/audio-isolation
+    # Removes background noise from audio
+    # Documentation: https://elevenlabs.io/docs/api-reference/audio-isolation
+    #
+    # @param audio_file [IO, File] The audio file from which vocals/speech will be isolated
+    # @param filename [String] Original filename for the audio file
+    # @param options [Hash] Optional parameters
+    # @option options [String] :file_format Format of input audio ('pcm_s16le_16' or 'other', defaults to 'other')
+    # @return [String] Binary audio data with background noise removed
+    def isolate(audio_file, filename, **options)
+      endpoint = "/v1/audio-isolation"
+      payload = {
+        audio: @client.file_part(audio_file, filename)
+      }
+      # Add optional parameters if provided
+      payload[:file_format] = options[:file_format] if options[:file_format]
+      @client.post_multipart(endpoint, payload)
+    end
+    # POST /v1/audio-isolation/stream
+    # Removes background noise from audio with streaming response
+    # Documentation: https://elevenlabs.io/docs/api-reference/audio-isolation/stream
+    #
+    # @param audio_file [IO, File] The audio file from which vocals/speech will be isolated
+    # @param filename [String] Original filename for the audio file
+    # @param options [Hash] Optional parameters
+    # @option options [String] :file_format Format of input audio ('pcm_s16le_16' or 'other', defaults to 'other')
+    # @param block [Proc] Block to handle each chunk of streaming audio data
+    # @return [Faraday::Response] Response object for streaming
+    def isolate_stream(audio_file, filename, **options, &block)
+      endpoint = "/v1/audio-isolation/stream"
+      payload = {
+        audio: @client.file_part(audio_file, filename)
+      }
+      # Add optional parameters if provided
+      payload[:file_format] = options[:file_format] if options[:file_format]
+      # Use streaming multipart request
+      response = @client.instance_variable_get(:@conn).post(endpoint) do |req|
+        req.headers["xi-api-key"] = @client.api_key
+        req.body = payload
+        # Set up streaming callback if block provided
+        if block_given?
+          req.options.on_data = proc do |chunk, _|
+            block.call(chunk)
+          end
+        end
+      end
+      @client.send(:handle_response, response)
+    end
+    private
+    attr_reader :client
+  end
+end

data/lib/elevenlabs_client/endpoints/audio_native.rb ADDED Viewed

@@ -0,0 +1,103 @@
+# frozen_string_literal: true
+module ElevenlabsClient
+  class AudioNative
+    def initialize(client)
+      @client = client
+    end
+    # POST /v1/audio-native
+    # Creates Audio Native enabled project, optionally starts conversion and returns project ID and embeddable HTML snippet
+    # Documentation: https://elevenlabs.io/docs/api-reference/audio-native/create
+    #
+    # @param name [String] Project name
+    # @param options [Hash] Optional parameters
+    # @option options [String] :image Image URL used in the player (deprecated)
+    # @option options [String] :author Author used in the player
+    # @option options [String] :title Title used in the player
+    # @option options [Boolean] :small Whether to use small player (deprecated, defaults to false)
+    # @option options [String] :text_color Text color used in the player
+    # @option options [String] :background_color Background color used in the player
+    # @option options [Integer] :sessionization Minutes to persist session (deprecated, defaults to 0)
+    # @option options [String] :voice_id Voice ID used to voice the content
+    # @option options [String] :model_id TTS Model ID used in the player
+    # @option options [IO, File] :file Text or HTML input file containing article content
+    # @option options [String] :filename Original filename for the file
+    # @option options [Boolean] :auto_convert Whether to auto convert project to audio (defaults to false)
+    # @option options [String] :apply_text_normalization Text normalization mode ('auto', 'on', 'off', 'apply_english')
+    # @return [Hash] JSON response containing project_id, converting status, and html_snippet
+    def create(name, **options)
+      endpoint = "/v1/audio-native"
+      payload = { name: name }
+      # Add optional parameters if provided
+      payload[:image] = options[:image] if options[:image]
+      payload[:author] = options[:author] if options[:author]
+      payload[:title] = options[:title] if options[:title]
+      payload[:small] = options[:small] unless options[:small].nil?
+      payload[:text_color] = options[:text_color] if options[:text_color]
+      payload[:background_color] = options[:background_color] if options[:background_color]
+      payload[:sessionization] = options[:sessionization] if options[:sessionization]
+      payload[:voice_id] = options[:voice_id] if options[:voice_id]
+      payload[:model_id] = options[:model_id] if options[:model_id]
+      payload[:auto_convert] = options[:auto_convert] unless options[:auto_convert].nil?
+      payload[:apply_text_normalization] = options[:apply_text_normalization] if options[:apply_text_normalization]
+      # Add file if provided
+      if options[:file] && options[:filename]
+        payload[:file] = @client.file_part(options[:file], options[:filename])
+      end
+      @client.post_multipart(endpoint, payload)
+    end
+    # POST /v1/audio-native/:project_id/content
+    # Updates content for the specific AudioNative Project
+    # Documentation: https://elevenlabs.io/docs/api-reference/audio-native/update
+    #
+    # @param project_id [String] The ID of the project to be used
+    # @param options [Hash] Optional parameters
+    # @option options [IO, File] :file Text or HTML input file containing article content
+    # @option options [String] :filename Original filename for the file
+    # @option options [Boolean] :auto_convert Whether to auto convert project to audio (defaults to false)
+    # @option options [Boolean] :auto_publish Whether to auto publish after conversion (defaults to false)
+    # @return [Hash] JSON response containing project_id, converting, publishing status, and html_snippet
+    def update_content(project_id, **options)
+      endpoint = "/v1/audio-native/#{project_id}/content"
+      payload = {}
+      # Add optional parameters if provided
+      payload[:auto_convert] = options[:auto_convert] unless options[:auto_convert].nil?
+      payload[:auto_publish] = options[:auto_publish] unless options[:auto_publish].nil?
+      # Add file if provided
+      if options[:file] && options[:filename]
+        payload[:file] = @client.file_part(options[:file], options[:filename])
+      end
+      @client.post_multipart(endpoint, payload)
+    end
+    # GET /v1/audio-native/:project_id/settings
+    # Get player settings for the specific project
+    # Documentation: https://elevenlabs.io/docs/api-reference/audio-native/settings
+    #
+    # @param project_id [String] The ID of the Studio project
+    # @return [Hash] JSON response containing enabled status, snapshot_id, and settings
+    def get_settings(project_id)
+      endpoint = "/v1/audio-native/#{project_id}/settings"
+      @client.get(endpoint)
+    end
+    # Alias methods for convenience
+    alias_method :create_project, :create
+    alias_method :update_project_content, :update_content
+    alias_method :project_settings, :get_settings
+    private
+    attr_reader :client
+  end
+end

data/lib/elevenlabs_client/endpoints/dubs.rb CHANGED Viewed

@@ -8,6 +8,7 @@ module ElevenlabsClient
     # POST /v1/dubbing (multipart)
     # Creates a new dubbing job
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/create
     #
     # @param file_io [IO] The audio/video file to dub
     # @param filename [String] Original filename
@@ -19,8 +20,9 @@ module ElevenlabsClient
       payload = {
         file: @client.file_part(file_io, filename),
         mode: "automatic",
-        target_languages: target_languages,
-        name: name
+        name: name,
+        target_lang: target_languages.first,
+        num_speakers: 1
       }.compact.merge(options)
       @client.post_multipart("/v1/dubbing", payload)
@@ -28,6 +30,7 @@ module ElevenlabsClient
     # GET /v1/dubbing/{id}
     # Retrieves dubbing job details
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/get
     #
     # @param dubbing_id [String] The dubbing job ID
     # @return [Hash] Dubbing job details
@@ -37,6 +40,7 @@ module ElevenlabsClient
     # GET /v1/dubbing
     # Lists dubbing jobs
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing
     #
     # @param params [Hash] Query parameters (dubbing_status, page_size, etc.)
     # @return [Hash] List of dubbing jobs
@@ -46,6 +50,7 @@ module ElevenlabsClient
     # GET /v1/dubbing/{id}/resources
     # Retrieves dubbing resources for editing (if dubbing_studio: true was used)
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/get-resource
     #
     # @param dubbing_id [String] The dubbing job ID
     # @return [Hash] Dubbing resources
@@ -55,6 +60,7 @@ module ElevenlabsClient
     # DELETE /v1/dubbing/{id}
     # Deletes a dubbing project
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/delete
     #
     # @param dubbing_id [String] The dubbing job ID
     # @return [Hash] Response with status
@@ -64,6 +70,7 @@ module ElevenlabsClient
     # GET /v1/dubbing/resource/{dubbing_id}
     # Gets dubbing resource with detailed information including segments, speakers, etc.
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/get-resource
     #
     # @param dubbing_id [String] The dubbing job ID
     # @return [Hash] Detailed dubbing resource information
@@ -73,6 +80,7 @@ module ElevenlabsClient
     # POST /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}/segment
     # Creates a new segment in dubbing resource
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/create-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param speaker_id [String] The speaker ID
@@ -94,6 +102,7 @@ module ElevenlabsClient
     # DELETE /v1/dubbing/resource/{dubbing_id}/segment/{segment_id}
     # Deletes a single segment from the dubbing
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/delete-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param segment_id [String] The segment ID
@@ -104,6 +113,7 @@ module ElevenlabsClient
     # PATCH /v1/dubbing/resource/{dubbing_id}/segment/{segment_id}/{language}
     # Updates a single segment with new text and/or start/end times
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/update-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param segment_id [String] The segment ID
@@ -124,6 +134,7 @@ module ElevenlabsClient
     # POST /v1/dubbing/resource/{dubbing_id}/transcribe
     # Regenerates transcriptions for specified segments
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/transcribe-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param segments [Array<String>] List of segment IDs to transcribe
@@ -135,6 +146,7 @@ module ElevenlabsClient
     # POST /v1/dubbing/resource/{dubbing_id}/translate
     # Regenerates translations for specified segments/languages
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/translate-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param segments [Array<String>] List of segment IDs to translate
@@ -151,6 +163,7 @@ module ElevenlabsClient
     # POST /v1/dubbing/resource/{dubbing_id}/dub
     # Regenerates dubs for specified segments/languages
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/dub-segment
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param segments [Array<String>] List of segment IDs to dub
@@ -167,6 +180,7 @@ module ElevenlabsClient
     # POST /v1/dubbing/resource/{dubbing_id}/render/{language}
     # Renders the output media for a language
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/render-project
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param language [String] The language to render
@@ -184,6 +198,7 @@ module ElevenlabsClient
     # PATCH /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}
     # Updates speaker metadata such as voice
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/update-speaker
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param speaker_id [String] The speaker ID
@@ -201,6 +216,7 @@ module ElevenlabsClient
     # GET /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}/similar-voices
     # Gets similar voices for a speaker
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/resources/get-similar-voices
     #
     # @param dubbing_id [String] The dubbing job ID
     # @param speaker_id [String] The speaker ID
@@ -209,6 +225,40 @@ module ElevenlabsClient
       @client.get("/v1/dubbing/resource/#{dubbing_id}/speaker/#{speaker_id}/similar-voices")
     end
+    # GET /v1/dubbing/{dubbing_id}/audio/{language_code}
+    # Returns dub as a streamed MP3 or MP4 file
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/audio/get
+    #
+    # @param dubbing_id [String] ID of the dubbing project
+    # @param language_code [String] ID of the language
+    # @return [String] Binary audio/video data
+    def get_dubbed_audio(dubbing_id, language_code)
+      endpoint = "/v1/dubbing/#{dubbing_id}/audio/#{language_code}"
+      @client.get(endpoint)
+    end
+    # GET /v1/dubbing/{dubbing_id}/transcript/{language_code}
+    # Returns transcript for the dub as an SRT or WEBVTT file
+    # Documentation: https://elevenlabs.io/docs/api-reference/dubbing/transcript/get-transcript-for-dub
+    #
+    # @param dubbing_id [String] ID of the dubbing project
+    # @param language_code [String] ID of the language
+    # @param options [Hash] Optional parameters
+    # @option options [String] :format_type Format to use ("srt" or "webvtt", default: "srt")
+    # @return [String] Transcript in specified format
+    def get_dubbed_transcript(dubbing_id, language_code, **options)
+      endpoint = "/v1/dubbing/#{dubbing_id}/transcript/#{language_code}"
+      params = {}
+      params[:format_type] = options[:format_type] if options[:format_type]
+      @client.get(endpoint, params)
+    end
+    # Alias methods for convenience
+    alias_method :dubbed_audio, :get_dubbed_audio
+    alias_method :dubbed_transcript, :get_dubbed_transcript
     private
     attr_reader :client

data/lib/elevenlabs_client/endpoints/forced_alignment.rb ADDED Viewed

@@ -0,0 +1,41 @@
+# frozen_string_literal: true
+module ElevenlabsClient
+  class ForcedAlignment
+    def initialize(client)
+      @client = client
+    end
+    # POST /v1/forced-alignment
+    # Force align an audio file to text. Get timing information for each character and word
+    # Documentation: https://elevenlabs.io/docs/api-reference/forced-alignment
+    #
+    # @param audio_file [IO, File] The audio file to align (must be less than 1GB)
+    # @param filename [String] Original filename for the audio file
+    # @param text [String] The text to align with the audio
+    # @param options [Hash] Optional parameters
+    # @option options [Boolean] :enabled_spooled_file Stream file in chunks for large files (defaults to false)
+    # @return [Hash] JSON response containing characters, words arrays with timing info, and loss score
+    def create(audio_file, filename, text, **options)
+      endpoint = "/v1/forced-alignment"
+      payload = {
+        file: @client.file_part(audio_file, filename),
+        text: text
+      }
+      # Add optional parameters if provided
+      payload[:enabled_spooled_file] = options[:enabled_spooled_file] unless options[:enabled_spooled_file].nil?
+      @client.post_multipart(endpoint, payload)
+    end
+    # Alias methods for convenience
+    alias_method :align, :create
+    alias_method :force_align, :create
+    private
+    attr_reader :client
+  end
+end