RubyGems - speech2text - Versions diffs - 0.01 → 0.3.0 - Mend

speech2text 0.01 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

data/README.rdoc +18 -5
data/bin/speech2text +5 -0
data/lib/speech.rb +1 -0
data/lib/speech/audio_inspector.rb +1 -0
data/lib/speech/audio_splitter.rb +2 -1
data/lib/speech/audio_to_text.rb +6 -0
data/lib/speech/version.rb +2 -1
data/speech2text.gemspec +9 -8
data/test/audio_inspector_test.rb +1 -0
data/test/audio_splitter_test.rb +2 -1
data/test/audio_to_text_test.rb +21 -0
data/test/samples/i-like-pickles.json +1 -0
metadata +10 -9
data/lib/speech/text.rb +0 -11
data/test/i-like-pickles.wav +0 -0

data/README.rdoc CHANGED Viewed

@@ -1,10 +1,23 @@
-== Speech2Text
+= Speech2Text
 Using the power of ffmpeg/flac/Google and ruby here is a simple interface to play with to convert speech to text.
-At this point the API from Google is not documented and seemly free.
-The Google API will frequently return 500 errors without providing much reason as to why.
+Using a new undocumentd speech API from Google with the help of this article: http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/
-It's possible that Google will decide to not open this API up and this effort my completely be for not...
+We're able to provide a very simple API in Ruby to decode simple audio to text.
-This was all made possible in short order all thanks to Chrome 11 and http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/
+The API from Google is not yet public and so may change. It also seems to be very fragile as more times than not it will return a 500, so the library has retry code built in - for larger audio files 10+ failures may return before a successful result is retrieved...
+It also appears that the API only likes smaller audio files so there is a built in chunker that allows us to split the audio up into smaller chunks.
+== Example
+ audio = Speech::AudioToText.new("i-like-pickles.wav")
+ puts audio.to_text.inspect
+ => {"captured_json"=>[["I like pickles", 0.92731786]], "confidence"=>0.92731786}
+== Command Line
+ speech2text i-like-pickles.wav
+ cat i-like-pickles.json
+ {"captured_json"=>[["I like pickles", 0.92731786]], "confidence"=>0.92731786}

data/bin/speech2text CHANGED Viewed

@@ -1,2 +1,7 @@
 #!/this/will/be/replaced/by/rubygems
 # -*- encoding: binary -*-
+require 'speech'
+captured_json = Speech::AudioToText.new(ARGV[0]).to_text
+puts captured_json.inspect

data/lib/speech.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 require 'curb'
 require 'json'

data/lib/speech/audio_inspector.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 module Speech
   class AudioInspector

data/lib/speech/audio_splitter.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 module Speech
   class AudioSplitter
@@ -8,7 +9,7 @@ module Speech
       def initialize(splitter, offset, duration)
         self.offset = offset
-        self.chunk = "chunk-" + splitter.original_file.gsub(/\.(.*)$/, "-#{offset}" + '.\1')
+        self.chunk = File.join(File.dirname(splitter.original_file), "chunk-" + File.basename(splitter.original_file).gsub(/\.(.*)$/, "-#{offset}" + '.\1'))
         self.duration = duration
         self.splitter = splitter
       end

data/lib/speech/audio_to_text.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 module Speech
   class AudioToText
@@ -21,6 +22,10 @@ module Speech
       JSON.parse(File.read(self.captured_file))
     end
+    def clean
+      File.unlink self.captured_file if self.captured_file && File.exist?(self.captured_file)
+    end
   protected
     def convert_chunk(easy, chunk, options={})
@@ -29,6 +34,7 @@ module Speech
       while retrying
         #easy.verbose = true
         easy.headers['Content-Type'] = "audio/x-flac; rate=#{chunk.flac_rate}"
+        easy.headers['User-Agent'] = "https://github.com/taf2/speech2text"
         easy.post_body = "Content=#{chunk.to_flac_bytes}"
         easy.on_progress {|dl_total, dl_now, ul_total, ul_now| printf("%.2f/%.2f\r", ul_now, ul_total); true }
         easy.on_complete {|easy| puts }

data/lib/speech/version.rb CHANGED Viewed

@@ -1,5 +1,6 @@
+# -*- encoding: binary -*-
 module Speech
   class Info
-    VERSION='0.01'
+    VERSION='0.3.0'
   end
 end

data/speech2text.gemspec CHANGED Viewed

@@ -2,14 +2,15 @@ $:.unshift File.expand_path(File.dirname(__FILE__) + "/lib")
 require "speech/version"
 Gem::Specification.new do |s|
-  s.name    = "speech2text"
-  s.authors = ["Todd A. Fisher"]
-  s.email   = "todd.fisher@gmail.com"
-  s.version = Speech::Info::VERSION
-  s.homepage = "https://github.com/taf2/speech2text"
-  s.summary  = "Speech to Text Library"
-  s.description = "Super powers of Google wrapped in a nice Ruby interface"
-  s.files = Dir["{lib,bin,test}/**/*", "Rakefile", "README.rdoc", "*.gemspec"]
+  s.name           = "speech2text"
+  s.authors        = ["Todd A. Fisher"]
+  s.email          = "todd.fisher@gmail.com"
+  s.version        = Speech::Info::VERSION
+  s.homepage       = "https://github.com/taf2/speech2text"
+  s.summary        = "Speech to Text Library"
+  s.description    = "Super powers of Google wrapped in a nice Ruby interface"
+  s.files          = Dir["{lib,bin,test}/**/*", "Rakefile", "README.rdoc", "*.gemspec"]
+  s.executables    = %w(speech2text)
   s.add_dependency "curb"
   s.add_dependency "json"

data/test/audio_inspector_test.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 require 'test/unit'
 $:.unshift File.expand_path(File.dirname(__FILE__) + '/../lib')
 require 'speech'

data/test/audio_splitter_test.rb CHANGED Viewed

@@ -1,3 +1,4 @@
+# -*- encoding: binary -*-
 require 'test/unit'
 $:.unshift File.expand_path(File.dirname(__FILE__) + '/../lib')
 require 'speech'
@@ -5,7 +6,7 @@ require 'speech'
 class SpeechAudioSplitterTest < Test::Unit::TestCase
   def test_audio_splitter
-    splitter = Speech::AudioSplitter.new("i-like-pickles.wav", 1)
+    splitter = Speech::AudioSplitter.new("samples/i-like-pickles.wav", 1)
     assert_equal '00:00:03:52', splitter.duration.to_s
     assert_equal 3.52, splitter.duration.to_f

data/test/audio_to_text_test.rb ADDED Viewed

@@ -0,0 +1,21 @@
+# -*- encoding: binary -*-
+require 'test/unit'
+$:.unshift File.expand_path(File.dirname(__FILE__) + '/../lib')
+require 'speech'
+class SpeechAudioToTextTest < Test::Unit::TestCase
+  def test_audio_to_text
+    audio = Speech::AudioToText.new("samples/i-like-pickles.wav")
+    captured_json = audio.to_text
+    assert captured_json
+    assert captured_json.key?("captured_json")
+    assert !captured_json['captured_json'].empty?
+    assert_equal ['captured_json', 'confidence'], captured_json.keys.sort
+    assert_equal "I like pickles", captured_json['captured_json'].flatten.first
+    assert captured_json['confidence'] > 0.9
+#    {"captured_json"=>[["I like pickles", 0.92731786]], "confidence"=>0.92731786}
+#    puts captured_json.inspect
+  ensure
+    audio.clean
+  end
+end

data/test/samples/i-like-pickles.json ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"captured_json":[["I like pickles",0.92731786]],"confidence":0.92731786}

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: speech2text
 version: !ruby/object:Gem::Version
-  version: '0.01'
+  version: 0.3.0
   prerelease:
 platform: ruby
 authors:
@@ -9,12 +9,12 @@ authors:
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2011-03-24 00:00:00.000000000 -04:00
+date: 2011-03-25 00:00:00.000000000 -04:00
 default_executable:
 dependencies:
 - !ruby/object:Gem::Dependency
   name: curb
-  requirement: &2157005460 !ruby/object:Gem::Requirement
+  requirement: &2157005180 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -22,10 +22,10 @@ dependencies:
         version: '0'
   type: :runtime
   prerelease: false
-  version_requirements: *2157005460
+  version_requirements: *2157005180
 - !ruby/object:Gem::Dependency
   name: json
-  requirement: &2157005040 !ruby/object:Gem::Requirement
+  requirement: &2157004740 !ruby/object:Gem::Requirement
     none: false
     requirements:
     - - ! '>='
@@ -33,24 +33,25 @@ dependencies:
         version: '0'
   type: :runtime
   prerelease: false
-  version_requirements: *2157005040
+  version_requirements: *2157004740
 description: Super powers of Google wrapped in a nice Ruby interface
 email: todd.fisher@gmail.com
-executables: []
+executables:
+- speech2text
 extensions: []
 extra_rdoc_files: []
 files:
 - lib/speech/audio_inspector.rb
 - lib/speech/audio_splitter.rb
 - lib/speech/audio_to_text.rb
-- lib/speech/text.rb
 - lib/speech/version.rb
 - lib/speech.rb
 - bin/speech2text
 - test/audio_inspector_test.rb
 - test/audio_splitter_test.rb
-- test/i-like-pickles.wav
+- test/audio_to_text_test.rb
 - test/SampleAudio.wav
+- test/samples/i-like-pickles.json
 - test/samples/i-like-pickles.wav
 - Rakefile
 - README.rdoc

data/lib/speech/text.rb DELETED Viewed

@@ -1,11 +0,0 @@
-module Speech
-  class Text
-    def initialize(audio_file, options={})
-    end
-    def decode_audio(flac16k_audio)
-    end
-  end
-end

data/test/i-like-pickles.wav DELETED Viewed

Binary file