pocketsphinx-ruby 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b67faaa7d60b0ff377d160f8fd88a28ebcc1eaa4
4
- data.tar.gz: 163809adeed0af96f876f10bb2b2e93b4ebc8ba6
3
+ metadata.gz: 440699d34e0585b3670bd4bfa91e6e9a87b2331f
4
+ data.tar.gz: 8a697fa2d7e491e4eccfb47fe678a6acfcf695a7
5
5
  SHA512:
6
- metadata.gz: f7e109bae75aadd7a1ea053fb21cfd099e95c8e7476c08ea0d45d01991d84a93713e346e63d415e14981353f557d6c78fd11177cac726f1145ecbd1b80567f9f
7
- data.tar.gz: a5b0af62adb929f617239f651d6929eb88b55861e5318970f14e46eaf81b49433fb1492bcfe2d176ea53f982843c3cd674d5cd7e63155ebe788618c0cbaa397b
6
+ metadata.gz: 7d913ab82f397056b9b90bb5f7d4fb6609a618a367a361c3936afd4e850caf908614bb16a174cae26160241cc8424eb7abae4404595b1d27e6a90bc0e431f2cf
7
+ data.tar.gz: ab6b8b36f3b9ef07f0086cca1e28b49b146116b15a3acf7e06f0fc80d50fc133a2653e05f437344c2f8532284c4e8fa8205151bca3de379f4a4f354048accdf5
data/README.md CHANGED
@@ -6,7 +6,7 @@
6
6
 
7
7
  This gem provides Ruby [FFI](https://github.com/ffi/ffi) bindings for [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx), a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop. Pocketsphinx is part of the [CMU Sphinx](http://cmusphinx.sourceforge.net/) Open Source Toolkit For Speech Recognition.
8
8
 
9
- I had initially looked at using Pocketsphinx's [SWIG](http://www.swig.org/) interface for this gem, but decided in favor of FFI for many of the reasons outlined [here](https://github.com/ffi/ffi/wiki/Why-use-FFI), but most importantly ease of maintenance and JRuby support.
9
+ Pocketsphinx's [SWIG](http://www.swig.org/) interface was initially considered for this gem, but dropped in favor of FFI for many of the reasons outlined [here](https://github.com/ffi/ffi/wiki/Why-use-FFI); most importantly ease of maintenance and JRuby support.
10
10
 
11
11
  The goal of this project is to make it as easy as possible for the Ruby community to experiment with speech recognition. Please do contribute fixes and enhancements.
12
12
 
@@ -62,6 +62,41 @@ Pocketsphinx::LiveSpeechRecognizer.new.recognize do |speech|
62
62
  end
63
63
  ```
64
64
 
65
+ The `AudioFileSpeechRecognizer` decodes directly from an audio file by coordinating interactions between an `AudioFile` and `Decoder`.
66
+
67
+ ```ruby
68
+ recognizer = Pocketsphinx::AudioFileSpeechRecognizer.new
69
+
70
+ recognizer.recognize('spec/assets/audio/goforward.raw') do |speech|
71
+ puts speech # => "go forward ten years"
72
+ end
73
+ ```
74
+
75
+ These two classes split speech into utterances by detecting silence between them. By default this uses Pocketsphinx's internal Voice Activity Detection (VAD) which can be configured by adjusting the `vad_postspeech`, `vad_prespeech`, and `vad_threshold` configuration settings.
76
+
77
+
78
+ ## Configuration
79
+
80
+ All of Pocketsphinx's decoding settings are managed by the `Configuration` class, which can be passed into the high-level speech recognizers:
81
+
82
+ ```ruby
83
+ configuration = Pocketsphinx::Configuration.default
84
+ configuration.details('vad_threshold')
85
+ # => {
86
+ # :name => "vad_threshold",
87
+ # :type => :float,
88
+ # :default => 2.0,
89
+ # :value => 2.0,
90
+ # :info => "Threshold for decision between noise and silence frames. Log-ratio between signal level and noise level."
91
+ # }
92
+
93
+ configuration['vad_threshold'] = 4
94
+
95
+ Pocketsphinx::LiveSpeechRecognizer.new(configuration)
96
+ ```
97
+
98
+ You can find the output of `configuration.details` [here](https://github.com/watsonbox/pocketsphinx-ruby/wiki/Default-Pocketsphinx-Configuration) for more information on the various different settings.
99
+
65
100
 
66
101
  ## Microphone
67
102
 
@@ -86,6 +121,20 @@ File.open("test.raw", "wb") do |file|
86
121
  end
87
122
  ```
88
123
 
124
+ To open this audio file take a look at [this wiki page](https://github.com/watsonbox/pocketsphinx-ruby/wiki/Importing-raw-PCM-audio-with-Audacity).
125
+
126
+
127
+ ## Decoder
128
+
129
+ The `Decoder` class uses Pocketsphinx's libpocketsphinx to decode audio data into text. For example to decode a single utterance:
130
+
131
+ ```ruby
132
+ decoder = Decoder.new(Configuration.default)
133
+ decoder.decode 'spec/assets/audio/goforward.raw'
134
+
135
+ puts decoder.hypothesis # => "go forward ten years"
136
+ ```
137
+
89
138
 
90
139
  ## Contributing
91
140
 
@@ -0,0 +1,11 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "pocketsphinx-ruby"
5
+
6
+ include Pocketsphinx
7
+
8
+ decoder = Decoder.new(Configuration.default)
9
+ decoder.decode 'spec/assets/audio/goforward.raw'
10
+
11
+ puts decoder.hypothesis # => "go forward ten years"
@@ -16,7 +16,7 @@ microphone = Microphone.new
16
16
  File.open("test_write.raw", "wb") do |file|
17
17
  microphone.record do
18
18
  FFI::MemoryPointer.new(:int16, MAX_SAMPLES) do |buffer|
19
- (RECORDING_LENGTH / RECORDING_INTERVAL).times do
19
+ (RECORDING_LENGTH / RECORDING_INTERVAL).to_i.times do
20
20
  sample_count = microphone.read_audio(buffer, MAX_SAMPLES)
21
21
 
22
22
  # sample_count * 2 since this is length in bytes
data/lib/pocketsphinx.rb CHANGED
@@ -6,10 +6,12 @@ require "pocketsphinx/api/sphinxad"
6
6
  require "pocketsphinx/api/pocketsphinx"
7
7
 
8
8
  require "pocketsphinx/configuration"
9
+ require "pocketsphinx/audio_file"
9
10
  require "pocketsphinx/microphone"
10
11
  require "pocketsphinx/decoder"
11
12
  require "pocketsphinx/speech_recognizer"
12
13
  require "pocketsphinx/live_speech_recognizer"
14
+ require "pocketsphinx/audio_file_speech_recognizer"
13
15
 
14
16
  module Pocketsphinx
15
17
 
@@ -4,14 +4,18 @@ module Pocketsphinx
4
4
  extend FFI::Library
5
5
  ffi_lib "libpocketsphinx"
6
6
 
7
- attach_function :ps_init, [:pointer], :pointer
7
+ typedef :pointer, :decoder
8
+ typedef :pointer, :configuration
9
+
10
+ attach_function :ps_init, [:configuration], :decoder
8
11
  attach_function :ps_default_search_args, [:pointer], :void
9
12
  attach_function :ps_args, [], :pointer
10
- attach_function :ps_process_raw, [:pointer, :pointer, :size_t, :int, :int], :int
11
- attach_function :ps_start_utt, [:pointer, :string], :int
12
- attach_function :ps_end_utt, [:pointer], :int
13
- attach_function :ps_get_in_speech, [:pointer], :uint8
14
- attach_function :ps_get_hyp, [:pointer, :pointer, :pointer], :string
13
+ attach_function :ps_decode_raw, [:decoder, :pointer, :string, :long], :int
14
+ attach_function :ps_process_raw, [:decoder, :pointer, :size_t, :int, :int], :int
15
+ attach_function :ps_start_utt, [:decoder, :string], :int
16
+ attach_function :ps_end_utt, [:decoder], :int
17
+ attach_function :ps_get_in_speech, [:decoder], :uint8
18
+ attach_function :ps_get_hyp, [:decoder, :pointer, :pointer], :string
15
19
  end
16
20
  end
17
21
  end
@@ -0,0 +1,32 @@
1
+ module Pocketsphinx
2
+ # Implements Recordable interface (#record and #read_audio)
3
+ class AudioFile < Struct.new(:file_path)
4
+ def record
5
+ File.open(file_path, 'rb') do |file|
6
+ self.file = file
7
+ yield
8
+ self.file = nil
9
+ end
10
+ end
11
+
12
+ # Read next block of audio samples from file; up to max samples into buffer.
13
+ #
14
+ # @param [FFI::Pointer] buffer 16bit buffer of at least max_samples in size
15
+ # @params [Fixnum] max_samples The maximum number of samples to read from the audio file
16
+ # @return [Fixnum] Samples actually read; nil if EOF
17
+ def read_audio(buffer, max_samples = 4096)
18
+ if file.nil?
19
+ raise "Can't read audio: use AudioFile#record to open the file first"
20
+ end
21
+
22
+ if data = file.read(max_samples * 2)
23
+ buffer.write_string(data)
24
+ data.length / 2
25
+ end
26
+ end
27
+
28
+ private
29
+
30
+ attr_accessor :file
31
+ end
32
+ end
@@ -0,0 +1,12 @@
1
+ module Pocketsphinx
2
+ # High-level class for live speech recognition from a raw audio file.
3
+ class AudioFileSpeechRecognizer < SpeechRecognizer
4
+ def recognize(file_path, max_samples = 4096)
5
+ self.recordable = AudioFile.new(file_path)
6
+
7
+ super(max_samples) do |speech|
8
+ yield speech if block_given?
9
+ end
10
+ end
11
+ end
12
+ end
@@ -3,6 +3,7 @@ require 'pocketsphinx/configuration/setting_definition'
3
3
  module Pocketsphinx
4
4
  class Configuration
5
5
  attr_reader :ps_config
6
+ attr_reader :setting_definitions
6
7
 
7
8
  private_class_method :new
8
9
 
@@ -22,12 +23,33 @@ module Pocketsphinx
22
23
  new(API::Pocketsphinx.ps_args)
23
24
  end
24
25
 
25
- def [](name)
26
- unless definition = @setting_definitions[name]
27
- raise "Configuration setting '#{name}' does not exist"
26
+ def setting_names
27
+ setting_definitions.keys.sort
28
+ end
29
+
30
+ # Get details for one or all configuration settings
31
+ #
32
+ # @param [String] name Name of setting to get details for. Gets details for all settings if nil.
33
+ def details(name = nil)
34
+ details = [name || setting_names].flatten.map do |name|
35
+ definition = find_definition(name)
36
+
37
+ {
38
+ name: name,
39
+ type: definition.type,
40
+ default: definition.default,
41
+ required: definition.required?,
42
+ value: self[name],
43
+ info: definition.doc
44
+ }
28
45
  end
29
46
 
30
- case definition.type
47
+ name ? details.first : details
48
+ end
49
+
50
+ # Get a configuration setting
51
+ def [](name)
52
+ case find_definition(name).type
31
53
  when :integer
32
54
  API::Sphinxbase.cmd_ln_int_r(@ps_config, "-#{name}")
33
55
  when :float
@@ -41,12 +63,9 @@ module Pocketsphinx
41
63
  end
42
64
  end
43
65
 
66
+ # Set a configuration setting with type checking
44
67
  def []=(name, value)
45
- unless definition = @setting_definitions[name]
46
- raise "Configuration setting '#{name}' does not exist"
47
- end
48
-
49
- case definition.type
68
+ case find_definition(name).type
50
69
  when :integer
51
70
  raise "Configuration setting '#{name}' must be a Fixnum" unless value.respond_to?(:to_i)
52
71
  API::Sphinxbase.cmd_ln_set_int_r(@ps_config, "-#{name}", value.to_i)
@@ -61,5 +80,11 @@ module Pocketsphinx
61
80
  raise NotImplementedException
62
81
  end
63
82
  end
83
+
84
+ private
85
+
86
+ def find_definition(name)
87
+ setting_definitions[name] or raise "Configuration setting '#{name}' does not exist"
88
+ end
64
89
  end
65
90
  end
@@ -1,19 +1,25 @@
1
1
  module Pocketsphinx
2
2
  class Configuration
3
- class SettingDefinition
3
+ class SettingDefinition < Struct.new(:name, :type_code, :deflt, :doc)
4
4
  TYPES = [:integer, :float, :string, :boolean, :string_list]
5
5
 
6
- def initialize(name, type_code, default, doc)
7
- @name, @type_code, @default, @doc = name, type_code, default, doc
8
- end
9
-
10
6
  def type
11
7
  # Remove the required bit if it exists and find type from log2 of code
12
- TYPES[Math.log2(@type_code - @type_code%2) - 1]
8
+ TYPES[Math.log2(type_code - type_code%2) - 1]
9
+ end
10
+
11
+ # Convert string defaults from pocketsphinx to Ruby types
12
+ def default
13
+ case type
14
+ when :integer then deflt.to_i
15
+ when :float then deflt.to_f
16
+ when :boolean then deflt == 'yes'
17
+ else deflt
18
+ end
13
19
  end
14
20
 
15
21
  def required?
16
- @type_code % 2 == 1
22
+ type_code % 2 == 1
17
23
  end
18
24
 
19
25
  # Build setting definitions from pocketsphinx argument definitions
@@ -10,6 +10,42 @@ module Pocketsphinx
10
10
  @ps_decoder = ps_api.ps_init(configuration.ps_config)
11
11
  end
12
12
 
13
+ # Decode a raw audio stream as a single utterance, opening a file if path given
14
+ #
15
+ # See #decode_raw
16
+ #
17
+ # @param [IO] audio_path_or_file The raw audio stream or file path to decode as a single utterance
18
+ # @param [Fixnum] max_samples The maximum samples to process from the stream on each iteration
19
+ def decode(audio_path_or_file, max_samples = 2048)
20
+ case audio_path_or_file
21
+ when String
22
+ File.open(audio_path_or_file, 'rb') { |f| decode_raw(f, max_samples) }
23
+ else
24
+ decode_raw(audio_path_or_file, max_samples)
25
+ end
26
+ end
27
+
28
+ # Decode a raw audio stream as a single utterance.
29
+ #
30
+ # No headers are recognized in this files. The configuration parameters samprate
31
+ # and input_endian are used to determine the sampling rate and endianness of the stream,
32
+ # respectively. Audio is always assumed to be 16-bit signed PCM.
33
+ #
34
+ # @param [IO] audio_file The raw audio stream to decode as a single utterance
35
+ # @param [Fixnum] max_samples The maximum samples to process from the stream on each iteration
36
+ def decode_raw(audio_file, max_samples = 2048)
37
+ start_utterance
38
+
39
+ FFI::MemoryPointer.new(:int16, max_samples) do |buffer|
40
+ while data = audio_file.read(max_samples * 2)
41
+ buffer.write_string(data)
42
+ process_raw(buffer, data.length / 2)
43
+ end
44
+ end
45
+
46
+ end_utterance
47
+ end
48
+
13
49
  # Decode raw audio data.
14
50
  #
15
51
  # @param [Boolean] no_search If non-zero, perform feature extraction but don't do any
@@ -3,46 +3,8 @@ module Pocketsphinx
3
3
  #
4
4
  # Modeled on the LiveSpeechRecognizer from Sphinx4.
5
5
  class LiveSpeechRecognizer < SpeechRecognizer
6
- attr_writer :microphone
7
-
8
- def microphone
9
- @microphone ||= Microphone.new
10
- end
11
-
12
- # Recognize utterances and yield hypotheses in infinite loop
13
- #
14
- # @param [Float]
15
- def recognize(recording_interval = 0.1, max_samples = 4096)
16
- decoder.start_utterance
17
-
18
- microphone.record do
19
- FFI::MemoryPointer.new(:int16, max_samples) do |buffer|
20
- loop do
21
- if decoder.in_speech?
22
- process_audio(buffer, max_samples, recording_interval) while decoder.in_speech?
23
- yield get_hypothesis
24
- else
25
- process_audio(buffer, max_samples, recording_interval)
26
- end
27
- end
28
- end
29
- end
30
- end
31
-
32
- private
33
-
34
- def process_audio(buffer, max_samples, delay)
35
- sample_count = microphone.read_audio(buffer, max_samples)
36
- decoder.process_raw(buffer, sample_count)
37
- sleep delay
38
- end
39
-
40
- # Called on speech -> silence transition
41
- def get_hypothesis
42
- decoder.end_utterance
43
- decoder.hypothesis.tap do
44
- decoder.start_utterance
45
- end
6
+ def recordable
7
+ @recordable ||= Microphone.new
46
8
  end
47
9
  end
48
10
  end
@@ -1,10 +1,13 @@
1
1
  module Pocketsphinx
2
- # Provides non-blocking audio recording using libsphinxad
2
+ # Provides non-blocking live audio recording using libsphinxad
3
+ #
4
+ # Implements Recordable interface (#record and #read_audio)
3
5
  class Microphone
4
6
  Error = Class.new(StandardError)
5
7
 
6
8
  attr_reader :ps_audio_device
7
9
  attr_writer :ps_api
10
+ attr_reader :sample_rate
8
11
 
9
12
  # Opens an audio device for recording
10
13
  #
@@ -14,8 +17,9 @@ module Pocketsphinx
14
17
  # @param [String] default_device The device name
15
18
  # @param [Object] ps_api A SphinxAD API implementation to use, API::SphinxAD if not provided
16
19
  def initialize(sample_rate = 16000, default_device = nil, ps_api = nil)
20
+ @sample_rate = sample_rate
17
21
  @ps_api = ps_api
18
- @ps_audio_device = ps_api.ad_open_dev(default_device, sample_rate)
22
+ @ps_audio_device = self.ps_api.ad_open_dev(default_device, sample_rate)
19
23
 
20
24
  # Ensure that audio device is closed when object is garbage collected
21
25
  ObjectSpace.define_finalizer(self, self.class.finalize(ps_api, @ps_audio_device))
@@ -46,10 +50,22 @@ module Pocketsphinx
46
50
  # Read next block of audio samples while recording; read upto max samples into buf.
47
51
  #
48
52
  # @param [FFI::Pointer] buffer 16bit buffer of at least max_samples in size
49
- # @return [Fixnum] Samples actually read (could be 0 since non-blocking); -1 if not
53
+ # @params [Fixnum] max_samples The maximum number of samples to read from the audio device
54
+ # @return [Fixnum] Samples actually read (could be 0 since non-blocking); nil if not
50
55
  # recording and no more samples remaining to be read from most recent recording.
51
56
  def read_audio(buffer, max_samples = 4096)
52
- ps_api.ad_read(@ps_audio_device, buffer, max_samples)
57
+ samples = ps_api.ad_read(@ps_audio_device, buffer, max_samples)
58
+ samples if samples >= 0
59
+ end
60
+
61
+ # A Recordable may specify an audio reading delay
62
+ #
63
+ # In the case of the Microphone, because we are doing non-blocking reads,
64
+ # we specify a delay which should fill half of the max buffer size
65
+ #
66
+ # @param [Fixnum] max_samples The maximum samples we tried to read from the audio device
67
+ def read_audio_delay(max_samples = 4096)
68
+ max_samples / (2 * sample_rate)
53
69
  end
54
70
 
55
71
  def close_device
@@ -1,9 +1,83 @@
1
1
  module Pocketsphinx
2
+ # Reads audio data from a recordable interface and decodes it into utterances
3
+ #
4
+ # Essentially orchestrates interaction between Recordable and Decoder, and detects new utterances.
2
5
  class SpeechRecognizer
3
- attr_reader :decoder
6
+ # Recordable interface must implement #record and #read_audio
7
+ attr_writer :recordable
8
+ attr_writer :decoder
4
9
 
5
- def initialize(configuration= nil)
6
- @decoder = Decoder.new(configuration || Configuration.default)
10
+ def initialize(configuration = nil)
11
+ @configuration = configuration
12
+ end
13
+
14
+ def recordable
15
+ @recordable or raise "A SpeechRecognizer must have a recordable interface"
16
+ end
17
+
18
+ def decoder
19
+ @decoder ||= Decoder.new(configuration)
20
+ end
21
+
22
+ def configuration
23
+ @configuration ||= Configuration.default
24
+ end
25
+
26
+ # Recognize utterances and yield hypotheses in infinite loop
27
+ #
28
+ # Splits speech into utterances by detecting silence between them.
29
+ # By default this uses Pocketsphinx's internal Voice Activity Detection (VAD) which can be
30
+ # configured by adjusting the `vad_postspeech`, `vad_prespeech`, and `vad_threshold` settings.
31
+ #
32
+ # @param [Fixnum] max_samples Number of samples to process at a time
33
+ def recognize(max_samples = 4096)
34
+ decoder.start_utterance
35
+
36
+ recordable.record do
37
+ FFI::MemoryPointer.new(:int16, max_samples) do |buffer|
38
+ loop do
39
+ if in_speech?
40
+ while decoder.in_speech?
41
+ process_audio(buffer, max_samples) or break
42
+ end
43
+
44
+ yield get_hypothesis
45
+ else
46
+ process_audio(buffer, max_samples) or break
47
+ end
48
+ end
49
+ end
50
+ end
51
+ end
52
+
53
+ def in_speech?
54
+ # Use Pocketsphinx's implementation by default
55
+ decoder.in_speech?
56
+ end
57
+
58
+ private
59
+
60
+ def process_audio(buffer, max_samples)
61
+ sample_count = recordable.read_audio(buffer, max_samples)
62
+
63
+ if sample_count
64
+ decoder.process_raw(buffer, sample_count)
65
+
66
+ # Check for a delay for example in case of non-blocking live audio
67
+ if recordable.respond_to?(:read_audio_delay)
68
+ sleep recordable.read_audio_delay(max_samples)
69
+ end
70
+ end
71
+
72
+ sample_count
73
+ end
74
+
75
+ # Called on speech -> silence transition
76
+ def get_hypothesis
77
+ decoder.end_utterance
78
+ decoder.hypothesis.tap do
79
+ decoder.start_utterance
80
+ end
7
81
  end
8
82
  end
9
83
  end
@@ -1,3 +1,3 @@
1
1
  module Pocketsphinx
2
- VERSION = "0.0.1"
2
+ VERSION = "0.0.2"
3
3
  end
Binary file
@@ -44,4 +44,37 @@ describe Configuration do
44
44
  it 'raises exceptions when a setting is unknown' do
45
45
  expect { subject['unknown'] = true }.to raise_exception "Configuration setting 'unknown' does not exist"
46
46
  end
47
+
48
+ describe '#setting_names' do
49
+ it 'contains the names of all possible system settings' do
50
+ expect(subject.setting_names.count).to eq(117)
51
+ end
52
+ end
53
+
54
+ describe '#details' do
55
+ it 'gives details for a single setting' do
56
+ expect(subject.details 'vad_threshold').to eq({
57
+ name: "vad_threshold",
58
+ type: :float,
59
+ default: 2.0,
60
+ required: false,
61
+ value: 2.0,
62
+ info: "Threshold for decision between noise and silence frames. Log-ratio between signal level and noise level."
63
+ })
64
+ end
65
+
66
+ it 'gives details for all settings when no name is specified' do
67
+ details = subject.details
68
+
69
+ expect(details.count).to eq(117)
70
+ expect(details.first).to eq({
71
+ name: "agc",
72
+ type: :string,
73
+ default: "none",
74
+ required: false,
75
+ value: "none",
76
+ info: "Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')"
77
+ })
78
+ end
79
+ end
47
80
  end
data/spec/decoder_spec.rb CHANGED
@@ -9,6 +9,22 @@ describe Decoder do
9
9
  @decoder = Decoder.new(Configuration.default)
10
10
  end
11
11
 
12
+ # Full integration test
13
+ describe '#decode' do
14
+ it 'correctly decodes the speech in goforward.raw' do
15
+ subject.decode File.open('spec/assets/audio/goforward.raw', 'rb')
16
+
17
+ # With the default configuration (no specific grammar), pocketsphinx doesn't actually
18
+ # get this quite right, but nonetheless this is the expected output
19
+ expect(subject.hypothesis).to eq("go forward ten years")
20
+ end
21
+
22
+ it 'accepts a file path as well as a stream' do
23
+ subject.decode 'spec/assets/audio/goforward.raw'
24
+ expect(subject.hypothesis).to eq("go forward ten years")
25
+ end
26
+ end
27
+
12
28
  describe '#process_raw' do
13
29
  it 'calls libpocketsphinx' do
14
30
  FFI::MemoryPointer.new(:int16, 4096) do |buffer|
@@ -0,0 +1,23 @@
1
+ require 'spec_helper'
2
+
3
+ describe SpeechRecognizer do
4
+ let(:recordable) { AudioFile.new('spec/assets/audio/goforward.raw') }
5
+
6
+ subject do
7
+ SpeechRecognizer.new.tap do |speech_recognizer|
8
+ speech_recognizer.recordable = recordable
9
+ speech_recognizer.decoder = @decoder
10
+ end
11
+ end
12
+
13
+ # Share decoder across all examples for speed
14
+ before :all do
15
+ @decoder = Decoder.new(Configuration.default)
16
+ end
17
+
18
+ describe '#recognize' do
19
+ it 'should decode speech in raw audio' do
20
+ expect { |b| subject.recognize(4096, &b) }.to yield_with_args("go forward ten years")
21
+ end
22
+ end
23
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pocketsphinx-ruby
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.0.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Howard Wilson
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-10-19 00:00:00.000000000 Z
11
+ date: 2014-10-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ffi
@@ -94,6 +94,7 @@ files:
94
94
  - LICENSE.txt
95
95
  - README.md
96
96
  - Rakefile
97
+ - examples/decode_audio_file.rb
97
98
  - examples/pocketsphinx_continuous.rb
98
99
  - examples/record_audio_file.rb
99
100
  - lib/pocketsphinx-ruby.rb
@@ -101,6 +102,8 @@ files:
101
102
  - lib/pocketsphinx/api/pocketsphinx.rb
102
103
  - lib/pocketsphinx/api/sphinxad.rb
103
104
  - lib/pocketsphinx/api/sphinxbase.rb
105
+ - lib/pocketsphinx/audio_file.rb
106
+ - lib/pocketsphinx/audio_file_speech_recognizer.rb
104
107
  - lib/pocketsphinx/configuration.rb
105
108
  - lib/pocketsphinx/configuration/setting_definition.rb
106
109
  - lib/pocketsphinx/decoder.rb
@@ -109,10 +112,12 @@ files:
109
112
  - lib/pocketsphinx/speech_recognizer.rb
110
113
  - lib/pocketsphinx/version.rb
111
114
  - pocketsphinx-ruby.gemspec
115
+ - spec/assets/audio/goforward.raw
112
116
  - spec/configuration_spec.rb
113
117
  - spec/decoder_spec.rb
114
118
  - spec/microphone_spec.rb
115
119
  - spec/spec_helper.rb
120
+ - spec/speech_recognizer_spec.rb
116
121
  homepage: https://github.com/watsonbox/pocketsphinx-ruby
117
122
  licenses:
118
123
  - MIT
@@ -138,7 +143,9 @@ signing_key:
138
143
  specification_version: 4
139
144
  summary: Ruby FFI pocketsphinx bindings
140
145
  test_files:
146
+ - spec/assets/audio/goforward.raw
141
147
  - spec/configuration_spec.rb
142
148
  - spec/decoder_spec.rb
143
149
  - spec/microphone_spec.rb
144
150
  - spec/spec_helper.rb
151
+ - spec/speech_recognizer_spec.rb