pocketsphinx-ruby 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 440699d34e0585b3670bd4bfa91e6e9a87b2331f
4
- data.tar.gz: 8a697fa2d7e491e4eccfb47fe678a6acfcf695a7
3
+ metadata.gz: 3bf38b30cbc9fd5c2375d2c330238d1ad429e44c
4
+ data.tar.gz: 82aecda7d2c95378f15f47cc897b13419b83ce00
5
5
  SHA512:
6
- metadata.gz: 7d913ab82f397056b9b90bb5f7d4fb6609a618a367a361c3936afd4e850caf908614bb16a174cae26160241cc8424eb7abae4404595b1d27e6a90bc0e431f2cf
7
- data.tar.gz: ab6b8b36f3b9ef07f0086cca1e28b49b146116b15a3acf7e06f0fc80d50fc133a2653e05f437344c2f8532284c4e8fa8205151bca3de379f4a4f354048accdf5
6
+ metadata.gz: 2bb177461a17173815f299d3b17807014b80a22c3fae818569a4d29355e33dd506d3e1d4d195f8fa84ee09073014212fadd17a00e3f088a82d4278ccceea936a
7
+ data.tar.gz: 1ba9d9b74c999a05091e870b0fe2aee6c45df623b4d84a7364268eab1e4cf4e41dcf11b7206e90f5e3afd1d9ee292e503f8b4a6712b859e151e611944fe867a4
data/README.md CHANGED
@@ -3,6 +3,7 @@
3
3
  [![Build Status](http://img.shields.io/travis/watsonbox/pocketsphinx-ruby.svg?style=flat)](https://travis-ci.org/watsonbox/pocketsphinx-ruby)
4
4
  [![Code Climate](http://img.shields.io/codeclimate/github/watsonbox/pocketsphinx-ruby/badges/gpa.svg?style=flat)](https://codeclimate.com/github/watsonbox/pocketsphinx-ruby)
5
5
  [![Coverage Status](https://img.shields.io/coveralls/watsonbox/pocketsphinx-ruby.svg?style=flat)](https://coveralls.io/r/watsonbox/pocketsphinx-ruby)
6
+ [![Yard Docs](http://img.shields.io/badge/yard-docs-blue.svg?style=flat)](http://www.rubydoc.info/gems/pocketsphinx-ruby/frames)
6
7
 
7
8
  This gem provides Ruby [FFI](https://github.com/ffi/ffi) bindings for [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx), a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop. Pocketsphinx is part of the [CMU Sphinx](http://cmusphinx.sourceforge.net/) Open Source Toolkit For Speech Recognition.
8
9
 
@@ -50,7 +51,7 @@ Or install it yourself as:
50
51
  $ gem install pocketsphinx-ruby
51
52
 
52
53
 
53
- ## Basic Usage
54
+ ## Usage
54
55
 
55
56
  The `LiveSpeechRecognizer` is modeled on the same class in [Sphinx4](http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4). It uses the `Microphone` and `Decoder` classes internally to provide a simple, high-level recognition interface:
56
57
 
@@ -75,7 +76,7 @@ end
75
76
  These two classes split speech into utterances by detecting silence between them. By default this uses Pocketsphinx's internal Voice Activity Detection (VAD) which can be configured by adjusting the `vad_postspeech`, `vad_prespeech`, and `vad_threshold` configuration settings.
76
77
 
77
78
 
78
- ## Configuration
79
+ ### Configuration
79
80
 
80
81
  All of Pocketsphinx's decoding settings are managed by the `Configuration` class, which can be passed into the high-level speech recognizers:
81
82
 
@@ -98,7 +99,7 @@ Pocketsphinx::LiveSpeechRecognizer.new(configuration)
98
99
  You can find the output of `configuration.details` [here](https://github.com/watsonbox/pocketsphinx-ruby/wiki/Default-Pocketsphinx-Configuration) for more information on the various different settings.
99
100
 
100
101
 
101
- ## Microphone
102
+ ### Microphone
102
103
 
103
104
  The `Microphone` class uses Pocketsphinx's libsphinxad to record audio for speech recognition. For desktop applications this should normally be 16bit/16kHz raw PCM audio, so these are the default settings. The exact audio backend depends on [what was selected](https://github.com/cmusphinx/sphinxbase/blob/master/configure.in#L138) when libsphinxad was built. On OSX, OpenAL is [now supported](https://github.com/cmusphinx/sphinxbase/commit/5cc55c4721273681200e1f754ff0798ac073b950) and should work just fine.
104
105
 
@@ -124,7 +125,7 @@ end
124
125
  To open this audio file take a look at [this wiki page](https://github.com/watsonbox/pocketsphinx-ruby/wiki/Importing-raw-PCM-audio-with-Audacity).
125
126
 
126
127
 
127
- ## Decoder
128
+ ### Decoder
128
129
 
129
130
  The `Decoder` class uses Pocketsphinx's libpocketsphinx to decode audio data into text. For example to decode a single utterance:
130
131
 
@@ -136,6 +137,18 @@ puts decoder.hypothesis # => "go forward ten years"
136
137
  ```
137
138
 
138
139
 
140
+ ### Keyword Spotting
141
+
142
+ Keyword spotting is another feature that is not in the current stable (0.8) releases of Pocketsphinx, having been [merged into trunk](https://github.com/cmusphinx/pocketsphinx/commit/f562f9356cc7f1ade4941ebdde0c377642a023e3) early in 2014. In can be useful for detecting an activation keyword in a command and control application, while ignoring all other speech. Set up a recognizer as follows:
143
+
144
+ ```ruby
145
+ configuration = Configuration::KeywordSpotting.new('Okay computer')
146
+ recognizer = LiveSpeechRecognizer.new(configuration)
147
+ ```
148
+
149
+ The `KeywordSpotting` configuration accepts a second argument for adjusting the sensitivity of the keyword detection. Note that this is just a wrapper which sets the `keyphrase` and `kws_threshold` settings on the default configuration.
150
+
151
+
139
152
  ## Contributing
140
153
 
141
154
  1. Fork it ( https://github.com/[my-github-username]/pocketsphinx-ruby/fork )
@@ -0,0 +1,21 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "pocketsphinx-ruby"
5
+
6
+ include Pocketsphinx
7
+
8
+ configuration = Configuration::KeywordSpotting.new('hello computer')
9
+ recognizer = LiveSpeechRecognizer.new(configuration)
10
+
11
+ recognizer.recognize do |speech|
12
+ if configuration.keyword == 'hello computer'
13
+ configuration.keyword = 'goodbye computer'
14
+ else
15
+ configuration.keyword = 'hello computer'
16
+ end
17
+
18
+ recognizer.reconfigure
19
+
20
+ puts "You said '#{speech}'. Keyword is now '#{configuration.keyword}'"
21
+ end
data/lib/pocketsphinx.rb CHANGED
@@ -1,11 +1,18 @@
1
1
  require 'ffi'
2
2
 
3
3
  require "pocketsphinx/version"
4
+
5
+ # Pocketsphinx FFI API
4
6
  require "pocketsphinx/api/sphinxbase"
5
7
  require "pocketsphinx/api/sphinxad"
6
8
  require "pocketsphinx/api/pocketsphinx"
7
9
 
8
- require "pocketsphinx/configuration"
10
+ # Configuration
11
+ require 'pocketsphinx/configuration/setting_definition'
12
+ require "pocketsphinx/configuration/base"
13
+ require "pocketsphinx/configuration/default"
14
+ require "pocketsphinx/configuration/keyword_spotting"
15
+
9
16
  require "pocketsphinx/audio_file"
10
17
  require "pocketsphinx/microphone"
11
18
  require "pocketsphinx/decoder"
@@ -8,6 +8,7 @@ module Pocketsphinx
8
8
  typedef :pointer, :configuration
9
9
 
10
10
  attach_function :ps_init, [:configuration], :decoder
11
+ attach_function :ps_reinit, [:decoder, :configuration], :int
11
12
  attach_function :ps_default_search_args, [:pointer], :void
12
13
  attach_function :ps_args, [], :pointer
13
14
  attach_function :ps_decode_raw, [:decoder, :pointer, :string, :long], :int
@@ -0,0 +1,95 @@
1
+ module Pocketsphinx
2
+ module Configuration
3
+ class Base
4
+ attr_reader :ps_config
5
+ attr_reader :setting_definitions
6
+
7
+ def initialize
8
+ @ps_arg_defs = API::Pocketsphinx.ps_args
9
+ @setting_definitions = SettingDefinition.from_arg_defs(@ps_arg_defs)
10
+
11
+ # Sets default settings based on definitions
12
+ @ps_config = API::Sphinxbase.cmd_ln_parse_r(nil, @ps_arg_defs, 0, nil, 1)
13
+ end
14
+
15
+ def setting_names
16
+ setting_definitions.keys.sort
17
+ end
18
+
19
+ # Get details for one or all configuration settings
20
+ #
21
+ # @param [String] name Name of setting to get details for. Gets details for all settings if nil.
22
+ def details(name = nil)
23
+ details = [name || setting_names].flatten.map do |name|
24
+ definition = find_definition(name)
25
+
26
+ {
27
+ name: name,
28
+ type: definition.type,
29
+ default: definition.default,
30
+ required: definition.required?,
31
+ value: self[name],
32
+ info: definition.doc
33
+ }
34
+ end
35
+
36
+ name ? details.first : details
37
+ end
38
+
39
+ # Get a configuration setting
40
+ def [](name)
41
+ case find_definition(name).type
42
+ when :integer
43
+ API::Sphinxbase.cmd_ln_int_r(ps_config, "-#{name}")
44
+ when :float
45
+ API::Sphinxbase.cmd_ln_float_r(ps_config, "-#{name}")
46
+ when :string
47
+ API::Sphinxbase.cmd_ln_str_r(ps_config, "-#{name}")
48
+ when :boolean
49
+ API::Sphinxbase.cmd_ln_int_r(ps_config, "-#{name}") != 0
50
+ when :string_list
51
+ raise NotImplementedException
52
+ end
53
+ end
54
+
55
+ # Set a configuration setting with type checking
56
+ def []=(name, value)
57
+ check_type(name, type = find_definition(name).type, value)
58
+
59
+ case type
60
+ when :integer
61
+ API::Sphinxbase.cmd_ln_set_int_r(ps_config, "-#{name}", value.to_i)
62
+ when :float
63
+ API::Sphinxbase.cmd_ln_set_float_r(ps_config, "-#{name}", value.to_f)
64
+ when :string
65
+ API::Sphinxbase.cmd_ln_set_str_r(ps_config, "-#{name}", (value.to_s if value))
66
+ when :boolean
67
+ API::Sphinxbase.cmd_ln_set_int_r(ps_config, "-#{name}", value ? 1 : 0)
68
+ when :string_list
69
+ raise NotImplementedException
70
+ end
71
+ end
72
+
73
+ private
74
+
75
+ def find_definition(name)
76
+ setting_definitions[name] or raise "Configuration setting '#{name}' does not exist"
77
+ end
78
+
79
+ def check_type(name, expected_type, value)
80
+ conversion_method = case expected_type
81
+ when :integer then :to_i
82
+ when :float then :to_f
83
+ end
84
+
85
+ if conversion_method && !value.respond_to?(conversion_method)
86
+ raise "Configuration setting '#{name}' must be of type #{expected_type.to_s.capitalize}"
87
+ end
88
+
89
+ if value.nil? && expected_type != :string
90
+ raise "Only string settings can be set to nil"
91
+ end
92
+ end
93
+ end
94
+ end
95
+ end
@@ -0,0 +1,17 @@
1
+ module Pocketsphinx
2
+ module Configuration
3
+ class Default < Base
4
+ def initialize
5
+ super
6
+
7
+ # Sets default grammar and language model if they are not set explicitly and
8
+ # are present in the default search path.
9
+ API::Pocketsphinx.ps_default_search_args(@ps_config)
10
+ end
11
+ end
12
+
13
+ def self.default
14
+ Default.new
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,37 @@
1
+ module Pocketsphinx
2
+ module Configuration
3
+ class KeywordSpotting < Default
4
+ attr_reader :kws_threshold
5
+
6
+ def initialize(keyword, threshold = nil)
7
+ super()
8
+
9
+ self['lm'] = nil
10
+ self.keyword = keyword
11
+ self.kws_threshold = threshold if threshold
12
+ end
13
+
14
+ def keyword
15
+ self['keyphrase']
16
+ end
17
+
18
+ def keyword=(value)
19
+ self['keyphrase'] = sanitize_keyword value
20
+ end
21
+
22
+ def kws_threshold
23
+ self['kws_threshold']
24
+ end
25
+
26
+ def kws_threshold=(value)
27
+ self['kws_threshold'] = value
28
+ end
29
+
30
+ private
31
+
32
+ def sanitize_keyword(keyword)
33
+ keyword.downcase
34
+ end
35
+ end
36
+ end
37
+ end
@@ -1,5 +1,5 @@
1
1
  module Pocketsphinx
2
- class Configuration
2
+ module Configuration
3
3
  class SettingDefinition < Struct.new(:name, :type_code, :deflt, :doc)
4
4
  TYPES = [:integer, :float, :string, :boolean, :string_list]
5
5
 
@@ -1,13 +1,22 @@
1
1
  module Pocketsphinx
2
- class Decoder
2
+ class Decoder < Struct.new(:configuration)
3
3
  Error = Class.new(StandardError)
4
4
 
5
- attr_reader :ps_decoder
6
5
  attr_writer :ps_api
7
6
 
8
- def initialize(configuration)
9
- @configuration = configuration
10
- @ps_decoder = ps_api.ps_init(configuration.ps_config)
7
+ # Reinitialize the decoder with updated configuration.
8
+ #
9
+ # This function allows you to switch the acoustic model, dictionary, or other configuration
10
+ # without creating an entirely new decoding object.
11
+ #
12
+ # @param [Configuration] configuration An optional new configuration to use. If this is
13
+ # nil, the previous configuration will be reloaded, with any changes applied.
14
+ def reconfigure(configuration = nil)
15
+ self.configuration = configuration if configuration
16
+
17
+ ps_api.ps_reinit(ps_decoder, self.configuration.ps_config).tap do |result|
18
+ raise Error, "Decoder#reconfigure failed with error code #{result}" if result < 0
19
+ end
11
20
  end
12
21
 
13
22
  # Decode a raw audio stream as a single utterance, opening a file if path given
@@ -55,7 +64,7 @@ module Pocketsphinx
55
64
  # worth of data. This may allow the recognizer to produce more accurate results.
56
65
  # @return Number of frames of data searched
57
66
  def process_raw(buffer, size, no_search = false, full_utt = false)
58
- ps_api.ps_process_raw(@ps_decoder, buffer, size, no_search ? 1 : 0, full_utt ? 1 : 0).tap do |result|
67
+ ps_api.ps_process_raw(ps_decoder, buffer, size, no_search ? 1 : 0, full_utt ? 1 : 0).tap do |result|
59
68
  raise Error, "Decoder#process_raw failed with error code #{result}" if result < 0
60
69
  end
61
70
  end
@@ -68,21 +77,21 @@ module Pocketsphinx
68
77
  #
69
78
  # @param [String] name String uniquely identifying this utterance. If nil, one will be created.
70
79
  def start_utterance(name = nil)
71
- ps_api.ps_start_utt(@ps_decoder, name).tap do |result|
80
+ ps_api.ps_start_utt(ps_decoder, name).tap do |result|
72
81
  raise Error, "Decoder#start_utterance failed with error code #{result}" if result < 0
73
82
  end
74
83
  end
75
84
 
76
85
  # End utterance processing
77
86
  def end_utterance
78
- ps_api.ps_end_utt(@ps_decoder).tap do |result|
87
+ ps_api.ps_end_utt(ps_decoder).tap do |result|
79
88
  raise Error, "Decoder#end_utterance failed with error code #{result}" if result < 0
80
89
  end
81
90
  end
82
91
 
83
92
  # Checks if the last feed audio buffer contained speech
84
93
  def in_speech?
85
- ps_api.ps_get_in_speech(@ps_decoder) != 0
94
+ ps_api.ps_get_in_speech(ps_decoder) != 0
86
95
  end
87
96
 
88
97
  # Get hypothesis string and path score.
@@ -90,11 +99,15 @@ module Pocketsphinx
90
99
  # @return [String] Hypothesis string
91
100
  # @todo Expand to return path score and utterance ID
92
101
  def hypothesis
93
- ps_api.ps_get_hyp(@ps_decoder, nil, nil)
102
+ ps_api.ps_get_hyp(ps_decoder, nil, nil)
94
103
  end
95
104
 
96
105
  def ps_api
97
106
  @ps_api || API::Pocketsphinx
98
107
  end
108
+
109
+ def ps_decoder
110
+ @ps_decoder ||= ps_api.ps_init(configuration.ps_config)
111
+ end
99
112
  end
100
113
  end
@@ -65,7 +65,7 @@ module Pocketsphinx
65
65
  #
66
66
  # @param [Fixnum] max_samples The maximum samples we tried to read from the audio device
67
67
  def read_audio_delay(max_samples = 4096)
68
- max_samples / (2 * sample_rate)
68
+ max_samples.to_f / (2 * sample_rate)
69
69
  end
70
70
 
71
71
  def close_device
@@ -6,6 +6,7 @@ module Pocketsphinx
6
6
  # Recordable interface must implement #record and #read_audio
7
7
  attr_writer :recordable
8
8
  attr_writer :decoder
9
+ attr_writer :configuration
9
10
 
10
11
  def initialize(configuration = nil)
11
12
  @configuration = configuration
@@ -23,6 +24,19 @@ module Pocketsphinx
23
24
  @configuration ||= Configuration.default
24
25
  end
25
26
 
27
+ # Reinitialize the decoder with updated configuration.
28
+ #
29
+ # See Decoder#reconfigure
30
+ #
31
+ # @param [Configuration] configuration An optional new configuration to use. If this is
32
+ # nil, the previous configuration will be reloaded, with any changes applied.
33
+ def reconfigure(configuration = nil)
34
+ self.configuration = configuration if configuration
35
+
36
+ decoder.reconfigure(configuration)
37
+ decoder.start_utterance if recognizing?
38
+ end
39
+
26
40
  # Recognize utterances and yield hypotheses in infinite loop
27
41
  #
28
42
  # Splits speech into utterances by detecting silence between them.
@@ -32,6 +46,7 @@ module Pocketsphinx
32
46
  # @param [Fixnum] max_samples Number of samples to process at a time
33
47
  def recognize(max_samples = 4096)
34
48
  decoder.start_utterance
49
+ @recognizing = true
35
50
 
36
51
  recordable.record do
37
52
  FFI::MemoryPointer.new(:int16, max_samples) do |buffer|
@@ -41,13 +56,16 @@ module Pocketsphinx
41
56
  process_audio(buffer, max_samples) or break
42
57
  end
43
58
 
44
- yield get_hypothesis
59
+ hypothesis = get_hypothesis
60
+ yield hypothesis if hypothesis
45
61
  else
46
62
  process_audio(buffer, max_samples) or break
47
63
  end
48
64
  end
49
65
  end
50
66
  end
67
+ ensure
68
+ @recognizing = false
51
69
  end
52
70
 
53
71
  def in_speech?
@@ -55,6 +73,10 @@ module Pocketsphinx
55
73
  decoder.in_speech?
56
74
  end
57
75
 
76
+ def recognizing?
77
+ @recognizing == true
78
+ end
79
+
58
80
  private
59
81
 
60
82
  def process_audio(buffer, max_samples)
@@ -1,3 +1,3 @@
1
1
  module Pocketsphinx
2
- VERSION = "0.0.2"
2
+ VERSION = "0.0.3"
3
3
  end
@@ -4,7 +4,7 @@ describe Configuration do
4
4
  subject { Pocketsphinx::Configuration.default }
5
5
 
6
6
  it "provides a default pocketsphinx configuration" do
7
- expect(subject).to be_a(Pocketsphinx::Configuration)
7
+ expect(subject).to be_a(Pocketsphinx::Configuration::Default)
8
8
  end
9
9
 
10
10
  it "supports integer settings" do
@@ -13,6 +13,8 @@ describe Configuration do
13
13
 
14
14
  subject['frate'] = 50
15
15
  expect(subject['frate']).to eq(50)
16
+
17
+ expect { subject['frate'] = nil }.to raise_exception "Only string settings can be set to nil"
16
18
  end
17
19
 
18
20
  it "supports float settings" do
@@ -21,24 +23,31 @@ describe Configuration do
21
23
 
22
24
  subject['samprate'] = 8000
23
25
  expect(subject['samprate']).to eq(8000)
26
+
27
+ expect { subject['samprate'] = nil }.to raise_exception "Only string settings can be set to nil"
24
28
  end
25
29
 
26
- it "supports getting strings" do
30
+ it "supports string settings" do
27
31
  expect(subject['warp_type']).to eq('inverse_linear')
28
32
 
29
33
  subject['warp_type'] = 'different_type'
30
34
  expect(subject['warp_type']).to eq('different_type')
35
+
36
+ subject['warp_type'] = nil
37
+ expect(subject['warp_type']).to eq(nil)
31
38
  end
32
39
 
33
- it "supports getting booleans" do
40
+ it "supports boolean settings" do
34
41
  expect(subject['smoothspec']).to eq(false)
35
42
 
36
43
  subject['smoothspec'] = true
37
44
  expect(subject['smoothspec']).to eq(true)
45
+
46
+ expect { subject['smoothspec'] = nil }.to raise_exception "Only string settings can be set to nil"
38
47
  end
39
48
 
40
49
  it 'raises exceptions when setting with incorrectly typed values' do
41
- expect { subject['frate'] = true }.to raise_exception "Configuration setting 'frate' must be a Fixnum"
50
+ expect { subject['frate'] = true }.to raise_exception "Configuration setting 'frate' must be of type Integer"
42
51
  end
43
52
 
44
53
  it 'raises exceptions when a setting is unknown' do
@@ -77,4 +86,30 @@ describe Configuration do
77
86
  })
78
87
  end
79
88
  end
89
+
90
+ context 'keyword spotting configuration' do
91
+ subject { Configuration::KeywordSpotting.new('Okay computer') }
92
+
93
+ it 'sets the lowercase keyphrase' do
94
+ expect(subject['keyphrase']).to eq('okay computer')
95
+ end
96
+
97
+ it 'uses no language model' do
98
+ expect(subject['lm']).to be_nil
99
+ end
100
+
101
+ it 'exposes the keyphrase setting as #keyword' do
102
+ subject.keyword = 'Hello computer'
103
+
104
+ expect(subject.keyword).to eq('hello computer')
105
+ expect(subject['keyphrase']).to eq('hello computer')
106
+ end
107
+
108
+ it 'exposes the kws_threshold setting as #kws_threshold' do
109
+ subject.kws_threshold = 24
110
+
111
+ expect(subject.kws_threshold).to eq(24)
112
+ expect(subject['kws_threshold']).to eq(24)
113
+ end
114
+ end
80
115
  end
data/spec/decoder_spec.rb CHANGED
@@ -1,27 +1,47 @@
1
1
  require 'spec_helper'
2
2
 
3
3
  describe Decoder do
4
- subject { @decoder }
5
- let(:ps_api) { @decoder.ps_api = double }
6
-
7
- # Share decoder across all examples for speed
8
- before :all do
9
- @decoder = Decoder.new(Configuration.default)
4
+ subject { Decoder.new(configuration) }
5
+ let(:ps_api) { subject.ps_api }
6
+ let(:ps_decoder) { double }
7
+ let(:configuration) { Configuration.default }
8
+
9
+ before do
10
+ subject.ps_api = double
11
+ allow(ps_api).to receive(:ps_init).and_return(ps_decoder)
10
12
  end
11
13
 
12
- # Full integration test
13
- describe '#decode' do
14
- it 'correctly decodes the speech in goforward.raw' do
15
- subject.decode File.open('spec/assets/audio/goforward.raw', 'rb')
14
+ describe '#reconfigure' do
15
+ it 'calls libpocketsphinx' do
16
+ expect(ps_api)
17
+ .to receive(:ps_reinit)
18
+ .with(subject.ps_decoder, configuration.ps_config)
19
+ .and_return(0)
16
20
 
17
- # With the default configuration (no specific grammar), pocketsphinx doesn't actually
18
- # get this quite right, but nonetheless this is the expected output
19
- expect(subject.hypothesis).to eq("go forward ten years")
21
+ subject.reconfigure
20
22
  end
21
23
 
22
- it 'accepts a file path as well as a stream' do
23
- subject.decode 'spec/assets/audio/goforward.raw'
24
- expect(subject.hypothesis).to eq("go forward ten years")
24
+ it 'sets a new configuration if one is passed' do
25
+ new_config = Struct.new(:ps_config).new(:ps_config)
26
+
27
+ expect(ps_api)
28
+ .to receive(:ps_reinit)
29
+ .with(subject.ps_decoder, new_config.ps_config)
30
+ .and_return(0)
31
+
32
+ subject.reconfigure(new_config)
33
+
34
+ expect(subject.configuration).to be(new_config)
35
+ end
36
+
37
+ it 'raises an exception on error' do
38
+ expect(ps_api)
39
+ .to receive(:ps_reinit)
40
+ .with(subject.ps_decoder, configuration.ps_config)
41
+ .and_return(-1)
42
+
43
+ expect { subject.reconfigure }
44
+ .to raise_exception "Decoder#reconfigure failed with error code -1"
25
45
  end
26
46
  end
27
47
 
@@ -0,0 +1,28 @@
1
+ require 'spec_helper'
2
+
3
+ describe Decoder do
4
+ subject { @decoder }
5
+ let(:configuration) { @configuration }
6
+
7
+ # Share decoder across all examples for speed
8
+ before :all do
9
+ @configuration = Configuration.default
10
+ @decoder = Decoder.new(@configuration)
11
+ end
12
+
13
+ describe '#decode' do
14
+ it 'correctly decodes the speech in goforward.raw' do
15
+ @decoder.ps_api = nil
16
+ subject.decode File.open('spec/assets/audio/goforward.raw', 'rb')
17
+
18
+ # With the default configuration (no specific grammar), pocketsphinx doesn't actually
19
+ # get this quite right, but nonetheless this is the expected output
20
+ expect(subject.hypothesis).to eq("go forward ten years")
21
+ end
22
+
23
+ it 'accepts a file path as well as a stream' do
24
+ subject.decode 'spec/assets/audio/goforward.raw'
25
+ expect(subject.hypothesis).to eq("go forward ten years")
26
+ end
27
+ end
28
+ end
@@ -0,0 +1,23 @@
1
+ require 'spec_helper'
2
+
3
+ describe SpeechRecognizer do
4
+ let(:recordable) { AudioFile.new('spec/assets/audio/goforward.raw') }
5
+
6
+ subject do
7
+ SpeechRecognizer.new.tap do |speech_recognizer|
8
+ speech_recognizer.recordable = recordable
9
+ speech_recognizer.decoder = @decoder
10
+ end
11
+ end
12
+
13
+ # Share decoder across all examples for speed
14
+ before :all do
15
+ @decoder = Decoder.new(Configuration.default)
16
+ end
17
+
18
+ describe '#recognize' do
19
+ it 'should decode speech in raw audio' do
20
+ expect { |b| subject.recognize(4096, &b) }.to yield_with_args("go forward ten years")
21
+ end
22
+ end
23
+ end
@@ -79,6 +79,12 @@ describe Microphone do
79
79
  end
80
80
  end
81
81
 
82
+ describe '#read_audio_delay' do
83
+ it 'should be 0.128 seconds for a max_samples of 4096 and sample rate of 16kHz' do
84
+ expect(subject.read_audio_delay(4096)).to eq(0.128)
85
+ end
86
+ end
87
+
82
88
  describe '#close_device' do
83
89
  it 'calls libsphinxad' do
84
90
  expect(ps_api)
@@ -1,23 +1,40 @@
1
1
  require 'spec_helper'
2
2
 
3
3
  describe SpeechRecognizer do
4
- let(:recordable) { AudioFile.new('spec/assets/audio/goforward.raw') }
4
+ let(:configuration) { double }
5
+ let(:recordable) { double }
6
+ let(:decoder) { double }
7
+ subject { SpeechRecognizer.new(configuration) }
5
8
 
6
- subject do
7
- SpeechRecognizer.new.tap do |speech_recognizer|
8
- speech_recognizer.recordable = recordable
9
- speech_recognizer.decoder = @decoder
10
- end
9
+ before do
10
+ subject.decoder = decoder
11
+ subject.recordable = recordable
11
12
  end
12
13
 
13
- # Share decoder across all examples for speed
14
- before :all do
15
- @decoder = Decoder.new(Configuration.default)
16
- end
14
+ describe '#reconfigure' do
15
+ before do
16
+ allow(decoder).to receive(:reconfigure)
17
+ allow(decoder).to receive(:start_utterance)
18
+ end
19
+
20
+ it 'saves the configuration if one is given' do
21
+ subject.reconfigure(:new_configuration)
22
+ expect(subject.configuration).to eq(:new_configuration)
23
+ end
24
+
25
+ it 'reconfigures the decoder' do
26
+ expect(decoder).to receive(:reconfigure).with(nil).ordered
27
+ expect(decoder).to receive(:reconfigure).with(:new_configuration).ordered
28
+
29
+ subject.reconfigure
30
+ subject.reconfigure(:new_configuration)
31
+ end
32
+
33
+ it 'restarts an utterance if recognition was interrupted' do
34
+ expect(subject).to receive(:recognizing?).and_return(true)
35
+ expect(decoder).to receive(:start_utterance)
17
36
 
18
- describe '#recognize' do
19
- it 'should decode speech in raw audio' do
20
- expect { |b| subject.recognize(4096, &b) }.to yield_with_args("go forward ten years")
37
+ subject.reconfigure
21
38
  end
22
39
  end
23
40
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pocketsphinx-ruby
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.2
4
+ version: 0.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Howard Wilson
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-10-20 00:00:00.000000000 Z
11
+ date: 2014-10-21 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ffi
@@ -95,6 +95,7 @@ files:
95
95
  - README.md
96
96
  - Rakefile
97
97
  - examples/decode_audio_file.rb
98
+ - examples/keyword_spotter.rb
98
99
  - examples/pocketsphinx_continuous.rb
99
100
  - examples/record_audio_file.rb
100
101
  - lib/pocketsphinx-ruby.rb
@@ -104,7 +105,9 @@ files:
104
105
  - lib/pocketsphinx/api/sphinxbase.rb
105
106
  - lib/pocketsphinx/audio_file.rb
106
107
  - lib/pocketsphinx/audio_file_speech_recognizer.rb
107
- - lib/pocketsphinx/configuration.rb
108
+ - lib/pocketsphinx/configuration/base.rb
109
+ - lib/pocketsphinx/configuration/default.rb
110
+ - lib/pocketsphinx/configuration/keyword_spotting.rb
108
111
  - lib/pocketsphinx/configuration/setting_definition.rb
109
112
  - lib/pocketsphinx/decoder.rb
110
113
  - lib/pocketsphinx/live_speech_recognizer.rb
@@ -115,6 +118,8 @@ files:
115
118
  - spec/assets/audio/goforward.raw
116
119
  - spec/configuration_spec.rb
117
120
  - spec/decoder_spec.rb
121
+ - spec/integration/decoder_spec.rb
122
+ - spec/integration/speech_recognizer_spec.rb
118
123
  - spec/microphone_spec.rb
119
124
  - spec/spec_helper.rb
120
125
  - spec/speech_recognizer_spec.rb
@@ -146,6 +151,8 @@ test_files:
146
151
  - spec/assets/audio/goforward.raw
147
152
  - spec/configuration_spec.rb
148
153
  - spec/decoder_spec.rb
154
+ - spec/integration/decoder_spec.rb
155
+ - spec/integration/speech_recognizer_spec.rb
149
156
  - spec/microphone_spec.rb
150
157
  - spec/spec_helper.rb
151
158
  - spec/speech_recognizer_spec.rb
@@ -1,90 +0,0 @@
1
- require 'pocketsphinx/configuration/setting_definition'
2
-
3
- module Pocketsphinx
4
- class Configuration
5
- attr_reader :ps_config
6
- attr_reader :setting_definitions
7
-
8
- private_class_method :new
9
-
10
- def initialize(ps_arg_defs)
11
- @ps_arg_defs = ps_arg_defs
12
- @setting_definitions = SettingDefinition.from_arg_defs(ps_arg_defs)
13
-
14
- # Sets default settings based on definitions
15
- @ps_config = API::Sphinxbase.cmd_ln_parse_r(nil, ps_arg_defs, 0, nil, 1)
16
-
17
- # Sets default grammar and language model if they are not set explicitly and
18
- # are present in the default search path.
19
- API::Pocketsphinx.ps_default_search_args(@ps_config)
20
- end
21
-
22
- def self.default
23
- new(API::Pocketsphinx.ps_args)
24
- end
25
-
26
- def setting_names
27
- setting_definitions.keys.sort
28
- end
29
-
30
- # Get details for one or all configuration settings
31
- #
32
- # @param [String] name Name of setting to get details for. Gets details for all settings if nil.
33
- def details(name = nil)
34
- details = [name || setting_names].flatten.map do |name|
35
- definition = find_definition(name)
36
-
37
- {
38
- name: name,
39
- type: definition.type,
40
- default: definition.default,
41
- required: definition.required?,
42
- value: self[name],
43
- info: definition.doc
44
- }
45
- end
46
-
47
- name ? details.first : details
48
- end
49
-
50
- # Get a configuration setting
51
- def [](name)
52
- case find_definition(name).type
53
- when :integer
54
- API::Sphinxbase.cmd_ln_int_r(@ps_config, "-#{name}")
55
- when :float
56
- API::Sphinxbase.cmd_ln_float_r(@ps_config, "-#{name}")
57
- when :string
58
- API::Sphinxbase.cmd_ln_str_r(@ps_config, "-#{name}")
59
- when :boolean
60
- API::Sphinxbase.cmd_ln_int_r(@ps_config, "-#{name}") != 0
61
- when :string_list
62
- raise NotImplementedException
63
- end
64
- end
65
-
66
- # Set a configuration setting with type checking
67
- def []=(name, value)
68
- case find_definition(name).type
69
- when :integer
70
- raise "Configuration setting '#{name}' must be a Fixnum" unless value.respond_to?(:to_i)
71
- API::Sphinxbase.cmd_ln_set_int_r(@ps_config, "-#{name}", value.to_i)
72
- when :float
73
- raise "Configuration setting '#{name}' must be a Float" unless value.respond_to?(:to_i)
74
- API::Sphinxbase.cmd_ln_set_float_r(@ps_config, "-#{name}", value.to_f)
75
- when :string
76
- API::Sphinxbase.cmd_ln_set_str_r(@ps_config, "-#{name}", value.to_s)
77
- when :boolean
78
- API::Sphinxbase.cmd_ln_set_int_r(@ps_config, "-#{name}", value ? 1 : 0)
79
- when :string_list
80
- raise NotImplementedException
81
- end
82
- end
83
-
84
- private
85
-
86
- def find_definition(name)
87
- setting_definitions[name] or raise "Configuration setting '#{name}' does not exist"
88
- end
89
- end
90
- end