ruby_speech 1.0.2 → 1.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG.md +3 -0
- data/README.md +115 -3
- data/lib/ruby_speech.rb +14 -0
- data/lib/ruby_speech/nlsml.rb +19 -0
- data/lib/ruby_speech/nlsml/builder.rb +47 -0
- data/lib/ruby_speech/nlsml/document.rb +116 -0
- data/lib/ruby_speech/version.rb +1 -1
- data/spec/ruby_speech/nlsml_spec.rb +395 -0
- data/spec/ruby_speech_spec.rb +134 -0
- metadata +11 -4
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,8 @@
|
|
1
1
|
# [develop](https://github.com/benlangfeld/ruby_speech)
|
2
2
|
|
3
|
+
# [1.1.0](https://github.com/benlangfeld/ruby_speech/compare/v1.0.2...v1.1.0) - [2013-03-02](https://rubygems.org/gems/ruby_speech/versions/1.1.0)
|
4
|
+
* Feature: NLSML building & parsing
|
5
|
+
|
3
6
|
# [1.0.2](https://github.com/benlangfeld/ruby_speech/compare/v1.0.1...v1.0.2) - [2012-12-26](https://rubygems.org/gems/ruby_speech/versions/1.0.2)
|
4
7
|
* Bugfix: Get test suite passing on JRuby
|
5
8
|
|
data/README.md
CHANGED
@@ -1,10 +1,12 @@
|
|
1
1
|
# RubySpeech
|
2
|
-
RubySpeech is a library for constructing and parsing Text to Speech (TTS) and Automatic Speech Recognition (ASR) documents such as [SSML](http://www.w3.org/TR/speech-synthesis)
|
2
|
+
RubySpeech is a library for constructing and parsing Text to Speech (TTS) and Automatic Speech Recognition (ASR) documents such as [SSML](http://www.w3.org/TR/speech-synthesis), [GRXML](http://www.w3.org/TR/speech-grammar/) and [NLSML](http://www.w3.org/TR/nl-spec/). Such documents can be constructed to be processed by TTS and ASR engines, parsed as the result from such, or used in the implementation of such engines.
|
3
3
|
|
4
4
|
## Installation
|
5
5
|
gem install ruby_speech
|
6
6
|
|
7
7
|
## Library
|
8
|
+
|
9
|
+
### SSML
|
8
10
|
RubySpeech provides a DSL for constructing SSML documents like so:
|
9
11
|
|
10
12
|
```ruby
|
@@ -45,6 +47,7 @@ Once your `Speak` is fully prepared and you're ready to send it off for processi
|
|
45
47
|
|
46
48
|
You may also then need to call `to_s`.
|
47
49
|
|
50
|
+
### GRXML
|
48
51
|
|
49
52
|
Construct a GRXML (SRGS) document like this:
|
50
53
|
|
@@ -103,7 +106,7 @@ which becomes
|
|
103
106
|
</grammar>
|
104
107
|
```
|
105
108
|
|
106
|
-
|
109
|
+
#### Grammar matching
|
107
110
|
|
108
111
|
It is possible to match some arbitrary input against a GRXML grammar. In order to do so, certain normalization routines should first be run on the grammar in order to prepare it for matching. These are reference inlining, tokenization and whitespace normalization, and are described [in the SRGS spec](http://www.w3.org/TR/speech-grammar/#S2.1). This process will transform the above grammar like so:
|
109
112
|
|
@@ -198,6 +201,111 @@ Matching against some sample input strings then returns the following results:
|
|
198
201
|
=> #<RubySpeech::GRXML::NoMatch:0x00000101371660>
|
199
202
|
```
|
200
203
|
|
204
|
+
### NLSML
|
205
|
+
|
206
|
+
[Natural Language Semantics Markup Language](http://www.w3.org/TR/nl-spec/) is the format used by many Speech Recognition engines and natural language processors to add semantic information to human language. RubySpeech is capable of generating and parsing such documents.
|
207
|
+
|
208
|
+
It is possible to generate an NLSML document like so:
|
209
|
+
|
210
|
+
```ruby
|
211
|
+
require 'ruby_speech'
|
212
|
+
|
213
|
+
nlsml = RubySpeech::NLSML.draw(grammar: 'http://flight', 'xmlns:myApp' => 'foo') do
|
214
|
+
interpretation confidence: 0.6 do
|
215
|
+
input "I want to go to Pittsburgh", mode: :speech
|
216
|
+
|
217
|
+
model do
|
218
|
+
group name: 'airline' do
|
219
|
+
string name: 'to_city'
|
220
|
+
end
|
221
|
+
end
|
222
|
+
|
223
|
+
instance do
|
224
|
+
self['myApp'].airline do
|
225
|
+
to_city 'Pittsburgh'
|
226
|
+
end
|
227
|
+
end
|
228
|
+
end
|
229
|
+
|
230
|
+
interpretation confidence: 0.4 do
|
231
|
+
input "I want to go to Stockholm"
|
232
|
+
|
233
|
+
model do
|
234
|
+
group name: 'airline' do
|
235
|
+
string name: 'to_city'
|
236
|
+
end
|
237
|
+
end
|
238
|
+
|
239
|
+
instance do
|
240
|
+
self['myApp'].airline do
|
241
|
+
to_city "Stockholm"
|
242
|
+
end
|
243
|
+
end
|
244
|
+
end
|
245
|
+
end
|
246
|
+
|
247
|
+
nlsml.to_s
|
248
|
+
```
|
249
|
+
|
250
|
+
becomes:
|
251
|
+
|
252
|
+
```xml
|
253
|
+
<?xml version="1.0"?>
|
254
|
+
<result xmlns:myApp="foo" xmlns:xf="http://www.w3.org/2000/xforms" grammar="http://flight">
|
255
|
+
<interpretation confidence="60">
|
256
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
257
|
+
<xf:model>
|
258
|
+
<xf:group name="airline">
|
259
|
+
<xf:string name="to_city"/>
|
260
|
+
</xf:group>
|
261
|
+
</xf:model>
|
262
|
+
<xf:instance>
|
263
|
+
<myApp:airline>
|
264
|
+
<myApp:to_city>Pittsburgh</myApp:to_city>
|
265
|
+
</myApp:airline>
|
266
|
+
</xf:instance>
|
267
|
+
</interpretation>
|
268
|
+
<interpretation confidence="40">
|
269
|
+
<input>I want to go to Stockholm</input>
|
270
|
+
<xf:model>
|
271
|
+
<xf:group name="airline">
|
272
|
+
<xf:string name="to_city"/>
|
273
|
+
</xf:group>
|
274
|
+
</xf:model>
|
275
|
+
<xf:instance>
|
276
|
+
<myApp:airline>
|
277
|
+
<myApp:to_city>Stockholm</myApp:to_city>
|
278
|
+
</myApp:airline>
|
279
|
+
</xf:instance>
|
280
|
+
</interpretation>
|
281
|
+
</result>
|
282
|
+
```
|
283
|
+
|
284
|
+
It's also possible to parse an NLSML document and extract useful information from it. Taking the above example, one may do:
|
285
|
+
|
286
|
+
```ruby
|
287
|
+
document = RubySpeech.parse nlsml.to_s
|
288
|
+
|
289
|
+
document.match? # => true
|
290
|
+
document.interpretations # => [
|
291
|
+
{
|
292
|
+
confidence: 0.6,
|
293
|
+
input: { mode: :speech, content: 'I want to go to Pittsburgh' },
|
294
|
+
instance: { airline: { to_city: 'Pittsburgh' } }
|
295
|
+
},
|
296
|
+
{
|
297
|
+
confidence: 0.4,
|
298
|
+
input: { content: 'I want to go to Stockholm' },
|
299
|
+
instance: { airline: { to_city: 'Stockholm' } }
|
300
|
+
}
|
301
|
+
]
|
302
|
+
document.best_interpretation # => {
|
303
|
+
confidence: 0.6,
|
304
|
+
input: { mode: :speech, content: 'I want to go to Pittsburgh' },
|
305
|
+
instance: { airline: { to_city: 'Pittsburgh' } }
|
306
|
+
}
|
307
|
+
```
|
308
|
+
|
201
309
|
Check out the [YARD documentation](http://rdoc.info/github/benlangfeld/ruby_speech/master/frames) for more
|
202
310
|
|
203
311
|
## Features:
|
@@ -226,6 +334,10 @@ Check out the [YARD documentation](http://rdoc.info/github/benlangfeld/ruby_spee
|
|
226
334
|
* `<tag/>`
|
227
335
|
* `<token/>`
|
228
336
|
|
337
|
+
### NLSML
|
338
|
+
* Document construction
|
339
|
+
* Simple data extraction from documents
|
340
|
+
|
229
341
|
## TODO:
|
230
342
|
### SSML
|
231
343
|
* `<lexicon/>`
|
@@ -236,11 +348,11 @@ Check out the [YARD documentation](http://rdoc.info/github/benlangfeld/ruby_spee
|
|
236
348
|
* `<example/>`
|
237
349
|
* `<lexicon/>`
|
238
350
|
|
239
|
-
|
240
351
|
## Links:
|
241
352
|
* [Source](https://github.com/benlangfeld/ruby_speech)
|
242
353
|
* [Documentation](http://rdoc.info/gems/ruby_speech/frames)
|
243
354
|
* [Bug Tracker](https://github.com/benlangfeld/ruby_speech/issues)
|
355
|
+
* [CI](https://travis-ci.org/#!/benlangfeld/ruby_speech)
|
244
356
|
|
245
357
|
## Note on Patches/Pull Requests
|
246
358
|
|
data/lib/ruby_speech.rb
CHANGED
@@ -15,8 +15,22 @@ module RubySpeech
|
|
15
15
|
autoload :GenericElement
|
16
16
|
autoload :SSML
|
17
17
|
autoload :GRXML
|
18
|
+
autoload :NLSML
|
18
19
|
autoload :XML
|
19
20
|
end
|
21
|
+
|
22
|
+
def self.parse(string)
|
23
|
+
document = Nokogiri::XML.parse string, nil, nil, Nokogiri::XML::ParseOptions::NOBLANKS
|
24
|
+
namespace = document.root.namespace
|
25
|
+
case namespace && namespace.href
|
26
|
+
when SSML::SSML_NAMESPACE
|
27
|
+
SSML::Element.import string
|
28
|
+
when GRXML::GRXML_NAMESPACE
|
29
|
+
GRXML::Element.import string
|
30
|
+
when NLSML::NLSML_NAMESPACE, nil
|
31
|
+
NLSML::Document.new document
|
32
|
+
end
|
33
|
+
end
|
20
34
|
end
|
21
35
|
|
22
36
|
ActiveSupport::Autoload.eager_autoload!
|
@@ -0,0 +1,19 @@
|
|
1
|
+
module RubySpeech
|
2
|
+
module NLSML
|
3
|
+
extend ActiveSupport::Autoload
|
4
|
+
|
5
|
+
NLSML_NAMESPACE = 'http://www.w3c.org/2000/11/nlsml'
|
6
|
+
XFORMS_NAMESPACE = 'http://www.w3.org/2000/xforms'
|
7
|
+
|
8
|
+
eager_autoload do
|
9
|
+
autoload :Builder
|
10
|
+
autoload :Document
|
11
|
+
end
|
12
|
+
|
13
|
+
def self.draw(options = {}, &block)
|
14
|
+
Builder.new(options, &block).document
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
ActiveSupport::Autoload.eager_autoload!
|
@@ -0,0 +1,47 @@
|
|
1
|
+
module RubySpeech
|
2
|
+
module NLSML
|
3
|
+
class Builder
|
4
|
+
attr_reader :document
|
5
|
+
|
6
|
+
def initialize(options = {}, &block)
|
7
|
+
options = {'xmlns' => NLSML_NAMESPACE, 'xmlns:xf' => XFORMS_NAMESPACE}.merge(options)
|
8
|
+
@document = Nokogiri::XML::Builder.new do |builder|
|
9
|
+
builder.result options do |r|
|
10
|
+
apply_block r, &block
|
11
|
+
end
|
12
|
+
end.doc
|
13
|
+
end
|
14
|
+
|
15
|
+
def interpretation(*args, &block)
|
16
|
+
if args.last.respond_to?(:has_key?) && args.last.has_key?(:confidence)
|
17
|
+
args.last[:confidence] = (args.last[:confidence] * 100).to_i
|
18
|
+
end
|
19
|
+
@result.send :interpretation, *args, &block
|
20
|
+
end
|
21
|
+
|
22
|
+
def model(*args, &block)
|
23
|
+
xf_namespaced_element :model, *args, &block
|
24
|
+
end
|
25
|
+
|
26
|
+
def instance(*args, &block)
|
27
|
+
xf_namespaced_element :instance, *args, &block
|
28
|
+
end
|
29
|
+
|
30
|
+
def method_missing(method_name, *args, &block)
|
31
|
+
@result.send method_name, *args, &block
|
32
|
+
end
|
33
|
+
|
34
|
+
private
|
35
|
+
|
36
|
+
def apply_block(result, &block)
|
37
|
+
@result = result
|
38
|
+
instance_eval &block
|
39
|
+
end
|
40
|
+
|
41
|
+
def xf_namespaced_element(element_name, *args, &block)
|
42
|
+
namespace = @result.send :[], 'xf'
|
43
|
+
namespace.send element_name, &block
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
47
|
+
end
|
@@ -0,0 +1,116 @@
|
|
1
|
+
require 'delegate'
|
2
|
+
|
3
|
+
module RubySpeech
|
4
|
+
module NLSML
|
5
|
+
class Document < SimpleDelegator
|
6
|
+
def initialize(xml)
|
7
|
+
super
|
8
|
+
@xml = xml
|
9
|
+
end
|
10
|
+
|
11
|
+
def grammar
|
12
|
+
result['grammar']
|
13
|
+
end
|
14
|
+
|
15
|
+
def interpretations
|
16
|
+
interpretation_nodes.map do |interpretation|
|
17
|
+
interpretation_hash_for_interpretation interpretation
|
18
|
+
end
|
19
|
+
end
|
20
|
+
|
21
|
+
def best_interpretation
|
22
|
+
interpretation_hash_for_interpretation interpretation_nodes.first
|
23
|
+
end
|
24
|
+
|
25
|
+
def match?
|
26
|
+
interpretation_nodes.count > 0 && !nomatch? && !noinput?
|
27
|
+
end
|
28
|
+
|
29
|
+
def ==(other)
|
30
|
+
to_xml == other.to_xml
|
31
|
+
end
|
32
|
+
|
33
|
+
def noinput?
|
34
|
+
noinput_elements.any?
|
35
|
+
end
|
36
|
+
|
37
|
+
private
|
38
|
+
|
39
|
+
def nomatch?
|
40
|
+
nomatch_elements.count >= input_elements.count
|
41
|
+
end
|
42
|
+
|
43
|
+
def nomatch_elements
|
44
|
+
result.xpath 'ns:interpretation/ns:input/ns:nomatch|interpretation/input/nomatch', 'ns' => NLSML_NAMESPACE
|
45
|
+
end
|
46
|
+
|
47
|
+
def noinput_elements
|
48
|
+
result.xpath 'ns:interpretation/ns:input/ns:noinput|interpretation/input/noinput', 'ns' => NLSML_NAMESPACE
|
49
|
+
end
|
50
|
+
|
51
|
+
def input_elements
|
52
|
+
result.xpath 'ns:interpretation/ns:input|interpretation/input', 'ns' => NLSML_NAMESPACE
|
53
|
+
end
|
54
|
+
|
55
|
+
def input_hash_for_interpretation(interpretation)
|
56
|
+
input_element = interpretation.at_xpath '(ns:input|input)', 'ns' => NLSML_NAMESPACE
|
57
|
+
{ content: input_element.content }.tap do |h|
|
58
|
+
h[:mode] = input_element['mode'].to_sym if input_element['mode']
|
59
|
+
end
|
60
|
+
end
|
61
|
+
|
62
|
+
def instance_hash_for_interpretation(interpretation)
|
63
|
+
instances = instance_elements interpretation
|
64
|
+
return unless instances.any?
|
65
|
+
element_children_key_value instances.first
|
66
|
+
end
|
67
|
+
|
68
|
+
def instances_collection_for_interpretation(interpretation)
|
69
|
+
instances = instance_elements interpretation
|
70
|
+
instances.map do |instance|
|
71
|
+
element_children_key_value instance
|
72
|
+
end
|
73
|
+
end
|
74
|
+
|
75
|
+
def instance_elements(interpretation)
|
76
|
+
interpretation.xpath '(xf:instance|ns:instance|instance)', 'xf' => XFORMS_NAMESPACE, 'ns' => NLSML_NAMESPACE
|
77
|
+
end
|
78
|
+
|
79
|
+
def element_children_key_value(element)
|
80
|
+
element.children.inject({}) do |acc, child|
|
81
|
+
acc[child.node_name.to_sym] = case child.children.count
|
82
|
+
when 0
|
83
|
+
child.content
|
84
|
+
when 1
|
85
|
+
if child.children.first.is_a?(Nokogiri::XML::Text)
|
86
|
+
child.children.first.content
|
87
|
+
else
|
88
|
+
element_children_key_value child
|
89
|
+
end
|
90
|
+
else
|
91
|
+
element_children_key_value child
|
92
|
+
end
|
93
|
+
acc
|
94
|
+
end
|
95
|
+
end
|
96
|
+
|
97
|
+
def interpretation_hash_for_interpretation(interpretation)
|
98
|
+
{
|
99
|
+
confidence: interpretation['confidence'].to_f/100,
|
100
|
+
input: input_hash_for_interpretation(interpretation),
|
101
|
+
instance: instance_hash_for_interpretation(interpretation),
|
102
|
+
instances: instances_collection_for_interpretation(interpretation)
|
103
|
+
}
|
104
|
+
end
|
105
|
+
|
106
|
+
def result
|
107
|
+
root
|
108
|
+
end
|
109
|
+
|
110
|
+
def interpretation_nodes
|
111
|
+
nodes = result.xpath '(ns:interpretation|interpretation)', 'ns' => NLSML_NAMESPACE
|
112
|
+
nodes.sort_by { |int| -int[:confidence].to_i }
|
113
|
+
end
|
114
|
+
end
|
115
|
+
end
|
116
|
+
end
|
data/lib/ruby_speech/version.rb
CHANGED
@@ -0,0 +1,395 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe RubySpeech::NLSML do
|
4
|
+
let :example_document do
|
5
|
+
'''
|
6
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" xmlns:xf="http://www.w3.org/2000/xforms" xmlns:myApp="foo" grammar="http://flight">
|
7
|
+
<interpretation confidence="60">
|
8
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
9
|
+
<xf:model>
|
10
|
+
<xf:group name="airline">
|
11
|
+
<xf:string name="to_city"/>
|
12
|
+
</xf:group>
|
13
|
+
</xf:model>
|
14
|
+
<xf:instance>
|
15
|
+
<myApp:airline>
|
16
|
+
<myApp:to_city>Pittsburgh</myApp:to_city>
|
17
|
+
</myApp:airline>
|
18
|
+
</xf:instance>
|
19
|
+
</interpretation>
|
20
|
+
<interpretation confidence="40">
|
21
|
+
<input>I want to go to Stockholm</input>
|
22
|
+
<xf:model>
|
23
|
+
<xf:group name="airline">
|
24
|
+
<xf:string name="to_city"/>
|
25
|
+
</xf:group>
|
26
|
+
</xf:model>
|
27
|
+
<xf:instance>
|
28
|
+
<myApp:airline>
|
29
|
+
<myApp:to_city>Stockholm</myApp:to_city>
|
30
|
+
</myApp:airline>
|
31
|
+
</xf:instance>
|
32
|
+
</interpretation>
|
33
|
+
</result>
|
34
|
+
'''
|
35
|
+
end
|
36
|
+
|
37
|
+
describe 'drawing a document' do
|
38
|
+
let :expected_document do
|
39
|
+
Nokogiri::XML(example_document, nil, nil, Nokogiri::XML::ParseOptions::NOBLANKS).to_xml
|
40
|
+
end
|
41
|
+
|
42
|
+
it "should allow building a document" do
|
43
|
+
document = RubySpeech::NLSML.draw(grammar: 'http://flight', 'xmlns:myApp' => 'foo') do
|
44
|
+
interpretation confidence: 0.6 do
|
45
|
+
input "I want to go to Pittsburgh", mode: :speech
|
46
|
+
|
47
|
+
model do
|
48
|
+
group name: 'airline' do
|
49
|
+
string name: 'to_city'
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
instance do
|
54
|
+
self['myApp'].airline do
|
55
|
+
to_city 'Pittsburgh'
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end
|
59
|
+
|
60
|
+
interpretation confidence: 0.4 do
|
61
|
+
input "I want to go to Stockholm"
|
62
|
+
|
63
|
+
model do
|
64
|
+
group name: 'airline' do
|
65
|
+
string name: 'to_city'
|
66
|
+
end
|
67
|
+
end
|
68
|
+
|
69
|
+
instance do
|
70
|
+
self['myApp'].airline do
|
71
|
+
to_city "Stockholm"
|
72
|
+
end
|
73
|
+
end
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
if RUBY_ENGINE == 'jruby'
|
78
|
+
expected_document.gsub! 'myApp:to_city', 'to_city'
|
79
|
+
expected_document.gsub! 'xf:group', 'group'
|
80
|
+
expected_document.gsub! 'xf:string', 'string'
|
81
|
+
end
|
82
|
+
|
83
|
+
document.to_xml.should == expected_document
|
84
|
+
end
|
85
|
+
end
|
86
|
+
|
87
|
+
describe "parsing a document" do
|
88
|
+
subject do
|
89
|
+
RubySpeech.parse example_document
|
90
|
+
end
|
91
|
+
|
92
|
+
let(:empty_result) { '<result xmlns="http://www.w3c.org/2000/11/nlsml" xmlns:xf="http://www.w3.org/2000/xforms"/>' }
|
93
|
+
|
94
|
+
its(:grammar) { should == 'http://flight' }
|
95
|
+
|
96
|
+
it { should be_match }
|
97
|
+
|
98
|
+
let(:expected_best_interpretation) do
|
99
|
+
{
|
100
|
+
confidence: 0.6,
|
101
|
+
input: { mode: :speech, content: 'I want to go to Pittsburgh' },
|
102
|
+
instance: { airline: { to_city: 'Pittsburgh' } },
|
103
|
+
instances: [{ airline: { to_city: 'Pittsburgh' } }]
|
104
|
+
}
|
105
|
+
end
|
106
|
+
|
107
|
+
let(:expected_interpretations) do
|
108
|
+
[
|
109
|
+
expected_best_interpretation,
|
110
|
+
{
|
111
|
+
confidence: 0.4,
|
112
|
+
input: { content: 'I want to go to Stockholm' },
|
113
|
+
instance: { airline: { to_city: 'Stockholm' } },
|
114
|
+
instances: [{ airline: { to_city: 'Stockholm' } }]
|
115
|
+
}
|
116
|
+
]
|
117
|
+
end
|
118
|
+
|
119
|
+
its(:interpretations) { should == expected_interpretations }
|
120
|
+
its(:best_interpretation) { should == expected_best_interpretation }
|
121
|
+
|
122
|
+
it "should be equal if the XML is the same" do
|
123
|
+
subject.should be == RubySpeech.parse(example_document)
|
124
|
+
end
|
125
|
+
|
126
|
+
it "should not be equal if the XML is different" do
|
127
|
+
subject.should_not be == RubySpeech.parse(empty_result)
|
128
|
+
end
|
129
|
+
|
130
|
+
context "with an interpretation that has no model/instance" do
|
131
|
+
let :example_document do
|
132
|
+
'''
|
133
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" grammar="http://flight">
|
134
|
+
<interpretation confidence="60">
|
135
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
136
|
+
</interpretation>
|
137
|
+
<interpretation confidence="40">
|
138
|
+
<input>I want to go to Stockholm</input>
|
139
|
+
</interpretation>
|
140
|
+
</result>
|
141
|
+
'''
|
142
|
+
end
|
143
|
+
|
144
|
+
let(:expected_best_interpretation) do
|
145
|
+
{
|
146
|
+
confidence: 0.6,
|
147
|
+
input: { mode: :speech, content: 'I want to go to Pittsburgh' },
|
148
|
+
instance: nil,
|
149
|
+
instances: []
|
150
|
+
}
|
151
|
+
end
|
152
|
+
|
153
|
+
let(:expected_interpretations) do
|
154
|
+
[
|
155
|
+
expected_best_interpretation,
|
156
|
+
{
|
157
|
+
confidence: 0.4,
|
158
|
+
input: { content: 'I want to go to Stockholm' },
|
159
|
+
instance: nil,
|
160
|
+
instances: []
|
161
|
+
}
|
162
|
+
]
|
163
|
+
end
|
164
|
+
|
165
|
+
its(:interpretations) { should == expected_interpretations }
|
166
|
+
its(:best_interpretation) { should == expected_best_interpretation }
|
167
|
+
end
|
168
|
+
|
169
|
+
context "without any interpretations" do
|
170
|
+
subject do
|
171
|
+
RubySpeech.parse empty_result
|
172
|
+
end
|
173
|
+
|
174
|
+
it { should_not be_match }
|
175
|
+
end
|
176
|
+
|
177
|
+
context "with interpretations out of confidence order" do
|
178
|
+
let :example_document do
|
179
|
+
'''
|
180
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" xmlns:myApp="foo" xmlns:xf="http://www.w3.org/2000/xforms" grammar="http://flight">
|
181
|
+
<interpretation confidence="40">
|
182
|
+
<input>I want to go to Stockholm</input>
|
183
|
+
<xf:model>
|
184
|
+
<xf:group name="airline">
|
185
|
+
<xf:string name="to_city"/>
|
186
|
+
</xf:group>
|
187
|
+
</xf:model>
|
188
|
+
<xf:instance>
|
189
|
+
<myApp:airline>
|
190
|
+
<myApp:to_city>Stockholm</myApp:to_city>
|
191
|
+
</myApp:airline>
|
192
|
+
</xf:instance>
|
193
|
+
</interpretation>
|
194
|
+
<interpretation confidence="60">
|
195
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
196
|
+
<xf:model>
|
197
|
+
<xf:group name="airline">
|
198
|
+
<xf:string name="to_city"/>
|
199
|
+
</xf:group>
|
200
|
+
</xf:model>
|
201
|
+
<xf:instance>
|
202
|
+
<myApp:airline>
|
203
|
+
<myApp:to_city>Pittsburgh</myApp:to_city>
|
204
|
+
</myApp:airline>
|
205
|
+
</xf:instance>
|
206
|
+
</interpretation>
|
207
|
+
</result>
|
208
|
+
'''
|
209
|
+
end
|
210
|
+
|
211
|
+
its(:interpretations) { should == expected_interpretations }
|
212
|
+
its(:best_interpretation) { should == expected_best_interpretation }
|
213
|
+
end
|
214
|
+
|
215
|
+
context "with multiple instances for a single interpretation" do
|
216
|
+
let :example_document do
|
217
|
+
'''
|
218
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" xmlns:myApp="foo" xmlns:xf="http://www.w3.org/2000/xforms" grammar="http://flight">
|
219
|
+
<interpretation confidence="100">
|
220
|
+
<input mode="speech">I want to go to Boston</input>
|
221
|
+
<xf:model>
|
222
|
+
<xf:group name="airline">
|
223
|
+
<xf:string name="to_city"/>
|
224
|
+
</xf:group>
|
225
|
+
</xf:model>
|
226
|
+
<xf:instance>
|
227
|
+
<myApp:airline>
|
228
|
+
<myApp:to_city>Boston, MA</myApp:to_city>
|
229
|
+
</myApp:airline>
|
230
|
+
</xf:instance>
|
231
|
+
<xf:instance>
|
232
|
+
<myApp:airline>
|
233
|
+
<myApp:to_city>Boston, UK</myApp:to_city>
|
234
|
+
</myApp:airline>
|
235
|
+
</xf:instance>
|
236
|
+
</interpretation>
|
237
|
+
</result>
|
238
|
+
'''
|
239
|
+
end
|
240
|
+
|
241
|
+
let(:expected_interpretation) do
|
242
|
+
{
|
243
|
+
confidence: 1.0,
|
244
|
+
input: { content: 'I want to go to Boston', mode: :speech },
|
245
|
+
instance: { airline: { to_city: 'Boston, MA' } },
|
246
|
+
instances: [
|
247
|
+
{ airline: { to_city: 'Boston, MA' } },
|
248
|
+
{ airline: { to_city: 'Boston, UK' } }
|
249
|
+
]
|
250
|
+
}
|
251
|
+
end
|
252
|
+
|
253
|
+
its(:interpretations) { should == [expected_interpretation] }
|
254
|
+
its(:best_interpretation) { should == expected_interpretation }
|
255
|
+
end
|
256
|
+
|
257
|
+
context "with no namespaces (because some vendors think this is ok)" do
|
258
|
+
let :example_document do
|
259
|
+
'''
|
260
|
+
<result grammar="http://flight">
|
261
|
+
<interpretation confidence="60">
|
262
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
263
|
+
<model>
|
264
|
+
<group name="airline">
|
265
|
+
<string name="to_city"/>
|
266
|
+
</group>
|
267
|
+
</model>
|
268
|
+
<instance>
|
269
|
+
<airline>
|
270
|
+
<to_city>Pittsburgh</to_city>
|
271
|
+
</airline>
|
272
|
+
</instance>
|
273
|
+
</interpretation>
|
274
|
+
<interpretation confidence="40">
|
275
|
+
<input>I want to go to Stockholm</input>
|
276
|
+
<model>
|
277
|
+
<group name="airline">
|
278
|
+
<string name="to_city"/>
|
279
|
+
</group>
|
280
|
+
</model>
|
281
|
+
<instance>
|
282
|
+
<airline>
|
283
|
+
<to_city>Stockholm</to_city>
|
284
|
+
</airline>
|
285
|
+
</instance>
|
286
|
+
</interpretation>
|
287
|
+
</result>
|
288
|
+
'''
|
289
|
+
end
|
290
|
+
|
291
|
+
its(:interpretations) { should == expected_interpretations }
|
292
|
+
its(:best_interpretation) { should == expected_best_interpretation }
|
293
|
+
end
|
294
|
+
|
295
|
+
context "with just an NLSML namespace (because we need something, damnit!)" do
|
296
|
+
let :example_document do
|
297
|
+
'''
|
298
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" grammar="http://flight">
|
299
|
+
<interpretation confidence="60">
|
300
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
301
|
+
<model>
|
302
|
+
<group name="airline">
|
303
|
+
<string name="to_city"/>
|
304
|
+
</group>
|
305
|
+
</model>
|
306
|
+
<instance>
|
307
|
+
<airline>
|
308
|
+
<to_city>Pittsburgh</to_city>
|
309
|
+
</airline>
|
310
|
+
</instance>
|
311
|
+
</interpretation>
|
312
|
+
<interpretation confidence="40">
|
313
|
+
<input>I want to go to Stockholm</input>
|
314
|
+
<model>
|
315
|
+
<group name="airline">
|
316
|
+
<string name="to_city"/>
|
317
|
+
</group>
|
318
|
+
</model>
|
319
|
+
<instance>
|
320
|
+
<airline>
|
321
|
+
<to_city>Stockholm</to_city>
|
322
|
+
</airline>
|
323
|
+
</instance>
|
324
|
+
</interpretation>
|
325
|
+
</result>
|
326
|
+
'''
|
327
|
+
end
|
328
|
+
|
329
|
+
its(:interpretations) { should == expected_interpretations }
|
330
|
+
its(:best_interpretation) { should == expected_best_interpretation }
|
331
|
+
end
|
332
|
+
|
333
|
+
context "with a single interpretation with a nomatch input" do
|
334
|
+
let :example_document do
|
335
|
+
'''
|
336
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" grammar="http://flight">
|
337
|
+
<interpretation>
|
338
|
+
<input>
|
339
|
+
<nomatch/>
|
340
|
+
</input>
|
341
|
+
</interpretation>
|
342
|
+
</result>
|
343
|
+
'''
|
344
|
+
end
|
345
|
+
|
346
|
+
it { should_not be_match }
|
347
|
+
end
|
348
|
+
|
349
|
+
context "with multiple interpretations where one is a nomatch input" do
|
350
|
+
let :example_document do
|
351
|
+
'''
|
352
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" grammar="http://flight">
|
353
|
+
<interpretation confidence="60">
|
354
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
355
|
+
<model>
|
356
|
+
<group name="airline">
|
357
|
+
<string name="to_city"/>
|
358
|
+
</group>
|
359
|
+
</model>
|
360
|
+
<instance>
|
361
|
+
<airline>
|
362
|
+
<to_city>Pittsburgh</to_city>
|
363
|
+
</airline>
|
364
|
+
</instance>
|
365
|
+
</interpretation>
|
366
|
+
<interpretation>
|
367
|
+
<input>
|
368
|
+
<nomatch/>
|
369
|
+
</input>
|
370
|
+
</interpretation>
|
371
|
+
</result>
|
372
|
+
'''
|
373
|
+
end
|
374
|
+
|
375
|
+
it { should be_match }
|
376
|
+
end
|
377
|
+
|
378
|
+
context "with a single interpretation with a noinput" do
|
379
|
+
let :example_document do
|
380
|
+
'''
|
381
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" grammar="http://flight">
|
382
|
+
<interpretation>
|
383
|
+
<input>
|
384
|
+
<noinput/>
|
385
|
+
</input>
|
386
|
+
</interpretation>
|
387
|
+
</result>
|
388
|
+
'''
|
389
|
+
end
|
390
|
+
|
391
|
+
it { should_not be_match }
|
392
|
+
it { should be_noinput }
|
393
|
+
end
|
394
|
+
end
|
395
|
+
end
|
@@ -0,0 +1,134 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe RubySpeech do
|
4
|
+
describe ".parse" do
|
5
|
+
subject do
|
6
|
+
RubySpeech.parse example_document
|
7
|
+
end
|
8
|
+
|
9
|
+
context "with an SSML document" do
|
10
|
+
let :example_document do
|
11
|
+
'''<?xml version="1.0"?>
|
12
|
+
<!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN"
|
13
|
+
"http://www.w3.org/TR/speech-synthesis/synthesis.dtd">
|
14
|
+
<speak version="1.0"
|
15
|
+
xmlns="http://www.w3.org/2001/10/synthesis"
|
16
|
+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
17
|
+
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
|
18
|
+
http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
|
19
|
+
xml:lang="en-US">
|
20
|
+
<p>
|
21
|
+
<s>You have 4 new messages.</s>
|
22
|
+
<s>The first is from Stephanie Williams and arrived at <break/> 3:45pm.
|
23
|
+
</s>
|
24
|
+
<s>
|
25
|
+
The subject is <prosody rate="-20%">ski trip</prosody>
|
26
|
+
</s>
|
27
|
+
|
28
|
+
</p>
|
29
|
+
</speak>
|
30
|
+
'''
|
31
|
+
end
|
32
|
+
|
33
|
+
it { should be_a RubySpeech::SSML::Element }
|
34
|
+
end
|
35
|
+
|
36
|
+
context "with a GRXML document" do
|
37
|
+
let :example_document do
|
38
|
+
'''<?xml version="1.0" encoding="UTF-8"?>
|
39
|
+
|
40
|
+
<!DOCTYPE grammar PUBLIC "-//W3C//DTD GRAMMAR 1.0//EN"
|
41
|
+
"http://www.w3.org/TR/speech-grammar/grammar.dtd">
|
42
|
+
|
43
|
+
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en"
|
44
|
+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
45
|
+
xsi:schemaLocation="http://www.w3.org/2001/06/grammar
|
46
|
+
http://www.w3.org/TR/speech-grammar/grammar.xsd"
|
47
|
+
version="1.0" mode="voice" root="basicCmd">
|
48
|
+
|
49
|
+
<meta name="author" content="Stephanie Williams"/>
|
50
|
+
|
51
|
+
<rule id="basicCmd" scope="public">
|
52
|
+
<example> please move the window </example>
|
53
|
+
<example> open a file </example>
|
54
|
+
|
55
|
+
<ruleref uri="http://grammar.example.com/politeness.grxml#startPolite"/>
|
56
|
+
|
57
|
+
<ruleref uri="#command"/>
|
58
|
+
<ruleref uri="http://grammar.example.com/politeness.grxml#endPolite"/>
|
59
|
+
|
60
|
+
</rule>
|
61
|
+
|
62
|
+
<rule id="command">
|
63
|
+
<ruleref uri="#action"/> <ruleref uri="#object"/>
|
64
|
+
</rule>
|
65
|
+
|
66
|
+
<rule id="action">
|
67
|
+
<one-of>
|
68
|
+
<item weight="10"> open <tag>TAG-CONTENT-1</tag> </item>
|
69
|
+
<item weight="2"> close <tag>TAG-CONTENT-2</tag> </item>
|
70
|
+
<item weight="1"> delete <tag>TAG-CONTENT-3</tag> </item>
|
71
|
+
<item weight="1"> move <tag>TAG-CONTENT-4</tag> </item>
|
72
|
+
</one-of>
|
73
|
+
</rule>
|
74
|
+
|
75
|
+
<rule id="object">
|
76
|
+
<item repeat="0-1">
|
77
|
+
<one-of>
|
78
|
+
<item> the </item>
|
79
|
+
<item> a </item>
|
80
|
+
</one-of>
|
81
|
+
</item>
|
82
|
+
|
83
|
+
<one-of>
|
84
|
+
<item> window </item>
|
85
|
+
<item> file </item>
|
86
|
+
<item> menu </item>
|
87
|
+
</one-of>
|
88
|
+
</rule>
|
89
|
+
|
90
|
+
</grammar>
|
91
|
+
'''
|
92
|
+
end
|
93
|
+
|
94
|
+
it { should be_a RubySpeech::GRXML::Element }
|
95
|
+
end
|
96
|
+
|
97
|
+
context "with an NLSML document" do
|
98
|
+
let :example_document do
|
99
|
+
'''
|
100
|
+
<result xmlns="http://www.w3c.org/2000/11/nlsml" xmlns:xf="http://www.w3.org/2000/xforms" xmlns:myApp="foo" grammar="http://flight">
|
101
|
+
<interpretation confidence="60">
|
102
|
+
<input mode="speech">I want to go to Pittsburgh</input>
|
103
|
+
<xf:model>
|
104
|
+
<xf:group name="airline">
|
105
|
+
<xf:string name="to_city"/>
|
106
|
+
</xf:group>
|
107
|
+
</xf:model>
|
108
|
+
<xf:instance>
|
109
|
+
<myApp:airline>
|
110
|
+
<myApp:to_city>Pittsburgh</myApp:to_city>
|
111
|
+
</myApp:airline>
|
112
|
+
</xf:instance>
|
113
|
+
</interpretation>
|
114
|
+
<interpretation confidence="40">
|
115
|
+
<input>I want to go to Stockholm</input>
|
116
|
+
<xf:model>
|
117
|
+
<xf:group name="airline">
|
118
|
+
<xf:string name="to_city"/>
|
119
|
+
</xf:group>
|
120
|
+
</xf:model>
|
121
|
+
<xf:instance>
|
122
|
+
<myApp:airline>
|
123
|
+
<myApp:to_city>Stockholm</myApp:to_city>
|
124
|
+
</myApp:airline>
|
125
|
+
</xf:instance>
|
126
|
+
</interpretation>
|
127
|
+
</result>
|
128
|
+
'''
|
129
|
+
end
|
130
|
+
|
131
|
+
it { should be_a RubySpeech::NLSML::Document }
|
132
|
+
end
|
133
|
+
end
|
134
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ruby_speech
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.0
|
4
|
+
version: 1.1.0
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2013-03-02 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: niceogiri
|
@@ -266,6 +266,9 @@ files:
|
|
266
266
|
- lib/ruby_speech/grxml/ruleref.rb
|
267
267
|
- lib/ruby_speech/grxml/tag.rb
|
268
268
|
- lib/ruby_speech/grxml/token.rb
|
269
|
+
- lib/ruby_speech/nlsml.rb
|
270
|
+
- lib/ruby_speech/nlsml/builder.rb
|
271
|
+
- lib/ruby_speech/nlsml/document.rb
|
269
272
|
- lib/ruby_speech/ssml.rb
|
270
273
|
- lib/ruby_speech/ssml/audio.rb
|
271
274
|
- lib/ruby_speech/ssml/break.rb
|
@@ -296,6 +299,7 @@ files:
|
|
296
299
|
- spec/ruby_speech/grxml/tag_spec.rb
|
297
300
|
- spec/ruby_speech/grxml/token_spec.rb
|
298
301
|
- spec/ruby_speech/grxml_spec.rb
|
302
|
+
- spec/ruby_speech/nlsml_spec.rb
|
299
303
|
- spec/ruby_speech/ssml/audio_spec.rb
|
300
304
|
- spec/ruby_speech/ssml/break_spec.rb
|
301
305
|
- spec/ruby_speech/ssml/desc_spec.rb
|
@@ -310,6 +314,7 @@ files:
|
|
310
314
|
- spec/ruby_speech/ssml/sub_spec.rb
|
311
315
|
- spec/ruby_speech/ssml/voice_spec.rb
|
312
316
|
- spec/ruby_speech/ssml_spec.rb
|
317
|
+
- spec/ruby_speech_spec.rb
|
313
318
|
- spec/spec_helper.rb
|
314
319
|
- spec/support/matchers.rb
|
315
320
|
homepage: https://github.com/benlangfeld/ruby_speech
|
@@ -326,7 +331,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
326
331
|
version: '0'
|
327
332
|
segments:
|
328
333
|
- 0
|
329
|
-
hash:
|
334
|
+
hash: -2548283956083168054
|
330
335
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
331
336
|
none: false
|
332
337
|
requirements:
|
@@ -335,7 +340,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
335
340
|
version: '0'
|
336
341
|
segments:
|
337
342
|
- 0
|
338
|
-
hash:
|
343
|
+
hash: -2548283956083168054
|
339
344
|
requirements: []
|
340
345
|
rubyforge_project: ruby_speech
|
341
346
|
rubygems_version: 1.8.24
|
@@ -354,6 +359,7 @@ test_files:
|
|
354
359
|
- spec/ruby_speech/grxml/tag_spec.rb
|
355
360
|
- spec/ruby_speech/grxml/token_spec.rb
|
356
361
|
- spec/ruby_speech/grxml_spec.rb
|
362
|
+
- spec/ruby_speech/nlsml_spec.rb
|
357
363
|
- spec/ruby_speech/ssml/audio_spec.rb
|
358
364
|
- spec/ruby_speech/ssml/break_spec.rb
|
359
365
|
- spec/ruby_speech/ssml/desc_spec.rb
|
@@ -368,6 +374,7 @@ test_files:
|
|
368
374
|
- spec/ruby_speech/ssml/sub_spec.rb
|
369
375
|
- spec/ruby_speech/ssml/voice_spec.rb
|
370
376
|
- spec/ruby_speech/ssml_spec.rb
|
377
|
+
- spec/ruby_speech_spec.rb
|
371
378
|
- spec/spec_helper.rb
|
372
379
|
- spec/support/matchers.rb
|
373
380
|
has_rdoc:
|