noyes 0.4.1 → 0.6.1
Sign up to get free protection for your applications and to get access to all the features.
- data/FAQ +35 -0
- data/README +144 -5
- data/bin/nrec +61 -0
- data/lib/common/parallel_filter.rb +1 -1
- data/lib/common/serial_filter.rb +1 -1
- data/lib/common.rb +5 -0
- data/lib/java_impl/dct.rb +13 -0
- data/lib/java_impl/delta.rb +14 -0
- data/lib/java_impl/discrete_fourier_transform.rb +10 -0
- data/lib/java_impl/filter.rb +0 -0
- data/lib/java_impl/hamming_window.rb +10 -0
- data/lib/java_impl/java_filter.rb +15 -0
- data/lib/java_impl/live_cmn.rb +10 -0
- data/lib/java_impl/log_compress.rb +10 -0
- data/lib/java_impl/mel_filter.rb +28 -0
- data/lib/java_impl/power_spec.rb +10 -0
- data/lib/java_impl/preemphasis.rb +11 -0
- data/lib/java_impl/segment.rb +11 -0
- data/lib/noyes.rb +12 -15
- data/lib/noyes_java.rb +14 -0
- data/lib/ruby_impl/dct.rb +1 -1
- data/lib/ruby_impl/discrete_fourier_transform.rb +28 -25
- data/lib/ruby_impl/mel_filter.rb +2 -1
- data/lib/ruby_impl/power_spec.rb +2 -3
- data/ship/noyes.jar +0 -0
- metadata +24 -10
- data/bin/recognize.sh +0 -15
- data/doc/overview.rdoc +0 -51
- /data/bin/{noyes_dump44k.sh → noyes_dump44k} +0 -0
- /data/bin/{noyes_dump8k.sh → noyes_dump8k} +0 -0
data/FAQ
ADDED
@@ -0,0 +1,35 @@
|
|
1
|
+
Q:
|
2
|
+
Does this contain a pure Ruby implementation?
|
3
|
+
A:
|
4
|
+
Yes.
|
5
|
+
|
6
|
+
Q:
|
7
|
+
Does this contain a pure Java implementation?
|
8
|
+
A:
|
9
|
+
Yes.
|
10
|
+
|
11
|
+
Q:
|
12
|
+
How do I use the Java implementation from JRuby?
|
13
|
+
A:
|
14
|
+
require 'noyes_java.jar'
|
15
|
+
include NoyesJava
|
16
|
+
|
17
|
+
Q:
|
18
|
+
How do I use the pure ruby implementation?
|
19
|
+
A:
|
20
|
+
require 'noyes'
|
21
|
+
include Noyes
|
22
|
+
|
23
|
+
Q:
|
24
|
+
Is there are recognizer I can use with this front end?
|
25
|
+
A:
|
26
|
+
Yes, the include command line programs nrec sends the data
|
27
|
+
to a recognizer running on somewhere on the cloud. Currently it
|
28
|
+
recognizes the names of NFL teams. You can get audio data from
|
29
|
+
http://github.com/talkhouse/audiodata/tree/master/nfl/
|
30
|
+
|
31
|
+
Q:
|
32
|
+
What does nrec stand for?
|
33
|
+
|
34
|
+
A:
|
35
|
+
Noyes Recognizer.
|
data/README
CHANGED
@@ -5,10 +5,30 @@ Pronunciation: Typically pronounced the same as 'noise'. But "NO!... YES!" is
|
|
5
5
|
considered acceptable if you yell it loudly enough or at least with sufficient
|
6
6
|
conviction to make people think you have truly changed your mind.
|
7
7
|
|
8
|
+
Noyes is a general purpose signal processing tool that is flexible enough for
|
9
|
+
many purposes. However, it exists because there is a need for low-latency high
|
10
|
+
quality speech recognition on portable wireless devices. The most powerful
|
11
|
+
speech recognizers are very large with huge models running on powerful cloud
|
12
|
+
based systems. But transmitting raw audio to these recognizers creates too
|
13
|
+
much latency because raw audio uses too much bandwidth. By sending compressed
|
14
|
+
features instead of raw audio the bandwidth can be greatly reduced without
|
15
|
+
compromising recognition accuracy. In some cases the effect of inadequate
|
16
|
+
bandwidth on latency can be reduced to zero.
|
17
|
+
|
18
|
+
Because hand sets require different implementations the Noyes library is
|
19
|
+
designed to quickly and efficiently work with and develop multiple underlying
|
20
|
+
implementations. All implementations are accessible via a high level dynamic
|
21
|
+
language that includes a very expressive domain specific language for handling
|
22
|
+
signal processing routines. In addition, all implementations share unit tests
|
23
|
+
written in a high level dynamic language.
|
24
|
+
|
8
25
|
Noyes is implemented entirely in Ruby. It's also implemented entirely in Java.
|
9
26
|
The Java version has Ruby bindings too. So you can have Java's speed from
|
10
|
-
Ruby.
|
11
|
-
|
27
|
+
Ruby. If you need a pure Java version you can use the generated jar. There is
|
28
|
+
a lot of flexibility without a lot of overhead. All versions share the same
|
29
|
+
unit tests, which are written in Ruby.
|
30
|
+
|
31
|
+
The design goal is to have signal processing routines that are so simple and so
|
12
32
|
disentangled from the overall system that anyone could extract any of the
|
13
33
|
routines and use them elsewhere with little trouble. Benchmarks are included.
|
14
34
|
|
@@ -23,10 +43,129 @@ the gem.
|
|
23
43
|
|
24
44
|
Requirements:
|
25
45
|
Almost any version of ruby & rake.
|
26
|
-
Java, if you want to use the Java
|
46
|
+
Java, if you want to use the Java implementation instead of the default pure
|
47
|
+
ruby implementation.
|
27
48
|
|
28
|
-
Some of the utility scripts may use sox, but
|
49
|
+
Some of the utility scripts such as nrec and jrec may use sox, but
|
29
50
|
none of the core routines use it.
|
30
51
|
|
31
|
-
|
52
|
+
Build instructions
|
32
53
|
rake -T
|
54
|
+
|
55
|
+
|
56
|
+
= USAGE
|
57
|
+
|
58
|
+
All signal processing routines use a simple DSL style inteface. Below are some
|
59
|
+
examples.
|
60
|
+
|
61
|
+
== Filter operator example.
|
62
|
+
The '>>=' operator is called the filter operator. It modifies that data on the
|
63
|
+
left using the filter on the right. This is similar to the way the += operator
|
64
|
+
works for numbers. Note that the >>= actually looks like a filter making it easy
|
65
|
+
to remember.
|
66
|
+
|
67
|
+
require 'noyes'
|
68
|
+
data = (1..12).to_a # An array of nonesense data.
|
69
|
+
segmenter = Segmenter.new 4, 2 # window size, window shift
|
70
|
+
hamming_filter = HammingWindow.new 4 # window size
|
71
|
+
power_spec_filter = PowerSpectrumFilter.new 8 # number of ffts
|
72
|
+
|
73
|
+
data >>= segmenter
|
74
|
+
data >>= hamming_filter
|
75
|
+
data >>= power_spec_filter
|
76
|
+
data >>= dct_filter
|
77
|
+
|
78
|
+
You can expand the >>= operator out, but I think the flow is worse and there is
|
79
|
+
more repetition, particularly when you have a lot of filters in sequence. This
|
80
|
+
is perfectly valid syntax though. Also, this is very useful if you don't want
|
81
|
+
to keep a reference to your original data.
|
82
|
+
|
83
|
+
require 'noyes'
|
84
|
+
pcm_data = (1..12).to_a
|
85
|
+
segmenter = Segmenter.new
|
86
|
+
hamming_filter = HammingWindow.new 4
|
87
|
+
segmented_data = segmenter << pcm_data, 4, 2
|
88
|
+
hamming_data = hamming_filter << segmented_data
|
89
|
+
power_spectrum data = power_spec_filter hamminging_data, 8
|
90
|
+
dct_data = dct_filter << power_spectrum_data
|
91
|
+
|
92
|
+
== Advanced filter DSLs
|
93
|
+
For most things, the filter operator is simple, easy to remember, and
|
94
|
+
very concise. But sometimes you want to build more elaborate combinations
|
95
|
+
of filters and use them as if you had a single filter. In this case
|
96
|
+
making a new classes for every possible combination creates an explosion
|
97
|
+
of new classes and a maintainence nightmare. Instead, there is a simple
|
98
|
+
graph notation you can use to combine filters. In the following example
|
99
|
+
we'll combine all the filters from a previous example and then use them
|
100
|
+
as if they were a single filter.
|
101
|
+
|
102
|
+
serial_filter = segmenter & hamming_filter & power_spec_filter & dct_filter
|
103
|
+
data >>= serial_filter
|
104
|
+
|
105
|
+
It's also possible to take parallel data streams and pipe them through
|
106
|
+
parallel filters as if you had only one data stream and only one filter.
|
107
|
+
|
108
|
+
data = [stream_1,stream_2]
|
109
|
+
parallel_filter = filter_1 | filter_2
|
110
|
+
data >>= parallel_filter
|
111
|
+
|
112
|
+
It is not necessary for the data to be synchronous when using parallel filters.
|
113
|
+
When using parallel filters the number of elements going through one filter
|
114
|
+
does not have to equal the number of elements going through the second filter.
|
115
|
+
|
116
|
+
You can see that you can make arbitrarily complicated graphs of filters by
|
117
|
+
combined use of the '&' and '|' operators. Almost identical notation is used
|
118
|
+
to specify graphs for context free grammars. Keep in mind that '&' take
|
119
|
+
precedence over '|'. In the example below stream 1 goes through filter 1 and
|
120
|
+
filter 2 while stream 2 goes through filters 3, 4, and 5.
|
121
|
+
|
122
|
+
parallel_data = [stream_1,stream_2]
|
123
|
+
big_filter = filter_1 & filter_2 | filter_3 & filter_4 & filter_5
|
124
|
+
parallel_data >>= big_filter
|
125
|
+
|
126
|
+
== Command Line Utilities
|
127
|
+
|
128
|
+
The program nrec will process almost any format of audio file into speech
|
129
|
+
features and send the data to a cloud hosted speech recognizer. The resulting
|
130
|
+
transcript will be sent back and printed out. The nrec program uses whatever
|
131
|
+
version of Ruby is on the path of your current environment. It is compatible
|
132
|
+
with both ruby 1.9, ruby 1.8x, and JRuby. When run under JRuby it can
|
133
|
+
optionally use a Java implementation, which is very fast. See nrec --help for
|
134
|
+
more information.
|
135
|
+
|
136
|
+
== Assessing Performance for Wireless Devices
|
137
|
+
|
138
|
+
It's important to note that the performance characteristics of live data and
|
139
|
+
recorded data are different. Any delay experience by a user starts from the
|
140
|
+
time they stop speaking. In contrast, any delay experienced when processing a
|
141
|
+
file starts from the time a file starts processing. For that reason file
|
142
|
+
processing always seems slower. Modern recognizers are easily capable of
|
143
|
+
exceeding real time performance so that it not a factor. The delay experienced
|
144
|
+
by a user is typically due to the time required to transmit the audio to the
|
145
|
+
recognizer and the time required to detect end of utterance, assuming end of
|
146
|
+
utterance detection is used.
|
147
|
+
|
148
|
+
If end of utterance detection is used the recognizer must wait until it has
|
149
|
+
sufficient evidence to be reasonably sure the user has stopped talking. This
|
150
|
+
could mean that a suitable period of silence has passed which means the user
|
151
|
+
incurs a slight but unavoidable delay. End of utterance detection also could
|
152
|
+
mean the grammar or language model does not allow for any other reasonable
|
153
|
+
possibility even if more data were available, which may mean no delay at all
|
154
|
+
(or even a negative delay in some cases).
|
155
|
+
|
156
|
+
If the bandwidth of the network is low enough, which is often the case for the
|
157
|
+
data channel of portable wireless handsets, it will take time for raw
|
158
|
+
uncompressed audio to traverse the network. By computing features on the
|
159
|
+
handset it is possible to have significant reduction in bandwidth requirements
|
160
|
+
eliminating much of the latency. These features in turn may then be compressed
|
161
|
+
for further bandwidth reduction. This method exceeds what is possible with
|
162
|
+
alternative methods of audio compression. Further, it eliminates many of the
|
163
|
+
distortion components that may compromise recognition accuracy.
|
164
|
+
|
165
|
+
If all you want is a rough feeling of how responsive speech recognition will be
|
166
|
+
over your network try speaking an utterance at the same time you enter a
|
167
|
+
command to have a prerecorded utterance recognized. You'll probably be
|
168
|
+
surprised by how quickly the network is able to respond. You may find that the
|
169
|
+
Java implementation feels like instant response even though it takes time for
|
170
|
+
the JVM to launch. Ruby 1.9 is actually surprisingly quick on a reasonably
|
171
|
+
powerful laptop.
|
data/bin/nrec
ADDED
@@ -0,0 +1,61 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# vim: set filetype=ruby :
|
3
|
+
ROOT = File.dirname(File.dirname(__FILE__))
|
4
|
+
$: << "#{ROOT}/lib" << "#{ROOT}/ship"
|
5
|
+
|
6
|
+
require 'optparse'
|
7
|
+
options = {}
|
8
|
+
OptionParser.new do |opt|
|
9
|
+
opt.banner = 'Usage: nrec [options] file1 file2 ...'
|
10
|
+
opt.on '-v', '--verbose', 'Output more information' do
|
11
|
+
options[:verbose] = true
|
12
|
+
end
|
13
|
+
options[:implementation] = :ruby
|
14
|
+
opt.on '-j', '--java', 'Use java implementation' do
|
15
|
+
options[:implementation] = :java
|
16
|
+
end
|
17
|
+
opt.on( '-h', '--help', 'Display this screen' ) do
|
18
|
+
puts opt
|
19
|
+
exit
|
20
|
+
end
|
21
|
+
end.parse!
|
22
|
+
|
23
|
+
# Must set implementation specific library path before requiring libraries.
|
24
|
+
case options[:implementation]
|
25
|
+
when :java
|
26
|
+
if RUBY_PLATFORM != 'java'
|
27
|
+
puts "The Java implementation is not accessable from Ruby, only JRuby."
|
28
|
+
puts "You'll need to check your environment carefully. If you've"
|
29
|
+
puts "installed this gem under both ruby and jruby and both are in"
|
30
|
+
puts "your current environment you may have created a conflict."
|
31
|
+
puts "you must make sure the jruby path preceeds the ruby path."
|
32
|
+
exit
|
33
|
+
end
|
34
|
+
puts "Using Java implementation" if options[:verbose]
|
35
|
+
require 'noyes_java'
|
36
|
+
include NoyesJava
|
37
|
+
when :ruby
|
38
|
+
if options[:verbose]
|
39
|
+
if RUBY_PLATFORM == 'java'
|
40
|
+
puts "Using pure ruby implementation under JRuby #{RUBY_VERSION}."
|
41
|
+
else
|
42
|
+
puts "Using pure ruby implementation under Ruby #{RUBY_VERSION}."
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
require 'noyes'
|
47
|
+
include Noyes
|
48
|
+
end
|
49
|
+
require 'socket'
|
50
|
+
|
51
|
+
def recognize file, node='174.129.244.159', port=2348
|
52
|
+
TCPSocket.open(node, port) do |client|
|
53
|
+
send_incremental_features file, client, client
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
ARGV.each do |file|
|
58
|
+
puts "recognizing file #{file}" if options[:verbose]
|
59
|
+
result = recognize file
|
60
|
+
puts "\n#{result}"
|
61
|
+
end
|
@@ -12,7 +12,7 @@ module NoyesFilterDSL
|
|
12
12
|
offset = -1
|
13
13
|
@filters.map {|f| f << data[offset+=1]}
|
14
14
|
end
|
15
|
-
def
|
15
|
+
def & other
|
16
16
|
raise "Parameter does not respond to <<." unless other.respond_to? :<<
|
17
17
|
if other.kind_of?(ParallelFilter) && filters.size != other.filters.size
|
18
18
|
raise "Parallel filters must have equal dimensions %d vs %d " %
|
data/lib/common/serial_filter.rb
CHANGED
@@ -8,7 +8,7 @@ module NoyesFilterDSL
|
|
8
8
|
@filters.each {|f| data >>= f}
|
9
9
|
data
|
10
10
|
end
|
11
|
-
def
|
11
|
+
def & other
|
12
12
|
raise "Parameter does not respond to <<." unless other.respond_to? :<<
|
13
13
|
if other.kind_of? SerialFilter
|
14
14
|
return SerialFilter.new(@filters.clone + other.filters.clone)
|
data/lib/common.rb
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class DCT
|
5
|
+
include JavaFilter
|
6
|
+
def initialize order, ncol
|
7
|
+
@filter = Java::talkhouse.DiscreteCosineTransform.new order, ncol
|
8
|
+
end
|
9
|
+
def melcos
|
10
|
+
@filter.melcos.map {|a|a.to_a}
|
11
|
+
end
|
12
|
+
end
|
13
|
+
end
|
@@ -0,0 +1,14 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class DoubleDeltaFilter
|
5
|
+
include JavaFilter
|
6
|
+
def initialize
|
7
|
+
@filter = Java::talkhouse.DoubleDeltaFilter.new
|
8
|
+
end
|
9
|
+
def final_estimate
|
10
|
+
x = @filter.final_estimate
|
11
|
+
x.map{|a|a.to_a}
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
File without changes
|
@@ -0,0 +1,15 @@
|
|
1
|
+
module NoyesJava
|
2
|
+
module JavaFilter
|
3
|
+
def << data
|
4
|
+
java_matrix = @filter.apply data.to_java Java::double[]
|
5
|
+
java_matrix.map {|java_array|java_array.to_a}
|
6
|
+
end
|
7
|
+
def self.ensure_jarray array
|
8
|
+
if array.respond_to? :each
|
9
|
+
array.to_java(Java::double[]).to_a
|
10
|
+
else
|
11
|
+
array
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class MelFilter
|
5
|
+
include JavaFilter
|
6
|
+
def initialize srate, nfft, nfilt, lowerf, upperf
|
7
|
+
@filter = Java::talkhouse.MelFilter.new srate, nfft, nfilt, lowerf, upperf
|
8
|
+
end
|
9
|
+
def self.make_bank_parameters srate, nfft, nfilt, lowerf, upperf
|
10
|
+
parameters = Java::talkhouse.MelFilter.make_bank_parameters srate, nfft,
|
11
|
+
nfilt, lowerf, upperf
|
12
|
+
parameters.map {|array|array.to_a}
|
13
|
+
end
|
14
|
+
def self.make_filter left, center, right, init_freq, delta
|
15
|
+
filters = Java::talkhouse.MelFilter.make_filter left, center, right,
|
16
|
+
init_freq, delta
|
17
|
+
filters = filters.to_a
|
18
|
+
indefilters = filters.shift
|
19
|
+
[indefilters, filters]
|
20
|
+
end
|
21
|
+
def self.to_mel f
|
22
|
+
x = Java::talkhouse.MelFilter.mel JavaFilter.ensure_jarray f
|
23
|
+
end
|
24
|
+
def self.to_linear mel
|
25
|
+
Java::talkhouse.MelFilter.melinv JavaFilter.ensure_jarray mel
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
@@ -0,0 +1,11 @@
|
|
1
|
+
module NoyesJava
|
2
|
+
class Segmenter
|
3
|
+
def initialize win_size, win_shift
|
4
|
+
@filter = Java::talkhouse.Segmenter.new win_size, win_shift
|
5
|
+
end
|
6
|
+
def << data
|
7
|
+
java_matrix = @filter.apply data.to_java(:double)
|
8
|
+
java_matrix.map {|java_array|java_array.to_a} if java_matrix
|
9
|
+
end
|
10
|
+
end
|
11
|
+
end
|
data/lib/noyes.rb
CHANGED
@@ -1,15 +1,12 @@
|
|
1
|
-
require '
|
2
|
-
require '
|
3
|
-
require '
|
4
|
-
require '
|
5
|
-
require '
|
6
|
-
require '
|
7
|
-
require '
|
8
|
-
require '
|
9
|
-
require '
|
10
|
-
require '
|
11
|
-
require '
|
12
|
-
require '
|
13
|
-
require 'power_spec'
|
14
|
-
require 'preemphasis'
|
15
|
-
require 'segment'
|
1
|
+
require 'common'
|
2
|
+
require 'ruby_impl/dct'
|
3
|
+
require 'ruby_impl/delta'
|
4
|
+
require 'ruby_impl/filter'
|
5
|
+
require 'ruby_impl/mel_filter'
|
6
|
+
require 'ruby_impl/hamming_window'
|
7
|
+
require 'ruby_impl/log_compress'
|
8
|
+
require 'ruby_impl/live_cmn'
|
9
|
+
require 'ruby_impl/discrete_fourier_transform'
|
10
|
+
require 'ruby_impl/power_spec'
|
11
|
+
require 'ruby_impl/preemphasis'
|
12
|
+
require 'ruby_impl/segment'
|
data/lib/noyes_java.rb
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
require 'common'
|
2
|
+
require 'java'
|
3
|
+
require 'java_impl/dct'
|
4
|
+
require 'java_impl/delta'
|
5
|
+
require 'java_impl/filter'
|
6
|
+
require 'java_impl/mel_filter'
|
7
|
+
require 'java_impl/hamming_window'
|
8
|
+
require 'java_impl/live_cmn'
|
9
|
+
require 'java_impl/log_compress'
|
10
|
+
require 'java_impl/discrete_fourier_transform'
|
11
|
+
require 'java_impl/power_spec'
|
12
|
+
require 'java_impl/preemphasis'
|
13
|
+
require 'java_impl/segment'
|
14
|
+
require 'noyes.jar'
|
data/lib/ruby_impl/dct.rb
CHANGED
@@ -4,31 +4,34 @@ module Noyes
|
|
4
4
|
include Math
|
5
5
|
# Takes the discrete Fourier transform.
|
6
6
|
def dft data,size
|
7
|
-
vals = Array.new(size)
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
7
|
+
vals = Array.new(size) do |i|
|
8
|
+
i < data.size ? Complex(data[i],0) : Complex(0,0)
|
9
|
+
end
|
10
|
+
j=0
|
11
|
+
size.times do |i|
|
12
|
+
vals[j],vals[i] = vals[i],vals[j] if i<j
|
13
|
+
m = size/2
|
14
|
+
while j>=m && m>1
|
15
|
+
j-=m
|
14
16
|
m/=2
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
17
|
+
end
|
18
|
+
j+=m
|
19
|
+
end
|
20
|
+
k=1
|
21
|
+
while k<size
|
22
|
+
incr = 2*k
|
23
|
+
mul = Complex.polar 1, Math::PI/k
|
24
|
+
w = Complex(1, 0)
|
25
|
+
k.times do |i|
|
26
|
+
i.step(size-1,incr) do |j|
|
27
|
+
tmp = w * vals[j+k]
|
28
|
+
vals[j+k],vals[j]=vals[j]-tmp,vals[j]+tmp
|
29
|
+
end
|
30
|
+
w *= mul;
|
31
|
+
end
|
32
|
+
k=incr
|
33
|
+
end
|
34
|
+
vals
|
33
35
|
end
|
36
|
+
module_function :dft
|
34
37
|
end
|
data/lib/ruby_impl/mel_filter.rb
CHANGED
data/lib/ruby_impl/power_spec.rb
CHANGED
@@ -1,17 +1,16 @@
|
|
1
|
-
require 'discrete_fourier_transform'
|
1
|
+
require 'ruby_impl/discrete_fourier_transform'
|
2
2
|
module Noyes
|
3
3
|
# The square of the DFT. You must specify the number of ffts. The power
|
4
4
|
# spectrum returns an array of arrays where each inner array is of length
|
5
5
|
# nfft/2 + 1. The length of the outer array does not change.
|
6
6
|
class PowerSpectrumFilter
|
7
|
-
include Noyes
|
8
7
|
def initialize nfft
|
9
8
|
@nfft = nfft
|
10
9
|
end
|
11
10
|
def << data
|
12
11
|
nuniqdftpts = @nfft/2 + 1
|
13
12
|
data.map do |datavec|
|
14
|
-
datavecfft = dft datavec, @nfft
|
13
|
+
datavecfft = Noyes.dft datavec, @nfft
|
15
14
|
Array.new(nuniqdftpts){|i| datavecfft[i].abs**2}
|
16
15
|
end
|
17
16
|
end
|
data/ship/noyes.jar
ADDED
Binary file
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: noyes
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.6.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Joe Woelfel
|
@@ -9,29 +9,43 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2010-02-
|
12
|
+
date: 2010-02-25 00:00:00 -05:00
|
13
13
|
default_executable:
|
14
14
|
dependencies: []
|
15
15
|
|
16
|
-
description: Currently sufficient to create
|
16
|
+
description: Currently sufficient to create features for speech recognition
|
17
17
|
email: joe@talkhouse.com
|
18
18
|
executables:
|
19
|
-
- noyes_dump44k
|
20
|
-
- noyes_dump8k
|
21
|
-
-
|
19
|
+
- noyes_dump44k
|
20
|
+
- noyes_dump8k
|
21
|
+
- nrec
|
22
22
|
extensions: []
|
23
23
|
|
24
24
|
extra_rdoc_files:
|
25
25
|
- COPYING
|
26
|
+
- FAQ
|
26
27
|
- README
|
27
|
-
- doc/overview.rdoc
|
28
28
|
files:
|
29
|
+
- lib/common.rb
|
29
30
|
- lib/common/noyes_dsl.rb
|
30
31
|
- lib/common/noyes_math.rb
|
31
32
|
- lib/common/parallel_filter.rb
|
32
33
|
- lib/common/send_incrementally.rb
|
33
34
|
- lib/common/serial_filter.rb
|
35
|
+
- lib/java_impl/dct.rb
|
36
|
+
- lib/java_impl/delta.rb
|
37
|
+
- lib/java_impl/discrete_fourier_transform.rb
|
38
|
+
- lib/java_impl/filter.rb
|
39
|
+
- lib/java_impl/hamming_window.rb
|
40
|
+
- lib/java_impl/java_filter.rb
|
41
|
+
- lib/java_impl/live_cmn.rb
|
42
|
+
- lib/java_impl/log_compress.rb
|
43
|
+
- lib/java_impl/mel_filter.rb
|
44
|
+
- lib/java_impl/power_spec.rb
|
45
|
+
- lib/java_impl/preemphasis.rb
|
46
|
+
- lib/java_impl/segment.rb
|
34
47
|
- lib/noyes.rb
|
48
|
+
- lib/noyes_java.rb
|
35
49
|
- lib/ruby_impl/dct.rb
|
36
50
|
- lib/ruby_impl/delta.rb
|
37
51
|
- lib/ruby_impl/discrete_fourier_transform.rb
|
@@ -43,9 +57,10 @@ files:
|
|
43
57
|
- lib/ruby_impl/power_spec.rb
|
44
58
|
- lib/ruby_impl/preemphasis.rb
|
45
59
|
- lib/ruby_impl/segment.rb
|
60
|
+
- ship/noyes.jar
|
46
61
|
- COPYING
|
62
|
+
- FAQ
|
47
63
|
- README
|
48
|
-
- doc/overview.rdoc
|
49
64
|
has_rdoc: true
|
50
65
|
homepage: http://github.com/talkhouse/noyes
|
51
66
|
licenses: []
|
@@ -54,9 +69,8 @@ post_install_message:
|
|
54
69
|
rdoc_options:
|
55
70
|
- --charset=UTF-8
|
56
71
|
require_paths:
|
57
|
-
- lib/ruby_impl
|
58
|
-
- lib/common
|
59
72
|
- lib
|
73
|
+
- ship
|
60
74
|
required_ruby_version: !ruby/object:Gem::Requirement
|
61
75
|
requirements:
|
62
76
|
- - ">="
|
data/bin/recognize.sh
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
#!/usr/bin/env jruby
|
2
|
-
# vim: set filetype=ruby :
|
3
|
-
ROOT = File.dirname(File.dirname(__FILE__))
|
4
|
-
$: << "#{ROOT}/lib/ruby"
|
5
|
-
$: << "#{ROOT}/lib/common"
|
6
|
-
require 'socket'
|
7
|
-
require 'send_incrementally'
|
8
|
-
|
9
|
-
def recognize file, node='localhost', port=2318
|
10
|
-
TCPSocket.open(node, port) do |client|
|
11
|
-
send_incremental_features file, client, client
|
12
|
-
end
|
13
|
-
end
|
14
|
-
|
15
|
-
puts recognize ARGV[0]
|
data/doc/overview.rdoc
DELETED
@@ -1,51 +0,0 @@
|
|
1
|
-
# = Overview
|
2
|
-
#
|
3
|
-
# All signal processing routines use a simple DSL style inteface. Below are
|
4
|
-
# some examples.
|
5
|
-
#
|
6
|
-
# == Filter operator example.
|
7
|
-
# Each example below is the data on the left being operated on by the filter on
|
8
|
-
# the right. This is similar to the way the += operator works for numbers. The
|
9
|
-
# data is not modified in place currently and it should probably stay that way.
|
10
|
-
# It could be if efficiency demanded it, but that would require a bit more care
|
11
|
-
# to avoid side effects when using the API. The >>= actually looks like a
|
12
|
-
# filter.
|
13
|
-
#
|
14
|
-
# data = (1..12).to_a
|
15
|
-
# segmenter = Segmenter.new 4, 2 # window size, window shift
|
16
|
-
# hamming_filter = HammingWindow.new 4 # window size
|
17
|
-
# power_spec_filter = PowerSpectrumFilter.new 8 # number of ffts
|
18
|
-
#
|
19
|
-
# data >>= segmenter
|
20
|
-
# data >>= hamming_filter
|
21
|
-
# data >>= power_spec_filter
|
22
|
-
# data >>= dct_filter
|
23
|
-
#
|
24
|
-
# You can expand the >>= operator out, but I think the flow is worse and there
|
25
|
-
# is more repetition, particularly when you have a lot of filters in sequence.
|
26
|
-
# This is perfectly valid syntax though. Also, this is very useful if you don't
|
27
|
-
# want to keep a reference to your original data.
|
28
|
-
#
|
29
|
-
# pcm_data = (1..12).to_a
|
30
|
-
# segmenter = Segmenter.new
|
31
|
-
# hamming_filter = HammingWindow.new 4
|
32
|
-
# segmented_data = segmenter << pcm_data, 4, 2
|
33
|
-
# hamming_data = hamming_filter << segmented_data
|
34
|
-
# power_spectrum data = power_spec_filter hamminging_data, 8
|
35
|
-
# dct_data = dct_filter << power_spectrum_data
|
36
|
-
#
|
37
|
-
# Here is an older version with function calls instead of operator overloading.
|
38
|
-
# The trouble with it is that the flow is hard to follow, and there is
|
39
|
-
# repetition. Filter and process are really synonyms. And this requires
|
40
|
-
# repeating the data component twice. Also, power spec is a function here
|
41
|
-
# with additional arguments. I think I'd rather have the configuration
|
42
|
-
# details, such as number of ffts all grouped at the top. It's easier to
|
43
|
-
# follow this way.
|
44
|
-
#
|
45
|
-
# data = (1..12).to_a
|
46
|
-
# seg = Segmenter.new
|
47
|
-
# ham = HammingWindow.new 4
|
48
|
-
# segments = segmenter.process data, 4, 2
|
49
|
-
# hamming_ = hamming_filter.process segments
|
50
|
-
# power = power_spec.filter hamming, 8
|
51
|
-
# dct = dct.process power
|
File without changes
|
File without changes
|