noyes 0.4.1 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/FAQ +35 -0
- data/README +144 -5
- data/bin/nrec +61 -0
- data/lib/common/parallel_filter.rb +1 -1
- data/lib/common/serial_filter.rb +1 -1
- data/lib/common.rb +5 -0
- data/lib/java_impl/dct.rb +13 -0
- data/lib/java_impl/delta.rb +14 -0
- data/lib/java_impl/discrete_fourier_transform.rb +10 -0
- data/lib/java_impl/filter.rb +0 -0
- data/lib/java_impl/hamming_window.rb +10 -0
- data/lib/java_impl/java_filter.rb +15 -0
- data/lib/java_impl/live_cmn.rb +10 -0
- data/lib/java_impl/log_compress.rb +10 -0
- data/lib/java_impl/mel_filter.rb +28 -0
- data/lib/java_impl/power_spec.rb +10 -0
- data/lib/java_impl/preemphasis.rb +11 -0
- data/lib/java_impl/segment.rb +11 -0
- data/lib/noyes.rb +12 -15
- data/lib/noyes_java.rb +14 -0
- data/lib/ruby_impl/dct.rb +1 -1
- data/lib/ruby_impl/discrete_fourier_transform.rb +28 -25
- data/lib/ruby_impl/mel_filter.rb +2 -1
- data/lib/ruby_impl/power_spec.rb +2 -3
- data/ship/noyes.jar +0 -0
- metadata +24 -10
- data/bin/recognize.sh +0 -15
- data/doc/overview.rdoc +0 -51
- /data/bin/{noyes_dump44k.sh → noyes_dump44k} +0 -0
- /data/bin/{noyes_dump8k.sh → noyes_dump8k} +0 -0
data/FAQ
ADDED
@@ -0,0 +1,35 @@
|
|
1
|
+
Q:
|
2
|
+
Does this contain a pure Ruby implementation?
|
3
|
+
A:
|
4
|
+
Yes.
|
5
|
+
|
6
|
+
Q:
|
7
|
+
Does this contain a pure Java implementation?
|
8
|
+
A:
|
9
|
+
Yes.
|
10
|
+
|
11
|
+
Q:
|
12
|
+
How do I use the Java implementation from JRuby?
|
13
|
+
A:
|
14
|
+
require 'noyes_java.jar'
|
15
|
+
include NoyesJava
|
16
|
+
|
17
|
+
Q:
|
18
|
+
How do I use the pure ruby implementation?
|
19
|
+
A:
|
20
|
+
require 'noyes'
|
21
|
+
include Noyes
|
22
|
+
|
23
|
+
Q:
|
24
|
+
Is there are recognizer I can use with this front end?
|
25
|
+
A:
|
26
|
+
Yes, the include command line programs nrec sends the data
|
27
|
+
to a recognizer running on somewhere on the cloud. Currently it
|
28
|
+
recognizes the names of NFL teams. You can get audio data from
|
29
|
+
http://github.com/talkhouse/audiodata/tree/master/nfl/
|
30
|
+
|
31
|
+
Q:
|
32
|
+
What does nrec stand for?
|
33
|
+
|
34
|
+
A:
|
35
|
+
Noyes Recognizer.
|
data/README
CHANGED
@@ -5,10 +5,30 @@ Pronunciation: Typically pronounced the same as 'noise'. But "NO!... YES!" is
|
|
5
5
|
considered acceptable if you yell it loudly enough or at least with sufficient
|
6
6
|
conviction to make people think you have truly changed your mind.
|
7
7
|
|
8
|
+
Noyes is a general purpose signal processing tool that is flexible enough for
|
9
|
+
many purposes. However, it exists because there is a need for low-latency high
|
10
|
+
quality speech recognition on portable wireless devices. The most powerful
|
11
|
+
speech recognizers are very large with huge models running on powerful cloud
|
12
|
+
based systems. But transmitting raw audio to these recognizers creates too
|
13
|
+
much latency because raw audio uses too much bandwidth. By sending compressed
|
14
|
+
features instead of raw audio the bandwidth can be greatly reduced without
|
15
|
+
compromising recognition accuracy. In some cases the effect of inadequate
|
16
|
+
bandwidth on latency can be reduced to zero.
|
17
|
+
|
18
|
+
Because hand sets require different implementations the Noyes library is
|
19
|
+
designed to quickly and efficiently work with and develop multiple underlying
|
20
|
+
implementations. All implementations are accessible via a high level dynamic
|
21
|
+
language that includes a very expressive domain specific language for handling
|
22
|
+
signal processing routines. In addition, all implementations share unit tests
|
23
|
+
written in a high level dynamic language.
|
24
|
+
|
8
25
|
Noyes is implemented entirely in Ruby. It's also implemented entirely in Java.
|
9
26
|
The Java version has Ruby bindings too. So you can have Java's speed from
|
10
|
-
Ruby.
|
11
|
-
|
27
|
+
Ruby. If you need a pure Java version you can use the generated jar. There is
|
28
|
+
a lot of flexibility without a lot of overhead. All versions share the same
|
29
|
+
unit tests, which are written in Ruby.
|
30
|
+
|
31
|
+
The design goal is to have signal processing routines that are so simple and so
|
12
32
|
disentangled from the overall system that anyone could extract any of the
|
13
33
|
routines and use them elsewhere with little trouble. Benchmarks are included.
|
14
34
|
|
@@ -23,10 +43,129 @@ the gem.
|
|
23
43
|
|
24
44
|
Requirements:
|
25
45
|
Almost any version of ruby & rake.
|
26
|
-
Java, if you want to use the Java
|
46
|
+
Java, if you want to use the Java implementation instead of the default pure
|
47
|
+
ruby implementation.
|
27
48
|
|
28
|
-
Some of the utility scripts may use sox, but
|
49
|
+
Some of the utility scripts such as nrec and jrec may use sox, but
|
29
50
|
none of the core routines use it.
|
30
51
|
|
31
|
-
|
52
|
+
Build instructions
|
32
53
|
rake -T
|
54
|
+
|
55
|
+
|
56
|
+
= USAGE
|
57
|
+
|
58
|
+
All signal processing routines use a simple DSL style inteface. Below are some
|
59
|
+
examples.
|
60
|
+
|
61
|
+
== Filter operator example.
|
62
|
+
The '>>=' operator is called the filter operator. It modifies that data on the
|
63
|
+
left using the filter on the right. This is similar to the way the += operator
|
64
|
+
works for numbers. Note that the >>= actually looks like a filter making it easy
|
65
|
+
to remember.
|
66
|
+
|
67
|
+
require 'noyes'
|
68
|
+
data = (1..12).to_a # An array of nonesense data.
|
69
|
+
segmenter = Segmenter.new 4, 2 # window size, window shift
|
70
|
+
hamming_filter = HammingWindow.new 4 # window size
|
71
|
+
power_spec_filter = PowerSpectrumFilter.new 8 # number of ffts
|
72
|
+
|
73
|
+
data >>= segmenter
|
74
|
+
data >>= hamming_filter
|
75
|
+
data >>= power_spec_filter
|
76
|
+
data >>= dct_filter
|
77
|
+
|
78
|
+
You can expand the >>= operator out, but I think the flow is worse and there is
|
79
|
+
more repetition, particularly when you have a lot of filters in sequence. This
|
80
|
+
is perfectly valid syntax though. Also, this is very useful if you don't want
|
81
|
+
to keep a reference to your original data.
|
82
|
+
|
83
|
+
require 'noyes'
|
84
|
+
pcm_data = (1..12).to_a
|
85
|
+
segmenter = Segmenter.new
|
86
|
+
hamming_filter = HammingWindow.new 4
|
87
|
+
segmented_data = segmenter << pcm_data, 4, 2
|
88
|
+
hamming_data = hamming_filter << segmented_data
|
89
|
+
power_spectrum data = power_spec_filter hamminging_data, 8
|
90
|
+
dct_data = dct_filter << power_spectrum_data
|
91
|
+
|
92
|
+
== Advanced filter DSLs
|
93
|
+
For most things, the filter operator is simple, easy to remember, and
|
94
|
+
very concise. But sometimes you want to build more elaborate combinations
|
95
|
+
of filters and use them as if you had a single filter. In this case
|
96
|
+
making a new classes for every possible combination creates an explosion
|
97
|
+
of new classes and a maintainence nightmare. Instead, there is a simple
|
98
|
+
graph notation you can use to combine filters. In the following example
|
99
|
+
we'll combine all the filters from a previous example and then use them
|
100
|
+
as if they were a single filter.
|
101
|
+
|
102
|
+
serial_filter = segmenter & hamming_filter & power_spec_filter & dct_filter
|
103
|
+
data >>= serial_filter
|
104
|
+
|
105
|
+
It's also possible to take parallel data streams and pipe them through
|
106
|
+
parallel filters as if you had only one data stream and only one filter.
|
107
|
+
|
108
|
+
data = [stream_1,stream_2]
|
109
|
+
parallel_filter = filter_1 | filter_2
|
110
|
+
data >>= parallel_filter
|
111
|
+
|
112
|
+
It is not necessary for the data to be synchronous when using parallel filters.
|
113
|
+
When using parallel filters the number of elements going through one filter
|
114
|
+
does not have to equal the number of elements going through the second filter.
|
115
|
+
|
116
|
+
You can see that you can make arbitrarily complicated graphs of filters by
|
117
|
+
combined use of the '&' and '|' operators. Almost identical notation is used
|
118
|
+
to specify graphs for context free grammars. Keep in mind that '&' take
|
119
|
+
precedence over '|'. In the example below stream 1 goes through filter 1 and
|
120
|
+
filter 2 while stream 2 goes through filters 3, 4, and 5.
|
121
|
+
|
122
|
+
parallel_data = [stream_1,stream_2]
|
123
|
+
big_filter = filter_1 & filter_2 | filter_3 & filter_4 & filter_5
|
124
|
+
parallel_data >>= big_filter
|
125
|
+
|
126
|
+
== Command Line Utilities
|
127
|
+
|
128
|
+
The program nrec will process almost any format of audio file into speech
|
129
|
+
features and send the data to a cloud hosted speech recognizer. The resulting
|
130
|
+
transcript will be sent back and printed out. The nrec program uses whatever
|
131
|
+
version of Ruby is on the path of your current environment. It is compatible
|
132
|
+
with both ruby 1.9, ruby 1.8x, and JRuby. When run under JRuby it can
|
133
|
+
optionally use a Java implementation, which is very fast. See nrec --help for
|
134
|
+
more information.
|
135
|
+
|
136
|
+
== Assessing Performance for Wireless Devices
|
137
|
+
|
138
|
+
It's important to note that the performance characteristics of live data and
|
139
|
+
recorded data are different. Any delay experience by a user starts from the
|
140
|
+
time they stop speaking. In contrast, any delay experienced when processing a
|
141
|
+
file starts from the time a file starts processing. For that reason file
|
142
|
+
processing always seems slower. Modern recognizers are easily capable of
|
143
|
+
exceeding real time performance so that it not a factor. The delay experienced
|
144
|
+
by a user is typically due to the time required to transmit the audio to the
|
145
|
+
recognizer and the time required to detect end of utterance, assuming end of
|
146
|
+
utterance detection is used.
|
147
|
+
|
148
|
+
If end of utterance detection is used the recognizer must wait until it has
|
149
|
+
sufficient evidence to be reasonably sure the user has stopped talking. This
|
150
|
+
could mean that a suitable period of silence has passed which means the user
|
151
|
+
incurs a slight but unavoidable delay. End of utterance detection also could
|
152
|
+
mean the grammar or language model does not allow for any other reasonable
|
153
|
+
possibility even if more data were available, which may mean no delay at all
|
154
|
+
(or even a negative delay in some cases).
|
155
|
+
|
156
|
+
If the bandwidth of the network is low enough, which is often the case for the
|
157
|
+
data channel of portable wireless handsets, it will take time for raw
|
158
|
+
uncompressed audio to traverse the network. By computing features on the
|
159
|
+
handset it is possible to have significant reduction in bandwidth requirements
|
160
|
+
eliminating much of the latency. These features in turn may then be compressed
|
161
|
+
for further bandwidth reduction. This method exceeds what is possible with
|
162
|
+
alternative methods of audio compression. Further, it eliminates many of the
|
163
|
+
distortion components that may compromise recognition accuracy.
|
164
|
+
|
165
|
+
If all you want is a rough feeling of how responsive speech recognition will be
|
166
|
+
over your network try speaking an utterance at the same time you enter a
|
167
|
+
command to have a prerecorded utterance recognized. You'll probably be
|
168
|
+
surprised by how quickly the network is able to respond. You may find that the
|
169
|
+
Java implementation feels like instant response even though it takes time for
|
170
|
+
the JVM to launch. Ruby 1.9 is actually surprisingly quick on a reasonably
|
171
|
+
powerful laptop.
|
data/bin/nrec
ADDED
@@ -0,0 +1,61 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# vim: set filetype=ruby :
|
3
|
+
ROOT = File.dirname(File.dirname(__FILE__))
|
4
|
+
$: << "#{ROOT}/lib" << "#{ROOT}/ship"
|
5
|
+
|
6
|
+
require 'optparse'
|
7
|
+
options = {}
|
8
|
+
OptionParser.new do |opt|
|
9
|
+
opt.banner = 'Usage: nrec [options] file1 file2 ...'
|
10
|
+
opt.on '-v', '--verbose', 'Output more information' do
|
11
|
+
options[:verbose] = true
|
12
|
+
end
|
13
|
+
options[:implementation] = :ruby
|
14
|
+
opt.on '-j', '--java', 'Use java implementation' do
|
15
|
+
options[:implementation] = :java
|
16
|
+
end
|
17
|
+
opt.on( '-h', '--help', 'Display this screen' ) do
|
18
|
+
puts opt
|
19
|
+
exit
|
20
|
+
end
|
21
|
+
end.parse!
|
22
|
+
|
23
|
+
# Must set implementation specific library path before requiring libraries.
|
24
|
+
case options[:implementation]
|
25
|
+
when :java
|
26
|
+
if RUBY_PLATFORM != 'java'
|
27
|
+
puts "The Java implementation is not accessable from Ruby, only JRuby."
|
28
|
+
puts "You'll need to check your environment carefully. If you've"
|
29
|
+
puts "installed this gem under both ruby and jruby and both are in"
|
30
|
+
puts "your current environment you may have created a conflict."
|
31
|
+
puts "you must make sure the jruby path preceeds the ruby path."
|
32
|
+
exit
|
33
|
+
end
|
34
|
+
puts "Using Java implementation" if options[:verbose]
|
35
|
+
require 'noyes_java'
|
36
|
+
include NoyesJava
|
37
|
+
when :ruby
|
38
|
+
if options[:verbose]
|
39
|
+
if RUBY_PLATFORM == 'java'
|
40
|
+
puts "Using pure ruby implementation under JRuby #{RUBY_VERSION}."
|
41
|
+
else
|
42
|
+
puts "Using pure ruby implementation under Ruby #{RUBY_VERSION}."
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
require 'noyes'
|
47
|
+
include Noyes
|
48
|
+
end
|
49
|
+
require 'socket'
|
50
|
+
|
51
|
+
def recognize file, node='174.129.244.159', port=2348
|
52
|
+
TCPSocket.open(node, port) do |client|
|
53
|
+
send_incremental_features file, client, client
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
ARGV.each do |file|
|
58
|
+
puts "recognizing file #{file}" if options[:verbose]
|
59
|
+
result = recognize file
|
60
|
+
puts "\n#{result}"
|
61
|
+
end
|
@@ -12,7 +12,7 @@ module NoyesFilterDSL
|
|
12
12
|
offset = -1
|
13
13
|
@filters.map {|f| f << data[offset+=1]}
|
14
14
|
end
|
15
|
-
def
|
15
|
+
def & other
|
16
16
|
raise "Parameter does not respond to <<." unless other.respond_to? :<<
|
17
17
|
if other.kind_of?(ParallelFilter) && filters.size != other.filters.size
|
18
18
|
raise "Parallel filters must have equal dimensions %d vs %d " %
|
data/lib/common/serial_filter.rb
CHANGED
@@ -8,7 +8,7 @@ module NoyesFilterDSL
|
|
8
8
|
@filters.each {|f| data >>= f}
|
9
9
|
data
|
10
10
|
end
|
11
|
-
def
|
11
|
+
def & other
|
12
12
|
raise "Parameter does not respond to <<." unless other.respond_to? :<<
|
13
13
|
if other.kind_of? SerialFilter
|
14
14
|
return SerialFilter.new(@filters.clone + other.filters.clone)
|
data/lib/common.rb
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class DCT
|
5
|
+
include JavaFilter
|
6
|
+
def initialize order, ncol
|
7
|
+
@filter = Java::talkhouse.DiscreteCosineTransform.new order, ncol
|
8
|
+
end
|
9
|
+
def melcos
|
10
|
+
@filter.melcos.map {|a|a.to_a}
|
11
|
+
end
|
12
|
+
end
|
13
|
+
end
|
@@ -0,0 +1,14 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class DoubleDeltaFilter
|
5
|
+
include JavaFilter
|
6
|
+
def initialize
|
7
|
+
@filter = Java::talkhouse.DoubleDeltaFilter.new
|
8
|
+
end
|
9
|
+
def final_estimate
|
10
|
+
x = @filter.final_estimate
|
11
|
+
x.map{|a|a.to_a}
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
File without changes
|
@@ -0,0 +1,15 @@
|
|
1
|
+
module NoyesJava
|
2
|
+
module JavaFilter
|
3
|
+
def << data
|
4
|
+
java_matrix = @filter.apply data.to_java Java::double[]
|
5
|
+
java_matrix.map {|java_array|java_array.to_a}
|
6
|
+
end
|
7
|
+
def self.ensure_jarray array
|
8
|
+
if array.respond_to? :each
|
9
|
+
array.to_java(Java::double[]).to_a
|
10
|
+
else
|
11
|
+
array
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'java_impl/java_filter'
|
2
|
+
|
3
|
+
module NoyesJava
|
4
|
+
class MelFilter
|
5
|
+
include JavaFilter
|
6
|
+
def initialize srate, nfft, nfilt, lowerf, upperf
|
7
|
+
@filter = Java::talkhouse.MelFilter.new srate, nfft, nfilt, lowerf, upperf
|
8
|
+
end
|
9
|
+
def self.make_bank_parameters srate, nfft, nfilt, lowerf, upperf
|
10
|
+
parameters = Java::talkhouse.MelFilter.make_bank_parameters srate, nfft,
|
11
|
+
nfilt, lowerf, upperf
|
12
|
+
parameters.map {|array|array.to_a}
|
13
|
+
end
|
14
|
+
def self.make_filter left, center, right, init_freq, delta
|
15
|
+
filters = Java::talkhouse.MelFilter.make_filter left, center, right,
|
16
|
+
init_freq, delta
|
17
|
+
filters = filters.to_a
|
18
|
+
indefilters = filters.shift
|
19
|
+
[indefilters, filters]
|
20
|
+
end
|
21
|
+
def self.to_mel f
|
22
|
+
x = Java::talkhouse.MelFilter.mel JavaFilter.ensure_jarray f
|
23
|
+
end
|
24
|
+
def self.to_linear mel
|
25
|
+
Java::talkhouse.MelFilter.melinv JavaFilter.ensure_jarray mel
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
@@ -0,0 +1,11 @@
|
|
1
|
+
module NoyesJava
|
2
|
+
class Segmenter
|
3
|
+
def initialize win_size, win_shift
|
4
|
+
@filter = Java::talkhouse.Segmenter.new win_size, win_shift
|
5
|
+
end
|
6
|
+
def << data
|
7
|
+
java_matrix = @filter.apply data.to_java(:double)
|
8
|
+
java_matrix.map {|java_array|java_array.to_a} if java_matrix
|
9
|
+
end
|
10
|
+
end
|
11
|
+
end
|
data/lib/noyes.rb
CHANGED
@@ -1,15 +1,12 @@
|
|
1
|
-
require '
|
2
|
-
require '
|
3
|
-
require '
|
4
|
-
require '
|
5
|
-
require '
|
6
|
-
require '
|
7
|
-
require '
|
8
|
-
require '
|
9
|
-
require '
|
10
|
-
require '
|
11
|
-
require '
|
12
|
-
require '
|
13
|
-
require 'power_spec'
|
14
|
-
require 'preemphasis'
|
15
|
-
require 'segment'
|
1
|
+
require 'common'
|
2
|
+
require 'ruby_impl/dct'
|
3
|
+
require 'ruby_impl/delta'
|
4
|
+
require 'ruby_impl/filter'
|
5
|
+
require 'ruby_impl/mel_filter'
|
6
|
+
require 'ruby_impl/hamming_window'
|
7
|
+
require 'ruby_impl/log_compress'
|
8
|
+
require 'ruby_impl/live_cmn'
|
9
|
+
require 'ruby_impl/discrete_fourier_transform'
|
10
|
+
require 'ruby_impl/power_spec'
|
11
|
+
require 'ruby_impl/preemphasis'
|
12
|
+
require 'ruby_impl/segment'
|
data/lib/noyes_java.rb
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
require 'common'
|
2
|
+
require 'java'
|
3
|
+
require 'java_impl/dct'
|
4
|
+
require 'java_impl/delta'
|
5
|
+
require 'java_impl/filter'
|
6
|
+
require 'java_impl/mel_filter'
|
7
|
+
require 'java_impl/hamming_window'
|
8
|
+
require 'java_impl/live_cmn'
|
9
|
+
require 'java_impl/log_compress'
|
10
|
+
require 'java_impl/discrete_fourier_transform'
|
11
|
+
require 'java_impl/power_spec'
|
12
|
+
require 'java_impl/preemphasis'
|
13
|
+
require 'java_impl/segment'
|
14
|
+
require 'noyes.jar'
|
data/lib/ruby_impl/dct.rb
CHANGED
@@ -4,31 +4,34 @@ module Noyes
|
|
4
4
|
include Math
|
5
5
|
# Takes the discrete Fourier transform.
|
6
6
|
def dft data,size
|
7
|
-
vals = Array.new(size)
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
7
|
+
vals = Array.new(size) do |i|
|
8
|
+
i < data.size ? Complex(data[i],0) : Complex(0,0)
|
9
|
+
end
|
10
|
+
j=0
|
11
|
+
size.times do |i|
|
12
|
+
vals[j],vals[i] = vals[i],vals[j] if i<j
|
13
|
+
m = size/2
|
14
|
+
while j>=m && m>1
|
15
|
+
j-=m
|
14
16
|
m/=2
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
17
|
+
end
|
18
|
+
j+=m
|
19
|
+
end
|
20
|
+
k=1
|
21
|
+
while k<size
|
22
|
+
incr = 2*k
|
23
|
+
mul = Complex.polar 1, Math::PI/k
|
24
|
+
w = Complex(1, 0)
|
25
|
+
k.times do |i|
|
26
|
+
i.step(size-1,incr) do |j|
|
27
|
+
tmp = w * vals[j+k]
|
28
|
+
vals[j+k],vals[j]=vals[j]-tmp,vals[j]+tmp
|
29
|
+
end
|
30
|
+
w *= mul;
|
31
|
+
end
|
32
|
+
k=incr
|
33
|
+
end
|
34
|
+
vals
|
33
35
|
end
|
36
|
+
module_function :dft
|
34
37
|
end
|
data/lib/ruby_impl/mel_filter.rb
CHANGED
data/lib/ruby_impl/power_spec.rb
CHANGED
@@ -1,17 +1,16 @@
|
|
1
|
-
require 'discrete_fourier_transform'
|
1
|
+
require 'ruby_impl/discrete_fourier_transform'
|
2
2
|
module Noyes
|
3
3
|
# The square of the DFT. You must specify the number of ffts. The power
|
4
4
|
# spectrum returns an array of arrays where each inner array is of length
|
5
5
|
# nfft/2 + 1. The length of the outer array does not change.
|
6
6
|
class PowerSpectrumFilter
|
7
|
-
include Noyes
|
8
7
|
def initialize nfft
|
9
8
|
@nfft = nfft
|
10
9
|
end
|
11
10
|
def << data
|
12
11
|
nuniqdftpts = @nfft/2 + 1
|
13
12
|
data.map do |datavec|
|
14
|
-
datavecfft = dft datavec, @nfft
|
13
|
+
datavecfft = Noyes.dft datavec, @nfft
|
15
14
|
Array.new(nuniqdftpts){|i| datavecfft[i].abs**2}
|
16
15
|
end
|
17
16
|
end
|
data/ship/noyes.jar
ADDED
Binary file
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: noyes
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.6.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Joe Woelfel
|
@@ -9,29 +9,43 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2010-02-
|
12
|
+
date: 2010-02-25 00:00:00 -05:00
|
13
13
|
default_executable:
|
14
14
|
dependencies: []
|
15
15
|
|
16
|
-
description: Currently sufficient to create
|
16
|
+
description: Currently sufficient to create features for speech recognition
|
17
17
|
email: joe@talkhouse.com
|
18
18
|
executables:
|
19
|
-
- noyes_dump44k
|
20
|
-
- noyes_dump8k
|
21
|
-
-
|
19
|
+
- noyes_dump44k
|
20
|
+
- noyes_dump8k
|
21
|
+
- nrec
|
22
22
|
extensions: []
|
23
23
|
|
24
24
|
extra_rdoc_files:
|
25
25
|
- COPYING
|
26
|
+
- FAQ
|
26
27
|
- README
|
27
|
-
- doc/overview.rdoc
|
28
28
|
files:
|
29
|
+
- lib/common.rb
|
29
30
|
- lib/common/noyes_dsl.rb
|
30
31
|
- lib/common/noyes_math.rb
|
31
32
|
- lib/common/parallel_filter.rb
|
32
33
|
- lib/common/send_incrementally.rb
|
33
34
|
- lib/common/serial_filter.rb
|
35
|
+
- lib/java_impl/dct.rb
|
36
|
+
- lib/java_impl/delta.rb
|
37
|
+
- lib/java_impl/discrete_fourier_transform.rb
|
38
|
+
- lib/java_impl/filter.rb
|
39
|
+
- lib/java_impl/hamming_window.rb
|
40
|
+
- lib/java_impl/java_filter.rb
|
41
|
+
- lib/java_impl/live_cmn.rb
|
42
|
+
- lib/java_impl/log_compress.rb
|
43
|
+
- lib/java_impl/mel_filter.rb
|
44
|
+
- lib/java_impl/power_spec.rb
|
45
|
+
- lib/java_impl/preemphasis.rb
|
46
|
+
- lib/java_impl/segment.rb
|
34
47
|
- lib/noyes.rb
|
48
|
+
- lib/noyes_java.rb
|
35
49
|
- lib/ruby_impl/dct.rb
|
36
50
|
- lib/ruby_impl/delta.rb
|
37
51
|
- lib/ruby_impl/discrete_fourier_transform.rb
|
@@ -43,9 +57,10 @@ files:
|
|
43
57
|
- lib/ruby_impl/power_spec.rb
|
44
58
|
- lib/ruby_impl/preemphasis.rb
|
45
59
|
- lib/ruby_impl/segment.rb
|
60
|
+
- ship/noyes.jar
|
46
61
|
- COPYING
|
62
|
+
- FAQ
|
47
63
|
- README
|
48
|
-
- doc/overview.rdoc
|
49
64
|
has_rdoc: true
|
50
65
|
homepage: http://github.com/talkhouse/noyes
|
51
66
|
licenses: []
|
@@ -54,9 +69,8 @@ post_install_message:
|
|
54
69
|
rdoc_options:
|
55
70
|
- --charset=UTF-8
|
56
71
|
require_paths:
|
57
|
-
- lib/ruby_impl
|
58
|
-
- lib/common
|
59
72
|
- lib
|
73
|
+
- ship
|
60
74
|
required_ruby_version: !ruby/object:Gem::Requirement
|
61
75
|
requirements:
|
62
76
|
- - ">="
|
data/bin/recognize.sh
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
#!/usr/bin/env jruby
|
2
|
-
# vim: set filetype=ruby :
|
3
|
-
ROOT = File.dirname(File.dirname(__FILE__))
|
4
|
-
$: << "#{ROOT}/lib/ruby"
|
5
|
-
$: << "#{ROOT}/lib/common"
|
6
|
-
require 'socket'
|
7
|
-
require 'send_incrementally'
|
8
|
-
|
9
|
-
def recognize file, node='localhost', port=2318
|
10
|
-
TCPSocket.open(node, port) do |client|
|
11
|
-
send_incremental_features file, client, client
|
12
|
-
end
|
13
|
-
end
|
14
|
-
|
15
|
-
puts recognize ARGV[0]
|
data/doc/overview.rdoc
DELETED
@@ -1,51 +0,0 @@
|
|
1
|
-
# = Overview
|
2
|
-
#
|
3
|
-
# All signal processing routines use a simple DSL style inteface. Below are
|
4
|
-
# some examples.
|
5
|
-
#
|
6
|
-
# == Filter operator example.
|
7
|
-
# Each example below is the data on the left being operated on by the filter on
|
8
|
-
# the right. This is similar to the way the += operator works for numbers. The
|
9
|
-
# data is not modified in place currently and it should probably stay that way.
|
10
|
-
# It could be if efficiency demanded it, but that would require a bit more care
|
11
|
-
# to avoid side effects when using the API. The >>= actually looks like a
|
12
|
-
# filter.
|
13
|
-
#
|
14
|
-
# data = (1..12).to_a
|
15
|
-
# segmenter = Segmenter.new 4, 2 # window size, window shift
|
16
|
-
# hamming_filter = HammingWindow.new 4 # window size
|
17
|
-
# power_spec_filter = PowerSpectrumFilter.new 8 # number of ffts
|
18
|
-
#
|
19
|
-
# data >>= segmenter
|
20
|
-
# data >>= hamming_filter
|
21
|
-
# data >>= power_spec_filter
|
22
|
-
# data >>= dct_filter
|
23
|
-
#
|
24
|
-
# You can expand the >>= operator out, but I think the flow is worse and there
|
25
|
-
# is more repetition, particularly when you have a lot of filters in sequence.
|
26
|
-
# This is perfectly valid syntax though. Also, this is very useful if you don't
|
27
|
-
# want to keep a reference to your original data.
|
28
|
-
#
|
29
|
-
# pcm_data = (1..12).to_a
|
30
|
-
# segmenter = Segmenter.new
|
31
|
-
# hamming_filter = HammingWindow.new 4
|
32
|
-
# segmented_data = segmenter << pcm_data, 4, 2
|
33
|
-
# hamming_data = hamming_filter << segmented_data
|
34
|
-
# power_spectrum data = power_spec_filter hamminging_data, 8
|
35
|
-
# dct_data = dct_filter << power_spectrum_data
|
36
|
-
#
|
37
|
-
# Here is an older version with function calls instead of operator overloading.
|
38
|
-
# The trouble with it is that the flow is hard to follow, and there is
|
39
|
-
# repetition. Filter and process are really synonyms. And this requires
|
40
|
-
# repeating the data component twice. Also, power spec is a function here
|
41
|
-
# with additional arguments. I think I'd rather have the configuration
|
42
|
-
# details, such as number of ffts all grouped at the top. It's easier to
|
43
|
-
# follow this way.
|
44
|
-
#
|
45
|
-
# data = (1..12).to_a
|
46
|
-
# seg = Segmenter.new
|
47
|
-
# ham = HammingWindow.new 4
|
48
|
-
# segments = segmenter.process data, 4, 2
|
49
|
-
# hamming_ = hamming_filter.process segments
|
50
|
-
# power = power_spec.filter hamming, 8
|
51
|
-
# dct = dct.process power
|
File without changes
|
File without changes
|