primord-tools 1.0.0 → 1.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- MTdiZjAzZGNjOGYxNDJiNTY1OWFjNWFmOTdmMDhlNDk1NmQ5ZTQwZg==
4
+ YWVjZTlhNjQ4MDgyMjIxY2I5NWE3N2ZkNjIwZGVmZThhOGFjYjAzMQ==
5
5
  data.tar.gz: !binary |-
6
- OTdjYjY1YzNlZjg5YjM4ZmM1YjYzOTI0NGNkYmIyYmUyNzMzYWU2Nw==
6
+ MTMzZWIyMzRhYTVkYzFlMTAyZGU5ZWYyYWRhOGQ2MDcxZDBhN2U2NQ==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- MjM4NWJmMGE3N2M1ODQ2MmExMmIwY2E1YjVjODdiYzZiN2E0NTg4Y2ZjZTA5
10
- ZjQ2NjNhNzU0NGU1OWQ3NmM2ZTY5ZTU5YjAzZWY0OTNhZWI1ZjdiOGY0ZDY5
11
- ODI2YTU2MzEyNjQ0NWMzMDc1NzBlMDkyMzk0NjZlMDAyMTA2NjY=
9
+ NWZiMmU4OTUzMmU2OTM1NDcyMWFiODg4YWNjM2E2NjY1YzAzNTU2Njk4MGFl
10
+ OTUwZWJjZjg4NTc4YTZjOTYxZDRmOTJiOGEyOTQ1YmY4YTY2ZDY2YTI1Y2Rj
11
+ YmJiMGU5ZmEwZDBkOTdhNGEyZDlkNjE0YTM5Njg4NjQxODg2OWM=
12
12
  data.tar.gz: !binary |-
13
- YzNkZjI4YWIyOTk2MTMxY2I0NzdlYTk2NzVjZGRiZjhhNzBmNjU1YWVlOTNj
14
- MmE1MzRlYjBlOWIwYTc0MTA4OTdlN2E3NWE2OWU1ZGM2M2M5ZDA3YzBlNDNh
15
- ZGZiOGEzOTY0YTA3NGNlYzEyZjBhYjUzYzZhZTdjZWUwODI4ZTY=
13
+ MmI3NmQ2MTVmN2RkMzA5YTdmOWEwY2UyZDRjMzEwMDY0ZDcwYmJjZWU4ODcx
14
+ YWY3MDMzNzFjMTE0NWYyNzU2N2QzNTM3MWZmODAxNjg5ZTU2YmI1ZmFmZTQ2
15
+ YzcwNWFlYzY4NDAyYjBjMmM5MjhkMjI4YjZlZDYxNTk0ZTkyNzI=
data/.gitignore CHANGED
@@ -1 +1,5 @@
1
1
  /pkg
2
+
3
+ # that is a symlink to the actual data
4
+ /stuff
5
+ .R*
data/README.md CHANGED
@@ -2,3 +2,65 @@ primord-tools
2
2
  =============
3
3
 
4
4
  Tools used to create metadata of primord rf data
5
+
6
+ Desinged to analyze the three types of files generated
7
+ by the experiment: bgdist, pylog, and pyerr.
8
+
9
+ Usage
10
+ -----
11
+
12
+ The basic usage is `primord-metadata <directive> ...`
13
+
14
+ Supported directives are
15
+ * `metadata` - generate metadata
16
+ * `help` - print help message
17
+
18
+ Metadata
19
+ --------
20
+
21
+ The syntax for this directive is `primord-metadata metadata <outfile> <bgdist|pyerr|pylog> <param,param,...> <files>`
22
+
23
+ * `outfile` - file to write the output to in csv format
24
+ * `bgdist|pyerr|pylog` - the type of file to process (only one can be specified)
25
+ * `param,param,...` - the parameters to collect/compute (eg. `file,lines,date` for the filename, number of lines and date)
26
+ * `files` - a space seperated list of input files
27
+
28
+
29
+
30
+ Parametes
31
+ ---------
32
+
33
+ All files support these parameters
34
+ * `file` - name of the file
35
+ * `date` - date of data gathered
36
+ * `lines` - number of lines in the file
37
+
38
+ The following parameters are avalible for `bgdist`
39
+ * `has_mulit` - whether the data has multi-frequency data
40
+ * `stat_sf_aic` - the Akaike Information Criterion for the single frequency model
41
+ * `stat_sf_intercept` - the count intercept used in the single frequency model
42
+ * `stat_mf_aic` - the Akaike Information Criterion for the single frequency model
43
+ * `stat_mf_intercept` - the count intercept used in the multi frequency model
44
+ * `stat_sf_count` - number of single frequency events observed
45
+ * `stat_mf_count` - number of multi frequency events observed
46
+
47
+ The following parameters are avalible for `pylog`
48
+
49
+ The following parameters are avalible for `pyerr`
50
+
51
+
52
+ BGDist Model
53
+ ------------
54
+
55
+ The background noise should follow a poisson distribution. This is no suprise, as
56
+ many random processes follow this distribution. Due to hardware limitations of the
57
+ windowed FFT as well as bandwidth limitations between the USRP2 and the computer
58
+ logging the data, signals below a threshold were not logged. This makes computing
59
+ the poisson PDF for the data slightly more chalenging. The method that is employed
60
+ is a zero inflated GLM using a logarithmic link function.
61
+
62
+ In many cases the model fails due to the following reasons
63
+ * insuficient data - if there is only signals in a couple of bins the optimization fails
64
+ * other wierd/unkonwn issues - I am investegating other sourced of failures
65
+
66
+
data/Rakefile CHANGED
@@ -13,7 +13,7 @@ spec = Gem::Specification.new do |gem|
13
13
  gem.summary = "Creates metadata for primord rf data"
14
14
  gem.description = "Process the logs from gnuradio and generates metadata"
15
15
 
16
- gem.files = `git ls-files`.split($\)
16
+ gem.files = `git ls-files | grep -E '^tmp' -v`.split($\)
17
17
 
18
18
  gem.executables = ["primord-metadata"]
19
19
  end
data/TODO.md ADDED
@@ -0,0 +1,25 @@
1
+ Main part
2
+ * a good README
3
+ * a better help message
4
+ * option parsing
5
+ * pyerr/pylog - parse the logs and find errors
6
+ * cross coralate time of error to bgdist data
7
+ * bgdist - get the threshold
8
+ * bgdist - aggragate data in a short time periods together
9
+ * bgdist - find data duration
10
+
11
+ On the R side of things
12
+ * fit poisson to data [here](http://stats.stackexchange.com/questions/70558/diagnostic-plots-for-count-regression)
13
+ and [here](http://www.ats.ucla.edu/stat/r/dae/zipoisson.htm)
14
+ * get it working on windows
15
+ * caching of computed statistics to make re-running it on largish subsets of the data faster
16
+ * bgdist - get mu and lambda
17
+
18
+ Package Issues
19
+ * Fix up the rake file
20
+ * rinruby is a pain so do something about it
21
+
22
+ Wishlist
23
+ * get data in a time line
24
+ * make a nice web ui with an interactive timeline
25
+ * make the timeline pinable
@@ -1,101 +1,207 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- # TODO write to output file
4
- # TODO tabs vs space consistency
5
- # TODO other useful things
3
+ # TODO proper option parsing
4
+ # TODO this thing is turning into a beast
5
+ # - split into multiple files
6
+ # - modulize the code
6
7
 
7
- module Utils
8
- def self.print_usage
9
- puts "./metadaer.rb <pyerr|pylog|bgdist> <param,param,...> <files>"
10
- end
11
- def self.count_lines file
12
- lines = 0
13
- file.each_line { |l| lines += 1 }
14
- file.rewind
15
- lines
16
- end
17
- def self.extract_date str
18
- a = str.match /.*([0-9]{4})([0-9]{2})([0-9]{2})/
19
- "#{a[2]}-#{a[3]}-#{a[1]}"
20
- end
21
- end # class Stuff
8
+ # Wonky bug in rinruby (see http://hfeild-software.blogspot.com/2013/01/rinruby-woes.html)
9
+ # I will send a PR to get it fixed sometime in the future
10
+ R = ""
11
+
12
+ # TODO fix this in the Rakefile
13
+ $:.unshift File.dirname(__FILE__) + '/../lib/'
14
+ require 'utils'
22
15
 
23
- # classes for each type of file we want to
24
- # work with
16
+ # classes for each type of file we want to work with
25
17
  class DataFile
26
18
  @@params
19
+ @@outfile
27
20
  def initialize filename
21
+ @filename = filename
28
22
  @data = {}
29
- @data['file'] = filename
30
- @file = File.open filename, 'r'
31
-
23
+ @lines = IO.readlines @filename
24
+
25
+ # generic stuff that all files can have
26
+ def self.data_file
27
+ @data['file'] = @filename
28
+ end
29
+ def self.data_lines
30
+ @data['lines'] = @lines.length
31
+ end
32
+ def self.data_date
33
+ @data['date'] = Utils::extract_date @filename
34
+ end
35
+
36
+ # all the computation happens here
32
37
  def self.write
38
+ # fill out @data
39
+ @@params.each do |param|
40
+ self.send "data_#{param}"
41
+ end
42
+ # write everything out to stdout
33
43
  str = ""
34
44
  @@params.each do |param|
35
45
  str += "#{@data[param]},"
36
46
  end
37
- puts str[0..-2]
47
+ @@outfile.puts str[0..-2]
38
48
  end
39
49
  end
40
- def self.format= params
50
+ # set @@params and write the csv header
51
+ def self.format params, outfile
52
+ @@outfile = outfile
41
53
  @@params = params
42
54
  str = ""
43
55
  @@params.each do |param|
44
56
  str += "#{param},"
45
57
  end
46
- puts str[0..-2]
58
+ @@outfile.puts str[0..-2]
59
+
60
+ # check if we need to do the statistical modeling
61
+ @@do_stats = false
62
+ @@params.each do |param|
63
+ if param =~ /^stat_/
64
+ @@do_stats = true
65
+ require 'rinruby'
66
+ # NOTE not even close to threadsafe
67
+ @@R = RinRuby.new :interactive => false, :echo => true, :executable => "'#{`which R`.chomp.chomp}'"
68
+ @@R.eval 'library(pscl)'
69
+ break
70
+ end
71
+ end
47
72
  end
48
73
  end
49
74
 
50
75
  class Pyerr < DataFile
51
76
  def initialize filename
52
77
  super
53
- # count the lines
54
- @data['lines'] = Utils::count_lines @file
55
- # get the date
56
- @data['date'] = Utils::extract_date filename
78
+ # nothing special
57
79
  end
58
80
  end # class DataFile
59
81
 
60
82
  class Pylog < DataFile
61
83
  def initialize filename
62
84
  super
63
- # count the lines
64
- @data['lines'] = Utils::count_lines @file
65
- # get the date
66
- @data['date'] = Utils::extract_date filename
85
+ # nothing special
67
86
  end
68
87
  end # class Pylog
69
88
 
70
89
  class Bgdist < DataFile
71
90
  def initialize filename
72
91
  super
73
- # count the lines
74
- @data['lines'] = Utils::count_lines @file
75
- # get the date
76
- @data['date'] = Utils::extract_date filename
92
+ def self.data_has_multi
93
+ if @lines[0] =~ /bin,avg_amp,N_avg_amp,,freq,N_freq/
94
+ @data['has_multi'] = false
95
+ elsif @lines[0] =~ /bin,avg_amp_sf,N_avg_amp_sf,N_avg_amp_mf,,freq,N_freq_sf,N_freq_mf/
96
+ @data['has_multi'] = true
97
+ else
98
+ abort "\nerror in comprehending bgdist header: #{@lines[0]}\n"
99
+ end
100
+ end
101
+ if @@do_stats
102
+ rcode = <<EOF
103
+ print_log <- function (what, msg)
104
+ {
105
+ print(paste("#{@filename}:", what, " -> ", msg), stdout());
106
+ }
107
+
108
+ D <- read.csv('#{@filename}')
109
+ D <- D[1:length(D$bin)-1,]
110
+ EOF
111
+ self.data_has_multi
112
+ if @data['has_multi']
113
+ rcode += <<EOF
114
+ mod.sf <- tryCatch({
115
+ zeroinfl(D$N_avg_amp_sf ~ bin, data=D, dist="poisson")
116
+ }, warning = function(w) {
117
+ print_log("fitting single frequency", w)
118
+ }, error = function(e) {
119
+ print_log("fitting single frequency", e)
120
+ }, finally = {})
121
+ mod.sf_count = sum(D$N_avg_amp_sf)
122
+
123
+ mod.mf <- tryCatch({
124
+ zeroinfl(D$N_avg_amp_mf ~ bin, data=D, dist="poisson")
125
+ }, warning = function(w) {
126
+ print_log("fitting multi frequency", w)
127
+ }, error = function(e) {
128
+ print_log("fitting multi frequency", e)
129
+ }, finally = {})
130
+ mod.mf_count = sum(D$N_avg_amp_mf)
131
+ EOF
132
+ else # single frequency
133
+ rcode += <<EOF
134
+ mod.sf <- tryCatch({
135
+ zeroinfl(D$N_avg_amp ~ bin, data=D, dist="poisson")
136
+ }, warning = function(w) {
137
+ print_log("fitting single frequency", w)
138
+ }, error = function(e) {
139
+ print_log("fitting single frequency", e)
140
+ }, finally = {})
141
+ mod.sf_count = sum(D$N_avg_amp)
142
+
143
+ mod.mf = NA
144
+ mod.mf_count = as.integer(0)
145
+ EOF
146
+ end # end of single or multifrequency block
147
+ # rinruby does not support the logical type, thus we use NaN which is a double
148
+ rcode += <<EOF
149
+ if (typeof(mod.sf) == "list") {
150
+ mod.sf_aic = as.double(AIC(mod.sf))
151
+ mod.sf_intercept = mod.sf$coefficients$count[1]
152
+ } else {
153
+ mod.sf_aic = NaN
154
+ mod.sf_intercept = NaN
155
+ }
156
+ if (typeof(mod.mf) == "list") {
157
+ mod.mf_aic = as.double(AIC(mod.mf))
158
+ mod.mf_intercept = mod.mf$coefficients$count[1]
159
+ } else {
160
+ mod.mf_aic = NaN
161
+ mod.mf_intercept = NaN
162
+ }
163
+ EOF
164
+ @@R.eval rcode
165
+ end
166
+ # ruby metaprograming :P
167
+ def self.method_missing m, *args, &block
168
+ if m =~ /^data_(stat_)(.*)/
169
+ @data["#{$1}#{$2}"] = @@R.send "mod.#{$2}"
170
+ if @data["#{$1}#{$2}"].respond_to? :nan? and @data["#{$1}#{$2}"].nan?
171
+ @data["#{$1}#{$2}"] = 'NA'
172
+ end
173
+ else
174
+ abort "your missing your method named #{m}, you should go find it\n"
175
+ end
176
+ end
77
177
  end
78
178
  end # class BGdist
79
179
 
80
- # main
81
180
 
181
+
182
+ # main
82
183
  directive = ARGV.shift
83
184
  case directive
84
185
  when 'help'
85
186
  Utils::print_usage
86
187
  exit 0
87
- else
188
+ when 'metadata'
189
+ outfile = ARGV.shift
190
+ clas = Object.const_get ARGV.shift.capitalize
88
191
  params = []
89
192
  ARGV.shift.gsub(/([^,]+)/){ |p| params << p }
90
- DataFile::format = params
91
- clas = Object.const_get directive.capitalize
193
+ DataFile::format params, (File.open outfile, 'w')
194
+ # This is the supper fancy option that I will work on this weekend
195
+ # I want to create a one page web app that is centered on a timeline
196
+ #
197
+ when 'timeserve'
198
+
199
+
200
+ else
201
+ puts 'unknown directive'
92
202
  end
93
203
 
94
- objs = []
95
204
  ARGV.each do |filename|
96
- objs << (clas.new filename)
97
- end
98
-
99
- objs.each do |obj|
205
+ obj = clas.new filename
100
206
  obj.write
101
207
  end
@@ -0,0 +1,10 @@
1
+ module Utils
2
+ def self.print_usage
3
+ puts "primord-metadata <directive> ...\n\tSee https://github.com/floomby/primord-tools for more info"
4
+ end
5
+ # get date the file was created from the filename
6
+ def self.extract_date str
7
+ a = str.match /.*([0-9]{4})([0-9]{2})([0-9]{2})/
8
+ "#{a[2]}-#{a[3]}-#{a[1]}"
9
+ end
10
+ end # module Utils
@@ -0,0 +1,6 @@
1
+ #!/bin/bash
2
+
3
+ # setup the environment needed to use R on windows
4
+ # under cygwin with the rinruby gem
5
+
6
+ export PATH=$PATH:/cygdrive/c/Program\ Files/R/R-2.15.2/bin/x64
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: primord-tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - ! 'floomby
@@ -10,7 +10,7 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2014-06-18 00:00:00.000000000 Z
13
+ date: 2014-07-03 00:00:00.000000000 Z
14
14
  dependencies: []
15
15
  description: Process the logs from gnuradio and generates metadata
16
16
  email: ! 'floomby@nmt.edu
@@ -24,8 +24,10 @@ files:
24
24
  - .gitignore
25
25
  - README.md
26
26
  - Rakefile
27
- - TODO
27
+ - TODO.md
28
28
  - bin/primord-metadata
29
+ - lib/utils.rb
30
+ - setup-env.sh
29
31
  homepage: https://github.com/floomby/gitver
30
32
  licenses: []
31
33
  metadata: {}
data/TODO DELETED
@@ -1,2 +0,0 @@
1
- 1) a good README
2
- 2) a better help message