primord-tools 1.0.0 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- MTdiZjAzZGNjOGYxNDJiNTY1OWFjNWFmOTdmMDhlNDk1NmQ5ZTQwZg==
4
+ YWVjZTlhNjQ4MDgyMjIxY2I5NWE3N2ZkNjIwZGVmZThhOGFjYjAzMQ==
5
5
  data.tar.gz: !binary |-
6
- OTdjYjY1YzNlZjg5YjM4ZmM1YjYzOTI0NGNkYmIyYmUyNzMzYWU2Nw==
6
+ MTMzZWIyMzRhYTVkYzFlMTAyZGU5ZWYyYWRhOGQ2MDcxZDBhN2U2NQ==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- MjM4NWJmMGE3N2M1ODQ2MmExMmIwY2E1YjVjODdiYzZiN2E0NTg4Y2ZjZTA5
10
- ZjQ2NjNhNzU0NGU1OWQ3NmM2ZTY5ZTU5YjAzZWY0OTNhZWI1ZjdiOGY0ZDY5
11
- ODI2YTU2MzEyNjQ0NWMzMDc1NzBlMDkyMzk0NjZlMDAyMTA2NjY=
9
+ NWZiMmU4OTUzMmU2OTM1NDcyMWFiODg4YWNjM2E2NjY1YzAzNTU2Njk4MGFl
10
+ OTUwZWJjZjg4NTc4YTZjOTYxZDRmOTJiOGEyOTQ1YmY4YTY2ZDY2YTI1Y2Rj
11
+ YmJiMGU5ZmEwZDBkOTdhNGEyZDlkNjE0YTM5Njg4NjQxODg2OWM=
12
12
  data.tar.gz: !binary |-
13
- YzNkZjI4YWIyOTk2MTMxY2I0NzdlYTk2NzVjZGRiZjhhNzBmNjU1YWVlOTNj
14
- MmE1MzRlYjBlOWIwYTc0MTA4OTdlN2E3NWE2OWU1ZGM2M2M5ZDA3YzBlNDNh
15
- ZGZiOGEzOTY0YTA3NGNlYzEyZjBhYjUzYzZhZTdjZWUwODI4ZTY=
13
+ MmI3NmQ2MTVmN2RkMzA5YTdmOWEwY2UyZDRjMzEwMDY0ZDcwYmJjZWU4ODcx
14
+ YWY3MDMzNzFjMTE0NWYyNzU2N2QzNTM3MWZmODAxNjg5ZTU2YmI1ZmFmZTQ2
15
+ YzcwNWFlYzY4NDAyYjBjMmM5MjhkMjI4YjZlZDYxNTk0ZTkyNzI=
data/.gitignore CHANGED
@@ -1 +1,5 @@
1
1
  /pkg
2
+
3
+ # that is a symlink to the actual data
4
+ /stuff
5
+ .R*
data/README.md CHANGED
@@ -2,3 +2,65 @@ primord-tools
2
2
  =============
3
3
 
4
4
  Tools used to create metadata of primord rf data
5
+
6
+ Desinged to analyze the three types of files generated
7
+ by the experiment: bgdist, pylog, and pyerr.
8
+
9
+ Usage
10
+ -----
11
+
12
+ The basic usage is `primord-metadata <directive> ...`
13
+
14
+ Supported directives are
15
+ * `metadata` - generate metadata
16
+ * `help` - print help message
17
+
18
+ Metadata
19
+ --------
20
+
21
+ The syntax for this directive is `primord-metadata metadata <outfile> <bgdist|pyerr|pylog> <param,param,...> <files>`
22
+
23
+ * `outfile` - file to write the output to in csv format
24
+ * `bgdist|pyerr|pylog` - the type of file to process (only one can be specified)
25
+ * `param,param,...` - the parameters to collect/compute (eg. `file,lines,date` for the filename, number of lines and date)
26
+ * `files` - a space seperated list of input files
27
+
28
+
29
+
30
+ Parametes
31
+ ---------
32
+
33
+ All files support these parameters
34
+ * `file` - name of the file
35
+ * `date` - date of data gathered
36
+ * `lines` - number of lines in the file
37
+
38
+ The following parameters are avalible for `bgdist`
39
+ * `has_mulit` - whether the data has multi-frequency data
40
+ * `stat_sf_aic` - the Akaike Information Criterion for the single frequency model
41
+ * `stat_sf_intercept` - the count intercept used in the single frequency model
42
+ * `stat_mf_aic` - the Akaike Information Criterion for the single frequency model
43
+ * `stat_mf_intercept` - the count intercept used in the multi frequency model
44
+ * `stat_sf_count` - number of single frequency events observed
45
+ * `stat_mf_count` - number of multi frequency events observed
46
+
47
+ The following parameters are avalible for `pylog`
48
+
49
+ The following parameters are avalible for `pyerr`
50
+
51
+
52
+ BGDist Model
53
+ ------------
54
+
55
+ The background noise should follow a poisson distribution. This is no suprise, as
56
+ many random processes follow this distribution. Due to hardware limitations of the
57
+ windowed FFT as well as bandwidth limitations between the USRP2 and the computer
58
+ logging the data, signals below a threshold were not logged. This makes computing
59
+ the poisson PDF for the data slightly more chalenging. The method that is employed
60
+ is a zero inflated GLM using a logarithmic link function.
61
+
62
+ In many cases the model fails due to the following reasons
63
+ * insuficient data - if there is only signals in a couple of bins the optimization fails
64
+ * other wierd/unkonwn issues - I am investegating other sourced of failures
65
+
66
+
data/Rakefile CHANGED
@@ -13,7 +13,7 @@ spec = Gem::Specification.new do |gem|
13
13
  gem.summary = "Creates metadata for primord rf data"
14
14
  gem.description = "Process the logs from gnuradio and generates metadata"
15
15
 
16
- gem.files = `git ls-files`.split($\)
16
+ gem.files = `git ls-files | grep -E '^tmp' -v`.split($\)
17
17
 
18
18
  gem.executables = ["primord-metadata"]
19
19
  end
data/TODO.md ADDED
@@ -0,0 +1,25 @@
1
+ Main part
2
+ * a good README
3
+ * a better help message
4
+ * option parsing
5
+ * pyerr/pylog - parse the logs and find errors
6
+ * cross coralate time of error to bgdist data
7
+ * bgdist - get the threshold
8
+ * bgdist - aggragate data in a short time periods together
9
+ * bgdist - find data duration
10
+
11
+ On the R side of things
12
+ * fit poisson to data [here](http://stats.stackexchange.com/questions/70558/diagnostic-plots-for-count-regression)
13
+ and [here](http://www.ats.ucla.edu/stat/r/dae/zipoisson.htm)
14
+ * get it working on windows
15
+ * caching of computed statistics to make re-running it on largish subsets of the data faster
16
+ * bgdist - get mu and lambda
17
+
18
+ Package Issues
19
+ * Fix up the rake file
20
+ * rinruby is a pain so do something about it
21
+
22
+ Wishlist
23
+ * get data in a time line
24
+ * make a nice web ui with an interactive timeline
25
+ * make the timeline pinable
@@ -1,101 +1,207 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- # TODO write to output file
4
- # TODO tabs vs space consistency
5
- # TODO other useful things
3
+ # TODO proper option parsing
4
+ # TODO this thing is turning into a beast
5
+ # - split into multiple files
6
+ # - modulize the code
6
7
 
7
- module Utils
8
- def self.print_usage
9
- puts "./metadaer.rb <pyerr|pylog|bgdist> <param,param,...> <files>"
10
- end
11
- def self.count_lines file
12
- lines = 0
13
- file.each_line { |l| lines += 1 }
14
- file.rewind
15
- lines
16
- end
17
- def self.extract_date str
18
- a = str.match /.*([0-9]{4})([0-9]{2})([0-9]{2})/
19
- "#{a[2]}-#{a[3]}-#{a[1]}"
20
- end
21
- end # class Stuff
8
+ # Wonky bug in rinruby (see http://hfeild-software.blogspot.com/2013/01/rinruby-woes.html)
9
+ # I will send a PR to get it fixed sometime in the future
10
+ R = ""
11
+
12
+ # TODO fix this in the Rakefile
13
+ $:.unshift File.dirname(__FILE__) + '/../lib/'
14
+ require 'utils'
22
15
 
23
- # classes for each type of file we want to
24
- # work with
16
+ # classes for each type of file we want to work with
25
17
  class DataFile
26
18
  @@params
19
+ @@outfile
27
20
  def initialize filename
21
+ @filename = filename
28
22
  @data = {}
29
- @data['file'] = filename
30
- @file = File.open filename, 'r'
31
-
23
+ @lines = IO.readlines @filename
24
+
25
+ # generic stuff that all files can have
26
+ def self.data_file
27
+ @data['file'] = @filename
28
+ end
29
+ def self.data_lines
30
+ @data['lines'] = @lines.length
31
+ end
32
+ def self.data_date
33
+ @data['date'] = Utils::extract_date @filename
34
+ end
35
+
36
+ # all the computation happens here
32
37
  def self.write
38
+ # fill out @data
39
+ @@params.each do |param|
40
+ self.send "data_#{param}"
41
+ end
42
+ # write everything out to stdout
33
43
  str = ""
34
44
  @@params.each do |param|
35
45
  str += "#{@data[param]},"
36
46
  end
37
- puts str[0..-2]
47
+ @@outfile.puts str[0..-2]
38
48
  end
39
49
  end
40
- def self.format= params
50
+ # set @@params and write the csv header
51
+ def self.format params, outfile
52
+ @@outfile = outfile
41
53
  @@params = params
42
54
  str = ""
43
55
  @@params.each do |param|
44
56
  str += "#{param},"
45
57
  end
46
- puts str[0..-2]
58
+ @@outfile.puts str[0..-2]
59
+
60
+ # check if we need to do the statistical modeling
61
+ @@do_stats = false
62
+ @@params.each do |param|
63
+ if param =~ /^stat_/
64
+ @@do_stats = true
65
+ require 'rinruby'
66
+ # NOTE not even close to threadsafe
67
+ @@R = RinRuby.new :interactive => false, :echo => true, :executable => "'#{`which R`.chomp.chomp}'"
68
+ @@R.eval 'library(pscl)'
69
+ break
70
+ end
71
+ end
47
72
  end
48
73
  end
49
74
 
50
75
  class Pyerr < DataFile
51
76
  def initialize filename
52
77
  super
53
- # count the lines
54
- @data['lines'] = Utils::count_lines @file
55
- # get the date
56
- @data['date'] = Utils::extract_date filename
78
+ # nothing special
57
79
  end
58
80
  end # class DataFile
59
81
 
60
82
  class Pylog < DataFile
61
83
  def initialize filename
62
84
  super
63
- # count the lines
64
- @data['lines'] = Utils::count_lines @file
65
- # get the date
66
- @data['date'] = Utils::extract_date filename
85
+ # nothing special
67
86
  end
68
87
  end # class Pylog
69
88
 
70
89
  class Bgdist < DataFile
71
90
  def initialize filename
72
91
  super
73
- # count the lines
74
- @data['lines'] = Utils::count_lines @file
75
- # get the date
76
- @data['date'] = Utils::extract_date filename
92
+ def self.data_has_multi
93
+ if @lines[0] =~ /bin,avg_amp,N_avg_amp,,freq,N_freq/
94
+ @data['has_multi'] = false
95
+ elsif @lines[0] =~ /bin,avg_amp_sf,N_avg_amp_sf,N_avg_amp_mf,,freq,N_freq_sf,N_freq_mf/
96
+ @data['has_multi'] = true
97
+ else
98
+ abort "\nerror in comprehending bgdist header: #{@lines[0]}\n"
99
+ end
100
+ end
101
+ if @@do_stats
102
+ rcode = <<EOF
103
+ print_log <- function (what, msg)
104
+ {
105
+ print(paste("#{@filename}:", what, " -> ", msg), stdout());
106
+ }
107
+
108
+ D <- read.csv('#{@filename}')
109
+ D <- D[1:length(D$bin)-1,]
110
+ EOF
111
+ self.data_has_multi
112
+ if @data['has_multi']
113
+ rcode += <<EOF
114
+ mod.sf <- tryCatch({
115
+ zeroinfl(D$N_avg_amp_sf ~ bin, data=D, dist="poisson")
116
+ }, warning = function(w) {
117
+ print_log("fitting single frequency", w)
118
+ }, error = function(e) {
119
+ print_log("fitting single frequency", e)
120
+ }, finally = {})
121
+ mod.sf_count = sum(D$N_avg_amp_sf)
122
+
123
+ mod.mf <- tryCatch({
124
+ zeroinfl(D$N_avg_amp_mf ~ bin, data=D, dist="poisson")
125
+ }, warning = function(w) {
126
+ print_log("fitting multi frequency", w)
127
+ }, error = function(e) {
128
+ print_log("fitting multi frequency", e)
129
+ }, finally = {})
130
+ mod.mf_count = sum(D$N_avg_amp_mf)
131
+ EOF
132
+ else # single frequency
133
+ rcode += <<EOF
134
+ mod.sf <- tryCatch({
135
+ zeroinfl(D$N_avg_amp ~ bin, data=D, dist="poisson")
136
+ }, warning = function(w) {
137
+ print_log("fitting single frequency", w)
138
+ }, error = function(e) {
139
+ print_log("fitting single frequency", e)
140
+ }, finally = {})
141
+ mod.sf_count = sum(D$N_avg_amp)
142
+
143
+ mod.mf = NA
144
+ mod.mf_count = as.integer(0)
145
+ EOF
146
+ end # end of single or multifrequency block
147
+ # rinruby does not support the logical type, thus we use NaN which is a double
148
+ rcode += <<EOF
149
+ if (typeof(mod.sf) == "list") {
150
+ mod.sf_aic = as.double(AIC(mod.sf))
151
+ mod.sf_intercept = mod.sf$coefficients$count[1]
152
+ } else {
153
+ mod.sf_aic = NaN
154
+ mod.sf_intercept = NaN
155
+ }
156
+ if (typeof(mod.mf) == "list") {
157
+ mod.mf_aic = as.double(AIC(mod.mf))
158
+ mod.mf_intercept = mod.mf$coefficients$count[1]
159
+ } else {
160
+ mod.mf_aic = NaN
161
+ mod.mf_intercept = NaN
162
+ }
163
+ EOF
164
+ @@R.eval rcode
165
+ end
166
+ # ruby metaprograming :P
167
+ def self.method_missing m, *args, &block
168
+ if m =~ /^data_(stat_)(.*)/
169
+ @data["#{$1}#{$2}"] = @@R.send "mod.#{$2}"
170
+ if @data["#{$1}#{$2}"].respond_to? :nan? and @data["#{$1}#{$2}"].nan?
171
+ @data["#{$1}#{$2}"] = 'NA'
172
+ end
173
+ else
174
+ abort "your missing your method named #{m}, you should go find it\n"
175
+ end
176
+ end
77
177
  end
78
178
  end # class BGdist
79
179
 
80
- # main
81
180
 
181
+
182
+ # main
82
183
  directive = ARGV.shift
83
184
  case directive
84
185
  when 'help'
85
186
  Utils::print_usage
86
187
  exit 0
87
- else
188
+ when 'metadata'
189
+ outfile = ARGV.shift
190
+ clas = Object.const_get ARGV.shift.capitalize
88
191
  params = []
89
192
  ARGV.shift.gsub(/([^,]+)/){ |p| params << p }
90
- DataFile::format = params
91
- clas = Object.const_get directive.capitalize
193
+ DataFile::format params, (File.open outfile, 'w')
194
+ # This is the supper fancy option that I will work on this weekend
195
+ # I want to create a one page web app that is centered on a timeline
196
+ #
197
+ when 'timeserve'
198
+
199
+
200
+ else
201
+ puts 'unknown directive'
92
202
  end
93
203
 
94
- objs = []
95
204
  ARGV.each do |filename|
96
- objs << (clas.new filename)
97
- end
98
-
99
- objs.each do |obj|
205
+ obj = clas.new filename
100
206
  obj.write
101
207
  end
@@ -0,0 +1,10 @@
1
+ module Utils
2
+ def self.print_usage
3
+ puts "primord-metadata <directive> ...\n\tSee https://github.com/floomby/primord-tools for more info"
4
+ end
5
+ # get date the file was created from the filename
6
+ def self.extract_date str
7
+ a = str.match /.*([0-9]{4})([0-9]{2})([0-9]{2})/
8
+ "#{a[2]}-#{a[3]}-#{a[1]}"
9
+ end
10
+ end # module Utils
@@ -0,0 +1,6 @@
1
+ #!/bin/bash
2
+
3
+ # setup the environment needed to use R on windows
4
+ # under cygwin with the rinruby gem
5
+
6
+ export PATH=$PATH:/cygdrive/c/Program\ Files/R/R-2.15.2/bin/x64
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: primord-tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - ! 'floomby
@@ -10,7 +10,7 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2014-06-18 00:00:00.000000000 Z
13
+ date: 2014-07-03 00:00:00.000000000 Z
14
14
  dependencies: []
15
15
  description: Process the logs from gnuradio and generates metadata
16
16
  email: ! 'floomby@nmt.edu
@@ -24,8 +24,10 @@ files:
24
24
  - .gitignore
25
25
  - README.md
26
26
  - Rakefile
27
- - TODO
27
+ - TODO.md
28
28
  - bin/primord-metadata
29
+ - lib/utils.rb
30
+ - setup-env.sh
29
31
  homepage: https://github.com/floomby/gitver
30
32
  licenses: []
31
33
  metadata: {}
data/TODO DELETED
@@ -1,2 +0,0 @@
1
- 1) a good README
2
- 2) a better help message