puppet-community-mvp 0.0.3 → 0.0.7

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: f77ade2721786fdca7fd96827ca510e0404a02a8
4
- data.tar.gz: ad32f5a43392ed7f23d49d60db3fbe308ae73e2d
2
+ SHA256:
3
+ metadata.gz: dd83202b003a900b8744b0fc8da5bb14b6024ca37f1419c841d68afaa4b487dd
4
+ data.tar.gz: c69cfa9c035b30136593d10dbad96d588fd0fe8dd870ba0763068f0c756af5cc
5
5
  SHA512:
6
- metadata.gz: be5282a77000b433c3aedd58ddfd29ea594df089007e38e74cb8b18c5c6b576ff048b218ec30cb9f60b8291faa1b62525f8d2c79694444bb3f95df2ede98efc0
7
- data.tar.gz: 94894492ffba9d187f9178a17d51e412d80dc7b3434681bf43da79f422c6bbb0ddddba7f347b49ed048e0134a72eefe5ff3d83a615d5c4402d809f3e8010c1ac
6
+ metadata.gz: 5370badaaa4208281fa6e864a398482cd6403aeacaa8ef1ec7ac661d39287cff944a483c3f16ed352feb27b3c230573d627ebf75c43d5be4d496313553706f11
7
+ data.tar.gz: 1418192f0b6adc010b7982b34c2b8f1654b6a3fa6fbd88533a2bf766aea2a4ad3ffb63ba6d3d3481c67c1cf74fa0ec9796dc520474558750bda62fd32d05d24a
data/README.md CHANGED
@@ -0,0 +1,78 @@
1
+ # Puppet Community MVP tool
2
+
3
+ This is a simple tool to generate stats about the Puppet community. It was
4
+ originally intended to show the "most valuable players" but has since morphed to
5
+ show a lot of other things too. We primarily use it on a weekly cron job to
6
+ gather information using the Forge APIs and normalizing them so that they can be
7
+ easily combined with simple SQL queries to generate usage information.
8
+
9
+ ## Interactive usage
10
+
11
+ If you're not working on our community stats pipeline, then there are only three
12
+ subcommands you'll be interested in.
13
+
14
+ ### `stats`
15
+
16
+ This subcommand will use cached data to generate a report of Forge community
17
+ statistics. For example, it will generate distributions of module quality
18
+ scores, or releases per module, or modules per author, etc. And it will generate
19
+ sparklines showing the contributions over time of the most prolific Forge
20
+ authors and it will show authors who aren't as active as they used to be.
21
+
22
+ Unfortunately, this report is not customizable or templatable at this point.
23
+
24
+ You will need cached data before you can generate this report. See the `get` subcommand.
25
+
26
+
27
+ ### `get`
28
+
29
+ This subcommand will download and cache a local mirror of the data stored in our
30
+ BigQuery database. This data is used for the `stats` command.
31
+
32
+
33
+ ### `analyze`
34
+
35
+ This subcommand is maybe the most interesting. Many interesting bits of
36
+ information can be gathered by inspecting the source code of modules, not by
37
+ running SQL queries about their statistics. For example, `find manifests/ -name
38
+ '*.pp' | wc -l` will tell you how many manifests any given module includes, and
39
+ `grep -rn '--no-external-facts' facts.d/` will tell you how many external facts
40
+ are invoking `facter` to gather and use _other_ facts while running.
41
+
42
+ This command lets you write that little bit of analysis code as a script, and
43
+ then systematically run that script against the current release of every single
44
+ module on the Forge and collate the generated output.
45
+
46
+ A script can be written in any language and will be executed from the root of
47
+ the unpacked module. It will be invoked with an environment containing the following
48
+ variables:
49
+
50
+ * `mvp_owner` -- the Forge namespace of the module, aka the author's username
51
+ * `mvp_name` -- the name of the module itself
52
+ * `mvp_version` -- the current version of the module
53
+ * `mvp_downloads` -- the number of downloads this module has. A *rough* estimation of popularity
54
+
55
+ The script should print an array of arrays in JSON format to STDOUT. These will be
56
+ combined to make a CSV file, the columns of which are defined by the data you
57
+ return. In other words, the items in the inner array(s) are totally up to you.
58
+ They will become the columns of the generated CSV file.
59
+
60
+ The parameters relevant to this subcommand are:
61
+
62
+ ```
63
+ -o, --output_file OUTPUT_FILE The path to save a csv report.
64
+ --script SCRIPT The script file to analyze a module. See docs for interface.
65
+ --count N For debugging. Select a random list of this many modules to analyze.
66
+ -d, --debug Display extra debugging information.
67
+ ```
68
+
69
+ See files in the `scripts/` directory for examples of analysis scripts. To use,
70
+ just path of a script, like
71
+
72
+ ```
73
+ $ mvp analyze --script scripts/manifest_count.rb --count 5
74
+ [✔] stdlib (OK)
75
+ $ cat analyzed.csv
76
+ ...
77
+ ```
78
+
data/bin/mvp CHANGED
@@ -13,19 +13,21 @@ optparse = OptionParser.new { |opts|
13
13
  opts.banner = "Usage : #{NAME} [command] [target] [options]
14
14
 
15
15
  This tool will scrape the Puppet Forge API for interesting module & author stats.
16
- The following CLI commands are available.
16
+ It can also mirror public BigQuery tables or views into our dataset for efficiency,
17
+ or download and itemize each Forge module.
17
18
 
18
- * get | retrieve | download [target]
19
- * Downloads and caches all Forge metadata.
20
- * Optional targets: all, authors, modules, releases
21
- * upload | insert [target]
22
- * Uploads data to BigQuery
23
- * Optional targets: all, authors, modules, releases, mirrors
24
19
  * mirror [target]
25
20
  * Runs the download & then upload tasks.
21
+ * Optional targets: all, authors, modules, releases, validations, itemizations, puppetfiles, tables
22
+ * get | retrieve | download [target]
23
+ * Downloads and caches data locally so you can run the stats task.
26
24
  * Optional targets: all, authors, modules, releases
27
25
  * stats
28
26
  * Print out a summary of interesting stats.
27
+ * analyze <script file>
28
+ * Run a specified script to analyze each module to generate arbitrary stats
29
+ * Writes output to a csv file, analyzed.csv by default
30
+
29
31
  "
30
32
 
31
33
  opts.on("-f FORGEAPI", "--forgeapi FORGEAPI", "Forge API server. Rarely needed.") do |arg|
@@ -60,10 +62,22 @@ The following CLI commands are available.
60
62
  options[:output_file] = arg
61
63
  end
62
64
 
65
+ opts.on("--script SCRIPT", "The script file to analyze a module. See docs for interface.") do |arg|
66
+ options[:script] = arg
67
+ end
68
+
69
+ opts.on("--count N", "For debugging. Select a random list of this many modules to analyze.") do |arg|
70
+ options[:count] = arg.to_i
71
+ end
72
+
63
73
  opts.on("-d", "--debug", "Display extra debugging information.") do
64
74
  options[:debug] = true
65
75
  end
66
76
 
77
+ opts.on("-n", "--noop", "Don't actually upload data.") do
78
+ options[:noop] = true
79
+ end
80
+
67
81
  opts.separator('')
68
82
 
69
83
  opts.on("-h", "--help", "Displays this help") do
@@ -83,31 +97,29 @@ options[:gcloud][:dataset] ||= 'community'
83
97
  options[:gcloud][:project] ||= 'puppet'
84
98
  options[:gcloud][:keyfile] ||= '~/.mvp/credentials.json'
85
99
 
100
+ options[:script] = File.expand_path(options[:script]) if options[:script]
86
101
  options[:cachedir] = File.expand_path(options[:cachedir])
87
102
  options[:github_data] = File.expand_path(options[:github_data])
88
103
  options[:gcloud][:keyfile] = File.expand_path(options[:gcloud][:keyfile])
89
104
  FileUtils.mkdir_p(options[:cachedir])
90
105
 
106
+ command, target = ARGV
107
+ case command
108
+ when 'analyze'
109
+ options[:output_file] ||= 'analyzed.csv'
110
+ end
111
+
91
112
  $logger = Logger::new(STDOUT)
92
113
  $logger.level = options[:debug] ? Logger::DEBUG : Logger::INFO
93
114
  $logger.formatter = proc { |severity,datetime,progname,msg| "#{severity}: #{msg}\n" }
94
115
 
95
116
  runner = Mvp::Runner.new(options)
96
117
 
97
- command, target = ARGV
98
118
  case command
99
119
  when 'get', 'retrieve', 'download'
100
120
  target ||= :all
101
121
  runner.retrieve(target.to_sym)
102
122
 
103
- when 'transform'
104
- target ||= :all
105
- runner.retrieve(target.to_sym, false)
106
-
107
- when 'insert', 'upload'
108
- target ||= :all
109
- runner.upload(target.to_sym)
110
-
111
123
  when 'mirror'
112
124
  target ||= :all
113
125
  runner.mirror(target.to_sym)
@@ -116,6 +128,9 @@ when 'stats'
116
128
  target ||= :all
117
129
  runner.stats(target.to_sym)
118
130
 
131
+ when 'analyze'
132
+ runner.analyze
133
+
119
134
  when 'test'
120
135
  runner.test
121
136
 
data/bin/pftest.rb ADDED
@@ -0,0 +1,22 @@
1
+ #! /usr/bin/env ruby
2
+
3
+ require 'mvp/puppetfile_parser'
4
+ require 'open-uri'
5
+ require 'json'
6
+ require 'logger'
7
+
8
+ $logger = Logger::new(STDOUT)
9
+ $logger.level = Logger::INFO
10
+ $logger.formatter = proc { |severity,datetime,progname,msg| "#{severity}: #{msg}\n" }
11
+
12
+ pf = open(ARGV.first)
13
+ parser = Mvp::PuppetfileParser.new()
14
+
15
+
16
+ repo = {
17
+ :repo_name => 'testing',
18
+ :md5 => 'wakka wakka',
19
+ :content => pf.read,
20
+ }
21
+
22
+ puts JSON.pretty_generate(parser.parse(repo))
data/lib/mvp.rb CHANGED
@@ -1,4 +1,2 @@
1
1
  require 'mvp/runner'
2
- require 'mvp/downloader'
3
- require 'mvp/uploader'
4
- require 'mvp/stats'
2
+ require 'mvp/stats'
@@ -3,10 +3,10 @@ require 'tty-spinner'
3
3
  require "google/cloud/bigquery"
4
4
 
5
5
  class Mvp
6
- class Uploader
6
+ class Bigquery
7
7
  def initialize(options = {})
8
+ @options = options
8
9
  @cachedir = options[:cachedir]
9
- @mirrors = options[:gcloud][:mirrors]
10
10
  @bigquery = Google::Cloud::Bigquery.new(
11
11
  :project_id => options[:gcloud][:project],
12
12
  :credentials => Google::Cloud::Bigquery::Credentials.new(options[:gcloud][:keyfile]),
@@ -16,7 +16,7 @@ class Mvp
16
16
  raise "\nThere is a problem with the gCloud configuration: \n #{JSON.pretty_generate(options)}" if @dataset.nil?
17
17
 
18
18
  @itemized = @dataset.table('forge_itemized') || @dataset.create_table('forge_itemized') do |table|
19
- table.name = 'Itemied dependencies between modules'
19
+ table.name = 'Itemized dependencies between modules'
20
20
  table.description = 'A list of all types/classes/functions used by each module and where they come from'
21
21
  table.schema do |s|
22
22
  s.string "module", mode: :required
@@ -27,9 +27,24 @@ class Mvp
27
27
  s.integer "count", mode: :required
28
28
  end
29
29
  end
30
+
31
+ @puppetfile_usage = @dataset.table('github_puppetfile_usage') || @dataset.create_table('github_puppetfile_usage') do |table|
32
+ table.name = 'Puppetfile Module Usage'
33
+ table.description = 'A list of all modules referenced in public Puppetfiles'
34
+ table.schema do |s|
35
+ s.string "repo_name", mode: :required
36
+ s.string "module", mode: :required
37
+ s.string "type", mode: :required
38
+ s.string "source"
39
+ s.string "version"
40
+ s.string "md5", mode: :required
41
+ end
42
+ end
30
43
  end
31
44
 
32
45
  def truncate(entity)
46
+ return if @options[:noop]
47
+
33
48
  begin
34
49
  case entity
35
50
  when :authors
@@ -65,6 +80,7 @@ class Mvp
65
80
  s.timestamp "created_at", mode: :required
66
81
  s.timestamp "updated_at", mode: :required
67
82
  s.string "tasks", mode: :repeated
83
+ s.string "plans", mode: :repeated
68
84
  s.string "homepage_url"
69
85
  s.string "project_page"
70
86
  s.string "issues_url"
@@ -125,6 +141,7 @@ class Mvp
125
141
  s.timestamp "deleted_at"
126
142
  s.string "deleted_for"
127
143
  s.string "tasks", mode: :repeated
144
+ s.string "plans", mode: :repeated
128
145
  s.string "project_page"
129
146
  s.string "issues_url"
130
147
  s.string "source"
@@ -144,11 +161,9 @@ class Mvp
144
161
  s.boolean "puppet_99x"
145
162
  s.string "dependencies", mode: :repeated
146
163
  s.string "file_uri", mode: :required
147
- s.string "file_md5", mode: :required
164
+ s.string "file_md5"
165
+ s.string "file_sha256"
148
166
  s.integer "file_size", mode: :required
149
- s.string "changelog"
150
- s.string "reference"
151
- s.string "readme"
152
167
  s.string "license"
153
168
  s.string "metadata", mode: :required
154
169
  end
@@ -163,99 +178,90 @@ class Mvp
163
178
  end
164
179
  end
165
180
 
166
- def authors()
167
- upload('authors')
168
- end
169
-
170
- def modules()
171
- upload('modules')
181
+ def retrieve(entity)
182
+ get(entity, ['*'])
172
183
  end
173
184
 
174
- def releases()
175
- upload('releases')
176
- end
185
+ def mirror_table(entity)
186
+ return if @options[:noop]
177
187
 
178
- def validations()
179
- upload('validations')
180
- end
181
-
182
- def github_mirrors()
183
- @mirrors.each do |entity|
184
- begin
185
- spinner = TTY::Spinner.new("[:spinner] :title")
186
- spinner.update(title: "Mirroring #{entity[:type]} #{entity[:name]} to BigQuery...")
187
- spinner.auto_spin
188
-
189
- case entity[:type]
190
- when :view
191
- @dataset.table(entity[:name]).delete rescue nil # delete if exists
192
- @dataset.create_view(entity[:name], entity[:query],
193
- :legacy_sql => true)
194
-
195
- when :table
196
- job = @dataset.query_job(entity[:query],
197
- :legacy_sql => true,
198
- :write => 'truncate',
199
- :table => @dataset.table(entity[:name], :skip_lookup => true))
200
- job.wait_until_done!
188
+ begin
189
+ case entity[:type]
190
+ when :view
191
+ @dataset.table(entity[:name]).delete rescue nil # delete if exists
192
+ @dataset.create_view(entity[:name], entity[:query])
201
193
 
202
- else
203
- $logger.error "Unknown mirror type: #{entity[:type]}"
204
- end
194
+ when :table
195
+ job = @dataset.query_job(entity[:query],
196
+ :write => 'truncate',
197
+ :table => @dataset.table(entity[:name], :skip_lookup => true))
198
+ job.wait_until_done!
205
199
 
206
- spinner.success('(OK)')
207
- rescue => e
208
- spinner.error("(Google Cloud error: #{e.message})")
209
- $logger.error e.backtrace.join("\n")
200
+ else
201
+ $logger.error "Unknown mirror type: #{entity[:type]}"
210
202
  end
203
+ rescue => e
204
+ $logger.error("(Google Cloud error: #{e.message})")
205
+ $logger.debug e.backtrace.join("\n")
211
206
  end
212
207
  end
213
208
 
214
- def insert(entity, data)
215
- table = @dataset.table("forge_#{entity}")
209
+ def insert(entity, data, suite = 'forge')
210
+ return if @options[:noop]
211
+ return if data.empty?
212
+
213
+ table = @dataset.table("#{suite}_#{entity}")
216
214
  response = table.insert(data)
217
215
 
218
216
  unless response.success?
219
- errors = {}
217
+ $logger.error '========================================================================='
220
218
  response.insert_errors.each do |err|
221
- errors[err.row['slug']] = err.errors
219
+ $logger.debug JSON.pretty_generate(err.row.reject {|k,v| ['metadata'].include? k})
220
+ $logger.error JSON.pretty_generate(err.errors)
222
221
  end
223
- $logger.error JSON.pretty_generate(errors)
224
222
  end
225
223
  end
226
224
 
227
- def upload(entity)
228
- begin
229
- spinner = TTY::Spinner.new("[:spinner] :title")
230
- spinner.update(title: "Uploading #{entity} to BigQuery ...")
231
- spinner.auto_spin
225
+ def delete(entity, field, match, suite = 'forge')
226
+ @dataset.query("DELETE FROM #{suite}_#{entity} WHERE #{field} = '#{match}'")
227
+ end
232
228
 
233
- @dataset.load("forge_#{entity}", "#{@cachedir}/nld_#{entity}.json",
234
- :write => 'truncate',
235
- :autodetect => true)
229
+ def get(entity, fields, suite = 'forge')
230
+ raise 'pass fields as an array' unless fields.is_a? Array
231
+ @dataset.query("SELECT #{fields.join(', ')} FROM #{suite}_#{entity}")
232
+ end
236
233
 
237
- # table = @dataset.table("forge_#{entity}")
238
- # File.readlines("#{@cachedir}/nld_#{entity}.json").each do |line|
239
- # data = JSON.parse(line)
240
- #
241
- # begin
242
- # table.insert data
243
- # rescue
244
- # require 'pry'
245
- # binding.pry
246
- # end
247
- # end
234
+ def module_sources()
235
+ get('modules', ['slug', 'source'])
236
+ end
248
237
 
238
+ def puppetfiles()
239
+ sql = 'SELECT f.repo_name, f.path, c.content, c.md5
240
+ FROM github_puppetfile_files AS f
241
+ JOIN github_puppetfile_contents AS c
242
+ ON c.id = f.id
249
243
 
250
- spinner.success('(OK)')
251
- rescue => e
252
- spinner.error("(Google Cloud error: #{e.message})")
253
- $logger.error e.backtrace.join("\n")
254
- end
244
+ WHERE c.md5 NOT IN (
245
+ SELECT u.md5
246
+ FROM github_puppetfile_usage AS u
247
+ WHERE u.repo_name = f.repo_name
248
+ ) AND LOWER(repo_name) NOT LIKE "%boxen%"'
249
+ @dataset.query(sql)
250
+ end
251
+
252
+ def unitemized()
253
+ sql = 'SELECT m.name, m.slug, m.version, m.dependencies
254
+ FROM forge_modules AS m
255
+ WHERE m.version NOT IN (
256
+ SELECT i.version
257
+ FROM forge_itemized AS i
258
+ WHERE module = m.slug
259
+ )'
260
+ @dataset.query(sql)
255
261
  end
256
262
 
257
263
  def version_itemized?(mod, version)
258
- str = "SELECT version FROM forge_itemized WHERE name = '#{mod}' UNIQUE"
264
+ str = "SELECT DISTINCT version FROM forge_itemized WHERE module = '#{mod}'"
259
265
  versions = @dataset.query(str).map {|row| row[:version] } rescue []
260
266
 
261
267
  versions.include? version
@@ -2,151 +2,82 @@ require 'json'
2
2
  require 'httparty'
3
3
  require 'tty-spinner'
4
4
  require 'semantic_puppet'
5
- require 'mvp/monkeypatches'
6
- require 'mvp/itemizer'
7
5
 
8
6
  class Mvp
9
- class Downloader
7
+ class Forge
10
8
  def initialize(options = {})
11
9
  @useragent = 'Puppet Community Stats Monitor'
12
- @cachedir = options[:cachedir]
13
10
  @forgeapi = options[:forgeapi] ||'https://forgeapi.puppet.com'
14
- @itemizer = Mvp::Itemizer.new(options)
15
11
  end
16
12
 
17
- def mirror(entity, uploader)
18
- # using authors for git repo terminology consistency
19
- item = (entity == :authors) ? 'users' : entity.to_s
20
- download(item) do |data|
21
- case entity
22
- when :modules
23
- uploader.insert(:validations, flatten_validations(retrieve_validations(data)))
24
- data = flatten_modules(data)
25
-
26
- @itemizer.run!(data, uploader)
27
- when :releases
28
- data = flatten_releases(data)
29
- end
30
-
31
- uploader.insert(entity, data)
32
- end
33
- end
34
-
35
- def retrieve(entity, download = true)
36
- if download
37
- # I am focusing on authorship rather than just users, so for now I'm using the word authors
38
- item = (entity == :authors) ? 'users' : entity.to_s
39
- data = []
40
- download(item) do |resp|
41
- data.concat resp
42
- end
43
- save_json(entity, data)
44
- else
45
- data = File.read("#{@cachedir}/#{entity}.json")
46
- end
47
-
48
- case entity
49
- when :modules
50
- data = flatten_modules(data)
51
- when :releases
52
- data = flatten_releases(data)
53
- end
54
- save_nld_json(entity.to_s, data)
55
- end
56
-
57
- def retrieve_validations(modules, period = 25)
58
- results = {}
13
+ def retrieve(entity)
14
+ raise 'Please process downloaded data by passing a block' unless block_given?
59
15
 
16
+ # using authors for git repo terminology consistency
17
+ entity = :users if entity == :authors
60
18
  begin
61
19
  offset = 0
62
- endpoint = "/private/validations/"
63
- modules.each do |mod|
64
- name = "#{mod['owner']['username']}-#{mod['name']}"
65
- response = HTTParty.get("#{@forgeapi}#{endpoint}#{name}", headers: {'User-Agent' => @useragent})
20
+ endpoint = "/v3/#{entity}?sort_by=downloads&limit=50"
21
+
22
+ while endpoint do
23
+ response = HTTParty.get("#{@forgeapi}#{endpoint}", headers: {"User-Agent" => @useragent})
66
24
  raise "Forge Error: #{@response.body}" unless response.code == 200
25
+ data = JSON.parse(response.body)
26
+ results = munge_dates(data['results'])
27
+
28
+ case entity
29
+ when :modules
30
+ results = flatten_modules(results)
31
+ when :releases
32
+ results = flatten_releases(results)
33
+ end
67
34
 
68
- results[name] = JSON.parse(response.body)
69
- offset += 1
35
+ yield results, offset
70
36
 
71
- if block_given? and (offset % period == 0)
72
- yield offset
37
+ offset += 50
38
+ endpoint = data['pagination']['next']
39
+ if (endpoint and (offset % 250 == 0))
73
40
  GC.start
74
41
  end
75
42
  end
43
+
76
44
  rescue => e
77
45
  $logger.error e.message
78
46
  $logger.debug e.backtrace.join("\n")
79
47
  end
80
48
 
81
- results
49
+ nil
82
50
  end
83
51
 
84
- def validations()
85
- cache = "#{@cachedir}/modules.json"
86
-
87
- if File.exist? cache
88
- module_data = JSON.parse(File.read(cache))
89
- else
90
- module_data = retrieve(:modules)
91
- end
52
+ def retrieve_validations(modules, period = 25)
53
+ raise 'Please process validations by passing a block' unless block_given?
92
54
 
55
+ offset = 0
93
56
  begin
94
- spinner = TTY::Spinner.new("[:spinner] :title")
95
- spinner.update(title: "Downloading module validations ...")
96
- spinner.auto_spin
57
+ modules.each_slice(period) do |group|
58
+ offset += period
59
+ results = group.map { |mod| validations(mod[:slug]) }
97
60
 
98
- results = retrieve_validations(module_data) do |offset|
99
- spinner.update(title: "Downloading module validations [#{offset}]...")
61
+ yield results, offset
62
+ GC.start
100
63
  end
101
-
102
- spinner.success('(OK)')
103
64
  rescue => e
104
- spinner.error('API error')
105
65
  $logger.error e.message
106
66
  $logger.debug e.backtrace.join("\n")
107
67
  end
108
68
 
109
- save_json('validations', results)
110
- save_nld_json('validations', flatten_validations(results))
111
- results
69
+ nil
112
70
  end
113
71
 
114
- def download(entity)
115
- raise 'Please process downloaded data by passing a block' unless block_given?
116
-
117
- begin
118
- offset = 0
119
- endpoint = "/v3/#{entity}?sort_by=downloads&limit=50"
120
- spinner = TTY::Spinner.new("[:spinner] :title")
121
- spinner.update(title: "Downloading #{entity} ...")
122
- spinner.auto_spin
123
-
124
- while endpoint do
125
- response = HTTParty.get("#{@forgeapi}#{endpoint}", headers: {"User-Agent" => @useragent})
126
- raise "Forge Error: #{@response.body}" unless response.code == 200
127
- data = JSON.parse(response.body)
128
-
129
- offset += 50
130
- endpoint = data['pagination']['next']
131
-
132
- yield munge_dates(data['results'])
133
-
134
- if (endpoint and (offset % 250 == 0))
135
- spinner.update(title: "Downloading #{entity} [#{offset}]...")
136
- GC.start
137
- end
138
- end
139
-
140
- spinner.success('(OK)')
141
- rescue => e
142
- spinner.error('API error')
143
- $logger.error e.message
144
- $logger.debug e.backtrace.join("\n")
145
- end
72
+ def validations(name)
73
+ endpoint = "/private/validations/"
74
+ response = HTTParty.get("#{@forgeapi}#{endpoint}#{name}", headers: {'User-Agent' => @useragent})
75
+ raise "Forge Error: #{@response.body}" unless response.code == 200
146
76
 
147
- nil
77
+ flatten_validations(name, JSON.parse(response.body))
148
78
  end
149
79
 
80
+
150
81
  # transform dates into a format that bigquery knows
151
82
  def munge_dates(object)
152
83
  ["created_at", "updated_at", "deprecated_at", "deleted_at"].each do |field|
@@ -160,16 +91,6 @@ class Mvp
160
91
  object
161
92
  end
162
93
 
163
- def save_json(thing, data)
164
- File.write("#{@cachedir}/#{thing}.json", data.to_json)
165
- end
166
-
167
- # store data in a way that bigquery can grok
168
- # uploading files is far easier than streaming data, when replacing a dataset
169
- def save_nld_json(thing, data)
170
- File.write("#{@cachedir}/nld_#{thing}.json", data.to_newline_delimited_json)
171
- end
172
-
173
94
  def flatten_modules(data)
174
95
  data.each do |row|
175
96
  row['owner'] = row['owner']['username']
@@ -183,6 +104,7 @@ class Mvp
183
104
  row['project_page'] = row['current_release']['metadata']['project_page']
184
105
  row['issues_url'] = row['current_release']['metadata']['issues_url']
185
106
  row['tasks'] = row['current_release']['tasks'].map{|task| task['name']} rescue []
107
+ row['plans'] = row['current_release']['plans'].map{|task| task['name']} rescue []
186
108
 
187
109
  row['release_count'] = row['releases'].count rescue 0
188
110
  row['releases'] = row['releases'].map{|r| r['version']} rescue []
@@ -202,21 +124,24 @@ class Mvp
202
124
  row['project_page'] = row['metadata']['project_page']
203
125
  row['issues_url'] = row['metadata']['issues_url']
204
126
  row['tasks'] = row['tasks'].map{|task| task['name']} rescue []
127
+ row['plans'] = row['plans'].map{|task| task['name']} rescue []
205
128
 
206
129
  simplify_metadata(row, row['metadata'])
207
- row.delete('module')
130
+
131
+ # These items are just too big to store in the table, and the malware scan isn't done yet
132
+ ['module', 'changelog', 'readme', 'reference', 'malware_scan'].each do |column|
133
+ row.delete(column)
134
+ end
208
135
  end
209
136
  data
210
137
  end
211
138
 
212
- def flatten_validations(data)
213
- data.map do |name, scores|
214
- row = { 'name' => name }
215
- scores.each do |entry|
216
- row[entry['name']] = entry['score']
217
- end
218
- row
139
+ def flatten_validations(name, scores)
140
+ row = { 'name' => name }
141
+ scores.each do |entry|
142
+ row[entry['name']] = entry['score']
219
143
  end
144
+ row
220
145
  end
221
146
 
222
147
  def simplify_metadata(data, metadata)
data/lib/mvp/itemizer.rb CHANGED
@@ -12,7 +12,7 @@ class Mvp
12
12
 
13
13
  def run!(data, uploader)
14
14
  data.each do |mod|
15
- modname = mod['slug']
15
+ modname = mod['name']
16
16
  version = mod['version']
17
17
  return if uploader.version_itemized?(modname, version)
18
18
 
@@ -27,13 +27,23 @@ class Mvp
27
27
  end
28
28
  end
29
29
 
30
+ def itemized(mod)
31
+ modname = mod[:slug]
32
+ version = mod[:version]
33
+ baserow = { :module => modname, :version => version, :kind => 'admin', :element => 'version', :count => 0}
34
+
35
+ table(itemize(modname, version), mod) << baserow
36
+ end
37
+
30
38
  def download(path, modname, version)
31
39
  filename = "#{modname}-#{version}.tar.gz"
32
40
  Dir.chdir(path) do
33
41
  File.open(filename, "w") do |file|
34
42
  file << HTTParty.get( "#{@forge}/v3/files/#{filename}" )
35
43
  end
36
- system("tar -xf #{filename}")
44
+ # Why is tar terrible?
45
+ FileUtils.mkdir("#{modname}-#{version}")
46
+ system("tar -xf #{filename} -C #{modname}-#{version} --strip-components=1")
37
47
  FileUtils.rm(filename)
38
48
  end
39
49
  end
@@ -55,23 +65,67 @@ class Mvp
55
65
  end
56
66
  end
57
67
 
68
+ def analyze(mod, script, debug)
69
+ require 'open3'
70
+ require 'json'
71
+
72
+ # sanitize an environment
73
+ env = {'mvp_script' => script}
74
+ mod.each do |key, value|
75
+ env["mvp_#{key}"] = value.to_s
76
+ end
77
+
78
+ downloads = mod[:downloads]
79
+ Dir.mktmpdir('mvp') do |path|
80
+ download(path, "#{mod[:owner]}-#{mod[:name]}", mod[:version])
81
+
82
+ rows = []
83
+ Dir.chdir("#{path}/#{mod[:owner]}-#{mod[:name]}-#{mod[:version]}") do
84
+ if debug
85
+ exit(1) unless system(env, ENV['SHELL'])
86
+ end
87
+
88
+ stdout, stderr, status = Open3.capture3(env, script)
89
+
90
+ if status.success?
91
+ rows = JSON.parse(stdout)
92
+ else
93
+ $logger.error stderr
94
+ end
95
+ end
96
+
97
+ return rows unless rows.empty?
98
+ end
99
+ end
100
+
58
101
  # Build a table with this schema
59
102
  # module | version | source | kind | element | count
60
103
  def table(itemized, data)
61
- modname = data['slug']
62
- version = data['version']
63
- dependencies = data['dependencies']
104
+ modname = data[:name]
105
+ slug = data[:slug]
106
+ version = data[:version]
107
+ dependencies = data[:dependencies]
64
108
 
65
109
  itemized.map do |kind, elements|
66
110
  # the kind of element comes pluralized from puppet-itemize
67
111
  kind = kind.to_s
68
112
  kind = kind.end_with?('ses') ? kind.chomp('es') : kind.chomp('s')
69
113
  elements.map do |name, count|
70
- # TODO: this may suffer from collisions, (module foo, function foo, for example)
71
- depname = name.split('::').first
114
+ if name == modname
115
+ depname = name
116
+ else
117
+ # This relies on a little guesswork.
118
+ segments = name.split('::') # First see if its already namespaced and we can just use it
119
+ segments = name.split('_') if segments.size == 1 # If not, then maybe it follows the pattern like 'mysql_password'
120
+ depname = segments.first
121
+ end
122
+
123
+ # There's a chance of collisions here. For example, if you depended on a module
124
+ # named 'foobar-notify' and you used a 'notify' resource, then the resource would
125
+ # be improperly linked to that module. That's a pretty small edge case though.
72
126
  source = dependencies.find {|row| row.split('-').last == depname} rescue nil
73
127
 
74
- { :module => modname, :version => version, :source => source, :kind => kind, :element => name, :count => count }
128
+ { :module => slug, :version => version, :source => source, :kind => kind, :element => name, :count => count }
75
129
  end
76
130
  end.flatten(1)
77
131
  end
@@ -0,0 +1,171 @@
1
+ class Mvp
2
+ class PuppetfileParser
3
+ def initialize(options = {})
4
+ @sources = {}
5
+ @modules = []
6
+ @repo = nil
7
+ end
8
+
9
+ def suitable?
10
+ defined?(RubyVM::AbstractSyntaxTree)
11
+ end
12
+
13
+ def sources=(modules)
14
+ modules.each do |row|
15
+ next unless row[:source]
16
+ next if row[:source] == 'UNKNOWN'
17
+
18
+ @sources[canonical_git_repo(row[:source])] = row[:slug]
19
+ end
20
+ end
21
+
22
+ def parse(repo)
23
+ # This only works on Ruby 2.6+
24
+ return unless suitable?
25
+
26
+ begin
27
+ root = RubyVM::AbstractSyntaxTree.parse(repo[:content])
28
+ rescue SyntaxError => e
29
+ $logger.warn "Syntax error in #{repo[:repo_name]}/Puppetfile"
30
+ $logger.warn e.message
31
+ end
32
+
33
+ @repo = repo
34
+ @modules = []
35
+ traverse(root)
36
+ @modules.compact.map do |row|
37
+ row[:repo_name] = repo[:repo_name]
38
+ row[:md5] = repo[:md5]
39
+ row[:module] = canonical_name(row[:module], row[:source])
40
+ stringify(row)
41
+ end
42
+ end
43
+
44
+ def stringify(row)
45
+ row.each do |key, value|
46
+ if value.is_a? RubyVM::AbstractSyntaxTree::Node
47
+ row[key] = :'#<programmatically generated via ruby code>'
48
+ end
49
+ end
50
+ end
51
+
52
+ def canonical_name(name, repo)
53
+ return name if name.include?('-')
54
+ repo = canonical_git_repo(repo)
55
+
56
+ return @sources[repo] if @sources.include?(repo)
57
+ name
58
+ end
59
+
60
+ def canonical_git_repo(repo)
61
+ return unless repo
62
+ return unless repo.is_a? String
63
+ repo.sub(/^git@github.com\:/, 'github.com/')
64
+ .sub(/^(git|https?)\:\/\//, '')
65
+ .sub(/\.git$/, '')
66
+ end
67
+
68
+ def add_module(name, args)
69
+ unless name.is_a? String
70
+ $logger.warn "Non string module name in #{@repo[:repo_name]}/Puppetfile"
71
+ return nil
72
+ end
73
+ name.gsub!('/', '-')
74
+ case args
75
+ when String, Symbol, NilClass
76
+ @modules << {
77
+ :module => name,
78
+ :type => :forge,
79
+ :source => :forge,
80
+ :version => args,
81
+ }
82
+ when Hash
83
+ @modules << parse_args(name, args)
84
+ else
85
+ $logger.warn "#{@repo[:repo_name]}/Puppetfile: Unknown format: mod('#{name}', #{args.inspect})"
86
+ end
87
+ end
88
+
89
+ def parse_args(name, args)
90
+ data = {:module => name}
91
+
92
+ if args.include? :git
93
+ data[:type] = :git
94
+ data[:source] = args[:git]
95
+ data[:version] = args[:ref] || args[:tag] || args[:commit] || args[:branch] || :latest
96
+ elsif args.include? :svn
97
+ data[:type] = :svn
98
+ data[:source] = args[:svn]
99
+ data[:version] = args[:rev] || args[:revision] || :latest
100
+ elsif args.include? :boxen
101
+ data[:type] = :boxen
102
+ data[:source] = args[:repo]
103
+ data[:version] = args[:version] || :latest
104
+ else
105
+ $logger.warn "#{@repo[:repo_name]}/Puppetfile: Unknown args format: mod('#{name}', #{args.inspect})"
106
+ return nil
107
+ end
108
+
109
+ data
110
+ end
111
+
112
+ def traverse(node)
113
+ begin
114
+ if node.type == :FCALL
115
+ name = node.children.first
116
+ args = node.children.last.children.map do |item|
117
+ next if item.nil?
118
+
119
+ case item.type
120
+ when :HASH
121
+ Hash[*item.children.first.children.compact.map {|n| n.children.first }]
122
+ else
123
+ item.children.first
124
+ end
125
+ end.compact
126
+
127
+ case name
128
+ when :mod
129
+ add_module(args.shift, args.shift)
130
+ when :forge
131
+ # noop
132
+ when :moduledir
133
+ # noop
134
+ when :github
135
+ # oh boxen, you so silly.
136
+ # The order of the unpacking below *is* important.
137
+ modname = args.shift
138
+ version = args.shift
139
+ data = args.shift || {}
140
+
141
+ # this is gross but I'm not sure I actually care right now.
142
+ if (modname.is_a? String and [String, NilClass].include? version.class and data.is_a? Hash)
143
+ data[:boxen] = :boxen
144
+ data[:version] = version
145
+ add_module(modname, data)
146
+ else
147
+ $logger.warn "#{@repo[:repo_name]}/Puppetfile: malformed boxen"
148
+ end
149
+ else
150
+ # Should we record unexpected Ruby code or just log it to stdout?
151
+ args = args.map {|a| a.is_a?(String) ? "'#{a}'" : a}.join(', ')
152
+ $logger.warn "#{@repo[:repo_name]}/Puppetfile: Unexpected invocation of #{name}(#{args})"
153
+ end
154
+ end
155
+
156
+ node.children.each do |n|
157
+ next unless n.is_a? RubyVM::AbstractSyntaxTree::Node
158
+
159
+ traverse(n)
160
+ end
161
+ rescue => e
162
+ puts e.message
163
+ end
164
+ end
165
+
166
+ def test()
167
+ require 'pry'
168
+ binding.pry
169
+ end
170
+ end
171
+ end
data/lib/mvp/runner.rb CHANGED
@@ -1,6 +1,10 @@
1
- require 'mvp/downloader'
2
- require 'mvp/uploader'
1
+ require 'mvp/forge'
2
+ require 'mvp/bigquery'
3
3
  require 'mvp/stats'
4
+ require 'mvp/itemizer'
5
+ require 'mvp/puppetfile_parser'
6
+
7
+ require 'tty-spinner'
4
8
 
5
9
  class Mvp
6
10
  class Runner
@@ -11,52 +15,144 @@ class Mvp
11
15
  end
12
16
 
13
17
  def retrieve(target = :all, download = true)
14
- downloader = Mvp::Downloader.new(@options)
18
+ bigquery = Mvp::Bigquery.new(@options)
15
19
 
16
- [:authors, :modules, :releases].each do |thing|
17
- next unless [:all, thing].include? target
18
- downloader.retrieve(thing, download)
19
- end
20
+ begin
21
+ [:authors, :modules, :releases, :validations].each do |thing|
22
+ next unless [:all, thing].include? target
23
+ spinner = mkspinner("Retrieving #{thing} ...")
24
+ data = bigquery.retrieve(thing)
25
+ save_json(thing, data)
26
+ spinner.success('(OK)')
27
+ end
20
28
 
21
- if [:all, :validations].include? target
22
- downloader.validations()
29
+ rescue => e
30
+ spinner.error("API error: #{e.message}")
31
+ $logger.error "API error: #{e.message}"
32
+ $logger.debug e.backtrace.join("\n")
33
+ sleep 10
23
34
  end
24
35
  end
25
36
 
26
- def upload(target = :all)
27
- uploader = Mvp::Uploader.new(@options)
37
+ def mirror(target = :all)
38
+ forge = Mvp::Forge.new(@options)
39
+ bigquery = Mvp::Bigquery.new(@options)
40
+ itemizer = Mvp::Itemizer.new(@options)
41
+ pfparser = Mvp::PuppetfileParser.new(@options)
28
42
 
29
- [:authors, :modules, :releases, :validations, :github_mirrors].each do |thing|
30
- next unless [:all, thing].include? target
31
- uploader.send(thing)
43
+ begin
44
+ [:authors, :modules, :releases].each do |thing|
45
+ next unless [:all, thing].include? target
46
+ spinner = mkspinner("Mirroring #{thing}...")
47
+ bigquery.truncate(thing)
48
+ forge.retrieve(thing) do |data, offset|
49
+ spinner.update(title: "Mirroring #{thing} [#{offset}]...")
50
+ bigquery.insert(thing, data)
51
+ end
52
+ spinner.success('(OK)')
53
+ end
54
+
55
+ if [:all, :validations].include? target
56
+ spinner = mkspinner("Mirroring validations...")
57
+ modules = bigquery.get(:modules, [:slug])
58
+ bigquery.truncate(:validations)
59
+ forge.retrieve_validations(modules) do |data, offset|
60
+ spinner.update(title: "Mirroring validations [#{offset}]...")
61
+ bigquery.insert(:validations, data)
62
+ end
63
+ spinner.success('(OK)')
64
+ end
65
+
66
+ if [:all, :itemizations].include? target
67
+ spinner = mkspinner("Itemizing modules...")
68
+ bigquery.unitemized.each do |mod|
69
+ spinner.update(title: "Itemizing [#{mod[:slug]}]...")
70
+ rows = itemizer.itemized(mod)
71
+ bigquery.delete(:itemized, :module, mod[:slug])
72
+ bigquery.insert(:itemized, rows)
73
+ end
74
+ spinner.success('(OK)')
75
+ end
76
+
77
+ if [:all, :mirrors, :tables].include? target
78
+ @options[:gcloud][:mirrors].each do |entity|
79
+ spinner = mkspinner("Mirroring #{entity[:type]} #{entity[:name]} to BigQuery...")
80
+ bigquery.mirror_table(entity)
81
+ spinner.success('(OK)')
82
+ end
83
+ end
84
+
85
+ if [:all, :puppetfiles].include? target
86
+ spinner = mkspinner("Analyzing Puppetfile module references...")
87
+ if pfparser.suitable?
88
+ pfparser.sources = bigquery.module_sources
89
+ bigquery.puppetfiles.each do |repo|
90
+ spinner.update(title: "Analyzing [#{repo[:repo_name]}/Puppetfile]...")
91
+ rows = pfparser.parse(repo)
92
+ bigquery.delete(:puppetfile_usage, :repo_name, repo[:repo_name], :github)
93
+ bigquery.insert(:puppetfile_usage, rows, :github)
94
+ end
95
+ spinner.success('(OK)')
96
+ else
97
+ spinner.error("(Not functional on Ruby #{RUBY_VERSION})")
98
+ end
99
+ end
100
+
101
+ rescue => e
102
+ spinner.error("API error: #{e.message}")
103
+ $logger.error "API error: #{e.message}"
104
+ $logger.debug e.backtrace.join("\n")
105
+ sleep 10
32
106
  end
33
107
  end
34
108
 
35
- def mirror(target = :all)
36
- downloader = Mvp::Downloader.new(@options)
37
- uploader = Mvp::Uploader.new(@options)
109
+ def analyze
110
+ bigquery = Mvp::Bigquery.new(@options)
111
+ itemizer = Mvp::Itemizer.new(@options)
38
112
 
39
- # validations are downloaded with modules
40
- [:authors, :modules, :releases].each do |thing|
41
- next unless [:all, thing].include? target
42
- uploader.truncate(thing)
43
- downloader.mirror(thing, uploader)
44
- end
113
+ begin
114
+ spinner = mkspinner("Analyzing modules...")
115
+ modules = bigquery.get(:modules, [:owner, :name, :version, :downloads])
116
+ modules = modules.sample(@options[:count]) if @options[:count]
117
+
118
+ require 'csv'
119
+ csv_string = CSV.generate do |csv|
120
+ modules.each do |mod|
121
+ spinner.stop if @options[:debug]
122
+ rows = itemizer.analyze(mod, @options[:script], @options[:debug])
123
+ spinner.start if @options[:debug]
124
+
125
+ next unless rows
126
+ spinner.update(title: mod[:name])
127
+ rows.each {|row| csv << row}
128
+ end
129
+ end
45
130
 
46
- if [:all, :mirrors].include? target
47
- uploader.github_mirrors()
131
+ File.write(@options[:output_file], csv_string)
132
+ spinner.success('(OK)')
48
133
  end
49
134
  end
50
135
 
51
136
  def stats(target)
52
137
  stats = Mvp::Stats.new(@options)
53
138
 
54
- [:authors, :modules, :releases, :relationships, :github, :validations].each do |thing|
139
+ [:authors, :modules, :releases, :relationships, :validations].each do |thing|
55
140
  next unless [:all, thing].include? target
56
141
  stats.send(thing)
57
142
  end
58
143
  end
59
144
 
145
+ def mkspinner(title)
146
+ spinner = TTY::Spinner.new("[:spinner] :title")
147
+ spinner.update(title: title)
148
+ spinner.auto_spin
149
+ spinner
150
+ end
151
+
152
+ def save_json(thing, data)
153
+ File.write("#{@cachedir}/#{thing}.json", data.to_json)
154
+ end
155
+
60
156
  def test()
61
157
  require 'pry'
62
158
  binding.pry
data/lib/mvp/stats.rb CHANGED
@@ -19,7 +19,8 @@ class Mvp
19
19
 
20
20
  def draw_graph(series, width, title = nil)
21
21
  series.compact!
22
- graph = []
22
+ width = [width, series.size].min
23
+ graph = []
23
24
  (bins, freqs) = series.histogram(:bin_width => width)
24
25
 
25
26
  bins.each_with_index do |item, index|
@@ -44,6 +45,20 @@ class Mvp
44
45
  days_ago(datestr)/365
45
46
  end
46
47
 
48
+ def current_releases
49
+ return @current_releases if @current_releases
50
+
51
+ data_m = load('modules').reject {|m| m['owner'] == 'puppetlabs' }
52
+ data_r = load('releases').reject {|m| m['owner'] == 'puppetlabs' }
53
+
54
+ @current_releases = data_m.map {|mod|
55
+ name = mod['slug']
56
+ curr = mod['releases'].first
57
+
58
+ data_r.find {|r| r['slug'] == "#{name}-#{curr}" }
59
+ }.compact
60
+ end
61
+
47
62
  def tally_author_info(releases, target, scope='module_count')
48
63
  # update the author records with the fields we need
49
64
  target.each do |author|
@@ -52,7 +67,7 @@ class Mvp
52
67
  end
53
68
 
54
69
  releases.each do |mod|
55
- username = mod['module']['owner']['username']
70
+ username = mod['owner']
56
71
  score = mod['validation_score']
57
72
  author = target.select{|m| m['username'] == username}.first
58
73
 
@@ -111,9 +126,10 @@ class Mvp
111
126
  end
112
127
 
113
128
  def modules()
114
- data_m = load('modules').reject {|m| m['owner']['username'] == 'puppetlabs' }
129
+ data_m = load('modules').reject {|m| m['owner'] == 'puppetlabs' }
115
130
  data_a = load('authors').reject {|u| u['username'] == 'puppetlabs' or u['module_count'] == 0}
116
- current = data_m.map {|m| m['current_release'] }
131
+
132
+ current = current_releases
117
133
 
118
134
  tally_author_info(current, data_a, 'module_count')
119
135
 
@@ -155,7 +171,7 @@ class Mvp
155
171
  end
156
172
 
157
173
  def releases()
158
- data_r = load('releases').reject {|m| m['module']['owner']['username'] == 'puppetlabs' }
174
+ data_r = load('releases').reject {|m| m['owner'] == 'puppetlabs' }
159
175
  data_a = load('authors').reject {|u| u['username'] == 'puppetlabs' or u['module_count'] == 0}
160
176
 
161
177
  tally_author_info(data_r, data_a, 'release_count')
@@ -236,12 +252,12 @@ class Mvp
236
252
  end
237
253
 
238
254
  def relationships()
239
- data_m = load('modules').reject {|m| m['owner']['username'] == 'puppetlabs' }
240
255
  data_a = load('authors').reject {|u| u['username'] == 'puppetlabs' or u['module_count'] == 0}
241
- current = data_m.map {|m| m['current_release'] }
256
+ current = current_releases.dup
242
257
 
243
258
  current.each do |mod|
244
- mod['metadata']['dependants'] = []
259
+ mod['metadata'] = JSON.parse(mod['metadata'])
260
+ mod['metadata']['dependants'] = []
245
261
  end
246
262
  current.each do |mod|
247
263
  mod['metadata']['dependencies'].each do |dependency|
@@ -257,7 +273,7 @@ class Mvp
257
273
  count = mod['metadata']['dependants'].count
258
274
  next unless count > 0
259
275
 
260
- author = data_a.select{|m| m['username'] == mod['module']['owner']['username']}.first
276
+ author = data_a.select{|m| m['username'] == mod['owner']}.first
261
277
  author['dependants'] << count
262
278
  end
263
279
  data_a.each { |a| a['average_dependants'] = average(a['dependants']) }
@@ -280,6 +296,7 @@ class Mvp
280
296
  author['module_count'],
281
297
  author['release_count'] ]
282
298
  end
299
+ puts
283
300
  end
284
301
 
285
302
  def github()
@@ -328,7 +345,7 @@ class Mvp
328
345
  end
329
346
 
330
347
  def validations()
331
- puts 'got nothing for you yet'
348
+ puts 'No validations yet'
332
349
  end
333
350
 
334
351
  def test()
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: puppet-community-mvp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.3
4
+ version: 0.0.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ben Ford
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-10-26 00:00:00.000000000 Z
11
+ date: 2021-08-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: json
@@ -109,7 +109,21 @@ dependencies:
109
109
  - !ruby/object:Gem::Version
110
110
  version: '0'
111
111
  - !ruby/object:Gem::Dependency
112
- name: google-cloud
112
+ name: google-cloud-bigquery
113
+ requirement: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - ">="
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ type: :runtime
119
+ prerelease: false
120
+ version_requirements: !ruby/object:Gem::Requirement
121
+ requirements:
122
+ - - ">="
123
+ - !ruby/object:Gem::Version
124
+ version: '0'
125
+ - !ruby/object:Gem::Dependency
126
+ name: puppet-itemize
113
127
  requirement: !ruby/object:Gem::Requirement
114
128
  requirements:
115
129
  - - ">="
@@ -137,13 +151,14 @@ files:
137
151
  - LICENSE
138
152
  - README.md
139
153
  - bin/mvp
154
+ - bin/pftest.rb
140
155
  - lib/mvp.rb
141
- - lib/mvp/downloader.rb
156
+ - lib/mvp/bigquery.rb
157
+ - lib/mvp/forge.rb
142
158
  - lib/mvp/itemizer.rb
143
- - lib/mvp/monkeypatches.rb
159
+ - lib/mvp/puppetfile_parser.rb
144
160
  - lib/mvp/runner.rb
145
161
  - lib/mvp/stats.rb
146
- - lib/mvp/uploader.rb
147
162
  homepage:
148
163
  licenses:
149
164
  - Apache 2
@@ -163,8 +178,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
163
178
  - !ruby/object:Gem::Version
164
179
  version: '0'
165
180
  requirements: []
166
- rubyforge_project:
167
- rubygems_version: 2.6.10
181
+ rubygems_version: 3.0.3
168
182
  signing_key:
169
183
  specification_version: 4
170
184
  summary: Generate some stats about the Puppet Community.
@@ -1,8 +0,0 @@
1
- # BigQuery uses newline delimited json
2
- # https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON
3
-
4
- class Array
5
- def to_newline_delimited_json
6
- self.map(&:to_json).join("\n")
7
- end
8
- end