compare_compressors 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/README.md +132 -0
- data/bin/compare_compressors +6 -0
- data/lib/compare_compressors.rb +40 -0
- data/lib/compare_compressors/command_line_interface.rb +223 -0
- data/lib/compare_compressors/comparer.rb +70 -0
- data/lib/compare_compressors/compressor.rb +150 -0
- data/lib/compare_compressors/compressors/brotli_compressor.rb +43 -0
- data/lib/compare_compressors/compressors/bzip2_compressor.rb +37 -0
- data/lib/compare_compressors/compressors/gzip_compressor.rb +34 -0
- data/lib/compare_compressors/compressors/seven_zip_compressor.rb +43 -0
- data/lib/compare_compressors/compressors/xz_compressor.rb +37 -0
- data/lib/compare_compressors/compressors/zstd_compressor.rb +37 -0
- data/lib/compare_compressors/cost_model.rb +55 -0
- data/lib/compare_compressors/costed_group_result.rb +87 -0
- data/lib/compare_compressors/group_result.rb +62 -0
- data/lib/compare_compressors/plotter.rb +164 -0
- data/lib/compare_compressors/plotters/cost_plotter.rb +90 -0
- data/lib/compare_compressors/plotters/raw_plotter.rb +61 -0
- data/lib/compare_compressors/plotters/size_plotter.rb +76 -0
- data/lib/compare_compressors/result.rb +81 -0
- data/lib/compare_compressors/version.rb +8 -0
- data/test/compare_compressors/compare_compressors_test.rb +271 -0
- metadata +101 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 8962f01ee986bcfd6c301100f04565ae6844d4ed
|
4
|
+
data.tar.gz: e1e02bb86cc4bc150c599f439e59a06718c5be9d
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 41d405711e8947af177e741d819df1416e24838ca77ef672571e811d6419e21e0c732b0feba92d928acee5becac5f5e4dda70ef64e872d96d9d3ed18edbb1812
|
7
|
+
data.tar.gz: eb4ce9b89346ee291e9778ea5450180772c2788792a2272bd8757a21fd018e1702a2dfec5fe1ebc053f04784132fd6d57ad0d59e272c38c029c4925eb1add646
|
data/README.md
ADDED
@@ -0,0 +1,132 @@
|
|
1
|
+
# compare_compressors
|
2
|
+
|
3
|
+
https://github.com/jdleesmiller/compare_compressors
|
4
|
+
|
5
|
+
[![Build Status](https://travis-ci.org/jdleesmiller/compare_compressors.svg?branch=master)](https://travis-ci.org/jdleesmiller/compare_compressors)
|
6
|
+
|
7
|
+
## Synopsis
|
8
|
+
|
9
|
+
Evaluate different compression tools and their settings by running them on a sample of data.
|
10
|
+
|
11
|
+
See [this blog post for an example of how to use the tool](TODO).
|
12
|
+
|
13
|
+
### Usage
|
14
|
+
|
15
|
+
This utility has many system dependencies, so the easiest way to run it is via Docker:
|
16
|
+
|
17
|
+
```shell
|
18
|
+
$ docker pull jdleesmiller/compare_compressors
|
19
|
+
```
|
20
|
+
|
21
|
+
Generally you will run a `compare` step, followed by a `plot` or `summarize` step.
|
22
|
+
|
23
|
+
#### Compare
|
24
|
+
|
25
|
+
This step runs the compressors on the sample files and saves the results to a CSV. Assuming that your sample files are in a folder called `data` in the current directory, and they are called `test1`, `test2`, etc.., the command would look like:
|
26
|
+
|
27
|
+
```
|
28
|
+
docker run --rm \
|
29
|
+
--volume `pwd`/data:/home/app/compare_compressors/data:ro \
|
30
|
+
--volume /tmp:/tmp \ # optional
|
31
|
+
jdleesmiller/compare_compressors compare data/test* >data/compare.csv
|
32
|
+
```
|
33
|
+
|
34
|
+
where:
|
35
|
+
|
36
|
+
- The `--rm` flag tells docker to remove the container when it's finished.
|
37
|
+
|
38
|
+
- The ```--volume `pwd`/data:/home/app/compare_compressors/data:ro``` flag mounts `./data` on the host inside the container, so the utility can access the sample files. The trick here is that `/home/app/compare_compressors` is the utility's working directory inside the container, so the relative paths `data/test*` for the sample files will be the same both inside and outside of the container. The `:ro` makes it a read only mount; this is optional, but it provides added assurance that the utility won't change your data files.
|
39
|
+
|
40
|
+
- The `--volume /tmp:/tmp` flag is optional but may improve performance. The utility does its compression and decompression in `/tmp` inside the container, and all of the writes inside the container go through Docker's union file system. By mounting `/tmp` on the host, we bypass the union file system. (Ideally, we'd just do this in the Dockerfile, but unfortunately it's 10x slower on Docker for Mac; hopefully that will improve soon.)
|
41
|
+
|
42
|
+
#### Plot
|
43
|
+
|
44
|
+
Once you've generated a CSV with results, the tool can read the CSV and generate a `gnuplot` script to plot the results. Note that you need to have `gnuplot` installed on the host for this to work.
|
45
|
+
|
46
|
+
There are several plotting commands: `plot` gives you a 2D plot of compression time vs compressed size. There is also a `--decompression` option to plot decompression time vs compressed size instead.
|
47
|
+
|
48
|
+
```
|
49
|
+
docker run --rm \
|
50
|
+
--volume `pwd`/data:/home/app/compare_compressors/data:ro \
|
51
|
+
--volume /tmp:/tmp \
|
52
|
+
jdleesmiller/compare_compressors plot data/compare.csv | gnuplot
|
53
|
+
```
|
54
|
+
|
55
|
+
The `plot_costs` command takes three cost coefficients: cost per GiB of stored data, cost per hour to run the compression program, and cost per hour to run the decompression program. The program then computes a simple linear cost function. To keep the plot in 2D, the two time costs are added together.
|
56
|
+
|
57
|
+
```
|
58
|
+
docker run --rm \
|
59
|
+
--volume `pwd`/data:/home/app/compare_compressors/data:ro \
|
60
|
+
--gibyte-cost 56.05 \
|
61
|
+
--compression-hour-cost 32.35 \
|
62
|
+
--decompression-hour-cost 177.91 \
|
63
|
+
--currency '£' \
|
64
|
+
jdleesmiller/compare_compressors plot_costs data/compare.csv | gnuplot
|
65
|
+
```
|
66
|
+
|
67
|
+
#### Summarize
|
68
|
+
|
69
|
+
Print a list of the compressors and settings in descending order by cost. The cost function is of the same form as for `plot_costs`.
|
70
|
+
|
71
|
+
```
|
72
|
+
docker run --rm \
|
73
|
+
--volume `pwd`/data:/home/app/compare_compressors/data \
|
74
|
+
--gibyte-cost 56.05 \
|
75
|
+
--compression-hour-cost 32.35 \
|
76
|
+
--decompression-hour-cost 177.91 \
|
77
|
+
--currency '£' \
|
78
|
+
jdleesmiller/compare_compressors summarize data/compare.csv
|
79
|
+
```
|
80
|
+
|
81
|
+
## Requirements
|
82
|
+
|
83
|
+
A linux-like `/usr/bin/time` utility is required, along with several system packages. See the [Dockerfile](Dockerfile) for a list of the packages that this utility depends on. To make the plot, you will also need `gnuplot`.
|
84
|
+
|
85
|
+
If you are installing natively, without docker, you will need ruby and then to install the gem:
|
86
|
+
|
87
|
+
```
|
88
|
+
$ gem install compare_compressors
|
89
|
+
```
|
90
|
+
|
91
|
+
## Development
|
92
|
+
|
93
|
+
For development, you will probably want (1) override the default entrypoint and (2) mount the application root inside the container. To run the tests, for example:
|
94
|
+
|
95
|
+
```
|
96
|
+
docker run --rm -it --entrypoint='' \
|
97
|
+
--volume=compare_compressors_bundle:/home/app/compare_compressors/.bundle \
|
98
|
+
--volume=`pwd`:/home/app/compare_compressors \
|
99
|
+
compare_compressors bundle exec rake
|
100
|
+
```
|
101
|
+
|
102
|
+
The only caveat is that you need to preserve the `.bundle` folder inside the container by mounting it as a volume; the above command does this using a named volume, `compare_compressors_bundle`, which will persist between runs and be easier to identify in the `docker volume ls` output.
|
103
|
+
|
104
|
+
## Related
|
105
|
+
|
106
|
+
- See [this blog post for an example of how to use the tool](TODO).
|
107
|
+
- For many more compression algorithms, see https://quixdb.github.io/squash-benchmark/
|
108
|
+
|
109
|
+
## License
|
110
|
+
|
111
|
+
(The MIT License)
|
112
|
+
|
113
|
+
Copyright (c) 2017 John Lees-Miller
|
114
|
+
|
115
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
116
|
+
a copy of this software and associated documentation files (the
|
117
|
+
'Software'), to deal in the Software without restriction, including
|
118
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
119
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
120
|
+
permit persons to whom the Software is furnished to do so, subject to
|
121
|
+
the following conditions:
|
122
|
+
|
123
|
+
The above copyright notice and this permission notice shall be
|
124
|
+
included in all copies or substantial portions of the Software.
|
125
|
+
|
126
|
+
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
127
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
128
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
129
|
+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
130
|
+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
131
|
+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
132
|
+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
@@ -0,0 +1,40 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'English'
|
4
|
+
|
5
|
+
require_relative 'compare_compressors/version'
|
6
|
+
|
7
|
+
require_relative 'compare_compressors/comparer'
|
8
|
+
require_relative 'compare_compressors/cost_model'
|
9
|
+
require_relative 'compare_compressors/result'
|
10
|
+
require_relative 'compare_compressors/group_result'
|
11
|
+
require_relative 'compare_compressors/costed_group_result'
|
12
|
+
|
13
|
+
require_relative 'compare_compressors/plotter'
|
14
|
+
require_relative 'compare_compressors/plotters/cost_plotter'
|
15
|
+
require_relative 'compare_compressors/plotters/raw_plotter'
|
16
|
+
require_relative 'compare_compressors/plotters/size_plotter'
|
17
|
+
|
18
|
+
require_relative 'compare_compressors/compressor'
|
19
|
+
require_relative 'compare_compressors/compressors/brotli_compressor'
|
20
|
+
require_relative 'compare_compressors/compressors/bzip2_compressor'
|
21
|
+
require_relative 'compare_compressors/compressors/gzip_compressor'
|
22
|
+
require_relative 'compare_compressors/compressors/seven_zip_compressor'
|
23
|
+
require_relative 'compare_compressors/compressors/xz_compressor'
|
24
|
+
require_relative 'compare_compressors/compressors/zstd_compressor'
|
25
|
+
|
26
|
+
require_relative 'compare_compressors/command_line_interface'
|
27
|
+
|
28
|
+
#
|
29
|
+
# Compare compression algorithms.
|
30
|
+
#
|
31
|
+
module CompareCompressors
|
32
|
+
COMPRESSORS = [
|
33
|
+
BrotliCompressor,
|
34
|
+
Bzip2Compressor,
|
35
|
+
GzipCompressor,
|
36
|
+
SevenZipCompressor,
|
37
|
+
XzCompressor,
|
38
|
+
ZstdCompressor
|
39
|
+
].map(&:new).freeze
|
40
|
+
end
|
@@ -0,0 +1,223 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'csv'
|
4
|
+
require 'thor'
|
5
|
+
|
6
|
+
module CompareCompressors
|
7
|
+
#
|
8
|
+
# Handle generic command line options and run the relevant command.
|
9
|
+
#
|
10
|
+
class CommandLineInterface < Thor
|
11
|
+
desc \
|
12
|
+
'version',
|
13
|
+
'print version (also available as --version)'
|
14
|
+
def version
|
15
|
+
puts "compare_compressors-#{CompareCompressors::VERSION}"
|
16
|
+
COMPRESSORS.each do |compressor|
|
17
|
+
puts format('%10s: %s', compressor.name, compressor.version || '?')
|
18
|
+
end
|
19
|
+
end
|
20
|
+
map %w(--version -v) => :version
|
21
|
+
|
22
|
+
desc \
|
23
|
+
'compare <target files>',
|
24
|
+
'Run compression tools on targets and write a CSV'
|
25
|
+
def compare(*targets)
|
26
|
+
CSV do |csv|
|
27
|
+
Comparer.new.run(csv, COMPRESSORS, targets)
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
class <<self
|
32
|
+
def scale_option
|
33
|
+
option \
|
34
|
+
:scale,
|
35
|
+
type: :numeric,
|
36
|
+
desc: 'scale factor from sample targets to full dataset',
|
37
|
+
default: 1.0
|
38
|
+
end
|
39
|
+
|
40
|
+
def use_cpu_time_option
|
41
|
+
option \
|
42
|
+
:use_cpu_time,
|
43
|
+
type: :boolean,
|
44
|
+
desc: 'use CPU time rather than elapsed time',
|
45
|
+
default: CostModel::DEFAULT_USE_CPU_TIME
|
46
|
+
end
|
47
|
+
|
48
|
+
def cost_options
|
49
|
+
option \
|
50
|
+
:gibyte_cost,
|
51
|
+
type: :numeric,
|
52
|
+
desc: 'storage cost per gigabyte of compressed output',
|
53
|
+
default: CostModel::DEFAULT_GIBYTE_COST
|
54
|
+
option \
|
55
|
+
:compression_hour_cost,
|
56
|
+
type: :numeric,
|
57
|
+
desc: 'compute cost per hour of CPU time for compression',
|
58
|
+
default: CostModel::DEFAULT_HOUR_COST
|
59
|
+
option \
|
60
|
+
:decompression_hour_cost,
|
61
|
+
type: :numeric,
|
62
|
+
desc: 'compute cost per hour of CPU time for decompression',
|
63
|
+
default: CostModel::DEFAULT_HOUR_COST
|
64
|
+
option \
|
65
|
+
:currency,
|
66
|
+
type: :string,
|
67
|
+
desc: 'currency symbol for display',
|
68
|
+
default: CostModel::DEFAULT_CURRENCY
|
69
|
+
end
|
70
|
+
|
71
|
+
def plot_options
|
72
|
+
option \
|
73
|
+
:terminal,
|
74
|
+
desc: 'the terminal line for gnuplot',
|
75
|
+
default: Plotter::DEFAULT_TERMINAL
|
76
|
+
option \
|
77
|
+
:output,
|
78
|
+
desc: 'the output name for gnuplot',
|
79
|
+
default: Plotter::DEFAULT_OUTPUT
|
80
|
+
option \
|
81
|
+
:pareto_only,
|
82
|
+
desc: 'plot only non-dominated compressor-level pairs',
|
83
|
+
type: :boolean,
|
84
|
+
default: true
|
85
|
+
option \
|
86
|
+
:logscale_size,
|
87
|
+
desc: 'use a log10 scale for the size (lucky you if you need this)',
|
88
|
+
type: :boolean,
|
89
|
+
default: Plotter::DEFAULT_LOGSCALE_SIZE
|
90
|
+
option \
|
91
|
+
:autoscale_fix,
|
92
|
+
desc: 'zoom axes to fit the points tightly',
|
93
|
+
type: :boolean,
|
94
|
+
default: Plotter::DEFAULT_AUTOSCALE_FIX
|
95
|
+
option \
|
96
|
+
:show_labels,
|
97
|
+
desc: 'show compression level labels on the plot',
|
98
|
+
type: :boolean,
|
99
|
+
default: Plotter::DEFAULT_SHOW_LABELS
|
100
|
+
option \
|
101
|
+
:lmargin,
|
102
|
+
desc: 'adjust lmargin (workaround if y label is cut off on png)',
|
103
|
+
type: :numeric,
|
104
|
+
default: Plotter::DEFAULT_LMARGIN
|
105
|
+
option \
|
106
|
+
:title,
|
107
|
+
desc: 'main title (must not contain double quotes)',
|
108
|
+
type: :string,
|
109
|
+
default: Plotter::DEFAULT_TITLE
|
110
|
+
end
|
111
|
+
end
|
112
|
+
|
113
|
+
desc \
|
114
|
+
'plot [csv file]',
|
115
|
+
'Write a gnuplot script for a basic 2D plot with the CSV from compare'
|
116
|
+
scale_option
|
117
|
+
use_cpu_time_option
|
118
|
+
plot_options
|
119
|
+
option \
|
120
|
+
:decompression,
|
121
|
+
desc: 'show decompression time instead of compression time',
|
122
|
+
type: :boolean,
|
123
|
+
default: SizePlotter::DEFAULT_DECOMPRESSION
|
124
|
+
def plot(csv_file = nil)
|
125
|
+
results = read_results(csv_file)
|
126
|
+
group_results = GroupResult.group(results, scale: options[:scale])
|
127
|
+
plotter = make_plotter(
|
128
|
+
SizePlotter, options,
|
129
|
+
decompression: options[:decompression]
|
130
|
+
)
|
131
|
+
plotter.plot(group_results, pareto_only: options[:pareto_only])
|
132
|
+
end
|
133
|
+
|
134
|
+
desc \
|
135
|
+
'plot_3d [csv file]',
|
136
|
+
'Write a gnuplot script for a 3D plot with the CSV from compare'
|
137
|
+
scale_option
|
138
|
+
use_cpu_time_option
|
139
|
+
plot_options
|
140
|
+
def plot_3d(csv_file = nil)
|
141
|
+
results = read_results(csv_file)
|
142
|
+
group_results = GroupResult.group(results, scale: options[:scale])
|
143
|
+
plotter = make_plotter(RawPlotter, options)
|
144
|
+
plotter.plot(group_results, pareto_only: options[:pareto_only])
|
145
|
+
end
|
146
|
+
|
147
|
+
desc \
|
148
|
+
'plot_costs [csv file]',
|
149
|
+
'Write a gnuplot script for a 2D cost plot with the CSV from compare'
|
150
|
+
scale_option
|
151
|
+
use_cpu_time_option
|
152
|
+
cost_options
|
153
|
+
plot_options
|
154
|
+
option \
|
155
|
+
:show_cost_contours,
|
156
|
+
desc: 'show cost function contours',
|
157
|
+
type: :boolean,
|
158
|
+
default: CostPlotter::DEFAULT_SHOW_COST_CONTOURS
|
159
|
+
def plot_costs(csv_file = nil)
|
160
|
+
results = read_results(csv_file)
|
161
|
+
group_results = GroupResult.group(results, scale: options[:scale])
|
162
|
+
cost_model = make_cost_model(options)
|
163
|
+
costed_group_results =
|
164
|
+
CostedGroupResult.from_group_results(cost_model, group_results)
|
165
|
+
plotter = make_plotter(
|
166
|
+
CostPlotter, options, cost_model,
|
167
|
+
show_cost_contours: options[:show_cost_contours]
|
168
|
+
)
|
169
|
+
plotter.plot(costed_group_results, pareto_only: options[:pareto_only])
|
170
|
+
end
|
171
|
+
|
172
|
+
desc \
|
173
|
+
'summarize [csv file]',
|
174
|
+
'Read CSV from compare and write out a summary'
|
175
|
+
scale_option
|
176
|
+
use_cpu_time_option
|
177
|
+
cost_options
|
178
|
+
option \
|
179
|
+
:top,
|
180
|
+
desc: 'number of results to include',
|
181
|
+
type: :numeric,
|
182
|
+
default: CostModel::DEFAULT_SUMMARIZE_TOP
|
183
|
+
def summarize(csv_file = nil)
|
184
|
+
results = read_results(csv_file)
|
185
|
+
group_results = GroupResult.group(results, scale: options[:scale])
|
186
|
+
cost_model = make_cost_model(options)
|
187
|
+
costed_group_results =
|
188
|
+
CostedGroupResult.from_group_results(cost_model, group_results)
|
189
|
+
puts cost_model.summarize(costed_group_results, options[:top])
|
190
|
+
end
|
191
|
+
|
192
|
+
private
|
193
|
+
|
194
|
+
def make_cost_model(options)
|
195
|
+
CostModel.new(
|
196
|
+
gibyte_cost: options[:gibyte_cost],
|
197
|
+
compression_hour_cost: options[:compression_hour_cost],
|
198
|
+
decompression_hour_cost: options[:decompression_hour_cost],
|
199
|
+
use_cpu_time: options[:use_cpu_time],
|
200
|
+
currency: options[:currency]
|
201
|
+
)
|
202
|
+
end
|
203
|
+
|
204
|
+
def make_plotter(klass, options, *args, **kwargs)
|
205
|
+
klass.new(
|
206
|
+
*args,
|
207
|
+
terminal: options[:terminal],
|
208
|
+
output: options[:output],
|
209
|
+
logscale_size: options[:logscale_size],
|
210
|
+
autoscale_fix: options[:autoscale_fix],
|
211
|
+
show_labels: options[:show_labels],
|
212
|
+
lmargin: options[:lmargin],
|
213
|
+
title: options[:title],
|
214
|
+
use_cpu_time: options[:use_cpu_time],
|
215
|
+
**kwargs
|
216
|
+
)
|
217
|
+
end
|
218
|
+
|
219
|
+
def read_results(csv_file)
|
220
|
+
Result.read_csv(csv_file ? File.read(csv_file) : STDIN)
|
221
|
+
end
|
222
|
+
end
|
223
|
+
end
|
@@ -0,0 +1,70 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'digest'
|
4
|
+
require 'fileutils'
|
5
|
+
require 'tmpdir'
|
6
|
+
|
7
|
+
module CompareCompressors
|
8
|
+
#
|
9
|
+
# Run compressors on targets and record the results.
|
10
|
+
#
|
11
|
+
# The general approach is, for each target:
|
12
|
+
#
|
13
|
+
# 1. Copy the original target (read only) to a temporary folder (read write);
|
14
|
+
# the copy is the 'work target'.
|
15
|
+
# 2. Hash the work target so we can make sure we don't change it.
|
16
|
+
# 3. For each compressor and level, compress the work target.
|
17
|
+
# 4. Remove the work target (if the compressor left it)
|
18
|
+
# 5. Decompress the compressed target; this should restore the work target.
|
19
|
+
# 6. Check the work target's hash before we start the next compressor or
|
20
|
+
# level, to make sure the compression hasn't broken it somehow.
|
21
|
+
#
|
22
|
+
# This approach is a bit complicated, but it lets us (1) make sure we don't
|
23
|
+
# change the original targets, since they're copied, (2) make sure we
|
24
|
+
# don't accidentally change the work target during the run, which would
|
25
|
+
# invalidate the results, and (3) avoid copying the work target from the
|
26
|
+
# target repeatedly.
|
27
|
+
#
|
28
|
+
class Comparer
|
29
|
+
#
|
30
|
+
# @param [CSV] csv CSV writer for output
|
31
|
+
# @param [Array<Compressor>] compressors
|
32
|
+
# @param [Array<String>] targets pathnames of targets (read only)
|
33
|
+
#
|
34
|
+
def run(csv, compressors, targets)
|
35
|
+
csv << Result.members
|
36
|
+
targets.each do |target|
|
37
|
+
Dir.mktmpdir do |tmp|
|
38
|
+
work_target = stage_target(tmp, target)
|
39
|
+
evaluate_target(csv, compressors, target, work_target)
|
40
|
+
end
|
41
|
+
end
|
42
|
+
nil
|
43
|
+
end
|
44
|
+
|
45
|
+
private
|
46
|
+
|
47
|
+
def stage_target(tmp, target)
|
48
|
+
pathname = File.join(tmp, 'data')
|
49
|
+
FileUtils.cp target, pathname
|
50
|
+
pathname
|
51
|
+
end
|
52
|
+
|
53
|
+
def evaluate_target(csv, compressors, target, work_target)
|
54
|
+
target_digest = find_digest(work_target)
|
55
|
+
compressors.each do |compressor|
|
56
|
+
compressor.levels.each do |level|
|
57
|
+
if find_digest(work_target) != target_digest
|
58
|
+
raise "digest mismatch: #{compressor.name}" \
|
59
|
+
" level #{level} on #{target}"
|
60
|
+
end
|
61
|
+
csv << compressor.evaluate(target, work_target, level)
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
66
|
+
def find_digest(pathname)
|
67
|
+
Digest::SHA256.file(pathname).hexdigest
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|