errata 1.1.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/CHANGELOG +6 -0
- data/{README.md → README.markdown} +62 -5
- data/lib/errata.rb +4 -2
- data/lib/errata/erratum.rb +3 -0
- data/lib/errata/version.rb +1 -1
- data/rfc_editor.png +0 -0
- data/test/test_errata.rb +57 -16
- metadata +4 -4
- data/.document +0 -5
data/CHANGELOG
CHANGED
@@ -1,8 +1,36 @@
|
|
1
1
|
# errata
|
2
2
|
|
3
|
-
|
3
|
+
Define an errata in table format (CSV) and then apply it to an arbitrary source. Inspired by RFC Errata, lets you keep your own errata in a transparent way.
|
4
4
|
|
5
|
-
|
5
|
+
Tested in MRI 1.8.7+, MRI 1.9.2+, and JRuby 1.6.7+. Thread safe.
|
6
|
+
|
7
|
+
## Real-world usage
|
8
|
+
|
9
|
+
<p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p>
|
10
|
+
|
11
|
+
We use `errata` for [data science at Brighter Planet](http://brighterplanet.com/research) and in production at
|
12
|
+
|
13
|
+
* [Brighter Planet's reference data web service](http://data.brighterplanet.com)
|
14
|
+
* [Brighter Planet's impact estimate web service](http://impact.brighterplanet.com)
|
15
|
+
|
16
|
+
The killer combination:
|
17
|
+
|
18
|
+
1. [`active_record_inline_schema`](https://github.com/seamusabshere/active_record_inline_schema) - define table structure
|
19
|
+
2. [`remote_table`](https://github.com/seamusabshere/remote_table) - download data and parse it
|
20
|
+
3. [`errata`](https://github.com/seamusabshere/errata) (this library!) - apply corrections in a transparent way
|
21
|
+
4. [`data_miner`](https://github.com/seamusabshere/remote_table) - import data idempotently
|
22
|
+
|
23
|
+
## Inspiration
|
24
|
+
|
25
|
+
There's a process for reporting errata on RFC:
|
26
|
+
|
27
|
+
* [RFC Errata](http://www.rfc-editor.org/errata.php)
|
28
|
+
* [Status and Type Descriptions for RFC Errata](http://www.rfc-editor.org/status_type_desc.html)
|
29
|
+
* [How to report errata](http://www.rfc-editor.org/how_to_report.html)
|
30
|
+
|
31
|
+
<p><a href="http://www.rfc-editor.org"><img src="https://github.com/seamusabshere/errata/raw/master/rfc_editor.png" alt="screenshot of the RFC Editor" /></a></p>
|
32
|
+
|
33
|
+
## Example
|
6
34
|
|
7
35
|
Every errata has a table structure based on the [IETF RFC Editor's "How to Report Errata"](http://www.rfc-editor.org/how_to_report.html).
|
8
36
|
|
@@ -167,15 +195,44 @@ And then used
|
|
167
195
|
|
168
196
|
Assumes all input strings are UTF-8. Otherwise there can be problems with Ruby 1.9 and Regexp::FIXEDENCODING. Specifically, ASCII-8BIT regexps might be applied to UTF-8 strings (or vice-versa), resulting in Encoding::CompatibilityError.
|
169
197
|
|
170
|
-
##
|
198
|
+
## More advanced usage
|
199
|
+
|
200
|
+
The [`earth` library](https://github.com/brighterplanet/earth) has dozens of real-life examples showing errata in action:
|
171
201
|
|
172
|
-
|
202
|
+
<table>
|
203
|
+
<tr>
|
204
|
+
<th>Model</th>
|
205
|
+
<th>Reference</th>
|
206
|
+
<th>Errata file</th>
|
207
|
+
</tr>
|
208
|
+
<tr>
|
209
|
+
<td><a href="http://data.brighterplanet.com/countries">Country</a></td>
|
210
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/country/data_miner.rb">data_miner.rb</a></td>
|
211
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/country/wri_errata.csv">wri_errata.csv</a></td>
|
212
|
+
</tr>
|
213
|
+
<tr>
|
214
|
+
<td><a href="http://data.brighterplanet.com/aircraft">Aircraft</a></td>
|
215
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb">data_miner.rb</a></td>
|
216
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/aircraft/faa_errata.csv">faa_errata.csv</a></td>
|
217
|
+
</tr>
|
218
|
+
<tr>
|
219
|
+
<td><a href="http://data.brighterplanet.com/airports">Airports</a></td>
|
220
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/airport/data_miner.rb">data_miner.rb</a></td>
|
221
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/airport/openflights_errata.csv">openflights_errata.csv</a></td>
|
222
|
+
</tr>
|
223
|
+
<tr>
|
224
|
+
<td><a href="http://data.brighterplanet.com/automobile_make_model_year_variants">Automobile model variants</a></td>
|
225
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb">data_miner.rb</a></td>
|
226
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/automobile_make_model_year_variant/feg_errata.csv">feg_errata.csv</a></td>
|
227
|
+
</tr>
|
228
|
+
</table>
|
173
229
|
|
174
230
|
## Authors
|
175
231
|
|
176
232
|
* Seamus Abshere <seamus@abshere.net>
|
177
233
|
* Andy Rossmeissl <andy@rossmeissl.net>
|
234
|
+
* Ian Hough <ijhough@gmail.com>
|
178
235
|
|
179
236
|
## Copyright
|
180
237
|
|
181
|
-
Copyright (c)
|
238
|
+
Copyright (c) 2012 Brighter Planet. See LICENSE for details.
|
data/lib/errata.rb
CHANGED
@@ -23,13 +23,14 @@ class Errata
|
|
23
23
|
options = options.symbolize_keys
|
24
24
|
|
25
25
|
responder = options.delete :responder
|
26
|
-
raise "[errata] :responder is required" unless responder
|
27
26
|
if responder.is_a?(::String)
|
28
27
|
@lazy_load_responder_mutex = ::Mutex.new
|
29
28
|
@lazy_load_responder_class_name = responder
|
30
|
-
|
29
|
+
elsif responder
|
31
30
|
::Kernel.warn %{[errata] Passing an object as :responder is deprecated. It's recommended to pass a class name instead, which will be constantized and instantiated with no arguments.}
|
32
31
|
@responder = responder
|
32
|
+
else
|
33
|
+
@no_responder = true
|
33
34
|
end
|
34
35
|
|
35
36
|
if table = options.delete(:table)
|
@@ -52,6 +53,7 @@ class Errata
|
|
52
53
|
end
|
53
54
|
|
54
55
|
def responder
|
56
|
+
return if @no_responder == true
|
55
57
|
@responder || @lazy_load_responder_mutex.synchronize do
|
56
58
|
@responder ||= lazy_load_responder_class_name.constantize.new
|
57
59
|
end
|
data/lib/errata/erratum.rb
CHANGED
@@ -24,6 +24,9 @@ class Errata
|
|
24
24
|
@matching_methods = options[:condition].split(SEMICOLON_DELIMITER).map do |method_id|
|
25
25
|
method_id.strip.gsub(/\W/, '_').downcase + '?'
|
26
26
|
end
|
27
|
+
if @matching_methods.any? and @responder.nil?
|
28
|
+
raise ::ArgumentError, %{[errata] Conditions like #{@matching_methods.first.inspect} used, but no :responder defined}
|
29
|
+
end
|
27
30
|
@matching_expression = if options[:x].blank?
|
28
31
|
nil
|
29
32
|
elsif (options[:x].start_with?('/') or options[:x].start_with?('%r{')) and as_regexp = options[:x].as_regexp
|
data/lib/errata/version.rb
CHANGED
data/rfc_editor.png
ADDED
Binary file
|
data/test/test_errata.rb
CHANGED
@@ -2,24 +2,65 @@ require 'helper'
|
|
2
2
|
require 'models'
|
3
3
|
|
4
4
|
describe Errata do
|
5
|
-
|
6
|
-
|
7
|
-
|
5
|
+
describe 'without responder' do
|
6
|
+
it "doesn't require a responder" do
|
7
|
+
e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdHEtemF2YTZzdGRYbE1MTHFMRXpRUHc&single=true&gid=0&output=csv'
|
8
|
+
row = { 'name' => 'denver intl airport' }
|
9
|
+
e.correct! row
|
10
|
+
row['name'].must_equal 'denver International airport'
|
11
|
+
end
|
8
12
|
end
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
13
|
+
|
14
|
+
describe 'with conditions' do
|
15
|
+
it "uses a responder to answer conditions" do
|
16
|
+
eval %{
|
17
|
+
class ColoradoGuru
|
18
|
+
def is_denver_airport?(record)
|
19
|
+
record['name'].to_s.downcase.include? 'denver'
|
20
|
+
end
|
21
|
+
end
|
22
|
+
}
|
23
|
+
e = Errata.new(
|
24
|
+
:url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv',
|
25
|
+
:responder => 'ColoradoGuru'
|
26
|
+
)
|
27
|
+
row = { 'name' => 'denver intl airport' }
|
28
|
+
e.correct! row
|
29
|
+
row['name'].must_equal 'denver International airport' # matched condition
|
30
|
+
row = { 'name' => 'madison intl airport' }
|
31
|
+
e.correct! row
|
32
|
+
row['name'].must_equal 'madison intl airport' # didn't match
|
33
|
+
end
|
34
|
+
|
35
|
+
it "blows up if you have conditions but no responder" do
|
36
|
+
e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv'
|
37
|
+
row = { 'name' => 'denver intl airport' }
|
38
|
+
lambda do
|
39
|
+
e.correct! row
|
40
|
+
end.must_raise ArgumentError, /conditions.*used/i
|
41
|
+
end
|
18
42
|
end
|
43
|
+
|
44
|
+
describe 'to correct automobile model details' do
|
45
|
+
before do
|
46
|
+
@e = Errata.new :url => 'http://spreadsheets.google.com/pub?key=t9WkYT39zjrStx7ruCFrZJg',
|
47
|
+
:responder => 'AutomobileVariantGuru'
|
48
|
+
end
|
49
|
+
|
50
|
+
it "corrects rows" do
|
51
|
+
alfa = { "carline_mfr_name"=>"ALFA ROMEO" }
|
52
|
+
@e.correct!(alfa)
|
53
|
+
alfa['carline_mfr_name'].must_equal 'Alfa Romeo'
|
54
|
+
end
|
19
55
|
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
56
|
+
it "rejects rows" do
|
57
|
+
@e.rejects?('carline_mfr_name' => 'AURORA CARS').must_equal true
|
58
|
+
end
|
59
|
+
|
60
|
+
it "tries multiple conditions" do
|
61
|
+
bentley = { 'carline_mfr_name' => 'ROLLS-ROYCE BENTLEY', "carline name" => 'Super Bentley' }
|
62
|
+
@e.correct!(bentley)
|
63
|
+
bentley['carline_mfr_name'].must_equal 'Bentley'
|
64
|
+
end
|
24
65
|
end
|
25
66
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: errata
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.1.
|
4
|
+
version: 1.1.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,7 +10,7 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2012-05-
|
13
|
+
date: 2012-05-11 00:00:00.000000000 Z
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
16
16
|
name: activesupport
|
@@ -67,12 +67,11 @@ executables: []
|
|
67
67
|
extensions: []
|
68
68
|
extra_rdoc_files: []
|
69
69
|
files:
|
70
|
-
- .document
|
71
70
|
- .gitignore
|
72
71
|
- CHANGELOG
|
73
72
|
- Gemfile
|
74
73
|
- LICENSE
|
75
|
-
- README.
|
74
|
+
- README.markdown
|
76
75
|
- Rakefile
|
77
76
|
- errata.gemspec
|
78
77
|
- lib/errata.rb
|
@@ -84,6 +83,7 @@ files:
|
|
84
83
|
- lib/errata/erratum/transform.rb
|
85
84
|
- lib/errata/erratum/truncate.rb
|
86
85
|
- lib/errata/version.rb
|
86
|
+
- rfc_editor.png
|
87
87
|
- test/helper.rb
|
88
88
|
- test/models.rb
|
89
89
|
- test/test_errata.rb
|