errata 1.1.0 → 1.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +6 -0
- data/{README.md → README.markdown} +62 -5
- data/lib/errata.rb +4 -2
- data/lib/errata/erratum.rb +3 -0
- data/lib/errata/version.rb +1 -1
- data/rfc_editor.png +0 -0
- data/test/test_errata.rb +57 -16
- metadata +4 -4
- data/.document +0 -5
data/CHANGELOG
CHANGED
@@ -1,8 +1,36 @@
|
|
1
1
|
# errata
|
2
2
|
|
3
|
-
|
3
|
+
Define an errata in table format (CSV) and then apply it to an arbitrary source. Inspired by RFC Errata, lets you keep your own errata in a transparent way.
|
4
4
|
|
5
|
-
|
5
|
+
Tested in MRI 1.8.7+, MRI 1.9.2+, and JRuby 1.6.7+. Thread safe.
|
6
|
+
|
7
|
+
## Real-world usage
|
8
|
+
|
9
|
+
<p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p>
|
10
|
+
|
11
|
+
We use `errata` for [data science at Brighter Planet](http://brighterplanet.com/research) and in production at
|
12
|
+
|
13
|
+
* [Brighter Planet's reference data web service](http://data.brighterplanet.com)
|
14
|
+
* [Brighter Planet's impact estimate web service](http://impact.brighterplanet.com)
|
15
|
+
|
16
|
+
The killer combination:
|
17
|
+
|
18
|
+
1. [`active_record_inline_schema`](https://github.com/seamusabshere/active_record_inline_schema) - define table structure
|
19
|
+
2. [`remote_table`](https://github.com/seamusabshere/remote_table) - download data and parse it
|
20
|
+
3. [`errata`](https://github.com/seamusabshere/errata) (this library!) - apply corrections in a transparent way
|
21
|
+
4. [`data_miner`](https://github.com/seamusabshere/remote_table) - import data idempotently
|
22
|
+
|
23
|
+
## Inspiration
|
24
|
+
|
25
|
+
There's a process for reporting errata on RFC:
|
26
|
+
|
27
|
+
* [RFC Errata](http://www.rfc-editor.org/errata.php)
|
28
|
+
* [Status and Type Descriptions for RFC Errata](http://www.rfc-editor.org/status_type_desc.html)
|
29
|
+
* [How to report errata](http://www.rfc-editor.org/how_to_report.html)
|
30
|
+
|
31
|
+
<p><a href="http://www.rfc-editor.org"><img src="https://github.com/seamusabshere/errata/raw/master/rfc_editor.png" alt="screenshot of the RFC Editor" /></a></p>
|
32
|
+
|
33
|
+
## Example
|
6
34
|
|
7
35
|
Every errata has a table structure based on the [IETF RFC Editor's "How to Report Errata"](http://www.rfc-editor.org/how_to_report.html).
|
8
36
|
|
@@ -167,15 +195,44 @@ And then used
|
|
167
195
|
|
168
196
|
Assumes all input strings are UTF-8. Otherwise there can be problems with Ruby 1.9 and Regexp::FIXEDENCODING. Specifically, ASCII-8BIT regexps might be applied to UTF-8 strings (or vice-versa), resulting in Encoding::CompatibilityError.
|
169
197
|
|
170
|
-
##
|
198
|
+
## More advanced usage
|
199
|
+
|
200
|
+
The [`earth` library](https://github.com/brighterplanet/earth) has dozens of real-life examples showing errata in action:
|
171
201
|
|
172
|
-
|
202
|
+
<table>
|
203
|
+
<tr>
|
204
|
+
<th>Model</th>
|
205
|
+
<th>Reference</th>
|
206
|
+
<th>Errata file</th>
|
207
|
+
</tr>
|
208
|
+
<tr>
|
209
|
+
<td><a href="http://data.brighterplanet.com/countries">Country</a></td>
|
210
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/country/data_miner.rb">data_miner.rb</a></td>
|
211
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/country/wri_errata.csv">wri_errata.csv</a></td>
|
212
|
+
</tr>
|
213
|
+
<tr>
|
214
|
+
<td><a href="http://data.brighterplanet.com/aircraft">Aircraft</a></td>
|
215
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb">data_miner.rb</a></td>
|
216
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/aircraft/faa_errata.csv">faa_errata.csv</a></td>
|
217
|
+
</tr>
|
218
|
+
<tr>
|
219
|
+
<td><a href="http://data.brighterplanet.com/airports">Airports</a></td>
|
220
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/airport/data_miner.rb">data_miner.rb</a></td>
|
221
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/airport/openflights_errata.csv">openflights_errata.csv</a></td>
|
222
|
+
</tr>
|
223
|
+
<tr>
|
224
|
+
<td><a href="http://data.brighterplanet.com/automobile_make_model_year_variants">Automobile model variants</a></td>
|
225
|
+
<td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb">data_miner.rb</a></td>
|
226
|
+
<td><a href="https://raw.github.com/brighterplanet/earth/master/errata/automobile_make_model_year_variant/feg_errata.csv">feg_errata.csv</a></td>
|
227
|
+
</tr>
|
228
|
+
</table>
|
173
229
|
|
174
230
|
## Authors
|
175
231
|
|
176
232
|
* Seamus Abshere <seamus@abshere.net>
|
177
233
|
* Andy Rossmeissl <andy@rossmeissl.net>
|
234
|
+
* Ian Hough <ijhough@gmail.com>
|
178
235
|
|
179
236
|
## Copyright
|
180
237
|
|
181
|
-
Copyright (c)
|
238
|
+
Copyright (c) 2012 Brighter Planet. See LICENSE for details.
|
data/lib/errata.rb
CHANGED
@@ -23,13 +23,14 @@ class Errata
|
|
23
23
|
options = options.symbolize_keys
|
24
24
|
|
25
25
|
responder = options.delete :responder
|
26
|
-
raise "[errata] :responder is required" unless responder
|
27
26
|
if responder.is_a?(::String)
|
28
27
|
@lazy_load_responder_mutex = ::Mutex.new
|
29
28
|
@lazy_load_responder_class_name = responder
|
30
|
-
|
29
|
+
elsif responder
|
31
30
|
::Kernel.warn %{[errata] Passing an object as :responder is deprecated. It's recommended to pass a class name instead, which will be constantized and instantiated with no arguments.}
|
32
31
|
@responder = responder
|
32
|
+
else
|
33
|
+
@no_responder = true
|
33
34
|
end
|
34
35
|
|
35
36
|
if table = options.delete(:table)
|
@@ -52,6 +53,7 @@ class Errata
|
|
52
53
|
end
|
53
54
|
|
54
55
|
def responder
|
56
|
+
return if @no_responder == true
|
55
57
|
@responder || @lazy_load_responder_mutex.synchronize do
|
56
58
|
@responder ||= lazy_load_responder_class_name.constantize.new
|
57
59
|
end
|
data/lib/errata/erratum.rb
CHANGED
@@ -24,6 +24,9 @@ class Errata
|
|
24
24
|
@matching_methods = options[:condition].split(SEMICOLON_DELIMITER).map do |method_id|
|
25
25
|
method_id.strip.gsub(/\W/, '_').downcase + '?'
|
26
26
|
end
|
27
|
+
if @matching_methods.any? and @responder.nil?
|
28
|
+
raise ::ArgumentError, %{[errata] Conditions like #{@matching_methods.first.inspect} used, but no :responder defined}
|
29
|
+
end
|
27
30
|
@matching_expression = if options[:x].blank?
|
28
31
|
nil
|
29
32
|
elsif (options[:x].start_with?('/') or options[:x].start_with?('%r{')) and as_regexp = options[:x].as_regexp
|
data/lib/errata/version.rb
CHANGED
data/rfc_editor.png
ADDED
Binary file
|
data/test/test_errata.rb
CHANGED
@@ -2,24 +2,65 @@ require 'helper'
|
|
2
2
|
require 'models'
|
3
3
|
|
4
4
|
describe Errata do
|
5
|
-
|
6
|
-
|
7
|
-
|
5
|
+
describe 'without responder' do
|
6
|
+
it "doesn't require a responder" do
|
7
|
+
e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdHEtemF2YTZzdGRYbE1MTHFMRXpRUHc&single=true&gid=0&output=csv'
|
8
|
+
row = { 'name' => 'denver intl airport' }
|
9
|
+
e.correct! row
|
10
|
+
row['name'].must_equal 'denver International airport'
|
11
|
+
end
|
8
12
|
end
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
13
|
+
|
14
|
+
describe 'with conditions' do
|
15
|
+
it "uses a responder to answer conditions" do
|
16
|
+
eval %{
|
17
|
+
class ColoradoGuru
|
18
|
+
def is_denver_airport?(record)
|
19
|
+
record['name'].to_s.downcase.include? 'denver'
|
20
|
+
end
|
21
|
+
end
|
22
|
+
}
|
23
|
+
e = Errata.new(
|
24
|
+
:url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv',
|
25
|
+
:responder => 'ColoradoGuru'
|
26
|
+
)
|
27
|
+
row = { 'name' => 'denver intl airport' }
|
28
|
+
e.correct! row
|
29
|
+
row['name'].must_equal 'denver International airport' # matched condition
|
30
|
+
row = { 'name' => 'madison intl airport' }
|
31
|
+
e.correct! row
|
32
|
+
row['name'].must_equal 'madison intl airport' # didn't match
|
33
|
+
end
|
34
|
+
|
35
|
+
it "blows up if you have conditions but no responder" do
|
36
|
+
e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv'
|
37
|
+
row = { 'name' => 'denver intl airport' }
|
38
|
+
lambda do
|
39
|
+
e.correct! row
|
40
|
+
end.must_raise ArgumentError, /conditions.*used/i
|
41
|
+
end
|
18
42
|
end
|
43
|
+
|
44
|
+
describe 'to correct automobile model details' do
|
45
|
+
before do
|
46
|
+
@e = Errata.new :url => 'http://spreadsheets.google.com/pub?key=t9WkYT39zjrStx7ruCFrZJg',
|
47
|
+
:responder => 'AutomobileVariantGuru'
|
48
|
+
end
|
49
|
+
|
50
|
+
it "corrects rows" do
|
51
|
+
alfa = { "carline_mfr_name"=>"ALFA ROMEO" }
|
52
|
+
@e.correct!(alfa)
|
53
|
+
alfa['carline_mfr_name'].must_equal 'Alfa Romeo'
|
54
|
+
end
|
19
55
|
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
56
|
+
it "rejects rows" do
|
57
|
+
@e.rejects?('carline_mfr_name' => 'AURORA CARS').must_equal true
|
58
|
+
end
|
59
|
+
|
60
|
+
it "tries multiple conditions" do
|
61
|
+
bentley = { 'carline_mfr_name' => 'ROLLS-ROYCE BENTLEY', "carline name" => 'Super Bentley' }
|
62
|
+
@e.correct!(bentley)
|
63
|
+
bentley['carline_mfr_name'].must_equal 'Bentley'
|
64
|
+
end
|
24
65
|
end
|
25
66
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: errata
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.1.
|
4
|
+
version: 1.1.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,7 +10,7 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2012-05-
|
13
|
+
date: 2012-05-11 00:00:00.000000000 Z
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
16
16
|
name: activesupport
|
@@ -67,12 +67,11 @@ executables: []
|
|
67
67
|
extensions: []
|
68
68
|
extra_rdoc_files: []
|
69
69
|
files:
|
70
|
-
- .document
|
71
70
|
- .gitignore
|
72
71
|
- CHANGELOG
|
73
72
|
- Gemfile
|
74
73
|
- LICENSE
|
75
|
-
- README.
|
74
|
+
- README.markdown
|
76
75
|
- Rakefile
|
77
76
|
- errata.gemspec
|
78
77
|
- lib/errata.rb
|
@@ -84,6 +83,7 @@ files:
|
|
84
83
|
- lib/errata/erratum/transform.rb
|
85
84
|
- lib/errata/erratum/truncate.rb
|
86
85
|
- lib/errata/version.rb
|
86
|
+
- rfc_editor.png
|
87
87
|
- test/helper.rb
|
88
88
|
- test/models.rb
|
89
89
|
- test/test_errata.rb
|