errata 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/CHANGELOG CHANGED
@@ -1,3 +1,9 @@
1
+ 1.1.1 / 2012-05-11
2
+
3
+ * Enhancements
4
+
5
+ * Don't require a :responder if there are no conditions
6
+
1
7
  1.1.0 / 2012-05-03
2
8
 
3
9
  * Breaking changes
@@ -1,8 +1,36 @@
1
1
  # errata
2
2
 
3
- Correct strings based on remote errata files.
3
+ Define an errata in table format (CSV) and then apply it to an arbitrary source. Inspired by RFC Errata, lets you keep your own errata in a transparent way.
4
4
 
5
- # Example
5
+ Tested in MRI 1.8.7+, MRI 1.9.2+, and JRuby 1.6.7+. Thread safe.
6
+
7
+ ## Real-world usage
8
+
9
+ <p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p>
10
+
11
+ We use `errata` for [data science at Brighter Planet](http://brighterplanet.com/research) and in production at
12
+
13
+ * [Brighter Planet's reference data web service](http://data.brighterplanet.com)
14
+ * [Brighter Planet's impact estimate web service](http://impact.brighterplanet.com)
15
+
16
+ The killer combination:
17
+
18
+ 1. [`active_record_inline_schema`](https://github.com/seamusabshere/active_record_inline_schema) - define table structure
19
+ 2. [`remote_table`](https://github.com/seamusabshere/remote_table) - download data and parse it
20
+ 3. [`errata`](https://github.com/seamusabshere/errata) (this library!) - apply corrections in a transparent way
21
+ 4. [`data_miner`](https://github.com/seamusabshere/remote_table) - import data idempotently
22
+
23
+ ## Inspiration
24
+
25
+ There's a process for reporting errata on RFC:
26
+
27
+ * [RFC Errata](http://www.rfc-editor.org/errata.php)
28
+ * [Status and Type Descriptions for RFC Errata](http://www.rfc-editor.org/status_type_desc.html)
29
+ * [How to report errata](http://www.rfc-editor.org/how_to_report.html)
30
+
31
+ <p><a href="http://www.rfc-editor.org"><img src="https://github.com/seamusabshere/errata/raw/master/rfc_editor.png" alt="screenshot of the RFC Editor" /></a></p>
32
+
33
+ ## Example
6
34
 
7
35
  Every errata has a table structure based on the [IETF RFC Editor's "How to Report Errata"](http://www.rfc-editor.org/how_to_report.html).
8
36
 
@@ -167,15 +195,44 @@ And then used
167
195
 
168
196
  Assumes all input strings are UTF-8. Otherwise there can be problems with Ruby 1.9 and Regexp::FIXEDENCODING. Specifically, ASCII-8BIT regexps might be applied to UTF-8 strings (or vice-versa), resulting in Encoding::CompatibilityError.
169
197
 
170
- ## Real-life usage
198
+ ## More advanced usage
199
+
200
+ The [`earth` library](https://github.com/brighterplanet/earth) has dozens of real-life examples showing errata in action:
171
201
 
172
- Used by [data_miner](http://github.com/seamusabshere/data_miner)
202
+ <table>
203
+ <tr>
204
+ <th>Model</th>
205
+ <th>Reference</th>
206
+ <th>Errata file</th>
207
+ </tr>
208
+ <tr>
209
+ <td><a href="http://data.brighterplanet.com/countries">Country</a></td>
210
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/country/data_miner.rb">data_miner.rb</a></td>
211
+ <td><a href="https://raw.github.com/brighterplanet/earth/master/errata/country/wri_errata.csv">wri_errata.csv</a></td>
212
+ </tr>
213
+ <tr>
214
+ <td><a href="http://data.brighterplanet.com/aircraft">Aircraft</a></td>
215
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb">data_miner.rb</a></td>
216
+ <td><a href="https://raw.github.com/brighterplanet/earth/master/errata/aircraft/faa_errata.csv">faa_errata.csv</a></td>
217
+ </tr>
218
+ <tr>
219
+ <td><a href="http://data.brighterplanet.com/airports">Airports</a></td>
220
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/airport/data_miner.rb">data_miner.rb</a></td>
221
+ <td><a href="https://raw.github.com/brighterplanet/earth/master/errata/airport/openflights_errata.csv">openflights_errata.csv</a></td>
222
+ </tr>
223
+ <tr>
224
+ <td><a href="http://data.brighterplanet.com/automobile_make_model_year_variants">Automobile model variants</a></td>
225
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb">data_miner.rb</a></td>
226
+ <td><a href="https://raw.github.com/brighterplanet/earth/master/errata/automobile_make_model_year_variant/feg_errata.csv">feg_errata.csv</a></td>
227
+ </tr>
228
+ </table>
173
229
 
174
230
  ## Authors
175
231
 
176
232
  * Seamus Abshere <seamus@abshere.net>
177
233
  * Andy Rossmeissl <andy@rossmeissl.net>
234
+ * Ian Hough <ijhough@gmail.com>
178
235
 
179
236
  ## Copyright
180
237
 
181
- Copyright (c) 2011 Brighter Planet. See LICENSE for details.
238
+ Copyright (c) 2012 Brighter Planet. See LICENSE for details.
@@ -23,13 +23,14 @@ class Errata
23
23
  options = options.symbolize_keys
24
24
 
25
25
  responder = options.delete :responder
26
- raise "[errata] :responder is required" unless responder
27
26
  if responder.is_a?(::String)
28
27
  @lazy_load_responder_mutex = ::Mutex.new
29
28
  @lazy_load_responder_class_name = responder
30
- else
29
+ elsif responder
31
30
  ::Kernel.warn %{[errata] Passing an object as :responder is deprecated. It's recommended to pass a class name instead, which will be constantized and instantiated with no arguments.}
32
31
  @responder = responder
32
+ else
33
+ @no_responder = true
33
34
  end
34
35
 
35
36
  if table = options.delete(:table)
@@ -52,6 +53,7 @@ class Errata
52
53
  end
53
54
 
54
55
  def responder
56
+ return if @no_responder == true
55
57
  @responder || @lazy_load_responder_mutex.synchronize do
56
58
  @responder ||= lazy_load_responder_class_name.constantize.new
57
59
  end
@@ -24,6 +24,9 @@ class Errata
24
24
  @matching_methods = options[:condition].split(SEMICOLON_DELIMITER).map do |method_id|
25
25
  method_id.strip.gsub(/\W/, '_').downcase + '?'
26
26
  end
27
+ if @matching_methods.any? and @responder.nil?
28
+ raise ::ArgumentError, %{[errata] Conditions like #{@matching_methods.first.inspect} used, but no :responder defined}
29
+ end
27
30
  @matching_expression = if options[:x].blank?
28
31
  nil
29
32
  elsif (options[:x].start_with?('/') or options[:x].start_with?('%r{')) and as_regexp = options[:x].as_regexp
@@ -1,3 +1,3 @@
1
1
  class Errata
2
- VERSION = '1.1.0'
2
+ VERSION = '1.1.1'
3
3
  end
Binary file
@@ -2,24 +2,65 @@ require 'helper'
2
2
  require 'models'
3
3
 
4
4
  describe Errata do
5
- before do
6
- @e = Errata.new :url => 'http://spreadsheets.google.com/pub?key=t9WkYT39zjrStx7ruCFrZJg',
7
- :responder => 'AutomobileVariantGuru'
5
+ describe 'without responder' do
6
+ it "doesn't require a responder" do
7
+ e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdHEtemF2YTZzdGRYbE1MTHFMRXpRUHc&single=true&gid=0&output=csv'
8
+ row = { 'name' => 'denver intl airport' }
9
+ e.correct! row
10
+ row['name'].must_equal 'denver International airport'
11
+ end
8
12
  end
9
-
10
- it "corrects rows" do
11
- alfa = { "carline_mfr_name"=>"ALFA ROMEO" }
12
- @e.correct!(alfa)
13
- alfa['carline_mfr_name'].must_equal 'Alfa Romeo'
14
- end
15
-
16
- it "rejects rows" do
17
- @e.rejects?('carline_mfr_name' => 'AURORA CARS').must_equal true
13
+
14
+ describe 'with conditions' do
15
+ it "uses a responder to answer conditions" do
16
+ eval %{
17
+ class ColoradoGuru
18
+ def is_denver_airport?(record)
19
+ record['name'].to_s.downcase.include? 'denver'
20
+ end
21
+ end
22
+ }
23
+ e = Errata.new(
24
+ :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv',
25
+ :responder => 'ColoradoGuru'
26
+ )
27
+ row = { 'name' => 'denver intl airport' }
28
+ e.correct! row
29
+ row['name'].must_equal 'denver International airport' # matched condition
30
+ row = { 'name' => 'madison intl airport' }
31
+ e.correct! row
32
+ row['name'].must_equal 'madison intl airport' # didn't match
33
+ end
34
+
35
+ it "blows up if you have conditions but no responder" do
36
+ e = Errata.new :url => 'https://docs.google.com/spreadsheet/pub?key=0AkCJNpm9Ks6JdG9PcFBjVnE4SGpLVXNTakVhSFY2VFE&single=true&gid=0&output=csv'
37
+ row = { 'name' => 'denver intl airport' }
38
+ lambda do
39
+ e.correct! row
40
+ end.must_raise ArgumentError, /conditions.*used/i
41
+ end
18
42
  end
43
+
44
+ describe 'to correct automobile model details' do
45
+ before do
46
+ @e = Errata.new :url => 'http://spreadsheets.google.com/pub?key=t9WkYT39zjrStx7ruCFrZJg',
47
+ :responder => 'AutomobileVariantGuru'
48
+ end
49
+
50
+ it "corrects rows" do
51
+ alfa = { "carline_mfr_name"=>"ALFA ROMEO" }
52
+ @e.correct!(alfa)
53
+ alfa['carline_mfr_name'].must_equal 'Alfa Romeo'
54
+ end
19
55
 
20
- it "tries multiple conditions" do
21
- bentley = { 'carline_mfr_name' => 'ROLLS-ROYCE BENTLEY', "carline name" => 'Super Bentley' }
22
- @e.correct!(bentley)
23
- bentley['carline_mfr_name'].must_equal 'Bentley'
56
+ it "rejects rows" do
57
+ @e.rejects?('carline_mfr_name' => 'AURORA CARS').must_equal true
58
+ end
59
+
60
+ it "tries multiple conditions" do
61
+ bentley = { 'carline_mfr_name' => 'ROLLS-ROYCE BENTLEY', "carline name" => 'Super Bentley' }
62
+ @e.correct!(bentley)
63
+ bentley['carline_mfr_name'].must_equal 'Bentley'
64
+ end
24
65
  end
25
66
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: errata
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.1.1
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -10,7 +10,7 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2012-05-03 00:00:00.000000000 Z
13
+ date: 2012-05-11 00:00:00.000000000 Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: activesupport
@@ -67,12 +67,11 @@ executables: []
67
67
  extensions: []
68
68
  extra_rdoc_files: []
69
69
  files:
70
- - .document
71
70
  - .gitignore
72
71
  - CHANGELOG
73
72
  - Gemfile
74
73
  - LICENSE
75
- - README.md
74
+ - README.markdown
76
75
  - Rakefile
77
76
  - errata.gemspec
78
77
  - lib/errata.rb
@@ -84,6 +83,7 @@ files:
84
83
  - lib/errata/erratum/transform.rb
85
84
  - lib/errata/erratum/truncate.rb
86
85
  - lib/errata/version.rb
86
+ - rfc_editor.png
87
87
  - test/helper.rb
88
88
  - test/models.rb
89
89
  - test/test_errata.rb
data/.document DELETED
@@ -1,5 +0,0 @@
1
- README.rdoc
2
- lib/**/*.rb
3
- bin/*
4
- features/**/*.feature
5
- LICENSE