pupa 0.1.4 → 0.1.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b859cc31c591efea3402d3d4b0134fdccb394550
4
- data.tar.gz: f077f4c7765c50c463e8f65124b19ce14b42f095
3
+ metadata.gz: d794649266b975f92ee8ff502a3de21390dc540b
4
+ data.tar.gz: 59b89a81274d35ee848d944da9f4337295ab8567
5
5
  SHA512:
6
- metadata.gz: 3e9eeddce347575a99e50ea67d57e8eb8c1310b8929184f10ffc5ca62da9cc843198a831241411abef829c9d8cb8165f3a6438b67fe6e4b32f4e1317ae7e0f67
7
- data.tar.gz: 8ca3a8dde166ac6712baaf53743846ba8cd80e6af87e8436eb8f9481fc24f4cbde180ae9e12f3fc747838292bfbf67f6f3b0659bcf8cb0693772c96b5f5c0e83
6
+ metadata.gz: 19582ce0e29e5a9ad52d4dabb216664d418fe738b4bf9b534a51b161eae209616df8bc87138a5f34253dace8c657ae21c0a2684024258dc918a94dfc63476a68
7
+ data.tar.gz: 3464169a23f255de3e4b357e245135fc571bc11642a1929d448a88fc1382e5418ed96b3214317c30bae90cb535f90ab6bd4a2bd94d1403ce840a1ccbf6cad734
data/PERFORMANCE.md ADDED
@@ -0,0 +1,129 @@
1
+ # Pupa.rb: A Data Scraping Framework
2
+
3
+ ## Performance
4
+
5
+ Pupa.rb offers several ways to significantly improve performance.
6
+
7
+ In an example case, reducing disk I/O and skipping validation as described below reduced the time to scrape 10,000 documents from 100 cached HTTP responses from 100 seconds down to 5 seconds. Like fast tests, fast scrapers make development smoother.
8
+
9
+ The `import` action's performance is currently limited by the database when a dependency graph is used to determine the evaluation order. If a dependency graph cannot be used because you don't know a related object's ID, [several optimizations](https://github.com/opennorth/pupa-ruby/issues/12) can be implemented to improve performance.
10
+
11
+ ### Reducing HTTP requests
12
+
13
+ HTTP requests consume the most time. To avoid repeat HTTP requests while developing a scraper, cache all HTTP responses. Pupa.rb will by default use a `web_cache` directory in the same directory as your script. You can change the directory by setting the `--cache_dir` switch on the command line, for example:
14
+
15
+ ruby cat.rb --cache_dir /tmp/my_cache_dir
16
+
17
+ ### Parallelizing HTTP requests
18
+
19
+ To enable parallel requests, use the `typhoeus` gem. Unless you are using an old version of Typhoeus (< 0.5), both Faraday and Typhoeus define a Faraday adapter, but you must use the one defined by Typhoeus, like so:
20
+
21
+ ```ruby
22
+ require 'pupa'
23
+ require 'typhoeus'
24
+ require 'typhoeus/adapters/faraday'
25
+ ```
26
+
27
+ Then, in your scraping methods, write code like:
28
+
29
+ ```ruby
30
+ responses = []
31
+
32
+ # Change the maximum number of concurrent requests (default 200). You usually
33
+ # need to tweak this number by trial and error.
34
+ # @see https://github.com/lostisland/faraday/wiki/Parallel-requests#advanced-use
35
+ manager = Typhoeus::Hydra.new(max_concurrency: 20)
36
+
37
+ begin
38
+ # Send HTTP requests in parallel.
39
+ client.in_parallel(manager) do
40
+ responses << client.get('http://example.com/foo')
41
+ responses << client.get('http://example.com/bar')
42
+ # More requests...
43
+ end
44
+ rescue Faraday::Error::ClientError => e
45
+ # Log an error message if, for example, you exceed a server's maximum number
46
+ # of concurrent connections or if you exceed an API's rate limit.
47
+ error(e.response.inspect)
48
+ end
49
+
50
+ # Responses are now available for use.
51
+ responses.each do |response|
52
+ # Only process the finished responses.
53
+ if response.success?
54
+ # If success...
55
+ elsif response.finished?
56
+ # If error...
57
+ end
58
+ end
59
+ ```
60
+
61
+ ### Reducing disk I/O
62
+
63
+ After HTTP requests, disk I/O is the slowest operation. Two types of files are written to disk: HTTP responses are written to the cache directory, and JSON documents are written to the output directory. Writing to memory is much faster than writing to disk.
64
+
65
+ #### RAM file systems
66
+
67
+ A simple solution is to create a file system in RAM, like `tmpfs` on Linux for example, and to use it as your `output_dir` and `cache_dir`. On OS X, you must create a RAM disk. To create a 128MB RAM disk, for example, run:
68
+
69
+ ramdisk=$(hdiutil attach -nomount ram://$((128 * 2048)) | tr -d ' \t')
70
+ diskutil erasevolume HFS+ 'ramdisk' $ramdisk
71
+
72
+ You can then set the `output_dir` and `cache_dir` on OS X as:
73
+
74
+ ruby cat.rb --output_dir /Volumes/ramdisk/scraped_data --cache_dir /Volumes/ramdisk/web_cache
75
+
76
+ Once you are done with the RAM disk, release the memory:
77
+
78
+ diskutil unmount $ramdisk
79
+ hdiutil detach $ramdisk
80
+
81
+ Using a RAM disk will significantly improve performance; however, the data will be lost between reboots unless you move the data to a hard disk. Using Memcached (for caching) and Redis (for storage) is moderately faster than using a RAM disk, and Redis will not lose your output data between reboots.
82
+
83
+ #### Memcached
84
+
85
+ You may cache HTTP responses in [Memcached](http://memcached.org/). First, require the `dalli` gem. Then:
86
+
87
+ ruby cat.rb --cache_dir memcached://localhost:11211
88
+
89
+ The data in Memcached will be lost between reboots.
90
+
91
+ #### Redis
92
+
93
+ You may dump JSON documents in [Redis](http://redis.io/). First, require the `redis-store` gem. Then:
94
+
95
+ ruby cat.rb --output_dir redis://localhost:6379/0
96
+
97
+ To dump JSON documents in Redis moderately faster, use [pipelining](http://redis.io/topics/pipelining):
98
+
99
+ ruby cat.rb --output_dir redis://localhost:6379/0 --pipelined
100
+
101
+ Requiring the `hiredis` gem will slightly improve performance.
102
+
103
+ Note that Pupa.rb flushes the Redis database before scraping. If you use Redis, **DO NOT** share a Redis database with Pupa.rb and other applications. You can select a different database than the default `0` for use with Pupa.rb by passing an argument like `redis://localhost:6379/15`, where `15` is the database number.
104
+
105
+ ### Skipping validation
106
+
107
+ The `json-schema` gem is slow compared to, for example, [JSV](https://github.com/garycourt/JSV). Setting the `--no-validate` switch and running JSON Schema validations separately can further reduce a scraper's running time.
108
+
109
+ The [pupa-validate](https://npmjs.org/package/pupa-validate) npm package can be used to validate JSON documents using the faster JSV. In an example case, using JSV instead of the `json-schema` gem reduced by half the time to validate 10,000 documents.
110
+
111
+ ### Ruby version
112
+
113
+ Pupa.rb requires Ruby 2.x. If you have already made all the above optimizations, you may notice a significant improvement by using Ruby 2.1, which has better garbage collection than Ruby 2.0.
114
+
115
+ ### Profiling
116
+
117
+ You can profile your code using [perftools.rb](https://github.com/tmm1/perftools.rb). First, install the gem:
118
+
119
+ gem install perftools.rb
120
+
121
+ Then, run your script with the profiler (changing `/tmp/PROFILE_NAME` and `script.rb` as appropriate):
122
+
123
+ CPUPROFILE=/tmp/PROFILE_NAME RUBYOPT="-r`gem which perftools | tail -1`" ruby script.rb
124
+
125
+ You may want to set the `CPUPROFILE_REALTIME=1` flag; however, it seems to interfere with HTTP requests, for whatever reason.
126
+
127
+ [perftools.rb](https://github.com/tmm1/perftools.rb) has several output formats. If your code is straight-forward, you can draw a graph (changing `/tmp/PROFILE_NAME` and `/tmp/PROFILE_NAME.pdf` as appropriate):
128
+
129
+ pprof.rb --pdf /tmp/PROFILE_NAME > /tmp/PROFILE_NAME.pdf
data/README.md CHANGED
@@ -69,7 +69,7 @@ The [organization.rb](http://opennorth.github.io/pupa-ruby/docs/organization.htm
69
69
 
70
70
  JSON parsing is enabled by default. To enable automatic parsing of HTML and XML, require the `nokogiri` and `multi_xml` gems.
71
71
 
72
- ### [OpenCivicData](http://opencivicdata.org/) compatibility
72
+ ## [OpenCivicData](http://opencivicdata.org/) compatibility
73
73
 
74
74
  Both Pupa.rb and Sunlight Labs' [Pupa](https://github.com/opencivicdata/pupa) implement models for people, organizations and memberships from the [Popolo](http://popoloproject.com/) open government data specification. Pupa.rb lets you use your own classes, but Pupa only supports a fixed set of classes. A consequence of Pupa.rb's flexibility is that the value of the `_type` property for `Person`, `Organization` and `Membership` objects differs between Pupa.rb and Pupa. Pupa.rb has namespaced types like `pupa/person` – to allow Ruby to load the `Person` class in the `Pupa` module – whereas Pupa has unnamespaced types like `person`.
75
75
 
@@ -81,138 +81,8 @@ require 'pupa/refinements/opencivicdata'
81
81
 
82
82
  It is not currently possible to run the `scrape` action with one of Pupa.rb and Pupa, and to then run the `import` action with the other. Both actions must be run by the same library.
83
83
 
84
- ## Performance
85
-
86
- Pupa.rb offers several ways to significantly improve performance.
87
-
88
- In an example case, reducing disk I/O and skipping validation as described below reduced the time to scrape 10,000 documents from 100 cached HTTP responses from 100 seconds down to 5 seconds. Like fast tests, fast scrapers make development smoother.
89
-
90
- The `import` action's performance is currently limited by MongoDB when a dependency graph is used to determine the evaluation order. If a dependency graph cannot be used because you don't know a related object's ID, [several optimizations](https://github.com/opennorth/pupa-ruby/issues/12) can be implemented to improve performance.
91
-
92
- ### Reducing HTTP requests
93
-
94
- HTTP requests consume the most time. To avoid repeat HTTP requests while developing a scraper, cache all HTTP responses. Pupa.rb will by default use a `web_cache` directory in the same directory as your script. You can change the directory by setting the `--cache_dir` switch on the command line, for example:
95
-
96
- ruby cat.rb --cache_dir /tmp/my_cache_dir
97
-
98
- ### Parallelizing HTTP requests
99
-
100
- To enable parallel requests, use the `typhoeus` gem. Unless you are using an old version of Typhoeus (< 0.5), both Faraday and Typhoeus define a Faraday adapter, but you must use the one defined by Typhoeus, like so:
101
-
102
- ```ruby
103
- require 'pupa'
104
- require 'typhoeus'
105
- require 'typhoeus/adapters/faraday'
106
- ```
107
-
108
- Then, in your scraping methods, write code like:
109
-
110
- ```ruby
111
- responses = []
112
-
113
- # Change the maximum number of concurrent requests (default 200). You usually
114
- # need to tweak this number by trial and error.
115
- # @see https://github.com/lostisland/faraday/wiki/Parallel-requests#advanced-use
116
- manager = Typhoeus::Hydra.new(max_concurrency: 20)
117
-
118
- begin
119
- # Send HTTP requests in parallel.
120
- client.in_parallel(manager) do
121
- responses << client.get('http://example.com/foo')
122
- responses << client.get('http://example.com/bar')
123
- # More requests...
124
- end
125
- rescue Faraday::Error::ClientError => e
126
- # Log an error message if, for example, you exceed a server's maximum number
127
- # of concurrent connections or if you exceed an API's rate limit.
128
- error(e.response.inspect)
129
- end
130
-
131
- # Responses are now available for use.
132
- responses.each do |response|
133
- # Only process the finished responses.
134
- if response.success?
135
- # If success...
136
- elsif response.finished?
137
- # If error...
138
- end
139
- end
140
- ```
141
-
142
- ### Reducing disk I/O
143
-
144
- After HTTP requests, disk I/O is the slowest operation. Two types of files are written to disk: HTTP responses are written to the cache directory, and JSON documents are written to the output directory. Writing to memory is much faster than writing to disk.
145
-
146
- #### RAM file systems
147
-
148
- A simple solution is to create a file system in RAM, like `tmpfs` on Linux for example, and to use it as your `output_dir` and `cache_dir`. On OS X, you must create a RAM disk. To create a 128MB RAM disk, for example, run:
149
-
150
- ramdisk=$(hdiutil attach -nomount ram://$((128 * 2048)) | tr -d ' \t')
151
- diskutil erasevolume HFS+ 'ramdisk' $ramdisk
152
-
153
- You can then set the `output_dir` and `cache_dir` on OS X as:
154
-
155
- ruby cat.rb --output_dir /Volumes/ramdisk/scraped_data --cache_dir /Volumes/ramdisk/web_cache
156
-
157
- Once you are done with the RAM disk, release the memory:
158
-
159
- diskutil unmount $ramdisk
160
- hdiutil detach $ramdisk
161
-
162
- Using a RAM disk will significantly improve performance; however, the data will be lost between reboots unless you move the data to a hard disk. Using Memcached (for caching) and Redis (for storage) is moderately faster than using a RAM disk, and Redis will not lose your output data between reboots.
163
-
164
- #### Memcached
165
-
166
- You may cache HTTP responses in [Memcached](http://memcached.org/). First, require the `dalli` gem. Then:
167
-
168
- ruby cat.rb --cache_dir memcached://localhost:11211
169
-
170
- The data in Memcached will be lost between reboots.
171
-
172
- #### Redis
173
-
174
- You may dump JSON documents in [Redis](http://redis.io/). First, require the `redis-store` gem. Then:
175
-
176
- ruby cat.rb --output_dir redis://localhost:6379/0
177
-
178
- To dump JSON documents in Redis moderately faster, use [pipelining](http://redis.io/topics/pipelining):
179
-
180
- ruby cat.rb --output_dir redis://localhost:6379/0 --pipelined
181
-
182
- Requiring the `hiredis` gem will slightly improve performance.
183
-
184
- Note that Pupa.rb flushes the Redis database before scraping. If you use Redis, **DO NOT** share a Redis database with Pupa.rb and other applications. You can select a different database than the default `0` for use with Pupa.rb by passing an argument like `redis://localhost:6379/15`, where `15` is the database number.
185
-
186
- ### Skipping validation
187
-
188
- The `json-schema` gem is slow compared to, for example, [JSV](https://github.com/garycourt/JSV). Setting the `--no-validate` switch and running JSON Schema validations separately can further reduce a scraper's running time.
189
-
190
- The [pupa-validate](https://npmjs.org/package/pupa-validate) npm package can be used to validate JSON documents using the faster JSV. In an example case, using JSV instead of the `json-schema` gem reduced by half the time to validate 10,000 documents.
191
-
192
- ### Ruby version
193
-
194
- Pupa.rb requires Ruby 2.x. If you have already made all the above optimizations, you may notice a significant improvement by using Ruby 2.1, which has better garbage collection than Ruby 2.0.
195
-
196
- ### Profiling
197
-
198
- You can profile your code using [perftools.rb](https://github.com/tmm1/perftools.rb). First, install the gem:
199
-
200
- gem install perftools.rb
201
-
202
- Then, run your script with the profiler (changing `/tmp/PROFILE_NAME` and `script.rb` as appropriate):
203
-
204
- CPUPROFILE=/tmp/PROFILE_NAME RUBYOPT="-r`gem which perftools | tail -1`" ruby script.rb
205
-
206
- You may want to set the `CPUPROFILE_REALTIME=1` flag; however, it seems to interfere with HTTP requests, for whatever reason.
207
-
208
- [perftools.rb](https://github.com/tmm1/perftools.rb) has several output formats. If your code is straight-forward, you can draw a graph (changing `/tmp/PROFILE_NAME` and `/tmp/PROFILE_NAME.pdf` as appropriate):
209
-
210
- pprof.rb --pdf /tmp/PROFILE_NAME > /tmp/PROFILE_NAME.pdf
211
-
212
84
  ## Integration with ODMs
213
85
 
214
- ### Mongoid
215
-
216
86
  `Pupa::Model` is incompatible with `Mongoid::Document`. Don't do this:
217
87
 
218
88
  ```ruby
@@ -224,6 +94,10 @@ end
224
94
 
225
95
  Instead, have a scraping model that includes `Pupa::Model` and an app model that includes `Mongoid::Document`.
226
96
 
97
+ ## Performance
98
+
99
+ Pupa.rb offers several ways to significantly improve performance. [Read the documentation.](https://github.com/opennorth/pupa-ruby/blob/master/PERFORMANCE.md#readme)
100
+
227
101
  ## Testing
228
102
 
229
103
  **DO NOT** run this gem's specs if you are using Redis database number 15 on `localhost`!
@@ -30,9 +30,12 @@ module Pupa
30
30
  # (e.g. `memcached://localhost:11211`) in which to cache requests
31
31
  # @param [Integer] expires_in the cache's expiration time in seconds
32
32
  # @param [Integer] value_max_bytes the maximum Memcached item size
33
+ # @param [String] memcached_username the Memcached username
34
+ # @param [String] memcached_password the Memcached password
33
35
  # @param [String] level the log level
36
+ # @param [String,IO] logdev the log device
34
37
  # @return [Faraday::Connection] a configured Faraday HTTP client
35
- def self.new(cache_dir: nil, expires_in: 86400, value_max_bytes: 1048576, level: 'INFO') # 1 day
38
+ def self.new(cache_dir: nil, expires_in: 86400, value_max_bytes: 1048576, memcached_username: nil, memcached_password: nil, level: 'INFO', logdev: STDOUT) # 1 day
36
39
  Faraday.new do |connection|
37
40
  connection.request :url_encoded
38
41
  connection.use Middleware::Logger, Logger.new('faraday', level: level)
@@ -59,7 +62,7 @@ module Pupa
59
62
  connection.response :caching do
60
63
  address = cache_dir[%r{\Amemcached://(.+)\z}, 1]
61
64
  if address
62
- ActiveSupport::Cache::MemCacheStore.new(address, expires_in: expires_in, value_max_bytes: Integer(value_max_bytes))
65
+ ActiveSupport::Cache::MemCacheStore.new(address, expires_in: expires_in, value_max_bytes: Integer(value_max_bytes), username: memcached_username, password: memcached_password)
63
66
  else
64
67
  ActiveSupport::Cache::FileStore.new(cache_dir, expires_in: expires_in)
65
68
  end
@@ -25,14 +25,16 @@ module Pupa
25
25
  # (e.g. `memcached://localhost:11211`) in which to cache HTTP responses
26
26
  # @param [Integer] expires_in the cache's expiration time in seconds
27
27
  # @param [Integer] value_max_bytes the maximum Memcached item size
28
+ # @param [String] memcached_username the Memcached username
29
+ # @param [String] memcached_password the Memcached password
28
30
  # @param [String] database_url the database URL
29
31
  # @param [Boolean] validate whether to validate JSON documents
30
32
  # @param [String] level the log level
31
33
  # @param [String,IO] logdev the log device
32
34
  # @param [Hash] options criteria for selecting the methods to run
33
- def initialize(output_dir, pipelined: false, cache_dir: nil, expires_in: 86400, value_max_bytes: 1048576, database_url: 'mongodb://localhost:27017/pupa', validate: true, level: 'INFO', logdev: STDOUT, options: {})
35
+ def initialize(output_dir, pipelined: false, cache_dir: nil, expires_in: 86400, value_max_bytes: 1048576, memcached_username: nil, memcached_password: nil, database_url: 'mongodb://localhost:27017/pupa', validate: true, level: 'INFO', logdev: STDOUT, options: {})
34
36
  @store = DocumentStore.new(output_dir, pipelined: pipelined)
35
- @client = Client.new(cache_dir: cache_dir, expires_in: expires_in, value_max_bytes: value_max_bytes, level: level)
37
+ @client = Client.new(cache_dir: cache_dir, expires_in: expires_in, value_max_bytes: value_max_bytes, memcached_username: memcached_username, memcached_password: memcached_password, level: level, logdev: logdev)
36
38
  @connection = Connection.new(database_url)
37
39
  @logger = Logger.new('pupa', level: level, logdev: logdev)
38
40
  @validate = validate
data/lib/pupa/runner.rb CHANGED
@@ -11,17 +11,19 @@ module Pupa
11
11
  @processor_class = processor_class
12
12
 
13
13
  @options = OpenStruct.new({
14
- actions: [],
15
- tasks: [],
16
- output_dir: File.expand_path('scraped_data', Dir.pwd),
17
- pipelined: false,
18
- cache_dir: File.expand_path('web_cache', Dir.pwd),
19
- expires_in: 86400, # 1 day
20
- value_max_bytes: 1048576, # 1 MB
21
- database_url: 'mongodb://localhost:27017/pupa',
22
- validate: true,
23
- level: 'INFO',
24
- dry_run: false,
14
+ actions: [],
15
+ tasks: [],
16
+ output_dir: File.expand_path('scraped_data', Dir.pwd),
17
+ pipelined: false,
18
+ cache_dir: File.expand_path('web_cache', Dir.pwd),
19
+ expires_in: 86400, # 1 day
20
+ value_max_bytes: 1048576, # 1 MB
21
+ memcached_username: nil,
22
+ memcached_password: nil,
23
+ database_url: 'mongodb://localhost:27017/pupa',
24
+ validate: true,
25
+ level: 'INFO',
26
+ dry_run: false,
25
27
  }.merge(defaults))
26
28
 
27
29
  @actions = {
@@ -86,7 +88,13 @@ module Pupa
86
88
  opts.on('--value_max_bytes BYTES', "The maximum Memcached item size") do |v|
87
89
  options.value_max_bytes = v
88
90
  end
89
- opts.on('-d', '--database_url SCHEME://USERNAME:PASSWORD@HOST:PORT/DATABASE', 'The database URL') do |v|
91
+ opts.on('--memcached_username USERNAME', "The Memcached username") do |v|
92
+ options.memcached_username = v
93
+ end
94
+ opts.on('--memcached_password USERNAME', "The Memcached password") do |v|
95
+ options.memcached_password = v
96
+ end
97
+ opts.on('-d', '--database_url', 'The database URL (e.g. mongodb://USER:PASSWORD@localhost:27017/pupa or postgres://USER:PASSWORD@localhost:5432/pupa') do |v|
90
98
  options.database_url = v
91
99
  end
92
100
  opts.on('--[no-]validate', 'Validate JSON documents') do |v|
@@ -147,6 +155,8 @@ module Pupa
147
155
  cache_dir: options.cache_dir,
148
156
  expires_in: options.expires_in,
149
157
  value_max_bytes: options.value_max_bytes,
158
+ memcached_username: options.memcached_username,
159
+ memcached_password: options.memcached_password,
150
160
  database_url: options.database_url,
151
161
  validate: options.validate,
152
162
  level: options.level,
@@ -165,7 +175,7 @@ module Pupa
165
175
  end
166
176
 
167
177
  if options.level == 'DEBUG'
168
- %w(output_dir pipelined cache_dir expires_in value_max_bytes database_url validate level).each do |option|
178
+ %w(output_dir pipelined cache_dir expires_in value_max_bytes memcached_username memcached_password database_url validate level).each do |option|
169
179
  puts "#{option}: #{options[option]}"
170
180
  end
171
181
  unless rest.empty?
data/lib/pupa/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Pupa
2
- VERSION = "0.1.4"
2
+ VERSION = "0.1.5"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pupa
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.4
4
+ version: 0.1.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Open North
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-05-23 00:00:00.000000000 Z
11
+ date: 2014-07-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
@@ -288,6 +288,7 @@ files:
288
288
  - ".yardopts"
289
289
  - Gemfile
290
290
  - LICENSE
291
+ - PERFORMANCE.md
291
292
  - README.md
292
293
  - Rakefile
293
294
  - USAGE