wayfarer 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +8 -0
  3. data/.rbenv-gemsets +1 -0
  4. data/.rspec +3 -0
  5. data/.rubocop.yml +21 -0
  6. data/.ruby-version +1 -0
  7. data/.travis.yml +5 -0
  8. data/.yardopts +3 -0
  9. data/Changelog.md +10 -0
  10. data/Gemfile +11 -0
  11. data/LICENSE +19 -0
  12. data/README.md +21 -0
  13. data/Rakefile +114 -0
  14. data/benchmark/frontiers.rb +143 -0
  15. data/bin/wayfarer +116 -0
  16. data/docs/.gitignore +2 -0
  17. data/docs/_config.yml +15 -0
  18. data/docs/_includes/base.html +7 -0
  19. data/docs/_includes/head.html +10 -0
  20. data/docs/_includes/navigation.html +187 -0
  21. data/docs/_layouts/default.html +42 -0
  22. data/docs/_sass/base.scss +439 -0
  23. data/docs/_sass/variables.scss +24 -0
  24. data/docs/_sass/vendor/bourbon/_bourbon-deprecate.scss +19 -0
  25. data/docs/_sass/vendor/bourbon/_bourbon-deprecated-upcoming.scss +425 -0
  26. data/docs/_sass/vendor/bourbon/_bourbon.scss +90 -0
  27. data/docs/_sass/vendor/bourbon/addons/_border-color.scss +29 -0
  28. data/docs/_sass/vendor/bourbon/addons/_border-radius.scss +48 -0
  29. data/docs/_sass/vendor/bourbon/addons/_border-style.scss +28 -0
  30. data/docs/_sass/vendor/bourbon/addons/_border-width.scss +28 -0
  31. data/docs/_sass/vendor/bourbon/addons/_buttons.scss +69 -0
  32. data/docs/_sass/vendor/bourbon/addons/_clearfix.scss +25 -0
  33. data/docs/_sass/vendor/bourbon/addons/_ellipsis.scss +30 -0
  34. data/docs/_sass/vendor/bourbon/addons/_font-stacks.scss +31 -0
  35. data/docs/_sass/vendor/bourbon/addons/_hide-text.scss +27 -0
  36. data/docs/_sass/vendor/bourbon/addons/_margin.scss +29 -0
  37. data/docs/_sass/vendor/bourbon/addons/_padding.scss +29 -0
  38. data/docs/_sass/vendor/bourbon/addons/_position.scss +51 -0
  39. data/docs/_sass/vendor/bourbon/addons/_prefixer.scss +66 -0
  40. data/docs/_sass/vendor/bourbon/addons/_retina-image.scss +27 -0
  41. data/docs/_sass/vendor/bourbon/addons/_size.scss +56 -0
  42. data/docs/_sass/vendor/bourbon/addons/_text-inputs.scss +118 -0
  43. data/docs/_sass/vendor/bourbon/addons/_timing-functions.scss +34 -0
  44. data/docs/_sass/vendor/bourbon/addons/_triangle.scss +63 -0
  45. data/docs/_sass/vendor/bourbon/addons/_word-wrap.scss +29 -0
  46. data/docs/_sass/vendor/bourbon/css3/_animation.scss +61 -0
  47. data/docs/_sass/vendor/bourbon/css3/_appearance.scss +5 -0
  48. data/docs/_sass/vendor/bourbon/css3/_backface-visibility.scss +5 -0
  49. data/docs/_sass/vendor/bourbon/css3/_background-image.scss +44 -0
  50. data/docs/_sass/vendor/bourbon/css3/_background.scss +57 -0
  51. data/docs/_sass/vendor/bourbon/css3/_border-image.scss +61 -0
  52. data/docs/_sass/vendor/bourbon/css3/_calc.scss +6 -0
  53. data/docs/_sass/vendor/bourbon/css3/_columns.scss +67 -0
  54. data/docs/_sass/vendor/bourbon/css3/_filter.scss +6 -0
  55. data/docs/_sass/vendor/bourbon/css3/_flex-box.scss +327 -0
  56. data/docs/_sass/vendor/bourbon/css3/_font-face.scss +29 -0
  57. data/docs/_sass/vendor/bourbon/css3/_font-feature-settings.scss +6 -0
  58. data/docs/_sass/vendor/bourbon/css3/_hidpi-media-query.scss +12 -0
  59. data/docs/_sass/vendor/bourbon/css3/_hyphens.scss +6 -0
  60. data/docs/_sass/vendor/bourbon/css3/_image-rendering.scss +15 -0
  61. data/docs/_sass/vendor/bourbon/css3/_keyframes.scss +38 -0
  62. data/docs/_sass/vendor/bourbon/css3/_linear-gradient.scss +40 -0
  63. data/docs/_sass/vendor/bourbon/css3/_perspective.scss +12 -0
  64. data/docs/_sass/vendor/bourbon/css3/_placeholder.scss +10 -0
  65. data/docs/_sass/vendor/bourbon/css3/_radial-gradient.scss +40 -0
  66. data/docs/_sass/vendor/bourbon/css3/_selection.scss +44 -0
  67. data/docs/_sass/vendor/bourbon/css3/_text-decoration.scss +27 -0
  68. data/docs/_sass/vendor/bourbon/css3/_transform.scss +21 -0
  69. data/docs/_sass/vendor/bourbon/css3/_transition.scss +81 -0
  70. data/docs/_sass/vendor/bourbon/css3/_user-select.scss +5 -0
  71. data/docs/_sass/vendor/bourbon/functions/_assign-inputs.scss +16 -0
  72. data/docs/_sass/vendor/bourbon/functions/_contains-falsy.scss +25 -0
  73. data/docs/_sass/vendor/bourbon/functions/_contains.scss +31 -0
  74. data/docs/_sass/vendor/bourbon/functions/_is-length.scss +16 -0
  75. data/docs/_sass/vendor/bourbon/functions/_is-light.scss +26 -0
  76. data/docs/_sass/vendor/bourbon/functions/_is-number.scss +16 -0
  77. data/docs/_sass/vendor/bourbon/functions/_is-size.scss +23 -0
  78. data/docs/_sass/vendor/bourbon/functions/_modular-scale.scss +74 -0
  79. data/docs/_sass/vendor/bourbon/functions/_px-to-em.scss +24 -0
  80. data/docs/_sass/vendor/bourbon/functions/_px-to-rem.scss +26 -0
  81. data/docs/_sass/vendor/bourbon/functions/_shade.scss +24 -0
  82. data/docs/_sass/vendor/bourbon/functions/_strip-units.scss +22 -0
  83. data/docs/_sass/vendor/bourbon/functions/_tint.scss +24 -0
  84. data/docs/_sass/vendor/bourbon/functions/_transition-property-name.scss +37 -0
  85. data/docs/_sass/vendor/bourbon/functions/_unpack.scss +32 -0
  86. data/docs/_sass/vendor/bourbon/helpers/_convert-units.scss +26 -0
  87. data/docs/_sass/vendor/bourbon/helpers/_directional-values.scss +108 -0
  88. data/docs/_sass/vendor/bourbon/helpers/_font-source-declaration.scss +53 -0
  89. data/docs/_sass/vendor/bourbon/helpers/_gradient-positions-parser.scss +24 -0
  90. data/docs/_sass/vendor/bourbon/helpers/_linear-angle-parser.scss +35 -0
  91. data/docs/_sass/vendor/bourbon/helpers/_linear-gradient-parser.scss +51 -0
  92. data/docs/_sass/vendor/bourbon/helpers/_linear-positions-parser.scss +77 -0
  93. data/docs/_sass/vendor/bourbon/helpers/_linear-side-corner-parser.scss +41 -0
  94. data/docs/_sass/vendor/bourbon/helpers/_radial-arg-parser.scss +74 -0
  95. data/docs/_sass/vendor/bourbon/helpers/_radial-gradient-parser.scss +55 -0
  96. data/docs/_sass/vendor/bourbon/helpers/_radial-positions-parser.scss +28 -0
  97. data/docs/_sass/vendor/bourbon/helpers/_render-gradients.scss +31 -0
  98. data/docs/_sass/vendor/bourbon/helpers/_shape-size-stripper.scss +15 -0
  99. data/docs/_sass/vendor/bourbon/helpers/_str-to-num.scss +55 -0
  100. data/docs/_sass/vendor/bourbon/settings/_asset-pipeline.scss +7 -0
  101. data/docs/_sass/vendor/bourbon/settings/_deprecation-warnings.scss +8 -0
  102. data/docs/_sass/vendor/bourbon/settings/_prefixer.scss +9 -0
  103. data/docs/_sass/vendor/bourbon/settings/_px-to-em.scss +1 -0
  104. data/docs/_sass/vendor/neat/_neat-helpers.scss +11 -0
  105. data/docs/_sass/vendor/neat/_neat.scss +23 -0
  106. data/docs/_sass/vendor/neat/functions/_new-breakpoint.scss +49 -0
  107. data/docs/_sass/vendor/neat/functions/_private.scss +114 -0
  108. data/docs/_sass/vendor/neat/grid/_box-sizing.scss +15 -0
  109. data/docs/_sass/vendor/neat/grid/_direction-context.scss +33 -0
  110. data/docs/_sass/vendor/neat/grid/_display-context.scss +28 -0
  111. data/docs/_sass/vendor/neat/grid/_fill-parent.scss +22 -0
  112. data/docs/_sass/vendor/neat/grid/_media.scss +92 -0
  113. data/docs/_sass/vendor/neat/grid/_omega.scss +87 -0
  114. data/docs/_sass/vendor/neat/grid/_outer-container.scss +34 -0
  115. data/docs/_sass/vendor/neat/grid/_pad.scss +25 -0
  116. data/docs/_sass/vendor/neat/grid/_private.scss +35 -0
  117. data/docs/_sass/vendor/neat/grid/_row.scss +52 -0
  118. data/docs/_sass/vendor/neat/grid/_shift.scss +50 -0
  119. data/docs/_sass/vendor/neat/grid/_span-columns.scss +94 -0
  120. data/docs/_sass/vendor/neat/grid/_to-deprecate.scss +97 -0
  121. data/docs/_sass/vendor/neat/grid/_visual-grid.scss +42 -0
  122. data/docs/_sass/vendor/neat/mixins/_clearfix.scss +25 -0
  123. data/docs/_sass/vendor/neat/settings/_disable-warnings.scss +13 -0
  124. data/docs/_sass/vendor/neat/settings/_grid.scss +51 -0
  125. data/docs/_sass/vendor/neat/settings/_visual-grid.scss +27 -0
  126. data/docs/_sass/vendor/normalize-3.0.2.scss +427 -0
  127. data/docs/_sass/vendor/pygments.scss +356 -0
  128. data/docs/automating_browsers/capybara.md +70 -0
  129. data/docs/css/screen.scss +7 -0
  130. data/docs/guides/callbacks.md +45 -0
  131. data/docs/guides/cli.md +52 -0
  132. data/docs/guides/configuration.md +184 -0
  133. data/docs/guides/error_handling.md +46 -0
  134. data/docs/guides/frontiers.md +93 -0
  135. data/docs/guides/halting.md +23 -0
  136. data/docs/guides/job_queues.md +26 -0
  137. data/docs/guides/locals.md +36 -0
  138. data/docs/guides/logging.md +22 -0
  139. data/docs/guides/page_objects.md +67 -0
  140. data/docs/guides/peeking.md +46 -0
  141. data/docs/guides/selenium_capybara.md +100 -0
  142. data/docs/guides/tutorial.md +452 -0
  143. data/docs/index.md +82 -0
  144. data/docs/js/navigation.js +11 -0
  145. data/docs/misc/contributing.md +20 -0
  146. data/docs/misc/testing.md +11 -0
  147. data/docs/recipes/authentication.md +23 -0
  148. data/docs/recipes/csv.md +29 -0
  149. data/docs/recipes/javascript.md +20 -0
  150. data/docs/recipes/multiple_uris.md +18 -0
  151. data/docs/recipes/screenshots.md +20 -0
  152. data/docs/routing/custom_rules.md +16 -0
  153. data/docs/routing/filetypes_rules.md +21 -0
  154. data/docs/routing/host_rules.md +24 -0
  155. data/docs/routing/path_rules.md +33 -0
  156. data/docs/routing/protocol_rules.md +17 -0
  157. data/docs/routing/query_rules.md +69 -0
  158. data/docs/routing/routes.md +96 -0
  159. data/docs/routing/uri_rules.md +18 -0
  160. data/examples/collect_github_issues.rb +65 -0
  161. data/examples/find_foobar_on_wikipedia.rb +23 -0
  162. data/lib/wayfarer/configuration.rb +86 -0
  163. data/lib/wayfarer/crawl.rb +79 -0
  164. data/lib/wayfarer/crawl_observer.rb +103 -0
  165. data/lib/wayfarer/dispatcher.rb +104 -0
  166. data/lib/wayfarer/finders.rb +61 -0
  167. data/lib/wayfarer/frontiers/frontier.rb +79 -0
  168. data/lib/wayfarer/frontiers/memory_bloomfilter.rb +32 -0
  169. data/lib/wayfarer/frontiers/memory_frontier.rb +76 -0
  170. data/lib/wayfarer/frontiers/memory_trie_frontier.rb +39 -0
  171. data/lib/wayfarer/frontiers/normalize_uris.rb +48 -0
  172. data/lib/wayfarer/frontiers/redis_bloomfilter.rb +34 -0
  173. data/lib/wayfarer/frontiers/redis_frontier.rb +83 -0
  174. data/lib/wayfarer/http_adapters/adapter_pool.rb +62 -0
  175. data/lib/wayfarer/http_adapters/net_http_adapter.rb +77 -0
  176. data/lib/wayfarer/http_adapters/selenium_adapter.rb +80 -0
  177. data/lib/wayfarer/job.rb +211 -0
  178. data/lib/wayfarer/locals.rb +40 -0
  179. data/lib/wayfarer/page.rb +94 -0
  180. data/lib/wayfarer/parsers/json_parser.rb +20 -0
  181. data/lib/wayfarer/parsers/xml_parser.rb +27 -0
  182. data/lib/wayfarer/processor.rb +103 -0
  183. data/lib/wayfarer/routing/custom_rule.rb +21 -0
  184. data/lib/wayfarer/routing/filetypes_rule.rb +20 -0
  185. data/lib/wayfarer/routing/host_rule.rb +19 -0
  186. data/lib/wayfarer/routing/path_rule.rb +54 -0
  187. data/lib/wayfarer/routing/protocol_rule.rb +21 -0
  188. data/lib/wayfarer/routing/query_rule.rb +59 -0
  189. data/lib/wayfarer/routing/router.rb +71 -0
  190. data/lib/wayfarer/routing/rule.rb +114 -0
  191. data/lib/wayfarer/routing/uri_rule.rb +21 -0
  192. data/lib/wayfarer.rb +68 -0
  193. data/spec/configuration_spec.rb +26 -0
  194. data/spec/crawl_spec.rb +48 -0
  195. data/spec/finders_spec.rb +49 -0
  196. data/spec/frontiers/memory_bloomfilter_spec.rb +6 -0
  197. data/spec/frontiers/memory_frontier_spec.rb +6 -0
  198. data/spec/frontiers/memory_trie_frontier_spec.rb +6 -0
  199. data/spec/frontiers/normalize_uris_spec.rb +59 -0
  200. data/spec/frontiers/redis_bloomfilter_spec.rb +6 -0
  201. data/spec/frontiers/redis_frontier_spec.rb +6 -0
  202. data/spec/http_adapters/adapter_pool_spec.rb +33 -0
  203. data/spec/http_adapters/net_http_adapter_spec.rb +83 -0
  204. data/spec/http_adapters/selenium_adapter_spec.rb +53 -0
  205. data/spec/integration/callbacks_spec.rb +42 -0
  206. data/spec/integration/locals_spec.rb +106 -0
  207. data/spec/integration/peeking_spec.rb +61 -0
  208. data/spec/job_spec.rb +122 -0
  209. data/spec/page_spec.rb +38 -0
  210. data/spec/parsers/json_parser_spec.rb +30 -0
  211. data/spec/parsers/xml_parser_spec.rb +24 -0
  212. data/spec/processor_spec.rb +31 -0
  213. data/spec/routing/custom_rule_spec.rb +26 -0
  214. data/spec/routing/filetypes_rule_spec.rb +40 -0
  215. data/spec/routing/host_rule_spec.rb +48 -0
  216. data/spec/routing/path_rule_spec.rb +66 -0
  217. data/spec/routing/protocol_rule_spec.rb +26 -0
  218. data/spec/routing/query_rule_spec.rb +124 -0
  219. data/spec/routing/router_spec.rb +67 -0
  220. data/spec/routing/rule_spec.rb +251 -0
  221. data/spec/routing/uri_rule_spec.rb +24 -0
  222. data/spec/shared/frontier.rb +96 -0
  223. data/spec/spec_helpers.rb +62 -0
  224. data/spec/wayfarer_spec.rb +24 -0
  225. data/support/static/finders.html +38 -0
  226. data/support/static/graph/details/a.html +10 -0
  227. data/support/static/graph/details/b.html +10 -0
  228. data/support/static/graph/index.html +20 -0
  229. data/support/static/json/dummy.json +13 -0
  230. data/support/static/links/links.html +28 -0
  231. data/support/static/xml/dummy.xml +120 -0
  232. data/support/test_app.rb +45 -0
  233. data/wayfarer-jruby.gemspec +49 -0
  234. data/wayfarer.gemspec +53 -0
  235. metadata +697 -0
@@ -0,0 +1,356 @@
1
+ /* Generated by Pygments CSS Theme Builder - https://jwarby.github.io/jekyll-pygments-themes/builder.html */
2
+ /* Base Style */
3
+ .highlight pre {
4
+ color: #333333;
5
+ background-color: transparent;
6
+ }
7
+ /* Punctuation */
8
+ .highlight .p {
9
+ color: #333333;
10
+ background-color: transparent;
11
+ }
12
+ /* Error */
13
+ .highlight .err {
14
+ color: #333333;
15
+ background-color: transparent;
16
+ }
17
+ /* Base Style */
18
+ .highlight .n {
19
+ color: #333333;
20
+ background-color: transparent;
21
+ }
22
+ /* Name Attribute */
23
+ .highlight .na {
24
+ color: #333333;
25
+ background-color: transparent;
26
+ }
27
+ /* Name Builtin */
28
+ .highlight .nb {
29
+ color: #333333;
30
+ background-color: transparent;
31
+ }
32
+ /* Name Class */
33
+ .highlight .nc {
34
+ color: #333333;
35
+ background-color: transparent;
36
+ }
37
+ /* Name Constant */
38
+ .highlight .no {
39
+ color: #333333;
40
+ background-color: transparent;
41
+ }
42
+ /* Name Decorator */
43
+ .highlight .nd {
44
+ color: #333333;
45
+ background-color: transparent;
46
+ }
47
+ /* Name Entity */
48
+ .highlight .ni {
49
+ color: #a20e30;
50
+ background-color: transparent;
51
+ }
52
+ /* Name Exception */
53
+ .highlight .ne {
54
+ color: #333333;
55
+ background-color: transparent;
56
+ }
57
+ /* Name Function */
58
+ .highlight .nf {
59
+ color: #333333;
60
+ background-color: transparent;
61
+ }
62
+ /* Name Label */
63
+ .highlight .nl {
64
+ color: #333333;
65
+ background-color: transparent;
66
+ }
67
+ /* Name Namespace */
68
+ .highlight .nn {
69
+ color: #333333;
70
+ background-color: transparent;
71
+ }
72
+ /* Name Other */
73
+ .highlight .nx {
74
+ color: #333333;
75
+ background-color: transparent;
76
+ }
77
+ /* Name Property */
78
+ .highlight .py {
79
+ color: #333333;
80
+ background-color: transparent;
81
+ }
82
+ /* Name Tag */
83
+ .highlight .nt {
84
+ color: #333333;
85
+ background-color: transparent;
86
+ }
87
+ /* Name Variable */
88
+ .highlight .nv {
89
+ color: #333333;
90
+ background-color: transparent;
91
+ }
92
+ /* Name Variable Class */
93
+ .highlight .vc {
94
+ color: #333333;
95
+ background-color: transparent;
96
+ }
97
+ /* Name Variable Global */
98
+ .highlight .vg {
99
+ color: #333333;
100
+ background-color: transparent;
101
+ }
102
+ /* Name Variable Instance */
103
+ .highlight .vi {
104
+ color: #333333;
105
+ background-color: transparent;
106
+ }
107
+ /* Name Builtin Pseudo */
108
+ .highlight .bp {
109
+ color: #333333;
110
+ background-color: transparent;
111
+ }
112
+ /* Base Style */
113
+ .highlight .g {
114
+ color: #333333;
115
+ background-color: transparent;
116
+ }
117
+ /* */
118
+ .highlight .gd {
119
+ color: #333333;
120
+ background-color: transparent;
121
+ }
122
+ /* Base Style */
123
+ .highlight .o {
124
+ color: #333333;
125
+ background-color: transparent;
126
+ }
127
+ /* Operator Word */
128
+ .highlight .ow {
129
+ color: #333333;
130
+ background-color: transparent;
131
+ }
132
+ /* Base Style */
133
+ .highlight .c {
134
+ color: #727273;
135
+ background-color: transparent;
136
+ }
137
+ /* Comment Multiline */
138
+ .highlight .cm {
139
+ color: #727273;
140
+ background-color: transparent;
141
+ }
142
+ /* Comment Preproc */
143
+ .highlight .cp {
144
+ color: #727273;
145
+ background-color: transparent;
146
+ }
147
+ /* Comment Single */
148
+ .highlight .c1 {
149
+ color: #727273;
150
+ background-color: transparent;
151
+ }
152
+ /* Comment Special */
153
+ .highlight .cs {
154
+ color: #727273;
155
+ background-color: transparent;
156
+ }
157
+ /* Base Style */
158
+ .highlight .k {
159
+ color: #333333;
160
+ background-color: transparent;
161
+ }
162
+ /* Keyword Constant */
163
+ .highlight .kc {
164
+ color: #333333;
165
+ background-color: transparent;
166
+ }
167
+ /* Keyword Declaration */
168
+ .highlight .kd {
169
+ color: #2f3661;
170
+ background-color: transparent;
171
+ }
172
+ /* Keyword Namespace */
173
+ .highlight .kn {
174
+ color: #333333;
175
+ background-color: transparent;
176
+ }
177
+ /* Keyword Pseudo */
178
+ .highlight .kp {
179
+ color: #333333;
180
+ background-color: transparent;
181
+ }
182
+ /* Keyword Reserved */
183
+ .highlight .kr {
184
+ color: #333333;
185
+ background-color: transparent;
186
+ }
187
+ /* Keyword Type */
188
+ .highlight .kt {
189
+ color: #333333;
190
+ background-color: transparent;
191
+ }
192
+ /* Base Style */
193
+ .highlight .l {
194
+ color: #a20e30;
195
+ background-color: transparent;
196
+ }
197
+ /* Literal Date */
198
+ .highlight .ld {
199
+ color: #a20e30;
200
+ background-color: transparent;
201
+ }
202
+ /* Literal Number */
203
+ .highlight .m {
204
+ color: #a20e30;
205
+ background-color: transparent;
206
+ }
207
+ /* Literal Number Float */
208
+ .highlight .mf {
209
+ color: #a20e30;
210
+ background-color: transparent;
211
+ }
212
+ /* Literal Number Hex */
213
+ .highlight .mh {
214
+ color: #333333;
215
+ background-color: transparent;
216
+ }
217
+ /* Literal Number Integer */
218
+ .highlight .mi {
219
+ color: #a20e30;
220
+ background-color: transparent;
221
+ }
222
+ /* Literal Number Oct */
223
+ .highlight .mo {
224
+ color: #a20e30;
225
+ background-color: transparent;
226
+ }
227
+ /* Literal Number Integer Long */
228
+ .highlight .il {
229
+ color: #a20e30;
230
+ background-color: transparent;
231
+ }
232
+ /* Literal String */
233
+ .highlight .s {
234
+ color: #a20e30;
235
+ background-color: transparent;
236
+ }
237
+ /* Literal String Backtick */
238
+ .highlight .sb {
239
+ color: #a20e30;
240
+ background-color: transparent;
241
+ }
242
+ /* Literal String Char */
243
+ .highlight .sc {
244
+ color: #a20e30;
245
+ background-color: transparent;
246
+ }
247
+ /* Literal String Doc */
248
+ .highlight .sd {
249
+ color: #a20e30;
250
+ background-color: transparent;
251
+ }
252
+ /* Literal String Double */
253
+ .highlight .s2 {
254
+ color: #a20e30;
255
+ background-color: transparent;
256
+ }
257
+ /* Literal String Escape */
258
+ .highlight .se {
259
+ color: #a20e30;
260
+ background-color: transparent;
261
+ }
262
+ /* Literal String Heredoc */
263
+ .highlight .sh {
264
+ color: #a20e30;
265
+ background-color: transparent;
266
+ }
267
+ /* Literal String Interpol */
268
+ .highlight .si {
269
+ color: #a20e30;
270
+ background-color: transparent;
271
+ }
272
+ /* Literal String Other */
273
+ .highlight .sx {
274
+ color: #a20e30;
275
+ background-color: transparent;
276
+ }
277
+ /* Literal String Regex */
278
+ .highlight .sr {
279
+ color: #a20e30;
280
+ background-color: transparent;
281
+ }
282
+ /* Literal String Single */
283
+ .highlight .s1 {
284
+ color: #a20e30;
285
+ background-color: transparent;
286
+ }
287
+ /* Literal String Symbol */
288
+ .highlight .ss {
289
+ color: #a20e30;
290
+ background-color: transparent;
291
+ }
292
+ /* Base Style */
293
+ .highlight .g {
294
+ color: #333333;
295
+ background-color: transparent;
296
+ }
297
+ /* Generic Deleted */
298
+ .highlight .gd {
299
+ color: #333333;
300
+ background-color: transparent;
301
+ }
302
+ /* Generic Emph */
303
+ .highlight .ge {
304
+ color: #333333;
305
+ background-color: transparent;
306
+ }
307
+ /* Generic Error */
308
+ .highlight .gr {
309
+ color: #333333;
310
+ background-color: transparent;
311
+ }
312
+ /* Generic Heading */
313
+ .highlight .gh {
314
+ color: #333333;
315
+ background-color: transparent;
316
+ }
317
+ /* Generic Inserted */
318
+ .highlight .gi {
319
+ color: #333333;
320
+ background-color: transparent;
321
+ }
322
+ /* Generic Output */
323
+ .highlight .go {
324
+ color: #333333;
325
+ background-color: transparent;
326
+ }
327
+ /* Generic Prompt */
328
+ .highlight .gp {
329
+ color: #333333;
330
+ background-color: transparent;
331
+ }
332
+ /* Generic Strong */
333
+ .highlight .gs {
334
+ color: #333333;
335
+ background-color: transparent;
336
+ }
337
+ /* Generic Subheading */
338
+ .highlight .gu {
339
+ color: #333333;
340
+ background-color: transparent;
341
+ }
342
+ /* Generic Traceback */
343
+ .highlight .gt {
344
+ color: #333333;
345
+ background-color: transparent;
346
+ }
347
+ /* Other */
348
+ .highlight .x {
349
+ color: #333333;
350
+ background-color: transparent;
351
+ }
352
+ /* Text Whitespace */
353
+ .highlight .w {
354
+ color: #333333;
355
+ background-color: transparent;
356
+ }
@@ -0,0 +1,70 @@
1
+ ---
2
+ layout: default
3
+ title: Using Capybara
4
+ ---
5
+
6
+ # Using Capybara
7
+ When using Selenium, Wayfarer supports Selenium drivers. You can execute JavaScript, take screenshots, interact with the page, and so on. For an exhaustive list, see [the official API documentation](http://www.rubydoc.info/gems/selenium-webdriver/0.0.28/Selenium/WebDriver/Driver).
8
+
9
+ See [examples/selenium.rb](../examples/selenium.rb).
10
+
11
+ ## Setup
12
+ Inside your instance methods, you have access to `#driver`, which returns a Selenium driver:
13
+
14
+ {% highlight ruby %}
15
+ class DummyJob < Wayfarer::Job
16
+ config do |c|
17
+ c.http_adapter = :selenium
18
+ c.selenium_argv = [:firefox]
19
+ c.connection_count = 4 # Number of instantiated WebDrivers
20
+ end
21
+
22
+ draw uri: "https://example.com"
23
+ def foo
24
+ driver # => #<Selenium::WebDriver::Driver:...>
25
+ end
26
+ end
27
+ {% endhighlight %}
28
+
29
+ ### Selenium Grid
30
+ {% highlight ruby %}
31
+ class DummyJob < Wayfarer::Job
32
+ config do |c|
33
+ c.http_adapter = :selenium
34
+ c.selenium_argv = [
35
+ :remote, url: "http://localhost:4444/wd/hub", desired_capabilities: :firefox
36
+ ]
37
+ end
38
+ end
39
+ {% endhighlight %}
40
+
41
+ ## Executing JavaScript
42
+ ```ruby
43
+ class DummyJob < Wayfarer::Job
44
+ config do |c|
45
+ c.http_adapter = :selenium
46
+ c.selenium_argv = [:firefox]
47
+ end
48
+
49
+ draw uri: "https://example.com"
50
+ def example
51
+ driver.execute_script("console.log('Hello from wayfarer!')")
52
+ end
53
+ end
54
+ ```
55
+
56
+ ## Taking screenshots
57
+ {% highlight ruby %}
58
+ class DummyJob < Wayfarer::Job
59
+ config do |c|
60
+ c.http_adapter = :selenium
61
+ c.selenium_argv = [:firefox]
62
+ c.window_size: [1024, 768]
63
+ end
64
+
65
+ draw uri: "https://example.com"
66
+ def example
67
+ driver.save_screenshot("/tmp/screenshot.png")
68
+ end
69
+ end
70
+ {% endhighlight %}
@@ -0,0 +1,7 @@
1
+ ---
2
+ ---
3
+
4
+ @charset "utf-8";
5
+
6
+ @import "variables";
7
+ @import "base";
@@ -0,0 +1,45 @@
1
+ ---
2
+ layout: default
3
+ title: Callbacks
4
+ ---
5
+
6
+ # Callbacks
7
+
8
+ Besides all [ActiveJob callbacks](http://api.rubyonrails.org/classes/ActiveJob/Callbacks/ClassMethods.html), three other callbacks are available. You have access to [locals](/guides/locals) in all callbacks.
9
+
10
+ ## `before_crawl`
11
+ Fires __once__ before any pages have been retrieved.
12
+
13
+ {% highlight ruby %}
14
+ class DummyJob < Wayfarer::Job
15
+ before_crawl { puts "Work is about to happen" }
16
+ end
17
+ {% endhighlight %}
18
+
19
+ ## `after_crawl`
20
+ Fires __once__ after all pages have been retrieved and processing is done.
21
+
22
+ {% highlight ruby %}
23
+ class DummyJob < Wayfarer::Job
24
+ after_crawl { puts "Work did happen" }
25
+ end
26
+ {% endhighlight %}
27
+
28
+ ## `setup_adapter`
29
+ Fires for every adapter immediately after its creation.
30
+
31
+ The block gets yielded an adapter. When using the Selenium HTTP adapter, both a WebDriver and a wrapping Capybara driver get yielded.
32
+
33
+ {% highlight ruby %}
34
+ class DummyJob < Wayfarer::Job
35
+ config.http_adapter = :selenium
36
+ config.connection_count = 4
37
+
38
+ setup_adapter do |adapter, driver, browser|
39
+ # This block gets called 4 times with different adapters
40
+ adapter # => The HTTP adapter
41
+ driver # => #<Selenium::WebDriver::Driver:...> or nil
42
+ browser # => #<Capybara::Selenium::Driver:...> or nil
43
+ end
44
+ end
45
+ {% endhighlight %}
@@ -0,0 +1,52 @@
1
+ ---
2
+ layout: default
3
+ title: CLI
4
+ ---
5
+
6
+ # Command-line interface
7
+ Wayfarer ships with a small executable, `wayfarer`.
8
+
9
+ Job classes are loaded by naming convention, e.g. if you pass `./directory/foo_bar.rb` as the `FILE` parameter, that file is expected to define the class `FooBar`. You can leave off the `.rb` extension.
10
+
11
+ ## `% wayfarer route FILE URI`
12
+ Loads the job defined in `FILE`, and prints the first matching route for `URI`.
13
+
14
+ ## `% wayfarer enqueue FILE URI`
15
+ Loads and enqueues the job in `FILE`, starting from `URI`.
16
+
17
+ * `--log_level LEVEL`
18
+ Option. Which log messages to print.
19
+
20
+ * Default: `info`
21
+ * Recognized values: `unknown`, `debug`, `error`, `fatal`, `info`, `warn`
22
+
23
+ * `--queue_adapter ADAPTER`
24
+ Option. Which ActiveJob queue adapter to use (e.g. `sidekiq`, `resque`).
25
+ * Recognized values: strings, see [documentation](http://api.rubyonrails.org/)
26
+
27
+ * `--wait VALUE`
28
+ Option. Point of time when the enqueued job should be run.
29
+
30
+ 1. If the value can be converted to an integer, it represents the seconds from now.
31
+ 2. If the value can be parsed by `Time::parse`, the job gets scheduled at that point in time.
32
+ 3. If the value is a human-readable time string that [Chronic](https://github.com/mojombo/chronic) can make sense of, the job is scheduled at that point in time.
33
+
34
+ __Examples:__
35
+
36
+ 60 seconds from now:
37
+
38
+ ```
39
+ % wayfarer enqueue ./foo_bar http://google.com --wait 60
40
+ ```
41
+
42
+ 6pm, today:
43
+
44
+ ```
45
+ % wayfarer enqueue ./foo_bar http://google.com --wait 18:00
46
+ ```
47
+
48
+ Tomorrow:
49
+
50
+ ```
51
+ % wayfarer enqueue ./foo_bar http://google.com --wait tomorrow
52
+ ```
@@ -0,0 +1,184 @@
1
+ ---
2
+ layout: default
3
+ title: Configuration
4
+ ---
5
+
6
+ # Configuration
7
+
8
+ All job classes base their configuration off the global one.
9
+
10
+ {% highlight ruby %}
11
+ # Setting a key globally applies to all jobs ...
12
+ Wayfarer.config.key = :value
13
+
14
+ class DummyJob < Wayfarer::Job
15
+ # ... unless a job overrides it
16
+ config.key = :other_value
17
+ end
18
+
19
+ class DummyJob < Wayfarer::Job
20
+ # Have it yielded
21
+ config { |c| c.key = :other_value }
22
+ end
23
+ {% endhighlight %}
24
+
25
+ ## Recognized keys and values
26
+
27
+ ### `print_stacktraces`
28
+ * Default: `true`
29
+ * Recognized values: Booleans
30
+
31
+ Whether to print stacktraces when encounterting unhandled exceptions in job actions. See [Error handling]({{base}}/guides/error_handling.html).
32
+
33
+ ---
34
+
35
+ ### `reraise_exceptions`
36
+
37
+ * Default: `false`
38
+ * Recognized values: Booleans
39
+
40
+ Whether to crash when encountering unhandled exceptions in job actions. See [Error handling]({{base}}/guides/error_handling.html).
41
+
42
+ ---
43
+
44
+ ### `allow_circulation`
45
+
46
+ * Default: `false`
47
+ * Recognized values: Booleans
48
+
49
+ Whether URIs may be visited twice.
50
+
51
+ <aside class="note">
52
+ Allowing circulation might cause your jobs to not terminate.
53
+ </aside>
54
+
55
+ ---
56
+
57
+ ### `normalize_uris`
58
+
59
+ * Default: `true`
60
+ * Recognized values: Booleans
61
+
62
+ Whether to strip fragments, reorder query keys, etc. when staging and caching URIs. Customizable with the `:normalize_uri_options` key. See [normalize_url](https://github.com/rwz/normalize_url).
63
+
64
+ ---
65
+
66
+ ### `normalize_uri_options`
67
+
68
+ * Default: `{}`
69
+ * Recognized values: See [normalize_url](https://github.com/rwz/normalize_url).
70
+
71
+ ---
72
+
73
+ ### `frontier`
74
+ * Default: `:memory`
75
+ * Recognized values: See [(Redis) frontiers](frontiers.html).
76
+
77
+ Which frontier to use.
78
+
79
+ <aside class="note">
80
+ Bloom filters may yield false positives. See the <a href="https://en.wikipedia.org/wiki/Bloom_filter">Wikipedia article</a>.
81
+ </aside>
82
+
83
+ ---
84
+
85
+ ### `connection_count`
86
+
87
+ * Default: `4`
88
+ * Recognized values: Integers
89
+
90
+ How many threads and HTTP adapters to use (1:1 correspondence).
91
+
92
+ ---
93
+
94
+ ### `http_adapter`
95
+
96
+ * Default: `:net_http`
97
+ * Recognized values: `:net_http`, `:selenium`
98
+
99
+ Which HTTP adapter to use. See [Selenium & Capybara](selenium_capybara.html).
100
+
101
+ ---
102
+
103
+ ### `connection_timeout`
104
+
105
+ * Default: `Float::INFINITY`
106
+ * Recognized values: Floats
107
+
108
+ Time in seconds that a job instance may hold an HTTP adapter. Instances that exceed this time limit raise an exception.
109
+
110
+ ---
111
+
112
+ ### `max_http_redirects`
113
+
114
+ * Default: `3`
115
+ * Recognized values: Integers
116
+
117
+ How many 3xx redirects to follow.
118
+
119
+ <aside class="note">
120
+ Has no effect when using the <code>:selenium</code> HTTP adapter.
121
+ </aside>
122
+
123
+ ---
124
+
125
+ ### `selenium_argv`
126
+
127
+ * Default: `[:firefox]`
128
+ * Recognized values: See [Selenium & Capybara](selenium_capybara.html)
129
+
130
+ Argument vector passed to [`Selenium::WebDriver::Driver::for`](http://www.rubydoc.info/gems/selenium-webdriver/Selenium/WebDriver/Driver#for-class_method).
131
+
132
+ ---
133
+
134
+ ### `redis_opts`
135
+
136
+ * Default: `{ host: "localhost", port: 6379 }`
137
+ * Recognized values: [See documentation](http://www.rubydoc.info/github/redis/redis-rb/Redis%3Ainitialize)
138
+
139
+ Options passed to [`Redis#initialize`](http://www.rubydoc.info/github/redis/redis-rb/Redis%3Ainitialize).
140
+
141
+ ---
142
+
143
+ ### `bloomfilter_opts`
144
+
145
+ * Default:
146
+ ```
147
+ {
148
+ size: 100,
149
+ hashes: 2,
150
+ seed: 1,
151
+ bucket: 3,
152
+ raise: false
153
+ }
154
+ ```
155
+ * Recognized values:
156
+ * `size`: Integers; number of buckets in a bloom filter
157
+ * `hashes`: Integers; number of hash functions
158
+ * `seed`: Integers; seed of hash functions
159
+ * `bucket`: Integers; number of bits in a bloom filter bucket
160
+ * `raise`: Booleans; whether to raise on bucket overflow
161
+
162
+ Options for [bloomfilter-rb](https://github.com/igrigorik/bloomfilter-rb).
163
+
164
+ ---
165
+
166
+ ### `window_size`
167
+
168
+ * Default: `[1024, 768]`
169
+ * Recognized values: `[Integer, Integer]`
170
+
171
+ Dimensions of browser windows.
172
+
173
+ <aside class="note">
174
+ Only has an effect when using the <code>:selenium</code> HTTP adapter.
175
+ </aside>
176
+
177
+ ---
178
+
179
+ ### `mustermann_type`
180
+
181
+ * Default: `:sinatra`
182
+ * Recognized values: [See documentation](https://github.com/sinatra/mustermann)
183
+
184
+ Which [Mustermann](https://github.com/sinatra/mustermann) pattern type to use.