wukong 3.0.0.pre → 3.0.0.pre2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (476) hide show
  1. data/.gitignore +46 -33
  2. data/.gitmodules +3 -0
  3. data/.rspec +1 -1
  4. data/.travis.yml +8 -1
  5. data/.yardopts +0 -13
  6. data/Guardfile +4 -6
  7. data/{LICENSE.textile → LICENSE.md} +43 -55
  8. data/README-old.md +422 -0
  9. data/README.md +279 -418
  10. data/Rakefile +21 -5
  11. data/TODO.md +6 -6
  12. data/bin/wu-clean-encoding +31 -0
  13. data/bin/wu-lign +2 -2
  14. data/bin/wu-local +69 -0
  15. data/bin/wu-server +70 -0
  16. data/examples/Gemfile +38 -0
  17. data/examples/README.md +9 -0
  18. data/examples/dataflow/apache_log_line.rb +64 -25
  19. data/examples/dataflow/fibonacci_series.rb +101 -0
  20. data/examples/dataflow/parse_apache_logs.rb +37 -7
  21. data/examples/{dataflow.rb → dataflow/scraper_macro_flow.rb} +0 -0
  22. data/examples/dataflow/simple.rb +4 -4
  23. data/examples/geo.rb +4 -0
  24. data/examples/geo/geo_grids.numbers +0 -0
  25. data/examples/geo/geolocated.rb +331 -0
  26. data/examples/geo/quadtile.rb +69 -0
  27. data/examples/geo/spec/geolocated_spec.rb +247 -0
  28. data/examples/geo/tile_fetcher.rb +77 -0
  29. data/examples/graph/minimum_spanning_tree.rb +61 -61
  30. data/examples/jabberwocky.txt +36 -0
  31. data/examples/models/wikipedia.rb +20 -0
  32. data/examples/munging/Gemfile +8 -0
  33. data/examples/munging/airline_flights/airline.rb +57 -0
  34. data/examples/munging/airline_flights/airline_flights.rake +83 -0
  35. data/{lib/wukong/settings.rb → examples/munging/airline_flights/airplane.rb} +0 -0
  36. data/examples/munging/airline_flights/airport.rb +211 -0
  37. data/examples/munging/airline_flights/airport_id_unification.rb +129 -0
  38. data/examples/munging/airline_flights/airport_ok_chars.rb +4 -0
  39. data/examples/munging/airline_flights/flight.rb +156 -0
  40. data/examples/munging/airline_flights/models.rb +4 -0
  41. data/examples/munging/airline_flights/parse.rb +26 -0
  42. data/examples/munging/airline_flights/reconcile_airports.rb +142 -0
  43. data/examples/munging/airline_flights/route.rb +35 -0
  44. data/examples/munging/airline_flights/tasks.rake +83 -0
  45. data/examples/munging/airline_flights/timezone_fixup.rb +62 -0
  46. data/examples/munging/airline_flights/topcities.rb +167 -0
  47. data/examples/munging/airports/40_wbans.txt +40 -0
  48. data/examples/munging/airports/filter_weather_reports.rb +37 -0
  49. data/examples/munging/airports/join.pig +31 -0
  50. data/examples/munging/airports/to_tsv.rb +33 -0
  51. data/examples/munging/airports/usa_wbans.pig +19 -0
  52. data/examples/munging/airports/usa_wbans.txt +2157 -0
  53. data/examples/munging/airports/wbans.pig +19 -0
  54. data/examples/munging/airports/wbans.txt +2310 -0
  55. data/examples/munging/geo/geo_json.rb +54 -0
  56. data/examples/munging/geo/geo_models.rb +69 -0
  57. data/examples/munging/geo/geonames_models.rb +78 -0
  58. data/examples/munging/geo/iso_codes.rb +172 -0
  59. data/examples/munging/geo/reconcile_countries.rb +124 -0
  60. data/examples/munging/geo/tasks.rake +71 -0
  61. data/examples/munging/rake_helper.rb +62 -0
  62. data/examples/munging/weather/.gitignore +1 -0
  63. data/examples/munging/weather/Gemfile +4 -0
  64. data/examples/munging/weather/Rakefile +28 -0
  65. data/examples/munging/weather/extract_ish.rb +13 -0
  66. data/examples/munging/weather/models/weather.rb +119 -0
  67. data/examples/munging/weather/utils/noaa_downloader.rb +46 -0
  68. data/examples/munging/wikipedia/README.md +34 -0
  69. data/examples/munging/wikipedia/Rakefile +193 -0
  70. data/examples/munging/wikipedia/articles/extract_articles-parsed.rb +79 -0
  71. data/examples/munging/wikipedia/articles/extract_articles-templated.rb +136 -0
  72. data/examples/munging/wikipedia/articles/textualize_articles.rb +54 -0
  73. data/examples/munging/wikipedia/articles/verify_structure.rb +43 -0
  74. data/examples/munging/wikipedia/articles/wp2txt-LICENSE.txt +22 -0
  75. data/examples/munging/wikipedia/articles/wp2txt_article.rb +259 -0
  76. data/examples/munging/wikipedia/articles/wp2txt_utils.rb +452 -0
  77. data/examples/munging/wikipedia/dbpedia/dbpedia_common.rb +4 -0
  78. data/examples/munging/wikipedia/dbpedia/dbpedia_extract_geocoordinates.rb +78 -0
  79. data/examples/munging/wikipedia/dbpedia/extract_links.rb +193 -0
  80. data/examples/munging/wikipedia/dbpedia/sameas_extractor.rb +20 -0
  81. data/examples/munging/wikipedia/n1_subuniverse/n1_nodes.pig +18 -0
  82. data/examples/munging/wikipedia/page_metadata/extract_page_metadata.rb +21 -0
  83. data/examples/munging/wikipedia/page_metadata/extract_page_metadata.rb.old +27 -0
  84. data/examples/munging/wikipedia/pagelinks/augment_pagelinks.pig +29 -0
  85. data/examples/munging/wikipedia/pagelinks/extract_pagelinks.rb +14 -0
  86. data/examples/munging/wikipedia/pagelinks/extract_pagelinks.rb.old +25 -0
  87. data/examples/munging/wikipedia/pagelinks/undirect_pagelinks.pig +29 -0
  88. data/examples/munging/wikipedia/pageviews/augment_pageviews.pig +32 -0
  89. data/examples/munging/wikipedia/pageviews/extract_pageviews.rb +85 -0
  90. data/examples/munging/wikipedia/pig_style_guide.md +25 -0
  91. data/examples/munging/wikipedia/redirects/redirects_page_metadata.pig +19 -0
  92. data/examples/munging/wikipedia/subuniverse/sub_articles.pig +23 -0
  93. data/examples/munging/wikipedia/subuniverse/sub_page_metadata.pig +24 -0
  94. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_from.pig +22 -0
  95. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_into.pig +22 -0
  96. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_within.pig +26 -0
  97. data/examples/munging/wikipedia/subuniverse/sub_pageviews.pig +29 -0
  98. data/examples/munging/wikipedia/subuniverse/sub_undirected_pagelinks_within.pig +24 -0
  99. data/examples/munging/wikipedia/utils/get_namespaces.rb +86 -0
  100. data/examples/munging/wikipedia/utils/munging_utils.rb +68 -0
  101. data/examples/munging/wikipedia/utils/namespaces.json +1 -0
  102. data/examples/rake_helper.rb +85 -0
  103. data/examples/server_logs/geo_ip_mapping/munge_geolite.rb +82 -0
  104. data/examples/server_logs/logline.rb +95 -0
  105. data/examples/server_logs/models.rb +66 -0
  106. data/examples/server_logs/page_counts.pig +48 -0
  107. data/examples/server_logs/server_logs-01-parse-script.rb +13 -0
  108. data/examples/server_logs/server_logs-02-histograms-full.rb +33 -0
  109. data/examples/server_logs/server_logs-02-histograms-mapper.rb +14 -0
  110. data/{old/examples/server_logs/breadcrumbs.rb → examples/server_logs/server_logs-03-breadcrumbs-full.rb} +26 -30
  111. data/examples/server_logs/server_logs-04-page_page_edges-full.rb +40 -0
  112. data/examples/string_reverser.rb +26 -0
  113. data/examples/text/pig_latin.rb +2 -2
  114. data/examples/text/regional_flavor/README.md +14 -0
  115. data/examples/text/regional_flavor/article_wordbags.pig +39 -0
  116. data/examples/text/regional_flavor/j01-article_wordbags.rb +4 -0
  117. data/examples/text/regional_flavor/simple_pig_script.pig +27 -0
  118. data/examples/word_count/accumulator.rb +26 -0
  119. data/examples/word_count/tokenizer.rb +13 -0
  120. data/examples/word_count/word_count.rb +6 -0
  121. data/examples/workflow/cherry_pie.dot +97 -0
  122. data/examples/workflow/cherry_pie.png +0 -0
  123. data/examples/workflow/cherry_pie.rb +61 -26
  124. data/lib/hanuman.rb +34 -7
  125. data/lib/hanuman/graph.rb +55 -31
  126. data/lib/hanuman/graphvizzer.rb +199 -178
  127. data/lib/hanuman/graphvizzer/gv_models.rb +161 -0
  128. data/lib/hanuman/graphvizzer/gv_presenter.rb +97 -0
  129. data/lib/hanuman/link.rb +35 -0
  130. data/lib/hanuman/registry.rb +46 -0
  131. data/lib/hanuman/stage.rb +76 -32
  132. data/lib/wukong.rb +23 -24
  133. data/lib/wukong/boot.rb +87 -0
  134. data/lib/wukong/configuration.rb +8 -0
  135. data/lib/wukong/dataflow.rb +45 -78
  136. data/lib/wukong/driver.rb +99 -0
  137. data/lib/wukong/emitter.rb +22 -0
  138. data/lib/wukong/model/faker.rb +24 -24
  139. data/lib/wukong/model/flatpack_parser/flat.rb +60 -0
  140. data/lib/wukong/model/flatpack_parser/flatpack.rb +4 -0
  141. data/lib/wukong/model/flatpack_parser/lang.rb +46 -0
  142. data/lib/wukong/model/flatpack_parser/parser.rb +55 -0
  143. data/lib/wukong/model/flatpack_parser/tokens.rb +130 -0
  144. data/lib/wukong/processor.rb +60 -114
  145. data/lib/wukong/spec_helpers.rb +81 -0
  146. data/lib/wukong/spec_helpers/integration_driver.rb +144 -0
  147. data/lib/wukong/spec_helpers/integration_driver_matchers.rb +219 -0
  148. data/lib/wukong/spec_helpers/processor_helpers.rb +95 -0
  149. data/lib/wukong/spec_helpers/processor_methods.rb +108 -0
  150. data/lib/wukong/spec_helpers/shared_examples.rb +15 -0
  151. data/lib/wukong/spec_helpers/spec_driver.rb +28 -0
  152. data/lib/wukong/spec_helpers/spec_driver_matchers.rb +195 -0
  153. data/lib/wukong/version.rb +2 -1
  154. data/lib/wukong/widget/filters.rb +311 -0
  155. data/lib/wukong/widget/processors.rb +156 -0
  156. data/lib/wukong/widget/reducers.rb +7 -0
  157. data/lib/wukong/widget/reducers/accumulator.rb +73 -0
  158. data/lib/wukong/widget/reducers/bin.rb +318 -0
  159. data/lib/wukong/widget/reducers/count.rb +61 -0
  160. data/lib/wukong/widget/reducers/group.rb +85 -0
  161. data/lib/wukong/widget/reducers/group_concat.rb +70 -0
  162. data/lib/wukong/widget/reducers/moments.rb +72 -0
  163. data/lib/wukong/widget/reducers/sort.rb +130 -0
  164. data/lib/wukong/widget/serializers.rb +287 -0
  165. data/lib/wukong/widget/sink.rb +10 -52
  166. data/lib/wukong/widget/source.rb +7 -113
  167. data/lib/wukong/widget/utils.rb +46 -0
  168. data/lib/wukong/widgets.rb +6 -0
  169. data/spec/examples/dataflow/fibonacci_series_spec.rb +18 -0
  170. data/spec/examples/dataflow/parsing_spec.rb +12 -11
  171. data/spec/examples/dataflow/simple_spec.rb +32 -6
  172. data/spec/examples/dataflow/telegram_spec.rb +36 -36
  173. data/spec/examples/graph/minimum_spanning_tree_spec.rb +30 -31
  174. data/spec/examples/munging/airline_flights/identifiers_spec.rb +16 -0
  175. data/spec/examples/munging/airline_flights_spec.rb +202 -0
  176. data/spec/examples/text/pig_latin_spec.rb +13 -16
  177. data/spec/examples/workflow/cherry_pie_spec.rb +34 -4
  178. data/spec/hanuman/graph_spec.rb +27 -2
  179. data/spec/hanuman/hanuman_spec.rb +10 -0
  180. data/spec/hanuman/registry_spec.rb +123 -0
  181. data/spec/hanuman/stage_spec.rb +61 -7
  182. data/spec/spec_helper.rb +29 -19
  183. data/spec/support/hanuman_test_helpers.rb +14 -12
  184. data/spec/support/shared_context_for_reducers.rb +37 -0
  185. data/spec/support/shared_examples_for_builders.rb +101 -0
  186. data/spec/support/shared_examples_for_shortcuts.rb +57 -0
  187. data/spec/support/wukong_test_helpers.rb +37 -11
  188. data/spec/wukong/dataflow_spec.rb +77 -55
  189. data/spec/wukong/local_runner_spec.rb +24 -24
  190. data/spec/wukong/model/faker_spec.rb +132 -131
  191. data/spec/wukong/runner_spec.rb +8 -8
  192. data/spec/wukong/widget/filters_spec.rb +61 -0
  193. data/spec/wukong/widget/processors_spec.rb +126 -0
  194. data/spec/wukong/widget/reducers/bin_spec.rb +92 -0
  195. data/spec/wukong/widget/reducers/count_spec.rb +11 -0
  196. data/spec/wukong/widget/reducers/group_spec.rb +20 -0
  197. data/spec/wukong/widget/reducers/moments_spec.rb +36 -0
  198. data/spec/wukong/widget/reducers/sort_spec.rb +26 -0
  199. data/spec/wukong/widget/serializers_spec.rb +92 -0
  200. data/spec/wukong/widget/sink_spec.rb +15 -15
  201. data/spec/wukong/widget/source_spec.rb +65 -41
  202. data/spec/wukong/wukong_spec.rb +10 -0
  203. data/wukong.gemspec +17 -10
  204. metadata +359 -335
  205. data/.document +0 -5
  206. data/VERSION +0 -1
  207. data/bin/hdp-bin +0 -44
  208. data/bin/hdp-bzip +0 -23
  209. data/bin/hdp-cat +0 -3
  210. data/bin/hdp-catd +0 -3
  211. data/bin/hdp-cp +0 -3
  212. data/bin/hdp-du +0 -86
  213. data/bin/hdp-get +0 -3
  214. data/bin/hdp-kill +0 -3
  215. data/bin/hdp-kill-task +0 -3
  216. data/bin/hdp-ls +0 -11
  217. data/bin/hdp-mkdir +0 -2
  218. data/bin/hdp-mkdirp +0 -12
  219. data/bin/hdp-mv +0 -3
  220. data/bin/hdp-parts_to_keys.rb +0 -77
  221. data/bin/hdp-ps +0 -3
  222. data/bin/hdp-put +0 -3
  223. data/bin/hdp-rm +0 -32
  224. data/bin/hdp-sort +0 -40
  225. data/bin/hdp-stream +0 -40
  226. data/bin/hdp-stream-flat +0 -22
  227. data/bin/hdp-stream2 +0 -39
  228. data/bin/hdp-sync +0 -17
  229. data/bin/hdp-wc +0 -67
  230. data/bin/wu-flow +0 -10
  231. data/bin/wu-map +0 -17
  232. data/bin/wu-red +0 -17
  233. data/bin/wukong +0 -17
  234. data/data/CREDITS.md +0 -355
  235. data/data/graph/airfares.tsv +0 -2174
  236. data/data/text/gift_of_the_magi.txt +0 -225
  237. data/data/text/jabberwocky.txt +0 -36
  238. data/data/text/rectification_of_names.txt +0 -33
  239. data/data/twitter/a_atsigns_b.tsv +0 -64
  240. data/data/twitter/a_follows_b.tsv +0 -53
  241. data/data/twitter/tweet.tsv +0 -167
  242. data/data/twitter/twitter_user.tsv +0 -55
  243. data/data/wikipedia/dbpedia-sentences.tsv +0 -1000
  244. data/docpages/INSTALL.textile +0 -92
  245. data/docpages/LICENSE.textile +0 -107
  246. data/docpages/README-elastic_map_reduce.textile +0 -377
  247. data/docpages/README-performance.textile +0 -90
  248. data/docpages/README-wulign.textile +0 -65
  249. data/docpages/UsingWukong-part1-get_ready.textile +0 -17
  250. data/docpages/UsingWukong-part2-ThinkingBigData.textile +0 -75
  251. data/docpages/UsingWukong-part3-parsing.textile +0 -138
  252. data/docpages/_config.yml +0 -39
  253. data/docpages/avro/avro_notes.textile +0 -56
  254. data/docpages/avro/performance.textile +0 -36
  255. data/docpages/avro/tethering.textile +0 -19
  256. data/docpages/bigdata-tips.textile +0 -143
  257. data/docpages/code/api_response_example.txt +0 -20
  258. data/docpages/code/parser_skeleton.rb +0 -38
  259. data/docpages/diagrams/MapReduceDiagram.graffle +0 -0
  260. data/docpages/favicon.ico +0 -0
  261. data/docpages/gem.css +0 -16
  262. data/docpages/hadoop-tips.textile +0 -83
  263. data/docpages/index.textile +0 -92
  264. data/docpages/intro.textile +0 -8
  265. data/docpages/moreinfo.textile +0 -174
  266. data/docpages/news.html +0 -24
  267. data/docpages/pig/PigLatinExpressionsList.txt +0 -122
  268. data/docpages/pig/PigLatinReferenceManual.txt +0 -1640
  269. data/docpages/pig/commandline_params.txt +0 -26
  270. data/docpages/pig/cookbook.html +0 -481
  271. data/docpages/pig/images/hadoop-logo.jpg +0 -0
  272. data/docpages/pig/images/instruction_arrow.png +0 -0
  273. data/docpages/pig/images/pig-logo.gif +0 -0
  274. data/docpages/pig/piglatin_ref1.html +0 -1103
  275. data/docpages/pig/piglatin_ref2.html +0 -14340
  276. data/docpages/pig/setup.html +0 -505
  277. data/docpages/pig/skin/basic.css +0 -166
  278. data/docpages/pig/skin/breadcrumbs.js +0 -237
  279. data/docpages/pig/skin/fontsize.js +0 -166
  280. data/docpages/pig/skin/getBlank.js +0 -40
  281. data/docpages/pig/skin/getMenu.js +0 -45
  282. data/docpages/pig/skin/images/chapter.gif +0 -0
  283. data/docpages/pig/skin/images/chapter_open.gif +0 -0
  284. data/docpages/pig/skin/images/current.gif +0 -0
  285. data/docpages/pig/skin/images/external-link.gif +0 -0
  286. data/docpages/pig/skin/images/header_white_line.gif +0 -0
  287. data/docpages/pig/skin/images/page.gif +0 -0
  288. data/docpages/pig/skin/images/pdfdoc.gif +0 -0
  289. data/docpages/pig/skin/images/rc-b-l-15-1body-2menu-3menu.png +0 -0
  290. data/docpages/pig/skin/images/rc-b-r-15-1body-2menu-3menu.png +0 -0
  291. data/docpages/pig/skin/images/rc-b-r-5-1header-2tab-selected-3tab-selected.png +0 -0
  292. data/docpages/pig/skin/images/rc-t-l-5-1header-2searchbox-3searchbox.png +0 -0
  293. data/docpages/pig/skin/images/rc-t-l-5-1header-2tab-selected-3tab-selected.png +0 -0
  294. data/docpages/pig/skin/images/rc-t-l-5-1header-2tab-unselected-3tab-unselected.png +0 -0
  295. data/docpages/pig/skin/images/rc-t-r-15-1body-2menu-3menu.png +0 -0
  296. data/docpages/pig/skin/images/rc-t-r-5-1header-2searchbox-3searchbox.png +0 -0
  297. data/docpages/pig/skin/images/rc-t-r-5-1header-2tab-selected-3tab-selected.png +0 -0
  298. data/docpages/pig/skin/images/rc-t-r-5-1header-2tab-unselected-3tab-unselected.png +0 -0
  299. data/docpages/pig/skin/print.css +0 -54
  300. data/docpages/pig/skin/profile.css +0 -181
  301. data/docpages/pig/skin/screen.css +0 -587
  302. data/docpages/pig/tutorial.html +0 -1059
  303. data/docpages/pig/udf.html +0 -1509
  304. data/docpages/tutorial.textile +0 -283
  305. data/docpages/usage.textile +0 -195
  306. data/docpages/wutils.textile +0 -263
  307. data/examples/dataflow/complex.rb +0 -11
  308. data/examples/dataflow/donuts.rb +0 -13
  309. data/examples/tiny_count/jabberwocky_output.tsv +0 -92
  310. data/examples/word_count.rb +0 -48
  311. data/examples/workflow/fiddle.rb +0 -24
  312. data/lib/away/escapement.rb +0 -129
  313. data/lib/away/exe.rb +0 -11
  314. data/lib/away/experimental.rb +0 -5
  315. data/lib/away/from_file.rb +0 -52
  316. data/lib/away/job.rb +0 -56
  317. data/lib/away/job/rake_compat.rb +0 -17
  318. data/lib/away/registry.rb +0 -79
  319. data/lib/away/runner.rb +0 -276
  320. data/lib/away/runner/execute.rb +0 -121
  321. data/lib/away/script.rb +0 -161
  322. data/lib/away/script/hadoop_command.rb +0 -240
  323. data/lib/away/source/file_list_source.rb +0 -15
  324. data/lib/away/source/looper.rb +0 -18
  325. data/lib/away/task.rb +0 -219
  326. data/lib/hanuman/action.rb +0 -21
  327. data/lib/hanuman/chain.rb +0 -4
  328. data/lib/hanuman/graphviz.rb +0 -74
  329. data/lib/hanuman/resource.rb +0 -6
  330. data/lib/hanuman/slot.rb +0 -87
  331. data/lib/hanuman/slottable.rb +0 -220
  332. data/lib/wukong/bad_record.rb +0 -15
  333. data/lib/wukong/event.rb +0 -44
  334. data/lib/wukong/local_runner.rb +0 -55
  335. data/lib/wukong/mapred.rb +0 -3
  336. data/lib/wukong/universe.rb +0 -48
  337. data/lib/wukong/widget/filter.rb +0 -81
  338. data/lib/wukong/widget/gibberish.rb +0 -123
  339. data/lib/wukong/widget/monitor.rb +0 -26
  340. data/lib/wukong/widget/reducer.rb +0 -66
  341. data/lib/wukong/widget/stringifier.rb +0 -50
  342. data/lib/wukong/workflow.rb +0 -22
  343. data/lib/wukong/workflow/command.rb +0 -42
  344. data/old/config/emr-example.yaml +0 -48
  345. data/old/examples/README.txt +0 -17
  346. data/old/examples/contrib/jeans/README.markdown +0 -165
  347. data/old/examples/contrib/jeans/data/normalized_sizes +0 -3
  348. data/old/examples/contrib/jeans/data/orders.tsv +0 -1302
  349. data/old/examples/contrib/jeans/data/sizes +0 -3
  350. data/old/examples/contrib/jeans/normalize.rb +0 -20
  351. data/old/examples/contrib/jeans/sizes.rb +0 -55
  352. data/old/examples/corpus/bnc_word_freq.rb +0 -44
  353. data/old/examples/corpus/bucket_counter.rb +0 -47
  354. data/old/examples/corpus/dbpedia_abstract_to_sentences.rb +0 -86
  355. data/old/examples/corpus/sentence_bigrams.rb +0 -53
  356. data/old/examples/corpus/sentence_coocurrence.rb +0 -66
  357. data/old/examples/corpus/stopwords.rb +0 -138
  358. data/old/examples/corpus/words_to_bigrams.rb +0 -53
  359. data/old/examples/emr/README.textile +0 -110
  360. data/old/examples/emr/dot_wukong_dir/credentials.json +0 -7
  361. data/old/examples/emr/dot_wukong_dir/emr.yaml +0 -69
  362. data/old/examples/emr/dot_wukong_dir/emr_bootstrap.sh +0 -33
  363. data/old/examples/emr/elastic_mapreduce_example.rb +0 -28
  364. data/old/examples/network_graph/adjacency_list.rb +0 -74
  365. data/old/examples/network_graph/breadth_first_search.rb +0 -72
  366. data/old/examples/network_graph/gen_2paths.rb +0 -68
  367. data/old/examples/network_graph/gen_multi_edge.rb +0 -112
  368. data/old/examples/network_graph/gen_symmetric_links.rb +0 -64
  369. data/old/examples/pagerank/README.textile +0 -6
  370. data/old/examples/pagerank/gen_initial_pagerank_graph.pig +0 -57
  371. data/old/examples/pagerank/pagerank.rb +0 -72
  372. data/old/examples/pagerank/pagerank_initialize.rb +0 -42
  373. data/old/examples/pagerank/run_pagerank.sh +0 -21
  374. data/old/examples/sample_records.rb +0 -33
  375. data/old/examples/server_logs/apache_log_parser.rb +0 -15
  376. data/old/examples/server_logs/nook.rb +0 -48
  377. data/old/examples/server_logs/nook/faraday_dummy_adapter.rb +0 -94
  378. data/old/examples/server_logs/user_agent.rb +0 -40
  379. data/old/examples/simple_word_count.rb +0 -82
  380. data/old/examples/size.rb +0 -61
  381. data/old/examples/stats/avg_value_frequency.rb +0 -86
  382. data/old/examples/stats/binning_percentile_estimator.rb +0 -140
  383. data/old/examples/stats/data/avg_value_frequency.tsv +0 -3
  384. data/old/examples/stats/rank_and_bin.rb +0 -173
  385. data/old/examples/stupidly_simple_filter.rb +0 -40
  386. data/old/examples/word_count.rb +0 -75
  387. data/old/graph/graphviz_builder.rb +0 -580
  388. data/old/graph_easy/Attributes.pm +0 -4181
  389. data/old/graph_easy/Graphviz.pm +0 -2232
  390. data/old/wukong.rb +0 -18
  391. data/old/wukong/and_pig.rb +0 -38
  392. data/old/wukong/bad_record.rb +0 -18
  393. data/old/wukong/datatypes.rb +0 -24
  394. data/old/wukong/datatypes/enum.rb +0 -127
  395. data/old/wukong/datatypes/fake_types.rb +0 -17
  396. data/old/wukong/decorator.rb +0 -28
  397. data/old/wukong/encoding/asciize.rb +0 -108
  398. data/old/wukong/extensions.rb +0 -16
  399. data/old/wukong/extensions/array.rb +0 -18
  400. data/old/wukong/extensions/blank.rb +0 -93
  401. data/old/wukong/extensions/class.rb +0 -189
  402. data/old/wukong/extensions/date_time.rb +0 -53
  403. data/old/wukong/extensions/emittable.rb +0 -69
  404. data/old/wukong/extensions/enumerable.rb +0 -79
  405. data/old/wukong/extensions/hash.rb +0 -167
  406. data/old/wukong/extensions/hash_keys.rb +0 -16
  407. data/old/wukong/extensions/hash_like.rb +0 -150
  408. data/old/wukong/extensions/hashlike_class.rb +0 -47
  409. data/old/wukong/extensions/module.rb +0 -2
  410. data/old/wukong/extensions/pathname.rb +0 -27
  411. data/old/wukong/extensions/string.rb +0 -65
  412. data/old/wukong/extensions/struct.rb +0 -17
  413. data/old/wukong/extensions/symbol.rb +0 -11
  414. data/old/wukong/filename_pattern.rb +0 -74
  415. data/old/wukong/helper.rb +0 -7
  416. data/old/wukong/helper/stopwords.rb +0 -195
  417. data/old/wukong/helper/tokenize.rb +0 -35
  418. data/old/wukong/logger.rb +0 -38
  419. data/old/wukong/periodic_monitor.rb +0 -72
  420. data/old/wukong/schema.rb +0 -269
  421. data/old/wukong/script.rb +0 -286
  422. data/old/wukong/script/avro_command.rb +0 -5
  423. data/old/wukong/script/cassandra_loader_script.rb +0 -40
  424. data/old/wukong/script/emr_command.rb +0 -168
  425. data/old/wukong/script/hadoop_command.rb +0 -237
  426. data/old/wukong/script/local_command.rb +0 -41
  427. data/old/wukong/store.rb +0 -10
  428. data/old/wukong/store/base.rb +0 -27
  429. data/old/wukong/store/cassandra.rb +0 -10
  430. data/old/wukong/store/cassandra/streaming.rb +0 -75
  431. data/old/wukong/store/cassandra/struct_loader.rb +0 -21
  432. data/old/wukong/store/cassandra_model.rb +0 -91
  433. data/old/wukong/store/chh_chunked_flat_file_store.rb +0 -37
  434. data/old/wukong/store/chunked_flat_file_store.rb +0 -48
  435. data/old/wukong/store/conditional_store.rb +0 -57
  436. data/old/wukong/store/factory.rb +0 -8
  437. data/old/wukong/store/flat_file_store.rb +0 -89
  438. data/old/wukong/store/key_store.rb +0 -51
  439. data/old/wukong/store/null_store.rb +0 -15
  440. data/old/wukong/store/read_thru_store.rb +0 -22
  441. data/old/wukong/store/tokyo_tdb_key_store.rb +0 -33
  442. data/old/wukong/store/tyrant_rdb_key_store.rb +0 -57
  443. data/old/wukong/store/tyrant_tdb_key_store.rb +0 -20
  444. data/old/wukong/streamer.rb +0 -30
  445. data/old/wukong/streamer/accumulating_reducer.rb +0 -83
  446. data/old/wukong/streamer/base.rb +0 -126
  447. data/old/wukong/streamer/counting_reducer.rb +0 -25
  448. data/old/wukong/streamer/filter.rb +0 -20
  449. data/old/wukong/streamer/instance_streamer.rb +0 -15
  450. data/old/wukong/streamer/json_streamer.rb +0 -21
  451. data/old/wukong/streamer/line_streamer.rb +0 -12
  452. data/old/wukong/streamer/list_reducer.rb +0 -31
  453. data/old/wukong/streamer/rank_and_bin_reducer.rb +0 -145
  454. data/old/wukong/streamer/record_streamer.rb +0 -14
  455. data/old/wukong/streamer/reducer.rb +0 -11
  456. data/old/wukong/streamer/set_reducer.rb +0 -14
  457. data/old/wukong/streamer/struct_streamer.rb +0 -48
  458. data/old/wukong/streamer/summing_reducer.rb +0 -29
  459. data/old/wukong/streamer/uniq_by_last_reducer.rb +0 -51
  460. data/old/wukong/typed_struct.rb +0 -12
  461. data/spec/away/encoding_spec.rb +0 -32
  462. data/spec/away/exe_spec.rb +0 -20
  463. data/spec/away/flow_spec.rb +0 -82
  464. data/spec/away/graph_spec.rb +0 -6
  465. data/spec/away/job_spec.rb +0 -15
  466. data/spec/away/rake_compat_spec.rb +0 -9
  467. data/spec/away/script_spec.rb +0 -81
  468. data/spec/hanuman/graphviz_spec.rb +0 -29
  469. data/spec/hanuman/slot_spec.rb +0 -2
  470. data/spec/support/examples_helper.rb +0 -10
  471. data/spec/support/streamer_test_helpers.rb +0 -6
  472. data/spec/support/wukong_widget_helpers.rb +0 -66
  473. data/spec/wukong/processor_spec.rb +0 -109
  474. data/spec/wukong/widget/filter_spec.rb +0 -99
  475. data/spec/wukong/widget/stringifier_spec.rb +0 -51
  476. data/spec/wukong/workflow/command_spec.rb +0 -5
@@ -1,39 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'wukong'
3
-
4
- # Example usage:
5
- #
6
- # ~/ics/wukong/bin/hdp-stream2 input_path1,input_path2 output_path \
7
- # "`which cuttab` 2,3,7" "`which uniq` -c" 1 3 -jobconf mapred.reduce.tasks=23
8
-
9
-
10
- # options = Settings[:runner_defaults].dup
11
-
12
- # cmdline_opts = Hash.zip(
13
- # [ :input_file, :output_file,
14
- # :map_command, :reduce_command,
15
- # :partition_fields, :sort_fields],
16
- # ARGV.map{|s| s.blank? ? nil : s }
17
- # )
18
- # argvs = ARGV.slice!(0..5) ;
19
- # ARGV.unshift cmdline_opts[:input_file];
20
- # ARGV.unshift cmdline_opts[:output_file]
21
- # p [argvs, ARGV]
22
- #
23
- # # cmdline_opts[:map_command] = `which cat`.chomp if cmdline_opts[:map_command].blank?
24
- # # cmdline_opts[:reduce_command] = nil if cmdline_opts[:reduce_command].blank?
25
- # cmdline_opts[:dry_run] = true
26
- # cmdline_opts[:run] = true
27
-
28
- #p cmdline_opts, Settings[:runner_defaults]
29
-
30
- # Go script go!
31
- runner = Wukong::Script.new(
32
- nil, # use mapper_command
33
- nil, # use reducer_command
34
- :run => true
35
- )
36
- # runner.options.merge cmdline_opts
37
- runner.options[:reuse_jvms] = true if runner.options[:reuse_jvms].blank?
38
-
39
- runner.run
@@ -1,17 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'wukong'
3
-
4
- src_dir, dest_dir = ARGV[0..1]
5
- src_files = Dir[src_dir + '/*']
6
- dest_files = Wukong::Dfs.list_files dest_dir
7
- Wukong::Dfs.compare_listings(src_files, dest_files) do |comparison, src_file, dest_file|
8
- case comparison
9
- when :missing
10
- dest_filename = "%s/%s" % [dest_dir, dest_file]
11
- puts "Copying #{src_file} #{dest_filename}"
12
- puts `hadoop dfs -put #{src_file} #{dest_filename}`
13
- when :differ
14
- src_ls = `ls -l #{src_file}`.split(/\s+/).join("\t")
15
- puts "Differ: #{src_ls} \n#{dest_file}"
16
- end
17
- end
data/bin/hdp-wc DELETED
@@ -1,67 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'wukong'
3
- NEWLINE_LENGTH = $/.length # KLUDGE
4
-
5
- #
6
- #
7
- #
8
- # !! The +words+ count comes out higher than that of +wc+ -- don't know
9
- # why. (It's close: a 10GB, 1M line dataset it showed 367833839 vs. 367713271)
10
- #
11
- class WcMapper < Wukong::Streamer::LineStreamer
12
- attr_accessor :lines, :fields, :words, :chars, :bytes
13
-
14
- def before_stream
15
- self.lines, self.fields, self.words, self.chars, self.bytes = [0,0,0,0,0]
16
- end
17
-
18
- def process line
19
- return unless line
20
- self.lines += 1
21
- self.fields += 1 + line.count("\t")
22
- self.words += 1 + line.strip.scan(/\s+/).length unless line.blank?
23
- self.chars += line.chars.to_a.length + NEWLINE_LENGTH
24
- self.bytes += line.bytesize + NEWLINE_LENGTH
25
- $stderr.puts line if (line.chars.to_a.length != line.bytesize)
26
- end
27
-
28
- def after_stream
29
- emit [lines, fields, words, chars, bytes]
30
- end
31
- end
32
-
33
- #
34
- #
35
- class WcReducer < Wukong::Streamer::Base
36
- attr_accessor :lines, :fields, :words, :chars, :bytes
37
-
38
- def before_stream
39
- self.lines, self.fields, self.words, self.chars, self.bytes = [0,0,0,0,0]
40
- end
41
-
42
- def process m_lines, m_fields, m_words, m_chars, m_bytes
43
- self.lines += m_lines.to_i
44
- self.fields += m_fields.to_i
45
- self.words += m_words.to_i
46
- self.chars += m_chars.to_i
47
- self.bytes += m_bytes.to_i
48
- end
49
-
50
- def after_stream
51
- emit [lines, fields, words, chars, bytes]
52
- end
53
- end
54
-
55
- Wukong::Script.new(WcMapper, WcReducer, :reduce_tasks => 1).run
56
-
57
- # class FooScript < Wukong::Script
58
- # def map_command
59
- # '/usr/bin/wc'
60
- # end
61
- # def reduce_command
62
- # '/bin/cat'
63
- # end
64
- # end
65
- # FooScript.new(nil, nil, :reduce_tasks => 1).run
66
- #
67
- # ruby -ne 'wc_v = `echo "#{$_.chomp}" | wc`; gr_v=($_.strip.empty? ? 0 : $_.strip.scan(/\s+/).length + 1 ) ; puts [wc_v.chomp, " ", gr_v, $_.chomp].join("\t")'
@@ -1,10 +0,0 @@
1
- #!/usr/bin/env ruby
2
- $LOAD_PATH.unshift(File.expand_path("../lib", File.realdirpath(File.dirname(__FILE__))))
3
- require 'wukong'
4
-
5
- dataflow_file = ARGV[0]
6
- flow_name = File.basename(dataflow_file, '.rb').to_sym
7
-
8
- load(dataflow_file)
9
-
10
- Wukong::LocalRunner.run(Wukong.dataflow(flow_name), :default)
data/bin/wu-map DELETED
@@ -1,17 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'rubygems' unless defined?(Gem)
3
- $LOAD_PATH.unshift(File.expand_path("../lib", File.dirname(__FILE__)))
4
- require 'wukong'
5
-
6
- load ARGV[0]
7
-
8
- case File.basename($0)
9
- when 'wu-map' then run_mode = :mapper
10
- when 'wu-red' then run_mode = :reducer
11
- end
12
-
13
- Wukong::LocalRunner.new do
14
- source :stdin, Wukong::Source::Stdin.new
15
- sink :stdout, Wukong::Sink::Stdout.new
16
- flow Wukong.dataflow(run_mode)
17
- end.run
data/bin/wu-red DELETED
@@ -1,17 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'rubygems' unless defined?(Gem)
3
- $LOAD_PATH.unshift(File.expand_path("../lib", File.dirname(__FILE__)))
4
- require 'wukong'
5
-
6
- load ARGV[0]
7
-
8
- case File.basename($0)
9
- when 'wu-map' then run_mode = :mapper
10
- when 'wu-red' then run_mode = :reducer
11
- end
12
-
13
- Wukong::LocalRunner.new do
14
- source :stdin, Wukong::Source::Stdin.new
15
- sink :stdout, Wukong::Sink::Stdout.new
16
- flow Wukong.dataflow(run_mode)
17
- end.run
data/bin/wukong DELETED
@@ -1,17 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'rubygems' unless defined?(Gem)
3
- $LOAD_PATH.unshift(File.expand_path("../lib", File.dirname(__FILE__)))
4
- require 'wukong'
5
-
6
- load ARGV[0]
7
-
8
- case File.basename($0)
9
- when 'wu-map' then run_mode = :mapper
10
- when 'wu-red' then run_mode = :reducer
11
- end
12
-
13
- Wukong::LocalRunner.new do
14
- source :stdin, Wukong::Source::Stdin.new
15
- sink :stdout, Wukong::Sink::Stdout.new
16
- flow Wukong.dataflow(run_mode)
17
- end.run
@@ -1,355 +0,0 @@
1
-
2
- ### graph/airfares.tsv
3
-
4
- Approximate city-city airfares, scraped from the [Bing Flight Search](http://www.bing.com/travel/flights) site on 2012 June 3. Used URLs of the form http://www.bing.com/travel/flight/flexibleTravel?form=TRHPFG&o=BWI and extracted fields directly from page text.
5
-
6
- ### text/gift_of_the_magi.txt
7
-
8
-
9
- The [Project Gutenberg EBook of The Gift of the Magi, by O. Henry](http://www.gutenberg.org/cache/epub/7256/pg7256.txt)
10
-
11
- Title: The Gift of the Magi
12
- Author: O. Henry
13
- Produced by: Susan Ritchie. HTML version by Jose Menendez.
14
- Posting Date: December 17, 2011 [EBook #7256]
15
- Release Date: January, 2005
16
- Last Updated: December 20, 2003
17
-
18
- This eBook is for the use of anyone anywhere at no cost and with
19
- almost no restrictions whatsoever. You may copy it, give it away or
20
- re-use it under the terms of the Project Gutenberg License included
21
- with this eBook or online at www.gutenberg.net
22
-
23
- Language: English
24
-
25
- ## Gutenberg License
26
-
27
- *** START: FULL LICENSE ***
28
-
29
- THE FULL PROJECT GUTENBERG LICENSE
30
- PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
31
-
32
- To protect the Project Gutenberg-tm mission of promoting the free
33
- distribution of electronic works, by using or distributing this work
34
- (or any other work associated in any way with the phrase "Project
35
- Gutenberg"), you agree to comply with all the terms of the Full Project
36
- Gutenberg-tm License (available with this file or online at
37
- http://gutenberg.net/license).
38
-
39
- Section 1. General Terms of Use and Redistributing Project Gutenberg-tm
40
- electronic works
41
-
42
- 1.A. By reading or using any part of this Project Gutenberg-tm
43
- electronic work, you indicate that you have read, understand, agree to
44
- and accept all the terms of this license and intellectual property
45
- (trademark/copyright) agreement. If you do not agree to abide by all
46
- the terms of this agreement, you must cease using and return or destroy
47
- all copies of Project Gutenberg-tm electronic works in your possession.
48
- If you paid a fee for obtaining a copy of or access to a Project
49
- Gutenberg-tm electronic work and you do not agree to be bound by the
50
- terms of this agreement, you may obtain a refund from the person or
51
- entity to whom you paid the fee as set forth in paragraph 1.E.8.
52
-
53
- 1.B. "Project Gutenberg" is a registered trademark. It may only be
54
- used on or associated in any way with an electronic work by people who
55
- agree to be bound by the terms of this agreement. There are a few
56
- things that you can do with most Project Gutenberg-tm electronic works
57
- even without complying with the full terms of this agreement. See
58
- paragraph 1.C below. There are a lot of things you can do with Project
59
- Gutenberg-tm electronic works if you follow the terms of this agreement
60
- and help preserve free future access to Project Gutenberg-tm electronic
61
- works. See paragraph 1.E below.
62
-
63
- 1.C. The Project Gutenberg Literary Archive Foundation ("the Foundation"
64
- or PGLAF), owns a compilation copyright in the collection of Project
65
- Gutenberg-tm electronic works. Nearly all the individual works in the
66
- collection are in the public domain in the United States. If an
67
- individual work is in the public domain in the United States and you are
68
- located in the United States, we do not claim a right to prevent you from
69
- copying, distributing, performing, displaying or creating derivative
70
- works based on the work as long as all references to Project Gutenberg
71
- are removed. Of course, we hope that you will support the Project
72
- Gutenberg-tm mission of promoting free access to electronic works by
73
- freely sharing Project Gutenberg-tm works in compliance with the terms of
74
- this agreement for keeping the Project Gutenberg-tm name associated with
75
- the work. You can easily comply with the terms of this agreement by
76
- keeping this work in the same format with its attached full Project
77
- Gutenberg-tm License when you share it without charge with others.
78
-
79
- 1.D. The copyright laws of the place where you are located also govern
80
- what you can do with this work. Copyright laws in most countries are in
81
- a constant state of change. If you are outside the United States, check
82
- the laws of your country in addition to the terms of this agreement
83
- before downloading, copying, displaying, performing, distributing or
84
- creating derivative works based on this work or any other Project
85
- Gutenberg-tm work. The Foundation makes no representations concerning
86
- the copyright status of any work in any country outside the United
87
- States.
88
-
89
- 1.E. Unless you have removed all references to Project Gutenberg:
90
-
91
- 1.E.1. The following sentence, with active links to, or other immediate
92
- access to, the full Project Gutenberg-tm License must appear prominently
93
- whenever any copy of a Project Gutenberg-tm work (any work on which the
94
- phrase "Project Gutenberg" appears, or with which the phrase "Project
95
- Gutenberg" is associated) is accessed, displayed, performed, viewed,
96
- copied or distributed:
97
-
98
- This eBook is for the use of anyone anywhere at no cost and with
99
- almost no restrictions whatsoever. You may copy it, give it away or
100
- re-use it under the terms of the Project Gutenberg License included
101
- with this eBook or online at www.gutenberg.net
102
-
103
- 1.E.2. If an individual Project Gutenberg-tm electronic work is derived
104
- from the public domain (does not contain a notice indicating that it is
105
- posted with permission of the copyright holder), the work can be copied
106
- and distributed to anyone in the United States without paying any fees
107
- or charges. If you are redistributing or providing access to a work
108
- with the phrase "Project Gutenberg" associated with or appearing on the
109
- work, you must comply either with the requirements of paragraphs 1.E.1
110
- through 1.E.7 or obtain permission for the use of the work and the
111
- Project Gutenberg-tm trademark as set forth in paragraphs 1.E.8 or
112
- 1.E.9.
113
-
114
- 1.E.3. If an individual Project Gutenberg-tm electronic work is posted
115
- with the permission of the copyright holder, your use and distribution
116
- must comply with both paragraphs 1.E.1 through 1.E.7 and any additional
117
- terms imposed by the copyright holder. Additional terms will be linked
118
- to the Project Gutenberg-tm License for all works posted with the
119
- permission of the copyright holder found at the beginning of this work.
120
-
121
- 1.E.4. Do not unlink or detach or remove the full Project Gutenberg-tm
122
- License terms from this work, or any files containing a part of this
123
- work or any other work associated with Project Gutenberg-tm.
124
-
125
- 1.E.5. Do not copy, display, perform, distribute or redistribute this
126
- electronic work, or any part of this electronic work, without
127
- prominently displaying the sentence set forth in paragraph 1.E.1 with
128
- active links or immediate access to the full terms of the Project
129
- Gutenberg-tm License.
130
-
131
- 1.E.6. You may convert to and distribute this work in any binary,
132
- compressed, marked up, nonproprietary or proprietary form, including any
133
- word processing or hypertext form. However, if you provide access to or
134
- distribute copies of a Project Gutenberg-tm work in a format other than
135
- "Plain Vanilla ASCII" or other format used in the official version
136
- posted on the official Project Gutenberg-tm web site (www.gutenberg.net),
137
- you must, at no additional cost, fee or expense to the user, provide a
138
- copy, a means of exporting a copy, or a means of obtaining a copy upon
139
- request, of the work in its original "Plain Vanilla ASCII" or other
140
- form. Any alternate format must include the full Project Gutenberg-tm
141
- License as specified in paragraph 1.E.1.
142
-
143
- 1.E.7. Do not charge a fee for access to, viewing, displaying,
144
- performing, copying or distributing any Project Gutenberg-tm works
145
- unless you comply with paragraph 1.E.8 or 1.E.9.
146
-
147
- 1.E.8. You may charge a reasonable fee for copies of or providing
148
- access to or distributing Project Gutenberg-tm electronic works provided
149
- that
150
-
151
- - You pay a royalty fee of 20% of the gross profits you derive from
152
- the use of Project Gutenberg-tm works calculated using the method
153
- you already use to calculate your applicable taxes. The fee is
154
- owed to the owner of the Project Gutenberg-tm trademark, but he
155
- has agreed to donate royalties under this paragraph to the
156
- Project Gutenberg Literary Archive Foundation. Royalty payments
157
- must be paid within 60 days following each date on which you
158
- prepare (or are legally required to prepare) your periodic tax
159
- returns. Royalty payments should be clearly marked as such and
160
- sent to the Project Gutenberg Literary Archive Foundation at the
161
- address specified in Section 4, "Information about donations to
162
- the Project Gutenberg Literary Archive Foundation."
163
-
164
- - You provide a full refund of any money paid by a user who notifies
165
- you in writing (or by e-mail) within 30 days of receipt that s/he
166
- does not agree to the terms of the full Project Gutenberg-tm
167
- License. You must require such a user to return or
168
- destroy all copies of the works possessed in a physical medium
169
- and discontinue all use of and all access to other copies of
170
- Project Gutenberg-tm works.
171
-
172
- - You provide, in accordance with paragraph 1.F.3, a full refund of any
173
- money paid for a work or a replacement copy, if a defect in the
174
- electronic work is discovered and reported to you within 90 days
175
- of receipt of the work.
176
-
177
- - You comply with all other terms of this agreement for free
178
- distribution of Project Gutenberg-tm works.
179
-
180
- 1.E.9. If you wish to charge a fee or distribute a Project Gutenberg-tm
181
- electronic work or group of works on different terms than are set
182
- forth in this agreement, you must obtain permission in writing from
183
- both the Project Gutenberg Literary Archive Foundation and Michael
184
- Hart, the owner of the Project Gutenberg-tm trademark. Contact the
185
- Foundation as set forth in Section 3 below.
186
-
187
- 1.F.
188
-
189
- 1.F.1. Project Gutenberg volunteers and employees expend considerable
190
- effort to identify, do copyright research on, transcribe and proofread
191
- public domain works in creating the Project Gutenberg-tm
192
- collection. Despite these efforts, Project Gutenberg-tm electronic
193
- works, and the medium on which they may be stored, may contain
194
- "Defects," such as, but not limited to, incomplete, inaccurate or
195
- corrupt data, transcription errors, a copyright or other intellectual
196
- property infringement, a defective or damaged disk or other medium, a
197
- computer virus, or computer codes that damage or cannot be read by
198
- your equipment.
199
-
200
- 1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for the "Right
201
- of Replacement or Refund" described in paragraph 1.F.3, the Project
202
- Gutenberg Literary Archive Foundation, the owner of the Project
203
- Gutenberg-tm trademark, and any other party distributing a Project
204
- Gutenberg-tm electronic work under this agreement, disclaim all
205
- liability to you for damages, costs and expenses, including legal
206
- fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
207
- LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE
208
- PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE FOUNDATION, THE
209
- TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE
210
- LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE OR
211
- INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH
212
- DAMAGE.
213
-
214
- 1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you discover a
215
- defect in this electronic work within 90 days of receiving it, you can
216
- receive a refund of the money (if any) you paid for it by sending a
217
- written explanation to the person you received the work from. If you
218
- received the work on a physical medium, you must return the medium with
219
- your written explanation. The person or entity that provided you with
220
- the defective work may elect to provide a replacement copy in lieu of a
221
- refund. If you received the work electronically, the person or entity
222
- providing it to you may choose to give you a second opportunity to
223
- receive the work electronically in lieu of a refund. If the second copy
224
- is also defective, you may demand a refund in writing without further
225
- opportunities to fix the problem.
226
-
227
- 1.F.4. Except for the limited right of replacement or refund set forth
228
- in paragraph 1.F.3, this work is provided to you 'AS-IS' WITH NO OTHER
229
- WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
230
- WARRANTIES OF MERCHANTIBILITY OR FITNESS FOR ANY PURPOSE.
231
-
232
- 1.F.5. Some states do not allow disclaimers of certain implied
233
- warranties or the exclusion or limitation of certain types of damages.
234
- If any disclaimer or limitation set forth in this agreement violates the
235
- law of the state applicable to this agreement, the agreement shall be
236
- interpreted to make the maximum disclaimer or limitation permitted by
237
- the applicable state law. The invalidity or unenforceability of any
238
- provision of this agreement shall not void the remaining provisions.
239
-
240
- 1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation, the
241
- trademark owner, any agent or employee of the Foundation, anyone
242
- providing copies of Project Gutenberg-tm electronic works in accordance
243
- with this agreement, and any volunteers associated with the production,
244
- promotion and distribution of Project Gutenberg-tm electronic works,
245
- harmless from all liability, costs and expenses, including legal fees,
246
- that arise directly or indirectly from any of the following which you do
247
- or cause to occur: (a) distribution of this or any Project Gutenberg-tm
248
- work, (b) alteration, modification, or additions or deletions to any
249
- Project Gutenberg-tm work, and (c) any Defect you cause.
250
-
251
-
252
- Section 2. Information about the Mission of Project Gutenberg-tm
253
-
254
- Project Gutenberg-tm is synonymous with the free distribution of
255
- electronic works in formats readable by the widest variety of computers
256
- including obsolete, old, middle-aged and new computers. It exists
257
- because of the efforts of hundreds of volunteers and donations from
258
- people in all walks of life.
259
-
260
- Volunteers and financial support to provide volunteers with the
261
- assistance they need are critical to reaching Project Gutenberg-tm's
262
- goals and ensuring that the Project Gutenberg-tm collection will
263
- remain freely available for generations to come. In 2001, the Project
264
- Gutenberg Literary Archive Foundation was created to provide a secure
265
- and permanent future for Project Gutenberg-tm and future generations.
266
- To learn more about the Project Gutenberg Literary Archive Foundation
267
- and how your efforts and donations can help, see Sections 3 and 4
268
- and the Foundation web page at http://www.pglaf.org.
269
-
270
-
271
- Section 3. Information about the Project Gutenberg Literary Archive
272
- Foundation
273
-
274
- The Project Gutenberg Literary Archive Foundation is a non profit
275
- 501(c)(3) educational corporation organized under the laws of the
276
- state of Mississippi and granted tax exempt status by the Internal
277
- Revenue Service. The Foundation's EIN or federal tax identification
278
- number is 64-6221541. Its 501(c)(3) letter is posted at
279
- http://pglaf.org/fundraising. Contributions to the Project Gutenberg
280
- Literary Archive Foundation are tax deductible to the full extent
281
- permitted by U.S. federal laws and your state's laws.
282
-
283
- The Foundation's principal office is located at 4557 Melan Dr. S.
284
- Fairbanks, AK, 99712., but its volunteers and employees are scattered
285
- throughout numerous locations. Its business office is located at
286
- 809 North 1500 West, Salt Lake City, UT 84116, (801) 596-1887, email
287
- business@pglaf.org. Email contact links and up to date contact
288
- information can be found at the Foundation's web site and official
289
- page at http://pglaf.org
290
-
291
- For additional contact information:
292
- Dr. Gregory B. Newby
293
- Chief Executive and Director
294
- gbnewby@pglaf.org
295
-
296
-
297
- Section 4. Information about Donations to the Project Gutenberg
298
- Literary Archive Foundation
299
-
300
- Project Gutenberg-tm depends upon and cannot survive without wide
301
- spread public support and donations to carry out its mission of
302
- increasing the number of public domain and licensed works that can be
303
- freely distributed in machine readable form accessible by the widest
304
- array of equipment including outdated equipment. Many small donations
305
- ($1 to $5,000) are particularly important to maintaining tax exempt
306
- status with the IRS.
307
-
308
- The Foundation is committed to complying with the laws regulating
309
- charities and charitable donations in all 50 states of the United
310
- States. Compliance requirements are not uniform and it takes a
311
- considerable effort, much paperwork and many fees to meet and keep up
312
- with these requirements. We do not solicit donations in locations
313
- where we have not received written confirmation of compliance. To
314
- SEND DONATIONS or determine the status of compliance for any
315
- particular state visit http://pglaf.org
316
-
317
- While we cannot and do not solicit contributions from states where we
318
- have not met the solicitation requirements, we know of no prohibition
319
- against accepting unsolicited donations from donors in such states who
320
- approach us with offers to donate.
321
-
322
- International donations are gratefully accepted, but we cannot make
323
- any statements concerning tax treatment of donations received from
324
- outside the United States. U.S. laws alone swamp our small staff.
325
-
326
- Please check the Project Gutenberg Web pages for current donation
327
- methods and addresses. Donations are accepted in a number of other
328
- ways including including checks, online payments and credit card
329
- donations. To donate, please visit: http://pglaf.org/donate
330
-
331
-
332
- Section 5. General Information About Project Gutenberg-tm electronic
333
- works.
334
-
335
- Professor Michael S. Hart is the originator of the Project Gutenberg-tm
336
- concept of a library of electronic works that could be freely shared
337
- with anyone. For thirty years, he produced and distributed Project
338
- Gutenberg-tm eBooks with only a loose network of volunteer support.
339
-
340
-
341
- Project Gutenberg-tm eBooks are often created from several printed
342
- editions, all of which are confirmed as Public Domain in the U.S.
343
- unless a copyright notice is included. Thus, we do not necessarily
344
- keep eBooks in compliance with any particular paper edition.
345
-
346
-
347
- Most people start at our Web site which has the main PG search facility:
348
-
349
- http://www.gutenberg.net
350
-
351
- This Web site includes information about Project Gutenberg-tm,
352
- including how to make donations to the Project Gutenberg Literary
353
- Archive Foundation, how to help produce our new eBooks, and how to
354
- subscribe to our email newsletter to hear about new eBooks.
355
-