wukong 3.0.0.pre → 3.0.0.pre2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (476) hide show
  1. data/.gitignore +46 -33
  2. data/.gitmodules +3 -0
  3. data/.rspec +1 -1
  4. data/.travis.yml +8 -1
  5. data/.yardopts +0 -13
  6. data/Guardfile +4 -6
  7. data/{LICENSE.textile → LICENSE.md} +43 -55
  8. data/README-old.md +422 -0
  9. data/README.md +279 -418
  10. data/Rakefile +21 -5
  11. data/TODO.md +6 -6
  12. data/bin/wu-clean-encoding +31 -0
  13. data/bin/wu-lign +2 -2
  14. data/bin/wu-local +69 -0
  15. data/bin/wu-server +70 -0
  16. data/examples/Gemfile +38 -0
  17. data/examples/README.md +9 -0
  18. data/examples/dataflow/apache_log_line.rb +64 -25
  19. data/examples/dataflow/fibonacci_series.rb +101 -0
  20. data/examples/dataflow/parse_apache_logs.rb +37 -7
  21. data/examples/{dataflow.rb → dataflow/scraper_macro_flow.rb} +0 -0
  22. data/examples/dataflow/simple.rb +4 -4
  23. data/examples/geo.rb +4 -0
  24. data/examples/geo/geo_grids.numbers +0 -0
  25. data/examples/geo/geolocated.rb +331 -0
  26. data/examples/geo/quadtile.rb +69 -0
  27. data/examples/geo/spec/geolocated_spec.rb +247 -0
  28. data/examples/geo/tile_fetcher.rb +77 -0
  29. data/examples/graph/minimum_spanning_tree.rb +61 -61
  30. data/examples/jabberwocky.txt +36 -0
  31. data/examples/models/wikipedia.rb +20 -0
  32. data/examples/munging/Gemfile +8 -0
  33. data/examples/munging/airline_flights/airline.rb +57 -0
  34. data/examples/munging/airline_flights/airline_flights.rake +83 -0
  35. data/{lib/wukong/settings.rb → examples/munging/airline_flights/airplane.rb} +0 -0
  36. data/examples/munging/airline_flights/airport.rb +211 -0
  37. data/examples/munging/airline_flights/airport_id_unification.rb +129 -0
  38. data/examples/munging/airline_flights/airport_ok_chars.rb +4 -0
  39. data/examples/munging/airline_flights/flight.rb +156 -0
  40. data/examples/munging/airline_flights/models.rb +4 -0
  41. data/examples/munging/airline_flights/parse.rb +26 -0
  42. data/examples/munging/airline_flights/reconcile_airports.rb +142 -0
  43. data/examples/munging/airline_flights/route.rb +35 -0
  44. data/examples/munging/airline_flights/tasks.rake +83 -0
  45. data/examples/munging/airline_flights/timezone_fixup.rb +62 -0
  46. data/examples/munging/airline_flights/topcities.rb +167 -0
  47. data/examples/munging/airports/40_wbans.txt +40 -0
  48. data/examples/munging/airports/filter_weather_reports.rb +37 -0
  49. data/examples/munging/airports/join.pig +31 -0
  50. data/examples/munging/airports/to_tsv.rb +33 -0
  51. data/examples/munging/airports/usa_wbans.pig +19 -0
  52. data/examples/munging/airports/usa_wbans.txt +2157 -0
  53. data/examples/munging/airports/wbans.pig +19 -0
  54. data/examples/munging/airports/wbans.txt +2310 -0
  55. data/examples/munging/geo/geo_json.rb +54 -0
  56. data/examples/munging/geo/geo_models.rb +69 -0
  57. data/examples/munging/geo/geonames_models.rb +78 -0
  58. data/examples/munging/geo/iso_codes.rb +172 -0
  59. data/examples/munging/geo/reconcile_countries.rb +124 -0
  60. data/examples/munging/geo/tasks.rake +71 -0
  61. data/examples/munging/rake_helper.rb +62 -0
  62. data/examples/munging/weather/.gitignore +1 -0
  63. data/examples/munging/weather/Gemfile +4 -0
  64. data/examples/munging/weather/Rakefile +28 -0
  65. data/examples/munging/weather/extract_ish.rb +13 -0
  66. data/examples/munging/weather/models/weather.rb +119 -0
  67. data/examples/munging/weather/utils/noaa_downloader.rb +46 -0
  68. data/examples/munging/wikipedia/README.md +34 -0
  69. data/examples/munging/wikipedia/Rakefile +193 -0
  70. data/examples/munging/wikipedia/articles/extract_articles-parsed.rb +79 -0
  71. data/examples/munging/wikipedia/articles/extract_articles-templated.rb +136 -0
  72. data/examples/munging/wikipedia/articles/textualize_articles.rb +54 -0
  73. data/examples/munging/wikipedia/articles/verify_structure.rb +43 -0
  74. data/examples/munging/wikipedia/articles/wp2txt-LICENSE.txt +22 -0
  75. data/examples/munging/wikipedia/articles/wp2txt_article.rb +259 -0
  76. data/examples/munging/wikipedia/articles/wp2txt_utils.rb +452 -0
  77. data/examples/munging/wikipedia/dbpedia/dbpedia_common.rb +4 -0
  78. data/examples/munging/wikipedia/dbpedia/dbpedia_extract_geocoordinates.rb +78 -0
  79. data/examples/munging/wikipedia/dbpedia/extract_links.rb +193 -0
  80. data/examples/munging/wikipedia/dbpedia/sameas_extractor.rb +20 -0
  81. data/examples/munging/wikipedia/n1_subuniverse/n1_nodes.pig +18 -0
  82. data/examples/munging/wikipedia/page_metadata/extract_page_metadata.rb +21 -0
  83. data/examples/munging/wikipedia/page_metadata/extract_page_metadata.rb.old +27 -0
  84. data/examples/munging/wikipedia/pagelinks/augment_pagelinks.pig +29 -0
  85. data/examples/munging/wikipedia/pagelinks/extract_pagelinks.rb +14 -0
  86. data/examples/munging/wikipedia/pagelinks/extract_pagelinks.rb.old +25 -0
  87. data/examples/munging/wikipedia/pagelinks/undirect_pagelinks.pig +29 -0
  88. data/examples/munging/wikipedia/pageviews/augment_pageviews.pig +32 -0
  89. data/examples/munging/wikipedia/pageviews/extract_pageviews.rb +85 -0
  90. data/examples/munging/wikipedia/pig_style_guide.md +25 -0
  91. data/examples/munging/wikipedia/redirects/redirects_page_metadata.pig +19 -0
  92. data/examples/munging/wikipedia/subuniverse/sub_articles.pig +23 -0
  93. data/examples/munging/wikipedia/subuniverse/sub_page_metadata.pig +24 -0
  94. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_from.pig +22 -0
  95. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_into.pig +22 -0
  96. data/examples/munging/wikipedia/subuniverse/sub_pagelinks_within.pig +26 -0
  97. data/examples/munging/wikipedia/subuniverse/sub_pageviews.pig +29 -0
  98. data/examples/munging/wikipedia/subuniverse/sub_undirected_pagelinks_within.pig +24 -0
  99. data/examples/munging/wikipedia/utils/get_namespaces.rb +86 -0
  100. data/examples/munging/wikipedia/utils/munging_utils.rb +68 -0
  101. data/examples/munging/wikipedia/utils/namespaces.json +1 -0
  102. data/examples/rake_helper.rb +85 -0
  103. data/examples/server_logs/geo_ip_mapping/munge_geolite.rb +82 -0
  104. data/examples/server_logs/logline.rb +95 -0
  105. data/examples/server_logs/models.rb +66 -0
  106. data/examples/server_logs/page_counts.pig +48 -0
  107. data/examples/server_logs/server_logs-01-parse-script.rb +13 -0
  108. data/examples/server_logs/server_logs-02-histograms-full.rb +33 -0
  109. data/examples/server_logs/server_logs-02-histograms-mapper.rb +14 -0
  110. data/{old/examples/server_logs/breadcrumbs.rb → examples/server_logs/server_logs-03-breadcrumbs-full.rb} +26 -30
  111. data/examples/server_logs/server_logs-04-page_page_edges-full.rb +40 -0
  112. data/examples/string_reverser.rb +26 -0
  113. data/examples/text/pig_latin.rb +2 -2
  114. data/examples/text/regional_flavor/README.md +14 -0
  115. data/examples/text/regional_flavor/article_wordbags.pig +39 -0
  116. data/examples/text/regional_flavor/j01-article_wordbags.rb +4 -0
  117. data/examples/text/regional_flavor/simple_pig_script.pig +27 -0
  118. data/examples/word_count/accumulator.rb +26 -0
  119. data/examples/word_count/tokenizer.rb +13 -0
  120. data/examples/word_count/word_count.rb +6 -0
  121. data/examples/workflow/cherry_pie.dot +97 -0
  122. data/examples/workflow/cherry_pie.png +0 -0
  123. data/examples/workflow/cherry_pie.rb +61 -26
  124. data/lib/hanuman.rb +34 -7
  125. data/lib/hanuman/graph.rb +55 -31
  126. data/lib/hanuman/graphvizzer.rb +199 -178
  127. data/lib/hanuman/graphvizzer/gv_models.rb +161 -0
  128. data/lib/hanuman/graphvizzer/gv_presenter.rb +97 -0
  129. data/lib/hanuman/link.rb +35 -0
  130. data/lib/hanuman/registry.rb +46 -0
  131. data/lib/hanuman/stage.rb +76 -32
  132. data/lib/wukong.rb +23 -24
  133. data/lib/wukong/boot.rb +87 -0
  134. data/lib/wukong/configuration.rb +8 -0
  135. data/lib/wukong/dataflow.rb +45 -78
  136. data/lib/wukong/driver.rb +99 -0
  137. data/lib/wukong/emitter.rb +22 -0
  138. data/lib/wukong/model/faker.rb +24 -24
  139. data/lib/wukong/model/flatpack_parser/flat.rb +60 -0
  140. data/lib/wukong/model/flatpack_parser/flatpack.rb +4 -0
  141. data/lib/wukong/model/flatpack_parser/lang.rb +46 -0
  142. data/lib/wukong/model/flatpack_parser/parser.rb +55 -0
  143. data/lib/wukong/model/flatpack_parser/tokens.rb +130 -0
  144. data/lib/wukong/processor.rb +60 -114
  145. data/lib/wukong/spec_helpers.rb +81 -0
  146. data/lib/wukong/spec_helpers/integration_driver.rb +144 -0
  147. data/lib/wukong/spec_helpers/integration_driver_matchers.rb +219 -0
  148. data/lib/wukong/spec_helpers/processor_helpers.rb +95 -0
  149. data/lib/wukong/spec_helpers/processor_methods.rb +108 -0
  150. data/lib/wukong/spec_helpers/shared_examples.rb +15 -0
  151. data/lib/wukong/spec_helpers/spec_driver.rb +28 -0
  152. data/lib/wukong/spec_helpers/spec_driver_matchers.rb +195 -0
  153. data/lib/wukong/version.rb +2 -1
  154. data/lib/wukong/widget/filters.rb +311 -0
  155. data/lib/wukong/widget/processors.rb +156 -0
  156. data/lib/wukong/widget/reducers.rb +7 -0
  157. data/lib/wukong/widget/reducers/accumulator.rb +73 -0
  158. data/lib/wukong/widget/reducers/bin.rb +318 -0
  159. data/lib/wukong/widget/reducers/count.rb +61 -0
  160. data/lib/wukong/widget/reducers/group.rb +85 -0
  161. data/lib/wukong/widget/reducers/group_concat.rb +70 -0
  162. data/lib/wukong/widget/reducers/moments.rb +72 -0
  163. data/lib/wukong/widget/reducers/sort.rb +130 -0
  164. data/lib/wukong/widget/serializers.rb +287 -0
  165. data/lib/wukong/widget/sink.rb +10 -52
  166. data/lib/wukong/widget/source.rb +7 -113
  167. data/lib/wukong/widget/utils.rb +46 -0
  168. data/lib/wukong/widgets.rb +6 -0
  169. data/spec/examples/dataflow/fibonacci_series_spec.rb +18 -0
  170. data/spec/examples/dataflow/parsing_spec.rb +12 -11
  171. data/spec/examples/dataflow/simple_spec.rb +32 -6
  172. data/spec/examples/dataflow/telegram_spec.rb +36 -36
  173. data/spec/examples/graph/minimum_spanning_tree_spec.rb +30 -31
  174. data/spec/examples/munging/airline_flights/identifiers_spec.rb +16 -0
  175. data/spec/examples/munging/airline_flights_spec.rb +202 -0
  176. data/spec/examples/text/pig_latin_spec.rb +13 -16
  177. data/spec/examples/workflow/cherry_pie_spec.rb +34 -4
  178. data/spec/hanuman/graph_spec.rb +27 -2
  179. data/spec/hanuman/hanuman_spec.rb +10 -0
  180. data/spec/hanuman/registry_spec.rb +123 -0
  181. data/spec/hanuman/stage_spec.rb +61 -7
  182. data/spec/spec_helper.rb +29 -19
  183. data/spec/support/hanuman_test_helpers.rb +14 -12
  184. data/spec/support/shared_context_for_reducers.rb +37 -0
  185. data/spec/support/shared_examples_for_builders.rb +101 -0
  186. data/spec/support/shared_examples_for_shortcuts.rb +57 -0
  187. data/spec/support/wukong_test_helpers.rb +37 -11
  188. data/spec/wukong/dataflow_spec.rb +77 -55
  189. data/spec/wukong/local_runner_spec.rb +24 -24
  190. data/spec/wukong/model/faker_spec.rb +132 -131
  191. data/spec/wukong/runner_spec.rb +8 -8
  192. data/spec/wukong/widget/filters_spec.rb +61 -0
  193. data/spec/wukong/widget/processors_spec.rb +126 -0
  194. data/spec/wukong/widget/reducers/bin_spec.rb +92 -0
  195. data/spec/wukong/widget/reducers/count_spec.rb +11 -0
  196. data/spec/wukong/widget/reducers/group_spec.rb +20 -0
  197. data/spec/wukong/widget/reducers/moments_spec.rb +36 -0
  198. data/spec/wukong/widget/reducers/sort_spec.rb +26 -0
  199. data/spec/wukong/widget/serializers_spec.rb +92 -0
  200. data/spec/wukong/widget/sink_spec.rb +15 -15
  201. data/spec/wukong/widget/source_spec.rb +65 -41
  202. data/spec/wukong/wukong_spec.rb +10 -0
  203. data/wukong.gemspec +17 -10
  204. metadata +359 -335
  205. data/.document +0 -5
  206. data/VERSION +0 -1
  207. data/bin/hdp-bin +0 -44
  208. data/bin/hdp-bzip +0 -23
  209. data/bin/hdp-cat +0 -3
  210. data/bin/hdp-catd +0 -3
  211. data/bin/hdp-cp +0 -3
  212. data/bin/hdp-du +0 -86
  213. data/bin/hdp-get +0 -3
  214. data/bin/hdp-kill +0 -3
  215. data/bin/hdp-kill-task +0 -3
  216. data/bin/hdp-ls +0 -11
  217. data/bin/hdp-mkdir +0 -2
  218. data/bin/hdp-mkdirp +0 -12
  219. data/bin/hdp-mv +0 -3
  220. data/bin/hdp-parts_to_keys.rb +0 -77
  221. data/bin/hdp-ps +0 -3
  222. data/bin/hdp-put +0 -3
  223. data/bin/hdp-rm +0 -32
  224. data/bin/hdp-sort +0 -40
  225. data/bin/hdp-stream +0 -40
  226. data/bin/hdp-stream-flat +0 -22
  227. data/bin/hdp-stream2 +0 -39
  228. data/bin/hdp-sync +0 -17
  229. data/bin/hdp-wc +0 -67
  230. data/bin/wu-flow +0 -10
  231. data/bin/wu-map +0 -17
  232. data/bin/wu-red +0 -17
  233. data/bin/wukong +0 -17
  234. data/data/CREDITS.md +0 -355
  235. data/data/graph/airfares.tsv +0 -2174
  236. data/data/text/gift_of_the_magi.txt +0 -225
  237. data/data/text/jabberwocky.txt +0 -36
  238. data/data/text/rectification_of_names.txt +0 -33
  239. data/data/twitter/a_atsigns_b.tsv +0 -64
  240. data/data/twitter/a_follows_b.tsv +0 -53
  241. data/data/twitter/tweet.tsv +0 -167
  242. data/data/twitter/twitter_user.tsv +0 -55
  243. data/data/wikipedia/dbpedia-sentences.tsv +0 -1000
  244. data/docpages/INSTALL.textile +0 -92
  245. data/docpages/LICENSE.textile +0 -107
  246. data/docpages/README-elastic_map_reduce.textile +0 -377
  247. data/docpages/README-performance.textile +0 -90
  248. data/docpages/README-wulign.textile +0 -65
  249. data/docpages/UsingWukong-part1-get_ready.textile +0 -17
  250. data/docpages/UsingWukong-part2-ThinkingBigData.textile +0 -75
  251. data/docpages/UsingWukong-part3-parsing.textile +0 -138
  252. data/docpages/_config.yml +0 -39
  253. data/docpages/avro/avro_notes.textile +0 -56
  254. data/docpages/avro/performance.textile +0 -36
  255. data/docpages/avro/tethering.textile +0 -19
  256. data/docpages/bigdata-tips.textile +0 -143
  257. data/docpages/code/api_response_example.txt +0 -20
  258. data/docpages/code/parser_skeleton.rb +0 -38
  259. data/docpages/diagrams/MapReduceDiagram.graffle +0 -0
  260. data/docpages/favicon.ico +0 -0
  261. data/docpages/gem.css +0 -16
  262. data/docpages/hadoop-tips.textile +0 -83
  263. data/docpages/index.textile +0 -92
  264. data/docpages/intro.textile +0 -8
  265. data/docpages/moreinfo.textile +0 -174
  266. data/docpages/news.html +0 -24
  267. data/docpages/pig/PigLatinExpressionsList.txt +0 -122
  268. data/docpages/pig/PigLatinReferenceManual.txt +0 -1640
  269. data/docpages/pig/commandline_params.txt +0 -26
  270. data/docpages/pig/cookbook.html +0 -481
  271. data/docpages/pig/images/hadoop-logo.jpg +0 -0
  272. data/docpages/pig/images/instruction_arrow.png +0 -0
  273. data/docpages/pig/images/pig-logo.gif +0 -0
  274. data/docpages/pig/piglatin_ref1.html +0 -1103
  275. data/docpages/pig/piglatin_ref2.html +0 -14340
  276. data/docpages/pig/setup.html +0 -505
  277. data/docpages/pig/skin/basic.css +0 -166
  278. data/docpages/pig/skin/breadcrumbs.js +0 -237
  279. data/docpages/pig/skin/fontsize.js +0 -166
  280. data/docpages/pig/skin/getBlank.js +0 -40
  281. data/docpages/pig/skin/getMenu.js +0 -45
  282. data/docpages/pig/skin/images/chapter.gif +0 -0
  283. data/docpages/pig/skin/images/chapter_open.gif +0 -0
  284. data/docpages/pig/skin/images/current.gif +0 -0
  285. data/docpages/pig/skin/images/external-link.gif +0 -0
  286. data/docpages/pig/skin/images/header_white_line.gif +0 -0
  287. data/docpages/pig/skin/images/page.gif +0 -0
  288. data/docpages/pig/skin/images/pdfdoc.gif +0 -0
  289. data/docpages/pig/skin/images/rc-b-l-15-1body-2menu-3menu.png +0 -0
  290. data/docpages/pig/skin/images/rc-b-r-15-1body-2menu-3menu.png +0 -0
  291. data/docpages/pig/skin/images/rc-b-r-5-1header-2tab-selected-3tab-selected.png +0 -0
  292. data/docpages/pig/skin/images/rc-t-l-5-1header-2searchbox-3searchbox.png +0 -0
  293. data/docpages/pig/skin/images/rc-t-l-5-1header-2tab-selected-3tab-selected.png +0 -0
  294. data/docpages/pig/skin/images/rc-t-l-5-1header-2tab-unselected-3tab-unselected.png +0 -0
  295. data/docpages/pig/skin/images/rc-t-r-15-1body-2menu-3menu.png +0 -0
  296. data/docpages/pig/skin/images/rc-t-r-5-1header-2searchbox-3searchbox.png +0 -0
  297. data/docpages/pig/skin/images/rc-t-r-5-1header-2tab-selected-3tab-selected.png +0 -0
  298. data/docpages/pig/skin/images/rc-t-r-5-1header-2tab-unselected-3tab-unselected.png +0 -0
  299. data/docpages/pig/skin/print.css +0 -54
  300. data/docpages/pig/skin/profile.css +0 -181
  301. data/docpages/pig/skin/screen.css +0 -587
  302. data/docpages/pig/tutorial.html +0 -1059
  303. data/docpages/pig/udf.html +0 -1509
  304. data/docpages/tutorial.textile +0 -283
  305. data/docpages/usage.textile +0 -195
  306. data/docpages/wutils.textile +0 -263
  307. data/examples/dataflow/complex.rb +0 -11
  308. data/examples/dataflow/donuts.rb +0 -13
  309. data/examples/tiny_count/jabberwocky_output.tsv +0 -92
  310. data/examples/word_count.rb +0 -48
  311. data/examples/workflow/fiddle.rb +0 -24
  312. data/lib/away/escapement.rb +0 -129
  313. data/lib/away/exe.rb +0 -11
  314. data/lib/away/experimental.rb +0 -5
  315. data/lib/away/from_file.rb +0 -52
  316. data/lib/away/job.rb +0 -56
  317. data/lib/away/job/rake_compat.rb +0 -17
  318. data/lib/away/registry.rb +0 -79
  319. data/lib/away/runner.rb +0 -276
  320. data/lib/away/runner/execute.rb +0 -121
  321. data/lib/away/script.rb +0 -161
  322. data/lib/away/script/hadoop_command.rb +0 -240
  323. data/lib/away/source/file_list_source.rb +0 -15
  324. data/lib/away/source/looper.rb +0 -18
  325. data/lib/away/task.rb +0 -219
  326. data/lib/hanuman/action.rb +0 -21
  327. data/lib/hanuman/chain.rb +0 -4
  328. data/lib/hanuman/graphviz.rb +0 -74
  329. data/lib/hanuman/resource.rb +0 -6
  330. data/lib/hanuman/slot.rb +0 -87
  331. data/lib/hanuman/slottable.rb +0 -220
  332. data/lib/wukong/bad_record.rb +0 -15
  333. data/lib/wukong/event.rb +0 -44
  334. data/lib/wukong/local_runner.rb +0 -55
  335. data/lib/wukong/mapred.rb +0 -3
  336. data/lib/wukong/universe.rb +0 -48
  337. data/lib/wukong/widget/filter.rb +0 -81
  338. data/lib/wukong/widget/gibberish.rb +0 -123
  339. data/lib/wukong/widget/monitor.rb +0 -26
  340. data/lib/wukong/widget/reducer.rb +0 -66
  341. data/lib/wukong/widget/stringifier.rb +0 -50
  342. data/lib/wukong/workflow.rb +0 -22
  343. data/lib/wukong/workflow/command.rb +0 -42
  344. data/old/config/emr-example.yaml +0 -48
  345. data/old/examples/README.txt +0 -17
  346. data/old/examples/contrib/jeans/README.markdown +0 -165
  347. data/old/examples/contrib/jeans/data/normalized_sizes +0 -3
  348. data/old/examples/contrib/jeans/data/orders.tsv +0 -1302
  349. data/old/examples/contrib/jeans/data/sizes +0 -3
  350. data/old/examples/contrib/jeans/normalize.rb +0 -20
  351. data/old/examples/contrib/jeans/sizes.rb +0 -55
  352. data/old/examples/corpus/bnc_word_freq.rb +0 -44
  353. data/old/examples/corpus/bucket_counter.rb +0 -47
  354. data/old/examples/corpus/dbpedia_abstract_to_sentences.rb +0 -86
  355. data/old/examples/corpus/sentence_bigrams.rb +0 -53
  356. data/old/examples/corpus/sentence_coocurrence.rb +0 -66
  357. data/old/examples/corpus/stopwords.rb +0 -138
  358. data/old/examples/corpus/words_to_bigrams.rb +0 -53
  359. data/old/examples/emr/README.textile +0 -110
  360. data/old/examples/emr/dot_wukong_dir/credentials.json +0 -7
  361. data/old/examples/emr/dot_wukong_dir/emr.yaml +0 -69
  362. data/old/examples/emr/dot_wukong_dir/emr_bootstrap.sh +0 -33
  363. data/old/examples/emr/elastic_mapreduce_example.rb +0 -28
  364. data/old/examples/network_graph/adjacency_list.rb +0 -74
  365. data/old/examples/network_graph/breadth_first_search.rb +0 -72
  366. data/old/examples/network_graph/gen_2paths.rb +0 -68
  367. data/old/examples/network_graph/gen_multi_edge.rb +0 -112
  368. data/old/examples/network_graph/gen_symmetric_links.rb +0 -64
  369. data/old/examples/pagerank/README.textile +0 -6
  370. data/old/examples/pagerank/gen_initial_pagerank_graph.pig +0 -57
  371. data/old/examples/pagerank/pagerank.rb +0 -72
  372. data/old/examples/pagerank/pagerank_initialize.rb +0 -42
  373. data/old/examples/pagerank/run_pagerank.sh +0 -21
  374. data/old/examples/sample_records.rb +0 -33
  375. data/old/examples/server_logs/apache_log_parser.rb +0 -15
  376. data/old/examples/server_logs/nook.rb +0 -48
  377. data/old/examples/server_logs/nook/faraday_dummy_adapter.rb +0 -94
  378. data/old/examples/server_logs/user_agent.rb +0 -40
  379. data/old/examples/simple_word_count.rb +0 -82
  380. data/old/examples/size.rb +0 -61
  381. data/old/examples/stats/avg_value_frequency.rb +0 -86
  382. data/old/examples/stats/binning_percentile_estimator.rb +0 -140
  383. data/old/examples/stats/data/avg_value_frequency.tsv +0 -3
  384. data/old/examples/stats/rank_and_bin.rb +0 -173
  385. data/old/examples/stupidly_simple_filter.rb +0 -40
  386. data/old/examples/word_count.rb +0 -75
  387. data/old/graph/graphviz_builder.rb +0 -580
  388. data/old/graph_easy/Attributes.pm +0 -4181
  389. data/old/graph_easy/Graphviz.pm +0 -2232
  390. data/old/wukong.rb +0 -18
  391. data/old/wukong/and_pig.rb +0 -38
  392. data/old/wukong/bad_record.rb +0 -18
  393. data/old/wukong/datatypes.rb +0 -24
  394. data/old/wukong/datatypes/enum.rb +0 -127
  395. data/old/wukong/datatypes/fake_types.rb +0 -17
  396. data/old/wukong/decorator.rb +0 -28
  397. data/old/wukong/encoding/asciize.rb +0 -108
  398. data/old/wukong/extensions.rb +0 -16
  399. data/old/wukong/extensions/array.rb +0 -18
  400. data/old/wukong/extensions/blank.rb +0 -93
  401. data/old/wukong/extensions/class.rb +0 -189
  402. data/old/wukong/extensions/date_time.rb +0 -53
  403. data/old/wukong/extensions/emittable.rb +0 -69
  404. data/old/wukong/extensions/enumerable.rb +0 -79
  405. data/old/wukong/extensions/hash.rb +0 -167
  406. data/old/wukong/extensions/hash_keys.rb +0 -16
  407. data/old/wukong/extensions/hash_like.rb +0 -150
  408. data/old/wukong/extensions/hashlike_class.rb +0 -47
  409. data/old/wukong/extensions/module.rb +0 -2
  410. data/old/wukong/extensions/pathname.rb +0 -27
  411. data/old/wukong/extensions/string.rb +0 -65
  412. data/old/wukong/extensions/struct.rb +0 -17
  413. data/old/wukong/extensions/symbol.rb +0 -11
  414. data/old/wukong/filename_pattern.rb +0 -74
  415. data/old/wukong/helper.rb +0 -7
  416. data/old/wukong/helper/stopwords.rb +0 -195
  417. data/old/wukong/helper/tokenize.rb +0 -35
  418. data/old/wukong/logger.rb +0 -38
  419. data/old/wukong/periodic_monitor.rb +0 -72
  420. data/old/wukong/schema.rb +0 -269
  421. data/old/wukong/script.rb +0 -286
  422. data/old/wukong/script/avro_command.rb +0 -5
  423. data/old/wukong/script/cassandra_loader_script.rb +0 -40
  424. data/old/wukong/script/emr_command.rb +0 -168
  425. data/old/wukong/script/hadoop_command.rb +0 -237
  426. data/old/wukong/script/local_command.rb +0 -41
  427. data/old/wukong/store.rb +0 -10
  428. data/old/wukong/store/base.rb +0 -27
  429. data/old/wukong/store/cassandra.rb +0 -10
  430. data/old/wukong/store/cassandra/streaming.rb +0 -75
  431. data/old/wukong/store/cassandra/struct_loader.rb +0 -21
  432. data/old/wukong/store/cassandra_model.rb +0 -91
  433. data/old/wukong/store/chh_chunked_flat_file_store.rb +0 -37
  434. data/old/wukong/store/chunked_flat_file_store.rb +0 -48
  435. data/old/wukong/store/conditional_store.rb +0 -57
  436. data/old/wukong/store/factory.rb +0 -8
  437. data/old/wukong/store/flat_file_store.rb +0 -89
  438. data/old/wukong/store/key_store.rb +0 -51
  439. data/old/wukong/store/null_store.rb +0 -15
  440. data/old/wukong/store/read_thru_store.rb +0 -22
  441. data/old/wukong/store/tokyo_tdb_key_store.rb +0 -33
  442. data/old/wukong/store/tyrant_rdb_key_store.rb +0 -57
  443. data/old/wukong/store/tyrant_tdb_key_store.rb +0 -20
  444. data/old/wukong/streamer.rb +0 -30
  445. data/old/wukong/streamer/accumulating_reducer.rb +0 -83
  446. data/old/wukong/streamer/base.rb +0 -126
  447. data/old/wukong/streamer/counting_reducer.rb +0 -25
  448. data/old/wukong/streamer/filter.rb +0 -20
  449. data/old/wukong/streamer/instance_streamer.rb +0 -15
  450. data/old/wukong/streamer/json_streamer.rb +0 -21
  451. data/old/wukong/streamer/line_streamer.rb +0 -12
  452. data/old/wukong/streamer/list_reducer.rb +0 -31
  453. data/old/wukong/streamer/rank_and_bin_reducer.rb +0 -145
  454. data/old/wukong/streamer/record_streamer.rb +0 -14
  455. data/old/wukong/streamer/reducer.rb +0 -11
  456. data/old/wukong/streamer/set_reducer.rb +0 -14
  457. data/old/wukong/streamer/struct_streamer.rb +0 -48
  458. data/old/wukong/streamer/summing_reducer.rb +0 -29
  459. data/old/wukong/streamer/uniq_by_last_reducer.rb +0 -51
  460. data/old/wukong/typed_struct.rb +0 -12
  461. data/spec/away/encoding_spec.rb +0 -32
  462. data/spec/away/exe_spec.rb +0 -20
  463. data/spec/away/flow_spec.rb +0 -82
  464. data/spec/away/graph_spec.rb +0 -6
  465. data/spec/away/job_spec.rb +0 -15
  466. data/spec/away/rake_compat_spec.rb +0 -9
  467. data/spec/away/script_spec.rb +0 -81
  468. data/spec/hanuman/graphviz_spec.rb +0 -29
  469. data/spec/hanuman/slot_spec.rb +0 -2
  470. data/spec/support/examples_helper.rb +0 -10
  471. data/spec/support/streamer_test_helpers.rb +0 -6
  472. data/spec/support/wukong_widget_helpers.rb +0 -66
  473. data/spec/wukong/processor_spec.rb +0 -109
  474. data/spec/wukong/widget/filter_spec.rb +0 -99
  475. data/spec/wukong/widget/stringifier_spec.rb +0 -51
  476. data/spec/wukong/workflow/command_spec.rb +0 -5
@@ -1,2232 +0,0 @@
1
- #############################################################################
2
- # Parse graphviz/dot text into a Graph::Easy object
3
- #
4
- #############################################################################
5
-
6
- package Graph::Easy::Parser::Graphviz;
7
-
8
- $VERSION = '0.17';
9
- use Graph::Easy::Parser;
10
- @ISA = qw/Graph::Easy::Parser/;
11
-
12
- use strict;
13
- use utf8;
14
- use constant NO_MULTIPLES => 1;
15
-
16
- sub _init
17
- {
18
- my $self = shift;
19
-
20
- $self->SUPER::_init(@_);
21
- $self->{attr_sep} = '=';
22
- # remove " <p1> " from autosplit (shape=record) labels
23
- $self->{_qr_part_clean} = qr/\s*<([^>]*)>/;
24
-
25
- $self;
26
- }
27
-
28
- sub reset
29
- {
30
- my $self = shift;
31
-
32
- $self->SUPER::reset(@_);
33
-
34
- # set some default attributes on the graph object, because graphviz has
35
- # different defaults as Graph::Easy
36
- my $g = $self->{_graph};
37
-
38
- $g->set_attribute('colorscheme','x11');
39
- $g->set_attribute('flow','south');
40
- $g->set_attribute('edge','arrow-style', 'filled');
41
- $g->set_attribute('group','align', 'center');
42
- $g->set_attribute('group','fill', 'inherit');
43
-
44
- $self->{scope_stack} = [];
45
-
46
- # allow some temp. values during parsing
47
- $g->_allow_special_attributes(
48
- {
49
- node => {
50
- shape => [
51
- "",
52
- [ qw/ circle diamond edge ellipse hexagon house invisible
53
- invhouse invtrapezium invtriangle octagon parallelogram pentagon
54
- point triangle trapezium septagon rect rounded none img record Mrecord/ ],
55
- '',
56
- '',
57
- undef,
58
- ],
59
- },
60
- } );
61
-
62
- $g->{_warn_on_unknown_attributes} = 1;
63
-
64
- $self;
65
- }
66
-
67
- # map "&tilde;" to "~"
68
- my %entities = (
69
- 'amp' => '&',
70
- 'quot' => '"',
71
- 'lt' => '<',
72
- 'gt' => '>',
73
- 'nbsp' => ' ', # this is a non-break-space between '' here!
74
- 'iexcl' => '¡',
75
- 'cent' => '¢',
76
- 'pound' => '£',
77
- 'curren' => '¤',
78
- 'yen' => '¥',
79
- 'brvbar' => '¦',
80
- 'sect' => '§',
81
- 'uml' => '¨',
82
- 'copy' => '©',
83
- 'ordf' => 'ª',
84
- 'ordf' => 'ª',
85
- 'laquo' => '«',
86
- 'not' => '¬',
87
- 'shy' => "\x{00AD}", # soft-hyphen
88
- 'reg' => '®',
89
- 'macr' => '¯',
90
- 'deg' => '°',
91
- 'plusmn' => '±',
92
- 'sup2' => '²',
93
- 'sup3' => '³',
94
- 'acute' => '´',
95
- 'micro' => 'µ',
96
- 'para' => '¶',
97
- 'midot' => '·',
98
- 'cedil' => '¸',
99
- 'sup1' => '¹',
100
- 'ordm' => 'º',
101
- 'raquo' => '»',
102
- 'frac14' => '¼',
103
- 'frac12' => '½',
104
- 'frac34' => '¾',
105
- 'iquest' => '¿',
106
- 'Agrave' => 'À',
107
- 'Aacute' => 'Á',
108
- 'Acirc' => 'Â',
109
- 'Atilde' => 'Ã',
110
- 'Auml' => 'Ä',
111
- 'Aring' => 'Å',
112
- 'Aelig' => 'Æ',
113
- 'Ccedil' => 'Ç',
114
- 'Egrave' => 'È',
115
- 'Eacute' => 'É',
116
- 'Ecirc' => 'Ê',
117
- 'Euml' => 'Ë',
118
- 'Igrave' => 'Ì',
119
- 'Iacute' => 'Í',
120
- 'Icirc' => 'Î',
121
- 'Iuml' => 'Ï',
122
- 'ETH' => 'Ð',
123
- 'Ntilde' => 'Ñ',
124
- 'Ograve' => 'Ò',
125
- 'Oacute' => 'Ó',
126
- 'Ocirc' => 'Ô',
127
- 'Otilde' => 'Õ',
128
- 'Ouml' => 'Ö',
129
- 'times' => '×',
130
- 'Oslash' => 'Ø',
131
- 'Ugrave' => 'Ù',
132
- 'Uacute' => 'Ù',
133
- 'Ucirc' => 'Û',
134
- 'Uuml' => 'Ü',
135
- 'Yacute' => 'Ý',
136
- 'THORN' => 'Þ',
137
- 'szlig' => 'ß',
138
- 'agrave' => 'à',
139
- 'aacute' => 'á',
140
- 'acirc' => 'â',
141
- 'atilde' => 'ã',
142
- 'auml' => 'ä',
143
- 'aring' => 'å',
144
- 'aelig' => 'æ',
145
- 'ccedil' => 'ç',
146
- 'egrave' => 'è',
147
- 'eacute' => 'é',
148
- 'ecirc' => 'ê',
149
- 'euml' => 'ë',
150
- 'igrave' => 'ì',
151
- 'iacute' => 'í',
152
- 'icirc' => 'î',
153
- 'iuml' => 'ï',
154
- 'eth' => 'ð',
155
- 'ntilde' => 'ñ',
156
- 'ograve' => 'ò',
157
- 'oacute' => 'ó',
158
- 'ocirc' => 'ô',
159
- 'otilde' => 'õ',
160
- 'ouml' => 'ö',
161
- 'divide' => '÷',
162
- 'oslash' => 'ø',
163
- 'ugrave' => 'ù',
164
- 'uacute' => 'ú',
165
- 'ucirc' => 'û',
166
- 'uuml' => 'ü',
167
- 'yacute' => 'ý',
168
- 'thorn' => 'þ',
169
- 'yuml' => 'ÿ',
170
- 'Oelig' => 'Œ',
171
- 'oelig' => 'œ',
172
- 'Scaron' => 'Š',
173
- 'scaron' => 'š',
174
- 'Yuml' => 'Ÿ',
175
- 'fnof' => 'ƒ',
176
- 'circ' => '^',
177
- 'tilde' => '~',
178
- 'Alpha' => 'Α',
179
- 'Beta' => 'Β',
180
- 'Gamma' => 'Γ',
181
- 'Delta' => 'Δ',
182
- 'Epsilon'=> 'Ε',
183
- 'Zeta' => 'Ζ',
184
- 'Eta' => 'Η',
185
- 'Theta' => 'Θ',
186
- 'Iota' => 'Ι',
187
- 'Kappa' => 'Κ',
188
- 'Lambda' => 'Λ',
189
- 'Mu' => 'Μ',
190
- 'Nu' => 'Ν',
191
- 'Xi' => 'Ξ',
192
- 'Omicron'=> 'Ο',
193
- 'Pi' => 'Π',
194
- 'Rho' => 'Ρ',
195
- 'Sigma' => 'Σ',
196
- 'Tau' => 'Τ',
197
- 'Upsilon'=> 'Υ',
198
- 'Phi' => 'Φ',
199
- 'Chi' => 'Χ',
200
- 'Psi' => 'Ψ',
201
- 'Omega' => 'Ω',
202
- 'alpha' => 'α',
203
- 'beta' => 'β',
204
- 'gamma' => 'γ',
205
- 'delta' => 'δ',
206
- 'epsilon'=> 'ε',
207
- 'zeta' => 'ζ',
208
- 'eta' => 'η',
209
- 'theta' => 'θ',
210
- 'iota' => 'ι',
211
- 'kappa' => 'κ',
212
- 'lambda' => 'λ',
213
- 'mu' => 'μ',
214
- 'nu' => 'ν',
215
- 'xi' => 'ξ',
216
- 'omicron'=> 'ο',
217
- 'pi' => 'π',
218
- 'rho' => 'ρ',
219
- 'sigma' => 'σ',
220
- 'tau' => 'τ',
221
- 'upsilon'=> 'υ',
222
- 'phi' => 'φ',
223
- 'chi' => 'χ',
224
- 'psi' => 'ψ',
225
- 'omega' => 'ω',
226
- 'thetasym'=>'ϑ',
227
- 'upsih' => 'ϒ',
228
- 'piv' => 'ϖ',
229
- 'ensp' => "\x{2003}", # normal wide space
230
- 'emsp' => "\x{2004}", # wide space
231
- 'thinsp' => "\x{2009}", # very thin space
232
- 'zwnj' => "\x{200c}", # zero-width-non-joiner
233
- 'zwj' => "\x{200d}", # zero-width-joiner
234
- 'lrm' => "\x{200e}", # left-to-right
235
- 'rlm' => "\x{200f}", # right-to-left
236
- 'ndash' => '–',
237
- 'mdash' => '—',
238
- 'lsquo' => '‘',
239
- 'rsquo' => '’',
240
- 'sbquo' => '‚',
241
- 'ldquo' => '“',
242
- 'rdquo' => '”',
243
- 'bdquo' => '„',
244
- 'dagger' => '†',
245
- 'Dagger' => '‡',
246
- 'bull' => '•',
247
- 'hellip' => '…',
248
- 'permil' => '‰',
249
- 'prime' => '′',
250
- 'Prime' => '′',
251
- 'lsaquo' => '‹',
252
- 'rsaquo' => '›',
253
- 'oline' => '‾',
254
- 'frasl' => '⁄',
255
- 'euro' => '€',
256
- 'image' => 'ℑ',
257
- 'weierp' => '℘',
258
- 'real' => 'ℜ',
259
- 'trade' => '™',
260
- 'alefsym'=> 'ℵ',
261
- 'larr' => '←',
262
- 'uarr' => '↑',
263
- 'rarr' => '→',
264
- 'darr' => '↓',
265
- 'harr' => '↔',
266
- 'crarr' => '↵',
267
- 'lArr' => '⇐',
268
- 'uArr' => '⇑',
269
- 'rArr' => '⇒',
270
- 'dArr' => '⇓',
271
- 'hArr' => '⇔',
272
- 'forall' => '∀',
273
- 'part' => '∂',
274
- 'exist' => '∃',
275
- 'empty' => '∅',
276
- 'nabla' => '∇',
277
- 'isin' => '∈',
278
- 'notin' => '∉',
279
- 'ni' => '∋',
280
- 'prod' => '∏',
281
- 'sum' => '∑',
282
- 'minus' => '−',
283
- 'lowast' => '∗',
284
- 'radic' => '√',
285
- 'prop' => '∝',
286
- 'infin' => '∞',
287
- 'ang' => '∠',
288
- 'and' => '∧',
289
- 'or' => '∨',
290
- 'cap' => '∩',
291
- 'cup' => '∪',
292
- 'int' => '∫',
293
- 'there4' => '∴',
294
- 'sim' => '∼',
295
- 'cong' => '≅',
296
- 'asymp' => '≃',
297
- 'ne' => '≠',
298
- 'eq' => '=',
299
- 'le' => '≤',
300
- 'ge' => '≥',
301
- 'sub' => '⊂',
302
- 'sup' => '⊃',
303
- 'nsub' => '⊄',
304
- 'nsup' => '⊅',
305
- 'sube' => '⊆',
306
- 'supe' => '⊇',
307
- 'oplus' => '⊕',
308
- 'otimes' => '⊗',
309
- 'perp' => '⊥',
310
- 'sdot' => '⋅',
311
- 'lceil' => '⌈',
312
- 'rceil' => '⌉',
313
- 'lfloor' => '⌊',
314
- 'rfloor' => '⌋',
315
- 'lang' => '〈',
316
- 'rang' => '〉',
317
- 'roz' => '◊',
318
- 'spades' => '♠',
319
- 'clubs' => '♣',
320
- 'diamonds'=>'♦',
321
- 'hearts' => '♥',
322
- );
323
-
324
- sub _unquote_attribute
325
- {
326
- my ($self,$name,$val) = @_;
327
-
328
- my $html_like = 0;
329
- if ($name eq 'label')
330
- {
331
- $html_like = 1 if $val =~ /^\s*<\s*</;
332
- # '< >' => ' ', ' < a > ' => ' a '
333
- if ($html_like == 0 && $val =~ /\s*<(.*)>\s*\z/)
334
- {
335
- $val = $1; $val = ' ' if $val eq '';
336
- }
337
- }
338
-
339
- my $v = $self->_unquote($val);
340
-
341
- # Now HTML labels always start with "<", while non-HTML labels
342
- # start with " <" or anything else.
343
- if ($html_like == 0)
344
- {
345
- $v = ' ' . $v if $v =~ /^</;
346
- }
347
- else
348
- {
349
- $v =~ s/^\s*//; $v =~ s/\s*\z//;
350
- }
351
-
352
- $v;
353
- }
354
-
355
- sub _unquote
356
- {
357
- my ($self, $name) = @_;
358
-
359
- $name = '' unless defined $name;
360
-
361
- # string concat
362
- # "foo" + " bar" => "foo bar"
363
- $name =~ s/^
364
- "((?:\\"|[^"])*)" # "foo"
365
- \s*\+\s*"((?:\\"|[^"])*)" # followed by ' + "bar"'
366
- /"$1$2"/x
367
- while $name =~ /^
368
- "(?:\\"|[^"])*" # "foo"
369
- \s*\+\s*"(?:\\"|[^"])*" # followed by ' + "bar"'
370
- /x;
371
-
372
- # map "&!;" to "!"
373
- $name =~ s/&(.);/$1/g;
374
-
375
- # map "&amp;" to "&"
376
- $name =~ s/&([^;]+);/$entities{$1} || '';/eg;
377
-
378
- # "foo bar" => foo bar
379
- $name =~ s/^"\s*//; # remove left-over quotes
380
- $name =~ s/\s*"\z//;
381
-
382
- # unquote special chars
383
- $name =~ s/\\([\[\(\{\}\]\)#"])/$1/g;
384
-
385
- $name;
386
- }
387
-
388
- sub _clean_line
389
- {
390
- # do some cleanups on a line before handling it
391
- my ($self,$line) = @_;
392
-
393
- chomp($line);
394
-
395
- # collapse white space at start
396
- $line =~ s/^\s+//;
397
- # line ending in '\' means a continuation
398
- $line =~ s/\\\z//;
399
-
400
- $line;
401
- }
402
-
403
- sub _line_insert
404
- {
405
- # "a1 -> a2\na3 -> a4" => "a1 -> a2 a3 -> a4"
406
- ' ';
407
- }
408
-
409
- #############################################################################
410
-
411
- sub _match_boolean
412
- {
413
- # not used yet, match a boolean value
414
- qr/(true|false|\d+)/;
415
- }
416
-
417
- sub _match_comment
418
- {
419
- # match the start of a comment
420
-
421
- # // comment
422
- qr#(:[^\\]|)//#;
423
- }
424
-
425
- sub _match_multi_line_comment
426
- {
427
- # match a multi line comment
428
-
429
- # /* * comment * */
430
- qr#(?:\s*/\*.*?\*/\s*)+#;
431
- }
432
-
433
- sub _match_optional_multi_line_comment
434
- {
435
- # match a multi line comment
436
-
437
- # "/* * comment * */" or /* a */ /* b */ or ""
438
- qr#(?:(?:\s*/\*.*?\*/\s*)*|\s+)#;
439
- }
440
-
441
- sub _match_name
442
- {
443
- # Return a regexp that matches an ID in the DOT language.
444
- # See http://www.graphviz.org/doc/info/lang.html for reference.
445
-
446
- # "node", "graph", "edge", "digraph", "subgraph" and "strict" are reserved:
447
- qr/\s*
448
- (
449
- # double quoted string
450
- "(?:\\"|[^"])*" # "foo"
451
- (?:\s*\+\s*"(?:\\"|[^"])*")* # followed by 0 or more ' + "bar"'
452
- |
453
- # number
454
- -? # optional minus sign
455
- (?: # non-capture group
456
- \.[0-9]+ # .00019
457
- | # or
458
- [0-9]+(?:\.[0-9]*)? # 123 or 123.1
459
- )
460
- |
461
- # plain node name (a-z0-9_+)
462
- (?!(?i:node|edge|digraph|subgraph|graph|strict)\s)[\w]+
463
- )/xi;
464
- }
465
-
466
- sub _match_node
467
- {
468
- # Return a regexp that matches something like '"bonn"' or 'bonn' or 'bonn:f1'
469
- my $self = shift;
470
-
471
- my $qr_n = $self->_match_name();
472
-
473
- # Examples: "bonn", "Bonn":f1, "Bonn":"f1", "Bonn":"port":"w", Bonn:port:w
474
- qr/
475
- $qr_n # node name (see _match_name)
476
- (?:
477
- :$qr_n
478
- (?: :(n|ne|e|se|s|sw|w|nw) )? # :port:compass_direction
479
- |
480
- :(n|ne|e|se|s|sw|w|nw) # :compass_direction
481
- )? # optional
482
- /x;
483
- }
484
-
485
- sub _match_group_start
486
- {
487
- # match a subgraph at the beginning (f.i. "graph { ")
488
- my $self = shift;
489
- my $qr_n = $self->_match_name();
490
-
491
- qr/^\s*(?:strict\s+)?(?:(?i)digraph|subgraph|graph)\s+$qr_n\s*\{/i;
492
- }
493
-
494
- sub _match_pseudo_group_start_at_beginning
495
- {
496
- # match an anonymous group start at the beginning (aka " { ")
497
- qr/^\s*\{/;
498
- }
499
-
500
- sub _match_pseudo_group_start
501
- {
502
- # match an anonymous group start (aka " { ")
503
- qr/\s*\{/;
504
- }
505
-
506
- sub _match_group_end
507
- {
508
- # return a regexp that matches something like " }" or "} ;".
509
- qr/^\s*\}\s*;?\s*/;
510
- }
511
-
512
- sub _match_edge
513
- {
514
- # Matches an edge
515
- qr/\s*(->|--)/;
516
- }
517
-
518
- sub _match_html_regexps
519
- {
520
- # Return hash with regexps matching different parts of an HTML label.
521
- my $qr =
522
- {
523
- # BORDER="2"
524
- attribute => qr/\s*([A-Za-z]+)\s*=\s*"((?:\\"|[^"])*)"/,
525
- # BORDER="2" COLSPAN="2"
526
- attributes => qr/(?:\s+(?:[A-Za-z]+)\s*=\s*"(?:\\"|[^"])*")*/,
527
- text => qr/.*?/,
528
- tr => qr/\s*<TR>/i,
529
- tr_end => qr/\s*<\/TR>/i,
530
- td => qr/\s*<TD[^>]*>/i,
531
- td_tag => qr/\s*<TD\s*/i,
532
- td_end => qr/\s*<\/TD>/i,
533
- table => qr/\s*<TABLE[^>]*>/i,
534
- table_tag => qr/\s*<TABLE\s*/i,
535
- table_end => qr/\s*<\/TABLE>/i,
536
- };
537
- $qr->{row} = qr/$qr->{tr}(?:$qr->{td}$qr->{text}$qr->{td_end})*$qr->{tr_end}/;
538
-
539
- $qr;
540
- }
541
-
542
- sub _match_html
543
- {
544
- # build a giant regular expression that matches an HTML label
545
-
546
- # label=<
547
- # <TABLE BORDER="2" CELLBORDER="1" CELLSPACING="0" BGCOLOR="#ffffff">
548
- # <TR><TD PORT="portname" COLSPAN="3" BGCOLOR="#aabbcc" ALIGN="CENTER">port</TD></TR>
549
- # <TR><TD PORT="port2" COLSPAN="2" ALIGN="LEFT">port2</TD><TD PORT="port3" ALIGN="LEFT">port3</TD></TR>
550
- # </TABLE>>
551
-
552
- my $qr = _match_html_regexps();
553
-
554
- # < <TABLE> .. </TABLE> >
555
- qr/<$qr->{table}(?:$qr->{row})*$qr->{table_end}\s*>/;
556
- }
557
-
558
- sub _match_single_attribute
559
- {
560
- my $qr_html = _match_html();
561
-
562
- qr/\s*(\w+)\s*=\s* # the attribute name (label=")
563
- (
564
- "(?:\\"|[^"])*" # "foo"
565
- (?:\s*\+\s*"(?:\\"|[^"])*")* # followed by 0 or more ' + "bar"'
566
- |
567
- $qr_html # or < <TABLE>..<\/TABLE> >
568
- |
569
- <[^>]*> # or something like < a >
570
- |
571
- [^<][^,\]\}\n\s;]* # or simple 'fooobar'
572
- )
573
- [,\]\n\}\s;]?\s*/x; # possible ",", "\n" etc.
574
- }
575
-
576
- sub _match_special_attribute
577
- {
578
- # match boolean attributes, these can appear without a value
579
- qr/\s*(
580
- center|
581
- compound|
582
- concentrate|
583
- constraint|
584
- decorate|
585
- diredgeconstraints|
586
- fixedsize|
587
- headclip|
588
- labelfloat|
589
- landscape|
590
- mosek|
591
- nojustify|
592
- normalize|
593
- overlap|
594
- pack|
595
- pin|
596
- regular|
597
- remincross|
598
- root|
599
- splines|
600
- tailclip|
601
- truecolor
602
- )[,;\s]?\s*/x;
603
- }
604
-
605
- sub _match_attributes
606
- {
607
- # return a regexp that matches something like " [ color=red; ]" and returns
608
- # the inner text without the []
609
-
610
- my $qr_att = _match_single_attribute();
611
- my $qr_satt = _match_special_attribute();
612
- my $qr_cmt = _match_multi_line_comment();
613
-
614
- qr/\s*\[\s*((?:$qr_att|$qr_satt|$qr_cmt)*)\s*\];?/;
615
- }
616
-
617
- sub _match_graph_attribute
618
- {
619
- # return a regexp that matches something like " color=red; " for attributes
620
- # that apply to a graph/subgraph
621
- qr/^\s*(\w+\s*=\s*("[^"]+"|[^;\n\s]+))([;\n\s]\s*|\z)/;
622
- }
623
-
624
- sub _match_optional_attributes
625
- {
626
- # return a regexp that matches something like " [ color=red; ]" and returns
627
- # the inner text with the []
628
-
629
- my $qr_att = _match_single_attribute();
630
- my $qr_satt = _match_special_attribute();
631
- my $qr_cmt = _match_multi_line_comment();
632
-
633
- qr/\s*(\[\s*((?:$qr_att|$qr_satt|$qr_cmt)*)\s*\])?;?/;
634
- }
635
-
636
- sub _clean_attributes
637
- {
638
- my ($self,$text) = @_;
639
-
640
- $text =~ s/^\s*\[\s*//; # remove left-over "[" and spaces
641
- $text =~ s/\s*;?\s*\]\s*\z//; # remove left-over "]" and spaces
642
-
643
- $text;
644
- }
645
-
646
- #############################################################################
647
-
648
- sub _new_scope
649
- {
650
- # create a new scope, with attributes from current scope
651
- my ($self, $is_group) = @_;
652
-
653
- my $scope = {};
654
-
655
- if (@{$self->{scope_stack}} > 0)
656
- {
657
- my $old_scope = $self->{scope_stack}->[-1];
658
-
659
- # make a copy of the old scope's attributes
660
- for my $t (keys %$old_scope)
661
- {
662
- next if $t =~ /^_/;
663
- my $s = $old_scope->{$t};
664
- $scope->{$t} = {} unless ref $scope->{$t}; my $sc = $scope->{$t};
665
- for my $k (keys %$s)
666
- {
667
- # skip things like "_is_group"
668
- $sc->{$k} = $s->{$k} unless $k =~ /^_/;
669
- }
670
- }
671
- }
672
- $scope->{_is_group} = 1 if defined $is_group;
673
-
674
- push @{$self->{scope_stack}}, $scope;
675
- $scope;
676
- }
677
-
678
- sub _add_group_match
679
- {
680
- # register handlers for group start/end
681
- my $self = shift;
682
-
683
- my $qr_pseudo_group_start = $self->_match_pseudo_group_start_at_beginning();
684
- my $qr_group_start = $self->_match_group_start();
685
- my $qr_group_end = $self->_match_group_end();
686
- my $qr_edge = $self->_match_edge();
687
- my $qr_ocmt = $self->_match_optional_multi_line_comment();
688
-
689
- # "subgraph G {"
690
- $self->_register_handler( $qr_group_start,
691
- sub
692
- {
693
- my $self = shift;
694
- my $graph = $self->{_graph};
695
- my $gn = $self->_unquote($1);
696
- print STDERR "# Parser: found subcluster '$gn'\n" if $self->{debug};
697
- push @{$self->{group_stack}}, $self->_new_group($gn);
698
- $self->_new_scope( 1 );
699
- 1;
700
- } );
701
-
702
- # "{ "
703
- $self->_register_handler( $qr_pseudo_group_start,
704
- sub
705
- {
706
- my $self = shift;
707
- print STDERR "# Parser: Creating new scope\n" if $self->{debug};
708
- $self->_new_scope();
709
- # forget the left side
710
- $self->{left_edge} = undef;
711
- $self->{left_stack} = [ ];
712
- 1;
713
- } );
714
-
715
- # "} -> " group/cluster/scope end with an edge
716
- $self->_register_handler( qr/$qr_group_end$qr_ocmt$qr_edge/,
717
- sub
718
- {
719
- my $self = shift;
720
-
721
- my $scope = pop @{$self->{scope_stack}};
722
- return $self->parse_error(0) if !defined $scope;
723
-
724
- if ($scope->{_is_group} && @{$self->{group_stack}})
725
- {
726
- print STDERR "# Parser: end subcluster '$self->{group_stack}->[-1]->{name}'\n" if $self->{debug};
727
- pop @{$self->{group_stack}};
728
- }
729
- else { print STDERR "# Parser: end scope\n" if $self->{debug}; }
730
-
731
- 1;
732
- },
733
- sub
734
- {
735
- my ($self, $line) = @_;
736
- $line =~ qr/$qr_group_end$qr_edge/;
737
- $1 . ' ';
738
- } );
739
-
740
- # "}" group/cluster/scope end
741
- $self->_register_handler( $qr_group_end,
742
- sub
743
- {
744
- my $self = shift;
745
-
746
- my $scope = pop @{$self->{scope_stack}};
747
- return $self->parse_error(0) if !defined $scope;
748
-
749
- if ($scope->{_is_group} && @{$self->{group_stack}})
750
- {
751
- print STDERR "# Parser: end subcluster '$self->{group_stack}->[-1]->{name}'\n" if $self->{debug};
752
- pop @{$self->{group_stack}};
753
- }
754
- # always reset the stack
755
- $self->{stack} = [ ];
756
- 1;
757
- } );
758
- }
759
-
760
- sub _edge_style
761
- {
762
- # To convert "--" or "->" we simple do nothing, since the edge style in
763
- # Graphviz can only be set via the attribute "style"
764
- my ($self, $ed) = @_;
765
-
766
- 'solid';
767
- }
768
-
769
- sub _new_nodes
770
- {
771
- my ($self, $name, $group_stack, $att, $port, $stack) = @_;
772
-
773
- $port = '' unless defined $port;
774
- my @rc = ();
775
- # "name1" => "name1"
776
- if ($port ne '')
777
- {
778
- # create a special node
779
- $name =~ s/^"//; $name =~ s/"\z//;
780
- $port =~ s/^"//; $port =~ s/"\z//;
781
- # XXX TODO: find unique name?
782
- @rc = $self->_new_node ($self->{_graph}, "$name:$port", $group_stack, $att, $stack);
783
- my $node = $rc[0];
784
- $node->{_graphviz_portlet} = $port;
785
- $node->{_graphviz_basename} = $name;
786
- }
787
- else
788
- {
789
- @rc = $self->_new_node ($self->{_graph}, $name, $group_stack, $att, $stack);
790
- }
791
- @rc;
792
- }
793
-
794
- sub _build_match_stack
795
- {
796
- my $self = shift;
797
-
798
- my $qr_node = $self->_match_node();
799
- my $qr_name = $self->_match_name();
800
- my $qr_cmt = $self->_match_multi_line_comment();
801
- my $qr_ocmt = $self->_match_optional_multi_line_comment();
802
- my $qr_attr = $self->_match_attributes();
803
- my $qr_gatr = $self->_match_graph_attribute();
804
- my $qr_oatr = $self->_match_optional_attributes();
805
- my $qr_edge = $self->_match_edge();
806
- my $qr_pgr = $self->_match_pseudo_group_start();
807
-
808
- # remove multi line comments /* comment */
809
- $self->_register_handler( qr/^$qr_cmt/, undef );
810
-
811
- # remove single line comment // comment
812
- $self->_register_handler( qr/^\s*\/\/.*/, undef );
813
-
814
- # simple remove the graph start, but remember that we did this
815
- $self->_register_handler( qr/^\s*((?i)strict)?$qr_ocmt((?i)digraph|graph)$qr_ocmt$qr_node$qr_ocmt\{/,
816
- sub
817
- {
818
- my $self = shift;
819
- return $self->parse_error(6) if @{$self->{scope_stack}} > 0;
820
- $self->{_graphviz_graph_name} = $3;
821
- $self->_new_scope(1);
822
- $self->{_graph}->set_attribute('type','undirected') if lc($2) eq 'graph';
823
- 1;
824
- } );
825
-
826
- # simple remove the graph start, but remember that we did this
827
- $self->_register_handler( qr/^\s*(strict)?$qr_ocmt(di)?graph$qr_ocmt\{/i,
828
- sub
829
- {
830
- my $self = shift;
831
- return $self->parse_error(6) if @{$self->{scope_stack}} > 0;
832
- $self->{_graphviz_graph_name} = 'unnamed';
833
- $self->_new_scope(1);
834
- $self->{_graph}->set_attribute('type','undirected') if lc($2) ne 'di';
835
- 1;
836
- } );
837
-
838
- # end-of-statement
839
- $self->_register_handler( qr/^\s*;/, undef );
840
-
841
- # cluster/subgraph "subgraph G { .. }"
842
- # scope (dummy group): "{ .. }"
843
- # scope/group/subgraph end: "}"
844
- $self->_add_group_match();
845
-
846
- # node [ color="red" ] etc.
847
- # The "(?i)" makes the keywords match case-insensitive.
848
- $self->_register_handler( qr/^\s*((?i)node|graph|edge)$qr_ocmt$qr_attr/,
849
- sub
850
- {
851
- my $self = shift;
852
- my $type = lc($1 || '');
853
- my $att = $self->_parse_attributes($2 || '', $type, NO_MULTIPLES );
854
- return undef unless defined $att; # error in attributes?
855
-
856
- if ($type ne 'graph')
857
- {
858
- # apply the attributes to the current scope
859
- my $scope = $self->{scope_stack}->[-1];
860
- $scope->{$type} = {} unless ref $scope->{$type};
861
- my $s = $scope->{$type};
862
- for my $k (keys %$att)
863
- {
864
- $s->{$k} = $att->{$k};
865
- }
866
- }
867
- else
868
- {
869
- my $graph = $self->{_graph};
870
- $graph->set_attributes ($type, $att);
871
- }
872
-
873
- # forget stacks
874
- $self->{stack} = [];
875
- $self->{left_edge} = undef;
876
- $self->{left_stack} = [];
877
- 1;
878
- } );
879
-
880
- # color=red; (for graphs or subgraphs)
881
- $self->_register_attribute_handler($qr_gatr, 'parent');
882
- # [ color=red; ] (for nodes/edges)
883
- $self->_register_attribute_handler($qr_attr);
884
-
885
- # node chain continued like "-> { ... "
886
- $self->_register_handler( qr/^$qr_edge$qr_ocmt$qr_pgr/,
887
- sub
888
- {
889
- my $self = shift;
890
-
891
- return if @{$self->{stack}} == 0; # only match this if stack non-empty
892
-
893
- my $graph = $self->{_graph};
894
- my $eg = $1; # entire edge ("->" etc)
895
-
896
- my $edge_un = 0; $edge_un = 1 if $eg eq '--'; # undirected edge?
897
-
898
- # need to defer edge attribute parsing until the edge exists
899
- # if inside a scope, set the scope attributes, too:
900
- my $scope = $self->{scope_stack}->[-1] || {};
901
- my $edge_atr = $scope->{edge} || {};
902
-
903
- # create a new scope
904
- $self->_new_scope();
905
-
906
- # remember the left side
907
- $self->{left_edge} = [ 'solid', '', $edge_atr, 0, $edge_un ];
908
- $self->{left_stack} = $self->{stack};
909
-
910
- # forget stack and remember the right side instead
911
- $self->{stack} = [];
912
-
913
- 1;
914
- } );
915
-
916
- # "Berlin"
917
- $self->_register_handler( qr/^$qr_node/,
918
- sub
919
- {
920
- my $self = shift;
921
- my $graph = $self->{_graph};
922
-
923
- # only match this inside a "{ }" (normal, non-group) scope
924
- return if exists $self->{scope_stack}->[-1]->{_is_group};
925
-
926
- my $n1 = $1;
927
- my $port = $2;
928
- push @{$self->{stack}},
929
- $self->_new_nodes ($n1, $self->{group_stack}, {}, $port, $self->{stack});
930
-
931
- if (defined $self->{left_edge})
932
- {
933
- my $e = $self->{use_class}->{edge};
934
- my ($style, $edge_label, $edge_atr, $edge_bd, $edge_un) = @{$self->{left_edge}};
935
-
936
- foreach my $node (@{$self->{left_stack}})
937
- {
938
- my $edge = $e->new( { style => $style, name => $edge_label } );
939
-
940
- # if inside a scope, set the scope attributes, too:
941
- my $scope = $self->{scope_stack}->[-1];
942
- $edge->set_attributes($scope->{edge}) if $scope;
943
-
944
- # override with the local attributes
945
- # 'string' => [ 'string' ]
946
- # [ { hash }, 'string' ] => [ { hash }, 'string' ]
947
- my $e = $edge_atr; $e = [ $edge_atr ] unless ref($e) eq 'ARRAY';
948
-
949
- for my $a (@$e)
950
- {
951
- if (ref $a)
952
- {
953
- $edge->set_attributes($a);
954
- }
955
- else
956
- {
957
- # deferred parsing with the object as param:
958
- my $out = $self->_parse_attributes($a, $edge, NO_MULTIPLES);
959
- return undef unless defined $out; # error in attributes?
960
- $edge->set_attributes($out);
961
- }
962
- }
963
-
964
- # "<--->": bidirectional
965
- $edge->bidirectional(1) if $edge_bd;
966
- $edge->undirected(1) if $edge_un;
967
- $graph->add_edge ( $node, $self->{stack}->[-1], $edge );
968
- }
969
- }
970
- 1;
971
- } );
972
-
973
- # "Berlin" [ color=red ] or "Bonn":"a" [ color=red ]
974
- $self->_register_handler( qr/^$qr_node$qr_oatr/,
975
- sub
976
- {
977
- my $self = shift;
978
- my $name = $1;
979
- my $port = $2;
980
- my $compass = $4 || ''; $port .= ":$compass" if $compass;
981
-
982
- $self->{stack} = [ $self->_new_nodes ($name, $self->{group_stack}, {}, $port ) ];
983
-
984
- # defer attribute parsing until object exists
985
- my $node = $self->{stack}->[0];
986
- my $a1 = $self->_parse_attributes($5||'', $node);
987
- return undef if $self->{error};
988
- $node->set_attributes($a1);
989
-
990
- # forget left stack
991
- $self->{left_edge} = undef;
992
- $self->{left_stack} = [];
993
- 1;
994
- } );
995
-
996
- # Things like ' "Node" ' will be consumed before, so we do not need a case
997
- # for '"Bonn" -> "Berlin"'
998
-
999
- # node chain continued like "-> "Kassel" [ ... ]"
1000
- $self->_register_handler( qr/^$qr_edge$qr_ocmt$qr_node$qr_ocmt$qr_oatr/,
1001
- sub
1002
- {
1003
- my $self = shift;
1004
-
1005
- return if @{$self->{stack}} == 0; # only match this if stack non-empty
1006
-
1007
- my $graph = $self->{_graph};
1008
- my $eg = $1; # entire edge ("->" etc)
1009
- my $n = $2; # node name
1010
- my $port = $3;
1011
- my $compass = $4 || $5 || ''; $port .= ":$compass" if $compass;
1012
-
1013
- my $edge_un = 0; $edge_un = 1 if $eg eq '--'; # undirected edge?
1014
-
1015
- my $scope = $self->{scope_stack}->[-1] || {};
1016
-
1017
- # need to defer edge attribute parsing until the edge exists
1018
- my $edge_atr = [ $6||'', $scope->{edge} || {} ];
1019
-
1020
- # the right side nodes:
1021
- my $nodes_b = [ $self->_new_nodes ($n, $self->{group_stack}, {}, $port) ];
1022
-
1023
- my $style = $self->_link_lists( $self->{stack}, $nodes_b,
1024
- '--', '', $edge_atr, 0, $edge_un);
1025
-
1026
- # remember the left side
1027
- $self->{left_edge} = [ $style, '', $edge_atr, 0, $edge_un ];
1028
- $self->{left_stack} = $self->{stack};
1029
-
1030
- # forget stack and remember the right side instead
1031
- $self->{stack} = $nodes_b;
1032
- 1;
1033
- } );
1034
-
1035
- $self;
1036
- }
1037
-
1038
- sub _add_node
1039
- {
1040
- # add a node to the graph, overridable by subclasses
1041
- my ($self, $graph, $name) = @_;
1042
-
1043
- # "a -- clusterB" should not create a spurious node named "clusterB"
1044
- my @groups = $graph->groups();
1045
- for my $g (@groups)
1046
- {
1047
- return $g if $g->{name} eq $name;
1048
- }
1049
-
1050
- my $node = $graph->node($name);
1051
-
1052
- if (!defined $node)
1053
- {
1054
- $node = $graph->add_node($name); # add
1055
-
1056
- # apply attributes from the current scope (only for new nodes)
1057
- my $scope = $self->{scope_stack}->[-1];
1058
- return $self->error("Scope stack is empty!") unless defined $scope;
1059
-
1060
- my $is_group = $scope->{_is_group};
1061
- delete $scope->{_is_group};
1062
- $node->set_attributes($scope->{node});
1063
- $scope->{_is_group} = $is_group if $is_group;
1064
- }
1065
-
1066
- $node;
1067
- }
1068
-
1069
- #############################################################################
1070
- # attribute remapping
1071
-
1072
- # undef => drop that attribute
1073
- # not listed attributes will result in "x-dot-$attribute" and a warning
1074
-
1075
- my $remap = {
1076
- 'node' => {
1077
- 'distortion' => 'x-dot-distortion',
1078
-
1079
- 'fixedsize' => undef,
1080
- 'group' => 'x-dot-group',
1081
- 'height' => 'x-dot-height',
1082
-
1083
- # XXX TODO: ignore non-node attributes set in a scope
1084
- 'dir' => undef,
1085
-
1086
- 'layer' => 'x-dot-layer',
1087
- 'margin' => 'x-dot-margin',
1088
- 'orientation' => \&_from_graphviz_node_orientation,
1089
- 'peripheries' => \&_from_graphviz_node_peripheries,
1090
- 'pin' => 'x-dot-pin',
1091
- 'pos' => 'x-dot-pos',
1092
- # XXX TODO: rank=0 should make that node the root node
1093
- # 'rank' => undef,
1094
- 'rects' => 'x-dot-rects',
1095
- 'regular' => 'x-dot-regular',
1096
- # 'root' => undef,
1097
- 'sides' => 'x-dot-sides',
1098
- 'shapefile' => 'x-dot-shapefile',
1099
- 'shape' => \&_from_graphviz_node_shape,
1100
- 'skew' => 'x-dot-skew',
1101
- 'style' => \&_from_graphviz_style,
1102
- 'width' => 'x-dot-width',
1103
- 'z' => 'x-dot-z',
1104
- },
1105
-
1106
- 'edge' => {
1107
- 'arrowsize' => 'x-dot-arrowsize',
1108
- 'arrowhead' => \&_from_graphviz_arrow_style,
1109
- 'arrowtail' => 'x-dot-arrowtail',
1110
- # important for color lists like "red:red" => double edge
1111
- 'color' => \&_from_graphviz_edge_color,
1112
- 'constraint' => 'x-dot-constraint',
1113
- 'dir' => \&_from_graphviz_edge_dir,
1114
- 'decorate' => 'x-dot-decorate',
1115
- 'f' => 'x-dot-f',
1116
- 'headclip' => 'x-dot-headclip',
1117
- 'headhref' => 'headlink',
1118
- 'headurl' => 'headlink',
1119
- 'headport' => \&_from_graphviz_headport,
1120
- 'headlabel' => 'headlabel',
1121
- 'headtarget' => 'x-dot-headtarget',
1122
- 'headtooltip' => 'headtitle',
1123
- 'labelangle' => 'x-dot-labelangle',
1124
- 'labeldistance' => 'x-dot-labeldistance',
1125
- 'labelfloat' => 'x-dot-labelfloat',
1126
- 'labelfontcolor' => \&_from_graphviz_color,
1127
- 'labelfontname' => 'font',
1128
- 'labelfontsize' => 'font-size',
1129
- 'layer' => 'x-dot-layer',
1130
- 'len' => 'x-dot-len',
1131
- 'lhead' => 'x-dot-lhead',
1132
- 'ltail' => 'x-dot-tail',
1133
- 'minlen' => \&_from_graphviz_edge_minlen,
1134
- 'pos' => 'x-dot-pos',
1135
- 'samehead' => 'x-dot-samehead',
1136
- 'samearrowhead' => 'x-dot-samearrowhead',
1137
- 'sametail' => 'x-dot-sametail',
1138
- 'style' => \&_from_graphviz_edge_style,
1139
- 'tailclip' => 'x-dot-tailclip',
1140
- 'tailhref' => 'taillink',
1141
- 'tailurl' => 'taillink',
1142
- 'tailport' => \&_from_graphviz_tailport,
1143
- 'taillabel' => 'taillabel',
1144
- 'tailtarget' => 'x-dot-tailtarget',
1145
- 'tailtooltip' => 'tailtitle',
1146
- 'weight' => 'x-dot-weight',
1147
- },
1148
-
1149
- 'graph' => {
1150
- 'damping' => 'x-dot-damping',
1151
- 'K' => 'x-dot-k',
1152
- 'bb' => 'x-dot-bb',
1153
- 'center' => 'x-dot-center',
1154
- # will be handled automatically:
1155
- 'charset' => undef,
1156
- 'clusterrank' => 'x-dot-clusterrank',
1157
- 'compound' => 'x-dot-compound',
1158
- 'concentrate' => 'x-dot-concentrate',
1159
- 'defaultdist' => 'x-dot-defaultdist',
1160
- 'dim' => 'x-dot-dim',
1161
- 'dpi' => 'x-dot-dpi',
1162
- 'epsilon' => 'x-dot-epsilon',
1163
- 'esep' => 'x-dot-esep',
1164
- 'fontpath' => 'x-dot-fontpath',
1165
- 'labeljust' => \&_from_graphviz_graph_labeljust,
1166
- 'labelloc' => \&_from_graphviz_labelloc,
1167
- 'landscape' => 'x-dot-landscape',
1168
- 'layers' => 'x-dot-layers',
1169
- 'layersep' => 'x-dot-layersep',
1170
- 'levelsgap' => 'x-dot-levelsgap',
1171
- 'margin' => 'x-dot-margin',
1172
- 'maxiter' => 'x-dot-maxiter',
1173
- 'mclimit' => 'x-dot-mclimit',
1174
- 'mindist' => 'x-dot-mindist',
1175
- 'minquit' => 'x-dot-minquit',
1176
- 'mode' => 'x-dot-mode',
1177
- 'model' => 'x-dot-model',
1178
- 'nodesep' => 'x-dot-nodesep',
1179
- 'normalize' => 'x-dot-normalize',
1180
- 'nslimit' => 'x-dot-nslimit',
1181
- 'nslimit1' => 'x-dot-nslimit1',
1182
- 'ordering' => 'x-dot-ordering',
1183
- 'orientation' => 'x-dot-orientation',
1184
- 'output' => 'output',
1185
- 'outputorder' => 'x-dot-outputorder',
1186
- 'overlap' => 'x-dot-overlap',
1187
- 'pack' => 'x-dot-pack',
1188
- 'packmode' => 'x-dot-packmode',
1189
- 'page' => 'x-dot-page',
1190
- 'pagedir' => 'x-dot-pagedir',
1191
- 'pencolor' => \&_from_graphviz_color,
1192
- 'quantum' => 'x-dot-quantum',
1193
- 'rankdir' => \&_from_graphviz_graph_rankdir,
1194
- 'ranksep' => 'x-dot-ranksep',
1195
- 'ratio' => 'x-dot-ratio',
1196
- 'remincross' => 'x-dot-remincross',
1197
- 'resolution' => 'x-dot-resolution',
1198
- 'rotate' => 'x-dot-rotate',
1199
- 'samplepoints' => 'x-dot-samplepoints',
1200
- 'searchsize' => 'x-dot-searchsize',
1201
- 'sep' => 'x-dot-sep',
1202
- 'size' => 'x-dot-size',
1203
- 'splines' => 'x-dot-splines',
1204
- 'start' => 'x-dot-start',
1205
- 'style' => \&_from_graphviz_style,
1206
- 'stylesheet' => 'x-dot-stylesheet',
1207
- 'truecolor' => 'x-dot-truecolor',
1208
- 'viewport' => 'x-dot-viewport',
1209
- 'voro-margin' => 'x-dot-voro-margin',
1210
- },
1211
-
1212
- 'group' => {
1213
- 'labeljust' => \&_from_graphviz_graph_labeljust,
1214
- 'labelloc' => \&_from_graphviz_labelloc,
1215
- 'pencolor' => \&_from_graphviz_color,
1216
- 'style' => \&_from_graphviz_style,
1217
- 'K' => 'x-dot-k',
1218
- },
1219
-
1220
- 'all' => {
1221
- 'color' => \&_from_graphviz_color,
1222
- 'colorscheme' => 'x-colorscheme',
1223
- 'bgcolor' => \&_from_graphviz_color,
1224
- 'fillcolor' => \&_from_graphviz_color,
1225
- 'fontsize' => \&_from_graphviz_font_size,
1226
- 'fontcolor' => \&_from_graphviz_color,
1227
- 'fontname' => 'font',
1228
- 'lp' => 'x-dot-lp',
1229
- 'nojustify' => 'x-dot-nojustify',
1230
- 'rank' => 'x-dot-rank',
1231
- 'showboxes' => 'x-dot-showboxes',
1232
- 'target' => 'x-dot-target',
1233
- 'tooltip' => 'title',
1234
- 'URL' => 'link',
1235
- 'href' => 'link',
1236
- },
1237
- };
1238
-
1239
- sub _remap { $remap; }
1240
-
1241
- my $rankdir = {
1242
- 'LR' => 'east',
1243
- 'RL' => 'west',
1244
- 'TB' => 'south',
1245
- 'BT' => 'north',
1246
- };
1247
-
1248
- sub _from_graphviz_graph_rankdir
1249
- {
1250
- my ($self, $name, $dir, $object) = @_;
1251
-
1252
- my $d = $rankdir->{$dir} || 'east';
1253
-
1254
- ('flow', $d);
1255
- }
1256
-
1257
- my $shapes = {
1258
- box => 'rect',
1259
- polygon => 'rect',
1260
- egg => 'rect',
1261
- rectangle => 'rect',
1262
- mdiamond => 'diamond',
1263
- msquare => 'rect',
1264
- plaintext => 'none',
1265
- none => 'none',
1266
- #
1267
- mrecord => 'record',
1268
- Mrecord => 'record',
1269
- square => 'rect',
1270
- triangle => 'diamond',
1271
- };
1272
-
1273
- sub _from_graphviz_node_shape
1274
- {
1275
- my ($self, $name, $shape) = @_;
1276
-
1277
- my @rc;
1278
- my $s = lc($shape);
1279
- if ($s =~ /^(triple|double)/)
1280
- {
1281
- $s =~ s/^(triple|double)//;
1282
- push @rc, ('border-style','double');
1283
- }
1284
-
1285
- # map the name to what Graph::Easy expects (ellipse stays as ellipse f.i.)
1286
- $s = $shapes->{$s} || $s;
1287
-
1288
- (@rc, $name, $s);
1289
- }
1290
-
1291
- sub _from_graphviz_style
1292
- {
1293
- my ($self, $name, $style, $class) = @_;
1294
-
1295
- my @styles = split /\s*,\s*/, $style;
1296
-
1297
- my $is_node = 0;
1298
- $is_node = 1 if ref($class) && !$class->isa('Graph::Easy::Group');
1299
- $is_node = 1 if !ref($class) && defined $class && $class eq 'node';
1300
-
1301
- my @rc;
1302
- for my $s (@styles)
1303
- {
1304
- @rc = ('shape', 'rounded') if $s eq 'rounded';
1305
- @rc = ('shape', 'invisible') if $s eq 'invis';
1306
- @rc = ('border', 'black ' . $1) if $s =~ /^(bold|dotted|dashed)\z/;
1307
- if ($is_node != 0)
1308
- {
1309
- @rc = ('shape', 'rect') if $s eq 'filled';
1310
- }
1311
- # convert "setlinewidth(12)" =>
1312
- if ($s =~ /setlinewidth\((\d+|\d*\.\d+)\)/)
1313
- {
1314
- my $width = abs($1 || 1);
1315
- my $style = '';
1316
- $style = 'wide'; # > 11
1317
- $style = 'solid' if $width < 3;
1318
- $style = 'bold' if $width >= 3 && $width < 5;
1319
- $style = 'broad' if $width >= 5 && $width < 11;
1320
- push @rc, ('borderstyle',$style);
1321
- }
1322
- }
1323
-
1324
- @rc;
1325
- }
1326
-
1327
- sub _from_graphviz_node_orientation
1328
- {
1329
- my ($self, $name, $o) = @_;
1330
-
1331
- my $r = int($o);
1332
-
1333
- return (undef,undef) if $r == 0;
1334
-
1335
- # 1.0 => 1
1336
- ('rotate', $r);
1337
- }
1338
-
1339
- my $port_remap = {
1340
- n => 'north',
1341
- e => 'east',
1342
- w => 'west',
1343
- s => 'south',
1344
- };
1345
-
1346
- sub _from_graphviz_headport
1347
- {
1348
- my ($self, $name, $compass) = @_;
1349
-
1350
- # XXX TODO
1351
- # handle "port:compass" too
1352
-
1353
- # one of "n","ne","e","se","s","sw","w","nw
1354
- # "ne => n"
1355
- my $c = $port_remap->{ substr(lc($compass),0,1) } || 'east';
1356
-
1357
- ('end', $c);
1358
- }
1359
-
1360
- sub _from_graphviz_tailport
1361
- {
1362
- my ($self, $name, $compass) = @_;
1363
-
1364
- # XXX TODO
1365
- # handle "port:compass" too
1366
-
1367
- # one of "n","ne","e","se","s","sw","w","nw
1368
- # "ne => n" => "north"
1369
- my $c = $port_remap->{ substr(lc($compass),0,1) } || 'east';
1370
-
1371
- ('start', $c);
1372
- }
1373
-
1374
- sub _from_graphviz_node_peripheries
1375
- {
1376
- my ($self, $name, $cnt) = @_;
1377
-
1378
- return (undef,undef) if $cnt < 2;
1379
-
1380
- # peripheries = 2 => double border
1381
- ('border-style', 'double');
1382
- }
1383
-
1384
- sub _from_graphviz_edge_minlen
1385
- {
1386
- my ($self, $name, $len) = @_;
1387
-
1388
- # 1 => 1, 2 => 3, 3 => 5 etc
1389
- $len = $len * 2 - 1;
1390
- ($name, $len);
1391
- }
1392
-
1393
- sub _from_graphviz_font_size
1394
- {
1395
- my ($self, $f, $size) = @_;
1396
-
1397
- # 20 => 20px
1398
- $size = $size . 'px' if $size =~ /^\d+(\.\d+)?\z/;
1399
-
1400
- ('fontsize', $size);
1401
- }
1402
-
1403
- sub _from_graphviz_labelloc
1404
- {
1405
- my ($self, $name, $loc) = @_;
1406
-
1407
- my $l = 'top';
1408
- $l = 'bottom' if $loc =~ /^b/;
1409
-
1410
- ('labelpos', $l);
1411
- }
1412
-
1413
- sub _from_graphviz_edge_dir
1414
- {
1415
- my ($self, $name, $dir, $edge) = @_;
1416
-
1417
- # Modify the edge, depending on dir
1418
- if (ref($edge))
1419
- {
1420
- # "forward" is the default and ignored
1421
- $edge->flip() if $dir eq 'back';
1422
- $edge->bidirectional(1) if $dir eq 'both';
1423
- $edge->undirected(1) if $dir eq 'none';
1424
- }
1425
-
1426
- (undef, undef);
1427
- }
1428
-
1429
- sub _from_graphviz_edge_style
1430
- {
1431
- my ($self, $name, $style, $object) = @_;
1432
-
1433
- # input: solid dashed dotted bold invis
1434
- $style = 'invisible' if $style eq 'invis';
1435
-
1436
- # although "normal" is not documented, it occurs in the wild
1437
- $style = 'solid' if $style eq 'normal';
1438
-
1439
- # convert "setlinewidth(12)" =>
1440
- if ($style =~ /setlinewidth\((\d+|\d*\.\d+)\)/)
1441
- {
1442
- my $width = abs($1 || 1);
1443
- $style = 'wide'; # > 11
1444
- $style = 'solid' if $width < 3;
1445
- $style = 'bold' if $width >= 3 && $width < 5;
1446
- $style = 'broad' if $width >= 5 && $width < 11;
1447
- }
1448
-
1449
- ($name, $style);
1450
- }
1451
-
1452
- sub _from_graphviz_arrow_style
1453
- {
1454
- my ($self, $name, $shape, $object) = @_;
1455
-
1456
- my $style = 'open';
1457
-
1458
- $style = 'closed' if $shape =~ /^(empty|onormal)\z/;
1459
- $style = 'filled' if $shape eq 'normal' || $shape eq 'normalnormal';
1460
- $style = 'open' if $shape eq 'vee' || $shape eq 'veevee';
1461
- $style = 'none' if $shape eq 'none' || $shape eq 'nonenone';
1462
-
1463
- ('arrow-style', $style);
1464
- }
1465
-
1466
- my $color_atr_map = {
1467
- fontcolor => 'color',
1468
- bgcolor => 'background',
1469
- fillcolor => 'fill',
1470
- pencolor => 'bordercolor',
1471
- labelfontcolor => 'labelcolor',
1472
- color => 'color',
1473
- };
1474
-
1475
- sub _from_graphviz_color
1476
- {
1477
- # Remap the color name and value
1478
- my ($self, $name, $color) = @_;
1479
-
1480
- # "//red" => "red"
1481
- $color =~ s/^\/\///;
1482
-
1483
- my $colorscheme = 'x11';
1484
- if ($color =~ /^\//)
1485
- {
1486
- # "/set9/red" => "red"
1487
- $color =~ s/^\/([^\/]+)\///;
1488
- $colorscheme = $1;
1489
- # map the color to the right color according to the colorscheme
1490
- $color = Graph::Easy->color_value($color,$colorscheme) || 'black';
1491
- }
1492
-
1493
- # "#AA BB CC => "#AABBCC"
1494
- $color =~ s/\s+//g if $color =~ /^#/;
1495
-
1496
- # "0.1 0.4 0.5" => "hsv(0.1,0.4,0.5)"
1497
- $color =~ s/\s+/,/g if $color =~ /\s/;
1498
- $color = 'hsv(' . $color . ')' if $color =~ /,/;
1499
-
1500
- ($color_atr_map->{$name}, $color);
1501
- }
1502
-
1503
- sub _from_graphviz_edge_color
1504
- {
1505
- # remap the color name and value
1506
- my ($self, $name, $color) = @_;
1507
-
1508
- my @colors = split /:/, $color;
1509
-
1510
- for my $c (@colors)
1511
- {
1512
- $c = Graph::Easy::Parser::Graphviz::_from_graphviz_color($self,$name,$c);
1513
- }
1514
-
1515
- my @rc;
1516
- if (@colors > 1)
1517
- {
1518
- # 'red:blue' => "style: double; color: red"
1519
- push @rc, 'style', 'double';
1520
- }
1521
-
1522
- (@rc, $color_atr_map->{$name}, $colors[0]);
1523
- }
1524
-
1525
- sub _from_graphviz_graph_labeljust
1526
- {
1527
- my ($self, $name, $l) = @_;
1528
-
1529
- # input: "l" "r" or "c", output "left", "right" or "center"
1530
- my $a = 'center';
1531
- $a = 'left' if $l eq 'l';
1532
- $a = 'right' if $l eq 'r';
1533
-
1534
- ('align', $a);
1535
- }
1536
-
1537
- #############################################################################
1538
-
1539
- sub _remap_attributes
1540
- {
1541
- my ($self, $att, $object, $r) = @_;
1542
-
1543
- if ($self->{debug})
1544
- {
1545
- my $o = ''; $o = " for $object" if $object;
1546
- print STDERR "# remapping attributes '$att'$o\n";
1547
- require Data::Dumper; print STDERR "#" , Data::Dumper::Dumper($att),"\n";
1548
- }
1549
-
1550
- $r = $self->_remap() unless defined $r;
1551
-
1552
- $self->{_graph}->_remap_attributes($object, $att, $r, 'noquote', undef, undef);
1553
- }
1554
-
1555
- #############################################################################
1556
-
1557
- my $html_remap = {
1558
- 'table' => {
1559
- 'align' => 'align',
1560
- 'balign' => undef,
1561
- 'bgcolor' => 'fill',
1562
- 'border' => 'border',
1563
- # XXX TODO
1564
- 'cellborder' => 'border',
1565
- 'cellspacing' => undef,
1566
- 'cellpadding' => undef,
1567
- 'fixedsize' => undef,
1568
- 'height' => undef,
1569
- 'href' => 'link',
1570
- 'port' => undef,
1571
- 'target' => undef,
1572
- 'title' => 'title',
1573
- 'tooltip' => 'title',
1574
- 'valign' => undef,
1575
- 'width' => undef,
1576
- },
1577
- 'td' => {
1578
- 'align' => 'align',
1579
- 'balign' => undef,
1580
- 'bgcolor' => 'fill',
1581
- 'border' => 'border',
1582
- 'cellspacing' => undef,
1583
- 'cellpadding' => undef,
1584
- 'colspan' => 'columns',
1585
- 'fixedsize' => undef,
1586
- 'height' => undef,
1587
- 'href' => 'link',
1588
- 'port' => undef,
1589
- 'rowspan' => 'rows',
1590
- 'target' => undef,
1591
- 'title' => 'title',
1592
- 'tooltip' => 'title',
1593
- 'valign' => undef,
1594
- 'width' => undef,
1595
- },
1596
- };
1597
-
1598
- sub _parse_html_attributes
1599
- {
1600
- my ($self, $text, $qr, $tag) = @_;
1601
-
1602
- # "<TD ...>" => " ..."
1603
- $text =~ s/^$qr->{td_tag}//;
1604
- $text =~ s/\s*>\z//;
1605
-
1606
- my $attr = {};
1607
- while ($text ne '')
1608
- {
1609
-
1610
- return $self->error("HTML-like attribute '$text' doesn't look valid to me.")
1611
- unless $text =~ s/^($qr->{attribute})//;
1612
-
1613
- my $name = lc($2); my $value = $3;
1614
-
1615
- $self->_unquote($value);
1616
- $value = lc($value) if $name eq 'align';
1617
- $self->error ("Unknown attribute '$name' in HTML-like label") unless exists $html_remap->{$tag}->{$name};
1618
- # filter out attributes we do not yet support
1619
- $attr->{$name} = $value if defined $html_remap->{$tag}->{$name};
1620
- }
1621
-
1622
- $attr;
1623
- }
1624
-
1625
- sub _html_per_table
1626
- {
1627
- # take the HTML-like attributes found per TABLE and create a hash with them
1628
- # so they can be applied as default to each node
1629
- my ($self, $attributes) = @_;
1630
-
1631
- $self->_remap_attributes($attributes,'table',$html_remap);
1632
- }
1633
-
1634
- sub _html_per_node
1635
- {
1636
- # take the HTML-like attributes found per TD and apply them to the node
1637
- my ($self, $attr, $node) = @_;
1638
-
1639
- my $c = $attr->{colspan} || 1;
1640
- $node->set_attribute('columns',$c) if $c != 1;
1641
-
1642
- my $r = $attr->{rowspan} || 1;
1643
- $node->set_attribute('rows',$r) if $r != 1;
1644
-
1645
- $node->{autosplit_portname} = $attr->{port} if exists $attr->{port};
1646
-
1647
- for my $k (qw/port colspan rowspan/)
1648
- {
1649
- delete $attr->{$k};
1650
- }
1651
-
1652
- my $att = $self->_remap_attributes($attr,$node,$html_remap);
1653
-
1654
- $node->set_attributes($att);
1655
-
1656
- $self;
1657
- }
1658
-
1659
- sub _parse_html
1660
- {
1661
- # Given an HTML label, parses that into the individual parts. Returns a
1662
- # list of nodes.
1663
- my ($self, $n, $qr) = @_;
1664
-
1665
- my $graph = $self->{_graph};
1666
-
1667
- my $label = $n->label(1); $label = '' unless defined $label;
1668
- my $org_label = $label;
1669
-
1670
- # print STDERR "# 1 HTML-like label is now: $label\n";
1671
-
1672
- # "unquote" the HTML-like label
1673
- $label =~ s/^<\s*//;
1674
- $label =~ s/\s*>\z//;
1675
-
1676
- # print STDERR "# 2 HTML-like label is now: $label\n";
1677
-
1678
- # remove the table end (at the end)
1679
- $label =~ s/$qr->{table_end}\s*\z//;
1680
- # print STDERR "# 2.a HTML-like label is now: $label\n";
1681
- # remove the table start
1682
- $label =~ s/($qr->{table})//;
1683
-
1684
- # print STDERR "# 3 HTML-like label is now: $label\n";
1685
-
1686
- my $table_tag = $1 || '';
1687
- $table_tag =~ /$qr->{table_tag}(.*?)>/;
1688
- my $table_attr = $self->_parse_html_attributes($1 || '', $qr, 'table');
1689
-
1690
- # use Data::Dumper;
1691
- # print STDERR "# 3 HTML-like table-tag attributes are: ", Dumper($table_attr),"\n";
1692
-
1693
- # generate the base name from the actual graphviz node name to allow links to
1694
- # it
1695
- my $base_name = $n->{name};
1696
-
1697
- my $class = $self->{use_class}->{node};
1698
-
1699
- my $raw_attributes = $n->raw_attributes();
1700
- delete $raw_attributes->{label};
1701
- delete $raw_attributes->{shape};
1702
-
1703
- my @rc; my $first_in_row;
1704
- my $x = 0; my $y = 0; my $idx = 0;
1705
- while ($label ne '')
1706
- {
1707
- $label =~ s/^\s*($qr->{row})//;
1708
-
1709
- return $self->error ("Cannot parse HTML-like label: '$label'")
1710
- unless defined $1;
1711
-
1712
- # we now got one row:
1713
- my $row = $1;
1714
-
1715
- # print STDERR "# 3 HTML-like row is $row\n";
1716
-
1717
- # remove <TR>
1718
- $row =~ s/^\s*$qr->{tr}\s*//;
1719
- # remove </TR>
1720
- $row =~ s/\s*$qr->{tr_end}\s*\z//;
1721
-
1722
- my $first = 1;
1723
- while ($row ne '')
1724
- {
1725
- # remove one TD from the current row text
1726
- $row =~ s/^($qr->{td})($qr->{text})$qr->{td_end}//;
1727
- return $self->error ("Cannot parse HTML-like row: '$row'")
1728
- unless defined $1;
1729
-
1730
- my $node_label = $2;
1731
- my $attr_txt = $1;
1732
-
1733
- # convert "<BR/>" etc. to line breaks
1734
- # XXX TODO apply here the default of BALIGN
1735
- $node_label =~ s/<BR\s*\/?>/\\n/gi;
1736
-
1737
- # if the font covers the entire node, set "font" attribute
1738
- my $font_face = undef;
1739
- if ($node_label =~ /^[ ]*<FONT FACE="([^"]+)">(.*)<\/FONT>[ ]*\z/i)
1740
- {
1741
- $node_label = $2; $font_face = $1;
1742
- }
1743
- # XXX TODO if not, allow inline font changes
1744
- $node_label =~ s/<FONT[^>]+>(.*)<\/FONT>/$1/ig;
1745
-
1746
- my $node_name = $base_name . '.' . $idx;
1747
-
1748
- # if it doesn't exist, add it, otherwise retrieve node object to $node
1749
-
1750
- my $node = $graph->node($node_name);
1751
- if (!defined $node)
1752
- {
1753
- # create node object from the correct class
1754
- $node = $class->new($node_name);
1755
- $graph->add_node($node);
1756
- $node->set_attributes($raw_attributes);
1757
- $node->{autosplit_portname} = $idx; # some sensible default
1758
- }
1759
-
1760
- # apply the default attributes from the table
1761
- $node->set_attributes($table_attr);
1762
- # if found a global font attribute, override the font attribute with it
1763
- $node->set_attribute('font',$font_face) if defined $font_face;
1764
-
1765
- # parse the attributes and apply them to the node
1766
- $self->_html_per_node( $self->_parse_html_attributes($attr_txt,$qr,'td'), $node );
1767
-
1768
- # print STDERR "# Created $node_name\n";
1769
-
1770
- $node->{autosplit_label} = $node_label;
1771
- $node->{autosplit_basename} = $base_name;
1772
-
1773
- push @rc, $node;
1774
- if (@rc == 1)
1775
- {
1776
- # for correct as_txt output
1777
- $node->{autosplit} = $org_label;
1778
- $node->{autosplit} =~ s/\s+\z//; # strip trailing spaces
1779
- $node->{autosplit} =~ s/^\s+//; # strip leading spaces
1780
- $first_in_row = $node;
1781
- }
1782
- else
1783
- {
1784
- # second, third etc. get previous as origin
1785
- my ($sx,$sy) = (1,0);
1786
- my $origin = $rc[-2];
1787
- # the first node in one row is relative to the first node in the
1788
- # prev row
1789
- if ($first == 1)
1790
- {
1791
- ($sx,$sy) = (0,1); $origin = $first_in_row;
1792
- $first_in_row = $node;
1793
- $first = 0;
1794
- }
1795
- $node->relative_to($origin,$sx,$sy);
1796
- # suppress as_txt output for other parts
1797
- $node->{autosplit} = undef;
1798
- }
1799
- # nec. for border-collapse
1800
- $node->{autosplit_xy} = "$x,$y";
1801
-
1802
- $idx++; # next node ID
1803
- $x++;
1804
- }
1805
-
1806
- # next row
1807
- $y++;
1808
- }
1809
-
1810
- # return created nodes
1811
- @rc;
1812
- }
1813
-
1814
- #############################################################################
1815
-
1816
- sub _parser_cleanup
1817
- {
1818
- # After initial parsing, do cleanup, e.g. autosplit nodes with shape record,
1819
- # parse HTML-like labels, re-connect edges to the parts etc.
1820
- my ($self) = @_;
1821
-
1822
- print STDERR "# Parser cleanup pass\n" if $self->{debug};
1823
-
1824
- my $g = $self->{_graph};
1825
- my @nodes = $g->nodes();
1826
-
1827
- # For all nodes that have a shape of "record", break down their label into
1828
- # parts and create these as autosplit nodes.
1829
- # For all nodes that have a label starting with "<", parse it as HTML.
1830
-
1831
- # keep a record of all nodes to be deleted later:
1832
- my $delete = {};
1833
-
1834
- my $html_regexps = $self->_match_html_regexps();
1835
- my $graph_flow = $g->attribute('flow');
1836
- for my $n (@nodes)
1837
- {
1838
- my $label = $n->label(1);
1839
- # we can get away with a direct lookup, since DOT does not have classes
1840
- my $shape = $n->{att}->{shape} || 'rect';
1841
-
1842
- if ($shape !~ /record/ && $label =~ /^<\s*<.*>\z/)
1843
- {
1844
- print STDERR "# HTML-like label found: $label\n" if $self->{debug};
1845
- my @nodes = $self->_parse_html($n, $html_regexps);
1846
- # remove the temp. and spurious node
1847
- $delete->{$n->{name}} = undef;
1848
- my @edges = $n->edges();
1849
- # reconnect the found edges to the new autosplit parts
1850
- for my $e (@edges)
1851
- {
1852
- # XXX TODO: connect to better suited parts based on flow?
1853
- $e->start_at($nodes[0]) if ($e->{from} == $n);
1854
- $e->end_at($nodes[0]) if ($e->{to} == $n);
1855
- }
1856
- $g->del_node($n);
1857
- next;
1858
- }
1859
-
1860
- if ($shape =~ /record/ && $label =~ /\|/)
1861
- {
1862
- my $att = {};
1863
- # create basename only when node name differes from label
1864
- $att->{basename} = $n->{name};
1865
- if ($n->{name} ne $label)
1866
- {
1867
- $att->{basename} = $n->{name};
1868
- }
1869
- # XXX TODO: autosplit needs to handle nesting like "{}".
1870
-
1871
- # Replace "{ ... | ... | ... }" with "...|| ... || ...." as a cheat
1872
- # to fix some common cases
1873
- if ($label =~ /^\s*\{[^\{\}]+\}\s*\z/)
1874
- {
1875
- $label =~ s/[\{\}]//g; # {..|..} => ..|..
1876
- # if flow up/down: {A||B} => "[ A|| || B ]"
1877
- $label =~ s/\|/\|\| /g # ..|.. => ..|| ..
1878
- if ($graph_flow =~ /^(east|west)/);
1879
- # if flow left/right: {A||B} => "[ A| |B ]"
1880
- $label =~ s/\|\|/\| \|/g # ..|.. => ..| |..
1881
- if ($graph_flow =~ /^(north|south)/);
1882
- }
1883
- my @rc = $self->_autosplit_node($g, $label, $att, 0 );
1884
- my $group = $n->group();
1885
- $n->del_attribute('label');
1886
-
1887
- my $qr_clean = $self->{_qr_part_clean};
1888
- # clean the base name of ports:
1889
- # "<f1> test | <f2> test" => "test|test"
1890
- $rc[0]->{autosplit} =~ s/(^|\|)$qr_clean/$1/g;
1891
- $rc[0]->{att}->{basename} =~ s/(^|\|)$qr_clean/$1/g;
1892
- $rc[0]->{autosplit} =~ s/^\s*//;
1893
- $rc[0]->{att}->{basename} =~ s/^\s*//;
1894
- # '| |' => '| |' to avoid empty parts via as_txt() => as_ascii()
1895
- $rc[0]->{autosplit} =~ s/\|\s\|/\| \|/g;
1896
- $rc[0]->{att}->{basename} =~ s/\|\s\|/\| \|/g;
1897
- $rc[0]->{autosplit} =~ s/\|\s\|/\| \|/g;
1898
- $rc[0]->{att}->{basename} =~ s/\|\s\|/\| \|/g;
1899
- delete $rc[0]->{att}->{basename} if $rc[0]->{att}->{basename} eq $rc[0]->{autosplit};
1900
-
1901
- for my $n1 (@rc)
1902
- {
1903
- $n1->add_to_group($group) if $group;
1904
- $n1->set_attributes($n->{att});
1905
- # remove the temp. "shape=record"
1906
- $n1->del_attribute('shape');
1907
- }
1908
-
1909
- # If the helper node has edges, reconnect them to the first
1910
- # part of the autosplit node (dot seems to render them arbitrarily
1911
- # on the autosplit node):
1912
-
1913
- for my $e (values %{$n->{edges}})
1914
- {
1915
- $e->start_at($rc[0]) if $e->{from} == $n;
1916
- $e->end_at($rc[0]) if $e->{to} == $n;
1917
- }
1918
- # remove the temp. and spurious node
1919
- $delete->{$n->{name}} = undef;
1920
- $g->del_node($n);
1921
- }
1922
- }
1923
-
1924
- # During parsing, "bonn:f1" -> "berlin:f2" results in "bonn:f1" and
1925
- # "berlin:f2" as nodes, plus an edge connecting them
1926
-
1927
- # We find all of these nodes, move the edges to the freshly created
1928
- # autosplit parts above, then delete the superflous temporary nodes.
1929
-
1930
- # if we looked up "Bonn:f1", remember it here to save time:
1931
- my $node_cache = {};
1932
-
1933
- my @edges = $g->edges();
1934
- @nodes = $g->nodes(); # get a fresh list of nodes after split
1935
- for my $e (@edges)
1936
- {
1937
- # do this for both the "from" and "to" side of the edge:
1938
- for my $side ('from','to')
1939
- {
1940
- my $n = $e->{$side};
1941
- next unless defined $n->{_graphviz_portlet};
1942
-
1943
- my $port = $n->{_graphviz_portlet};
1944
- my $base = $n->{_graphviz_basename};
1945
-
1946
- my $compass = '';
1947
- if ($port =~ s/:(n|ne|e|se|s|sw|w|nw)\z//)
1948
- {
1949
- $compass = $1;
1950
- }
1951
- # "Bonn:w" is port "w", and only "west" when that port doesnt exist
1952
-
1953
- # look it up in the cache first
1954
- my $node = $node_cache->{"$base:$port"};
1955
-
1956
- my $p = undef;
1957
- if (!defined $node)
1958
- {
1959
- # go thru all nodes and for see if we find one with the right port name
1960
- for my $na (@nodes)
1961
- {
1962
- next unless exists $na->{autosplit_portname} && exists $na->{autosplit_basename};
1963
- next unless $na->{autosplit_basename} eq $base;
1964
- next unless $na->{autosplit_portname} eq $port;
1965
- # cache result
1966
- $node_cache->{"$base:$port"} = $na;
1967
- $node = $na;
1968
- $p = $port_remap->{substr($compass,0,1)} if $compass; # ne => n => north
1969
- }
1970
- }
1971
-
1972
- if (!defined $node)
1973
- {
1974
- # Still not defined?
1975
- # port looks like a compass node?
1976
- if ($port =~ /^(n|ne|e|se|s|sw|w|nw)\z/)
1977
- {
1978
- # get the first node matching the base
1979
- for my $na (@nodes)
1980
- {
1981
- #print STDERR "# evaluating $na ($na->{name} $na->{autosplit_basename}) ($base)\n";
1982
- next unless exists $na->{autosplit_basename};
1983
- next unless $na->{autosplit_basename} eq $base;
1984
- # cache result
1985
- $node_cache->{"$base:$port"} = $na;
1986
- $node = $na;
1987
- }
1988
- if (!defined $node)
1989
- {
1990
- return $self->error("Cannot find autosplit node for $base:$port on edge $e->{id}");
1991
- }
1992
- $p = $port_remap->{substr($port,0,1)}; # ne => n => north
1993
- }
1994
- else
1995
- {
1996
- # uhoh...
1997
- return $self->error("Cannot find autosplit node for $base:$port on edge $e->{id}");
1998
- }
1999
- }
2000
-
2001
- if ($side eq 'from')
2002
- {
2003
- $delete->{$e->{from}->{name}} = undef;
2004
- print STDERR "# Setting new edge start point to $node->{name}\n" if $self->{debug};
2005
- $e->start_at($node);
2006
- print STDERR "# Setting new edge end point to start at $p\n" if $self->{debug} && $p;
2007
- $e->set_attribute('start', $p) if $p;
2008
- }
2009
- else
2010
- {
2011
- $delete->{$e->{to}->{name}} = undef;
2012
- print STDERR "# Setting new edge end point to $node->{name}\n" if $self->{debug};
2013
- $e->end_at($node);
2014
- print STDERR "# Setting new edge end point to end at $p\n" if $self->{debug} && $p;
2015
- $e->set_attribute('end', $p) if $p;
2016
- }
2017
-
2018
- } # end for side "from" and "to"
2019
- # we have reconnected this edge
2020
- }
2021
-
2022
- # after reconnecting all edges, we can delete temp. nodes:
2023
- for my $n (@nodes)
2024
- {
2025
- next unless exists $n->{_graphviz_portlet};
2026
- # "c:w" => "c"
2027
- my $name = $n->{name}; $name =~ s/:.*?\z//;
2028
- # add "c" unless we should delete the base node (this deletes record
2029
- # and autosplit nodes, but keeps loners like "c:w" around as "c":
2030
- $g->add_node($name) unless exists $delete->{$name};
2031
- # delete "c:w"
2032
- $g->del_node($n);
2033
- }
2034
-
2035
- # if the graph doesn't have a title, set the graph name as title
2036
- $g->set_attribute('title', $self->{_graphviz_graph_name})
2037
- unless defined $g->raw_attribute('title');
2038
-
2039
- # cleanup if there are no groups
2040
- if ($g->groups() == 0)
2041
- {
2042
- $g->del_attribute('group', 'align');
2043
- $g->del_attribute('group', 'fill');
2044
- }
2045
- $g->{_warn_on_unknown_attributes} = 0; # reset to die again
2046
-
2047
- $self;
2048
- }
2049
-
2050
- 1;
2051
- __END__
2052
-
2053
- =head1 NAME
2054
-
2055
- Graph::Easy::Parser::Graphviz - Parse Graphviz text into Graph::Easy
2056
-
2057
- =head1 SYNOPSIS
2058
-
2059
- # creating a graph from a textual description
2060
-
2061
- use Graph::Easy::Parser::Graphviz;
2062
- my $parser = Graph::Easy::Parser::Graphviz->new();
2063
-
2064
- my $graph = $parser->from_text(
2065
- "digraph MyGraph { \n" .
2066
- " Bonn -> \"Berlin\" \n }"
2067
- );
2068
- print $graph->as_ascii();
2069
-
2070
- print $parser->from_file('mygraph.dot')->as_ascii();
2071
-
2072
- =head1 DESCRIPTION
2073
-
2074
- C<Graph::Easy::Parser::Graphviz> parses the text format from the DOT language
2075
- use by Graphviz and constructs a C<Graph::Easy> object from it.
2076
-
2077
- The resulting object can than be used to layout and output the graph
2078
- in various formats.
2079
-
2080
- Please see the Graphviz manual for a full description of the syntax
2081
- rules of the DOT language.
2082
-
2083
- =head2 Output
2084
-
2085
- The output will be a L<Graph::Easy|Graph::Easy> object (unless overrriden
2086
- with C<use_class()>), see the documentation for Graph::Easy what you can do
2087
- with it.
2088
-
2089
- =head2 Attributes
2090
-
2091
- Attributes will be remapped to the proper Graph::Easy attribute names and
2092
- values, as much as possible.
2093
-
2094
- Anything else will be converted to custom attributes starting with "x-dot-".
2095
- So "ranksep: 2" will become "x-dot-ranksep: 2".
2096
-
2097
- =head1 METHODS
2098
-
2099
- C<Graph::Easy::Parser::Graphviz> supports the same methods
2100
- as its parent class C<Graph::Easy::Parser>:
2101
-
2102
- =head2 new()
2103
-
2104
- use Graph::Easy::Parser::Graphviz;
2105
- my $parser = Graph::Easy::Parser::Graphviz->new();
2106
-
2107
- Creates a new parser object. There are two valid parameters:
2108
-
2109
- debug
2110
- fatal_errors
2111
-
2112
- Both take either a false or a true value.
2113
-
2114
- my $parser = Graph::Easy::Parser::Graphviz->new( debug => 1 );
2115
- $parser->from_text('digraph G { A -> B }');
2116
-
2117
- =head2 reset()
2118
-
2119
- $parser->reset();
2120
-
2121
- Reset the status of the parser, clear errors etc. Automatically called
2122
- when you call any of the C<from_XXX()> methods below.
2123
-
2124
- =head2 use_class()
2125
-
2126
- $parser->use_class('node', 'Graph::Easy::MyNode');
2127
-
2128
- Override the class to be used to constructs objects while parsing.
2129
-
2130
- See L<Graph::Easy::Parser> for further information.
2131
-
2132
- =head2 from_text()
2133
-
2134
- my $graph = $parser->from_text( $text );
2135
-
2136
- Create a L<Graph::Easy|Graph::Easy> object from the textual description in C<$text>.
2137
-
2138
- Returns undef for error, you can find out what the error was
2139
- with L<error()>.
2140
-
2141
- This method will reset any previous error, and thus the C<$parser> object
2142
- can be re-used to parse different texts by just calling C<from_text()>
2143
- multiple times.
2144
-
2145
- =head2 from_file()
2146
-
2147
- my $graph = $parser->from_file( $filename );
2148
- my $graph = Graph::Easy::Parser->from_file( $filename );
2149
-
2150
- Creates a L<Graph::Easy|Graph::Easy> object from the textual description in the file
2151
- C<$filename>.
2152
-
2153
- The second calling style will create a temporary parser object,
2154
- parse the file and return the resulting C<Graph::Easy> object.
2155
-
2156
- Returns undef for error, you can find out what the error was
2157
- with L<error()> when using the first calling style.
2158
-
2159
- =head2 error()
2160
-
2161
- my $error = $parser->error();
2162
-
2163
- Returns the last error, or the empty string if no error occured.
2164
-
2165
- =head2 parse_error()
2166
-
2167
- $parser->parse_error( $msg_nr, @params);
2168
-
2169
- Sets an error message from a message number and replaces embedded
2170
- templates like C<##param1##> with the passed parameters.
2171
-
2172
- =head1 CAVEATS
2173
-
2174
- The parser has problems with the following things:
2175
-
2176
- =over 12
2177
-
2178
- =item encoding and charset attribute
2179
-
2180
- The parser assumes the input to be C<utf-8>. Input files in <code>Latin1</code>
2181
- are not parsed properly, even when they have the charset attribute set.
2182
-
2183
- =item shape=record
2184
-
2185
- Nodes with shape record are only parsed properly when the label does not
2186
- contain groups delimited by "{" and "}", so the following is parsed
2187
- wrongly:
2188
-
2189
- node1 [ shape=record, label="A|{B|C}" ]
2190
-
2191
- =item default shape
2192
-
2193
- The default shape for a node is 'rect', opposed to 'circle' as dot renders
2194
- nodes.
2195
-
2196
- =item attributes
2197
-
2198
- Some attributes are B<not> remapped properly to what Graph::Easy expects, thus
2199
- losing information, either because Graph::Easy doesn't support this feature
2200
- yet, or because the mapping is incomplete.
2201
-
2202
- Some attributes meant only for nodes or edges etc. might be incorrectly applied
2203
- to other objects, resulting in unnec. warnings while parsing.
2204
-
2205
- Attributes not valid in the original DOT language are silently ignored by dot,
2206
- but result in a warning when parsing under Graph::Easy. This helps catching all
2207
- these pesky misspellings, but it's not yet possible to disable these warnings.
2208
-
2209
- =item comments
2210
-
2211
- Comments written in the source code itself are discarded. If you want to have
2212
- comments on the graph, clusters, nodes or edges, use the attribute C<comment>.
2213
- These are correctly read in and stored, and then output into the different
2214
- formats, too.
2215
-
2216
- =back
2217
-
2218
- =head1 EXPORT
2219
-
2220
- Exports nothing.
2221
-
2222
- =head1 SEE ALSO
2223
-
2224
- L<Graph::Easy>, L<Graph::Reader::Dot>.
2225
-
2226
- =head1 AUTHOR
2227
-
2228
- Copyright (C) 2005 - 2007 by Tels L<http://bloodgate.com>
2229
-
2230
- See the LICENSE file for information.
2231
-
2232
- =cut