ariel 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. data/README +49 -83
  2. data/bin/ariel +29 -20
  3. data/examples/google_calculator/structure.rb +2 -2
  4. data/examples/google_calculator/structure.yaml +13 -15
  5. data/examples/raa/labeled/highline.html +5 -4
  6. data/examples/raa/labeled/mongrel.html +9 -8
  7. data/examples/raa/structure.rb +4 -2
  8. data/examples/raa/structure.yaml +94 -78
  9. data/lib/ariel.rb +71 -33
  10. data/lib/ariel/{candidate_selector.rb → candidate_refiner.rb} +39 -38
  11. data/lib/ariel/label_utils.rb +46 -18
  12. data/lib/ariel/labeled_document_loader.rb +77 -0
  13. data/lib/ariel/learner.rb +60 -38
  14. data/lib/ariel/log.rb +67 -0
  15. data/lib/ariel/node.rb +52 -0
  16. data/lib/ariel/node/extracted.rb +90 -0
  17. data/lib/ariel/node/structure.rb +91 -0
  18. data/lib/ariel/rule.rb +114 -32
  19. data/lib/ariel/rule_set.rb +34 -15
  20. data/lib/ariel/token.rb +9 -3
  21. data/lib/ariel/token_stream.rb +32 -17
  22. data/lib/ariel/wildcards.rb +19 -15
  23. data/test/fixtures.rb +45 -3
  24. data/test/specs/candidate_refiner_spec.rb +48 -0
  25. data/test/specs/label_utils_spec.rb +97 -0
  26. data/test/specs/learner_spec.rb +39 -0
  27. data/test/specs/node_extracted_spec.rb +90 -0
  28. data/test/specs/node_spec.rb +76 -0
  29. data/test/specs/node_structure_spec.rb +74 -0
  30. data/test/specs/rule_set_spec.rb +85 -0
  31. data/test/specs/rule_spec.rb +110 -0
  32. data/test/specs/token_stream_spec.rb +100 -7
  33. metadata +21 -28
  34. data/lib/ariel/example_document_loader.rb +0 -59
  35. data/lib/ariel/extracted_node.rb +0 -20
  36. data/lib/ariel/node_like.rb +0 -26
  37. data/lib/ariel/structure_node.rb +0 -75
  38. data/test/ariel_test_case.rb +0 -15
  39. data/test/test_candidate_selector.rb +0 -58
  40. data/test/test_example_document_loader.rb +0 -7
  41. data/test/test_label_utils.rb +0 -15
  42. data/test/test_learner.rb +0 -38
  43. data/test/test_rule.rb +0 -38
  44. data/test/test_structure_node.rb +0 -81
  45. data/test/test_token.rb +0 -16
  46. data/test/test_token_stream.rb +0 -82
  47. data/test/test_wildcards.rb +0 -18
data/README CHANGED
@@ -1,98 +1,64 @@
1
- = Ariel release 0.0.1
1
+ = Ariel release 0.1.0
2
+
3
+ == About - Ariel: A Ruby Information Extraction Library
4
+ Ariel is a library that allows you to extract information from semi-structured
5
+ documents (such as websites). It is different to existing tools because rather
6
+ than expecting the developer to write rules to extract the desired information,
7
+ Ariel will use a small number of labeled examples to generate and learn
8
+ effective extraction rules. It is developed by Alex Bradbury and released under
9
+ the MIT license. Ariel was started as a Google Summer of Code project mentored
10
+ by Austin Ziegler in 2006.
2
11
 
3
12
  == Install
4
13
  gem install ariel
5
14
 
6
15
  == Announcement
7
- This is the first public release of Ariel - A Ruby Information Extraction
8
- Library. See my previous post, ruby-talk:200140[http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/200140]
9
- for more background information. This release supports defining a tree document
10
- structure and learning rules to extract each node of this true. Handling of list
11
- extraction and learning is not yet implemented, and is the next immediate
12
- priority. See the examples directory included in this release and below for
13
- discussion of the included examples. Rule learning is functional, and appears to
14
- work well, but many refinements are possible. Look out for more updates and a
15
- new releases shortly.
16
-
17
- == About Ariel
18
- Ariel intends to assist in extracting information from semi-structured
19
- documents including (but not in any way limited to) web pages. Although you
20
- may use libraries such as Hpricot or Rubyful Soup, or even plain Regular
21
- Expressions to achieve the same goal, Ariel approaches the problem very
22
- differently. Ariel relies on the user labeling examples of the data they
23
- want to extract, and then finds patterns across several such labeled
24
- examples in order to produce a set of general rules for extracting this
25
- information from any similar document. It uses the MIT license.
26
-
27
- == Examples
28
- This release includes two examples in the example directory (which should now
29
- be in the directory to which rubygems installed ariel). The first is the
30
- google_calculator directory (inspired by Justin Bailey's post to my Ariel
31
- progress report). The structure is very simple, a calculation is extracted from
32
- the page, and then the actual result is extracted from that calculation. 3
33
- labeled examples are included. Ariel reads each of these, tokenizes them,
34
- and extracts each label. 4 sets of rules are learnt:
35
- 1. Rules to locate the start of the calculation in the original document.
36
- 2. Rules to locate the end of the calculation in the original document (applied
37
- from the end of the document).
38
- 3. Rules to locate the start of the result of the calculation from the
39
- extracted calculation.
40
- 4. Rules to locate the end of the result of the calculation from the extracted
41
- calculation (applied from the end of the calculation).
42
-
43
- Take note of 3 and 4 - this is the advantage of treating a document as a tree in
44
- this way. Deeply nested elements can be located by generating a series of simple
45
- rules, rather than generating a rule with complexity that increases at each
46
- level. Sets of rules are generated because it may not be possible to generate a
47
- single rule that will catch all cases. A rule is found that matches as many of
48
- the examples as possible (and fails on the rest), these examples are then removed
49
- and a rule is found that will match as many of the remaining examples and so on.
50
- When it comes to applying these learnt rules, the rules are applied in order
51
- until there is a rule that matches.
52
-
53
- To see this example for yourself just execute structure.rb in the
54
- examples/google_calculator directory to create a locally writable
55
- structure.yaml. Then do:
56
- ariel -D -m learn -s structure.yaml -d /path/to/examples/google_calculator/labeled
57
-
58
- You'll have to wait a while (see my note about performance below). At the end,
59
- the learnt rules will be printed in YAML format, and structure.yaml will be
60
- updated to include these rules. Apply these learnt rules to some unlabeled
61
- documents by doing:
62
- ariel -D -m extract -s structure.yaml -d /path/to/examples/google_calculator/unlabeled
63
-
64
- You should see the results of a successful extraction printed to your terminal,
65
- such as this one:
66
16
 
67
- Results for unlabeled/2:
68
- calculation: 3.5 U.S. dollars = 1.8486241 British pounds
69
- result: 1.8486241 British pounds
17
+ I'm happy to announce the release of Ariel 0.1.0, the result of my Summer of
18
+ Code work. This release should be easy to use, very functional, and hopefully
19
+ useful - so it's worth trying out. I've put a lot of effort in to writing clear
20
+ and straightforward documentation to get your started, so take a look at the
21
+ docs available at http://ariel.rubyforge.org. In particular, flick through the
22
+ tutorial and quick start guide. If you're interested, you may also want to take
23
+ a look at the theory page where I've made a good start on describing the method
24
+ Ariel uses to learn extraction rules. If you have any problems or find any bugs,
25
+ just send me an email or add it to the issue tracker (see link below). Enjoy.
26
+ See the FAQ for a vim snippet to make labeling examples a little easier.
70
27
 
71
- The second example (raa) learns rules using just 2 labeled examples. This is probably
72
- fewer than I'd recommend in most cases, but as it works... This example consists
73
- of project entries in the Ruby Application Archive. The structure of the page is
74
- very flat, so all rules are applied to the full page. Rules are learnt and
75
- applied as shown above. The structure.yaml files included in the examples
76
- directories already include rules generated by Ariel, use these if you just want
77
- to see extraction working.
28
+ == Quickstart/Basic usage
78
29
 
79
- Note: The interface demonstrated by ariel above is not very flexible or
80
- friendly, it's just to serve as a demonstration for the moment.
30
+ * @require 'ariel'@
31
+ * Define a structure for the information you wish to extract:
32
+ structure = Ariel::Node::Structure.new do |r|
33
+ r.item :title
34
+ r.item :body
35
+ r.list :comments do |c|
36
+ c.list_item :comment do |d|
37
+ d.item :author
38
+ d.item :body
39
+ end
40
+ end
41
+ end
42
+ * Collect a few examples of the sort of document you wish to extract information
43
+ from (pages from the same website for instance).
44
+ * Label each example with tags such as <l:title>, <l:comment> and so on in the
45
+ relevant places.
46
+ * Ariel.learn structure, labeled_file1, labeled_file2, labeled_file3
47
+ * Find the documents you want to extract information from.
48
+ * extractions = Ariel.extract structure, unlabeled_file1,
49
+ unlabeled_file2
50
+ * extractions[0].search('comments/*/body').each {|e| puts e.extracted_text} =>
51
+ "Great stuff, loving it", "I love life", .....
52
+ * extractions[0].at('comments/34') => nil</tt> (there is no 34th comment, #at
53
+ returns the first result rather than an array of matches).
81
54
 
82
- == Performance
83
- Generating rules takes quite a long time. It is always going to be an intensive
84
- operation, but there are some very simple and obvious improvements in efficiency
85
- that can be made. For a start, the rule candidate refining process currently
86
- re-applies the same rules over and over every time the remaining rule candidates
87
- are ranked. This is where most time is spent, and caching these should make a
88
- big difference. This will definitely be implemented. Other performance
89
- enhancements are bound to be there, but my focus at this time is to get
90
- something that works.
91
55
 
92
56
  == Credits
93
57
  Ariel is developed by Alex Bradbury as a Google Summer of Code project under the
94
58
  mentoring of Austin Ziegler.
95
59
 
96
60
  == Links
97
- Watch my development through the subversion repository at http://rubyforge.org/projects/ariel
98
- I've also just started using the tracker at http://code.google.com/p/ariel/
61
+ SVN Repository: http://rubyforge.org/projects/ariel
62
+ Issue tracker: http://code.google.com/p/ariel/issues/
63
+ Documentation/homepage: http://ariel.rubyforge.org
64
+ RDoc: http://ariel.rubyforge.org/rdoc/
data/bin/ariel CHANGED
@@ -14,43 +14,52 @@ OptionParser.new do |opts|
14
14
  end
15
15
 
16
16
  opts.on('-d', '--dir=DIRECTORY', 'Directory to look for documents to operate on.') do |dir|
17
+ raise ArgumentError, "directory does not exist" unless FileTest.directory? dir
17
18
  options[:dir]=dir
18
19
  end
19
20
 
20
- opts.on('-D', '--debug', 'Directory to look for documents to operate on.') do
21
+ opts.on('-D', '--debug', 'Enable debugging output.') do
21
22
  $DEBUG=true
22
23
  end
23
24
 
24
25
  opts.on('-s', '--structure=STRUCTURE', 'YAML file in which the structure is defined') do |structure|
25
26
  options[:structure]=structure
26
27
  end
28
+
29
+ opts.on('-o', '--output-dir=DIRECTORY', 'Directory to output to') do |dir|
30
+ raise ArgumentError, "directory does not exist" unless FileTest.directory? dir
31
+ options[:output_dir]=dir
32
+ end
27
33
  end.parse!
28
34
 
29
- require 'ariel' #After option parsing to debug setting can take effect
35
+ require 'ariel' #After option parsing so debug setting can take effect
36
+
37
+ files=Dir["#{options[:dir]}/*"].select {|file_name| File.file? file_name}
38
+ structure=YAML.load_file options[:structure]
30
39
 
31
40
  case options[:mode]
32
41
  when "learn"
33
- structure=YAML.load_file options[:structure]
34
- learnt_structure=Ariel::ExampleDocumentLoader.load_directory options[:dir], structure
42
+ Ariel.learn(structure, *files)
35
43
  File.open(options[:structure], 'wb') do |file|
36
- YAML.dump(learnt_structure, file)
37
- end
38
- learnt_structure.each_descendant do |structure_node|
39
- puts structure_node.meta.name.to_s
40
- puts structure_node.ruleset.to_yaml
44
+ YAML.dump(structure, file)
41
45
  end
46
+
42
47
  when "extract"
43
- learnt_structure=YAML.load_file options[:structure]
44
- Dir.glob("#{options[:dir]}/*") do |file|
45
- tokenstream=Ariel::TokenStream.new
46
- tokenstream.tokenize File.read(file)
47
- root_node=Ariel::ExtractedNode.new :root, tokenstream, learnt_structure
48
- learnt_structure.apply_extraction_tree_on root_node
49
- puts "Results for #{file}:"
50
- root_node.each_descendant do |node|
51
- puts "#{node.meta.name}: #{node.tokenstream.text}"
48
+ extractions = Ariel.extract(structure, *files)
49
+ if options[:output_dir]
50
+ extractions.zip(files) do |extraction, file|
51
+ filename=File.join(options[:output_dir], File.basename(file)+'.yaml')
52
+ File.open(filename, 'wb') do |f|
53
+ YAML.dump(extraction, f)
54
+ end
55
+ end
56
+ else
57
+ puts "No --output-dir given, so printing extractions to stdout"
58
+ extractions.each do |extraction|
59
+ extraction.each_descendant do |node|
60
+ puts "#{node.node_name}: #{node.tokenstream.text}"
61
+ end
62
+ puts #Blank line looks prettier
52
63
  end
53
- puts
54
- # puts root_node.to_yaml
55
64
  end
56
65
  end
@@ -1,12 +1,12 @@
1
1
  require 'ariel'
2
2
  require 'yaml'
3
3
 
4
- structure = Ariel::StructureNode.new do |r|
4
+ structure = Ariel::Node::Structure.new do |r|
5
5
  r.item :calculation do |c|
6
6
  c.item :result
7
7
  end
8
8
  end
9
9
 
10
- File.open('structure.yaml') do |file|
10
+ File.open('structure.yaml', 'w') do |file|
11
11
  YAML.dump structure, file
12
12
  end
@@ -1,46 +1,44 @@
1
- --- &id002 !ruby/object:Ariel::StructureNode
1
+ --- &id002 !ruby/object:Ariel::Node::Structure
2
2
  children:
3
- :calculation: &id001 !ruby/object:Ariel::StructureNode
3
+ :calculation: &id001 !ruby/object:Ariel::Node::Structure
4
4
  children:
5
- :result: !ruby/object:Ariel::StructureNode
5
+ :result: !ruby/object:Ariel::Node::Structure
6
6
  children: {}
7
7
 
8
- meta: !ruby/object:OpenStruct
9
- table:
10
- :node_type: :not_list
11
- :name: :result
8
+ node_name: :result
9
+ node_type: :not_list
12
10
  parent: *id001
13
11
  ruleset: !ruby/object:Ariel::RuleSet
14
12
  end_rules:
15
13
  - !ruby/object:Ariel::Rule
16
14
  direction: :back
15
+ exhaustive: false
17
16
  landmarks: []
18
17
 
19
18
  start_rules:
20
19
  - !ruby/object:Ariel::Rule
21
20
  direction: :forward
21
+ exhaustive: false
22
22
  landmarks:
23
23
  - - "="
24
- meta: !ruby/object:OpenStruct
25
- table:
26
- :node_type: :not_list
27
- :name: :calculation
24
+ node_name: :calculation
25
+ node_type: :not_list
28
26
  parent: *id002
29
27
  ruleset: !ruby/object:Ariel::RuleSet
30
28
  end_rules:
31
29
  - !ruby/object:Ariel::Rule
32
30
  direction: :back
31
+ exhaustive: false
33
32
  landmarks:
34
33
  - - </b>
35
34
  - - </b>
36
35
  start_rules:
37
36
  - !ruby/object:Ariel::Rule
38
37
  direction: :forward
38
+ exhaustive: false
39
39
  landmarks:
40
40
  - - <b>
41
41
  - - gif
42
42
  - - <b>
43
- meta: !ruby/object:OpenStruct
44
- table:
45
- :node_type: :not_list
46
- :name: :root
43
+ node_name: :root
44
+ node_type: :not_list
@@ -96,17 +96,18 @@ highline / <l:current_version>1.2.0</l:current_version>
96
96
 
97
97
  <tr><th>Versions: </th>
98
98
  <td>
99
- <l:version_history>[<a href="project/highline/1.2.0">1.2.0</a> (2006-03-23)]
99
+ <l:version_history>[<a
100
+ href="project/highline/1.2.0"><l:version>1.2.0</l:version></a> (2006-03-23)]
100
101
 
101
102
  [<a href="project/highline/1.0.2">1.0.2</a> (2006-02-20)]
102
103
 
103
- [<a href="project/highline/1.0.1">1.0.1</a> (2005-07-07)]
104
+ [<a href="project/highline/1.0.1"><l:version>1.0.1</l:version></a> (2005-07-07)]
104
105
 
105
106
  [<a href="project/highline/1.0.0">1.0.0</a> (2005-07-07)]
106
107
 
107
- [<a href="project/highline/0.6.1">0.6.1</a> (2005-05-26)]
108
+ [<a href="project/highline/0.6.1"><l:version>0.6.1</l:version></a> (2005-05-26)]
108
109
 
109
- [<a href="project/highline/0.6.0">0.6.0</a>
110
+ [<a href="project/highline/0.6.0"><l:version>0.6.0</l:version></a>
110
111
  (2005-05-21)]</l:version_history>
111
112
 
112
113
  </td>
@@ -126,21 +126,22 @@ mongrel / <l:current_version>0.3.12</l:current_version>
126
126
 
127
127
  <tr><th>Versions: </th>
128
128
  <td>
129
- <l:version_history>[<a href="project/mongrel/0.3.12">0.3.12</a> (2006-03-30)]
129
+ <l:version_history>[<a
130
+ href="project/mongrel/0.3.12"><l:version>0.3.12</l:version></a> (2006-03-30)]
130
131
 
131
- [<a href="project/mongrel/0.3.11">0.3.11</a> (2006-03-15)]
132
+ [<a href="project/mongrel/0.3.11"><l:version>0.3.11</l:version></a> (2006-03-15)]
132
133
 
133
- [<a href="project/mongrel/0.3.10">0.3.10</a> (2006-03-12)]
134
+ [<a href="project/mongrel/0.3.10"><l:version>0.3.10</l:version></a> (2006-03-12)]
134
135
 
135
- [<a href="project/mongrel/0.3.9">0.3.9</a> (2006-03-06)]
136
+ [<a href="project/mongrel/0.3.9"><l:version>0.3.9</l:version></a> (2006-03-06)]
136
137
 
137
- [<a href="project/mongrel/0.3.8">0.3.8</a> (2006-03-04)]
138
+ [<a href="project/mongrel/0.3.8"><l:version>0.3.8</l:version></a> (2006-03-04)]
138
139
 
139
- [<a href="project/mongrel/0.3.6">0.3.6</a> (2006-02-23)]
140
+ [<a href="project/mongrel/0.3.6"><l:version>0.3.6</l:version></a> (2006-02-23)]
140
141
 
141
- [<a href="project/mongrel/0.3.2">0.3.2</a> (2006-02-13)]
142
+ [<a href="project/mongrel/0.3.2"><l:version>0.3.2</l:version></a> (2006-02-13)]
142
143
 
143
- [<a href="project/mongrel/0.3.1">0.3.1</a> (2006-02-12)]</l:version_history>
144
+ [<a href="project/mongrel/0.3.1"><l:version>0.3.1</l:version></a> (2006-02-12)]</l:version_history>
144
145
 
145
146
  </td>
146
147
  </tr>
@@ -1,7 +1,7 @@
1
1
  require 'ariel'
2
2
  require 'yaml'
3
3
 
4
- structure = Ariel::StructureNode.new do |r|
4
+ structure = Ariel::Node::Structure.new do |r|
5
5
  r.item :name
6
6
  r.item :current_version
7
7
  r.item :short_description
@@ -9,7 +9,9 @@ structure = Ariel::StructureNode.new do |r|
9
9
  r.item :owner
10
10
  r.item :homepage
11
11
  r.item :license
12
- r.item :version_history
12
+ r.list :version_history do |v|
13
+ v.list_item :version
14
+ end
13
15
  end
14
16
 
15
17
  File.open('structure.yaml', 'wb') do |file|
@@ -1,38 +1,16 @@
1
- --- &id001 !ruby/object:Ariel::StructureNode
1
+ --- &id001 !ruby/object:Ariel::Node::Structure
2
2
  children:
3
- :version_history: !ruby/object:Ariel::StructureNode
3
+ :short_description: !ruby/object:Ariel::Node::Structure
4
4
  children: {}
5
5
 
6
- meta: !ruby/object:OpenStruct
7
- table:
8
- :name: :version_history
9
- :node_type: :not_list
10
- parent: *id001
11
- ruleset: !ruby/object:Ariel::RuleSet
12
- end_rules:
13
- - !ruby/object:Ariel::Rule
14
- direction: :back
15
- landmarks:
16
- - - </td>
17
- start_rules:
18
- - !ruby/object:Ariel::Rule
19
- direction: :forward
20
- landmarks:
21
- - - <td>
22
- - - Versions
23
- - - <td>
24
- :short_description: !ruby/object:Ariel::StructureNode
25
- children: {}
26
-
27
- meta: !ruby/object:OpenStruct
28
- table:
29
- :name: :short_description
30
- :node_type: :not_list
6
+ node_name: :short_description
7
+ node_type: :not_list
31
8
  parent: *id001
32
9
  ruleset: !ruby/object:Ariel::RuleSet
33
10
  end_rules:
34
11
  - !ruby/object:Ariel::Rule
35
12
  direction: :back
13
+ exhaustive: false
36
14
  landmarks:
37
15
  - - </td>
38
16
  - - Category
@@ -40,109 +18,109 @@ children:
40
18
  start_rules:
41
19
  - !ruby/object:Ariel::Rule
42
20
  direction: :forward
21
+ exhaustive: false
43
22
  landmarks:
44
23
  - - <td>
45
- :current_version: !ruby/object:Ariel::StructureNode
24
+ :homepage: !ruby/object:Ariel::Node::Structure
46
25
  children: {}
47
26
 
48
- meta: !ruby/object:OpenStruct
49
- table:
50
- :name: :current_version
51
- :node_type: :not_list
27
+ node_name: :homepage
28
+ node_type: :not_list
52
29
  parent: *id001
53
30
  ruleset: !ruby/object:Ariel::RuleSet
54
31
  end_rules:
55
32
  - !ruby/object:Ariel::Rule
56
33
  direction: :back
34
+ exhaustive: false
57
35
  landmarks:
58
- - - </p>
59
- - - table
60
- - - </p>
36
+ - - </a>
37
+ - - Download
38
+ - - </a>
61
39
  start_rules:
62
40
  - !ruby/object:Ariel::Rule
63
41
  direction: :forward
42
+ exhaustive: false
64
43
  landmarks:
65
- - - /
66
- - - caption
67
- - - /
68
- :homepage: !ruby/object:Ariel::StructureNode
44
+ - - ">"
45
+ - - rubyforge
46
+ - - ">"
47
+ :category: !ruby/object:Ariel::Node::Structure
69
48
  children: {}
70
49
 
71
- meta: !ruby/object:OpenStruct
72
- table:
73
- :name: :homepage
74
- :node_type: :not_list
50
+ node_name: :category
51
+ node_type: :not_list
75
52
  parent: *id001
76
53
  ruleset: !ruby/object:Ariel::RuleSet
77
54
  end_rules:
78
55
  - !ruby/object:Ariel::Rule
79
56
  direction: :back
57
+ exhaustive: false
80
58
  landmarks:
81
- - - </a>
82
- - - Download
83
- - - </a>
59
+ - - </td>
60
+ - - Status
61
+ - - </td>
84
62
  start_rules:
85
63
  - !ruby/object:Ariel::Rule
86
64
  direction: :forward
65
+ exhaustive: false
87
66
  landmarks:
88
- - - ">"
89
- - - rubyforge
90
- - - ">"
91
- :category: !ruby/object:Ariel::StructureNode
67
+ - - <td>
68
+ - - <td>
69
+ :current_version: !ruby/object:Ariel::Node::Structure
92
70
  children: {}
93
71
 
94
- meta: !ruby/object:OpenStruct
95
- table:
96
- :name: :category
97
- :node_type: :not_list
72
+ node_name: :current_version
73
+ node_type: :not_list
98
74
  parent: *id001
99
75
  ruleset: !ruby/object:Ariel::RuleSet
100
76
  end_rules:
101
77
  - !ruby/object:Ariel::Rule
102
78
  direction: :back
79
+ exhaustive: false
103
80
  landmarks:
104
- - - </td>
105
- - - Status
106
- - - </td>
81
+ - - </p>
82
+ - - table
83
+ - - </p>
107
84
  start_rules:
108
85
  - !ruby/object:Ariel::Rule
109
86
  direction: :forward
87
+ exhaustive: false
110
88
  landmarks:
111
- - - <td>
112
- - - <td>
113
- :name: !ruby/object:Ariel::StructureNode
89
+ - - :anything
90
+ - - caption
91
+ - - /
92
+ :name: !ruby/object:Ariel::Node::Structure
114
93
  children: {}
115
94
 
116
- meta: !ruby/object:OpenStruct
117
- table:
118
- :name: :name
119
- :node_type: :not_list
95
+ node_name: :name
96
+ node_type: :not_list
120
97
  parent: *id001
121
98
  ruleset: !ruby/object:Ariel::RuleSet
122
99
  end_rules:
123
100
  - !ruby/object:Ariel::Rule
124
101
  direction: :back
102
+ exhaustive: false
125
103
  landmarks:
126
104
  - - </title>
127
105
  start_rules:
128
106
  - !ruby/object:Ariel::Rule
129
107
  direction: :forward
108
+ exhaustive: false
130
109
  landmarks:
131
110
  - - "-"
132
111
  - - RAA
133
112
  - "-"
134
- :owner: !ruby/object:Ariel::StructureNode
113
+ :owner: !ruby/object:Ariel::Node::Structure
135
114
  children: {}
136
115
 
137
- meta: !ruby/object:OpenStruct
138
- table:
139
- :name: :owner
140
- :node_type: :not_list
116
+ node_name: :owner
117
+ node_type: :not_list
141
118
  parent: *id001
142
119
  ruleset: !ruby/object:Ariel::RuleSet
143
120
  end_rules:
144
121
  - !ruby/object:Ariel::Rule
145
122
  direction: :back
123
+ exhaustive: false
146
124
  landmarks:
147
125
  - - </a>
148
126
  - - id
@@ -150,22 +128,22 @@ children:
150
128
  start_rules:
151
129
  - !ruby/object:Ariel::Rule
152
130
  direction: :forward
131
+ exhaustive: false
153
132
  landmarks:
154
133
  - - ">"
155
134
  - - Owner
156
135
  - - ">"
157
- :license: !ruby/object:Ariel::StructureNode
136
+ :license: !ruby/object:Ariel::Node::Structure
158
137
  children: {}
159
138
 
160
- meta: !ruby/object:OpenStruct
161
- table:
162
- :name: :license
163
- :node_type: :not_list
139
+ node_name: :license
140
+ node_type: :not_list
164
141
  parent: *id001
165
142
  ruleset: !ruby/object:Ariel::RuleSet
166
143
  end_rules:
167
144
  - !ruby/object:Ariel::Rule
168
145
  direction: :back
146
+ exhaustive: false
169
147
  landmarks:
170
148
  - - </td>
171
149
  - - Dependency
@@ -173,11 +151,49 @@ children:
173
151
  start_rules:
174
152
  - !ruby/object:Ariel::Rule
175
153
  direction: :forward
154
+ exhaustive: false
176
155
  landmarks:
177
156
  - - <td>
178
157
  - - License
179
158
  - - <td>
180
- meta: !ruby/object:OpenStruct
181
- table:
182
- :name: :root
183
- :node_type: :not_list
159
+ :version_history: &id002 !ruby/object:Ariel::Node::Structure
160
+ children:
161
+ :version: !ruby/object:Ariel::Node::Structure
162
+ children: {}
163
+
164
+ node_name: :version
165
+ node_type: :list_item
166
+ parent: *id002
167
+ ruleset: !ruby/object:Ariel::RuleSet
168
+ end_rules:
169
+ - !ruby/object:Ariel::Rule
170
+ direction: :back
171
+ exhaustive: true
172
+ landmarks:
173
+ - - </a>
174
+ start_rules:
175
+ - !ruby/object:Ariel::Rule
176
+ direction: :forward
177
+ exhaustive: true
178
+ landmarks:
179
+ - - ">"
180
+ node_name: :version_history
181
+ node_type: :not_list
182
+ parent: *id001
183
+ ruleset: !ruby/object:Ariel::RuleSet
184
+ end_rules:
185
+ - !ruby/object:Ariel::Rule
186
+ direction: :back
187
+ exhaustive: false
188
+ landmarks:
189
+ - - </td>
190
+ start_rules:
191
+ - !ruby/object:Ariel::Rule
192
+ direction: :forward
193
+ exhaustive: false
194
+ landmarks:
195
+ - - <td>
196
+ - - Versions
197
+ - - <td>
198
+ node_name: :root
199
+ node_type: :not_list