babel_bridge 0.5.1 → 0.5.3

Sign up to get free protection for your applications and to get access to all the features.
Files changed (40) hide show
  1. data/CHANGE_LOG +165 -0
  2. data/Gemfile +4 -0
  3. data/Guardfile +7 -0
  4. data/LICENCE +24 -0
  5. data/README.md +244 -0
  6. data/Rakefile +8 -2
  7. data/TODO +100 -0
  8. data/babel_bridge.gemspec +11 -3
  9. data/examples/json/json_parser.rb +23 -0
  10. data/examples/json/json_parser2.rb +37 -0
  11. data/lib/babel_bridge.rb +3 -2
  12. data/lib/{nodes.rb → babel_bridge/nodes.rb} +0 -0
  13. data/lib/{nodes → babel_bridge/nodes}/empty_node.rb +0 -0
  14. data/lib/{nodes → babel_bridge/nodes}/node.rb +1 -1
  15. data/lib/{nodes → babel_bridge/nodes}/non_terminal_node.rb +0 -8
  16. data/lib/{nodes → babel_bridge/nodes}/root_node.rb +0 -0
  17. data/lib/{nodes → babel_bridge/nodes}/rule_node.rb +0 -0
  18. data/lib/{nodes → babel_bridge/nodes}/terminal_node.rb +0 -0
  19. data/lib/{parser.rb → babel_bridge/parser.rb} +7 -14
  20. data/lib/{pattern_element.rb → babel_bridge/pattern_element.rb} +27 -25
  21. data/lib/babel_bridge/pattern_element_hash.rb +22 -0
  22. data/lib/{rule.rb → babel_bridge/rule.rb} +0 -0
  23. data/lib/{rule_variant.rb → babel_bridge/rule_variant.rb} +0 -4
  24. data/lib/{shell.rb → babel_bridge/shell.rb} +0 -0
  25. data/lib/{string.rb → babel_bridge/string.rb} +0 -0
  26. data/lib/{tools.rb → babel_bridge/tools.rb} +0 -0
  27. data/lib/babel_bridge/version.rb +3 -0
  28. data/spec/advanced_parsers_spec.rb +1 -0
  29. data/spec/basic_parsing_spec.rb +43 -0
  30. data/spec/bb_spec.rb +19 -0
  31. data/spec/compound_patterns_spec.rb +61 -0
  32. data/spec/node_spec.rb +3 -3
  33. data/spec/pattern_generators_spec.rb +4 -4
  34. data/spec/spec_helper.rb +3 -0
  35. metadata +115 -33
  36. data/README +0 -144
  37. data/examples/turing/examples.turing +0 -33
  38. data/examples/turing/notes.rb +0 -111
  39. data/examples/turing/turing_demo.rb +0 -71
  40. data/lib/version.rb +0 -4
data/CHANGE_LOG ADDED
@@ -0,0 +1,165 @@
1
+ 2013-2-12 v0.5.3
2
+
3
+ fixed bug with 0-length matchs' to_s returning non-zero-length strings
4
+
5
+ 2012-1-25 v0.5.1
6
+
7
+ added parser.relative_source_file
8
+
9
+ 2012-1-12 v0.5.0
10
+
11
+ added Parser.new :source_file => String
12
+ Sets parser.source_file value
13
+
14
+ Changed uniform_tabs to NOT include at least one space. If you want to ensure at least one space, you should add a space after your tab.
15
+
16
+ Fixed out-of-date tests in tools_spec.
17
+
18
+ 2012-1-6 v0.5.0
19
+
20
+ Nodes now have #line and #column methods which return the line and column of the source for the start of that Node's match.
21
+
22
+ 2012-1-5 v0.5.0
23
+
24
+ Completely reworked ignore_whitespace - again.
25
+
26
+ Now there is a global "delimiter" pattern which is matched between every sub-pattern of every rule AND at the begining and end of the entire parse.
27
+
28
+ ignore_whitespace sets this delimiter to: /\s*/
29
+
30
+ You can set your own delimiter with the delimiter method:
31
+
32
+ class MyParser < BabelBridge::Parser
33
+ delimiter :hi, "there", "/[mM]ust/", "be between every sub-pattern!" # delimiter can take any pattern "rule" can
34
+ rule :hi, "hi"
35
+ end
36
+
37
+ You can override the delimiter pattern for a single rule to put in special code:
38
+
39
+ class MyParser < BabelBridge::Parser
40
+ ignore_whitespace
41
+
42
+ rule :root, many(:statement, ';')
43
+ rule :statement, many(:word, / +/), :delimiter => // # disable the global delimiter
44
+ end
45
+
46
+ INCOMPATIBLE CHANGE: node.matches is no longer positional
47
+
48
+ node.matches now includes only things that were matched. This means conditional matches which do not match no longer add an EmptyNode to node.matches.
49
+
50
+ node.matches now contains all delimiter matches.
51
+
52
+ INCOMPATIBLE CHANGE: no more ManyNode
53
+
54
+ The many(rule) parser pattern no longer generates a special kind of parse-tree node. Instead it adds all its matches to the parent rule's .matches list. It also adds all the many-delimiters.
55
+
56
+ NOTE: 'delimiter' referes to the global delimiter pattern or the rule-local override. 'many-delimiter' refers to the optional, explicit delimiter specfied for the many-pattern.
57
+
58
+ NOTE: many(:rule,:many_delimiter) will effectively match: [rule]([delimiter][many_delimiter][delimiter][rule])*
59
+
60
+ 2012-12-31 v0.4.2
61
+
62
+ Bugfix: parser_failure_info now works when nothing is matched
63
+
64
+ 2012-12-17 v0.4.1
65
+
66
+ rewind_whitespace usage example:
67
+
68
+ rule :end_statement, rewind_whitespace, /([\t ]*[\n;])+/
69
+
70
+ In this example, end_statement is similar to the end-of-statement pattern for the ruby language. Each statement either ends with a new line or a semicolon. "rewind_whitspace" indicates the parser should back up to the end of the last match and then continue matching.
71
+
72
+ 2012-11-20 v0.4.0
73
+
74
+ INCOMPATIBLE CHANGE: Removed the post-match pattern option from the "many" pattern matcher. It simplifies things and can easily be reproduced with a custom rule.
75
+
76
+ Did significant code cleanup. NonTerminalNode was renamed RuleNode and a new NonTerminalNode class was created as a parent for RuleNode and ManyNode.
77
+
78
+ ignore_whitespace is now just a regexp. An Empty regexp is used if ignore_whitespace is not specified. It is now handled consistenly throughout. Every node has postwhitespace_range and prewhitespace_range methods that allow you to find the whitespace after and before that node.
79
+
80
+ node.to_s and node.text now both just return the matched text WITHOUT the preceding and trailing whitespace. Note, however, that it will still include any whitespace inbetween as it is just a single slice out of the source.
81
+
82
+ 2012-11-13
83
+
84
+ ignore_whitespace now optionally takes a regexp for what to ignore after every TerminalNode. Default: /\s*/
85
+
86
+ rewind_whitespace matching pattern added. This allows you to match the string ignored by "ignore_whitespace" after the previous token.
87
+
88
+ Example: Implements the Ruby ";" / new-line parsing rule.
89
+
90
+ class MyParser < BabelBridge::Parser
91
+ ignore_whitespace
92
+
93
+ rule :pair, :statement, :end_statement, :statement
94
+ rule :end_statement, rewind_whitespace(/([\t ]*[\n;])+/)
95
+ rule :statement, "0"
96
+ end
97
+
98
+ # matches two 0s separated by one or more ";" or "\n" and any whitespace
99
+
100
+
101
+ 2012-09-28
102
+
103
+ Added to_sym on nodes.
104
+
105
+ 2012-09-19 version 0.3.1
106
+
107
+ Added refinements to the parser-failure output.
108
+
109
+ 2012-09-13
110
+
111
+ Reversed the precedence order for binary_operators_rule. The first element has the highest precedence, i.e., it is computed first.
112
+
113
+ Now, the correct precedence order for the basic operators is:
114
+
115
+ [["*", "/"], ["+", "-"]]
116
+
117
+ 2012-09-12
118
+
119
+ using readline for shell
120
+
121
+ added support for infix binary operator presedence resolution:
122
+
123
+ USAGE:
124
+
125
+ binary_operators_rule :any_rule_name, :operands_pattern, operators, [:right_operators => [...]]
126
+
127
+ Where "operators" is an array of operators, ordered by precedence such as: ["+", "-", "*", "/"].
128
+
129
+ The last operators in the array are matched first.
130
+
131
+ You can also group operators into the same precedence level: [["+", "-"], ["*", "/"]]
132
+
133
+ Operators in the same precedence level are matched left-to-right.
134
+
135
+ You optionally can list one or more "right_operators" - which can be strings or regexps - to specify which operators are right-associative.
136
+
137
+ MATCHING:
138
+
139
+ binary_operators_rule :any_rule_name, :operands_pattern, ["+", "-", "*", "/"]
140
+
141
+ matches the same string as:
142
+
143
+ rule :any_rule_name, many(:operands_pattern,/[-+*\/]/)
144
+
145
+ PARSE TREE:
146
+
147
+ The resulting parse-tree consists of 1 or more instances of the :any_rule_name rule's varient class. Each node has methods for easy acess to:
148
+
149
+ left -> the left operand node
150
+ right -> the right operand node
151
+ operator -> the operator as a symbol
152
+ operator_node -> the operator node
153
+
154
+ ignore_whitespace feature added
155
+
156
+ Called in the parser's class. Sets a flag that causes all future parsing to ignore white spaces. Specifically, this means that after each terminal-node match, all trailing-whitespace is consumed before the next terminal match is attempted.
157
+
158
+ This means that terminal nodes can still match any white-spaces they require.
159
+
160
+ The exact matched string, including trailing whitespace, is still available via the "text" method. The "to_s" method, though, now returns the stripped token value (if ignore_whitespace is enabled).
161
+
162
+ 2012-09-09
163
+
164
+ forward_to now scans all patern elements for the first one that responds to the method
165
+ added shell
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in foiled.gemspec
4
+ gemspec
data/Guardfile ADDED
@@ -0,0 +1,7 @@
1
+ guard 'rspec', :cli => "--color" do
2
+ watch(%r{^spec/.+_spec\.rb$})
3
+ watch(%r{^lib/(.+)\.rb$}) { "spec" }
4
+ watch('spec/spec_helper.rb') { "spec" }
5
+
6
+ end
7
+
data/LICENCE ADDED
@@ -0,0 +1,24 @@
1
+ Copyright (c) 2010, Shane Brinkman-Davis
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are met:
6
+ * Redistributions of source code must retain the above copyright
7
+ notice, this list of conditions and the following disclaimer.
8
+ * Redistributions in binary form must reproduce the above copyright
9
+ notice, this list of conditions and the following disclaimer in the
10
+ documentation and/or other materials provided with the distribution.
11
+ * Neither the name of the <organization> nor the
12
+ names of its contributors may be used to endorse or promote products
13
+ derived from this software without specific prior written permission.
14
+
15
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
16
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
17
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18
+ DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
19
+ DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
22
+ ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
23
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
24
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md ADDED
@@ -0,0 +1,244 @@
1
+ Summary
2
+ -------
3
+
4
+ Babel Bridge let's you generate parsers 100% in Ruby code. It is a memoizing Parsing Expression Grammar (PEG) generator like Treetop, but it doesn't require special file-types or new syntax. Overall focus is on simplicity and usability over performance.
5
+
6
+ Goals
7
+ -----
8
+
9
+ * Allow expression 100% in ruby
10
+ * Productivity through Simplicity and Understandability first
11
+ * Performance second
12
+
13
+
14
+ Example
15
+ -------
16
+
17
+ ``` ruby
18
+ require "babel_bridge"
19
+
20
+ class MyParser < BabelBridge::Parser
21
+
22
+ # foo rule: match "foo" optionally followed by the :bar rule
23
+ rule :foo, "foo", :bar?
24
+
25
+ # bar rule: match "bar"
26
+ rule :bar, "bar"
27
+ end
28
+
29
+ # create one more instances of your parser
30
+ parser = MyParser.new
31
+
32
+ parser.parse "foo" # matches "foo"
33
+ # => FooNode1 > "foo"
34
+
35
+ parser.parse "foobar" # matches "foobar"
36
+ # => FooNode1
37
+ # "foo"
38
+ # BarNode1 > "bar"
39
+
40
+ parser.parse "fribar" # fails to match
41
+ # => nil
42
+
43
+ parser.parse "foobarbar" # fails to match entire input
44
+ # => nil
45
+ ```
46
+
47
+ More elaborate examples:
48
+ * [Parsing JSON the Not-So-Hard Way](http://www.essenceandartifact.com/2013/01/parsing-json-not-so-hard-way.html)
49
+ * [How to Create a Turing Complete Programming Language in 40 Minutes](http://www.essenceandartifact.com/2012/09/how-to-create-turing-complete.html)
50
+
51
+ Features
52
+ --------
53
+
54
+ ``` ruby
55
+
56
+ # returns the BabelBridge::Rule instance for that rule
57
+ rule = MyParser[:foo]
58
+ # => rule :foo, "foo", :bar?
59
+
60
+ # nice human-readable view of the rule with extra info:
61
+ rule.to_s
62
+ # rule :foo, node_class: MyParser::FooNode
63
+ # variant_class: MyParser::FooNode1, pattern: "foo", :bar?
64
+
65
+ # returns the code necessary for generating the rule and all its variants
66
+ # (minus any class_eval code)
67
+ rule.inspect
68
+ # => rule :foo, "foo", :bar?
69
+
70
+ # returns the Node class for a rule
71
+ MyParser.node_class(:foo)
72
+ # => MyParser::FooNode
73
+
74
+ MyParser.node_class(:foo) do
75
+ # class_eval inside the rule's Node-class
76
+ end
77
+
78
+ # parses Text starting with the MyParser.root_rule
79
+ # The root_rule is defined automatically by the first rule defined, but can be set by:
80
+ # MyParser.root_rule=v
81
+ # where v is the symbol name of the rule or the actual rule object from MyParser[rule]
82
+ text = "foobar"
83
+ parser.parse(text)
84
+
85
+ # do a one-time parse with :bar set as the root-rule
86
+ text = "bar"
87
+ parser.parse(text, :rule => :bar)
88
+
89
+ # relax requirement to match entire input
90
+ parser.parse "foobar and then something", :partial_match => true
91
+
92
+ # parse failure
93
+ parser.parse "foo is not immediately followed by bar"
94
+
95
+ # human readable parser failure info
96
+ puts parser.parser_failure_info
97
+ ```
98
+
99
+ Parser failure info output:
100
+ ```
101
+ Parsing error at line 1 column 4 offset 3
102
+
103
+ Source:
104
+ ...
105
+ foo<HERE> is not immediately followed by bar
106
+ ...
107
+
108
+ Parser did not match entire input.
109
+
110
+ Parse path at failure:
111
+ FooNode1
112
+
113
+ Expecting:
114
+ "bar" BarNode1
115
+ ```
116
+ NOTE: This is an evolving feature, this output is as-of 0.5.1 and may not match the current version.
117
+
118
+ Defining Rules
119
+ --------------
120
+
121
+ Inside the parser class, a rule is defined as follows:
122
+
123
+ ``` ruby
124
+ class MyParser < BabelBridge::Parser
125
+ rule :rule_name, pattern
126
+ end
127
+ ```
128
+
129
+ Where:
130
+
131
+ * :rule_name is a symbol
132
+ * pattern see Patterns below
133
+
134
+ You can also add new rules outside the class definition by:
135
+
136
+ ``` ruby
137
+ MyParser.rule :rule_name, pattern
138
+ ```
139
+
140
+ Patterns
141
+ --------
142
+
143
+ Patterns are a list of pattern elements, matched in order:
144
+
145
+ Example:
146
+
147
+ ``` ruby
148
+ rule :my_rule, "match", "this", "in", "order" # matches "matchthisinorder"
149
+ ```
150
+
151
+ Pattern Elements
152
+ ----------------
153
+
154
+ Pattern elements are basic-pattern-element or extended-pattern-element ( expressed as a hash). Internally, they are "compiled" into instances of PatternElement with optimized lambda functions for parsing.
155
+
156
+ ## Basic Pattern Elements (basic_element)
157
+
158
+ ``` ruby
159
+ :my_rule # matches the Rule named :my_rule
160
+ :my_rule? # optional: optionally matches Rule :my_rule
161
+ :my_rule! # negative: success only if it DOESN'T match Rule :my_rule
162
+ "string" # matches the string exactly
163
+ /regex/ # matches the regex exactly
164
+ ```
165
+
166
+ ## Advanced Pattern Elements
167
+
168
+ ``` ruby
169
+
170
+ # success if basic_element could be matched, but the input is not consumed
171
+ could.match(pattern_element)
172
+
173
+ # negative (two equivelent methods)
174
+ dont.match(pattern_element)
175
+ match!(pattern_element)
176
+
177
+ # optional (two equivelent methods)
178
+ optionally.match(pattern_element)
179
+ match?(pattern_element)
180
+
181
+ # match 1 or more
182
+ many(pattern_element)
183
+
184
+ # match 1 or more of one basic_element delimited by another basic_element)
185
+ many(pattern_element, delimiter_pattern_element)
186
+
187
+ # match 0 or more
188
+ many?(pattern_element)
189
+
190
+ # An array of patterns tells BB to match those patterns in order ("and" matching)
191
+ [pattern_element_a, pattern_element_b, pattern_element_c, ...]
192
+
193
+ # match any one of the listed patterns ("or" matching)
194
+ any(pattern_element_a, pattern_element_b, pattern_element_c, ...)
195
+
196
+ # optionally match any of the patterns
197
+ any?(pattern_element_a, pattern_element_b, pattern_element_c, ...)
198
+
199
+ # don't match any of the patterns
200
+ any!(pattern_element_a, pattern_element_b, pattern_element_c, ...)
201
+
202
+ ```
203
+
204
+ ## Custom Pattern Element Parser
205
+
206
+ Custom pattern elements are not generally needed, but for certain patterns, particularly context sensative ones, we provide a way to do it.
207
+
208
+ ``` ruby
209
+ class MyParser < BabelBridge::Parser
210
+
211
+ # custom parser to match an all upper-case word followed by any number of characters before that word is repeated
212
+ rule :foo, (custom_parser do |parent_node|
213
+ offset = parent_node.next
214
+ src = parent_node.src
215
+
216
+ # Note, the \A anchors the search at the beginning of the string
217
+ if src[offset..-1].index(/\A[A-Z]+/) == 0
218
+ endpattern=$~.to_s
219
+ if i = src.index(endpattern, offset + endpattern.length)
220
+ range = offset..(i + endpattern.length)
221
+ BabelBridge::TerminalNode.new(parent_node, range, "endpattern")
222
+ end
223
+ end
224
+ end)
225
+ end
226
+
227
+ parser = parser
228
+ parser.parse "END this is in the middle END"
229
+ # => FooNode1 > "END this is in the middle END"
230
+
231
+ parser.parse "DRUID this is in the middle DRUID"
232
+ # => FooNode1 > "DRUID this is in the middle DRUID"
233
+
234
+ parser.parse "DRUID this is in the middle DRUI"
235
+ # => nil
236
+ ```
237
+
238
+ Structure
239
+ ---------
240
+
241
+ * Each Rule defines a subclass of Node
242
+ * Each RuleVariant defines a subclass of the parent Rule's node-class
243
+
244
+ Therefor you can easily define code to be shared across all variants as well as define code specific to one variant.