RubyGems - babel_bridge - Versions diffs - 0.5.1 → 0.5.3 - Mend

babel_bridge 0.5.1 → 0.5.3

Files changed (40) hide show

data/CHANGE_LOG +165 -0
data/Gemfile +4 -0
data/Guardfile +7 -0
data/LICENCE +24 -0
data/README.md +244 -0
data/Rakefile +8 -2
data/TODO +100 -0
data/babel_bridge.gemspec +11 -3
data/examples/json/json_parser.rb +23 -0
data/examples/json/json_parser2.rb +37 -0
data/lib/babel_bridge.rb +3 -2
data/lib/{nodes.rb → babel_bridge/nodes.rb} +0 -0
data/lib/{nodes → babel_bridge/nodes}/empty_node.rb +0 -0
data/lib/{nodes → babel_bridge/nodes}/node.rb +1 -1
data/lib/{nodes → babel_bridge/nodes}/non_terminal_node.rb +0 -8
data/lib/{nodes → babel_bridge/nodes}/root_node.rb +0 -0
data/lib/{nodes → babel_bridge/nodes}/rule_node.rb +0 -0
data/lib/{nodes → babel_bridge/nodes}/terminal_node.rb +0 -0
data/lib/{parser.rb → babel_bridge/parser.rb} +7 -14
data/lib/{pattern_element.rb → babel_bridge/pattern_element.rb} +27 -25
data/lib/babel_bridge/pattern_element_hash.rb +22 -0
data/lib/{rule.rb → babel_bridge/rule.rb} +0 -0
data/lib/{rule_variant.rb → babel_bridge/rule_variant.rb} +0 -4
data/lib/{shell.rb → babel_bridge/shell.rb} +0 -0
data/lib/{string.rb → babel_bridge/string.rb} +0 -0
data/lib/{tools.rb → babel_bridge/tools.rb} +0 -0
data/lib/babel_bridge/version.rb +3 -0
data/spec/advanced_parsers_spec.rb +1 -0
data/spec/basic_parsing_spec.rb +43 -0
data/spec/bb_spec.rb +19 -0
data/spec/compound_patterns_spec.rb +61 -0
data/spec/node_spec.rb +3 -3
data/spec/pattern_generators_spec.rb +4 -4
data/spec/spec_helper.rb +3 -0
metadata +115 -33
data/README +0 -144
data/examples/turing/examples.turing +0 -33
data/examples/turing/notes.rb +0 -111
data/examples/turing/turing_demo.rb +0 -71
data/lib/version.rb +0 -4

data/CHANGE_LOG ADDED Viewed

@@ -0,0 +1,165 @@
+2013-2-12 v0.5.3
+  fixed bug with 0-length matchs' to_s returning non-zero-length strings
+2012-1-25 v0.5.1
+  added parser.relative_source_file
+2012-1-12 v0.5.0
+  added Parser.new :source_file => String
+  Sets parser.source_file value
+  Changed uniform_tabs to NOT include at least one space. If you want to ensure at least one space, you should add a space after your tab.
+  Fixed out-of-date tests in tools_spec.
+2012-1-6 v0.5.0
+  Nodes now have #line and #column methods which return the line and column of the source for the start of that Node's match.
+2012-1-5 v0.5.0
+  Completely reworked ignore_whitespace - again.
+  Now there is a global "delimiter" pattern which is matched between every sub-pattern of every rule AND at the begining and end of the entire parse.
+  ignore_whitespace sets this delimiter to: /\s*/
+  You can set your own delimiter with the delimiter method:
+  class MyParser < BabelBridge::Parser
+    delimiter :hi, "there", "/[mM]ust/", "be between every sub-pattern!" # delimiter can take any pattern "rule" can
+    rule :hi, "hi"
+  end
+  You can override the delimiter pattern for a single rule to put in special code:
+  class MyParser < BabelBridge::Parser
+    ignore_whitespace
+    rule :root, many(:statement, ';')
+    rule :statement, many(:word, / +/), :delimiter => //  # disable the global delimiter
+  end
+  INCOMPATIBLE CHANGE: node.matches is no longer positional
+  node.matches now includes only things that were matched. This means conditional matches which do not match no longer add an EmptyNode to node.matches.
+  node.matches now contains all delimiter matches.
+  INCOMPATIBLE CHANGE: no more ManyNode
+  The many(rule) parser pattern no longer generates a special kind of parse-tree node. Instead it adds all its matches to the parent rule's .matches list. It also adds all the many-delimiters.
+  NOTE: 'delimiter' referes to the global delimiter pattern or the rule-local override. 'many-delimiter' refers to the optional, explicit delimiter specfied for the many-pattern.
+  NOTE: many(:rule,:many_delimiter) will effectively match: [rule]([delimiter][many_delimiter][delimiter][rule])*
+2012-12-31 v0.4.2
+    Bugfix: parser_failure_info now works when nothing is matched
+2012-12-17 v0.4.1
+    rewind_whitespace usage example:
+      rule :end_statement, rewind_whitespace, /([\t ]*[\n;])+/
+    In this example, end_statement is similar to the end-of-statement pattern for the ruby language. Each statement either ends with a new line or a semicolon. "rewind_whitspace" indicates the parser should back up to the end of the last match and then continue matching.
+2012-11-20 v0.4.0
+    INCOMPATIBLE CHANGE: Removed the post-match pattern option from the "many" pattern matcher. It simplifies things and can easily be reproduced with a custom rule.
+    Did significant code cleanup. NonTerminalNode was renamed RuleNode and a new NonTerminalNode class was created as a parent for RuleNode and ManyNode.
+    ignore_whitespace is now just a regexp. An Empty regexp is used if ignore_whitespace is not specified. It is now handled consistenly throughout. Every node has postwhitespace_range and prewhitespace_range methods that allow you to find the whitespace after and before that node.
+    node.to_s and node.text now both just return the matched text WITHOUT the preceding and trailing whitespace. Note, however, that it will still include any whitespace inbetween as it is just a single slice out of the source.
+2012-11-13
+    ignore_whitespace now optionally takes a regexp for what to ignore after every TerminalNode. Default: /\s*/
+    rewind_whitespace matching pattern added. This allows you to match the string ignored by "ignore_whitespace" after the previous token.
+    Example: Implements the Ruby ";" / new-line parsing rule.
+      class MyParser < BabelBridge::Parser
+        ignore_whitespace
+        rule :pair, :statement, :end_statement, :statement
+        rule :end_statement, rewind_whitespace(/([\t ]*[\n;])+/)
+        rule :statement, "0"
+      end
+      # matches two 0s separated by one or more ";" or "\n" and any whitespace
+2012-09-28
+    Added to_sym on nodes.
+2012-09-19 version 0.3.1
+    Added refinements to the parser-failure output.
+2012-09-13
+    Reversed the precedence order for binary_operators_rule. The first element has the highest precedence, i.e., it is computed first.
+    Now, the correct precedence order for the basic operators is:
+        [["*", "/"], ["+", "-"]]
+2012-09-12
+    using readline for shell
+    added support for infix binary operator presedence resolution:
+        USAGE:
+              binary_operators_rule :any_rule_name, :operands_pattern, operators, [:right_operators => [...]]
+            Where "operators" is an array of operators, ordered by precedence such as: ["+", "-", "*", "/"].
+            The last operators in the array are matched first.
+            You can also group operators into the same precedence level: [["+", "-"], ["*", "/"]]
+            Operators in the same precedence level are matched left-to-right.
+            You optionally can list one or more "right_operators" - which can be strings or regexps - to specify which operators are right-associative.
+        MATCHING:
+              binary_operators_rule :any_rule_name, :operands_pattern, ["+", "-", "*", "/"]
+            matches the same string as:
+              rule :any_rule_name, many(:operands_pattern,/[-+*\/]/)
+        PARSE TREE:
+            The resulting parse-tree consists of 1 or more instances of the :any_rule_name rule's varient class. Each node has methods for easy acess to:
+                left -> the left operand node
+                right -> the right operand node
+                operator -> the operator as a symbol
+                operator_node -> the operator node
+    ignore_whitespace feature added
+        Called in the parser's class. Sets a flag that causes all future parsing to ignore white spaces. Specifically, this means that after each terminal-node match, all trailing-whitespace is consumed before the next terminal match is attempted.
+        This means that terminal nodes can still match any white-spaces they require.
+        The exact matched string, including trailing whitespace, is still available via the "text" method. The "to_s" method, though, now returns the stripped token value (if ignore_whitespace is enabled).
+2012-09-09
+    forward_to now scans all patern elements for the first one that responds to the method
+    added shell

data/Gemfile ADDED Viewed

@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+# Specify your gem's dependencies in foiled.gemspec
+gemspec

data/Guardfile ADDED Viewed

@@ -0,0 +1,7 @@
+guard 'rspec', :cli => "--color" do
+  watch(%r{^spec/.+_spec\.rb$})
+  watch(%r{^lib/(.+)\.rb$})     { "spec" }
+  watch('spec/spec_helper.rb')  { "spec" }
+end

data/LICENCE ADDED Viewed

@@ -0,0 +1,24 @@
+Copyright (c) 2010, Shane Brinkman-Davis
+All rights reserved.
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+    * Neither the name of the <organization> nor the
+      names of its contributors may be used to endorse or promote products
+      derived from this software without specific prior written permission.
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

data/README.md ADDED Viewed

@@ -0,0 +1,244 @@
+Summary
+-------
+Babel Bridge let's you generate parsers 100% in Ruby code. It is a memoizing Parsing Expression Grammar (PEG) generator like Treetop, but it doesn't require special file-types or new syntax. Overall focus is on simplicity and usability over performance.
+Goals
+-----
+* Allow expression 100% in ruby
+* Productivity through Simplicity and Understandability first
+* Performance second
+Example
+-------
+``` ruby
+require "babel_bridge"
+class MyParser < BabelBridge::Parser
+  # foo rule: match "foo" optionally followed by the :bar rule
+  rule :foo, "foo", :bar?
+  # bar rule: match "bar"
+  rule :bar, "bar"
+end
+# create one more instances of your parser
+parser = MyParser.new
+parser.parse "foo" # matches "foo"
+#  => FooNode1 > "foo"
+parser.parse "foobar" # matches "foobar"
+# => FooNode1
+#  "foo"
+#  BarNode1 > "bar"
+parser.parse "fribar" # fails to match
+# => nil
+parser.parse "foobarbar" # fails to match entire input
+# => nil
+```
+More elaborate examples:
+* [Parsing JSON the Not-So-Hard Way](http://www.essenceandartifact.com/2013/01/parsing-json-not-so-hard-way.html)
+* [How to Create a Turing Complete Programming Language in 40 Minutes](http://www.essenceandartifact.com/2012/09/how-to-create-turing-complete.html)
+Features
+--------
+``` ruby
+# returns the BabelBridge::Rule instance for that rule
+rule = MyParser[:foo]
+# => rule :foo, "foo", :bar?
+# nice human-readable view of the rule with extra info:
+rule.to_s
+# rule :foo, node_class: MyParser::FooNode
+#         variant_class: MyParser::FooNode1, pattern: "foo", :bar?
+# returns the code necessary for generating the rule and all its variants
+# (minus any class_eval code)
+rule.inspect
+# => rule :foo, "foo", :bar?
+# returns the Node class for a rule
+MyParser.node_class(:foo)
+# => MyParser::FooNode
+MyParser.node_class(:foo) do
+  # class_eval inside the rule's Node-class
+end
+# parses Text starting with the MyParser.root_rule
+# The root_rule is defined automatically by the first rule defined, but can be set by:
+#   MyParser.root_rule=v
+# where v is the symbol name of the rule or the actual rule object from MyParser[rule]
+text = "foobar"
+parser.parse(text)
+# do a one-time parse with :bar set as the root-rule
+text = "bar"
+parser.parse(text, :rule => :bar)
+# relax requirement to match entire input
+parser.parse "foobar and then something", :partial_match => true
+# parse failure
+parser.parse "foo is not immediately followed by bar"
+# human readable parser failure info
+puts parser.parser_failure_info
+```
+Parser failure info output:
+```
+Parsing error at line 1 column 4 offset 3
+Source:
+...
+foo<HERE> is not immediately followed by bar
+...
+Parser did not match entire input.
+Parse path at failure:
+  FooNode1
+Expecting:
+  "bar" BarNode1
+```
+NOTE: This is an evolving feature, this output is as-of 0.5.1 and may not match the current version.
+Defining Rules
+--------------
+Inside the parser class, a rule is defined as follows:
+``` ruby
+class MyParser < BabelBridge::Parser
+  rule :rule_name, pattern
+end
+```
+Where:
+* :rule_name    is a symbol
+* pattern       see Patterns below
+You can also add new rules outside the class definition by:
+``` ruby
+MyParser.rule :rule_name, pattern
+```
+Patterns
+--------
+Patterns are a list of pattern elements, matched in order:
+Example:
+``` ruby
+rule :my_rule, "match", "this", "in", "order"  # matches "matchthisinorder"
+```
+Pattern Elements
+----------------
+Pattern elements are basic-pattern-element or extended-pattern-element ( expressed as a hash). Internally, they are "compiled" into instances of PatternElement with optimized lambda functions for parsing.
+## Basic Pattern Elements (basic_element)
+``` ruby
+:my_rule      # matches the Rule named :my_rule
+:my_rule?     # optional: optionally matches Rule :my_rule
+:my_rule!     # negative: success only if it DOESN'T match Rule :my_rule
+"string"      # matches the string exactly
+/regex/       # matches the regex exactly
+```
+## Advanced Pattern Elements
+``` ruby
+# success if basic_element could be matched, but the input is not consumed
+could.match(pattern_element)
+# negative (two equivelent methods)
+dont.match(pattern_element)
+match!(pattern_element)
+# optional (two equivelent methods)
+optionally.match(pattern_element)
+match?(pattern_element)
+# match 1 or more
+many(pattern_element)
+# match 1 or more of one basic_element delimited by another basic_element)
+many(pattern_element, delimiter_pattern_element)
+# match 0 or more
+many?(pattern_element)
+# An array of patterns tells BB to match those patterns in order ("and" matching)
+[pattern_element_a, pattern_element_b, pattern_element_c, ...]
+# match any one of the listed patterns ("or" matching)
+any(pattern_element_a, pattern_element_b, pattern_element_c, ...)
+# optionally match any of the patterns
+any?(pattern_element_a, pattern_element_b, pattern_element_c, ...)
+# don't match any of the patterns
+any!(pattern_element_a, pattern_element_b, pattern_element_c, ...)
+```
+## Custom Pattern Element Parser
+Custom pattern elements are not generally needed, but for certain patterns, particularly context sensative ones, we provide a way to do it.
+``` ruby
+class MyParser < BabelBridge::Parser
+  # custom parser to match an all upper-case word followed by any number of characters before that word is repeated
+  rule :foo, (custom_parser do |parent_node|
+    offset = parent_node.next
+    src = parent_node.src
+    # Note, the \A anchors the search at the beginning of the string
+    if src[offset..-1].index(/\A[A-Z]+/) == 0
+      endpattern=$~.to_s
+      if i = src.index(endpattern, offset + endpattern.length)
+        range = offset..(i + endpattern.length)
+        BabelBridge::TerminalNode.new(parent_node, range, "endpattern")
+      end
+    end
+  end)
+end
+parser = parser
+parser.parse "END this is in the middle END"
+# => FooNode1 > "END this is in the middle END"
+parser.parse "DRUID this is in the middle DRUID"
+# => FooNode1 > "DRUID this is in the middle DRUID"
+parser.parse "DRUID this is in the middle DRUI"
+# => nil
+```
+Structure
+---------
+* Each Rule defines a subclass of Node
+* Each RuleVariant defines a subclass of the parent Rule's node-class
+Therefor you can easily define code to be shared across all variants as well as define code specific to one variant.