parslet 0.9.0 → 0.10.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/Gemfile CHANGED
@@ -1,7 +1,11 @@
1
1
  # A sample Gemfile
2
2
  source "http://rubygems.org"
3
3
 
4
+ gem 'blankslate', '>= 2.1.2.3'
5
+
4
6
  group :development do
5
7
  gem 'rspec'
6
8
  gem 'flexmock'
9
+
10
+ gem 'sdoc'
7
11
  end
data/HISTORY.txt CHANGED
@@ -1,4 +1,27 @@
1
- = 0.9.0 / ???
1
+ = 0.10.1 / ???
2
+
3
+ + Allow match['a-z'], shortcut for match('[a-z]')
4
+
5
+ ! Fixed output inconsistencies (behaviour in connection to 'maybe')
6
+
7
+ = 0.10.0 / 22Nov2010
8
+
9
+ + Parslet::Transform now takes a block on initialisation, wherein you can
10
+ define all the rules directly.
11
+
12
+ + Parslet::Transform now only passes a hash to the block during transform
13
+ when its arity is 1. Otherwise all hash contents as bound as local
14
+ variables.
15
+
16
+ + Both inline and other documentation have been improved.
17
+
18
+ + You can now use 'subtree(:x)' to bind any subtree to x during tree pattern
19
+ matching.
20
+
21
+ + Transform classes can now include rules into class definition. This makes
22
+ Parser and Transformer behave the same.
23
+
24
+ = 0.9.0 / 28Oct2010
2
25
  * More of everything: Examples, documentation, etc...
3
26
 
4
27
  * Breaking change: Ruby's binary or ('|') is now used for alternatives,
data/README CHANGED
@@ -1,48 +1,17 @@
1
1
  INTRODUCTION
2
2
 
3
- A small library that implements a PEG grammar. PEG means Parsing Expression
4
- Grammars [1]. These are a different kind of grammars that recognize almost the
5
- same languages as your conventional LR parser, except that they are easier to
6
- work with, since they haven't been conceived for generation, but for
7
- recognition of languages. You can read the founding paper of the field by
8
- Bryan Ford here [2].
9
-
10
- Other Ruby projects that work on the same topic are:
11
- http://wiki.github.com/luikore/rsec/
12
- http://github.com/mjijackson/citrus
13
- http://github.com/nathansobo/treetop
14
-
15
- My goal here was to see how a parser/parser generator should be constructed to
16
- allow clean AST construction and good error handling. It seems to me that most
17
- often, parser generators only handle the success-case and forget about
18
- debugging and error generation.
19
-
20
- More specifically, this library is motivated by one of my compiler projects. I
21
- started out using 'treetop' (see the link above), but found it unusable. It
22
- was lacking in
23
-
24
- * error reporting: Hard to see where a grammar fails.
25
-
26
- * stability of generated trees: Intermediary trees were dictated by the
27
- grammar. It was hard to define invariants in that system - what was
28
- convenient when writing the grammar often wasn't in subsequent stages.
29
-
30
- * clarity of parser code: The parser code is generated and is very hard
31
- to read. Add that to the first point to understand my pain.
32
-
33
- So parslet tries to be different. It doesn't generate the parser, but instead
34
- defines it in a DSL which is very close to what you find in [2]. A successful
35
- parse then generates a parser tree consisting entirely of hashes and arrays
36
- and strings (read: instable). This parser tree can then be converted to a real
37
- AST (read: stable) using a pattern matcher that is also part of this library.
38
-
39
- Error reporting is another area where parslet excels: It is able to print not
40
- only the error you are used to seeing ('Parse failed because of REASON at line
41
- 1 and char 2'), but also prints what led to that failure in the form of a
42
- tree (#error_tree method).
43
-
44
- [1] http://en.wikipedia.org/wiki/Parsing_expression_grammar
45
- [2] http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf
3
+ Parslet makes developing complex parsers easy. It does so by
4
+
5
+ * providing the best *error reporting* possible
6
+ * *not generating* reams of code for you to debug
7
+
8
+ Parslet takes the long way around to make *your job* easier. It allows for
9
+ incremental language construction. Often, you start out small, implementing
10
+ the atoms of your language first; _parslet_ takes pride in making this
11
+ possible.
12
+
13
+ Eager to try this out? Please see the associated web site:
14
+ http://kschiess.github.com/parslet
46
15
 
47
16
  SYNOPSIS
48
17
 
@@ -56,7 +25,7 @@ SYNOPSIS
56
25
  str('"').absnt? >> any
57
26
  ).repeat.as(:string) >>
58
27
  str('"')
59
-
28
+
60
29
  # Parse the string and capture parts of the interpretation (:string above)
61
30
  tree = parser.parse(%Q{
62
31
  "This is a \\"String\\" in which you can escape stuff"
@@ -64,36 +33,24 @@ SYNOPSIS
64
33
 
65
34
  tree # => {:string=>"This is a \\\"String\\\" in which you can escape stuff"}
66
35
 
67
- # Here's how you can grab results from that tree:
36
+ # Here's how you can grab results from that tree, two methods:
37
+
38
+ # 1)
68
39
  Pattern.new(:string => simple(:x)).each_match(tree) do |dictionary|
69
- puts "String contents: #{dictionary[:x]}"
40
+ puts "String contents (method 1): #{dictionary[:x]}"
70
41
  end
71
-
72
- # Here's how to transform that tree into something else ----------------------
73
42
 
74
- # Defines the classes of our new Syntax Tree
75
- class StringLiteral < Struct.new(:text); end
76
-
77
- # Defines a set of transformation rules on tree leafes
78
- transform = Transform.new
79
- transform.rule(:string => simple(:x)) { |d| StringLiteral.new(d[:x]) }
80
-
81
- # Transforms the tree
82
- transform.apply(tree)
83
-
84
- # => #<struct StringLiteral text="This is a \\\"String\\\" ... escape stuff">
43
+ # 2)
44
+ transform = Parslet::Transform.new do
45
+ rule(:string => simple(:x)) {
46
+ puts "String contents (method 2): #{x}" }
47
+ end
48
+ transform.apply(tree)
85
49
 
86
50
  COMPATIBILITY
87
51
 
88
52
  This library should work with both ruby 1.8 and ruby 1.9.
89
53
 
90
- AUTHORS
91
-
92
- My gigantous thanks go to the following cool guys and gals that help make this
93
- rock:
94
-
95
- Florian Hanke <florian.hanke@gmail.com>
96
-
97
54
  STATUS
98
55
 
99
56
  On the road to 1.0; improving documentation, packaging and upgrading to rspec2.
data/Rakefile CHANGED
@@ -18,7 +18,7 @@ spec = Gem::Specification.new do |s|
18
18
 
19
19
  # Change these as appropriate
20
20
  s.name = "parslet"
21
- s.version = "0.9.0"
21
+ s.version = "0.10.1"
22
22
  s.summary = "Parser construction library with great error reporting in Ruby."
23
23
  s.author = "Kaspar Schiess"
24
24
  s.email = "kaspar.schiess@absurd.li"
@@ -34,7 +34,7 @@ spec = Gem::Specification.new do |s|
34
34
 
35
35
  # If you want to depend on other gems, add them here, along with any
36
36
  # relevant versions
37
- # s.add_dependency("some_other_gem", "~> 0.1.0")
37
+ s.add_dependency("blankslate", "~> 2.1.2.3")
38
38
 
39
39
  # If your tests use any gems, include them here
40
40
  s.add_development_dependency("rspec")
@@ -60,11 +60,15 @@ end
60
60
 
61
61
  task :package => :gemspec
62
62
 
63
+ require 'sdoc'
64
+
63
65
  # Generate documentation
64
- Rake::RDocTask.new do |rd|
65
- rd.main = "README"
66
- rd.rdoc_files.include("README", "lib/**/*.rb")
67
- rd.rdoc_dir = "rdoc"
66
+ Rake::RDocTask.new do |rdoc|
67
+ rdoc.options << '--fmt' << 'shtml' # explictly set shtml generator
68
+ rdoc.template = 'direct' # lighter template used on railsapi.com
69
+ rdoc.main = "README"
70
+ rdoc.rdoc_files.include("README", "lib/**/*.rb")
71
+ rdoc.rdoc_dir = "rdoc"
68
72
  end
69
73
 
70
74
  desc 'Clear out RDoc and generated packages'
data/lib/parslet.rb CHANGED
@@ -4,14 +4,9 @@ require 'stringio'
4
4
  #
5
5
  # require 'parslet'
6
6
  #
7
- # class MyParser
8
- # include Parslet
9
- #
7
+ # class MyParser < Parslet::Parser
10
8
  # rule(:a) { str('a').repeat }
11
- #
12
- # def parse(str)
13
- # a.parse(str)
14
- # end
9
+ # root(:a)
15
10
  # end
16
11
  #
17
12
  # pp MyParser.new.parse('aaaa') # => 'aaaa'
@@ -36,138 +31,19 @@ require 'stringio'
36
31
  # and use the second stage to isolate the rest of your code from the changes
37
32
  # you've effected.
38
33
  #
39
- # = Language Atoms
40
- #
41
- # PEG-style grammars build on a very small number of atoms, or parslets. In
42
- # fact, only three types of parslets exist. Here's how to match a string:
43
- #
44
- # str('a string')
45
- #
46
- # This matches the string 'a string' literally and nothing else. If your input
47
- # doesn't contain the string, it will fail. Here's how to match a character
48
- # set:
49
- #
50
- # match('[abc]')
51
- #
52
- # This matches 'a', 'b' or 'c'. The string matched will always have a length
53
- # of 1; to match longer strings, please see the title below. The last parslet
54
- # of the three is 'any':
55
- #
56
- # any
57
- #
58
- # 'any' functions like the dot in regular expressions - it matches any single
59
- # character.
60
- #
61
- # = Combination and Repetition
62
- #
63
- # Parslets only get useful when combined to grammars. To combine one parslet
64
- # with the other, you have 4 kinds of methods available: repeat and maybe, >>
65
- # (sequence), | (alternation), absnt? and prsnt?.
66
- #
67
- # str('a').repeat # any number of 'a's, including 0
68
- # str('a').maybe # maybe there'll be an 'a', maybe not
69
- #
70
- # Parslets can be joined using >>. This means: Match the left parslet, then
71
- # match the right parslet.
72
- #
73
- # str('a') >> str('b') # would match 'ab'
74
- #
75
- # Keep in mind that all combination and repetition operators themselves return
76
- # a parslet. You can combine the result again:
77
- #
78
- # ( str('a') >> str('b') ) >> str('c') # would match 'abc'
79
- #
80
- # The slash ('|') indicates alternatives:
81
- #
82
- # str('a') | str('b') # would match 'a' OR 'b'
83
- #
84
- # The left side of an alternative is matched first; if it matches, the right
85
- # side is never looked at.
86
- #
87
- # The absnt? and prsnt? qualifiers allow looking at input without consuming
88
- # it:
89
- #
90
- # str('a').absnt? # will match if at the current position there is an 'a'.
91
- # str('a').absnt? >> str('b') # check for 'a' then match 'b'
92
- #
93
- # This means that the second example will not match any input; when the second
94
- # part is parsed, the first part has asserted the presence of 'a', and thus
95
- # str('b') cannot match. The prsnt? method is the opposite of absnt?, it
96
- # asserts presence.
97
- #
98
- # More documentation on these methods can be found in Parslets::Atoms::Base.
99
- #
100
- # = Intermediary Parse Trees
101
- #
102
- # As you have probably seen above, you can hand input (strings or StringIOs) to
103
- # your parslets like this:
104
- #
105
- # parslet.parse(str)
106
- #
107
- # This returns an intermediary parse tree or raises an exception
108
- # (Parslet::ParseFailed) when the input is not well formed.
109
- #
110
- # Intermediary parse trees are essentially just Plain Old Ruby Objects. (PORO
111
- # technology as we call it.) Parslets try very hard to return sensible stuff;
112
- # it is quite easy to use the results for the later stages of your program.
113
- #
114
- # Here a few examples and what their intermediary tree looks like:
115
- #
116
- # str('foo').parse('foo') # => 'foo'
117
- # (str('f') >> str('o') >> str('o')).parse('foo') # => 'foo'
118
- #
119
- # Naming parslets
120
- #
121
- # Construction of lambda blocks
122
- #
123
- # = Intermediary Tree transformation
124
- #
125
- # The intermediary parse tree by itself is most often not very useful. Its
126
- # form is volatile; changing your parser in the slightest might produce
127
- # profound changes in the generated trees.
128
- #
129
- # Generally you will want to construct a more stable tree using your own
130
- # carefully crafted representation of the domain. Parslet provides you with
131
- # an elegant way of transmogrifying your intermediary tree into the output
132
- # format you choose. This is achieved by transformation rules such as this
133
- # one:
134
- #
135
- # transform.rule(:literal => {:string => :_x}) { |d|
136
- # StringLit.new(*d.values) }
137
- #
138
- # The above rule will transform a subtree looking like this:
139
- #
140
- # :literal
141
- # |
142
- # :string
143
- # |
144
- # "somestring"
145
- #
146
- # into just this:
147
- #
148
- # StringLit
149
- # value: "somestring"
150
- #
151
- #
152
- # = Further documentation
153
- #
154
- # Please see the examples subdirectory of the distribution for more examples.
155
- # Check out 'rooc' (github.com/kschiess/rooc) as well - it uses parslet for
156
- # compiler construction.
157
- #
158
34
  module Parslet
159
35
  def self.included(base)
160
36
  base.extend(ClassMethods)
161
37
  end
162
38
 
163
- # This is raised when the parse failed to match or to consume all its input.
164
- # It contains the message that should be presented to the user. If you want
165
- # to display more error explanation, you can print the #error_tree that is
39
+ # Raised when the parse failed to match or to consume all its input. It
40
+ # contains the message that should be presented to the user. If you want to
41
+ # display more error explanation, you can print the #error_tree that is
166
42
  # stored in the parslet. This is a graphical representation of what went
167
43
  # wrong.
168
44
  #
169
45
  # Example:
170
- #
46
+ #
171
47
  # begin
172
48
  # parslet.parse(str)
173
49
  # rescue Parslet::ParseFailed => failure
@@ -181,6 +57,7 @@ module Parslet
181
57
  # Define the parsers #root function. This is the place where you start
182
58
  # parsing; if you have a rule for 'file' that describes what should be
183
59
  # in a file, this would be your root declaration:
60
+ #
184
61
  # class Parser
185
62
  # root :file
186
63
  # rule(:file) { ... }
@@ -205,9 +82,9 @@ module Parslet
205
82
  end
206
83
  end
207
84
 
208
- # Define an entity for the parser. This generates a method of the same name
209
- # that can be used as part of other patterns. Those methods can be freely
210
- # mixed in your parser class with real ruby methods.
85
+ # Define an entity for the parser. This generates a method of the same
86
+ # name that can be used as part of other patterns. Those methods can be
87
+ # freely mixed in your parser class with real ruby methods.
211
88
  #
212
89
  # Example:
213
90
  #
@@ -233,6 +110,14 @@ module Parslet
233
110
  end
234
111
  end
235
112
 
113
+ # Allows for delayed construction of #match.
114
+ #
115
+ class DelayedMatchConstructor
116
+ def [](str)
117
+ Atoms::Re.new("[" + str + "]")
118
+ end
119
+ end
120
+
236
121
  # Returns an atom matching a character class. This is essentially a regular
237
122
  # expression, but you should only match a single character.
238
123
  #
@@ -241,8 +126,10 @@ module Parslet
241
126
  # match('[ab]') # will match either 'a' or 'b'
242
127
  # match('[\n\s]') # will match newlines and spaces
243
128
  #
244
- def match(obj)
245
- Atoms::Re.new(obj)
129
+ def match(str=nil)
130
+ return DelayedMatchConstructor.new unless str
131
+
132
+ return Atoms::Re.new(str)
246
133
  end
247
134
  module_function :match
248
135
 
@@ -263,7 +150,19 @@ module Parslet
263
150
  Atoms::Re.new('.')
264
151
  end
265
152
  module_function :any
266
-
153
+
154
+ # A special kind of atom that allows embedding whole treetop expressions
155
+ # into parslet construction.
156
+ #
157
+ # Example:
158
+ #
159
+ # exp(%Q("a" "b"?)) # => returns the same as str('a') >> str('b').maybe
160
+ #
161
+ def exp(str)
162
+ Parslet::Expression.new(str).to_parslet
163
+ end
164
+ module_function :exp
165
+
267
166
  # Returns a placeholder for a tree transformation that will only match a
268
167
  # sequence of elements. The +symbol+ you specify will be the key for the
269
168
  # matched sequence in the returned dictionary.
@@ -292,10 +191,24 @@ module Parslet
292
191
  Pattern::SimpleBind.new(symbol)
293
192
  end
294
193
  module_function :simple
194
+
195
+ # Returns a placeholder for tree transformation patterns that will match
196
+ # any kind of subtree.
197
+ #
198
+ # Example:
199
+ #
200
+ # { :expression => subtree(:exp) }
201
+ #
202
+ def subtree(symbol)
203
+ Pattern::SubtreeBind.new(symbol)
204
+ end
205
+
206
+ autoload :Expression, 'parslet/expression'
295
207
  end
296
208
 
297
209
  require 'parslet/error_tree'
298
210
  require 'parslet/atoms'
299
211
  require 'parslet/pattern'
300
212
  require 'parslet/pattern/binding'
301
- require 'parslet/transform'
213
+ require 'parslet/transform'
214
+ require 'parslet/parser'
data/lib/parslet/atoms.rb CHANGED
@@ -1,4 +1,7 @@
1
1
  module Parslet::Atoms
2
+ # The precedence module controls parenthesis during the #inspect printing
3
+ # of parslets. It is not relevant to other aspects of the parsing.
4
+ #
2
5
  module Precedence
3
6
  prec = 0
4
7
  BASE = (prec+=1) # everything else
@@ -9,484 +12,14 @@ module Parslet::Atoms
9
12
  OUTER = (prec+=1) # printing is done here.
10
13
  end
11
14
 
12
- # Base class for all parslets, handles orchestration of calls and implements
13
- # a lot of the operator and chaining methods.
14
- #
15
- class Base
16
- def parse(io)
17
- if io.respond_to? :to_str
18
- io = StringIO.new(io)
19
- end
20
-
21
- result = apply(io)
22
-
23
- # If we haven't consumed the input, then the pattern doesn't match. Try
24
- # to provide a good error message (even asking down below)
25
- unless io.eof?
26
- # Do we know why we stopped matching input? If yes, that's a good
27
- # error to fail with. Otherwise just report that we cannot consume the
28
- # input.
29
- if cause
30
- raise Parslet::ParseFailed, "Unconsumed input, maybe because of this: #{cause}"
31
- else
32
- error(io, "Don't know what to do with #{io.string[io.pos,100]}")
33
- end
34
- end
35
-
36
- return flatten(result)
37
- end
38
-
39
- def apply(io)
40
- # p [:start, self, io.string[io.pos, 10]]
41
-
42
- old_pos = io.pos
43
-
44
- # p [:try, self, io.string[io.pos, 20]]
45
- begin
46
- r = try(io)
47
- # p [:return_from, self, flatten(r)]
48
- @last_cause = nil
49
- return r
50
- rescue Parslet::ParseFailed => ex
51
- # p [:failing, self, io.string[io.pos, 20]]
52
- io.pos = old_pos; raise ex
53
- end
54
- end
55
-
56
- def repeat(min=0, max=nil)
57
- Repetition.new(self, min, max)
58
- end
59
- def maybe
60
- Repetition.new(self, 0, 1, :maybe)
61
- end
62
- def >>(parslet)
63
- Sequence.new(self, parslet)
64
- end
65
- def |(parslet)
66
- Alternative.new(self, parslet)
67
- end
68
- def absnt?
69
- Lookahead.new(self, false)
70
- end
71
- def prsnt?
72
- Lookahead.new(self, true)
73
- end
74
- def as(name)
75
- Named.new(self, name)
76
- end
77
-
78
- def flatten(value)
79
- # Passes through everything that isn't an array of things
80
- return value unless value.instance_of? Array
81
-
82
- # Extracts the s-expression tag
83
- tag, *tail = value
84
-
85
- # Merges arrays:
86
- result = tail.
87
- map { |e| flatten(e) } # first flatten each element
88
-
89
- case tag
90
- when :sequence
91
- return flatten_sequence(result)
92
- when :maybe
93
- return result.first
94
- when :repetition
95
- return flatten_repetition(result)
96
- end
97
-
98
- fail "BUG: Unknown tag #{tag.inspect}."
99
- end
100
- def flatten_sequence(list)
101
- list.inject('') { |r, e| # and then merge flat elements
102
- case [r, e].map { |o| o.class }
103
- when [Hash, Hash] # two keyed subtrees: make one
104
- warn_about_duplicate_keys(r, e)
105
- r.merge(e)
106
- # a keyed tree and an array (push down)
107
- when [Hash, Array]
108
- [r] + e
109
- when [Array, Hash]
110
- r + [e]
111
- when [String, String]
112
- r << e
113
- else
114
- if r.instance_of? Hash
115
- r # Ignore e, since its not a hash we can merge
116
- else
117
- e # Whatever e is at this point, we keep it
118
- end
119
- end
120
- }
121
- end
122
- def flatten_repetition(list)
123
- if list.any? { |e| e.instance_of?(Hash) }
124
- # If keyed subtrees are in the array, we'll want to discard all
125
- # strings inbetween. To keep them, name them.
126
- return list.select { |e| e.instance_of?(Hash) }
127
- end
128
-
129
- if list.any? { |e| e.instance_of?(Array) }
130
- # If any arrays are nested in this array, flatten all arrays to this
131
- # level.
132
- return list.
133
- select { |e| e.instance_of?(Array) }.
134
- flatten(1)
135
- end
136
-
137
- # If there are only strings, concatenate them and return that.
138
- list.inject('') { |s,e| s<<(e||'') }
139
- end
140
-
141
- def self.precedence(prec)
142
- define_method(:precedence) { prec }
143
- end
144
- precedence Precedence::BASE
145
- def to_s(outer_prec)
146
- if outer_prec < precedence
147
- "("+to_s_inner(precedence)+")"
148
- else
149
- to_s_inner(precedence)
150
- end
151
- end
152
- def inspect
153
- to_s(Precedence::OUTER)
154
- end
155
-
156
- # Cause should return the current best approximation of this parslet
157
- # of what went wrong with the parse. Not relevant if the parse succeeds,
158
- # but needed for clever error reports.
159
- #
160
- def cause
161
- @last_cause
162
- end
163
-
164
- # Error tree returns what went wrong here plus what went wrong inside
165
- # subexpressions as a tree. The error stored for this node will be equal
166
- # with #cause.
167
- #
168
- def error_tree
169
- Parslet::ErrorTree.new(self) if cause?
170
- end
171
- def cause?
172
- not @last_cause.nil?
173
- end
174
- private
175
- # Report/raise a parse error with the given message, printing the current
176
- # position as well. Appends 'at line X char Y.' to the message you give.
177
- # If +pos+ is given, it is used as the real position the error happened,
178
- # correcting the io's current position.
179
- #
180
- def error(io, str, pos=nil)
181
- pre = io.string[0..(pos||io.pos)]
182
- lines = Array(pre.lines)
183
-
184
- if lines.empty?
185
- formatted_cause = str
186
- else
187
- pos = lines.last.length
188
- formatted_cause = "#{str} at line #{lines.count} char #{pos}."
189
- end
190
-
191
- @last_cause = formatted_cause
192
-
193
- raise Parslet::ParseFailed, formatted_cause, nil
194
- end
195
- def warn_about_duplicate_keys(h1, h2)
196
- d = h1.keys & h2.keys
197
- unless d.empty?
198
- warn "Duplicate subtrees while merging result of \n #{self.inspect}\nonly the values"+
199
- " of the latter will be kept. (keys: #{d.inspect})"
200
- end
201
- end
202
- end
203
-
204
- class Named < Base
205
- attr_reader :parslet, :name
206
- def initialize(parslet, name)
207
- @parslet, @name = parslet, name
208
- end
209
-
210
- def apply(io)
211
- value = parslet.apply(io)
212
-
213
- produce_return_value value
214
- end
215
-
216
- def to_s_inner(prec)
217
- "#{name}:#{parslet.to_s(prec)}"
218
- end
219
-
220
- def error_tree
221
- parslet.error_tree
222
- end
223
- private
224
- def produce_return_value(val)
225
- { name => flatten(val) }
226
- end
227
- end
228
-
229
- class Lookahead < Base
230
- attr_reader :positive
231
- attr_reader :bound_parslet
232
-
233
- def initialize(bound_parslet, positive=true)
234
- # Model positive and negative lookahead by testing this flag.
235
- @positive = positive
236
- @bound_parslet = bound_parslet
237
- end
238
-
239
- def try(io)
240
- pos = io.pos
241
- begin
242
- bound_parslet.apply(io)
243
- rescue Parslet::ParseFailed
244
- return fail(io)
245
- ensure
246
- io.pos = pos
247
- end
248
- return success(io)
249
- end
250
-
251
- def fail(io)
252
- if positive
253
- error(io, "lookahead: #{bound_parslet.inspect} didn't match, but should have")
254
- else
255
- # TODO: Squash this down to nothing? Return value handling here...
256
- return nil
257
- end
258
- end
259
- def success(io)
260
- if positive
261
- return nil # see above, TODO
262
- else
263
- error(
264
- io,
265
- "negative lookahead: #{bound_parslet.inspect} matched, but shouldn't have")
266
- end
267
- end
268
-
269
- precedence Precedence::LOOKAHEAD
270
- def to_s_inner(prec)
271
- char = positive ? '&' : '!'
272
-
273
- "#{char}#{bound_parslet.to_s(prec)}"
274
- end
275
-
276
- def error_tree
277
- bound_parslet.error_tree
278
- end
279
- end
280
-
281
- class Alternative < Base
282
- attr_reader :alternatives
283
- def initialize(*alternatives)
284
- @alternatives = alternatives
285
- end
286
-
287
- def |(parslet)
288
- @alternatives << parslet
289
- self
290
- end
291
-
292
- def try(io)
293
- alternatives.each { |a|
294
- begin
295
- return a.apply(io)
296
- rescue Parslet::ParseFailed => ex
297
- end
298
- }
299
- # If we reach this point, all alternatives have failed.
300
- error(io, "Expected one of #{alternatives.inspect}.")
301
- end
302
-
303
- precedence Precedence::ALTERNATE
304
- def to_s_inner(prec)
305
- alternatives.map { |a| a.to_s(prec) }.join(' | ')
306
- end
307
-
308
- def error_tree
309
- Parslet::ErrorTree.new(self, *alternatives.
310
- map { |child| child.error_tree })
311
- end
312
- end
313
-
314
- # A sequence of parslets, matched from left to right. Denoted by '>>'
315
- #
316
- class Sequence < Base
317
- attr_reader :parslets
318
- def initialize(*parslets)
319
- @parslets = parslets
320
- end
321
-
322
- def >>(parslet)
323
- @parslets << parslet
324
- self
325
- end
326
-
327
- def try(io)
328
- [:sequence]+parslets.map { |p|
329
- # Save each parslet as potentially offending (raising an error).
330
- @offending_parslet = p
331
- p.apply(io)
332
- }
333
- rescue Parslet::ParseFailed
334
- error(io, "Failed to match sequence (#{self.inspect})")
335
- end
336
-
337
- precedence Precedence::SEQUENCE
338
- def to_s_inner(prec)
339
- parslets.map { |p| p.to_s(prec) }.join(' ')
340
- end
341
-
342
- def error_tree
343
- Parslet::ErrorTree.new(self).tap { |t|
344
- t.children << @offending_parslet.error_tree if @offending_parslet }
345
- end
346
- end
347
-
348
- class Repetition < Base
349
- attr_reader :min, :max, :parslet
350
- def initialize(parslet, min, max, tag=:repetition)
351
- @parslet = parslet
352
- @min, @max = min, max
353
- @tag = tag
354
- end
355
-
356
- def try(io)
357
- occ = 0
358
- result = [@tag] # initialize the result array with the tag (for flattening)
359
- loop do
360
- begin
361
- result << parslet.apply(io)
362
- occ += 1
363
-
364
- # If we're not greedy (max is defined), check if that has been
365
- # reached.
366
- return result if max && occ>=max
367
- rescue Parslet::ParseFailed => ex
368
- # Greedy matcher has produced a failure. Check if occ (which will
369
- # contain the number of sucesses) is in {min, max}.
370
- # p [:repetition, occ, min, max]
371
- error(io, "Expected at least #{min} of #{parslet.inspect}") if occ < min
372
- return result
373
- end
374
- end
375
- end
376
-
377
- precedence Precedence::REPETITION
378
- def to_s_inner(prec)
379
- minmax = "{#{min}, #{max}}"
380
- minmax = '?' if min == 0 && max == 1
381
-
382
- parslet.to_s(prec) + minmax
383
- end
384
-
385
- def cause
386
- # Either the repetition failed or the parslet inside failed to repeat.
387
- super || parslet.cause
388
- end
389
- def error_tree
390
- if cause?
391
- Parslet::ErrorTree.new(self, parslet.error_tree)
392
- else
393
- parslet.error_tree
394
- end
395
- end
396
- end
397
-
398
- # Matches a special kind of regular expression that only ever matches one
399
- # character at a time. Useful members of this family are: character ranges,
400
- # \w, \d, \r, \n, ...
401
- #
402
- class Re < Base
403
- attr_reader :match
404
- def initialize(match)
405
- @match = match
406
- end
407
-
408
- def try(io)
409
- r = Regexp.new(match, Regexp::MULTILINE)
410
- s = io.read(1)
411
- error(io, "Premature end of input") unless s
412
- error(io, "Failed to match #{match.inspect[1..-2]}") unless s.match(r)
413
- return s
414
- end
415
-
416
- def to_s_inner(prec)
417
- match.inspect[1..-2]
418
- end
419
- end
420
-
421
- # Matches a string of characters.
422
- #
423
- class Str < Base
424
- attr_reader :str
425
- def initialize(str)
426
- @str = str
427
- end
428
-
429
- def try(io)
430
- old_pos = io.pos
431
- s = io.read(str.size)
432
- error(io, "Premature end of input") unless s && s.size==str.size
433
- error(io, "Expected #{str.inspect}, but got #{s.inspect}", old_pos) \
434
- unless s==str
435
- return s
436
- end
437
-
438
- def to_s_inner(prec)
439
- "'#{str}'"
440
- end
441
- end
442
-
443
- # This wraps pieces of parslet definition and gives them a name. The wrapped
444
- # piece is lazily evaluated and cached. This has two purposes:
445
- #
446
- # a) Avoid infinite recursion during evaluation of the definition
447
- #
448
- # b) Be able to print things by their name, not by their sometimes
449
- # complicated content.
450
- #
451
- # You don't normally use this directly, instead you should generated it by
452
- # using the structuring method Parslet#rule.
453
- #
454
- class Entity < Base
455
- attr_reader :name, :context, :block
456
- def initialize(name, context, block)
457
- super()
458
-
459
- @name = name
460
- @context = context
461
- @block = block
462
- end
463
-
464
- def try(io)
465
- parslet.apply(io)
466
- end
467
-
468
- def parslet
469
- @parslet ||= context.instance_eval(&block).tap { |p|
470
- raise_not_implemented unless p
471
- }
472
- end
473
-
474
- def to_s_inner(prec)
475
- name.to_s.upcase
476
- end
477
-
478
- def error_tree
479
- parslet.error_tree
480
- end
481
-
482
- private
483
- def raise_not_implemented
484
- trace = caller.reject {|l| l =~ %r{#{Regexp.escape(__FILE__)}}} # blatantly stolen from dependencies.rb in activesupport
485
- exception = NotImplementedError.new("rule(#{name.inspect}) { ... } returns nil. Still not implemented, but already used?")
486
- exception.set_backtrace(trace)
487
-
488
- raise exception
489
- end
490
- end
15
+ autoload :Base, 'parslet/atoms/base'
16
+ autoload :Named, 'parslet/atoms/named'
17
+ autoload :Lookahead, 'parslet/atoms/lookahead'
18
+ autoload :Alternative, 'parslet/atoms/alternative'
19
+ autoload :Sequence, 'parslet/atoms/sequence'
20
+ autoload :Repetition, 'parslet/atoms/repetition'
21
+ autoload :Re, 'parslet/atoms/re'
22
+ autoload :Str, 'parslet/atoms/str'
23
+ autoload :Entity, 'parslet/atoms/entity'
491
24
  end
492
25