parslet 0.9.0 → 0.10.1

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile CHANGED
@@ -1,7 +1,11 @@
1
1
  # A sample Gemfile
2
2
  source "http://rubygems.org"
3
3
 
4
+ gem 'blankslate', '>= 2.1.2.3'
5
+
4
6
  group :development do
5
7
  gem 'rspec'
6
8
  gem 'flexmock'
9
+
10
+ gem 'sdoc'
7
11
  end
data/HISTORY.txt CHANGED
@@ -1,4 +1,27 @@
1
- = 0.9.0 / ???
1
+ = 0.10.1 / ???
2
+
3
+ + Allow match['a-z'], shortcut for match('[a-z]')
4
+
5
+ ! Fixed output inconsistencies (behaviour in connection to 'maybe')
6
+
7
+ = 0.10.0 / 22Nov2010
8
+
9
+ + Parslet::Transform now takes a block on initialisation, wherein you can
10
+ define all the rules directly.
11
+
12
+ + Parslet::Transform now only passes a hash to the block during transform
13
+ when its arity is 1. Otherwise all hash contents as bound as local
14
+ variables.
15
+
16
+ + Both inline and other documentation have been improved.
17
+
18
+ + You can now use 'subtree(:x)' to bind any subtree to x during tree pattern
19
+ matching.
20
+
21
+ + Transform classes can now include rules into class definition. This makes
22
+ Parser and Transformer behave the same.
23
+
24
+ = 0.9.0 / 28Oct2010
2
25
  * More of everything: Examples, documentation, etc...
3
26
 
4
27
  * Breaking change: Ruby's binary or ('|') is now used for alternatives,
data/README CHANGED
@@ -1,48 +1,17 @@
1
1
  INTRODUCTION
2
2
 
3
- A small library that implements a PEG grammar. PEG means Parsing Expression
4
- Grammars [1]. These are a different kind of grammars that recognize almost the
5
- same languages as your conventional LR parser, except that they are easier to
6
- work with, since they haven't been conceived for generation, but for
7
- recognition of languages. You can read the founding paper of the field by
8
- Bryan Ford here [2].
9
-
10
- Other Ruby projects that work on the same topic are:
11
- http://wiki.github.com/luikore/rsec/
12
- http://github.com/mjijackson/citrus
13
- http://github.com/nathansobo/treetop
14
-
15
- My goal here was to see how a parser/parser generator should be constructed to
16
- allow clean AST construction and good error handling. It seems to me that most
17
- often, parser generators only handle the success-case and forget about
18
- debugging and error generation.
19
-
20
- More specifically, this library is motivated by one of my compiler projects. I
21
- started out using 'treetop' (see the link above), but found it unusable. It
22
- was lacking in
23
-
24
- * error reporting: Hard to see where a grammar fails.
25
-
26
- * stability of generated trees: Intermediary trees were dictated by the
27
- grammar. It was hard to define invariants in that system - what was
28
- convenient when writing the grammar often wasn't in subsequent stages.
29
-
30
- * clarity of parser code: The parser code is generated and is very hard
31
- to read. Add that to the first point to understand my pain.
32
-
33
- So parslet tries to be different. It doesn't generate the parser, but instead
34
- defines it in a DSL which is very close to what you find in [2]. A successful
35
- parse then generates a parser tree consisting entirely of hashes and arrays
36
- and strings (read: instable). This parser tree can then be converted to a real
37
- AST (read: stable) using a pattern matcher that is also part of this library.
38
-
39
- Error reporting is another area where parslet excels: It is able to print not
40
- only the error you are used to seeing ('Parse failed because of REASON at line
41
- 1 and char 2'), but also prints what led to that failure in the form of a
42
- tree (#error_tree method).
43
-
44
- [1] http://en.wikipedia.org/wiki/Parsing_expression_grammar
45
- [2] http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf
3
+ Parslet makes developing complex parsers easy. It does so by
4
+
5
+ * providing the best *error reporting* possible
6
+ * *not generating* reams of code for you to debug
7
+
8
+ Parslet takes the long way around to make *your job* easier. It allows for
9
+ incremental language construction. Often, you start out small, implementing
10
+ the atoms of your language first; _parslet_ takes pride in making this
11
+ possible.
12
+
13
+ Eager to try this out? Please see the associated web site:
14
+ http://kschiess.github.com/parslet
46
15
 
47
16
  SYNOPSIS
48
17
 
@@ -56,7 +25,7 @@ SYNOPSIS
56
25
  str('"').absnt? >> any
57
26
  ).repeat.as(:string) >>
58
27
  str('"')
59
-
28
+
60
29
  # Parse the string and capture parts of the interpretation (:string above)
61
30
  tree = parser.parse(%Q{
62
31
  "This is a \\"String\\" in which you can escape stuff"
@@ -64,36 +33,24 @@ SYNOPSIS
64
33
 
65
34
  tree # => {:string=>"This is a \\\"String\\\" in which you can escape stuff"}
66
35
 
67
- # Here's how you can grab results from that tree:
36
+ # Here's how you can grab results from that tree, two methods:
37
+
38
+ # 1)
68
39
  Pattern.new(:string => simple(:x)).each_match(tree) do |dictionary|
69
- puts "String contents: #{dictionary[:x]}"
40
+ puts "String contents (method 1): #{dictionary[:x]}"
70
41
  end
71
-
72
- # Here's how to transform that tree into something else ----------------------
73
42
 
74
- # Defines the classes of our new Syntax Tree
75
- class StringLiteral < Struct.new(:text); end
76
-
77
- # Defines a set of transformation rules on tree leafes
78
- transform = Transform.new
79
- transform.rule(:string => simple(:x)) { |d| StringLiteral.new(d[:x]) }
80
-
81
- # Transforms the tree
82
- transform.apply(tree)
83
-
84
- # => #<struct StringLiteral text="This is a \\\"String\\\" ... escape stuff">
43
+ # 2)
44
+ transform = Parslet::Transform.new do
45
+ rule(:string => simple(:x)) {
46
+ puts "String contents (method 2): #{x}" }
47
+ end
48
+ transform.apply(tree)
85
49
 
86
50
  COMPATIBILITY
87
51
 
88
52
  This library should work with both ruby 1.8 and ruby 1.9.
89
53
 
90
- AUTHORS
91
-
92
- My gigantous thanks go to the following cool guys and gals that help make this
93
- rock:
94
-
95
- Florian Hanke <florian.hanke@gmail.com>
96
-
97
54
  STATUS
98
55
 
99
56
  On the road to 1.0; improving documentation, packaging and upgrading to rspec2.
data/Rakefile CHANGED
@@ -18,7 +18,7 @@ spec = Gem::Specification.new do |s|
18
18
 
19
19
  # Change these as appropriate
20
20
  s.name = "parslet"
21
- s.version = "0.9.0"
21
+ s.version = "0.10.1"
22
22
  s.summary = "Parser construction library with great error reporting in Ruby."
23
23
  s.author = "Kaspar Schiess"
24
24
  s.email = "kaspar.schiess@absurd.li"
@@ -34,7 +34,7 @@ spec = Gem::Specification.new do |s|
34
34
 
35
35
  # If you want to depend on other gems, add them here, along with any
36
36
  # relevant versions
37
- # s.add_dependency("some_other_gem", "~> 0.1.0")
37
+ s.add_dependency("blankslate", "~> 2.1.2.3")
38
38
 
39
39
  # If your tests use any gems, include them here
40
40
  s.add_development_dependency("rspec")
@@ -60,11 +60,15 @@ end
60
60
 
61
61
  task :package => :gemspec
62
62
 
63
+ require 'sdoc'
64
+
63
65
  # Generate documentation
64
- Rake::RDocTask.new do |rd|
65
- rd.main = "README"
66
- rd.rdoc_files.include("README", "lib/**/*.rb")
67
- rd.rdoc_dir = "rdoc"
66
+ Rake::RDocTask.new do |rdoc|
67
+ rdoc.options << '--fmt' << 'shtml' # explictly set shtml generator
68
+ rdoc.template = 'direct' # lighter template used on railsapi.com
69
+ rdoc.main = "README"
70
+ rdoc.rdoc_files.include("README", "lib/**/*.rb")
71
+ rdoc.rdoc_dir = "rdoc"
68
72
  end
69
73
 
70
74
  desc 'Clear out RDoc and generated packages'
data/lib/parslet.rb CHANGED
@@ -4,14 +4,9 @@ require 'stringio'
4
4
  #
5
5
  # require 'parslet'
6
6
  #
7
- # class MyParser
8
- # include Parslet
9
- #
7
+ # class MyParser < Parslet::Parser
10
8
  # rule(:a) { str('a').repeat }
11
- #
12
- # def parse(str)
13
- # a.parse(str)
14
- # end
9
+ # root(:a)
15
10
  # end
16
11
  #
17
12
  # pp MyParser.new.parse('aaaa') # => 'aaaa'
@@ -36,138 +31,19 @@ require 'stringio'
36
31
  # and use the second stage to isolate the rest of your code from the changes
37
32
  # you've effected.
38
33
  #
39
- # = Language Atoms
40
- #
41
- # PEG-style grammars build on a very small number of atoms, or parslets. In
42
- # fact, only three types of parslets exist. Here's how to match a string:
43
- #
44
- # str('a string')
45
- #
46
- # This matches the string 'a string' literally and nothing else. If your input
47
- # doesn't contain the string, it will fail. Here's how to match a character
48
- # set:
49
- #
50
- # match('[abc]')
51
- #
52
- # This matches 'a', 'b' or 'c'. The string matched will always have a length
53
- # of 1; to match longer strings, please see the title below. The last parslet
54
- # of the three is 'any':
55
- #
56
- # any
57
- #
58
- # 'any' functions like the dot in regular expressions - it matches any single
59
- # character.
60
- #
61
- # = Combination and Repetition
62
- #
63
- # Parslets only get useful when combined to grammars. To combine one parslet
64
- # with the other, you have 4 kinds of methods available: repeat and maybe, >>
65
- # (sequence), | (alternation), absnt? and prsnt?.
66
- #
67
- # str('a').repeat # any number of 'a's, including 0
68
- # str('a').maybe # maybe there'll be an 'a', maybe not
69
- #
70
- # Parslets can be joined using >>. This means: Match the left parslet, then
71
- # match the right parslet.
72
- #
73
- # str('a') >> str('b') # would match 'ab'
74
- #
75
- # Keep in mind that all combination and repetition operators themselves return
76
- # a parslet. You can combine the result again:
77
- #
78
- # ( str('a') >> str('b') ) >> str('c') # would match 'abc'
79
- #
80
- # The slash ('|') indicates alternatives:
81
- #
82
- # str('a') | str('b') # would match 'a' OR 'b'
83
- #
84
- # The left side of an alternative is matched first; if it matches, the right
85
- # side is never looked at.
86
- #
87
- # The absnt? and prsnt? qualifiers allow looking at input without consuming
88
- # it:
89
- #
90
- # str('a').absnt? # will match if at the current position there is an 'a'.
91
- # str('a').absnt? >> str('b') # check for 'a' then match 'b'
92
- #
93
- # This means that the second example will not match any input; when the second
94
- # part is parsed, the first part has asserted the presence of 'a', and thus
95
- # str('b') cannot match. The prsnt? method is the opposite of absnt?, it
96
- # asserts presence.
97
- #
98
- # More documentation on these methods can be found in Parslets::Atoms::Base.
99
- #
100
- # = Intermediary Parse Trees
101
- #
102
- # As you have probably seen above, you can hand input (strings or StringIOs) to
103
- # your parslets like this:
104
- #
105
- # parslet.parse(str)
106
- #
107
- # This returns an intermediary parse tree or raises an exception
108
- # (Parslet::ParseFailed) when the input is not well formed.
109
- #
110
- # Intermediary parse trees are essentially just Plain Old Ruby Objects. (PORO
111
- # technology as we call it.) Parslets try very hard to return sensible stuff;
112
- # it is quite easy to use the results for the later stages of your program.
113
- #
114
- # Here a few examples and what their intermediary tree looks like:
115
- #
116
- # str('foo').parse('foo') # => 'foo'
117
- # (str('f') >> str('o') >> str('o')).parse('foo') # => 'foo'
118
- #
119
- # Naming parslets
120
- #
121
- # Construction of lambda blocks
122
- #
123
- # = Intermediary Tree transformation
124
- #
125
- # The intermediary parse tree by itself is most often not very useful. Its
126
- # form is volatile; changing your parser in the slightest might produce
127
- # profound changes in the generated trees.
128
- #
129
- # Generally you will want to construct a more stable tree using your own
130
- # carefully crafted representation of the domain. Parslet provides you with
131
- # an elegant way of transmogrifying your intermediary tree into the output
132
- # format you choose. This is achieved by transformation rules such as this
133
- # one:
134
- #
135
- # transform.rule(:literal => {:string => :_x}) { |d|
136
- # StringLit.new(*d.values) }
137
- #
138
- # The above rule will transform a subtree looking like this:
139
- #
140
- # :literal
141
- # |
142
- # :string
143
- # |
144
- # "somestring"
145
- #
146
- # into just this:
147
- #
148
- # StringLit
149
- # value: "somestring"
150
- #
151
- #
152
- # = Further documentation
153
- #
154
- # Please see the examples subdirectory of the distribution for more examples.
155
- # Check out 'rooc' (github.com/kschiess/rooc) as well - it uses parslet for
156
- # compiler construction.
157
- #
158
34
  module Parslet
159
35
  def self.included(base)
160
36
  base.extend(ClassMethods)
161
37
  end
162
38
 
163
- # This is raised when the parse failed to match or to consume all its input.
164
- # It contains the message that should be presented to the user. If you want
165
- # to display more error explanation, you can print the #error_tree that is
39
+ # Raised when the parse failed to match or to consume all its input. It
40
+ # contains the message that should be presented to the user. If you want to
41
+ # display more error explanation, you can print the #error_tree that is
166
42
  # stored in the parslet. This is a graphical representation of what went
167
43
  # wrong.
168
44
  #
169
45
  # Example:
170
- #
46
+ #
171
47
  # begin
172
48
  # parslet.parse(str)
173
49
  # rescue Parslet::ParseFailed => failure
@@ -181,6 +57,7 @@ module Parslet
181
57
  # Define the parsers #root function. This is the place where you start
182
58
  # parsing; if you have a rule for 'file' that describes what should be
183
59
  # in a file, this would be your root declaration:
60
+ #
184
61
  # class Parser
185
62
  # root :file
186
63
  # rule(:file) { ... }
@@ -205,9 +82,9 @@ module Parslet
205
82
  end
206
83
  end
207
84
 
208
- # Define an entity for the parser. This generates a method of the same name
209
- # that can be used as part of other patterns. Those methods can be freely
210
- # mixed in your parser class with real ruby methods.
85
+ # Define an entity for the parser. This generates a method of the same
86
+ # name that can be used as part of other patterns. Those methods can be
87
+ # freely mixed in your parser class with real ruby methods.
211
88
  #
212
89
  # Example:
213
90
  #
@@ -233,6 +110,14 @@ module Parslet
233
110
  end
234
111
  end
235
112
 
113
+ # Allows for delayed construction of #match.
114
+ #
115
+ class DelayedMatchConstructor
116
+ def [](str)
117
+ Atoms::Re.new("[" + str + "]")
118
+ end
119
+ end
120
+
236
121
  # Returns an atom matching a character class. This is essentially a regular
237
122
  # expression, but you should only match a single character.
238
123
  #
@@ -241,8 +126,10 @@ module Parslet
241
126
  # match('[ab]') # will match either 'a' or 'b'
242
127
  # match('[\n\s]') # will match newlines and spaces
243
128
  #
244
- def match(obj)
245
- Atoms::Re.new(obj)
129
+ def match(str=nil)
130
+ return DelayedMatchConstructor.new unless str
131
+
132
+ return Atoms::Re.new(str)
246
133
  end
247
134
  module_function :match
248
135
 
@@ -263,7 +150,19 @@ module Parslet
263
150
  Atoms::Re.new('.')
264
151
  end
265
152
  module_function :any
266
-
153
+
154
+ # A special kind of atom that allows embedding whole treetop expressions
155
+ # into parslet construction.
156
+ #
157
+ # Example:
158
+ #
159
+ # exp(%Q("a" "b"?)) # => returns the same as str('a') >> str('b').maybe
160
+ #
161
+ def exp(str)
162
+ Parslet::Expression.new(str).to_parslet
163
+ end
164
+ module_function :exp
165
+
267
166
  # Returns a placeholder for a tree transformation that will only match a
268
167
  # sequence of elements. The +symbol+ you specify will be the key for the
269
168
  # matched sequence in the returned dictionary.
@@ -292,10 +191,24 @@ module Parslet
292
191
  Pattern::SimpleBind.new(symbol)
293
192
  end
294
193
  module_function :simple
194
+
195
+ # Returns a placeholder for tree transformation patterns that will match
196
+ # any kind of subtree.
197
+ #
198
+ # Example:
199
+ #
200
+ # { :expression => subtree(:exp) }
201
+ #
202
+ def subtree(symbol)
203
+ Pattern::SubtreeBind.new(symbol)
204
+ end
205
+
206
+ autoload :Expression, 'parslet/expression'
295
207
  end
296
208
 
297
209
  require 'parslet/error_tree'
298
210
  require 'parslet/atoms'
299
211
  require 'parslet/pattern'
300
212
  require 'parslet/pattern/binding'
301
- require 'parslet/transform'
213
+ require 'parslet/transform'
214
+ require 'parslet/parser'
data/lib/parslet/atoms.rb CHANGED
@@ -1,4 +1,7 @@
1
1
  module Parslet::Atoms
2
+ # The precedence module controls parenthesis during the #inspect printing
3
+ # of parslets. It is not relevant to other aspects of the parsing.
4
+ #
2
5
  module Precedence
3
6
  prec = 0
4
7
  BASE = (prec+=1) # everything else
@@ -9,484 +12,14 @@ module Parslet::Atoms
9
12
  OUTER = (prec+=1) # printing is done here.
10
13
  end
11
14
 
12
- # Base class for all parslets, handles orchestration of calls and implements
13
- # a lot of the operator and chaining methods.
14
- #
15
- class Base
16
- def parse(io)
17
- if io.respond_to? :to_str
18
- io = StringIO.new(io)
19
- end
20
-
21
- result = apply(io)
22
-
23
- # If we haven't consumed the input, then the pattern doesn't match. Try
24
- # to provide a good error message (even asking down below)
25
- unless io.eof?
26
- # Do we know why we stopped matching input? If yes, that's a good
27
- # error to fail with. Otherwise just report that we cannot consume the
28
- # input.
29
- if cause
30
- raise Parslet::ParseFailed, "Unconsumed input, maybe because of this: #{cause}"
31
- else
32
- error(io, "Don't know what to do with #{io.string[io.pos,100]}")
33
- end
34
- end
35
-
36
- return flatten(result)
37
- end
38
-
39
- def apply(io)
40
- # p [:start, self, io.string[io.pos, 10]]
41
-
42
- old_pos = io.pos
43
-
44
- # p [:try, self, io.string[io.pos, 20]]
45
- begin
46
- r = try(io)
47
- # p [:return_from, self, flatten(r)]
48
- @last_cause = nil
49
- return r
50
- rescue Parslet::ParseFailed => ex
51
- # p [:failing, self, io.string[io.pos, 20]]
52
- io.pos = old_pos; raise ex
53
- end
54
- end
55
-
56
- def repeat(min=0, max=nil)
57
- Repetition.new(self, min, max)
58
- end
59
- def maybe
60
- Repetition.new(self, 0, 1, :maybe)
61
- end
62
- def >>(parslet)
63
- Sequence.new(self, parslet)
64
- end
65
- def |(parslet)
66
- Alternative.new(self, parslet)
67
- end
68
- def absnt?
69
- Lookahead.new(self, false)
70
- end
71
- def prsnt?
72
- Lookahead.new(self, true)
73
- end
74
- def as(name)
75
- Named.new(self, name)
76
- end
77
-
78
- def flatten(value)
79
- # Passes through everything that isn't an array of things
80
- return value unless value.instance_of? Array
81
-
82
- # Extracts the s-expression tag
83
- tag, *tail = value
84
-
85
- # Merges arrays:
86
- result = tail.
87
- map { |e| flatten(e) } # first flatten each element
88
-
89
- case tag
90
- when :sequence
91
- return flatten_sequence(result)
92
- when :maybe
93
- return result.first
94
- when :repetition
95
- return flatten_repetition(result)
96
- end
97
-
98
- fail "BUG: Unknown tag #{tag.inspect}."
99
- end
100
- def flatten_sequence(list)
101
- list.inject('') { |r, e| # and then merge flat elements
102
- case [r, e].map { |o| o.class }
103
- when [Hash, Hash] # two keyed subtrees: make one
104
- warn_about_duplicate_keys(r, e)
105
- r.merge(e)
106
- # a keyed tree and an array (push down)
107
- when [Hash, Array]
108
- [r] + e
109
- when [Array, Hash]
110
- r + [e]
111
- when [String, String]
112
- r << e
113
- else
114
- if r.instance_of? Hash
115
- r # Ignore e, since its not a hash we can merge
116
- else
117
- e # Whatever e is at this point, we keep it
118
- end
119
- end
120
- }
121
- end
122
- def flatten_repetition(list)
123
- if list.any? { |e| e.instance_of?(Hash) }
124
- # If keyed subtrees are in the array, we'll want to discard all
125
- # strings inbetween. To keep them, name them.
126
- return list.select { |e| e.instance_of?(Hash) }
127
- end
128
-
129
- if list.any? { |e| e.instance_of?(Array) }
130
- # If any arrays are nested in this array, flatten all arrays to this
131
- # level.
132
- return list.
133
- select { |e| e.instance_of?(Array) }.
134
- flatten(1)
135
- end
136
-
137
- # If there are only strings, concatenate them and return that.
138
- list.inject('') { |s,e| s<<(e||'') }
139
- end
140
-
141
- def self.precedence(prec)
142
- define_method(:precedence) { prec }
143
- end
144
- precedence Precedence::BASE
145
- def to_s(outer_prec)
146
- if outer_prec < precedence
147
- "("+to_s_inner(precedence)+")"
148
- else
149
- to_s_inner(precedence)
150
- end
151
- end
152
- def inspect
153
- to_s(Precedence::OUTER)
154
- end
155
-
156
- # Cause should return the current best approximation of this parslet
157
- # of what went wrong with the parse. Not relevant if the parse succeeds,
158
- # but needed for clever error reports.
159
- #
160
- def cause
161
- @last_cause
162
- end
163
-
164
- # Error tree returns what went wrong here plus what went wrong inside
165
- # subexpressions as a tree. The error stored for this node will be equal
166
- # with #cause.
167
- #
168
- def error_tree
169
- Parslet::ErrorTree.new(self) if cause?
170
- end
171
- def cause?
172
- not @last_cause.nil?
173
- end
174
- private
175
- # Report/raise a parse error with the given message, printing the current
176
- # position as well. Appends 'at line X char Y.' to the message you give.
177
- # If +pos+ is given, it is used as the real position the error happened,
178
- # correcting the io's current position.
179
- #
180
- def error(io, str, pos=nil)
181
- pre = io.string[0..(pos||io.pos)]
182
- lines = Array(pre.lines)
183
-
184
- if lines.empty?
185
- formatted_cause = str
186
- else
187
- pos = lines.last.length
188
- formatted_cause = "#{str} at line #{lines.count} char #{pos}."
189
- end
190
-
191
- @last_cause = formatted_cause
192
-
193
- raise Parslet::ParseFailed, formatted_cause, nil
194
- end
195
- def warn_about_duplicate_keys(h1, h2)
196
- d = h1.keys & h2.keys
197
- unless d.empty?
198
- warn "Duplicate subtrees while merging result of \n #{self.inspect}\nonly the values"+
199
- " of the latter will be kept. (keys: #{d.inspect})"
200
- end
201
- end
202
- end
203
-
204
- class Named < Base
205
- attr_reader :parslet, :name
206
- def initialize(parslet, name)
207
- @parslet, @name = parslet, name
208
- end
209
-
210
- def apply(io)
211
- value = parslet.apply(io)
212
-
213
- produce_return_value value
214
- end
215
-
216
- def to_s_inner(prec)
217
- "#{name}:#{parslet.to_s(prec)}"
218
- end
219
-
220
- def error_tree
221
- parslet.error_tree
222
- end
223
- private
224
- def produce_return_value(val)
225
- { name => flatten(val) }
226
- end
227
- end
228
-
229
- class Lookahead < Base
230
- attr_reader :positive
231
- attr_reader :bound_parslet
232
-
233
- def initialize(bound_parslet, positive=true)
234
- # Model positive and negative lookahead by testing this flag.
235
- @positive = positive
236
- @bound_parslet = bound_parslet
237
- end
238
-
239
- def try(io)
240
- pos = io.pos
241
- begin
242
- bound_parslet.apply(io)
243
- rescue Parslet::ParseFailed
244
- return fail(io)
245
- ensure
246
- io.pos = pos
247
- end
248
- return success(io)
249
- end
250
-
251
- def fail(io)
252
- if positive
253
- error(io, "lookahead: #{bound_parslet.inspect} didn't match, but should have")
254
- else
255
- # TODO: Squash this down to nothing? Return value handling here...
256
- return nil
257
- end
258
- end
259
- def success(io)
260
- if positive
261
- return nil # see above, TODO
262
- else
263
- error(
264
- io,
265
- "negative lookahead: #{bound_parslet.inspect} matched, but shouldn't have")
266
- end
267
- end
268
-
269
- precedence Precedence::LOOKAHEAD
270
- def to_s_inner(prec)
271
- char = positive ? '&' : '!'
272
-
273
- "#{char}#{bound_parslet.to_s(prec)}"
274
- end
275
-
276
- def error_tree
277
- bound_parslet.error_tree
278
- end
279
- end
280
-
281
- class Alternative < Base
282
- attr_reader :alternatives
283
- def initialize(*alternatives)
284
- @alternatives = alternatives
285
- end
286
-
287
- def |(parslet)
288
- @alternatives << parslet
289
- self
290
- end
291
-
292
- def try(io)
293
- alternatives.each { |a|
294
- begin
295
- return a.apply(io)
296
- rescue Parslet::ParseFailed => ex
297
- end
298
- }
299
- # If we reach this point, all alternatives have failed.
300
- error(io, "Expected one of #{alternatives.inspect}.")
301
- end
302
-
303
- precedence Precedence::ALTERNATE
304
- def to_s_inner(prec)
305
- alternatives.map { |a| a.to_s(prec) }.join(' | ')
306
- end
307
-
308
- def error_tree
309
- Parslet::ErrorTree.new(self, *alternatives.
310
- map { |child| child.error_tree })
311
- end
312
- end
313
-
314
- # A sequence of parslets, matched from left to right. Denoted by '>>'
315
- #
316
- class Sequence < Base
317
- attr_reader :parslets
318
- def initialize(*parslets)
319
- @parslets = parslets
320
- end
321
-
322
- def >>(parslet)
323
- @parslets << parslet
324
- self
325
- end
326
-
327
- def try(io)
328
- [:sequence]+parslets.map { |p|
329
- # Save each parslet as potentially offending (raising an error).
330
- @offending_parslet = p
331
- p.apply(io)
332
- }
333
- rescue Parslet::ParseFailed
334
- error(io, "Failed to match sequence (#{self.inspect})")
335
- end
336
-
337
- precedence Precedence::SEQUENCE
338
- def to_s_inner(prec)
339
- parslets.map { |p| p.to_s(prec) }.join(' ')
340
- end
341
-
342
- def error_tree
343
- Parslet::ErrorTree.new(self).tap { |t|
344
- t.children << @offending_parslet.error_tree if @offending_parslet }
345
- end
346
- end
347
-
348
- class Repetition < Base
349
- attr_reader :min, :max, :parslet
350
- def initialize(parslet, min, max, tag=:repetition)
351
- @parslet = parslet
352
- @min, @max = min, max
353
- @tag = tag
354
- end
355
-
356
- def try(io)
357
- occ = 0
358
- result = [@tag] # initialize the result array with the tag (for flattening)
359
- loop do
360
- begin
361
- result << parslet.apply(io)
362
- occ += 1
363
-
364
- # If we're not greedy (max is defined), check if that has been
365
- # reached.
366
- return result if max && occ>=max
367
- rescue Parslet::ParseFailed => ex
368
- # Greedy matcher has produced a failure. Check if occ (which will
369
- # contain the number of sucesses) is in {min, max}.
370
- # p [:repetition, occ, min, max]
371
- error(io, "Expected at least #{min} of #{parslet.inspect}") if occ < min
372
- return result
373
- end
374
- end
375
- end
376
-
377
- precedence Precedence::REPETITION
378
- def to_s_inner(prec)
379
- minmax = "{#{min}, #{max}}"
380
- minmax = '?' if min == 0 && max == 1
381
-
382
- parslet.to_s(prec) + minmax
383
- end
384
-
385
- def cause
386
- # Either the repetition failed or the parslet inside failed to repeat.
387
- super || parslet.cause
388
- end
389
- def error_tree
390
- if cause?
391
- Parslet::ErrorTree.new(self, parslet.error_tree)
392
- else
393
- parslet.error_tree
394
- end
395
- end
396
- end
397
-
398
- # Matches a special kind of regular expression that only ever matches one
399
- # character at a time. Useful members of this family are: character ranges,
400
- # \w, \d, \r, \n, ...
401
- #
402
- class Re < Base
403
- attr_reader :match
404
- def initialize(match)
405
- @match = match
406
- end
407
-
408
- def try(io)
409
- r = Regexp.new(match, Regexp::MULTILINE)
410
- s = io.read(1)
411
- error(io, "Premature end of input") unless s
412
- error(io, "Failed to match #{match.inspect[1..-2]}") unless s.match(r)
413
- return s
414
- end
415
-
416
- def to_s_inner(prec)
417
- match.inspect[1..-2]
418
- end
419
- end
420
-
421
- # Matches a string of characters.
422
- #
423
- class Str < Base
424
- attr_reader :str
425
- def initialize(str)
426
- @str = str
427
- end
428
-
429
- def try(io)
430
- old_pos = io.pos
431
- s = io.read(str.size)
432
- error(io, "Premature end of input") unless s && s.size==str.size
433
- error(io, "Expected #{str.inspect}, but got #{s.inspect}", old_pos) \
434
- unless s==str
435
- return s
436
- end
437
-
438
- def to_s_inner(prec)
439
- "'#{str}'"
440
- end
441
- end
442
-
443
- # This wraps pieces of parslet definition and gives them a name. The wrapped
444
- # piece is lazily evaluated and cached. This has two purposes:
445
- #
446
- # a) Avoid infinite recursion during evaluation of the definition
447
- #
448
- # b) Be able to print things by their name, not by their sometimes
449
- # complicated content.
450
- #
451
- # You don't normally use this directly, instead you should generated it by
452
- # using the structuring method Parslet#rule.
453
- #
454
- class Entity < Base
455
- attr_reader :name, :context, :block
456
- def initialize(name, context, block)
457
- super()
458
-
459
- @name = name
460
- @context = context
461
- @block = block
462
- end
463
-
464
- def try(io)
465
- parslet.apply(io)
466
- end
467
-
468
- def parslet
469
- @parslet ||= context.instance_eval(&block).tap { |p|
470
- raise_not_implemented unless p
471
- }
472
- end
473
-
474
- def to_s_inner(prec)
475
- name.to_s.upcase
476
- end
477
-
478
- def error_tree
479
- parslet.error_tree
480
- end
481
-
482
- private
483
- def raise_not_implemented
484
- trace = caller.reject {|l| l =~ %r{#{Regexp.escape(__FILE__)}}} # blatantly stolen from dependencies.rb in activesupport
485
- exception = NotImplementedError.new("rule(#{name.inspect}) { ... } returns nil. Still not implemented, but already used?")
486
- exception.set_backtrace(trace)
487
-
488
- raise exception
489
- end
490
- end
15
+ autoload :Base, 'parslet/atoms/base'
16
+ autoload :Named, 'parslet/atoms/named'
17
+ autoload :Lookahead, 'parslet/atoms/lookahead'
18
+ autoload :Alternative, 'parslet/atoms/alternative'
19
+ autoload :Sequence, 'parslet/atoms/sequence'
20
+ autoload :Repetition, 'parslet/atoms/repetition'
21
+ autoload :Re, 'parslet/atoms/re'
22
+ autoload :Str, 'parslet/atoms/str'
23
+ autoload :Entity, 'parslet/atoms/entity'
491
24
  end
492
25