whittle 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -2,32 +2,57 @@
2
2
 
3
3
  Whittle is a LALR(1) parser. It's very small, easy to understand, and what's most important,
4
4
  it's 100% ruby. You write parsers by specifying sequences of allowable rules (which refer to
5
- other rules, or even to themselves), and for each rule in your grammar, you provide a block that
5
+ other rules, or even to themselves). For each rule in your grammar, you provide a block that
6
6
  is invoked when the grammar is recognized.
7
7
 
8
- If you're not familiar with parsing, you should find Whittle to be a very friendly little
8
+ If you're *not* familiar with parsing, you should find Whittle to be a very friendly little
9
9
  parser.
10
10
 
11
- It is related, somewhat, to yacc and bison, which belong to the class of parsers knows as
12
- LALR(1): Lookahead Left-Right (using 1 lookahead token). This class of parsers is both easy to
13
- work with, and powerful.
11
+ It is related, somewhat, to yacc and bison, which belong to the class of parsers known as
12
+ LALR(1): Left-Right, using 1 Lookahead token. This class of parsers is both easy to work with
13
+ and particularly powerful (ruby itself is parsed using a LALR(1) parser). Since the algorithm
14
+ is based around a theory that *never* has to backtrack (that is, each token read takes the
15
+ parse forward, with just a single lookup in a parse table), parse time is also fast. Parse
16
+ time is governed by the size of the input, not by the size of the grammar.
14
17
 
15
- Whittle provides meaningful error reporting and even lets you hook into the error handling logic
16
- if you need to write some sort of crazy madman forgiving parser.
18
+ Whittle provides meaningful error reporting (line number, expected tokens, received token) and
19
+ even lets you hook into the error handling logic if you need to write some sort of crazy
20
+ madman-forgiving parser.
21
+
22
+ If you've had issues with other parsers hitting "stack level too deep" errors, you should find
23
+ that Whittle does not suffer from the same issues, since it uses a state-switching algorithm
24
+ (a pushdown automaton to be precise), rather than simply having one parse function call another
25
+ and so on. Whittle also supports the following concepts:
26
+
27
+ - Left/right recursion
28
+ - Left/right associativity
29
+ - Operator precedences
30
+ - Skipping of silent tokens in the input (e.g. whitespace/comments)
31
+
32
+ ## Installation
33
+
34
+ Via rubygems:
35
+
36
+ gem install whittle
37
+
38
+ Or in your Gemfile, if you're using bundler:
39
+
40
+ gem 'whittle'
17
41
 
18
42
  ## The Basics
19
43
 
20
- Parsers using Whittle are *not* generated. This may strike users of other LALR(1) parsers as
21
- odd, but c'mon, we're using Ruby, right?
44
+ Parsers using Whittle do not generate ruby code from a grammar file. This may strike users of
45
+ other LALR(1) parsers as odd, but c'mon, we're using Ruby, right?
22
46
 
23
47
  I'll avoid discussing the algorithm until we get into the really advanced stuff, but you will
24
48
  need to understand a few fundamental ideas before we begin.
25
49
 
26
- 1. There are two types of rule that make up a complete parser: terminal, and nonterminal
50
+ 1. There are two types of rule that make up a complete parser: *terminal*, and *nonterminal*
27
51
  - A terminal rule is quite simply a chunk of the input string, like '42', or 'function'
28
- - A nonterminal rule is a rule that makes reference to other rules (terminal and nonterminal)
52
+ - A nonterminal rule is a rule that makes reference to other rules (both terminal and
53
+ nonterminal)
29
54
  2. The input to be parsed *always* conforms to just one rule at the topmost level. This is
30
- known as the "start rule".
55
+ known as the "start rule" and describes the structure of the program as a whole.
31
56
 
32
57
  The easiest way to understand how the parser works is just to learn by example, so let's see an
33
58
  example.
@@ -38,9 +63,7 @@ require 'whittle'
38
63
  class Mathematician < Whittle::Parser
39
64
  rule("+")
40
65
 
41
- rule(:int) do |r|
42
- r[/[0-9]+/].as { |num| Integer(num) }
43
- end
66
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
44
67
 
45
68
  rule(:expr) do |r|
46
69
  r[:int, "+", :int].as { |a, _, b| a + b }
@@ -55,21 +78,32 @@ mathematician.parse("1+2")
55
78
  ```
56
79
 
57
80
  Let's break this down a bit. As you can see, the whole thing is really just `rule` used in
58
- different ways. We also have to set the rule that we can use to describe an entire program,
59
- which in this case is the `:expr` rule that can add two numbers together.
81
+ different ways. We also have to set the start rule that we can use to describe an entire
82
+ program, which in this case is the `:expr` rule that can add two numbers together.
60
83
 
61
84
  There are two terminal rules (`"+"` and `:int`) and one nonterminal (`:expr`) in the above
62
85
  grammar. Each rule can have a block attached to it. The block is invoked with the result
63
- evaluating the blocks that are attached to each input (recursively). A rule with no block
64
- attached as just a shorthand way of saying "return the input verbatim", so our "+" above receives
65
- the string "+" and returns the string "+". Since this is such a common use-case, Whittle offers
66
- the shorthand.
67
-
68
- As the input string is parsed, it *must* match the start rule `:expr`. Whittle reads the "1",
69
- which matches `:int` (which casts the String "1" to the Integer 1), next the parser looks for the
70
- expected "+", which it gets. Now it looks for another `:int`, which it gets. Upon having
71
- read the sequence `:int`, `"+"`, `:int`, Whittle invokes the block for `:expr` with the arguments
72
- 1, "+", 2, returning the 3 we expect.
86
+ evaluating the blocks attached to each of its inputs (in a depth-first manner). The default
87
+ action if no block is given, is to return whatever the leftmost input to the rule happens to
88
+ be.
89
+
90
+ We can optionally use the Hash notation to map a name with a pattern (or a fixed string) when
91
+ we declare terminal rules too, as we have done with the `:int` rule above. Note that the
92
+ longer way around defining terminal rules is to do like we have done for `:expr` and define a
93
+ block, but since this is such a common use-case, Whittle offers the shorthand.
94
+
95
+ As the input string is parsed, it *must* match the start rule `:expr`.
96
+
97
+ Let's step through the parse for the above input "1+2". When the parser starts, it looks at
98
+ the start rule `:expr` and decides what tokens would be valid if they were encountered. Since
99
+ `:expr` starts with `:int`, the only thing that would be valid is anything matching
100
+ `/[0-9]+/`. When the parser reads the "1", it recognizes it as an `:int`, puts at aside (puts
101
+ it on the stack, in technical terms). Now it advances through the rule for `:expr` and
102
+ decides the only possible valid input would be a "+", and finally the last `:int`. Upon
103
+ having read the sequence `:int`, "+", `:int`, our block attached to that rule is invoked to
104
+ return a result. First the three inputs are passed through their respective blocks (so the
105
+ "1" and the "2" are cast to integers, according to the rule for `:int`), then they are passed
106
+ to the `:expr`, which adds the 1 and the 2 to make 3. Magic!
73
107
 
74
108
  ## Nonterminal rules can have more than one valid sequence
75
109
 
@@ -88,9 +122,7 @@ class Mathematician < Whittle::Parser
88
122
  rule("*")
89
123
  rule("/")
90
124
 
91
- rule(:int) do |r|
92
- r[/[0-9]+/].as { |num| Integer(num) }
93
- end
125
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
94
126
 
95
127
  rule(:expr) do |r|
96
128
  r[:int, "+", :int].as { |a, _, b| a + b }
@@ -117,7 +149,9 @@ mathematician.parse("4/2")
117
149
  # => 2
118
150
  ```
119
151
 
120
- Now you're probably seeing how matching just one rule for the entire input is not a problem.
152
+ Now you're probably beginning to see how matching just one rule for the entire input is not a
153
+ problem. To think about a more real world example, you can describe most programming
154
+ languages as a series of statements and constructs.
121
155
 
122
156
  ## Rules can refer to themselves
123
157
 
@@ -133,16 +167,14 @@ class Mathematician < Whittle::Parser
133
167
  rule("*")
134
168
  rule("/")
135
169
 
136
- rule(:int) do |r|
137
- r[/[0-9]+/].as { |num| Integer(num) }
138
- end
170
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
139
171
 
140
172
  rule(:expr) do |r|
141
173
  r[:expr, "+", :expr].as { |a, _, b| a + b }
142
174
  r[:expr, "-", :expr].as { |a, _, b| a - b }
143
175
  r[:expr, "*", :expr].as { |a, _, b| a * b }
144
176
  r[:expr, "/", :expr].as { |a, _, b| a / b }
145
- r[:int].as(:value)
177
+ r[:int]
146
178
  end
147
179
 
148
180
  start(:expr)
@@ -156,14 +188,15 @@ mathematician.parse("1+5-2")
156
188
  Adding a rule of just `:int` to the `:expr` rule means that any integer is also a valid `:expr`.
157
189
  It is now possible to say that any `:expr` can be added to, multiplied by, divided by or
158
190
  subtracted from another `:expr`. It is this ability to self-reference that makes LALR(1)
159
- parsers so powerful and easy to use. Note that because the result each rule is computed
160
- *before* being passed as arguments to the block, each `:expr` in the calculations above will
161
- always be a number, since each `:expr` returns a number.
191
+ parsers so powerful and easy to use. Note that because the result each input to any given rule
192
+ is computed *before* being passed as arguments to the block, each `:expr` in the calculations
193
+ above will always be a number, since each `:expr` returns a number. The recursion in these rules
194
+ is practically limitless. You can write "1+2-3*4+775/3" and it's still an `:expr`.
162
195
 
163
196
  ## Specifying the associativity
164
197
 
165
- Our mathematician still isn't very clever however. It makes some silly mistakes. Let's see
166
- what happens when we do the following:
198
+ If we poke around for more than a few seconds, we'll soon realize that our mathematician makes
199
+ some silly mistakes. Let's see what happens when we do the following:
167
200
 
168
201
  ``` ruby
169
202
  mathematician.parse("6-3-1")
@@ -196,16 +229,14 @@ class Mathematician < Whittle::Parser
196
229
  rule("*") % :left
197
230
  rule("/") % :left
198
231
 
199
- rule(:int) do |r|
200
- r[/[0-9]+/].as { |num| Integer(num) }
201
- end
232
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
202
233
 
203
234
  rule(:expr) do |r|
204
235
  r[:expr, "+", :expr].as { |a, _, b| a + b }
205
236
  r[:expr, "-", :expr].as { |a, _, b| a - b }
206
237
  r[:expr, "*", :expr].as { |a, _, b| a * b }
207
238
  r[:expr, "/", :expr].as { |a, _, b| a / b }
208
- r[:int].as(:value)
239
+ r[:int]
209
240
  end
210
241
 
211
242
  start(:expr)
@@ -217,11 +248,12 @@ mathematician.parse("6-3-1")
217
248
  ```
218
249
 
219
250
  Attaching a percent sign followed by either `:left` or `:right` changes the associativity of a
220
- rule. We now get the correct result.
251
+ terminal rule. We now get the correct result.
221
252
 
222
253
  ## Specifying the operator precedence
223
254
 
224
- Well, despite fixing the associativity, we find we still have a problem:
255
+ Basic arithmetic is easy peasy, right? Well, despite fixing the associativity, we find we still
256
+ have a problem:
225
257
 
226
258
  ``` ruby
227
259
  mathematician.parse("1+2*3")
@@ -241,16 +273,14 @@ class Mathematician < Whittle::Parser
241
273
  rule("*") % :left ^ 2
242
274
  rule("/") % :left ^ 2
243
275
 
244
- rule(:int) do |r|
245
- r[/[0-9]+/].as { |num| Integer(num) }
246
- end
276
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
247
277
 
248
278
  rule(:expr) do |r|
249
279
  r[:expr, "+", :expr].as { |a, _, b| a + b }
250
280
  r[:expr, "-", :expr].as { |a, _, b| a - b }
251
281
  r[:expr, "*", :expr].as { |a, _, b| a * b }
252
282
  r[:expr, "/", :expr].as { |a, _, b| a / b }
253
- r[:int].as(:value)
283
+ r[:int]
254
284
  end
255
285
 
256
286
  start(:expr)
@@ -270,7 +300,7 @@ The same applies to "*" and "/", but these both usually have a higher precedence
270
300
  ## Disambiguating expressions with the use of parentheses
271
301
 
272
302
  Sometimes we really do want "1+2*3" to mean "(1+2)*3", so we should really support this in our
273
- mathematician. Fortunately adjusting the syntax rules in Whittle is a painless exercise.
303
+ mathematician class. Fortunately adjusting the syntax rules in Whittle is a painless exercise.
274
304
 
275
305
  ``` ruby
276
306
  require 'whittle'
@@ -284,9 +314,7 @@ class Mathematician < Whittle::Parser
284
314
  rule("(")
285
315
  rule(")")
286
316
 
287
- rule(:int) do |r|
288
- r[/[0-9]+/].as { |num| Integer(num) }
289
- end
317
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
290
318
 
291
319
  rule(:expr) do |r|
292
320
  r["(", :expr, ")"].as { |_, exp, _| exp }
@@ -294,7 +322,7 @@ class Mathematician < Whittle::Parser
294
322
  r[:expr, "-", :expr].as { |a, _, b| a - b }
295
323
  r[:expr, "*", :expr].as { |a, _, b| a * b }
296
324
  r[:expr, "/", :expr].as { |a, _, b| a / b }
297
- r[:int].as(:value)
325
+ r[:int]
298
326
  end
299
327
 
300
328
  start(:expr)
@@ -306,22 +334,22 @@ mathematician.parse("(1+2)*3")
306
334
  ```
307
335
 
308
336
  All we had to do was add the new terminal rules for "(" and ")" then specify that the value of
309
- an expression enclosed in parentheses is simply the value of the expression itself.
337
+ an expression enclosed in parentheses is simply the value of the expression itself. We could
338
+ just as easily pick some other characters to surround the grouping (maybe "~1+2~*3"), but then
339
+ people would think we were silly (arguably, we would be a bit silly if we gave the expression a
340
+ curly moustache like that!).
310
341
 
311
342
  ## Skipping whitespace
312
343
 
313
344
  Most languages contain tokens that are ignored when interpreting the input, such as whitespace
314
345
  and comments. Accounting for the possibility of these in all rules would be both wasteful and
315
- tiresome. Instead, we skip them entirely, by declaring a terminal rule without any associated
316
- action, or if you want to be explicit, with `as(:nothing)`.
346
+ tiresome. Instead, we skip them entirely, by declaring a terminal rule with `#skip!`.
317
347
 
318
348
  ``` ruby
319
349
  require 'whittle'
320
350
 
321
351
  class Mathematician < Whittle::Parser
322
- rule(:wsp) do |r|
323
- r[/\s+/]
324
- end
352
+ rule(:wsp => /\s+/).skip!
325
353
 
326
354
  rule("+") % :left ^ 1
327
355
  rule("-") % :left ^ 1
@@ -331,9 +359,7 @@ class Mathematician < Whittle::Parser
331
359
  rule("(")
332
360
  rule(")")
333
361
 
334
- rule(:int) do |r|
335
- r[/[0-9]+/].as { |num| Integer(num) }
336
- end
362
+ rule(:int => /[0-9]+/).as { |num| Integer(num) }
337
363
 
338
364
  rule(:expr) do |r|
339
365
  r["(", :expr, ")"].as { |_, exp, _| exp }
@@ -341,7 +367,7 @@ class Mathematician < Whittle::Parser
341
367
  r[:expr, "-", :expr].as { |a, _, b| a - b }
342
368
  r[:expr, "*", :expr].as { |a, _, b| a * b }
343
369
  r[:expr, "/", :expr].as { |a, _, b| a / b }
344
- r[:int].as(:value)
370
+ r[:int]
345
371
  end
346
372
 
347
373
  start(:expr)
@@ -387,9 +413,7 @@ match nothing at all, which is what we hit in the middle of our nested parenthes
387
413
  This is most useful in constructs like the following:
388
414
 
389
415
  ``` ruby
390
- rule(:id) do |r|
391
- r[/[a-z]+/].as(:value)
392
- end
416
+ rule(:id => /[a-z]+/)
393
417
 
394
418
  rule(:list) do |r|
395
419
  r[].as { [] }
@@ -412,13 +436,9 @@ information.
412
436
 
413
437
  ``` ruby
414
438
  class ListParser < Whittle::Parser
415
- rule(:wsp) do |r|
416
- r[/\s+/]
417
- end
439
+ rule(:wsp => /\s+/).skip!
418
440
 
419
- rule(:id) do |r|
420
- r[/[a-z]+/].as(:value)
421
- end
441
+ rule(:id => /[a-z]+/)
422
442
 
423
443
  rule(",")
424
444
  rule("-")
@@ -447,10 +467,17 @@ something else, or rewinding the parse stack to a point where the error would no
447
467
  need to write some specs on this and explore it fully myself before I document it. 99% of users
448
468
  would never need to do such a thing.
449
469
 
470
+ ## More examples
471
+
472
+ There are some runnable examples included in the examples/ directory. Playing around with these
473
+ would probably be a useful exercise.
474
+
475
+ If you have any examples you'd like to contribute, I will gladly add them to the repository.
476
+
450
477
  ## TODO
451
478
 
452
479
  - Provide a more powerful (state based) lexer algorithm, or at least document how users can
453
- override `#lex`.
480
+ override `#lex`.
454
481
  - Allow inspection of the parse table (it is not very human friendly right now).
455
482
  - Allow inspection of the AST (maybe).
456
483
  - Given in an input String, provide a human readble explanation of the parse.
@@ -0,0 +1,59 @@
1
+ # Whittle: A little LALR(1) parser in pure ruby, without a generator.
2
+ #
3
+ # Copyright (c) Chris Corbyn, 2011
4
+
5
+ # This example creates a simple infix calculator, supporting the four basic arithmetic
6
+ # functions, add, subtract, multiply and divide, along with logic grouping and operator
7
+ # precedence
8
+
9
+ require "whittle"
10
+ require "bigdecimal"
11
+
12
+ class Calculator < Whittle::Parser
13
+ rule(:wsp => /\s+/).skip!
14
+
15
+ rule("+") % :left ^ 1
16
+ rule("-") % :left ^ 1
17
+ rule("*") % :left ^ 2
18
+ rule("/") % :left ^ 2
19
+
20
+ rule("(")
21
+ rule(")")
22
+
23
+ rule(:decimal => /([0-9]*\.)?[0-9]+/).as { |num| BigDecimal(num) }
24
+
25
+ rule(:expr) do |r|
26
+ r["(", :expr, ")"].as { |_, e, _| e }
27
+ r[:expr, "+", :expr].as { |a, _, b| a + b }
28
+ r[:expr, "-", :expr].as { |a, _, b| a - b }
29
+ r[:expr, "*", :expr].as { |a, _, b| a * b }
30
+ r[:expr, "/", :expr].as { |a, _, b| a / b }
31
+ r["-", :expr].as { |_, e| -e }
32
+ r[:decimal]
33
+ end
34
+
35
+ start(:expr)
36
+ end
37
+
38
+ calculator = Calculator.new
39
+
40
+ p calculator.parse("5-2-1").to_f
41
+ # => 2
42
+
43
+ p calculator.parse("5-2*3").to_f
44
+ # => -1
45
+
46
+ p calculator.parse(".7").to_f
47
+ # => 0.7
48
+
49
+ p calculator.parse("3.3 - .7").to_f
50
+ # => 2.6
51
+
52
+ p calculator.parse("5-(2-1)").to_f
53
+ # => 4
54
+
55
+ p calculator.parse("5 - -2").to_f
56
+ # => 7
57
+
58
+ p calculator.parse("5 * 2 - -2").to_f
59
+ # => 12
@@ -58,69 +58,42 @@ module Whittle
58
58
 
59
59
  # Declares a new rule.
60
60
  #
61
- # The are two ways to call this method. The most fundamental way is to pass a Symbol
62
- # in the +name+ parameter, along with a block, in which you will add one more possible
63
- # rules.
61
+ # The are three ways to call this method:
64
62
  #
65
- # @example Specifying multiple rules with a block
63
+ # 1. rule("+")
64
+ # 2. rule(:int => /[0-9]+/)
65
+ # 3. rule(:expr) do |r|
66
+ # r[:int, "+", :int].as { |a, _, b| a + b }
67
+ # end
66
68
  #
67
- # rule(:expr) do |r|
68
- # r[:expr, "+", :expr].as { |a, _, b| a + b }
69
- # r[:expr, "-", :expr].as { |a, _, b| a - b }
70
- # r[:expr, "/", :expr].as { |a, _, b| a / b }
71
- # r[:expr, "*", :expr].as { |a, _, b| a * b }
72
- # r[:integer].as { |i| Integer(i) }
73
- # end
69
+ # Variants (1) and (2) define basic terminal symbols (direct chunks of the input string),
70
+ # while variant (3) takes a block to define one or more nonterminal rules.
74
71
  #
75
- # Each rule specified in this way defines one of many possibilities to describe the input.
76
- # Rules may refer back to themselves, which means in the above, any integer is a valid
77
- # expr:
78
- #
79
- # 42
80
- #
81
- # Therefore any sum of integers as also a valid expr:
82
- #
83
- # 42 + 24
84
- #
85
- # Therefore any multiplication of sums of integers is also a valid expr, and so on.
86
- #
87
- # 42 + 24 * 7 + 52
88
- #
89
- # A rule like the above is called a 'nonterminal', because upon recognizing any expr, it
90
- # is possible for the rule to continue collecting input and becoming a larger expr.
91
- #
92
- # In subtle contrast, a rule like the following:
93
- #
94
- # rule("+") do |r|
95
- # r["+"].as { |plus| plus }
96
- # end
97
- #
98
- # Is called a 'terminal' token, since upon recognizing a "+", the parser cannot
99
- # add further input to the "+" itself... it is the tip of a branch in the parse tree; the
100
- # branch terminates here, and subsequently the rule is terminal.
101
- #
102
- # There is a shorthand way to write the above rule:
103
- #
104
- # rule("+")
105
- #
106
- # Not given a block, #rule treats the name parameter as a literal token.
107
- #
108
- # Note that nonterminal rules are composed of other nonterminal rules and/or terminal
109
- # rules. Terminal rules contain one, and only one Regexp pattern or fixed string.
110
- #
111
- # @param [Symbol, String] name
112
- # the name of the ruleset (note the one ruleset can contain multiple rules)
72
+ # @param [Symbol, String, Hash] name
73
+ # the name of the rule, or a Hash mapping the name to a pattern
113
74
  #
114
75
  # @return [RuleSet, Rule]
115
76
  # the newly created RuleSet if a block was given, otherwise a rule representing a
116
77
  # terminal token for the input string +name+.
117
78
  def rule(name)
118
- rules[name] = RuleSet.new(name)
119
-
120
79
  if block_given?
80
+ raise ArgumentError,
81
+ "Parser#rule does not accept both a Hash and a block" if name.kind_of?(Hash)
82
+
83
+ rules[name] = RuleSet.new(name)
121
84
  rules[name].tap { |r| yield r }
122
85
  else
123
- rules[name][name].as(:value)
86
+ key, value = if name.kind_of?(Hash)
87
+ raise ArgumentError,
88
+ "Only one element allowed in Hash for Parser#rule" unless name.length == 1
89
+
90
+ name.first
91
+ else
92
+ [name, name]
93
+ end
94
+
95
+ rules[key] = RuleSet.new(key)
96
+ rules[key][value].as(:value)
124
97
  end
125
98
  end
126
99
 
data/lib/whittle/rule.rb CHANGED
@@ -26,7 +26,7 @@ module Whittle
26
26
  # a variable list of components that make up the Rule
27
27
  def initialize(name, *components)
28
28
  @components = components
29
- @action = NULL_ACTION
29
+ @action = DUMP_ACTION
30
30
  @name = name
31
31
  @terminal = components.length == 1 && !components.first.kind_of?(Symbol)
32
32
  @assoc = :right
@@ -142,6 +142,8 @@ module Whittle
142
142
  # Given a block, the Rule will be reduced by passing the result of reducing
143
143
  # all inputs as arguments to the block.
144
144
  #
145
+ # The default action is to return the leftmost input unchanged.
146
+ #
145
147
  # Given the Symbol :value, the matched input will be returned verbatim.
146
148
  # Given the Symbol :nothing, nil will be returned; you can use this to
147
149
  # skip whitesapce and comments, for example.
@@ -165,6 +167,14 @@ module Whittle
165
167
  end
166
168
  end
167
169
 
170
+ # Alias for as(:nothing).
171
+ #
172
+ # @return [Rule]
173
+ # returns self
174
+ def skip!
175
+ as(:nothing)
176
+ end
177
+
168
178
  # Set the associativity of this Rule.
169
179
  #
170
180
  # Accepts values of :left, :right (default) or :nonassoc.
@@ -1,3 +1,7 @@
1
+ # Whittle: A little LALR(1) parser in pure ruby, without a generator.
2
+ #
3
+ # Copyright (c) Chris Corbyn, 2011
4
+
1
5
  module Whittle
2
- VERSION = "0.0.1"
6
+ VERSION = "0.0.2"
3
7
  end
data/lib/whittle.rb CHANGED
@@ -1,3 +1,7 @@
1
+ # Whittle: A little LALR(1) parser in pure ruby, without a generator.
2
+ #
3
+ # Copyright (c) Chris Corbyn, 2011
4
+
1
5
  require "whittle/version"
2
6
  require "whittle/error"
3
7
  require "whittle/errors/unconsumed_input_error"
@@ -3,13 +3,9 @@ require "spec_helper"
3
3
  describe "a parser encountering unexpected input" do
4
4
  let(:parser) do
5
5
  Class.new(Whittle::Parser) do
6
- rule(:wsp) do |r|
7
- r[/\s+/]
8
- end
6
+ rule(:wsp => /\s+/).skip!
9
7
 
10
- rule(:id) do |r|
11
- r[/[a-z]+/].as(:value)
12
- end
8
+ rule(:id => /[a-z]+/)
13
9
 
14
10
  rule(",")
15
11
  rule("-")
@@ -6,12 +6,10 @@ describe "a parser with logical grouping" do
6
6
  rule(:expr) do |r|
7
7
  r["(", :expr, ")"].as { |_, expr, _| expr }
8
8
  r[:expr, "-", :expr].as { |a, _, b| a - b }
9
- r[:int].as(:value)
9
+ r[:int]
10
10
  end
11
11
 
12
- rule(:int) do |r|
13
- r[/[0-9]+/].as { |int| Integer(int) }
14
- end
12
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
15
13
 
16
14
  rule("(")
17
15
  rule(")")
@@ -9,12 +9,10 @@ describe "a parser with multiple precedence levels" do
9
9
  r[:expr, "-", :expr].as { |a, _, b| a - b }
10
10
  r[:expr, "*", :expr].as { |a, _, b| a * b }
11
11
  r[:expr, "/", :expr].as { |a, _, b| a / b }
12
- r[:int].as(:value)
12
+ r[:int]
13
13
  end
14
14
 
15
- rule(:int) do |r|
16
- r[/[0-9]+/].as { |int| Integer(int) }
17
- end
15
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
18
16
 
19
17
  rule("(")
20
18
  rule(")")
@@ -3,12 +3,10 @@ require "spec_helper"
3
3
  describe "a noop parser" do
4
4
  let(:parser) do
5
5
  Class.new(Whittle::Parser) do
6
- rule(:char) do |r|
7
- r[/./].as(:value)
8
- end
6
+ rule(:char => /./)
9
7
 
10
8
  rule(:prog) do |r|
11
- r[:char]
9
+ r[:char].skip!
12
10
  end
13
11
 
14
12
  start(:prog)
@@ -3,9 +3,7 @@ require "spec_helper"
3
3
  describe "a pass-through parser" do
4
4
  let(:parser) do
5
5
  Class.new(Whittle::Parser) do
6
- rule(:foo) do |r|
7
- r["FOO"].as(:value)
8
- end
6
+ rule(:foo => "FOO")
9
7
 
10
8
  start(:foo)
11
9
  end
@@ -6,14 +6,12 @@ describe "a parser depending on operator precedences" do
6
6
  rule("+") % :left ^ 1
7
7
  rule("*") % :left ^ 2
8
8
 
9
- rule(:int) do |r|
10
- r[/[0-9]+/].as { |i| Integer(i) }
11
- end
9
+ rule(:int => /[0-9]+/).as { |i| Integer(i) }
12
10
 
13
11
  rule(:expr) do |r|
14
12
  r[:expr, "+", :expr].as { |a, _, b| a + b }
15
13
  r[:expr, "*", :expr].as { |a, _, b| a * b }
16
- r[:int].as(:value)
14
+ r[:int]
17
15
  end
18
16
 
19
17
  start(:expr)
@@ -7,13 +7,11 @@ describe "a parser with a self-referential rule" do
7
7
  rule(")")
8
8
  rule("+")
9
9
 
10
- rule(:int) do |r|
11
- r[/[0-9]+/].as { |int| Integer(int) }
12
- end
10
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
13
11
 
14
12
  rule(:expr) do |r|
15
13
  r[:expr, "+", :expr].as { |a, _, b| a + b }
16
- r[:int].as(:value)
14
+ r[:int]
17
15
  end
18
16
 
19
17
  start(:expr)
@@ -3,19 +3,15 @@ require "spec_helper"
3
3
  describe "a parser that skips tokens" do
4
4
  let(:parser) do
5
5
  Class.new(Whittle::Parser) do
6
- rule(:wsp) do |r|
7
- r[/\s+/]
8
- end
6
+ rule(:wsp => /\s+/).skip!
9
7
 
10
8
  rule("-") % :left
11
9
 
12
- rule(:int) do |r|
13
- r[/[0-9]+/].as { |int| Integer(int) }
14
- end
10
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
15
11
 
16
12
  rule(:expr) do |r|
17
13
  r[:expr, "-", :expr].as { |a, _, b| a - b }
18
- r[:int].as(:value)
14
+ r[:int]
19
15
  end
20
16
 
21
17
  start(:expr)
@@ -5,9 +5,7 @@ describe "a parser returning the sum of two integers" do
5
5
  Class.new(Whittle::Parser) do
6
6
  rule("+")
7
7
 
8
- rule(:int) do |r|
9
- r[/[0-9]+/].as { |int| Integer(int) }
10
- end
8
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
11
9
 
12
10
  rule(:sum) do |r|
13
11
  r[:int, "+", :int].as { |a, _, b| a + b }
@@ -3,9 +3,7 @@ require "spec_helper"
3
3
  describe "a type-casting parser" do
4
4
  let(:parser) do
5
5
  Class.new(Whittle::Parser) do
6
- rule(:int) do |r|
7
- r[/[0-9]+/].as { |int| Integer(int) }
8
- end
6
+ rule(:int => /[0-9]+/).as { |int| Integer(int) }
9
7
 
10
8
  start(:int)
11
9
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: whittle
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.0.2
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2011-11-27 00:00:00.000000000 Z
12
+ date: 2011-11-28 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rspec
16
- requirement: &70265816010420 !ruby/object:Gem::Requirement
16
+ requirement: &70351976364700 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ~>
@@ -21,7 +21,7 @@ dependencies:
21
21
  version: '2.6'
22
22
  type: :development
23
23
  prerelease: false
24
- version_requirements: *70265816010420
24
+ version_requirements: *70351976364700
25
25
  description: ! "Write powerful parsers by defining a series of very simple rules\n
26
26
  \ and operations to perform as those rules are matched. Whittle\n
27
27
  \ parsers are written in pure ruby and as such are extremely
@@ -40,6 +40,7 @@ files:
40
40
  - LICENSE
41
41
  - README.md
42
42
  - Rakefile
43
+ - examples/calculator.rb
43
44
  - lib/whittle.rb
44
45
  - lib/whittle/error.rb
45
46
  - lib/whittle/errors/grammar_error.rb