whittle 0.0.6 → 0.0.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +49 -7
- data/examples/calculator.rb +10 -0
- data/lib/whittle/parse_error_builder.rb +92 -0
- data/lib/whittle/parser.rb +57 -64
- data/lib/whittle/rule.rb +0 -39
- data/lib/whittle/terminal.rb +46 -0
- data/lib/whittle/version.rb +1 -1
- data/lib/whittle.rb +1 -0
- data/spec/unit/parse_error_builder_spec.rb +78 -0
- data/spec/unit/parser/one_off_start_rule_spec.rb +26 -0
- metadata +9 -4
data/README.md
CHANGED
@@ -91,9 +91,7 @@ be. We use `#as` to provide an action that actually does something meaningful w
|
|
91
91
|
inputs.
|
92
92
|
|
93
93
|
We can optionally use the Hash notation to map a name with a pattern (or a fixed string) when
|
94
|
-
we declare terminal rules too, as we have done with the `:int` rule above.
|
95
|
-
longer way around defining terminal rules is to do like we have done for `:expr` and define a
|
96
|
-
block, but since this is such a common use-case, Whittle offers the shorthand.
|
94
|
+
we declare terminal rules too, as we have done with the `:int` rule above.
|
97
95
|
|
98
96
|
As the input string is parsed, it *must* match the start rule `:expr`.
|
99
97
|
|
@@ -438,6 +436,17 @@ end
|
|
438
436
|
The following would return the array `["a", "b", "c"]` given the input string "a, b, c", or
|
439
437
|
given the input string "" (nothing) it would return the empty array.
|
440
438
|
|
439
|
+
## You can use a different start rule on-demand
|
440
|
+
|
441
|
+
While this is not advised in production (requiring such a thing in production would suggest
|
442
|
+
you need to re-think your grammar), during development you may wish to specify any of your
|
443
|
+
smaller rules as the start rule for a parse. This is particularly useful in debugging, and
|
444
|
+
in writing unit tests.
|
445
|
+
|
446
|
+
``` ruby
|
447
|
+
parser.parse(input_string, :rule => :something_else)
|
448
|
+
```
|
449
|
+
|
441
450
|
## Parse errors
|
442
451
|
|
443
452
|
### The default error reporting
|
@@ -464,10 +473,23 @@ class ListParser < Whittle::Parser
|
|
464
473
|
start(:list)
|
465
474
|
end
|
466
475
|
|
467
|
-
|
476
|
+
str = <<-END
|
477
|
+
one, two, three, four, five,
|
478
|
+
six, seven, eight, nine, ten,
|
479
|
+
eleven, twelve, thirteen,
|
480
|
+
fourteen, fifteen - sixteen, seventeen
|
481
|
+
END
|
482
|
+
|
483
|
+
ListParser.new.parse(str)
|
468
484
|
|
469
485
|
# =>
|
470
|
-
# Parse error: expected "," but got "-" on line
|
486
|
+
# Parse error: expected "," but got "-" on line 4. (Whittle::ParseError)
|
487
|
+
#
|
488
|
+
# Exact region marked...
|
489
|
+
#
|
490
|
+
# fourteen, fifteen - sixteen, seventeen
|
491
|
+
# ... ^ ... right here
|
492
|
+
#
|
471
493
|
```
|
472
494
|
|
473
495
|
You can also access `#line`, `#expected` and `#received` if you catch the exception.
|
@@ -500,6 +522,16 @@ rule("keyword")
|
|
500
522
|
rule(:name => /pattern/)
|
501
523
|
```
|
502
524
|
|
525
|
+
### Matching with case-insenstivity
|
526
|
+
|
527
|
+
You can use the Hash notation from above, with a String as the key, mapping to a Regexp.
|
528
|
+
|
529
|
+
``` ruby
|
530
|
+
rule("function" => /function/i)
|
531
|
+
```
|
532
|
+
|
533
|
+
Now in all rules that allow case-insensitive "function", just use the String `"function"`.
|
534
|
+
|
503
535
|
### Providing a semantic action for a terminal rule
|
504
536
|
|
505
537
|
``` ruby
|
@@ -580,14 +612,20 @@ rule("+") ^ 1
|
|
580
612
|
rule("*") ^ 2
|
581
613
|
```
|
582
614
|
|
583
|
-
###
|
615
|
+
### How do I make two expresions mutually reference each other?
|
616
|
+
|
617
|
+
Let's say you have two types of expression, `:binary_expr` (like "a + b") and `:invcation_expr` (like "foo(bar)").
|
618
|
+
|
619
|
+
What you're saying is that any argument in the invocation expression should support either another invocation, or
|
620
|
+
a `:binary_expr`. Likewise, you want any operand of `:binary_expr` to support either another `:binary_expr` or
|
621
|
+
an `:invoation_expr`.
|
584
622
|
|
585
623
|
If you can explain it this simply on paper, you can explain it formally in your grammar. If `:binary_expr`
|
586
624
|
allows `:invocation_expr` as an operand, and if `:invocation_expr` allows `:binary_expr` as an argument, then
|
587
625
|
what you're saying is they can be used in place of each other; thus, define a rule that represents the two of them
|
588
626
|
and use that new rule where you want to support both types of expression.
|
589
627
|
|
590
|
-
Assuming your grammar looked something like this
|
628
|
+
Assuming your grammar looked something like this pseudo example.
|
591
629
|
|
592
630
|
``` ruby
|
593
631
|
rule("+")
|
@@ -648,6 +686,10 @@ Now we can parse the more complex expression "1 + foo(2, 3) + 4" without any iss
|
|
648
686
|
|
649
687
|
### How do I track state to store variables etc with Whittle?
|
650
688
|
|
689
|
+
In general you build an complete AST to be interpreted if you're writing a program, rather than interpret the input as
|
690
|
+
it is parsed (what would happen if something had written to disk and then a parse error occurred?). That said, in
|
691
|
+
simple cases it may be useful to simply interpret the input as it is read.
|
692
|
+
|
651
693
|
One of the goals of making Whittle all ruby was that I wouldn't have to tie people into any particular way of doing
|
652
694
|
something. Your blocks can call any ruby code they like, so create an object of some sort those blocks can reference
|
653
695
|
and do as you need during the parse. For example, you could add a method to the class called something like `runtime`,
|
data/examples/calculator.rb
CHANGED
@@ -0,0 +1,92 @@
|
|
1
|
+
# Whittle: A little LALR(1) parser in pure ruby, without a generator.
|
2
|
+
#
|
3
|
+
# Copyright (c) Chris Corbyn, 2011
|
4
|
+
module Whittle
|
5
|
+
# Since parse error diagram the region where the error occured,
|
6
|
+
# this logic is split out from the main Parser
|
7
|
+
class ParseErrorBuilder
|
8
|
+
class << self
|
9
|
+
# Generates a ParseError for the given set of error conditions
|
10
|
+
#
|
11
|
+
# A ParseError always specifies the line nunber, the expected inputs and
|
12
|
+
# the received input.
|
13
|
+
#
|
14
|
+
# If possible, it also draw a diagram indicating the point where the
|
15
|
+
# error occurred.
|
16
|
+
#
|
17
|
+
# @param [Hash] state
|
18
|
+
# all the instructions for the current parser state
|
19
|
+
#
|
20
|
+
# @param [Hash] token
|
21
|
+
# the unexpected input token
|
22
|
+
#
|
23
|
+
# @param [Hash] context
|
24
|
+
# the current parser context, providing line number, input string + stack etc
|
25
|
+
#
|
26
|
+
# @return [ParseError]
|
27
|
+
# a detailed Exception to be raised
|
28
|
+
def exception(state, token, context)
|
29
|
+
region = extract_error_region(token[:offset], context[:input])
|
30
|
+
expected = extract_expected_tokens(state)
|
31
|
+
message = <<-ERROR.gsub(/\n(?!\n)\s+/, " ").strip
|
32
|
+
Parse error:
|
33
|
+
#{expected.count > 1 ? 'expected one of' : 'expected'}
|
34
|
+
#{expected.map { |k| k.inspect }.join(", ")}
|
35
|
+
but got
|
36
|
+
#{token[:name].inspect}
|
37
|
+
on line
|
38
|
+
#{token[:line]}.
|
39
|
+
ERROR
|
40
|
+
|
41
|
+
unless region.nil?
|
42
|
+
region = "\n\nExact region marked...\n\n#{region}"
|
43
|
+
end
|
44
|
+
|
45
|
+
ParseError.new(message + region.to_s, token[:line], expected, token[:name])
|
46
|
+
end
|
47
|
+
|
48
|
+
private
|
49
|
+
|
50
|
+
def extract_error_region(offset, input)
|
51
|
+
return if offset.nil?
|
52
|
+
|
53
|
+
# FIXME: If anybody has a cleaner way to insert the ^ marker, please do :-)
|
54
|
+
width = 100
|
55
|
+
start_offset = [offset - width, 0].max
|
56
|
+
end_offset = offset + width
|
57
|
+
before = input[start_offset, [offset, width].min]
|
58
|
+
after = input[offset, width]
|
59
|
+
before_lines = "~#{before}~".lines.to_a
|
60
|
+
after_lines = "~#{after}~".lines.to_a
|
61
|
+
|
62
|
+
before_lines.first.slice!(0)
|
63
|
+
before_lines.last.chop!
|
64
|
+
|
65
|
+
after_lines.first.slice!(0)
|
66
|
+
after_lines.last.chop!
|
67
|
+
|
68
|
+
region_before = before_lines.pop
|
69
|
+
region_after = after_lines.shift
|
70
|
+
error_line = region_before + region_after
|
71
|
+
|
72
|
+
padding = if region_before.length > 5
|
73
|
+
(" " * (region_before.length - 5)) + " ... "
|
74
|
+
else
|
75
|
+
" " * region_before.length
|
76
|
+
end
|
77
|
+
|
78
|
+
marker = "#{padding}^ ... right here\n\n"
|
79
|
+
|
80
|
+
unless error_line =~ /[\r\n]\Z/
|
81
|
+
marker = "\n#{marker}"
|
82
|
+
end
|
83
|
+
|
84
|
+
"#{error_line}#{marker}"
|
85
|
+
end
|
86
|
+
|
87
|
+
def extract_expected_tokens(state)
|
88
|
+
state.select { |s, i| [:shift, :accept].include?(i[:action]) }.keys
|
89
|
+
end
|
90
|
+
end
|
91
|
+
end
|
92
|
+
end
|
data/lib/whittle/parser.rb
CHANGED
@@ -107,19 +107,6 @@ module Whittle
|
|
107
107
|
@start
|
108
108
|
end
|
109
109
|
|
110
|
-
# Returns the numeric value for the initial state (the state ID associated with the start
|
111
|
-
# rule).
|
112
|
-
#
|
113
|
-
# In most LALR(1) parsers, this would be zero, but for implementation reasons, this will
|
114
|
-
# be an unpredictably large (or small) number.
|
115
|
-
#
|
116
|
-
# @return [Fixnum]
|
117
|
-
# the ID for the initial state in the parse table
|
118
|
-
def initial_state
|
119
|
-
prepare_start_rule
|
120
|
-
[rules[start], 0].hash
|
121
|
-
end
|
122
|
-
|
123
110
|
# Returns the entire parse table used to interpret input into the parser.
|
124
111
|
#
|
125
112
|
# You should not need to call this method, though you may wish to inspect its contents
|
@@ -133,34 +120,38 @@ module Whittle
|
|
133
120
|
# @return [Hash]
|
134
121
|
# a 2-dimensional Hash representing states with actions to perform for a given lookahead
|
135
122
|
def parse_table
|
136
|
-
@parse_table ||=
|
137
|
-
prepare_start_rule
|
138
|
-
rules[start].build_parse_table(
|
139
|
-
{},
|
140
|
-
self,
|
141
|
-
{
|
142
|
-
:initial => true,
|
143
|
-
:state => initial_state,
|
144
|
-
:seen => [],
|
145
|
-
:offset => 0,
|
146
|
-
:prec => 0
|
147
|
-
}
|
148
|
-
)
|
149
|
-
end
|
123
|
+
@parse_table ||= parse_table_for_rule(start)
|
150
124
|
end
|
151
125
|
|
152
|
-
|
153
|
-
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
|
158
|
-
|
159
|
-
|
160
|
-
|
126
|
+
# Prepare the parse table for a given rule instead of the start rule.
|
127
|
+
#
|
128
|
+
# Warning: this method does not memoize the result, so you should not use it in production.
|
129
|
+
#
|
130
|
+
# @param [Symbol, String] name
|
131
|
+
# the name of the Rule to use as the start rule
|
132
|
+
#
|
133
|
+
# @return [Hash]
|
134
|
+
# the complete parse table for this rule
|
135
|
+
def parse_table_for_rule(name)
|
136
|
+
raise GrammarError, "Undefined start rule #{name.inspect}" unless rules.key?(name)
|
161
137
|
|
162
|
-
|
138
|
+
rule = if rules[name].terminal?
|
139
|
+
RuleSet.new(:$start, false).tap { |r| r[name].as { |prog| prog } }
|
140
|
+
else
|
141
|
+
rules[name]
|
163
142
|
end
|
143
|
+
|
144
|
+
rule.build_parse_table(
|
145
|
+
{},
|
146
|
+
self,
|
147
|
+
{
|
148
|
+
:initial => true,
|
149
|
+
:state => [rule, 0].hash,
|
150
|
+
:seen => [],
|
151
|
+
:offset => 0,
|
152
|
+
:prec => 0
|
153
|
+
}
|
154
|
+
)
|
164
155
|
end
|
165
156
|
end
|
166
157
|
|
@@ -185,14 +176,28 @@ module Whittle
|
|
185
176
|
#
|
186
177
|
# A successful parse returns the result of evaluating the start rule, whatever that may be.
|
187
178
|
#
|
179
|
+
# It is possible to specify a different start rule during development.
|
180
|
+
#
|
181
|
+
# @example Using a different start rule
|
182
|
+
#
|
183
|
+
# parser.parse(str, :rule => :another_rule)
|
184
|
+
#
|
188
185
|
# @param [String] input
|
189
186
|
# a complete input string to parse according to the grammar
|
190
187
|
#
|
188
|
+
# @param [Hash] options
|
189
|
+
# currently the only supported option is :rule, to specify a different once-off start rule
|
190
|
+
#
|
191
191
|
# @return [Object]
|
192
192
|
# whatever the grammar defines
|
193
|
-
def parse(input)
|
194
|
-
table =
|
195
|
-
|
193
|
+
def parse(input, options = {})
|
194
|
+
table = if options.key?(:rule)
|
195
|
+
self.class.parse_table_for_rule(options[:rule])
|
196
|
+
else
|
197
|
+
self.class.parse_table
|
198
|
+
end
|
199
|
+
|
200
|
+
states = [table.keys.first]
|
196
201
|
args = []
|
197
202
|
line = 1
|
198
203
|
|
@@ -223,7 +228,7 @@ module Whittle
|
|
223
228
|
end
|
224
229
|
end
|
225
230
|
|
226
|
-
error(state, token, :states => states, :args => args)
|
231
|
+
error(state, token, :input => input, :states => states, :args => args)
|
227
232
|
end
|
228
233
|
end
|
229
234
|
end
|
@@ -246,13 +251,14 @@ module Whittle
|
|
246
251
|
raise UnconsumedInputError,
|
247
252
|
"Unmatched input #{input[offset..-1].inspect} on line #{line}" if token.nil?
|
248
253
|
|
249
|
-
|
254
|
+
token[:offset] = offset
|
250
255
|
line, token[:line] = token[:line], line
|
256
|
+
offset += token[:value].length
|
251
257
|
yield token unless token[:discarded]
|
252
258
|
end
|
253
259
|
end
|
254
260
|
|
255
|
-
yield ({ :name => :$end, :line => line, :value => nil })
|
261
|
+
yield ({ :name => :$end, :line => line, :value => nil, :offset => offset })
|
256
262
|
end
|
257
263
|
|
258
264
|
# Invoked when the parser detects an error.
|
@@ -267,30 +273,21 @@ module Whittle
|
|
267
273
|
# @param [Hash] state
|
268
274
|
# the possible actions for the current parser state
|
269
275
|
#
|
270
|
-
# @param [Hash]
|
271
|
-
# the received token
|
276
|
+
# @param [Hash] token
|
277
|
+
# the received token
|
272
278
|
#
|
273
|
-
# @param [Hash]
|
274
|
-
# the current parse context (arg stack + state stack)
|
275
|
-
def error(state,
|
276
|
-
|
277
|
-
message = <<-ERROR.gsub(/\n\s+/, " ").strip
|
278
|
-
Parse error:
|
279
|
-
expected
|
280
|
-
#{expected.map { |k| k.inspect }.join("; or ")}
|
281
|
-
but got
|
282
|
-
#{input[:name].inspect}
|
283
|
-
on line
|
284
|
-
#{input[:line]}
|
285
|
-
ERROR
|
286
|
-
|
287
|
-
raise ParseError.new(message, input[:line], expected, input[:name])
|
279
|
+
# @param [Hash] context
|
280
|
+
# the current parse context (input + arg stack + state stack)
|
281
|
+
def error(state, token, context)
|
282
|
+
raise ParseErrorBuilder.exception(state, token, context)
|
288
283
|
end
|
289
284
|
|
290
285
|
private
|
291
286
|
|
292
287
|
def next_token(source, offset, line)
|
293
288
|
rules.each do |name, rule|
|
289
|
+
next unless rule.terminal?
|
290
|
+
|
294
291
|
if token = rule.scan(source, offset, line)
|
295
292
|
token[:name] = name
|
296
293
|
return token
|
@@ -299,9 +296,5 @@ module Whittle
|
|
299
296
|
|
300
297
|
nil
|
301
298
|
end
|
302
|
-
|
303
|
-
def extract_expected_tokens(state)
|
304
|
-
state.select { |s, i| [:shift, :accept].include?(i[:action]) }.keys
|
305
|
-
end
|
306
299
|
end
|
307
300
|
end
|
data/lib/whittle/rule.rb
CHANGED
@@ -36,16 +36,6 @@ module Whittle
|
|
36
36
|
raise ArgumentError, "Unsupported rule component #{c.class}"
|
37
37
|
end
|
38
38
|
end
|
39
|
-
|
40
|
-
pattern = @components.first
|
41
|
-
|
42
|
-
if terminal?
|
43
|
-
@pattern = if pattern.kind_of?(Regexp)
|
44
|
-
Regexp.new("\\G#{pattern}")
|
45
|
-
else
|
46
|
-
Regexp.new("\\G#{Regexp.escape(pattern)}")
|
47
|
-
end
|
48
|
-
end
|
49
39
|
end
|
50
40
|
|
51
41
|
# Predicate check for whether or not the Rule represents a terminal symbol.
|
@@ -213,35 +203,6 @@ module Whittle
|
|
213
203
|
tap { @prec = prec.to_i }
|
214
204
|
end
|
215
205
|
|
216
|
-
# Invoked for terminal rules during lexing, ignored for nonterminal rules.
|
217
|
-
#
|
218
|
-
# @param [String] source
|
219
|
-
# the input String the scan
|
220
|
-
#
|
221
|
-
# @param [Fixnum] offset
|
222
|
-
# the current index in the search
|
223
|
-
#
|
224
|
-
# @param [Fixnum] line
|
225
|
-
# the line the lexer was up to when the previous token was matched
|
226
|
-
#
|
227
|
-
# @return [Hash]
|
228
|
-
# a Hash representing the token, containing :rule, :value, :line and
|
229
|
-
# :discarded, if the token is to be skipped.
|
230
|
-
#
|
231
|
-
# Returns nil if nothing is matched.
|
232
|
-
def scan(source, offset, line)
|
233
|
-
return nil unless terminal?
|
234
|
-
|
235
|
-
if match = source.match(@pattern, offset)
|
236
|
-
{
|
237
|
-
:rule => self,
|
238
|
-
:value => match[0],
|
239
|
-
:line => line + match[0].count("\r\n", "\n"),
|
240
|
-
:discarded => @action.equal?(NULL_ACTION)
|
241
|
-
}
|
242
|
-
end
|
243
|
-
end
|
244
|
-
|
245
206
|
private
|
246
207
|
|
247
208
|
def resolve_conflicts(instructions)
|
data/lib/whittle/terminal.rb
CHANGED
@@ -5,8 +5,54 @@
|
|
5
5
|
module Whittle
|
6
6
|
# Represents an terminal Rule, matching a pattern in the input String
|
7
7
|
class Terminal < Rule
|
8
|
+
# Hard-coded to always return true
|
8
9
|
def terminal?
|
9
10
|
true
|
10
11
|
end
|
12
|
+
|
13
|
+
# Invoked for terminal rules during lexing, ignored for nonterminal rules.
|
14
|
+
#
|
15
|
+
# @param [String] source
|
16
|
+
# the input String the scan
|
17
|
+
#
|
18
|
+
# @param [Fixnum] offset
|
19
|
+
# the current index in the search
|
20
|
+
#
|
21
|
+
# @param [Fixnum] line
|
22
|
+
# the line the lexer was up to when the previous token was matched
|
23
|
+
#
|
24
|
+
# @return [Hash]
|
25
|
+
# a Hash representing the token, containing :rule, :value, :line and
|
26
|
+
# :discarded, if the token is to be skipped.
|
27
|
+
#
|
28
|
+
# Returns nil if nothing is matched.
|
29
|
+
def scan(source, offset, line)
|
30
|
+
if match = source.match(@pattern, offset)
|
31
|
+
{
|
32
|
+
:rule => self,
|
33
|
+
:value => match[0],
|
34
|
+
:line => line + match[0].count("\r\n", "\n"),
|
35
|
+
:discarded => @action.equal?(NULL_ACTION)
|
36
|
+
}
|
37
|
+
end
|
38
|
+
end
|
39
|
+
|
40
|
+
private
|
41
|
+
|
42
|
+
def initialize(name, *components)
|
43
|
+
raise ArgumentError, \
|
44
|
+
"Rule #{name.inspect} is terminal and can only have one rule component" \
|
45
|
+
unless components.length == 1
|
46
|
+
|
47
|
+
super
|
48
|
+
|
49
|
+
pattern = components.first
|
50
|
+
|
51
|
+
@pattern = if pattern.kind_of?(Regexp)
|
52
|
+
Regexp.new("\\G#{pattern}")
|
53
|
+
else
|
54
|
+
Regexp.new("\\G#{Regexp.escape(pattern)}")
|
55
|
+
end
|
56
|
+
end
|
11
57
|
end
|
12
58
|
end
|
data/lib/whittle/version.rb
CHANGED
data/lib/whittle.rb
CHANGED
@@ -7,6 +7,7 @@ require "whittle/error"
|
|
7
7
|
require "whittle/errors/unconsumed_input_error"
|
8
8
|
require "whittle/errors/parse_error"
|
9
9
|
require "whittle/errors/grammar_error"
|
10
|
+
require "whittle/parse_error_builder"
|
10
11
|
require "whittle/rule"
|
11
12
|
require "whittle/terminal"
|
12
13
|
require "whittle/non_terminal"
|
@@ -0,0 +1,78 @@
|
|
1
|
+
require "spec_helper"
|
2
|
+
|
3
|
+
describe Whittle::ParseErrorBuilder do
|
4
|
+
let(:context) do
|
5
|
+
{
|
6
|
+
:input => "one two three four five\nsix seven eight nine ten\neleven twelve"
|
7
|
+
}
|
8
|
+
end
|
9
|
+
|
10
|
+
let(:state) do
|
11
|
+
{
|
12
|
+
"gazillion" => { :action => :shift, :state => 7 }
|
13
|
+
}
|
14
|
+
end
|
15
|
+
|
16
|
+
context "given an error region in the middle of a line" do
|
17
|
+
let(:token) do
|
18
|
+
{
|
19
|
+
:name => "eight",
|
20
|
+
:value => "eight",
|
21
|
+
:offset => 34
|
22
|
+
}
|
23
|
+
end
|
24
|
+
|
25
|
+
let(:indicator) do
|
26
|
+
Regexp.escape(
|
27
|
+
"six seven eight nine ten\n" <<
|
28
|
+
" ... ^ ..."
|
29
|
+
)
|
30
|
+
end
|
31
|
+
|
32
|
+
it "indicates the exact region" do
|
33
|
+
Whittle::ParseErrorBuilder.exception(state, token, context).message.should =~ /#{indicator}/
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
context "given an error region near the start of a line" do
|
38
|
+
let(:token) do
|
39
|
+
{
|
40
|
+
:name => "two",
|
41
|
+
:value => "two",
|
42
|
+
:offset => 4
|
43
|
+
}
|
44
|
+
end
|
45
|
+
|
46
|
+
let(:indicator) do
|
47
|
+
Regexp.escape(
|
48
|
+
"one two three four five\n" <<
|
49
|
+
" ^ ..."
|
50
|
+
)
|
51
|
+
end
|
52
|
+
|
53
|
+
it "indicates the exact region" do
|
54
|
+
Whittle::ParseErrorBuilder.exception(state, token, context).message.should =~ /#{indicator}/
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
context "given an error region near the end of a line" do
|
59
|
+
let(:token) do
|
60
|
+
{
|
61
|
+
:name => "five",
|
62
|
+
:value => "five",
|
63
|
+
:offset => 19
|
64
|
+
}
|
65
|
+
end
|
66
|
+
|
67
|
+
let(:indicator) do
|
68
|
+
Regexp.escape(
|
69
|
+
"one two three four five\n" <<
|
70
|
+
" ... ^ ..."
|
71
|
+
)
|
72
|
+
end
|
73
|
+
|
74
|
+
it "indicates the exact region" do
|
75
|
+
Whittle::ParseErrorBuilder.exception(state, token, context).message.should =~ /#{indicator}/
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
@@ -0,0 +1,26 @@
|
|
1
|
+
require "spec_helper"
|
2
|
+
|
3
|
+
describe "parsing according to a different start rule" do
|
4
|
+
let(:parser) do
|
5
|
+
Class.new(Whittle::Parser) do
|
6
|
+
rule("+")
|
7
|
+
rule("-")
|
8
|
+
|
9
|
+
rule(:int => /[0-9]+/).as { |i| Integer(i) }
|
10
|
+
|
11
|
+
rule(:sum) do |r|
|
12
|
+
r[:int, "+", :int].as { |a, _, b| a + b }
|
13
|
+
end
|
14
|
+
|
15
|
+
rule(:sub) do |r|
|
16
|
+
r[:sum, "-", :sum].as { |a, _, b| a - b }
|
17
|
+
end
|
18
|
+
|
19
|
+
start(:sub)
|
20
|
+
end
|
21
|
+
end
|
22
|
+
|
23
|
+
it "ignores the defined start rule and uses the specified one" do
|
24
|
+
parser.new.parse("1+2", :rule => :sum).should == 3
|
25
|
+
end
|
26
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: whittle
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2011-12-
|
12
|
+
date: 2011-12-03 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rspec
|
16
|
-
requirement: &
|
16
|
+
requirement: &70312516285540 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ~>
|
@@ -21,7 +21,7 @@ dependencies:
|
|
21
21
|
version: '2.6'
|
22
22
|
type: :development
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *70312516285540
|
25
25
|
description: ! "Write powerful parsers by defining a series of very simple rules\n
|
26
26
|
\ and operations to perform as those rules are matched. Whittle\n
|
27
27
|
\ parsers are written in pure ruby and as such are extremely
|
@@ -47,12 +47,14 @@ files:
|
|
47
47
|
- lib/whittle/errors/parse_error.rb
|
48
48
|
- lib/whittle/errors/unconsumed_input_error.rb
|
49
49
|
- lib/whittle/non_terminal.rb
|
50
|
+
- lib/whittle/parse_error_builder.rb
|
50
51
|
- lib/whittle/parser.rb
|
51
52
|
- lib/whittle/rule.rb
|
52
53
|
- lib/whittle/rule_set.rb
|
53
54
|
- lib/whittle/terminal.rb
|
54
55
|
- lib/whittle/version.rb
|
55
56
|
- spec/spec_helper.rb
|
57
|
+
- spec/unit/parse_error_builder_spec.rb
|
56
58
|
- spec/unit/parser/empty_rule_spec.rb
|
57
59
|
- spec/unit/parser/empty_string_spec.rb
|
58
60
|
- spec/unit/parser/error_reporting_spec.rb
|
@@ -60,6 +62,7 @@ files:
|
|
60
62
|
- spec/unit/parser/multiple_precedence_spec.rb
|
61
63
|
- spec/unit/parser/non_terminal_ambiguity_spec.rb
|
62
64
|
- spec/unit/parser/noop_spec.rb
|
65
|
+
- spec/unit/parser/one_off_start_rule_spec.rb
|
63
66
|
- spec/unit/parser/pass_through_parser_spec.rb
|
64
67
|
- spec/unit/parser/precedence_spec.rb
|
65
68
|
- spec/unit/parser/premature_eof_spec.rb
|
@@ -96,6 +99,7 @@ specification_version: 3
|
|
96
99
|
summary: An efficient, easy to use, LALR parser for Ruby
|
97
100
|
test_files:
|
98
101
|
- spec/spec_helper.rb
|
102
|
+
- spec/unit/parse_error_builder_spec.rb
|
99
103
|
- spec/unit/parser/empty_rule_spec.rb
|
100
104
|
- spec/unit/parser/empty_string_spec.rb
|
101
105
|
- spec/unit/parser/error_reporting_spec.rb
|
@@ -103,6 +107,7 @@ test_files:
|
|
103
107
|
- spec/unit/parser/multiple_precedence_spec.rb
|
104
108
|
- spec/unit/parser/non_terminal_ambiguity_spec.rb
|
105
109
|
- spec/unit/parser/noop_spec.rb
|
110
|
+
- spec/unit/parser/one_off_start_rule_spec.rb
|
106
111
|
- spec/unit/parser/pass_through_parser_spec.rb
|
107
112
|
- spec/unit/parser/precedence_spec.rb
|
108
113
|
- spec/unit/parser/premature_eof_spec.rb
|