yaparc 0.0.7 → 0.0.8
Sign up to get free protection for your applications and to get access to all the features.
- data/README +172 -8
- data/lib/yaparc.rb +402 -69
- data/tests/test_calc.rb +42 -105
- data/tests/test_parser.rb +304 -67
- metadata +2 -2
data/README
CHANGED
@@ -1,10 +1,11 @@
|
|
1
1
|
= Synopsis
|
2
2
|
|
3
|
-
This is a yet another simple combinator parser
|
3
|
+
There are several implementations of parser combinator in ruby. This is a yet another simple combinator parser libraryin ruby.
|
4
4
|
|
5
5
|
= Requirements
|
6
6
|
|
7
7
|
* Ruby (http://www.ruby-lang.org/)
|
8
|
+
* RubyGem (http://rubyforge.org/projects/rubygems/)
|
8
9
|
|
9
10
|
= Install
|
10
11
|
|
@@ -12,19 +13,182 @@ This is a yet another simple combinator parser library in ruby.
|
|
12
13
|
|
13
14
|
= Usage
|
14
15
|
|
15
|
-
|
16
|
-
require_gem 'yaparc'
|
16
|
+
In combinator parser, each parser is construct as a function taking input string as arguments. Larger parsers are built from smaller parsers. Although combinators are higher-order functions in ordinary functional languages, they are constructed as classes in yaparc, because Ruby has more object-oriented than functional property.
|
17
17
|
|
18
|
+
All parsers has 'parse' method, each of which takes input string as its arguments except SatisfyParser. All of them return an array of array as their result, with the empty array [] denoting faiilure, and a singleton array [[v, xs]] indicating success, with value v and uncosumed input xs as String instance.
|
18
19
|
|
19
|
-
|
20
|
+
== Primitive Parsers
|
20
21
|
|
22
|
+
* SucceedParser
|
23
|
+
* FailParser
|
24
|
+
* ItemParser
|
25
|
+
* SatisfyParser
|
21
26
|
|
22
|
-
|
27
|
+
=== SucceedParser class
|
23
28
|
|
24
|
-
|
29
|
+
The parser SucceedParser always succeeds with the result value, without consuming any of the input string.
|
30
|
+
In the following example, SucceedParser#parse takes an input string "blah, blah, blah" and returns the singleton array [[1, "blah, blah, blah"]].
|
25
31
|
|
26
|
-
|
27
|
-
|
32
|
+
parser = SucceedParser.new(1)
|
33
|
+
parser.parse("blah, blah, blah")
|
34
|
+
=> [[1, "blah, blah, blah"]]
|
28
35
|
|
36
|
+
=== FailParser class
|
37
|
+
|
38
|
+
The parser FailParser always fails, regardless of the contents of the input string.
|
39
|
+
|
40
|
+
parser = FailParser.new
|
41
|
+
parser.parse("abc")
|
42
|
+
=> []
|
43
|
+
|
44
|
+
=== ItemParser class
|
45
|
+
|
46
|
+
The parser ItemParser fails if the input string is empty, and succeeds with the first character as the result value otherwise.
|
47
|
+
|
48
|
+
parser = ::Yaparc::ItemParser.new
|
49
|
+
parser.parse("abc")
|
50
|
+
=> [["a", "bc"]]
|
51
|
+
|
52
|
+
=== SatisfyParser class
|
53
|
+
|
54
|
+
The parser SatisfyParser recognizes a single input via predicate which determines if an arbitrary input is suitable for the predicate.
|
55
|
+
|
56
|
+
is_integer = lambda do |i|
|
57
|
+
begin
|
58
|
+
Integer(i)
|
59
|
+
true
|
60
|
+
rescue
|
61
|
+
false
|
62
|
+
end
|
63
|
+
end
|
64
|
+
parser = SatisfyParser.new(is_integer)
|
65
|
+
parser.parse("123")
|
66
|
+
=> [["1", "23"]]
|
67
|
+
|
68
|
+
|
69
|
+
== Combining Parsers
|
70
|
+
|
71
|
+
* AltParser
|
72
|
+
* SeqParser
|
73
|
+
* ManyParser
|
74
|
+
* ManyOneParser
|
75
|
+
|
76
|
+
|
77
|
+
|
78
|
+
=== Sequencing parser
|
79
|
+
|
80
|
+
The SeqParser corresponds to sequencing in BNF. The following parser recognizes anything that Symbol.new('+') or Natural.new would if placed in succession.
|
81
|
+
|
82
|
+
parser = SeqParser.new(Symbol.new('+'), Natural.new)
|
83
|
+
parser.parse("+321")
|
84
|
+
=> [[321,""]]
|
85
|
+
|
86
|
+
if a block given to SeqParser, it analyses input string to construct its logical structure.
|
87
|
+
|
88
|
+
parser = SeqParser.new(Symbol.new('+'), Natural.new) do | plus, nat|
|
89
|
+
nat
|
90
|
+
end
|
91
|
+
parser.parse("+1234")
|
92
|
+
=> [[1234,""]]
|
93
|
+
|
94
|
+
It produces a parse tree which expounds the semantic structure of the program.
|
95
|
+
|
96
|
+
=== Alternation parser
|
97
|
+
|
98
|
+
The parser AltParser class is an alternation parser, which returns the result of the first parser to succeed, and failure if neither does.
|
99
|
+
|
100
|
+
|
101
|
+
parser = AltParser.new(
|
102
|
+
SeqParser.new(Symbol.new('+'), Natural.new) do | _, nat|
|
103
|
+
nat
|
104
|
+
end,
|
105
|
+
Natural.new
|
106
|
+
)
|
107
|
+
parser.parse("1234")
|
108
|
+
=> [[1234,""]]
|
109
|
+
parser.parse("-1234")
|
110
|
+
=> []
|
111
|
+
|
112
|
+
|
113
|
+
=== ManyParser
|
114
|
+
|
115
|
+
In ManyParser, zero or more applications of parser are admissible.
|
116
|
+
|
117
|
+
parser = ManyParser.new(SatisfyParser.new(lambda {|i| i > '0' and i < '9'}))
|
118
|
+
parser.parse("123abc")
|
119
|
+
=> [["123", "abc"]]
|
120
|
+
|
121
|
+
|
122
|
+
=== ManyOneParser
|
123
|
+
|
124
|
+
The ManyOneParser requires at least one successfull application of parser.
|
125
|
+
|
126
|
+
|
127
|
+
== Tokenized parser
|
128
|
+
|
129
|
+
* Identifier
|
130
|
+
|
131
|
+
Parser for identifier
|
132
|
+
|
133
|
+
* Natural
|
134
|
+
|
135
|
+
Parser for natural number
|
136
|
+
|
137
|
+
* Symbol
|
138
|
+
|
139
|
+
|
140
|
+
== Define your own parser
|
141
|
+
|
142
|
+
|
143
|
+
There are two ways to construct parser. One is to inherit from Yaparc::ParserBase class.
|
144
|
+
|
145
|
+
class StringMatch < Yaparc::ParserBase
|
146
|
+
|
147
|
+
def initialize(literal)
|
148
|
+
@parser = Token.new(StringParser.new(literal))
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
152
|
+
The other is to inherit from Yaparc::AbstractParser class.
|
153
|
+
|
154
|
+
class Identifier < Yaparc::AbstractParser
|
155
|
+
def initialize
|
156
|
+
@parser = lambda do
|
157
|
+
Token.new(Ident.new)
|
158
|
+
end
|
159
|
+
end
|
160
|
+
end
|
161
|
+
|
162
|
+
If you want to nest the same parser class in the parser definition, you have to choose this way.
|
163
|
+
In the following example, note that Expr class is instantiated inside Expr#initialize method.
|
164
|
+
|
165
|
+
class Expr < Yaparc::AbstractParser
|
166
|
+
def initialize
|
167
|
+
@parser = lambda do
|
168
|
+
Yaparc::AltParser.new(
|
169
|
+
Yaparc::SeqParser.new(Term.new,
|
170
|
+
Yaparc::Symbol.new('+'),
|
171
|
+
Expr.new) do |term, _, expr|
|
172
|
+
['+', term,expr]
|
173
|
+
end,
|
174
|
+
Term.new
|
175
|
+
)
|
176
|
+
end
|
177
|
+
end
|
178
|
+
|
179
|
+
Constructing your parsers, it should be noted that left-recursion leads to non-termination of the parser.
|
180
|
+
|
181
|
+
== Avoiding left-recursion
|
182
|
+
|
183
|
+
A ::= A B | C
|
184
|
+
|
185
|
+
is equivalent to
|
186
|
+
|
187
|
+
A ::= C B*
|
188
|
+
|
189
|
+
|
190
|
+
== Tokenization
|
191
|
+
|
192
|
+
When you want to tokenize input stream, use Token class.
|
29
193
|
|
30
194
|
|