citrus 1.2.1 → 1.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README CHANGED
@@ -7,59 +7,319 @@
7
7
 
8
8
  Citrus is a compact and powerful parsing library for Ruby that combines the
9
9
  elegance and expressiveness of the language with the simplicity and power of
10
- parsing expression grammars.
10
+ parsing expressions.
11
11
 
12
- Citrus grammars look very much like Treetop grammars but take a completely
13
- different approach. Instead of generating parsers from your grammars, Citrus
14
- evaluates grammars and rules in memory as Ruby modules. In fact, you can even
15
- define your grammars as Ruby modules in the first place, entirely skipping the
16
- parsing/evaluation step.
17
12
 
18
- Terminals are represented as either strings or regular expressions. Support for
19
- sequences, choices, labels, repetition, and lookahead (both positive and
20
- negative) are all included, as well as character classes and the dot-matches-
21
- anything symbol.
13
+ ** Installation **
22
14
 
23
- To try it out, fire up an IRB session from the root of the project and run one
24
- of the examples.
25
15
 
26
- $ irb -Ilib
27
- > require 'citrus'
28
- => true
29
- > Citrus.load 'examples/calc'
30
- => [Calc]
31
- > match = Calc.parse '1 + 5'
32
- => #<Citrus::Match ...
33
- > match.value
34
- => 6
16
+ Via RubyGems:
35
17
 
36
- Be sure to try requiring `citrus/debug' (instead of just `citrus') if you'd like
37
- some better visualization of the match results.
18
+ $ sudo gem install citrus
38
19
 
39
- The code base is very small and it's well-documented and tested, so it should be
40
- fairly easy to understand for anyone who is familiar with parsing expressions.
20
+ From a local copy:
41
21
 
22
+ $ git clone git://github.com/mjijackson/citrus.git
23
+ $ cd citrus
24
+ $ rake package && sudo rake install
42
25
 
43
- ** Links **
44
26
 
27
+ ** Background **
45
28
 
46
- http://pdos.csail.mit.edu/~baford/packrat/
47
- http://en.wikipedia.org/wiki/Parsing_expression_grammar
48
- http://treetop.rubyforge.org/index.html
49
29
 
30
+ In order to be able to use Citrus effectively, you must first understand the
31
+ difference between syntax and semantics. Syntax is a set of rules that govern
32
+ the way letters and punctuation may be used in a language. For example, English
33
+ syntax dictates that proper nouns should start with a capital letter and that
34
+ sentences should end with a period.
50
35
 
51
- ** Installation **
36
+ Semantics are the rules by which meaning may be derived in a language. For
37
+ example, as you read a book you are able to make some sense of the particular
38
+ way in which words on a page are combined to form thoughts and express ideas
39
+ because you understand what the words themselves mean and you can understand
40
+ what they mean collectively.
52
41
 
42
+ Computers use a similar process when interpreting code. First, the code must be
43
+ parsed into recognizable symbols or tokens. These tokens may then be passed to
44
+ an interpreter which is responsible for forming actual instructions from them.
53
45
 
54
- Via RubyGems:
46
+ Citrus is a pure Ruby library that allows you to perform both lexical analysis
47
+ and semantic interpretation quickly and easily. Using Citrus you can write
48
+ powerful parsers that are simple to understand and easy to create and maintain.
55
49
 
56
- $ sudo gem install citrus
50
+ In Citrus, there are three main types of objects: rules, grammars, and matches.
57
51
 
58
- From a local copy:
52
+ == Rules
53
+
54
+ A rule is an object that specifies some matching behavior on a string. There are
55
+ two types of rules: terminals and non-terminals. Terminals can be either Ruby
56
+ strings or regular expressions that specify some input to match. For example, a
57
+ terminal created from the string "end" would match any sequence of the
58
+ characters "e", "n", and "d", in that order. A terminal created from a regular
59
+ expression uses Ruby's regular expression engine to attempt to create a match.
60
+
61
+ Non-terminals are rules that may contain other rules but do not themselves match
62
+ directly on the input. For example, a Repeat is a non-terminal that may contain
63
+ one other rule that will try and match a certain number of times. Several other
64
+ types of non-terminals are available that will be discussed later.
65
+
66
+ Rule objects may also have semantic information associated with them in the form
67
+ of Ruby modules. These modules contain methods that will be used to extend any
68
+ match objects created by the rule with which they are associated.
69
+
70
+ == Grammars
71
+
72
+ A grammar is a container for rules. Usually the rules in a grammar collectively
73
+ form a complete specification for some language, or a well-defined subset
74
+ thereof.
75
+
76
+ A Citrus grammar is really just a souped-up Ruby module. These modules may be
77
+ included in other grammar modules in the same way that Ruby modules are normally
78
+ used. This property allows you to divide a complex grammar into reusable pieces
79
+ that may be combined dynamically at runtime. Any grammar rule with the same name
80
+ as a rule in an included grammar may access that rule with a mechanism similar
81
+ to Ruby's super keyword.
82
+
83
+ == Matches
84
+
85
+ Matches are created by rule objects when they match on the input. A match
86
+ contains the string of text that made up the match as well as its offset in the
87
+ original input string. During a parse, matches are arranged in a tree structure
88
+ where any match may contain any number of other matches. This structure is
89
+ determined by the way in which the rule that generated each match is used in the
90
+ grammar.
91
+
92
+ For example, a match that is created from a non-terminal rule that contains
93
+ several other terminals will likewise contain several matches, one for each
94
+ terminal.
95
+
96
+ Match objects may be extended with semantic information in the form of methods.
97
+ These methods can interpret the text of a match using the wealth of information
98
+ available to them including the text of the match, its position in the input,
99
+ and any submatches.
100
+
101
+
102
+ ** Syntax **
103
+
104
+
105
+ The most straightforward way to compose a Citrus grammar is to use Citrus' own
106
+ custom grammar syntax. This syntax borrows heavily from Ruby, so it should
107
+ already be familiar to Ruby programmers.
108
+
109
+ == Terminals
110
+
111
+ Terminals may be represented by a string or a regular expression. Both follow
112
+ the same rules as Ruby string and regular expression literals.
113
+
114
+ 'abc'
115
+ "abc\n"
116
+ /\xFF/
117
+
118
+ Character classes and the dot (match anything) symbol are supported as well for
119
+ compatibility with other parsing expression implementations.
120
+
121
+ [a-z0-9] # match any lowercase letter or digit
122
+ [\x00-\xFF] # match any octet
123
+ . # match anything, even new lines
124
+
125
+ == Repetition
126
+
127
+ Quantifiers may be used after any expression to specify a number of times it
128
+ must match. The universal form of a quantifier is N*M where N is the minimum and
129
+ M is the maximum number of times the expression may match.
130
+
131
+ 'abc'1*2 # match "abc" a minimum of one, maximum
132
+ # of two times
133
+ 'abc'1* # match "abc" at least once
134
+ 'abc'*2 # match "abc" a maximum of twice
135
+
136
+ The + and ? operators are supported as well for the common cases of 1* and *1
137
+ respectively.
138
+
139
+ 'abc'+ # match "abc" at least once
140
+ 'abc'? # match "abc" a maximum of once
141
+
142
+ == Lookahead
143
+
144
+ Both positive and negative lookahead are supported in Citrus. Use the & and !
145
+ operators to indicate that an expression either should or should not match. In
146
+ neither case is any input consumed.
147
+
148
+ &'a' 'b' # match a "b" preceded by an "a"
149
+ !'a' 'b' # match a "b" that is not preceded by an "a"
150
+ !'a' . # match any character except for "a"
151
+
152
+ == Sequences
153
+
154
+ Sequences of expressions may be separated by a space to indicate that the rules
155
+ should match in that order.
156
+
157
+ 'a' 'b' 'c' # match "a", then "b", then "c"
158
+ 'a' [0-9] # match "a", then a numeric digit
159
+
160
+ == Choices
161
+
162
+ Ordered choice is indicated by a vertical bar that separates two expressions.
163
+ Note that any operator binds more tightly than the bar.
59
164
 
60
- $ git clone git://github.com/mjijackson/citrus.git
61
- $ cd citrus
62
- $ rake package && sudo rake install
165
+ 'a' | 'b' # match "a" or "b"
166
+ 'a' 'b' | 'c' # match "a" then "b" (in sequence), or "c"
167
+
168
+ == Super
169
+
170
+ When including a grammar inside another, all rules in the child that have the
171
+ same name as a rule in the parent also have access to the super keyword to
172
+ invoke the parent rule.
173
+
174
+ == Labels
175
+
176
+ Match objects may be referred to by a different name than the rule that
177
+ originally generated them. Labels are created by placing the label and a colon
178
+ immediately preceding any expression.
179
+
180
+ chars:/[a-z]+/ # the characters matched by the regular
181
+ # expression may be referred to as "chars"
182
+ # in a block method
183
+
184
+
185
+ ** Example **
186
+
187
+
188
+ Below is an example of a simple grammar that is able to parse strings of
189
+ integers separated by any amount of white space and a + symbol.
190
+
191
+ grammar Addition
192
+ rule additive
193
+ number plus (additive | number)
194
+ end
195
+
196
+ rule number
197
+ [0-9]+ space
198
+ end
199
+
200
+ rule plus
201
+ '+' space
202
+ end
203
+
204
+ rule space
205
+ [ \t]*
206
+ end
207
+ end
208
+
209
+ Several things to note about the above example:
210
+
211
+ * Grammar and rule declarations end with the "end" keyword
212
+ * A Sequence of rules is created by separating expressions with a space
213
+ * Likewise, ordered choice is represented with a vertical bar
214
+ * Parentheses may be used to override the natural binding order
215
+ * Rules may refer to other rules in their own definitions simply by using the
216
+ other rule's name
217
+ * Any expression may be followed by a quantifier
218
+
219
+ == Interpretation
220
+
221
+ The grammar above is able to parse simple mathematical expressions such as "1+2"
222
+ and "1 + 2+3", but it does not have enough semantic information to be able to
223
+ actually interpret these expressions.
224
+
225
+ At this point, when the grammar parses a string it generates a tree of Match
226
+ objects. Each match is created by a rule. A match will know what text it
227
+ contains, its offset in the original input, and what submatches it contains.
228
+
229
+ Submatches are created whenever a rule contains another rule. For example, in
230
+ the grammar above the number rule matches a string of digits followed by white
231
+ space. Thus, a match generated by the number rule will contain two submatches.
232
+
233
+ We can use Ruby's block syntax to create a module that will be attached to these
234
+ matches when they are created and is used to lazily extend them when we want to
235
+ interpret them. The following example shows one way to do this.
236
+
237
+ grammar Addition
238
+ rule additive
239
+ (number plus term) {
240
+ def value
241
+ number.value + term.value
242
+ end
243
+ }
244
+ end
245
+
246
+ rule term
247
+ (additive | number) {
248
+ def value
249
+ first.value
250
+ end
251
+ }
252
+ end
253
+
254
+ rule number
255
+ ([0-9]+ space) {
256
+ def value
257
+ text.strip.to_i
258
+ end
259
+ }
260
+ end
261
+
262
+ rule plus
263
+ '+' space
264
+ end
265
+
266
+ rule space
267
+ [ \t]*
268
+ end
269
+ end
270
+
271
+ In this version of the grammar the additive rule has been refactored to use the
272
+ term rule. This makes it a little cleaner to define our semantic blocks. It's
273
+ easiest to explain what is going on here by starting with the lowest level
274
+ block, which is defined within the number rule.
275
+
276
+ The semantic block associated with the number rule defines one method, value.
277
+ This method will be present on all matches that result from this rule. Inside
278
+ this method, we can see that the value of a number match is determined to be
279
+ its text value, stripped of white space and converted to an integer.
280
+
281
+ Similarly, the block that is applied to term matches also defines a value
282
+ method. However, this method works a bit differently. Since a term matches an
283
+ additive or a number a term match will contain one submatch, the match that
284
+ resulted from either additive or number. The first method retrieves the first
285
+ submatch. So, the value of a term is determined to be the value of its first
286
+ submatch.
287
+
288
+ Finally, the additive rule also extends its matches with a value method. Here,
289
+ the value of an additive is determined to be the values of its number and term
290
+ matches added together using Ruby's addition operator.
291
+
292
+ Since additive is the first rule defined in the grammar, any match that results
293
+ from parsing a string with this grammar will have a value method that can be
294
+ used to recursively calculate the collective value of the entire match tree.
295
+
296
+ To give it a try, save the code for the Addition grammar in a file called
297
+ addition.citrus. Next, assuming you have the Citrus gem installed, try the
298
+ following sequence of commands in a terminal.
299
+
300
+ $ irb
301
+ > require 'citrus'
302
+ => true
303
+ > Citrus.load 'addition'
304
+ => [Addition]
305
+ > m = Addition.parse '1 + 2 + 3'
306
+ => #<Citrus::Match ...
307
+ > m.value
308
+ => 6
309
+
310
+ Congratulations! You just ran your first piece of Citrus code.
311
+
312
+ Take a look at examples/calc.citrus for an example of a calculator that is able
313
+ to parse and evaluate more complex mathematical expressions.
314
+
315
+
316
+ ** Links **
317
+
318
+
319
+ http://mjijackson.com/citrus
320
+ http://pdos.csail.mit.edu/~baford/packrat/
321
+ http://en.wikipedia.org/wiki/Parsing_expression_grammar
322
+ http://treetop.rubyforge.org/index.html
63
323
 
64
324
 
65
325
  ** License **
data/citrus.gemspec CHANGED
@@ -1,7 +1,7 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'citrus'
3
- s.version = '1.2.1'
4
- s.date = '2010-06-02'
3
+ s.version = '1.2.2'
4
+ s.date = '2010-06-09'
5
5
 
6
6
  s.summary = 'Parsing Expressions for Ruby'
7
7
  s.description = 'Parsing Expressions for Ruby'
@@ -14,6 +14,7 @@ Gem::Specification.new do |s|
14
14
  s.files = Dir['benchmark/*.rb'] +
15
15
  Dir['benchmark/*.citrus'] +
16
16
  Dir['benchmark/*.gnuplot'] +
17
+ Dir['doc/**/*'] +
17
18
  Dir['examples/**/*'] +
18
19
  Dir['extras/**/*'] +
19
20
  Dir['lib/**/*.rb'] +
@@ -29,5 +30,5 @@ Gem::Specification.new do |s|
29
30
  s.rdoc_options = %w< --line-numbers --inline-source --title Citrus --main Citrus >
30
31
  s.extra_rdoc_files = %w< README >
31
32
 
32
- s.homepage = 'http://github.com/mjijackson/citrus'
33
+ s.homepage = 'http://mjijackson.com/citrus'
33
34
  end
@@ -0,0 +1,72 @@
1
+ = Background
2
+
3
+ In order to be able to use Citrus effectively, you must first understand the
4
+ difference between syntax and semantics. Syntax is a set of rules that govern
5
+ the way letters and punctuation may be used in a language. For example, English
6
+ syntax dictates that proper nouns should start with a capital letter and that
7
+ sentences should end with a period.
8
+
9
+ Semantics are the rules by which meaning may be derived in a language. For
10
+ example, as you read a book you are able to make some sense of the particular
11
+ way in which words on a page are combined to form thoughts and express ideas
12
+ because you understand what the words themselves mean and you can understand
13
+ what they mean collectively.
14
+
15
+ Computers use a similar process when interpreting code. First, the code must be
16
+ parsed into recognizable symbols or tokens. These tokens may then be passed to
17
+ an interpreter which is responsible for forming actual instructions from them.
18
+
19
+ Citrus is a pure Ruby library that allows you to perform both lexical analysis
20
+ and semantic interpretation quickly and easily. Using Citrus you can write
21
+ powerful parsers that are simple to understand and easy to create and maintain.
22
+
23
+ In Citrus, there are three main types of objects: rules, grammars, and matches.
24
+
25
+ == Rules
26
+
27
+ A Rule[link:api/classes/Citrus/Rule.html] is an object that specifies some matching behavior on a string. There are
28
+ two types of rules: terminals and non-terminals. Terminals can be either Ruby
29
+ strings or regular expressions that specify some input to match. For example, a
30
+ terminal created from the string "end" would match any sequence of the
31
+ characters "e", "n", and "d", in that order. A terminal created from a regular
32
+ expression uses Ruby's regular expression engine to attempt to create a match.
33
+
34
+ Non-terminals are rules that may contain other rules but do not themselves match
35
+ directly on the input. For example, a Repeat is a non-terminal that may contain
36
+ one other rule that will try and match a certain number of times. Several other
37
+ types of non-terminals are available that will be discussed later.
38
+
39
+ Rule objects may also have semantic information associated with them in the form
40
+ of Ruby modules. These modules contain methods that will be used to extend any
41
+ match objects created by the rule with which they are associated.
42
+
43
+ == Grammars
44
+
45
+ A Grammar[link:api/classes/Citrus/Grammar.html] is a container for rules. Usually the rules in a grammar collectively
46
+ form a complete specification for some language, or a well-defined subset
47
+ thereof.
48
+
49
+ A Citrus grammar is really just a souped-up Ruby module. These modules may be
50
+ included in other grammar modules in the same way that Ruby modules are normally
51
+ used. This property allows you to divide a complex grammar into reusable pieces
52
+ that may be combined dynamically at runtime. Any grammar rule with the same name
53
+ as a rule in an included grammar may access that rule with a mechanism similar
54
+ to Ruby's super keyword.
55
+
56
+ == Matches
57
+
58
+ Matches are created by rule objects when they match on the input. A Match[link:api/classes/Citrus/Match.html]
59
+ contains the string of text that made up the match as well as its offset in the
60
+ original input string. During a parse, matches are arranged in a tree structure
61
+ where any match may contain any number of other matches. This structure is
62
+ determined by the way in which the rule that generated each match is used in the
63
+ grammar.
64
+
65
+ For example, a match that is created from a non-terminal rule that contains
66
+ several other terminals will likewise contain several matches, one for each
67
+ terminal.
68
+
69
+ Match objects may be extended with semantic information in the form of methods.
70
+ These methods can interpret the text of a match using the wealth of information
71
+ available to them including the text of the match, its position in the input,
72
+ and any submatches.
data/doc/example.rdoc ADDED
@@ -0,0 +1,128 @@
1
+ = Example
2
+
3
+ Below is an example of a simple grammar that is able to parse strings of
4
+ integers separated by any amount of white space and a <tt>+</tt> symbol.
5
+
6
+ grammar Addition
7
+ rule additive
8
+ number plus (additive | number)
9
+ end
10
+
11
+ rule number
12
+ [0-9]+ space
13
+ end
14
+
15
+ rule plus
16
+ '+' space
17
+ end
18
+
19
+ rule space
20
+ [ \t]*
21
+ end
22
+ end
23
+
24
+ Several things to note about the above example:
25
+
26
+ * Grammar and rule declarations end with the <tt>end</tt> keyword
27
+ * A Sequence of rules is created by separating expressions with a space
28
+ * Likewise, ordered choice is represented with a vertical bar
29
+ * Parentheses may be used to override the natural binding order
30
+ * Rules may refer to other rules in their own definitions simply by using the
31
+ other rule's name
32
+ * Any expression may be followed by a quantifier
33
+
34
+ == Interpretation
35
+
36
+ The grammar above is able to parse simple mathematical expressions such as "1+2"
37
+ and "1 + 2+3", but it does not have enough semantic information to be able to
38
+ actually interpret these expressions.
39
+
40
+ At this point, when the grammar parses a string it generates a tree of Match[link:api/classes/Citrus/Match.html]
41
+ objects. Each match is created by a rule. A match will know what text it
42
+ contains, its offset in the original input, and what submatches it contains.
43
+
44
+ Submatches are created whenever a rule contains another rule. For example, in
45
+ the grammar above the number rule matches a string of digits followed by white
46
+ space. Thus, a match generated by the number rule will contain two submatches.
47
+
48
+ We can use Ruby's block syntax to create a module that will be attached to these
49
+ matches when they are created and is used to lazily extend them when we want to
50
+ interpret them. The following example shows one way to do this.
51
+
52
+ grammar Addition
53
+ rule additive
54
+ (number plus term) {
55
+ def value
56
+ number.value + term.value
57
+ end
58
+ }
59
+ end
60
+
61
+ rule term
62
+ (additive | number) {
63
+ def value
64
+ first.value
65
+ end
66
+ }
67
+ end
68
+
69
+ rule number
70
+ ([0-9]+ space) {
71
+ def value
72
+ text.strip.to_i
73
+ end
74
+ }
75
+ end
76
+
77
+ rule plus
78
+ '+' space
79
+ end
80
+
81
+ rule space
82
+ [ \t]*
83
+ end
84
+ end
85
+
86
+ In this version of the grammar the additive rule has been refactored to use the
87
+ term rule. This makes it a little cleaner to define our semantic blocks. It's
88
+ easiest to explain what is going on here by starting with the lowest level
89
+ block, which is defined within the number rule.
90
+
91
+ The semantic block associated with the number rule defines one method, value.
92
+ This method will be present on all matches that result from this rule. Inside
93
+ this method, we can see that the value of a number match is determined to be
94
+ its text value, stripped of white space and converted to an integer.
95
+
96
+ Similarly, the block that is applied to term matches also defines a value
97
+ method. However, this method works a bit differently. Since a term matches an
98
+ additive or a number a term match will contain one submatch, the match that
99
+ resulted from either additive or number. The first method retrieves the first
100
+ submatch. So, the value of a term is determined to be the value of its first
101
+ submatch.
102
+
103
+ Finally, the additive rule also extends its matches with a value method. Here,
104
+ the value of an additive is determined to be the values of its number and term
105
+ matches added together using Ruby's addition operator.
106
+
107
+ Since additive is the first rule defined in the grammar, any match that results
108
+ from parsing a string with this grammar will have a value method that can be
109
+ used to recursively calculate the collective value of the entire match tree.
110
+
111
+ To give it a try, save the code for the Addition grammar in a file called
112
+ addition.citrus. Next, assuming you have the Citrus gem installed, try the
113
+ following sequence of commands in a terminal.
114
+
115
+ $ irb
116
+ > require 'citrus'
117
+ => true
118
+ > Citrus.load 'addition'
119
+ => [Addition]
120
+ > m = Addition.parse '1 + 2 + 3'
121
+ => #<Citrus::Match ...
122
+ > m.value
123
+ => 6
124
+
125
+ Congratulations! You just ran your first piece of Citrus code.
126
+
127
+ Take a look at examples/calc.citrus[http://github.com/mjijackson/citrus/blob/master/examples/calc.citrus] for an example of a calculator that is able
128
+ to parse and evaluate more complex mathematical expressions.
data/doc/index.rdoc ADDED
@@ -0,0 +1,15 @@
1
+ Citrus is a compact and powerful parsing library for Ruby[http://ruby-lang.org/] that combines the
2
+ elegance and expressiveness of the language with the simplicity and power of
3
+ parsing expressions.
4
+
5
+ = Installation
6
+
7
+ Via RubyGems[http://rubygems.org/]:
8
+
9
+ $ sudo gem install citrus
10
+
11
+ From a local copy:
12
+
13
+ $ git clone git://github.com/mjijackson/citrus.git
14
+ $ cd citrus
15
+ $ rake package && sudo rake install
data/doc/license.rdoc ADDED
@@ -0,0 +1,21 @@
1
+ = License
2
+
3
+ Copyright 2010 Michael Jackson
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/doc/links.rdoc ADDED
@@ -0,0 +1,18 @@
1
+ = Links
2
+
3
+ The primary resource for all things to do with parsing expressions can be found
4
+ at MIT.
5
+
6
+ http://pdos.csail.mit.edu/~baford/packrat
7
+
8
+ A useful summary of parsing expression grammars can be found on Wikipedia as
9
+ well.
10
+
11
+ http://en.wikipedia.org/wiki/Parsing_expression_grammar
12
+
13
+ Citrus draws inspiration from another Ruby library for writing parsing
14
+ expression grammars, Treetop. While Citrus' syntax is similar to that of
15
+ Treetop, it's not identical. The link is included here for those who may wish to
16
+ explore an alternative implementation.
17
+
18
+ http://treetop.rubyforge.org
data/doc/syntax.rdoc ADDED
@@ -0,0 +1,96 @@
1
+ = Syntax
2
+
3
+ The most straightforward way to compose a Citrus grammar is to use Citrus' own
4
+ custom grammar syntax. This syntax borrows heavily from Ruby, so it should
5
+ already be familiar to Ruby programmers.
6
+
7
+ == Terminals
8
+
9
+ Terminals may be represented by a string or a regular expression. Both follow
10
+ the same rules as Ruby string and regular expression literals.
11
+
12
+ 'abc'
13
+ "abc\n"
14
+ /\xFF/
15
+
16
+ Character classes and the dot (match anything) symbol are supported as well for
17
+ compatibility with other parsing expression implementations.
18
+
19
+ [a-z0-9] # match any lowercase letter or digit
20
+ [\x00-\xFF] # match any octet
21
+ . # match anything, even new lines
22
+
23
+ See FixedWidth[link:api/classes/Citrus/FixedWidth.html] and
24
+ Expression[link:api/classes/Citrus/Expression.html] for more information.
25
+
26
+ == Repetition
27
+
28
+ Quantifiers may be used after any expression to specify a number of times it
29
+ must match. The universal form of a quantifier is N*M where N is the minimum and
30
+ M is the maximum number of times the expression may match.
31
+
32
+ 'abc'1*2 # match "abc" a minimum of one, maximum
33
+ # of two times
34
+ 'abc'1* # match "abc" at least once
35
+ 'abc'*2 # match "abc" a maximum of twice
36
+
37
+ The + and ? operators are supported as well for the common cases of 1* and *1
38
+ respectively.
39
+
40
+ 'abc'+ # match "abc" at least once
41
+ 'abc'? # match "abc" a maximum of once
42
+
43
+ See Repeat[link:api/classes/Citrus/Repeat.html] for more information.
44
+
45
+ == Lookahead
46
+
47
+ Both positive and negative lookahead are supported in Citrus. Use the & and !
48
+ operators to indicate that an expression either should or should not match. In
49
+ neither case is any input consumed.
50
+
51
+ &'a' 'b' # match a "b" preceded by an "a"
52
+ !'a' 'b' # match a "b" that is not preceded by an "a"
53
+ !'a' . # match any character except for "a"
54
+
55
+ See AndPredicate[link:api/classes/Citrus/AndPredicate.html] and
56
+ NotPredicate[link:api/classes/Citrus/NotPredicate.html] for more information.
57
+
58
+ == Sequences
59
+
60
+ Sequences of expressions may be separated by a space to indicate that the rules
61
+ should match in that order.
62
+
63
+ 'a' 'b' 'c' # match "a", then "b", then "c"
64
+ 'a' [0-9] # match "a", then a numeric digit
65
+
66
+ See Sequence[link:api/classes/Citrus/Sequence.html] for more information.
67
+
68
+ == Choices
69
+
70
+ Ordered choice is indicated by a vertical bar that separates two expressions.
71
+ Note that any operator binds more tightly than the bar.
72
+
73
+ 'a' | 'b' # match "a" or "b"
74
+ 'a' 'b' | 'c' # match "a" then "b" (in sequence), or "c"
75
+
76
+ See Choice[link:api/classes/Citrus/Choice.html] for more information.
77
+
78
+ == Super
79
+
80
+ When including a grammar inside another, all rules in the child that have the
81
+ same name as a rule in the parent also have access to the super keyword to
82
+ invoke the parent rule.
83
+
84
+ See Super[link:api/classes/Citrus/Super.html] for more information.
85
+
86
+ == Labels
87
+
88
+ Match objects may be referred to by a different name than the rule that
89
+ originally generated them. Labels are created by placing the label and a colon
90
+ immediately preceding any expression.
91
+
92
+ chars:/[a-z]+/ # the characters matched by the regular
93
+ # expression may be referred to as "chars"
94
+ # in a block method
95
+
96
+ See Label[link:api/classes/Citrus/Label.html] for more information.
data/lib/citrus.rb CHANGED
@@ -1,10 +1,10 @@
1
1
  # Citrus is a compact and powerful parsing library for Ruby that combines the
2
2
  # elegance and expressiveness of the language with the simplicity and power of
3
- # parsing expression grammars.
3
+ # parsing expressions.
4
4
  #
5
- # http://github.com/mjijackson/citrus
5
+ # http://mjijackson.com/citrus
6
6
  module Citrus
7
- VERSION = [1, 2, 1]
7
+ VERSION = [1, 2, 2]
8
8
 
9
9
  Infinity = 1.0 / 0
10
10
 
data/lib/citrus/debug.rb CHANGED
@@ -7,7 +7,7 @@ module Citrus
7
7
  # inspecting a nested match. The +xml+ argument may be a Hash of
8
8
  # Builder::XmlMarkup options.
9
9
  def to_markup(xml={})
10
- if xml.is_a?(Hash)
10
+ if Hash === xml
11
11
  opt = { :indent => 2 }.merge(xml)
12
12
  xml = Builder::XmlMarkup.new(opt)
13
13
  xml.instruct!
metadata CHANGED
@@ -5,8 +5,8 @@ version: !ruby/object:Gem::Version
5
5
  segments:
6
6
  - 1
7
7
  - 2
8
- - 1
9
- version: 1.2.1
8
+ - 2
9
+ version: 1.2.2
10
10
  platform: ruby
11
11
  authors:
12
12
  - Michael Jackson
@@ -14,7 +14,7 @@ autorequire:
14
14
  bindir: bin
15
15
  cert_chain: []
16
16
 
17
- date: 2010-06-02 00:00:00 -06:00
17
+ date: 2010-06-09 00:00:00 -06:00
18
18
  default_executable:
19
19
  dependencies:
20
20
  - !ruby/object:Gem::Dependency
@@ -53,6 +53,12 @@ files:
53
53
  - benchmark/seqpar.rb
54
54
  - benchmark/seqpar.citrus
55
55
  - benchmark/seqpar.gnuplot
56
+ - doc/background.rdoc
57
+ - doc/example.rdoc
58
+ - doc/index.rdoc
59
+ - doc/license.rdoc
60
+ - doc/links.rdoc
61
+ - doc/syntax.rdoc
56
62
  - examples/calc.citrus
57
63
  - examples/calc.rb
58
64
  - examples/calc_sugar.rb
@@ -83,7 +89,7 @@ files:
83
89
  - Rakefile
84
90
  - README
85
91
  has_rdoc: true
86
- homepage: http://github.com/mjijackson/citrus
92
+ homepage: http://mjijackson.com/citrus
87
93
  licenses: []
88
94
 
89
95
  post_install_message: