ebnf 1.2.0 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: aae4c0b2a8f6d03654bc5436b36c90e577669c41ea4df25d881be3aef86e5ace
4
- data.tar.gz: 7f1b7fe448ae87f3755e6da4fc0faf47dbb3d3b573a3e4a8be9b3066f0fac6fa
3
+ metadata.gz: 06a44ff62e466914a2d6f1d80d4eacae30745c5c7823799af847db98cd27c813
4
+ data.tar.gz: fa433cd2719ec088cadfd987d69effa5490a3219166bd93406b1e73537956056
5
5
  SHA512:
6
- metadata.gz: 6d0f441731842af8ac38f98352b96a47cf6810bfe81ce620f8ef57af87ce6d65c679d88903c35dbce99617b24bdea8fbb70a23d4873d941ae151275a11561627
7
- data.tar.gz: f19f527e041a8f1c48da918497660466c5df71a02077e2183472b157a8611cbf87903fe2b6858e889c119211a7fedb279ad4889d811bec5d6132dd9287d74314
6
+ metadata.gz: 335fb057c452beb803c64fdfcbe87bc908e26323904a9641892987f819beb87edea7c3ca2f2bfa1aa911fcbf4b642c41b5cfdf5ba77568a8455283482fcaeb7e
7
+ data.tar.gz: f4dd5ab8c0b86c2cf2ca2b1edc4f5d93ce3c86a6b5bb527f43def66c33374090873be59ecadb60874bea62d5f80bfa76c230dc7468caf5e8af3557125d887b85
data/README.md CHANGED
@@ -2,13 +2,15 @@
2
2
 
3
3
  [EBNF][] parser and generic parser generator.
4
4
 
5
- [![Gem Version](https://badge.fury.io/rb/ebnf.png)](http://badge.fury.io/rb/ebnf)
6
- [![Build Status](https://secure.travis-ci.org/dryruby/ebnf.png?branch=master)](http://travis-ci.org/dryruby/ebnf)
5
+ [![Gem Version](https://badge.fury.io/rb/ebnf.png)](https://badge.fury.io/rb/ebnf)
6
+ [![Build Status](https://secure.travis-ci.org/dryruby/ebnf.png?branch=master)](https://travis-ci.org/dryruby/ebnf)
7
7
  [![Coverage Status](https://coveralls.io/repos/dryruby/ebnf/badge.svg)](https://coveralls.io/r/dryruby/ebnf)
8
- [![Dependency Status](https://gemnasium.com/dryruby/ebnf.png)](https://gemnasium.com/dryruby/ebnf)
9
8
 
10
9
  ## Description
11
- This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
10
+ This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
11
+
12
+ ### LL(1) Parser
13
+ In one mode, it parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
12
14
 
13
15
  As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
14
16
 
@@ -23,8 +25,13 @@ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match
23
25
 
24
26
  Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
25
27
 
28
+ See {EBNF::LL1} and {EBNF::LL1::Parser} for further information.
29
+
30
+ ### [PEG][]/[Packrat][] Parser
31
+ An additional Parsing Expression Grammar ([PEG][]) parser generator is also supported. This performs more minmal transformations on the parsed grammar to extract sub-productions, which allows each component of a rule to generate its own parsing event.
32
+
26
33
  ## Usage
27
- ### Parsing an LL(1) Grammar
34
+ ### Parsing an EBNF Grammar
28
35
 
29
36
  require 'ebnf'
30
37
 
@@ -37,199 +44,165 @@ Output rules and terminals as S-Expressions, Turtle, HTML or BNF
37
44
  puts ebnf.to_html
38
45
  puts ebnf.to_s
39
46
 
40
- Transform EBNF to BNF (generates sub-productions using `alt` or `seq` from `plus`, `star` or `opt`)
47
+ Transform EBNF to PEG (generates sub-rules for embedded expressions) and the RULES table as Ruby for parsing grammars:
48
+
49
+ ebnf.make_peg
50
+ ebnf.to_ruby
51
+
52
+ Transform EBNF to BNF (generates sub-rules using `alt` or `seq` from `plus`, `star` or `opt`)
41
53
 
42
54
  ebnf.make_bnf
43
55
 
44
- Generate [First/Follow][] rules for BNF grammars
56
+ Generate [First/Follow][] rules for BNF grammars (using "ebnf" as the starting production):
45
57
 
46
58
  ebnf.first_follow(:ebnf)
47
59
 
48
- Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
60
+ Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars:
49
61
 
50
62
  ebnf.build_tables
51
63
  ebnf.to_ruby
52
64
 
53
- Generate formatted grammar using HTML (requires [Haml][Haml] gem)
65
+ Generate formatted grammar using HTML (requires [Haml][Haml] gem):
54
66
 
55
67
  ebnf.to_html
56
68
 
57
- ### Parser S-Expressions
58
- Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
59
-
60
- Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
61
-
62
- (rule ebnf "1"
63
- (start #t)
64
- (first "@pass" "@terminals" LHS _eps)
65
- (follow _eof)
66
- (cleanup star)
67
- (alt _empty _ebnf_2))
68
- (rule _ebnf_1 "1.1"
69
- (first "@pass" "@terminals" LHS)
70
- (follow "@pass" "@terminals" LHS _eof)
71
- (alt declaration rule))
72
- (rule _ebnf_2 "1.2"
73
- (first "@pass" "@terminals" LHS)
74
- (follow _eof)
75
- (cleanup merge)
76
- (seq _ebnf_1 ebnf))
77
- (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
78
-
79
- ### Creating terminal definitions and parser rules to parse generated grammars
80
- The parser is initialized to callbacks invoked on entry and exit
81
- to each `terminal` and `production`. A trivial parser loop can be described as follows:
82
-
83
- require 'ebnf/ll1/parser'
84
- require 'meta'
85
-
86
- class Parser
87
- include Meta
88
-
89
- terminal(:SYMBOL, /([a-z]|[A-Z]|[0-9]|_)+/) do |prod, token, input|
90
- # Add data based on scanned token to input
91
- input[:symbol] = token.value
92
- end
93
-
94
- start_production(:rule) do |input, current, callback|
95
- # Process on start of production
96
- # Set state for entry into recursed rules through current
97
-
98
- # Callback to parser loop with callback
99
- end
100
-
101
- production(:rule) do |input, current, callback|
102
- # Process on end of production
103
- # return results in input, retrieve results from recursed rules in current
104
-
105
- # Callback to parser loop with callback
106
- end
107
-
108
- def initialize(input)
109
- parser_options = {
110
- branch: BRANCH,
111
- first: FIRST,
112
- follow: FOLLOW,
113
- cleanup: CLEANUP
114
- }
115
- parse(input, start_symbol, parser_options) do |context, *data|
116
- # Process calls from callback from productions
117
-
118
- rescue ArgumentError, RDF::LL1::Parser::Error => e
119
- progress("Parsing completed with errors:\n\t#{e.message}")
120
- raise RDF::ReaderError, e.message if validate?
121
- end
122
-
123
- ### Branch Table
124
- The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
125
-
126
- BRANCH = {
127
- :alt => {
128
- "(" => [:seq, :_alt_1],
129
- :ENUM => [:seq, :_alt_1],
130
- :HEX => [:seq, :_alt_1],
131
- :O_ENUM => [:seq, :_alt_1],
132
- :O_RANGE => [:seq, :_alt_1],
133
- :RANGE => [:seq, :_alt_1],
134
- :STRING1 => [:seq, :_alt_1],
135
- :STRING2 => [:seq, :_alt_1],
136
- :SYMBOL => [:seq, :_alt_1],
137
- },
138
- ...
139
- :declaration => {
140
- "@pass" => [:pass],
141
- "@terminals" => ["@terminals"],
142
- },
143
- ...
144
- }
145
-
146
- In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
147
-
148
- The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
149
-
150
- ### First/Follow Table
151
- The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
152
-
153
- FIRST = {
154
- :alt => [
155
- :HEX,
156
- :SYMBOL,
157
- :ENUM,
158
- :O_ENUM,
159
- :RANGE,
160
- :O_RANGE,
161
- :STRING1,
162
- :STRING2,
163
- "("],
164
- ...
165
- }
166
-
167
- ### Terminals Table
168
- This table is a simple list of the terminal productions found in the grammar. For example:
169
-
170
- TERMINALS = ["(", ")", "-",
171
- "@pass", "@terminals",
172
- :ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
173
- :RANGE, :STRING1, :STRING2, :SYMBOL,"|"
174
- ].freeze
175
-
176
- ### Cleanup Table
177
- This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
178
-
179
- CLEANUP = {
180
- :_alt_1 => :star,
181
- :_alt_3 => :merge,
182
- :_diff_1 => :opt,
183
- :ebnf => :star,
184
- :_ebnf_2 => :merge,
185
- :_postfix_1 => :opt,
186
- :seq => :plus,
187
- :_seq_1 => :star,
188
- :_seq_2 => :merge,
189
- }.freeze
190
-
191
- In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
192
-
193
- ebnf ::= _empty _ebnf_2
194
- _ebnf_1 ::= declaration | rule
195
- _ebnf_2 ::= _ebnf_1 ebnf
196
- _ebnf_3 ::= ebnf
197
-
198
- The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
69
+ ### Parser debugging
199
70
 
200
- ## EBNF Grammar
201
- The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
202
- [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
71
+ Inevitably while implementing a parser for some specific grammar, a developer will need greater insight into the operation of the parser. While this can involve sorting through a tremendous amount of data, the parser can be provided a [Logger][] instance which will output messages at varying levels of detail to document the state of the parser at any given point. Most useful is likely the `INFO` level of debugging, but even more detail is revealed using the `DEBUG` level. `WARN` and `ERROR` statements will typically also be provided as part of an exception if parsing fails, but can be shown in the context of other parsing state with appropriate indentation as part of the logger.
72
+
73
+ ### Parser errors
74
+ On a parsing failure, and exception is raised with information that may be useful in determining the source of the error.
203
75
 
204
- * Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
76
+ ## EBNF Grammar
77
+ The [EBNF][] variant used here is based on [W3C](https://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
78
+ [XML 1.0 recommendation](https://www.w3.org/TR/REC-xml/), with minor extensions:
79
+
80
+ The general form of a rule is:
81
+
82
+ symbol ::= expression
83
+
84
+ which can also be proceeded by an optional number enclosed in square brackets to identify the rule number:
85
+
86
+ [1] symbol ::= expression
87
+
88
+ Symbols are written with an initial capital letter if they are the start symbol of a regular language (terminals), otherwise with an initial lowercase letter (non-terminals). Literal strings are quoted.
89
+
90
+ Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters:
91
+
92
+ <table>
93
+ <tr><td><code>#xN</code></td>
94
+ <td>where <code>N</code> is a hexadecimal integer, the expression matches the character whose number (code point) in ISO/IEC 10646 is <code>N</code>. The number of leading zeros in the <code>#xN</code> form is insignificant.</td></tr>
95
+ <tr><td><code>[a-zA-Z], [#xN-#xN]</code>
96
+ <td>matches any Char with a value in the range(s) indicated (inclusive).</td></tr>
97
+ <tr><td><code>[abc], [#xN#xN#xN]</code></td>
98
+ <td>matches any Char with a value among the characters enumerated. Enumerations and ranges can be mixed in one set of brackets.</td></tr>
99
+ <tr><td><code>[^a-z], [^#xN-#xN]</code></td>
100
+ <td>matches any Char with a value outside the range indicated.</td></tr>
101
+ <tr><td><code>[^abc], [^#xN#xN#xN]</code></td>
102
+ <td>matches any Char with a value not among the characters given. Enumerations and ranges of forbidden values can be mixed in one set of brackets.</td></tr>
103
+ <tr><td><code>"string"</code></td>
104
+ <td>matches a literal string matching that given inside the double quotes.</td></tr>
105
+ <tr><td><code>'string'</code></td>
106
+ <td>matches a literal string matching that given inside the single quotes.</td></tr>
107
+ <tr><td><code>A (B | C)</code></td>
108
+ <td><code>(B | C)</code> is treated as a unit and may be combined as described in this list.</td></tr>
109
+ <tr><td><code>A?</code></td>
110
+ <td>matches A or nothing; optional A.</td></tr>
111
+ <tr><td><code>A B</code></td>
112
+ <td>matches <code>A</code> followed by <code>B</code>. This operator has higher precedence than alternation; thus <code>A B | C D</code> is identical to <code>(A B) | (C D)</code>.</td></tr>
113
+ <tr><td><code>A | B</code></td>
114
+ <td>matches <code>A</code> or <code>B</code>.</td></tr>
115
+ <tr><td><code>A - B</code></td>
116
+ <td>matches any string that matches <code>A</code> but does not match <code>B</code>.</td></tr>
117
+ <tr><td><code>A+</code></td>
118
+ <td>matches one or more occurrences of <code>A</code>. Concatenation has higher precedence than alternation; thus <code>A+ | B+</code> is identical to <code>(A+) | (B+)</code>.</td></tr>
119
+ <tr><td><code>A*</code></td>
120
+ <td>matches zero or more occurrences of <code>A</code>. Concatenation has higher precedence than alternation; thus <code>A* | B*</code> is identical to <code>(A*) | (B*)</code>.</td></tr>
121
+ <tr><td><code>@pass " "*</code></td>
122
+ <td>Defines consumed whitespace in the document. Any whitespace found between non-terminal rules is consumed and ignored.</td></tr>
123
+ <tr><td><code>@terminals</code></td>
124
+ <td>Introduces terminal rules. All rules defined after this point are treated as terminals.</td></tr>
125
+ </table>
126
+
127
+ * Comments include `//` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
205
128
  * All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
206
- * `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
129
+ * `@terminals` causes following rules to be treated as terminals. Any terminal which is all upper-case (eg`TERMINAL`), or any rules with expressions that match characters (`#xN`, `[a-z]`, `[^a-z]`, `[abc]`, `[^abc]`, `"string"`, `'string'`, or `A - B`), are also treated as terminals.
207
130
  * `@pass` defines the expression used to detect whitespace, which is removed in processing.
208
131
  * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
209
132
 
210
- Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
133
+ Parsing this grammar yields an S-Expression version: {file:etc/ebnf.sxp} (or [LL(1)][] version {file:etc/ebnf.ll1.sxp} or [PEG][] version {file:etc/ebnf.peg.sxp}).
134
+
135
+ ### Parser S-Expressions
136
+ Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule
137
+
138
+ [1] ebnf ::= (declaration | rule)*
139
+
140
+ is serialized as
141
+
142
+ (rule ebnf "1" (star (alt declaration rule)))
143
+
144
+ Different components of an EBNF rule expression are transformed into their own operator:
145
+
146
+ <table>
147
+ <tr><td><code>#xN</code></td><td><code>(hex "#xN")</code></td></tr>
148
+ <tr><td><code>[a-z#xN-#xN]</code></td><td><code>(range "a-z#xN-#xN")</code></td></tr>
149
+ <tr><td><code>[abc#xN]</code></td><td><code>(range "abc#xN")</code></td></tr>
150
+ <tr><td><code>[^a-z#xN-#xN]</code></td><td><code>(range "^a-z#xN-#xN")</code></td></tr>
151
+ <tr><td><code>[^abc#xN]</code></td><td><code>(range "^abc#xN")</code></td></tr>
152
+ <tr><td><code>"string"</code></td><td><code>"string"</code></td></tr>
153
+ <tr><td><code>'string'</code></td><td><code>"string"</code></td></tr>
154
+ <tr><td><code>A (B | C)</code></td><td><code>(seq (A (alt B C)))</code></td></tr>
155
+ <tr><td><code>A?</code></td><td><code>(opt A)</code></td></tr>
156
+ <tr><td><code>A B</code></td><td><code>(seq A B)</code></td></tr>
157
+ <tr><td><code>A | B</code></td><td><code>(alt A B)</code></td></tr>
158
+ <tr><td><code>A - B</code></td><td><code>(diff A B)</code></td></tr>
159
+ <tr><td><code>A+</code></td><td><code>(plus A)</code></td></tr>
160
+ <tr><td><code>A*</code></td><td><code>(star A)</code></td></tr>
161
+ <tr><td><code>@pass " "*</code></td><td><code>(pass (star " "))</code></td></tr>
162
+ <tr><td><code>@terminals</code></td><td></td></tr>
163
+ </table>
164
+
165
+ Additionally, rules defined with an UPPERCASE symbol are treated as terminals.
166
+
167
+ For an [LL(1)][] parser generator, the {EBNF::BNF.make_bnf} method can be used to transform the EBNF rule into a BNF rule.
168
+
169
+ (rule ebnf "1" (alt _empty _ebnf_2))
170
+ (rule _ebnf_1 "1.1" (alt declaration rule))
171
+ (rule _ebnf_2 "1.2" (seq _ebnf_1 ebnf))
172
+ (rule _ebnf_3 "1.3" (seq ebnf))
173
+
174
+ This allows [First/Follow][] and other tables used by a parser to parse examples of the associated grammar. For more, see {EBNF::LL1}.
175
+
176
+ For a [PEG][] parser generator, there is a simpler transformation that reduces rules containing sub-expressions (composed of `star`, `alt`, `seq` and similar expressions) and creates named rules to allow appropriate callbacks and for naming elements of the generating abstract syntax tree. The {EBNF::PEG.make_peg} method transforms the original rule into the following two rules:
177
+
178
+ (rule ebnf "1" (star _ebnf_1))
179
+ (rule _ebnf_1 "1.1" (alt declaration rule))
180
+
181
+ ## Example parsers
182
+ For a [PEG][] parser for a simple grammar implementing a calculator see [Calc example](http://dryruby.github.io/ebnf/examples/calc/doc/calc.html
183
+
184
+ For an example parser built using this gem that parses the [EBNF][] grammar, see [EBNF PEG Parser example](http://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html). This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
211
185
 
212
- ## Example parser
213
- For an example parser built using this gem, see {file:examples/ebnf-parser/README EBNF Parser example}. This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
186
+ There is also an
187
+ [EBNF LL(1) Parser example](http://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html).
214
188
 
215
189
  ## Acknowledgements
216
190
  Much of this work, particularly the generic parser, is inspired by work originally done by
217
- Tim Berners-Lee's Python [predictive parser](http://www.w3.org/2000/10/swap/grammar/predictiveParser.py).
191
+ Tim Berners-Lee's Python [predictive parser](https://www.w3.org/2000/10/swap/grammar/predictiveParser.py).
218
192
 
219
- The EBNF parser was inspired by Dan Connolly's
220
- [EBNF to Turtle processor](http://www.w3.org/2000/10/swap/grammar/ebnf2turtle.py),
221
- [EBNF to BNF Notation-3 rules](http://www.w3.org/2000/10/swap/grammar/ebnf2bnf.n3),
222
- and [First Follow Notation-3 rules](http://www.w3.org/2000/10/swap/grammar/first_follow.n3).
193
+ The [LL(1)][] parser was inspired by Dan Connolly's
194
+ [EBNF to Turtle processor](https://www.w3.org/2000/10/swap/grammar/ebnf2turtle.py),
195
+ [EBNF to BNF Notation-3 rules](https://www.w3.org/2000/10/swap/grammar/ebnf2bnf.n3),
196
+ and [First Follow Notation-3 rules](https://www.w3.org/2000/10/swap/grammar/first_follow.n3).
223
197
 
224
198
  ## Documentation
225
199
  Full documentation available on [Rubydoc.info][EBNF doc].
226
200
 
227
201
  ## Future Work
228
202
  * Better LL(1) parser tests
229
- * Either generate [Packrat parser][Packrat] for a [Parsing Regular Expression Grammar][PEG], or integrate with [Treetop][] or similar.
230
203
 
231
204
  ## Author
232
- * [Gregg Kellogg](http://github.com/gkellogg) - <http://greggkellogg.net/>
205
+ * [Gregg Kellogg](https://github.com/gkellogg) - <https://greggkellogg.net/>
233
206
 
234
207
  ## Contributing
235
208
  This repository uses [Git Flow](https://github.com/nvie/gitflow) to mange development and release activity. All submissions _must_ be on a feature branch based on the _develop_ branch to ease staging and integration.
@@ -248,22 +221,24 @@ This repository uses [Git Flow](https://github.com/nvie/gitflow) to mange develo
248
221
 
249
222
  ## License
250
223
  This is free and unencumbered public domain software. For more information,
251
- see <http://unlicense.org/> or the accompanying {file:UNLICENSE} file.
252
-
253
- A copy of the [Turtle EBNF][] and derived parser files are included in the repository, which are not covered under the UNLICENSE. These files are covered via the [W3C Document License](http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231).
254
-
255
- [Ruby]: http://ruby-lang.org/
256
- [YARD]: http://yardoc.org/
257
- [YARD-GS]: http://rubydoc.info/docs/yard/file/docs/GettingStarted.md
258
- [PDD]: http://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
259
- [EBNF]: http://www.w3.org/TR/REC-xml/#sec-notation
260
- [EBNF doc]: http://rubydoc.info/github/dryruby/ebnf/master/frames
261
- [First/Follow]: http://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
262
- [LL(1)]: http://www.csd.uwo.ca/~moreno//CS447/Lectures/Syntax.html/node14.html
263
- [LL(1) Parser]: http://en.wikipedia.org/wiki/LL_parser
264
- [Tokenizer]: http://en.wikipedia.org/wiki/Lexical_analysis#Tokenizer
265
- [Turtle EBNF]: http://dvcs.w3.org/hg/rdf/file/default/rdf-turtle/turtle.bnf
266
- [Packrat]: http://pdos.csail.mit.edu/~baford/packrat/thesis/
267
- [PEG]: http://en.wikipedia.org/wiki/Parsing_expression_grammar
268
- [Treetop]: http://rubygems.org/gems/treetop
269
- [Haml]: http://rubygems.org/gems/haml
224
+ see <https://unlicense.org/> or the accompanying {file:UNLICENSE} file.
225
+
226
+ A copy of the [Turtle EBNF][] and derived parser files are included in the repository, which are not covered under the UNLICENSE. These files are covered via the [W3C Document License](https://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231).
227
+
228
+ [Ruby]: https://ruby-lang.org/
229
+ [YARD]: https://yardoc.org/
230
+ [YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
231
+ [PDD]: https://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
232
+ [BNF]: https://en.wikipedia.org/wiki/Backus–Naur_form
233
+ [EBNF]: https://www.w3.org/TR/REC-xml/#sec-notation
234
+ [EBNF doc]: https://rubydoc.info/github/dryruby/ebnf
235
+ [First/Follow]: https://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
236
+ [LL(1)]: https://www.csd.uwo.ca/~moreno//CS447/Lectures/Syntax.html/node14.html
237
+ [LL(1) Parser]: https://en.wikipedia.org/wiki/LL_parser
238
+ [Logger]: https://ruby-doc.org/stdlib-2.4.0/libdoc/logger/rdoc/Logger.html
239
+ [Tokenizer]: https://en.wikipedia.org/wiki/Lexical_analysis#Tokenizer
240
+ [Turtle EBNF]: https://dvcs.w3.org/hg/rdf/file/default/rdf-turtle/turtle.bnf
241
+ [Packrat]: https://pdos.csail.mit.edu/~baford/packrat/thesis/
242
+ [PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar
243
+ [Treetop]: https://rubygems.org/gems/treetop
244
+ [Haml]: https://rubygems.org/gems/haml
data/UNLICENSE CHANGED
@@ -21,4 +21,4 @@ OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
21
21
  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22
22
  OTHER DEALINGS IN THE SOFTWARE.
23
23
 
24
- For more information, please refer to <http://unlicense.org/>
24
+ For more information, please refer to <https://unlicense.org/>
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.2.0
1
+ 2.0.0
data/bin/ebnf CHANGED
@@ -28,6 +28,7 @@ OPT_ARGS = [
28
28
  ["--input-format", GetoptLong::REQUIRED_ARGUMENT,"Specify input format one of ebnf or sxp"],
29
29
  ["--mod-name", GetoptLong::REQUIRED_ARGUMENT,"Module name used when creating ruby tables"],
30
30
  ["--output", "-o", GetoptLong::REQUIRED_ARGUMENT,"Output to the specified file path"],
31
+ ["--peg", GetoptLong::NO_ARGUMENT, "Transform EBNF to PEG"],
31
32
  ["--prefix", "-p", GetoptLong::REQUIRED_ARGUMENT,"Prefix to use when generating Turtle"],
32
33
  ["--progress", "-v", GetoptLong::NO_ARGUMENT, "Detail on execution"],
33
34
  ["--namespace", "-n", GetoptLong::REQUIRED_ARGUMENT,"Namespace to use when generating Turtle"],
@@ -58,9 +59,10 @@ opts.each do |opt, arg|
58
59
  when '--evaluate' then input = arg
59
60
  when '--input-format' then options[:format] = arg.to_sym
60
61
  when '--format' then options[:output_format] = arg.to_sym
61
- when '--ll1' then (options[:ll1] ||= []) <<arg.to_sym
62
+ when '--ll1' then (options[:ll1] ||= []) << arg.to_sym
62
63
  when '--mod-name' then options[:mod_name] = arg
63
64
  when '--output' then out = File.open(arg, "w")
65
+ when '--peg' then options[:peg] = true
64
66
  when '--prefix' then options[:prefix] = arg
65
67
  when '--namespace' then options[:namespace] = arg
66
68
  when '--progress' then options[:progress] = true
@@ -68,8 +70,8 @@ opts.each do |opt, arg|
68
70
  end
69
71
  end
70
72
 
71
- if options[:output_format] == :rb && !options[:ll1]
72
- STDERR.puts "outputing in .rb format requires -ll"
73
+ if options[:output_format] == :rb && !(options[:ll1] || options[:peg])
74
+ STDERR.puts "outputing in .rb format requires --ll or --peg"
73
75
  exit(1)
74
76
  end
75
77
 
@@ -77,6 +79,7 @@ input = File.open(ARGV[0]) if ARGV[0]
77
79
 
78
80
  ebnf = EBNF.parse(input || STDIN, **options)
79
81
  ebnf.make_bnf if options[:bnf] || options[:ll1]
82
+ ebnf.make_peg if options[:peg]
80
83
  if options[:ll1]
81
84
  ebnf.first_follow(*options[:ll1])
82
85
  ebnf.build_tables