personify 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +1 -0
- data/LICENSE +20 -0
- data/README.md +172 -0
- data/Rakefile +53 -0
- data/VERSION +1 -0
- data/doc/syntax_ideas.md +141 -0
- data/lib/personify/context.rb +55 -0
- data/lib/personify/parser/personify.rb +1071 -0
- data/lib/personify/parser/personify.treetop +107 -0
- data/lib/personify/parser/personify_node_classes.rb +121 -0
- data/lib/personify/template.rb +17 -0
- data/lib/personify.rb +8 -0
- data/script/generate_parser.rb +6 -0
- data/test/context_test.rb +122 -0
- data/test/fixtures/multiple_tags.txt +8 -0
- data/test/parse_runner.rb +60 -0
- data/test/parser_test.rb +291 -0
- data/test/test_helper.rb +16 -0
- data/vendor/treetop/.gitignore +5 -0
- data/vendor/treetop/History.txt +9 -0
- data/vendor/treetop/README +164 -0
- data/vendor/treetop/Rakefile +20 -0
- data/vendor/treetop/Treetop.tmbundle/Snippets/grammar ___ end.tmSnippet +20 -0
- data/vendor/treetop/Treetop.tmbundle/Snippets/rule ___ end.tmSnippet +18 -0
- data/vendor/treetop/Treetop.tmbundle/Syntaxes/Treetop Grammar.tmLanguage +251 -0
- data/vendor/treetop/Treetop.tmbundle/info.plist +10 -0
- data/vendor/treetop/bin/tt +28 -0
- data/vendor/treetop/doc/contributing_and_planned_features.markdown +103 -0
- data/vendor/treetop/doc/grammar_composition.markdown +65 -0
- data/vendor/treetop/doc/index.markdown +90 -0
- data/vendor/treetop/doc/pitfalls_and_advanced_techniques.markdown +51 -0
- data/vendor/treetop/doc/semantic_interpretation.markdown +189 -0
- data/vendor/treetop/doc/site.rb +110 -0
- data/vendor/treetop/doc/sitegen.rb +60 -0
- data/vendor/treetop/doc/syntactic_recognition.markdown +100 -0
- data/vendor/treetop/doc/using_in_ruby.markdown +21 -0
- data/vendor/treetop/examples/lambda_calculus/arithmetic.rb +551 -0
- data/vendor/treetop/examples/lambda_calculus/arithmetic.treetop +97 -0
- data/vendor/treetop/examples/lambda_calculus/arithmetic_node_classes.rb +7 -0
- data/vendor/treetop/examples/lambda_calculus/arithmetic_test.rb +54 -0
- data/vendor/treetop/examples/lambda_calculus/lambda_calculus +0 -0
- data/vendor/treetop/examples/lambda_calculus/lambda_calculus.rb +718 -0
- data/vendor/treetop/examples/lambda_calculus/lambda_calculus.treetop +132 -0
- data/vendor/treetop/examples/lambda_calculus/lambda_calculus_node_classes.rb +5 -0
- data/vendor/treetop/examples/lambda_calculus/lambda_calculus_test.rb +89 -0
- data/vendor/treetop/examples/lambda_calculus/test_helper.rb +18 -0
- data/vendor/treetop/lib/treetop/bootstrap_gen_1_metagrammar.rb +45 -0
- data/vendor/treetop/lib/treetop/compiler/grammar_compiler.rb +40 -0
- data/vendor/treetop/lib/treetop/compiler/lexical_address_space.rb +17 -0
- data/vendor/treetop/lib/treetop/compiler/metagrammar.rb +2955 -0
- data/vendor/treetop/lib/treetop/compiler/metagrammar.treetop +404 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/anything_symbol.rb +20 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/atomic_expression.rb +14 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/character_class.rb +22 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/choice.rb +31 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/declaration_sequence.rb +24 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/grammar.rb +28 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/inline_module.rb +27 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/nonterminal.rb +13 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/optional.rb +19 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/parenthesized_expression.rb +9 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/parsing_expression.rb +138 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/parsing_rule.rb +55 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/predicate.rb +45 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/repetition.rb +55 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/sequence.rb +68 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/terminal.rb +20 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/transient_prefix.rb +9 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes/treetop_file.rb +9 -0
- data/vendor/treetop/lib/treetop/compiler/node_classes.rb +19 -0
- data/vendor/treetop/lib/treetop/compiler/ruby_builder.rb +113 -0
- data/vendor/treetop/lib/treetop/compiler.rb +6 -0
- data/vendor/treetop/lib/treetop/ruby_extensions/string.rb +42 -0
- data/vendor/treetop/lib/treetop/ruby_extensions.rb +2 -0
- data/vendor/treetop/lib/treetop/runtime/compiled_parser.rb +95 -0
- data/vendor/treetop/lib/treetop/runtime/interval_skip_list/head_node.rb +15 -0
- data/vendor/treetop/lib/treetop/runtime/interval_skip_list/interval_skip_list.rb +200 -0
- data/vendor/treetop/lib/treetop/runtime/interval_skip_list/node.rb +164 -0
- data/vendor/treetop/lib/treetop/runtime/interval_skip_list.rb +4 -0
- data/vendor/treetop/lib/treetop/runtime/syntax_node.rb +72 -0
- data/vendor/treetop/lib/treetop/runtime/terminal_parse_failure.rb +16 -0
- data/vendor/treetop/lib/treetop/runtime/terminal_syntax_node.rb +17 -0
- data/vendor/treetop/lib/treetop/runtime.rb +5 -0
- data/vendor/treetop/lib/treetop/version.rb +9 -0
- data/vendor/treetop/lib/treetop.rb +11 -0
- data/vendor/treetop/script/generate_metagrammar.rb +14 -0
- data/vendor/treetop/script/svnadd +11 -0
- data/vendor/treetop/script/svnrm +11 -0
- data/vendor/treetop/spec/compiler/and_predicate_spec.rb +36 -0
- data/vendor/treetop/spec/compiler/anything_symbol_spec.rb +52 -0
- data/vendor/treetop/spec/compiler/character_class_spec.rb +188 -0
- data/vendor/treetop/spec/compiler/choice_spec.rb +80 -0
- data/vendor/treetop/spec/compiler/circular_compilation_spec.rb +28 -0
- data/vendor/treetop/spec/compiler/failure_propagation_functional_spec.rb +21 -0
- data/vendor/treetop/spec/compiler/grammar_compiler_spec.rb +84 -0
- data/vendor/treetop/spec/compiler/grammar_spec.rb +41 -0
- data/vendor/treetop/spec/compiler/nonterminal_symbol_spec.rb +40 -0
- data/vendor/treetop/spec/compiler/not_predicate_spec.rb +38 -0
- data/vendor/treetop/spec/compiler/one_or_more_spec.rb +35 -0
- data/vendor/treetop/spec/compiler/optional_spec.rb +37 -0
- data/vendor/treetop/spec/compiler/parenthesized_expression_spec.rb +19 -0
- data/vendor/treetop/spec/compiler/parsing_rule_spec.rb +32 -0
- data/vendor/treetop/spec/compiler/sequence_spec.rb +115 -0
- data/vendor/treetop/spec/compiler/terminal_spec.rb +81 -0
- data/vendor/treetop/spec/compiler/terminal_symbol_spec.rb +37 -0
- data/vendor/treetop/spec/compiler/test_grammar.treetop +7 -0
- data/vendor/treetop/spec/compiler/test_grammar.tt +7 -0
- data/vendor/treetop/spec/compiler/test_grammar_do.treetop +7 -0
- data/vendor/treetop/spec/compiler/zero_or_more_spec.rb +56 -0
- data/vendor/treetop/spec/composition/a.treetop +11 -0
- data/vendor/treetop/spec/composition/b.treetop +11 -0
- data/vendor/treetop/spec/composition/c.treetop +10 -0
- data/vendor/treetop/spec/composition/d.treetop +10 -0
- data/vendor/treetop/spec/composition/grammar_composition_spec.rb +26 -0
- data/vendor/treetop/spec/ruby_extensions/string_spec.rb +32 -0
- data/vendor/treetop/spec/runtime/compiled_parser_spec.rb +101 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/delete_spec.rb +147 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/expire_range_spec.rb +349 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/insert_and_delete_node.rb +385 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/insert_spec.rb +660 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/interval_skip_list_spec.graffle +6175 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/interval_skip_list_spec.rb +58 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/palindromic_fixture.rb +23 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/palindromic_fixture_spec.rb +164 -0
- data/vendor/treetop/spec/runtime/interval_skip_list/spec_helper.rb +84 -0
- data/vendor/treetop/spec/runtime/syntax_node_spec.rb +53 -0
- data/vendor/treetop/spec/spec_helper.rb +106 -0
- data/vendor/treetop/spec/spec_suite.rb +4 -0
- data/vendor/treetop/treetop.gemspec +18 -0
- metadata +196 -0
@@ -0,0 +1,189 @@
|
|
1
|
+
#Semantic Interpretation
|
2
|
+
Lets use the below grammar as an example. It describes parentheses wrapping a single character to an arbitrary depth.
|
3
|
+
|
4
|
+
grammar ParenLanguage
|
5
|
+
rule parenthesized_letter
|
6
|
+
'(' parenthesized_letter ')'
|
7
|
+
/
|
8
|
+
[a-z]
|
9
|
+
end
|
10
|
+
end
|
11
|
+
|
12
|
+
Matches:
|
13
|
+
|
14
|
+
* `'a'`
|
15
|
+
* `'(a)'`
|
16
|
+
* `'((a))'`
|
17
|
+
* etc.
|
18
|
+
|
19
|
+
|
20
|
+
Output from a parser for this grammar looks like this:
|
21
|
+
|
22
|
+
![Tree Returned By ParenLanguageParser](./images/paren_language_output.png)
|
23
|
+
|
24
|
+
This is a parse tree whose nodes are instances of `Treetop::Runtime::SyntaxNode`. What if we could define methods on these node objects? We would then have an object-oriented program whose structure corresponded to the structure of our language. Treetop provides two techniques for doing just this.
|
25
|
+
|
26
|
+
##Associating Methods with Node-Instantiating Expressions
|
27
|
+
Sequences and all types of terminals are node-instantiating expressions. When they match, they create instances of `Treetop::Runtime::SyntaxNode`. Methods can be added to these nodes in the following ways:
|
28
|
+
|
29
|
+
###Inline Method Definition
|
30
|
+
Methods can be added to the nodes instantiated by the successful match of an expression
|
31
|
+
|
32
|
+
grammar ParenLanguage
|
33
|
+
rule parenthesized_letter
|
34
|
+
'(' parenthesized_letter ')' {
|
35
|
+
def depth
|
36
|
+
parenthesized_letter.depth + 1
|
37
|
+
end
|
38
|
+
}
|
39
|
+
/
|
40
|
+
[a-z] {
|
41
|
+
def depth
|
42
|
+
0
|
43
|
+
end
|
44
|
+
}
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
Note that each alternative expression is followed by a block containing a method definition. A `depth` method is defined on both expressions. The recursive `depth` method defined in the block following the first expression determines the depth of the nested parentheses and adds one two it. The base case is implemented in the block following the second expression; a single character has a depth of 0.
|
49
|
+
|
50
|
+
|
51
|
+
###Custom `SyntaxNode` Subclass Declarations
|
52
|
+
You can instruct the parser to instantiate a custom subclass of Treetop::Runtime::SyntaxNode for an expression by following it by the name of that class enclosed in angle brackets (`<>`). The above inline method definitions could have been moved out into a single class like so.
|
53
|
+
|
54
|
+
# in .treetop file
|
55
|
+
grammar ParenLanguage
|
56
|
+
rule parenthesized_letter
|
57
|
+
'(' parenthesized_letter ')' <ParenNode>
|
58
|
+
/
|
59
|
+
[a-z] <ParenNode>
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
# in separate .rb file
|
64
|
+
class ParenNode < Treetop::Runtime::SyntaxNode
|
65
|
+
def depth
|
66
|
+
if nonterminal?
|
67
|
+
parenthesized_letter.depth + 1
|
68
|
+
else
|
69
|
+
0
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
74
|
+
##Automatic Extension of Results
|
75
|
+
Nonterminal and ordered choice expressions do not instantiate new nodes, but rather pass through nodes that are instantiated by other expressions. They can extend nodes they propagate with anonymous or declared modules, using similar constructs used with expressions that instantiate their own syntax nodes.
|
76
|
+
|
77
|
+
###Extending a Propagated Node with an Anonymous Module
|
78
|
+
rule parenthesized_letter
|
79
|
+
('(' parenthesized_letter ')' / [a-z]) {
|
80
|
+
def depth
|
81
|
+
if nonterminal?
|
82
|
+
parenthesized_letter.depth + 1
|
83
|
+
else
|
84
|
+
0
|
85
|
+
end
|
86
|
+
end
|
87
|
+
}
|
88
|
+
end
|
89
|
+
|
90
|
+
The parenthesized choice above can result in a node matching either of the two choices. Than node will be extended with methods defined in the subsequent block. Note that a choice must always be parenthesized to be associated with a following block.
|
91
|
+
|
92
|
+
###Extending A Propagated Node with a Declared Module
|
93
|
+
# in .treetop file
|
94
|
+
rule parenthesized_letter
|
95
|
+
('(' parenthesized_letter ')' / [a-z]) <ParenNode>
|
96
|
+
end
|
97
|
+
|
98
|
+
# in separate .rb file
|
99
|
+
module ParenNode
|
100
|
+
def depth
|
101
|
+
if nonterminal?
|
102
|
+
parenthesized_letter.depth + 1
|
103
|
+
else
|
104
|
+
0
|
105
|
+
end
|
106
|
+
end
|
107
|
+
end
|
108
|
+
|
109
|
+
Here the result is extended with the `ParenNode` module. Note the previous example for node-instantiating expressions, the constant in the declaration must be a module because the result is extended with it.
|
110
|
+
|
111
|
+
##Automatically-Defined Element Accessor Methods
|
112
|
+
###Default Accessors
|
113
|
+
Nodes instantiated upon the matching of sequences have methods automatically defined for any nonterminals in the sequence.
|
114
|
+
|
115
|
+
rule abc
|
116
|
+
a b c {
|
117
|
+
def to_s
|
118
|
+
a.to_s + b.to_s + c.to_s
|
119
|
+
end
|
120
|
+
}
|
121
|
+
end
|
122
|
+
|
123
|
+
In the above code, the `to_s` method calls automatically-defined element accessors for the nodes returned by parsing nonterminals `a`, `b`, and `c`.
|
124
|
+
|
125
|
+
###Labels
|
126
|
+
Subexpressions can be given an explicit label to have an element accessor method defined for them. This is useful in cases of ambiguity between two references to the same nonterminal or when you need to access an unnamed subexpression.
|
127
|
+
|
128
|
+
rule labels
|
129
|
+
first_letter:[a-z] rest_letters:(', ' letter:[a-z])* {
|
130
|
+
def letters
|
131
|
+
[first_letter] + rest_letters.map do |comma_and_letter|
|
132
|
+
comma_and_letter.letter
|
133
|
+
end
|
134
|
+
end
|
135
|
+
}
|
136
|
+
end
|
137
|
+
|
138
|
+
The above grammar uses label-derived accessors to determine the letters in a comma-delimited list of letters. The labeled expressions _could_ have been extracted to their own rules, but if they aren't used elsewhere, labels still enable them to be referenced by a name within the expression's methods.
|
139
|
+
|
140
|
+
###Overriding Element Accessors
|
141
|
+
The module containing automatically defined element accessor methods is an ancestor of the module in which you define your own methods, meaning you can override them with access to the `super` keyword. Here's an example of how this fact can improve the readability of the example above.
|
142
|
+
|
143
|
+
rule labels
|
144
|
+
first_letter:[a-z] rest_letters:(', ' letter:[a-z])* {
|
145
|
+
def letters
|
146
|
+
[first_letter] + rest_letters
|
147
|
+
end
|
148
|
+
|
149
|
+
def rest_letters
|
150
|
+
super.map { |comma_and_letter| comma_and_letter.letter }
|
151
|
+
end
|
152
|
+
}
|
153
|
+
end
|
154
|
+
|
155
|
+
|
156
|
+
##Methods Available on `Treetop::Runtime::SyntaxNode`
|
157
|
+
|
158
|
+
<table>
|
159
|
+
<tr>
|
160
|
+
<td>
|
161
|
+
<code>terminal?</code>
|
162
|
+
</td>
|
163
|
+
<td>
|
164
|
+
Was this node produced by the matching of a terminal symbol?
|
165
|
+
</td>
|
166
|
+
</tr>
|
167
|
+
<tr>
|
168
|
+
<td>
|
169
|
+
<code>nonterminal?</code>
|
170
|
+
</td>
|
171
|
+
<td>
|
172
|
+
Was this node produced by the matching of a nonterminal symbol?
|
173
|
+
</td>
|
174
|
+
<tr>
|
175
|
+
<td>
|
176
|
+
<code>text_value</code>
|
177
|
+
</td>
|
178
|
+
<td>
|
179
|
+
The substring of the input represented by this node.
|
180
|
+
</td>
|
181
|
+
<tr>
|
182
|
+
<td>
|
183
|
+
<code>elements</code>
|
184
|
+
</td>
|
185
|
+
<td>
|
186
|
+
Available only on nonterminal nodes, returns the nodes parsed by the elements of the matched sequence.
|
187
|
+
</td>
|
188
|
+
</tr>
|
189
|
+
</table>
|
@@ -0,0 +1,110 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'erector'
|
3
|
+
require "#{File.dirname(__FILE__)}/sitegen"
|
4
|
+
|
5
|
+
class Layout < Erector::Widget
|
6
|
+
def render
|
7
|
+
html do
|
8
|
+
head do
|
9
|
+
link :rel => "stylesheet",
|
10
|
+
:type => "text/css",
|
11
|
+
:href => "./screen.css"
|
12
|
+
|
13
|
+
rawtext %(
|
14
|
+
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
|
15
|
+
</script>
|
16
|
+
<script type="text/javascript">
|
17
|
+
_uacct = "UA-3418876-1";
|
18
|
+
urchinTracker();
|
19
|
+
</script>
|
20
|
+
)
|
21
|
+
end
|
22
|
+
|
23
|
+
body do
|
24
|
+
div :id => 'top' do
|
25
|
+
div :id => 'main_navigation' do
|
26
|
+
main_navigation
|
27
|
+
end
|
28
|
+
end
|
29
|
+
div :id => 'middle' do
|
30
|
+
div :id => 'content' do
|
31
|
+
content
|
32
|
+
end
|
33
|
+
end
|
34
|
+
div :id => 'bottom' do
|
35
|
+
|
36
|
+
end
|
37
|
+
end
|
38
|
+
end
|
39
|
+
end
|
40
|
+
|
41
|
+
def main_navigation
|
42
|
+
ul do
|
43
|
+
li { link_to "Documentation", SyntacticRecognition, Documentation }
|
44
|
+
li { link_to "Contribute", Contribute }
|
45
|
+
li { link_to "Home", Index }
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
def content
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
class Index < Layout
|
54
|
+
def content
|
55
|
+
bluecloth "index.markdown"
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
class Documentation < Layout
|
60
|
+
abstract
|
61
|
+
|
62
|
+
def content
|
63
|
+
div :id => 'secondary_navigation' do
|
64
|
+
ul do
|
65
|
+
li { link_to 'Syntax', SyntacticRecognition }
|
66
|
+
li { link_to 'Semantics', SemanticInterpretation }
|
67
|
+
li { link_to 'Using In Ruby', UsingInRuby }
|
68
|
+
li { link_to 'Advanced Techniques', PitfallsAndAdvancedTechniques }
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
div :id => 'documentation_content' do
|
73
|
+
documentation_content
|
74
|
+
end
|
75
|
+
end
|
76
|
+
end
|
77
|
+
|
78
|
+
class SyntacticRecognition < Documentation
|
79
|
+
def documentation_content
|
80
|
+
bluecloth "syntactic_recognition.markdown"
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
class SemanticInterpretation < Documentation
|
85
|
+
def documentation_content
|
86
|
+
bluecloth "semantic_interpretation.markdown"
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
class UsingInRuby < Documentation
|
91
|
+
def documentation_content
|
92
|
+
bluecloth "using_in_ruby.markdown"
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
class PitfallsAndAdvancedTechniques < Documentation
|
97
|
+
def documentation_content
|
98
|
+
bluecloth "pitfalls_and_advanced_techniques.markdown"
|
99
|
+
end
|
100
|
+
end
|
101
|
+
|
102
|
+
|
103
|
+
class Contribute < Layout
|
104
|
+
def content
|
105
|
+
bluecloth "contributing_and_planned_features.markdown"
|
106
|
+
end
|
107
|
+
end
|
108
|
+
|
109
|
+
|
110
|
+
Layout.generate_site
|
@@ -0,0 +1,60 @@
|
|
1
|
+
class Layout < Erector::Widget
|
2
|
+
|
3
|
+
class << self
|
4
|
+
def inherited(page_class)
|
5
|
+
puts page_class
|
6
|
+
(@@page_classes ||= []) << page_class
|
7
|
+
end
|
8
|
+
|
9
|
+
def generate_site
|
10
|
+
@@page_classes.each do |page_class|
|
11
|
+
page_class.generate_html unless page_class.abstract?
|
12
|
+
puts page_class
|
13
|
+
end
|
14
|
+
end
|
15
|
+
|
16
|
+
def generate_html
|
17
|
+
File.open(absolute_path, 'w') do |file|
|
18
|
+
file.write(new.render)
|
19
|
+
end
|
20
|
+
end
|
21
|
+
|
22
|
+
def absolute_path
|
23
|
+
absolutize(relative_path)
|
24
|
+
end
|
25
|
+
|
26
|
+
def relative_path
|
27
|
+
"#{name.gsub('::', '_').underscore}.html"
|
28
|
+
end
|
29
|
+
|
30
|
+
def absolutize(relative_path)
|
31
|
+
File.join(File.dirname(__FILE__), "site", relative_path)
|
32
|
+
end
|
33
|
+
|
34
|
+
def abstract
|
35
|
+
@abstract = true
|
36
|
+
end
|
37
|
+
|
38
|
+
def abstract?
|
39
|
+
@abstract
|
40
|
+
end
|
41
|
+
end
|
42
|
+
|
43
|
+
def bluecloth(relative_path)
|
44
|
+
File.open(File.join(File.dirname(__FILE__), relative_path)) do |file|
|
45
|
+
rawtext BlueCloth.new(file.read).to_html
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
def absolutize(relative_path)
|
50
|
+
self.class.absolutize(relative_path)
|
51
|
+
end
|
52
|
+
|
53
|
+
def link_to(link_text, page_class, section_class=nil)
|
54
|
+
if instance_of?(page_class) || section_class && is_a?(section_class)
|
55
|
+
text link_text
|
56
|
+
else
|
57
|
+
a link_text, :href => page_class.relative_path
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
@@ -0,0 +1,100 @@
|
|
1
|
+
#Syntactic Recognition
|
2
|
+
Treetop grammars are written in a custom language based on parsing expression grammars. Literature on the subject of <a href="http://en.wikipedia.org/wiki/Parsing_expression_grammar">parsing expression grammars</a> is useful in writing Treetop grammars.
|
3
|
+
|
4
|
+
#Grammar Structure
|
5
|
+
Treetop grammars look like this:
|
6
|
+
|
7
|
+
grammar GrammarName
|
8
|
+
rule rule_name
|
9
|
+
...
|
10
|
+
end
|
11
|
+
|
12
|
+
rule rule_name
|
13
|
+
...
|
14
|
+
end
|
15
|
+
|
16
|
+
...
|
17
|
+
end
|
18
|
+
|
19
|
+
The main keywords are:
|
20
|
+
|
21
|
+
* `grammar` : This introduces a new grammar. It is followed by a constant name to which the grammar will be bound when it is loaded.
|
22
|
+
|
23
|
+
* `rule` : This defines a parsing rule within the grammar. It is followed by a name by which this rule can be referenced within other rules. It is then followed by a parsing expression defining the rule.
|
24
|
+
|
25
|
+
#Parsing Expressions
|
26
|
+
Each rule associates a name with a _parsing expression_. Parsing expressions are a generalization of vanilla regular expressions. Their key feature is the ability to reference other expressions in the grammar by name.
|
27
|
+
|
28
|
+
##Terminal Symbols
|
29
|
+
###Strings
|
30
|
+
Strings are surrounded in double or single quotes and must be matched exactly.
|
31
|
+
|
32
|
+
* `"foo"`
|
33
|
+
* `'foo'`
|
34
|
+
|
35
|
+
###Character Classes
|
36
|
+
Character classes are surrounded by brackets. Their semantics are identical to those used in Ruby's regular expressions.
|
37
|
+
|
38
|
+
* `[a-zA-Z]`
|
39
|
+
* `[0-9]`
|
40
|
+
|
41
|
+
###The Anything Symbol
|
42
|
+
The anything symbol is represented by a dot (`.`) and matches any single character.
|
43
|
+
|
44
|
+
##Nonterminal Symbols
|
45
|
+
Nonterminal symbols are unquoted references to other named rules. They are equivalent to an inline substitution of the named expression.
|
46
|
+
|
47
|
+
rule foo
|
48
|
+
"the dog " bar
|
49
|
+
end
|
50
|
+
|
51
|
+
rule bar
|
52
|
+
"jumped"
|
53
|
+
end
|
54
|
+
|
55
|
+
The above grammar is equivalent to:
|
56
|
+
|
57
|
+
rule foo
|
58
|
+
"the dog jumped"
|
59
|
+
end
|
60
|
+
|
61
|
+
##Ordered Choice
|
62
|
+
Parsers attempt to match ordered choices in left-to-right order, and stop after the first successful match.
|
63
|
+
|
64
|
+
"foobar" / "foo" / "bar"
|
65
|
+
|
66
|
+
Note that if `"foo"` in the above expression came first, `"foobar"` would never be matched.
|
67
|
+
|
68
|
+
##Sequences
|
69
|
+
|
70
|
+
Sequences are a space-separated list of parsing expressions. They have higher precedence than choices, so choices must be parenthesized to be used as the elements of a sequence.
|
71
|
+
|
72
|
+
"foo" "bar" ("baz" / "bop")
|
73
|
+
|
74
|
+
##Zero or More
|
75
|
+
Parsers will greedily match an expression zero or more times if it is followed by the star (`*`) symbol.
|
76
|
+
|
77
|
+
* `'foo'*` matches the empty string, `"foo"`, `"foofoo"`, etc.
|
78
|
+
|
79
|
+
##One or More
|
80
|
+
Parsers will greedily match an expression one or more times if it is followed by the star (`+`) symbol.
|
81
|
+
|
82
|
+
* `'foo'+` does not match the empty string, but matches `"foo"`, `"foofoo"`, etc.
|
83
|
+
|
84
|
+
##Optional Expressions
|
85
|
+
An expression can be declared optional by following it with a question mark (`?`).
|
86
|
+
|
87
|
+
* `'foo'?` matches `"foo"` or the empty string.
|
88
|
+
|
89
|
+
##Lookahead Assertions
|
90
|
+
Lookahead assertions can be used to give parsing expressions a limited degree of context-sensitivity. The parser will look ahead into the buffer and attempt to match an expression without consuming input.
|
91
|
+
|
92
|
+
###Positive Lookahead Assertion
|
93
|
+
Preceding an expression with an ampersand `(&)` indicates that it must match, but no input will be consumed in the process of determining whether this is true.
|
94
|
+
|
95
|
+
* `"foo" &"bar"` matches `"foobar"` but only consumes up to the end `"foo"`. It will not match `"foobaz"`.
|
96
|
+
|
97
|
+
###Negative Lookahead Assertion
|
98
|
+
Preceding an expression with a bang `(!)` indicates that the expression must not match, but no input will be consumed in the process of determining whether this is true.
|
99
|
+
|
100
|
+
* `"foo" !"bar"` matches `"foobaz"` but only consumes up to the end `"foo"`. It will not match `"foobar"`.
|
@@ -0,0 +1,21 @@
|
|
1
|
+
#Using Treetop Grammars in Ruby
|
2
|
+
##Using the Command Line Compiler
|
3
|
+
You can `.treetop` files into Ruby source code with the `tt` command line script. `tt` takes an list of files with a `.treetop` extension and compiles them into `.rb` files of the same name. You can then `require` these files like any other Ruby script. Alternately, you can supply just one `.treetop` file and a `-o` flag to name specify the name of the output file. Improvements to this compilation script are welcome.
|
4
|
+
|
5
|
+
tt foo.treetop bar.treetop
|
6
|
+
tt foo.treetop -o foogrammar.rb
|
7
|
+
|
8
|
+
##Loading A Grammar Directly
|
9
|
+
The Polyglot gem makes it possible to load `.treetop` or `.tt` files directly with `require`. This will invoke `Treetop.load`, which automatically compiles the grammar to Ruby and then evaluates the Ruby source. If you are getting errors in methods you define on the syntax tree, try using the command line compiler for better stack trace feedback. A better solution to this issue is in the works.
|
10
|
+
|
11
|
+
##Instantiating and Using Parsers
|
12
|
+
If a grammar by the name of `Foo` is defined, the compiled Ruby source will define a `FooParser` class. To parse input, create an instance and call its `parse` method with a string. The parser will return the syntax tree of the match or `nil` if there is a failure.
|
13
|
+
|
14
|
+
Treetop.load "arithmetic"
|
15
|
+
|
16
|
+
parser = ArithmeticParser.new
|
17
|
+
if parser.parse('1+1')
|
18
|
+
puts 'success'
|
19
|
+
else
|
20
|
+
puts 'failure'
|
21
|
+
end
|