treetop 1.0.1 → 1.0.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (5) hide show
  1. data/README +117 -2
  2. data/Rakefile +2 -2
  3. data/lib/treetop.rb +2 -2
  4. data/test/test_helper.rb +3 -2
  5. metadata +4 -4
data/README CHANGED
@@ -1,3 +1,118 @@
1
- = Treetop
1
+ Tutorial
2
+ ========
3
+ Languages can be split into two components, their *syntax* and their *semantics*. It's your understanding of English syntax that tells you the stream of words "Sleep furiously green ideas colorless" is not a valid sentence. Semantics is deeper. Even if we rearrange the above sentence to be "Colorless green ideas sleep furiously", which is syntactically correct, it remains nonsensical on a semantic level. With Treetop, you'll be dealing with languages that are much simpler than English, but these basic concepts apply. Your programs will need to address both the syntax and the semantics of the languages they interpret.
4
+
5
+ Treetop equips you with powerful tools for each of these two aspects of interpreter writing. You'll describe the syntax of your language with a *parsing expression grammar*. From this description, Treetop will generate a Ruby parser that transforms streams of characters written into your language into *abstract syntax trees* representing their structure. You'll then describe the semantics of your language in Ruby by defining methods on the syntax trees the parser generates.
6
+
7
+ Parsing Expression Grammars, The Basics
8
+ =======================================
9
+ The first step in using Treetop is defining a grammar in a file with the `.treetop` extension. Here's a grammar that's useless because it's empty:
10
+
11
+ # my_grammar.treetop
12
+ grammar MyGrammar
13
+ end
14
+
15
+ Next, you start filling your grammar with rules. Each rule associates a name with a parsing expression, like the following:
16
+
17
+ # my_grammar.treetop
18
+ grammar MyGrammar
19
+ rule hello
20
+ 'hello chomsky'
21
+ end
22
+ end
23
+
24
+ The first rule becomes the *root* of the grammar, causing its expression to be matched when a parser for the grammar is fed a string. The above grammar can now be used in a Ruby program. Notice how a string matching the first rule parses successfully, but a second nonmatching string does not.
25
+
26
+ # use_grammar.rb
27
+ require 'rubygems'
28
+ require 'treetop'
29
+ load_grammar 'my_grammar'
30
+
31
+ parser = MyGrammarParser.new
32
+ puts parser.parse('hello chomsky').success? # => true
33
+ puts parser.parse('silly generativists!').success? # => false
34
+
35
+ Users of *regular expressions* will find parsing expressions familiar. They share the same basic purpose, matching strings against patterns. However, parsing expressions can recognize a broader category of languages than their less expressive brethren. Before we get into demonstrating that, lets cover some basics. At first parsing expressions won't seem much different. Trust that they are.
36
+
37
+ Terminal Symbols
38
+ ----------------
39
+ The expression in the grammar above is a terminal symbol. It will only match a string that matches it exactly. There are two other kinds of terminal symbols, which we'll revisit later. Terminals are called *atomic expressions* because they aren't composed of smaller expressions.
40
+
41
+ Ordered Choices
42
+ ---------------
43
+ Ordered choices are *composite expressions*, which allow for any of several subexpressions to be matched. These should be familiar from regular expressions, but in parsing expressions, they are delimited by the `/` character. Its important to note that the choices are prioritized in the order they appear. If an earlier expression is matched, no subsequent expressions are tried. Here's an example:
44
+
45
+ # my_grammar.treetop
46
+ grammar MyGrammar
47
+ rule hello
48
+ 'hello chomsky' / 'hello lambek'
49
+ end
50
+ end
51
+
52
+ # fragment of use_grammar.rb
53
+ puts parser.parse('hello chomsky').success? # => true
54
+ puts parser.parse('hello lambek').success? # => true
55
+ puts parser.parse('silly generativists!').success? # => false
56
+
57
+ Sequences
58
+ ---------
59
+ Sequences are composed of other parsing expressions separated by spaces. Using sequences, we can tighten up the above grammar.
60
+
61
+ # my_grammar.treetop
62
+ grammar MyGrammar
63
+ rule hello
64
+ 'hello ' ('chomsky' / 'lambek')
65
+ end
66
+ end
67
+
68
+ Node the use of parentheses to override the default precedence rules, which bind sequences more tightly than choices.
69
+
70
+ Nonterminal Symbols
71
+ -------------------
72
+ Here we leave regular expressions behind. Nonterminals allow expressions to refer to other expressions by name. A trivial use of this facility would allow us to make the above grammar more readable should the list of names grow longer.
73
+
74
+ # my_grammar.treetop
75
+ grammar MyGrammar
76
+ rule hello
77
+ 'hello ' linguist
78
+ end
79
+
80
+ rule linguist
81
+ 'chomsky' / 'lambek' / 'jacobsen' / 'frege'
82
+ end
83
+ end
84
+
85
+ The true power of this facility, however, is unleashed when writing *recursive expressions*. Here is a self-referential expression that can match any number of open parentheses followed by any number of closed parentheses. This is theoretically impossible with regular expressions due to the *pumping lemma*.
86
+
87
+ # parentheses.treetop
88
+ grammar Parentheses
89
+ rule parens
90
+ '(' parens ')' / ''
91
+ end
92
+ end
93
+
94
+
95
+ The `parens` expression simply states that a `parens` is a set of parentheses surrounding another `parens` expression or, if that doesn't match, the empty string. If you are uncomfortable with recursion, its time to get comfortable, because it is the basis of language. Here's a tip: Don't try and imagine the parser circling round and round through the same rule. Instead, imagine the rule is *already* defined while you are defining it. If you imagine that `parens` already matches a string of matching parentheses, then its easy to think of `parens` as an open and closing parentheses around another set of matching parentheses, which conveniently, you happen to be defining. You know that `parens` is supposed to represent a string of matched parentheses, so trust in that meaning, even if you haven't fully implemented it yet.
96
+
97
+
98
+ Features to cover in the talk
99
+ =============================
100
+
101
+ * Treetop files
102
+ * Grammar definition
103
+ * Rules
104
+ * Loading a grammar
105
+ * Compiling a grammar with the `tt` command
106
+ * Accessing a parser for the grammar from Ruby
107
+ * Parsing Expressions of all kinds
108
+ ? Left recursion and factorization
109
+ - Here I can talk about function application, discussing how the operator
110
+ could be an arbitrary expression
111
+ * Inline node class eval blocks
112
+ * Node class declarations
113
+ * Labels
114
+ * Use of super within within labels
115
+ * Grammar composition with include
116
+ * Use of super with grammar composition
117
+
2
118
 
3
- To compile a treetop grammar file into Ruby, run `tt` on the file. See the metagrammar for an example of how to write a grammar. More examples soon!
data/Rakefile CHANGED
@@ -15,7 +15,7 @@ end
15
15
 
16
16
  gemspec = Gem::Specification.new do |s|
17
17
  s.name = "treetop"
18
- s.version = "1.0.1"
18
+ s.version = "1.0.2"
19
19
  s.author = "Nathan Sobo"
20
20
  s.email = "nathansobo@gmail.com"
21
21
  s.homepage = "http://functionalform.blogspot.com"
@@ -27,7 +27,7 @@ gemspec = Gem::Specification.new do |s|
27
27
  s.require_path = "lib"
28
28
  s.autorequire = "treetop"
29
29
  s.has_rdoc = false
30
- s.add_dependency "facets"
30
+ s.add_dependency "facets", ">=2.0.2"
31
31
  end
32
32
 
33
33
  Rake::GemPackageTask.new(gemspec) do |pkg|
data/lib/treetop.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  require 'rubygems'
2
- require 'facet/string/tab'
3
- require 'facet/string/camelize'
2
+ require 'facets/string/tabs'
3
+ require 'facets/stylize'
4
4
 
5
5
  dir = File.dirname(__FILE__)
6
6
 
data/test/test_helper.rb CHANGED
@@ -1,7 +1,8 @@
1
+ require 'rubygems'
1
2
  dir = File.dirname(__FILE__)
2
- $:.unshift(File.join(dir, *%w[.. lib]))
3
+ $:.unshift(File.expand_path(File.join(dir, '..', 'lib')))
3
4
  require File.expand_path(File.join(dir, 'screw', 'unit'))
4
- require 'treetop'
5
+ gem_original_require 'treetop'
5
6
 
6
7
  include Treetop
7
8
 
metadata CHANGED
@@ -3,8 +3,8 @@ rubygems_version: 0.9.2
3
3
  specification_version: 1
4
4
  name: treetop
5
5
  version: !ruby/object:Gem::Version
6
- version: 1.0.1
7
- date: 2007-09-14 00:00:00 -07:00
6
+ version: 1.0.2
7
+ date: 2007-10-22 00:00:00 -07:00
8
8
  summary: A Ruby-based text parsing and interpretation DSL
9
9
  require_paths:
10
10
  - lib
@@ -162,7 +162,7 @@ dependencies:
162
162
  version_requirement:
163
163
  version_requirements: !ruby/object:Gem::Version::Requirement
164
164
  requirements:
165
- - - ">"
165
+ - - ">="
166
166
  - !ruby/object:Gem::Version
167
- version: 0.0.0
167
+ version: 2.0.2
168
168
  version: