pegex 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
data/.gemspec CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  GemSpec = Gem::Specification.new do |gem|
4
4
  gem.name = 'pegex'
5
- gem.version = '0.0.2'
5
+ gem.version = '0.0.3'
6
6
  gem.license = 'MIT'
7
7
  gem.required_ruby_version = '>= 1.9.1'
8
8
 
@@ -17,5 +17,5 @@ that will work equivalently in lots of programming languages!
17
17
 
18
18
  gem.files = `git ls-files`.lines.map{|l|l.chomp}
19
19
 
20
- gem.add_development_dependency 'testml-lite', '>= 0.0.1'
20
+ gem.add_development_dependency 'testml', '>= 0.0.2'
21
21
  end
@@ -0,0 +1,2 @@
1
+ export RUBYLIB=../testml-rb/lib
2
+ export RUBYOPT=-rxxx
@@ -1,3 +1,6 @@
1
+ - version: 0.0.3
2
+ date: Sat Apr 20 07:03:20 CST 2013
3
+ changes: Major refactoring release
1
4
  - version: 0.0.2
2
5
  date: Wed Dec 19 23:58:14 PST 2012
3
6
  changes: Fixed a regex bug
data/LICENSE CHANGED
@@ -1,6 +1,6 @@
1
1
  (The MIT License)
2
2
 
3
- Copyright © 2012 Ingy döt Net
3
+ Copyright © 2013 Ingy döt Net
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy of
6
6
  this software and associated documentation files (the ‘Software’), to deal in
@@ -1,4 +1,4 @@
1
- = Pegex - Acmeist PEG Parsing Framework
1
+ = pegex - Acmeist PEG Parsing Framework
2
2
 
3
3
  Pegex is a Acmeist parser framework. It allows you to easily create
4
4
  parsers that will work equivalently in lots of programming languages!
@@ -29,35 +29,37 @@ or more explicitly:
29
29
 
30
30
  = Description
31
31
 
32
- Pegex is a Acmeist parser framework. It allows you to easily create parsers
33
- that will work equivalently in lots of programming languages!
32
+ Pegex is a Acmeist parser framework. It allows you to easily create
33
+ parsers that will work equivalently in lots of programming languages!
34
34
 
35
- Pegex gets it name by combining Parsing Expression Grammars (PEG), with
36
- Regular Expessions (Regex). That's actually what Pegex does.
35
+ Pegex gets it name by combining Parsing Expression Grammars (PEG),
36
+ with Regular Expessions (Regex). That's actually what Pegex does.
37
37
 
38
- PEG is the cool new way to elegantly specify recursive descent grammars. The
39
- Perl 6 language is defined in terms of a self modifying PEG language called
40
- *Perl 6 Rules*. Regexes are familiar to programmers of most modern
41
- programming languages. Pegex defines a simple PEG syntax, where all the
42
- terminals are regexes. This means that Pegex can be quite fast and powerful.
38
+ PEG is the cool new way to elegantly specify recursive descent
39
+ grammars. The Perl 6 language is defined in terms of a self modifying
40
+ PEG language called *Perl 6 Rules*. Regexes are familiar to
41
+ programmers of most modern programming languages. Pegex defines a
42
+ simple PEG syntax, where all the terminals are regexes. This means
43
+ that Pegex can be quite fast and powerful.
43
44
 
44
- Pegex attempts to be the simplest way to define new (or old) Domain Specific
45
- Languages (DSLs) that need to be used in several programming languages and
46
- environments.
45
+ Pegex attempts to be the simplest way to define new (or old) Domain
46
+ Specific Languages (DSLs) that need to be used in several programming
47
+ languages and environments.
47
48
 
48
49
  = Usage
49
50
 
50
- The +pegex.rb+ module itself is just a trivial way to use the
51
- Pegex framework. It is only intended for the simplest of uses.
51
+ The +pegex.rb+ module itself is just a trivial way to use the Pegex
52
+ framework. It is only intended for the simplest of uses.
52
53
 
53
- +pegex.rb+ defines a single function, +pegex+, which takes a Pegex grammar
54
- string as input. You may also pass in a receiver class or object.
54
+ +pegex.rb+ defines a single function, +pegex+, which takes a Pegex
55
+ grammar string as input. You may also pass in a receiver class or
56
+ object.
55
57
 
56
58
  parser = pegex(grammar, MyReceiver)
57
59
 
58
- The +pegex+ function returns a Pegex::Parser object, on which you would
59
- typically call the +parse()+ method, which (on success) will return a data
60
- structure of the parsed data.
60
+ The +pegex+ function returns a Pegex::Parser object, on which you
61
+ would typically call the +parse()+ method, which (on success) will
62
+ return a data structure of the parsed data.
61
63
 
62
64
  See Pegex::API for more details.
63
65
 
@@ -66,10 +68,10 @@ See Pegex::API for more details.
66
68
  This Pegex library was ported to Ruby from the Perl module:
67
69
  http://search.cpan.org/dist/Pegex/
68
70
 
69
- The code and tests were fully ported from Perl to Ruby. Pegex should work
70
- exactly the same in both languages. The documentation and examples have not yet
71
- been fully ported, but they will be soon enough. For now, refer to the Perl
72
- docs.
71
+ The code and tests were fully ported from Perl to Ruby. Pegex should
72
+ work exactly the same in both languages. The documentation and
73
+ examples have not yet been fully ported, but they will be soon enough.
74
+ For now, refer to the Perl docs.
73
75
 
74
76
  You can start here: http://search.cpan.org/dist/Pegex/lib/Pegex.pod
75
77
 
data/Rakefile CHANGED
@@ -11,6 +11,12 @@ DevNull = '2>/dev/null'
11
11
  require 'rake'
12
12
  require 'rake/testtask'
13
13
  require 'rake/clean'
14
+ if File.exists? 'test/testml.yaml'
15
+ if File.exists? 'lib/rake/testml.rb'
16
+ $:.unshift "#{Dir.getwd}/lib"
17
+ end
18
+ require 'rake/testml'
19
+ end
14
20
 
15
21
  task :default => 'help'
16
22
 
@@ -18,9 +24,12 @@ CLEAN.include GemDir, GemFile, 'data.tar.gz', 'metadata.gz'
18
24
 
19
25
  desc 'Run the tests'
20
26
  task :test do
27
+ load '.env' if File.exists? '.env'
21
28
  Rake::TestTask.new do |t|
22
29
  t.verbose = true
23
- t.test_files = FileList['test/*.rb']
30
+ t.test_files = ENV['DEV_TEST_FILES'] &&
31
+ FileList[ENV['DEV_TEST_FILES'].split] ||
32
+ FileList['test/**/*.rb'].sort
24
33
  end
25
34
  end
26
35
 
data/ToDo CHANGED
@@ -1,3 +1,6 @@
1
+ - Support grammar inheritance
2
+
3
+ - Use pp tree instead of JSON in pegex/pegex/grammar
1
4
  - Finish porting tests from perl
2
5
  - Support grammar self compile:
3
6
  > ruby -Ilib -rfoo/grammar -ecompile
@@ -3,15 +3,14 @@ module Pegex;end
3
3
  require 'pegex/parser'
4
4
  require 'pegex/grammar'
5
5
 
6
- def pegex grammar_text, receiver=nil
7
- unless receiver
6
+ def pegex grammar, receiver=nil
7
+ if not receiver
8
8
  require 'pegex/tree/wrap'
9
9
  receiver = Pegex::Tree::Wrap.new
10
10
  end
11
- receiver = receiver.new \
12
- if receiver.class == Class
11
+ receiver = receiver.new if receiver.class == Class
13
12
  return Pegex::Parser.new do |p|
14
- p.grammar = Pegex::Grammar.new {|g| g.text = grammar_text}
13
+ p.grammar = Pegex::Grammar.new {|g| g.text = grammar}
15
14
  p.receiver = receiver
16
15
  end
17
16
  end
@@ -1,6 +1,7 @@
1
1
  require 'pegex'
2
2
  class Pegex::Grammar
3
3
  attr_accessor :text
4
+ attr_accessor :tree
4
5
 
5
6
  def initialize
6
7
  yield self if block_given?
@@ -2,6 +2,7 @@ require 'pegex'
2
2
 
3
3
  class Pegex::Input
4
4
  attr_accessor :string
5
+ attr_accessor :file
5
6
 
6
7
  def initialize
7
8
  @is_eof = false
@@ -1,11 +1,15 @@
1
1
  require 'pegex/input'
2
2
 
3
- $pegex_nil = []
4
- $dummy = [1]
3
+ module Pegex::Constant
4
+ Null = []
5
+ Dummy = []
6
+ end
5
7
 
6
8
  class Pegex::Parser
7
9
  attr_accessor :grammar
8
10
  attr_accessor :receiver
11
+ attr_accessor :input
12
+
9
13
  attr_accessor :parent
10
14
  attr_accessor :rule
11
15
  attr_accessor :debug
@@ -14,18 +18,16 @@ class Pegex::Parser
14
18
  @position = 0
15
19
  @farthest = 0
16
20
  @optimized = false
17
- @debug = false
18
21
  @throw_on_error = true
19
- # @debug = true
22
+ @debug = ENV['RUBY_PEGEX_DEBUG'] || $PegexParserDebug || false
20
23
  yield self if block_given?
21
24
  end
22
25
 
23
26
  def parse input, start=nil
24
27
  @position = 0
25
- if input.kind_of? String
26
- input = Pegex::Input.new do |i|
27
- i.string = input
28
- end
28
+
29
+ if not input.kind_of? Pegex::Input
30
+ input = Pegex::Input.new {|i| i.string = input}
29
31
  end
30
32
  @input = input
31
33
  @input.open unless @input.open?
@@ -40,18 +42,20 @@ class Pegex::Parser
40
42
  (@tree['TOP'] ? 'TOP' : nil) or
41
43
  fail "No starting rule for Pegex::Parser::parse"
42
44
 
43
- optimize_grammar start_rule_ref
45
+ optimize_grammar(start_rule_ref)
44
46
 
45
- fail "No 'receiver'. Can't parse" unless @receiver
47
+ fail "No 'receiver'. Can't parse" unless @receiver
46
48
 
47
- # XXX does ruby have problems with circulat references
49
+ # XXX does ruby have problems with circulat references?
48
50
  @receiver.parser = self
49
51
 
50
52
  if @receiver.respond_to? 'initial'
51
- @rule, @parent = $start_rule_ref, {}
53
+ @rule = start_rule_ref
54
+ @parent = {}
55
+ @receiver.initial
52
56
  end
53
57
 
54
- match = match_ref start_rule_ref, {}
58
+ match = match_ref(start_rule_ref, {})
55
59
 
56
60
  @input.close
57
61
 
@@ -61,7 +65,8 @@ class Pegex::Parser
61
65
  end
62
66
 
63
67
  if @receiver.respond_to? 'final'
64
- @rule, @parent = start_rule_ref, {}
68
+ @rule = start_rule_ref
69
+ @parent = {}
65
70
  match = [ @receiver.final(match.first) ]
66
71
  end
67
72
 
@@ -72,9 +77,9 @@ class Pegex::Parser
72
77
  return if @optimized
73
78
  @tree.each_pair do |name, node|
74
79
  next if node.kind_of? String
75
- optimize_node node
80
+ optimize_node(node)
76
81
  end
77
- optimize_node '.ref' => start
82
+ optimize_node('.ref' => start)
78
83
  @optimized = true
79
84
  end
80
85
 
@@ -88,8 +93,8 @@ class Pegex::Parser
88
93
  end
89
94
  end
90
95
  min, max = node.values_at '+min', '+max'
91
- node['+min'] ||= max == nil ? 1 : 0
92
- node['+max'] ||= min == nil ? 1 : 0
96
+ node['+min'] ||= max.nil? ? 1 : 0
97
+ node['+max'] ||= min.nil? ? 1 : 0
93
98
  node['+asr'] ||= nil
94
99
  node['+min'] = node['+min'].to_i
95
100
  node['+max'] = node['+max'].to_i
@@ -111,12 +116,12 @@ class Pegex::Parser
111
116
  node['rule'] = Regexp.new "\\A#{node['.rgx']}"
112
117
  end
113
118
  if sep = node['.sep']
114
- optimize_node sep
119
+ optimize_node(sep)
115
120
  end
116
121
  end
117
122
 
118
123
  def match_next next_
119
- return match_next_with_sep next_ if next_['.sep']
124
+ return match_next_with_sep(next_) if next_['.sep']
120
125
 
121
126
  rule, method, kind, min, max, assertion =
122
127
  next_.values_at 'rule', 'method', 'kind', '+min', '+max', '+asr'
@@ -126,7 +131,7 @@ class Pegex::Parser
126
131
  while return_ = method.call(rule, next_)
127
132
  position = @position unless assertion
128
133
  count += 1
129
- match.concat return_ unless return_.equal? $pegex_nil
134
+ match.concat return_
130
135
  break if max == 1
131
136
  end
132
137
  if max != 1
@@ -134,7 +139,7 @@ class Pegex::Parser
134
139
  @farthest = position if (@position = position) > @farthest
135
140
  end
136
141
  result = (count >= min and (max == 0 or count <= max)) ^ (assertion == -1)
137
- if not result or assertion
142
+ if not(result) or assertion
138
143
  @farthest = position if (@position = position) > @farthest
139
144
  end
140
145
 
@@ -146,53 +151,52 @@ class Pegex::Parser
146
151
  next_.values_at 'rule', 'method', 'kind', '+min', '+max', '.sep'
147
152
 
148
153
  position, match, count, scount, smin, smax =
149
- @position, [], 0, 0, sep.values_at('+min', '+max')
150
-
154
+ @position, [], 0, 0, *(sep.values_at('+min', '+max'))
151
155
  while return_ = method.call(rule, next_)
152
156
  position = @position
153
157
  count += 1
154
- match.concat return_
158
+ match.concat(return_)
155
159
  return_ = match_next(sep) or break
156
- match.concat return_
160
+ match.concat(smax == 1 ? return_ : return_[0]) if !return_.empty?
157
161
  scount += 1
158
162
  end
159
- if max != 1
160
- match = [match]
161
- end
163
+ match = [match] if max != 1
162
164
  result = count >= min and (max == 0 or count <= max)
163
165
  if count == scount and not sep['+eok']
164
166
  @farthest = position if (@position = position) > @farthest
165
167
  end
166
168
 
167
- return result ? next_['-skip'] ? [] : match : false
169
+ return(result ? next_['-skip'] ? [] : match : false)
168
170
  end
169
171
 
170
172
  def match_ref ref, parent
171
173
  rule = @tree[ref]
172
174
  match = match_next(rule) or return false
173
- return $dummy unless rule['action']
175
+ return Pegex::Constant::Dummy unless rule['action']
174
176
  @rule, @parent = ref, parent
175
177
  result = rule['action'].call(match.first)
176
- return (result.equal? $pegex_nil) ? result : [result]
178
+ return (result.equal? Pegex::Constant::Null) ? result : [result]
177
179
  end
178
180
 
179
181
  def match_rgx regexp, parent=nil
180
- position = @position
181
- string = @buffer[position .. -1]
182
- (m = string.match regexp) or return false
183
- position += m[0].length
182
+ buffer = @buffer[@position .. -1]
183
+ (m = buffer.match regexp) or return false
184
+ @position += m[0].length
185
+ # TODO use m.captures
184
186
  match = m[1..-1]
185
187
  match = [ match ] if m.length > 2
186
- @farthest = position if (@position = position) > @farthest
188
+ @farthest = @position if @position > @farthest
187
189
  return match
188
190
  end
189
191
 
190
192
  def match_all list, parent=nil
191
- position, set, len = @position, [], 0
193
+ position = @position
194
+ set = []
195
+ len = 0
192
196
  list.each do |elem|
193
197
  if match = match_next(elem)
194
- if !elem['+asr'] and !elem['-skip']
195
- set.concat match
198
+ if !(elem['+asr'] or elem['-skip'])
199
+ set.concat(match)
196
200
  len += 1
197
201
  end
198
202
  else
@@ -206,7 +210,7 @@ class Pegex::Parser
206
210
 
207
211
  def match_any list, parent=nil
208
212
  list.each do |elem|
209
- if (match = match_next elem)
213
+ if (match = match_next(elem))
210
214
  return match
211
215
  end
212
216
  end
@@ -214,18 +218,18 @@ class Pegex::Parser
214
218
  end
215
219
 
216
220
  def match_err error, parent=nil
217
- throw_error error
221
+ throw_error(error)
218
222
  end
219
223
 
220
224
  def match_ref_trace ref, parent
221
225
  rule = @tree[ref]
222
- trace_on = ! rule['+asr']
223
- trace "try_#{ref}" if trace_on
226
+ trace = ! rule['+asr']
227
+ trace("try_#{ref}") if trace
224
228
  result = nil
225
- if (result = match_ref ref, parent)
226
- trace "got_#{ref}" if trace_on
229
+ if (result = match_ref(ref, parent))
230
+ trace("got_#{ref}") if trace
227
231
  else
228
- trace "not_#{ref}" if trace_on
232
+ trace("not_#{ref}") if trace
229
233
  end
230
234
  return result
231
235
  end
@@ -243,13 +247,7 @@ class Pegex::Parser
243
247
  $stderr.print indent ? " >#{snippet}<\n" : "\n"
244
248
  end
245
249
 
246
- def throw_error msg
247
- raise msg
248
- end
249
-
250
- class PegexParseError < RuntimeError
251
-
252
- end
250
+ class PegexParseError < RuntimeError;end
253
251
 
254
252
  def throw_error msg
255
253
  @error = format_error msg
@@ -57,20 +57,20 @@ class Pegex::Pegex::AST < Pegex::Tree
57
57
  group[@prefixes[prefix]] = 1
58
58
  end
59
59
  unless suffix.empty?
60
- set_quantity group, suffix
60
+ set_quantity(group, suffix)
61
61
  end
62
62
  return group
63
63
  end
64
64
 
65
65
  def got_all_group got
66
- list = get_group got
66
+ list = get_group(got)
67
67
  fail unless list.length > 0
68
68
  return list.first if list.length == 1
69
69
  return '.all' => list
70
70
  end
71
71
 
72
72
  def got_any_group got
73
- list = get_group got
73
+ list = get_group(got)
74
74
  fail unless list.length > 0
75
75
  return list.first if list.length == 1
76
76
  return '.any' => list
@@ -96,10 +96,10 @@ class Pegex::Pegex::AST < Pegex::Tree
96
96
  if (regex = @atoms[ref])
97
97
  @extra_rules[ref] = {'.rgx' => regex}
98
98
  end
99
- unless suffix.empty?
100
- set_quantity node, suffix
99
+ if !suffix.empty?
100
+ set_quantity(node, suffix)
101
101
  end
102
- unless prefix.empty?
102
+ if !prefix.empty?
103
103
  if @prefixes[prefix].kind_of? Array
104
104
  key, val = @prefixes[prefix]
105
105
  else