fupeg 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e5cd2184b66fd7961603e8e21fbdf82514619001cff01cbe230f218c6f2737b1
4
- data.tar.gz: 74af0b350f4f966c7c7da490609f18c982c4aed3541a9e279f5eff318c4f698f
3
+ metadata.gz: 7fb3eb5f497d0c4294b507b5c8681b39635b4a7d1df96644f2afd73170e552b2
4
+ data.tar.gz: 682bfdc3c694abeddd1dc31591ef5bd39429ec4c9b2511e71269939eedd26549
5
5
  SHA512:
6
- metadata.gz: ac6a5f6588125e13597104c4c9969fbcf85825b73c76d677ea525eae81791111658733bc366a0b129af86bd85d017bec36b2b6a5c48d2677a574b63334b32bb7
7
- data.tar.gz: 8988b71fe64ea7202cc08c996a58ab0c461df6bdba19e1cf7f32315fa039d3412048235e752a7ae4895847f1215b2dd7716255c8b7c03782154926bfb43fd874
6
+ metadata.gz: 00d7f7fc3d440d968df79df59dab77c87918437dfee730682f31728fe0e1f43c650d88a1ef8d44d92c18f107d83a14fb47db0758516cf97e520e47690e118af7
7
+ data.tar.gz: 1653a4e8f7a3302b42682b29fac7b18e28e61dc99f1e63b392058282d73f3f2e54963e081431507b5e9fbb52bf5da7341de72d34e2fdaf7497db25215621f8b2
data/CHANGELOG.md CHANGED
@@ -1,4 +1,26 @@
1
- ## [0.1.0] - 2023-08-11
1
+ ## [0.2.0] - 2023-08-15
2
+
3
+ - Split Parser and Grammar
4
+ - Use `_` for both literals and sequence:
5
+ `_("x")` , `_(/x/)`, `_{ _("x") }`
6
+ - Use backtick "\`" for string literals
7
+ `x`
8
+ - `cont?` used with block to detect uncutted alternative
9
+ ```ruby
10
+ cut {
11
+ # condition
12
+ _ { `if` && cut! && ... } ||
13
+ # loop
14
+ cont? { `while` && cut! && ...} ||
15
+ # assignment
16
+ cont? { (i = ident) && sp? && `=` && cut! && ... } ||
17
+ # function call
18
+ cont? { (i = ident) && sp? && `(` && cut! && ... } ||
19
+ ...
20
+ }
21
+ ```
22
+
23
+ ## [0.1.0] - 2023-08-14
2
24
 
3
25
  - Initial release
4
26
  - Simplest rule definition in Ruby code without magic
data/README.md CHANGED
@@ -1,8 +1,16 @@
1
- # Fupeg
1
+ # Fupeg - simplest parser combinator
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/fupeg`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ PEG like parser combinator as simple as possible, but still useful.
4
+ - backtracking, manually specified by user.
5
+ - no memoization (yet).
6
+ - no left recursion (yet).
7
+ - built with StringScanner.
8
+ - pattern sequences and alteration are implemented with logical operators.
4
9
 
5
- TODO: Delete this and the text above, and describe your gem
10
+ Grammar code is pure-ruby and is executed as it is written.
11
+ No grammar tree is built and evaluated.
12
+
13
+ As bonus, "cut" operator is implemented.
6
14
 
7
15
  ## Installation
8
16
 
@@ -16,7 +24,87 @@ If bundler is not being used to manage dependencies, install the gem by executin
16
24
 
17
25
  ## Usage
18
26
 
19
- TODO: Write usage instructions here
27
+ First you should define grammar:
28
+
29
+ ```ruby
30
+ require "fupeg"
31
+
32
+ class Calc < FuPeg::Grammar
33
+ def eof
34
+ wont! { dot } && :eof
35
+ end
36
+
37
+ def lnsp?
38
+ # match regular expression
39
+ _(/[ \t]*/)
40
+ end
41
+
42
+ # Ruby 3.0 flavour
43
+ def sp? = _(/\s*/)
44
+
45
+ def number = (n = _(/\d+/)) && [:num, n]
46
+
47
+ def atom
48
+ # match raw string: _("(") is aliased to `(`
49
+ #
50
+ # match sequence of patterns with backtracking:
51
+ # `_{ x && y && z }` will rewind position, if block returns `nil` or `false`
52
+ #
53
+ # store value, returned by subpattern: just stor it into variable
54
+ #
55
+ # use `||` for alternatives
56
+ number || _ { _("(") && sp? && (sub = sum) && sp? && `)` && [:sub, sub] }
57
+ end
58
+
59
+ def fact
60
+ # repetition returns array of block results
61
+ # it stops if block returns falsey (`nil` or `false`)
62
+ rep { |fst| # fst == true for first element
63
+ op = nil
64
+ # don't expect operator before first term
65
+ (fst || (op = `*` || _("/") || _(/%/)) && sp?) &&
66
+ (a = atom) && lnsp? &&
67
+ [op, a].compact
68
+ # flat AST tree, returns [:fact, at, op, at, op, at, op] if matched
69
+ }&.flatten(1)&.unshift(:fact)
70
+ end
71
+
72
+ def sum
73
+ _ {
74
+ op = rest = nil
75
+ (f = fact) &&
76
+ # optional matches pattern always succeed
77
+ opt { lnsp? && (op = `+` || `-`) && sp? && (rest = sum) } &&
78
+ # recursive AST tree
79
+ (rest ? [:sum, f, op, rest] : f)
80
+ }
81
+ end
82
+
83
+ def root
84
+ _ { sum || eof }
85
+ end
86
+ end
87
+ ```
88
+
89
+ Then either parse string directly, or create parser and grammar:
90
+
91
+ ```ruby
92
+ # Direct parsing
93
+ pp Calc.parse(:root, "1")
94
+ pp Calc.parse(:root, "1 + 2")
95
+
96
+ # separate parser and grammar initialization
97
+ parser = FuPeg::Parser.new("1 - 2*4/7 + 5")
98
+ grammar = Calc.new(parser)
99
+ pp grammar.root
100
+
101
+ # combined parser and grammar initialization
102
+ _parser, grammar = Calc.create("(1 -
103
+ 2)*
104
+ (4 -10) +
105
+ 11")
106
+ pp grammar.root
107
+ ```
20
108
 
21
109
  ## Development
22
110
 
@@ -26,7 +114,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
26
114
 
27
115
  ## Contributing
28
116
 
29
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/fupeg.
117
+ Bug reports and pull requests are welcome on GitHub at https://github.com/funny-falcon/fupeg .
30
118
 
31
119
  ## License
32
120
 
data/examples/calc.rb ADDED
@@ -0,0 +1,69 @@
1
+ require "fupeg"
2
+
3
+ class Calc < FuPeg::Grammar
4
+ def eof
5
+ wont! { dot } && :eof
6
+ end
7
+
8
+ def lnsp?
9
+ # match regular expression
10
+ _(/[ \t]*/)
11
+ end
12
+
13
+ # Ruby 3.0 flavour
14
+ def sp? = _(/\s*/)
15
+
16
+ def number = (n = _(/\d+/)) && [:num, n]
17
+
18
+ def atom
19
+ # match raw string: _("(") is aliased to `(`
20
+ #
21
+ # match sequence of patterns with backtracking:
22
+ # `_{ x && y && z }` will rewind position, if block returns `nil` or `false`
23
+ #
24
+ # store value, returned by subpattern: just stor it into variable
25
+ number || _ { _("(") && sp? && (sub = sum) && sp? && `)` && [:sub, sub] }
26
+ end
27
+
28
+ def fact
29
+ # repetition returns array of block results
30
+ # it stops if block returns falsey (`nil` or `false`)
31
+ rep { |fst| # fst == true for first element
32
+ op = nil
33
+ (fst || (op = `*` || `/` || "%") && sp?) &&
34
+ (a = atom) && lnsp? &&
35
+ [op, a].compact
36
+ # flat AST tree, returns [:fact, at, op, at, op, at, op] if matched
37
+ }&.flatten(1)&.unshift(:fact)
38
+ end
39
+
40
+ def sum
41
+ _ {
42
+ op = rest = nil
43
+ (f = fact) &&
44
+ # optional matches pattern always succeed
45
+ opt { lnsp? && (op = `+` || `-`) && sp? && (rest = sum) } &&
46
+ # recursive AST tree
47
+ (rest ? [:sum, f, op, rest] : f)
48
+ }
49
+ end
50
+
51
+ def root
52
+ _ { sum || eof }
53
+ end
54
+ end
55
+
56
+ pp Calc.parse(:root, "1")
57
+ pp Calc.parse(:root, "1 + 2")
58
+
59
+ # separate parser and grammar initialization
60
+ parser = FuPeg::Parser.new("1 - 2*4/7 + 5")
61
+ grammar = Calc.new(parser)
62
+ pp grammar.root
63
+
64
+ # combined parser and grammar initialization
65
+ _parser, grammar = Calc.create("(1 -
66
+ 2)*
67
+ (4 -10) +
68
+ 11")
69
+ pp grammar.root
@@ -0,0 +1,88 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "parser"
4
+
5
+ module FuPeg
6
+ class Grammar
7
+ def self.create(str, pos = 0)
8
+ parser = Parser.new(str, pos)
9
+ grammar = new(parser)
10
+ [parser, grammar]
11
+ end
12
+
13
+ def self.parse(root, str)
14
+ _, gr = create(str)
15
+ gr.__send__(root)
16
+ end
17
+
18
+ def initialize(parser)
19
+ @p = parser
20
+ end
21
+
22
+ def fail!
23
+ @p.fail!(skip: 3)
24
+ end
25
+
26
+ def dot
27
+ @p.match(/./m)
28
+ end
29
+
30
+ def `(str)
31
+ @p.match(str)
32
+ end
33
+
34
+ def _(lit = nil, &block)
35
+ @p.match(lit, &block)
36
+ end
37
+
38
+ def opt(arg = nil, &block)
39
+ @p.match(arg, &block) || true
40
+ end
41
+
42
+ def will?(lit = nil, &block)
43
+ @p.preserve(pos: true) { @p.match(lit, &block) }
44
+ end
45
+
46
+ def wont!(lit = nil, &block)
47
+ @p.preserve(pos: true, failed: true) { !@p.match(lit, &block) } || @p.fail!
48
+ end
49
+
50
+ def text(lit = nil, &block)
51
+ @p.text(lit, &block)
52
+ end
53
+
54
+ def bounds(lit = nil, &block)
55
+ @p.bounds(lit, &block)
56
+ end
57
+
58
+ def cut(&block)
59
+ @p.with_cut_point(&block)
60
+ end
61
+
62
+ def cut!
63
+ @p.current_cutpoint.cut!
64
+ end
65
+
66
+ def cont?(&block)
67
+ @p.current_cutpoint.can_continue? && (block ? @p.backtrack(&block) : true)
68
+ end
69
+
70
+ def rep(range = 0.., lit = nil, &block)
71
+ range = range..range if Integer === range
72
+ range = 0..range.max if range.begin.nil?
73
+ unless Integer === range.min && (range.end.nil? || Integer === range.max)
74
+ raise "Range malformed #{range}"
75
+ end
76
+ @p.backtrack do
77
+ max = range.end && range.max
78
+ ar = []
79
+ (1..max).each do |i|
80
+ res = @p.backtrack { yield i == 1 }
81
+ break unless res
82
+ ar << res
83
+ end
84
+ (ar.size >= range.min) ? ar : @p.fail!
85
+ end
86
+ end
87
+ end
88
+ end
data/lib/fupeg/parser.rb CHANGED
@@ -28,28 +28,31 @@ module FuPeg
28
28
  @scan.pos
29
29
  end
30
30
 
31
- def charpos
32
- @str_size - @str.byteslice(@scan.pos..).size
31
+ def charpos(pos = @scan.pos)
32
+ @str_size - @str.byteslice(pos..).size
33
33
  end
34
34
 
35
- Fail = Struct.new(:stack, :pos, :bytepos)
35
+ Fail = Struct.new(:stack, :bytepos)
36
36
 
37
- def fail!(skip = 2)
37
+ def fail!(*, skip: 2)
38
38
  if !@failed || bytepos > @failed.bytepos
39
39
  stack = caller_locations(skip)
40
40
  stack.delete_if do |loc|
41
- if loc.path == __FILE__
42
- loc.label =~ /\b(_bt|each|block)\b/
41
+ if loc.path.start_with?(__dir__)
42
+ loc.label =~ /\b(backtrack|each|block)\b/
43
43
  end
44
44
  end
45
- pos = position_for_charpos(charpos)
46
- @failed = Fail.new(stack, pos, bytepos)
45
+ @failed = Fail.new(stack, bytepos)
47
46
  end
48
47
  nil
49
48
  end
50
49
 
50
+ def failed_position
51
+ position_for_bytepos(@failed.bytepos)
52
+ end
53
+
51
54
  def report_failed(out)
52
- pos = @failed.pos
55
+ pos = position_for_bytepos(@failed.bytepos)
53
56
  out << "Failed at #{pos.lineno}:#{pos.colno} :\n"
54
57
  out << pos.line + "\n"
55
58
  out << (" " * (pos.colno - 1) + "^\n")
@@ -60,75 +63,36 @@ module FuPeg
60
63
  out
61
64
  end
62
65
 
63
- def dot
64
- @scan.scan(/./m) || fail!
65
- end
66
-
67
66
  begin
68
67
  StringScanner.new("x").skip("x")
69
- def lit(reg_or_str)
70
- @scan.scan(reg_or_str) || fail!
68
+ def match(lit = //, &block)
69
+ block ? backtrack(&block) : (@scan.scan(lit) || fail!)
71
70
  end
72
71
  rescue
73
- def lit(reg_or_str)
74
- if String === reg_or_str
75
- @__match_lit_cache ||= Hash.new { |h, s| h[s] = Regexp.new(Regexp.escape(s)) }
76
- reg_or_str = @__match_lit_cache[reg_or_str]
72
+ def match(lit = //, &block)
73
+ if String === lit
74
+ @_lit_cache ||= {}
75
+ lit = @_lit_cache[lit] ||= Regexp.new(Regexp.escape(lit))
77
76
  end
78
- @scan.scan(reg_or_str) || fail!
77
+ block ? backtrack(&block) : (@scan.scan(lit) || fail!)
79
78
  end
80
79
  end
81
80
 
82
- def seq(*args, &block)
83
- _bt(&block)
84
- end
85
-
86
- def opt(&block)
87
- _rewind(nil, @failed, _bt(&block) || true)
88
- end
89
-
90
- def rep(range = 0.., &block)
91
- range = range..range if Integer === range
92
- range = 0..range.max if range.begin.nil?
93
- unless Integer === range.min && (range.end.nil? || Integer === range.max)
94
- raise "Range malformed #{range}"
95
- end
96
- _bt do
97
- max = range.end && range.max
98
- ar = []
99
- (1..max).each do
100
- res = _bt(&block)
101
- break unless res
102
- ar << res
103
- end
104
- (ar.size >= range.min) ? ar : fail!
105
- end
106
- end
107
-
108
- def text(&block)
81
+ def text(lit = nil, &block)
109
82
  pos = @scan.pos
110
- _bt(&block) && @str.byteslice(pos, @scan.pos - pos)
83
+ match(lit, &block) && @str.byteslice(pos, @scan.pos - pos)
111
84
  end
112
85
 
113
- def will?(&block)
114
- _rewind(@scan.pos, false, _bt(&block))
115
- end
116
-
117
- def wont!(&block)
118
- _rewind(@scan.pos, @failed, !_bt(&block)) || fail!
86
+ def bounds(lit = nil, &block)
87
+ pos = @scan.pos
88
+ match(lit, &block) && pos...@scan.pos
119
89
  end
120
90
 
121
- # cut point handling
122
- # cut do
123
- # seq { lit("{") && cut! && lit("}") } ||
124
- # !cut? && seq { lit("[") && cut! && lit("]") } ||
125
- # !cut? && dot
126
- # end
127
91
  class CutPoint
128
92
  attr_accessor :next
129
93
 
130
94
  def initialize
131
- @cut = false
95
+ @cut = nil
132
96
  @next = nil
133
97
  end
134
98
 
@@ -137,13 +101,13 @@ module FuPeg
137
101
  @cut = true
138
102
  end
139
103
 
140
- def cut?
141
- @cut
104
+ def can_continue?
105
+ @cut ? nil : true
142
106
  end
143
107
  end
144
108
 
145
- # for use with cut! and cut?
146
- def cut
109
+ # for use with cut! and cont?
110
+ def with_cut_point
147
111
  prev_cut = @cut
148
112
  @cut = CutPoint.new
149
113
  prev_cut.next = @cut
@@ -153,51 +117,40 @@ module FuPeg
153
117
  @cut = prev_cut
154
118
  end
155
119
 
156
- def cut!
157
- @cut.cut!
158
- end
159
-
160
- def cut?
161
- @cut.cut?
120
+ def current_cutpoint
121
+ @cut
162
122
  end
163
123
 
164
124
  # Position handling for failures
165
125
 
166
126
  Position = Struct.new(:lineno, :colno, :line, :charpos)
167
127
 
168
- private
169
-
170
128
  def init_line_ends
171
129
  @line_ends = [-1]
172
- pos = 0
173
- while (pos = @str.index("\n", pos))
174
- @line_ends << @pos
175
- pos += 1
130
+ scan = StringScanner.new(@str)
131
+ while scan.skip_until(/\n|\r\n?/)
132
+ @line_ends << scan.pos - 1
176
133
  end
177
- @line_ends << @str.size
134
+ @line_ends << @str.bytesize
178
135
  end
179
136
 
180
- public
181
-
182
- def position_for_charpos(charpos)
183
- lineno = @line_ends.bsearch_index { |x| x >= charpos }
137
+ def position_for_bytepos(pos)
138
+ lineno = @line_ends.bsearch_index { |x| x >= pos }
184
139
  case lineno
185
140
  when nil
186
- raise "Position #{charpos} is larger than string size #{@str.size}"
141
+ raise "Position #{pos} is larger than string byte size #{@str.bytesize}"
187
142
  else
188
143
  prev_end = @line_ends[lineno - 1]
189
144
  line_start = prev_end + 1
190
- column = charpos - prev_end
145
+ column = @str.byteslice(line_start, pos - prev_end).size
191
146
  end
192
- line = @str[line_start..@line_ends[lineno]]
193
- Position.new(lineno, column, line, charpos)
147
+ line = @str.byteslice(line_start..@line_ends[lineno])
148
+ Position.new(lineno, column, line, charpos(pos))
194
149
  end
195
150
 
196
151
  # helper methods
197
152
 
198
- private
199
-
200
- def _bt
153
+ def backtrack
201
154
  pos = @scan.pos
202
155
  res = yield
203
156
  if res
@@ -212,10 +165,12 @@ module FuPeg
212
165
  raise
213
166
  end
214
167
 
215
- def _rewind(pos, failed, val)
216
- @scan.pos = pos if pos
217
- @failed = failed if failed != false
218
- val
168
+ def preserve(pos = false, failed = false, &block)
169
+ p, f = @scan.pos, @failed
170
+ r = yield
171
+ @scan.pos = p if pos
172
+ @failed = f if failed
173
+ r
219
174
  end
220
175
  end
221
176
  end
data/lib/fupeg/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module FuPeg
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/fupeg.rb CHANGED
@@ -2,10 +2,9 @@
2
2
 
3
3
  require_relative "fupeg/version"
4
4
  require_relative "fupeg/parser"
5
+ require_relative "fupeg/grammar"
5
6
 
6
7
  module FuPeg
7
- VERSION = "0.1.0"
8
-
9
8
  class Error < StandardError; end
10
9
  # Your code goes here...
11
10
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fupeg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Yura Sokolov
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-08-14 00:00:00.000000000 Z
11
+ date: 2023-08-15 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: "\n Simple backtracing parser, using ruby logical operators for primitive
14
14
  sequence/choice\n and slim wrappers for other PEG style operators and backtrace.\n
@@ -26,7 +26,9 @@ files:
26
26
  - LICENSE.txt
27
27
  - README.md
28
28
  - Rakefile
29
+ - examples/calc.rb
29
30
  - lib/fupeg.rb
31
+ - lib/fupeg/grammar.rb
30
32
  - lib/fupeg/parser.rb
31
33
  - lib/fupeg/version.rb
32
34
  - sig/fupeg.rbs