fupeg 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e5cd2184b66fd7961603e8e21fbdf82514619001cff01cbe230f218c6f2737b1
4
- data.tar.gz: 74af0b350f4f966c7c7da490609f18c982c4aed3541a9e279f5eff318c4f698f
3
+ metadata.gz: 7fb3eb5f497d0c4294b507b5c8681b39635b4a7d1df96644f2afd73170e552b2
4
+ data.tar.gz: 682bfdc3c694abeddd1dc31591ef5bd39429ec4c9b2511e71269939eedd26549
5
5
  SHA512:
6
- metadata.gz: ac6a5f6588125e13597104c4c9969fbcf85825b73c76d677ea525eae81791111658733bc366a0b129af86bd85d017bec36b2b6a5c48d2677a574b63334b32bb7
7
- data.tar.gz: 8988b71fe64ea7202cc08c996a58ab0c461df6bdba19e1cf7f32315fa039d3412048235e752a7ae4895847f1215b2dd7716255c8b7c03782154926bfb43fd874
6
+ metadata.gz: 00d7f7fc3d440d968df79df59dab77c87918437dfee730682f31728fe0e1f43c650d88a1ef8d44d92c18f107d83a14fb47db0758516cf97e520e47690e118af7
7
+ data.tar.gz: 1653a4e8f7a3302b42682b29fac7b18e28e61dc99f1e63b392058282d73f3f2e54963e081431507b5e9fbb52bf5da7341de72d34e2fdaf7497db25215621f8b2
data/CHANGELOG.md CHANGED
@@ -1,4 +1,26 @@
1
- ## [0.1.0] - 2023-08-11
1
+ ## [0.2.0] - 2023-08-15
2
+
3
+ - Split Parser and Grammar
4
+ - Use `_` for both literals and sequence:
5
+ `_("x")` , `_(/x/)`, `_{ _("x") }`
6
+ - Use backtick "\`" for string literals
7
+ `x`
8
+ - `cont?` used with block to detect uncutted alternative
9
+ ```ruby
10
+ cut {
11
+ # condition
12
+ _ { `if` && cut! && ... } ||
13
+ # loop
14
+ cont? { `while` && cut! && ...} ||
15
+ # assignment
16
+ cont? { (i = ident) && sp? && `=` && cut! && ... } ||
17
+ # function call
18
+ cont? { (i = ident) && sp? && `(` && cut! && ... } ||
19
+ ...
20
+ }
21
+ ```
22
+
23
+ ## [0.1.0] - 2023-08-14
2
24
 
3
25
  - Initial release
4
26
  - Simplest rule definition in Ruby code without magic
data/README.md CHANGED
@@ -1,8 +1,16 @@
1
- # Fupeg
1
+ # Fupeg - simplest parser combinator
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/fupeg`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ PEG like parser combinator as simple as possible, but still useful.
4
+ - backtracking, manually specified by user.
5
+ - no memoization (yet).
6
+ - no left recursion (yet).
7
+ - built with StringScanner.
8
+ - pattern sequences and alteration are implemented with logical operators.
4
9
 
5
- TODO: Delete this and the text above, and describe your gem
10
+ Grammar code is pure-ruby and is executed as it is written.
11
+ No grammar tree is built and evaluated.
12
+
13
+ As bonus, "cut" operator is implemented.
6
14
 
7
15
  ## Installation
8
16
 
@@ -16,7 +24,87 @@ If bundler is not being used to manage dependencies, install the gem by executin
16
24
 
17
25
  ## Usage
18
26
 
19
- TODO: Write usage instructions here
27
+ First you should define grammar:
28
+
29
+ ```ruby
30
+ require "fupeg"
31
+
32
+ class Calc < FuPeg::Grammar
33
+ def eof
34
+ wont! { dot } && :eof
35
+ end
36
+
37
+ def lnsp?
38
+ # match regular expression
39
+ _(/[ \t]*/)
40
+ end
41
+
42
+ # Ruby 3.0 flavour
43
+ def sp? = _(/\s*/)
44
+
45
+ def number = (n = _(/\d+/)) && [:num, n]
46
+
47
+ def atom
48
+ # match raw string: _("(") is aliased to `(`
49
+ #
50
+ # match sequence of patterns with backtracking:
51
+ # `_{ x && y && z }` will rewind position, if block returns `nil` or `false`
52
+ #
53
+ # store value, returned by subpattern: just stor it into variable
54
+ #
55
+ # use `||` for alternatives
56
+ number || _ { _("(") && sp? && (sub = sum) && sp? && `)` && [:sub, sub] }
57
+ end
58
+
59
+ def fact
60
+ # repetition returns array of block results
61
+ # it stops if block returns falsey (`nil` or `false`)
62
+ rep { |fst| # fst == true for first element
63
+ op = nil
64
+ # don't expect operator before first term
65
+ (fst || (op = `*` || _("/") || _(/%/)) && sp?) &&
66
+ (a = atom) && lnsp? &&
67
+ [op, a].compact
68
+ # flat AST tree, returns [:fact, at, op, at, op, at, op] if matched
69
+ }&.flatten(1)&.unshift(:fact)
70
+ end
71
+
72
+ def sum
73
+ _ {
74
+ op = rest = nil
75
+ (f = fact) &&
76
+ # optional matches pattern always succeed
77
+ opt { lnsp? && (op = `+` || `-`) && sp? && (rest = sum) } &&
78
+ # recursive AST tree
79
+ (rest ? [:sum, f, op, rest] : f)
80
+ }
81
+ end
82
+
83
+ def root
84
+ _ { sum || eof }
85
+ end
86
+ end
87
+ ```
88
+
89
+ Then either parse string directly, or create parser and grammar:
90
+
91
+ ```ruby
92
+ # Direct parsing
93
+ pp Calc.parse(:root, "1")
94
+ pp Calc.parse(:root, "1 + 2")
95
+
96
+ # separate parser and grammar initialization
97
+ parser = FuPeg::Parser.new("1 - 2*4/7 + 5")
98
+ grammar = Calc.new(parser)
99
+ pp grammar.root
100
+
101
+ # combined parser and grammar initialization
102
+ _parser, grammar = Calc.create("(1 -
103
+ 2)*
104
+ (4 -10) +
105
+ 11")
106
+ pp grammar.root
107
+ ```
20
108
 
21
109
  ## Development
22
110
 
@@ -26,7 +114,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
26
114
 
27
115
  ## Contributing
28
116
 
29
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/fupeg.
117
+ Bug reports and pull requests are welcome on GitHub at https://github.com/funny-falcon/fupeg .
30
118
 
31
119
  ## License
32
120
 
data/examples/calc.rb ADDED
@@ -0,0 +1,69 @@
1
+ require "fupeg"
2
+
3
+ class Calc < FuPeg::Grammar
4
+ def eof
5
+ wont! { dot } && :eof
6
+ end
7
+
8
+ def lnsp?
9
+ # match regular expression
10
+ _(/[ \t]*/)
11
+ end
12
+
13
+ # Ruby 3.0 flavour
14
+ def sp? = _(/\s*/)
15
+
16
+ def number = (n = _(/\d+/)) && [:num, n]
17
+
18
+ def atom
19
+ # match raw string: _("(") is aliased to `(`
20
+ #
21
+ # match sequence of patterns with backtracking:
22
+ # `_{ x && y && z }` will rewind position, if block returns `nil` or `false`
23
+ #
24
+ # store value, returned by subpattern: just stor it into variable
25
+ number || _ { _("(") && sp? && (sub = sum) && sp? && `)` && [:sub, sub] }
26
+ end
27
+
28
+ def fact
29
+ # repetition returns array of block results
30
+ # it stops if block returns falsey (`nil` or `false`)
31
+ rep { |fst| # fst == true for first element
32
+ op = nil
33
+ (fst || (op = `*` || `/` || "%") && sp?) &&
34
+ (a = atom) && lnsp? &&
35
+ [op, a].compact
36
+ # flat AST tree, returns [:fact, at, op, at, op, at, op] if matched
37
+ }&.flatten(1)&.unshift(:fact)
38
+ end
39
+
40
+ def sum
41
+ _ {
42
+ op = rest = nil
43
+ (f = fact) &&
44
+ # optional matches pattern always succeed
45
+ opt { lnsp? && (op = `+` || `-`) && sp? && (rest = sum) } &&
46
+ # recursive AST tree
47
+ (rest ? [:sum, f, op, rest] : f)
48
+ }
49
+ end
50
+
51
+ def root
52
+ _ { sum || eof }
53
+ end
54
+ end
55
+
56
+ pp Calc.parse(:root, "1")
57
+ pp Calc.parse(:root, "1 + 2")
58
+
59
+ # separate parser and grammar initialization
60
+ parser = FuPeg::Parser.new("1 - 2*4/7 + 5")
61
+ grammar = Calc.new(parser)
62
+ pp grammar.root
63
+
64
+ # combined parser and grammar initialization
65
+ _parser, grammar = Calc.create("(1 -
66
+ 2)*
67
+ (4 -10) +
68
+ 11")
69
+ pp grammar.root
@@ -0,0 +1,88 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "parser"
4
+
5
+ module FuPeg
6
+ class Grammar
7
+ def self.create(str, pos = 0)
8
+ parser = Parser.new(str, pos)
9
+ grammar = new(parser)
10
+ [parser, grammar]
11
+ end
12
+
13
+ def self.parse(root, str)
14
+ _, gr = create(str)
15
+ gr.__send__(root)
16
+ end
17
+
18
+ def initialize(parser)
19
+ @p = parser
20
+ end
21
+
22
+ def fail!
23
+ @p.fail!(skip: 3)
24
+ end
25
+
26
+ def dot
27
+ @p.match(/./m)
28
+ end
29
+
30
+ def `(str)
31
+ @p.match(str)
32
+ end
33
+
34
+ def _(lit = nil, &block)
35
+ @p.match(lit, &block)
36
+ end
37
+
38
+ def opt(arg = nil, &block)
39
+ @p.match(arg, &block) || true
40
+ end
41
+
42
+ def will?(lit = nil, &block)
43
+ @p.preserve(pos: true) { @p.match(lit, &block) }
44
+ end
45
+
46
+ def wont!(lit = nil, &block)
47
+ @p.preserve(pos: true, failed: true) { !@p.match(lit, &block) } || @p.fail!
48
+ end
49
+
50
+ def text(lit = nil, &block)
51
+ @p.text(lit, &block)
52
+ end
53
+
54
+ def bounds(lit = nil, &block)
55
+ @p.bounds(lit, &block)
56
+ end
57
+
58
+ def cut(&block)
59
+ @p.with_cut_point(&block)
60
+ end
61
+
62
+ def cut!
63
+ @p.current_cutpoint.cut!
64
+ end
65
+
66
+ def cont?(&block)
67
+ @p.current_cutpoint.can_continue? && (block ? @p.backtrack(&block) : true)
68
+ end
69
+
70
+ def rep(range = 0.., lit = nil, &block)
71
+ range = range..range if Integer === range
72
+ range = 0..range.max if range.begin.nil?
73
+ unless Integer === range.min && (range.end.nil? || Integer === range.max)
74
+ raise "Range malformed #{range}"
75
+ end
76
+ @p.backtrack do
77
+ max = range.end && range.max
78
+ ar = []
79
+ (1..max).each do |i|
80
+ res = @p.backtrack { yield i == 1 }
81
+ break unless res
82
+ ar << res
83
+ end
84
+ (ar.size >= range.min) ? ar : @p.fail!
85
+ end
86
+ end
87
+ end
88
+ end
data/lib/fupeg/parser.rb CHANGED
@@ -28,28 +28,31 @@ module FuPeg
28
28
  @scan.pos
29
29
  end
30
30
 
31
- def charpos
32
- @str_size - @str.byteslice(@scan.pos..).size
31
+ def charpos(pos = @scan.pos)
32
+ @str_size - @str.byteslice(pos..).size
33
33
  end
34
34
 
35
- Fail = Struct.new(:stack, :pos, :bytepos)
35
+ Fail = Struct.new(:stack, :bytepos)
36
36
 
37
- def fail!(skip = 2)
37
+ def fail!(*, skip: 2)
38
38
  if !@failed || bytepos > @failed.bytepos
39
39
  stack = caller_locations(skip)
40
40
  stack.delete_if do |loc|
41
- if loc.path == __FILE__
42
- loc.label =~ /\b(_bt|each|block)\b/
41
+ if loc.path.start_with?(__dir__)
42
+ loc.label =~ /\b(backtrack|each|block)\b/
43
43
  end
44
44
  end
45
- pos = position_for_charpos(charpos)
46
- @failed = Fail.new(stack, pos, bytepos)
45
+ @failed = Fail.new(stack, bytepos)
47
46
  end
48
47
  nil
49
48
  end
50
49
 
50
+ def failed_position
51
+ position_for_bytepos(@failed.bytepos)
52
+ end
53
+
51
54
  def report_failed(out)
52
- pos = @failed.pos
55
+ pos = position_for_bytepos(@failed.bytepos)
53
56
  out << "Failed at #{pos.lineno}:#{pos.colno} :\n"
54
57
  out << pos.line + "\n"
55
58
  out << (" " * (pos.colno - 1) + "^\n")
@@ -60,75 +63,36 @@ module FuPeg
60
63
  out
61
64
  end
62
65
 
63
- def dot
64
- @scan.scan(/./m) || fail!
65
- end
66
-
67
66
  begin
68
67
  StringScanner.new("x").skip("x")
69
- def lit(reg_or_str)
70
- @scan.scan(reg_or_str) || fail!
68
+ def match(lit = //, &block)
69
+ block ? backtrack(&block) : (@scan.scan(lit) || fail!)
71
70
  end
72
71
  rescue
73
- def lit(reg_or_str)
74
- if String === reg_or_str
75
- @__match_lit_cache ||= Hash.new { |h, s| h[s] = Regexp.new(Regexp.escape(s)) }
76
- reg_or_str = @__match_lit_cache[reg_or_str]
72
+ def match(lit = //, &block)
73
+ if String === lit
74
+ @_lit_cache ||= {}
75
+ lit = @_lit_cache[lit] ||= Regexp.new(Regexp.escape(lit))
77
76
  end
78
- @scan.scan(reg_or_str) || fail!
77
+ block ? backtrack(&block) : (@scan.scan(lit) || fail!)
79
78
  end
80
79
  end
81
80
 
82
- def seq(*args, &block)
83
- _bt(&block)
84
- end
85
-
86
- def opt(&block)
87
- _rewind(nil, @failed, _bt(&block) || true)
88
- end
89
-
90
- def rep(range = 0.., &block)
91
- range = range..range if Integer === range
92
- range = 0..range.max if range.begin.nil?
93
- unless Integer === range.min && (range.end.nil? || Integer === range.max)
94
- raise "Range malformed #{range}"
95
- end
96
- _bt do
97
- max = range.end && range.max
98
- ar = []
99
- (1..max).each do
100
- res = _bt(&block)
101
- break unless res
102
- ar << res
103
- end
104
- (ar.size >= range.min) ? ar : fail!
105
- end
106
- end
107
-
108
- def text(&block)
81
+ def text(lit = nil, &block)
109
82
  pos = @scan.pos
110
- _bt(&block) && @str.byteslice(pos, @scan.pos - pos)
83
+ match(lit, &block) && @str.byteslice(pos, @scan.pos - pos)
111
84
  end
112
85
 
113
- def will?(&block)
114
- _rewind(@scan.pos, false, _bt(&block))
115
- end
116
-
117
- def wont!(&block)
118
- _rewind(@scan.pos, @failed, !_bt(&block)) || fail!
86
+ def bounds(lit = nil, &block)
87
+ pos = @scan.pos
88
+ match(lit, &block) && pos...@scan.pos
119
89
  end
120
90
 
121
- # cut point handling
122
- # cut do
123
- # seq { lit("{") && cut! && lit("}") } ||
124
- # !cut? && seq { lit("[") && cut! && lit("]") } ||
125
- # !cut? && dot
126
- # end
127
91
  class CutPoint
128
92
  attr_accessor :next
129
93
 
130
94
  def initialize
131
- @cut = false
95
+ @cut = nil
132
96
  @next = nil
133
97
  end
134
98
 
@@ -137,13 +101,13 @@ module FuPeg
137
101
  @cut = true
138
102
  end
139
103
 
140
- def cut?
141
- @cut
104
+ def can_continue?
105
+ @cut ? nil : true
142
106
  end
143
107
  end
144
108
 
145
- # for use with cut! and cut?
146
- def cut
109
+ # for use with cut! and cont?
110
+ def with_cut_point
147
111
  prev_cut = @cut
148
112
  @cut = CutPoint.new
149
113
  prev_cut.next = @cut
@@ -153,51 +117,40 @@ module FuPeg
153
117
  @cut = prev_cut
154
118
  end
155
119
 
156
- def cut!
157
- @cut.cut!
158
- end
159
-
160
- def cut?
161
- @cut.cut?
120
+ def current_cutpoint
121
+ @cut
162
122
  end
163
123
 
164
124
  # Position handling for failures
165
125
 
166
126
  Position = Struct.new(:lineno, :colno, :line, :charpos)
167
127
 
168
- private
169
-
170
128
  def init_line_ends
171
129
  @line_ends = [-1]
172
- pos = 0
173
- while (pos = @str.index("\n", pos))
174
- @line_ends << @pos
175
- pos += 1
130
+ scan = StringScanner.new(@str)
131
+ while scan.skip_until(/\n|\r\n?/)
132
+ @line_ends << scan.pos - 1
176
133
  end
177
- @line_ends << @str.size
134
+ @line_ends << @str.bytesize
178
135
  end
179
136
 
180
- public
181
-
182
- def position_for_charpos(charpos)
183
- lineno = @line_ends.bsearch_index { |x| x >= charpos }
137
+ def position_for_bytepos(pos)
138
+ lineno = @line_ends.bsearch_index { |x| x >= pos }
184
139
  case lineno
185
140
  when nil
186
- raise "Position #{charpos} is larger than string size #{@str.size}"
141
+ raise "Position #{pos} is larger than string byte size #{@str.bytesize}"
187
142
  else
188
143
  prev_end = @line_ends[lineno - 1]
189
144
  line_start = prev_end + 1
190
- column = charpos - prev_end
145
+ column = @str.byteslice(line_start, pos - prev_end).size
191
146
  end
192
- line = @str[line_start..@line_ends[lineno]]
193
- Position.new(lineno, column, line, charpos)
147
+ line = @str.byteslice(line_start..@line_ends[lineno])
148
+ Position.new(lineno, column, line, charpos(pos))
194
149
  end
195
150
 
196
151
  # helper methods
197
152
 
198
- private
199
-
200
- def _bt
153
+ def backtrack
201
154
  pos = @scan.pos
202
155
  res = yield
203
156
  if res
@@ -212,10 +165,12 @@ module FuPeg
212
165
  raise
213
166
  end
214
167
 
215
- def _rewind(pos, failed, val)
216
- @scan.pos = pos if pos
217
- @failed = failed if failed != false
218
- val
168
+ def preserve(pos = false, failed = false, &block)
169
+ p, f = @scan.pos, @failed
170
+ r = yield
171
+ @scan.pos = p if pos
172
+ @failed = f if failed
173
+ r
219
174
  end
220
175
  end
221
176
  end
data/lib/fupeg/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module FuPeg
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/fupeg.rb CHANGED
@@ -2,10 +2,9 @@
2
2
 
3
3
  require_relative "fupeg/version"
4
4
  require_relative "fupeg/parser"
5
+ require_relative "fupeg/grammar"
5
6
 
6
7
  module FuPeg
7
- VERSION = "0.1.0"
8
-
9
8
  class Error < StandardError; end
10
9
  # Your code goes here...
11
10
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fupeg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Yura Sokolov
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-08-14 00:00:00.000000000 Z
11
+ date: 2023-08-15 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: "\n Simple backtracing parser, using ruby logical operators for primitive
14
14
  sequence/choice\n and slim wrappers for other PEG style operators and backtrace.\n
@@ -26,7 +26,9 @@ files:
26
26
  - LICENSE.txt
27
27
  - README.md
28
28
  - Rakefile
29
+ - examples/calc.rb
29
30
  - lib/fupeg.rb
31
+ - lib/fupeg/grammar.rb
30
32
  - lib/fupeg/parser.rb
31
33
  - lib/fupeg/version.rb
32
34
  - sig/fupeg.rbs