paco 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +50 -3
- data/README.md +73 -9
- data/lib/paco/callstack.rb +27 -0
- data/lib/paco/combinators/char.rb +14 -13
- data/lib/paco/combinators.rb +36 -45
- data/lib/paco/context.rb +28 -17
- data/lib/paco/index.rb +15 -0
- data/lib/paco/memoizer.rb +20 -0
- data/lib/paco/parse_error.rb +11 -10
- data/lib/paco/parser.rb +28 -16
- data/lib/paco/rspec/parse_matcher.rb +47 -0
- data/lib/paco/rspec.rb +4 -0
- data/lib/paco/version.rb +1 -1
- data/lib/paco.rb +0 -1
- metadata +7 -4
- data/bin/console +0 -15
- data/bin/setup +0 -8
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ae17ec48820da8d99722dd87e2f834cc238497c948dff292490665cdeb16fbc1
|
4
|
+
data.tar.gz: 30046ad4c3203dcf35430c9512df303f008fe513c11f92e06137a7131ee586fe
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf9fb9a162ab9c8292047942ebf8cc57e6a0382fa680beed1802129f6abd07d2483a24ba734f260c0fef9ba948e54ac9cf9635bc49abbbc457d2d91c6c5a9fb7
|
7
|
+
data.tar.gz: 0d4ef03c2022c26ae813d1d3f925b539c6cb7e2e715f27f965a80f9b8809e7a4c6d6d78bb6c5086603f3585aaf44e6035b5f72892078d44d63f4cfd5ca83509b
|
data/CHANGELOG.md
CHANGED
@@ -5,9 +5,55 @@ All notable changes to this project will be documented in this file.
|
|
5
5
|
The format is based on [Keep a Changelog],
|
6
6
|
and this project adheres to [Semantic Versioning].
|
7
7
|
|
8
|
-
## [
|
8
|
+
## [0.2.0] - 2021-12-28
|
9
9
|
|
10
|
-
|
10
|
+
### Added
|
11
|
+
|
12
|
+
- Callstack collection for debugging. ([@skryukov])
|
13
|
+
|
14
|
+
Pass `with_callstack: true` to the `Paco::Parser#parse` method to collect a callstack while parsing. To examine the callstack catch the `ParseError` exception:
|
15
|
+
|
16
|
+
```ruby
|
17
|
+
begin
|
18
|
+
string("Paco").parse("Paco!", with_callstack: true)
|
19
|
+
rescue Paco::ParseError => e
|
20
|
+
pp e.callstack.stack # You will probably want to use `binding.irb` or `binding.pry`
|
21
|
+
end
|
22
|
+
```
|
23
|
+
|
24
|
+
- `Paco::Combinators.index` method. ([@skryukov])
|
25
|
+
|
26
|
+
Call `Paco::Combinators.index` to get `Paco::Index` representing the current offset into the parse without consuming the input.
|
27
|
+
`Paco::Index` has a 0-based character offset attribute `:pos` and 1-based `:line` and `:column` attributes.
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
index.parse("Paco") #=> #<struct Paco::Index pos=0, line=1, column=1>
|
31
|
+
```
|
32
|
+
|
33
|
+
- RSpec matcher `#parse`. ([@skryukov])
|
34
|
+
|
35
|
+
Add `require "paco/rspec"` to `spec_helper.rb` to enable a special RSpec matcher `#parse`:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
subject { string("Paco") }
|
39
|
+
|
40
|
+
it { is_expected.to parse("Paco") } # just checks if parser succeeds
|
41
|
+
it { is_expected.to parse("Paco").as("Paco") } # checks if parser result is eq to value passed to `#as`
|
42
|
+
it { is_expected.to parse("Paco").fully } # checks if parser result is the same as value passed to `#parse`
|
43
|
+
```
|
44
|
+
|
45
|
+
### Changed
|
46
|
+
|
47
|
+
- `Paco::Combinators.seq_map` merged into `Paco::Combinators.seq`. ([@skryukov])
|
48
|
+
- `Paco::Combinators.sep_by_1` renamed to `Paco::Combinators.sep_by!`. ([@skryukov])
|
49
|
+
|
50
|
+
### Fixed
|
51
|
+
|
52
|
+
- `Paco::Combinators::Char#regexp` now uses `\A` instead of `^`. ([@skryukov])
|
53
|
+
- `include Paco` now works inside `irb`. ([@skryukov])
|
54
|
+
- `Paco::Combinators#not_followed_by` and `Paco::Combinators#seq` now don't consume input on error. ([@skryukov])
|
55
|
+
|
56
|
+
## [0.1.0] - 2021-12-12
|
11
57
|
|
12
58
|
### Added
|
13
59
|
|
@@ -15,7 +61,8 @@ and this project adheres to [Semantic Versioning].
|
|
15
61
|
|
16
62
|
[@skryukov]: https://github.com/skryukov
|
17
63
|
|
18
|
-
[Unreleased]: https://github.com/skryukov/paco/compare/v0.
|
64
|
+
[Unreleased]: https://github.com/skryukov/paco/compare/v0.2.0...HEAD
|
65
|
+
[0.2.0]: https://github.com/skryukov/paco/compare/v0.1.0...v0.2.0
|
19
66
|
[0.1.0]: https://github.com/skryukov/paco/commits/v0.1.0
|
20
67
|
|
21
68
|
[Keep a Changelog]: https://keepachangelog.com/en/1.0.0/
|
data/README.md
CHANGED
@@ -1,26 +1,83 @@
|
|
1
1
|
# Paco
|
2
2
|
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/paco.svg)](https://rubygems.org/gems/paco)
|
4
|
+
[![Build](https://github.com/skryukov/paco/workflows/Build/badge.svg)](https://github.com/skryukov/paco/actions)
|
5
|
+
|
3
6
|
Paco is a parser combinator library inspired by Haskell's [Parsec] and [Parsimmon].
|
4
7
|
|
5
|
-
|
8
|
+
"But I don't need to write another JSON parser or a new language, why do I need your library then?"
|
6
9
|
|
7
|
-
|
10
|
+
Well, most probably you don't. But I can think of rare cases when you do. Say, you need to write a validation for [git branch names].
|
11
|
+
|
12
|
+
You can go with easy-peasy regex:
|
8
13
|
|
9
14
|
```ruby
|
10
|
-
|
15
|
+
branch_name_regex = /^(?!\/|.*(?:[\/.]\.|\/\/|@{|\\|\.lock$|[\/.]$))[^\040\177 ~^:?*\[]+$/
|
16
|
+
|
17
|
+
branch_name_regex.match?("feature/branch-validation")
|
18
|
+
```
|
19
|
+
|
20
|
+
With Paco, you can go with a little more verbose version of that rule:
|
21
|
+
|
22
|
+
```ruby
|
23
|
+
module BranchNameParser
|
24
|
+
extend Paco
|
25
|
+
|
26
|
+
class << self
|
27
|
+
def parse(input)
|
28
|
+
parser.parse(input)
|
29
|
+
end
|
30
|
+
|
31
|
+
def parser
|
32
|
+
lookahead(none_of("/")).next(valid_chars.join)
|
33
|
+
end
|
34
|
+
|
35
|
+
def valid_chars
|
36
|
+
any_char.not_followed_by(invalid_sequences).at_least(1)
|
37
|
+
end
|
38
|
+
|
39
|
+
def invalid_sequences
|
40
|
+
alt(invalid_chars, invalid_endings)
|
41
|
+
end
|
42
|
+
|
43
|
+
def invalid_chars
|
44
|
+
alt(
|
45
|
+
string("/."),
|
46
|
+
string(".."),
|
47
|
+
string("//"),
|
48
|
+
string("@{"),
|
49
|
+
string("\\\\"),
|
50
|
+
one_of("\040\177 ~^:?*\\[")
|
51
|
+
)
|
52
|
+
end
|
53
|
+
|
54
|
+
def invalid_endings
|
55
|
+
seq(
|
56
|
+
alt(string(".lock"), one_of("/.")),
|
57
|
+
eof
|
58
|
+
)
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
BranchNameParser.parse("feature/branch-validation")
|
11
64
|
```
|
12
65
|
|
13
|
-
|
66
|
+
Easy? Not really, but there is a chance you can read it. 😅
|
67
|
+
|
68
|
+
See [API documentation](docs/paco.md), [examples](examples) and [specs](spec) for more info on usage.
|
14
69
|
|
15
|
-
|
70
|
+
<a href="https://evilmartians.com/"><img src="https://evilmartians.com/badges/sponsored-by-evil-martians.svg" alt="Sponsored by Evil Martians" width="236" height="54"></a>
|
16
71
|
|
17
|
-
|
72
|
+
## Installation
|
18
73
|
|
19
|
-
|
74
|
+
Add to your `Gemfile`:
|
20
75
|
|
21
|
-
|
76
|
+
```ruby
|
77
|
+
gem "paco"
|
78
|
+
```
|
22
79
|
|
23
|
-
|
80
|
+
And then run `bundle install`.
|
24
81
|
|
25
82
|
## Development
|
26
83
|
|
@@ -32,6 +89,11 @@ To install this gem onto your local machine, run `bundle exec rake install`.
|
|
32
89
|
|
33
90
|
Bug reports and pull requests are welcome on GitHub at https://github.com/skryukov/paco.
|
34
91
|
|
92
|
+
## Alternatives
|
93
|
+
|
94
|
+
- [parslet] - A small (but featureful) PEG based parser library.
|
95
|
+
- [parsby] — Parser combinator library for Ruby inspired by Haskell's Parsec.
|
96
|
+
|
35
97
|
## License
|
36
98
|
|
37
99
|
The gem is available as open source under the terms of the [MIT License].
|
@@ -39,4 +101,6 @@ The gem is available as open source under the terms of the [MIT License].
|
|
39
101
|
[MIT License]: https://opensource.org/licenses/MIT
|
40
102
|
[Parsec]: https://github.com/haskell/parsec
|
41
103
|
[Parsimmon]: https://github.com/jneen/parsimmon
|
104
|
+
[parslet]: https://github.com/kschiess/parslet
|
42
105
|
[parsby]: https://github.com/jolmg/parsby
|
106
|
+
[git branch names]: https://git-scm.com/docs/git-check-ref-format#_description
|
@@ -0,0 +1,27 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Paco
|
4
|
+
class Callstack
|
5
|
+
attr_reader :stack
|
6
|
+
|
7
|
+
def initialize
|
8
|
+
@stack = []
|
9
|
+
@depth = 0
|
10
|
+
end
|
11
|
+
|
12
|
+
def failure(**params)
|
13
|
+
@depth -= 1
|
14
|
+
@stack << params.merge(status: :failure, depth: @depth)
|
15
|
+
end
|
16
|
+
|
17
|
+
def start(**params)
|
18
|
+
@depth += 1
|
19
|
+
@stack << params.merge(status: :start, depth: @depth)
|
20
|
+
end
|
21
|
+
|
22
|
+
def success(**params)
|
23
|
+
@depth -= 1
|
24
|
+
@stack << params.merge(status: :success, depth: @depth)
|
25
|
+
end
|
26
|
+
end
|
27
|
+
end
|
@@ -32,7 +32,7 @@ module Paco
|
|
32
32
|
# @param [String] matcher
|
33
33
|
# @return [Paco::Parser]
|
34
34
|
def string(matcher)
|
35
|
-
Parser.new(matcher) do |ctx, parser|
|
35
|
+
Parser.new("string(#{matcher.inspect})") do |ctx, parser|
|
36
36
|
src = ctx.read(matcher.length)
|
37
37
|
parser.failure(ctx) if src != matcher
|
38
38
|
|
@@ -46,9 +46,10 @@ module Paco
|
|
46
46
|
# When `group` is specified, it returns only the text in the specific regexp match group.
|
47
47
|
# @param [Regexp] regexp
|
48
48
|
# @return [Paco::Parser]
|
49
|
+
# @param [Integer] group
|
49
50
|
def regexp(regexp, group: 0)
|
50
|
-
anchored_regexp = Regexp.new("
|
51
|
-
Parser.new(regexp.inspect) do |ctx, parser|
|
51
|
+
anchored_regexp = Regexp.new("\\A(?:#{regexp.source})", regexp.options)
|
52
|
+
Parser.new("regexp(#{regexp.inspect})") do |ctx, parser|
|
52
53
|
match = anchored_regexp.match(ctx.read_all)
|
53
54
|
parser.failure(ctx) if match.nil?
|
54
55
|
|
@@ -61,7 +62,7 @@ module Paco
|
|
61
62
|
# @param [Regexp] regexp
|
62
63
|
# @return [Paco::Parser]
|
63
64
|
def regexp_char(regexp)
|
64
|
-
satisfy(regexp.inspect) { |char| regexp.match?(char) }
|
65
|
+
satisfy("regexp_char(#{regexp.inspect})") { |char| regexp.match?(char) }
|
65
66
|
end
|
66
67
|
|
67
68
|
# Returns a parser that looks for exactly one character from passed
|
@@ -69,7 +70,7 @@ module Paco
|
|
69
70
|
# @param [String, Array<String>] matcher
|
70
71
|
# @return [Paco::Parser]
|
71
72
|
def one_of(matcher)
|
72
|
-
satisfy(matcher
|
73
|
+
satisfy("one_of(#{matcher})") { |char| matcher.include?(char) }
|
73
74
|
end
|
74
75
|
|
75
76
|
# Returns a parser that looks for exactly one character _NOT_ from passed
|
@@ -77,7 +78,7 @@ module Paco
|
|
77
78
|
# @param [String, Array<String>] matcher
|
78
79
|
# @return [Paco::Parser]
|
79
80
|
def none_of(matcher)
|
80
|
-
satisfy("
|
81
|
+
satisfy("none_of(#{matcher})") { |char| !matcher.include?(char) }
|
81
82
|
end
|
82
83
|
|
83
84
|
# Returns a parser that consumes and returns the next character of the input.
|
@@ -90,7 +91,7 @@ module Paco
|
|
90
91
|
# @return [Paco::Parser]
|
91
92
|
def remainder
|
92
93
|
memoize do
|
93
|
-
Parser.new("remainder
|
94
|
+
Parser.new("remainder") do |ctx, parser|
|
94
95
|
result = ctx.read_all
|
95
96
|
ctx.pos += result.length
|
96
97
|
result
|
@@ -147,7 +148,7 @@ module Paco
|
|
147
148
|
memoize { alt(newline, eof) }
|
148
149
|
end
|
149
150
|
|
150
|
-
# Alias for `Paco::Combinators.
|
151
|
+
# Alias for `Paco::Combinators.regexp_char(/[a-z]/i)`.
|
151
152
|
# @return [Paco::Parser]
|
152
153
|
def letter
|
153
154
|
memoize { regexp_char(/[a-z]/i) }
|
@@ -156,16 +157,16 @@ module Paco
|
|
156
157
|
# Alias for `Paco::Combinators.regexp(/[a-z]+/i)`.
|
157
158
|
# @return [Paco::Parser]
|
158
159
|
def letters
|
159
|
-
memoize {
|
160
|
+
memoize { regexp(/[a-z]+/i) }
|
160
161
|
end
|
161
162
|
|
162
163
|
# Alias for `Paco::Combinators.regexp(/[a-z]*/i)`.
|
163
164
|
# @return [Paco::Parser]
|
164
165
|
def opt_letters
|
165
|
-
memoize {
|
166
|
+
memoize { regexp(/[a-z]*/i) }
|
166
167
|
end
|
167
168
|
|
168
|
-
# Alias for `Paco::Combinators.
|
169
|
+
# Alias for `Paco::Combinators.regexp_char(/[0-9]/)`.
|
169
170
|
# @return [Paco::Parser]
|
170
171
|
def digit
|
171
172
|
memoize { regexp_char(/[0-9]/) }
|
@@ -174,13 +175,13 @@ module Paco
|
|
174
175
|
# Alias for `Paco::Combinators.regexp(/[0-9]+/)`.
|
175
176
|
# @return [Paco::Parser]
|
176
177
|
def digits
|
177
|
-
memoize {
|
178
|
+
memoize { regexp(/[0-9]+/) }
|
178
179
|
end
|
179
180
|
|
180
181
|
# Alias for `Paco::Combinators.regexp(/[0-9]*/)`.
|
181
182
|
# @return [Paco::Parser]
|
182
183
|
def opt_digits
|
183
|
-
memoize {
|
184
|
+
memoize { regexp(/[0-9]*/) }
|
184
185
|
end
|
185
186
|
|
186
187
|
# Alias for `Paco::Combinators.regexp(/\s+/)`.
|
data/lib/paco/combinators.rb
CHANGED
@@ -1,22 +1,14 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
|
-
require "monitor"
|
4
|
-
|
5
3
|
require "paco/combinators/char"
|
4
|
+
require "paco/memoizer"
|
6
5
|
|
7
6
|
module Paco
|
8
7
|
module Combinators
|
9
|
-
|
10
|
-
|
11
|
-
base.extend Char
|
12
|
-
end
|
13
|
-
|
14
|
-
def self.included(base)
|
15
|
-
base.include MonitorMixin
|
16
|
-
base.include Char
|
17
|
-
end
|
8
|
+
include Char
|
9
|
+
extend Char
|
18
10
|
|
19
|
-
|
11
|
+
module_function
|
20
12
|
|
21
13
|
# Returns a parser that runs the passed `parser` without consuming the input, and
|
22
14
|
# returns `null` if the passed `parser` _does not match_ the input. Fails otherwise.
|
@@ -28,10 +20,11 @@ module Paco
|
|
28
20
|
begin
|
29
21
|
parser._parse(ctx)
|
30
22
|
rescue ParseError
|
31
|
-
ctx.pos = start_pos
|
32
23
|
nil
|
33
24
|
else
|
34
25
|
pars.failure(ctx)
|
26
|
+
ensure
|
27
|
+
ctx.pos = start_pos
|
35
28
|
end
|
36
29
|
end
|
37
30
|
end
|
@@ -39,7 +32,7 @@ module Paco
|
|
39
32
|
# Returns a parser that doesn't consume any input and always returns `result`.
|
40
33
|
# @return [Paco::Parser]
|
41
34
|
def succeed(result)
|
42
|
-
Parser.new { result }
|
35
|
+
Parser.new("succeed(#{result})") { result }
|
43
36
|
end
|
44
37
|
|
45
38
|
# Returns a parser that doesn't consume any input and always fails with passed `message`.
|
@@ -54,7 +47,7 @@ module Paco
|
|
54
47
|
# @param [Paco::Parser] parser
|
55
48
|
# @return [Paco::Parser]
|
56
49
|
def lookahead(parser)
|
57
|
-
Parser.new do |ctx|
|
50
|
+
Parser.new("lookahead(#{parser.desc})") do |ctx|
|
58
51
|
start_pos = ctx.pos
|
59
52
|
parser._parse(ctx)
|
60
53
|
ctx.pos = start_pos
|
@@ -68,7 +61,7 @@ module Paco
|
|
68
61
|
def alt(*parsers)
|
69
62
|
raise ArgumentError, "no parsers specified" if parsers.empty?
|
70
63
|
|
71
|
-
Parser.new do |ctx|
|
64
|
+
Parser.new("alt(#{parsers.map(&:desc).join(", ")})") do |ctx|
|
72
65
|
result = nil
|
73
66
|
last_error = nil
|
74
67
|
start_pos = ctx.pos
|
@@ -86,26 +79,25 @@ module Paco
|
|
86
79
|
|
87
80
|
# Accepts one or more parsers, and returns a parser that expects them
|
88
81
|
# to match in order, returns an array of all their results.
|
82
|
+
# If `block` specified, passes results of the `parses` as an arguments
|
83
|
+
# to a `block`, and at the end returns its result.
|
89
84
|
# @param [Array<Paco::Parser>] parsers
|
90
85
|
# @return [Paco::Parser]
|
91
86
|
def seq(*parsers)
|
92
87
|
raise ArgumentError, "no parsers specified" if parsers.empty?
|
93
88
|
|
94
|
-
Parser.new do |ctx|
|
95
|
-
|
89
|
+
result = Parser.new("seq(#{parsers.map(&:desc).join(", ")})") do |ctx|
|
90
|
+
start_pos = ctx.pos
|
91
|
+
begin
|
92
|
+
parsers.map { |parser| parser._parse(ctx) }
|
93
|
+
rescue ParseError => e
|
94
|
+
ctx.pos = start_pos
|
95
|
+
raise e
|
96
|
+
end
|
96
97
|
end
|
97
|
-
|
98
|
+
return result unless block_given?
|
98
99
|
|
99
|
-
|
100
|
-
# their results as an arguments to a `block`, and at the end returns its result.
|
101
|
-
# @param [Array<Paco::Parser>] parsers
|
102
|
-
# @return [Paco::Parser]
|
103
|
-
def seq_map(*parsers, &block)
|
104
|
-
raise ArgumentError, "no parsers specified" if parsers.empty?
|
105
|
-
|
106
|
-
seq(*parsers).fmap do |results|
|
107
|
-
block.call(*results)
|
108
|
-
end
|
100
|
+
result.fmap { |results| yield(*results) }
|
109
101
|
end
|
110
102
|
|
111
103
|
# Accepts a block that returns a parser, which is evaluated the first time the parser is used.
|
@@ -121,7 +113,8 @@ module Paco
|
|
121
113
|
# @param [Paco::Parser] separator
|
122
114
|
# @return [Paco::Parser]
|
123
115
|
def sep_by(parser, separator)
|
124
|
-
alt(
|
116
|
+
alt(sep_by!(parser, separator), succeed([]))
|
117
|
+
.with_desc("sep_by(#{parser.desc}, #{separator.desc})")
|
125
118
|
end
|
126
119
|
|
127
120
|
# Returns a parser that expects one or more matches for `parser`,
|
@@ -129,11 +122,11 @@ module Paco
|
|
129
122
|
# @param [Paco::Parser] parser
|
130
123
|
# @param [Paco::Parser] separator
|
131
124
|
# @return [Paco::Parser]
|
132
|
-
def
|
133
|
-
|
134
|
-
|
135
|
-
end
|
125
|
+
def sep_by!(parser, separator)
|
126
|
+
seq(parser, many(separator.next(parser))) { |first, arr| [first] + arr }
|
127
|
+
.with_desc("sep_by!(#{parser.desc}, #{separator.desc})")
|
136
128
|
end
|
129
|
+
alias_method :sep_by_1, :sep_by!
|
137
130
|
|
138
131
|
# Expects the parser `before` before `parser` and `after` after `parser. Returns the result of the parser.
|
139
132
|
# @param [Paco::Parser] before
|
@@ -148,13 +141,10 @@ module Paco
|
|
148
141
|
# @param [Paco::Parser] parser
|
149
142
|
# @return [Paco::Parser]
|
150
143
|
def many(parser)
|
151
|
-
Parser.new do |ctx|
|
144
|
+
Parser.new("many(#{parser.desc})") do |ctx|
|
152
145
|
results = []
|
153
|
-
# last_pos = ctx.pos
|
154
146
|
loop do
|
155
147
|
results << parser._parse(ctx)
|
156
|
-
# raise ArgumentError, "smth wrong" if last_pos == ctx.pos
|
157
|
-
# last_pos = ctx.pos
|
158
148
|
rescue ParseError
|
159
149
|
break
|
160
150
|
end
|
@@ -169,15 +159,16 @@ module Paco
|
|
169
159
|
alt(parser, succeed(nil))
|
170
160
|
end
|
171
161
|
|
162
|
+
# Returns parser that returns `Paco::Index` representing
|
163
|
+
# the current offset into the parse without consuming the input.
|
164
|
+
# @return [Paco::Parser]
|
165
|
+
def index
|
166
|
+
Parser.new { |ctx| ctx.index }
|
167
|
+
end
|
168
|
+
|
172
169
|
# Helper used for memoization
|
173
170
|
def memoize(&block)
|
174
|
-
|
175
|
-
synchronize do
|
176
|
-
@_paco_memoized ||= {}
|
177
|
-
return @_paco_memoized[key] if @_paco_memoized.key?(key)
|
178
|
-
|
179
|
-
@_paco_memoized[key] = block.call
|
180
|
-
end
|
171
|
+
Memoizer.memoize(block.source_location, &block)
|
181
172
|
end
|
182
173
|
end
|
183
174
|
end
|
data/lib/paco/context.rb
CHANGED
@@ -1,17 +1,17 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "paco/callstack"
|
4
|
+
require "paco/index"
|
5
|
+
|
2
6
|
module Paco
|
3
7
|
class Context
|
4
|
-
attr_reader :input, :last_pos, :
|
8
|
+
attr_reader :input, :last_pos, :callstack
|
9
|
+
attr_accessor :pos
|
5
10
|
|
6
|
-
def pos
|
7
|
-
# TODO: is that needed?
|
8
|
-
@last_pos = @pos
|
9
|
-
@pos = np
|
10
|
-
end
|
11
|
-
|
12
|
-
def initialize(input, pos = 0)
|
11
|
+
def initialize(input, pos: 0, with_callstack: false)
|
13
12
|
@input = input
|
14
13
|
@pos = pos
|
14
|
+
@callstack = Callstack.new if with_callstack
|
15
15
|
end
|
16
16
|
|
17
17
|
def read(n)
|
@@ -19,22 +19,33 @@ module Paco
|
|
19
19
|
end
|
20
20
|
|
21
21
|
def read_all
|
22
|
-
input[pos
|
22
|
+
input[pos..]
|
23
23
|
end
|
24
24
|
|
25
25
|
def eof?
|
26
26
|
pos >= input.length
|
27
27
|
end
|
28
28
|
|
29
|
+
# @param [Integer] from
|
30
|
+
# @return [Paco::Index]
|
29
31
|
def index(from = nil)
|
30
|
-
from
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
32
|
+
Index.calculate(input: input, pos: from || pos)
|
33
|
+
end
|
34
|
+
|
35
|
+
# @param [Paco::Parser] parser
|
36
|
+
def failure_parse(parser)
|
37
|
+
@callstack&.failure(pos: pos, parser: parser.desc)
|
38
|
+
end
|
39
|
+
|
40
|
+
# @param [Paco::Parser] parser
|
41
|
+
def start_parse(parser)
|
42
|
+
@callstack&.start(pos: pos, parser: parser.desc)
|
43
|
+
end
|
44
|
+
|
45
|
+
# @param [Object] result
|
46
|
+
# @param [Paco::Parser] parser
|
47
|
+
def success_parse(result, parser)
|
48
|
+
@callstack&.success(pos: pos, result: result, parser: parser.desc)
|
38
49
|
end
|
39
50
|
end
|
40
51
|
end
|
data/lib/paco/index.rb
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
module Paco
|
2
|
+
Index = Struct.new(:pos, :line, :column) do
|
3
|
+
# @param [String] input
|
4
|
+
# @param [Integer] pos
|
5
|
+
def self.calculate(input:, pos:)
|
6
|
+
raise ArgumentError, "`pos` must be a non-negative integer" if pos < 0
|
7
|
+
raise ArgumentError, "`pos` is grater then input length" if pos > input.length
|
8
|
+
|
9
|
+
lines = input[0..pos].lines
|
10
|
+
line = lines.empty? ? 1 : lines.length
|
11
|
+
column = lines.last&.length || 1
|
12
|
+
new(pos, line, column)
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "monitor"
|
4
|
+
|
5
|
+
module Paco
|
6
|
+
module Memoizer
|
7
|
+
extend MonitorMixin
|
8
|
+
|
9
|
+
class << self
|
10
|
+
def memoize(key, &block)
|
11
|
+
synchronize do
|
12
|
+
@paco_memoized ||= {}
|
13
|
+
return @paco_memoized[key] if @paco_memoized.key?(key)
|
14
|
+
|
15
|
+
@paco_memoized[key] = block.call
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
data/lib/paco/parse_error.rb
CHANGED
@@ -1,28 +1,29 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Paco
|
3
4
|
class Error < StandardError; end
|
4
5
|
|
5
6
|
class ParseError < Error
|
7
|
+
attr_reader :ctx, :pos, :expected
|
8
|
+
|
6
9
|
# @param [Paco::Context] ctx
|
7
10
|
def initialize(ctx, expected)
|
8
11
|
@ctx = ctx
|
9
12
|
@pos = ctx.pos
|
10
13
|
@expected = expected
|
14
|
+
end
|
11
15
|
|
12
|
-
|
13
|
-
|
14
|
-
# puts "#{ctx.pos}/#{ctx.input.length}: #{ctx.input[ctx.last_pos..ctx.pos]}"
|
15
|
-
# puts "expected: #{expected}"
|
16
|
-
# puts ""
|
16
|
+
def callstack
|
17
|
+
ctx.callstack
|
17
18
|
end
|
18
19
|
|
19
20
|
def message
|
20
|
-
index =
|
21
|
+
index = ctx.index(pos)
|
21
22
|
<<~MSG
|
22
|
-
|
23
|
-
line #{index
|
24
|
-
unexpected #{
|
25
|
-
expecting #{
|
23
|
+
\nParsing error
|
24
|
+
line #{index.line}, column #{index.column}:
|
25
|
+
unexpected #{ctx.eof? ? "end of file" : ctx.input[pos].inspect}
|
26
|
+
expecting #{expected}
|
26
27
|
MSG
|
27
28
|
end
|
28
29
|
end
|
data/lib/paco/parser.rb
CHANGED
@@ -6,36 +6,47 @@ module Paco
|
|
6
6
|
class Parser
|
7
7
|
attr_reader :desc
|
8
8
|
|
9
|
+
# @param [String] desc
|
9
10
|
def initialize(desc = "", &block)
|
10
11
|
@desc = desc
|
11
12
|
@block = block
|
12
13
|
end
|
13
14
|
|
14
|
-
|
15
|
-
|
15
|
+
# @param [String] desc
|
16
|
+
# @return [Paco::Parser]
|
17
|
+
def with_desc(desc)
|
18
|
+
@desc = desc
|
19
|
+
self
|
20
|
+
end
|
21
|
+
|
22
|
+
# @param [String, Paco::Context] input
|
23
|
+
# @param [true, false] with_callstack
|
24
|
+
def parse(input, with_callstack: false)
|
25
|
+
ctx = input.is_a?(Context) ? input : Context.new(input, with_callstack: with_callstack)
|
16
26
|
skip(Paco::Combinators.eof)._parse(ctx)
|
17
27
|
end
|
18
28
|
|
29
|
+
# @param [Paco::Context] ctx
|
19
30
|
def _parse(ctx)
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
# puts "#{ctx.input.length}/#{ctx.pos}: " + ctx.input[ctx.last_pos..ctx.pos].inspect
|
25
|
-
# puts ""
|
26
|
-
# res
|
31
|
+
ctx.start_parse(self)
|
32
|
+
res = @block.call(ctx, self)
|
33
|
+
ctx.success_parse(res, self)
|
34
|
+
res
|
27
35
|
end
|
28
36
|
|
29
37
|
# Raises ParseError
|
30
38
|
# @param [Paco::Context] ctx
|
31
39
|
# @raise [Paco::ParseError]
|
32
40
|
def failure(ctx)
|
41
|
+
ctx.failure_parse(self)
|
33
42
|
raise ParseError.new(ctx, desc), "", []
|
34
43
|
end
|
35
44
|
|
36
45
|
# Returns a new parser which tries `parser`, and if it fails uses `other`.
|
46
|
+
# @param [Paco::Parser] other
|
47
|
+
# @return [Paco::Parser]
|
37
48
|
def or(other)
|
38
|
-
Parser.new do |ctx|
|
49
|
+
Parser.new("or(#{desc}, #{other.desc})") do |ctx|
|
39
50
|
_parse(ctx)
|
40
51
|
rescue ParseError
|
41
52
|
other._parse(ctx)
|
@@ -47,7 +58,7 @@ module Paco
|
|
47
58
|
# @param [Poco::Parser] other
|
48
59
|
# @return [Paco::Parser]
|
49
60
|
def skip(other)
|
50
|
-
Paco::Combinators.seq(self, other).fmap { |results| results[0] }
|
61
|
+
Paco::Combinators.seq(self, other).fmap { |results| results[0] }.with_desc("#{desc}.skip(#{other.desc})")
|
51
62
|
end
|
52
63
|
alias_method :<, :skip
|
53
64
|
|
@@ -55,14 +66,15 @@ module Paco
|
|
55
66
|
# @param [Poco::Parser] other
|
56
67
|
# @return [Paco::Parser]
|
57
68
|
def next(other)
|
58
|
-
|
69
|
+
Paco::Combinators.seq(self, other).fmap { |results| results[1] }
|
70
|
+
.with_desc("#{desc}.next(#{other.desc})")
|
59
71
|
end
|
60
72
|
alias_method :>, :next
|
61
73
|
|
62
74
|
# Transforms the output of `parser` with the given block.
|
63
75
|
# @return [Paco::Parser]
|
64
76
|
def fmap(&block)
|
65
|
-
Parser.new do |ctx|
|
77
|
+
Parser.new("#{desc}.fmap") do |ctx|
|
66
78
|
block.call(_parse(ctx))
|
67
79
|
end
|
68
80
|
end
|
@@ -74,7 +86,7 @@ module Paco
|
|
74
86
|
# with the other Paco::Combinators.
|
75
87
|
# @return [Paco::Parser]
|
76
88
|
def bind(&block)
|
77
|
-
Parser.new do |ctx|
|
89
|
+
Parser.new("#{desc}.bind") do |ctx|
|
78
90
|
block.call(_parse(ctx))._parse(ctx)
|
79
91
|
end
|
80
92
|
end
|
@@ -139,7 +151,7 @@ module Paco
|
|
139
151
|
raise ArgumentError, "invalid attributes: min `#{min}`, max `#{max}`"
|
140
152
|
end
|
141
153
|
|
142
|
-
Parser.new do |ctx|
|
154
|
+
Parser.new("#{desc}.times(#{min}, #{max})") do |ctx|
|
143
155
|
results = min.times.map { _parse(ctx) }
|
144
156
|
(max - min).times.each do
|
145
157
|
results << _parse(ctx)
|
@@ -154,7 +166,7 @@ module Paco
|
|
154
166
|
# Returns a parser that runs `parser` at least `num` times,
|
155
167
|
# and returns an array of the results.
|
156
168
|
def at_least(num)
|
157
|
-
Paco::Combinators.
|
169
|
+
Paco::Combinators.seq(times(num), many) do |head, rest|
|
158
170
|
head + rest
|
159
171
|
end
|
160
172
|
end
|
@@ -0,0 +1,47 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
RSpec::Matchers.define(:parse) do |input|
|
4
|
+
chain :as do |expected_output = nil, &block|
|
5
|
+
@expected = expected_output
|
6
|
+
@block = block
|
7
|
+
end
|
8
|
+
|
9
|
+
chain :fully do
|
10
|
+
@expected = input
|
11
|
+
end
|
12
|
+
|
13
|
+
match do |parser|
|
14
|
+
@result = parser.parse(input)
|
15
|
+
return @block.call(@result) if @block
|
16
|
+
return @expected == @result if defined?(@expected)
|
17
|
+
|
18
|
+
true
|
19
|
+
rescue Paco::ParseError => e
|
20
|
+
@error_message = e.message
|
21
|
+
false
|
22
|
+
end
|
23
|
+
|
24
|
+
failure_message do |subject|
|
25
|
+
msg = "expected output of parsing #{input.inspect} with #{subject.inspect} to"
|
26
|
+
was = (@result ? "was #{@result.inspect}" : "raised an error #{@error_message}")
|
27
|
+
return "#{msg} meet block conditions, but it didn't. It #{was}" if @block
|
28
|
+
return "#{msg} equal #{@expected.inspect}, but it #{was}" if defined?(@expected)
|
29
|
+
|
30
|
+
"expected #{subject.inspect} to successfully parse #{input.inspect}, but it #{was}"
|
31
|
+
end
|
32
|
+
|
33
|
+
failure_message_when_negated do |subject|
|
34
|
+
msg = "expected output of parsing #{input.inspect} with #{subject.inspect} not to"
|
35
|
+
return "#{msg} meet block conditions, but it did" if @block
|
36
|
+
return "#{msg} equal #{@expected.inspect}" if defined?(@expected)
|
37
|
+
|
38
|
+
"expected #{subject.inspect} to not parse #{input.inspect}, but it did"
|
39
|
+
end
|
40
|
+
|
41
|
+
description do
|
42
|
+
return "parse #{input.inspect} with block conditions" if @block
|
43
|
+
return "parse #{input.inspect} as #{@expected.inspect}" if defined?(@expected)
|
44
|
+
|
45
|
+
"parse #{input.inspect}"
|
46
|
+
end
|
47
|
+
end
|
data/lib/paco/rspec.rb
ADDED
data/lib/paco/version.rb
CHANGED
data/lib/paco.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: paco
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Svyatoslav Kryukov
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-12-
|
11
|
+
date: 2021-12-28 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
13
|
description: Paco is a parser combinator library.
|
14
14
|
email:
|
@@ -20,14 +20,17 @@ files:
|
|
20
20
|
- CHANGELOG.md
|
21
21
|
- LICENSE.txt
|
22
22
|
- README.md
|
23
|
-
- bin/console
|
24
|
-
- bin/setup
|
25
23
|
- lib/paco.rb
|
24
|
+
- lib/paco/callstack.rb
|
26
25
|
- lib/paco/combinators.rb
|
27
26
|
- lib/paco/combinators/char.rb
|
28
27
|
- lib/paco/context.rb
|
28
|
+
- lib/paco/index.rb
|
29
|
+
- lib/paco/memoizer.rb
|
29
30
|
- lib/paco/parse_error.rb
|
30
31
|
- lib/paco/parser.rb
|
32
|
+
- lib/paco/rspec.rb
|
33
|
+
- lib/paco/rspec/parse_matcher.rb
|
31
34
|
- lib/paco/version.rb
|
32
35
|
homepage: https://github.com/skryukov/paco
|
33
36
|
licenses:
|
data/bin/console
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
#!/usr/bin/env ruby
|
2
|
-
# frozen_string_literal: true
|
3
|
-
|
4
|
-
require "bundler/setup"
|
5
|
-
require "paco"
|
6
|
-
|
7
|
-
# You can add fixtures and/or initialization code here to make experimenting
|
8
|
-
# with your gem easier. You can also use a different console, if you like.
|
9
|
-
|
10
|
-
# (If you use this, don't forget to add pry to your Gemfile!)
|
11
|
-
# require "pry"
|
12
|
-
# Pry.start
|
13
|
-
|
14
|
-
require "irb"
|
15
|
-
IRB.start(__FILE__)
|