paco 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +50 -3
- data/README.md +73 -9
- data/lib/paco/callstack.rb +27 -0
- data/lib/paco/combinators/char.rb +14 -13
- data/lib/paco/combinators.rb +36 -45
- data/lib/paco/context.rb +28 -17
- data/lib/paco/index.rb +15 -0
- data/lib/paco/memoizer.rb +20 -0
- data/lib/paco/parse_error.rb +11 -10
- data/lib/paco/parser.rb +28 -16
- data/lib/paco/rspec/parse_matcher.rb +47 -0
- data/lib/paco/rspec.rb +4 -0
- data/lib/paco/version.rb +1 -1
- data/lib/paco.rb +0 -1
- metadata +7 -4
- data/bin/console +0 -15
- data/bin/setup +0 -8
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ae17ec48820da8d99722dd87e2f834cc238497c948dff292490665cdeb16fbc1
|
4
|
+
data.tar.gz: 30046ad4c3203dcf35430c9512df303f008fe513c11f92e06137a7131ee586fe
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf9fb9a162ab9c8292047942ebf8cc57e6a0382fa680beed1802129f6abd07d2483a24ba734f260c0fef9ba948e54ac9cf9635bc49abbbc457d2d91c6c5a9fb7
|
7
|
+
data.tar.gz: 0d4ef03c2022c26ae813d1d3f925b539c6cb7e2e715f27f965a80f9b8809e7a4c6d6d78bb6c5086603f3585aaf44e6035b5f72892078d44d63f4cfd5ca83509b
|
data/CHANGELOG.md
CHANGED
@@ -5,9 +5,55 @@ All notable changes to this project will be documented in this file.
|
|
5
5
|
The format is based on [Keep a Changelog],
|
6
6
|
and this project adheres to [Semantic Versioning].
|
7
7
|
|
8
|
-
## [
|
8
|
+
## [0.2.0] - 2021-12-28
|
9
9
|
|
10
|
-
|
10
|
+
### Added
|
11
|
+
|
12
|
+
- Callstack collection for debugging. ([@skryukov])
|
13
|
+
|
14
|
+
Pass `with_callstack: true` to the `Paco::Parser#parse` method to collect a callstack while parsing. To examine the callstack catch the `ParseError` exception:
|
15
|
+
|
16
|
+
```ruby
|
17
|
+
begin
|
18
|
+
string("Paco").parse("Paco!", with_callstack: true)
|
19
|
+
rescue Paco::ParseError => e
|
20
|
+
pp e.callstack.stack # You will probably want to use `binding.irb` or `binding.pry`
|
21
|
+
end
|
22
|
+
```
|
23
|
+
|
24
|
+
- `Paco::Combinators.index` method. ([@skryukov])
|
25
|
+
|
26
|
+
Call `Paco::Combinators.index` to get `Paco::Index` representing the current offset into the parse without consuming the input.
|
27
|
+
`Paco::Index` has a 0-based character offset attribute `:pos` and 1-based `:line` and `:column` attributes.
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
index.parse("Paco") #=> #<struct Paco::Index pos=0, line=1, column=1>
|
31
|
+
```
|
32
|
+
|
33
|
+
- RSpec matcher `#parse`. ([@skryukov])
|
34
|
+
|
35
|
+
Add `require "paco/rspec"` to `spec_helper.rb` to enable a special RSpec matcher `#parse`:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
subject { string("Paco") }
|
39
|
+
|
40
|
+
it { is_expected.to parse("Paco") } # just checks if parser succeeds
|
41
|
+
it { is_expected.to parse("Paco").as("Paco") } # checks if parser result is eq to value passed to `#as`
|
42
|
+
it { is_expected.to parse("Paco").fully } # checks if parser result is the same as value passed to `#parse`
|
43
|
+
```
|
44
|
+
|
45
|
+
### Changed
|
46
|
+
|
47
|
+
- `Paco::Combinators.seq_map` merged into `Paco::Combinators.seq`. ([@skryukov])
|
48
|
+
- `Paco::Combinators.sep_by_1` renamed to `Paco::Combinators.sep_by!`. ([@skryukov])
|
49
|
+
|
50
|
+
### Fixed
|
51
|
+
|
52
|
+
- `Paco::Combinators::Char#regexp` now uses `\A` instead of `^`. ([@skryukov])
|
53
|
+
- `include Paco` now works inside `irb`. ([@skryukov])
|
54
|
+
- `Paco::Combinators#not_followed_by` and `Paco::Combinators#seq` now don't consume input on error. ([@skryukov])
|
55
|
+
|
56
|
+
## [0.1.0] - 2021-12-12
|
11
57
|
|
12
58
|
### Added
|
13
59
|
|
@@ -15,7 +61,8 @@ and this project adheres to [Semantic Versioning].
|
|
15
61
|
|
16
62
|
[@skryukov]: https://github.com/skryukov
|
17
63
|
|
18
|
-
[Unreleased]: https://github.com/skryukov/paco/compare/v0.
|
64
|
+
[Unreleased]: https://github.com/skryukov/paco/compare/v0.2.0...HEAD
|
65
|
+
[0.2.0]: https://github.com/skryukov/paco/compare/v0.1.0...v0.2.0
|
19
66
|
[0.1.0]: https://github.com/skryukov/paco/commits/v0.1.0
|
20
67
|
|
21
68
|
[Keep a Changelog]: https://keepachangelog.com/en/1.0.0/
|
data/README.md
CHANGED
@@ -1,26 +1,83 @@
|
|
1
1
|
# Paco
|
2
2
|
|
3
|
+
[](https://rubygems.org/gems/paco)
|
4
|
+
[](https://github.com/skryukov/paco/actions)
|
5
|
+
|
3
6
|
Paco is a parser combinator library inspired by Haskell's [Parsec] and [Parsimmon].
|
4
7
|
|
5
|
-
|
8
|
+
"But I don't need to write another JSON parser or a new language, why do I need your library then?"
|
6
9
|
|
7
|
-
|
10
|
+
Well, most probably you don't. But I can think of rare cases when you do. Say, you need to write a validation for [git branch names].
|
11
|
+
|
12
|
+
You can go with easy-peasy regex:
|
8
13
|
|
9
14
|
```ruby
|
10
|
-
|
15
|
+
branch_name_regex = /^(?!\/|.*(?:[\/.]\.|\/\/|@{|\\|\.lock$|[\/.]$))[^\040\177 ~^:?*\[]+$/
|
16
|
+
|
17
|
+
branch_name_regex.match?("feature/branch-validation")
|
18
|
+
```
|
19
|
+
|
20
|
+
With Paco, you can go with a little more verbose version of that rule:
|
21
|
+
|
22
|
+
```ruby
|
23
|
+
module BranchNameParser
|
24
|
+
extend Paco
|
25
|
+
|
26
|
+
class << self
|
27
|
+
def parse(input)
|
28
|
+
parser.parse(input)
|
29
|
+
end
|
30
|
+
|
31
|
+
def parser
|
32
|
+
lookahead(none_of("/")).next(valid_chars.join)
|
33
|
+
end
|
34
|
+
|
35
|
+
def valid_chars
|
36
|
+
any_char.not_followed_by(invalid_sequences).at_least(1)
|
37
|
+
end
|
38
|
+
|
39
|
+
def invalid_sequences
|
40
|
+
alt(invalid_chars, invalid_endings)
|
41
|
+
end
|
42
|
+
|
43
|
+
def invalid_chars
|
44
|
+
alt(
|
45
|
+
string("/."),
|
46
|
+
string(".."),
|
47
|
+
string("//"),
|
48
|
+
string("@{"),
|
49
|
+
string("\\\\"),
|
50
|
+
one_of("\040\177 ~^:?*\\[")
|
51
|
+
)
|
52
|
+
end
|
53
|
+
|
54
|
+
def invalid_endings
|
55
|
+
seq(
|
56
|
+
alt(string(".lock"), one_of("/.")),
|
57
|
+
eof
|
58
|
+
)
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
BranchNameParser.parse("feature/branch-validation")
|
11
64
|
```
|
12
65
|
|
13
|
-
|
66
|
+
Easy? Not really, but there is a chance you can read it. 😅
|
67
|
+
|
68
|
+
See [API documentation](docs/paco.md), [examples](examples) and [specs](spec) for more info on usage.
|
14
69
|
|
15
|
-
|
70
|
+
<a href="https://evilmartians.com/"><img src="https://evilmartians.com/badges/sponsored-by-evil-martians.svg" alt="Sponsored by Evil Martians" width="236" height="54"></a>
|
16
71
|
|
17
|
-
|
72
|
+
## Installation
|
18
73
|
|
19
|
-
|
74
|
+
Add to your `Gemfile`:
|
20
75
|
|
21
|
-
|
76
|
+
```ruby
|
77
|
+
gem "paco"
|
78
|
+
```
|
22
79
|
|
23
|
-
|
80
|
+
And then run `bundle install`.
|
24
81
|
|
25
82
|
## Development
|
26
83
|
|
@@ -32,6 +89,11 @@ To install this gem onto your local machine, run `bundle exec rake install`.
|
|
32
89
|
|
33
90
|
Bug reports and pull requests are welcome on GitHub at https://github.com/skryukov/paco.
|
34
91
|
|
92
|
+
## Alternatives
|
93
|
+
|
94
|
+
- [parslet] - A small (but featureful) PEG based parser library.
|
95
|
+
- [parsby] — Parser combinator library for Ruby inspired by Haskell's Parsec.
|
96
|
+
|
35
97
|
## License
|
36
98
|
|
37
99
|
The gem is available as open source under the terms of the [MIT License].
|
@@ -39,4 +101,6 @@ The gem is available as open source under the terms of the [MIT License].
|
|
39
101
|
[MIT License]: https://opensource.org/licenses/MIT
|
40
102
|
[Parsec]: https://github.com/haskell/parsec
|
41
103
|
[Parsimmon]: https://github.com/jneen/parsimmon
|
104
|
+
[parslet]: https://github.com/kschiess/parslet
|
42
105
|
[parsby]: https://github.com/jolmg/parsby
|
106
|
+
[git branch names]: https://git-scm.com/docs/git-check-ref-format#_description
|
@@ -0,0 +1,27 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Paco
|
4
|
+
class Callstack
|
5
|
+
attr_reader :stack
|
6
|
+
|
7
|
+
def initialize
|
8
|
+
@stack = []
|
9
|
+
@depth = 0
|
10
|
+
end
|
11
|
+
|
12
|
+
def failure(**params)
|
13
|
+
@depth -= 1
|
14
|
+
@stack << params.merge(status: :failure, depth: @depth)
|
15
|
+
end
|
16
|
+
|
17
|
+
def start(**params)
|
18
|
+
@depth += 1
|
19
|
+
@stack << params.merge(status: :start, depth: @depth)
|
20
|
+
end
|
21
|
+
|
22
|
+
def success(**params)
|
23
|
+
@depth -= 1
|
24
|
+
@stack << params.merge(status: :success, depth: @depth)
|
25
|
+
end
|
26
|
+
end
|
27
|
+
end
|
@@ -32,7 +32,7 @@ module Paco
|
|
32
32
|
# @param [String] matcher
|
33
33
|
# @return [Paco::Parser]
|
34
34
|
def string(matcher)
|
35
|
-
Parser.new(matcher) do |ctx, parser|
|
35
|
+
Parser.new("string(#{matcher.inspect})") do |ctx, parser|
|
36
36
|
src = ctx.read(matcher.length)
|
37
37
|
parser.failure(ctx) if src != matcher
|
38
38
|
|
@@ -46,9 +46,10 @@ module Paco
|
|
46
46
|
# When `group` is specified, it returns only the text in the specific regexp match group.
|
47
47
|
# @param [Regexp] regexp
|
48
48
|
# @return [Paco::Parser]
|
49
|
+
# @param [Integer] group
|
49
50
|
def regexp(regexp, group: 0)
|
50
|
-
anchored_regexp = Regexp.new("
|
51
|
-
Parser.new(regexp.inspect) do |ctx, parser|
|
51
|
+
anchored_regexp = Regexp.new("\\A(?:#{regexp.source})", regexp.options)
|
52
|
+
Parser.new("regexp(#{regexp.inspect})") do |ctx, parser|
|
52
53
|
match = anchored_regexp.match(ctx.read_all)
|
53
54
|
parser.failure(ctx) if match.nil?
|
54
55
|
|
@@ -61,7 +62,7 @@ module Paco
|
|
61
62
|
# @param [Regexp] regexp
|
62
63
|
# @return [Paco::Parser]
|
63
64
|
def regexp_char(regexp)
|
64
|
-
satisfy(regexp.inspect) { |char| regexp.match?(char) }
|
65
|
+
satisfy("regexp_char(#{regexp.inspect})") { |char| regexp.match?(char) }
|
65
66
|
end
|
66
67
|
|
67
68
|
# Returns a parser that looks for exactly one character from passed
|
@@ -69,7 +70,7 @@ module Paco
|
|
69
70
|
# @param [String, Array<String>] matcher
|
70
71
|
# @return [Paco::Parser]
|
71
72
|
def one_of(matcher)
|
72
|
-
satisfy(matcher
|
73
|
+
satisfy("one_of(#{matcher})") { |char| matcher.include?(char) }
|
73
74
|
end
|
74
75
|
|
75
76
|
# Returns a parser that looks for exactly one character _NOT_ from passed
|
@@ -77,7 +78,7 @@ module Paco
|
|
77
78
|
# @param [String, Array<String>] matcher
|
78
79
|
# @return [Paco::Parser]
|
79
80
|
def none_of(matcher)
|
80
|
-
satisfy("
|
81
|
+
satisfy("none_of(#{matcher})") { |char| !matcher.include?(char) }
|
81
82
|
end
|
82
83
|
|
83
84
|
# Returns a parser that consumes and returns the next character of the input.
|
@@ -90,7 +91,7 @@ module Paco
|
|
90
91
|
# @return [Paco::Parser]
|
91
92
|
def remainder
|
92
93
|
memoize do
|
93
|
-
Parser.new("remainder
|
94
|
+
Parser.new("remainder") do |ctx, parser|
|
94
95
|
result = ctx.read_all
|
95
96
|
ctx.pos += result.length
|
96
97
|
result
|
@@ -147,7 +148,7 @@ module Paco
|
|
147
148
|
memoize { alt(newline, eof) }
|
148
149
|
end
|
149
150
|
|
150
|
-
# Alias for `Paco::Combinators.
|
151
|
+
# Alias for `Paco::Combinators.regexp_char(/[a-z]/i)`.
|
151
152
|
# @return [Paco::Parser]
|
152
153
|
def letter
|
153
154
|
memoize { regexp_char(/[a-z]/i) }
|
@@ -156,16 +157,16 @@ module Paco
|
|
156
157
|
# Alias for `Paco::Combinators.regexp(/[a-z]+/i)`.
|
157
158
|
# @return [Paco::Parser]
|
158
159
|
def letters
|
159
|
-
memoize {
|
160
|
+
memoize { regexp(/[a-z]+/i) }
|
160
161
|
end
|
161
162
|
|
162
163
|
# Alias for `Paco::Combinators.regexp(/[a-z]*/i)`.
|
163
164
|
# @return [Paco::Parser]
|
164
165
|
def opt_letters
|
165
|
-
memoize {
|
166
|
+
memoize { regexp(/[a-z]*/i) }
|
166
167
|
end
|
167
168
|
|
168
|
-
# Alias for `Paco::Combinators.
|
169
|
+
# Alias for `Paco::Combinators.regexp_char(/[0-9]/)`.
|
169
170
|
# @return [Paco::Parser]
|
170
171
|
def digit
|
171
172
|
memoize { regexp_char(/[0-9]/) }
|
@@ -174,13 +175,13 @@ module Paco
|
|
174
175
|
# Alias for `Paco::Combinators.regexp(/[0-9]+/)`.
|
175
176
|
# @return [Paco::Parser]
|
176
177
|
def digits
|
177
|
-
memoize {
|
178
|
+
memoize { regexp(/[0-9]+/) }
|
178
179
|
end
|
179
180
|
|
180
181
|
# Alias for `Paco::Combinators.regexp(/[0-9]*/)`.
|
181
182
|
# @return [Paco::Parser]
|
182
183
|
def opt_digits
|
183
|
-
memoize {
|
184
|
+
memoize { regexp(/[0-9]*/) }
|
184
185
|
end
|
185
186
|
|
186
187
|
# Alias for `Paco::Combinators.regexp(/\s+/)`.
|
data/lib/paco/combinators.rb
CHANGED
@@ -1,22 +1,14 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
|
-
require "monitor"
|
4
|
-
|
5
3
|
require "paco/combinators/char"
|
4
|
+
require "paco/memoizer"
|
6
5
|
|
7
6
|
module Paco
|
8
7
|
module Combinators
|
9
|
-
|
10
|
-
|
11
|
-
base.extend Char
|
12
|
-
end
|
13
|
-
|
14
|
-
def self.included(base)
|
15
|
-
base.include MonitorMixin
|
16
|
-
base.include Char
|
17
|
-
end
|
8
|
+
include Char
|
9
|
+
extend Char
|
18
10
|
|
19
|
-
|
11
|
+
module_function
|
20
12
|
|
21
13
|
# Returns a parser that runs the passed `parser` without consuming the input, and
|
22
14
|
# returns `null` if the passed `parser` _does not match_ the input. Fails otherwise.
|
@@ -28,10 +20,11 @@ module Paco
|
|
28
20
|
begin
|
29
21
|
parser._parse(ctx)
|
30
22
|
rescue ParseError
|
31
|
-
ctx.pos = start_pos
|
32
23
|
nil
|
33
24
|
else
|
34
25
|
pars.failure(ctx)
|
26
|
+
ensure
|
27
|
+
ctx.pos = start_pos
|
35
28
|
end
|
36
29
|
end
|
37
30
|
end
|
@@ -39,7 +32,7 @@ module Paco
|
|
39
32
|
# Returns a parser that doesn't consume any input and always returns `result`.
|
40
33
|
# @return [Paco::Parser]
|
41
34
|
def succeed(result)
|
42
|
-
Parser.new { result }
|
35
|
+
Parser.new("succeed(#{result})") { result }
|
43
36
|
end
|
44
37
|
|
45
38
|
# Returns a parser that doesn't consume any input and always fails with passed `message`.
|
@@ -54,7 +47,7 @@ module Paco
|
|
54
47
|
# @param [Paco::Parser] parser
|
55
48
|
# @return [Paco::Parser]
|
56
49
|
def lookahead(parser)
|
57
|
-
Parser.new do |ctx|
|
50
|
+
Parser.new("lookahead(#{parser.desc})") do |ctx|
|
58
51
|
start_pos = ctx.pos
|
59
52
|
parser._parse(ctx)
|
60
53
|
ctx.pos = start_pos
|
@@ -68,7 +61,7 @@ module Paco
|
|
68
61
|
def alt(*parsers)
|
69
62
|
raise ArgumentError, "no parsers specified" if parsers.empty?
|
70
63
|
|
71
|
-
Parser.new do |ctx|
|
64
|
+
Parser.new("alt(#{parsers.map(&:desc).join(", ")})") do |ctx|
|
72
65
|
result = nil
|
73
66
|
last_error = nil
|
74
67
|
start_pos = ctx.pos
|
@@ -86,26 +79,25 @@ module Paco
|
|
86
79
|
|
87
80
|
# Accepts one or more parsers, and returns a parser that expects them
|
88
81
|
# to match in order, returns an array of all their results.
|
82
|
+
# If `block` specified, passes results of the `parses` as an arguments
|
83
|
+
# to a `block`, and at the end returns its result.
|
89
84
|
# @param [Array<Paco::Parser>] parsers
|
90
85
|
# @return [Paco::Parser]
|
91
86
|
def seq(*parsers)
|
92
87
|
raise ArgumentError, "no parsers specified" if parsers.empty?
|
93
88
|
|
94
|
-
Parser.new do |ctx|
|
95
|
-
|
89
|
+
result = Parser.new("seq(#{parsers.map(&:desc).join(", ")})") do |ctx|
|
90
|
+
start_pos = ctx.pos
|
91
|
+
begin
|
92
|
+
parsers.map { |parser| parser._parse(ctx) }
|
93
|
+
rescue ParseError => e
|
94
|
+
ctx.pos = start_pos
|
95
|
+
raise e
|
96
|
+
end
|
96
97
|
end
|
97
|
-
|
98
|
+
return result unless block_given?
|
98
99
|
|
99
|
-
|
100
|
-
# their results as an arguments to a `block`, and at the end returns its result.
|
101
|
-
# @param [Array<Paco::Parser>] parsers
|
102
|
-
# @return [Paco::Parser]
|
103
|
-
def seq_map(*parsers, &block)
|
104
|
-
raise ArgumentError, "no parsers specified" if parsers.empty?
|
105
|
-
|
106
|
-
seq(*parsers).fmap do |results|
|
107
|
-
block.call(*results)
|
108
|
-
end
|
100
|
+
result.fmap { |results| yield(*results) }
|
109
101
|
end
|
110
102
|
|
111
103
|
# Accepts a block that returns a parser, which is evaluated the first time the parser is used.
|
@@ -121,7 +113,8 @@ module Paco
|
|
121
113
|
# @param [Paco::Parser] separator
|
122
114
|
# @return [Paco::Parser]
|
123
115
|
def sep_by(parser, separator)
|
124
|
-
alt(
|
116
|
+
alt(sep_by!(parser, separator), succeed([]))
|
117
|
+
.with_desc("sep_by(#{parser.desc}, #{separator.desc})")
|
125
118
|
end
|
126
119
|
|
127
120
|
# Returns a parser that expects one or more matches for `parser`,
|
@@ -129,11 +122,11 @@ module Paco
|
|
129
122
|
# @param [Paco::Parser] parser
|
130
123
|
# @param [Paco::Parser] separator
|
131
124
|
# @return [Paco::Parser]
|
132
|
-
def
|
133
|
-
|
134
|
-
|
135
|
-
end
|
125
|
+
def sep_by!(parser, separator)
|
126
|
+
seq(parser, many(separator.next(parser))) { |first, arr| [first] + arr }
|
127
|
+
.with_desc("sep_by!(#{parser.desc}, #{separator.desc})")
|
136
128
|
end
|
129
|
+
alias_method :sep_by_1, :sep_by!
|
137
130
|
|
138
131
|
# Expects the parser `before` before `parser` and `after` after `parser. Returns the result of the parser.
|
139
132
|
# @param [Paco::Parser] before
|
@@ -148,13 +141,10 @@ module Paco
|
|
148
141
|
# @param [Paco::Parser] parser
|
149
142
|
# @return [Paco::Parser]
|
150
143
|
def many(parser)
|
151
|
-
Parser.new do |ctx|
|
144
|
+
Parser.new("many(#{parser.desc})") do |ctx|
|
152
145
|
results = []
|
153
|
-
# last_pos = ctx.pos
|
154
146
|
loop do
|
155
147
|
results << parser._parse(ctx)
|
156
|
-
# raise ArgumentError, "smth wrong" if last_pos == ctx.pos
|
157
|
-
# last_pos = ctx.pos
|
158
148
|
rescue ParseError
|
159
149
|
break
|
160
150
|
end
|
@@ -169,15 +159,16 @@ module Paco
|
|
169
159
|
alt(parser, succeed(nil))
|
170
160
|
end
|
171
161
|
|
162
|
+
# Returns parser that returns `Paco::Index` representing
|
163
|
+
# the current offset into the parse without consuming the input.
|
164
|
+
# @return [Paco::Parser]
|
165
|
+
def index
|
166
|
+
Parser.new { |ctx| ctx.index }
|
167
|
+
end
|
168
|
+
|
172
169
|
# Helper used for memoization
|
173
170
|
def memoize(&block)
|
174
|
-
|
175
|
-
synchronize do
|
176
|
-
@_paco_memoized ||= {}
|
177
|
-
return @_paco_memoized[key] if @_paco_memoized.key?(key)
|
178
|
-
|
179
|
-
@_paco_memoized[key] = block.call
|
180
|
-
end
|
171
|
+
Memoizer.memoize(block.source_location, &block)
|
181
172
|
end
|
182
173
|
end
|
183
174
|
end
|
data/lib/paco/context.rb
CHANGED
@@ -1,17 +1,17 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "paco/callstack"
|
4
|
+
require "paco/index"
|
5
|
+
|
2
6
|
module Paco
|
3
7
|
class Context
|
4
|
-
attr_reader :input, :last_pos, :
|
8
|
+
attr_reader :input, :last_pos, :callstack
|
9
|
+
attr_accessor :pos
|
5
10
|
|
6
|
-
def pos
|
7
|
-
# TODO: is that needed?
|
8
|
-
@last_pos = @pos
|
9
|
-
@pos = np
|
10
|
-
end
|
11
|
-
|
12
|
-
def initialize(input, pos = 0)
|
11
|
+
def initialize(input, pos: 0, with_callstack: false)
|
13
12
|
@input = input
|
14
13
|
@pos = pos
|
14
|
+
@callstack = Callstack.new if with_callstack
|
15
15
|
end
|
16
16
|
|
17
17
|
def read(n)
|
@@ -19,22 +19,33 @@ module Paco
|
|
19
19
|
end
|
20
20
|
|
21
21
|
def read_all
|
22
|
-
input[pos
|
22
|
+
input[pos..]
|
23
23
|
end
|
24
24
|
|
25
25
|
def eof?
|
26
26
|
pos >= input.length
|
27
27
|
end
|
28
28
|
|
29
|
+
# @param [Integer] from
|
30
|
+
# @return [Paco::Index]
|
29
31
|
def index(from = nil)
|
30
|
-
from
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
32
|
+
Index.calculate(input: input, pos: from || pos)
|
33
|
+
end
|
34
|
+
|
35
|
+
# @param [Paco::Parser] parser
|
36
|
+
def failure_parse(parser)
|
37
|
+
@callstack&.failure(pos: pos, parser: parser.desc)
|
38
|
+
end
|
39
|
+
|
40
|
+
# @param [Paco::Parser] parser
|
41
|
+
def start_parse(parser)
|
42
|
+
@callstack&.start(pos: pos, parser: parser.desc)
|
43
|
+
end
|
44
|
+
|
45
|
+
# @param [Object] result
|
46
|
+
# @param [Paco::Parser] parser
|
47
|
+
def success_parse(result, parser)
|
48
|
+
@callstack&.success(pos: pos, result: result, parser: parser.desc)
|
38
49
|
end
|
39
50
|
end
|
40
51
|
end
|
data/lib/paco/index.rb
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
module Paco
|
2
|
+
Index = Struct.new(:pos, :line, :column) do
|
3
|
+
# @param [String] input
|
4
|
+
# @param [Integer] pos
|
5
|
+
def self.calculate(input:, pos:)
|
6
|
+
raise ArgumentError, "`pos` must be a non-negative integer" if pos < 0
|
7
|
+
raise ArgumentError, "`pos` is grater then input length" if pos > input.length
|
8
|
+
|
9
|
+
lines = input[0..pos].lines
|
10
|
+
line = lines.empty? ? 1 : lines.length
|
11
|
+
column = lines.last&.length || 1
|
12
|
+
new(pos, line, column)
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "monitor"
|
4
|
+
|
5
|
+
module Paco
|
6
|
+
module Memoizer
|
7
|
+
extend MonitorMixin
|
8
|
+
|
9
|
+
class << self
|
10
|
+
def memoize(key, &block)
|
11
|
+
synchronize do
|
12
|
+
@paco_memoized ||= {}
|
13
|
+
return @paco_memoized[key] if @paco_memoized.key?(key)
|
14
|
+
|
15
|
+
@paco_memoized[key] = block.call
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
data/lib/paco/parse_error.rb
CHANGED
@@ -1,28 +1,29 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Paco
|
3
4
|
class Error < StandardError; end
|
4
5
|
|
5
6
|
class ParseError < Error
|
7
|
+
attr_reader :ctx, :pos, :expected
|
8
|
+
|
6
9
|
# @param [Paco::Context] ctx
|
7
10
|
def initialize(ctx, expected)
|
8
11
|
@ctx = ctx
|
9
12
|
@pos = ctx.pos
|
10
13
|
@expected = expected
|
14
|
+
end
|
11
15
|
|
12
|
-
|
13
|
-
|
14
|
-
# puts "#{ctx.pos}/#{ctx.input.length}: #{ctx.input[ctx.last_pos..ctx.pos]}"
|
15
|
-
# puts "expected: #{expected}"
|
16
|
-
# puts ""
|
16
|
+
def callstack
|
17
|
+
ctx.callstack
|
17
18
|
end
|
18
19
|
|
19
20
|
def message
|
20
|
-
index =
|
21
|
+
index = ctx.index(pos)
|
21
22
|
<<~MSG
|
22
|
-
|
23
|
-
line #{index
|
24
|
-
unexpected #{
|
25
|
-
expecting #{
|
23
|
+
\nParsing error
|
24
|
+
line #{index.line}, column #{index.column}:
|
25
|
+
unexpected #{ctx.eof? ? "end of file" : ctx.input[pos].inspect}
|
26
|
+
expecting #{expected}
|
26
27
|
MSG
|
27
28
|
end
|
28
29
|
end
|
data/lib/paco/parser.rb
CHANGED
@@ -6,36 +6,47 @@ module Paco
|
|
6
6
|
class Parser
|
7
7
|
attr_reader :desc
|
8
8
|
|
9
|
+
# @param [String] desc
|
9
10
|
def initialize(desc = "", &block)
|
10
11
|
@desc = desc
|
11
12
|
@block = block
|
12
13
|
end
|
13
14
|
|
14
|
-
|
15
|
-
|
15
|
+
# @param [String] desc
|
16
|
+
# @return [Paco::Parser]
|
17
|
+
def with_desc(desc)
|
18
|
+
@desc = desc
|
19
|
+
self
|
20
|
+
end
|
21
|
+
|
22
|
+
# @param [String, Paco::Context] input
|
23
|
+
# @param [true, false] with_callstack
|
24
|
+
def parse(input, with_callstack: false)
|
25
|
+
ctx = input.is_a?(Context) ? input : Context.new(input, with_callstack: with_callstack)
|
16
26
|
skip(Paco::Combinators.eof)._parse(ctx)
|
17
27
|
end
|
18
28
|
|
29
|
+
# @param [Paco::Context] ctx
|
19
30
|
def _parse(ctx)
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
# puts "#{ctx.input.length}/#{ctx.pos}: " + ctx.input[ctx.last_pos..ctx.pos].inspect
|
25
|
-
# puts ""
|
26
|
-
# res
|
31
|
+
ctx.start_parse(self)
|
32
|
+
res = @block.call(ctx, self)
|
33
|
+
ctx.success_parse(res, self)
|
34
|
+
res
|
27
35
|
end
|
28
36
|
|
29
37
|
# Raises ParseError
|
30
38
|
# @param [Paco::Context] ctx
|
31
39
|
# @raise [Paco::ParseError]
|
32
40
|
def failure(ctx)
|
41
|
+
ctx.failure_parse(self)
|
33
42
|
raise ParseError.new(ctx, desc), "", []
|
34
43
|
end
|
35
44
|
|
36
45
|
# Returns a new parser which tries `parser`, and if it fails uses `other`.
|
46
|
+
# @param [Paco::Parser] other
|
47
|
+
# @return [Paco::Parser]
|
37
48
|
def or(other)
|
38
|
-
Parser.new do |ctx|
|
49
|
+
Parser.new("or(#{desc}, #{other.desc})") do |ctx|
|
39
50
|
_parse(ctx)
|
40
51
|
rescue ParseError
|
41
52
|
other._parse(ctx)
|
@@ -47,7 +58,7 @@ module Paco
|
|
47
58
|
# @param [Poco::Parser] other
|
48
59
|
# @return [Paco::Parser]
|
49
60
|
def skip(other)
|
50
|
-
Paco::Combinators.seq(self, other).fmap { |results| results[0] }
|
61
|
+
Paco::Combinators.seq(self, other).fmap { |results| results[0] }.with_desc("#{desc}.skip(#{other.desc})")
|
51
62
|
end
|
52
63
|
alias_method :<, :skip
|
53
64
|
|
@@ -55,14 +66,15 @@ module Paco
|
|
55
66
|
# @param [Poco::Parser] other
|
56
67
|
# @return [Paco::Parser]
|
57
68
|
def next(other)
|
58
|
-
|
69
|
+
Paco::Combinators.seq(self, other).fmap { |results| results[1] }
|
70
|
+
.with_desc("#{desc}.next(#{other.desc})")
|
59
71
|
end
|
60
72
|
alias_method :>, :next
|
61
73
|
|
62
74
|
# Transforms the output of `parser` with the given block.
|
63
75
|
# @return [Paco::Parser]
|
64
76
|
def fmap(&block)
|
65
|
-
Parser.new do |ctx|
|
77
|
+
Parser.new("#{desc}.fmap") do |ctx|
|
66
78
|
block.call(_parse(ctx))
|
67
79
|
end
|
68
80
|
end
|
@@ -74,7 +86,7 @@ module Paco
|
|
74
86
|
# with the other Paco::Combinators.
|
75
87
|
# @return [Paco::Parser]
|
76
88
|
def bind(&block)
|
77
|
-
Parser.new do |ctx|
|
89
|
+
Parser.new("#{desc}.bind") do |ctx|
|
78
90
|
block.call(_parse(ctx))._parse(ctx)
|
79
91
|
end
|
80
92
|
end
|
@@ -139,7 +151,7 @@ module Paco
|
|
139
151
|
raise ArgumentError, "invalid attributes: min `#{min}`, max `#{max}`"
|
140
152
|
end
|
141
153
|
|
142
|
-
Parser.new do |ctx|
|
154
|
+
Parser.new("#{desc}.times(#{min}, #{max})") do |ctx|
|
143
155
|
results = min.times.map { _parse(ctx) }
|
144
156
|
(max - min).times.each do
|
145
157
|
results << _parse(ctx)
|
@@ -154,7 +166,7 @@ module Paco
|
|
154
166
|
# Returns a parser that runs `parser` at least `num` times,
|
155
167
|
# and returns an array of the results.
|
156
168
|
def at_least(num)
|
157
|
-
Paco::Combinators.
|
169
|
+
Paco::Combinators.seq(times(num), many) do |head, rest|
|
158
170
|
head + rest
|
159
171
|
end
|
160
172
|
end
|
@@ -0,0 +1,47 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
RSpec::Matchers.define(:parse) do |input|
|
4
|
+
chain :as do |expected_output = nil, &block|
|
5
|
+
@expected = expected_output
|
6
|
+
@block = block
|
7
|
+
end
|
8
|
+
|
9
|
+
chain :fully do
|
10
|
+
@expected = input
|
11
|
+
end
|
12
|
+
|
13
|
+
match do |parser|
|
14
|
+
@result = parser.parse(input)
|
15
|
+
return @block.call(@result) if @block
|
16
|
+
return @expected == @result if defined?(@expected)
|
17
|
+
|
18
|
+
true
|
19
|
+
rescue Paco::ParseError => e
|
20
|
+
@error_message = e.message
|
21
|
+
false
|
22
|
+
end
|
23
|
+
|
24
|
+
failure_message do |subject|
|
25
|
+
msg = "expected output of parsing #{input.inspect} with #{subject.inspect} to"
|
26
|
+
was = (@result ? "was #{@result.inspect}" : "raised an error #{@error_message}")
|
27
|
+
return "#{msg} meet block conditions, but it didn't. It #{was}" if @block
|
28
|
+
return "#{msg} equal #{@expected.inspect}, but it #{was}" if defined?(@expected)
|
29
|
+
|
30
|
+
"expected #{subject.inspect} to successfully parse #{input.inspect}, but it #{was}"
|
31
|
+
end
|
32
|
+
|
33
|
+
failure_message_when_negated do |subject|
|
34
|
+
msg = "expected output of parsing #{input.inspect} with #{subject.inspect} not to"
|
35
|
+
return "#{msg} meet block conditions, but it did" if @block
|
36
|
+
return "#{msg} equal #{@expected.inspect}" if defined?(@expected)
|
37
|
+
|
38
|
+
"expected #{subject.inspect} to not parse #{input.inspect}, but it did"
|
39
|
+
end
|
40
|
+
|
41
|
+
description do
|
42
|
+
return "parse #{input.inspect} with block conditions" if @block
|
43
|
+
return "parse #{input.inspect} as #{@expected.inspect}" if defined?(@expected)
|
44
|
+
|
45
|
+
"parse #{input.inspect}"
|
46
|
+
end
|
47
|
+
end
|
data/lib/paco/rspec.rb
ADDED
data/lib/paco/version.rb
CHANGED
data/lib/paco.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: paco
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Svyatoslav Kryukov
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-12-
|
11
|
+
date: 2021-12-28 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
13
|
description: Paco is a parser combinator library.
|
14
14
|
email:
|
@@ -20,14 +20,17 @@ files:
|
|
20
20
|
- CHANGELOG.md
|
21
21
|
- LICENSE.txt
|
22
22
|
- README.md
|
23
|
-
- bin/console
|
24
|
-
- bin/setup
|
25
23
|
- lib/paco.rb
|
24
|
+
- lib/paco/callstack.rb
|
26
25
|
- lib/paco/combinators.rb
|
27
26
|
- lib/paco/combinators/char.rb
|
28
27
|
- lib/paco/context.rb
|
28
|
+
- lib/paco/index.rb
|
29
|
+
- lib/paco/memoizer.rb
|
29
30
|
- lib/paco/parse_error.rb
|
30
31
|
- lib/paco/parser.rb
|
32
|
+
- lib/paco/rspec.rb
|
33
|
+
- lib/paco/rspec/parse_matcher.rb
|
31
34
|
- lib/paco/version.rb
|
32
35
|
homepage: https://github.com/skryukov/paco
|
33
36
|
licenses:
|
data/bin/console
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
#!/usr/bin/env ruby
|
2
|
-
# frozen_string_literal: true
|
3
|
-
|
4
|
-
require "bundler/setup"
|
5
|
-
require "paco"
|
6
|
-
|
7
|
-
# You can add fixtures and/or initialization code here to make experimenting
|
8
|
-
# with your gem easier. You can also use a different console, if you like.
|
9
|
-
|
10
|
-
# (If you use this, don't forget to add pry to your Gemfile!)
|
11
|
-
# require "pry"
|
12
|
-
# Pry.start
|
13
|
-
|
14
|
-
require "irb"
|
15
|
-
IRB.start(__FILE__)
|