super-expressive-ruby 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 8e5ab82376e3effc765d550ed042ba0b52b1921ace4913bc773b6cdf355b3a5f
4
+ data.tar.gz: 62c2947b14cd2c745e84efdacb8acb8e521526d202b08f55b3b9c4377c5d550b
5
+ SHA512:
6
+ metadata.gz: 754878b3b36cb750ab0b679ce76d65e70ae60c8939e12a754ea08d4b3690bb032705a506dbc3b9d9e85f7bf286fa010d151d827f7aa9c43d341ea92884d90b1f
7
+ data.tar.gz: 4ff3df8c2f886a6c35e431b15e5a441c9fb5ee93f7a27c4f0d693cac15a38c9bc3ba2933c1a8f835b208547078a96027e788b7fe43ad7304427d05eef2bc055e
@@ -0,0 +1,18 @@
1
+ version: 2.1
2
+ orbs:
3
+ ruby: circleci/ruby@0.1.2
4
+
5
+ jobs:
6
+ build:
7
+ docker:
8
+ - image: circleci/ruby:2.6.3-stretch-node
9
+ executor: ruby/default
10
+ steps:
11
+ - checkout
12
+ - run:
13
+ name: Which bundler?
14
+ command: bundle -v
15
+ - ruby/bundle-install
16
+ - run:
17
+ name: Run rspec
18
+ command: bundle exec rspec
@@ -0,0 +1,12 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ Gemfile.lock
10
+
11
+ # rspec failure tracking
12
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --format documentation
2
+ --color
3
+ --require spec_helper
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at hiroshi_yamasaki@eastback.jp. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [https://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: https://contributor-covenant.org
74
+ [version]: https://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,7 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in super-expressive-ruby.gemspec
4
+ gemspec
5
+
6
+ gem "rake", "~> 12.0"
7
+ gem "rspec", "~> 3.0"
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2020 Hiroshi Yamasaki
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,141 @@
1
+ # SuperExpressiveRuby
2
+
3
+ This gem is a port of https://github.com/francisrstokes/super-expressive
4
+
5
+ ## Installation
6
+
7
+ Add this line to your application's Gemfile:
8
+
9
+ ```ruby
10
+ gem 'super-expressive-ruby'
11
+ ```
12
+
13
+ And then execute:
14
+
15
+ $ bundle install
16
+
17
+ Or install it yourself as:
18
+
19
+ $ gem install super-expressive-ruby
20
+
21
+ ## Usage
22
+
23
+ ### Example
24
+
25
+ <pre>
26
+ require 'super-expressive-ruby'
27
+
28
+ myRegex = SuperExpressive.create
29
+ .startOfInput
30
+ .optional.string('0x')
31
+ .capture
32
+ .exactly(4).anyOf
33
+ .range('A', 'F')
34
+ .range('a', 'f')
35
+ .range('0', '9')
36
+ .end
37
+ .end
38
+ .endOfInput
39
+ .toRegex;
40
+
41
+ // Produces the following regular expression:
42
+ /^(?:0x)?((?:[A-Fa-f0-9]){4})$/
43
+ </pre>
44
+
45
+
46
+ ### Snake cases are supported as well.
47
+
48
+ <pre>
49
+ require 'super-expressive-ruby'
50
+
51
+ my_regex = SuperExpressive.create
52
+ .start_of_input
53
+ .optional.string('0x')
54
+ .capture
55
+ .exactly(4).any_of
56
+ .range('A', 'F')
57
+ .range('a', 'f')
58
+ .range('0', '9')
59
+ .end
60
+ .end
61
+ .end_of_input
62
+ .to_regex;
63
+
64
+ // Produces the following regular expression:
65
+ /^(?:0x)?((?:[A-Fa-f0-9]){4})$/
66
+ </pre>
67
+
68
+ ### API Compatibility
69
+
70
+ Unsupported methods can be called but ignored.
71
+
72
+ - [ ] .allowMultipleMatches (use String#gsub or String#scan' as an alternative)
73
+ - [ ] .lineByLine (use \A or \z as an alternative)
74
+ - [x] .caseInsensitive
75
+ - [ ] .sticky (Ruby does not have JavaScript regular expression y option)
76
+ - [ ] .unicode
77
+ - [x] .singleLine
78
+ - [x] .anyChar
79
+ - [x] .whitespaceChar
80
+ - [x] .nonWhitespaceChar
81
+ - [x] .digit
82
+ - [x] .nonDigit
83
+ - [x] .word
84
+ - [x] .nonWord
85
+ - [x] .wordBoundary
86
+ - [x] .nonWordBoundary
87
+ - [x] .newline
88
+ - [x] .carriageReturn
89
+ - [x] .tab
90
+ - [x] .nullByte
91
+ - [x] .anyOf
92
+ - [x] .capture
93
+ - [x] .namedCapture(name)
94
+ - [x] .namedBackreference(name)
95
+ - [x] .backreference(index)
96
+ - [x] .group
97
+ - [x] .end()
98
+ - [x] .assertAhead
99
+ - [x] .assertNotAhead
100
+ - [x] .optional
101
+ - [x] .zeroOrMore
102
+ - [x] .zeroOrMoreLazy
103
+ - [x] .oneOrMore
104
+ - [x] .oneOrMoreLazy
105
+ - [x] .exactly(n)
106
+ - [x] .atLeast(n)
107
+ - [x] .between(x, y)
108
+ - [x] .betweenLazy(x, y)
109
+ - [x] .startOfInput
110
+ - [x] .endOfInput
111
+ - [x] .anyOfChars(chars)
112
+ - [x] .anythingButChars(chars)
113
+ - [x] .anythingButString(str)
114
+ - [x] .anythingButRange(a, b)
115
+ - [x] .string(s)
116
+ - [x] .char(c)
117
+ - [x] .range(a, b)
118
+ - [x] .subexpression(expr, opts?)
119
+ - [x] .toRegexString()
120
+ - [x] .toRegex()
121
+
122
+
123
+
124
+ ## Development
125
+
126
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
127
+
128
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
129
+
130
+ ## Contributing
131
+
132
+ Bug reports and pull requests are welcome on GitHub at https://github.com/hiy/super-expressive-ruby. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/hiy/super-expressive-ruby/blob/master/CODE_OF_CONDUCT.md).
133
+
134
+
135
+ ## License
136
+
137
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
138
+
139
+ ## Code of Conduct
140
+
141
+ Everyone interacting in the SuperExpressiveRuby project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/hiy/super-expressive-ruby/blob/master/CODE_OF_CONDUCT.md).
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "super-expressive-ruby"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,10 @@
1
+ require "super-expressive-ruby/version"
2
+
3
+ module SuperExpressive
4
+ require 'super-expressive-ruby/super-expressive-ruby'
5
+ class Error < StandardError; end
6
+
7
+ def self.create
8
+ SuperExpressiveRuby.new
9
+ end
10
+ end
@@ -0,0 +1,811 @@
1
+ # frozen_string_literal: true
2
+
3
+ class SuperExpressiveRuby
4
+ require 'active_support'
5
+ require "active_support/core_ext/object/deep_dup"
6
+
7
+ attr_accessor :state
8
+
9
+ NamedGroupRegex = /^[a-z]+\w*$/i.freeze
10
+ QuantifierTable = {
11
+ oneOrMore: '+',
12
+ oneOrMoreLazy: '+?',
13
+ zeroOrMore: '*',
14
+ zeroOrMoreLazy: '*?',
15
+ optional: '?',
16
+ exactly: proc { |times| "{#{times}}" },
17
+ atLeast: proc { |times| "{#{times},}" },
18
+ between: proc { |times| "{#{times[0]},#{times[1]}}" },
19
+ betweenLazy: proc { |times| "{#{times[0]},#{times[1]}}?" }
20
+ }.freeze
21
+
22
+ class << self
23
+ def evaluate(el)
24
+ case el[:type]
25
+ when 'noop'
26
+ ''
27
+ when 'anyChar'
28
+ '.'
29
+ when 'whitespaceChar'
30
+ '\\s'
31
+ when 'nonWhitespaceChar'
32
+ '\\S'
33
+ when 'digit'
34
+ '\\d'
35
+ when 'nonDigit'
36
+ '\\D'
37
+ when 'word'
38
+ '\\w'
39
+ when 'nonWord'
40
+ '\\W'
41
+ when 'wordBoundary'
42
+ '\\b'
43
+ when 'nonWordBoundary'
44
+ '\\B'
45
+ when 'startOfInput'
46
+ '^'
47
+ when 'endOfInput'
48
+ '$'
49
+ when 'newline'
50
+ '\\n'
51
+ when 'carriageReturn'
52
+ '\\r'
53
+ when 'tab'
54
+ '\\t'
55
+ when 'nullByte'
56
+ '\\0'
57
+ when 'string'
58
+ el[:value]
59
+ when 'char'
60
+ el[:value]
61
+ when 'range'
62
+ "[#{el[:value][0]}-#{el[:value][1]}]"
63
+ when 'anythingButRange'
64
+ "[^#{el[:value][0]}-#{el[:value][1]}]"
65
+ when 'anyOfChars'
66
+ "[#{el[:value]}]"
67
+ when 'anythingButChars'
68
+ "[^#{el[:value]}]"
69
+ when 'namedBackreference'
70
+ "\\k<#{el[:name]}>"
71
+ when 'backreference'
72
+ "\\#{el[:index]}"
73
+ when 'subexpression'
74
+ el[:value].map { |value| evaluate(value) }.join('')
75
+ when 'optional',
76
+ 'zeroOrMore',
77
+ 'zeroOrMoreLazy',
78
+ 'oneOrMore',
79
+ 'oneOrMoreLazy'
80
+ inner = evaluate(el[:value])
81
+ with_group =
82
+ if el[:value][:quantifierRequiresGroup]
83
+ "(?:#{inner})"
84
+ else
85
+ inner
86
+ end
87
+ symbol = QuantifierTable[el[:type].to_sym]
88
+ "#{with_group}#{symbol}"
89
+ when 'betweenLazy',
90
+ 'between',
91
+ 'atLeast',
92
+ 'exactly'
93
+ inner = evaluate(el[:value])
94
+ withGroup =
95
+ if el[:value][:quantifierRequiresGroup]
96
+ "(?:#{inner})"
97
+ else
98
+ inner
99
+ end
100
+ "#{withGroup}#{QuantifierTable[el[:type].to_sym].call(el[:times])}"
101
+ when 'anythingButString'
102
+ chars = el[:value].split('').map { |c| "[^#{c}]" }.join('')
103
+ "(?:#{chars})"
104
+ when 'assertAhead'
105
+ evaluated = el[:value].map { |v| evaluate(v) }.join('')
106
+ "(?=#{evaluated})"
107
+ when 'assertNotAhead'
108
+ evaluated = el[:value].map { |v| evaluate(v) }.join('')
109
+ "(?!#{evaluated})"
110
+ when 'anyOf'
111
+ fused, rest = fuse_elements(el[:value])
112
+ return "[#{fused}]" unless rest.length
113
+
114
+ evaluatedRest = rest.map { |v| evaluate(v) }
115
+ separator = evaluatedRest.length > 0 && fused.length > 0 ? '|' : ''
116
+ "(?:#{evaluatedRest.join('|')}#{separator}#{fused ? "[#{fused}]" : ''})"
117
+ when 'capture'
118
+ evaluated = el[:value].map { |v| evaluate(v) }
119
+ "(#{evaluated.join('')})"
120
+ when 'namedCapture'
121
+ evaluated = el[:value].map { |v| evaluate(v) }
122
+ "(?<#{el[:name]}>#{evaluated.join('')})"
123
+ when 'group'
124
+ evaluated = el[:value].map { |v| evaluate(v) }
125
+ "(?:#{evaluated.join('')})"
126
+ else
127
+ raise "Can't process unsupported element type: #{el[:type]}"
128
+ end
129
+ end
130
+
131
+ def as_type(type, opts={})
132
+ proc { |value| { type: type, value: value }.merge(opts) }
133
+ end
134
+
135
+ def deferred_type(type, opts={})
136
+ type_fn = as_type(type, opts)
137
+ type_fn.call(type_fn)
138
+ end
139
+
140
+ def assert(condition, message)
141
+ raise StandardError, message unless condition
142
+ end
143
+
144
+ def partition(a)
145
+ r = a.each_with_object([[], []]) do |cur, acc|
146
+ if is_fusable(cur)
147
+ acc[0].push(cur)
148
+ else
149
+ acc[1].push(cur)
150
+ end
151
+ acc
152
+ end
153
+ [r[0], r[1]]
154
+ end
155
+
156
+ def is_fusable(element)
157
+ element[:type] == 'range' ||
158
+ element[:type] == 'char' ||
159
+ element[:type] == 'anyOfChars'
160
+ end
161
+
162
+ def fuse_elements(elements)
163
+ fusables, rest = partition(elements)
164
+ fused = fusables.map do |el|
165
+ if %w[char anyOfChars].include?(el[:type])
166
+ el[:value]
167
+ else
168
+ "#{el[:value][0]}-#{el[:value][1]}"
169
+ end
170
+ end.join('')
171
+ [fused, rest]
172
+ end
173
+
174
+ def camelize(snake_case_str)
175
+ snake_case_str.split('_').each_with_object([]).with_index do |(s, acc), idx|
176
+ acc << if idx.zero?
177
+ s
178
+ else
179
+ s.capitalize
180
+ end
181
+ end.join
182
+ end
183
+ end
184
+
185
+ @@t = {
186
+ root: as_type('root').call,
187
+ noop: as_type('noop').call,
188
+ startOfInput: as_type('startOfInput').call,
189
+ endOfInput: as_type('endOfInput').call,
190
+ anyChar: as_type('anyChar').call,
191
+ whitespaceChar: as_type('whitespaceChar').call,
192
+ nonWhitespaceChar: as_type('nonWhitespaceChar').call,
193
+ digit: as_type('digit').call,
194
+ nonDigit: as_type('nonDigit').call,
195
+ word: as_type('word').call,
196
+ nonWord: as_type('nonWord').call,
197
+ wordBoundary: as_type('wordBoundary').call,
198
+ nonWordBoundary: as_type('nonWordBoundary').call,
199
+ newline: as_type('newline').call,
200
+ carriageReturn: as_type('carriageReturn').call,
201
+ tab: as_type('tab').call,
202
+ nullByte: as_type('nullByte').call,
203
+ anyOfChars: as_type('anyOfChars'),
204
+ anythingButString: as_type('anythingButString'),
205
+ anythingButChars: as_type('anythingButChars'),
206
+ anythingButRange: as_type('anythingButRange'),
207
+ char: as_type('char'),
208
+ range: as_type('range'),
209
+ string: as_type('string', { quantifierRequiresGroup: true }),
210
+ namedBackreference: proc { |name| deferred_type('namedBackreference', { name: name }) },
211
+ backreference: proc { |index| deferred_type('backreference', { index: index }) },
212
+ capture: deferred_type('capture', { containsChildren: true }),
213
+ subexpression: as_type('subexpression', { containsChildren: true, quantifierRequiresGroup: true }),
214
+ namedCapture: proc { |name| deferred_type('namedCapture', { name: name, containsChildren: true }) },
215
+ group: deferred_type('group', { containsChildren: true }),
216
+ anyOf: deferred_type('anyOf', { containsChildren: true }),
217
+ assertAhead: deferred_type('assertAhead', { containsChildren: true }),
218
+ assertNotAhead: deferred_type('assertNotAhead', { containsChildren: true }),
219
+ exactly: proc { |times| deferred_type('exactly', { times: times, containsChild: true }) },
220
+ atLeast: proc { |times| deferred_type('atLeast', { times: times, containsChild: true }) },
221
+ between: proc { |x, y| deferred_type('between', { times: [x, y], containsChild: true }) },
222
+ betweenLazy: proc { |x, y| deferred_type('betweenLazy', { times: [x, y], containsChild: true }) },
223
+ zeroOrMore: deferred_type('zeroOrMore', { containsChild: true }),
224
+ zeroOrMoreLazy: deferred_type('zeroOrMoreLazy', { containsChild: true }),
225
+ oneOrMore: deferred_type('oneOrMore', { containsChild: true }),
226
+ oneOrMoreLazy: deferred_type('oneOrMoreLazy', { containsChild: true }),
227
+ optional: deferred_type('optional', { containsChild: true })
228
+ }.freeze
229
+
230
+ def initialize
231
+ self.state = {
232
+ hasDefinedStart: false,
233
+ hasDefinedEnd: false,
234
+ flags: {
235
+ g: false,
236
+ y: false,
237
+ m: false,
238
+ i: false,
239
+ u: false,
240
+ s: false
241
+ },
242
+ stack: [create_stack_frame(t[:root])],
243
+ namedGroups: [],
244
+ totalCaptureGroups: 0
245
+ }
246
+ end
247
+
248
+ def t
249
+ @@t
250
+ end
251
+
252
+ def escape_special(s)
253
+ Regexp.escape(s)
254
+ end
255
+
256
+ def create_stack_frame(type)
257
+ { type: type, quantifier: nil, elements: [] }
258
+ end
259
+
260
+ def allow_multiple_matches
261
+ # warn("Warning: Ruby does not have a allow multiple matches option. use String#gsub or String#scan")
262
+ n = clone
263
+ n.state[:flags][:g] = true
264
+ n
265
+ end
266
+
267
+ def line_by_line
268
+ # warn("Warning: Ruby does not have a line by line option. use \A or \z as an alternative")
269
+ n = clone
270
+ n.state[:flags][:m] = true
271
+ n
272
+ end
273
+
274
+ def case_insensitive
275
+ n = clone
276
+ n.state[:flags][:i] = true
277
+ n
278
+ end
279
+
280
+ def sticky
281
+ # warn("Warning: Ruby does not have a sticky option")
282
+ n = clone
283
+ n.state[:flags][:y] = true
284
+ n
285
+ end
286
+
287
+ def unicode
288
+ n = clone
289
+ n.state[:flags][:u] = true
290
+ n
291
+ end
292
+
293
+ def single_line
294
+ n = clone
295
+ n.state[:flags][:s] = true
296
+ n
297
+ end
298
+
299
+ def match_element(type_fn)
300
+ n = clone
301
+ n.get_current_element_array.push(n.apply_quantifier(type_fn))
302
+ n
303
+ end
304
+
305
+ def any_char
306
+ match_element(t[:anyChar])
307
+ end
308
+
309
+ def whitespace_char
310
+ match_element(t[:whitespaceChar])
311
+ end
312
+
313
+ def non_whitespace_char
314
+ match_element(t[:nonWhitespaceChar])
315
+ end
316
+
317
+ def digit
318
+ match_element(t[:digit])
319
+ end
320
+
321
+ def non_digit
322
+ match_element(t[:nonDigit])
323
+ end
324
+
325
+ def word
326
+ match_element(t[:word])
327
+ end
328
+
329
+ def non_word
330
+ match_element(t[:nonWord])
331
+ end
332
+
333
+ def word_boundary
334
+ match_element(t[:wordBoundary])
335
+ end
336
+
337
+ def non_word_boundary
338
+ match_element(t[:nonWordBoundary])
339
+ end
340
+
341
+ def newline
342
+ match_element(t[:newline])
343
+ end
344
+
345
+ def carriage_return
346
+ match_element(t[:carriageReturn])
347
+ end
348
+
349
+ def tab
350
+ match_element(t[:tab])
351
+ end
352
+
353
+ def null_byte
354
+ match_element(t[:nullByte])
355
+ end
356
+
357
+ def named_backreference(name)
358
+ assert(state[:namedGroups].include?(name), "no capture group called '#{name}' exists (create one with .namedCapture())")
359
+ match_element(t[:namedBackreference].call(name))
360
+ end
361
+
362
+ def backreference(index)
363
+ assert(index.is_a?(Integer), 'index must be a number')
364
+ assert(index > 0 && index <= state[:totalCaptureGroups],
365
+ "invalid index #{index}. There are #{state[:totalCaptureGroups]} capture groups on this SuperExpression")
366
+ match_element(t[:backreference].call(index))
367
+ end
368
+
369
+ def frame_creating_element(type_fn)
370
+ n = clone
371
+ new_frame = create_stack_frame(type_fn)
372
+ n.state[:stack].push(new_frame)
373
+ n
374
+ end
375
+
376
+ def any_of
377
+ frame_creating_element(t[:anyOf])
378
+ end
379
+
380
+ def group
381
+ frame_creating_element(t[:group])
382
+ end
383
+
384
+ def assert_ahead
385
+ frame_creating_element(t[:assertAhead])
386
+ end
387
+
388
+ def assert_not_ahead
389
+ frame_creating_element(t[:assertNotAhead])
390
+ end
391
+
392
+ def capture
393
+ n = clone
394
+ new_frame = create_stack_frame(t[:capture])
395
+ n.state[:stack].push(new_frame)
396
+ n.state[:totalCaptureGroups] += 1
397
+ n
398
+ end
399
+
400
+ def track_named_group(name)
401
+ assert(name.is_a?(String), "name must be a string (got #{name})")
402
+ assert(name.length > 0, 'name must be at least one character')
403
+ assert(!state[:namedGroups].include?(name), "cannot use #{name} again for a capture group")
404
+ assert(name.scan(NamedGroupRegex).any?, "name '#{name}' is not valid (only letters, numbers, and underscores)")
405
+
406
+ state[:namedGroups].push name
407
+ end
408
+
409
+ def named_capture(name)
410
+ n = clone
411
+ new_frame = create_stack_frame(t[:namedCapture].call(name))
412
+
413
+ n.track_named_group(name)
414
+ n.state[:stack].push(new_frame)
415
+ n.state[:totalCaptureGroups] += 1
416
+ n
417
+ end
418
+
419
+ def quantifier_element(type_fn_name)
420
+ n = clone
421
+ current_frame = n.get_current_frame
422
+ if current_frame[:quantifier]
423
+ raise StandardError, "cannot quantify regular expression with '#{type_fn_name}' because it's already being quantified with '#{current_frame[:quantifier][:type]}'"
424
+ end
425
+
426
+ current_frame[:quantifier] = t[type_fn_name.to_sym]
427
+ n
428
+ end
429
+
430
+ def optional
431
+ quantifier_element('optional')
432
+ end
433
+
434
+ def zero_or_more
435
+ quantifier_element('zeroOrMore')
436
+ end
437
+
438
+ def zero_or_more_lazy
439
+ quantifier_element('zeroOrMoreLazy')
440
+ end
441
+
442
+ def one_or_more
443
+ quantifier_element('oneOrMore')
444
+ end
445
+
446
+ def one_or_more_lazy
447
+ quantifier_element('oneOrMoreLazy')
448
+ end
449
+
450
+ def exactly(n)
451
+ assert(n.is_a?(Integer) && n > 0, "n must be a positive integer (got #{n})")
452
+
453
+ nxt = clone
454
+ current_frame = nxt.get_current_frame
455
+ if current_frame[:quantifier]
456
+ raise StandardError, "cannot quantify regular expression with 'exactly' because it's already being quantified with '#{current_frame[:quantifier][:type]}'"
457
+ end
458
+
459
+ current_frame[:quantifier] = t[:exactly].call(n)
460
+ nxt
461
+ end
462
+
463
+ def at_least(n)
464
+ assert(n.is_a?(Integer) && n > 0, "n must be a positive integer (got #{n})")
465
+ nxt = clone
466
+ current_frame = nxt.get_current_frame
467
+ if current_frame[:quantifier]
468
+ raise StandardError, "cannot quantify regular expression with 'atLeast' because it's already being quantified with '#{currentFrame.quantifier.type}'"
469
+ end
470
+
471
+ current_frame[:quantifier] = t[:atLeast].call(n)
472
+ nxt
473
+ end
474
+
475
+ def between(x, y)
476
+ assert(x.is_a?(Integer) && x >= 0, "x must be an integer (got #{x})")
477
+ assert(y.is_a?(Integer) && y > 0, "y must be an integer greater than 0 (got #{y})")
478
+ assert(x < y, "x must be less than y (x = #{x}, y = #{y})")
479
+
480
+ nxt = clone
481
+ current_frame = nxt.get_current_frame
482
+ if current_frame[:quantifier]
483
+ raise StandardError, "cannot quantify regular expression with 'between' because it's already being quantified with '#{currentFrame.quantifier.type}'"
484
+ end
485
+
486
+ current_frame[:quantifier] = t[:between].call(x, y)
487
+ nxt
488
+ end
489
+
490
+ def between_lazy(x, y)
491
+ assert(x.is_a?(Integer) && x >= 0, "x must be an integer (got #{x})")
492
+ assert(y.is_a?(Integer) && y > 0, "y must be an integer greater than 0 (got #{y})")
493
+ assert(x < y, "x must be less than y (x = #{x}, y = #{y})")
494
+
495
+ n = clone
496
+ current_frame = n.get_current_frame
497
+ if current_frame[:quantifier]
498
+ raise StandardError, "cannot quantify regular expression with 'betweenLazy' because it's already being quantified with '#{current_frame[:quantifier][:type]}'"
499
+ end
500
+
501
+ current_frame[:quantifier] = t[:betweenLazy].call(x, y)
502
+ n
503
+ end
504
+
505
+ def start_of_input
506
+ assert(!state[:hasDefinedStart], 'This regex already has a defined start of input')
507
+ assert(!state[:hasDefinedEnd], 'Cannot define the start of input after the end of input')
508
+
509
+ n = clone
510
+ n.state[:hasDefinedStart] = true
511
+ n.get_current_element_array.push(t[:startOfInput])
512
+ n
513
+ end
514
+
515
+ def end_of_input
516
+ assert(!state[:hasDefinedEnd], 'This regex already has a defined end of input')
517
+
518
+ n = clone
519
+ n.state[:hasDefinedEnd] = true
520
+ n.get_current_element_array.push(t[:endOfInput])
521
+ n
522
+ end
523
+
524
+ def any_of_chars(s)
525
+ n = clone
526
+ element_value = t[:anyOfChars].call(escape_special(s))
527
+ current_frame = n.get_current_frame
528
+ current_frame[:elements].push(n.apply_quantifier(element_value))
529
+ n
530
+ end
531
+
532
+ def end
533
+ assert(state[:stack].length > 1, 'Cannot call end while building the root expression.')
534
+
535
+ n = clone
536
+ old_frame = n.state[:stack].pop
537
+ current_frame = n.get_current_frame
538
+ current_frame[:elements].push(n.apply_quantifier(old_frame[:type][:value].call(old_frame[:elements])))
539
+ n
540
+ end
541
+
542
+ def anything_but_string(str)
543
+ assert(str.is_a?(String), "str must be a string (got #{str})")
544
+ assert(str.length > 0, 'str must have least one character')
545
+
546
+ n = clone
547
+ element_value - t[:anythingButString].call(escape_special(str))
548
+ current_frame = n.get_current_frame
549
+ current_frame[:elements].push(n.apply_quantifier(element_value))
550
+ n
551
+ end
552
+
553
+ def anything_but_chars(chars)
554
+ assert(chars.is_a?(String), "chars must be a string (got #{chars})")
555
+ assert(chars.length > 0, 'chars must have at least one character')
556
+
557
+ n = clone
558
+ element_value = t[:anythingButChars].call(escape_special(chars))
559
+ current_frame = n.get_current_frame
560
+ current_frame[:elements].push(n.apply_quantifier(element_value))
561
+ n
562
+ end
563
+
564
+ def anything_but_range(a, b)
565
+ str_a = a.to_s
566
+ str_b = b.to_s
567
+
568
+ assert(str_a.length === 1, "a must be a single character or number (got #{str_a})")
569
+ assert(str_b.length === 1, "b must be a single character or number (got #{str_b})")
570
+ assert(str_a[0].ord < str_b[0].ord, "a must have a smaller character value than b (a = #{str_a[0].ord}, b = #{str_b[0].ord})")
571
+
572
+ n = clone
573
+ element_value = t[:anythingButRange].call([a, b])
574
+ current_frame = n.get_current_frame
575
+ current_frame[:elements].push(n.apply_quantifier(element_value))
576
+ n
577
+ end
578
+
579
+ def string(str)
580
+ assert('' != str, 'str cannot be an empty string')
581
+ n = clone
582
+
583
+ element_value =
584
+ if str.length > 1
585
+ t[:string].call(escape_special(str))
586
+ else
587
+ t[:char].call(str)
588
+ end
589
+
590
+ current_frame = n.get_current_frame
591
+ current_frame[:elements].push(n.apply_quantifier(element_value))
592
+
593
+ n
594
+ end
595
+
596
+ def char(c)
597
+ assert(c.is_a?(String), "c must be a string (got #{c})")
598
+ assert(c.length == 1, "char() can only be called with a single character (got #{c})")
599
+
600
+ n = clone
601
+ current_frame = n.get_current_frame
602
+ current_frame[:elements].push(n.apply_quantifier(t[:char].call(escape_special(c))))
603
+ n
604
+ end
605
+
606
+ def range(a, b)
607
+ str_a = a.to_s
608
+ str_b = b.to_s
609
+
610
+ assert(str_a.length == 1, "a must be a single character or number (got #{str_a})")
611
+ assert(str_b.length == 1, "b must be a single character or number (got #{str_b})")
612
+ assert(str_a[0].ord < str_b[0].ord, "a must have a smaller character value than b (a = #{str_a[0].ord}, b = #{str_b[0].ord})")
613
+
614
+ n = clone
615
+ element_value = t[:range].call([str_a, str_b])
616
+ current_frame = n.get_current_frame
617
+
618
+ current_frame[:elements].push(n.apply_quantifier(element_value))
619
+ n
620
+ end
621
+
622
+ def merge_subexpression(el, options, parent, increment_capture_groups)
623
+ next_el = el.clone
624
+ next_el[:index] += parent.state[:totalCaptureGroups] if next_el[:type] == 'backreference'
625
+
626
+ increment_capture_groups.call if next_el[:type] == 'capture'
627
+
628
+ if next_el[:type] === 'namedCapture'
629
+ group_name =
630
+ if options[:namespace]
631
+ "#{options[:namespace]}#{next_el[:name]}"
632
+ else
633
+ next_el[:name]
634
+ end
635
+
636
+ parent.track_named_group(group_name)
637
+ next_el[:name] = group_name
638
+ end
639
+
640
+ if next_el[:type] == 'namedBackreference'
641
+ next_el[:name] =
642
+ if options[:namespace]
643
+ "#{options[:namespace]}#{next_el[:name]}"
644
+ else
645
+ next_el[:name]
646
+ end
647
+ end
648
+
649
+ if next_el[:containsChild]
650
+ next_el[:value] = merge_subexpression(
651
+ next_el[:value],
652
+ options,
653
+ parent,
654
+ increment_capture_groups
655
+ )
656
+ elsif next_el[:containsChildren]
657
+ next_el[:value] = next_el[:value].map do |e|
658
+ merge_subexpression(
659
+ e,
660
+ options,
661
+ parent,
662
+ increment_capture_groups
663
+ )
664
+ end
665
+ end
666
+
667
+ if next_el[:type] == 'startOfInput'
668
+
669
+ return @@t[:noop] if options[:ignoreStartAndEnd]
670
+
671
+ assert(
672
+ !parent.state[:hasDefinedStart],
673
+ 'The parent regex already has a defined start of input. ' +
674
+ 'You can ignore a subexpressions startOfInput/endOfInput markers with the ignoreStartAndEnd option'
675
+ )
676
+
677
+ assert(
678
+ !parent.state[:hasDefinedEnd],
679
+ 'The parent regex already has a defined end of input. ' +
680
+ 'You can ignore a subexpressions startOfInput/endOfInput markers with the ignoreStartAndEnd option'
681
+ )
682
+
683
+ parent.state[:hasDefinedStart] = true
684
+ end
685
+
686
+ if next_el[:type] == 'endOfInput'
687
+ return @@t[:noop] if options[:ignoreStartAndEnd]
688
+
689
+ assert(
690
+ !parent.state[:hasDefinedEnd],
691
+ 'The parent regex already has a defined start of input. ' +
692
+ 'You can ignore a subexpressions startOfInput/endOfInput markers with the ignoreStartAndEnd option'
693
+ )
694
+
695
+ parent.state[:hasDefinedEnd] = true
696
+ end
697
+ next_el
698
+ end
699
+
700
+ def apply_subexpression_defaults(expr)
701
+ out = {}.merge(expr)
702
+
703
+ out[:namespace] = out.has_key?(:namespace) ? out[:namespace] : ''
704
+ out[:ignoreFlags] = out.has_key?(:ignoreFlags) ? out[:ignoreFlags] : true
705
+ out[:ignoreStartAndEnd] = out.has_key?(:ignoreStartAndEnd) ? out[:ignoreStartAndEnd] : true
706
+ assert(out[:namespace].is_a?(String), 'namespace must be a string')
707
+ assert(out[:ignoreFlags].is_a?(TrueClass) || out[:ignoreFlags].is_a?(FalseClass), 'ignoreFlags must be a boolean')
708
+ assert(out[:ignoreStartAndEnd].is_a?(TrueClass) || out[:ignoreStartAndEnd].is_a?(FalseClass), 'ignoreStartAndEnd must be a boolean')
709
+
710
+ out
711
+ end
712
+
713
+ def subexpression(expr, opts = {})
714
+ assert(expr.is_a?(SuperExpressiveRuby), 'expr must be a SuperExpressive instance')
715
+ assert(
716
+ expr.state[:stack].length === 1,
717
+ 'Cannot call subexpression with a not yet fully specified regex object.' +
718
+ "\n(Try adding a .end() call to match the '#{expr.get_current_frame[:type][:type]}' on the subexpression)\n"
719
+ )
720
+
721
+ options = apply_subexpression_defaults(opts)
722
+
723
+ expr_n = expr.clone
724
+ expr_n.state = expr.state.deep_dup
725
+ n = clone
726
+ additional_capture_groups = 0
727
+
728
+ expr_frame = expr_n.get_current_frame
729
+ closure = proc { additional_capture_groups += 1 }
730
+
731
+ expr_frame[:elements] = expr_frame[:elements].map do |e|
732
+ merge_subexpression(e, options, n, closure)
733
+ end
734
+
735
+ n.state[:totalCaptureGroups] += additional_capture_groups
736
+
737
+ unless options[:ignoreFlags]
738
+ expr_n.state[:flags].to_a.each do |e|
739
+ flag_name = e[0]
740
+ enabled = e[1]
741
+ n.state[:flags][flag_name] = enabled || n.state[:flags][flag_name]
742
+ end
743
+ end
744
+
745
+ current_frame = n.get_current_frame
746
+ current_frame[:elements].push(n.apply_quantifier(t[:subexpression].call(expr_frame[:elements])))
747
+ n
748
+ end
749
+
750
+ def to_regex_string
751
+ pattern, flags = get_regex_pattern_and_flags
752
+ Regexp.new(pattern, flags).to_s
753
+ end
754
+
755
+ def to_regex
756
+ pattern, flags = get_regex_pattern_and_flags
757
+ Regexp.new(pattern, flags)
758
+ end
759
+
760
+ def get_regex_pattern_and_flags
761
+ assert state[:stack].length === 1,
762
+ "Cannot compute the value of a not yet fully specified regex object.
763
+ \n(Try adding a .end() call to match the '#{get_current_frame[:type][:type]}')\n"
764
+ pattern = get_current_element_array.map { |el| self.class.evaluate(el) }.join('')
765
+ flag = nil
766
+ state[:flags].map do |name, is_on|
767
+ if is_on
768
+ flag = 0 if !flag
769
+ case name
770
+ when :s
771
+ flag = flag | Regexp::MULTILINE
772
+ when :i
773
+ flag = flag | Regexp::IGNORECASE
774
+ when :x
775
+ flag = flag | Regexp::EXTENDED
776
+ end
777
+ end
778
+ end
779
+ pat = (pattern == '' ? '(?:)' : pattern)
780
+ [pat, flag]
781
+ end
782
+
783
+ def apply_quantifier(element)
784
+ current_frame = get_current_frame
785
+ if current_frame[:quantifier]
786
+ wrapped = current_frame[:quantifier][:value].call(element)
787
+ current_frame[:quantifier] = nil
788
+ return wrapped
789
+ end
790
+ element
791
+ end
792
+
793
+ def get_current_frame
794
+ state[:stack][state[:stack].length - 1]
795
+ end
796
+
797
+ def get_current_element_array
798
+ get_current_frame[:elements]
799
+ end
800
+
801
+ def assert(condition, message)
802
+ self.class.assert(condition, message)
803
+ raise StandardError, message unless condition
804
+ end
805
+
806
+ # generate camel case methods
807
+ public_instance_methods(false).each do |method_name|
808
+ camelized_method_name = camelize(method_name.to_s)
809
+ alias_method camelized_method_name, method_name
810
+ end
811
+ end
@@ -0,0 +1,5 @@
1
+ module SuperExpressive
2
+ module Ruby
3
+ VERSION = "0.1.0"
4
+ end
5
+ end
@@ -0,0 +1,30 @@
1
+ require_relative 'lib/super-expressive-ruby/version'
2
+
3
+ Gem::Specification.new do |spec|
4
+ spec.name = "super-expressive-ruby"
5
+ spec.version = SuperExpressive::Ruby::VERSION
6
+ spec.authors = ["Hiroshi Yamasaki"]
7
+ spec.email = ["ymskhrs@gmail.com"]
8
+
9
+ spec.summary = "Build regular expressions in almost natural language"
10
+ spec.description = "This gem is a port of https://github.com/francisrstokes/super-expressive"
11
+ spec.homepage = "https://github.com/hiy/super-expressive-ruby"
12
+ spec.license = "MIT"
13
+ spec.required_ruby_version = Gem::Requirement.new(">= 2.3.0")
14
+
15
+ spec.metadata["allowed_push_host"] = "https://rubygems.org"
16
+
17
+ spec.metadata["homepage_uri"] = spec.homepage
18
+ spec.metadata["source_code_uri"] = "https://github.com/hiy/super-expressive-ruby"
19
+ spec.metadata["changelog_uri"] = "https://github.com/hiy/super-expressive-ruby"
20
+
21
+ # Specify which files should be added to the gem when it is released.
22
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
23
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
24
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
25
+ end
26
+ spec.bindir = "exe"
27
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
28
+ spec.require_paths = ["lib"]
29
+ spec.add_dependency "activesupport", '6.0'
30
+ end
metadata ADDED
@@ -0,0 +1,75 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: super-expressive-ruby
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Hiroshi Yamasaki
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2020-10-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: activesupport
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '='
18
+ - !ruby/object:Gem::Version
19
+ version: '6.0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '='
25
+ - !ruby/object:Gem::Version
26
+ version: '6.0'
27
+ description: This gem is a port of https://github.com/francisrstokes/super-expressive
28
+ email:
29
+ - ymskhrs@gmail.com
30
+ executables: []
31
+ extensions: []
32
+ extra_rdoc_files: []
33
+ files:
34
+ - ".circleci/config.yml"
35
+ - ".gitignore"
36
+ - ".rspec"
37
+ - CODE_OF_CONDUCT.md
38
+ - Gemfile
39
+ - LICENSE.txt
40
+ - README.md
41
+ - Rakefile
42
+ - bin/console
43
+ - bin/setup
44
+ - lib/super-expressive-ruby.rb
45
+ - lib/super-expressive-ruby/super-expressive-ruby.rb
46
+ - lib/super-expressive-ruby/version.rb
47
+ - super-expressive-ruby.gemspec
48
+ homepage: https://github.com/hiy/super-expressive-ruby
49
+ licenses:
50
+ - MIT
51
+ metadata:
52
+ allowed_push_host: https://rubygems.org
53
+ homepage_uri: https://github.com/hiy/super-expressive-ruby
54
+ source_code_uri: https://github.com/hiy/super-expressive-ruby
55
+ changelog_uri: https://github.com/hiy/super-expressive-ruby
56
+ post_install_message:
57
+ rdoc_options: []
58
+ require_paths:
59
+ - lib
60
+ required_ruby_version: !ruby/object:Gem::Requirement
61
+ requirements:
62
+ - - ">="
63
+ - !ruby/object:Gem::Version
64
+ version: 2.3.0
65
+ required_rubygems_version: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - ">="
68
+ - !ruby/object:Gem::Version
69
+ version: '0'
70
+ requirements: []
71
+ rubygems_version: 3.1.2
72
+ signing_key:
73
+ specification_version: 4
74
+ summary: Build regular expressions in almost natural language
75
+ test_files: []