regexp_parser 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 467bef34ff29198ccbde0063f1b2f62f03cf8237c46fe59be8f06e974c00c367
4
- data.tar.gz: b876e662f889449954f0beeddedded8e462b4daec5fbb4692b65f9f7a012fec8
3
+ metadata.gz: 20ba21704667276107a1041b3bb5943bbbec0078f706cf0d7db85110631dfe8d
4
+ data.tar.gz: 87886f6cad480ebc62f3e1f243d9b61170097e5419fc8b3972cd3348e5d8d7e0
5
5
  SHA512:
6
- metadata.gz: b2064de034cf83f157da79225fc587374f744884f37934fd1828e4f3127bc45f899d62986495d9344fd4a5f2e96f7b2cf90c7961c0895fd0ee52ab8e24428d67
7
- data.tar.gz: 146440a8fc9e2c48bb1bdedad539f6883ecbc5874276eb3e56876df7b74e42630ec789f6cc6b4ca7e79d5e6b2e986b5a2f5b2fc0be4d273c02d8ff62fa2cd35c
6
+ metadata.gz: '0678640973741b2ea63053c058809fa075b3b465756bddee9a1914f67f7181a3681d3592662d4eadf5a60e844c550950b371577239924c4d3ce7f07f9fdfefa6'
7
+ data.tar.gz: 3bf18d0d7989c1f9eef010d1579ac78537c6c083c9b7c7c2f0cda094c0f973e1fdcc17c5992ae35d823720d2cdb10a60424876e08bd4b2b60b125c8b107a62bf
@@ -1,3 +1,16 @@
1
+ ## [1.2.0] - 2018-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
2
+
3
+ ### Added
4
+
5
+ - `Subexpression` (branch node) includes `Enumerable`, allowing to `#select` children etc.
6
+
7
+ ### Fixed
8
+
9
+ - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re`
10
+ - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree
11
+ - it used to be the only expression stored in a custom ivar, complicating traversal
12
+ - its setter and getter (`#condition=`, `#condition`) still work as before
13
+
1
14
  ## [1.1.0] - 2018-09-17 - [Janosch Müller](mailto:janosch84@gmail.com)
2
15
 
3
16
  ### Added
data/README.md CHANGED
@@ -2,14 +2,14 @@
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/regexp_parser.svg)](http://badge.fury.io/rb/regexp_parser) [![Build Status](https://secure.travis-ci.org/ammar/regexp_parser.svg?branch=master)](http://travis-ci.org/ammar/regexp_parser) [![Code Climate](https://codeclimate.com/github/ammar/regexp_parser.svg)](https://codeclimate.com/github/ammar/regexp_parser/badges)
4
4
 
5
- A ruby gem for tokenizing, parsing, and transforming regular expressions.
5
+ A Ruby gem for tokenizing, parsing, and transforming regular expressions.
6
6
 
7
7
  * Multilayered
8
- * A scanner/tokenizer based on [ragel](http://www.colm.net/open-source/ragel/)
8
+ * A scanner/tokenizer based on [Ragel](http://www.colm.net/open-source/ragel/)
9
9
  * A lexer that produces a "stream" of token objects.
10
10
  * A parser that produces a "tree" of Expression objects (OO API)
11
- * Runs on ruby 1.9, 2.x, and jruby (1.9 mode) runtimes.
12
- * Recognizes ruby 1.8, 1.9, and 2.x regular expressions [See Supported Syntax](#supported-syntax)
11
+ * Runs on Ruby 1.9, 2.x, and JRuby (1.9 mode) runtimes.
12
+ * Recognizes Ruby 1.8, 1.9, and 2.x regular expressions [See Supported Syntax](#supported-syntax)
13
13
 
14
14
 
15
15
  _For examples of regexp_parser in use, see [Example Projects](#example-projects)._
@@ -46,7 +46,7 @@ The three main modules are **Scanner**, **Lexer**, and **Parser**. Each of them
46
46
  provides a single method that takes a regular expression (as a RegExp object or
47
47
  a string) and returns its results. The **Lexer** and the **Parser** accept an
48
48
  optional second argument that specifies the syntax version, like 'ruby/2.0',
49
- which defaults to the host ruby version (using RUBY_VERSION).
49
+ which defaults to the host Ruby version (using RUBY_VERSION).
50
50
 
51
51
  Here are the basic usage examples:
52
52
 
@@ -77,7 +77,7 @@ called with the results as follows:
77
77
  ## Components
78
78
 
79
79
  ### Scanner
80
- A ragel generated scanner that recognizes the cumulative syntax of all
80
+ A Ragel-generated scanner that recognizes the cumulative syntax of all
81
81
  supported syntax versions. It breaks a given expression's text into the
82
82
  smallest parts, and identifies their type, token, text, and start/end
83
83
  offsets within the pattern.
@@ -123,7 +123,7 @@ Regexp::Scanner.scan( /(cat?([bhm]at)){3,5}/ ).map {|token| token[2]}
123
123
  balancing punctuation and premature end of pattern. Flavor validity checks
124
124
  are performed in the lexer, which uses a syntax object.
125
125
 
126
- * If the input is a ruby **Regexp** object, the scanner calls #source on it to
126
+ * If the input is a Ruby **Regexp** object, the scanner calls #source on it to
127
127
  get its string representation. #source does not include the options of
128
128
  the expression (m, i, and x). To include the options in the scan, #to_s
129
129
  should be called on the **Regexp** before passing it to the scanner or the
@@ -188,7 +188,7 @@ ruby_18.implements? :conditional, :condition # => false
188
188
  Sits on top of the scanner and performs lexical analysis on the tokens that
189
189
  it emits. Among its tasks are; breaking quantified literal runs, collecting the
190
190
  emitted token attributes into Token objects, calculating their nesting depth,
191
- normalizing tokens for the parser, and checkng if the tokens are implemented by
191
+ normalizing tokens for the parser, and checking if the tokens are implemented by
192
192
  the given syntax version.
193
193
 
194
194
  See the [Token Objects](https://github.com/ammar/regexp_parser/wiki/Token-Objects)
@@ -196,7 +196,7 @@ wiki page for more information on Token objects.
196
196
 
197
197
 
198
198
  #### Example
199
- The following example lexes the given pattern, checks it against the ruby 1.9
199
+ The following example lexes the given pattern, checks it against the Ruby 1.9
200
200
  syntax, and prints the token objects' text indented to their level.
201
201
 
202
202
  ```ruby
@@ -224,7 +224,7 @@ end
224
224
 
225
225
  A one-liner that returns an array of the textual parts of the given pattern.
226
226
  Compare the output with that of the one-liner example of the **Scanner**; notably
227
- how the sequence 'cat' is treated. The 't' is seperated because it's followed
227
+ how the sequence 'cat' is treated. The 't' is separated because it's followed
228
228
  by a quantifier that only applies to it.
229
229
 
230
230
  ```ruby
@@ -233,7 +233,7 @@ Regexp::Lexer.scan( /(cat?([b]at)){3,5}/ ).map {|token| token.text}
233
233
  ```
234
234
 
235
235
  #### Notes
236
- * The syntax argument is optional. It defaults to the version of the ruby
236
+ * The syntax argument is optional. It defaults to the version of the Ruby
237
237
  interpreter in use, as returned by RUBY_VERSION.
238
238
 
239
239
  * The lexer normalizes some tokens, as noted in the Syntax section above.
@@ -308,8 +308,8 @@ Expression class. See the next section for details._
308
308
 
309
309
 
310
310
  ## Supported Syntax
311
- The three modules support all the regular expression syntax features of Ruby 1.8
312
- , 1.9, and 2.x:
311
+ The three modules support all the regular expression syntax features of Ruby 1.8,
312
+ 1.9, and 2.x:
313
313
 
314
314
  _Note that not all of these are available in all versions of Ruby_
315
315
 
@@ -318,7 +318,7 @@ _Note that not all of these are available in all versions of Ruby_
318
318
  | ------------------------------------- | ------------------------------------------------------- |:--------:|
319
319
  | **Alternation** | `a\|b\|c` | ✓ |
320
320
  | **Anchors** | `\A`, `^`, `\b` | ✓ |
321
- | **Character Classes** | `[abc]`, `[^\\]`, `[a-d&&g-h]`, `[a=e=b]` | ✓ |
321
+ | **Character Classes** | `[abc]`, `[^\\]`, `[a-d&&aeiou]`, `[a=e=b]` | ✓ |
322
322
  | **Character Types** | `\d`, `\H`, `\s` | ✓ |
323
323
  | **Cluster Types** | `\R`, `\X` | ✓ |
324
324
  | **Conditional Exps.** | `(?(cond)yes-subexp)`, `(?(cond)yes-subexp\|no-subexp)` | ✓ |
@@ -362,9 +362,9 @@ _Note that not all of these are available in all versions of Ruby_
362
362
  |   _**Blocks**_ | `\p{InArmenian}`, `\P{InKhmer}`, `\p{^InThai}` | ✓ |
363
363
  |   _**Classes**_ | `\p{Alpha}`, `\P{Space}`, `\p{^Alnum}` | ✓ |
364
364
  |   _**Derived**_ | `\p{Math}`, `\P{Lowercase}`, `\p{^Cased}` | ✓ |
365
- |   _**General Categories**_ | `\p{Lu}`, `\P{Cs}`, \p{^sc} | ✓ |
366
- |   _**Scripts**_ | `\p{Arabic}`, `\P{Hiragana}`, \p{^Greek} | ✓ |
367
- |   _**Simple**_ | `\p{Dash}`, `\p{Extender}`, \p{^Hyphen} | ✓ |
365
+ |   _**General Categories**_ | `\p{Lu}`, `\P{Cs}`, `\p{^sc}` | ✓ |
366
+ |   _**Scripts**_ | `\p{Arabic}`, `\P{Hiragana}`, `\p{^Greek}` | ✓ |
367
+ |   _**Simple**_ | `\p{Dash}`, `\p{Extender}`, `\p{^Hyphen}` | ✓ |
368
368
 
369
369
  ##### Inapplicable Features
370
370
 
@@ -389,9 +389,9 @@ or incorrectly return tokens/objects as literals._
389
389
  ## Testing
390
390
  To run the tests simply run rake from the root directory, as 'test' is the default task.
391
391
 
392
- It generates the scanner's code from the ragel source files and runs all the tests, thus it requires ragel to be installed.
392
+ It generates the scanner's code from the Ragel source files and runs all the tests, thus it requires Ragel to be installed.
393
393
 
394
- The tests use ruby's test/unit. They can also be run with:
394
+ The tests use Ruby's test/unit. They can also be run with:
395
395
 
396
396
  ```
397
397
  bin/test
@@ -409,16 +409,16 @@ It is sometimes helpful during development to focus on a specific test case, for
409
409
  bin/test test/expression/test_base.rb -n test_expression_to_re
410
410
  ```
411
411
 
412
- Note that changes to ragel files will not be reflected when using `bin/test`, so you might want to run:
412
+ Note that changes to Ragel files will not be reflected when using `bin/test`, so you might want to run:
413
413
 
414
414
  ```
415
415
  rake ragel:rb && bin/test test/scanner/test_properties.rb
416
416
  ```
417
417
 
418
418
  ## Building
419
- Building the scanner and the gem requires [ragel](http://www.colm.net/open-source/ragel/) to be
419
+ Building the scanner and the gem requires [Ragel](http://www.colm.net/open-source/ragel/) to be
420
420
  installed. The build tasks will automatically invoke the 'ragel:rb' task to generate the
421
- ruby scanner code.
421
+ Ruby scanner code.
422
422
 
423
423
 
424
424
  The project uses the standard rubygems package tasks, so:
@@ -127,7 +127,7 @@ module Regexp::Expression
127
127
  end
128
128
  alias :=~ :match
129
129
 
130
- def to_h
130
+ def attributes
131
131
  {
132
132
  type: type,
133
133
  token: token,
@@ -141,6 +141,7 @@ module Regexp::Expression
141
141
  quantifier: quantified? ? quantifier.to_h : nil,
142
142
  }
143
143
  end
144
+ alias :to_h :attributes
144
145
  end
145
146
 
146
147
  def self.parsed(exp)
@@ -18,13 +18,6 @@ module Regexp::Expression
18
18
  class Branch < Regexp::Expression::Sequence; end
19
19
 
20
20
  class Expression < Regexp::Expression::Subexpression
21
- attr_reader :condition
22
-
23
- def condition=(exp)
24
- @condition = exp
25
- expressions << exp
26
- end
27
-
28
21
  def <<(exp)
29
22
  expressions.last << exp
30
23
  end
@@ -35,16 +28,25 @@ module Regexp::Expression
35
28
  end
36
29
  alias :branch :add_sequence
37
30
 
31
+ def condition=(exp)
32
+ expressions.delete(condition)
33
+ expressions.unshift(exp)
34
+ end
35
+
36
+ def condition
37
+ find { |subexp| subexp.is_a?(Condition) }
38
+ end
39
+
38
40
  def branches
39
- expressions - [condition]
41
+ select { |subexp| subexp.is_a?(Sequence) }
40
42
  end
41
43
 
42
44
  def reference
43
45
  condition.reference
44
46
  end
45
47
 
46
- def to_s(_format = :full)
47
- text + condition.text + branches.join('|') + ')'
48
+ def to_s(format = :full)
49
+ "#{text}#{condition}#{branches.join('|')})#{quantifier_affix(format)}"
48
50
  end
49
51
  end
50
52
  end
@@ -1,6 +1,8 @@
1
1
  module Regexp::Expression
2
2
 
3
3
  class Subexpression < Regexp::Expression::Base
4
+ include Enumerable
5
+
4
6
  attr_accessor :expressions
5
7
 
6
8
  def initialize(token, options = {})
@@ -24,8 +26,7 @@ module Regexp::Expression
24
26
  end
25
27
  end
26
28
 
27
- %w[[] all? any? at collect count each each_with_index empty?
28
- fetch find first index join last length map values_at].each do |method|
29
+ %w[[] at each empty? fetch index join last length values_at].each do |method|
29
30
  class_eval <<-RUBY, __FILE__, __LINE__ + 1
30
31
  def #{method}(*args, &block)
31
32
  expressions.#{method}(*args, &block)
@@ -51,7 +52,7 @@ module Regexp::Expression
51
52
  end
52
53
 
53
54
  def to_h
54
- super.merge({
55
+ attributes.merge({
55
56
  text: to_s(:base),
56
57
  expressions: expressions.map(&:to_h)
57
58
  })
@@ -1,5 +1,5 @@
1
1
  class Regexp
2
2
  class Parser
3
- VERSION = '1.1.0'
3
+ VERSION = '1.2.0'
4
4
  end
5
5
  end
@@ -157,7 +157,8 @@ class TestParserConditionals < Test::Unit::TestCase
157
157
  conditional = root[1]
158
158
 
159
159
  assert conditional.quantified?
160
- assert_equal '{42}', conditional.quantifier.text
160
+ assert_equal '{42}', conditional.quantifier.text
161
+ assert_equal '(?(1)\d|(\w)){42}', conditional.to_s
161
162
  refute conditional.branches.any?(&:quantified?)
162
163
  end
163
164
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ammar Ali
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-09-17 00:00:00.000000000 Z
11
+ date: 2018-09-28 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A library for tokenizing, lexing, and parsing Ruby regular expressions.
14
14
  email: