regexp_parser 1.1.0 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 467bef34ff29198ccbde0063f1b2f62f03cf8237c46fe59be8f06e974c00c367
4
- data.tar.gz: b876e662f889449954f0beeddedded8e462b4daec5fbb4692b65f9f7a012fec8
3
+ metadata.gz: 20ba21704667276107a1041b3bb5943bbbec0078f706cf0d7db85110631dfe8d
4
+ data.tar.gz: 87886f6cad480ebc62f3e1f243d9b61170097e5419fc8b3972cd3348e5d8d7e0
5
5
  SHA512:
6
- metadata.gz: b2064de034cf83f157da79225fc587374f744884f37934fd1828e4f3127bc45f899d62986495d9344fd4a5f2e96f7b2cf90c7961c0895fd0ee52ab8e24428d67
7
- data.tar.gz: 146440a8fc9e2c48bb1bdedad539f6883ecbc5874276eb3e56876df7b74e42630ec789f6cc6b4ca7e79d5e6b2e986b5a2f5b2fc0be4d273c02d8ff62fa2cd35c
6
+ metadata.gz: '0678640973741b2ea63053c058809fa075b3b465756bddee9a1914f67f7181a3681d3592662d4eadf5a60e844c550950b371577239924c4d3ce7f07f9fdfefa6'
7
+ data.tar.gz: 3bf18d0d7989c1f9eef010d1579ac78537c6c083c9b7c7c2f0cda094c0f973e1fdcc17c5992ae35d823720d2cdb10a60424876e08bd4b2b60b125c8b107a62bf
@@ -1,3 +1,16 @@
1
+ ## [1.2.0] - 2018-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
2
+
3
+ ### Added
4
+
5
+ - `Subexpression` (branch node) includes `Enumerable`, allowing to `#select` children etc.
6
+
7
+ ### Fixed
8
+
9
+ - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re`
10
+ - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree
11
+ - it used to be the only expression stored in a custom ivar, complicating traversal
12
+ - its setter and getter (`#condition=`, `#condition`) still work as before
13
+
1
14
  ## [1.1.0] - 2018-09-17 - [Janosch Müller](mailto:janosch84@gmail.com)
2
15
 
3
16
  ### Added
data/README.md CHANGED
@@ -2,14 +2,14 @@
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/regexp_parser.svg)](http://badge.fury.io/rb/regexp_parser) [![Build Status](https://secure.travis-ci.org/ammar/regexp_parser.svg?branch=master)](http://travis-ci.org/ammar/regexp_parser) [![Code Climate](https://codeclimate.com/github/ammar/regexp_parser.svg)](https://codeclimate.com/github/ammar/regexp_parser/badges)
4
4
 
5
- A ruby gem for tokenizing, parsing, and transforming regular expressions.
5
+ A Ruby gem for tokenizing, parsing, and transforming regular expressions.
6
6
 
7
7
  * Multilayered
8
- * A scanner/tokenizer based on [ragel](http://www.colm.net/open-source/ragel/)
8
+ * A scanner/tokenizer based on [Ragel](http://www.colm.net/open-source/ragel/)
9
9
  * A lexer that produces a "stream" of token objects.
10
10
  * A parser that produces a "tree" of Expression objects (OO API)
11
- * Runs on ruby 1.9, 2.x, and jruby (1.9 mode) runtimes.
12
- * Recognizes ruby 1.8, 1.9, and 2.x regular expressions [See Supported Syntax](#supported-syntax)
11
+ * Runs on Ruby 1.9, 2.x, and JRuby (1.9 mode) runtimes.
12
+ * Recognizes Ruby 1.8, 1.9, and 2.x regular expressions [See Supported Syntax](#supported-syntax)
13
13
 
14
14
 
15
15
  _For examples of regexp_parser in use, see [Example Projects](#example-projects)._
@@ -46,7 +46,7 @@ The three main modules are **Scanner**, **Lexer**, and **Parser**. Each of them
46
46
  provides a single method that takes a regular expression (as a RegExp object or
47
47
  a string) and returns its results. The **Lexer** and the **Parser** accept an
48
48
  optional second argument that specifies the syntax version, like 'ruby/2.0',
49
- which defaults to the host ruby version (using RUBY_VERSION).
49
+ which defaults to the host Ruby version (using RUBY_VERSION).
50
50
 
51
51
  Here are the basic usage examples:
52
52
 
@@ -77,7 +77,7 @@ called with the results as follows:
77
77
  ## Components
78
78
 
79
79
  ### Scanner
80
- A ragel generated scanner that recognizes the cumulative syntax of all
80
+ A Ragel-generated scanner that recognizes the cumulative syntax of all
81
81
  supported syntax versions. It breaks a given expression's text into the
82
82
  smallest parts, and identifies their type, token, text, and start/end
83
83
  offsets within the pattern.
@@ -123,7 +123,7 @@ Regexp::Scanner.scan( /(cat?([bhm]at)){3,5}/ ).map {|token| token[2]}
123
123
  balancing punctuation and premature end of pattern. Flavor validity checks
124
124
  are performed in the lexer, which uses a syntax object.
125
125
 
126
- * If the input is a ruby **Regexp** object, the scanner calls #source on it to
126
+ * If the input is a Ruby **Regexp** object, the scanner calls #source on it to
127
127
  get its string representation. #source does not include the options of
128
128
  the expression (m, i, and x). To include the options in the scan, #to_s
129
129
  should be called on the **Regexp** before passing it to the scanner or the
@@ -188,7 +188,7 @@ ruby_18.implements? :conditional, :condition # => false
188
188
  Sits on top of the scanner and performs lexical analysis on the tokens that
189
189
  it emits. Among its tasks are; breaking quantified literal runs, collecting the
190
190
  emitted token attributes into Token objects, calculating their nesting depth,
191
- normalizing tokens for the parser, and checkng if the tokens are implemented by
191
+ normalizing tokens for the parser, and checking if the tokens are implemented by
192
192
  the given syntax version.
193
193
 
194
194
  See the [Token Objects](https://github.com/ammar/regexp_parser/wiki/Token-Objects)
@@ -196,7 +196,7 @@ wiki page for more information on Token objects.
196
196
 
197
197
 
198
198
  #### Example
199
- The following example lexes the given pattern, checks it against the ruby 1.9
199
+ The following example lexes the given pattern, checks it against the Ruby 1.9
200
200
  syntax, and prints the token objects' text indented to their level.
201
201
 
202
202
  ```ruby
@@ -224,7 +224,7 @@ end
224
224
 
225
225
  A one-liner that returns an array of the textual parts of the given pattern.
226
226
  Compare the output with that of the one-liner example of the **Scanner**; notably
227
- how the sequence 'cat' is treated. The 't' is seperated because it's followed
227
+ how the sequence 'cat' is treated. The 't' is separated because it's followed
228
228
  by a quantifier that only applies to it.
229
229
 
230
230
  ```ruby
@@ -233,7 +233,7 @@ Regexp::Lexer.scan( /(cat?([b]at)){3,5}/ ).map {|token| token.text}
233
233
  ```
234
234
 
235
235
  #### Notes
236
- * The syntax argument is optional. It defaults to the version of the ruby
236
+ * The syntax argument is optional. It defaults to the version of the Ruby
237
237
  interpreter in use, as returned by RUBY_VERSION.
238
238
 
239
239
  * The lexer normalizes some tokens, as noted in the Syntax section above.
@@ -308,8 +308,8 @@ Expression class. See the next section for details._
308
308
 
309
309
 
310
310
  ## Supported Syntax
311
- The three modules support all the regular expression syntax features of Ruby 1.8
312
- , 1.9, and 2.x:
311
+ The three modules support all the regular expression syntax features of Ruby 1.8,
312
+ 1.9, and 2.x:
313
313
 
314
314
  _Note that not all of these are available in all versions of Ruby_
315
315
 
@@ -318,7 +318,7 @@ _Note that not all of these are available in all versions of Ruby_
318
318
  | ------------------------------------- | ------------------------------------------------------- |:--------:|
319
319
  | **Alternation** | `a\|b\|c` | ✓ |
320
320
  | **Anchors** | `\A`, `^`, `\b` | ✓ |
321
- | **Character Classes** | `[abc]`, `[^\\]`, `[a-d&&g-h]`, `[a=e=b]` | ✓ |
321
+ | **Character Classes** | `[abc]`, `[^\\]`, `[a-d&&aeiou]`, `[a=e=b]` | ✓ |
322
322
  | **Character Types** | `\d`, `\H`, `\s` | ✓ |
323
323
  | **Cluster Types** | `\R`, `\X` | ✓ |
324
324
  | **Conditional Exps.** | `(?(cond)yes-subexp)`, `(?(cond)yes-subexp\|no-subexp)` | ✓ |
@@ -362,9 +362,9 @@ _Note that not all of these are available in all versions of Ruby_
362
362
  |   _**Blocks**_ | `\p{InArmenian}`, `\P{InKhmer}`, `\p{^InThai}` | ✓ |
363
363
  |   _**Classes**_ | `\p{Alpha}`, `\P{Space}`, `\p{^Alnum}` | ✓ |
364
364
  |   _**Derived**_ | `\p{Math}`, `\P{Lowercase}`, `\p{^Cased}` | ✓ |
365
- |   _**General Categories**_ | `\p{Lu}`, `\P{Cs}`, \p{^sc} | ✓ |
366
- |   _**Scripts**_ | `\p{Arabic}`, `\P{Hiragana}`, \p{^Greek} | ✓ |
367
- |   _**Simple**_ | `\p{Dash}`, `\p{Extender}`, \p{^Hyphen} | ✓ |
365
+ |   _**General Categories**_ | `\p{Lu}`, `\P{Cs}`, `\p{^sc}` | ✓ |
366
+ |   _**Scripts**_ | `\p{Arabic}`, `\P{Hiragana}`, `\p{^Greek}` | ✓ |
367
+ |   _**Simple**_ | `\p{Dash}`, `\p{Extender}`, `\p{^Hyphen}` | ✓ |
368
368
 
369
369
  ##### Inapplicable Features
370
370
 
@@ -389,9 +389,9 @@ or incorrectly return tokens/objects as literals._
389
389
  ## Testing
390
390
  To run the tests simply run rake from the root directory, as 'test' is the default task.
391
391
 
392
- It generates the scanner's code from the ragel source files and runs all the tests, thus it requires ragel to be installed.
392
+ It generates the scanner's code from the Ragel source files and runs all the tests, thus it requires Ragel to be installed.
393
393
 
394
- The tests use ruby's test/unit. They can also be run with:
394
+ The tests use Ruby's test/unit. They can also be run with:
395
395
 
396
396
  ```
397
397
  bin/test
@@ -409,16 +409,16 @@ It is sometimes helpful during development to focus on a specific test case, for
409
409
  bin/test test/expression/test_base.rb -n test_expression_to_re
410
410
  ```
411
411
 
412
- Note that changes to ragel files will not be reflected when using `bin/test`, so you might want to run:
412
+ Note that changes to Ragel files will not be reflected when using `bin/test`, so you might want to run:
413
413
 
414
414
  ```
415
415
  rake ragel:rb && bin/test test/scanner/test_properties.rb
416
416
  ```
417
417
 
418
418
  ## Building
419
- Building the scanner and the gem requires [ragel](http://www.colm.net/open-source/ragel/) to be
419
+ Building the scanner and the gem requires [Ragel](http://www.colm.net/open-source/ragel/) to be
420
420
  installed. The build tasks will automatically invoke the 'ragel:rb' task to generate the
421
- ruby scanner code.
421
+ Ruby scanner code.
422
422
 
423
423
 
424
424
  The project uses the standard rubygems package tasks, so:
@@ -127,7 +127,7 @@ module Regexp::Expression
127
127
  end
128
128
  alias :=~ :match
129
129
 
130
- def to_h
130
+ def attributes
131
131
  {
132
132
  type: type,
133
133
  token: token,
@@ -141,6 +141,7 @@ module Regexp::Expression
141
141
  quantifier: quantified? ? quantifier.to_h : nil,
142
142
  }
143
143
  end
144
+ alias :to_h :attributes
144
145
  end
145
146
 
146
147
  def self.parsed(exp)
@@ -18,13 +18,6 @@ module Regexp::Expression
18
18
  class Branch < Regexp::Expression::Sequence; end
19
19
 
20
20
  class Expression < Regexp::Expression::Subexpression
21
- attr_reader :condition
22
-
23
- def condition=(exp)
24
- @condition = exp
25
- expressions << exp
26
- end
27
-
28
21
  def <<(exp)
29
22
  expressions.last << exp
30
23
  end
@@ -35,16 +28,25 @@ module Regexp::Expression
35
28
  end
36
29
  alias :branch :add_sequence
37
30
 
31
+ def condition=(exp)
32
+ expressions.delete(condition)
33
+ expressions.unshift(exp)
34
+ end
35
+
36
+ def condition
37
+ find { |subexp| subexp.is_a?(Condition) }
38
+ end
39
+
38
40
  def branches
39
- expressions - [condition]
41
+ select { |subexp| subexp.is_a?(Sequence) }
40
42
  end
41
43
 
42
44
  def reference
43
45
  condition.reference
44
46
  end
45
47
 
46
- def to_s(_format = :full)
47
- text + condition.text + branches.join('|') + ')'
48
+ def to_s(format = :full)
49
+ "#{text}#{condition}#{branches.join('|')})#{quantifier_affix(format)}"
48
50
  end
49
51
  end
50
52
  end
@@ -1,6 +1,8 @@
1
1
  module Regexp::Expression
2
2
 
3
3
  class Subexpression < Regexp::Expression::Base
4
+ include Enumerable
5
+
4
6
  attr_accessor :expressions
5
7
 
6
8
  def initialize(token, options = {})
@@ -24,8 +26,7 @@ module Regexp::Expression
24
26
  end
25
27
  end
26
28
 
27
- %w[[] all? any? at collect count each each_with_index empty?
28
- fetch find first index join last length map values_at].each do |method|
29
+ %w[[] at each empty? fetch index join last length values_at].each do |method|
29
30
  class_eval <<-RUBY, __FILE__, __LINE__ + 1
30
31
  def #{method}(*args, &block)
31
32
  expressions.#{method}(*args, &block)
@@ -51,7 +52,7 @@ module Regexp::Expression
51
52
  end
52
53
 
53
54
  def to_h
54
- super.merge({
55
+ attributes.merge({
55
56
  text: to_s(:base),
56
57
  expressions: expressions.map(&:to_h)
57
58
  })
@@ -1,5 +1,5 @@
1
1
  class Regexp
2
2
  class Parser
3
- VERSION = '1.1.0'
3
+ VERSION = '1.2.0'
4
4
  end
5
5
  end
@@ -157,7 +157,8 @@ class TestParserConditionals < Test::Unit::TestCase
157
157
  conditional = root[1]
158
158
 
159
159
  assert conditional.quantified?
160
- assert_equal '{42}', conditional.quantifier.text
160
+ assert_equal '{42}', conditional.quantifier.text
161
+ assert_equal '(?(1)\d|(\w)){42}', conditional.to_s
161
162
  refute conditional.branches.any?(&:quantified?)
162
163
  end
163
164
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ammar Ali
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-09-17 00:00:00.000000000 Z
11
+ date: 2018-09-28 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A library for tokenizing, lexing, and parsing Ruby regular expressions.
14
14
  email: