character_set 1.7.0 → 1.8.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 778cea0208adb290e09f454e3f88c021531decede99604cf2b67f48c9ab3bcd8
4
- data.tar.gz: 43672f0afce2846bec846791e7445c9614605194967e061b1d9f0be305298be4
3
+ metadata.gz: ebb6792f685df02534f1ef04a92d7f0c5fdcb482e5aaa4856d7a39726e17f007
4
+ data.tar.gz: c6630aab9b6506c46a970ba83c257cd753f8f76760b6ce8d2639f51efba83eeb
5
5
  SHA512:
6
- metadata.gz: 635f9fb21c973b03b9a0556f2b6cf2c608753acb73616aa8681a5bada3418f955ec887164a03fd2add4edae8a60292eb6e4b681d11a8cf33d1083499afe83815
7
- data.tar.gz: 6a80f4f7f3f6c2357d84dc71bb5086f273471a8ae5af4e09abc502dc9893da90962d49527b40564de664af0a910f2e99720062dd41ba53427882ecb36e1a40a0
6
+ metadata.gz: 4c773a0546d05939d0b295e50355c6efe870a1ed74901d63c24097ff598d4a43bcd00ce2d03fb492a48fd9c03968a79ee78b789d92836843d6621dca3e8f313c
7
+ data.tar.gz: 560d3c3aa3f7e4daac3b6d2c89fb9dd6840777fa4d5896fb33564023ef745d81a7e4d0e51fe0ba42f6cd4504bc0b088657cd4ef1ab15d213aa1bb096ba404542
@@ -11,7 +11,7 @@ jobs:
11
11
  - name: Set up Ruby
12
12
  uses: ruby/setup-ruby@v1
13
13
  with:
14
- ruby-version: 2.7
14
+ ruby-version: 3.3
15
15
  - name: Prepare
16
16
  run: |
17
17
  bundle install --jobs 4
@@ -13,7 +13,7 @@ jobs:
13
13
  - name: Set up Ruby
14
14
  uses: ruby/setup-ruby@v1
15
15
  with:
16
- ruby-version: 2.7
16
+ ruby-version: 3.3
17
17
  - name: Cache gems
18
18
  uses: actions/cache@v1
19
19
  with:
@@ -12,7 +12,7 @@ jobs:
12
12
 
13
13
  strategy:
14
14
  matrix:
15
- ruby: [ '2.2', '2.7', '3.0', '3.1', 'ruby-head', 'jruby-head' ]
15
+ ruby: [ '2.4', '2.7', '3.0', '3.1', '3.2', '3.3', 'ruby-head', 'jruby-head' ]
16
16
 
17
17
  steps:
18
18
  - uses: actions/checkout@v2
@@ -24,3 +24,5 @@ jobs:
24
24
  run: bundle install --jobs 4
25
25
  - name: Test with Rake
26
26
  run: bundle exec rake
27
+ - uses: codecov/codecov-action@v3
28
+ if: matrix.ruby == '3.2'
data/.rubocop.yml CHANGED
@@ -15,3 +15,6 @@ Lint/AmbiguousOperatorPrecedence:
15
15
 
16
16
  Lint/AmbiguousRegexpLiteral:
17
17
  Enabled: false
18
+
19
+ Metrics:
20
+ Enabled: false
data/CHANGELOG.md CHANGED
@@ -6,6 +6,15 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [1.8.0] - 2024-01-07
10
+
11
+ ### Added
12
+
13
+ - support for `#<=>` and `#join`, which were added to `set` in the meantime
14
+ - support for getting the (overall) character set of a Regexp with multiple expressions
15
+ - support for global and local case-insensitivity in Regexp inputs
16
+ - `Regexp#{covered_by_character_set?,uses_character_set?}` methods (if core ext is used)
17
+
9
18
  ## [1.7.0] - 2023-05-12
10
19
 
11
20
  ### Added
data/Gemfile CHANGED
@@ -7,14 +7,15 @@ gemspec
7
7
 
8
8
  gem 'benchmark-ips', '~> 2.7'
9
9
  gem 'get_process_mem', '~> 0.2.3'
10
- gem 'rake', '~> 13.0'
10
+ gem 'rake', '~> 13.1'
11
11
  gem 'rake-compiler', '~> 1.1'
12
12
  gem 'range_compressor', '~> 1.0'
13
- gem 'regexp_parser', '~> 2.1'
14
- gem 'regexp_property_values', '~> 1.0'
13
+ gem 'regexp_parser', '~> 2.9'
14
+ gem 'regexp_property_values', '~> 1.5'
15
15
  gem 'rspec', '~> 3.8'
16
- if RUBY_VERSION.to_f >= 2.7
17
- gem 'codecov', '~> 0.2.12'
16
+ gem 'warning', '~> 1.3'
17
+ if RUBY_VERSION.to_f >= 3.0
18
18
  gem 'gouteur', '~> 1.0.0'
19
- gem 'rubocop', '~> 1.8'
19
+ gem 'rubocop', '~> 1.59'
20
+ gem 'simplecov-cobertura', require: false
20
21
  end
data/LICENSE.txt CHANGED
@@ -1,6 +1,6 @@
1
1
  The MIT License (MIT)
2
2
 
3
- Copyright (c) 2018 Janosch Müller
3
+ Copyright (c) 2018-2023 Janosch Müller
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
data/README.md CHANGED
@@ -3,7 +3,7 @@
3
3
  [![Gem Version](https://badge.fury.io/rb/character_set.svg)](http://badge.fury.io/rb/character_set)
4
4
  [![Build Status](https://github.com/jaynetics/character_set/workflows/tests/badge.svg)](https://github.com/jaynetics/character_set/actions)
5
5
  [![Build Status](https://github.com/jaynetics/character_set/workflows/gouteur/badge.svg)](https://github.com/jaynetics/character_set/actions)
6
- [![codecov](https://codecov.io/gh/jaynetics/character_set/branch/master/graph/badge.svg)](https://codecov.io/gh/jaynetics/character_set)
6
+ [![Coverage](https://codecov.io/gh/jaynetics/character_set/branch/main/graph/badge.svg?token=oY7gcWNbIN)](https://codecov.io/gh/jaynetics/character_set)
7
7
 
8
8
  This is a C-extended Ruby gem to work with sets of Unicode codepoints.
9
9
 
@@ -43,7 +43,7 @@ CharacterSet.parse('[a-c]')
43
43
  CharacterSet.parse('\U00000061-\U00000063')
44
44
  ```
45
45
 
46
- If the gems [`regexp_parser`](https://github.com/ammar/regexp_parser) and [`regexp_property_values`](https://github.com/jaynetics/regexp_property_values) are installed, `Regexp` and unicode property names can also be read. Regexp intersections, negations, and set nesting are covered, but the `i`-flag is ignored; call `#case_insensitive` on the result if needed.
46
+ If the gems [`regexp_parser`](https://github.com/ammar/regexp_parser) and [`regexp_property_values`](https://github.com/jaynetics/regexp_property_values) are installed, `Regexp` instances and unicode property names can also be read.
47
47
 
48
48
  ```ruby
49
49
  CharacterSet.of(/./) # => #<CharacterSet (size: 1112064)>
@@ -675,6 +675,18 @@ cs_method_proper_superset_p(VALUE self, VALUE other)
675
675
  return (is_superset && is_proper) ? Qtrue : Qfalse;
676
676
  }
677
677
 
678
+ static VALUE
679
+ cs_method_spaceship_operator(VALUE self, VALUE other)
680
+ {
681
+ if (cs_method_eql_p(self, other))
682
+ return INT2FIX(0);
683
+ if (cs_method_proper_subset_p(self, other))
684
+ return INT2FIX(-1);
685
+ if (cs_method_proper_superset_p(self, other))
686
+ return INT2FIX(1);
687
+ return Qnil;
688
+ }
689
+
678
690
  // *******************************
679
691
  // `CharacterSet`-specific methods
680
692
  // *******************************
@@ -1324,6 +1336,7 @@ void Init_character_set()
1324
1336
  rb_define_method(cs, ">=", cs_method_superset_p, 1);
1325
1337
  rb_define_method(cs, "proper_superset?", cs_method_proper_superset_p, 1);
1326
1338
  rb_define_method(cs, ">", cs_method_proper_superset_p, 1);
1339
+ rb_define_method(cs, "<=>", cs_method_spaceship_operator, 1);
1327
1340
 
1328
1341
  // `CharacterSet`-specific methods
1329
1342
 
@@ -4,6 +4,14 @@ class CharacterSet
4
4
  def character_set
5
5
  CharacterSet.of_regexp(self)
6
6
  end
7
+
8
+ def covered_by_character_set?(other)
9
+ other.superset?(character_set)
10
+ end
11
+
12
+ def uses_character_set?(other)
13
+ other.intersect?(character_set)
14
+ end
7
15
  end
8
16
  end
9
17
  end
@@ -4,86 +4,61 @@ class CharacterSet
4
4
 
5
5
  Error = Class.new(ArgumentError)
6
6
 
7
- def convert(expression, to = CharacterSet)
7
+ def convert(expression, to = CharacterSet, acc = [])
8
8
  CharacterSet.require_optional_dependency('regexp_parser', __method__)
9
9
 
10
10
  case expression
11
- when Regexp::Expression::Root
12
- if expression.count != 1
13
- raise Error, 'Pass a Regexp with exactly one expression, e.g. /[a-z]/'
14
- end
15
- convert(expression[0], to)
16
-
17
11
  when Regexp::Expression::CharacterSet
18
- content = expression.map { |subexp| convert(subexp, to) }.reduce(:+)
19
- content ||= to[]
20
- expression.negative? ? content.inversion : content
12
+ content = expression.map { |subexp| convert(subexp, to) }.reduce(:+) || to[]
13
+ acc << (expression.negative? ? content.inversion : content)
21
14
 
22
15
  when Regexp::Expression::CharacterSet::Intersection
23
- expression.map { |subexp| convert(subexp, to) }.reduce(:&)
24
-
25
- when Regexp::Expression::CharacterSet::IntersectedSequence
26
- expression.map { |subexp| convert(subexp, to) }.reduce(:+) || to[]
16
+ acc << expression.map { |subexp| convert(subexp, to) }.reduce(:&)
27
17
 
28
18
  when Regexp::Expression::CharacterSet::Range
29
19
  start, finish = expression.map { |subexp| convert(subexp, to) }
30
- to.new((start.min)..(finish.max))
20
+ acc << to.new((start.min)..(finish.max))
21
+
22
+ when Regexp::Expression::Subexpression # root, group, alternation, etc.
23
+ expression.each { |subexp| convert(subexp, to, acc) }
31
24
 
32
25
  when Regexp::Expression::CharacterType::Any
33
- to.unicode
26
+ acc << to.unicode
34
27
 
35
28
  when Regexp::Expression::CharacterType::Base
36
29
  /(?<negative>non)?(?<base_name>.+)/ =~ expression.token
37
30
  content =
38
31
  if expression.unicode_classes?
39
- # in u-mode, type shortcuts match the same as \p{<long type name>}
40
- to.of_property(base_name)
32
+ # in u-mode, most type shortcuts match the same as \p{<long type name>}
33
+ if base_name == 'linebreak'
34
+ to.from_ranges(10..13, 133..133, 8232..8233)
35
+ else
36
+ to.of_property(base_name)
37
+ end
41
38
  else
42
39
  # in normal mode, types match only ascii chars
43
40
  case base_name.to_sym
44
- when :digit then to.from_ranges(48..57)
45
- when :hex then to.from_ranges(48..57, 65..70, 97..102)
46
- when :space then to.from_ranges(9..13, 32..32)
47
- when :word then to.from_ranges(48..57, 65..90, 95..95, 97..122)
41
+ when :digit then to.from_ranges(48..57)
42
+ when :hex then to.from_ranges(48..57, 65..70, 97..102)
43
+ when :linebreak then to.from_ranges(10..13)
44
+ when :space then to.from_ranges(9..13, 32..32)
45
+ when :word then to.from_ranges(48..57, 65..90, 95..95, 97..122)
48
46
  else raise Error, "Unsupported CharacterType #{base_name}"
49
47
  end
50
48
  end
51
- negative ? content.inversion : content
49
+ acc << (negative ? content.inversion : content)
52
50
 
53
51
  when Regexp::Expression::EscapeSequence::CodepointList
54
- to.new(expression.codepoints)
52
+ content = to.new(expression.codepoints)
53
+ acc << (expression.i? ? content.case_insensitive : content)
55
54
 
56
55
  when Regexp::Expression::EscapeSequence::Base
57
- to[expression.codepoint]
58
-
59
- when Regexp::Expression::Group::Capture,
60
- Regexp::Expression::Group::Passive,
61
- Regexp::Expression::Group::Named,
62
- Regexp::Expression::Group::Atomic,
63
- Regexp::Expression::Group::Options
64
- case expression.count
65
- when 0 then to[]
66
- when 1 then convert(expression.first, to)
67
- else
68
- raise Error, 'Groups must contain exactly one expression, e.g. ([a-z])'
69
- end
70
-
71
- when Regexp::Expression::Alternation # rubocop:disable Lint/DuplicateBranch
72
- expression.map { |subexp| convert(subexp, to) }.reduce(:+)
73
-
74
- when Regexp::Expression::Alternative
75
- case expression.count
76
- when 0 then to[]
77
- when 1 then convert(expression.first, to)
78
- else
79
- raise Error, 'Alternatives must contain exactly one expression'
80
- end
56
+ content = to[expression.codepoint]
57
+ acc << (expression.i? ? content.case_insensitive : content)
81
58
 
82
59
  when Regexp::Expression::Literal
83
- if expression.set_level == 0 && expression.text.size != 1
84
- raise Error, 'Literal runs outside of sets are codepoint *sequences*'
85
- end
86
- to[expression.text.ord]
60
+ content = to[*expression.text.chars]
61
+ acc << (expression.i? ? content.case_insensitive : content)
87
62
 
88
63
  when Regexp::Expression::UnicodeProperty::Base,
89
64
  Regexp::Expression::PosixClass
@@ -91,14 +66,22 @@ class CharacterSet
91
66
  if expression.type == :posixclass && expression.ascii_classes?
92
67
  content = content.ascii_part
93
68
  end
94
- expression.negative? ? content.inversion : content
69
+ acc << (expression.negative? ? content.inversion : content)
70
+
71
+ when Regexp::Expression::Anchor::Base,
72
+ Regexp::Expression::Backreference::Base,
73
+ Regexp::Expression::Keep::Mark,
74
+ Regexp::Expression::Quantifier
75
+ # ignore zero-length and repeat expressions
95
76
 
96
77
  when Regexp::Expression::Base
97
78
  raise Error, "Unsupported expression class `#{expression.class}`"
98
79
 
99
80
  else
100
- raise Error, "Pass an expression (result of Regexp::Parser.parse)"
81
+ raise Error, 'Pass an expression (result of Regexp::Parser.parse)'
101
82
  end
83
+
84
+ acc.reduce(:+) || to[]
102
85
  end
103
86
  end
104
87
  end
@@ -122,10 +122,6 @@ class CharacterSet
122
122
  raise ArgumentError, 'pass a String' unless obj.respond_to?(:codepoints)
123
123
  obj.encode('utf-8')
124
124
  end
125
-
126
- def make_new_str(original, &block)
127
- utf8_str!(original).each_codepoint.with_object('', &block)
128
- end
129
125
  end
130
126
  end
131
127
  end
@@ -11,7 +11,7 @@ class CharacterSet
11
11
  RUBY
12
12
  end
13
13
 
14
- %i[< <= > >= === disjoint? include? intersect? member?
14
+ %i[< <= <=> > >= === disjoint? include? intersect? member?
15
15
  proper_subset? proper_superset? subset? superset?].each do |mthd|
16
16
  class_eval <<-RUBY, __FILE__, __LINE__ + 1
17
17
  def #{mthd}(enum, &block)
@@ -23,9 +23,8 @@ class CharacterSet
23
23
  RUBY
24
24
  end
25
25
 
26
- %i[<< add add? clear collect! delete delete? delete_if
27
- each filter! map! keep_if reject!
28
- select! subtract].each do |mthd|
26
+ %i[<< add add? clear delete delete? delete_if each filter! keep_if
27
+ reject! select! subtract].each do |mthd|
29
28
  class_eval <<-RUBY, __FILE__, __LINE__ + 1
30
29
  def #{mthd}(*args, &block)
31
30
  result = @__set.#{mthd}(*args, &block)
@@ -1,492 +1,385 @@
1
- # set and sorted_set are vendored due to various dependency issues:
2
- #
3
- # - issues with default vs. installed gems such as [#2]
4
- # - issues with the sorted_set dependency rb_tree
5
- # - long-standing issues in recent versions of sorted_set
6
- #
7
- # The RubyFallback (and thus these set classes), are only used for testing,
8
- # and for exotic rubies which use neither C nor Java.
9
-
10
- class CharacterSet
11
- module RubyFallback
12
- if RUBY_PLATFORM[/java/i]
13
- # Vendoring is not needed for JRuby which has sorted_set in the stdlib.
14
- require 'set'
15
-
16
- Set = ::Set
17
- SortedSet = ::SortedSet
18
- else
19
- # set, vendored from https://github.com/ruby/set/blob/master/lib/set.rb,
20
- # with comments removed and linted.
21
- class Set
22
- include Enumerable
23
-
24
- def self.[](*ary)
25
- new(ary)
26
- end
27
-
28
- def initialize(enum = nil, &block)
29
- @hash = Hash.new(false)
30
-
31
- enum.nil? and return
32
-
33
- if block
34
- do_with_enum(enum) { |o| add(block[o]) }
35
- else
36
- merge(enum)
37
- end
38
- end
39
-
40
- def compare_by_identity
41
- if @hash.respond_to?(:compare_by_identity)
42
- @hash.compare_by_identity
43
- self
44
- else
45
- raise NotImplementedError, "#{self.class.name}\##{__method__} is not implemented"
46
- end
47
- end
1
+ # set, vendored from https://github.com/ruby/set/blob/master/lib/set.rb,
2
+ # with comments removed and linted.
3
+ class CharacterSet::RubyFallback::Set
4
+ Set = self
5
+ include Enumerable
6
+
7
+ def self.[](*ary)
8
+ new(ary)
9
+ end
48
10
 
49
- def compare_by_identity?
50
- @hash.respond_to?(:compare_by_identity?) && @hash.compare_by_identity?
51
- end
11
+ def initialize(enum = nil, &block)
12
+ @hash = Hash.new(false)
52
13
 
53
- def do_with_enum(enum, &block)
54
- if enum.respond_to?(:each_entry)
55
- enum.each_entry(&block) if block
56
- elsif enum.respond_to?(:each)
57
- enum.each(&block) if block
58
- else
59
- raise ArgumentError, "value must be enumerable"
60
- end
61
- end
62
- private :do_with_enum
63
-
64
- def initialize_dup(orig)
65
- super
66
- @hash = orig.instance_variable_get(:@hash).dup
67
- end
14
+ enum.nil? and return
68
15
 
69
- if Kernel.instance_method(:initialize_clone).arity != 1
70
- def initialize_clone(orig, **options)
71
- super
72
- @hash = orig.instance_variable_get(:@hash).clone(**options)
73
- end
74
- else
75
- def initialize_clone(orig)
76
- super
77
- @hash = orig.instance_variable_get(:@hash).clone
78
- end
79
- end
80
-
81
- def freeze
82
- @hash.freeze
83
- super
84
- end
16
+ if block
17
+ do_with_enum(enum) { |o| add(block[o]) }
18
+ else
19
+ merge(enum)
20
+ end
21
+ end
85
22
 
86
- def size
87
- @hash.size
88
- end
89
- alias length size
23
+ def do_with_enum(enum, &block)
24
+ if enum.respond_to?(:each_entry)
25
+ enum.each_entry(&block) if block
26
+ elsif enum.respond_to?(:each)
27
+ enum.each(&block) if block
28
+ else
29
+ raise ArgumentError, "value must be enumerable"
30
+ end
31
+ end
32
+ private :do_with_enum
90
33
 
91
- def empty?
92
- @hash.empty?
93
- end
34
+ def initialize_dup(orig)
35
+ super
36
+ @hash = orig.instance_variable_get(:@hash).dup
37
+ end
94
38
 
95
- def clear
96
- @hash.clear
97
- self
98
- end
39
+ if Kernel.instance_method(:initialize_clone).arity != 1
40
+ def initialize_clone(orig, **options)
41
+ super
42
+ @hash = orig.instance_variable_get(:@hash).clone(**options)
43
+ end
44
+ else
45
+ def initialize_clone(orig)
46
+ super
47
+ @hash = orig.instance_variable_get(:@hash).clone
48
+ end
49
+ end
99
50
 
100
- def replace(enum)
101
- if enum.instance_of?(self.class)
102
- @hash.replace(enum.instance_variable_get(:@hash))
103
- self
104
- else
105
- do_with_enum(enum)
106
- clear
107
- merge(enum)
108
- end
109
- end
51
+ def freeze
52
+ @hash.freeze
53
+ super
54
+ end
110
55
 
111
- def to_a
112
- @hash.keys
113
- end
56
+ def size
57
+ @hash.size
58
+ end
59
+ alias length size
114
60
 
115
- def to_set(klass = Set, *args, &block)
116
- return self if instance_of?(Set) && klass == Set && block.nil? && args.empty?
117
- klass.new(self, *args, &block)
118
- end
61
+ def empty?
62
+ @hash.empty?
63
+ end
119
64
 
120
- def flatten_merge(set, seen = Set.new)
121
- set.each { |e|
122
- if e.is_a?(Set)
123
- if seen.include?(e_id = e.object_id)
124
- raise ArgumentError, "tried to flatten recursive Set"
125
- end
126
-
127
- seen.add(e_id)
128
- flatten_merge(e, seen)
129
- seen.delete(e_id)
130
- else
131
- add(e)
132
- end
133
- }
134
-
135
- self
136
- end
137
- protected :flatten_merge
65
+ def clear
66
+ @hash.clear
67
+ self
68
+ end
138
69
 
139
- def flatten
140
- self.class.new.flatten_merge(self)
141
- end
70
+ def to_a
71
+ @hash.keys
72
+ end
142
73
 
143
- def flatten!
144
- replace(flatten()) if any? { |e| e.is_a?(Set) }
145
- end
74
+ def include?(o)
75
+ @hash[o]
76
+ end
77
+ alias member? include?
78
+
79
+ def superset?(set)
80
+ case
81
+ when set.instance_of?(self.class) && @hash.respond_to?(:>=)
82
+ @hash >= set.instance_variable_get(:@hash)
83
+ when set.is_a?(Set)
84
+ size >= set.size && set.all? { |o| include?(o) }
85
+ else
86
+ raise ArgumentError, "value must be a set"
87
+ end
88
+ end
89
+ alias >= superset?
90
+
91
+ def proper_superset?(set)
92
+ case
93
+ when set.instance_of?(self.class) && @hash.respond_to?(:>)
94
+ @hash > set.instance_variable_get(:@hash)
95
+ when set.is_a?(Set)
96
+ size > set.size && set.all? { |o| include?(o) }
97
+ else
98
+ raise ArgumentError, "value must be a set"
99
+ end
100
+ end
101
+ alias > proper_superset?
102
+
103
+ def subset?(set)
104
+ case
105
+ when set.instance_of?(self.class) && @hash.respond_to?(:<=)
106
+ @hash <= set.instance_variable_get(:@hash)
107
+ when set.is_a?(Set)
108
+ size <= set.size && all? { |o| set.include?(o) }
109
+ else
110
+ raise ArgumentError, "value must be a set"
111
+ end
112
+ end
113
+ alias <= subset?
114
+
115
+ def proper_subset?(set)
116
+ case
117
+ when set.instance_of?(self.class) && @hash.respond_to?(:<)
118
+ @hash < set.instance_variable_get(:@hash)
119
+ when set.is_a?(Set)
120
+ size < set.size && all? { |o| set.include?(o) }
121
+ else
122
+ raise ArgumentError, "value must be a set"
123
+ end
124
+ end
125
+ alias < proper_subset?
146
126
 
147
- def include?(o)
148
- @hash[o]
149
- end
150
- alias member? include?
151
-
152
- def superset?(set)
153
- case
154
- when set.instance_of?(self.class) && @hash.respond_to?(:>=)
155
- @hash >= set.instance_variable_get(:@hash)
156
- when set.is_a?(Set)
157
- size >= set.size && set.all? { |o| include?(o) }
158
- else
159
- raise ArgumentError, "value must be a set"
160
- end
161
- end
162
- alias >= superset?
163
-
164
- def proper_superset?(set)
165
- case
166
- when set.instance_of?(self.class) && @hash.respond_to?(:>)
167
- @hash > set.instance_variable_get(:@hash)
168
- when set.is_a?(Set)
169
- size > set.size && set.all? { |o| include?(o) }
170
- else
171
- raise ArgumentError, "value must be a set"
172
- end
173
- end
174
- alias > proper_superset?
175
-
176
- def subset?(set)
177
- case
178
- when set.instance_of?(self.class) && @hash.respond_to?(:<=)
179
- @hash <= set.instance_variable_get(:@hash)
180
- when set.is_a?(Set)
181
- size <= set.size && all? { |o| set.include?(o) }
182
- else
183
- raise ArgumentError, "value must be a set"
184
- end
185
- end
186
- alias <= subset?
187
-
188
- def proper_subset?(set)
189
- case
190
- when set.instance_of?(self.class) && @hash.respond_to?(:<)
191
- @hash < set.instance_variable_get(:@hash)
192
- when set.is_a?(Set)
193
- size < set.size && all? { |o| set.include?(o) }
194
- else
195
- raise ArgumentError, "value must be a set"
196
- end
197
- end
198
- alias < proper_subset?
127
+ def <=>(set)
128
+ return unless set.is_a?(Set)
199
129
 
200
- def <=>(set)
201
- return unless set.is_a?(Set)
130
+ case size <=> set.size
131
+ when -1 then -1 if proper_subset?(set)
132
+ when +1 then +1 if proper_superset?(set)
133
+ else 0 if self.==(set)
134
+ end
135
+ end
202
136
 
203
- case size <=> set.size
204
- when -1 then -1 if proper_subset?(set)
205
- when +1 then +1 if proper_superset?(set)
206
- else 0 if self.==(set)
207
- end
208
- end
137
+ def intersect?(set)
138
+ case set
139
+ when Set
140
+ if size < set.size
141
+ any? { |o| set.include?(o) }
142
+ else
143
+ set.any? { |o| include?(o) }
144
+ end
145
+ when Enumerable
146
+ set.any? { |o| include?(o) }
147
+ else
148
+ raise ArgumentError, "value must be enumerable"
149
+ end
150
+ end
209
151
 
210
- def intersect?(set)
211
- case set
212
- when Set
213
- if size < set.size
214
- any? { |o| set.include?(o) }
215
- else
216
- set.any? { |o| include?(o) }
217
- end
218
- when Enumerable
219
- set.any? { |o| include?(o) }
220
- else
221
- raise ArgumentError, "value must be enumerable"
222
- end
223
- end
152
+ def disjoint?(set)
153
+ !intersect?(set)
154
+ end
224
155
 
225
- def disjoint?(set)
226
- !intersect?(set)
227
- end
156
+ def each(&block)
157
+ block_given? or return enum_for(__method__) { size }
158
+ @hash.each_key(&block)
159
+ self
160
+ end
228
161
 
229
- def each(&block)
230
- block_given? or return enum_for(__method__) { size }
231
- @hash.each_key(&block)
232
- self
233
- end
162
+ def add(o)
163
+ @hash[o] = true
164
+ self
165
+ end
166
+ alias << add
234
167
 
235
- def add(o)
236
- @hash[o] = true
237
- self
238
- end
239
- alias << add
168
+ def add?(o)
169
+ add(o) unless include?(o)
170
+ end
240
171
 
241
- def add?(o)
242
- add(o) unless include?(o)
243
- end
172
+ def delete(o)
173
+ @hash.delete(o)
174
+ self
175
+ end
244
176
 
245
- def delete(o)
246
- @hash.delete(o)
247
- self
248
- end
177
+ def delete?(o)
178
+ delete(o) if include?(o)
179
+ end
249
180
 
250
- def delete?(o)
251
- delete(o) if include?(o)
252
- end
181
+ def delete_if
182
+ block_given? or return enum_for(__method__) { size }
183
+ select { |o| yield o }.each { |o| @hash.delete(o) }
184
+ self
185
+ end
253
186
 
254
- def delete_if
255
- block_given? or return enum_for(__method__) { size }
256
- select { |o| yield o }.each { |o| @hash.delete(o) }
257
- self
258
- end
187
+ def keep_if
188
+ block_given? or return enum_for(__method__) { size }
189
+ reject { |o| yield o }.each { |o| @hash.delete(o) }
190
+ self
191
+ end
259
192
 
260
- def keep_if
261
- block_given? or return enum_for(__method__) { size }
262
- reject { |o| yield o }.each { |o| @hash.delete(o) }
263
- self
264
- end
193
+ def reject!(&block)
194
+ block_given? or return enum_for(__method__) { size }
195
+ n = size
196
+ delete_if(&block)
197
+ self if size != n
198
+ end
265
199
 
266
- def collect!
267
- block_given? or return enum_for(__method__) { size }
268
- set = self.class.new
269
- each { |o| set << yield(o) }
270
- replace(set)
271
- end
272
- alias map! collect!
200
+ def select!(&block)
201
+ block_given? or return enum_for(__method__) { size }
202
+ n = size
203
+ keep_if(&block)
204
+ self if size != n
205
+ end
273
206
 
274
- def reject!(&block)
275
- block_given? or return enum_for(__method__) { size }
276
- n = size
277
- delete_if(&block)
278
- self if size != n
279
- end
207
+ alias filter! select!
280
208
 
281
- def select!(&block)
282
- block_given? or return enum_for(__method__) { size }
283
- n = size
284
- keep_if(&block)
285
- self if size != n
286
- end
209
+ def merge(*enums, **_rest)
210
+ enums.each do |enum|
211
+ if enum.instance_of?(self.class)
212
+ @hash.update(enum.instance_variable_get(:@hash))
213
+ else
214
+ do_with_enum(enum) { |o| add(o) }
215
+ end
216
+ end
287
217
 
288
- alias filter! select!
218
+ self
219
+ end
289
220
 
290
- def merge(*enums, **_rest)
291
- enums.each do |enum|
292
- if enum.instance_of?(self.class)
293
- @hash.update(enum.instance_variable_get(:@hash))
294
- else
295
- do_with_enum(enum) { |o| add(o) }
296
- end
297
- end
221
+ def subtract(enum)
222
+ do_with_enum(enum) { |o| delete(o) }
223
+ self
224
+ end
298
225
 
299
- self
300
- end
226
+ def |(enum)
227
+ dup.merge(enum)
228
+ end
229
+ alias + |
230
+ alias union |
301
231
 
302
- def subtract(enum)
303
- do_with_enum(enum) { |o| delete(o) }
304
- self
305
- end
232
+ def -(enum)
233
+ dup.subtract(enum)
234
+ end
235
+ alias difference -
236
+
237
+ def &(enum)
238
+ n = self.class.new
239
+ if enum.is_a?(Set)
240
+ if enum.size > size
241
+ each { |o| n.add(o) if enum.include?(o) }
242
+ else
243
+ enum.each { |o| n.add(o) if include?(o) }
244
+ end
245
+ else
246
+ do_with_enum(enum) { |o| n.add(o) if include?(o) }
247
+ end
248
+ n
249
+ end
250
+ alias intersection &
306
251
 
307
- def |(enum)
308
- dup.merge(enum)
309
- end
310
- alias + |
311
- alias union |
252
+ def ^(enum)
253
+ n = Set.new(enum)
254
+ each { |o| n.add(o) unless n.delete?(o) }
255
+ n
256
+ end
312
257
 
313
- def -(enum)
314
- dup.subtract(enum)
315
- end
316
- alias difference -
317
-
318
- def &(enum)
319
- n = self.class.new
320
- if enum.is_a?(Set)
321
- if enum.size > size
322
- each { |o| n.add(o) if enum.include?(o) }
323
- else
324
- enum.each { |o| n.add(o) if include?(o) }
325
- end
326
- else
327
- do_with_enum(enum) { |o| n.add(o) if include?(o) }
328
- end
329
- n
330
- end
331
- alias intersection &
258
+ def ==(other)
259
+ if self.equal?(other)
260
+ true
261
+ elsif other.instance_of?(self.class)
262
+ @hash == other.instance_variable_get(:@hash)
263
+ elsif other.is_a?(Set) && self.size == other.size
264
+ other.all? { |o| @hash.include?(o) }
265
+ else
266
+ false
267
+ end
268
+ end
332
269
 
333
- def ^(enum)
334
- n = Set.new(enum)
335
- each { |o| n.add(o) unless n.delete?(o) }
336
- n
337
- end
270
+ def hash
271
+ @hash.hash
272
+ end
338
273
 
339
- def ==(other)
340
- if self.equal?(other)
341
- true
342
- elsif other.instance_of?(self.class)
343
- @hash == other.instance_variable_get(:@hash)
344
- elsif other.is_a?(Set) && self.size == other.size
345
- other.all? { |o| @hash.include?(o) }
346
- else
347
- false
348
- end
349
- end
274
+ def eql?(o)
275
+ return false unless o.is_a?(Set)
276
+ @hash.eql?(o.instance_variable_get(:@hash))
277
+ end
350
278
 
351
- def hash
352
- @hash.hash
353
- end
279
+ alias === include?
354
280
 
355
- def eql?(o)
356
- return false unless o.is_a?(Set)
357
- @hash.eql?(o.instance_variable_get(:@hash))
358
- end
281
+ def classify
282
+ block_given? or return enum_for(__method__) { size }
359
283
 
360
- def reset
361
- if @hash.respond_to?(:rehash)
362
- @hash.rehash
363
- else
364
- raise FrozenError, "can't modify frozen #{self.class.name}" if frozen?
365
- end
366
- self
367
- end
368
- alias === include?
284
+ h = {}
369
285
 
370
- def classify
371
- block_given? or return enum_for(__method__) { size }
286
+ each { |i|
287
+ (h[yield(i)] ||= self.class.new).add(i)
288
+ }
372
289
 
373
- h = {}
290
+ h
291
+ end
374
292
 
375
- each { |i|
376
- (h[yield(i)] ||= self.class.new).add(i)
377
- }
293
+ def divide(&func)
294
+ func or return enum_for(__method__) { size }
378
295
 
379
- h
380
- end
296
+ if func.arity == 2
297
+ require 'tsort'
381
298
 
382
- def divide(&func)
383
- func or return enum_for(__method__) { size }
384
-
385
- if func.arity == 2
386
- require 'tsort'
387
-
388
- class << dig = {}
389
- include TSort
390
-
391
- alias tsort_each_node each_key
392
- def tsort_each_child(node, &block)
393
- fetch(node).each(&block)
394
- end
395
- end
396
-
397
- each { |u|
398
- dig[u] = a = []
399
- each{ |v| func.call(u, v) and a << v }
400
- }
401
-
402
- set = Set.new()
403
- dig.each_strongly_connected_component { |css|
404
- set.add(self.class.new(css))
405
- }
406
- set
407
- else
408
- Set.new(classify(&func).values)
409
- end
410
- end
299
+ class << dig = {}
300
+ include TSort
411
301
 
412
- def join(separator=nil)
413
- to_a.join(separator)
302
+ alias tsort_each_node each_key
303
+ def tsort_each_child(node, &block)
304
+ fetch(node).each(&block)
414
305
  end
415
306
  end
416
307
 
417
- # sorted_set without rbtree dependency, vendored from
418
- # https://github.com/ruby/set/blob/72f08c4/lib/set.rb#L731-L800
419
- class SortedSet < Set
420
- def initialize(*args)
421
- @keys = nil
422
- super
423
- end
308
+ each { |u|
309
+ dig[u] = a = []
310
+ each{ |v| func.call(u, v) and a << v }
311
+ }
424
312
 
425
- def clear
426
- @keys = nil
427
- super
428
- end
313
+ set = Set.new()
314
+ dig.each_strongly_connected_component { |css|
315
+ set.add(self.class.new(css))
316
+ }
317
+ set
318
+ else
319
+ Set.new(classify(&func).values)
320
+ end
321
+ end
322
+ end
429
323
 
430
- def replace(enum)
431
- @keys = nil
432
- super
433
- end
324
+ # sorted_set without rbtree dependency, vendored from
325
+ # https://github.com/ruby/set/blob/72f08c4/lib/set.rb#L731-L800
326
+ class CharacterSet::RubyFallback::SortedSet < CharacterSet::RubyFallback::Set
327
+ def initialize(*args)
328
+ @keys = nil
329
+ super
330
+ end
434
331
 
435
- def add(o)
436
- o.respond_to?(:<=>) or raise ArgumentError, "value must respond to <=>"
437
- @keys = nil
438
- super
439
- end
440
- alias << add
332
+ def clear
333
+ @keys = nil
334
+ super
335
+ end
441
336
 
442
- def delete(o)
443
- @keys = nil
444
- @hash.delete(o)
445
- self
446
- end
337
+ def add(o)
338
+ @keys = nil
339
+ super
340
+ end
341
+ alias << add
447
342
 
448
- def delete_if
449
- block_given? or return enum_for(__method__) { size }
450
- n = @hash.size
451
- super
452
- @keys = nil if @hash.size != n
453
- self
454
- end
343
+ def delete(o)
344
+ @keys = nil
345
+ @hash.delete(o)
346
+ self
347
+ end
455
348
 
456
- def keep_if
457
- block_given? or return enum_for(__method__) { size }
458
- n = @hash.size
459
- super
460
- @keys = nil if @hash.size != n
461
- self
462
- end
349
+ def delete_if
350
+ block_given? or return enum_for(__method__) { size }
351
+ n = @hash.size
352
+ super
353
+ @keys = nil if @hash.size != n
354
+ self
355
+ end
463
356
 
464
- def merge(enum)
465
- @keys = nil
466
- super
467
- end
357
+ def keep_if
358
+ block_given? or return enum_for(__method__) { size }
359
+ n = @hash.size
360
+ super
361
+ @keys = nil if @hash.size != n
362
+ self
363
+ end
468
364
 
469
- def each(&block)
470
- block or return enum_for(__method__) { size }
471
- to_a.each(&block)
472
- self
473
- end
365
+ def merge(enum)
366
+ @keys = nil
367
+ super
368
+ end
474
369
 
475
- def to_a
476
- (@keys = @hash.keys).sort! unless @keys
477
- @keys.dup
478
- end
370
+ def each(&block)
371
+ block or return enum_for(__method__) { size }
372
+ to_a.each(&block)
373
+ self
374
+ end
479
375
 
480
- def freeze
481
- to_a
482
- super
483
- end
376
+ def to_a
377
+ (@keys = @hash.keys).sort! unless @keys
378
+ @keys.dup
379
+ end
484
380
 
485
- def rehash
486
- @keys = nil
487
- super
488
- end
489
- end
490
- end
381
+ def freeze
382
+ to_a
383
+ super
491
384
  end
492
385
  end
@@ -1,6 +1,5 @@
1
1
  require 'character_set/ruby_fallback/set_methods'
2
2
  require 'character_set/ruby_fallback/character_set_methods'
3
- require 'character_set/ruby_fallback/vendored_set_classes'
4
3
 
5
4
  class CharacterSet
6
5
  module RubyFallback
@@ -17,3 +16,20 @@ class CharacterSet
17
16
  end
18
17
  end
19
18
  end
19
+
20
+ if RUBY_PLATFORM[/java/i]
21
+ # JRuby has sorted_set in the stdlib.
22
+ require 'set'
23
+ CharacterSet::RubyFallback::Set = ::Set
24
+ CharacterSet::RubyFallback::SortedSet = ::SortedSet
25
+ else
26
+ # For other rubies, set/sorted_set are vendored due to dependency issues:
27
+ #
28
+ # - issues with default vs. installed gems such as [#2]
29
+ # - issues with the sorted_set dependency rb_tree
30
+ # - long-standing issues in recent versions of sorted_set
31
+ #
32
+ # The RubyFallback, and thus these set classes, are only used for testing,
33
+ # and for exotic rubies which use neither C nor Java.
34
+ require 'character_set/ruby_fallback/vendored_set_classes'
35
+ end
@@ -22,7 +22,7 @@ class CharacterSet
22
22
 
23
23
  # Allow some methods to take an Enum just as well as another CharacterSet.
24
24
  # Tested by ruby-spec.
25
- %w[& + - ^ | difference disjoint? intersect? intersection
25
+ %w[& + - ^ | <=> difference disjoint? intersect? intersection
26
26
  subtract union].each do |method|
27
27
  class_eval <<-RUBY, __FILE__, __LINE__ + 1
28
28
  def #{method}(arg)
@@ -165,9 +165,13 @@ class CharacterSet
165
165
  end
166
166
 
167
167
  def divide(&func)
168
- require 'character_set/ruby_fallback/vendored_set_classes'
168
+ require 'character_set/ruby_fallback'
169
169
  CharacterSet::RubyFallback::Set.new(to_a).divide(&func)
170
170
  end
171
+
172
+ def join(separator = '')
173
+ to_a(true).join(separator)
174
+ end
171
175
  RUBY
172
176
 
173
177
  # CharacterSet-specific section methods
@@ -1,3 +1,3 @@
1
1
  class CharacterSet
2
- VERSION = '1.7.0'
2
+ VERSION = '1.8.0'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: character_set
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.7.0
4
+ version: 1.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Janosch Müller
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-05-12 00:00:00.000000000 Z
11
+ date: 2024-01-07 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description:
14
14
  email:
@@ -106,7 +106,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
106
106
  - !ruby/object:Gem::Version
107
107
  version: '0'
108
108
  requirements: []
109
- rubygems_version: 3.4.1
109
+ rubygems_version: 3.5.0.dev
110
110
  signing_key:
111
111
  specification_version: 4
112
112
  summary: Build, read, write and compare sets of Unicode codepoints.