string_splitter 0.5.0 → 0.5.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +36 -11
- data/lib/string_splitter.rb +26 -15
- data/lib/string_splitter/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 582dd9d8bae0421a49348bf0ccade081a4cc448e8e27943dcb67004b1b684f6d
|
4
|
+
data.tar.gz: 10990476dec6bf7edc909cd8558d0404fd9295820238ac527ebf3294454815a2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 666914aa76ca9f425dc7ef60b0110dbb1239fad3ae44ac49ba0ee59531b93d800cb2ca475c524ee359dbde4b21a0b97a89fa3f6910bb78d1b6737729ffddc1a9
|
7
|
+
data.tar.gz: 4c9522bcc4e858a98e4b9c79abe2ecf845b0a8209479b802637936215c0a5c02e9c0853f103779618636774ec5ce55a7157ea8144eaadaa97f918a94e062d4e9
|
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -11,6 +11,7 @@
|
|
11
11
|
- [SYNOPSIS](#synopsis)
|
12
12
|
- [DESCRIPTION](#description)
|
13
13
|
- [WHY?](#why)
|
14
|
+
- [COMPATIBILITY](#compatibility)
|
14
15
|
- [VERSION](#version)
|
15
16
|
- [SEE ALSO](#see-also)
|
16
17
|
- [Gems](#gems)
|
@@ -36,30 +37,47 @@ gem "string_splitter"
|
|
36
37
|
require "string_splitter"
|
37
38
|
|
38
39
|
ss = StringSplitter.new
|
40
|
+
```
|
41
|
+
|
42
|
+
**Same as `String#split`**
|
39
43
|
|
40
|
-
|
44
|
+
```ruby
|
41
45
|
ss.split("foo bar baz quux")
|
42
46
|
ss.split("foo bar baz quux", " ")
|
43
47
|
ss.split("foo bar baz quux", /\s+/)
|
44
48
|
# => ["foo", "bar", "baz", "quux"]
|
49
|
+
```
|
50
|
+
|
51
|
+
**Split at the first delimiter**
|
45
52
|
|
46
|
-
|
53
|
+
```ruby
|
47
54
|
ss.split("foo:bar:baz:quux", ":", at: 1)
|
48
55
|
# => ["foo", "bar:baz:quux"]
|
56
|
+
```
|
57
|
+
|
58
|
+
**Split at the last delimiter**
|
49
59
|
|
50
|
-
|
60
|
+
```ruby
|
51
61
|
ss.split("foo:bar:baz:quux", ":", at: -1)
|
52
62
|
# => ["foo:bar:baz", "quux"]
|
63
|
+
```
|
53
64
|
|
54
|
-
|
65
|
+
**Split at multiple delimiter positions**
|
66
|
+
|
67
|
+
```ruby
|
55
68
|
ss.split("1:2:3:4:5:6:7:8:9", ":", at: [1..3, -2])
|
56
69
|
# => ["1", "2", "3", "4:5:6:7", "8:9"]
|
70
|
+
```
|
57
71
|
|
58
|
-
|
72
|
+
**Split from the right**
|
73
|
+
|
74
|
+
```ruby
|
59
75
|
ss.rsplit("1:2:3:4:5:6:7:8:9", ":", at: [1..3, 5])
|
60
76
|
# => ["1:2:3:4", "5:6", "7", "8", "9"]
|
77
|
+
```
|
78
|
+
**Full control via a block**
|
61
79
|
|
62
|
-
|
80
|
+
```ruby
|
63
81
|
result = ss.split('a:a:a:b:c:c:e:a:a:d:c', ":") do |split|
|
64
82
|
split.index > 0 && split.lhs == split.rhs
|
65
83
|
end
|
@@ -68,16 +86,16 @@ end
|
|
68
86
|
|
69
87
|
# DESCRIPTION
|
70
88
|
|
71
|
-
Many languages have built-in
|
89
|
+
Many languages have built-in `split` functions/methods for strings. They behave similarly
|
72
90
|
(notwithstanding the occasional [surprise](https://chriszetter.com/blog/2017/10/29/splitting-strings/)),
|
73
91
|
and handle a few common cases e.g.:
|
74
92
|
|
75
93
|
* limiting the number of splits
|
76
|
-
* including the
|
94
|
+
* including the separator(s) in the results
|
77
95
|
* removing (some) empty fields
|
78
96
|
|
79
97
|
But, because the API is squeezed into two overloaded parameters (the delimiter and the limit),
|
80
|
-
achieving the desired
|
98
|
+
achieving the desired results can be tricky. For instance, while `String#split` removes empty
|
81
99
|
trailing fields (by default), it provides no way to remove *all* empty fields. Likewise, the
|
82
100
|
cramped API means there's no way to e.g. combine a limit (positive integer) with the option
|
83
101
|
to preserve empty fields (negative integer), or use backreferences in a delimiter pattern
|
@@ -117,7 +135,8 @@ ss.split('foo:bar:baz:quux', ':', at: [1, -1]) # => ["foo", "bar:baz", "quux"]
|
|
117
135
|
I wanted to split semi-structured output into fields without having to resort to a regex or a full-blown parser.
|
118
136
|
|
119
137
|
As an example, the nominally unstructured output of many Unix commands is often formatted in a way
|
120
|
-
that's tantalizingly close to being machine-readable,
|
138
|
+
that's tantalizingly close to being [machine-readable](https://en.wikipedia.org/wiki/Delimiter-separated_values),
|
139
|
+
apart from a few pesky exceptions e.g.:
|
121
140
|
|
122
141
|
```bash
|
123
142
|
$ ls -l
|
@@ -177,9 +196,14 @@ ss.split(line, at: [1..5, 8])
|
|
177
196
|
# => ["-rw-r--r--", "1", "user", "users", "87", "Jun 18 18:16", "CHANGELOG.md"]
|
178
197
|
```
|
179
198
|
|
199
|
+
# COMPATIBILITY
|
200
|
+
|
201
|
+
StringSplitter is tested and supported on all versions of Ruby [supported by the ruby-core team](https://www.ruby-lang.org/en/downloads/branches/),
|
202
|
+
i.e., currently, Ruby 2.3 and above.
|
203
|
+
|
180
204
|
# VERSION
|
181
205
|
|
182
|
-
0.5.
|
206
|
+
0.5.1
|
183
207
|
|
184
208
|
# SEE ALSO
|
185
209
|
|
@@ -201,3 +225,4 @@ Copyright © 2018 by chocolateboy.
|
|
201
225
|
|
202
226
|
This is free software; you can redistribute it and/or modify it under the
|
203
227
|
terms of the [Artistic License 2.0](http://www.opensource.org/licenses/artistic-license-2.0.php).
|
228
|
+
|
data/lib/string_splitter.rb
CHANGED
@@ -1,13 +1,14 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'values'
|
4
|
+
require_relative 'string_splitter/version'
|
4
5
|
|
5
6
|
# This class extends the functionality of +String#split+ by:
|
6
7
|
#
|
7
8
|
# - providing full control over which splits are accepted or rejected
|
8
9
|
# - adding support for splitting from right-to-left
|
9
|
-
# - encapsulating splitting options/preferences in
|
10
|
-
# cram them into overloaded method parameters
|
10
|
+
# - encapsulating splitting options/preferences in the splitter rather
|
11
|
+
# than trying to cram them into overloaded method parameters
|
11
12
|
#
|
12
13
|
# These enhancements allow splits to handle many cases that otherwise require bigger
|
13
14
|
# guns e.g. regex matching or parsing.
|
@@ -47,7 +48,7 @@ class StringSplitter
|
|
47
48
|
reject: exclude,
|
48
49
|
&block
|
49
50
|
)
|
50
|
-
result,
|
51
|
+
result, splits, block = split_init(
|
51
52
|
string: string,
|
52
53
|
delimiter: delimiter,
|
53
54
|
select: select,
|
@@ -55,8 +56,10 @@ class StringSplitter
|
|
55
56
|
block: block
|
56
57
|
)
|
57
58
|
|
58
|
-
splits.
|
59
|
-
|
59
|
+
count = splits.length
|
60
|
+
|
61
|
+
splits.each_with_index do |split, index|
|
62
|
+
split = Split.with(split.merge({ index: index, count: count }))
|
60
63
|
result << split.lhs if result.empty?
|
61
64
|
|
62
65
|
if block.call(split)
|
@@ -70,7 +73,7 @@ class StringSplitter
|
|
70
73
|
|
71
74
|
result << split.rhs
|
72
75
|
else
|
73
|
-
#
|
76
|
+
# concatenate the rhs
|
74
77
|
result[-1] = result[-1] + split.separator + split.rhs
|
75
78
|
end
|
76
79
|
end
|
@@ -89,7 +92,7 @@ class StringSplitter
|
|
89
92
|
reject: exclude,
|
90
93
|
&block
|
91
94
|
)
|
92
|
-
result,
|
95
|
+
result, splits, block = split_init(
|
93
96
|
string: string,
|
94
97
|
delimiter: delimiter,
|
95
98
|
select: select,
|
@@ -97,8 +100,10 @@ class StringSplitter
|
|
97
100
|
block: block
|
98
101
|
)
|
99
102
|
|
100
|
-
splits.
|
101
|
-
|
103
|
+
count = splits.length
|
104
|
+
|
105
|
+
splits.reverse!.each_with_index do |split, index|
|
106
|
+
split = Split.with(split.merge({ index: index, count: count }))
|
102
107
|
result.unshift(split.rhs) if result.empty?
|
103
108
|
|
104
109
|
if block.call(split)
|
@@ -156,11 +161,18 @@ class StringSplitter
|
|
156
161
|
[result, splits]
|
157
162
|
end
|
158
163
|
|
159
|
-
#
|
160
|
-
|
164
|
+
# takes a hash of options passed to +split+ or +rsplit+ and returns a:
|
165
|
+
#
|
166
|
+
# [result, splits, block]
|
167
|
+
#
|
168
|
+
# triple, where `result` is the return value of the method, `splits` is an array
|
169
|
+
# of hashes containing the lhs/rhs, separator and captures of each split, and
|
170
|
+
# `block` is a proc which specifies whether each split should be accepted or
|
171
|
+
# rejected
|
172
|
+
def split_init(string:, delimiter:, select:, reject:, block:)
|
161
173
|
unless (match = string.match(delimiter))
|
162
174
|
result = (@remove_empty && string.empty?) ? [] : [string]
|
163
|
-
return [result,
|
175
|
+
return [result, NO_SPLITS, block]
|
164
176
|
end
|
165
177
|
|
166
178
|
select = Array(select)
|
@@ -180,10 +192,9 @@ class StringSplitter
|
|
180
192
|
parts = string.split(/(#{delimiter})/, -1)
|
181
193
|
remove_trailing_empty_field!(parts, ncaptures)
|
182
194
|
result, splits = splits_for(parts, ncaptures)
|
183
|
-
|
184
|
-
block ||= positions ? match_positions(positions, action, count) : ACCEPT_ALL
|
195
|
+
block ||= positions ? match_positions(positions, action, splits.length) : ACCEPT_ALL
|
185
196
|
|
186
|
-
[result,
|
197
|
+
[result, splits, block]
|
187
198
|
end
|
188
199
|
|
189
200
|
# increment back-references so they remain valid when the outer capture
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: string_splitter
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.5.
|
4
|
+
version: 0.5.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- chocolateboy
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-
|
11
|
+
date: 2018-07-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: values
|