string_splitter 0.5.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +36 -11
- data/lib/string_splitter.rb +26 -15
- data/lib/string_splitter/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 582dd9d8bae0421a49348bf0ccade081a4cc448e8e27943dcb67004b1b684f6d
|
4
|
+
data.tar.gz: 10990476dec6bf7edc909cd8558d0404fd9295820238ac527ebf3294454815a2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 666914aa76ca9f425dc7ef60b0110dbb1239fad3ae44ac49ba0ee59531b93d800cb2ca475c524ee359dbde4b21a0b97a89fa3f6910bb78d1b6737729ffddc1a9
|
7
|
+
data.tar.gz: 4c9522bcc4e858a98e4b9c79abe2ecf845b0a8209479b802637936215c0a5c02e9c0853f103779618636774ec5ce55a7157ea8144eaadaa97f918a94e062d4e9
|
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -11,6 +11,7 @@
|
|
11
11
|
- [SYNOPSIS](#synopsis)
|
12
12
|
- [DESCRIPTION](#description)
|
13
13
|
- [WHY?](#why)
|
14
|
+
- [COMPATIBILITY](#compatibility)
|
14
15
|
- [VERSION](#version)
|
15
16
|
- [SEE ALSO](#see-also)
|
16
17
|
- [Gems](#gems)
|
@@ -36,30 +37,47 @@ gem "string_splitter"
|
|
36
37
|
require "string_splitter"
|
37
38
|
|
38
39
|
ss = StringSplitter.new
|
40
|
+
```
|
41
|
+
|
42
|
+
**Same as `String#split`**
|
39
43
|
|
40
|
-
|
44
|
+
```ruby
|
41
45
|
ss.split("foo bar baz quux")
|
42
46
|
ss.split("foo bar baz quux", " ")
|
43
47
|
ss.split("foo bar baz quux", /\s+/)
|
44
48
|
# => ["foo", "bar", "baz", "quux"]
|
49
|
+
```
|
50
|
+
|
51
|
+
**Split at the first delimiter**
|
45
52
|
|
46
|
-
|
53
|
+
```ruby
|
47
54
|
ss.split("foo:bar:baz:quux", ":", at: 1)
|
48
55
|
# => ["foo", "bar:baz:quux"]
|
56
|
+
```
|
57
|
+
|
58
|
+
**Split at the last delimiter**
|
49
59
|
|
50
|
-
|
60
|
+
```ruby
|
51
61
|
ss.split("foo:bar:baz:quux", ":", at: -1)
|
52
62
|
# => ["foo:bar:baz", "quux"]
|
63
|
+
```
|
53
64
|
|
54
|
-
|
65
|
+
**Split at multiple delimiter positions**
|
66
|
+
|
67
|
+
```ruby
|
55
68
|
ss.split("1:2:3:4:5:6:7:8:9", ":", at: [1..3, -2])
|
56
69
|
# => ["1", "2", "3", "4:5:6:7", "8:9"]
|
70
|
+
```
|
57
71
|
|
58
|
-
|
72
|
+
**Split from the right**
|
73
|
+
|
74
|
+
```ruby
|
59
75
|
ss.rsplit("1:2:3:4:5:6:7:8:9", ":", at: [1..3, 5])
|
60
76
|
# => ["1:2:3:4", "5:6", "7", "8", "9"]
|
77
|
+
```
|
78
|
+
**Full control via a block**
|
61
79
|
|
62
|
-
|
80
|
+
```ruby
|
63
81
|
result = ss.split('a:a:a:b:c:c:e:a:a:d:c', ":") do |split|
|
64
82
|
split.index > 0 && split.lhs == split.rhs
|
65
83
|
end
|
@@ -68,16 +86,16 @@ end
|
|
68
86
|
|
69
87
|
# DESCRIPTION
|
70
88
|
|
71
|
-
Many languages have built-in
|
89
|
+
Many languages have built-in `split` functions/methods for strings. They behave similarly
|
72
90
|
(notwithstanding the occasional [surprise](https://chriszetter.com/blog/2017/10/29/splitting-strings/)),
|
73
91
|
and handle a few common cases e.g.:
|
74
92
|
|
75
93
|
* limiting the number of splits
|
76
|
-
* including the
|
94
|
+
* including the separator(s) in the results
|
77
95
|
* removing (some) empty fields
|
78
96
|
|
79
97
|
But, because the API is squeezed into two overloaded parameters (the delimiter and the limit),
|
80
|
-
achieving the desired
|
98
|
+
achieving the desired results can be tricky. For instance, while `String#split` removes empty
|
81
99
|
trailing fields (by default), it provides no way to remove *all* empty fields. Likewise, the
|
82
100
|
cramped API means there's no way to e.g. combine a limit (positive integer) with the option
|
83
101
|
to preserve empty fields (negative integer), or use backreferences in a delimiter pattern
|
@@ -117,7 +135,8 @@ ss.split('foo:bar:baz:quux', ':', at: [1, -1]) # => ["foo", "bar:baz", "quux"]
|
|
117
135
|
I wanted to split semi-structured output into fields without having to resort to a regex or a full-blown parser.
|
118
136
|
|
119
137
|
As an example, the nominally unstructured output of many Unix commands is often formatted in a way
|
120
|
-
that's tantalizingly close to being machine-readable,
|
138
|
+
that's tantalizingly close to being [machine-readable](https://en.wikipedia.org/wiki/Delimiter-separated_values),
|
139
|
+
apart from a few pesky exceptions e.g.:
|
121
140
|
|
122
141
|
```bash
|
123
142
|
$ ls -l
|
@@ -177,9 +196,14 @@ ss.split(line, at: [1..5, 8])
|
|
177
196
|
# => ["-rw-r--r--", "1", "user", "users", "87", "Jun 18 18:16", "CHANGELOG.md"]
|
178
197
|
```
|
179
198
|
|
199
|
+
# COMPATIBILITY
|
200
|
+
|
201
|
+
StringSplitter is tested and supported on all versions of Ruby [supported by the ruby-core team](https://www.ruby-lang.org/en/downloads/branches/),
|
202
|
+
i.e., currently, Ruby 2.3 and above.
|
203
|
+
|
180
204
|
# VERSION
|
181
205
|
|
182
|
-
0.5.
|
206
|
+
0.5.1
|
183
207
|
|
184
208
|
# SEE ALSO
|
185
209
|
|
@@ -201,3 +225,4 @@ Copyright © 2018 by chocolateboy.
|
|
201
225
|
|
202
226
|
This is free software; you can redistribute it and/or modify it under the
|
203
227
|
terms of the [Artistic License 2.0](http://www.opensource.org/licenses/artistic-license-2.0.php).
|
228
|
+
|
data/lib/string_splitter.rb
CHANGED
@@ -1,13 +1,14 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'values'
|
4
|
+
require_relative 'string_splitter/version'
|
4
5
|
|
5
6
|
# This class extends the functionality of +String#split+ by:
|
6
7
|
#
|
7
8
|
# - providing full control over which splits are accepted or rejected
|
8
9
|
# - adding support for splitting from right-to-left
|
9
|
-
# - encapsulating splitting options/preferences in
|
10
|
-
# cram them into overloaded method parameters
|
10
|
+
# - encapsulating splitting options/preferences in the splitter rather
|
11
|
+
# than trying to cram them into overloaded method parameters
|
11
12
|
#
|
12
13
|
# These enhancements allow splits to handle many cases that otherwise require bigger
|
13
14
|
# guns e.g. regex matching or parsing.
|
@@ -47,7 +48,7 @@ class StringSplitter
|
|
47
48
|
reject: exclude,
|
48
49
|
&block
|
49
50
|
)
|
50
|
-
result,
|
51
|
+
result, splits, block = split_init(
|
51
52
|
string: string,
|
52
53
|
delimiter: delimiter,
|
53
54
|
select: select,
|
@@ -55,8 +56,10 @@ class StringSplitter
|
|
55
56
|
block: block
|
56
57
|
)
|
57
58
|
|
58
|
-
splits.
|
59
|
-
|
59
|
+
count = splits.length
|
60
|
+
|
61
|
+
splits.each_with_index do |split, index|
|
62
|
+
split = Split.with(split.merge({ index: index, count: count }))
|
60
63
|
result << split.lhs if result.empty?
|
61
64
|
|
62
65
|
if block.call(split)
|
@@ -70,7 +73,7 @@ class StringSplitter
|
|
70
73
|
|
71
74
|
result << split.rhs
|
72
75
|
else
|
73
|
-
#
|
76
|
+
# concatenate the rhs
|
74
77
|
result[-1] = result[-1] + split.separator + split.rhs
|
75
78
|
end
|
76
79
|
end
|
@@ -89,7 +92,7 @@ class StringSplitter
|
|
89
92
|
reject: exclude,
|
90
93
|
&block
|
91
94
|
)
|
92
|
-
result,
|
95
|
+
result, splits, block = split_init(
|
93
96
|
string: string,
|
94
97
|
delimiter: delimiter,
|
95
98
|
select: select,
|
@@ -97,8 +100,10 @@ class StringSplitter
|
|
97
100
|
block: block
|
98
101
|
)
|
99
102
|
|
100
|
-
splits.
|
101
|
-
|
103
|
+
count = splits.length
|
104
|
+
|
105
|
+
splits.reverse!.each_with_index do |split, index|
|
106
|
+
split = Split.with(split.merge({ index: index, count: count }))
|
102
107
|
result.unshift(split.rhs) if result.empty?
|
103
108
|
|
104
109
|
if block.call(split)
|
@@ -156,11 +161,18 @@ class StringSplitter
|
|
156
161
|
[result, splits]
|
157
162
|
end
|
158
163
|
|
159
|
-
#
|
160
|
-
|
164
|
+
# takes a hash of options passed to +split+ or +rsplit+ and returns a:
|
165
|
+
#
|
166
|
+
# [result, splits, block]
|
167
|
+
#
|
168
|
+
# triple, where `result` is the return value of the method, `splits` is an array
|
169
|
+
# of hashes containing the lhs/rhs, separator and captures of each split, and
|
170
|
+
# `block` is a proc which specifies whether each split should be accepted or
|
171
|
+
# rejected
|
172
|
+
def split_init(string:, delimiter:, select:, reject:, block:)
|
161
173
|
unless (match = string.match(delimiter))
|
162
174
|
result = (@remove_empty && string.empty?) ? [] : [string]
|
163
|
-
return [result,
|
175
|
+
return [result, NO_SPLITS, block]
|
164
176
|
end
|
165
177
|
|
166
178
|
select = Array(select)
|
@@ -180,10 +192,9 @@ class StringSplitter
|
|
180
192
|
parts = string.split(/(#{delimiter})/, -1)
|
181
193
|
remove_trailing_empty_field!(parts, ncaptures)
|
182
194
|
result, splits = splits_for(parts, ncaptures)
|
183
|
-
|
184
|
-
block ||= positions ? match_positions(positions, action, count) : ACCEPT_ALL
|
195
|
+
block ||= positions ? match_positions(positions, action, splits.length) : ACCEPT_ALL
|
185
196
|
|
186
|
-
[result,
|
197
|
+
[result, splits, block]
|
187
198
|
end
|
188
199
|
|
189
200
|
# increment back-references so they remain valid when the outer capture
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: string_splitter
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.5.
|
4
|
+
version: 0.5.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- chocolateboy
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-
|
11
|
+
date: 2018-07-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: values
|