unibits 2.4.0 → 2.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -5
- data/CHANGELOG.md +27 -0
- data/Gemfile +1 -1
- data/MIT-LICENSE.txt +1 -1
- data/README.md +17 -9
- data/Rakefile +5 -1
- data/bin/unibits +7 -3
- data/lib/unibits.rb +31 -12
- data/lib/unibits/version.rb +4 -2
- data/spec/unibits_spec.rb +7 -0
- data/unibits.gemspec +7 -7
- metadata +20 -16
- data/.travis.yml +0 -23
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 129aed0dcdc3467e6acf759da5068f140e352d52f44a23b2fa304db2227ebb88
|
4
|
+
data.tar.gz: f64d12e177c3c4c88927bd71273bb13855c8eb0d05203a3cbe0381a38604b580
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 2289fa37584738aa6de0d7239cc1f4cccec419e4db287345ecb440d036a012bfd3055a2ecea4b26b10cacdda50e6965f56516503435789c5a660f90d46394fed
|
7
|
+
data.tar.gz: 93d3e88519d3ec7cfebd37a32e7c5e3d51a23c1273c8708de0f82ce8c0cbe5258be1d49ec3270b5901cd15f94fd3cda288383727d8e772dc1e3849ffbc1f6f00
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,32 @@
|
|
1
1
|
## CHANGELOG
|
2
2
|
|
3
|
+
### 2.9.0
|
4
|
+
|
5
|
+
* Unicode 13
|
6
|
+
* Improve terminal width detection / Windows support
|
7
|
+
|
8
|
+
### 2.8.0
|
9
|
+
|
10
|
+
* Unicode 12
|
11
|
+
|
12
|
+
### 2.7.0
|
13
|
+
|
14
|
+
* Unicode 11
|
15
|
+
|
16
|
+
### 2.6.0
|
17
|
+
|
18
|
+
* Support Unicode 10.0
|
19
|
+
|
20
|
+
### 2.5.0
|
21
|
+
|
22
|
+
* Double check UTF-32 only on Ruby versions which contain the bug
|
23
|
+
* Highlight unassigned codepoints which are ignorable
|
24
|
+
* Bump symbolify dependency
|
25
|
+
* Add special characters (U+FFF9 - U+FFFC)
|
26
|
+
* Non-control separators return ⏎
|
27
|
+
* Bump characteristics dependency
|
28
|
+
* Allow GB1988 encoding (7bit ascii-like)
|
29
|
+
|
3
30
|
### 2.4.0
|
4
31
|
|
5
32
|
* Extract symbolification logic into extra [symbolify](https://github.com/janlelis/symbolify) gem (includes fixes and non-character detection)
|
data/Gemfile
CHANGED
data/MIT-LICENSE.txt
CHANGED
data/README.md
CHANGED
@@ -1,23 +1,26 @@
|
|
1
|
-
# unibits | Reveal the Unicode [![[version]](https://badge.fury.io/rb/unibits.svg)](
|
1
|
+
# unibits | Reveal the Unicode [![[version]](https://badge.fury.io/rb/unibits.svg)](https://badge.fury.io/rb/unibits) [![[ci]](https://github.com/janlelis/unibits/workflows/Test/badge.svg)](https://github.com/janlelis/unibits/actions?query=workflow%3ATest)
|
2
2
|
|
3
3
|
Ruby library and CLI command that visualizes various Unicode and ASCII/single byte encodings in the terminal:
|
4
4
|
|
5
5
|
- Makes analyzing encodings easier
|
6
6
|
- Helps you with debugging strings
|
7
7
|
- Highlights invalid/special/blank bytes/characters/codepoints
|
8
|
-
- Supports *UTF-8*, *UTF-16LE*/*UTF-16BE*, *UTF-32LE*/*UTF-32BE*, *ISO-8859-X*, *Windows-125X*, *IBMX*, *CP85X*, *macX*, *TIS-620*/*Windows-874*, *KOI8-R*/*KOI8-U*, 7-Bit *ASCII*, and arbitrary *BINARY* data
|
8
|
+
- Supports *UTF-8*, *UTF-16LE*/*UTF-16BE*, *UTF-32LE*/*UTF-32BE*, *ISO-8859-X*, *Windows-125X*, *IBMX*, *CP85X*, *macX*, *TIS-620*/*Windows-874*, *KOI8-R*/*KOI8-U*, 7-Bit *ASCII*/*GB1988*, and arbitrary *BINARY* data
|
9
9
|
|
10
10
|
## Color Coding
|
11
11
|
|
12
12
|
Each byte of the given string is highlighted using the following mechanism (characters -> codepoints):
|
13
13
|
|
14
14
|
- Red for invalid bytes
|
15
|
-
- Orange for unassigned bytes/characters
|
16
|
-
- Blue for control characters
|
17
15
|
- Light blue for blanks
|
16
|
+
- Blue for control characters
|
18
17
|
- Non-control formatting characters in pink
|
19
18
|
- Green for marks (Unicode only)
|
20
|
-
-
|
19
|
+
- Orange for unassigned codepoints
|
20
|
+
- Lighter orange for unassigned codepoints which are also ignorable
|
21
|
+
- Random color for all other codepoints
|
22
|
+
|
23
|
+
The same colors are used in the higher-level companion tool [uniscribe](https://github.com/janlelis/uniscribe).
|
21
24
|
|
22
25
|
## Setup
|
23
26
|
|
@@ -110,16 +113,21 @@ Example in Ruby: `unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'ascii'
|
|
110
113
|
|
111
114
|
## Notes
|
112
115
|
|
113
|
-
|
116
|
+
More info
|
114
117
|
|
115
118
|
- [Ruby's Encoding class](https://ruby-doc.org/core/Encoding.html)
|
116
|
-
- [Characteristics gem](https://github.com/janlelis/characteristics)
|
117
119
|
- [UTF-8 (Wikipedia)](https://en.wikipedia.org/wiki/UTF-8#Description)
|
118
120
|
- [UTF-16 (Wikipedia)](https://en.wikipedia.org/wiki/UTF-16#Description)
|
119
121
|
- [UTF-32 (Wikipedia)](https://en.wikipedia.org/wiki/UTF-32)
|
120
122
|
- [Difference between BINARY and ASCII](http://idiosyncratic-ruby.com/56-us-ascii-8bit.html)
|
121
|
-
|
123
|
+
|
124
|
+
Related gems
|
125
|
+
|
126
|
+
- [uniscribe](https://github.com/janlelis/uniscribe)
|
127
|
+
- [unicopy](https://github.com/janlelis/unicopy)
|
128
|
+
- [symbolify](https://github.com/janlelis/symbolify)
|
129
|
+
- [characteristics](https://github.com/janlelis/characteristics)
|
122
130
|
|
123
131
|
Lots of thanks to @damienklinnert for the motivation and inspiration required to build this! 🎆
|
124
132
|
|
125
|
-
Copyright (C) 2017 Jan Lelis <
|
133
|
+
Copyright (C) 2017-2020 Jan Lelis <https://janlelis.com>. Released under the MIT license.
|
data/Rakefile
CHANGED
@@ -32,7 +32,11 @@ end
|
|
32
32
|
|
33
33
|
desc "#{gemspec.name} | Spec"
|
34
34
|
task :spec do
|
35
|
-
|
35
|
+
if RbConfig::CONFIG['host_os'] =~ /mswin|mingw/
|
36
|
+
sh "for %f in (spec/\*.rb) do ruby spec/%f"
|
37
|
+
else
|
38
|
+
sh "for file in spec/*.rb; do ruby $file; done"
|
39
|
+
end
|
36
40
|
end
|
37
41
|
task default: :spec
|
38
42
|
|
data/bin/unibits
CHANGED
@@ -44,8 +44,9 @@ if argv[:help]
|
|
44
44
|
--convert <encoding> | -c | which encoding to convert to (if possible)
|
45
45
|
--width <n> | -w | force a specific number of terminal columns
|
46
46
|
--no-stats | | no stats header with length info
|
47
|
-
--version | | displays version of unibits
|
48
47
|
--wide-ambiguous | | ambiguous characters
|
48
|
+
--help | | this help page
|
49
|
+
--version | | displays version of unibits
|
49
50
|
|
50
51
|
#{Paint["ENCODINGS", :underline]}
|
51
52
|
|
@@ -53,11 +54,14 @@ if argv[:help]
|
|
53
54
|
#{Paint["COLOR CODING", :underline]}
|
54
55
|
|
55
56
|
#{Paint["invalid", Unibits::COLORS[:invalid]]}
|
56
|
-
#{Paint["unassigned", Unibits::COLORS[:unassigned]]}
|
57
|
-
#{Paint["control", Unibits::COLORS[:control]]}
|
58
57
|
#{Paint["blank", Unibits::COLORS[:blank]]}
|
58
|
+
#{Paint["control", Unibits::COLORS[:control]]}
|
59
59
|
#{Paint["format", Unibits::COLORS[:format]]}
|
60
60
|
#{Paint["mark", Unibits::COLORS[:mark]]}
|
61
|
+
#{Paint["unassigned", Unibits::COLORS[:unassigned]]}
|
62
|
+
#{Paint["unassigned and ignorable", Unibits::COLORS[:ignorable]]}
|
63
|
+
|
64
|
+
random color for other characters
|
61
65
|
|
62
66
|
#{Paint["STATS", :underline]}
|
63
67
|
|
data/lib/unibits.rb
CHANGED
@@ -22,32 +22,38 @@ module Unibits
|
|
22
22
|
/^TIS-620$/,
|
23
23
|
/^Windows-874$/,
|
24
24
|
/^KOI8/,
|
25
|
+
/^GB1988$/,
|
25
26
|
)
|
26
27
|
).sort.freeze
|
27
28
|
|
28
29
|
COLORS = {
|
29
30
|
invalid: "#FF0000",
|
30
|
-
unassigned: "#FF5500",
|
31
31
|
control: "#0000FF",
|
32
32
|
blank: "#33AADD",
|
33
33
|
format: "#FF00FF",
|
34
34
|
mark: "#228822",
|
35
|
+
unassigned: "#FF5500",
|
36
|
+
ignorable: "#FFAA00",
|
35
37
|
}
|
36
38
|
|
37
39
|
DEFAULT_TERMINAL_WIDTH = 80
|
38
40
|
|
39
41
|
def self.of(string, encoding: nil, convert: nil, stats: true, wide_ambiguous: false, width: nil)
|
40
|
-
|
41
|
-
|
42
|
-
|
42
|
+
string = convert_to_encoding_or_raise(string, encoding, convert)
|
43
|
+
|
44
|
+
puts stats(string, wide_ambiguous: wide_ambiguous) if stats
|
45
|
+
puts visualize(string, wide_ambiguous: wide_ambiguous, width: width)
|
46
|
+
end
|
47
|
+
|
48
|
+
def self.convert_to_encoding_or_raise(string, encoding, convert)
|
49
|
+
raise ArgumentError, "no data given to unibits" if !string || string.empty?
|
43
50
|
|
44
51
|
string = string.dup.force_encoding(encoding) if encoding
|
45
52
|
string = string.encode(convert) if convert
|
46
53
|
|
47
54
|
case string.encoding.name
|
48
55
|
when *SUPPORTED_ENCODINGS
|
49
|
-
|
50
|
-
puts visualize(string, wide_ambiguous: wide_ambiguous, width: width)
|
56
|
+
string
|
51
57
|
when 'UTF-16', 'UTF-32'
|
52
58
|
raise ArgumentError, "unibits only supports #{string.encoding.name} with specified endianess, please use #{string.encoding.name}LE or #{string.encoding.name}BE"
|
53
59
|
else
|
@@ -81,7 +87,17 @@ module Unibits
|
|
81
87
|
puts
|
82
88
|
string.each_char{ |char|
|
83
89
|
char_info = Characteristics.create_for_type(char, type)
|
84
|
-
|
90
|
+
|
91
|
+
if RUBY_VERSION >= "2.4.1" ||
|
92
|
+
RUBY_VERSION < "2.4.0" && RUBY_VERSION >= "2.3.4" ||
|
93
|
+
RUBY_VERSION < "2.3.0" && RUBY_VERSION >= "2.2.7" ||
|
94
|
+
char_info.encoding.name[0, 6] != "UTF-32" ||
|
95
|
+
!char_info.valid?
|
96
|
+
# bug is fixed or not relevant
|
97
|
+
else
|
98
|
+
double_check_utf32_validness!(char, char_info)
|
99
|
+
end
|
100
|
+
|
85
101
|
current_color = determine_char_color(char_info)
|
86
102
|
|
87
103
|
current_encoding_error = nil if char_info.valid?
|
@@ -230,16 +246,20 @@ module Unibits
|
|
230
246
|
end
|
231
247
|
|
232
248
|
def self.determine_terminal_cols
|
233
|
-
STDIN.winsize[1] || DEFAULT_TERMINAL_WIDTH
|
234
|
-
rescue Errno::ENOTTY
|
235
|
-
return DEFAULT_TERMINAL_WIDTH
|
249
|
+
STDIN.winsize[1] || ENV['COLUMNS'] || DEFAULT_TERMINAL_WIDTH
|
250
|
+
rescue Errno::ENOTTY, Errno::EBADF
|
251
|
+
return ENV['COLUMNS'] || DEFAULT_TERMINAL_WIDTH
|
236
252
|
end
|
237
253
|
|
238
254
|
def self.determine_char_color(char_info)
|
239
255
|
if !char_info.valid?
|
240
256
|
COLORS[:invalid]
|
241
257
|
elsif !char_info.assigned?
|
242
|
-
|
258
|
+
if char_info.unicode? && char_info.ignorable?
|
259
|
+
COLORS[:ignorable]
|
260
|
+
else
|
261
|
+
COLORS[:unassigned]
|
262
|
+
end
|
243
263
|
elsif char_info.blank?
|
244
264
|
COLORS[:blank]
|
245
265
|
elsif char_info.control?
|
@@ -315,7 +335,6 @@ module Unibits
|
|
315
335
|
end
|
316
336
|
|
317
337
|
def self.double_check_utf32_validness!(char, char_info)
|
318
|
-
return if RUBY_VERSION > "2.4.0" || char_info.encoding.name[0, 6] != "UTF-32" || !char_info.valid?
|
319
338
|
byte_values = char.b.unpack("C*")
|
320
339
|
le = char_info.encoding.name == 'UTF-32LE'
|
321
340
|
if byte_values[le ? 2 : 1] > 16 ||
|
data/lib/unibits/version.rb
CHANGED
data/spec/unibits_spec.rb
CHANGED
@@ -67,6 +67,13 @@ describe Unibits do
|
|
67
67
|
result.must_match "01000011"
|
68
68
|
end
|
69
69
|
|
70
|
+
it "works with GB1988" do
|
71
|
+
result = Paint.unpaint(Unibits.visualize("ASCII string".force_encoding('GB1988')))
|
72
|
+
result.must_match "C"
|
73
|
+
result.must_match "43"
|
74
|
+
result.must_match "01000011"
|
75
|
+
end
|
76
|
+
|
70
77
|
it "works with 'ISO-8859-X' encodings" do
|
71
78
|
string = "\xBC Idiosyncr\xE4tic\n\x91".force_encoding("ISO-8859-1")
|
72
79
|
result = Paint.unpaint(Unibits.visualize(string))
|
data/unibits.gemspec
CHANGED
@@ -5,10 +5,10 @@ require File.dirname(__FILE__) + "/lib/unibits/version"
|
|
5
5
|
Gem::Specification.new do |gem|
|
6
6
|
gem.name = "unibits"
|
7
7
|
gem.version = Unibits::VERSION
|
8
|
-
gem.summary = "Visualizes encodings
|
8
|
+
gem.summary = "Visualizes encodings"
|
9
9
|
gem.description = "Visualizes encodings in the terminal. Supports UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE, US-ASCII, ASCII-8BIT, and most of Rubies single-byte encodings. Comes as CLI command and as Ruby Kernel method."
|
10
10
|
gem.authors = ["Jan Lelis"]
|
11
|
-
gem.email = ["
|
11
|
+
gem.email = ["hi@ruby.consulting"]
|
12
12
|
gem.homepage = "https://github.com/janlelis/unibits"
|
13
13
|
gem.license = "MIT"
|
14
14
|
|
@@ -18,10 +18,10 @@ Gem::Specification.new do |gem|
|
|
18
18
|
gem.require_paths = ["lib"]
|
19
19
|
|
20
20
|
gem.add_dependency 'paint', '>= 0.9', '< 3.0'
|
21
|
-
gem.add_dependency 'unicode-display_width', '~>
|
22
|
-
gem.add_dependency 'symbolify', '~> 1.
|
23
|
-
gem.add_dependency 'characteristics', '
|
24
|
-
gem.add_dependency 'rationalist', '~> 2.0'
|
21
|
+
gem.add_dependency 'unicode-display_width', '~> 2.0'
|
22
|
+
gem.add_dependency 'symbolify', '~> 1.4'
|
23
|
+
gem.add_dependency 'characteristics', '~> 1.3'
|
24
|
+
gem.add_dependency 'rationalist', '~> 2.0', '>= 2.0.1'
|
25
25
|
|
26
|
-
gem.required_ruby_version = "
|
26
|
+
gem.required_ruby_version = ">= 2.0"
|
27
27
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: unibits
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.9.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jan Lelis
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-12-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: paint
|
@@ -36,42 +36,42 @@ dependencies:
|
|
36
36
|
requirements:
|
37
37
|
- - "~>"
|
38
38
|
- !ruby/object:Gem::Version
|
39
|
-
version: '
|
39
|
+
version: '2.0'
|
40
40
|
type: :runtime
|
41
41
|
prerelease: false
|
42
42
|
version_requirements: !ruby/object:Gem::Requirement
|
43
43
|
requirements:
|
44
44
|
- - "~>"
|
45
45
|
- !ruby/object:Gem::Version
|
46
|
-
version: '
|
46
|
+
version: '2.0'
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: symbolify
|
49
49
|
requirement: !ruby/object:Gem::Requirement
|
50
50
|
requirements:
|
51
51
|
- - "~>"
|
52
52
|
- !ruby/object:Gem::Version
|
53
|
-
version: '1.
|
53
|
+
version: '1.4'
|
54
54
|
type: :runtime
|
55
55
|
prerelease: false
|
56
56
|
version_requirements: !ruby/object:Gem::Requirement
|
57
57
|
requirements:
|
58
58
|
- - "~>"
|
59
59
|
- !ruby/object:Gem::Version
|
60
|
-
version: '1.
|
60
|
+
version: '1.4'
|
61
61
|
- !ruby/object:Gem::Dependency
|
62
62
|
name: characteristics
|
63
63
|
requirement: !ruby/object:Gem::Requirement
|
64
64
|
requirements:
|
65
|
-
- - "
|
65
|
+
- - "~>"
|
66
66
|
- !ruby/object:Gem::Version
|
67
|
-
version:
|
67
|
+
version: '1.3'
|
68
68
|
type: :runtime
|
69
69
|
prerelease: false
|
70
70
|
version_requirements: !ruby/object:Gem::Requirement
|
71
71
|
requirements:
|
72
|
-
- - "
|
72
|
+
- - "~>"
|
73
73
|
- !ruby/object:Gem::Version
|
74
|
-
version:
|
74
|
+
version: '1.3'
|
75
75
|
- !ruby/object:Gem::Dependency
|
76
76
|
name: rationalist
|
77
77
|
requirement: !ruby/object:Gem::Requirement
|
@@ -79,6 +79,9 @@ dependencies:
|
|
79
79
|
- - "~>"
|
80
80
|
- !ruby/object:Gem::Version
|
81
81
|
version: '2.0'
|
82
|
+
- - ">="
|
83
|
+
- !ruby/object:Gem::Version
|
84
|
+
version: 2.0.1
|
82
85
|
type: :runtime
|
83
86
|
prerelease: false
|
84
87
|
version_requirements: !ruby/object:Gem::Requirement
|
@@ -86,18 +89,20 @@ dependencies:
|
|
86
89
|
- - "~>"
|
87
90
|
- !ruby/object:Gem::Version
|
88
91
|
version: '2.0'
|
92
|
+
- - ">="
|
93
|
+
- !ruby/object:Gem::Version
|
94
|
+
version: 2.0.1
|
89
95
|
description: Visualizes encodings in the terminal. Supports UTF-8, UTF-16LE, UTF-16BE,
|
90
96
|
UTF-32LE, UTF-32BE, US-ASCII, ASCII-8BIT, and most of Rubies single-byte encodings.
|
91
97
|
Comes as CLI command and as Ruby Kernel method.
|
92
98
|
email:
|
93
|
-
-
|
99
|
+
- hi@ruby.consulting
|
94
100
|
executables:
|
95
101
|
- unibits
|
96
102
|
extensions: []
|
97
103
|
extra_rdoc_files: []
|
98
104
|
files:
|
99
105
|
- ".gitignore"
|
100
|
-
- ".travis.yml"
|
101
106
|
- CHANGELOG.md
|
102
107
|
- CODE_OF_CONDUCT.md
|
103
108
|
- Gemfile
|
@@ -121,7 +126,7 @@ require_paths:
|
|
121
126
|
- lib
|
122
127
|
required_ruby_version: !ruby/object:Gem::Requirement
|
123
128
|
requirements:
|
124
|
-
- - "
|
129
|
+
- - ">="
|
125
130
|
- !ruby/object:Gem::Version
|
126
131
|
version: '2.0'
|
127
132
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
@@ -130,10 +135,9 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
130
135
|
- !ruby/object:Gem::Version
|
131
136
|
version: '0'
|
132
137
|
requirements: []
|
133
|
-
|
134
|
-
rubygems_version: 2.6.8
|
138
|
+
rubygems_version: 3.2.3
|
135
139
|
signing_key:
|
136
140
|
specification_version: 4
|
137
|
-
summary: Visualizes encodings
|
141
|
+
summary: Visualizes encodings
|
138
142
|
test_files:
|
139
143
|
- spec/unibits_spec.rb
|
data/.travis.yml
DELETED
@@ -1,23 +0,0 @@
|
|
1
|
-
sudo: false
|
2
|
-
language: ruby
|
3
|
-
|
4
|
-
rvm:
|
5
|
-
- ruby-head
|
6
|
-
- 2.4.1
|
7
|
-
- 2.4.0
|
8
|
-
- 2.3.3
|
9
|
-
- 2.2
|
10
|
-
- 2.1
|
11
|
-
- 2.0
|
12
|
-
- jruby-head
|
13
|
-
- jruby-9.1.8.0
|
14
|
-
|
15
|
-
cache:
|
16
|
-
- bundler
|
17
|
-
|
18
|
-
matrix:
|
19
|
-
allow_failures:
|
20
|
-
- rvm: jruby-head
|
21
|
-
- rvm: ruby-head
|
22
|
-
- rvm: 2.0
|
23
|
-
# fast_finish: true
|