symbolify 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 9e0fa391ee606b11f9c99f4f33e9656f08692534
4
+ data.tar.gz: df940cb466c10f7894ca7a582c1dd87c6998a8a7
5
+ SHA512:
6
+ metadata.gz: f970bd71fb817eb2205f45f46b65cfdb6dbead36e302eccc2e15f6f138da921192744ba0800c77fcb379d3e9b3972b3727b3dcb49f9a9930f733ae944f46c1fb
7
+ data.tar.gz: 73ca12130eafeab368588ea617a45c6cf4c1fc27a2d1a7f4439ab99bc8d1792f08175ae403a76c5304759cfd20e48a586eaf8ca506fe233266da9fa560ee99fa
@@ -0,0 +1,2 @@
1
+ Gemfile.lock
2
+ /pkg
@@ -0,0 +1,22 @@
1
+ sudo: false
2
+ language: ruby
3
+
4
+ rvm:
5
+ - ruby-head
6
+ - 2.4.1
7
+ - 2.3.3
8
+ - 2.2
9
+ - 2.1
10
+ - 2.0
11
+ - jruby-head
12
+ - jruby-9.1.8.0
13
+
14
+ cache:
15
+ - bundler
16
+
17
+ matrix:
18
+ allow_failures:
19
+ - rvm: jruby-head
20
+ - rvm: ruby-head
21
+ - rvm: 2.0
22
+ # fast_finish: true
@@ -0,0 +1,10 @@
1
+ ## CHANGELOG
2
+
3
+ ### 1.0.0
4
+
5
+ * Import from unibits gem
6
+ * Freeze all string literals
7
+ * Automatically create characteristics of character when it is not passed in explicitly
8
+ * Add generic "dump" method of symbolificaton, which is used by `Symbolify.binary`
9
+ * Fix that correct tag names are used
10
+ * Support non-characters
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at opensource@janlelis.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ gem 'minitest'
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2017 Jan Lelis, mail@janlelis.de
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,37 @@
1
+ # Symbolify [![[version]](https://badge.fury.io/rb/symbolify.svg)](http://badge.fury.io/rb/symbolify) [![[travis]](https://travis-ci.org/janlelis/symbolify.svg)](https://travis-ci.org/janlelis/symbolify)
2
+
3
+ Safely print all codepoints from Unicode and single-byte encodings in UTF-8. It replaces control and non-printable characters with readable equivalents and wraps most blank characters in `]` and `[`.
4
+
5
+ Programs that make use of this library: [unibits](https://github.com/janlelis/unibits), [uniscribe](https://github.com/janlelis/uniscribe)
6
+
7
+ ## Setup
8
+
9
+ Add to your `Gemfile`:
10
+
11
+ ```ruby
12
+ gem 'symbolify'
13
+ ```
14
+
15
+ ## Usage
16
+
17
+ ```ruby
18
+ puts Symbolify.symbolify "A" # A
19
+ puts Symbolify.symbolify "🌫" # 🌫
20
+ puts Symbolify.symbolify "\0" # ␀
21
+ puts Symbolify.symbolify "\n" # ␊
22
+ puts Symbolify.symbolify "\x7F" # ␡
23
+ puts Symbolify.symbolify "\u{84}" # IND
24
+ puts Symbolify.symbolify "\u{200F}" # RLM
25
+ puts Symbolify.symbolify "\u{2067}" # RLI
26
+ puts Symbolify.symbolify "\u{0300}" # ◌̀
27
+ puts Symbolify.symbolify " " # ] [
28
+ puts Symbolify.symbolify "\u{E0020}" # TAG ␠
29
+ puts Symbolify.symbolify "\u{E01D7}" # VS232
30
+ puts Symbolify.symbolify "\u{E0000}" # n/a
31
+ puts Symbolify.symbolify "\u{10FFFF}" # n/c
32
+ puts Symbolify.symbolify "\x80" # �
33
+ ```
34
+
35
+ ## MIT License
36
+
37
+ Copyright (C) 2017 Jan Lelis <http://janlelis.com>. Released under the MIT license.
@@ -0,0 +1,38 @@
1
+ # # #
2
+ # Get gemspec info
3
+
4
+ gemspec_file = Dir['*.gemspec'].first
5
+ gemspec = eval File.read(gemspec_file), binding, gemspec_file
6
+ info = "#{gemspec.name} | #{gemspec.version} | " \
7
+ "#{gemspec.runtime_dependencies.size} dependencies | " \
8
+ "#{gemspec.files.size} files"
9
+
10
+ # # #
11
+ # Gem build and install task
12
+
13
+ desc info
14
+ task :gem do
15
+ puts info + "\n\n"
16
+ print " "; sh "gem build #{gemspec_file}"
17
+ FileUtils.mkdir_p 'pkg'
18
+ FileUtils.mv "#{gemspec.name}-#{gemspec.version}.gem", 'pkg'
19
+ puts; sh %{gem install --no-document pkg/#{gemspec.name}-#{gemspec.version}.gem}
20
+ end
21
+
22
+ # # #
23
+ # Start an IRB session with the gem loaded
24
+
25
+ desc "#{gemspec.name} | IRB"
26
+ task :irb do
27
+ sh "irb -I ./lib -r #{gemspec.name.gsub '-','/'}"
28
+ end
29
+
30
+ # # #
31
+ # Run specs
32
+
33
+ desc "#{gemspec.name} | Spec"
34
+ task :spec do
35
+ sh "for file in spec/*_spec.rb; do ruby $file; done"
36
+ end
37
+ task default: :spec
38
+
@@ -0,0 +1,609 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "symbolify/version"
4
+
5
+ require "characteristics"
6
+
7
+ module Symbolify
8
+ NO_UTF8_CONVERTER = /^(Windows-1258|IBM864|macCentEuro|macThai)/
9
+
10
+ REPLACEMENT_CHAR = "�"
11
+
12
+ CONTROL_C0_SYMBOLS = [
13
+ "␀",
14
+ "␁",
15
+ "␂",
16
+ "␃",
17
+ "␄",
18
+ "␅",
19
+ "␆",
20
+ "␇",
21
+ "␈",
22
+ "␉",
23
+ "␊",
24
+ "␋",
25
+ "␌",
26
+ "␍",
27
+ "␎",
28
+ "␏",
29
+ "␐",
30
+ "␑",
31
+ "␒",
32
+ "␓",
33
+ "␔",
34
+ "␕",
35
+ "␖",
36
+ "␗",
37
+ "␘",
38
+ "␙",
39
+ "␚",
40
+ "␛",
41
+ "␜",
42
+ "␝",
43
+ "␞",
44
+ "␟",
45
+ ].freeze
46
+
47
+ CONTROL_DELETE_SYMBOL = "␡"
48
+
49
+ CONTROL_C1_NAMES = {
50
+ 0x80 => "PAD",
51
+ 0x81 => "HOP",
52
+ 0x82 => "BPH",
53
+ 0x83 => "NBH",
54
+ 0x84 => "IND",
55
+ 0x85 => "NEL",
56
+ 0x86 => "SSA",
57
+ 0x87 => "ESA",
58
+ 0x88 => "HTS",
59
+ 0x89 => "HTJ",
60
+ 0x8A => "VTS",
61
+ 0x8B => "PLD",
62
+ 0x8C => "PLU",
63
+ 0x8D => "RI",
64
+ 0x8E => "SS2",
65
+ 0x8F => "SS3",
66
+ 0x90 => "DCS",
67
+ 0x91 => "PU1",
68
+ 0x92 => "PU2",
69
+ 0x93 => "STS",
70
+ 0x94 => "CCH",
71
+ 0x95 => "MW",
72
+ 0x96 => "SPA",
73
+ 0x97 => "EPA",
74
+ 0x98 => "SOS",
75
+ 0x99 => "SGC",
76
+ 0x9A => "SCI",
77
+ 0x9B => "CSI",
78
+ 0x9C => "ST",
79
+ 0x9D => "OSC",
80
+ 0x9E => "PM",
81
+ 0x9F => "APC",
82
+ }.freeze
83
+
84
+ BIDI_CONTROL_NAMES = {
85
+ 0x061C => "ALM",
86
+ 0x200E => "LRM",
87
+ 0x200F => "RLM",
88
+ 0x202A => "LRE",
89
+ 0x202B => "RLE",
90
+ 0x202C => "PDF",
91
+ 0x202D => "LRO",
92
+ 0x202E => "RLO",
93
+ 0x2066 => "LRI",
94
+ 0x2067 => "RLI",
95
+ 0x2068 => "FSI",
96
+ 0x2069 => "PDI",
97
+ }.freeze
98
+
99
+ TAG_NAMES = {
100
+ 0xE0001 => "LANG TAG",
101
+ 0xE0020 => "TAG ␠",
102
+ 0xE0021 => "TAG !",
103
+ 0xE0022 => "TAG \"",
104
+ 0xE0023 => "TAG #",
105
+ 0xE0024 => "TAG $",
106
+ 0xE0025 => "TAG %",
107
+ 0xE0026 => "TAG &",
108
+ 0xE0027 => "TAG '",
109
+ 0xE0028 => "TAG (",
110
+ 0xE0029 => "TAG )",
111
+ 0xE002A => "TAG *",
112
+ 0xE002B => "TAG +",
113
+ 0xE002C => "TAG ,",
114
+ 0xE002D => "TAG -",
115
+ 0xE002E => "TAG .",
116
+ 0xE002F => "TAG /",
117
+ 0xE0030 => "TAG 0",
118
+ 0xE0031 => "TAG 1",
119
+ 0xE0032 => "TAG 2",
120
+ 0xE0033 => "TAG 3",
121
+ 0xE0034 => "TAG 4",
122
+ 0xE0035 => "TAG 5",
123
+ 0xE0036 => "TAG 6",
124
+ 0xE0037 => "TAG 7",
125
+ 0xE0038 => "TAG 8",
126
+ 0xE0039 => "TAG 9",
127
+ 0xE003A => "TAG :",
128
+ 0xE003B => "TAG ;",
129
+ 0xE003C => "TAG <",
130
+ 0xE003D => "TAG =",
131
+ 0xE003E => "TAG >",
132
+ 0xE003F => "TAG ?",
133
+ 0xE0040 => "TAG @",
134
+ 0xE0041 => "TAG A",
135
+ 0xE0042 => "TAG B",
136
+ 0xE0043 => "TAG C",
137
+ 0xE0044 => "TAG D",
138
+ 0xE0045 => "TAG E",
139
+ 0xE0046 => "TAG F",
140
+ 0xE0047 => "TAG G",
141
+ 0xE0048 => "TAG H",
142
+ 0xE0049 => "TAG I",
143
+ 0xE004A => "TAG J",
144
+ 0xE004B => "TAG K",
145
+ 0xE004C => "TAG L",
146
+ 0xE004D => "TAG M",
147
+ 0xE004E => "TAG N",
148
+ 0xE004F => "TAG O",
149
+ 0xE0050 => "TAG P",
150
+ 0xE0051 => "TAG Q",
151
+ 0xE0052 => "TAG R",
152
+ 0xE0053 => "TAG S",
153
+ 0xE0054 => "TAG T",
154
+ 0xE0055 => "TAG U",
155
+ 0xE0056 => "TAG V",
156
+ 0xE0057 => "TAG W",
157
+ 0xE0058 => "TAG X",
158
+ 0xE0059 => "TAG Y",
159
+ 0xE005A => "TAG Z",
160
+ 0xE005B => "TAG [",
161
+ 0xE005C => "TAG \\",
162
+ 0xE005D => "TAG ]",
163
+ 0xE005E => "TAG ^",
164
+ 0xE005F => "TAG _",
165
+ 0xE0060 => "TAG `",
166
+ 0xE0061 => "TAG a",
167
+ 0xE0062 => "TAG b",
168
+ 0xE0063 => "TAG c",
169
+ 0xE0064 => "TAG d",
170
+ 0xE0065 => "TAG e",
171
+ 0xE0066 => "TAG f",
172
+ 0xE0067 => "TAG g",
173
+ 0xE0068 => "TAG h",
174
+ 0xE0069 => "TAG i",
175
+ 0xE006A => "TAG j",
176
+ 0xE006B => "TAG k",
177
+ 0xE006C => "TAG l",
178
+ 0xE006D => "TAG m",
179
+ 0xE006E => "TAG n",
180
+ 0xE006F => "TAG o",
181
+ 0xE0070 => "TAG p",
182
+ 0xE0071 => "TAG q",
183
+ 0xE0072 => "TAG r",
184
+ 0xE0073 => "TAG s",
185
+ 0xE0074 => "TAG t",
186
+ 0xE0075 => "TAG u",
187
+ 0xE0076 => "TAG v",
188
+ 0xE0077 => "TAG w",
189
+ 0xE0078 => "TAG x",
190
+ 0xE0079 => "TAG y",
191
+ 0xE007A => "TAG z",
192
+ 0xE007B => "TAG {",
193
+ 0xE007C => "TAG |",
194
+ 0xE007D => "TAG }",
195
+ 0xE007E => "TAG ~",
196
+ 0xE007F => "TAG ␡",
197
+ }.freeze
198
+
199
+ VARIATION_SELECTOR_NAMES = {
200
+ 0x180B => "FVS1",
201
+ 0x180C => "FVS2",
202
+ 0x180D => "FVS3",
203
+
204
+ 0xFE00 => "VS1",
205
+ 0xFE01 => "VS2",
206
+ 0xFE02 => "VS3",
207
+ 0xFE03 => "VS4",
208
+ 0xFE04 => "VS5",
209
+ 0xFE05 => "VS6",
210
+ 0xFE06 => "VS7",
211
+ 0xFE07 => "VS8",
212
+ 0xFE08 => "VS9",
213
+ 0xFE09 => "VS10",
214
+ 0xFE0A => "VS11",
215
+ 0xFE0B => "VS12",
216
+ 0xFE0C => "VS13",
217
+ 0xFE0D => "VS14",
218
+ 0xFE0E => "VS15",
219
+ 0xFE0F => "VS16",
220
+
221
+ 0xE0100 => "VS17",
222
+ 0xE0101 => "VS18",
223
+ 0xE0102 => "VS19",
224
+ 0xE0103 => "VS20",
225
+ 0xE0104 => "VS21",
226
+ 0xE0105 => "VS22",
227
+ 0xE0106 => "VS23",
228
+ 0xE0107 => "VS24",
229
+ 0xE0108 => "VS25",
230
+ 0xE0109 => "VS26",
231
+ 0xE010A => "VS27",
232
+ 0xE010B => "VS28",
233
+ 0xE010C => "VS29",
234
+ 0xE010D => "VS30",
235
+ 0xE010E => "VS31",
236
+ 0xE010F => "VS32",
237
+ 0xE0110 => "VS33",
238
+ 0xE0111 => "VS34",
239
+ 0xE0112 => "VS35",
240
+ 0xE0113 => "VS36",
241
+ 0xE0114 => "VS37",
242
+ 0xE0115 => "VS38",
243
+ 0xE0116 => "VS39",
244
+ 0xE0117 => "VS40",
245
+ 0xE0118 => "VS41",
246
+ 0xE0119 => "VS42",
247
+ 0xE011A => "VS43",
248
+ 0xE011B => "VS44",
249
+ 0xE011C => "VS45",
250
+ 0xE011D => "VS46",
251
+ 0xE011E => "VS47",
252
+ 0xE011F => "VS48",
253
+ 0xE0120 => "VS49",
254
+ 0xE0121 => "VS50",
255
+ 0xE0122 => "VS51",
256
+ 0xE0123 => "VS52",
257
+ 0xE0124 => "VS53",
258
+ 0xE0125 => "VS54",
259
+ 0xE0126 => "VS55",
260
+ 0xE0127 => "VS56",
261
+ 0xE0128 => "VS57",
262
+ 0xE0129 => "VS58",
263
+ 0xE012A => "VS59",
264
+ 0xE012B => "VS60",
265
+ 0xE012C => "VS61",
266
+ 0xE012D => "VS62",
267
+ 0xE012E => "VS63",
268
+ 0xE012F => "VS64",
269
+ 0xE0130 => "VS65",
270
+ 0xE0131 => "VS66",
271
+ 0xE0132 => "VS67",
272
+ 0xE0133 => "VS68",
273
+ 0xE0134 => "VS69",
274
+ 0xE0135 => "VS70",
275
+ 0xE0136 => "VS71",
276
+ 0xE0137 => "VS72",
277
+ 0xE0138 => "VS73",
278
+ 0xE0139 => "VS74",
279
+ 0xE013A => "VS75",
280
+ 0xE013B => "VS76",
281
+ 0xE013C => "VS77",
282
+ 0xE013D => "VS78",
283
+ 0xE013E => "VS79",
284
+ 0xE013F => "VS80",
285
+ 0xE0140 => "VS81",
286
+ 0xE0141 => "VS82",
287
+ 0xE0142 => "VS83",
288
+ 0xE0143 => "VS84",
289
+ 0xE0144 => "VS85",
290
+ 0xE0145 => "VS86",
291
+ 0xE0146 => "VS87",
292
+ 0xE0147 => "VS88",
293
+ 0xE0148 => "VS89",
294
+ 0xE0149 => "VS90",
295
+ 0xE014A => "VS91",
296
+ 0xE014B => "VS92",
297
+ 0xE014C => "VS93",
298
+ 0xE014D => "VS94",
299
+ 0xE014E => "VS95",
300
+ 0xE014F => "VS96",
301
+ 0xE0150 => "VS97",
302
+ 0xE0151 => "VS98",
303
+ 0xE0152 => "VS99",
304
+ 0xE0153 => "VS100",
305
+ 0xE0154 => "VS101",
306
+ 0xE0155 => "VS102",
307
+ 0xE0156 => "VS103",
308
+ 0xE0157 => "VS104",
309
+ 0xE0158 => "VS105",
310
+ 0xE0159 => "VS106",
311
+ 0xE015A => "VS107",
312
+ 0xE015B => "VS108",
313
+ 0xE015C => "VS109",
314
+ 0xE015D => "VS110",
315
+ 0xE015E => "VS111",
316
+ 0xE015F => "VS112",
317
+ 0xE0160 => "VS113",
318
+ 0xE0161 => "VS114",
319
+ 0xE0162 => "VS115",
320
+ 0xE0163 => "VS116",
321
+ 0xE0164 => "VS117",
322
+ 0xE0165 => "VS118",
323
+ 0xE0166 => "VS119",
324
+ 0xE0167 => "VS120",
325
+ 0xE0168 => "VS121",
326
+ 0xE0169 => "VS122",
327
+ 0xE016A => "VS123",
328
+ 0xE016B => "VS124",
329
+ 0xE016C => "VS125",
330
+ 0xE016D => "VS126",
331
+ 0xE016E => "VS127",
332
+ 0xE016F => "VS128",
333
+ 0xE0170 => "VS129",
334
+ 0xE0171 => "VS130",
335
+ 0xE0172 => "VS131",
336
+ 0xE0173 => "VS132",
337
+ 0xE0174 => "VS133",
338
+ 0xE0175 => "VS134",
339
+ 0xE0176 => "VS135",
340
+ 0xE0177 => "VS136",
341
+ 0xE0178 => "VS137",
342
+ 0xE0179 => "VS138",
343
+ 0xE017A => "VS139",
344
+ 0xE017B => "VS140",
345
+ 0xE017C => "VS141",
346
+ 0xE017D => "VS142",
347
+ 0xE017E => "VS143",
348
+ 0xE017F => "VS144",
349
+ 0xE0180 => "VS145",
350
+ 0xE0181 => "VS146",
351
+ 0xE0182 => "VS147",
352
+ 0xE0183 => "VS148",
353
+ 0xE0184 => "VS149",
354
+ 0xE0185 => "VS150",
355
+ 0xE0186 => "VS151",
356
+ 0xE0187 => "VS152",
357
+ 0xE0188 => "VS153",
358
+ 0xE0189 => "VS154",
359
+ 0xE018A => "VS155",
360
+ 0xE018B => "VS156",
361
+ 0xE018C => "VS157",
362
+ 0xE018D => "VS158",
363
+ 0xE018E => "VS159",
364
+ 0xE018F => "VS160",
365
+ 0xE0190 => "VS161",
366
+ 0xE0191 => "VS162",
367
+ 0xE0192 => "VS163",
368
+ 0xE0193 => "VS164",
369
+ 0xE0194 => "VS165",
370
+ 0xE0195 => "VS166",
371
+ 0xE0196 => "VS167",
372
+ 0xE0197 => "VS168",
373
+ 0xE0198 => "VS169",
374
+ 0xE0199 => "VS170",
375
+ 0xE019A => "VS171",
376
+ 0xE019B => "VS172",
377
+ 0xE019C => "VS173",
378
+ 0xE019D => "VS174",
379
+ 0xE019E => "VS175",
380
+ 0xE019F => "VS176",
381
+ 0xE01A0 => "VS177",
382
+ 0xE01A1 => "VS178",
383
+ 0xE01A2 => "VS179",
384
+ 0xE01A3 => "VS180",
385
+ 0xE01A4 => "VS181",
386
+ 0xE01A5 => "VS182",
387
+ 0xE01A6 => "VS183",
388
+ 0xE01A7 => "VS184",
389
+ 0xE01A8 => "VS185",
390
+ 0xE01A9 => "VS186",
391
+ 0xE01AA => "VS187",
392
+ 0xE01AB => "VS188",
393
+ 0xE01AC => "VS189",
394
+ 0xE01AD => "VS190",
395
+ 0xE01AE => "VS191",
396
+ 0xE01AF => "VS192",
397
+ 0xE01B0 => "VS193",
398
+ 0xE01B1 => "VS194",
399
+ 0xE01B2 => "VS195",
400
+ 0xE01B3 => "VS196",
401
+ 0xE01B4 => "VS197",
402
+ 0xE01B5 => "VS198",
403
+ 0xE01B6 => "VS199",
404
+ 0xE01B7 => "VS200",
405
+ 0xE01B8 => "VS201",
406
+ 0xE01B9 => "VS202",
407
+ 0xE01BA => "VS203",
408
+ 0xE01BB => "VS204",
409
+ 0xE01BC => "VS205",
410
+ 0xE01BD => "VS206",
411
+ 0xE01BE => "VS207",
412
+ 0xE01BF => "VS208",
413
+ 0xE01C0 => "VS209",
414
+ 0xE01C1 => "VS210",
415
+ 0xE01C2 => "VS211",
416
+ 0xE01C3 => "VS212",
417
+ 0xE01C4 => "VS213",
418
+ 0xE01C5 => "VS214",
419
+ 0xE01C6 => "VS215",
420
+ 0xE01C7 => "VS216",
421
+ 0xE01C8 => "VS217",
422
+ 0xE01C9 => "VS218",
423
+ 0xE01CA => "VS219",
424
+ 0xE01CB => "VS220",
425
+ 0xE01CC => "VS221",
426
+ 0xE01CD => "VS222",
427
+ 0xE01CE => "VS223",
428
+ 0xE01CF => "VS224",
429
+ 0xE01D0 => "VS225",
430
+ 0xE01D1 => "VS226",
431
+ 0xE01D2 => "VS227",
432
+ 0xE01D3 => "VS228",
433
+ 0xE01D4 => "VS229",
434
+ 0xE01D5 => "VS230",
435
+ 0xE01D6 => "VS231",
436
+ 0xE01D7 => "VS232",
437
+ 0xE01D8 => "VS233",
438
+ 0xE01D9 => "VS234",
439
+ 0xE01DA => "VS235",
440
+ 0xE01DB => "VS236",
441
+ 0xE01DC => "VS237",
442
+ 0xE01DD => "VS238",
443
+ 0xE01DE => "VS239",
444
+ 0xE01DF => "VS240",
445
+ 0xE01E0 => "VS241",
446
+ 0xE01E1 => "VS242",
447
+ 0xE01E2 => "VS243",
448
+ 0xE01E3 => "VS244",
449
+ 0xE01E4 => "VS245",
450
+ 0xE01E5 => "VS246",
451
+ 0xE01E6 => "VS247",
452
+ 0xE01E7 => "VS248",
453
+ 0xE01E8 => "VS249",
454
+ 0xE01E9 => "VS250",
455
+ 0xE01EA => "VS251",
456
+ 0xE01EB => "VS252",
457
+ 0xE01EC => "VS253",
458
+ 0xE01ED => "VS254",
459
+ 0xE01EE => "VS255",
460
+ 0xE01EF => "VS256",
461
+ }.freeze
462
+
463
+ NONCHARACTERS = [
464
+ *0xFDD0..0xFDEF,
465
+ 0xFFFE, 0xFFFF,
466
+ 0x1FFFE, 0x1FFFF,
467
+ 0x2FFFE, 0x2FFFF,
468
+ 0x3FFFE, 0x3FFFF,
469
+ 0x4FFFE, 0x4FFFF,
470
+ 0x5FFFE, 0x5FFFF,
471
+ 0x6FFFE, 0x6FFFF,
472
+ 0x7FFFE, 0x7FFFF,
473
+ 0x8FFFE, 0x8FFFF,
474
+ 0x9FFFE, 0x9FFFF,
475
+ 0xAFFFE, 0xAFFFF,
476
+ 0xBFFFE, 0xBFFFF,
477
+ 0xCFFFE, 0xCFFFF,
478
+ 0xDFFFE, 0xDFFFF,
479
+ 0xEFFFE, 0xEFFFF,
480
+ 0xFFFFE, 0xFFFFF,
481
+ 0x10FFFE, 0x10FFFF,
482
+ ].freeze
483
+
484
+ INTERESTING_BYTES_ENCODINGS = {
485
+ 0xD8 => /^macCroatian/,
486
+ 0xF0 => /^mac(Iceland|Roman|Turkish)/,
487
+ 0xFD => /^(ISO-8859-8|Windows-(1255|1256))/,
488
+ 0xFE => /^(ISO-8859-8|Windows-(1255|1256))/,
489
+ }.freeze
490
+
491
+ INTERESTING_BYTES_VALUES = {
492
+ 0xD8 => "Logo",
493
+ 0xF0 => "Logo",
494
+ 0xFD => "LRM",
495
+ 0xFE => "RLM",
496
+ }.freeze
497
+
498
+ MAC_KEY_SYMBOLS = {
499
+ 0x11 => "⌘",
500
+ 0x12 => "⇧",
501
+ 0x13 => "⌥",
502
+ 0x14 => "⌃",
503
+ }.freeze
504
+
505
+ def self.symbolify(char, char_info = Characteristics.create(char))
506
+ if !char_info.valid?
507
+ REPLACEMENT_CHAR
508
+ else
509
+ case char_info
510
+ when UnicodeCharacteristics
511
+ Symbolify.unicode(char, char_info)
512
+ when ByteCharacteristics
513
+ Symbolify.byte(char, char_info)
514
+ when AsciiCharacteristics
515
+ Symbolify.ascii(char, char_info)
516
+ else
517
+ Symbolify.binary(char)
518
+ end
519
+ end
520
+ end
521
+
522
+ def self.unicode(char, char_info = UnicodeCharacteristics.new(char))
523
+ if !char_info.assigned?
524
+ if NONCHARACTERS.include?(char.ord)
525
+ return "n/c"
526
+ else
527
+ return "n/a"
528
+ end
529
+ end
530
+
531
+ char = char.dup.encode("UTF-8")
532
+ ord = char.ord
533
+
534
+ if char_info.delete?
535
+ char = CONTROL_DELETE_SYMBOL
536
+ elsif char_info.c0?
537
+ char = CONTROL_C0_SYMBOLS[ord]
538
+ elsif char_info.c1?
539
+ char = CONTROL_C1_NAMES[ord]
540
+ elsif char_info.bidi_control?
541
+ char = BIDI_CONTROL_NAMES[ord]
542
+ elsif VARIATION_SELECTOR_NAMES.key?(ord)
543
+ char = VARIATION_SELECTOR_NAMES[ord]
544
+ elsif char_info.category == "Mn"
545
+ char = "◌" + char
546
+ elsif char_info.category == "Me"
547
+ char = " " + char
548
+ elsif char_info.blank?
549
+ char = "]" + char + "["
550
+ elsif TAG_NAMES.key?(ord)
551
+ char = TAG_NAMES[ord]
552
+ end
553
+
554
+ char
555
+ end
556
+
557
+ def self.byte(char, char_info= ByteCharacteristics.new(char))
558
+ return "n/a" if !char_info.assigned?
559
+
560
+ ord = char.ord
561
+ encoding = char_info.encoding
562
+ no_converter = !!(NO_UTF8_CONVERTER =~ encoding.name)
563
+ treat_char_unconverted = false
564
+
565
+ if char_info.delete?
566
+ char = CONTROL_DELETE_SYMBOL
567
+ elsif char_info.c0?
568
+ if ord >= 0x11 && ord <= 0x14 && encoding.name =~ /^mac/
569
+ char = MAC_KEY_SYMBOLS[ord]
570
+ else
571
+ char = CONTROL_C0_SYMBOLS[ord]
572
+ end
573
+ elsif char_info.c1?
574
+ char = CONTROL_C1_NAMES[ord]
575
+ elsif no_converter
576
+ treat_char_unconverted = true
577
+ elsif char_info.blank?
578
+ char = "]".encode(encoding) + char + "[".encode(encoding)
579
+ elsif INTERESTING_BYTES_ENCODINGS[ord] =~ encoding.name
580
+ char = INTERESTING_BYTES_VALUES[ord]
581
+ end
582
+
583
+ if no_converter && treat_char_unconverted
584
+ dump(char)
585
+ else
586
+ char.encode("UTF-8")
587
+ end
588
+ end
589
+
590
+ def self.ascii(char, char_info = AsciiCharacteristics.new(char))
591
+ if char_info.delete?
592
+ char = CONTROL_DELETE_SYMBOL
593
+ elsif char_info.c0?
594
+ char = CONTROL_C0_SYMBOLS[char.ord]
595
+ elsif char_info.blank?
596
+ char = "]" + char + "["
597
+ end
598
+
599
+ char
600
+ end
601
+
602
+ def self.binary(char, _ = nil)
603
+ dump(char)
604
+ end
605
+
606
+ def self.dump(char)
607
+ char.dump
608
+ end
609
+ end
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Symbolify
4
+ VERSION = "1.0.0".freeze
5
+ UNICODE_VERSION = "9.0.0".freeze
6
+ end
7
+
@@ -0,0 +1,119 @@
1
+ require_relative "../lib/symbolify"
2
+ require "minitest/autorun"
3
+
4
+ describe Symbolify do
5
+ describe ".symbolify" do
6
+ it "will show replacement character for invalid characters" do
7
+ assert_equal "�", Symbolify.symbolify("\x80")
8
+ end
9
+ end
10
+
11
+ describe ".unicode" do
12
+ it "works with normal characters" do
13
+ assert_equal "A", Symbolify.symbolify("A")
14
+ assert_equal "🌫", Symbolify.symbolify("🌫")
15
+ end
16
+
17
+ it "replaces C0 control characters" do
18
+ assert_equal "␀", Symbolify.symbolify("\0")
19
+ assert_equal "␊", Symbolify.symbolify("\n")
20
+ assert_equal "␡", Symbolify.symbolify("\x7F")
21
+ end
22
+
23
+ it "replaces C1 control characters" do
24
+ assert_equal "IND", Symbolify.symbolify("\u{84}")
25
+ end
26
+
27
+ it "replaces bidi control characters" do
28
+ assert_equal "RLM", Symbolify.symbolify("\u{200F}")
29
+ assert_equal "RLI", Symbolify.symbolify("\u{2067}")
30
+ end
31
+
32
+ it "prepends non-spacing marks with a dotted circle" do
33
+ assert_equal "◌\u{0300}", Symbolify.symbolify("\u{0300}")
34
+ end
35
+
36
+ it "prepends enclosing marks with a space" do
37
+ assert_equal " \u{20E3}", Symbolify.symbolify("\u{20E3}")
38
+ end
39
+
40
+ it "wraps blanks" do
41
+ assert_equal "] [", Symbolify.symbolify(" ")
42
+ end
43
+
44
+ it "replaces tags" do
45
+ assert_equal "TAG ␠", Symbolify.symbolify("\u{E0020}")
46
+ end
47
+
48
+ it "replaces variation selectors" do
49
+ assert_equal "VS232", Symbolify.symbolify("\u{E01D7}")
50
+ end
51
+
52
+ it "works with non-characters" do
53
+ assert_equal "n/c", Symbolify.symbolify("\u{10FFFF}")
54
+ end
55
+
56
+ it "works with unassigned characters" do
57
+ assert_equal "n/a", Symbolify.symbolify("\u{E0000}")
58
+ end
59
+ end
60
+
61
+ describe ".byte" do
62
+ it "works with normal characters" do
63
+ assert_equal "A", Symbolify.symbolify("A".force_encoding("ISO-8859-1"))
64
+ end
65
+
66
+ it "works with C0 control characters" do
67
+ assert_equal "␀", Symbolify.symbolify("\0".force_encoding("ISO-8859-1"))
68
+ assert_equal "␊", Symbolify.symbolify("\n".force_encoding("ISO-8859-1"))
69
+ assert_equal "␡", Symbolify.symbolify("\x7F".force_encoding("ISO-8859-1"))
70
+ end
71
+
72
+ it "works with C1 control characters (if encoding has C1)" do
73
+ assert_equal "IND", Symbolify.symbolify("\x84".force_encoding("ISO-8859-1"))
74
+ end
75
+
76
+ it "works with blanks" do
77
+ assert_equal "] [", Symbolify.symbolify(" ".force_encoding("ISO-8859-1"))
78
+ end
79
+
80
+ it "works with encodings that do not convert to UTF-8 (uses .dump)" do
81
+ assert_equal '"a"', Symbolify.symbolify("a".force_encoding("macThai"))
82
+ end
83
+
84
+ it "works with mac symbbols and logo bytes" do
85
+ assert_equal "⌘", Symbolify.symbolify("\x11".force_encoding("macRoman"))
86
+ assert_equal "Logo", Symbolify.symbolify("\xF0".force_encoding("macRoman"))
87
+ end
88
+
89
+ it "works with unassigned characters" do
90
+ assert_equal "n/a", Symbolify.symbolify("\x81".force_encoding("Windows-1252"))
91
+ end
92
+ end
93
+
94
+ describe ".ascii" do
95
+ it "works with normal characters" do
96
+ assert_equal "A", Symbolify.symbolify("A".force_encoding("US-ASCII"))
97
+ end
98
+
99
+ it "replaces C0 control characters" do
100
+ assert_equal "␀", Symbolify.symbolify("\0".force_encoding("US-ASCII"))
101
+ assert_equal "␊", Symbolify.symbolify("\n".force_encoding("US-ASCII"))
102
+ assert_equal "␡", Symbolify.symbolify("\x7F".force_encoding("US-ASCII"))
103
+ end
104
+
105
+ it "wraps blanks" do
106
+ assert_equal "] [", Symbolify.symbolify(" ".force_encoding("US-ASCII"))
107
+ end
108
+ end
109
+
110
+ describe ".binary" do
111
+ it "works with printable bytes" do
112
+ assert_equal '"A"', Symbolify.symbolify("A".b)
113
+ end
114
+
115
+ it "works with unkhnown bytes" do
116
+ assert_equal '"\x87"', Symbolify.symbolify("\x87".b)
117
+ end
118
+ end
119
+ end
@@ -0,0 +1,22 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ require File.dirname(__FILE__) + "/lib/symbolify/version"
4
+
5
+ Gem::Specification.new do |gem|
6
+ gem.name = "symbolify"
7
+ gem.version = Symbolify::VERSION
8
+ gem.summary = "Represent arbitrary codepoints in the terminal."
9
+ gem.description = "Safely print all codepoints from Unicode and single-byte encodings."
10
+ gem.authors = ["Jan Lelis"]
11
+ gem.email = ["mail@janlelis.de"]
12
+ gem.homepage = "https://github.com/janlelis/symbolify"
13
+ gem.license = "MIT"
14
+
15
+ gem.files = Dir["{**/}{.*,*}"].select{ |path| File.file?(path) && path !~ /^pkg/ }
16
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
17
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
18
+ gem.require_paths = ["lib"]
19
+
20
+ gem.required_ruby_version = "~> 2.0"
21
+ gem.add_dependency "characteristics", "~> 0.5"
22
+ end
metadata ADDED
@@ -0,0 +1,71 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: symbolify
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Jan Lelis
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-03-25 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: characteristics
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '0.5'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '0.5'
27
+ description: Safely print all codepoints from Unicode and single-byte encodings.
28
+ email:
29
+ - mail@janlelis.de
30
+ executables: []
31
+ extensions: []
32
+ extra_rdoc_files: []
33
+ files:
34
+ - ".gitignore"
35
+ - ".travis.yml"
36
+ - CHANGELOG.md
37
+ - CODE_OF_CONDUCT.md
38
+ - Gemfile
39
+ - MIT-LICENSE.txt
40
+ - README.md
41
+ - Rakefile
42
+ - lib/symbolify.rb
43
+ - lib/symbolify/version.rb
44
+ - spec/symbolify_spec.rb
45
+ - symbolify.gemspec
46
+ homepage: https://github.com/janlelis/symbolify
47
+ licenses:
48
+ - MIT
49
+ metadata: {}
50
+ post_install_message:
51
+ rdoc_options: []
52
+ require_paths:
53
+ - lib
54
+ required_ruby_version: !ruby/object:Gem::Requirement
55
+ requirements:
56
+ - - "~>"
57
+ - !ruby/object:Gem::Version
58
+ version: '2.0'
59
+ required_rubygems_version: !ruby/object:Gem::Requirement
60
+ requirements:
61
+ - - ">="
62
+ - !ruby/object:Gem::Version
63
+ version: '0'
64
+ requirements: []
65
+ rubyforge_project:
66
+ rubygems_version: 2.6.8
67
+ signing_key:
68
+ specification_version: 4
69
+ summary: Represent arbitrary codepoints in the terminal.
70
+ test_files:
71
+ - spec/symbolify_spec.rb