bytes 0.1.1 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +240 -0
- data/lib/bytes/version.rb +2 -2
- data/lib/bytes.rb +8 -2
- metadata +1 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5a8baa99382e45238e96527a5aab45a04daf4b73a2d4e1f64b3bf87530e14a7d
|
4
|
+
data.tar.gz: ec170b28a4efd3d7b0cdf45094831a68a86d60785f313668baa24cbf5a106085
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8f62ad42592bfd8e69fb9bf53526ba19e3568b7140f88bfa9bf524ee1ac8af3258dd2d156f6080d08e63ecbcb92cf0e8f531f56f4c3b32faad8477b7efdf8423
|
7
|
+
data.tar.gz: 728487df6ca9ad9cbe4aedcc2ae76c3f7018c195a780a5e4ef7ee487602cf0ac7fadff8b5c31b38a6e18f6a324947a1ae7e51f3c693bb65ae000ce6461a54969
|
data/README.md
CHANGED
@@ -8,6 +8,246 @@ bytes - bits 'n' bytes made easy/easier incl. new buffer helper / wrapper class
|
|
8
8
|
* rdoc :: [rubydoc.info/gems/bytes](http://rubydoc.info/gems/bytes)
|
9
9
|
|
10
10
|
|
11
|
+
## Background - Programming Bits, Bytes 'n' Blocks Step-by-Step Book / Guide
|
12
|
+
|
13
|
+
_Let's start with the three types of strings, that is, bytes, (string) buffers, and (frozen) strings, ..._
|
14
|
+
|
15
|
+
|
16
|
+
### Byte vs. Character
|
17
|
+
|
18
|
+
|
19
|
+
#### What's a Byte?
|
20
|
+
|
21
|
+
A byte is a 8-bit integer number (that is, unsigned from 0 to 255 or
|
22
|
+
signed from -128 to 127 using 2's complement).
|
23
|
+
Example:
|
24
|
+
|
25
|
+
``` ruby
|
26
|
+
0b01000001 #=> 65 - base 2 - binary bits
|
27
|
+
65 #=> 65 - base 10 - decimal numbers
|
28
|
+
0x41 #=> 65 - base 16 - hexadecimal numbers
|
29
|
+
[0b01000001, 65, 0x41]
|
30
|
+
#=> [65, 65, 65]
|
31
|
+
```
|
32
|
+
|
33
|
+
Or with conversions to 8-bit integer numbers. Example:
|
34
|
+
|
35
|
+
``` ruby
|
36
|
+
"01000001".to_i(2) #=> 65
|
37
|
+
"65".to_i(10) #=> 65
|
38
|
+
"65".to_i #=> 65 - same as to_i(10)
|
39
|
+
"41".to_i(16) #=> 65
|
40
|
+
"0x41".to_i(16) #=> 65 - same as "41" - 0x hex prefix gets skipped
|
41
|
+
["01000001".to_i(2), "65".to_i(10), "41".to_i(16)]
|
42
|
+
#=> [65, 65, 65]
|
43
|
+
```
|
44
|
+
|
45
|
+
Note: `String#hex` is a built-in short-cut / alias for `String#to_i(16)`. Example:
|
46
|
+
|
47
|
+
``` ruby
|
48
|
+
"41".hex #=> 65
|
49
|
+
"0x41".hex #=> 65 - same as "41" - 0x hex prefix gets
|
50
|
+
```
|
51
|
+
|
52
|
+
|
53
|
+
|
54
|
+
#### What's a Character?
|
55
|
+
|
56
|
+
A character (or char) used to be a byte
|
57
|
+
and, thus, a string (array) of characters
|
58
|
+
was also an array of bytes. Example:
|
59
|
+
|
60
|
+
``` ruby
|
61
|
+
?A.ord #=> 65 - ASCII character
|
62
|
+
"A".ord #=> 65 - ASCII character
|
63
|
+
"\x41".ord #=> 65 - ASCII character
|
64
|
+
[?A.ord, "A".ord, "\x41".ord]
|
65
|
+
#=> [65, 65, 65]
|
66
|
+
```
|
67
|
+
|
68
|
+
|
69
|
+
|
70
|
+
History Nostalgia Corner: ASCII 7-Bit Character Encoding
|
71
|
+
|
72
|
+
![](i/us_ascii_code_chart.png)
|
73
|
+
|
74
|
+
> ASCII abbreviated from American Standard Code for Information Interchange,
|
75
|
+
> is a character encoding standard for electronic communication. ASCII codes
|
76
|
+
> represent text in computers, telecommunications equipment, and other devices.
|
77
|
+
> Most modern character-encoding schemes are based on ASCII,
|
78
|
+
> although they support many additional characters.
|
79
|
+
>
|
80
|
+
> ASCII is the traditional name for the encoding system; the Internet Assigned
|
81
|
+
> Numbers Authority (IANA) prefers the updated name US-ASCII, which clarifies
|
82
|
+
> that this system was developed in the US and based on the typographical
|
83
|
+
> symbols predominantly in use there.
|
84
|
+
>
|
85
|
+
> (Source: [ASCII @ Wikipedia](https://en.wikipedia.org/wiki/ASCII))
|
86
|
+
|
87
|
+
|
88
|
+
|
89
|
+
Nowadays a character can have one, two or even more bytes.
|
90
|
+
Let's try:
|
91
|
+
|
92
|
+
``` ruby
|
93
|
+
## Latin Capital Letter A (Unicode) - 1 Byte
|
94
|
+
"A".bytes #=> [65]
|
95
|
+
"\x41".bytes #=> [65] - same as "A"
|
96
|
+
"\u0041".bytes #=> [65]
|
97
|
+
"\u{41}".bytes #=> [65] - same as "\u0041" - leading zeros can be dropped
|
98
|
+
"A".bytes.size #=> 1
|
99
|
+
"A".chars.size #=> 1
|
100
|
+
|
101
|
+
## Cyrillic Capital Letter A (Unicode) - 2 Bytes
|
102
|
+
"А".bytes #=> [208, 144]
|
103
|
+
"\u0410".bytes #=> [208, 144] - same "А"
|
104
|
+
"\u{410}".bytes #=> [208, 144] - same as "\u0410" - leading zeros can be dropped
|
105
|
+
"А".bytes.size #=> 2
|
106
|
+
"А".chars.size #=> 1
|
107
|
+
|
108
|
+
# Old Persian Number One (Unicode) - 4 Bytes
|
109
|
+
"𐏑".bytes #=> [240, 144, 143, 145]
|
110
|
+
"\u{103D1}".bytes #=> [240, 144, 143, 145]
|
111
|
+
"\u{103d1}".bytes #=> [240, 144, 143, 145] same as "\u{103D1}"
|
112
|
+
"𐏑".bytes.size #=> 4
|
113
|
+
"𐏑".chars.size #=> 1
|
114
|
+
|
115
|
+
# ...
|
116
|
+
```
|
117
|
+
|
118
|
+
|
119
|
+
|
120
|
+
### String of Bytes or String of Characters? Yes, Yes, Yes
|
121
|
+
|
122
|
+
|
123
|
+
In ruby the String class can morph into three types:
|
124
|
+
|
125
|
+
- Bytes
|
126
|
+
- Mutable String a.k.a String Buffer
|
127
|
+
- Immutable String a.k.a. Frozen String
|
128
|
+
|
129
|
+
|
130
|
+
#### Bytes / Binary
|
131
|
+
|
132
|
+
|
133
|
+
`String.new` or `"".b` creates new bytes, that is, a new binary string
|
134
|
+
buffer with the ASCII_8BIT encoding also known as BINARY.
|
135
|
+
Let's try:
|
136
|
+
|
137
|
+
``` ruby
|
138
|
+
String.new.encoding #=> <Encoding::ASCII_8BIT>
|
139
|
+
String.new("".b).encoding #=> <Encoding::ASCII_8BIT>
|
140
|
+
"".b.encoding #=> <Encoding::ASCII_8BIT>
|
141
|
+
|
142
|
+
Encoding::BINARY == Encoding::ASCII_8BIT #=> true
|
143
|
+
|
144
|
+
# or using the "type-safe" Bytes class
|
145
|
+
|
146
|
+
Bytes.new.encoding #=> <Encoding::ASCII_8BIT>
|
147
|
+
Bytes.new("").encoding #=> <Encoding::ASCII_8BIT>
|
148
|
+
Bytes.new("abc").encoding #=> <Encoding::ASCII_8BIT>
|
149
|
+
Bytes.new("\x61\x62\x63").encoding #=> <Encoding::ASCII_8BIT>
|
150
|
+
```
|
151
|
+
|
152
|
+
|
153
|
+
|
154
|
+
#### String Buffer
|
155
|
+
|
156
|
+
If you use `String.new("")` (note the `""` passed in) or
|
157
|
+
the string literal `""` that creates a new string buffer
|
158
|
+
with the default encoding (usually UTF-8).
|
159
|
+
Let's try:
|
160
|
+
|
161
|
+
``` ruby
|
162
|
+
# encoding: utf-8
|
163
|
+
String.new("").encoding #=> <Encoding::UTF_8>
|
164
|
+
"".encoding #=> <Encoding::UTF_8>
|
165
|
+
```
|
166
|
+
|
167
|
+
<!--
|
168
|
+
|
169
|
+
# or using the StringBuffer c'tor helper (returning a String)
|
170
|
+
|
171
|
+
StringBuffer.new.encoding #=> <Encoding::UTF_8>
|
172
|
+
StringBuffer.new("").encoding #=> <Encoding::UTF_8>
|
173
|
+
StringBuffer.new("abc").encoding #=> <Encoding::UTF_8>
|
174
|
+
|
175
|
+
-->
|
176
|
+
|
177
|
+
|
178
|
+
#### Frozen String
|
179
|
+
|
180
|
+
If you use the recommended `# frozen_string_literal: true` magic comment
|
181
|
+
or pragma you can automagically turn all string literals into
|
182
|
+
frozen (immutable) strings with the default encoding (usually UTF-8).
|
183
|
+
Let's try:
|
184
|
+
|
185
|
+
``` ruby
|
186
|
+
# frozen_string_literal: true
|
187
|
+
"".frozen? #=> true
|
188
|
+
"Hello, World!".frozen? #=> true
|
189
|
+
String.new.frozen? #=> false
|
190
|
+
String.new("").frozen? #=> false
|
191
|
+
```
|
192
|
+
|
193
|
+
|
194
|
+
|
195
|
+
<!--
|
196
|
+
|
197
|
+
### Bytes
|
198
|
+
|
199
|
+
bytes from hexstring
|
200
|
+
|
201
|
+
bytes to hexstring
|
202
|
+
|
203
|
+
bytes from string
|
204
|
+
|
205
|
+
bytes to string
|
206
|
+
|
207
|
+
bytes to array of integers
|
208
|
+
|
209
|
+
bytes from array of integers
|
210
|
+
|
211
|
+
|
212
|
+
|
213
|
+
#### Bytes to Integer Numbers - Little-Endian vs Big-Endian
|
214
|
+
|
215
|
+
4 byte unsigned integer -
|
216
|
+
|
217
|
+
Example - 1
|
218
|
+
|
219
|
+
bytes to integer
|
220
|
+
|
221
|
+
integer to bytes
|
222
|
+
|
223
|
+
Big-End first or Little-End first?
|
224
|
+
Least significant bit (lsm) or most significant bit (msb) first?
|
225
|
+
|
226
|
+
|
227
|
+
|
228
|
+
|
229
|
+
#### Bytes Helper
|
230
|
+
|
231
|
+
|
232
|
+
|
233
|
+
### Buffer
|
234
|
+
|
235
|
+
#### Buffer Helper
|
236
|
+
|
237
|
+
-->
|
238
|
+
|
239
|
+
|
240
|
+
|
241
|
+
|
242
|
+
To be continued ...
|
243
|
+
|
244
|
+
---
|
245
|
+
|
246
|
+
|
247
|
+
|
248
|
+
|
249
|
+
|
250
|
+
|
11
251
|
|
12
252
|
## Usage
|
13
253
|
|
data/lib/bytes/version.rb
CHANGED
data/lib/bytes.rb
CHANGED
@@ -64,8 +64,14 @@ class Bytes
|
|
64
64
|
def initialize( bin=String.new )
|
65
65
|
## note: for now will NOT dup(licate) passed in binary array
|
66
66
|
## you only get a new binary array if no arg passed in e.g. Bytes.new
|
67
|
-
|
68
|
-
|
67
|
+
@bin = if bin.encoding != Encoding::ASCII_8BIT
|
68
|
+
puts "!! WARN - Bytes.new - BINARY/ASCII-8BIT encoding expected; got: #{bin.encoding} for string:"
|
69
|
+
pp bin
|
70
|
+
|
71
|
+
bin.b
|
72
|
+
else
|
73
|
+
bin
|
74
|
+
end
|
69
75
|
end
|
70
76
|
|
71
77
|
|