node-marshal 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5bed8bdc507d12d26f9b1ed0eb8c969fc05853e3
4
- data.tar.gz: ebc554d61f17751f7033c1c666375f383e9cb5c5
3
+ metadata.gz: 1f8bf24b07c3adc1df9e6f12ab148f7428fee811
4
+ data.tar.gz: fa00130d6340112f071c2fc677f3684ecef18447
5
5
  SHA512:
6
- metadata.gz: 2d7c90c64e183b195ebbe6d12ea6f9af7d75ce54955e93b667b1ccbc9a22cc4807a1623c4c3eb0134a044af1c21f4aa94865878f7a93879b250f0d8928357e63
7
- data.tar.gz: f25344c304ef5786fd44fc50f14afc177b30698b4db00c2a12055707e10ae8964e7aa6a5d84d1c92877d7a0a418fd779032a5e281d3437adfd1c6493ffcb4cc5
6
+ metadata.gz: 64357a5c91bc8fc0f523683c1a8d83a83d7b112bff8d46bbecf0446249b94c03e9243361e11d5926ed772ca7599a6ae465a88cbd8d59ffd641a858ef3fea786d
7
+ data.tar.gz: 1707659e29f48e9ad3996ded44fcac8e622edaa7a6bed3d48f05a5e5231e6c6647a6fe6bea2705973289032d8a5a783c562db0fb77f7178f2746fa1a09e26b37
data/COPYING CHANGED
@@ -1,23 +1,23 @@
1
- Copyright (C) 2015-2016 Alexey Voskov. All rights reserved.
2
- Copyright (C) 1993-2013 Yukihiro Matsumoto. All rights reserved.
3
-
4
- Redistribution and use in source and binary forms, with or without
5
- modification, are permitted provided that the following conditions
6
- are met:
7
- 1. Redistributions of source code must retain the above copyright
8
- notice, this list of conditions and the following disclaimer.
9
- 2. Redistributions in binary form must reproduce the above copyright
10
- notice, this list of conditions and the following disclaimer in the
11
- documentation and/or other materials provided with the distribution.
12
-
13
- THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14
- ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15
- IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16
- ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17
- FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18
- DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19
- OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20
- HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21
- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22
- OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23
- SUCH DAMAGE.
1
+ Copyright (C) 2015-2017 Alexey Voskov. All rights reserved.
2
+ Copyright (C) 1993-2013 Yukihiro Matsumoto. All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions
6
+ are met:
7
+ 1. Redistributions of source code must retain the above copyright
8
+ notice, this list of conditions and the following disclaimer.
9
+ 2. Redistributions in binary form must reproduce the above copyright
10
+ notice, this list of conditions and the following disclaimer in the
11
+ documentation and/or other materials provided with the distribution.
12
+
13
+ THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16
+ ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19
+ OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20
+ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21
+ LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22
+ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23
+ SUCH DAMAGE.
@@ -1,49 +1,56 @@
1
- == node-marshal
2
-
3
- This gem is designed for transformation of Ruby source code (eiher in the form of files or strings) to the
4
- Ruby nodes (syntax trees) used by Ruby MRI internals. Obtained nodes can be serialized to the platform-dependent
5
- binary or ASCII strings and restored and launched from serialized format. Such kind of transformation is
6
- irreversible and can be used for source code protection; the similar principle is used by RubyEncoder commercial
7
- software. It also contains some subroutines that can be useful for code obfuscation by renaming symbols
8
- inside the program.
9
-
10
- The key features of node-marshal gem:
11
- - Irreversible conversion of a source code to the Ruby node (abstract syntax tree)
12
- - Ruby 1.9.3, 2.2.x and 2.3.x support (only MRI)
13
- - Active usage of Ruby internals (mainly AST) and Ruby standard library (Marshal and Zlib)
14
- - Set of tests for easy addition of new Ruby versions
15
- - Result of compilation depends on Ruby version and used platform (x86, x64 etc.)
16
- - Subroutines for obfuscation
17
- - 2-clause BSD license suitable for creation of custom source code protection system
18
-
19
- Changelog:
20
- - 16.MAR.2016 - 0.2.1
21
- - Bugfix: garbage collection of symbols kept in the literals table of the node dump
22
- is now prohibited. Such GC caused hardly reproducible bugs after code loading.
23
- (thanks to Gregory Siehień for suitable examples)
24
- - Bugfix: improved parsing of NODE_ARRAY (correct processing of two cases of
25
- not documented pointers (instead of longs) in 2nd child. It affects arrays,
26
- NODE_HASH (hashes) and NODE_DSTR (strings in double quotes with #{} inside) ).
27
- - Bugfix: now NodeMarshal class methods don't change the state of Ruby
28
- garbage collector
29
- - Improved NodeMarshal#dump_tree_short output (addresses of nodes are shown)
30
- - Added NodeMarshal#to_h method (alias for NodeMarshal#to_hash)
31
- - NodeMarshal#to_a and NodeMarshal#to_ary methods added (they show extended information
32
- about Ruby AST including rb_args_info and ID *tbl internals)
33
- - 11.JAN.2016 - 0.2.0
34
- - Bugfix: || and && in NODE_OP_ASGN1 (e.g. in x['a'] ||= 'b' or x['b'] &&= false)
35
- (this bug caused segfaults in some cases)
36
- - Bugfix: NodeMarshal#dump_node_short
37
- - Format version changed to NODEMARSHAL11 (support of symbols not representable in
38
- the form of String is added)
39
- - show_offsets property for controlling verbosity of NodeMarshal#dump_node_short output
40
- - Improved information about licenses
41
- - Improved rdoc documentation
42
- - 24.DEC.2015 - 0.1.2
43
- - Ruby 2.3.x preliminary support (including &. safe navigation operator)
44
- - Bugfix: NODE_MATCH3 (a =~ /abc/) issue (reported by Gregory Siehień)
45
- - Bugfix: NODE_BLOCK_PASS (short map syntax) issue (reported by Gregory Siehień)
46
- - Bugfix: failure in the case of syntax errors in the input data (now ArgumentError
47
- exception will be generated instead of it)
48
- - Ability of symbols renaming (for hiding variables and constants name during code obfuscation)
49
- - 04.MAY.2015 - 0.1.1 - first public version
1
+ == node-marshal
2
+
3
+ This gem is designed for transformation of Ruby source code (eiher in the form of files or strings) to the
4
+ Ruby nodes (syntax trees) used by Ruby MRI internals. Obtained nodes can be serialized to the platform-dependent
5
+ binary or ASCII strings and restored and launched from serialized format. Such kind of transformation is
6
+ irreversible and can be used for source code protection; the similar principle is used by RubyEncoder commercial
7
+ software. It also contains some subroutines that can be useful for code obfuscation by renaming symbols
8
+ inside the program.
9
+
10
+ The key features of node-marshal gem:
11
+ - Irreversible conversion of a source code to the Ruby node (abstract syntax tree)
12
+ - Ruby 1.9.3, 2.2.x and 2.3.x support (only MRI)
13
+ - Active usage of Ruby internals (mainly AST) and Ruby standard library (Marshal and Zlib)
14
+ - Set of tests for easy addition of new Ruby versions
15
+ - Result of compilation depends on Ruby version and used platform (x86, x64 etc.)
16
+ - Subroutines for obfuscation
17
+ - 2-clause BSD license suitable for creation of custom source code protection system
18
+
19
+ Changelog:
20
+ - 01.MAY.2017 - 0.2.2
21
+ - Bugfix: NODE_KW_ARG processing implementation. Allows to use keyword (named) arguments
22
+ in Ruby 2.x. (thanks to Jarosław Salik for bugreport).
23
+ - Bugfix: NODE_LASGN and NODE_DASGN_CURR 2nd child special cases (-1 value). Mainly for
24
+ their usage inside NODE_KW_ARG child trees.
25
+ - test_namedarg.rb test was added (for so called keyword argument)
26
+ - Improved output of dump_tree_short (rb_args_info dump)
27
+ - 16.MAR.2016 - 0.2.1
28
+ - Bugfix: garbage collection of symbols kept in the literals table of the node dump
29
+ is now prohibited. Such GC caused hardly reproducible bugs after code loading.
30
+ (thanks to Gregory Siehień for suitable examples)
31
+ - Bugfix: improved parsing of NODE_ARRAY (correct processing of two cases of
32
+ not documented pointers (instead of longs) in 2nd child. It affects arrays,
33
+ NODE_HASH (hashes) and NODE_DSTR (strings in double quotes with #{} inside) ).
34
+ - Bugfix: now NodeMarshal class methods don't change the state of Ruby
35
+ garbage collector
36
+ - Improved NodeMarshal#dump_tree_short output (addresses of nodes are shown)
37
+ - Added NodeMarshal#to_h method (alias for NodeMarshal#to_hash)
38
+ - NodeMarshal#to_a and NodeMarshal#to_ary methods added (they show extended information
39
+ about Ruby AST including rb_args_info and ID *tbl internals)
40
+ - 11.JAN.2016 - 0.2.0
41
+ - Bugfix: || and && in NODE_OP_ASGN1 (e.g. in x['a'] ||= 'b' or x['b'] &&= false)
42
+ (this bug caused segfaults in some cases)
43
+ - Bugfix: NodeMarshal#dump_node_short
44
+ - Format version changed to NODEMARSHAL11 (support of symbols not representable in
45
+ the form of String is added)
46
+ - show_offsets property for controlling verbosity of NodeMarshal#dump_node_short output
47
+ - Improved information about licenses
48
+ - Improved rdoc documentation
49
+ - 24.DEC.2015 - 0.1.2
50
+ - Ruby 2.3.x preliminary support (including &. safe navigation operator)
51
+ - Bugfix: NODE_MATCH3 (a =~ /abc/) issue (reported by Gregory Siehień)
52
+ - Bugfix: NODE_BLOCK_PASS (short map syntax) issue (reported by Gregory Siehień)
53
+ - Bugfix: failure in the case of syntax errors in the input data (now ArgumentError
54
+ exception will be generated instead of it)
55
+ - Ability of symbols renaming (for hiding variables and constants name during code obfuscation)
56
+ - 04.MAY.2015 - 0.1.1 - first public version
@@ -1,63 +1,63 @@
1
- #!/usr/bin/ruby
2
- require_relative '../lib/node-marshal.rb'
3
-
4
- help = <<-EOS
5
- Ruby source files compiler from node-marshal gem. It is based
6
- on NodeMarshal class. Source code is irreversibly transformed to the
7
- Ruby node (syntax tree) serialized into ASCII string. It can be used
8
- for code obfuscation and Ruby internals exploration.
9
-
10
- (C) 2015-2016 Alexey Voskov. License: BSD-2-Clause.
11
-
12
- Usage:
13
- noderbc inpfile outfile [options]
14
-
15
- Required arguments:
16
- inpfile -- Name of input Ruby script (with extension)
17
- outfile -- Name of output Ruby (with extension)
18
-
19
- Options:
20
- --compress=none -- No ZLib compression of the source
21
- --compress=zlib -- Use ZLib compression of the source (default)
22
- --so_path="str" -- String for inclusion of the node-marshal loader
23
- Its default value is:
24
- require_relative '../ext/node-marshal/nodemarshal.so'
25
-
26
- EOS
27
-
28
- if ARGV.length < 2
29
- # No required number of input arguments: show short help
30
- puts help
31
- else
32
- # Optional arguments processing
33
- opts = {}
34
- if ARGV.length > 2
35
- ARGV[2..-1].each do |arg|
36
- case arg
37
- when '--compress=none'
38
- opts[:compress] = false
39
- when '--compress=zlib'
40
- opts[:compress] = true
41
- when /^--so_path=.+$/
42
- str = arg[10..-1]
43
- opts[:so_path] = str
44
- else
45
- puts "Unknown argument #{arg}"
46
- exit
47
- end
48
- end
49
- end
50
- # Show given options
51
- puts "Used options:"
52
- if opts.size == 0
53
- puts " default"
54
- else
55
- puts " compress: #{opts[:compress]}" if opts.has_key?(:compress)
56
- puts " so_path: #{opts[:so_path]}" if opts.has_key?(:so_path)
57
- end
58
- # Required arguments processing
59
- inpfile = ARGV[0]
60
- outfile = ARGV[1]
61
- raise 'inpfile and outfile cannot be equal' if inpfile == outfile
62
- NodeMarshal.compile_rb_file(outfile, inpfile, opts)
63
- end
1
+ #!/usr/bin/ruby
2
+ require_relative '../lib/node-marshal.rb'
3
+
4
+ help = <<-EOS
5
+ Ruby source files compiler from node-marshal gem. It is based
6
+ on NodeMarshal class. Source code is irreversibly transformed to the
7
+ Ruby node (syntax tree) serialized into ASCII string. It can be used
8
+ for code obfuscation and Ruby internals exploration.
9
+
10
+ (C) 2015-2016 Alexey Voskov. License: BSD-2-Clause.
11
+
12
+ Usage:
13
+ noderbc inpfile outfile [options]
14
+
15
+ Required arguments:
16
+ inpfile -- Name of input Ruby script (with extension)
17
+ outfile -- Name of output Ruby (with extension)
18
+
19
+ Options:
20
+ --compress=none -- No ZLib compression of the source
21
+ --compress=zlib -- Use ZLib compression of the source (default)
22
+ --so_path="str" -- String for inclusion of the node-marshal loader
23
+ Its default value is:
24
+ require_relative '../ext/node-marshal/nodemarshal.so'
25
+
26
+ EOS
27
+
28
+ if ARGV.length < 2
29
+ # No required number of input arguments: show short help
30
+ puts help
31
+ else
32
+ # Optional arguments processing
33
+ opts = {}
34
+ if ARGV.length > 2
35
+ ARGV[2..-1].each do |arg|
36
+ case arg
37
+ when '--compress=none'
38
+ opts[:compress] = false
39
+ when '--compress=zlib'
40
+ opts[:compress] = true
41
+ when /^--so_path=.+$/
42
+ str = arg[10..-1]
43
+ opts[:so_path] = str
44
+ else
45
+ puts "Unknown argument #{arg}"
46
+ exit
47
+ end
48
+ end
49
+ end
50
+ # Show given options
51
+ puts "Used options:"
52
+ if opts.size == 0
53
+ puts " default"
54
+ else
55
+ puts " compress: #{opts[:compress]}" if opts.has_key?(:compress)
56
+ puts " so_path: #{opts[:so_path]}" if opts.has_key?(:so_path)
57
+ end
58
+ # Required arguments processing
59
+ inpfile = ARGV[0]
60
+ outfile = ARGV[1]
61
+ raise 'inpfile and outfile cannot be equal' if inpfile == outfile
62
+ NodeMarshal.compile_rb_file(outfile, inpfile, opts)
63
+ end
File without changes
@@ -1,194 +1,194 @@
1
- /*
2
- * Implementation of own version of BASE85 binary data encoding
3
- * adapted for the usage inside Ruby source code (i.e. without
4
- * such symbols as \ " # { } '
5
- *
6
- * Format of the output stream:
7
- * 1) First byte: number of bytes in the last chunk: from 0 to 3
8
- * (0 means that the last chunk contains 4 bytes, i.e. everything
9
- * is aligned). See val_to_char array for the used alphabet
10
- * 2) big-endian 5-byte numbers (base 85)
11
- * 3) empty string: arbitrary two bytes
12
- *
13
- * (C) 2015-2016 Alexey Voskov
14
- * License: BSD-2-Clause
15
- */
16
- #include <stdio.h>
17
- #include <stdlib.h>
18
- #include <inttypes.h>
19
- #include <ruby.h>
20
- #include <ruby/version.h>
21
-
22
- #define BASE85R_STR_WIDTH 14 // Number of 5-byte groups in the string (12 for 60-byte string)
23
-
24
- #define D0_VAL 1 // 85**0
25
- #define D1_VAL 85 // 85**1
26
- #define D2_VAL 7225 // 85**2
27
- #define D3_VAL 614125 // 85**3
28
- #define D4_VAL 52200625 // 85**4
29
-
30
- static int di_val[5] = {D4_VAL, D3_VAL, D2_VAL, D1_VAL, D0_VAL};
31
-
32
-
33
- /* Modified BASE85 digits */
34
- static const char val_to_char[86] = // ASCIIZ string
35
- "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
36
- "abcdefghijklmnopqrstuvwxyz"
37
- "0123456789"
38
- "!$%&()*-./"
39
- ":;<=>?@[]^"
40
- ",_|";
41
-
42
- static int char_to_val[128];
43
-
44
-
45
- /* Initializes internal tables that are required
46
- for recoding */
47
- void base85r_init_tables()
48
- {
49
- int i;
50
- for (i = 0; i < 128; i++)
51
- char_to_val[i] = -1;
52
-
53
-
54
- for (i = 0; i < 85; i++)
55
- {
56
- if (char_to_val[(int) val_to_char[i]] != -1)
57
- rb_raise(rb_eArgError, "Internal error");
58
- char_to_val[(int) val_to_char[i]] = i;
59
- }
60
-
61
- for (i = 0; i < 85; i++)
62
- if (char_to_val[(int) val_to_char[i]] != i)
63
- rb_raise(rb_eArgError, "Internal error");
64
- }
65
-
66
-
67
- /* Calculate length of buffer for base85 encoding */
68
- static int base85_encode_buf_len(int len)
69
- {
70
- // Calculate aligned (32-bit alignment) size of the buffer
71
- int buf_len = ((len >> 2) << 2);
72
- if (len % 4) buf_len += 4;
73
- // Calculate size of the output buffer
74
- buf_len = (buf_len * 5) / 4;
75
- buf_len = (buf_len * 105) / 100;
76
- buf_len += 32;
77
- // Return buffer size
78
- return buf_len;
79
- }
80
-
81
- /*
82
- * Encode string to modified BASE85 ASCII.
83
- * Note: call base85_init_tables before using of this function
84
- */
85
- VALUE base85r_encode(VALUE input)
86
- {
87
- VALUE output;
88
- int inp_len, out_len, out_buf_len;
89
- int pos, outpos;
90
- unsigned int val;
91
- unsigned char *outptr, *inptr;
92
- int i;
93
- // Check input data type and allocate string
94
- if (TYPE(input) != T_STRING)
95
- rb_raise(rb_eArgError, "base85r_encode: input must be a string");
96
- inp_len = RSTRING_LEN(input);
97
- out_buf_len = base85_encode_buf_len(inp_len);
98
- output = rb_str_new(NULL, out_buf_len);
99
- // Begin conversion
100
- outpos = 0;
101
- outptr = (unsigned char *) RSTRING_PTR(output);
102
- inptr = (unsigned char *) RSTRING_PTR(input);
103
- outptr[outpos++] = 32;
104
- outptr[outpos++] = val_to_char[inp_len % 4];
105
- out_len = 2;
106
- for (pos = 0; pos < inp_len; )
107
- {
108
- // Get four bytes
109
- val = 0;
110
- for (i = 24; i >= 0; i -= 8)
111
- if (pos < inp_len) val |= inptr[pos++] << i;
112
- // And transform them to five bytes
113
- for (i = 0; i < 5; i++)
114
- {
115
- int digit = (val / di_val[i]) % 85;
116
- char sym = val_to_char[digit];
117
- outptr[outpos++] = sym;
118
- }
119
- out_len += 5;
120
- // Newline addition
121
- if (pos % (4 * BASE85R_STR_WIDTH) == 0)
122
- {
123
- out_len += 2;
124
- outptr[outpos++] = 10;
125
- outptr[outpos++]= 32;
126
- }
127
- }
128
- // Check the state of memory
129
- if (outpos >= out_buf_len)
130
- rb_raise(rb_eArgError, "base85r_encode: internal memory error");
131
- // Truncate the empty "tail" of the buffer and return the string
132
- return rb_str_resize(output, out_len);
133
- }
134
-
135
-
136
- /*
137
- * Decode string in modified BASE85 ASCII format.
138
- * Note: call base85_init_tables before using of this function
139
- */
140
- VALUE base85r_decode(VALUE input)
141
- {
142
- int inp_len, out_len, pos, shift;
143
- unsigned int val = 0;
144
- VALUE output;
145
- unsigned char *inptr, *outptr;
146
- int tail_len, i;
147
- // Check input data type and allocate string
148
- if (TYPE(input) != T_STRING)
149
- rb_raise(rb_eArgError, "base85r_decode: input must be a string");
150
- inp_len = RSTRING_LEN(input);
151
- if (inp_len < 6 && inp_len != 2)
152
- { // String with 1 or more symbols
153
- rb_raise(rb_eArgError, "base85r_decode: input string is too short");
154
- }
155
- output = rb_str_new(NULL, inp_len);
156
- // Begin conversion
157
- inptr = (unsigned char *) RSTRING_PTR(input);
158
- outptr = (unsigned char *) RSTRING_PTR(output);
159
- out_len = 0;
160
- tail_len = -1;
161
- shift = 0;
162
- val = 0;
163
- for (pos = 0; pos < inp_len; pos++)
164
- {
165
- int digit = char_to_val[(int) inptr[pos]];
166
- if (digit != -1)
167
- {
168
- if (tail_len == -1)
169
- {
170
- tail_len = digit;
171
- if (tail_len > 4)
172
- rb_raise(rb_eArgError, "base85r_decode: input string is corrupted");
173
- continue;
174
- }
175
- val += digit * di_val[shift++];
176
- if (shift == 5)
177
- {
178
- for (i = 24; i >= 0; i -= 8)
179
- *outptr++ = (val >> i) & 0xFF;
180
- shift = 0; val = 0;
181
- out_len += 4;
182
- }
183
- }
184
- }
185
- // Check if the byte sequence was valid
186
- if (shift != 0)
187
- rb_raise(rb_eArgError, "base85r_decode: input string is corrupted");
188
- // Take into account unaligned "tail"
189
- if (tail_len != 0)
190
- {
191
- out_len -= (4 - tail_len);
192
- }
193
- return rb_str_resize(output, out_len);
194
- }
1
+ /*
2
+ * Implementation of own version of BASE85 binary data encoding
3
+ * adapted for the usage inside Ruby source code (i.e. without
4
+ * such symbols as \ " # { } '
5
+ *
6
+ * Format of the output stream:
7
+ * 1) First byte: number of bytes in the last chunk: from 0 to 3
8
+ * (0 means that the last chunk contains 4 bytes, i.e. everything
9
+ * is aligned). See val_to_char array for the used alphabet
10
+ * 2) big-endian 5-byte numbers (base 85)
11
+ * 3) empty string: arbitrary two bytes
12
+ *
13
+ * (C) 2015-2016 Alexey Voskov
14
+ * License: BSD-2-Clause
15
+ */
16
+ #include <stdio.h>
17
+ #include <stdlib.h>
18
+ #include <inttypes.h>
19
+ #include <ruby.h>
20
+ #include <ruby/version.h>
21
+
22
+ #define BASE85R_STR_WIDTH 14 // Number of 5-byte groups in the string (12 for 60-byte string)
23
+
24
+ #define D0_VAL 1 // 85**0
25
+ #define D1_VAL 85 // 85**1
26
+ #define D2_VAL 7225 // 85**2
27
+ #define D3_VAL 614125 // 85**3
28
+ #define D4_VAL 52200625 // 85**4
29
+
30
+ static int di_val[5] = {D4_VAL, D3_VAL, D2_VAL, D1_VAL, D0_VAL};
31
+
32
+
33
+ /* Modified BASE85 digits */
34
+ static const char val_to_char[86] = // ASCIIZ string
35
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
36
+ "abcdefghijklmnopqrstuvwxyz"
37
+ "0123456789"
38
+ "!$%&()*-./"
39
+ ":;<=>?@[]^"
40
+ ",_|";
41
+
42
+ static int char_to_val[128];
43
+
44
+
45
+ /* Initializes internal tables that are required
46
+ for recoding */
47
+ void base85r_init_tables()
48
+ {
49
+ int i;
50
+ for (i = 0; i < 128; i++)
51
+ char_to_val[i] = -1;
52
+
53
+
54
+ for (i = 0; i < 85; i++)
55
+ {
56
+ if (char_to_val[(int) val_to_char[i]] != -1)
57
+ rb_raise(rb_eArgError, "Internal error");
58
+ char_to_val[(int) val_to_char[i]] = i;
59
+ }
60
+
61
+ for (i = 0; i < 85; i++)
62
+ if (char_to_val[(int) val_to_char[i]] != i)
63
+ rb_raise(rb_eArgError, "Internal error");
64
+ }
65
+
66
+
67
+ /* Calculate length of buffer for base85 encoding */
68
+ static int base85_encode_buf_len(int len)
69
+ {
70
+ // Calculate aligned (32-bit alignment) size of the buffer
71
+ int buf_len = ((len >> 2) << 2);
72
+ if (len % 4) buf_len += 4;
73
+ // Calculate size of the output buffer
74
+ buf_len = (buf_len * 5) / 4;
75
+ buf_len = (buf_len * 105) / 100;
76
+ buf_len += 32;
77
+ // Return buffer size
78
+ return buf_len;
79
+ }
80
+
81
+ /*
82
+ * Encode string to modified BASE85 ASCII.
83
+ * Note: call base85_init_tables before using of this function
84
+ */
85
+ VALUE base85r_encode(VALUE input)
86
+ {
87
+ VALUE output;
88
+ int inp_len, out_len, out_buf_len;
89
+ int pos, outpos;
90
+ unsigned int val;
91
+ unsigned char *outptr, *inptr;
92
+ int i;
93
+ // Check input data type and allocate string
94
+ if (TYPE(input) != T_STRING)
95
+ rb_raise(rb_eArgError, "base85r_encode: input must be a string");
96
+ inp_len = RSTRING_LEN(input);
97
+ out_buf_len = base85_encode_buf_len(inp_len);
98
+ output = rb_str_new(NULL, out_buf_len);
99
+ // Begin conversion
100
+ outpos = 0;
101
+ outptr = (unsigned char *) RSTRING_PTR(output);
102
+ inptr = (unsigned char *) RSTRING_PTR(input);
103
+ outptr[outpos++] = 32;
104
+ outptr[outpos++] = val_to_char[inp_len % 4];
105
+ out_len = 2;
106
+ for (pos = 0; pos < inp_len; )
107
+ {
108
+ // Get four bytes
109
+ val = 0;
110
+ for (i = 24; i >= 0; i -= 8)
111
+ if (pos < inp_len) val |= inptr[pos++] << i;
112
+ // And transform them to five bytes
113
+ for (i = 0; i < 5; i++)
114
+ {
115
+ int digit = (val / di_val[i]) % 85;
116
+ char sym = val_to_char[digit];
117
+ outptr[outpos++] = sym;
118
+ }
119
+ out_len += 5;
120
+ // Newline addition
121
+ if (pos % (4 * BASE85R_STR_WIDTH) == 0)
122
+ {
123
+ out_len += 2;
124
+ outptr[outpos++] = 10;
125
+ outptr[outpos++]= 32;
126
+ }
127
+ }
128
+ // Check the state of memory
129
+ if (outpos >= out_buf_len)
130
+ rb_raise(rb_eArgError, "base85r_encode: internal memory error");
131
+ // Truncate the empty "tail" of the buffer and return the string
132
+ return rb_str_resize(output, out_len);
133
+ }
134
+
135
+
136
+ /*
137
+ * Decode string in modified BASE85 ASCII format.
138
+ * Note: call base85_init_tables before using of this function
139
+ */
140
+ VALUE base85r_decode(VALUE input)
141
+ {
142
+ int inp_len, out_len, pos, shift;
143
+ unsigned int val = 0;
144
+ VALUE output;
145
+ unsigned char *inptr, *outptr;
146
+ int tail_len, i;
147
+ // Check input data type and allocate string
148
+ if (TYPE(input) != T_STRING)
149
+ rb_raise(rb_eArgError, "base85r_decode: input must be a string");
150
+ inp_len = RSTRING_LEN(input);
151
+ if (inp_len < 6 && inp_len != 2)
152
+ { // String with 1 or more symbols
153
+ rb_raise(rb_eArgError, "base85r_decode: input string is too short");
154
+ }
155
+ output = rb_str_new(NULL, inp_len);
156
+ // Begin conversion
157
+ inptr = (unsigned char *) RSTRING_PTR(input);
158
+ outptr = (unsigned char *) RSTRING_PTR(output);
159
+ out_len = 0;
160
+ tail_len = -1;
161
+ shift = 0;
162
+ val = 0;
163
+ for (pos = 0; pos < inp_len; pos++)
164
+ {
165
+ int digit = char_to_val[(int) inptr[pos]];
166
+ if (digit != -1)
167
+ {
168
+ if (tail_len == -1)
169
+ {
170
+ tail_len = digit;
171
+ if (tail_len > 4)
172
+ rb_raise(rb_eArgError, "base85r_decode: input string is corrupted");
173
+ continue;
174
+ }
175
+ val += digit * di_val[shift++];
176
+ if (shift == 5)
177
+ {
178
+ for (i = 24; i >= 0; i -= 8)
179
+ *outptr++ = (val >> i) & 0xFF;
180
+ shift = 0; val = 0;
181
+ out_len += 4;
182
+ }
183
+ }
184
+ }
185
+ // Check if the byte sequence was valid
186
+ if (shift != 0)
187
+ rb_raise(rb_eArgError, "base85r_decode: input string is corrupted");
188
+ // Take into account unaligned "tail"
189
+ if (tail_len != 0)
190
+ {
191
+ out_len -= (4 - tail_len);
192
+ }
193
+ return rb_str_resize(output, out_len);
194
+ }