ruby-zstds 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (43) hide show
  1. checksums.yaml +7 -0
  2. data/AUTHORS +1 -0
  3. data/LICENSE +21 -0
  4. data/README.md +498 -0
  5. data/ext/extconf.rb +82 -0
  6. data/ext/zstds_ext/buffer.c +30 -0
  7. data/ext/zstds_ext/buffer.h +23 -0
  8. data/ext/zstds_ext/common.h +16 -0
  9. data/ext/zstds_ext/dictionary.c +106 -0
  10. data/ext/zstds_ext/dictionary.h +16 -0
  11. data/ext/zstds_ext/error.c +81 -0
  12. data/ext/zstds_ext/error.h +35 -0
  13. data/ext/zstds_ext/io.c +512 -0
  14. data/ext/zstds_ext/io.h +14 -0
  15. data/ext/zstds_ext/macro.h +13 -0
  16. data/ext/zstds_ext/main.c +25 -0
  17. data/ext/zstds_ext/option.c +287 -0
  18. data/ext/zstds_ext/option.h +122 -0
  19. data/ext/zstds_ext/stream/compressor.c +241 -0
  20. data/ext/zstds_ext/stream/compressor.h +31 -0
  21. data/ext/zstds_ext/stream/decompressor.c +183 -0
  22. data/ext/zstds_ext/stream/decompressor.h +29 -0
  23. data/ext/zstds_ext/string.c +254 -0
  24. data/ext/zstds_ext/string.h +14 -0
  25. data/lib/zstds.rb +9 -0
  26. data/lib/zstds/dictionary.rb +47 -0
  27. data/lib/zstds/error.rb +22 -0
  28. data/lib/zstds/file.rb +46 -0
  29. data/lib/zstds/option.rb +194 -0
  30. data/lib/zstds/stream/abstract.rb +153 -0
  31. data/lib/zstds/stream/delegates.rb +36 -0
  32. data/lib/zstds/stream/raw/abstract.rb +55 -0
  33. data/lib/zstds/stream/raw/compressor.rb +101 -0
  34. data/lib/zstds/stream/raw/decompressor.rb +70 -0
  35. data/lib/zstds/stream/reader.rb +166 -0
  36. data/lib/zstds/stream/reader_helpers.rb +192 -0
  37. data/lib/zstds/stream/stat.rb +78 -0
  38. data/lib/zstds/stream/writer.rb +145 -0
  39. data/lib/zstds/stream/writer_helpers.rb +93 -0
  40. data/lib/zstds/string.rb +31 -0
  41. data/lib/zstds/validation.rb +48 -0
  42. data/lib/zstds/version.rb +6 -0
  43. metadata +182 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 7c5a2d6aa32e470ebafcf774cbb4d92c0af6bc865ab9c53a25d77ccc28c8530f
4
+ data.tar.gz: 6f1b9dbcd8e0b0deb0f3e8d6d34c2bb0f3fa2883c6f95b10eab56ac36798b457
5
+ SHA512:
6
+ metadata.gz: ca4b8e2055507fd081aec2d3ada1ddc2ebb69511c52ccfe0ef5eae28ffc21d41f293a2b813cd7b96046bd5e33460f486e8199d6c43655cadf51ceb9d58e3d38f
7
+ data.tar.gz: 5e3e4e67091231f9a884556056b1be5911e8926cc996bdae00c89066194166ed36e2d9b28fe3a4005920da7808865432a05582691e9743c577ef9f9dd9bd5141
data/AUTHORS ADDED
@@ -0,0 +1 @@
1
+ Andrew Aladjev
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 AUTHORS
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,498 @@
1
+ # Ruby bindings for zstd library
2
+
3
+ | Travis | AppVeyor | Cirrus | Circle |
4
+ | :---: | :---: | :---: | :---: |
5
+ | [![Travis test status](https://travis-ci.com/andrew-aladev/ruby-zstds.svg?branch=master)](https://travis-ci.com/andrew-aladev/ruby-zstds) | [![AppVeyor test status](https://ci.appveyor.com/api/projects/status/github/andrew-aladev/ruby-zstds?branch=master&svg=true)](https://ci.appveyor.com/project/andrew-aladev/ruby-zstds/branch/master) | [![Cirrus test status](https://api.cirrus-ci.com/github/andrew-aladev/ruby-zstds.svg?branch=master)](https://cirrus-ci.com/github/andrew-aladev/ruby-zstds) | [![Circle test status](https://circleci.com/gh/andrew-aladev/ruby-zstds/tree/master.svg?style=shield)](https://circleci.com/gh/andrew-aladev/ruby-zstds/tree/master) |
6
+
7
+ See [zstd library](https://github.com/facebook/zstd).
8
+
9
+ ## Installation
10
+
11
+ Please install zstd library first, use latest 1.4.3+ version.
12
+
13
+ ```sh
14
+ gem install ruby-zstds
15
+ ```
16
+
17
+ You can build it from source.
18
+
19
+ ```sh
20
+ rake gem
21
+ gem install pkg/ruby-zstds-*.gem
22
+ ```
23
+
24
+ ## Usage
25
+
26
+ There are simple APIs: `String` and `File`. Also you can use generic streaming API: `Stream::Writer` and `Stream::Reader`.
27
+
28
+ ```ruby
29
+ require "zstds"
30
+
31
+ data = ZSTDS::String.compress "sample string"
32
+ puts ZSTDS::String.decompress(data)
33
+
34
+ ZSTDS::File.compress "file.txt", "file.txt.zst"
35
+ ZSTDS::File.decompress "file.txt.zst", "file.txt"
36
+
37
+ ZSTDS::Stream::Writer.open("file.txt.zst") { |writer| writer << "sample string" }
38
+ puts ZSTDS::Stream::Reader.open("file.txt.zst") { |reader| reader.read }
39
+ ```
40
+
41
+ You can create dictionary using `ZSTDS::Dictionary`.
42
+
43
+ ```ruby
44
+ require "securerandom"
45
+ require "zstds"
46
+
47
+ samples = (Array.new(8) { ::SecureRandom.random_bytes(1 << 8) } + ["sample string"]).shuffle
48
+
49
+ dictionary = ZSTDS::Dictionary.train samples
50
+ File.write "dictionary.bin", dictionary.buffer
51
+
52
+ dictionary_buffer = File.read "dictionary.bin"
53
+ dictionary = ZSTDS::Dictionary.new dictionary_buffer
54
+
55
+ data = ZSTDS::String.compress "sample string", :dictionary => dictionary
56
+ puts ZSTDS::String.decompress(data, :dictionary => dictionary)
57
+ ```
58
+
59
+ You can create and read `tar.zst` archives with `minitar` for example.
60
+
61
+ ```ruby
62
+ require "zstds"
63
+ require "minitar"
64
+
65
+ ZSTDS::Stream::Writer.open "file.tar.zst" do |writer|
66
+ Minitar::Writer.open writer do |tar|
67
+ tar.add_file_simple "file", :data => "sample string"
68
+ end
69
+ end
70
+
71
+ ZSTDS::Stream::Reader.open "file.tar.zst" do |reader|
72
+ Minitar::Reader.open reader do |tar|
73
+ tar.each_entry do |entry|
74
+ puts entry.name
75
+ puts entry.read
76
+ end
77
+ end
78
+ end
79
+ ```
80
+
81
+ ## Options
82
+
83
+ Each API supports several options:
84
+
85
+ ```
86
+ :source_buffer_length
87
+ :destination_buffer_length
88
+ ```
89
+
90
+ There are internal buffers for compressed and decompressed data.
91
+ For example you want to use 1 KB as source buffer length for compressor - please use 256 B as destination buffer length.
92
+ You want to use 256 B as source buffer length for decompressor - please use 1 KB as destination buffer length.
93
+
94
+ Values: 0 - infinity, default value: 0.
95
+ 0 means automatic buffer length selection.
96
+
97
+ ```
98
+ :compression_level
99
+ ```
100
+
101
+ Values: `ZSTDS::Option::MIN_COMPRESSION_LEVEL` - `ZSTDS::Option::MAX_COMPRESSION_LEVEL`, default value: `0`.
102
+
103
+ ```
104
+ :window_log
105
+ ```
106
+
107
+ Values: `ZSTDS::Option::MIN_WINDOW_LOG` - `ZSTDS::Option::MAX_WINDOW_LOG`, default value: `0`.
108
+
109
+ ```
110
+ :hash_log
111
+ ```
112
+
113
+ Values: `ZSTDS::Option::MIN_HASH_LOG` - `ZSTDS::Option::MAX_HASH_LOG`, default value: `0`.
114
+
115
+ ```
116
+ :chain_log
117
+ ```
118
+
119
+ Values: `ZSTDS::Option::MIN_CHAIN_LOG` - `ZSTDS::Option::MAX_CHAIN_LOG`, default value: `0`.
120
+
121
+ ```
122
+ :search_log
123
+ ```
124
+
125
+ Values: `ZSTDS::Option::MIN_SEARCH_LOG` - `ZSTDS::Option::MAX_SEARCH_LOG`, default value: `0`.
126
+
127
+ ```
128
+ :min_match
129
+ ```
130
+
131
+ Values: `ZSTDS::Option::MIN_MIN_MATCH` - `ZSTDS::Option::MAX_MIN_MATCH`, default value: `0`.
132
+
133
+ ```
134
+ :target_length
135
+ ```
136
+
137
+ Values: `ZSTDS::Option::MIN_TARGET_LENGTH` - `ZSTDS::Option::MAX_TARGET_LENGTH`, default value: `0`.
138
+
139
+ ```
140
+ :strategy
141
+ ```
142
+
143
+ Values: `ZSTDS::Option::STRATEGIES`, default value: none.
144
+
145
+ ```
146
+ :enable_long_distance_matching
147
+ ```
148
+
149
+ Values: true/false, default value: none.
150
+
151
+ ```
152
+ :ldm_hash_log
153
+ ```
154
+
155
+ Values: `ZSTDS::Option::MIN_LDM_HASH_LOG` - `ZSTDS::Option::MAX_LDM_HASH_LOG`, default value: `0`.
156
+
157
+ ```
158
+ :ldm_min_match
159
+ ```
160
+
161
+ Values: `ZSTDS::Option::MIN_LDM_MIN_MATCH` - `ZSTDS::Option::MAX_LDM_MIN_MATCH`, default value: `0`.
162
+
163
+ ```
164
+ :ldm_bucket_size_log
165
+ ```
166
+
167
+ Values: `ZSTDS::Option::MIN_LDM_BUCKET_SIZE_LOG` - `ZSTDS::Option::MAX_LDM_BUCKET_SIZE_LOG`, default value: `0`.
168
+
169
+ ```
170
+ :ldm_hash_rate_log
171
+ ```
172
+
173
+ Values: `ZSTDS::Option::MIN_LDM_HASH_RATE_LOG` - `ZSTDS::Option::MAX_LDM_HASH_RATE_LOG`, default value: `0`.
174
+
175
+ ```
176
+ :content_size_flag
177
+ ```
178
+
179
+ Values: true/false, default value: true.
180
+
181
+ ```
182
+ :checksum_flag
183
+ ```
184
+
185
+ Values: true/false, default value: false.
186
+
187
+ ```
188
+ :dict_id_flag
189
+ ```
190
+
191
+ Values: true/false, default value: true.
192
+
193
+ ```
194
+ :nb_workers
195
+ ```
196
+
197
+ Values: `ZSTDS::Option::MIN_NB_WORKERS` - `ZSTDS::Option::MAX_NB_WORKERS`, default value: `0`.
198
+
199
+ ```
200
+ :job_size
201
+ ```
202
+
203
+ Values: `ZSTDS::Option::MIN_JOB_SIZE` - `ZSTDS::Option::MAX_JOB_SIZE`, default value: `0`.
204
+
205
+ ```
206
+ :overlap_log
207
+ ```
208
+
209
+ Values: `ZSTDS::Option::MIN_OVERLAP_LOG` - `ZSTDS::Option::MAX_OVERLAP_LOG`, default value: `0`.
210
+
211
+ ```
212
+ :window_log_max
213
+ ```
214
+
215
+ Values: `ZSTDS::Option::MIN_WINDOW_LOG_MAX` - `ZSTDS::Option::MAX_WINDOW_LOG_MAX`, default value: `0`.
216
+
217
+ ```
218
+ :dictionary
219
+ ```
220
+
221
+ Special option for dictionary, default value: none.
222
+
223
+ ```
224
+ :pledged_size
225
+ ```
226
+
227
+ Values: 0 - infinity, default value: 0.
228
+ It is reasonable to provide size of input (if known) for streaming api.
229
+ `String` and `File` will set `:pledged_size` automaticaly.
230
+
231
+ Please read zstd docs for more info about options.
232
+
233
+ Possible compressor options:
234
+ ```
235
+ :compression_level
236
+ :window_log
237
+ :hash_log
238
+ :chain_log
239
+ :search_log
240
+ :min_match
241
+ :target_length
242
+ :strategy
243
+ :enable_long_distance_matching
244
+ :ldm_hash_log
245
+ :ldm_min_match
246
+ :ldm_bucket_size_log
247
+ :ldm_hash_rate_log
248
+ :content_size_flag
249
+ :checksum_flag
250
+ :dict_id_flag
251
+ :nb_workers
252
+ :job_size
253
+ :overlap_log
254
+ :dictionary
255
+ :pledged_size
256
+ ```
257
+
258
+ Possible decompressor options:
259
+ ```
260
+ :window_log_max
261
+ :dictionary
262
+ ```
263
+
264
+ Example:
265
+
266
+ ```ruby
267
+ require "zstds"
268
+
269
+ data = ZSTDS::String.compress "sample string", :compression_level => 5
270
+ puts ZSTDS::String.decompress(data, :window_log_max => 11)
271
+ ```
272
+
273
+ HTTP encoding (`Content-Encoding: zstd`) using default options:
274
+
275
+ ```ruby
276
+ require "zstds"
277
+ require "sinatra"
278
+
279
+ get "/" do
280
+ headers["Content-Encoding"] = "zstd"
281
+ ZSTDS::String.compress "sample string"
282
+ end
283
+ ```
284
+
285
+ ## String
286
+
287
+ String maintains destination buffer only, so it accepts `destination_buffer_length` option only.
288
+
289
+ ```
290
+ ::compress(source, options = {})
291
+ ::decompress(source, options = {})
292
+ ```
293
+
294
+ `source` is a source string.
295
+
296
+ ## File
297
+
298
+ File maintains both source and destination buffers, it accepts both `source_buffer_length` and `destination_buffer_length` options.
299
+
300
+ ```
301
+ ::compress(source, destination, options = {})
302
+ ::decompress(source, destination, options = {})
303
+ ```
304
+
305
+ `source` and `destination` are file pathes.
306
+
307
+ ## Stream::Writer
308
+
309
+ Its behaviour is similar to builtin [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html).
310
+
311
+ Writer maintains destination buffer only, so it accepts `destination_buffer_length` option only.
312
+
313
+ ```
314
+ ::open(file_path, options = {}, :external_encoding => nil, :transcode_options => {}, &block)
315
+ ```
316
+
317
+ Open file path and create stream writer associated with opened file.
318
+ Data will be transcoded to `:external_encoding` using `:transcode_options` before compressing.
319
+
320
+ It may be tricky to use both `:pledged_size` and `:transcode_options`. You have to provide size of transcoded input.
321
+
322
+ ```
323
+ ::new(destination_io, options = {}, :external_encoding => nil, :transcode_options => {})
324
+ ```
325
+
326
+ Create stream writer associated with destination io.
327
+ Data will be transcoded to `:external_encoding` using `:transcode_options` before compressing.
328
+
329
+ It may be tricky to use both `:pledged_size` and `:transcode_options`. You have to provide size of transcoded input.
330
+
331
+ ```
332
+ #set_encoding(external_encoding, nil, transcode_options)
333
+ ```
334
+
335
+ Set another encodings, `nil` is just for compatibility with `IO`.
336
+
337
+ ```
338
+ #io
339
+ #to_io
340
+ #stat
341
+ #external_encoding
342
+ #transcode_options
343
+ #pos
344
+ #tell
345
+ ```
346
+
347
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
348
+
349
+ ```
350
+ #write(*objects)
351
+ #flush
352
+ #rewind
353
+ #close
354
+ #closed?
355
+ ```
356
+
357
+ See [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html) docs.
358
+
359
+ ```
360
+ #write_nonblock(object, *options)
361
+ #flush_nonblock(*options)
362
+ #rewind_nonblock(*options)
363
+ #close_nonblock(*options)
364
+ ```
365
+
366
+ Special asynchronous methods missing in `Zlib::GzipWriter`.
367
+ `rewind` wants to `close`, `close` wants to `write` something and `flush`, `flush` want to `write` something.
368
+ So it is possible to have asynchronous variants for these synchronous methods.
369
+ Behaviour is the same as `IO#write_nonblock` method.
370
+
371
+ ```
372
+ #<<(object)
373
+ #print(*objects)
374
+ #printf(*args)
375
+ #putc(object, encoding: ::Encoding::BINARY)
376
+ #puts(*objects)
377
+ ```
378
+
379
+ Typical helpers, see [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html) docs.
380
+
381
+ ## Stream::Reader
382
+
383
+ Its behaviour is similar to builtin [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html).
384
+
385
+ Reader maintains both source and destination buffers, it accepts both `source_buffer_length` and `destination_buffer_length` options.
386
+
387
+ ```
388
+ ::open(file_path, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {}, &block)
389
+ ```
390
+
391
+ Open file path and create stream reader associated with opened file.
392
+ Data will be force encoded to `:external_encoding` and transcoded to `:internal_encoding` using `:transcode_options` after decompressing.
393
+
394
+ ```
395
+ ::new(source_io, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {})
396
+ ```
397
+
398
+ Create stream reader associated with source io.
399
+ Data will be force encoded to `:external_encoding` and transcoded to `:internal_encoding` using `:transcode_options` after decompressing.
400
+
401
+ ```
402
+ #set_encoding(external_encoding, internal_encoding, transcode_options)
403
+ ```
404
+
405
+ Set another encodings.
406
+
407
+ ```
408
+ #io
409
+ #to_io
410
+ #stat
411
+ #external_encoding
412
+ #internal_encoding
413
+ #transcode_options
414
+ #pos
415
+ #tell
416
+ ```
417
+
418
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
419
+
420
+ ```
421
+ #read(bytes_to_read = nil, out_buffer = nil)
422
+ #eof?
423
+ #rewind
424
+ #close
425
+ #closed?
426
+ ```
427
+
428
+ See [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html) docs.
429
+
430
+ ```
431
+ #readpartial(bytes_to_read = nil, out_buffer = nil)
432
+ #read_nonblock(bytes_to_read, out_buffer = nil, *options)
433
+ ```
434
+
435
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
436
+
437
+ ```
438
+ #getbyte
439
+ #each_byte(&block)
440
+ #readbyte
441
+ #ungetbyte(byte)
442
+
443
+ #getc
444
+ #readchar
445
+ #each_char(&block)
446
+ #ungetc(char)
447
+
448
+ #lineno
449
+ #lineno=
450
+ #gets(separator = $OUTPUT_RECORD_SEPARATOR, limit = nil)
451
+ #readline
452
+ #readlines
453
+ #each(&block)
454
+ #each_line(&block)
455
+ #ungetline(line)
456
+ ```
457
+
458
+ Typical helpers, see [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html) docs.
459
+
460
+ ## Dictionary
461
+
462
+ You can train dictionary from samples using `train` class method.
463
+
464
+ ```
465
+ ::train(samples, :capacity => 0)
466
+ ```
467
+
468
+ Please review zstd code before using it.
469
+ There are many validation requirements and it changes between versions.
470
+
471
+ ```
472
+ #buffer
473
+ ```
474
+
475
+ There is an attribute reader for buffer.
476
+ You can use it to store dictionary somewhere.
477
+
478
+ ```
479
+ ::new(buffer)
480
+ ```
481
+
482
+ Please use regular constructor to create dictionary from buffer.
483
+
484
+ ```
485
+ #id
486
+ ```
487
+
488
+ Read dictionary id from buffer.
489
+
490
+ ## CI
491
+
492
+ Travis and Appveyor CI uses [scripts/ci_test.sh](scripts/ci_test.sh) directly.
493
+ Cirrus and Circle CI uses prebuilt [scripts/test-images](scripts/test-images).
494
+ Cirrus CI uses amd64 image, Circle CI - i686.
495
+
496
+ ## License
497
+
498
+ MIT license, see LICENSE and AUTHORS.