ruby-zstds 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. checksums.yaml +7 -0
  2. data/AUTHORS +1 -0
  3. data/LICENSE +21 -0
  4. data/README.md +498 -0
  5. data/ext/extconf.rb +82 -0
  6. data/ext/zstds_ext/buffer.c +30 -0
  7. data/ext/zstds_ext/buffer.h +23 -0
  8. data/ext/zstds_ext/common.h +16 -0
  9. data/ext/zstds_ext/dictionary.c +106 -0
  10. data/ext/zstds_ext/dictionary.h +16 -0
  11. data/ext/zstds_ext/error.c +81 -0
  12. data/ext/zstds_ext/error.h +35 -0
  13. data/ext/zstds_ext/io.c +512 -0
  14. data/ext/zstds_ext/io.h +14 -0
  15. data/ext/zstds_ext/macro.h +13 -0
  16. data/ext/zstds_ext/main.c +25 -0
  17. data/ext/zstds_ext/option.c +287 -0
  18. data/ext/zstds_ext/option.h +122 -0
  19. data/ext/zstds_ext/stream/compressor.c +241 -0
  20. data/ext/zstds_ext/stream/compressor.h +31 -0
  21. data/ext/zstds_ext/stream/decompressor.c +183 -0
  22. data/ext/zstds_ext/stream/decompressor.h +29 -0
  23. data/ext/zstds_ext/string.c +254 -0
  24. data/ext/zstds_ext/string.h +14 -0
  25. data/lib/zstds.rb +9 -0
  26. data/lib/zstds/dictionary.rb +47 -0
  27. data/lib/zstds/error.rb +22 -0
  28. data/lib/zstds/file.rb +46 -0
  29. data/lib/zstds/option.rb +194 -0
  30. data/lib/zstds/stream/abstract.rb +153 -0
  31. data/lib/zstds/stream/delegates.rb +36 -0
  32. data/lib/zstds/stream/raw/abstract.rb +55 -0
  33. data/lib/zstds/stream/raw/compressor.rb +101 -0
  34. data/lib/zstds/stream/raw/decompressor.rb +70 -0
  35. data/lib/zstds/stream/reader.rb +166 -0
  36. data/lib/zstds/stream/reader_helpers.rb +192 -0
  37. data/lib/zstds/stream/stat.rb +78 -0
  38. data/lib/zstds/stream/writer.rb +145 -0
  39. data/lib/zstds/stream/writer_helpers.rb +93 -0
  40. data/lib/zstds/string.rb +31 -0
  41. data/lib/zstds/validation.rb +48 -0
  42. data/lib/zstds/version.rb +6 -0
  43. metadata +182 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 7c5a2d6aa32e470ebafcf774cbb4d92c0af6bc865ab9c53a25d77ccc28c8530f
4
+ data.tar.gz: 6f1b9dbcd8e0b0deb0f3e8d6d34c2bb0f3fa2883c6f95b10eab56ac36798b457
5
+ SHA512:
6
+ metadata.gz: ca4b8e2055507fd081aec2d3ada1ddc2ebb69511c52ccfe0ef5eae28ffc21d41f293a2b813cd7b96046bd5e33460f486e8199d6c43655cadf51ceb9d58e3d38f
7
+ data.tar.gz: 5e3e4e67091231f9a884556056b1be5911e8926cc996bdae00c89066194166ed36e2d9b28fe3a4005920da7808865432a05582691e9743c577ef9f9dd9bd5141
data/AUTHORS ADDED
@@ -0,0 +1 @@
1
+ Andrew Aladjev
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 AUTHORS
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,498 @@
1
+ # Ruby bindings for zstd library
2
+
3
+ | Travis | AppVeyor | Cirrus | Circle |
4
+ | :---: | :---: | :---: | :---: |
5
+ | [![Travis test status](https://travis-ci.com/andrew-aladev/ruby-zstds.svg?branch=master)](https://travis-ci.com/andrew-aladev/ruby-zstds) | [![AppVeyor test status](https://ci.appveyor.com/api/projects/status/github/andrew-aladev/ruby-zstds?branch=master&svg=true)](https://ci.appveyor.com/project/andrew-aladev/ruby-zstds/branch/master) | [![Cirrus test status](https://api.cirrus-ci.com/github/andrew-aladev/ruby-zstds.svg?branch=master)](https://cirrus-ci.com/github/andrew-aladev/ruby-zstds) | [![Circle test status](https://circleci.com/gh/andrew-aladev/ruby-zstds/tree/master.svg?style=shield)](https://circleci.com/gh/andrew-aladev/ruby-zstds/tree/master) |
6
+
7
+ See [zstd library](https://github.com/facebook/zstd).
8
+
9
+ ## Installation
10
+
11
+ Please install zstd library first, use latest 1.4.3+ version.
12
+
13
+ ```sh
14
+ gem install ruby-zstds
15
+ ```
16
+
17
+ You can build it from source.
18
+
19
+ ```sh
20
+ rake gem
21
+ gem install pkg/ruby-zstds-*.gem
22
+ ```
23
+
24
+ ## Usage
25
+
26
+ There are simple APIs: `String` and `File`. Also you can use generic streaming API: `Stream::Writer` and `Stream::Reader`.
27
+
28
+ ```ruby
29
+ require "zstds"
30
+
31
+ data = ZSTDS::String.compress "sample string"
32
+ puts ZSTDS::String.decompress(data)
33
+
34
+ ZSTDS::File.compress "file.txt", "file.txt.zst"
35
+ ZSTDS::File.decompress "file.txt.zst", "file.txt"
36
+
37
+ ZSTDS::Stream::Writer.open("file.txt.zst") { |writer| writer << "sample string" }
38
+ puts ZSTDS::Stream::Reader.open("file.txt.zst") { |reader| reader.read }
39
+ ```
40
+
41
+ You can create dictionary using `ZSTDS::Dictionary`.
42
+
43
+ ```ruby
44
+ require "securerandom"
45
+ require "zstds"
46
+
47
+ samples = (Array.new(8) { ::SecureRandom.random_bytes(1 << 8) } + ["sample string"]).shuffle
48
+
49
+ dictionary = ZSTDS::Dictionary.train samples
50
+ File.write "dictionary.bin", dictionary.buffer
51
+
52
+ dictionary_buffer = File.read "dictionary.bin"
53
+ dictionary = ZSTDS::Dictionary.new dictionary_buffer
54
+
55
+ data = ZSTDS::String.compress "sample string", :dictionary => dictionary
56
+ puts ZSTDS::String.decompress(data, :dictionary => dictionary)
57
+ ```
58
+
59
+ You can create and read `tar.zst` archives with `minitar` for example.
60
+
61
+ ```ruby
62
+ require "zstds"
63
+ require "minitar"
64
+
65
+ ZSTDS::Stream::Writer.open "file.tar.zst" do |writer|
66
+ Minitar::Writer.open writer do |tar|
67
+ tar.add_file_simple "file", :data => "sample string"
68
+ end
69
+ end
70
+
71
+ ZSTDS::Stream::Reader.open "file.tar.zst" do |reader|
72
+ Minitar::Reader.open reader do |tar|
73
+ tar.each_entry do |entry|
74
+ puts entry.name
75
+ puts entry.read
76
+ end
77
+ end
78
+ end
79
+ ```
80
+
81
+ ## Options
82
+
83
+ Each API supports several options:
84
+
85
+ ```
86
+ :source_buffer_length
87
+ :destination_buffer_length
88
+ ```
89
+
90
+ There are internal buffers for compressed and decompressed data.
91
+ For example you want to use 1 KB as source buffer length for compressor - please use 256 B as destination buffer length.
92
+ You want to use 256 B as source buffer length for decompressor - please use 1 KB as destination buffer length.
93
+
94
+ Values: 0 - infinity, default value: 0.
95
+ 0 means automatic buffer length selection.
96
+
97
+ ```
98
+ :compression_level
99
+ ```
100
+
101
+ Values: `ZSTDS::Option::MIN_COMPRESSION_LEVEL` - `ZSTDS::Option::MAX_COMPRESSION_LEVEL`, default value: `0`.
102
+
103
+ ```
104
+ :window_log
105
+ ```
106
+
107
+ Values: `ZSTDS::Option::MIN_WINDOW_LOG` - `ZSTDS::Option::MAX_WINDOW_LOG`, default value: `0`.
108
+
109
+ ```
110
+ :hash_log
111
+ ```
112
+
113
+ Values: `ZSTDS::Option::MIN_HASH_LOG` - `ZSTDS::Option::MAX_HASH_LOG`, default value: `0`.
114
+
115
+ ```
116
+ :chain_log
117
+ ```
118
+
119
+ Values: `ZSTDS::Option::MIN_CHAIN_LOG` - `ZSTDS::Option::MAX_CHAIN_LOG`, default value: `0`.
120
+
121
+ ```
122
+ :search_log
123
+ ```
124
+
125
+ Values: `ZSTDS::Option::MIN_SEARCH_LOG` - `ZSTDS::Option::MAX_SEARCH_LOG`, default value: `0`.
126
+
127
+ ```
128
+ :min_match
129
+ ```
130
+
131
+ Values: `ZSTDS::Option::MIN_MIN_MATCH` - `ZSTDS::Option::MAX_MIN_MATCH`, default value: `0`.
132
+
133
+ ```
134
+ :target_length
135
+ ```
136
+
137
+ Values: `ZSTDS::Option::MIN_TARGET_LENGTH` - `ZSTDS::Option::MAX_TARGET_LENGTH`, default value: `0`.
138
+
139
+ ```
140
+ :strategy
141
+ ```
142
+
143
+ Values: `ZSTDS::Option::STRATEGIES`, default value: none.
144
+
145
+ ```
146
+ :enable_long_distance_matching
147
+ ```
148
+
149
+ Values: true/false, default value: none.
150
+
151
+ ```
152
+ :ldm_hash_log
153
+ ```
154
+
155
+ Values: `ZSTDS::Option::MIN_LDM_HASH_LOG` - `ZSTDS::Option::MAX_LDM_HASH_LOG`, default value: `0`.
156
+
157
+ ```
158
+ :ldm_min_match
159
+ ```
160
+
161
+ Values: `ZSTDS::Option::MIN_LDM_MIN_MATCH` - `ZSTDS::Option::MAX_LDM_MIN_MATCH`, default value: `0`.
162
+
163
+ ```
164
+ :ldm_bucket_size_log
165
+ ```
166
+
167
+ Values: `ZSTDS::Option::MIN_LDM_BUCKET_SIZE_LOG` - `ZSTDS::Option::MAX_LDM_BUCKET_SIZE_LOG`, default value: `0`.
168
+
169
+ ```
170
+ :ldm_hash_rate_log
171
+ ```
172
+
173
+ Values: `ZSTDS::Option::MIN_LDM_HASH_RATE_LOG` - `ZSTDS::Option::MAX_LDM_HASH_RATE_LOG`, default value: `0`.
174
+
175
+ ```
176
+ :content_size_flag
177
+ ```
178
+
179
+ Values: true/false, default value: true.
180
+
181
+ ```
182
+ :checksum_flag
183
+ ```
184
+
185
+ Values: true/false, default value: false.
186
+
187
+ ```
188
+ :dict_id_flag
189
+ ```
190
+
191
+ Values: true/false, default value: true.
192
+
193
+ ```
194
+ :nb_workers
195
+ ```
196
+
197
+ Values: `ZSTDS::Option::MIN_NB_WORKERS` - `ZSTDS::Option::MAX_NB_WORKERS`, default value: `0`.
198
+
199
+ ```
200
+ :job_size
201
+ ```
202
+
203
+ Values: `ZSTDS::Option::MIN_JOB_SIZE` - `ZSTDS::Option::MAX_JOB_SIZE`, default value: `0`.
204
+
205
+ ```
206
+ :overlap_log
207
+ ```
208
+
209
+ Values: `ZSTDS::Option::MIN_OVERLAP_LOG` - `ZSTDS::Option::MAX_OVERLAP_LOG`, default value: `0`.
210
+
211
+ ```
212
+ :window_log_max
213
+ ```
214
+
215
+ Values: `ZSTDS::Option::MIN_WINDOW_LOG_MAX` - `ZSTDS::Option::MAX_WINDOW_LOG_MAX`, default value: `0`.
216
+
217
+ ```
218
+ :dictionary
219
+ ```
220
+
221
+ Special option for dictionary, default value: none.
222
+
223
+ ```
224
+ :pledged_size
225
+ ```
226
+
227
+ Values: 0 - infinity, default value: 0.
228
+ It is reasonable to provide size of input (if known) for streaming api.
229
+ `String` and `File` will set `:pledged_size` automaticaly.
230
+
231
+ Please read zstd docs for more info about options.
232
+
233
+ Possible compressor options:
234
+ ```
235
+ :compression_level
236
+ :window_log
237
+ :hash_log
238
+ :chain_log
239
+ :search_log
240
+ :min_match
241
+ :target_length
242
+ :strategy
243
+ :enable_long_distance_matching
244
+ :ldm_hash_log
245
+ :ldm_min_match
246
+ :ldm_bucket_size_log
247
+ :ldm_hash_rate_log
248
+ :content_size_flag
249
+ :checksum_flag
250
+ :dict_id_flag
251
+ :nb_workers
252
+ :job_size
253
+ :overlap_log
254
+ :dictionary
255
+ :pledged_size
256
+ ```
257
+
258
+ Possible decompressor options:
259
+ ```
260
+ :window_log_max
261
+ :dictionary
262
+ ```
263
+
264
+ Example:
265
+
266
+ ```ruby
267
+ require "zstds"
268
+
269
+ data = ZSTDS::String.compress "sample string", :compression_level => 5
270
+ puts ZSTDS::String.decompress(data, :window_log_max => 11)
271
+ ```
272
+
273
+ HTTP encoding (`Content-Encoding: zstd`) using default options:
274
+
275
+ ```ruby
276
+ require "zstds"
277
+ require "sinatra"
278
+
279
+ get "/" do
280
+ headers["Content-Encoding"] = "zstd"
281
+ ZSTDS::String.compress "sample string"
282
+ end
283
+ ```
284
+
285
+ ## String
286
+
287
+ String maintains destination buffer only, so it accepts `destination_buffer_length` option only.
288
+
289
+ ```
290
+ ::compress(source, options = {})
291
+ ::decompress(source, options = {})
292
+ ```
293
+
294
+ `source` is a source string.
295
+
296
+ ## File
297
+
298
+ File maintains both source and destination buffers, it accepts both `source_buffer_length` and `destination_buffer_length` options.
299
+
300
+ ```
301
+ ::compress(source, destination, options = {})
302
+ ::decompress(source, destination, options = {})
303
+ ```
304
+
305
+ `source` and `destination` are file pathes.
306
+
307
+ ## Stream::Writer
308
+
309
+ Its behaviour is similar to builtin [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html).
310
+
311
+ Writer maintains destination buffer only, so it accepts `destination_buffer_length` option only.
312
+
313
+ ```
314
+ ::open(file_path, options = {}, :external_encoding => nil, :transcode_options => {}, &block)
315
+ ```
316
+
317
+ Open file path and create stream writer associated with opened file.
318
+ Data will be transcoded to `:external_encoding` using `:transcode_options` before compressing.
319
+
320
+ It may be tricky to use both `:pledged_size` and `:transcode_options`. You have to provide size of transcoded input.
321
+
322
+ ```
323
+ ::new(destination_io, options = {}, :external_encoding => nil, :transcode_options => {})
324
+ ```
325
+
326
+ Create stream writer associated with destination io.
327
+ Data will be transcoded to `:external_encoding` using `:transcode_options` before compressing.
328
+
329
+ It may be tricky to use both `:pledged_size` and `:transcode_options`. You have to provide size of transcoded input.
330
+
331
+ ```
332
+ #set_encoding(external_encoding, nil, transcode_options)
333
+ ```
334
+
335
+ Set another encodings, `nil` is just for compatibility with `IO`.
336
+
337
+ ```
338
+ #io
339
+ #to_io
340
+ #stat
341
+ #external_encoding
342
+ #transcode_options
343
+ #pos
344
+ #tell
345
+ ```
346
+
347
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
348
+
349
+ ```
350
+ #write(*objects)
351
+ #flush
352
+ #rewind
353
+ #close
354
+ #closed?
355
+ ```
356
+
357
+ See [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html) docs.
358
+
359
+ ```
360
+ #write_nonblock(object, *options)
361
+ #flush_nonblock(*options)
362
+ #rewind_nonblock(*options)
363
+ #close_nonblock(*options)
364
+ ```
365
+
366
+ Special asynchronous methods missing in `Zlib::GzipWriter`.
367
+ `rewind` wants to `close`, `close` wants to `write` something and `flush`, `flush` want to `write` something.
368
+ So it is possible to have asynchronous variants for these synchronous methods.
369
+ Behaviour is the same as `IO#write_nonblock` method.
370
+
371
+ ```
372
+ #<<(object)
373
+ #print(*objects)
374
+ #printf(*args)
375
+ #putc(object, encoding: ::Encoding::BINARY)
376
+ #puts(*objects)
377
+ ```
378
+
379
+ Typical helpers, see [`Zlib::GzipWriter`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipWriter.html) docs.
380
+
381
+ ## Stream::Reader
382
+
383
+ Its behaviour is similar to builtin [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html).
384
+
385
+ Reader maintains both source and destination buffers, it accepts both `source_buffer_length` and `destination_buffer_length` options.
386
+
387
+ ```
388
+ ::open(file_path, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {}, &block)
389
+ ```
390
+
391
+ Open file path and create stream reader associated with opened file.
392
+ Data will be force encoded to `:external_encoding` and transcoded to `:internal_encoding` using `:transcode_options` after decompressing.
393
+
394
+ ```
395
+ ::new(source_io, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {})
396
+ ```
397
+
398
+ Create stream reader associated with source io.
399
+ Data will be force encoded to `:external_encoding` and transcoded to `:internal_encoding` using `:transcode_options` after decompressing.
400
+
401
+ ```
402
+ #set_encoding(external_encoding, internal_encoding, transcode_options)
403
+ ```
404
+
405
+ Set another encodings.
406
+
407
+ ```
408
+ #io
409
+ #to_io
410
+ #stat
411
+ #external_encoding
412
+ #internal_encoding
413
+ #transcode_options
414
+ #pos
415
+ #tell
416
+ ```
417
+
418
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
419
+
420
+ ```
421
+ #read(bytes_to_read = nil, out_buffer = nil)
422
+ #eof?
423
+ #rewind
424
+ #close
425
+ #closed?
426
+ ```
427
+
428
+ See [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html) docs.
429
+
430
+ ```
431
+ #readpartial(bytes_to_read = nil, out_buffer = nil)
432
+ #read_nonblock(bytes_to_read, out_buffer = nil, *options)
433
+ ```
434
+
435
+ See [`IO`](https://ruby-doc.org/core-2.6.1/IO.html) docs.
436
+
437
+ ```
438
+ #getbyte
439
+ #each_byte(&block)
440
+ #readbyte
441
+ #ungetbyte(byte)
442
+
443
+ #getc
444
+ #readchar
445
+ #each_char(&block)
446
+ #ungetc(char)
447
+
448
+ #lineno
449
+ #lineno=
450
+ #gets(separator = $OUTPUT_RECORD_SEPARATOR, limit = nil)
451
+ #readline
452
+ #readlines
453
+ #each(&block)
454
+ #each_line(&block)
455
+ #ungetline(line)
456
+ ```
457
+
458
+ Typical helpers, see [`Zlib::GzipReader`](https://ruby-doc.org/stdlib-2.6.1/libdoc/zlib/rdoc/Zlib/GzipReader.html) docs.
459
+
460
+ ## Dictionary
461
+
462
+ You can train dictionary from samples using `train` class method.
463
+
464
+ ```
465
+ ::train(samples, :capacity => 0)
466
+ ```
467
+
468
+ Please review zstd code before using it.
469
+ There are many validation requirements and it changes between versions.
470
+
471
+ ```
472
+ #buffer
473
+ ```
474
+
475
+ There is an attribute reader for buffer.
476
+ You can use it to store dictionary somewhere.
477
+
478
+ ```
479
+ ::new(buffer)
480
+ ```
481
+
482
+ Please use regular constructor to create dictionary from buffer.
483
+
484
+ ```
485
+ #id
486
+ ```
487
+
488
+ Read dictionary id from buffer.
489
+
490
+ ## CI
491
+
492
+ Travis and Appveyor CI uses [scripts/ci_test.sh](scripts/ci_test.sh) directly.
493
+ Cirrus and Circle CI uses prebuilt [scripts/test-images](scripts/test-images).
494
+ Cirrus CI uses amd64 image, Circle CI - i686.
495
+
496
+ ## License
497
+
498
+ MIT license, see LICENSE and AUTHORS.