extzstd 0.3 → 0.3.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (107) hide show
  1. checksums.yaml +4 -4
  2. data/HISTORY.ja.md +8 -0
  3. data/README.md +1 -1
  4. data/contrib/zstd/CHANGELOG +94 -0
  5. data/contrib/zstd/CONTRIBUTING.md +351 -1
  6. data/contrib/zstd/Makefile +32 -10
  7. data/contrib/zstd/README.md +33 -10
  8. data/contrib/zstd/TESTING.md +2 -2
  9. data/contrib/zstd/appveyor.yml +42 -4
  10. data/contrib/zstd/lib/Makefile +128 -60
  11. data/contrib/zstd/lib/README.md +47 -16
  12. data/contrib/zstd/lib/common/bitstream.h +38 -39
  13. data/contrib/zstd/lib/common/compiler.h +40 -5
  14. data/contrib/zstd/lib/common/cpu.h +1 -1
  15. data/contrib/zstd/lib/common/debug.c +11 -31
  16. data/contrib/zstd/lib/common/debug.h +11 -31
  17. data/contrib/zstd/lib/common/entropy_common.c +13 -33
  18. data/contrib/zstd/lib/common/error_private.c +2 -1
  19. data/contrib/zstd/lib/common/error_private.h +6 -2
  20. data/contrib/zstd/lib/common/fse.h +12 -32
  21. data/contrib/zstd/lib/common/fse_decompress.c +12 -35
  22. data/contrib/zstd/lib/common/huf.h +15 -33
  23. data/contrib/zstd/lib/common/mem.h +75 -2
  24. data/contrib/zstd/lib/common/pool.c +8 -4
  25. data/contrib/zstd/lib/common/pool.h +2 -2
  26. data/contrib/zstd/lib/common/threading.c +50 -4
  27. data/contrib/zstd/lib/common/threading.h +36 -4
  28. data/contrib/zstd/lib/common/xxhash.c +23 -35
  29. data/contrib/zstd/lib/common/xxhash.h +11 -31
  30. data/contrib/zstd/lib/common/zstd_common.c +1 -1
  31. data/contrib/zstd/lib/common/zstd_errors.h +2 -1
  32. data/contrib/zstd/lib/common/zstd_internal.h +154 -26
  33. data/contrib/zstd/lib/compress/fse_compress.c +17 -40
  34. data/contrib/zstd/lib/compress/hist.c +15 -35
  35. data/contrib/zstd/lib/compress/hist.h +12 -32
  36. data/contrib/zstd/lib/compress/huf_compress.c +92 -92
  37. data/contrib/zstd/lib/compress/zstd_compress.c +1191 -1330
  38. data/contrib/zstd/lib/compress/zstd_compress_internal.h +317 -55
  39. data/contrib/zstd/lib/compress/zstd_compress_literals.c +158 -0
  40. data/contrib/zstd/lib/compress/zstd_compress_literals.h +29 -0
  41. data/contrib/zstd/lib/compress/zstd_compress_sequences.c +419 -0
  42. data/contrib/zstd/lib/compress/zstd_compress_sequences.h +54 -0
  43. data/contrib/zstd/lib/compress/zstd_compress_superblock.c +845 -0
  44. data/contrib/zstd/lib/compress/zstd_compress_superblock.h +32 -0
  45. data/contrib/zstd/lib/compress/zstd_cwksp.h +525 -0
  46. data/contrib/zstd/lib/compress/zstd_double_fast.c +65 -43
  47. data/contrib/zstd/lib/compress/zstd_double_fast.h +2 -2
  48. data/contrib/zstd/lib/compress/zstd_fast.c +92 -66
  49. data/contrib/zstd/lib/compress/zstd_fast.h +2 -2
  50. data/contrib/zstd/lib/compress/zstd_lazy.c +74 -42
  51. data/contrib/zstd/lib/compress/zstd_lazy.h +1 -1
  52. data/contrib/zstd/lib/compress/zstd_ldm.c +32 -10
  53. data/contrib/zstd/lib/compress/zstd_ldm.h +7 -2
  54. data/contrib/zstd/lib/compress/zstd_opt.c +81 -114
  55. data/contrib/zstd/lib/compress/zstd_opt.h +1 -1
  56. data/contrib/zstd/lib/compress/zstdmt_compress.c +95 -51
  57. data/contrib/zstd/lib/compress/zstdmt_compress.h +3 -2
  58. data/contrib/zstd/lib/decompress/huf_decompress.c +76 -60
  59. data/contrib/zstd/lib/decompress/zstd_ddict.c +12 -8
  60. data/contrib/zstd/lib/decompress/zstd_ddict.h +2 -2
  61. data/contrib/zstd/lib/decompress/zstd_decompress.c +292 -172
  62. data/contrib/zstd/lib/decompress/zstd_decompress_block.c +459 -338
  63. data/contrib/zstd/lib/decompress/zstd_decompress_block.h +3 -3
  64. data/contrib/zstd/lib/decompress/zstd_decompress_internal.h +18 -4
  65. data/contrib/zstd/lib/deprecated/zbuff.h +9 -8
  66. data/contrib/zstd/lib/deprecated/zbuff_common.c +2 -2
  67. data/contrib/zstd/lib/deprecated/zbuff_compress.c +1 -1
  68. data/contrib/zstd/lib/deprecated/zbuff_decompress.c +1 -1
  69. data/contrib/zstd/lib/dictBuilder/cover.c +164 -54
  70. data/contrib/zstd/lib/dictBuilder/cover.h +52 -7
  71. data/contrib/zstd/lib/dictBuilder/fastcover.c +60 -43
  72. data/contrib/zstd/lib/dictBuilder/zdict.c +43 -19
  73. data/contrib/zstd/lib/dictBuilder/zdict.h +56 -28
  74. data/contrib/zstd/lib/legacy/zstd_legacy.h +8 -4
  75. data/contrib/zstd/lib/legacy/zstd_v01.c +110 -110
  76. data/contrib/zstd/lib/legacy/zstd_v01.h +1 -1
  77. data/contrib/zstd/lib/legacy/zstd_v02.c +23 -13
  78. data/contrib/zstd/lib/legacy/zstd_v02.h +1 -1
  79. data/contrib/zstd/lib/legacy/zstd_v03.c +23 -13
  80. data/contrib/zstd/lib/legacy/zstd_v03.h +1 -1
  81. data/contrib/zstd/lib/legacy/zstd_v04.c +30 -17
  82. data/contrib/zstd/lib/legacy/zstd_v04.h +1 -1
  83. data/contrib/zstd/lib/legacy/zstd_v05.c +113 -102
  84. data/contrib/zstd/lib/legacy/zstd_v05.h +2 -2
  85. data/contrib/zstd/lib/legacy/zstd_v06.c +20 -18
  86. data/contrib/zstd/lib/legacy/zstd_v06.h +1 -1
  87. data/contrib/zstd/lib/legacy/zstd_v07.c +25 -19
  88. data/contrib/zstd/lib/legacy/zstd_v07.h +1 -1
  89. data/contrib/zstd/lib/libzstd.pc.in +3 -2
  90. data/contrib/zstd/lib/zstd.h +265 -88
  91. data/ext/extzstd.h +1 -1
  92. data/ext/libzstd_conf.h +8 -0
  93. data/ext/zstd_common.c +1 -3
  94. data/ext/zstd_compress.c +3 -3
  95. data/ext/zstd_decompress.c +1 -5
  96. data/ext/zstd_dictbuilder.c +2 -3
  97. data/ext/zstd_dictbuilder_fastcover.c +1 -3
  98. data/ext/zstd_legacy_v01.c +2 -0
  99. data/ext/zstd_legacy_v02.c +2 -0
  100. data/ext/zstd_legacy_v03.c +2 -0
  101. data/ext/zstd_legacy_v04.c +2 -0
  102. data/ext/zstd_legacy_v05.c +2 -0
  103. data/ext/zstd_legacy_v06.c +2 -0
  104. data/ext/zstd_legacy_v07.c +2 -0
  105. data/lib/extzstd.rb +18 -10
  106. data/lib/extzstd/version.rb +1 -1
  107. metadata +15 -6
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c3ec6930c7601090d5974e8a9d163de8a28029b2effe8ebc8ccebe87a7227550
4
- data.tar.gz: cd76622c5d838900587e0064b1fc16249ce3d708cd5e5096a9bcc804c2bcd0d2
3
+ metadata.gz: 3f61d75e29eb86642f72138584ffc02ff36e86318dc48c07dbb176fc26f72b9c
4
+ data.tar.gz: 539f64336ce50da00e14e3920545a3f27842febe3ce8ec47387b408110fc1149
5
5
  SHA512:
6
- metadata.gz: 50c5d80163a87da20da9ed94eb9c639dab474af89b8dbddc5676b8e607ca0dd2a5dc3ab3aa0dce06a4997918411d8562c4b23ad3eeeacc18d7c6644caacbfc42
7
- data.tar.gz: c93ca3099b708d7f55b41a6156067a9ceaddebc6eb2e0700740d4fbd19983bd9f456108fa23027e6891966aef287469f24b89f6598be414bca924793ac878c1e
6
+ metadata.gz: 52279e275616d6d179464deda78ff289abc07b42ebf7f111c79b89d6bc0ba9092290494aaefb7b9631fb3ed37d7fcd60e3f337d691a8990b36155541442fad65
7
+ data.tar.gz: 0d46cccb64f150ecf3b29459bc55b0aecfa1b43dcaa68c46f60c1a863d1827edb71acbfb4491ee777acde3bf68bdd8faf841c83cd00863172a2943689b36e482
@@ -1,5 +1,13 @@
1
1
  # extzstd の更新履歴
2
2
 
3
+ ## extzstd-0.3.1 (令和2年10月3日 土曜日)
4
+
5
+ * zstd-1.4.5 への更新
6
+ * ruby-2.7 が警告するキーワード引数に関して修正
7
+ * ".so" ファイルの読み込みに `require` を使う
8
+ 参照: [extlz4#2](https://github.com/dearblue/ruby-extlz4/issues/2)
9
+
10
+
3
11
  ## extzstd-0.3 (平成31年4月)
4
12
 
5
13
  * zstd-1.4.0 への更新
data/README.md CHANGED
@@ -76,7 +76,7 @@ end
76
76
  ## Specification
77
77
 
78
78
  * package name: extzstd
79
- * version: 0.3
79
+ * version: 0.3.1
80
80
  * product quality: TECHNICAL PREVIEW, UNSTABLE
81
81
  * license: [BSD-2-clause License](LICENSE)
82
82
  * author: dearblue <mailto:dearblue@users.noreply.github.com>
@@ -1,3 +1,97 @@
1
+ v1.4.5
2
+ fix : Compression ratio regression on huge files (> 3 GB) using high levels (--ultra) and multithreading, by @terrelln
3
+ perf: Improved decompression speed: x64 : +10% (clang) / +5% (gcc); ARM : from +15% to +50%, depending on SoC, by @terrelln
4
+ perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashreshta)
5
+ perf: Improved fast compression speed on aarch64 (#2040, ~+3%, by @caoyzh)
6
+ perf: Small level 1 compression speed gains (depending on compiler)
7
+ cli : New --patch-from command, create and apply patches from files, by @bimbashreshta
8
+ cli : New --filelist= : Provide a list of files to operate upon from a file
9
+ cli : -b -d command can now benchmark decompression on multiple files
10
+ cli : New --no-content-size command
11
+ cli : New --show-default-cparams information command
12
+ api : ZDICT_finalizeDictionary() is promoted to stable (#2111)
13
+ api : new experimental parameter ZSTD_d_stableOutBuffer (#2094)
14
+ build: Generate a single-file libzstd library (#2065, by @cwoffenden)
15
+ build: Relative includes no longer require -I compiler flags for zstd lib subdirs (#2103, by @felixhandte)
16
+ build: zstd now compiles cleanly under -pedantic (#2099)
17
+ build: zstd now compiles with make-4.3
18
+ build: Support mingw cross-compilation from Linux, by @Ericson2314
19
+ build: Meson multi-thread build fix on windows
20
+ build: Some misc icc fixes backed by new ci test on travis
21
+ misc: bitflip analyzer tool, by @felixhandte
22
+ misc: Extend largeNbDicts benchmark to compression
23
+ misc: Edit-distance match finder in contrib/
24
+ doc : Improved beginner CONTRIBUTING.md docs
25
+ doc : New issue templates for zstd
26
+
27
+ v1.4.4
28
+ perf: Improved decompression speed, by > 10%, by @terrelln
29
+ perf: Better compression speed when re-using a context, by @felixhandte
30
+ perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42
31
+ perf: zstd reference encoder can generate RLE blocks, by @bimbashrestha
32
+ perf: minor generic speed optimization, by @davidbolvansky
33
+ api: new ability to extract sequences from the parser for analysis, by @bimbashrestha
34
+ api: fixed decoding of magic-less frames, by @terrelln
35
+ api: fixed ZSTD_initCStream_advanced() performance with fast modes, reported by @QrczakMK
36
+ cli: Named pipes support, by @bimbashrestha
37
+ cli: short tar's extension support, by @stokito
38
+ cli: command --output-dir-flat= , generates target files into requested directory, by @senhuang42
39
+ cli: commands --stream-size=# and --size-hint=#, by @nmagerko
40
+ cli: command --exclude-compressed, by @shashank0791
41
+ cli: faster `-t` test mode
42
+ cli: improved some error messages, by @vangyzen
43
+ cli: fix command `-D dictionary` on Windows, reported by @artyompetrov
44
+ cli: fix rare deadlock condition within dictionary builder, by @terrelln
45
+ build: single-file decoder with emscripten compilation script, by @cwoffenden
46
+ build: fixed zlibWrapper compilation on Visual Studio, reported by @bluenlive
47
+ build: fixed deprecation warning for certain gcc version, reported by @jasonma163
48
+ build: fix compilation on old gcc versions, by @cemeyer
49
+ build: improved installation directories for cmake script, by Dmitri Shubin
50
+ pack: modified pkgconfig, for better integration into openwrt, requested by @neheb
51
+ misc: Improved documentation : ZSTD_CLEVEL, DYNAMIC_BMI2, ZSTD_CDict, function deprecation, zstd format
52
+ misc: fixed educational decoder : accept larger literals section, and removed UNALIGNED() macro
53
+
54
+ v1.4.3
55
+ bug: Fix Dictionary Compression Ratio Regression by @cyan4973 (#1709)
56
+ bug: Fix Buffer Overflow in legacy v0.3 decompression by @felixhandte (#1722)
57
+ build: Add support for IAR C/C++ Compiler for Arm by @joseph0918 (#1705)
58
+
59
+ v1.4.2
60
+ bug: Fix bug in zstd-0.5 decoder by @terrelln (#1696)
61
+ bug: Fix seekable decompression in-memory API by @iburinoc (#1695)
62
+ misc: Validate blocks are smaller than size limit by @vivekmg (#1685)
63
+ misc: Restructure source files by @ephiepark (#1679)
64
+
65
+ v1.4.1
66
+ bug: Fix data corruption in niche use cases by @terrelln (#1659)
67
+ bug: Fuzz legacy modes, fix uncovered bugs by @terrelln (#1593, #1594, #1595)
68
+ bug: Fix out of bounds read by @terrelln (#1590)
69
+ perf: Improve decode speed by ~7% @mgrice (#1668)
70
+ perf: Slightly improved compression ratio of level 3 and 4 (ZSTD_dfast) by @cyan4973 (#1681)
71
+ perf: Slightly faster compression speed when re-using a context by @cyan4973 (#1658)
72
+ perf: Improve compression ratio for small windowLog by @cyan4973 (#1624)
73
+ perf: Faster compression speed in high compression mode for repetitive data by @terrelln (#1635)
74
+ api: Add parameter to generate smaller dictionaries by @tyler-tran (#1656)
75
+ cli: Recognize symlinks when built in C99 mode by @felixhandte (#1640)
76
+ cli: Expose cpu load indicator for each file on -vv mode by @ephiepark (#1631)
77
+ cli: Restrict read permissions on destination files by @chungy (#1644)
78
+ cli: zstdgrep: handle -f flag by @felixhandte (#1618)
79
+ cli: zstdcat: follow symlinks by @vejnar (#1604)
80
+ doc: Remove extra size limit on compressed blocks by @felixhandte (#1689)
81
+ doc: Fix typo by @yk-tanigawa (#1633)
82
+ doc: Improve documentation on streaming buffer sizes by @cyan4973 (#1629)
83
+ build: CMake: support building with LZ4 @leeyoung624 (#1626)
84
+ build: CMake: install zstdless and zstdgrep by @leeyoung624 (#1647)
85
+ build: CMake: respect existing uninstall target by @j301scott (#1619)
86
+ build: Make: skip multithread tests when built without support by @michaelforney (#1620)
87
+ build: Make: Fix examples/ test target by @sjnam (#1603)
88
+ build: Meson: rename options out of deprecated namespace by @lzutao (#1665)
89
+ build: Meson: fix build by @lzutao (#1602)
90
+ build: Visual Studio: don't export symbols in static lib by @scharan (#1650)
91
+ build: Visual Studio: fix linking by @absotively (#1639)
92
+ build: Fix MinGW-W64 build by @myzhang1029 (#1600)
93
+ misc: Expand decodecorpus coverage by @ephiepark (#1664)
94
+
1
95
  v1.4.0
2
96
  perf: Improve level 1 compression speed in most scenarios by 6% by @gbtucker and @terrelln
3
97
  api: Move the advanced API, including all functions in the staging section, to the stable section
@@ -26,6 +26,356 @@ to do this once to work on any of Facebook's open source projects.
26
26
 
27
27
  Complete your CLA here: <https://code.facebook.com/cla>
28
28
 
29
+ ## Workflow
30
+ Zstd uses a branch-based workflow for making changes to the codebase. Typically, zstd
31
+ will use a new branch per sizable topic. For smaller changes, it is okay to lump multiple
32
+ related changes into a branch.
33
+
34
+ Our contribution process works in three main stages:
35
+ 1. Local development
36
+ * Update:
37
+ * Checkout your fork of zstd if you have not already
38
+ ```
39
+ git checkout https://github.com/<username>/zstd
40
+ cd zstd
41
+ ```
42
+ * Update your local dev branch
43
+ ```
44
+ git pull https://github.com/facebook/zstd dev
45
+ git push origin dev
46
+ ```
47
+ * Topic and development:
48
+ * Make a new branch on your fork about the topic you're developing for
49
+ ```
50
+ # branch names should be consise but sufficiently informative
51
+ git checkout -b <branch-name>
52
+ git push origin <branch-name>
53
+ ```
54
+ * Make commits and push
55
+ ```
56
+ # make some changes =
57
+ git add -u && git commit -m <message>
58
+ git push origin <branch-name>
59
+ ```
60
+ * Note: run local tests to ensure that your changes didn't break existing functionality
61
+ * Quick check
62
+ ```
63
+ make shortest
64
+ ```
65
+ * Longer check
66
+ ```
67
+ make test
68
+ ```
69
+ 2. Code Review and CI tests
70
+ * Ensure CI tests pass:
71
+ * Before sharing anything to the community, make sure that all CI tests pass on your local fork.
72
+ See our section on setting up your CI environment for more information on how to do this.
73
+ * Ensure that static analysis passes on your development machine. See the Static Analysis section
74
+ below to see how to do this.
75
+ * Create a pull request:
76
+ * When you are ready to share you changes to the community, create a pull request from your branch
77
+ to facebook:dev. You can do this very easily by clicking 'Create Pull Request' on your fork's home
78
+ page.
79
+ * From there, select the branch where you made changes as your source branch and facebook:dev
80
+ as the destination.
81
+ * Examine the diff presented between the two branches to make sure there is nothing unexpected.
82
+ * Write a good pull request description:
83
+ * While there is no strict template that our contributors follow, we would like them to
84
+ sufficiently summarize and motivate the changes they are proposing. We recommend all pull requests,
85
+ at least indirectly, address the following points.
86
+ * Is this pull request important and why?
87
+ * Is it addressing an issue? If so, what issue? (provide links for convenience please)
88
+ * Is this a new feature? If so, why is it useful and/or necessary?
89
+ * Are there background references and documents that reviewers should be aware of to properly assess this change?
90
+ * Note: make sure to point out any design and architectural decisions that you made and the rationale behind them.
91
+ * Note: if you have been working with a specific user and would like them to review your work, make sure you mention them using (@<username>)
92
+ * Submit the pull request and iterate with feedback.
93
+ 3. Merge and Release
94
+ * Getting approval:
95
+ * You will have to iterate on your changes with feedback from other collaborators to reach a point
96
+ where your pull request can be safely merged.
97
+ * To avoid too many comments on style and convention, make sure that you have a
98
+ look at our style section below before creating a pull request.
99
+ * Eventually, someone from the zstd team will approve your pull request and not long after merge it into
100
+ the dev branch.
101
+ * Housekeeping:
102
+ * Most PRs are linked with one or more Github issues. If this is the case for your PR, make sure
103
+ the corresponding issue is mentioned. If your change 'fixes' or completely addresses the
104
+ issue at hand, then please indicate this by requesting that an issue be closed by commenting.
105
+ * Just because your changes have been merged does not mean the topic or larger issue is complete. Remember
106
+ that the change must make it to an official zstd release for it to be meaningful. We recommend
107
+ that contributers track the activity on their pull request and corresponding issue(s) page(s) until
108
+ their change makes it to the next release of zstd. Users will often discover bugs in your code or
109
+ suggest ways to refine and improve your initial changes even after the pull request is merged.
110
+
111
+ ## Static Analysis
112
+ Static analysis is a process for examining the correctness or validity of a program without actually
113
+ executing it. It usually helps us find many simple bugs. Zstd uses clang's `scan-build` tool for
114
+ static analysis. You can install it by following the instructions for your OS on https://clang-analyzer.llvm.org/scan-build.
115
+
116
+ Once installed, you can ensure that our static analysis tests pass on your local development machine
117
+ by running:
118
+ ```
119
+ make staticAnalyze
120
+ ```
121
+
122
+ In general, you can use `scan-build` to static analyze any build script. For example, to static analyze
123
+ just `contrib/largeNbDicts` and nothing else, you can run:
124
+
125
+ ```
126
+ scan-build make -C contrib/largeNbDicts largeNbDicts
127
+ ```
128
+
129
+ ## Performance
130
+ Performance is extremely important for zstd and we only merge pull requests whose performance
131
+ landscape and corresponding trade-offs have been adequately analyzed, reproduced, and presented.
132
+ This high bar for performance means that every PR which has the potential to
133
+ impact performance takes a very long time for us to properly review. That being said, we
134
+ always welcome contributions to improve performance (or worsen performance for the trade-off of
135
+ something else). Please keep the following in mind before submitting a performance related PR:
136
+
137
+ 1. Zstd isn't as old as gzip but it has been around for time now and its evolution is
138
+ very well documented via past Github issues and pull requests. It may be the case that your
139
+ particular performance optimization has already been considered in the past. Please take some
140
+ time to search through old issues and pull requests using keywords specific to your
141
+ would-be PR. Of course, just because a topic has already been discussed (and perhaps rejected
142
+ on some grounds) in the past, doesn't mean it isn't worth bringing up again. But even in that case,
143
+ it will be helpful for you to have context from that topic's history before contributing.
144
+ 2. The distinction between noise and actual performance gains can unfortunately be very subtle
145
+ especially when microbenchmarking extremely small wins or losses. The only remedy to getting
146
+ something subtle merged is extensive benchmarking. You will be doing us a great favor if you
147
+ take the time to run extensive, long-duration, and potentially cross-(os, platform, process, etc)
148
+ benchmarks on your end before submitting a PR. Of course, you will not be able to benchmark
149
+ your changes on every single processor and os out there (and neither will we) but do that best
150
+ you can:) We've adding some things to think about when benchmarking below in the Benchmarking
151
+ Performance section which might be helpful for you.
152
+ 3. Optimizing performance for a certain OS, processor vendor, compiler, or network system is a perfectly
153
+ legitimate thing to do as long as it does not harm the overall performance health of Zstd.
154
+ This is a hard balance to strike but please keep in mind other aspects of Zstd when
155
+ submitting changes that are clang-specific, windows-specific, etc.
156
+
157
+ ## Benchmarking Performance
158
+ Performance microbenchmarking is a tricky subject but also essential for Zstd. We value empirical
159
+ testing over theoretical speculation. This guide it not perfect but for most scenarios, it
160
+ is a good place to start.
161
+
162
+ ### Stability
163
+ Unfortunately, the most important aspect in being able to benchmark reliably is to have a stable
164
+ benchmarking machine. A virtual machine, a machine with shared resources, or your laptop
165
+ will typically not be stable enough to obtain reliable benchmark results. If you can get your
166
+ hands on a desktop, this is usually a better scenario.
167
+
168
+ Of course, benchmarking can be done on non-hyper-stable machines as well. You will just have to
169
+ do a little more work to ensure that you are in fact measuring the changes you've made not and
170
+ noise. Here are some things you can do to make your benchmarks more stable:
171
+
172
+ 1. The most simple thing you can do to drastically improve the stability of your benchmark is
173
+ to run it multiple times and then aggregate the results of those runs. As a general rule of
174
+ thumb, the smaller the change you are trying to measure, the more samples of benchmark runs
175
+ you will have to aggregate over to get reliable results. Here are some additional things to keep in
176
+ mind when running multiple trials:
177
+ * How you aggregate your samples are important. You might be tempted to use the mean of your
178
+ results. While this is certainly going to be a more stable number than a raw single sample
179
+ benchmark number, you might have more luck by taking the median. The mean is not robust to
180
+ outliers whereas the median is. Better still, you could simply take the fastest speed your
181
+ benchmark achieved on each run since that is likely the fastest your process will be
182
+ capable of running your code. In our experience, this (aggregating by just taking the sample
183
+ with the fastest running time) has been the most stable approach.
184
+ * The more samples you have, the more stable your benchmarks should be. You can verify
185
+ your improved stability by looking at the size of your confidence intervals as you
186
+ increase your sample count. These should get smaller and smaller. Eventually hopefully
187
+ smaller than the performance win you are expecting.
188
+ * Most processors will take some time to get `hot` when running anything. The observations
189
+ you collect during that time period will very different from the true performance number. Having
190
+ a very large number of sample will help alleviate this problem slightly but you can also
191
+ address is directly by simply not including the first `n` iterations of your benchmark in
192
+ your aggregations. You can determine `n` by simply looking at the results from each iteration
193
+ and then hand picking a good threshold after which the variance in results seems to stabilize.
194
+ 2. You cannot really get reliable benchmarks if your host machine is simultaneously running
195
+ another cpu/memory-intensive application in the background. If you are running benchmarks on your
196
+ personal laptop for instance, you should close all applications (including your code editor and
197
+ browser) before running your benchmarks. You might also have invisible background applications
198
+ running. You can see what these are by looking at either Activity Monitor on Mac or Task Manager
199
+ on Windows. You will get more stable benchmark results of you end those processes as well.
200
+ * If you have multiple cores, you can even run your benchmark on a reserved core to prevent
201
+ pollution from other OS and user processes. There are a number of ways to do this depending
202
+ on your OS:
203
+ * On linux boxes, you have use https://github.com/lpechacek/cpuset.
204
+ * On Windows, you can "Set Processor Affinity" using https://www.thewindowsclub.com/processor-affinity-windows
205
+ * On Mac, you can try to use their dedicated affinity API https://developer.apple.com/library/archive/releasenotes/Performance/RN-AffinityAPI/#//apple_ref/doc/uid/TP40006635-CH1-DontLinkElementID_2
206
+ 3. To benchmark, you will likely end up writing a separate c/c++ program that will link libzstd.
207
+ Dynamically linking your library will introduce some added variation (not a large amount but
208
+ definitely some). Statically linking libzstd will be more stable. Static libraries should
209
+ be enabled by default when building zstd.
210
+ 4. Use a profiler with a good high resolution timer. See the section below on profiling for
211
+ details on this.
212
+ 5. Disable frequency scaling, turbo boost and address space randomization (this will vary by OS)
213
+ 6. Try to avoid storage. On some systems you can use tmpfs. Putting the program, inputs and outputs on
214
+ tmpfs avoids touching a real storage system, which can have a pretty big variability.
215
+
216
+ Also check our LLVM's guide on benchmarking here: https://llvm.org/docs/Benchmarking.html
217
+
218
+ ### Zstd benchmark
219
+ The fastest signal you can get regarding your performance changes is via the in-build zstd cli
220
+ bench option. You can run Zstd as you typically would for your scenario using some set of options
221
+ and then additionally also specify the `-b#` option. Doing this will run our benchmarking pipeline
222
+ for that options you have just provided. If you want to look at the internals of how this
223
+ benchmarking script works, you can check out programs/benchzstd.c
224
+
225
+ For example: say you have made a change that you believe improves the speed of zstd level 1. The
226
+ very first thing you should use to asses whether you actually achieved any sort of improvement
227
+ is `zstd -b`. You might try to do something like this. Note: you can use the `-i` option to
228
+ specify a running time for your benchmark in seconds (default is 3 seconds).
229
+ Usually, the longer the running time, the more stable your results will be.
230
+
231
+ ```
232
+ $ git checkout <commit-before-your-change>
233
+ $ make && cp zstd zstd-old
234
+ $ git checkout <commit-after-your-change>
235
+ $ make && cp zstd zstd-new
236
+ $ zstd-old -i5 -b1 <your-test-data>
237
+ 1<your-test-data> : 8990 -> 3992 (2.252), 302.6 MB/s , 626.4 MB/s
238
+ $ zstd-new -i5 -b1 <your-test-data>
239
+ 1<your-test-data> : 8990 -> 3992 (2.252), 302.8 MB/s , 628.4 MB/s
240
+ ```
241
+
242
+ Unless your performance win is large enough to be visible despite the intrinsic noise
243
+ on your computer, benchzstd alone will likely not be enough to validate the impact of your
244
+ changes. For example, the results of the example above indicate that effectively nothing
245
+ changed but there could be a small <3% improvement that the noise on the host machine
246
+ obscured. So unless you see a large performance win (10-15% consistently) using just
247
+ this method of evaluation will not be sufficient.
248
+
249
+ ### Profiling
250
+ There are a number of great profilers out there. We're going to briefly mention how you can
251
+ profile your code using `instruments` on mac, `perf` on linux and `visual studio profiler`
252
+ on windows.
253
+
254
+ Say you have an idea for a change that you think will provide some good performance gains
255
+ for level 1 compression on Zstd. Typically this means, you have identified a section of
256
+ code that you think can be made to run faster.
257
+
258
+ The first thing you will want to do is make sure that the piece of code is actually taking up
259
+ a notable amount of time to run. It is usually not worth optimzing something which accounts for less than
260
+ 0.0001% of the total running time. Luckily, there are tools to help with this.
261
+ Profilers will let you see how much time your code spends inside a particular function.
262
+ If your target code snippit is only part of a function, it might be worth trying to
263
+ isolate that snippit by moving it to its own function (this is usually not necessary but
264
+ might be).
265
+
266
+ Most profilers (including the profilers dicusssed below) will generate a call graph of
267
+ functions for you. Your goal will be to find your function of interest in this call grapch
268
+ and then inspect the time spent inside of it. You might also want to to look at the
269
+ annotated assembly which most profilers will provide you with.
270
+
271
+ #### Instruments
272
+ We will once again consider the scenario where you think you've identified a piece of code
273
+ whose performance can be improved upon. Follow these steps to profile your code using
274
+ Instruments.
275
+
276
+ 1. Open Instruments
277
+ 2. Select `Time Profiler` from the list of standard templates
278
+ 3. Close all other applications except for your instruments window and your terminal
279
+ 4. Run your benchmarking script from your terminal window
280
+ * You will want a benchmark that runs for at least a few seconds (5 seconds will
281
+ usually be long enough). This way the profiler will have something to work with
282
+ and you will have ample time to attach your profiler to this process:)
283
+ * I will just use benchzstd as my bencharmking script for this example:
284
+ ```
285
+ $ zstd -b1 -i5 <my-data> # this will run for 5 seconds
286
+ ```
287
+ 5. Once you run your benchmarking script, switch back over to instruments and attach your
288
+ process to the time profiler. You can do this by:
289
+ * Clicking on the `All Processes` drop down in the top left of the toolbar.
290
+ * Selecting your process from the dropdown. In my case, it is just going to be labled
291
+ `zstd`
292
+ * Hitting the bright red record circle button on the top left of the toolbar
293
+ 6. You profiler will now start collecting metrics from your bencharking script. Once
294
+ you think you have collected enough samples (usually this is the case after 3 seconds of
295
+ recording), stop your profiler.
296
+ 7. Make sure that in toolbar of the bottom window, `profile` is selected.
297
+ 8. You should be able to see your call graph.
298
+ * If you don't see the call graph or an incomplete call graph, make sure you have compiled
299
+ zstd and your benchmarking scripg using debug flags. On mac and linux, this just means
300
+ you will have to supply the `-g` flag alone with your build script. You might also
301
+ have to provide the `-fno-omit-frame-pointer` flag
302
+ 9. Dig down the graph to find your function call and then inspect it by double clicking
303
+ the list item. You will be able to see the annotated source code and the assembly side by
304
+ side.
305
+
306
+ #### Perf
307
+
308
+ This wiki has a pretty detailed tutorial on getting started working with perf so we'll
309
+ leave you to check that out of you're getting started:
310
+
311
+ https://perf.wiki.kernel.org/index.php/Tutorial
312
+
313
+ Some general notes on perf:
314
+ * Use `perf stat -r # <bench-program>` to quickly get some relevant timing and
315
+ counter statistics. Perf uses a high resolution timer and this is likely one
316
+ of the first things your team will run when assessing your PR.
317
+ * Perf has a long list of hardware counters that can be viewed with `perf --list`.
318
+ When measuring optimizations, something worth trying is to make sure the handware
319
+ counters you expect to be impacted by your change are in fact being so. For example,
320
+ if you expect the L1 cache misses to decrease with your change, you can look at the
321
+ counter `L1-dcache-load-misses`
322
+ * Perf hardware counters will not work on a virtual machine.
323
+
324
+ #### Visual Studio
325
+
326
+ TODO
327
+
328
+
329
+ ## Setting up continuous integration (CI) on your fork
330
+ Zstd uses a number of different continuous integration (CI) tools to ensure that new changes
331
+ are well tested before they make it to an official release. Specifically, we use the platforms
332
+ travis-ci, circle-ci, and appveyor.
333
+
334
+ Changes cannot be merged into the main dev branch unless they pass all of our CI tests.
335
+ The easiest way to run these CI tests on your own before submitting a PR to our dev branch
336
+ is to configure your personal fork of zstd with each of the CI platforms. Below, you'll find
337
+ instructions for doing this.
338
+
339
+ ### travis-ci
340
+ Follow these steps to link travis-ci with your github fork of zstd
341
+
342
+ 1. Make sure you are logged into your github account
343
+ 2. Go to https://travis-ci.org/
344
+ 3. Click 'Sign in with Github' on the top right
345
+ 4. Click 'Authorize travis-ci'
346
+ 5. Click 'Activate all repositories using Github Apps'
347
+ 6. Select 'Only select repositories' and select your fork of zstd from the drop down
348
+ 7. Click 'Approve and Install'
349
+ 8. Click 'Sign in with Github' again. This time, it will be for travis-pro (which will let you view your tests on the web dashboard)
350
+ 9. Click 'Authorize travis-pro'
351
+ 10. You should have travis set up on your fork now.
352
+
353
+ ### circle-ci
354
+ TODO
355
+
356
+ ### appveyor
357
+ Follow these steps to link circle-ci with your girhub fork of zstd
358
+
359
+ 1. Make sure you are logged into your github account
360
+ 2. Go to https://www.appveyor.com/
361
+ 3. Click 'Sign in' on the top right
362
+ 4. Select 'Github' on the left panel
363
+ 5. Click 'Authorize appveyor'
364
+ 6. You might be asked to select which repositories you want to give appveyor permission to. Select your fork of zstd if you're prompted
365
+ 7. You should have appveyor set up on your fork now.
366
+
367
+ ### General notes on CI
368
+ CI tests run every time a pull request (PR) is created or updated. The exact tests
369
+ that get run will depend on the destination branch you specify. Some tests take
370
+ longer to run than others. Currently, our CI is set up to run a short
371
+ series of tests when creating a PR to the dev branch and a longer series of tests
372
+ when creating a PR to the master branch. You can look in the configuration files
373
+ of the respective CI platform for more information on what gets run when.
374
+
375
+ Most people will just want to create a PR with the destination set to their local dev
376
+ branch of zstd. You can then find the status of the tests on the PR's page. You can also
377
+ re-run tests and cancel running tests from the PR page or from the respective CI's dashboard.
378
+
29
379
  ## Issues
30
380
  We use GitHub issues to track public bugs. Please ensure your description is
31
381
  clear and has sufficient instructions to be able to reproduce the issue.
@@ -34,7 +384,7 @@ Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
34
384
  disclosure of security bugs. In those cases, please go through the process
35
385
  outlined on that page and do not file a public issue.
36
386
 
37
- ## Coding Style
387
+ ## Coding Style
38
388
  * 4 spaces for indentation rather than tabs
39
389
 
40
390
  ## License