uncle_blake3 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. checksums.yaml +7 -0
  2. data/LICENSE.md +27 -0
  3. data/README.md +89 -0
  4. data/ext/Rakefile +55 -0
  5. data/ext/binding/uncle_blake3.c +41 -0
  6. data/ext/blake3/c/Makefile.testing +82 -0
  7. data/ext/blake3/c/README.md +316 -0
  8. data/ext/blake3/c/blake3.c +616 -0
  9. data/ext/blake3/c/blake3.h +60 -0
  10. data/ext/blake3/c/blake3_avx2.c +326 -0
  11. data/ext/blake3/c/blake3_avx2_x86-64_unix.S +1815 -0
  12. data/ext/blake3/c/blake3_avx2_x86-64_windows_gnu.S +1817 -0
  13. data/ext/blake3/c/blake3_avx2_x86-64_windows_msvc.asm +1828 -0
  14. data/ext/blake3/c/blake3_avx512.c +1207 -0
  15. data/ext/blake3/c/blake3_avx512_x86-64_unix.S +2585 -0
  16. data/ext/blake3/c/blake3_avx512_x86-64_windows_gnu.S +2615 -0
  17. data/ext/blake3/c/blake3_avx512_x86-64_windows_msvc.asm +2634 -0
  18. data/ext/blake3/c/blake3_dispatch.c +276 -0
  19. data/ext/blake3/c/blake3_impl.h +282 -0
  20. data/ext/blake3/c/blake3_neon.c +351 -0
  21. data/ext/blake3/c/blake3_portable.c +160 -0
  22. data/ext/blake3/c/blake3_sse2.c +566 -0
  23. data/ext/blake3/c/blake3_sse2_x86-64_unix.S +2291 -0
  24. data/ext/blake3/c/blake3_sse2_x86-64_windows_gnu.S +2332 -0
  25. data/ext/blake3/c/blake3_sse2_x86-64_windows_msvc.asm +2350 -0
  26. data/ext/blake3/c/blake3_sse41.c +560 -0
  27. data/ext/blake3/c/blake3_sse41_x86-64_unix.S +2028 -0
  28. data/ext/blake3/c/blake3_sse41_x86-64_windows_gnu.S +2069 -0
  29. data/ext/blake3/c/blake3_sse41_x86-64_windows_msvc.asm +2089 -0
  30. data/ext/blake3/c/example.c +37 -0
  31. data/ext/blake3/c/main.c +166 -0
  32. data/ext/blake3/c/test.py +97 -0
  33. data/lib/uncle_blake3/binding.rb +20 -0
  34. data/lib/uncle_blake3/build/loader.rb +40 -0
  35. data/lib/uncle_blake3/build/platform.rb +37 -0
  36. data/lib/uncle_blake3/build.rb +4 -0
  37. data/lib/uncle_blake3/digest.rb +119 -0
  38. data/lib/uncle_blake3/version.rb +5 -0
  39. data/lib/uncle_blake3.rb +7 -0
  40. metadata +112 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: '0871620efe59a963bd224a6f78db74d5a59049f508c4a23af9c4bdda560c8475'
4
+ data.tar.gz: 2fd8ec5dd16df12f55deeea14175b54687b6d533eba4080cacd0ba8c061dbe63
5
+ SHA512:
6
+ metadata.gz: f88ab0d4978ecdea48c2afe95a1f602bc501fb48b99a8d03e0c9f00675ee695ccb3da0a07dbb51a6b38166f95d2c63e07eaee80b82fb8f1787598f83cbf9be97
7
+ data.tar.gz: d60a1b305eac39a76b399cff95590add4fc31b82e8ab0585de1f8d0db1147ee288c04251ef04c84059315fd747113a7c7676bef6a2f8c68ccdae3596263af4be
data/LICENSE.md ADDED
@@ -0,0 +1,27 @@
1
+ # BSD 3-Clause License
2
+
3
+ _Copyright © `2022`, `Sarun Rattanasiri`_
4
+ _All rights reserved._
5
+
6
+ Redistribution and use in source and binary forms, with or without modification,
7
+ are permitted provided that the following conditions are met:
8
+
9
+ * Redistributions of source code must retain the above copyright notice,
10
+ this list of conditions and the following disclaimer.
11
+ * Redistributions in binary form must reproduce the above copyright notice,
12
+ this list of conditions and the following disclaimer in the documentation
13
+ and/or other materials provided with the distribution.
14
+ * Neither the name of the copyright holder nor the names of its contributors
15
+ may be used to endorse or promote products derived from this software
16
+ without specific prior written permission.
17
+
18
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
20
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
21
+ IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
22
+ INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
23
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
24
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25
+ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
26
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
27
+ EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md ADDED
@@ -0,0 +1,89 @@
1
+ # UncleBlake3
2
+
3
+ ## What is it?
4
+
5
+ UncleBlake3 is a Ruby binding of [Blake3](https://github.com/BLAKE3-team/BLAKE3), a fast cryptographic hash function.
6
+
7
+ ## What are specials?
8
+
9
+ - It builds on top of the [official C implementation](https://github.com/BLAKE3-team/BLAKE3/tree/master/c),
10
+ which is hand-optimized down to the assembly instruction.
11
+ - The implementation supports `AVX512`, `AVX2`, `SSE4.1`, and `SSE2` instruction set for accelerations.
12
+ - Thin and stable binding layer
13
+ - Not limited to [Matz's Ruby Interpreter (MRI)](https://en.wikipedia.org/wiki/Ruby_MRI), this is due to the gem opting
14
+ for [Ruby-FFI](https://github.com/ffi/ffi) instead of using the API exposed by `ruby.h`.
15
+ (I only tested on MRI, though.)
16
+
17
+ ## Prerequisites
18
+
19
+ In order to install the gem, your needs:
20
+
21
+ - GCC, the GNU Compiler Collection
22
+ - And Ruby related stuffs
23
+
24
+ ## Installation
25
+
26
+ Add this line to your application's Gemfile:
27
+
28
+ ```ruby
29
+ gem 'uncle_blake3'
30
+ ```
31
+
32
+ And then execute:
33
+
34
+ $ bundle install
35
+
36
+ ## Usage Examples
37
+
38
+ ~~~Ruby
39
+ # basic usage
40
+ ::UncleBlake3::Digest.hexdigest("\x00")
41
+ # => "2d3adedff11b61f14c886e35afa036736dcd87a74d27b5c1510225d0f592e213"
42
+
43
+ # streaming
44
+ digest = ::UncleBlake3::Digest.new
45
+ digest << "\x00\x01"
46
+ digest << "\x02\x03"
47
+ digest.hexdigest
48
+ # => "f30f5ab28fe047904037f77b6da4fea1e27241c5d132638d8bedce9d40494f32"
49
+ # `<<` is an alias of `update`, use the one you like
50
+
51
+ # keyed hash
52
+ digest = ::UncleBlake3::Digest.new(key: 'whats the Elvish word for friend') # the key must be a 32-byte key or UncleBlake will get mad
53
+ digest << "\x00\x01\x02\x03"
54
+ digest.hexdigest
55
+ # => "7671dde590c95d5ac9616651ff5aa0a27bee5913a348e053b8aa9108917fe070"
56
+
57
+ # use key_seed if you want something like a keyed hash but you have an arbitrary length String as a key
58
+ digest = ::UncleBlake3::Digest.new(key_seed: 'BLAKE3 2019-12-27 16:29:52 test vectors context') # key_seed is the context string in the derive_key mode of Blake3
59
+ digest << "\x00\x01\x02\x03"
60
+ digest.hexdigest
61
+ # => "f46085c8190d69022369ce1a18880e9b369c135eb93f3c63550d3e7630e91060"
62
+
63
+ # shortcuts
64
+ ::UncleBlake3::Digest.digest("\x00\x01\x02\x03", key_seed: 'BLAKE3 2019-12-27 16:29:52 test vectors context')
65
+ # => "\xF4`\x85\xC8\x19\ri\x02#i\xCE\x1A\x18\x88\x0E\x9B6\x9C\x13^\xB9?<cU\r>v0\xE9\x10`"
66
+ ::UncleBlake3::Digest.hexdigest("\x00\x01\x02\x03", key_seed: 'BLAKE3 2019-12-27 16:29:52 test vectors context')
67
+ # => "f46085c8190d69022369ce1a18880e9b369c135eb93f3c63550d3e7630e91060"
68
+ ::UncleBlake3::Digest.base64digest("\x00\x01\x02\x03", key_seed: 'BLAKE3 2019-12-27 16:29:52 test vectors context', output_length: 24)
69
+ # => "9GCFyBkNaQIjac4aGIgOmzacE165Pzxj"
70
+ # `digest`, `hexdigest`, and `base64digest` are available as shortcuts and also on `Digest` instances.
71
+ # Same for the options, you may use `key`, `key_seed`, and `output_length` on both instance methods and shortcuts
72
+
73
+ # XOF (extendable-output functions)
74
+ digest = ::UncleBlake3::Digest.new(output_length: 64)
75
+ digest << "\x00"
76
+ digest.hexdigest
77
+ # => "2d3adedff11b61f14c886e35afa036736dcd87a74d27b5c1510225d0f592e213c3a6cb8bf623e20cdb535f8d1a5ffb86342d9c0b64aca3bce1d31f60adfa137b"
78
+ ~~~
79
+
80
+ ## Why not Rust binding?
81
+
82
+ `gcc` is way more common than `rust` compiler.
83
+ Also in a typical Ruby application, we usually don't hash tons of data; mostly just hashing short messages.
84
+ In such case, C might me the best choice usability wise and performance wise, due to the fact that it only calculate a single hash on a single thread.
85
+ We won't get multicore boost on a single hash calculation, but with many simultaneous calculation for short inputs, we will get the benefit with less overhead.
86
+
87
+ ## License
88
+
89
+ UncleBlake3 is released under the [BSD 3-Clause License](LICENSE.md). :tada:
data/ext/Rakefile ADDED
@@ -0,0 +1,55 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'fileutils'
4
+ require_relative '../lib/uncle_blake3/build'
5
+
6
+ blake3_prefix = 'blake3/c/'
7
+ build_prefix = 'bin/.build/'
8
+ static_target = 'blake3.a'
9
+ object_list = %w[
10
+ blake3.o
11
+ blake3_dispatch.o
12
+ blake3_portable.o
13
+ blake3_sse2_x86-64_unix.o
14
+ blake3_sse41_x86-64_unix.o
15
+ blake3_avx2_x86-64_unix.o
16
+ blake3_avx512_x86-64_unix.o
17
+ ]
18
+
19
+ platform = ::UncleBlake3::Build::Platform.instance
20
+ out_dir = "#{platform.arch}-#{platform.os}"
21
+ lib_name = ::File.join(out_dir, platform.map_library_name('UncleBlake3'))
22
+
23
+ task default: [lib_name]
24
+
25
+ source_files = ::Dir['**/*', base: blake3_prefix].select do |file|
26
+ (/\.[cs]\z/i =~ file) && ::File.file?("#{blake3_prefix}#{file}")
27
+ end
28
+
29
+ file lib_name => FileList["#{build_prefix}uncle_blake3.o", "#{build_prefix}#{static_target}"] do |t|
30
+ ::FileUtils.mkdir_p(::File.dirname(t.name))
31
+ static_lib = t.prerequisites.last
32
+ static_lib_dir = ::File.dirname(static_lib)
33
+ static_lib_file = ::File.basename(static_lib)
34
+ sh "gcc -shared -O3 -flto -o #{t.name} #{t.prerequisites.first} -L#{static_lib_dir} -l:#{static_lib_file} -lm -lc"
35
+ end
36
+
37
+ file "#{build_prefix}uncle_blake3.o" => FileList['binding/uncle_blake3.c'] do |t|
38
+ ::FileUtils.mkdir_p(::File.dirname(t.name))
39
+ sh "gcc -O3 -fPIC -flto -Wall -I./blake3/c/ -c #{t.prerequisites.last} -o #{t.name}"
40
+ end
41
+
42
+ file "#{build_prefix}#{static_target}" => FileList[
43
+ *object_list.map { |object_file| "#{build_prefix}#{object_file}" }
44
+ ] do |t|
45
+ ::FileUtils.mkdir_p(::File.dirname(t.name))
46
+ sh "ar rcs #{t.name} #{t.prerequisites.join(' ')}"
47
+ end
48
+
49
+ source_files.each do |source_file|
50
+ object_name = source_file.sub(/(?:\.[cs])?\z/i, '.o')
51
+ file "#{build_prefix}#{object_name}" => FileList["#{blake3_prefix}#{source_file}"] do |t|
52
+ ::FileUtils.mkdir_p(::File.dirname(t.name))
53
+ sh "gcc -O3 -fPIC -flto -Wall -c #{t.prerequisites.last} -o #{t.name}"
54
+ end
55
+ end
@@ -0,0 +1,41 @@
1
+ #include <stdint.h>
2
+ #include <stdlib.h>
3
+ #include "blake3.h"
4
+
5
+ uint16_t UncleBlake3_KEY_LEN() {
6
+ return (BLAKE3_KEY_LEN);
7
+ }
8
+
9
+ uint16_t UncleBlake3_OUT_LEN() {
10
+ return (BLAKE3_OUT_LEN);
11
+ }
12
+
13
+ void * UncleBlake3_Init() {
14
+ blake3_hasher *retVal = malloc(sizeof (blake3_hasher)); // TODO: check result
15
+ blake3_hasher_init(retVal);
16
+ return retVal;
17
+ }
18
+
19
+ void * UncleBlake3_InitWithKey(const uint8_t *key) {
20
+ blake3_hasher *retVal = malloc(sizeof (blake3_hasher)); // TODO: check result
21
+ blake3_hasher_init_keyed(retVal, key);
22
+ return retVal;
23
+ }
24
+
25
+ void * UncleBlake3_InitWithKeySeed(const void *context, size_t context_len) {
26
+ blake3_hasher *retVal = malloc(sizeof (blake3_hasher)); // TODO: check result
27
+ blake3_hasher_init_derive_key_raw(retVal, context, context_len);
28
+ return retVal;
29
+ }
30
+
31
+ void UncleBlake3_Update(void *instance, const void *input, size_t inputByteLen) {
32
+ return blake3_hasher_update((blake3_hasher *)instance, input, inputByteLen);
33
+ }
34
+
35
+ void UncleBlake3_Final(void *instance, uint8_t *output, size_t outputByteLen) {
36
+ return blake3_hasher_finalize((const blake3_hasher *)instance, output, outputByteLen);
37
+ }
38
+
39
+ void UncleBlake3_Destroy(void *instance) {
40
+ free(instance);
41
+ }
@@ -0,0 +1,82 @@
1
+ # This Makefile is only for testing. C callers should follow the instructions
2
+ # in ./README.md to incorporate these C files into their existing build.
3
+
4
+ NAME=blake3
5
+ CC=gcc
6
+ CFLAGS=-O3 -Wall -Wextra -std=c11 -pedantic -fstack-protector-strong -D_FORTIFY_SOURCE=2 -fPIE -fvisibility=hidden
7
+ LDFLAGS=-pie -Wl,-z,relro,-z,now
8
+ TARGETS=
9
+ ASM_TARGETS=
10
+ EXTRAFLAGS=-Wa,--noexecstack
11
+
12
+ ifdef BLAKE3_NO_SSE2
13
+ EXTRAFLAGS += -DBLAKE3_NO_SSE2
14
+ else
15
+ TARGETS += blake3_sse2.o
16
+ ASM_TARGETS += blake3_sse2_x86-64_unix.S
17
+ endif
18
+
19
+ ifdef BLAKE3_NO_SSE41
20
+ EXTRAFLAGS += -DBLAKE3_NO_SSE41
21
+ else
22
+ TARGETS += blake3_sse41.o
23
+ ASM_TARGETS += blake3_sse41_x86-64_unix.S
24
+ endif
25
+
26
+ ifdef BLAKE3_NO_AVX2
27
+ EXTRAFLAGS += -DBLAKE3_NO_AVX2
28
+ else
29
+ TARGETS += blake3_avx2.o
30
+ ASM_TARGETS += blake3_avx2_x86-64_unix.S
31
+ endif
32
+
33
+ ifdef BLAKE3_NO_AVX512
34
+ EXTRAFLAGS += -DBLAKE3_NO_AVX512
35
+ else
36
+ TARGETS += blake3_avx512.o
37
+ ASM_TARGETS += blake3_avx512_x86-64_unix.S
38
+ endif
39
+
40
+ ifdef BLAKE3_USE_NEON
41
+ EXTRAFLAGS += -DBLAKE3_USE_NEON=1
42
+ TARGETS += blake3_neon.o
43
+ endif
44
+
45
+ ifdef BLAKE3_NO_NEON
46
+ EXTRAFLAGS += -DBLAKE3_USE_NEON=0
47
+ endif
48
+
49
+ all: blake3.c blake3_dispatch.c blake3_portable.c main.c $(TARGETS)
50
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) $^ -o $(NAME) $(LDFLAGS)
51
+
52
+ blake3_sse2.o: blake3_sse2.c
53
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) -c $^ -o $@ -msse2
54
+
55
+ blake3_sse41.o: blake3_sse41.c
56
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) -c $^ -o $@ -msse4.1
57
+
58
+ blake3_avx2.o: blake3_avx2.c
59
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) -c $^ -o $@ -mavx2
60
+
61
+ blake3_avx512.o: blake3_avx512.c
62
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) -c $^ -o $@ -mavx512f -mavx512vl
63
+
64
+ blake3_neon.o: blake3_neon.c
65
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) -c $^ -o $@
66
+
67
+ test: CFLAGS += -DBLAKE3_TESTING -fsanitize=address,undefined
68
+ test: all
69
+ ./test.py
70
+
71
+ asm: blake3.c blake3_dispatch.c blake3_portable.c main.c $(ASM_TARGETS)
72
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) $^ -o $(NAME) $(LDFLAGS)
73
+
74
+ test_asm: CFLAGS += -DBLAKE3_TESTING -fsanitize=address,undefined
75
+ test_asm: asm
76
+ ./test.py
77
+
78
+ example: example.c blake3.c blake3_dispatch.c blake3_portable.c $(ASM_TARGETS)
79
+ $(CC) $(CFLAGS) $(EXTRAFLAGS) $^ -o $@ $(LDFLAGS)
80
+
81
+ clean:
82
+ rm -f $(NAME) *.o
@@ -0,0 +1,316 @@
1
+ The official C implementation of BLAKE3.
2
+
3
+ # Example
4
+
5
+ An example program that hashes bytes from standard input and prints the
6
+ result:
7
+
8
+ ```c
9
+ #include "blake3.h"
10
+ #include <errno.h>
11
+ #include <stdio.h>
12
+ #include <stdlib.h>
13
+ #include <string.h>
14
+ #include <unistd.h>
15
+
16
+ int main() {
17
+ // Initialize the hasher.
18
+ blake3_hasher hasher;
19
+ blake3_hasher_init(&hasher);
20
+
21
+ // Read input bytes from stdin.
22
+ unsigned char buf[65536];
23
+ while (1) {
24
+ ssize_t n = read(STDIN_FILENO, buf, sizeof(buf));
25
+ if (n > 0) {
26
+ blake3_hasher_update(&hasher, buf, n);
27
+ } else if (n == 0) {
28
+ break; // end of file
29
+ } else {
30
+ fprintf(stderr, "read failed: %s\n", strerror(errno));
31
+ exit(1);
32
+ }
33
+ }
34
+
35
+ // Finalize the hash. BLAKE3_OUT_LEN is the default output length, 32 bytes.
36
+ uint8_t output[BLAKE3_OUT_LEN];
37
+ blake3_hasher_finalize(&hasher, output, BLAKE3_OUT_LEN);
38
+
39
+ // Print the hash as hexadecimal.
40
+ for (size_t i = 0; i < BLAKE3_OUT_LEN; i++) {
41
+ printf("%02x", output[i]);
42
+ }
43
+ printf("\n");
44
+ return 0;
45
+ }
46
+ ```
47
+
48
+ The code above is included in this directory as `example.c`. If you're
49
+ on x86\_64 with a Unix-like OS, you can compile a working binary like
50
+ this:
51
+
52
+ ```bash
53
+ gcc -O3 -o example example.c blake3.c blake3_dispatch.c blake3_portable.c \
54
+ blake3_sse2_x86-64_unix.S blake3_sse41_x86-64_unix.S blake3_avx2_x86-64_unix.S \
55
+ blake3_avx512_x86-64_unix.S
56
+ ```
57
+
58
+ # API
59
+
60
+ ## The Struct
61
+
62
+ ```c
63
+ typedef struct {
64
+ // private fields
65
+ } blake3_hasher;
66
+ ```
67
+
68
+ An incremental BLAKE3 hashing state, which can accept any number of
69
+ updates. This implementation doesn't allocate any heap memory, but
70
+ `sizeof(blake3_hasher)` itself is relatively large, currently 1912 bytes
71
+ on x86-64. This size can be reduced by restricting the maximum input
72
+ length, as described in Section 5.4 of [the BLAKE3
73
+ spec](https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf),
74
+ but this implementation doesn't currently support that strategy.
75
+
76
+ ## Common API Functions
77
+
78
+ ```c
79
+ void blake3_hasher_init(
80
+ blake3_hasher *self);
81
+ ```
82
+
83
+ Initialize a `blake3_hasher` in the default hashing mode.
84
+
85
+ ---
86
+
87
+ ```c
88
+ void blake3_hasher_update(
89
+ blake3_hasher *self,
90
+ const void *input,
91
+ size_t input_len);
92
+ ```
93
+
94
+ Add input to the hasher. This can be called any number of times.
95
+
96
+ ---
97
+
98
+ ```c
99
+ void blake3_hasher_finalize(
100
+ const blake3_hasher *self,
101
+ uint8_t *out,
102
+ size_t out_len);
103
+ ```
104
+
105
+ Finalize the hasher and return an output of any length, given in bytes.
106
+ This doesn't modify the hasher itself, and it's possible to finalize
107
+ again after adding more input. The constant `BLAKE3_OUT_LEN` provides
108
+ the default output length, 32 bytes, which is recommended for most
109
+ callers.
110
+
111
+ Outputs shorter than the default length of 32 bytes (256 bits) provide
112
+ less security. An N-bit BLAKE3 output is intended to provide N bits of
113
+ first and second preimage resistance and N/2 bits of collision
114
+ resistance, for any N up to 256. Longer outputs don't provide any
115
+ additional security.
116
+
117
+ Shorter BLAKE3 outputs are prefixes of longer ones. Explicitly
118
+ requesting a short output is equivalent to truncating the default-length
119
+ output. (Note that this is different between BLAKE2 and BLAKE3.)
120
+
121
+ ## Less Common API Functions
122
+
123
+ ```c
124
+ void blake3_hasher_init_keyed(
125
+ blake3_hasher *self,
126
+ const uint8_t key[BLAKE3_KEY_LEN]);
127
+ ```
128
+
129
+ Initialize a `blake3_hasher` in the keyed hashing mode. The key must be
130
+ exactly 32 bytes.
131
+
132
+ ---
133
+
134
+ ```c
135
+ void blake3_hasher_init_derive_key(
136
+ blake3_hasher *self,
137
+ const char *context);
138
+ ```
139
+
140
+ Initialize a `blake3_hasher` in the key derivation mode. The context
141
+ string is given as an initialization parameter, and afterwards input key
142
+ material should be given with `blake3_hasher_update`. The context string
143
+ is a null-terminated C string which should be **hardcoded, globally
144
+ unique, and application-specific**. The context string should not
145
+ include any dynamic input like salts, nonces, or identifiers read from a
146
+ database at runtime. A good default format for the context string is
147
+ `"[application] [commit timestamp] [purpose]"`, e.g., `"example.com
148
+ 2019-12-25 16:18:03 session tokens v1"`.
149
+
150
+ This function is intended for application code written in C. For
151
+ language bindings, see `blake3_hasher_init_derive_key_raw` below.
152
+
153
+ ---
154
+
155
+ ```c
156
+ void blake3_hasher_init_derive_key_raw(
157
+ blake3_hasher *self,
158
+ const void *context,
159
+ size_t context_len);
160
+ ```
161
+
162
+ As `blake3_hasher_init_derive_key` above, except that the context string
163
+ is given as a pointer to an array of arbitrary bytes with a provided
164
+ length. This is intended for writing language bindings, where C string
165
+ conversion would add unnecessary overhead and new error cases. Unicode
166
+ strings should be encoded as UTF-8.
167
+
168
+ Application code in C should prefer `blake3_hasher_init_derive_key`,
169
+ which takes the context as a C string. If you need to use arbitrary
170
+ bytes as a context string in application code, consider whether you're
171
+ violating the requirement that context strings should be hardcoded.
172
+
173
+ ---
174
+
175
+ ```c
176
+ void blake3_hasher_finalize_seek(
177
+ const blake3_hasher *self,
178
+ uint64_t seek,
179
+ uint8_t *out,
180
+ size_t out_len);
181
+ ```
182
+
183
+ The same as `blake3_hasher_finalize`, but with an additional `seek`
184
+ parameter for the starting byte position in the output stream. To
185
+ efficiently stream a large output without allocating memory, call this
186
+ function in a loop, incrementing `seek` by the output length each time.
187
+
188
+ ---
189
+
190
+ ```c
191
+ void blake3_hasher_reset(
192
+ blake3_hasher *self);
193
+ ```
194
+
195
+ Reset the hasher to its initial state, prior to any calls to
196
+ `blake3_hasher_update`. Currently this is no different from calling
197
+ `blake3_hasher_init` or similar again. However, if this implementation gains
198
+ multithreading support in the future, and if `blake3_hasher` holds (optional)
199
+ threading resources, this function will reuse those resources. Until then, this
200
+ is mainly for feature compatibility with the Rust implementation.
201
+
202
+
203
+ # Building
204
+
205
+ This implementation is just C and assembly files. It doesn't include a
206
+ public-facing build system. (The `Makefile` in this directory is only
207
+ for testing.) Instead, the intention is that you can include these files
208
+ in whatever build system you're already using. This section describes
209
+ the commands your build system should execute, or which you can execute
210
+ by hand. Note that these steps may change in future versions.
211
+
212
+ ## x86
213
+
214
+ Dynamic dispatch is enabled by default on x86. The implementation will
215
+ query the CPU at runtime to detect SIMD support, and it will use the
216
+ widest instruction set available. By default, `blake3_dispatch.c`
217
+ expects to be linked with code for five different instruction sets:
218
+ portable C, SSE2, SSE4.1, AVX2, and AVX-512.
219
+
220
+ For each of the x86 SIMD instruction sets, four versions are available:
221
+ three flavors of assembly (Unix, Windows MSVC, and Windows GNU) and one
222
+ version using C intrinsics. The assembly versions are generally
223
+ preferred. They perform better, they perform more consistently across
224
+ different compilers, and they build more quickly. On the other hand, the
225
+ assembly versions are x86\_64-only, and you need to select the right
226
+ flavor for your target platform.
227
+
228
+ Here's an example of building a shared library on x86\_64 Linux using
229
+ the assembly implementations:
230
+
231
+ ```bash
232
+ gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c \
233
+ blake3_sse2_x86-64_unix.S blake3_sse41_x86-64_unix.S blake3_avx2_x86-64_unix.S \
234
+ blake3_avx512_x86-64_unix.S
235
+ ```
236
+
237
+ When building the intrinsics-based implementations, you need to build
238
+ each implementation separately, with the corresponding instruction set
239
+ explicitly enabled in the compiler. Here's the same shared library using
240
+ the intrinsics-based implementations:
241
+
242
+ ```bash
243
+ gcc -c -fPIC -O3 -msse2 blake3_sse2.c -o blake3_sse2.o
244
+ gcc -c -fPIC -O3 -msse4.1 blake3_sse41.c -o blake3_sse41.o
245
+ gcc -c -fPIC -O3 -mavx2 blake3_avx2.c -o blake3_avx2.o
246
+ gcc -c -fPIC -O3 -mavx512f -mavx512vl blake3_avx512.c -o blake3_avx512.o
247
+ gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c \
248
+ blake3_avx2.o blake3_avx512.o blake3_sse41.o blake3_sse2.o
249
+ ```
250
+
251
+ Note above that building `blake3_avx512.c` requires both `-mavx512f` and
252
+ `-mavx512vl` under GCC and Clang. Under MSVC, the single `/arch:AVX512`
253
+ flag is sufficient. The MSVC equivalent of `-mavx2` is `/arch:AVX2`.
254
+ MSVC enables SSE2 and SSE4.1 by defaut, and it doesn't have a
255
+ corresponding flag.
256
+
257
+ If you want to omit SIMD code entirely, you need to explicitly disable
258
+ each instruction set. Here's an example of building a shared library on
259
+ x86 with only portable code:
260
+
261
+ ```bash
262
+ gcc -shared -O3 -o libblake3.so -DBLAKE3_NO_SSE2 -DBLAKE3_NO_SSE41 -DBLAKE3_NO_AVX2 \
263
+ -DBLAKE3_NO_AVX512 blake3.c blake3_dispatch.c blake3_portable.c
264
+ ```
265
+
266
+ ## ARM NEON
267
+
268
+ The NEON implementation is enabled by default on AArch64, but not on
269
+ other ARM targets, since not all of them support it. To enable it, set
270
+ `BLAKE3_USE_NEON=1`. Here's an example of building a shared library on
271
+ ARM Linux with NEON support:
272
+
273
+ ```bash
274
+ gcc -shared -O3 -o libblake3.so -DBLAKE3_USE_NEON=1 blake3.c blake3_dispatch.c \
275
+ blake3_portable.c blake3_neon.c
276
+ ```
277
+
278
+ To explicitiy disable using NEON instructions on AArch64, set
279
+ `BLAKE3_USE_NEON=0`.
280
+
281
+ ```bash
282
+ gcc -shared -O3 -o libblake3.so -DBLAKE3_USE_NEON=0 blake3.c blake3_dispatch.c \
283
+ blake3_portable.c
284
+ ```
285
+
286
+ Note that on some targets (ARMv7 in particular), extra flags may be
287
+ required to activate NEON support in the compiler. If you see an error
288
+ like...
289
+
290
+ ```
291
+ /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/9.2.0/include/arm_neon.h:635:1: error: inlining failed
292
+ in call to always_inline ‘vaddq_u32’: target specific option mismatch
293
+ ```
294
+
295
+ ...then you may need to add something like `-mfpu=neon-vfpv4
296
+ -mfloat-abi=hard`.
297
+
298
+ ## Other Platforms
299
+
300
+ The portable implementation should work on most other architectures. For
301
+ example:
302
+
303
+ ```bash
304
+ gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c
305
+ ```
306
+
307
+ # Multithreading
308
+
309
+ Unlike the Rust implementation, the C implementation doesn't currently support
310
+ multithreading. A future version of this library could add support by taking an
311
+ optional dependency on OpenMP or similar. Alternatively, we could expose a
312
+ lower-level API to allow callers to implement concurrency themselves. The
313
+ former would be more convenient and less error-prone, but the latter would give
314
+ callers the maximum possible amount of control. The best choice here depends on
315
+ the specific use case, so if you have a use case for multithreaded hashing in
316
+ C, please file a GitHub issue and let us know.