fast-aes 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,110 @@
1
+ = FastAES - Fast AES implementation for Ruby in C
2
+
3
+ This is a lightweight, fast implementation of AES (the US government's Advanced Encryption Standard,
4
+ aka "Rijndael"), written in C for speed. You can read more on the {Wikipedia AES Page}[http://en.wikipedia.org/wiki/Advanced_Encryption_Standard].
5
+ The algorithm itself was extracted from work by Christophe Devine for the open source Netcat clone
6
+ {sbd}[http://www.cycom.se/dl/sbd]. According to the community, this is
7
+ {one of the best performing AES implementations available}[http://www.derkeiler.com/Newsgroups/sci.crypt/2003-07/0162.html]:
8
+
9
+ > With some exceptions your code performs better than all others in
10
+ > enc[ryption]/dec[ryption]. Do you have an explanation of that fact? Thanks.
11
+ >
12
+ Well, I've tried to make the code as simple and straightforward as
13
+ possible; I also used a few basic tricks, like loop unrolling.
14
+
15
+ Since this library wraps the sbd implementation, it supports a subset of AES, specifically:
16
+
17
+ * 128, 192, and 256-bit ciphers
18
+ * Cipher Block Chaining (CBC) mode only
19
+ * Encrypted blocks are padded at 16-bit boundaries ({read more on padding}[http://www.di-mgt.com.au/cryptopad.html#whatispadding])
20
+
21
+ You can read specifics about AES-CBC in the IPSec-related {RFC 3602}[http://www.rfc-archive.org/getrfc.php?rfc=3602],
22
+ if you really care that much.
23
+
24
+ Bottom line, this gem works. Fast.
25
+
26
+ === Other Ruby AES gems
27
+
28
+ I couldn't find any that worked worth a crap. The {ruby-aes}[http://rubyforge.org/projects/ruby-aes/]
29
+ project has Ruby 1.9 bugs that have been open over _two_ _years_ now, {crypt/rijndael}[http://crypt.rubyforge.org/rijndael.html]
30
+ doesn't work on Ruby 1.9 and is *SLOOOOOOW* (as it's written in Ruby), and some people even report getting
31
+ {inconsistent encryption results from other libraries}[http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/228214].
32
+
33
+ So I grabbed some C reference code, wrapped a Ruby interface around it, and voíla.
34
+
35
+ C'mon people, it's not that hard. It's called Google. In my day, you had to actually *WRITE* the code.
36
+
37
+ == Installation
38
+
39
+ gem install gemcutter
40
+ gem install fast-aes
41
+
42
+ == Example
43
+
44
+ Simple encryption/decryption:
45
+
46
+ require 'fast-aes'
47
+
48
+ # key can be 128, 192, or 256 bits
49
+ key = '424b3b5c4d454c7a51376748255d7b7156585f543f776243227352746f'
50
+
51
+ aes = FastAES.new(key)
52
+
53
+ text = "Hey there, how are you?"
54
+
55
+ data = aes.encrypt(text)
56
+
57
+ puts aes.decrypt(data) # "Hey there, how are you?"
58
+
59
+ Pretty simple, jah?
60
+
61
+ == Why AES?
62
+
63
+ === SSL vs AES
64
+
65
+ I'm going to guess you're using Ruby with Rails, which means you're doing 90+% web development.
66
+ In that case, if you need security, SSL is the obvious choice (and the right one).
67
+
68
+ But there will probably come a time, padawan, when you need a couple backend servers to talk -
69
+ maybe job servers, or an admin port, or whatever. Maybe even a simple chat server.
70
+
71
+ You can use SSL for this if you want it to be time-consuming to setup, painful to maintain, and
72
+ slow. Or you can use a different algorithm, such as AES. Setting up an SSH tunnel is another good
73
+ alternative (although AES is faster, and setup is slightly easier).
74
+
75
+ === AES vs Other Encryption Standards
76
+
77
+ There are a bizillion (literally!) different encryption standards out there. If you have
78
+ a PhD, and can't find a job, writing an encryption algorithm is a good thing to put on your resume -
79
+ on that outside chance that someone will hire you and use it. If you don't possess the talent to
80
+ write an encryption standard, you can spend hours trying to crack one - for similar reasons. As a
81
+ result, of the many encryption alternatives, most are either (a) cracked or (b) covered by patents.
82
+
83
+ Personally, when it comes to encryption, I think choosing what the US government chooses is a decent
84
+ choice. They tend to be "security conscious."
85
+
86
+ === Special Note
87
+
88
+ As this software deals with encryption/decryption, please note there is *NO* *WARRANTY*, not even
89
+ with regards to FITNESS FOR A PARTICULAR PURPOSE or NONINFRINGEMENT. This means if you use this
90
+ library, and it turns out there's a flaw in the implementation that results in your data being
91
+ hacked, *IT* *IS* *NOT* *MY* *FAULT*. It's YOUR responsibility to check the implementation of this
92
+ library and algorithm. If you can't understand C code, that's NOT MY PROBLEM.
93
+
94
+ == Author
95
+
96
+ Original AES C reference code by Christophe Devine. Thanks Christophe!
97
+
98
+ This gem copyright (c) 2010 {Nate Wiger}[http://nate.wiger.org]. Released under the MIT License.
99
+
100
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation
101
+ files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use,
102
+ copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
103
+ Software is furnished to do so, subject to the following conditions:
104
+
105
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
106
+
107
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
108
+ OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
109
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
110
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,11 @@
1
+ # Loads mkmf which is used to make makefiles for Ruby extensions
2
+ require 'mkmf'
3
+
4
+ # Give it a name
5
+ extension_name = 'fast_aes'
6
+
7
+ # The destination
8
+ dir_config(extension_name)
9
+
10
+ # Do the work
11
+ create_makefile(extension_name)
@@ -0,0 +1,1023 @@
1
+ /*//////////////////////////////////////////////////////////////////////////////
2
+ ////////////////////////////////////////////////////////////////////////////////
3
+ //
4
+ // Part of the FastAES Ruby/C library implementation.
5
+ // Implementation in C originally by Christophe Devine.
6
+ //
7
+ ////////////////////////////////////////////////////////////////////////////////
8
+ //////////////////////////////////////////////////////////////////////////////*/
9
+
10
+ #include <string.h>
11
+ #include <stdio.h>
12
+ #include <stdint.h>
13
+
14
+ #include "ruby.h"
15
+ #include "fast_aes.h"
16
+
17
+ /* Global boolean */
18
+ int fast_aes_do_gen_tables = 1;
19
+
20
+ /* Old school. Oh yeah */
21
+ #ifndef RSTRING_PTR
22
+ #define RSTRING_PTR(s) (RSTRING(s)->ptr)
23
+ #define RSTRING_LEN(s) (RSTRING(s)->len)
24
+ #endif
25
+
26
+ /* Ruby buckets */
27
+ VALUE rb_cFastAES;
28
+
29
+ void Init_fast_aes()
30
+ {
31
+ rb_cFastAES = rb_define_class("FastAES", rb_cObject);
32
+
33
+ rb_define_alloc_func(rb_cFastAES, fast_aes_alloc);
34
+ rb_define_method(rb_cFastAES, "initialize", fast_aes_initialize, 1);
35
+ rb_define_method(rb_cFastAES, "encrypt", fast_aes_encrypt, 1);
36
+ rb_define_method(rb_cFastAES, "decrypt", fast_aes_decrypt, 1);
37
+ rb_define_method(rb_cFastAES, "key", fast_aes_key, 0);
38
+ }
39
+
40
+ VALUE fast_aes_key(VALUE self)
41
+ {
42
+ /* get our "self" data structure (eg, member vars) */
43
+ fast_aes_t* fast_aes;
44
+ Data_Get_Struct(self, fast_aes_t, fast_aes);
45
+ VALUE new_str = rb_str_new(fast_aes->key, fast_aes->key_bits/8);
46
+ return new_str;
47
+ }
48
+
49
+ VALUE fast_aes_alloc(VALUE klass)
50
+ {
51
+ /* Initialize our structs */
52
+ fast_aes_t *fast_aes = malloc(sizeof(fast_aes_t));
53
+
54
+ /* Clear out memory */
55
+ memset(fast_aes->key, 0, sizeof(fast_aes->key));
56
+ memset(fast_aes->erk, 0, sizeof(fast_aes->erk));
57
+ memset(fast_aes->drk, 0, sizeof(fast_aes->drk));
58
+ memset(fast_aes->initial_erk, 0, sizeof(fast_aes->initial_erk));
59
+ memset(fast_aes->initial_drk, 0, sizeof(fast_aes->initial_drk));
60
+
61
+ return Data_Wrap_Struct(klass, fast_aes_mark, fast_aes_free, fast_aes);
62
+ }
63
+
64
+ VALUE fast_aes_initialize(VALUE self, VALUE key)
65
+ {
66
+ /* get our "self" data structure (eg, member vars) */
67
+ fast_aes_t* fast_aes;
68
+ Data_Get_Struct(self, fast_aes_t, fast_aes);
69
+ char error_mesg[350];
70
+
71
+ int key_bits;
72
+ char* key_data = StringValuePtr(key);
73
+
74
+ /* since the tables are global there's no need to generate them more than once
75
+ * regardless of how many instances there are of this object
76
+ */
77
+ if( fast_aes_do_gen_tables == 1 )
78
+ {
79
+ fast_aes_gen_tables();
80
+ fast_aes_do_gen_tables = 0;
81
+ }
82
+
83
+ /* if they are trying to use a number of bits that is larger that the key
84
+ * has available, truncate the bits to the key bits.
85
+ * ie., they pass a 128 bit key but pass keytype N256 we will use N128
86
+ */
87
+ key_bits = strlen(key_data)*8;
88
+ switch(key_bits)
89
+ {
90
+ case 128:
91
+ case 192:
92
+ case 256:
93
+ fast_aes->key_bits = key_bits;
94
+ memcpy(fast_aes->key, key_data, key_bits/8);
95
+ /*printf("AES key=%s, bits=%d\n", fast_aes->key, fast_aes->key_bits);*/
96
+ break;
97
+ default:
98
+ sprintf(error_mesg, "AES key must be 128, 192, or 256 bits in length (got %d): %s", key_bits, key_data);
99
+ rb_raise(rb_eArgError, error_mesg);
100
+ return Qnil;
101
+ }
102
+
103
+ if (fast_aes_initialize_state(fast_aes)) {
104
+ rb_raise(rb_eRuntimeError, "Failed to initialize AES internal state");
105
+ return Qnil;
106
+ }
107
+ return Qtrue;
108
+ }
109
+
110
+ void fast_aes_module_shutdown( fast_aes_t* fast_aes )
111
+ {
112
+ }
113
+
114
+ /* This method **MUST** be present even if it does nothing */
115
+ void fast_aes_mark( fast_aes_t* fast_aes )
116
+ {
117
+ /*rb_gc_mark(??);
118
+ //should we mark each member here? */
119
+ }
120
+
121
+ void fast_aes_free( fast_aes_t* fast_aes )
122
+ {
123
+ fast_aes_module_shutdown(fast_aes);
124
+ free(fast_aes);
125
+ }
126
+
127
+ VALUE fast_aes_encrypt(
128
+ VALUE self,
129
+ VALUE buffer
130
+ )
131
+ {
132
+ /* get our "self" data structure (eg, member vars) */
133
+ fast_aes_t* fast_aes;
134
+ Data_Get_Struct(self, fast_aes_t, fast_aes);
135
+
136
+ char* pDataIn = StringValuePtr(buffer);
137
+ int uiNumBytesIn = RSTRING_LEN(buffer);
138
+ char* pDataOut = malloc((uiNumBytesIn + 15) & -16); /* auto-malloc min size in 16-byte increments */
139
+
140
+ unsigned char *pRead, *pWrite;
141
+ pRead = (unsigned char*)pDataIn;
142
+ pWrite = (unsigned char*)pDataOut;
143
+
144
+ /*//////////////////////////////////////////////////////////////////////////
145
+ ////////////////////////////////////////////////////////////////////////////
146
+ // This routine will encode all input bytes in entirety (AES always "succeeds")
147
+ */
148
+ int puiNumBytesOut = 0;
149
+
150
+ /* set the state back to the start to allow for correct encryption
151
+ * everytime we are passed data to encrypt
152
+ */
153
+ if (fast_aes_reinitialize_state(fast_aes)) {
154
+ rb_raise(rb_eRuntimeError, "Failed to reinitialize AES internal state");
155
+ return Qnil;
156
+ }
157
+
158
+ /*//////////////////////////////////////////////////////////////////////////
159
+ ////////////////////////////////////////////////////////////////////////////
160
+ // Perform block encodes 16 bytes at a time while we still have at least
161
+ // 16 bytes of input remaining.
162
+ */
163
+ while( uiNumBytesIn >= 16 )
164
+ {
165
+ fast_aes_encrypt_block(fast_aes, pRead, pWrite);
166
+ pRead += 16; pWrite += 16;
167
+ uiNumBytesIn -= 16;
168
+ puiNumBytesOut += 16;
169
+ }
170
+
171
+ /*//////////////////////////////////////////////////////////////////////////
172
+ ////////////////////////////////////////////////////////////////////////////
173
+ // Have to catch any straggling bytes that are left after encoding the
174
+ // 16-byte blocks. The policy here will be to pad the input with zeros.
175
+ */
176
+ if( uiNumBytesIn > 0 )
177
+ {
178
+ unsigned char temp[16];
179
+ memset(temp, 0, sizeof(temp)); /* pad with 0's */
180
+ memcpy(temp, pRead, uiNumBytesIn);
181
+ fast_aes_encrypt_block(fast_aes, temp, pWrite);
182
+ puiNumBytesOut += 16;
183
+ }
184
+
185
+ /* return the encrypted string */
186
+ VALUE new_str = rb_str_new(pDataOut, puiNumBytesOut);
187
+ free(pDataOut);
188
+ return new_str;
189
+ }
190
+
191
+ VALUE fast_aes_decrypt(
192
+ VALUE self,
193
+ VALUE buffer
194
+ )
195
+ {
196
+ /* get our "self" data structure (eg, member vars) */
197
+ fast_aes_t* fast_aes;
198
+ Data_Get_Struct(self, fast_aes_t, fast_aes);
199
+
200
+ char* pDataIn = StringValuePtr(buffer);
201
+ int uiNumBytesIn = RSTRING_LEN(buffer);
202
+ char* pDataOut = malloc((uiNumBytesIn + 15) & -16); /* auto-malloc min size in 16-byte increments */
203
+ pDataOut = malloc(uiNumBytesIn + 15);
204
+
205
+ unsigned char *pRead, *pWrite;
206
+ pRead = (unsigned char*)pDataIn;
207
+ pWrite = (unsigned char*)pDataOut;
208
+
209
+ /*//////////////////////////////////////////////////////////////////////////
210
+ ////////////////////////////////////////////////////////////////////////////
211
+ // AES does not fail, and this routine will encode all input bytes
212
+ // entirely.
213
+ */
214
+ int puiNumBytesOut = 0;
215
+
216
+ /* set the state back to the start to allow for correct decryption
217
+ // everytime we are passed data to decrypt
218
+ */
219
+ if (fast_aes_reinitialize_state(fast_aes)) {
220
+ rb_raise(rb_eRuntimeError, "Failed to reinitialize AES internal state");
221
+ return Qnil;
222
+ }
223
+
224
+ /*//////////////////////////////////////////////////////////////////////////
225
+ ////////////////////////////////////////////////////////////////////////////
226
+ // Perform block decodes 16 bytes at a time while we still have at least
227
+ // 16 bytes of input remaining.
228
+ */
229
+ while( uiNumBytesIn >= 16 )
230
+ {
231
+ fast_aes_decrypt_block(fast_aes, pRead, pWrite);
232
+ pRead += 16; pWrite += 16;
233
+ uiNumBytesIn -= 16;
234
+ puiNumBytesOut += 16;
235
+ }
236
+
237
+ /*//////////////////////////////////////////////////////////////////////////
238
+ ////////////////////////////////////////////////////////////////////////////
239
+ // Have to catch any straggling bytes that are left after decoding the
240
+ // 16-byte blocks. Strip trailing zeros, which is something fucking
241
+ // loose-cannon rjc couldn't figure out despite being a "genius". He needs
242
+ // a punch in the junk, I swear to god.
243
+ */
244
+ if( uiNumBytesIn > 0 )
245
+ {
246
+ unsigned char temp[16];
247
+ memset(temp, 0, sizeof(temp)); /* pad with 0's */
248
+ memcpy(temp, pRead, uiNumBytesIn);
249
+ fast_aes_decrypt_block(fast_aes, temp, pWrite);
250
+ puiNumBytesOut += 16;
251
+ }
252
+
253
+ /* Strip zeros, simple but effective. RJC can suck my kawck.
254
+ * "Senior." LOL. You're fired.
255
+ */
256
+ while (puiNumBytesOut > 0) {
257
+ if (pDataOut[puiNumBytesOut - 1] != 0) break;
258
+ puiNumBytesOut -= 1;
259
+ }
260
+
261
+ /* return the decrypted string */
262
+ VALUE new_str = rb_str_new(pDataOut, puiNumBytesOut);
263
+ free(pDataOut);
264
+ return new_str;
265
+ }
266
+
267
+ /*//////////////////////////////////////////////////////////////////////////////
268
+ //////////////////////////////////////////////////////////////////////////////*/
269
+
270
+ /* uncomment the following line to use pre-computed tables */
271
+ /* otherwise the tables will be generated at the first run */
272
+ //#define FIXED_TABLES
273
+
274
+ #ifndef FIXED_TABLES
275
+
276
+ /* forward S-box & tables */
277
+
278
+ uint32_t FSb[256];
279
+ uint32_t FT0[256];
280
+ uint32_t FT1[256];
281
+ uint32_t FT2[256];
282
+ uint32_t FT3[256];
283
+
284
+ /* reverse S-box & tables */
285
+
286
+ uint32_t RSb[256];
287
+ uint32_t RT0[256];
288
+ uint32_t RT1[256];
289
+ uint32_t RT2[256];
290
+ uint32_t RT3[256];
291
+
292
+ /* round constants */
293
+
294
+ uint32_t RCON[10];
295
+
296
+ /* tables generation flag */
297
+
298
+ /* tables generation routine */
299
+
300
+ #define ROTR8(x) ( ( ( x << 24 ) & 0xFFFFFFFF ) | \
301
+ ( ( x & 0xFFFFFFFF ) >> 8 ) )
302
+
303
+ #define XTIME(x) ( ( x << 1 ) ^ ( ( x & 0x80 ) ? 0x1B : 0x00 ) )
304
+ #define MUL(x,y) ( ( x && y ) ? pow[(log[x] + log[y]) % 255] : 0 )
305
+
306
+ void fast_aes_gen_tables( void )
307
+ {
308
+ int i;
309
+ uint8_t x, y;
310
+ uint8_t pow[256];
311
+ uint8_t log[256];
312
+
313
+ /* compute pow and log tables over GF(2^8) */
314
+
315
+ for( i = 0, x = 1; i < 256; i++, x ^= XTIME( x ) )
316
+ {
317
+ pow[i] = x;
318
+ log[x] = i;
319
+ }
320
+
321
+ /* calculate the round constants */
322
+
323
+ for( i = 0, x = 1; i < 10; i++, x = XTIME( x ) )
324
+ {
325
+ RCON[i] = (uint32_t) x << 24;
326
+ }
327
+
328
+ /* generate the forward and reverse S-boxes */
329
+
330
+ FSb[0x00] = 0x63;
331
+ RSb[0x63] = 0x00;
332
+
333
+ for( i = 1; i < 256; ++i )
334
+ {
335
+ x = pow[255 - log[i]];
336
+
337
+ y = x; y = ( y << 1 ) | ( y >> 7 );
338
+ x ^= y; y = ( y << 1 ) | ( y >> 7 );
339
+ x ^= y; y = ( y << 1 ) | ( y >> 7 );
340
+ x ^= y; y = ( y << 1 ) | ( y >> 7 );
341
+ x ^= y ^ 0x63;
342
+
343
+ FSb[i] = x;
344
+ RSb[x] = i;
345
+ }
346
+
347
+ /* generate the forward and reverse tables */
348
+
349
+ for( i = 0; i < 256; ++i )
350
+ {
351
+ x = (unsigned char) FSb[i]; y = XTIME( x );
352
+
353
+ FT0[i] = (uint32_t) ( x ^ y ) ^
354
+ ( (uint32_t) x << 8 ) ^
355
+ ( (uint32_t) x << 16 ) ^
356
+ ( (uint32_t) y << 24 );
357
+
358
+ FT0[i] &= 0xFFFFFFFF;
359
+
360
+ FT1[i] = ROTR8( FT0[i] );
361
+ FT2[i] = ROTR8( FT1[i] );
362
+ FT3[i] = ROTR8( FT2[i] );
363
+
364
+ y = (unsigned char) RSb[i];
365
+
366
+ RT0[i] = ( (uint32_t) MUL( 0x0B, y ) ) ^
367
+ ( (uint32_t) MUL( 0x0D, y ) << 8 ) ^
368
+ ( (uint32_t) MUL( 0x09, y ) << 16 ) ^
369
+ ( (uint32_t) MUL( 0x0E, y ) << 24 );
370
+
371
+ RT0[i] &= 0xFFFFFFFF;
372
+
373
+ RT1[i] = ROTR8( RT0[i] );
374
+ RT2[i] = ROTR8( RT1[i] );
375
+ RT3[i] = ROTR8( RT2[i] );
376
+ }
377
+ }
378
+
379
+ #else
380
+
381
+ /* forward S-box */
382
+
383
+ static const uint32_t FSb[256] =
384
+ {
385
+ 0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5,
386
+ 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
387
+ 0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0,
388
+ 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
389
+ 0xB7, 0xFD, 0x93, 0x26, 0x36, 0x3F, 0xF7, 0xCC,
390
+ 0x34, 0xA5, 0xE5, 0xF1, 0x71, 0xD8, 0x31, 0x15,
391
+ 0x04, 0xC7, 0x23, 0xC3, 0x18, 0x96, 0x05, 0x9A,
392
+ 0x07, 0x12, 0x80, 0xE2, 0xEB, 0x27, 0xB2, 0x75,
393
+ 0x09, 0x83, 0x2C, 0x1A, 0x1B, 0x6E, 0x5A, 0xA0,
394
+ 0x52, 0x3B, 0xD6, 0xB3, 0x29, 0xE3, 0x2F, 0x84,
395
+ 0x53, 0xD1, 0x00, 0xED, 0x20, 0xFC, 0xB1, 0x5B,
396
+ 0x6A, 0xCB, 0xBE, 0x39, 0x4A, 0x4C, 0x58, 0xCF,
397
+ 0xD0, 0xEF, 0xAA, 0xFB, 0x43, 0x4D, 0x33, 0x85,
398
+ 0x45, 0xF9, 0x02, 0x7F, 0x50, 0x3C, 0x9F, 0xA8,
399
+ 0x51, 0xA3, 0x40, 0x8F, 0x92, 0x9D, 0x38, 0xF5,
400
+ 0xBC, 0xB6, 0xDA, 0x21, 0x10, 0xFF, 0xF3, 0xD2,
401
+ 0xCD, 0x0C, 0x13, 0xEC, 0x5F, 0x97, 0x44, 0x17,
402
+ 0xC4, 0xA7, 0x7E, 0x3D, 0x64, 0x5D, 0x19, 0x73,
403
+ 0x60, 0x81, 0x4F, 0xDC, 0x22, 0x2A, 0x90, 0x88,
404
+ 0x46, 0xEE, 0xB8, 0x14, 0xDE, 0x5E, 0x0B, 0xDB,
405
+ 0xE0, 0x32, 0x3A, 0x0A, 0x49, 0x06, 0x24, 0x5C,
406
+ 0xC2, 0xD3, 0xAC, 0x62, 0x91, 0x95, 0xE4, 0x79,
407
+ 0xE7, 0xC8, 0x37, 0x6D, 0x8D, 0xD5, 0x4E, 0xA9,
408
+ 0x6C, 0x56, 0xF4, 0xEA, 0x65, 0x7A, 0xAE, 0x08,
409
+ 0xBA, 0x78, 0x25, 0x2E, 0x1C, 0xA6, 0xB4, 0xC6,
410
+ 0xE8, 0xDD, 0x74, 0x1F, 0x4B, 0xBD, 0x8B, 0x8A,
411
+ 0x70, 0x3E, 0xB5, 0x66, 0x48, 0x03, 0xF6, 0x0E,
412
+ 0x61, 0x35, 0x57, 0xB9, 0x86, 0xC1, 0x1D, 0x9E,
413
+ 0xE1, 0xF8, 0x98, 0x11, 0x69, 0xD9, 0x8E, 0x94,
414
+ 0x9B, 0x1E, 0x87, 0xE9, 0xCE, 0x55, 0x28, 0xDF,
415
+ 0x8C, 0xA1, 0x89, 0x0D, 0xBF, 0xE6, 0x42, 0x68,
416
+ 0x41, 0x99, 0x2D, 0x0F, 0xB0, 0x54, 0xBB, 0x16
417
+ };
418
+
419
+ /* forward tables */
420
+
421
+ #define FT \
422
+ \
423
+ V(C6,63,63,A5), V(F8,7C,7C,84), V(EE,77,77,99), V(F6,7B,7B,8D), \
424
+ V(FF,F2,F2,0D), V(D6,6B,6B,BD), V(DE,6F,6F,B1), V(91,C5,C5,54), \
425
+ V(60,30,30,50), V(02,01,01,03), V(CE,67,67,A9), V(56,2B,2B,7D), \
426
+ V(E7,FE,FE,19), V(B5,D7,D7,62), V(4D,AB,AB,E6), V(EC,76,76,9A), \
427
+ V(8F,CA,CA,45), V(1F,82,82,9D), V(89,C9,C9,40), V(FA,7D,7D,87), \
428
+ V(EF,FA,FA,15), V(B2,59,59,EB), V(8E,47,47,C9), V(FB,F0,F0,0B), \
429
+ V(41,AD,AD,EC), V(B3,D4,D4,67), V(5F,A2,A2,FD), V(45,AF,AF,EA), \
430
+ V(23,9C,9C,BF), V(53,A4,A4,F7), V(E4,72,72,96), V(9B,C0,C0,5B), \
431
+ V(75,B7,B7,C2), V(E1,FD,FD,1C), V(3D,93,93,AE), V(4C,26,26,6A), \
432
+ V(6C,36,36,5A), V(7E,3F,3F,41), V(F5,F7,F7,02), V(83,CC,CC,4F), \
433
+ V(68,34,34,5C), V(51,A5,A5,F4), V(D1,E5,E5,34), V(F9,F1,F1,08), \
434
+ V(E2,71,71,93), V(AB,D8,D8,73), V(62,31,31,53), V(2A,15,15,3F), \
435
+ V(08,04,04,0C), V(95,C7,C7,52), V(46,23,23,65), V(9D,C3,C3,5E), \
436
+ V(30,18,18,28), V(37,96,96,A1), V(0A,05,05,0F), V(2F,9A,9A,B5), \
437
+ V(0E,07,07,09), V(24,12,12,36), V(1B,80,80,9B), V(DF,E2,E2,3D), \
438
+ V(CD,EB,EB,26), V(4E,27,27,69), V(7F,B2,B2,CD), V(EA,75,75,9F), \
439
+ V(12,09,09,1B), V(1D,83,83,9E), V(58,2C,2C,74), V(34,1A,1A,2E), \
440
+ V(36,1B,1B,2D), V(DC,6E,6E,B2), V(B4,5A,5A,EE), V(5B,A0,A0,FB), \
441
+ V(A4,52,52,F6), V(76,3B,3B,4D), V(B7,D6,D6,61), V(7D,B3,B3,CE), \
442
+ V(52,29,29,7B), V(DD,E3,E3,3E), V(5E,2F,2F,71), V(13,84,84,97), \
443
+ V(A6,53,53,F5), V(B9,D1,D1,68), V(00,00,00,00), V(C1,ED,ED,2C), \
444
+ V(40,20,20,60), V(E3,FC,FC,1F), V(79,B1,B1,C8), V(B6,5B,5B,ED), \
445
+ V(D4,6A,6A,BE), V(8D,CB,CB,46), V(67,BE,BE,D9), V(72,39,39,4B), \
446
+ V(94,4A,4A,DE), V(98,4C,4C,D4), V(B0,58,58,E8), V(85,CF,CF,4A), \
447
+ V(BB,D0,D0,6B), V(C5,EF,EF,2A), V(4F,AA,AA,E5), V(ED,FB,FB,16), \
448
+ V(86,43,43,C5), V(9A,4D,4D,D7), V(66,33,33,55), V(11,85,85,94), \
449
+ V(8A,45,45,CF), V(E9,F9,F9,10), V(04,02,02,06), V(FE,7F,7F,81), \
450
+ V(A0,50,50,F0), V(78,3C,3C,44), V(25,9F,9F,BA), V(4B,A8,A8,E3), \
451
+ V(A2,51,51,F3), V(5D,A3,A3,FE), V(80,40,40,C0), V(05,8F,8F,8A), \
452
+ V(3F,92,92,AD), V(21,9D,9D,BC), V(70,38,38,48), V(F1,F5,F5,04), \
453
+ V(63,BC,BC,DF), V(77,B6,B6,C1), V(AF,DA,DA,75), V(42,21,21,63), \
454
+ V(20,10,10,30), V(E5,FF,FF,1A), V(FD,F3,F3,0E), V(BF,D2,D2,6D), \
455
+ V(81,CD,CD,4C), V(18,0C,0C,14), V(26,13,13,35), V(C3,EC,EC,2F), \
456
+ V(BE,5F,5F,E1), V(35,97,97,A2), V(88,44,44,CC), V(2E,17,17,39), \
457
+ V(93,C4,C4,57), V(55,A7,A7,F2), V(FC,7E,7E,82), V(7A,3D,3D,47), \
458
+ V(C8,64,64,AC), V(BA,5D,5D,E7), V(32,19,19,2B), V(E6,73,73,95), \
459
+ V(C0,60,60,A0), V(19,81,81,98), V(9E,4F,4F,D1), V(A3,DC,DC,7F), \
460
+ V(44,22,22,66), V(54,2A,2A,7E), V(3B,90,90,AB), V(0B,88,88,83), \
461
+ V(8C,46,46,CA), V(C7,EE,EE,29), V(6B,B8,B8,D3), V(28,14,14,3C), \
462
+ V(A7,DE,DE,79), V(BC,5E,5E,E2), V(16,0B,0B,1D), V(AD,DB,DB,76), \
463
+ V(DB,E0,E0,3B), V(64,32,32,56), V(74,3A,3A,4E), V(14,0A,0A,1E), \
464
+ V(92,49,49,DB), V(0C,06,06,0A), V(48,24,24,6C), V(B8,5C,5C,E4), \
465
+ V(9F,C2,C2,5D), V(BD,D3,D3,6E), V(43,AC,AC,EF), V(C4,62,62,A6), \
466
+ V(39,91,91,A8), V(31,95,95,A4), V(D3,E4,E4,37), V(F2,79,79,8B), \
467
+ V(D5,E7,E7,32), V(8B,C8,C8,43), V(6E,37,37,59), V(DA,6D,6D,B7), \
468
+ V(01,8D,8D,8C), V(B1,D5,D5,64), V(9C,4E,4E,D2), V(49,A9,A9,E0), \
469
+ V(D8,6C,6C,B4), V(AC,56,56,FA), V(F3,F4,F4,07), V(CF,EA,EA,25), \
470
+ V(CA,65,65,AF), V(F4,7A,7A,8E), V(47,AE,AE,E9), V(10,08,08,18), \
471
+ V(6F,BA,BA,D5), V(F0,78,78,88), V(4A,25,25,6F), V(5C,2E,2E,72), \
472
+ V(38,1C,1C,24), V(57,A6,A6,F1), V(73,B4,B4,C7), V(97,C6,C6,51), \
473
+ V(CB,E8,E8,23), V(A1,DD,DD,7C), V(E8,74,74,9C), V(3E,1F,1F,21), \
474
+ V(96,4B,4B,DD), V(61,BD,BD,DC), V(0D,8B,8B,86), V(0F,8A,8A,85), \
475
+ V(E0,70,70,90), V(7C,3E,3E,42), V(71,B5,B5,C4), V(CC,66,66,AA), \
476
+ V(90,48,48,D8), V(06,03,03,05), V(F7,F6,F6,01), V(1C,0E,0E,12), \
477
+ V(C2,61,61,A3), V(6A,35,35,5F), V(AE,57,57,F9), V(69,B9,B9,D0), \
478
+ V(17,86,86,91), V(99,C1,C1,58), V(3A,1D,1D,27), V(27,9E,9E,B9), \
479
+ V(D9,E1,E1,38), V(EB,F8,F8,13), V(2B,98,98,B3), V(22,11,11,33), \
480
+ V(D2,69,69,BB), V(A9,D9,D9,70), V(07,8E,8E,89), V(33,94,94,A7), \
481
+ V(2D,9B,9B,B6), V(3C,1E,1E,22), V(15,87,87,92), V(C9,E9,E9,20), \
482
+ V(87,CE,CE,49), V(AA,55,55,FF), V(50,28,28,78), V(A5,DF,DF,7A), \
483
+ V(03,8C,8C,8F), V(59,A1,A1,F8), V(09,89,89,80), V(1A,0D,0D,17), \
484
+ V(65,BF,BF,DA), V(D7,E6,E6,31), V(84,42,42,C6), V(D0,68,68,B8), \
485
+ V(82,41,41,C3), V(29,99,99,B0), V(5A,2D,2D,77), V(1E,0F,0F,11), \
486
+ V(7B,B0,B0,CB), V(A8,54,54,FC), V(6D,BB,BB,D6), V(2C,16,16,3A)
487
+
488
+ #define V(a,b,c,d) 0x##a##b##c##d
489
+ static const uint32_t FT0[256] = { FT };
490
+ #undef V
491
+
492
+ #define V(a,b,c,d) 0x##d##a##b##c
493
+ static const uint32_t FT1[256] = { FT };
494
+ #undef V
495
+
496
+ #define V(a,b,c,d) 0x##c##d##a##b
497
+ static const uint32_t FT2[256] = { FT };
498
+ #undef V
499
+
500
+ #define V(a,b,c,d) 0x##b##c##d##a
501
+ static const uint32_t FT3[256] = { FT };
502
+ #undef V
503
+
504
+ #undef FT
505
+
506
+ /* reverse S-box */
507
+
508
+ static const uint32_t RSb[256] =
509
+ {
510
+ 0x52, 0x09, 0x6A, 0xD5, 0x30, 0x36, 0xA5, 0x38,
511
+ 0xBF, 0x40, 0xA3, 0x9E, 0x81, 0xF3, 0xD7, 0xFB,
512
+ 0x7C, 0xE3, 0x39, 0x82, 0x9B, 0x2F, 0xFF, 0x87,
513
+ 0x34, 0x8E, 0x43, 0x44, 0xC4, 0xDE, 0xE9, 0xCB,
514
+ 0x54, 0x7B, 0x94, 0x32, 0xA6, 0xC2, 0x23, 0x3D,
515
+ 0xEE, 0x4C, 0x95, 0x0B, 0x42, 0xFA, 0xC3, 0x4E,
516
+ 0x08, 0x2E, 0xA1, 0x66, 0x28, 0xD9, 0x24, 0xB2,
517
+ 0x76, 0x5B, 0xA2, 0x49, 0x6D, 0x8B, 0xD1, 0x25,
518
+ 0x72, 0xF8, 0xF6, 0x64, 0x86, 0x68, 0x98, 0x16,
519
+ 0xD4, 0xA4, 0x5C, 0xCC, 0x5D, 0x65, 0xB6, 0x92,
520
+ 0x6C, 0x70, 0x48, 0x50, 0xFD, 0xED, 0xB9, 0xDA,
521
+ 0x5E, 0x15, 0x46, 0x57, 0xA7, 0x8D, 0x9D, 0x84,
522
+ 0x90, 0xD8, 0xAB, 0x00, 0x8C, 0xBC, 0xD3, 0x0A,
523
+ 0xF7, 0xE4, 0x58, 0x05, 0xB8, 0xB3, 0x45, 0x06,
524
+ 0xD0, 0x2C, 0x1E, 0x8F, 0xCA, 0x3F, 0x0F, 0x02,
525
+ 0xC1, 0xAF, 0xBD, 0x03, 0x01, 0x13, 0x8A, 0x6B,
526
+ 0x3A, 0x91, 0x11, 0x41, 0x4F, 0x67, 0xDC, 0xEA,
527
+ 0x97, 0xF2, 0xCF, 0xCE, 0xF0, 0xB4, 0xE6, 0x73,
528
+ 0x96, 0xAC, 0x74, 0x22, 0xE7, 0xAD, 0x35, 0x85,
529
+ 0xE2, 0xF9, 0x37, 0xE8, 0x1C, 0x75, 0xDF, 0x6E,
530
+ 0x47, 0xF1, 0x1A, 0x71, 0x1D, 0x29, 0xC5, 0x89,
531
+ 0x6F, 0xB7, 0x62, 0x0E, 0xAA, 0x18, 0xBE, 0x1B,
532
+ 0xFC, 0x56, 0x3E, 0x4B, 0xC6, 0xD2, 0x79, 0x20,
533
+ 0x9A, 0xDB, 0xC0, 0xFE, 0x78, 0xCD, 0x5A, 0xF4,
534
+ 0x1F, 0xDD, 0xA8, 0x33, 0x88, 0x07, 0xC7, 0x31,
535
+ 0xB1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xEC, 0x5F,
536
+ 0x60, 0x51, 0x7F, 0xA9, 0x19, 0xB5, 0x4A, 0x0D,
537
+ 0x2D, 0xE5, 0x7A, 0x9F, 0x93, 0xC9, 0x9C, 0xEF,
538
+ 0xA0, 0xE0, 0x3B, 0x4D, 0xAE, 0x2A, 0xF5, 0xB0,
539
+ 0xC8, 0xEB, 0xBB, 0x3C, 0x83, 0x53, 0x99, 0x61,
540
+ 0x17, 0x2B, 0x04, 0x7E, 0xBA, 0x77, 0xD6, 0x26,
541
+ 0xE1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0C, 0x7D
542
+ };
543
+
544
+ /* reverse tables */
545
+
546
+ #define RT \
547
+ \
548
+ V(51,F4,A7,50), V(7E,41,65,53), V(1A,17,A4,C3), V(3A,27,5E,96), \
549
+ V(3B,AB,6B,CB), V(1F,9D,45,F1), V(AC,FA,58,AB), V(4B,E3,03,93), \
550
+ V(20,30,FA,55), V(AD,76,6D,F6), V(88,CC,76,91), V(F5,02,4C,25), \
551
+ V(4F,E5,D7,FC), V(C5,2A,CB,D7), V(26,35,44,80), V(B5,62,A3,8F), \
552
+ V(DE,B1,5A,49), V(25,BA,1B,67), V(45,EA,0E,98), V(5D,FE,C0,E1), \
553
+ V(C3,2F,75,02), V(81,4C,F0,12), V(8D,46,97,A3), V(6B,D3,F9,C6), \
554
+ V(03,8F,5F,E7), V(15,92,9C,95), V(BF,6D,7A,EB), V(95,52,59,DA), \
555
+ V(D4,BE,83,2D), V(58,74,21,D3), V(49,E0,69,29), V(8E,C9,C8,44), \
556
+ V(75,C2,89,6A), V(F4,8E,79,78), V(99,58,3E,6B), V(27,B9,71,DD), \
557
+ V(BE,E1,4F,B6), V(F0,88,AD,17), V(C9,20,AC,66), V(7D,CE,3A,B4), \
558
+ V(63,DF,4A,18), V(E5,1A,31,82), V(97,51,33,60), V(62,53,7F,45), \
559
+ V(B1,64,77,E0), V(BB,6B,AE,84), V(FE,81,A0,1C), V(F9,08,2B,94), \
560
+ V(70,48,68,58), V(8F,45,FD,19), V(94,DE,6C,87), V(52,7B,F8,B7), \
561
+ V(AB,73,D3,23), V(72,4B,02,E2), V(E3,1F,8F,57), V(66,55,AB,2A), \
562
+ V(B2,EB,28,07), V(2F,B5,C2,03), V(86,C5,7B,9A), V(D3,37,08,A5), \
563
+ V(30,28,87,F2), V(23,BF,A5,B2), V(02,03,6A,BA), V(ED,16,82,5C), \
564
+ V(8A,CF,1C,2B), V(A7,79,B4,92), V(F3,07,F2,F0), V(4E,69,E2,A1), \
565
+ V(65,DA,F4,CD), V(06,05,BE,D5), V(D1,34,62,1F), V(C4,A6,FE,8A), \
566
+ V(34,2E,53,9D), V(A2,F3,55,A0), V(05,8A,E1,32), V(A4,F6,EB,75), \
567
+ V(0B,83,EC,39), V(40,60,EF,AA), V(5E,71,9F,06), V(BD,6E,10,51), \
568
+ V(3E,21,8A,F9), V(96,DD,06,3D), V(DD,3E,05,AE), V(4D,E6,BD,46), \
569
+ V(91,54,8D,B5), V(71,C4,5D,05), V(04,06,D4,6F), V(60,50,15,FF), \
570
+ V(19,98,FB,24), V(D6,BD,E9,97), V(89,40,43,CC), V(67,D9,9E,77), \
571
+ V(B0,E8,42,BD), V(07,89,8B,88), V(E7,19,5B,38), V(79,C8,EE,DB), \
572
+ V(A1,7C,0A,47), V(7C,42,0F,E9), V(F8,84,1E,C9), V(00,00,00,00), \
573
+ V(09,80,86,83), V(32,2B,ED,48), V(1E,11,70,AC), V(6C,5A,72,4E), \
574
+ V(FD,0E,FF,FB), V(0F,85,38,56), V(3D,AE,D5,1E), V(36,2D,39,27), \
575
+ V(0A,0F,D9,64), V(68,5C,A6,21), V(9B,5B,54,D1), V(24,36,2E,3A), \
576
+ V(0C,0A,67,B1), V(93,57,E7,0F), V(B4,EE,96,D2), V(1B,9B,91,9E), \
577
+ V(80,C0,C5,4F), V(61,DC,20,A2), V(5A,77,4B,69), V(1C,12,1A,16), \
578
+ V(E2,93,BA,0A), V(C0,A0,2A,E5), V(3C,22,E0,43), V(12,1B,17,1D), \
579
+ V(0E,09,0D,0B), V(F2,8B,C7,AD), V(2D,B6,A8,B9), V(14,1E,A9,C8), \
580
+ V(57,F1,19,85), V(AF,75,07,4C), V(EE,99,DD,BB), V(A3,7F,60,FD), \
581
+ V(F7,01,26,9F), V(5C,72,F5,BC), V(44,66,3B,C5), V(5B,FB,7E,34), \
582
+ V(8B,43,29,76), V(CB,23,C6,DC), V(B6,ED,FC,68), V(B8,E4,F1,63), \
583
+ V(D7,31,DC,CA), V(42,63,85,10), V(13,97,22,40), V(84,C6,11,20), \
584
+ V(85,4A,24,7D), V(D2,BB,3D,F8), V(AE,F9,32,11), V(C7,29,A1,6D), \
585
+ V(1D,9E,2F,4B), V(DC,B2,30,F3), V(0D,86,52,EC), V(77,C1,E3,D0), \
586
+ V(2B,B3,16,6C), V(A9,70,B9,99), V(11,94,48,FA), V(47,E9,64,22), \
587
+ V(A8,FC,8C,C4), V(A0,F0,3F,1A), V(56,7D,2C,D8), V(22,33,90,EF), \
588
+ V(87,49,4E,C7), V(D9,38,D1,C1), V(8C,CA,A2,FE), V(98,D4,0B,36), \
589
+ V(A6,F5,81,CF), V(A5,7A,DE,28), V(DA,B7,8E,26), V(3F,AD,BF,A4), \
590
+ V(2C,3A,9D,E4), V(50,78,92,0D), V(6A,5F,CC,9B), V(54,7E,46,62), \
591
+ V(F6,8D,13,C2), V(90,D8,B8,E8), V(2E,39,F7,5E), V(82,C3,AF,F5), \
592
+ V(9F,5D,80,BE), V(69,D0,93,7C), V(6F,D5,2D,A9), V(CF,25,12,B3), \
593
+ V(C8,AC,99,3B), V(10,18,7D,A7), V(E8,9C,63,6E), V(DB,3B,BB,7B), \
594
+ V(CD,26,78,09), V(6E,59,18,F4), V(EC,9A,B7,01), V(83,4F,9A,A8), \
595
+ V(E6,95,6E,65), V(AA,FF,E6,7E), V(21,BC,CF,08), V(EF,15,E8,E6), \
596
+ V(BA,E7,9B,D9), V(4A,6F,36,CE), V(EA,9F,09,D4), V(29,B0,7C,D6), \
597
+ V(31,A4,B2,AF), V(2A,3F,23,31), V(C6,A5,94,30), V(35,A2,66,C0), \
598
+ V(74,4E,BC,37), V(FC,82,CA,A6), V(E0,90,D0,B0), V(33,A7,D8,15), \
599
+ V(F1,04,98,4A), V(41,EC,DA,F7), V(7F,CD,50,0E), V(17,91,F6,2F), \
600
+ V(76,4D,D6,8D), V(43,EF,B0,4D), V(CC,AA,4D,54), V(E4,96,04,DF), \
601
+ V(9E,D1,B5,E3), V(4C,6A,88,1B), V(C1,2C,1F,B8), V(46,65,51,7F), \
602
+ V(9D,5E,EA,04), V(01,8C,35,5D), V(FA,87,74,73), V(FB,0B,41,2E), \
603
+ V(B3,67,1D,5A), V(92,DB,D2,52), V(E9,10,56,33), V(6D,D6,47,13), \
604
+ V(9A,D7,61,8C), V(37,A1,0C,7A), V(59,F8,14,8E), V(EB,13,3C,89), \
605
+ V(CE,A9,27,EE), V(B7,61,C9,35), V(E1,1C,E5,ED), V(7A,47,B1,3C), \
606
+ V(9C,D2,DF,59), V(55,F2,73,3F), V(18,14,CE,79), V(73,C7,37,BF), \
607
+ V(53,F7,CD,EA), V(5F,FD,AA,5B), V(DF,3D,6F,14), V(78,44,DB,86), \
608
+ V(CA,AF,F3,81), V(B9,68,C4,3E), V(38,24,34,2C), V(C2,A3,40,5F), \
609
+ V(16,1D,C3,72), V(BC,E2,25,0C), V(28,3C,49,8B), V(FF,0D,95,41), \
610
+ V(39,A8,01,71), V(08,0C,B3,DE), V(D8,B4,E4,9C), V(64,56,C1,90), \
611
+ V(7B,CB,84,61), V(D5,32,B6,70), V(48,6C,5C,74), V(D0,B8,57,42)
612
+
613
+ #define V(a,b,c,d) 0x##a##b##c##d
614
+ static const uint32_t RT0[256] = { RT };
615
+ #undef V
616
+
617
+ #define V(a,b,c,d) 0x##d##a##b##c
618
+ static const uint32_t RT1[256] = { RT };
619
+ #undef V
620
+
621
+ #define V(a,b,c,d) 0x##c##d##a##b
622
+ static const uint32_t RT2[256] = { RT };
623
+ #undef V
624
+
625
+ #define V(a,b,c,d) 0x##b##c##d##a
626
+ static const uint32_t RT3[256] = { RT };
627
+ #undef V
628
+
629
+ #undef RT
630
+
631
+ /* round constants */
632
+
633
+ static const uint32_t RCON[10] =
634
+ {
635
+ 0x01000000, 0x02000000, 0x04000000, 0x08000000,
636
+ 0x10000000, 0x20000000, 0x40000000, 0x80000000,
637
+ 0x1B000000, 0x36000000
638
+ };
639
+
640
+ void aes_gen_tables( void )
641
+ {
642
+ }
643
+
644
+ #endif
645
+
646
+ /* platform-independant 32-bit integer manipulation macros */
647
+
648
+ #define GET_UINT32(n,b,i) \
649
+ { \
650
+ (n) = ( (uint32_t) (b)[(i) ] << 24 ) \
651
+ | ( (uint32_t) (b)[(i) + 1] << 16 ) \
652
+ | ( (uint32_t) (b)[(i) + 2] << 8 ) \
653
+ | ( (uint32_t) (b)[(i) + 3] ); \
654
+ }
655
+
656
+ #define PUT_UINT32(n,b,i) \
657
+ { \
658
+ (b)[(i) ] = (uint8_t) ( (n) >> 24 ); \
659
+ (b)[(i) + 1] = (uint8_t) ( (n) >> 16 ); \
660
+ (b)[(i) + 2] = (uint8_t) ( (n) >> 8 ); \
661
+ (b)[(i) + 3] = (uint8_t) ( (n) ); \
662
+ }
663
+
664
+ /* decryption key schedule tables */
665
+
666
+ int KT_init = 1;
667
+
668
+ uint32_t KT0[256];
669
+ uint32_t KT1[256];
670
+ uint32_t KT2[256];
671
+ uint32_t KT3[256];
672
+
673
+ uint32_t initial_KT0[256];
674
+ uint32_t initial_KT1[256];
675
+ uint32_t initial_KT2[256];
676
+ uint32_t initial_KT3[256];
677
+
678
+ /* AES key scheduling routine */
679
+
680
+ int
681
+ fast_aes_initialize_state(fast_aes_t* fast_aes)
682
+ {
683
+ int i;
684
+ uint32_t *RK, *SK;
685
+
686
+ switch( fast_aes->key_bits )
687
+ {
688
+ case 128: fast_aes->nr = 10; break;
689
+ case 192: fast_aes->nr = 12; break;
690
+ case 256: fast_aes->nr = 14; break;
691
+ default : return( 1 );
692
+ }
693
+
694
+ RK = fast_aes->erk;
695
+
696
+ for( i = 0; i < (fast_aes->key_bits >> 5); ++i )
697
+ {
698
+ GET_UINT32(
699
+ fast_aes->erk[i],
700
+ ((unsigned char*)fast_aes->key),
701
+ i * 4
702
+ );
703
+ }
704
+
705
+ /* setup encryption round keys */
706
+
707
+ switch( fast_aes->key_bits )
708
+ {
709
+ case 128:
710
+
711
+ for( i = 0; i < 10; ++i, RK += 4 )
712
+ {
713
+ RK[4] = RK[0] ^ RCON[i] ^
714
+ ( FSb[ (uint8_t) ( RK[3] >> 16 ) ] << 24 ) ^
715
+ ( FSb[ (uint8_t) ( RK[3] >> 8 ) ] << 16 ) ^
716
+ ( FSb[ (uint8_t) ( RK[3] ) ] << 8 ) ^
717
+ ( FSb[ (uint8_t) ( RK[3] >> 24 ) ] );
718
+
719
+ RK[5] = RK[1] ^ RK[4];
720
+ RK[6] = RK[2] ^ RK[5];
721
+ RK[7] = RK[3] ^ RK[6];
722
+ }
723
+ break;
724
+
725
+ case 192:
726
+
727
+ for( i = 0; i < 8; ++i, RK += 6 )
728
+ {
729
+ RK[6] = RK[0] ^ RCON[i] ^
730
+ ( FSb[ (uint8_t) ( RK[5] >> 16 ) ] << 24 ) ^
731
+ ( FSb[ (uint8_t) ( RK[5] >> 8 ) ] << 16 ) ^
732
+ ( FSb[ (uint8_t) ( RK[5] ) ] << 8 ) ^
733
+ ( FSb[ (uint8_t) ( RK[5] >> 24 ) ] );
734
+
735
+ RK[7] = RK[1] ^ RK[6];
736
+ RK[8] = RK[2] ^ RK[7];
737
+ RK[9] = RK[3] ^ RK[8];
738
+ RK[10] = RK[4] ^ RK[9];
739
+ RK[11] = RK[5] ^ RK[10];
740
+ }
741
+ break;
742
+
743
+ case 256:
744
+
745
+ for( i = 0; i < 7; ++i, RK += 8 )
746
+ {
747
+ RK[8] = RK[0] ^ RCON[i] ^
748
+ ( FSb[ (uint8_t) ( RK[7] >> 16 ) ] << 24 ) ^
749
+ ( FSb[ (uint8_t) ( RK[7] >> 8 ) ] << 16 ) ^
750
+ ( FSb[ (uint8_t) ( RK[7] ) ] << 8 ) ^
751
+ ( FSb[ (uint8_t) ( RK[7] >> 24 ) ] );
752
+
753
+ RK[9] = RK[1] ^ RK[8];
754
+ RK[10] = RK[2] ^ RK[9];
755
+ RK[11] = RK[3] ^ RK[10];
756
+
757
+ RK[12] = RK[4] ^
758
+ ( FSb[ (uint8_t) ( RK[11] >> 24 ) ] << 24 ) ^
759
+ ( FSb[ (uint8_t) ( RK[11] >> 16 ) ] << 16 ) ^
760
+ ( FSb[ (uint8_t) ( RK[11] >> 8 ) ] << 8 ) ^
761
+ ( FSb[ (uint8_t) ( RK[11] ) ] );
762
+
763
+ RK[13] = RK[5] ^ RK[12];
764
+ RK[14] = RK[6] ^ RK[13];
765
+ RK[15] = RK[7] ^ RK[14];
766
+ }
767
+ break;
768
+ }
769
+
770
+ /* setup decryption round keys */
771
+
772
+ if( KT_init )
773
+ {
774
+ for( i = 0; i < 256; ++i )
775
+ {
776
+ KT0[i] = RT0[ FSb[i] ];
777
+ KT1[i] = RT1[ FSb[i] ];
778
+ KT2[i] = RT2[ FSb[i] ];
779
+ KT3[i] = RT3[ FSb[i] ];
780
+ }
781
+
782
+ KT_init = 0;
783
+ }
784
+
785
+ SK = fast_aes->drk;
786
+
787
+ *SK++ = *RK++;
788
+ *SK++ = *RK++;
789
+ *SK++ = *RK++;
790
+ *SK++ = *RK++;
791
+
792
+ for( i = 1; i < fast_aes->nr; ++i )
793
+ {
794
+ RK -= 8;
795
+
796
+ *SK++ = KT0[ (uint8_t) ( *RK >> 24 ) ] ^
797
+ KT1[ (uint8_t) ( *RK >> 16 ) ] ^
798
+ KT2[ (uint8_t) ( *RK >> 8 ) ] ^
799
+ KT3[ (uint8_t) ( *RK ) ]; RK++;
800
+
801
+ *SK++ = KT0[ (uint8_t) ( *RK >> 24 ) ] ^
802
+ KT1[ (uint8_t) ( *RK >> 16 ) ] ^
803
+ KT2[ (uint8_t) ( *RK >> 8 ) ] ^
804
+ KT3[ (uint8_t) ( *RK ) ]; RK++;
805
+
806
+ *SK++ = KT0[ (uint8_t) ( *RK >> 24 ) ] ^
807
+ KT1[ (uint8_t) ( *RK >> 16 ) ] ^
808
+ KT2[ (uint8_t) ( *RK >> 8 ) ] ^
809
+ KT3[ (uint8_t) ( *RK ) ]; RK++;
810
+
811
+ *SK++ = KT0[ (uint8_t) ( *RK >> 24 ) ] ^
812
+ KT1[ (uint8_t) ( *RK >> 16 ) ] ^
813
+ KT2[ (uint8_t) ( *RK >> 8 ) ] ^
814
+ KT3[ (uint8_t) ( *RK ) ]; RK++;
815
+ }
816
+
817
+ RK -= 8;
818
+
819
+ *SK++ = *RK++;
820
+ *SK++ = *RK++;
821
+ *SK++ = *RK++;
822
+ *SK++ = *RK++;
823
+
824
+ /* setup values for fast re-initialization */
825
+ memcpy(fast_aes->initial_erk, fast_aes->erk, sizeof(fast_aes->initial_erk));
826
+ memcpy(fast_aes->initial_drk, fast_aes->drk, sizeof(fast_aes->initial_drk));
827
+ return 0;
828
+ }
829
+
830
+ int
831
+ fast_aes_reinitialize_state(fast_aes_t* fast_aes)
832
+ {
833
+ /* put round keys for encryption and decryption back to their initial
834
+ // states so we can encrypt and decrypt new items properly
835
+ */
836
+ memcpy(fast_aes->erk, fast_aes->initial_erk, sizeof(fast_aes->initial_erk));
837
+ memcpy(fast_aes->drk, fast_aes->initial_drk, sizeof(fast_aes->initial_drk));
838
+
839
+ return 0;
840
+ }
841
+
842
+ /* AES 128-bit block encryption routine */
843
+
844
+ void
845
+ fast_aes_encrypt_block(fast_aes_t* fast_aes, uint8_t input[16], uint8_t output[16])
846
+ {
847
+ uint32_t *RK, X0, X1, X2, X3, Y0, Y1, Y2, Y3;
848
+
849
+ RK = fast_aes->erk;
850
+
851
+ GET_UINT32( X0, input, 0 ); X0 ^= RK[0];
852
+ GET_UINT32( X1, input, 4 ); X1 ^= RK[1];
853
+ GET_UINT32( X2, input, 8 ); X2 ^= RK[2];
854
+ GET_UINT32( X3, input, 12 ); X3 ^= RK[3];
855
+
856
+ #define AES_FROUND(X0,X1,X2,X3,Y0,Y1,Y2,Y3) \
857
+ { \
858
+ RK += 4; \
859
+ \
860
+ X0 = RK[0] ^ FT0[ (uint8_t) ( Y0 >> 24 ) ] ^ \
861
+ FT1[ (uint8_t) ( Y1 >> 16 ) ] ^ \
862
+ FT2[ (uint8_t) ( Y2 >> 8 ) ] ^ \
863
+ FT3[ (uint8_t) ( Y3 ) ]; \
864
+ \
865
+ X1 = RK[1] ^ FT0[ (uint8_t) ( Y1 >> 24 ) ] ^ \
866
+ FT1[ (uint8_t) ( Y2 >> 16 ) ] ^ \
867
+ FT2[ (uint8_t) ( Y3 >> 8 ) ] ^ \
868
+ FT3[ (uint8_t) ( Y0 ) ]; \
869
+ \
870
+ X2 = RK[2] ^ FT0[ (uint8_t) ( Y2 >> 24 ) ] ^ \
871
+ FT1[ (uint8_t) ( Y3 >> 16 ) ] ^ \
872
+ FT2[ (uint8_t) ( Y0 >> 8 ) ] ^ \
873
+ FT3[ (uint8_t) ( Y1 ) ]; \
874
+ \
875
+ X3 = RK[3] ^ FT0[ (uint8_t) ( Y3 >> 24 ) ] ^ \
876
+ FT1[ (uint8_t) ( Y0 >> 16 ) ] ^ \
877
+ FT2[ (uint8_t) ( Y1 >> 8 ) ] ^ \
878
+ FT3[ (uint8_t) ( Y2 ) ]; \
879
+ }
880
+
881
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 1 */
882
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 2 */
883
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 3 */
884
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 4 */
885
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 5 */
886
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 6 */
887
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 7 */
888
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 8 */
889
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 9 */
890
+
891
+ if( fast_aes->nr > 10 )
892
+ {
893
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 10 */
894
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 11 */
895
+ }
896
+
897
+ if( fast_aes->nr > 12 )
898
+ {
899
+ AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 12 */
900
+ AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 13 */
901
+ }
902
+
903
+ /* last round */
904
+
905
+ RK += 4;
906
+
907
+ X0 = RK[0] ^ ( FSb[ (uint8_t) ( Y0 >> 24 ) ] << 24 ) ^
908
+ ( FSb[ (uint8_t) ( Y1 >> 16 ) ] << 16 ) ^
909
+ ( FSb[ (uint8_t) ( Y2 >> 8 ) ] << 8 ) ^
910
+ ( FSb[ (uint8_t) ( Y3 ) ] );
911
+
912
+ X1 = RK[1] ^ ( FSb[ (uint8_t) ( Y1 >> 24 ) ] << 24 ) ^
913
+ ( FSb[ (uint8_t) ( Y2 >> 16 ) ] << 16 ) ^
914
+ ( FSb[ (uint8_t) ( Y3 >> 8 ) ] << 8 ) ^
915
+ ( FSb[ (uint8_t) ( Y0 ) ] );
916
+
917
+ X2 = RK[2] ^ ( FSb[ (uint8_t) ( Y2 >> 24 ) ] << 24 ) ^
918
+ ( FSb[ (uint8_t) ( Y3 >> 16 ) ] << 16 ) ^
919
+ ( FSb[ (uint8_t) ( Y0 >> 8 ) ] << 8 ) ^
920
+ ( FSb[ (uint8_t) ( Y1 ) ] );
921
+
922
+ X3 = RK[3] ^ ( FSb[ (uint8_t) ( Y3 >> 24 ) ] << 24 ) ^
923
+ ( FSb[ (uint8_t) ( Y0 >> 16 ) ] << 16 ) ^
924
+ ( FSb[ (uint8_t) ( Y1 >> 8 ) ] << 8 ) ^
925
+ ( FSb[ (uint8_t) ( Y2 ) ] );
926
+
927
+ PUT_UINT32( X0, output, 0 );
928
+ PUT_UINT32( X1, output, 4 );
929
+ PUT_UINT32( X2, output, 8 );
930
+ PUT_UINT32( X3, output, 12 );
931
+ }
932
+
933
+ /* AES 128-bit block decryption routine */
934
+
935
+ void
936
+ fast_aes_decrypt_block(fast_aes_t* fast_aes, uint8_t input[16], uint8_t output[16])
937
+ {
938
+ uint32_t *RK, X0, X1, X2, X3, Y0, Y1, Y2, Y3;
939
+
940
+ RK = fast_aes->drk;
941
+
942
+ GET_UINT32( X0, input, 0 ); X0 ^= RK[0];
943
+ GET_UINT32( X1, input, 4 ); X1 ^= RK[1];
944
+ GET_UINT32( X2, input, 8 ); X2 ^= RK[2];
945
+ GET_UINT32( X3, input, 12 ); X3 ^= RK[3];
946
+
947
+ #define AES_RROUND(X0,X1,X2,X3,Y0,Y1,Y2,Y3) \
948
+ { \
949
+ RK += 4; \
950
+ \
951
+ X0 = RK[0] ^ RT0[ (uint8_t) ( Y0 >> 24 ) ] ^ \
952
+ RT1[ (uint8_t) ( Y3 >> 16 ) ] ^ \
953
+ RT2[ (uint8_t) ( Y2 >> 8 ) ] ^ \
954
+ RT3[ (uint8_t) ( Y1 ) ]; \
955
+ \
956
+ X1 = RK[1] ^ RT0[ (uint8_t) ( Y1 >> 24 ) ] ^ \
957
+ RT1[ (uint8_t) ( Y0 >> 16 ) ] ^ \
958
+ RT2[ (uint8_t) ( Y3 >> 8 ) ] ^ \
959
+ RT3[ (uint8_t) ( Y2 ) ]; \
960
+ \
961
+ X2 = RK[2] ^ RT0[ (uint8_t) ( Y2 >> 24 ) ] ^ \
962
+ RT1[ (uint8_t) ( Y1 >> 16 ) ] ^ \
963
+ RT2[ (uint8_t) ( Y0 >> 8 ) ] ^ \
964
+ RT3[ (uint8_t) ( Y3 ) ]; \
965
+ \
966
+ X3 = RK[3] ^ RT0[ (uint8_t) ( Y3 >> 24 ) ] ^ \
967
+ RT1[ (uint8_t) ( Y2 >> 16 ) ] ^ \
968
+ RT2[ (uint8_t) ( Y1 >> 8 ) ] ^ \
969
+ RT3[ (uint8_t) ( Y0 ) ]; \
970
+ }
971
+
972
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 1 */
973
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 2 */
974
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 3 */
975
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 4 */
976
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 5 */
977
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 6 */
978
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 7 */
979
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 8 */
980
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 9 */
981
+
982
+ if( fast_aes->nr > 10 )
983
+ {
984
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 10 */
985
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 11 */
986
+ }
987
+
988
+ if( fast_aes->nr > 12 )
989
+ {
990
+ AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 ); /* round 12 */
991
+ AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 ); /* round 13 */
992
+ }
993
+
994
+ /* last round */
995
+
996
+ RK += 4;
997
+
998
+ X0 = RK[0] ^ ( RSb[ (uint8_t) ( Y0 >> 24 ) ] << 24 ) ^
999
+ ( RSb[ (uint8_t) ( Y3 >> 16 ) ] << 16 ) ^
1000
+ ( RSb[ (uint8_t) ( Y2 >> 8 ) ] << 8 ) ^
1001
+ ( RSb[ (uint8_t) ( Y1 ) ] );
1002
+
1003
+ X1 = RK[1] ^ ( RSb[ (uint8_t) ( Y1 >> 24 ) ] << 24 ) ^
1004
+ ( RSb[ (uint8_t) ( Y0 >> 16 ) ] << 16 ) ^
1005
+ ( RSb[ (uint8_t) ( Y3 >> 8 ) ] << 8 ) ^
1006
+ ( RSb[ (uint8_t) ( Y2 ) ] );
1007
+
1008
+ X2 = RK[2] ^ ( RSb[ (uint8_t) ( Y2 >> 24 ) ] << 24 ) ^
1009
+ ( RSb[ (uint8_t) ( Y1 >> 16 ) ] << 16 ) ^
1010
+ ( RSb[ (uint8_t) ( Y0 >> 8 ) ] << 8 ) ^
1011
+ ( RSb[ (uint8_t) ( Y3 ) ] );
1012
+
1013
+ X3 = RK[3] ^ ( RSb[ (uint8_t) ( Y3 >> 24 ) ] << 24 ) ^
1014
+ ( RSb[ (uint8_t) ( Y2 >> 16 ) ] << 16 ) ^
1015
+ ( RSb[ (uint8_t) ( Y1 >> 8 ) ] << 8 ) ^
1016
+ ( RSb[ (uint8_t) ( Y0 ) ] );
1017
+
1018
+ PUT_UINT32( X0, output, 0 );
1019
+ PUT_UINT32( X1, output, 4 );
1020
+ PUT_UINT32( X2, output, 8 );
1021
+ PUT_UINT32( X3, output, 12 );
1022
+ }
1023
+