secp256k1-native 0.17.0 → 0.18.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +14 -0
- data/README.md +1 -1
- data/ext/secp256k1_native/field.c +69 -9
- data/ext/secp256k1_native/jacobian.c +2 -2
- data/ext/secp256k1_native/scalar.c +91 -40
- data/ext/secp256k1_native/secp256k1_native.h +31 -1
- data/lib/secp256k1/version.rb +1 -1
- data/lib/secp256k1.rb +104 -10
- metadata +1 -2
- data/lib/secp256k1_native.bundle +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: f869c9c197727fdab65f1517decd680fa332a79a2409cbf08a3cc975dd679da7
|
|
4
|
+
data.tar.gz: bdcbb3a7fa8964600a6baf0e99249d2ce9dd76b5bbdf17ff0b70547f4c016b50
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 2aff9d5c272393b74a4df7abed789bd40be5b06d8c90ed13c9458326129ceac2051de76c01e5ac7ba42d6c9c3c7d6fe0150a76b8b7e68dd35481d21ea48a467e
|
|
7
|
+
data.tar.gz: c45c61f5380c35774d8026b9cab72b21d7183b2fafb7b513b63caf520996c9d664e87759158b6cb4af359ff9e26064b60d70730a07066d4a480b2c976313db3f
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,19 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.18.0] - 2026-06-30
|
|
4
|
+
|
|
5
|
+
### Security
|
|
6
|
+
|
|
7
|
+
- **Compiler-reconstructed timing side-channel in `Point#mul` (the secret-scalar Montgomery ladder).** Bare-metal dudect verification (issue #25; AMD Ryzen 9 9950X, GCC 15.2, `-O2`) found that GCC 15.2 reconstructs the branchless `(a & ~mask) | (b & mask)` select idiom in `uint256_select` into a secret-dependent conditional jump, leaking the scalar at dudect |t| ≈ 21. This silently undid the 0.17.0 |t|=875 fix, which relied on that select being branchless. Fixed with a value barrier (`ct_value_barrier_u64`) applied via a single `ct_mask_u64` helper to **every** constant-time select mask in the extension (`uint256_select`, `fred`/`fadd`/`fsub`/`fneg`, `scalar_reduce`/`scalar_add`, the `jp_double` infinity select, and the ladder `cswap`). Re-verified: disassembly shows no branch/`cmov` at any select line, ctgrind clean, dudect `scalar_multiply_ct` |t| → 0.68 mean (0/20 runs over 4.5). See [advisory 0001](docs/advisories/0001-compiler-reconstructed-ct-branch.md). Only `uint256_select` actively branchified under this compiler; the other sites are hardened as defence-in-depth.
|
|
8
|
+
|
|
9
|
+
### Changed
|
|
10
|
+
|
|
11
|
+
- Bare-metal dudect timing verification is now a **required pre-tag release gate** (not a one-off): a constant-time *source* is not a constant-time *binary*, and a compiler upgrade can silently reintroduce a branch that only a statistical run on the shipping compiler observes. Documented in [`docs/security.md`](docs/security.md#empirical-timing-verification) and [`docs/timing-verification-runbook.md`](docs/timing-verification-runbook.md).
|
|
12
|
+
|
|
13
|
+
### Build
|
|
14
|
+
|
|
15
|
+
- Timing harness (`timing/`) now builds on modern GCC/glibc toolchains: define `_POSIX_C_SOURCE` for `clock_gettime` under `-std=c99`, and add `-fcommon` for the `rb_mSecp256k1Native` tentative definition under GCC 10+ `-fno-common`.
|
|
16
|
+
|
|
3
17
|
## [0.17.0] - 2026-05-01
|
|
4
18
|
|
|
5
19
|
### Added
|
data/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
> **Before using a custom cryptographic implementation, read [Evaluating the risks](https://sgbett.github.io/secp256k1-native/risks/) — it examines what the empirical evidence says about rolling your own crypto and where this gem sits in that landscape.**
|
|
4
4
|
|
|
5
|
-
Pure native
|
|
5
|
+
Pure native secp256k1 implementation for Ruby (no libsecp256k1 dependency).
|
|
6
6
|
|
|
7
7
|
Provides secp256k1 elliptic curve cryptography for Ruby — field arithmetic, scalar operations, Jacobian point arithmetic, and constant-time scalar multiplication — via an optional native C extension. The gem ships a pure-Ruby base layer that works out of the box on any Ruby 2.7+ platform, with the C extension providing constant-time guarantees and ~22x acceleration when available.
|
|
8
8
|
|
|
@@ -24,8 +24,24 @@
|
|
|
24
24
|
* so that only one copy of each function exists in the linked extension.
|
|
25
25
|
* ----------------------------------------------------------------------- */
|
|
26
26
|
|
|
27
|
+
/*
|
|
28
|
+
* Marshal a Ruby Integer into a uint256_t.
|
|
29
|
+
*
|
|
30
|
+
* @raise [TypeError] if rb_int is not an Integer (L-1: rejects Float,
|
|
31
|
+
* Rational, BigDecimal, nil, anything responding to #to_int).
|
|
32
|
+
* @raise [ArgumentError] if rb_int is negative or exceeds 256 bits.
|
|
33
|
+
*/
|
|
27
34
|
uint256_t rb_to_uint256(VALUE rb_int)
|
|
28
35
|
{
|
|
36
|
+
/* L-1: reject non-Integer before reaching rb_integer_pack, which itself
|
|
37
|
+
* calls rb_to_int and would silently coerce Float / Rational / objects
|
|
38
|
+
* responding to #to_int. This check is the single load-bearing guard
|
|
39
|
+
* for ALL 16 wrappers' Integer contract — it must come FIRST, before
|
|
40
|
+
* rb_integer_pack mutates the input. */
|
|
41
|
+
if (!RB_INTEGER_TYPE_P(rb_int)) {
|
|
42
|
+
rb_raise(rb_eTypeError, "expected Integer");
|
|
43
|
+
}
|
|
44
|
+
|
|
29
45
|
uint256_t n;
|
|
30
46
|
memset(&n, 0, sizeof(n));
|
|
31
47
|
int result = rb_integer_pack(rb_int, n.d, 4, sizeof(uint64_t), 0, U256_PACK_FLAGS);
|
|
@@ -183,7 +199,6 @@ void fred_internal(uint256_t *r, const uint256_t *hi, const uint256_t *lo)
|
|
|
183
199
|
|
|
184
200
|
/* Compute c × hi with carry. c fits in 33 bits, hi fits in 64 bits
|
|
185
201
|
* each, so each product fits in 97 bits — safe in uint128_t. */
|
|
186
|
-
acc = 0;
|
|
187
202
|
carry = 0;
|
|
188
203
|
for (i = 0; i < 4; i++) {
|
|
189
204
|
acc = (uint128_t)hi->d[i] * FRED_C + lo->d[i] + carry;
|
|
@@ -239,7 +254,7 @@ void fred_internal(uint256_t *r, const uint256_t *hi, const uint256_t *lo)
|
|
|
239
254
|
uint64_t borrow = uint256_sub(&reduced, r, &FIELD_P);
|
|
240
255
|
|
|
241
256
|
/* mask = all 1s if borrow == 1 (keep r), all 0s if borrow == 0 (keep reduced). */
|
|
242
|
-
uint64_t mask =
|
|
257
|
+
uint64_t mask = ct_mask_u64(borrow);
|
|
243
258
|
for (i = 0; i < 4; i++) {
|
|
244
259
|
r->d[i] = (r->d[i] & mask) | (reduced.d[i] & ~mask);
|
|
245
260
|
}
|
|
@@ -307,6 +322,10 @@ void fsqr_internal(uint256_t *r, const uint256_t *a)
|
|
|
307
322
|
* fadd_internal — modular addition.
|
|
308
323
|
*
|
|
309
324
|
* Computes a + b, then branchlessly subtracts P if the result >= P.
|
|
325
|
+
*
|
|
326
|
+
* Precondition: a, b < P (canonical). Pre-reduction is the wrapper's
|
|
327
|
+
* responsibility — see rb_fadd. The Jacobian path (jacobian.c) only feeds
|
|
328
|
+
* canonical intermediates produced by other internals.
|
|
310
329
|
*/
|
|
311
330
|
void fadd_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
312
331
|
{
|
|
@@ -323,7 +342,7 @@ void fadd_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
|
323
342
|
* If overflow == 0 and borrow == 1 : sum < P, want sum.
|
|
324
343
|
* Combined: keep sum iff (overflow == 0 && borrow == 1). */
|
|
325
344
|
uint64_t keep_original = (~overflow) & borrow;
|
|
326
|
-
uint64_t mask =
|
|
345
|
+
uint64_t mask = ct_mask_u64(keep_original); /* all 1s iff sum < P */
|
|
327
346
|
int i;
|
|
328
347
|
for (i = 0; i < 4; i++) {
|
|
329
348
|
r->d[i] = (sum.d[i] & mask) | (reduced.d[i] & ~mask);
|
|
@@ -334,6 +353,8 @@ void fadd_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
|
334
353
|
* fsub_internal — modular subtraction.
|
|
335
354
|
*
|
|
336
355
|
* Computes a - b; if the result underflows, adds P back — branchlessly.
|
|
356
|
+
*
|
|
357
|
+
* Precondition: a, b < P (canonical) — see fadd_internal.
|
|
337
358
|
*/
|
|
338
359
|
void fsub_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
339
360
|
{
|
|
@@ -346,7 +367,7 @@ void fsub_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
|
346
367
|
(void)carry; /* carry is 0 here since diff + P < 2^256 when borrow == 1 */
|
|
347
368
|
|
|
348
369
|
/* mask: all 1s if borrow == 1 (use corrected), all 0s otherwise (use diff). */
|
|
349
|
-
uint64_t mask =
|
|
370
|
+
uint64_t mask = ct_mask_u64(borrow);
|
|
350
371
|
int i;
|
|
351
372
|
for (i = 0; i < 4; i++) {
|
|
352
373
|
r->d[i] = (corrected.d[i] & mask) | (diff.d[i] & ~mask);
|
|
@@ -357,6 +378,8 @@ void fsub_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
|
357
378
|
* fneg_internal — modular negation.
|
|
358
379
|
*
|
|
359
380
|
* Returns P - a for non-zero a, and 0 for a == 0 — branchlessly.
|
|
381
|
+
*
|
|
382
|
+
* Precondition: a < P (canonical) — see fadd_internal.
|
|
360
383
|
*/
|
|
361
384
|
void fneg_internal(uint256_t *r, const uint256_t *a)
|
|
362
385
|
{
|
|
@@ -365,7 +388,7 @@ void fneg_internal(uint256_t *r, const uint256_t *a)
|
|
|
365
388
|
|
|
366
389
|
/* If a == 0 the result should be 0, not P. */
|
|
367
390
|
uint64_t is_zero = uint256_is_zero(a);
|
|
368
|
-
uint64_t mask =
|
|
391
|
+
uint64_t mask = ct_mask_u64(is_zero); /* all 1s if a is zero */
|
|
369
392
|
int i;
|
|
370
393
|
for (i = 0; i < 4; i++) {
|
|
371
394
|
/* zero mask: 0 where is_zero, negated.d[i] where not */
|
|
@@ -435,7 +458,13 @@ int fsqrt_internal(uint256_t *r, const uint256_t *a)
|
|
|
435
458
|
diff |= (check.d[i] ^ a_reduced.d[i]);
|
|
436
459
|
}
|
|
437
460
|
|
|
438
|
-
if (diff != 0)
|
|
461
|
+
if (diff != 0) {
|
|
462
|
+
/* Not a quadratic residue. Honour the docstring contract by
|
|
463
|
+
* writing a defined value to *r so callers cannot inadvertently
|
|
464
|
+
* read uninitialised memory if they ignore the return code. */
|
|
465
|
+
uint256_copy(r, &zero);
|
|
466
|
+
return 0;
|
|
467
|
+
}
|
|
439
468
|
|
|
440
469
|
uint256_copy(r, &result);
|
|
441
470
|
return 1;
|
|
@@ -455,6 +484,14 @@ int fsqrt_internal(uint256_t *r, const uint256_t *a)
|
|
|
455
484
|
static VALUE rb_fred(VALUE self, VALUE x)
|
|
456
485
|
{
|
|
457
486
|
(void)self;
|
|
487
|
+
/* L-1: reject non-Integer before rb_integer_pack, which would silently
|
|
488
|
+
* coerce Float / Rational / objects responding to #to_int. rb_fred packs
|
|
489
|
+
* 8 limbs (vs 4 in rb_to_uint256), so it does not flow through that
|
|
490
|
+
* helper — guard locally to honour the same boundary contract. */
|
|
491
|
+
if (!RB_INTEGER_TYPE_P(x)) {
|
|
492
|
+
rb_raise(rb_eTypeError, "expected Integer");
|
|
493
|
+
}
|
|
494
|
+
|
|
458
495
|
/* fred is used for reducing wide intermediates. Pack into 8 limbs. */
|
|
459
496
|
uint64_t limbs[8];
|
|
460
497
|
memset(limbs, 0, sizeof(limbs));
|
|
@@ -519,8 +556,17 @@ static VALUE rb_fadd(VALUE self, VALUE a, VALUE b)
|
|
|
519
556
|
(void)self;
|
|
520
557
|
uint256_t ua = rb_to_uint256(a);
|
|
521
558
|
uint256_t ub = rb_to_uint256(b);
|
|
559
|
+
|
|
560
|
+
/* L-3: pre-reduce operands so fadd_internal's `a, b < P` precondition is
|
|
561
|
+
* always satisfied (mirrors rb_finv / rb_fsqrt). fred handles 512-bit
|
|
562
|
+
* inputs; here we use hi=0 so it's a single fast pass on each operand. */
|
|
563
|
+
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
564
|
+
uint256_t ua_reduced, ub_reduced;
|
|
565
|
+
fred_internal(&ua_reduced, &zero_limbs, &ua);
|
|
566
|
+
fred_internal(&ub_reduced, &zero_limbs, &ub);
|
|
567
|
+
|
|
522
568
|
uint256_t r;
|
|
523
|
-
fadd_internal(&r, &
|
|
569
|
+
fadd_internal(&r, &ua_reduced, &ub_reduced);
|
|
524
570
|
return uint256_to_rb(&r);
|
|
525
571
|
}
|
|
526
572
|
|
|
@@ -535,8 +581,15 @@ static VALUE rb_fsub(VALUE self, VALUE a, VALUE b)
|
|
|
535
581
|
(void)self;
|
|
536
582
|
uint256_t ua = rb_to_uint256(a);
|
|
537
583
|
uint256_t ub = rb_to_uint256(b);
|
|
584
|
+
|
|
585
|
+
/* L-3: pre-reduce operands (see rb_fadd). */
|
|
586
|
+
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
587
|
+
uint256_t ua_reduced, ub_reduced;
|
|
588
|
+
fred_internal(&ua_reduced, &zero_limbs, &ua);
|
|
589
|
+
fred_internal(&ub_reduced, &zero_limbs, &ub);
|
|
590
|
+
|
|
538
591
|
uint256_t r;
|
|
539
|
-
fsub_internal(&r, &
|
|
592
|
+
fsub_internal(&r, &ua_reduced, &ub_reduced);
|
|
540
593
|
return uint256_to_rb(&r);
|
|
541
594
|
}
|
|
542
595
|
|
|
@@ -550,8 +603,15 @@ static VALUE rb_fneg(VALUE self, VALUE a)
|
|
|
550
603
|
{
|
|
551
604
|
(void)self;
|
|
552
605
|
uint256_t ua = rb_to_uint256(a);
|
|
606
|
+
|
|
607
|
+
/* L-3 / I-3: pre-reduce the operand so fneg_internal's `a < P`
|
|
608
|
+
* precondition is always satisfied (mirrors rb_finv / rb_fsqrt). */
|
|
609
|
+
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
610
|
+
uint256_t ua_reduced;
|
|
611
|
+
fred_internal(&ua_reduced, &zero_limbs, &ua);
|
|
612
|
+
|
|
553
613
|
uint256_t r;
|
|
554
|
-
fneg_internal(&r, &
|
|
614
|
+
fneg_internal(&r, &ua_reduced);
|
|
555
615
|
return uint256_to_rb(&r);
|
|
556
616
|
}
|
|
557
617
|
|
|
@@ -131,7 +131,7 @@ void jp_double_internal(uint256_t r[3], const uint256_t p[3])
|
|
|
131
131
|
* Compute mask = all 1s if Y1 is zero, all 0s otherwise.
|
|
132
132
|
* Use the mask to select between [x3, y3, z3] and JP_INFINITY. */
|
|
133
133
|
uint64_t is_zero = uint256_is_zero(&p[1]);
|
|
134
|
-
uint64_t mask =
|
|
134
|
+
uint64_t mask = ct_mask_u64(is_zero); /* all 1s if Y1 == 0 */
|
|
135
135
|
int i;
|
|
136
136
|
for (i = 0; i < 4; i++) {
|
|
137
137
|
r[0].d[i] = (x3.d[i] & ~mask) | (JP_INF_X.d[i] & mask);
|
|
@@ -383,7 +383,7 @@ static VALUE rb_jp_neg(VALUE self, VALUE rb_point)
|
|
|
383
383
|
*/
|
|
384
384
|
static void cswap(uint64_t bit, uint256_t a[3], uint256_t b[3])
|
|
385
385
|
{
|
|
386
|
-
uint64_t mask =
|
|
386
|
+
uint64_t mask = ct_mask_u64(bit); /* all-ones if bit==1, all-zeros if bit==0 */
|
|
387
387
|
int j, k;
|
|
388
388
|
for (j = 0; j < 3; j++) {
|
|
389
389
|
for (k = 0; k < 4; k++) {
|
|
@@ -28,9 +28,9 @@
|
|
|
28
28
|
*
|
|
29
29
|
* Constant-time discipline
|
|
30
30
|
* ------------------------
|
|
31
|
-
*
|
|
32
|
-
*
|
|
33
|
-
* is safe.
|
|
31
|
+
* scalar_reduce_limbs and scalar_add_internal use branchless conditional
|
|
32
|
+
* selection — no operand-dependent branches in either. scalar_inv_internal
|
|
33
|
+
* iterates over bits of the public constant N-2, which is safe.
|
|
34
34
|
*/
|
|
35
35
|
|
|
36
36
|
/* -----------------------------------------------------------------------
|
|
@@ -90,11 +90,16 @@ static const uint256_t SCALAR_ONE = {{ 1ULL, 0ULL, 0ULL, 0ULL }};
|
|
|
90
90
|
* After the first fold the 512-bit value has been reduced to at most
|
|
91
91
|
* ~385 bits. The overflow above bit 255 (stored in the temporary carry
|
|
92
92
|
* words) requires a second fold. After two folds the result fits in
|
|
93
|
-
* 256 bits + at most 1 bit
|
|
93
|
+
* 256 bits + at most 1 bit; the residual fold then propagates that
|
|
94
|
+
* remaining bit (the "topcarry") and a branchless conditional-subtract of N
|
|
95
|
+
* selects whichever of {r, r-N} is the canonical residue. When topcarry is
|
|
96
|
+
* set, the subtract also folds the dropped 2^256 back as c_N (= 2^256 - N).
|
|
94
97
|
*
|
|
95
98
|
* We accumulate into an 8-limb array and reuse the upper limbs as
|
|
96
99
|
* temporaries for the folded-in contributions, so no extra allocation is
|
|
97
100
|
* needed.
|
|
101
|
+
*
|
|
102
|
+
* Branchless throughout — no operand-dependent control flow.
|
|
98
103
|
*/
|
|
99
104
|
static void scalar_reduce_limbs(uint256_t *r, uint64_t product[8])
|
|
100
105
|
{
|
|
@@ -161,9 +166,12 @@ static void scalar_reduce_limbs(uint256_t *r, uint64_t product[8])
|
|
|
161
166
|
uint64_t hi2[4];
|
|
162
167
|
for (i = 0; i < 4; i++) { hi2[i] = t[4 + i]; t[4 + i] = 0; }
|
|
163
168
|
|
|
169
|
+
/* Second fold — unconditional loop body (no branch on h being zero).
|
|
170
|
+
* The body is a faithful no-op when h == 0 (each `h * CONST` term is 0
|
|
171
|
+
* and the carries propagate identically), so removing the guard changes
|
|
172
|
+
* no result, only the timing. (Closes I-11 secret-dependent branch.) */
|
|
164
173
|
for (i = 0; i < 4; i++) {
|
|
165
174
|
uint64_t h = hi2[i];
|
|
166
|
-
if (h == 0) continue;
|
|
167
175
|
|
|
168
176
|
acc = (uint128_t)h * CN_LO + t[i];
|
|
169
177
|
t[i] = (uint64_t)acc;
|
|
@@ -183,30 +191,46 @@ static void scalar_reduce_limbs(uint256_t *r, uint64_t product[8])
|
|
|
183
191
|
/* After two folds, any carry here is negligible (< 2). */
|
|
184
192
|
}
|
|
185
193
|
|
|
186
|
-
/*
|
|
187
|
-
*
|
|
194
|
+
/* Result is now in t[0..3] with a small residual in t[4].
|
|
195
|
+
*
|
|
196
|
+
* Bound: after the first fold the value is < 2^386 (the original 512-bit
|
|
197
|
+
* product reduced by c_N ≈ 2^129). The second fold reduces that overflow
|
|
198
|
+
* by another factor of c_N, so the post-second-fold residual is < 2^259
|
|
199
|
+
* — i.e. t[4] is a few bits wide (at most a small single-digit value),
|
|
200
|
+
* and the residual fold below produces V < 2N. V < 2N means a single
|
|
201
|
+
* conditional subtract of N is sufficient to canonicalise. */
|
|
188
202
|
r->d[0] = t[0]; r->d[1] = t[1]; r->d[2] = t[2]; r->d[3] = t[3];
|
|
189
203
|
|
|
190
|
-
/*
|
|
191
|
-
*
|
|
204
|
+
/* Residual fold — unconditional (I-11: no branch on the carry) and
|
|
205
|
+
* capturing the carry OUT of the top limb (H-1: previously dropped at
|
|
206
|
+
* bit 255). After two folds the value here is < 2^257, so topcarry
|
|
207
|
+
* is 0 or 1. */
|
|
192
208
|
uint64_t carry3 = t[4];
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
209
|
+
uint128_t a0 = (uint128_t)carry3 * CN_LO + r->d[0];
|
|
210
|
+
r->d[0] = (uint64_t)a0;
|
|
211
|
+
uint128_t a1 = (uint128_t)carry3 * CN_MID + r->d[1] + (a0 >> 64);
|
|
212
|
+
r->d[1] = (uint64_t)a1;
|
|
213
|
+
uint128_t a2 = (uint128_t)carry3 + r->d[2] + (a1 >> 64);
|
|
214
|
+
r->d[2] = (uint64_t)a2;
|
|
215
|
+
uint128_t a3 = (uint128_t)r->d[3] + (a2 >> 64);
|
|
216
|
+
r->d[3] = (uint64_t)a3;
|
|
217
|
+
uint64_t topcarry = (uint64_t)(a3 >> 64); /* 0 or 1 — was H-1 dropped bit */
|
|
218
|
+
|
|
219
|
+
/* Branchless final reduction: keep (r - N) when topcarry is set OR r >= N.
|
|
220
|
+
*
|
|
221
|
+
* c_N == 2^256 - N, so subtracting N once when topcarry is set converts
|
|
222
|
+
* the dropped 2^256 into the correct +c_N residue. Total value V < 2N
|
|
223
|
+
* (from V < 2^257 and N ≈ 2^256), so a single conditional subtract of N
|
|
224
|
+
* is sufficient.
|
|
225
|
+
*
|
|
226
|
+
* Using (1 ^ borrow) instead of (borrow == 0) avoids any compiler
|
|
227
|
+
* latitude to emit a compare-and-branch for the predicate. */
|
|
205
228
|
uint256_t reduced;
|
|
206
|
-
uint64_t borrow = uint256_sub(&reduced, r, &CURVE_N);
|
|
207
|
-
uint64_t
|
|
229
|
+
uint64_t borrow = uint256_sub(&reduced, r, &CURVE_N); /* borrow==0 <=> r >= N */
|
|
230
|
+
uint64_t keep_reduced = topcarry | (1 ^ borrow);
|
|
231
|
+
uint64_t mask = ct_mask_u64(keep_reduced);
|
|
208
232
|
for (i = 0; i < 4; i++) {
|
|
209
|
-
r->d[i] = (
|
|
233
|
+
r->d[i] = (reduced.d[i] & mask) | (r->d[i] & ~mask);
|
|
210
234
|
}
|
|
211
235
|
}
|
|
212
236
|
|
|
@@ -269,6 +293,9 @@ static void scalar_sqr_internal(uint256_t *r, const uint256_t *a)
|
|
|
269
293
|
* scalar_add_internal — modular addition mod N.
|
|
270
294
|
*
|
|
271
295
|
* Computes a + b, then branchlessly subtracts N if the result >= N.
|
|
296
|
+
*
|
|
297
|
+
* Precondition: a, b < N (canonical). Pre-reduction is the wrapper's
|
|
298
|
+
* responsibility — see rb_scalar_add.
|
|
272
299
|
*/
|
|
273
300
|
void scalar_add_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
274
301
|
{
|
|
@@ -283,7 +310,7 @@ void scalar_add_internal(uint256_t *r, const uint256_t *a, const uint256_t *b)
|
|
|
283
310
|
* If overflow == 0 and borrow == 0: sum >= N, want reduced.
|
|
284
311
|
* If overflow == 0 and borrow == 1: sum < N, want sum. */
|
|
285
312
|
uint64_t keep_original = (~overflow) & borrow;
|
|
286
|
-
uint64_t mask =
|
|
313
|
+
uint64_t mask = ct_mask_u64(keep_original);
|
|
287
314
|
int i;
|
|
288
315
|
for (i = 0; i < 4; i++) {
|
|
289
316
|
r->d[i] = (sum.d[i] & mask) | (reduced.d[i] & ~mask);
|
|
@@ -322,26 +349,30 @@ void scalar_inv_internal(uint256_t *r, const uint256_t *a)
|
|
|
322
349
|
* call-seq:
|
|
323
350
|
* Secp256k1Native.scalar_mod(a) -> Integer
|
|
324
351
|
*
|
|
325
|
-
* Reduce +a+ modulo the curve order N.
|
|
326
|
-
*
|
|
352
|
+
* Reduce +a+ modulo the curve order N. Accepts any Ruby Integer — negative,
|
|
353
|
+
* positive, and arbitrary width (including values >= 2^256).
|
|
327
354
|
*/
|
|
328
355
|
static VALUE rb_scalar_mod(VALUE self, VALUE a)
|
|
329
356
|
{
|
|
330
357
|
(void)self;
|
|
331
358
|
|
|
332
|
-
/*
|
|
333
|
-
*
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
if (negative) {
|
|
339
|
-
/* Ruby % is always non-negative when the modulus is positive */
|
|
340
|
-
a_norm = rb_funcall(a, rb_intern("%"), 1, n_rb);
|
|
341
|
-
} else {
|
|
342
|
-
a_norm = a;
|
|
359
|
+
/* L-1: reject non-Integer BEFORE Ruby `%` is dispatched on the receiver.
|
|
360
|
+
* Without this, a String would raise NoMethodError (no `%` of Integer),
|
|
361
|
+
* and any object whose `%` happens to return an Integer would silently
|
|
362
|
+
* succeed — both bypass the wrapper's documented TypeError contract. */
|
|
363
|
+
if (!RB_INTEGER_TYPE_P(a)) {
|
|
364
|
+
rb_raise(rb_eTypeError, "expected Integer");
|
|
343
365
|
}
|
|
344
366
|
|
|
367
|
+
/* L-4: pre-reduce via Ruby `%` unconditionally. This is intentionally
|
|
368
|
+
* different from the other scalar wrappers (which use the C-level
|
|
369
|
+
* scalar_reduce): Ruby `%` handles both negative inputs (returns the
|
|
370
|
+
* non-negative residue) and arbitrary width (rb_to_uint256 would raise
|
|
371
|
+
* "exceeds 256 bits" on values >= 2^256 otherwise), so it is the right
|
|
372
|
+
* canonicalisation primitive at this boundary. */
|
|
373
|
+
VALUE n_rb = uint256_to_rb(&CURVE_N);
|
|
374
|
+
VALUE a_norm = rb_funcall(a, rb_intern("%"), 1, n_rb);
|
|
375
|
+
|
|
345
376
|
uint256_t ua = rb_to_uint256(a_norm);
|
|
346
377
|
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
347
378
|
uint256_t r;
|
|
@@ -360,8 +391,18 @@ static VALUE rb_scalar_mul(VALUE self, VALUE a, VALUE b)
|
|
|
360
391
|
(void)self;
|
|
361
392
|
uint256_t ua = rb_to_uint256(a);
|
|
362
393
|
uint256_t ub = rb_to_uint256(b);
|
|
394
|
+
|
|
395
|
+
/* Defence in depth: pre-reduce both operands mod N before multiplying.
|
|
396
|
+
* scalar_mul_internal is correct on any 256-bit operand pair after the
|
|
397
|
+
* H-1 fix, so this is belt-and-braces — but it makes the Ruby boundary's
|
|
398
|
+
* input contract explicit and consistent with rb_scalar_inv. */
|
|
399
|
+
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
400
|
+
uint256_t ua_reduced, ub_reduced;
|
|
401
|
+
scalar_reduce(&ua_reduced, &zero_limbs, &ua);
|
|
402
|
+
scalar_reduce(&ub_reduced, &zero_limbs, &ub);
|
|
403
|
+
|
|
363
404
|
uint256_t r;
|
|
364
|
-
scalar_mul_internal(&r, &
|
|
405
|
+
scalar_mul_internal(&r, &ua_reduced, &ub_reduced);
|
|
365
406
|
return uint256_to_rb(&r);
|
|
366
407
|
}
|
|
367
408
|
|
|
@@ -403,8 +444,18 @@ static VALUE rb_scalar_add(VALUE self, VALUE a, VALUE b)
|
|
|
403
444
|
(void)self;
|
|
404
445
|
uint256_t ua = rb_to_uint256(a);
|
|
405
446
|
uint256_t ub = rb_to_uint256(b);
|
|
447
|
+
|
|
448
|
+
/* M-1 correctness fix: scalar_add_internal subtracts N at most once and is
|
|
449
|
+
* therefore correct only when both operands are already < N. Pre-reduce
|
|
450
|
+
* both operands mod N so the wrapper's documented `(a + b) mod N` contract
|
|
451
|
+
* holds for any 256-bit input (mirrors rb_scalar_inv / rb_scalar_mul). */
|
|
452
|
+
uint256_t zero_limbs = {{ 0ULL, 0ULL, 0ULL, 0ULL }};
|
|
453
|
+
uint256_t ua_reduced, ub_reduced;
|
|
454
|
+
scalar_reduce(&ua_reduced, &zero_limbs, &ua);
|
|
455
|
+
scalar_reduce(&ub_reduced, &zero_limbs, &ub);
|
|
456
|
+
|
|
406
457
|
uint256_t r;
|
|
407
|
-
scalar_add_internal(&r, &
|
|
458
|
+
scalar_add_internal(&r, &ua_reduced, &ub_reduced);
|
|
408
459
|
return uint256_to_rb(&r);
|
|
409
460
|
}
|
|
410
461
|
|
|
@@ -107,11 +107,41 @@ void register_scalar_methods(VALUE mod);
|
|
|
107
107
|
* Branchless selection helper
|
|
108
108
|
* ----------------------------------------------------------------------- */
|
|
109
109
|
|
|
110
|
+
/* Opaque value barrier: returns x unchanged, but the empty volatile asm forces
|
|
111
|
+
* the compiler to treat the result as an unknown register value. Without it,
|
|
112
|
+
* GCC (observed on 15.2, -O2) recognises the all-0s/all-1s select masks used
|
|
113
|
+
* throughout this extension, reconstructs the original boolean, and emits a
|
|
114
|
+
* secret-dependent conditional jump — defeating the branchless intent. This is
|
|
115
|
+
* the same technique libsecp256k1/BoringSSL use to keep constant-time selects
|
|
116
|
+
* flat. On compilers without GNU asm (e.g. MSVC, where this extension is a
|
|
117
|
+
* no-op anyway) it degrades to an identity function. */
|
|
118
|
+
static inline uint64_t ct_value_barrier_u64(uint64_t x) {
|
|
119
|
+
#if defined(__GNUC__) || defined(__clang__)
|
|
120
|
+
__asm__ volatile("" : "+r"(x));
|
|
121
|
+
#endif
|
|
122
|
+
return x;
|
|
123
|
+
}
|
|
124
|
+
|
|
125
|
+
/* Build a constant-time select mask: all-ones (0xFFFF...FF) when flag != 0,
|
|
126
|
+
* all-zeros otherwise. The value barrier is applied here so that EVERY mask in
|
|
127
|
+
* this extension is opaque to the optimiser before it feeds a branchless
|
|
128
|
+
* mask-select — both polarities are used in this codebase:
|
|
129
|
+
* (a & mask) | (b & ~mask) — selects `a` when mask is all-ones
|
|
130
|
+
* (a & ~mask) | (b & mask) — selects `b` when mask is all-ones (e.g. uint256_select)
|
|
131
|
+
* Either form is equivalent for an all-0/all-1 mask; the comment lists both so
|
|
132
|
+
* an auditor reading a call site knows the polarity is intentional, not a bug.
|
|
133
|
+
* All constant-time masks MUST be constructed through this helper — a raw
|
|
134
|
+
* `-(uint64_t)(cond)` is a latent branch waiting for the compiler to
|
|
135
|
+
* reconstruct it. */
|
|
136
|
+
static inline uint64_t ct_mask_u64(uint64_t flag) {
|
|
137
|
+
return ct_value_barrier_u64(-(uint64_t)(flag != 0));
|
|
138
|
+
}
|
|
139
|
+
|
|
110
140
|
/* Branchless conditional select: if flag is non-zero, *r = *b; else *r = *a.
|
|
111
141
|
* Constant-time: no branch on flag. */
|
|
112
142
|
static inline void uint256_select(uint256_t *r, const uint256_t *a,
|
|
113
143
|
const uint256_t *b, uint64_t flag) {
|
|
114
|
-
uint64_t mask =
|
|
144
|
+
uint64_t mask = ct_mask_u64(flag);
|
|
115
145
|
r->d[0] = (a->d[0] & ~mask) | (b->d[0] & mask);
|
|
116
146
|
r->d[1] = (a->d[1] & ~mask) | (b->d[1] & mask);
|
|
117
147
|
r->d[2] = (a->d[2] & ~mask) | (b->d[2] & mask);
|
data/lib/secp256k1/version.rb
CHANGED
data/lib/secp256k1.rb
CHANGED
|
@@ -138,12 +138,28 @@ module Secp256k1
|
|
|
138
138
|
end
|
|
139
139
|
|
|
140
140
|
# Modular subtraction in the field.
|
|
141
|
+
#
|
|
142
|
+
# Canonicalises both operands so the result matches the C wrapper for any
|
|
143
|
+
# *non-negative* 256-bit input — load-bearing for the dfuzz differential,
|
|
144
|
+
# where pure-Ruby serves as the oracle. The dfuzz harness only feeds
|
|
145
|
+
# non-negative inputs (xorshift output, plus structured P-band vectors),
|
|
146
|
+
# so the differential never observes the negative case.
|
|
147
|
+
#
|
|
148
|
+
# Note: pure-Ruby accepts negative inputs (Ruby `%` canonicalises them);
|
|
149
|
+
# the C wrapper rejects negatives via `rb_to_uint256`. Backend parity
|
|
150
|
+
# holds for all >= 0 inputs; intentional divergence on negatives.
|
|
141
151
|
def fsub(a, b)
|
|
152
|
+
a %= P
|
|
153
|
+
b %= P
|
|
142
154
|
a >= b ? a - b : P - (b - a)
|
|
143
155
|
end
|
|
144
156
|
|
|
145
157
|
# Modular negation in the field.
|
|
158
|
+
#
|
|
159
|
+
# Canonicalises the operand so the result matches the C wrapper for any
|
|
160
|
+
# non-negative 256-bit input — see {#fsub} for the negative-input note.
|
|
146
161
|
def fneg(a)
|
|
162
|
+
a %= P
|
|
147
163
|
a.zero? ? 0 : P - a
|
|
148
164
|
end
|
|
149
165
|
|
|
@@ -291,7 +307,10 @@ module Secp256k1
|
|
|
291
307
|
|
|
292
308
|
# @!visibility private
|
|
293
309
|
# Cache for precomputed wNAF tables, keyed by "window:x:y".
|
|
294
|
-
#
|
|
310
|
+
# FIFO eviction: the oldest *inserted* entry is dropped when the cap
|
|
311
|
+
# is reached (Hash preserves insertion order; we delete the first key).
|
|
312
|
+
# Bounded at WNAF_CACHE_MAX entries; keyed only on the public base
|
|
313
|
+
# point — no secret-scalar exposure.
|
|
295
314
|
WNAF_TABLE_CACHE = {} # rubocop:disable Style/MutableConstant
|
|
296
315
|
|
|
297
316
|
# @!visibility private
|
|
@@ -317,7 +336,7 @@ module Secp256k1
|
|
|
317
336
|
tbl = WNAF_TABLE_CACHE[cache_key]
|
|
318
337
|
|
|
319
338
|
if tbl.nil?
|
|
320
|
-
#
|
|
339
|
+
# FIFO eviction: drop the oldest *inserted* entry when the cache is full.
|
|
321
340
|
WNAF_TABLE_CACHE.delete(WNAF_TABLE_CACHE.keys.first) if WNAF_TABLE_CACHE.size >= WNAF_CACHE_MAX
|
|
322
341
|
|
|
323
342
|
tbl_size = 1 << (window - 1) # e.g. w=5 -> 16 entries
|
|
@@ -437,7 +456,24 @@ module Secp256k1
|
|
|
437
456
|
|
|
438
457
|
# @param x [Integer, nil] x-coordinate (nil for infinity)
|
|
439
458
|
# @param y [Integer, nil] y-coordinate (nil for infinity)
|
|
459
|
+
# @raise [ArgumentError] if x and y are not both nil and not both
|
|
460
|
+
# Integers in [0, P)
|
|
440
461
|
def initialize(x, y)
|
|
462
|
+
# I-3 mitigation, hardened: only two valid shapes are accepted —
|
|
463
|
+
# the point at infinity (nil, nil), or a finite point with both
|
|
464
|
+
# coordinates canonical in [0, P). Catches Point.new(1, P-of-range),
|
|
465
|
+
# Point.new(-1, 5), Point.new(nil, 5), and similar half-states at
|
|
466
|
+
# construction so no downstream path (negate, to_octet_string,
|
|
467
|
+
# on_curve?) has to second-guess the invariant.
|
|
468
|
+
if x.nil? && y.nil?
|
|
469
|
+
# point at infinity — both coordinates absent
|
|
470
|
+
elsif x.is_a?(Integer) && y.is_a?(Integer) && x >= 0 && x < P && y >= 0 && y < P
|
|
471
|
+
# finite point with canonical coordinates
|
|
472
|
+
else
|
|
473
|
+
raise ArgumentError,
|
|
474
|
+
'Point requires (nil, nil) for infinity or two Integers in [0, P)'
|
|
475
|
+
end
|
|
476
|
+
|
|
441
477
|
@x = x
|
|
442
478
|
@y = y
|
|
443
479
|
end
|
|
@@ -449,6 +485,37 @@ module Secp256k1
|
|
|
449
485
|
new(nil, nil)
|
|
450
486
|
end
|
|
451
487
|
|
|
488
|
+
# Construct a Point from raw (x, y) coordinates with curve-membership
|
|
489
|
+
# validation. This is the **required** entry point for caller-supplied
|
|
490
|
+
# coordinates (e.g. from an external protocol or user input).
|
|
491
|
+
#
|
|
492
|
+
# `Point.new` is intended for always-on-curve intermediates produced by
|
|
493
|
+
# `mul` / `mul_vt` / `add` / `negate`; it validates only the range of
|
|
494
|
+
# the coordinates, not that they satisfy y² = x³ + 7 (mod P). Calling
|
|
495
|
+
# `mul` on a Point constructed via `Point.new` with off-curve
|
|
496
|
+
# coordinates is an invalid-curve precondition that this method
|
|
497
|
+
# exists to close (L-5).
|
|
498
|
+
#
|
|
499
|
+
# @param x [Integer] x-coordinate in [0, P)
|
|
500
|
+
# @param y [Integer] y-coordinate in [0, P)
|
|
501
|
+
# @return [Point]
|
|
502
|
+
# @raise [ArgumentError] if x or y is nil (use `Point.infinity` for
|
|
503
|
+
# infinity); if x or y is not an Integer in [0, P) (raised by `new`);
|
|
504
|
+
# or if (x, y) is not on the curve
|
|
505
|
+
def self.from_coordinates(x, y)
|
|
506
|
+
# Reject the (nil, nil) infinity shape that Point.new accepts. This
|
|
507
|
+
# method's contract is "raw (x, y) Integers"; callers wanting infinity
|
|
508
|
+
# should use Point.infinity (or Point.new(nil, nil) on the internal path).
|
|
509
|
+
# Without this check, on_curve? returns true for infinity and we would
|
|
510
|
+
# silently return it.
|
|
511
|
+
raise ArgumentError, 'x and y must be Integers' if x.nil? || y.nil?
|
|
512
|
+
|
|
513
|
+
pt = new(x, y)
|
|
514
|
+
raise ArgumentError, 'point is not on the secp256k1 curve' unless pt.on_curve?
|
|
515
|
+
|
|
516
|
+
pt
|
|
517
|
+
end
|
|
518
|
+
|
|
452
519
|
# The generator point G.
|
|
453
520
|
#
|
|
454
521
|
# @return [Point]
|
|
@@ -464,6 +531,15 @@ module Secp256k1
|
|
|
464
531
|
# @raise [ArgumentError] if the encoding is invalid or the point
|
|
465
532
|
# is not on the curve
|
|
466
533
|
def self.from_bytes(bytes)
|
|
534
|
+
# I-4: reject non-String / empty input up front with a clean
|
|
535
|
+
# ArgumentError. Without this, nil / Float / Integer raise
|
|
536
|
+
# NoMethodError (on `.encoding`), and an empty String raises
|
|
537
|
+
# NoMethodError (on `nil.to_s` in the else-branch error formatting).
|
|
538
|
+
# All fail closed either way, but the error type is wrong.
|
|
539
|
+
unless bytes.is_a?(String) && !bytes.empty?
|
|
540
|
+
raise ArgumentError, 'bytes must be a non-empty String'
|
|
541
|
+
end
|
|
542
|
+
|
|
467
543
|
bytes = bytes.b if bytes.encoding != Encoding::BINARY
|
|
468
544
|
prefix = bytes.getbyte(0)
|
|
469
545
|
|
|
@@ -558,10 +634,8 @@ module Secp256k1
|
|
|
558
634
|
'Set SECP256K1_ALLOW_PURE_RUBY_CT=1 or call Secp256k1.allow_pure_ruby_ct! to override.'
|
|
559
635
|
end
|
|
560
636
|
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
scalar %= N
|
|
564
|
-
return self.class.infinity if scalar.zero?
|
|
637
|
+
scalar = normalise_scalar(scalar)
|
|
638
|
+
return self.class.infinity if scalar.nil?
|
|
565
639
|
|
|
566
640
|
jp = Secp256k1.scalar_multiply_ct(scalar, @x, @y)
|
|
567
641
|
affine = Secp256k1.jp_to_affine(jp)
|
|
@@ -582,10 +656,8 @@ module Secp256k1
|
|
|
582
656
|
# @param scalar [Integer] the public scalar multiplier
|
|
583
657
|
# @return [Point] the resulting point
|
|
584
658
|
def mul_vt(scalar)
|
|
585
|
-
|
|
586
|
-
|
|
587
|
-
scalar %= N
|
|
588
|
-
return self.class.infinity if scalar.zero?
|
|
659
|
+
scalar = normalise_scalar(scalar)
|
|
660
|
+
return self.class.infinity if scalar.nil?
|
|
589
661
|
|
|
590
662
|
jp = Secp256k1.scalar_multiply_wnaf(scalar, @x, @y)
|
|
591
663
|
affine = Secp256k1.jp_to_affine(jp)
|
|
@@ -594,6 +666,28 @@ module Secp256k1
|
|
|
594
666
|
self.class.new(affine[0], affine[1])
|
|
595
667
|
end
|
|
596
668
|
|
|
669
|
+
private
|
|
670
|
+
|
|
671
|
+
# Validate and canonicalise a scalar for multiplication (L-2).
|
|
672
|
+
#
|
|
673
|
+
# @param scalar [Integer] the scalar multiplier
|
|
674
|
+
# @return [Integer, nil] the scalar reduced mod N, or nil if the
|
|
675
|
+
# product would be infinity (scalar is zero mod N, or self is the
|
|
676
|
+
# point at infinity)
|
|
677
|
+
# @raise [ArgumentError] if scalar is not an Integer
|
|
678
|
+
def normalise_scalar(scalar)
|
|
679
|
+
raise ArgumentError, 'scalar must be an Integer' unless scalar.is_a?(Integer)
|
|
680
|
+
|
|
681
|
+
return nil if infinity?
|
|
682
|
+
|
|
683
|
+
scalar %= N
|
|
684
|
+
return nil if scalar.zero?
|
|
685
|
+
|
|
686
|
+
scalar
|
|
687
|
+
end
|
|
688
|
+
|
|
689
|
+
public
|
|
690
|
+
|
|
597
691
|
# Point addition: self + other.
|
|
598
692
|
#
|
|
599
693
|
# @param other [Point]
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: secp256k1-native
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.18.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Simon Bettison
|
|
@@ -32,7 +32,6 @@ files:
|
|
|
32
32
|
- ext/secp256k1_native/secp256k1_native.h
|
|
33
33
|
- lib/secp256k1.rb
|
|
34
34
|
- lib/secp256k1/version.rb
|
|
35
|
-
- lib/secp256k1_native.bundle
|
|
36
35
|
- secp256k1-native.gemspec
|
|
37
36
|
homepage: https://github.com/sgbett/secp256k1-native
|
|
38
37
|
licenses:
|
data/lib/secp256k1_native.bundle
DELETED
|
Binary file
|