amoradbx 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,43 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [2.0.0] - 2026-04-03
9
+
10
+ ### Added
11
+ - **Multi-threading support**: Added `WorkerPool` and `SharedArrayBuffer` support for parallel operations (up to 8 threads).
12
+ - **SIMD acceleration**: Integrated `wasm_simd128` for faster group probing in Swiss Tables and bulk memory operations.
13
+ - **Swiss Table design**: Replaced standard hash map with a high-performance Swiss Table implementation for better cache locality and probing efficiency.
14
+ - **Bloom Filters**: Added 256KB Bloom Filters per shard (64 shards total) to minimize expensive lookups for non-existent keys.
15
+ - **Skip List index**: Implemented a 16-level sorted index for efficient prefix and range scans.
16
+ - **Write-Ahead Log (WAL)**: Added durability via an append-only WAL with CRC32C integrity checks and automatic recovery.
17
+ - **Slab Allocator**: Custom 32-class slab allocator to reduce memory fragmentation and allocation overhead.
18
+ - **Atomic Batches**: Support for ACID-like batch writes with rollback on failure.
19
+ - **Buffered Writes**: Added command buffer (512KB) for high-throughput asynchronous writes.
20
+ - **Snapshot Export/Import**: Full database snapshots with per-record CRC32C checksums.
21
+ - **RapidHash**: Switched to RapidHash for improved hash distribution and performance.
22
+ - **Real-world Benchmark**: Introduced `benchmark.js` for accurate performance profiling including JS bridge overhead and latency percentiles (P50/P99).
23
+ - **npm package**: Added `package.json` for publishing as `amoradbx`.
24
+
25
+ ### Changed
26
+ - **Increased Limits**: Maximum value size increased to 1MB (16x previous limit).
27
+ - **Key Optimization**: Added 22-byte inline fast path for small keys to eliminate heap allocation.
28
+ - **Memory Management**: Optimized memory growth and boundary checks for large datasets.
29
+
30
+ ### Fixed
31
+ - **Windows WAL fix**: Optimized file-writing strategy in `amora.js` to prevent `EPERM` errors during WAL updates on Windows.
32
+ - Improved race condition handling in multi-threaded environments using `_Atomic` types and explicit memory ordering.
33
+ - Enhanced CRC32C performance with a slice-by-4 unrolled implementation.
34
+
35
+ ## [1.0.0] - 2026-04-02
36
+
37
+ ### Added
38
+ - Initial release of AmoraDB.
39
+ - Core key-value engine in C and WebAssembly.
40
+ - Basic Node.js bindings.
41
+ - Support for `set`, `get`, `has`, and `delete` operations.
42
+ - Simple in-memory storage.
43
+ - Basic benchmarking harness.
@@ -0,0 +1,77 @@
1
+ # Contributing to AmoraDB
2
+
3
+ Thank you for your interest in contributing to AmoraDB! We're excited to have you join our community.
4
+
5
+ ## Code of Conduct
6
+
7
+ Please follow our Code of Conduct in all your interactions with the project.
8
+
9
+ ## How to Contribute
10
+
11
+ 1. **Report Bugs**: If you find a bug, please open an issue with a clear description and steps to reproduce.
12
+ 2. **Suggest Features**: Have an idea for a new feature? Open an issue to discuss it.
13
+ 3. **Submit Pull Requests**:
14
+ * Fork the repository.
15
+ * Create a new branch for your changes.
16
+ * Implement your changes and add tests.
17
+ * Ensure all tests pass (`node test.js`).
18
+ * Submit a pull request.
19
+
20
+ ## Development Environment Setup
21
+
22
+ ### Prerequisites
23
+
24
+ * **Node.js**: Version 18 or higher.
25
+ * **Emscripten** or **wasi-sdk**: To compile the C core to WebAssembly.
26
+
27
+ ### Building from Source
28
+
29
+ To build the `amora_core_mt_simd.wasm` binary with full multi-threading and SIMD support:
30
+
31
+ ```bash
32
+ clang --target=wasm32 -O3 -nostdlib -std=c11 \
33
+ -msimd128 -matomics -mbulk-memory \
34
+ -fvisibility=hidden -ffunction-sections -fdata-sections \
35
+ -Wl,--no-entry -Wl,--export-dynamic \
36
+ -Wl,--import-memory -Wl,--shared-memory \
37
+ -Wl,--max-memory=4294967296 \
38
+ -Wl,--gc-sections -Wl,--allow-undefined -Wl,--lto-O3 \
39
+ -flto -DCACHE_LINE=256 \
40
+ -o amora_core_mt_simd.wasm amora_core.c
41
+ ```
42
+
43
+ ### Running Tests
44
+
45
+ Run the full test suite using Node.js:
46
+
47
+ ```bash
48
+ node test.js
49
+ ```
50
+
51
+ The test suite covers performance benchmarks, stress tests, and core functionality checks.
52
+
53
+ ## Coding Standards
54
+
55
+ ### C Core (`amora_core.c`)
56
+
57
+ * Standard C11.
58
+ * No standard library dependencies (`-nostdlib`).
59
+ * Use `u8`, `u16`, `u32`, `u64` for fixed-width integers.
60
+ * Optimize for performance: use `INLINE` for small functions and minimize allocations.
61
+ * Maintain thread safety using `_Atomic` types and memory ordering where necessary.
62
+
63
+ ### Node.js Binding (`amora.js`)
64
+
65
+ * Follow standard Node.js practices.
66
+ * Ensure backward compatibility with older Node.js versions if possible (minimum v18).
67
+ * Maintain the high-performance philosophy: avoid unnecessary copying and allocations.
68
+ * Use `SharedArrayBuffer` for multi-threaded communication.
69
+
70
+ ## Documentation
71
+
72
+ * Update the `README.md` and `SPEC.md` for any major architectural changes or new features.
73
+ * Keep comments clear and concise.
74
+
75
+ ## Licensing
76
+
77
+ By contributing to AmoraDB, you agree that your contributions will be licensed under the MIT License.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Amoracoin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,360 @@
1
+ <div align="center">
2
+
3
+ <img src="https://github.com/amoracoin-org/AmoraDb/blob/main/assets/amoradb-logo.jpeg" alt="AmoraDB" width="180"/>
4
+
5
+ # AmoraDB
6
+
7
+ **Ultra-High-Performance Embedded Key-Value Engine**
8
+
9
+ *Built in C ยท Compiled to WebAssembly ยท Bound to Node.js*
10
+
11
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blueviolet.svg)](#license)
12
+ [![WebAssembly](https://img.shields.io/badge/WebAssembly-SIMD128-654ff0?logo=webassembly)](https://webassembly.org/)
13
+ [![Node.js](https://img.shields.io/badge/Node.js-Binding-339933?logo=nodedotjs)](https://nodejs.org/)
14
+ [![Version](https://img.shields.io/badge/version-2.0-ff69b4)](#)
15
+
16
+ </div>
17
+
18
+ ---
19
+
20
+ ### ๐Ÿ“– Documentation
21
+
22
+ - **[CHANGELOG.md](CHANGELOG.md)**: Track all notable changes and version history.
23
+ - **[CONTRIBUTING.md](CONTRIBUTING.md)**: Guide for setting up development environment and contributing code.
24
+ - **[SPEC.md](SPEC.md)**: Deep dive into the internal architecture, sharding, and binary formats.
25
+
26
+ ---
27
+
28
+ AmoraDB is a hand-crafted, zero-dependency key-value store written entirely in C and compiled to WebAssembly. It is designed for scenarios where you need **millions of operations per second**, atomic batch writes, WAL-backed durability, and a minimal footprint โ€” all inside a Node.js process with no native addons.
29
+
30
+ It outperforms LevelDB and competes with LMDB at in-memory workloads, with the portability advantage of running anywhere WebAssembly runs.
31
+
32
+ ---
33
+
34
+ ## โœจ Features
35
+
36
+ | Capability | Detail |
37
+ |---|---|
38
+ | **Hash Map** | Swiss Table design with SIMD-accelerated group probing (`wasm_simd128`) |
39
+ | **Sharding** | 64 independent shards for lock-free parallelism |
40
+ | **Bloom Filters** | 256 KB per shard (2M bits) for zero-cost negative lookups |
41
+ | **Skip List** | 16-level sorted index, supports billions of entries |
42
+ | **WAL** | Append-only Write-Ahead Log with CRC32C integrity, up to 32 MB |
43
+ | **Snapshots** | Full export/import with per-record CRC32C checksums |
44
+ | **Atomic Batches** | ACID-like batch writes with rollback on failure |
45
+ | **Worker Threads** | Shared `SharedArrayBuffer` memory across up to 8 threads |
46
+ | **Slab Allocator** | 32 size classes, `SLAB_MIN_SIZE` 16B โ†’ `SLAB_MAX_SIZE` per class |
47
+ | **GC / Compaction** | Tombstone-aware compaction with auto-compact threshold |
48
+ | **Large Values** | Up to 1 MB per value (16ร— previous limit) |
49
+ | **Large Keys** | Up to 4 KB per key, with 22-byte inline fast path |
50
+
51
+ ---
52
+
53
+ ## ๐Ÿ“Š Performance
54
+
55
+ Benchmarks run with the built-in `db.bench(1_000_000)` harness (C-level, 1M operations, in-memory):
56
+
57
+ | Operation | Throughput |
58
+ |---|---|
59
+ | Write | ~1.5M+ ops/s |
60
+ | Read | ~1.7M+ ops/s |
61
+ | Delete | ~1.7M+ ops/s |
62
+
63
+ ### Comparison (in-memory, single node)
64
+
65
+ | Engine | Write/s | Read/s | Notes |
66
+ |---|---|---|---|
67
+ | **AmoraDB v2.0** | ~1.5M+ | ~1.7M+ | WASM ยท SIMD ยท 64 shards |
68
+ | LMDB | ~1.2M | ~2.0M | mmap ยท B+Tree ยท C addon |
69
+ | RocksDB | ~700K | ~900K | LSM ยท disk-tuned ยท C addon |
70
+ | LevelDB | ~400K | ~600K | LSM ยท disk ยท C addon |
71
+ | Redis (local) | ~500K | ~800K | TCP overhead ยท RAM |
72
+
73
+ ### Internal benchmark 1,000,000 ops (pure C)
74
+
75
+ The internal benchmark measures raw performance at the C level, bypassing the Node.js bridge.
76
+
77
+ ### Real-World Benchmark (Node.js)
78
+
79
+ For a more accurate measure of performance including JavaScript overhead and real-world latency, run:
80
+
81
+ ```bash
82
+ node benchmark.js
83
+ ```
84
+
85
+ This benchmark covers:
86
+ - **JS <-> WASM Bridge**: Real overhead of calling the engine from Node.js.
87
+ - **Latency (P50/P99)**: Tracks sub-millisecond response times.
88
+ - **Concurrency**: Parallel execution across worker threads.
89
+ - **Persistence**: Impact of WAL durability (Async vs Sync).
90
+
91
+ > Benchmarks are illustrative. Results vary by hardware, key size, value size, and access pattern.
92
+ >
93
+ > โš ๏ธ Note: LevelDB and RocksDB are disk-first engines optimized for persistence and compaction. Redis includes TCP overhead. This comparison reflects raw in-process throughput only โ€” not overall capability. Choose the right tool for your use case.
94
+ ---
95
+
96
+ ## ๐Ÿ— Architecture
97
+
98
+ ```
99
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
100
+ โ”‚ amora.js โ”‚ Node.js Binding
101
+ โ”‚ LRU string cache ยท encodeInto ยท SharedArrayBuf โ”‚
102
+ โ”‚ Command buffer (512KB) ยท Worker pool (8 max) โ”‚
103
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
104
+ โ”‚ WebAssembly
105
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
106
+ โ”‚ amora_core.c โ†’ .wasm โ”‚ Core Engine
107
+ โ”‚ โ”‚
108
+ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
109
+ โ”‚ โ”‚ Shard 0 โ”‚ โ”‚ Shard 1 โ”‚ โ”‚ ...63 โ”‚ โ”‚ 64 Shards
110
+ โ”‚ โ”‚ SwissMap โ”‚ โ”‚ SwissMap โ”‚ โ”‚ SwissMap โ”‚ โ”‚
111
+ โ”‚ โ”‚ Bloom โ”‚ โ”‚ Bloom โ”‚ โ”‚ Bloom โ”‚ โ”‚ 256KB Bloom/shard
112
+ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
113
+ โ”‚ โ”‚
114
+ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
115
+ โ”‚ โ”‚ Skip List (16 levels) โ”‚ โ”‚ Sorted index
116
+ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
117
+ โ”‚ โ”‚
118
+ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
119
+ โ”‚ โ”‚ WAL (CRC32C ยท 32MB ยท append-only) โ”‚ โ”‚ Durability
120
+ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
121
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
122
+ ```
123
+
124
+ **Hash function:** [RapidHash](https://github.com/Nicoshev/rapidhash) โ€” excellent distribution, minimal collisions.
125
+
126
+ **Integrity:** CRC32C with a 256-entry lookup table, processed 4 bytes per iteration (slice-by-4 unroll).
127
+
128
+ **Atomics:** When compiled with `__wasm_threads__`, all shard counters use `_Atomic` types with explicit memory ordering (acquire/release/relaxed).
129
+
130
+ ---
131
+
132
+ ## ๐Ÿš€ Getting Started
133
+
134
+ ### Prerequisites
135
+
136
+ - Node.js โ‰ฅ 18
137
+ - Bundled WebAssembly binary (`amora_core_mt_simd.wasm`) when installed from npm (or build from source โ€” see below)
138
+
139
+ ### Installation
140
+
141
+ ```bash
142
+ npm install amoradbx
143
+ ```
144
+
145
+ ### Basic Usage
146
+
147
+ ```js
148
+ const AmoraDB = require('amoradbx');
149
+
150
+ // Open a database (in-memory + optional WAL persistence)
151
+ const db = AmoraDB.open(null, {
152
+ threads: 4,
153
+ cap: 65536, // Initial capacity (entries)
154
+ walPath: './my.wal', // Omit for pure in-memory
155
+ walSync: true, // fsync on every WAL flush
156
+ });
157
+
158
+ // Basic operations
159
+ db.set('user:1', 'alice');
160
+ db.get('user:1'); // โ†’ 'alice'
161
+ db.has('user:1'); // โ†’ true
162
+ db.delete('user:1');
163
+
164
+ // Multi-get
165
+ const results = db.mget(['user:1', 'user:2', 'missing']);
166
+ // โ†’ ['alice', 'bob', null]
167
+
168
+ // Prefix scan
169
+ const entries = db.scan('user:');
170
+ // โ†’ [{ key: 'user:1', value: 'alice' }, ...]
171
+
172
+ // Range scan
173
+ const range = db.range('user:1', 'user:9');
174
+ ```
175
+
176
+ ### Atomic Batch Writes
177
+
178
+ ```js
179
+ db.batch([
180
+ { op: 'set', key: 'account:1', value: '500' },
181
+ { op: 'set', key: 'account:2', value: '300' },
182
+ { op: 'delete', key: 'account:old' },
183
+ ]);
184
+ // All succeed or all roll back โ€” no partial writes.
185
+ ```
186
+
187
+ ### Buffered Writes (High Throughput)
188
+
189
+ ```js
190
+ // setBuffered queues writes in a 512KB command buffer
191
+ for (let i = 0; i < 100_000; i++) {
192
+ db.setBuffered(`key:${i}`, `value:${i}`);
193
+ }
194
+ db.flush(); // Commit all at once
195
+ ```
196
+
197
+ ### Snapshot Export / Import
198
+
199
+ ```js
200
+ // Export entire database to a Buffer
201
+ const snapshot = db.export(128 * 1024 * 1024); // 128MB max
202
+
203
+ // Import into a fresh instance
204
+ const db2 = AmoraDB.open(null, { cap: 65536 });
205
+ const count = db2.import(snapshot);
206
+ console.log(`Imported ${count} entries`);
207
+ ```
208
+
209
+ ---
210
+
211
+ ## โš™๏ธ Configuration Options
212
+
213
+ | Option | Type | Default | Description |
214
+ |---|---|---|---|
215
+ | `threads` | `number` | `1` | Number of worker threads (max 8) |
216
+ | `cap` | `number` | `65536` | Initial hash map capacity per shard |
217
+ | `walPath` | `string\|null` | `null` | Path for WAL file. `null` = no persistence |
218
+ | `walSync` | `boolean` | `true` | `fsync` after every WAL flush |
219
+
220
+ ---
221
+
222
+ ## ๐Ÿ”ง API Reference
223
+
224
+ ### Core Operations
225
+
226
+ | Method | Returns | Description |
227
+ |---|---|---|
228
+ | `db.heartbeat()` | `boolean` | Check if WASM core is responsive |
229
+ | `db.set(key, value)` | `void` | Insert or update a key |
230
+ | `db.get(key)` | `string\|null` | Retrieve a value |
231
+ | `db.has(key)` | `boolean` | Check existence (uses bloom filter) |
232
+ | `db.delete(key)` | `void` | Remove a key |
233
+ | `db.setBuffered(key, value)` | `void` | Buffered write (flush manually) |
234
+ | `db.mget(keys[])` | `(string\|null)[]` | Batch get (up to 8192 keys) |
235
+ | `db.batch(ops[])` | `void` | Atomic multi-operation batch |
236
+ | `db.scan(prefix)` | `{key,value}[]` | All keys with given prefix |
237
+ | `db.range(from, to)` | `{key,value}[]` | Lexicographic range scan |
238
+
239
+ ### Maintenance
240
+
241
+ | Method | Returns | Description |
242
+ |---|---|---|
243
+ | `db.gc()` | `number` | Compact tombstones across all shards |
244
+ | `db.autoCompact()` | `number` | GC if fragmentation > 25% |
245
+ | `db.fragmentation()` | `number` | Current fragmentation percentage |
246
+ | `db.flush()` | `void` | Flush pending command buffer |
247
+ | `db.persist()` | `void` | Flush + WAL write |
248
+ | `db.restore()` | `void` | Reload from WAL |
249
+ | `db.reset(cap?)` | `void` | Wipe and reinitialize |
250
+ | `db.close()` | `Promise<void>` | Async: flush, persist, terminate workers |
251
+ | `db.import(buf)` | `number` | Import snapshot buffer |
252
+ | `db.export(max?)` | `Buffer` | Export database to snapshot buffer |
253
+
254
+ ### Observability
255
+
256
+ ```js
257
+ const s = db.stats();
258
+ // {
259
+ // count: number, // Live entries
260
+ // capacity: number, // Allocated slots
261
+ // deleted: number, // Tombstone count
262
+ // shards: 64,
263
+ // threads: number,
264
+ // load: '73.2%',
265
+ // fragmentation: '12%',
266
+ // hit_rate: '98.7%',
267
+ // total_ops: number,
268
+ // write_errors: number,
269
+ // wal_errors: number,
270
+ // compactions: number,
271
+ // arena_kb: string,
272
+ // wal_kb: string,
273
+ // wasm_mb: string,
274
+ // mem_shared: boolean,
275
+ // }
276
+ ```
277
+
278
+ ### Benchmark
279
+
280
+ ```js
281
+ const b = db.bench(1_000_000);
282
+ // {
283
+ // ops: 1000000,
284
+ // write_ms: '312ms', write_ops_s: '3.20M',
285
+ // read_ms: '276ms', read_ops_s: '3.62M',
286
+ // delete_ms: '289ms', delete_ops_s: '1.73M',
287
+ // scan_ms: '1.42ms',
288
+ // }
289
+ ```
290
+
291
+ ---
292
+
293
+ ## ๐Ÿ”’ Limits & Safety
294
+
295
+ | Constraint | Limit |
296
+ |---|---|
297
+ | Maximum key size | 4,096 bytes |
298
+ | Maximum value size | 1,048,576 bytes (1 MB) |
299
+ | Inline key fast path | โ‰ค 22 bytes (zero heap allocation) |
300
+ | Shards | 64 |
301
+ | Worker threads | 8 |
302
+ | Scan results per call | 4,096 |
303
+ | WAL max size | 32 MB |
304
+
305
+ All limits are validated on the JavaScript side before reaching WASM. Violations throw a `RangeError` immediately.
306
+
307
+ ---
308
+
309
+ ## ๐Ÿ›  Building from Source
310
+
311
+ AmoraDB core is written in standard C99 with optional SIMD and atomics extensions for WebAssembly. Build with [Emscripten](https://emscripten.org/) or [wasi-sdk](https://github.com/WebAssembly/wasi-sdk):
312
+
313
+ ```bash
314
+ clang --target=wasm32 -O3 -nostdlib -std=c11 \
315
+ -msimd128 -matomics -mbulk-memory \
316
+ -fvisibility=hidden -ffunction-sections -fdata-sections \
317
+ -Wl,--no-entry -Wl,--export-dynamic \
318
+ -Wl,--import-memory -Wl,--shared-memory \
319
+ -Wl,--max-memory=4294967296 \
320
+ -Wl,--gc-sections -Wl,--allow-undefined -Wl,--lto-O3 \
321
+ -flto -DCACHE_LINE=256 \
322
+ -o amora_core_mt_simd.wasm amora_core.c
323
+ ```
324
+
325
+ > The `-msimd128`, `-matomics`, and `-mbulk-memory` flags enable full multi-threaded SIMD mode. Remove them to build a single-threaded fallback.
326
+
327
+ ---
328
+
329
+ ## ๐Ÿ“ Project Structure
330
+
331
+ ```
332
+ amora/
333
+ โ”œโ”€โ”€ amora_core.c # Core engine โ€” hash map, WAL, bloom, skip list, slab
334
+ โ”œโ”€โ”€ amora_core_mt_simd.wasm # Compiled WebAssembly binary โ€” multi-threaded, SIMD acceleration
335
+ โ”œโ”€โ”€ amora.js # Node.js binding โ€” workers, caching, serialization
336
+ โ””โ”€โ”€ test.js # Full test suite with stress and benchmarks
337
+ ```
338
+
339
+ ---
340
+
341
+ ## ๐Ÿงช Running Tests
342
+
343
+ ```bash
344
+ node test.js
345
+ ```
346
+
347
+ The test suite covers: heartbeat, set/get/has/delete, large values (512 KB), key/value size validation, update / 1000 rewrites, mget, atomic batch, batch rollback (single and repeated), prefix scan, fragmentation + GC, snapshot export/import with CRC corruption detection, stats, stress (100K keys), and the internal 1M-op C benchmark.
348
+
349
+ ---
350
+
351
+ ## ๐Ÿ“„ License
352
+
353
+ MIT ยฉ AmoraDB Authors
354
+
355
+ ---
356
+
357
+ <div align="center">
358
+ <sub>Built with obsession for performance. No dependencies. No compromises.</sub>
359
+ </div>
360
+
package/SPEC.md ADDED
@@ -0,0 +1,85 @@
1
+ # AmoraDB Technical Specification
2
+
3
+ AmoraDB is an ultra-high-performance, zero-dependency, sharded key-value storage engine built in C and compiled to WebAssembly for Node.js environments.
4
+
5
+ ## 1. Core Architecture
6
+
7
+ ### 1.1 Sharding Strategy
8
+ - **Shards**: 64 independent shards (`N_SHARDS = 64`).
9
+ - **Parallelism**: Shard-level locking using atomic spinlocks for multi-threaded operations.
10
+ - **Distribution**: Keys are hashed using `RapidHash`, and the top 6 bits determine the shard index.
11
+
12
+ ### 1.2 Hash Table (Swiss Table)
13
+ - **Design**: Google's Swiss Table inspired design for high cache locality and SIMD probing.
14
+ - **Group Size**: 16 slots per group (`GROUP_SIZE = 16`).
15
+ - **Control Bytes**: Each slot has a 1-byte control value (Empty `0x80`, Deleted `0xFE`, or 7-bit hash suffix).
16
+ - **Probing**: Linear probing with `wasm_simd128` group filtering.
17
+
18
+ ### 1.3 Bloom Filters
19
+ - **Memory**: 256 KB per shard (2M bits total per shard).
20
+ - **Functions**: 6 hash functions per lookup.
21
+ - **Effectiveness**: Near-zero cost for negative lookups (non-existent keys).
22
+
23
+ ### 1.4 Sorted Index (Skip List)
24
+ - **Levels**: 16-level probabilistic Skip List.
25
+ - **Capacity**: Supports up to 1M nodes.
26
+ - **Operations**: Prefix scans and lexicographic range queries.
27
+
28
+ ### 1.5 Memory Management
29
+ - **Slab Allocator**: Custom 20-class slab allocator for key/value storage.
30
+ - **Classes**: 16B to 1MB size classes.
31
+ - **GC/Compaction**: Tombstone-aware garbage collection with a 25% fragmentation threshold.
32
+ - **Inline Keys**: Keys โ‰ค 22 bytes are stored directly within the hash table slot to avoid extra allocations.
33
+
34
+ ## 2. Durability & Integrity
35
+
36
+ ### 2.1 Write-Ahead Log (WAL)
37
+ - **Format**: Append-only log with CRC32C checksums for every record.
38
+ - **Header**: 8-byte header (`WAL_MAGIC = 0x414D5257`, `WAL_VERSION = 20`).
39
+ - **Size Limit**: 32 MB per WAL file.
40
+ - **Recovery**: Automatic replay on database open with error reporting for corrupted entries.
41
+
42
+ ### 2.2 Data Integrity
43
+ - **Checksums**: Slice-by-4 CRC32C unrolled implementation for high-speed verification.
44
+ - **Snapshots**: Full database exports include per-record CRC32C checksums to detect data corruption.
45
+
46
+ ## 3. Concurrency & Performance
47
+
48
+ ### 3.1 Multi-threading
49
+ - **Workers**: Up to 8 worker threads using Node.js `worker_threads`.
50
+ - **Memory**: Shared `SharedArrayBuffer` memory between main thread and workers.
51
+ - **Synchronization**: `stdatomic.h` with `acquire/release/relaxed` memory ordering for lock-free counters and spinlocks.
52
+
53
+ ### 3.2 Performance Targets
54
+ - **Writes**: 1.5M+ ops/s (in-memory, single node).
55
+ - **Reads**: 1.7M+ ops/s (in-memory, single node).
56
+ - **Latency**: Sub-microsecond access times for small key/value pairs.
57
+
58
+ ## 4. API Specification
59
+
60
+ ### 4.1 Core Methods
61
+ - `set(key, value)`: Atomic insertion or update.
62
+ - `get(key)`: Fast retrieval with Bloom filter bypass for negative hits.
63
+ - `has(key)`: Bloom filter check.
64
+ - `delete(key)`: Logical deletion with tombstone marking.
65
+ - `batch(ops[])`: ACID-like atomic batch operations with rollback capability.
66
+
67
+ ### 4.2 Scanning
68
+ - `scan(prefix)`: Prefix-based range scan.
69
+ - `range(from, to)`: Inclusive lexicographic range scan.
70
+
71
+ ## 5. Limits & Constraints
72
+
73
+ - **Maximum Key Size**: 4,096 bytes (4 KB).
74
+ - **Maximum Value Size**: 1,048,576 bytes (1 MB).
75
+ - **Inline Key Fast Path**: โ‰ค 22 bytes.
76
+ - **Max Threads**: 8.
77
+ - **Max WAL Size**: 32 MB.
78
+ - **Max Shards**: 64.
79
+ - **Max Memory**: Up to 4GB (WASM limit).
80
+
81
+ ## 6. Binary Format (WAL/Snapshots)
82
+
83
+ - **WAL Record**: `[Type:1][KLen:2][VLen:4][Key:KLen][Value:VLen][CRC:4]`
84
+ - **Snapshot Header**: `[Magic:4][Version:4][Count:4][Reserved:20]`
85
+ - **Snapshot Record**: `[KLen:2][VLen:4][Key:KLen][Value:VLen][CRC:4]`