slatedb 0.3.2.beta.3-x86_64-linux

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,632 @@
1
+ # SlateDB Ruby
2
+
3
+ Ruby bindings for [SlateDB](https://slatedb.io), a cloud-native embedded key-value store built on object storage.
4
+
5
+ [![Build status](https://badge.buildkite.com/a7ae51f3a0bc7809cf66981641ec47b3c70db8cf349a5e462f.svg)](https://buildkite.com/catkins-test/slatedb-rb) [![Gem Version](https://badge.fury.io/rb/slatedb.svg)](https://badge.fury.io/rb/slatedb)
6
+
7
+ ## Production Readiness
8
+
9
+ These bindings are still in early development, and while SlateDB itself is used in Production, these bindings have yet to be. Contributions are welcome!
10
+
11
+ ### TODO
12
+
13
+ - [ ] Cross-compile native extensions
14
+
15
+ ## Installation
16
+
17
+ Add this line to your application's Gemfile:
18
+
19
+ ```ruby
20
+ gem 'slatedb'
21
+ ```
22
+
23
+ And then execute:
24
+
25
+ ```bash
26
+ bundle install
27
+ ```
28
+
29
+ Or install it yourself as:
30
+
31
+ ```bash
32
+ gem install slatedb
33
+ ```
34
+
35
+ > [!IMPORTANT]
36
+ > This gem currently requires a working Rust toolchain to install until the dependencies are cross-compiled.
37
+
38
+ ## Usage
39
+
40
+ ### Basic Operations
41
+
42
+ ```ruby
43
+ require 'slatedb'
44
+
45
+ # Open a database with in-memory storage (for testing)
46
+ db = SlateDb::Database.open("/tmp/mydb")
47
+
48
+ # Store a value
49
+ db.put("hello", "world")
50
+
51
+ # Retrieve a value
52
+ value = db.get("hello") # => "world"
53
+
54
+ # Delete a value
55
+ db.delete("hello")
56
+
57
+ # Close the database
58
+ db.close
59
+ ```
60
+
61
+ ### Block Form (Recommended)
62
+
63
+ The block form automatically closes the database when the block exits:
64
+
65
+ ```ruby
66
+ SlateDb::Database.open("/tmp/mydb") do |db|
67
+ db.put("key", "value")
68
+ db.get("key") # => "value"
69
+ end # automatically closed
70
+ ```
71
+
72
+ ### Persistent Storage
73
+
74
+ For persistent storage, provide an object store URL:
75
+
76
+ ```ruby
77
+ # Local filesystem
78
+ SlateDb::Database.open("/tmp/mydb", url: "file:///tmp/mydb") do |db|
79
+ db.put("key", "value")
80
+ end
81
+
82
+ # S3 (requires AWS credentials)
83
+ SlateDb::Database.open("mydb", url: "s3://mybucket/path") do |db|
84
+ db.put("key", "value")
85
+ end
86
+
87
+ # Azure Blob Storage
88
+ SlateDb::Database.open("mydb", url: "az://container/path") do |db|
89
+ db.put("key", "value")
90
+ end
91
+
92
+ # Google Cloud Storage
93
+ SlateDb::Database.open("mydb", url: "gs://bucket/path") do |db|
94
+ db.put("key", "value")
95
+ end
96
+ ```
97
+
98
+ #### Cloud Storage Credentials
99
+
100
+ SlateDB uses the [object_store](https://docs.rs/object_store) crate, which automatically discovers credentials from standard environment variables and configuration files:
101
+
102
+ **AWS S3:**
103
+ - Environment variables: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`, `AWS_REGION`
104
+ - Credential files: `~/.aws/credentials`, `~/.aws/config`
105
+ - IAM roles (when running on EC2/ECS/EKS)
106
+ - Web identity tokens (for IRSA on EKS)
107
+
108
+ **Azure Blob Storage:**
109
+ - Environment variables: `AZURE_STORAGE_ACCOUNT_NAME`, `AZURE_STORAGE_ACCOUNT_KEY`, `AZURE_STORAGE_SAS_TOKEN`
110
+ - Azure CLI credentials: `az login`
111
+ - Managed Identity (when running on Azure)
112
+
113
+ **Google Cloud Storage:**
114
+ - Environment variables: `GOOGLE_SERVICE_ACCOUNT`, `GOOGLE_SERVICE_ACCOUNT_PATH`, `GOOGLE_SERVICE_ACCOUNT_KEY`
115
+ - Application Default Credentials: `gcloud auth application-default login`
116
+ - Service account key file: `GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json`
117
+
118
+ Example with explicit AWS credentials:
119
+
120
+ ```ruby
121
+ # Set credentials via environment
122
+ ENV['AWS_ACCESS_KEY_ID'] = 'your-access-key'
123
+ ENV['AWS_SECRET_ACCESS_KEY'] = 'your-secret-key'
124
+ ENV['AWS_REGION'] = 'us-east-1'
125
+
126
+ SlateDb::Database.open("mydb", url: "s3://mybucket/path") do |db|
127
+ db.put("key", "value")
128
+ end
129
+ ```
130
+
131
+ ### Options
132
+
133
+ #### Put Options
134
+
135
+ ```ruby
136
+ # Set TTL (time-to-live) in milliseconds
137
+ db.put("key", "value", ttl: 60_000) # expires in 60 seconds
138
+
139
+ # Don't wait for durability
140
+ db.put("key", "value", await_durable: false)
141
+
142
+ # Supply an explicit sequence number (SlateDB >= 0.13.0)
143
+ db.put("key", "value", seqnum: 42)
144
+ ```
145
+
146
+ #### User-Supplied Sequence Numbers
147
+
148
+ By default SlateDB assigns a monotonically increasing sequence number to every
149
+ write. Since SlateDB 0.13.0 you can instead supply your own via `seqnum:`. The
150
+ value must be **strictly greater** than the current maximum sequence number, or
151
+ the write is rejected with `SlateDb::InvalidArgumentError`. This is useful when
152
+ replaying an external log or coordinating sequence numbers across systems.
153
+
154
+ ```ruby
155
+ db.put("key", "value", seqnum: 1_000)
156
+ db.delete("old", seqnum: 1_001)
157
+ db.merge("counter", "5", seqnum: 1_002) # requires a merge operator
158
+ db.write(batch, seqnum: 1_003) # applied across the batch
159
+ db.batch(seqnum: 1_004) { |b| b.put("k", "v") }
160
+
161
+ # The sequence number is reflected in the stored record
162
+ db.put("key", "value", seqnum: 2_000)
163
+ db.get_key_value("key")[:seq] # => 2000
164
+
165
+ # On a transaction it is supplied at commit time
166
+ txn = db.begin_transaction
167
+ txn.put("k", "v")
168
+ txn.commit(seqnum: 3_000)
169
+ ```
170
+
171
+ #### Get Options
172
+
173
+ ```ruby
174
+ # Filter by durability level
175
+ db.get("key", durability_filter: "memory")
176
+ db.get("key", durability_filter: "remote")
177
+
178
+ # Include uncommitted data
179
+ db.get("key", dirty: true)
180
+ ```
181
+
182
+ #### Key-Value Metadata
183
+
184
+ SlateDB can return the full key-value record, including storage metadata:
185
+
186
+ ```ruby
187
+ db.put("key", "value")
188
+ entry = db.get_key_value("key")
189
+ # => { key: "key", value: "value", seq: 1, create_ts: 1_765_000_000_000, expire_ts: nil }
190
+
191
+ entry[:value] # => "value"
192
+ entry[:seq] # SlateDB sequence number
193
+ entry[:create_ts] # creation timestamp in milliseconds
194
+ entry[:expire_ts] # expiration timestamp in milliseconds, or nil
195
+
196
+ # Alias for the same API
197
+ db.get_entry("key")
198
+
199
+ # The same read options accepted by #get are supported
200
+ db.get_key_value("key", durability_filter: "memory", cache_blocks: false)
201
+ ```
202
+
203
+ Missing keys return `nil`, matching `#get`.
204
+
205
+ #### Delete Options
206
+
207
+ ```ruby
208
+ # Don't wait for durability
209
+ db.delete("key", await_durable: false)
210
+
211
+ # Supply an explicit sequence number (SlateDB >= 0.13.0)
212
+ db.delete("key", seqnum: 42)
213
+ ```
214
+
215
+ ### Scanning
216
+
217
+ Iterate over key ranges using the `scan` method:
218
+
219
+ ```ruby
220
+ # Scan all keys from "a" onwards
221
+ db.scan("a").each do |key, value|
222
+ puts "#{key}: #{value}"
223
+ end
224
+
225
+ # Scan a specific range [start, end)
226
+ db.scan("a", "z").each do |key, value|
227
+ puts "#{key}: #{value}"
228
+ end
229
+
230
+ # Scan in descending key order
231
+ db.scan("a", "z", order: :desc).each do |key, value|
232
+ puts "#{key}: #{value}"
233
+ end
234
+
235
+ # Use Enumerable methods
236
+ keys = db.scan("user:").map { |k, v| k }
237
+ users = db.scan("user:").select { |k, v| v.include?("active") }
238
+
239
+ # Convert to array
240
+ all_entries = db.scan("").to_a
241
+ ```
242
+
243
+ #### Prefix Scanning
244
+
245
+ Scan all keys with a given prefix using `scan_prefix`:
246
+
247
+ ```ruby
248
+ # Scan all keys starting with "user:"
249
+ db.scan_prefix("user:").each do |key, value|
250
+ puts "#{key}: #{value}"
251
+ end
252
+
253
+ # Block form
254
+ db.scan_prefix("order:") do |key, value|
255
+ puts "#{key}: #{value}"
256
+ end
257
+
258
+ # Prefix scans can also run in descending key order
259
+ db.scan_prefix("user:", order: :desc).each do |key, value|
260
+ puts "#{key}: #{value}"
261
+ end
262
+
263
+ # Works with transactions, snapshots, and readers too
264
+ db.transaction do |txn|
265
+ txn.scan_prefix("item:").each do |k, v|
266
+ puts "#{k}: #{v}"
267
+ end
268
+ end
269
+ ```
270
+
271
+ ### Merge Operations
272
+
273
+ Merge operations allow you to combine values without reading them first, useful for counters, append-only logs, and similar patterns:
274
+
275
+ ```ruby
276
+ # Open with a built-in merge operator
277
+ SlateDb::Database.open("/tmp/mydb", merge_operator: :string_concat) do |db|
278
+ # Merge appends to existing values (or creates if key doesn't exist)
279
+ db.merge("log", "line1\n")
280
+ db.merge("log", "line2\n")
281
+ db.merge("log", "line3\n")
282
+
283
+ db.get("log") # => "line1\nline2\nline3\n"
284
+ end
285
+
286
+ # Merge with options
287
+ db.merge("key", "value", ttl: 60_000, await_durable: false)
288
+
289
+ # Works in transactions and batches
290
+ db.transaction do |txn|
291
+ txn.merge("counter", "1")
292
+ end
293
+
294
+ db.batch do |b|
295
+ b.merge("key", "a")
296
+ .merge("key", "b")
297
+ end
298
+ ```
299
+
300
+ #### Custom Merge Operators
301
+
302
+ You can provide a Ruby Proc/lambda as a custom merge operator:
303
+
304
+ ```ruby
305
+ # Counter merge operator (adds numbers)
306
+ counter_merge = ->(key, existing, new_value) {
307
+ existing_num = existing ? existing.to_i : 0
308
+ (existing_num + new_value.to_i).to_s
309
+ }
310
+
311
+ SlateDb::Database.open("/tmp/mydb", merge_operator: counter_merge) do |db|
312
+ db.merge("visits", "1")
313
+ db.merge("visits", "1")
314
+ db.merge("visits", "1")
315
+
316
+ db.get("visits") # => "3"
317
+ end
318
+
319
+ # Max value merge operator
320
+ max_merge = ->(key, existing, new_value) {
321
+ existing_num = existing ? existing.to_i : 0
322
+ new_num = new_value.to_i
323
+ [existing_num, new_num].max.to_s
324
+ }
325
+
326
+ SlateDb::Database.open("/tmp/mydb", merge_operator: max_merge) do |db|
327
+ db.merge("high_score", "100")
328
+ db.merge("high_score", "250")
329
+ db.merge("high_score", "150")
330
+
331
+ db.get("high_score") # => "250"
332
+ end
333
+ ```
334
+
335
+ The proc receives three arguments:
336
+ - `key` - The key being merged
337
+ - `existing` - The existing value (nil if no value exists)
338
+ - `new_value` - The new merge operand
339
+
340
+ **Note:** Custom Proc merge operators work best with direct `db.merge()` calls. When used with transactions or batches, some merge operations may be processed on background threads and fall back to string concatenation.
341
+
342
+ #### Available Merge Operators
343
+
344
+ - `:string_concat` (or `:concat`) - Concatenates byte values (built-in)
345
+ - Any `Proc` or `lambda` - Custom merge logic
346
+
347
+ ### Write Batches
348
+
349
+ Perform multiple writes atomically:
350
+
351
+ ```ruby
352
+ # Create a batch manually
353
+ batch = SlateDb::WriteBatch.new
354
+ batch.put("key1", "value1")
355
+ batch.put("key2", "value2", ttl: 60_000)
356
+ batch.delete("old_key")
357
+ db.write(batch)
358
+
359
+ # Or use the block helper
360
+ db.batch do |b|
361
+ b.put("key1", "value1")
362
+ b.put("key2", "value2")
363
+ b.delete("old_key")
364
+ end
365
+ ```
366
+
367
+ ### Transactions
368
+
369
+ ACID transactions with snapshot or serializable isolation:
370
+
371
+ ```ruby
372
+ # Block form (recommended) - auto-commits on success, rolls back on exception
373
+ db.transaction do |txn|
374
+ balance = txn.get("balance").to_i
375
+ txn.put("balance", (balance - 100).to_s)
376
+ txn.put("withdrawal", "100")
377
+ end
378
+
379
+ # With serializable isolation for strict consistency
380
+ db.transaction(isolation: :serializable) do |txn|
381
+ counter = txn.get("counter").to_i
382
+ txn.put("counter", (counter + 1).to_s)
383
+ end
384
+
385
+ # Manual transaction management
386
+ txn = db.begin_transaction(isolation: :snapshot)
387
+ txn.put("key", "value")
388
+ txn.commit # or txn.rollback
389
+ ```
390
+
391
+ Transaction operations:
392
+
393
+ ```ruby
394
+ db.transaction do |txn|
395
+ # Read
396
+ value = txn.get("key")
397
+
398
+ # Write
399
+ txn.put("key", "value")
400
+ txn.put("expiring", "data", ttl: 30_000)
401
+
402
+ # Delete
403
+ txn.delete("old_key")
404
+
405
+ # Scan
406
+ txn.scan("prefix:").each do |k, v|
407
+ puts "#{k}: #{v}"
408
+ end
409
+
410
+ # Scan with prefix
411
+ txn.scan_prefix("user:").each do |k, v|
412
+ puts "#{k}: #{v}"
413
+ end
414
+ end
415
+ ```
416
+
417
+ #### Explicit Read Tracking
418
+
419
+ In serializable transactions, use `mark_read` to explicitly track keys for conflict detection without actually reading them:
420
+
421
+ ```ruby
422
+ db.transaction(isolation: :serializable) do |txn|
423
+ # Mark keys as read for conflict detection
424
+ txn.mark_read(["key1", "key2", "key3"])
425
+
426
+ # Now if another transaction modifies key1/key2/key3,
427
+ # this transaction will fail on commit
428
+ txn.put("result", "computed_value")
429
+ end
430
+ ```
431
+
432
+ ### Checkpoints
433
+
434
+ Create durable checkpoints for backup or read replica purposes:
435
+
436
+ ```ruby
437
+ SlateDb::Database.open("/tmp/mydb", url: "file:///tmp/mydb") do |db|
438
+ db.put("key", "value")
439
+ db.flush
440
+
441
+ # Create a checkpoint
442
+ checkpoint = db.create_checkpoint
443
+ puts "Checkpoint ID: #{checkpoint[:id]}"
444
+ puts "Manifest ID: #{checkpoint[:manifest_id]}"
445
+
446
+ # Create a named checkpoint with lifetime
447
+ checkpoint = db.create_checkpoint(
448
+ name: "before-migration",
449
+ lifetime: 3_600_000 # 1 hour in milliseconds
450
+ )
451
+ end
452
+ ```
453
+
454
+ ### Snapshots
455
+
456
+ Point-in-time consistent reads:
457
+
458
+ ```ruby
459
+ # Block form (recommended)
460
+ db.snapshot do |snap|
461
+ # All reads see the same consistent state
462
+ value1 = snap.get("key1")
463
+ value2 = snap.get("key2")
464
+
465
+ snap.scan("prefix:").each do |k, v|
466
+ puts "#{k}: #{v}"
467
+ end
468
+ end # automatically closed
469
+
470
+ # Manual management
471
+ snap = db.snapshot
472
+ value = snap.get("key")
473
+ snap.close
474
+ ```
475
+
476
+ ### Reader (Read-Only Access)
477
+
478
+ Open a database in read-only mode, useful for replicas:
479
+
480
+ ```ruby
481
+ # Basic read-only access
482
+ SlateDb::Reader.open("/tmp/mydb", url: "s3://bucket/path") do |reader|
483
+ value = reader.get("key")
484
+
485
+ reader.scan("prefix:").each do |k, v|
486
+ puts "#{k}: #{v}"
487
+ end
488
+ end
489
+
490
+ # Open at a specific checkpoint
491
+ SlateDb::Reader.open("/tmp/mydb",
492
+ url: "s3://bucket/path",
493
+ checkpoint_id: "uuid-here") do |reader|
494
+ reader.get("key")
495
+ end
496
+
497
+ # Enable the reader's on-disk cache and cap its open file handles
498
+ # (max_open_file_handles, added in SlateDB 0.13.0, only takes effect when
499
+ # cache_root is set, since that is what enables the cached object store).
500
+ SlateDb::Reader.open("/tmp/mydb",
501
+ url: "s3://bucket/path",
502
+ cache_root: "/var/cache/slatedb",
503
+ max_open_file_handles: 256) do |reader|
504
+ reader.get("key")
505
+ end
506
+ ```
507
+
508
+ ### Admin Operations
509
+
510
+ Administrative operations for database management:
511
+
512
+ ```ruby
513
+ admin = SlateDb::Admin.new("/tmp/mydb", url: "s3://bucket/path")
514
+
515
+ # Manifests
516
+ json = admin.read_manifest # Latest manifest as JSON
517
+ json = admin.read_manifest(123) # Specific manifest by ID
518
+ json = admin.list_manifests # List all manifests
519
+ json = admin.list_manifests(start: 1, end_id: 10) # Range query
520
+
521
+ # Checkpoints
522
+ result = admin.create_checkpoint(name: "backup-2024")
523
+ # => { id: "uuid-string", manifest_id: 7 }
524
+
525
+ checkpoints = admin.list_checkpoints
526
+ checkpoints = admin.list_checkpoints(name: "backup") # Filter by name
527
+
528
+ admin.refresh_checkpoint("uuid", lifetime: 3600_000) # Extend lifetime
529
+ admin.delete_checkpoint("uuid")
530
+
531
+ # Garbage Collection
532
+ admin.run_gc # Run with default settings
533
+ admin.run_gc(min_age: 3600_000) # Set min age for all directories (1 hour)
534
+ admin.run_gc(manifest_min_age: 86400_000) # Custom age for manifest (1 day)
535
+ admin.run_gc(wal_min_age: 60_000) # Custom age for WAL (1 minute)
536
+ admin.run_gc(compacted_min_age: 60_000) # Custom age for compacted (1 minute)
537
+ ```
538
+
539
+ ### Flushing
540
+
541
+ Ensure all writes are persisted:
542
+
543
+ ```ruby
544
+ db.put("key", "value")
545
+ db.flush
546
+ ```
547
+
548
+ ## Thread Safety
549
+
550
+ **SlateDB is fully thread-safe and optimized for concurrent access.**
551
+
552
+ - The `Database` class can be safely shared across multiple Ruby threads
553
+ - All operations (get, put, delete, scan, transactions) are thread-safe
554
+ - The Ruby bindings release the Global VM Lock (GVL) during I/O operations, allowing other Ruby threads to run concurrently
555
+ - Perfect for use with multi-threaded Ruby applications like Puma, Sidekiq, and concurrent test suites
556
+
557
+ ```ruby
558
+ db = SlateDb::Database.open("/tmp/mydb")
559
+
560
+ # Safe to use from multiple threads
561
+ threads = 10.times.map do |i|
562
+ Thread.new do
563
+ db.put("key-#{i}", "value-#{i}")
564
+ db.get("key-#{i}")
565
+ end
566
+ end
567
+
568
+ threads.each(&:join)
569
+ ```
570
+
571
+ **Implementation details:**
572
+ - The underlying SlateDB library uses `Arc` (atomic reference counting) and `RwLock` for internal state management
573
+ - I/O operations release the Ruby GVL using `rb_thread_call_without_gvl`, preventing blocking other threads
574
+ - A shared Tokio multi-threaded runtime handles all async operations efficiently
575
+
576
+ ## Error Handling
577
+
578
+ SlateDB defines several exception classes:
579
+
580
+ ```ruby
581
+ begin
582
+ db.put("", "value") # empty key
583
+ rescue SlateDb::InvalidArgumentError => e
584
+ puts "Invalid argument: #{e.message}"
585
+ rescue SlateDb::TransactionError => e
586
+ puts "Transaction conflict: #{e.message}"
587
+ rescue SlateDb::Error => e
588
+ puts "SlateDB error: #{e.message}"
589
+ end
590
+ ```
591
+
592
+ Exception hierarchy:
593
+
594
+ - `SlateDb::Error` - Base class (inherits from `StandardError`)
595
+ - `SlateDb::TransactionError` - Transaction conflicts
596
+ - `SlateDb::ClosedError` - Database has been closed
597
+ - `SlateDb::UnavailableError` - Storage/network unavailable
598
+ - `SlateDb::InvalidArgumentError` - Invalid arguments
599
+ - `SlateDb::DataError` - Data corruption or format errors
600
+ - `SlateDb::InternalError` - Internal errors
601
+
602
+ ## Requirements
603
+
604
+ - Ruby 3.3+
605
+ - Rust toolchain (for building from source)
606
+
607
+ ## Development
608
+
609
+ After checking out the repo, run:
610
+
611
+ ```bash
612
+ bundle install
613
+ bundle exec rake compile
614
+ bundle exec rake spec
615
+ ```
616
+
617
+ To run specific tests:
618
+
619
+ ```bash
620
+ bundle exec rspec spec/database_spec.rb
621
+ bundle exec rspec spec/transaction_spec.rb
622
+ ```
623
+
624
+ ## Contributing
625
+
626
+ Bug reports and pull requests are welcome on GitHub at https://github.com/catkins/slatedb-rb.
627
+
628
+ Also, find me on the [SlateDB Discord Server](https://discord.gg/mHYmGy5MgA).
629
+
630
+ ## License
631
+
632
+ Apache-2.0
Binary file
Binary file
Binary file