sequel-duckdb 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. checksums.yaml +7 -0
  2. data/.kiro/specs/advanced-sql-features-implementation/design.md +24 -0
  3. data/.kiro/specs/advanced-sql-features-implementation/requirements.md +43 -0
  4. data/.kiro/specs/advanced-sql-features-implementation/tasks.md +24 -0
  5. data/.kiro/specs/duckdb-sql-syntax-compatibility/design.md +258 -0
  6. data/.kiro/specs/duckdb-sql-syntax-compatibility/requirements.md +84 -0
  7. data/.kiro/specs/duckdb-sql-syntax-compatibility/tasks.md +94 -0
  8. data/.kiro/specs/edge-cases-and-validation-fixes/requirements.md +32 -0
  9. data/.kiro/specs/integration-test-database-setup/design.md +0 -0
  10. data/.kiro/specs/integration-test-database-setup/requirements.md +117 -0
  11. data/.kiro/specs/sequel-duckdb-adapter/design.md +542 -0
  12. data/.kiro/specs/sequel-duckdb-adapter/requirements.md +202 -0
  13. data/.kiro/specs/sequel-duckdb-adapter/tasks.md +247 -0
  14. data/.kiro/specs/sql-expression-handling-fix/design.md +298 -0
  15. data/.kiro/specs/sql-expression-handling-fix/requirements.md +86 -0
  16. data/.kiro/specs/sql-expression-handling-fix/tasks.md +22 -0
  17. data/.kiro/specs/test-infrastructure-improvements/requirements.md +106 -0
  18. data/.kiro/steering/product.md +22 -0
  19. data/.kiro/steering/structure.md +88 -0
  20. data/.kiro/steering/tech.md +124 -0
  21. data/.kiro/steering/testing.md +192 -0
  22. data/.rubocop.yml +103 -0
  23. data/.yardopts +8 -0
  24. data/API_DOCUMENTATION.md +919 -0
  25. data/CHANGELOG.md +131 -0
  26. data/LICENSE +21 -0
  27. data/MIGRATION_EXAMPLES.md +740 -0
  28. data/PERFORMANCE_OPTIMIZATIONS.md +723 -0
  29. data/README.md +692 -0
  30. data/Rakefile +27 -0
  31. data/TASK_10.2_IMPLEMENTATION_SUMMARY.md +164 -0
  32. data/docs/DUCKDB_SQL_PATTERNS.md +410 -0
  33. data/docs/TASK_12_VERIFICATION_SUMMARY.md +122 -0
  34. data/lib/sequel/adapters/duckdb.rb +256 -0
  35. data/lib/sequel/adapters/shared/duckdb.rb +2349 -0
  36. data/lib/sequel/duckdb/version.rb +16 -0
  37. data/lib/sequel/duckdb.rb +43 -0
  38. data/sig/sequel/duckdb.rbs +6 -0
  39. metadata +235 -0
@@ -0,0 +1,723 @@
1
+ # Performance Optimization Guide for Sequel-DuckDB
2
+
3
+ This guide provides comprehensive strategies for optimizing performance when using Sequel with DuckDB, leveraging DuckDB's unique strengths as an analytical database engine.
4
+
5
+ ## Table of Contents
6
+
7
+ 1. [Understanding DuckDB's Architecture](#understanding-duckdbs-architecture)
8
+ 2. [Query Optimization](#query-optimization)
9
+ 3. [Schema Design](#schema-design)
10
+ 4. [Bulk Operations](#bulk-operations)
11
+ 5. [Memory Management](#memory-management)
12
+ 6. [Connection Optimization](#connection-optimization)
13
+ 7. [Monitoring and Profiling](#monitoring-and-profiling)
14
+ 8. [Best Practices](#best-practices)
15
+
16
+ ## Understanding DuckDB's Architecture
17
+
18
+ DuckDB is designed as an analytical database with several key characteristics that affect performance optimization:
19
+
20
+ ### Columnar Storage
21
+ - Data is stored column-wise, making analytical queries very efficient
22
+ - SELECT queries that access few columns are much faster
23
+ - Aggregations and analytical functions are highly optimized
24
+
25
+ ### Vectorized Execution
26
+ - Operations are performed on batches of data (vectors) rather than row-by-row
27
+ - This reduces function call overhead and improves CPU cache utilization
28
+ - Particularly beneficial for analytical workloads
29
+
30
+ ### In-Memory Processing
31
+ - DuckDB can efficiently process data that fits in memory
32
+ - Automatic memory management with spill-to-disk for larger datasets
33
+ - Memory-mapped files for efficient file-based database access
34
+
35
+ ## Query Optimization
36
+
37
+ ### 1. Column Selection Optimization
38
+
39
+ **Always select only the columns you need:**
40
+
41
+ ```ruby
42
+ # ❌ Inefficient - selects all columns
43
+ users = db[:users].where(active: true).all
44
+
45
+ # ✅ Efficient - selects only needed columns
46
+ users = db[:users].select(:id, :name, :email).where(active: true).all
47
+
48
+ # ✅ Even better for large result sets
49
+ db[:users].select(:id, :name, :email).where(active: true).each do |user|
50
+ # Process user
51
+ end
52
+ ```
53
+
54
+ ### 2. Predicate Pushdown
55
+
56
+ **Apply filters as early as possible:**
57
+
58
+ ```ruby
59
+ # ❌ Less efficient - filtering after join
60
+ result = db[:users]
61
+ .join(:orders, user_id: :id)
62
+ .where(users__active: true, orders__status: 'completed')
63
+
64
+ # ✅ More efficient - filter before join when possible
65
+ active_users = db[:users].where(active: true)
66
+ completed_orders = db[:orders].where(status: 'completed')
67
+ result = active_users.join(completed_orders, user_id: :id)
68
+ ```
69
+
70
+ ### 3. Index Utilization
71
+
72
+ **Create indexes for frequently queried columns:**
73
+
74
+ ```ruby
75
+ # Create indexes for common query patterns
76
+ db.add_index :users, :email
77
+ db.add_index :orders, [:user_id, :status]
78
+ db.add_index :products, [:category_id, :active]
79
+
80
+ # Composite indexes for multi-column queries
81
+ db.add_index :order_items, [:order_id, :product_id]
82
+
83
+ # Partial indexes for filtered queries
84
+ db.add_index :products, :price, where: { active: true }
85
+ ```
86
+
87
+ ### 4. Query Plan Analysis
88
+
89
+ **Use EXPLAIN to understand query execution:**
90
+
91
+ ```ruby
92
+ # Analyze query performance
93
+ query = db[:users].join(:orders, user_id: :id).where(status: 'completed')
94
+ puts query.explain
95
+
96
+ # Look for:
97
+ # - Index usage
98
+ # - Join algorithms
99
+ # - Filter pushdown
100
+ # - Estimated row counts
101
+ ```
102
+
103
+ ### 5. Analytical Query Optimization
104
+
105
+ **Leverage DuckDB's analytical capabilities:**
106
+
107
+ ```ruby
108
+ # ✅ Efficient analytical queries
109
+ sales_summary = db[:sales]
110
+ .select(
111
+ :product_category,
112
+ Sequel.function(:sum, :amount).as(:total_sales),
113
+ Sequel.function(:avg, :amount).as(:avg_sale),
114
+ Sequel.function(:count, :id).as(:transaction_count),
115
+ Sequel.function(:percentile_cont, 0.5).within_group(:amount).as(:median_sale)
116
+ )
117
+ .group(:product_category)
118
+ .order(Sequel.desc(:total_sales))
119
+
120
+ # ✅ Window functions for advanced analytics
121
+ monthly_trends = db[:sales]
122
+ .select(
123
+ :month,
124
+ :amount,
125
+ Sequel.function(:lag, :amount, 1).over(order: :month).as(:prev_month),
126
+ Sequel.function(:sum, :amount).over(order: :month).as(:running_total),
127
+ Sequel.function(:rank).over(partition: :category, order: Sequel.desc(:amount)).as(:category_rank)
128
+ )
129
+ ```
130
+
131
+ ## Schema Design
132
+
133
+ ### 1. Optimal Data Types
134
+
135
+ **Choose appropriate data types for performance:**
136
+
137
+ ```ruby
138
+ # ✅ Efficient data types
139
+ db.create_table :products do
140
+ primary_key :id # INTEGER is efficient
141
+ String :name, size: 255 # Fixed-size strings when possible
142
+ Decimal :price, size: [10, 2] # Precise for monetary values
143
+ Integer :stock_quantity # INTEGER for counts
144
+ Boolean :active # BOOLEAN is very efficient
145
+ Date :created_date # DATE for date-only values
146
+ DateTime :created_at # TIMESTAMP for full datetime
147
+
148
+ # DuckDB-specific optimized types
149
+ column :tags, 'VARCHAR[]' # Arrays for multi-value attributes
150
+ column :metadata, 'JSON' # JSON for flexible data
151
+ end
152
+
153
+ # ❌ Avoid oversized types
154
+ # String :description, size: 10000 # Use TEXT instead
155
+ # Float :price # Use DECIMAL for money
156
+ ```
157
+
158
+ ### 2. Partitioning Strategy
159
+
160
+ **Design tables for analytical workloads:**
161
+
162
+ ```ruby
163
+ # ✅ Time-based partitioning pattern
164
+ db.create_table :sales_2024_q1 do
165
+ primary_key :id
166
+ foreign_key :product_id, :products
167
+ Decimal :amount, size: [10, 2]
168
+ Date :sale_date
169
+ DateTime :created_at
170
+
171
+ # Constraint to enforce partition bounds
172
+ constraint(:date_range) { (sale_date >= '2024-01-01') & (sale_date < '2024-04-01') }
173
+ end
174
+
175
+ # Create view for unified access
176
+ db.run <<~SQL
177
+ CREATE VIEW sales AS
178
+ SELECT * FROM sales_2024_q1
179
+ UNION ALL
180
+ SELECT * FROM sales_2024_q2
181
+ -- Add more partitions as needed
182
+ SQL
183
+ ```
184
+
185
+ ### 3. Denormalization for Analytics
186
+
187
+ **Consider denormalization for read-heavy analytical workloads:**
188
+
189
+ ```ruby
190
+ # ✅ Denormalized table for analytics
191
+ db.create_table :order_analytics do
192
+ primary_key :id
193
+ Integer :order_id
194
+ Integer :user_id
195
+ String :user_name # Denormalized from users table
196
+ String :user_email # Denormalized from users table
197
+ Integer :product_id
198
+ String :product_name # Denormalized from products table
199
+ String :category_name # Denormalized from categories table
200
+ Decimal :unit_price, size: [10, 2]
201
+ Integer :quantity
202
+ Decimal :total_amount, size: [10, 2]
203
+ Date :order_date
204
+ DateTime :created_at
205
+ end
206
+
207
+ # Populate with materialized view pattern
208
+ db.run <<~SQL
209
+ INSERT INTO order_analytics
210
+ SELECT
211
+ oi.id,
212
+ o.id as order_id,
213
+ u.id as user_id,
214
+ u.name as user_name,
215
+ u.email as user_email,
216
+ p.id as product_id,
217
+ p.name as product_name,
218
+ c.name as category_name,
219
+ oi.unit_price,
220
+ oi.quantity,
221
+ oi.unit_price * oi.quantity as total_amount,
222
+ o.created_at::DATE as order_date,
223
+ o.created_at
224
+ FROM order_items oi
225
+ JOIN orders o ON oi.order_id = o.id
226
+ JOIN users u ON o.user_id = u.id
227
+ JOIN products p ON oi.product_id = p.id
228
+ JOIN categories c ON p.category_id = c.id
229
+ SQL
230
+ ```
231
+
232
+ ## Bulk Operations
233
+
234
+ ### 1. Efficient Bulk Inserts
235
+
236
+ **Use multi_insert for large data loads:**
237
+
238
+ ```ruby
239
+ # ✅ Efficient bulk insert
240
+ data = []
241
+ 1000.times do |i|
242
+ data << {
243
+ name: "User #{i}",
244
+ email: "user#{i}@example.com",
245
+ created_at: Time.now
246
+ }
247
+ end
248
+
249
+ # Single transaction for all inserts
250
+ db.transaction do
251
+ db[:users].multi_insert(data)
252
+ end
253
+
254
+ # ✅ For very large datasets, use batching
255
+ def bulk_insert_batched(db, table, data, batch_size = 1000)
256
+ data.each_slice(batch_size) do |batch|
257
+ db.transaction do
258
+ db[table].multi_insert(batch)
259
+ end
260
+ end
261
+ end
262
+
263
+ bulk_insert_batched(db, :users, large_dataset, 5000)
264
+ ```
265
+
266
+ ### 2. Bulk Updates
267
+
268
+ **Efficient bulk update patterns:**
269
+
270
+ ```ruby
271
+ # ✅ Single UPDATE statement for bulk changes
272
+ db[:products].where(category_id: 1).update(
273
+ active: false,
274
+ updated_at: Time.now
275
+ )
276
+
277
+ # ✅ Conditional bulk updates
278
+ db.run <<~SQL
279
+ UPDATE products
280
+ SET
281
+ status = CASE
282
+ WHEN stock_quantity = 0 THEN 'out_of_stock'
283
+ WHEN stock_quantity < 10 THEN 'low_stock'
284
+ ELSE 'in_stock'
285
+ END,
286
+ updated_at = NOW()
287
+ WHERE status != CASE
288
+ WHEN stock_quantity = 0 THEN 'out_of_stock'
289
+ WHEN stock_quantity < 10 THEN 'low_stock'
290
+ ELSE 'in_stock'
291
+ END
292
+ SQL
293
+ ```
294
+
295
+ ### 3. Data Loading from Files
296
+
297
+ **Leverage DuckDB's file reading capabilities:**
298
+
299
+ ```ruby
300
+ # ✅ Direct CSV import (very fast)
301
+ db.run <<~SQL
302
+ CREATE TABLE temp_sales AS
303
+ SELECT * FROM read_csv_auto('sales_data.csv')
304
+ SQL
305
+
306
+ # ✅ Parquet files (excellent for analytics)
307
+ db.run <<~SQL
308
+ CREATE TABLE sales_archive AS
309
+ SELECT * FROM read_parquet('sales_archive.parquet')
310
+ SQL
311
+
312
+ # ✅ JSON files
313
+ db.run <<~SQL
314
+ CREATE TABLE user_events AS
315
+ SELECT * FROM read_json_auto('user_events.json')
316
+ SQL
317
+ ```
318
+
319
+ ## Memory Management
320
+
321
+ ### 1. Connection Configuration
322
+
323
+ **Optimize DuckDB memory settings:**
324
+
325
+ ```ruby
326
+ # ✅ Configure memory limits
327
+ db = Sequel.connect(
328
+ adapter: 'duckdb',
329
+ database: '/path/to/database.duckdb',
330
+ config: {
331
+ memory_limit: '4GB', # Set appropriate memory limit
332
+ threads: 8, # Use multiple threads
333
+ max_memory: '8GB', # Maximum memory before spilling
334
+ temp_directory: '/tmp/duckdb' # Temporary file location
335
+ }
336
+ )
337
+
338
+ # ✅ Runtime memory configuration
339
+ db.run "SET memory_limit='2GB'"
340
+ db.run "SET threads=4"
341
+ ```
342
+
343
+ ### 2. Result Set Management
344
+
345
+ **Handle large result sets efficiently:**
346
+
347
+ ```ruby
348
+ # ❌ Loads entire result set into memory
349
+ all_orders = db[:orders].all
350
+
351
+ # ✅ Process results in batches
352
+ db[:orders].paged_each(rows_per_fetch: 1000) do |order|
353
+ # Process each order
354
+ process_order(order)
355
+ end
356
+
357
+ # ✅ Use streaming for very large datasets
358
+ db[:large_table].use_cursor.each do |row|
359
+ # Process row by row without loading all into memory
360
+ process_row(row)
361
+ end
362
+
363
+ # ✅ Limit result sets when possible
364
+ recent_orders = db[:orders]
365
+ .where { created_at > Date.today - 30 }
366
+ .order(Sequel.desc(:created_at))
367
+ .limit(1000)
368
+ ```
369
+
370
+ ### 3. Connection Pooling
371
+
372
+ **Optimize connection management:**
373
+
374
+ ```ruby
375
+ # ✅ Connection pool configuration
376
+ db = Sequel.connect(
377
+ adapter: 'duckdb',
378
+ database: '/path/to/database.duckdb',
379
+ max_connections: 10, # Pool size
380
+ pool_timeout: 5, # Connection timeout
381
+ pool_sleep_time: 0.001, # Sleep between retries
382
+ pool_connection_validation: true
383
+ )
384
+
385
+ # ✅ Proper connection cleanup
386
+ begin
387
+ db.transaction do
388
+ # Database operations
389
+ end
390
+ ensure
391
+ db.disconnect if db
392
+ end
393
+ ```
394
+
395
+ ## Connection Optimization
396
+
397
+ ### 1. Connection Reuse
398
+
399
+ **Minimize connection overhead:**
400
+
401
+ ```ruby
402
+ # ✅ Reuse connections
403
+ class DatabaseManager
404
+ def self.connection
405
+ @connection ||= Sequel.connect('duckdb:///app.duckdb')
406
+ end
407
+
408
+ def self.disconnect
409
+ @connection&.disconnect
410
+ @connection = nil
411
+ end
412
+ end
413
+
414
+ # Use throughout application
415
+ db = DatabaseManager.connection
416
+ ```
417
+
418
+ ### 2. Transaction Management
419
+
420
+ **Optimize transaction usage:**
421
+
422
+ ```ruby
423
+ # ✅ Group related operations in transactions
424
+ db.transaction do
425
+ user_id = db[:users].insert(name: 'John', email: 'john@example.com')
426
+ profile_id = db[:profiles].insert(user_id: user_id, bio: 'Developer')
427
+ db[:preferences].insert(user_id: user_id, theme: 'dark')
428
+ end
429
+
430
+ # ✅ Use savepoints for nested operations
431
+ db.transaction do
432
+ user_id = db[:users].insert(name: 'Jane', email: 'jane@example.com')
433
+
434
+ begin
435
+ db.transaction(savepoint: true) do
436
+ # Risky operation that might fail
437
+ db[:audit_log].insert(user_id: user_id, action: 'created')
438
+ end
439
+ rescue Sequel::DatabaseError
440
+ # Continue even if audit logging fails
441
+ end
442
+ end
443
+ ```
444
+
445
+ ## Monitoring and Profiling
446
+
447
+ ### 1. Query Logging
448
+
449
+ **Enable comprehensive logging:**
450
+
451
+ ```ruby
452
+ # ✅ Enable SQL logging
453
+ require 'logger'
454
+ db.loggers << Logger.new($stdout)
455
+
456
+ # ✅ Custom logger with timing
457
+ class PerformanceLogger < Logger
458
+ def initialize(*args)
459
+ super
460
+ @start_times = {}
461
+ end
462
+
463
+ def info(message)
464
+ if message.include?('SELECT') || message.include?('INSERT') || message.include?('UPDATE')
465
+ @start_time = Time.now
466
+ super("SQL: #{message}")
467
+ end
468
+ end
469
+
470
+ def debug(message)
471
+ if @start_time && message.include?('rows')
472
+ duration = Time.now - @start_time
473
+ super("Duration: #{duration.round(3)}s - #{message}")
474
+ @start_time = nil
475
+ else
476
+ super
477
+ end
478
+ end
479
+ end
480
+
481
+ db.loggers << PerformanceLogger.new($stdout)
482
+ ```
483
+
484
+ ### 2. Performance Monitoring
485
+
486
+ **Monitor key performance metrics:**
487
+
488
+ ```ruby
489
+ # ✅ Query performance monitoring
490
+ class QueryMonitor
491
+ def self.monitor_query(description, &block)
492
+ start_time = Time.now
493
+ result = yield
494
+ duration = Time.now - start_time
495
+
496
+ if duration > 1.0 # Log slow queries
497
+ puts "SLOW QUERY (#{duration.round(3)}s): #{description}"
498
+ end
499
+
500
+ result
501
+ end
502
+ end
503
+
504
+ # Usage
505
+ users = QueryMonitor.monitor_query("Fetch active users") do
506
+ db[:users].where(active: true).all
507
+ end
508
+ ```
509
+
510
+ ### 3. Database Statistics
511
+
512
+ **Monitor database performance:**
513
+
514
+ ```ruby
515
+ # ✅ Check database statistics
516
+ def print_db_stats(db)
517
+ # Table sizes
518
+ puts "Table Statistics:"
519
+ db.tables.each do |table|
520
+ count = db[table].count
521
+ puts " #{table}: #{count} rows"
522
+ end
523
+
524
+ # Index usage (if available)
525
+ puts "\nIndex Information:"
526
+ db.tables.each do |table|
527
+ indexes = db.indexes(table)
528
+ puts " #{table}: #{indexes.keys.join(', ')}" if indexes.any?
529
+ end
530
+ end
531
+
532
+ print_db_stats(db)
533
+ ```
534
+
535
+ ## Best Practices
536
+
537
+ ### 1. Query Design Patterns
538
+
539
+ ```ruby
540
+ # ✅ Efficient analytical query pattern
541
+ def sales_report(db, start_date, end_date)
542
+ db[:order_analytics]
543
+ .where(order_date: start_date..end_date)
544
+ .select(
545
+ :category_name,
546
+ Sequel.function(:sum, :total_amount).as(:revenue),
547
+ Sequel.function(:count, :order_id).as(:order_count),
548
+ Sequel.function(:avg, :total_amount).as(:avg_order_value)
549
+ )
550
+ .group(:category_name)
551
+ .order(Sequel.desc(:revenue))
552
+ end
553
+
554
+ # ✅ Efficient pagination pattern
555
+ def paginated_orders(db, page = 1, per_page = 50)
556
+ offset = (page - 1) * per_page
557
+
558
+ db[:orders]
559
+ .select(:id, :user_id, :total, :status, :created_at)
560
+ .order(Sequel.desc(:created_at), :id) # Stable sort
561
+ .limit(per_page)
562
+ .offset(offset)
563
+ end
564
+ ```
565
+
566
+ ### 2. Caching Strategies
567
+
568
+ ```ruby
569
+ # ✅ Application-level caching
570
+ class CachedQueries
571
+ def self.user_stats(db, user_id)
572
+ @cache ||= {}
573
+ cache_key = "user_stats_#{user_id}"
574
+
575
+ @cache[cache_key] ||= db[:orders]
576
+ .where(user_id: user_id)
577
+ .select(
578
+ Sequel.function(:count, :id).as(:order_count),
579
+ Sequel.function(:sum, :total).as(:total_spent),
580
+ Sequel.function(:avg, :total).as(:avg_order)
581
+ )
582
+ .first
583
+ end
584
+
585
+ def self.clear_cache
586
+ @cache = {}
587
+ end
588
+ end
589
+ ```
590
+
591
+ ### 3. Error Handling and Retry Logic
592
+
593
+ ```ruby
594
+ # ✅ Robust error handling
595
+ def execute_with_retry(db, max_retries = 3)
596
+ retries = 0
597
+
598
+ begin
599
+ yield
600
+ rescue Sequel::DatabaseConnectionError => e
601
+ retries += 1
602
+ if retries <= max_retries
603
+ sleep(0.1 * retries) # Exponential backoff
604
+ retry
605
+ else
606
+ raise e
607
+ end
608
+ end
609
+ end
610
+
611
+ # Usage
612
+ result = execute_with_retry(db) do
613
+ db[:users].where(active: true).count
614
+ end
615
+ ```
616
+
617
+ ### 4. Development vs Production Optimization
618
+
619
+ ```ruby
620
+ # ✅ Environment-specific configuration
621
+ class DatabaseConfig
622
+ def self.connection_options
623
+ if ENV['RAILS_ENV'] == 'production'
624
+ {
625
+ adapter: 'duckdb',
626
+ database: '/var/lib/app/production.duckdb',
627
+ config: {
628
+ memory_limit: '8GB',
629
+ threads: 16,
630
+ max_memory: '16GB'
631
+ },
632
+ max_connections: 20,
633
+ pool_timeout: 10
634
+ }
635
+ else
636
+ {
637
+ adapter: 'duckdb',
638
+ database: ':memory:',
639
+ config: {
640
+ memory_limit: '1GB',
641
+ threads: 4
642
+ },
643
+ max_connections: 5
644
+ }
645
+ end
646
+ end
647
+ end
648
+
649
+ db = Sequel.connect(DatabaseConfig.connection_options)
650
+ ```
651
+
652
+ ## Performance Testing
653
+
654
+ ### 1. Benchmarking Queries
655
+
656
+ ```ruby
657
+ require 'benchmark'
658
+
659
+ # ✅ Query performance testing
660
+ def benchmark_query(description, iterations = 100)
661
+ puts "Benchmarking: #{description}"
662
+
663
+ time = Benchmark.measure do
664
+ iterations.times { yield }
665
+ end
666
+
667
+ puts " Total time: #{time.real.round(3)}s"
668
+ puts " Average: #{(time.real / iterations * 1000).round(3)}ms per query"
669
+ puts " Queries/sec: #{(iterations / time.real).round(1)}"
670
+ end
671
+
672
+ # Usage
673
+ benchmark_query("User lookup by email") do
674
+ db[:users].where(email: 'test@example.com').first
675
+ end
676
+
677
+ benchmark_query("Sales aggregation") do
678
+ db[:orders].where(status: 'completed').sum(:total)
679
+ end
680
+ ```
681
+
682
+ ### 2. Load Testing
683
+
684
+ ```ruby
685
+ # ✅ Concurrent load testing
686
+ require 'thread'
687
+
688
+ def load_test(db, concurrent_users = 10, queries_per_user = 100)
689
+ threads = []
690
+ results = Queue.new
691
+
692
+ concurrent_users.times do |user_id|
693
+ threads << Thread.new do
694
+ start_time = Time.now
695
+
696
+ queries_per_user.times do
697
+ # Simulate user queries
698
+ db[:users].where(id: rand(1000)).first
699
+ db[:orders].where(user_id: rand(1000)).count
700
+ end
701
+
702
+ duration = Time.now - start_time
703
+ results << { user_id: user_id, duration: duration }
704
+ end
705
+ end
706
+
707
+ threads.each(&:join)
708
+
709
+ # Collect results
710
+ total_queries = concurrent_users * queries_per_user
711
+ total_time = results.size.times.map { results.pop[:duration] }.max
712
+
713
+ puts "Load test results:"
714
+ puts " #{concurrent_users} concurrent users"
715
+ puts " #{total_queries} total queries"
716
+ puts " #{total_time.round(3)}s total time"
717
+ puts " #{(total_queries / total_time).round(1)} queries/sec"
718
+ end
719
+
720
+ load_test(db)
721
+ ```
722
+
723
+ This comprehensive performance optimization guide should help you get the most out of DuckDB's analytical capabilities while using Sequel. Remember that DuckDB excels at analytical workloads, so design your queries and schema to take advantage of its columnar storage and vectorized execution engine.