dwh 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1d0cb4848d96ff20f5b4c80e00cbfde5f1150e0e975cf55827f8466b9d4eaf49
4
- data.tar.gz: ce79d54a93388718886c8ec7892e9da544c4ddc52333e1a765a742518552df69
3
+ metadata.gz: 899fb3e403f0362cb21132d9af44f622e870b339776aac7bd4d3e7ed333d59e4
4
+ data.tar.gz: a3d536eb6da5885d4817be013d80e6b5a5ed422bce1466ac582c2b478724e278
5
5
  SHA512:
6
- metadata.gz: e61d14edd0b8e5818eb2eb3fe5ec25adf69a49279c9fd8b8db0ddb271f746de30573d97d90a6fe3bb9acf350f2ad80a72f076a19bd72a1270d9418aee6eca320
7
- data.tar.gz: a53cc7c72c222b579777e7796bbdefeada81ac77a8bf12d5ab46be73e2d76b7721deba91deb2019597013a1f1a46173a7406d6d95d1f8d11eb53f62e7ba19b26
6
+ metadata.gz: 5703c4ee14bf336b92f15740bfd0a6bce086ac9b636d67bcf269ea337a550a80bdd6a2f9d4265dd0a476d8d693a313dd2e6c8fa2fc587ea0f4653f6e0b528860
7
+ data.tar.gz: fa72a56f3eac7c5b4eae524a70b002f93b23e54513209d182de92e119aa43ea57a08d31d0f183b19d7df36601472045d9d2a001b26dda8bd56ee886556d4b21e
data/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.2.0] - 2025-10-12
4
+
5
+ ### Added
6
+
7
+ - **SQLite adapter** with performance optimizations
8
+ - WAL (Write-Ahead Logging) mode enabled by default for concurrent reads
9
+ - Performance-tuned pragmas: cache_size, mmap_size, temp_store, synchronous
10
+ - Custom date truncation for year, quarter, month, week, day, hour, minute, second
11
+ - Custom day/month name extraction via CASE statements (SQLite lacks strftime %A/%B support)
12
+ - Proper date casting using `date()` function
13
+ - Comprehensive test suite and documentation
14
+ - **Redshift adapter** for AWS data warehouse
15
+ - Native Redshift SQL function support
16
+ - Full metadata and table introspection
17
+ - `date_time_literal` method for creating timestamp literals
18
+ - `date_lit` method for creating date literals
19
+
20
+ ### Changed
21
+
22
+ - Removed ActiveSupport dependency
23
+ - Replaced `symbolize_keys` with `transform_keys(&:to_sym)`
24
+ - Replaced `demodulize` with `split('::').last.downcase`
25
+ - Removed core extensions
26
+ - Standardized all SQL function names in settings to UPPERCASE for consistency
27
+
28
+ ### Fixed
29
+
30
+ - Config defaults now properly set even when config key is passed with nil value
31
+ - Table instantiation issues resolved
32
+ - Test suite no longer requires Trino gem for default tests
33
+
3
34
  ## [0.1.0] - 2025-07-03
4
35
 
5
36
  - Initial release
data/README.md CHANGED
@@ -25,16 +25,17 @@ The adapter only has 5 core methods (6 including the connection method). A YAML
25
25
 
26
26
  - **Snowflake** - High performance cloud warehouse
27
27
  - **Trino** (formerly Presto) - Distributed SQL query engine
28
+ - **Redshift** - AWS data warehouse platform
28
29
  - **AWS Athena** - AWS big data warehouse
29
30
  - **Apache Druid** - Real-time analytics database
30
31
  - **DuckDB** - In-process analytical database
32
+ - **SQLite** - Lightweight embedded database
31
33
  - **PostgreSQL** - Full-featured RDBMS with advanced SQL support
32
34
  - **MySQL** - Popular open-source database
33
35
  - **SQL Server** - Microsoft's enterprise database
34
36
 
35
37
  ## Integrations Coming Soon
36
38
 
37
- - **Redshift** - AWS data warehouse platform
38
39
  - **ClickHouse** - High performance analytical db
39
40
  - **Databricks** - Big data compute engine
40
41
  - **MotherDuck** - Hosted DuckDB service
@@ -61,6 +62,14 @@ druid = DWH.create(:druid, {
61
62
 
62
63
  # basic query execution
63
64
  results = druid.execute("SELECT * FROM web_sales", format: :csv)
65
+
66
+ # Connect to SQLite for local analytics
67
+ sqlite = DWH.create(:sqlite, {
68
+ file: 'path/to/analytics.db'
69
+ })
70
+
71
+ # Query with optimized WAL mode enabled by default
72
+ results = sqlite.execute("SELECT * FROM sales_data", format: :array)
64
73
  ```
65
74
 
66
75
  ## Core API
@@ -70,6 +70,71 @@ postgres = DWH.create(:postgres, {
70
70
  })
71
71
  ```
72
72
 
73
+ ## Redshift Adapter
74
+
75
+ The Redshift adapter uses the `pg` gem and provides full-featured RDBMS support.
76
+
77
+ ### Basic Configuration
78
+
79
+ ```ruby
80
+ redshift = DWH.create(:redshift, {
81
+ host: 'localhost',
82
+ port: 5432, # Default: 5432
83
+ database: 'mydb',
84
+ schema: 'public', # Default: 'public'
85
+ username: 'user',
86
+ password: 'password',
87
+ client_name: 'My Application' # Default: 'DWH Ruby Gem'
88
+ })
89
+ ```
90
+
91
+ ### SSL Configuration
92
+
93
+ ```ruby
94
+ # Basic SSL
95
+ redshift = DWH.create(:redshift, {
96
+ host: 'localhost',
97
+ database: 'mydb',
98
+ username: 'user',
99
+ password: 'password',
100
+ ssl: true,
101
+ extra_connection_params: {
102
+ sslmode: 'require' # disable, prefer, require, verify-ca, verify-full
103
+ }
104
+ })
105
+
106
+ # Certificate-based SSL
107
+ redshift = DWH.create(:postgres, {
108
+ host: 'localhost',
109
+ database: 'mydb',
110
+ username: 'user',
111
+ ssl: true,
112
+ extra_connection_params: {
113
+ sslmode: 'verify-full',
114
+ sslrootcert: '/path/to/ca-cert.pem',
115
+ sslcert: '/path/to/client-cert.pem',
116
+ sslkey: '/path/to/client-key.pem'
117
+ }
118
+ })
119
+ ```
120
+
121
+ ### Advanced Configuration
122
+
123
+ ```ruby
124
+ redshift = DWH.create(:redshift, {
125
+ host: 'localhost',
126
+ database: 'mydb',
127
+ username: 'user',
128
+ password: 'password',
129
+ query_timeout: 3600, # seconds, default: 3600
130
+ extra_connection_params: {
131
+ application_name: 'Data Analysis Tool',
132
+ connect_timeout: 10,
133
+ options: '-c maintenance_work_mem=256MB'
134
+ }
135
+ })
136
+ ```
137
+
73
138
  ## Snowflake
74
139
 
75
140
  Snowflake adapter use the REST apis (https) to connect and query. This adapter also supports Multi-Database
@@ -287,6 +352,99 @@ duckdb = DWH.create(:duckdb, {
287
352
  })
288
353
  ```
289
354
 
355
+ ## SQLite Adapter
356
+
357
+ The SQLite adapter uses the `sqlite3` gem for lightweight embedded database analytics. It's optimized for analytical workloads with WAL mode enabled by default for better concurrent read performance.
358
+
359
+ ### Basic Configuration
360
+
361
+ ```ruby
362
+ # File-based database
363
+ sqlite = DWH.create(:sqlite, {
364
+ file: '/path/to/my/database.sqlite'
365
+ })
366
+
367
+ # In-memory database
368
+ sqlite = DWH.create(:sqlite, {
369
+ file: ':memory:'
370
+ })
371
+ ```
372
+
373
+ ### Read-Only Mode
374
+
375
+ ```ruby
376
+ sqlite = DWH.create(:sqlite, {
377
+ file: '/path/to/readonly/database.sqlite',
378
+ readonly: true
379
+ })
380
+ ```
381
+
382
+ ### Performance Optimization
383
+
384
+ The adapter includes default optimizations for analytical workloads:
385
+ - WAL mode enabled by default for concurrent reads
386
+ - 64MB cache size
387
+ - Memory-mapped I/O (128MB)
388
+ - Temp tables stored in memory
389
+
390
+ ```ruby
391
+ # Customize performance settings
392
+ sqlite = DWH.create(:sqlite, {
393
+ file: '/path/to/my/database.sqlite',
394
+ timeout: 5000, # busy timeout in milliseconds, default: 5000
395
+ pragmas: {
396
+ cache_size: -128000, # 128MB cache (negative means KB)
397
+ mmap_size: 268435456, # 256MB memory-mapped I/O
398
+ temp_store: 'MEMORY', # Store temp tables in memory
399
+ synchronous: 'NORMAL' # Faster than FULL, safe with WAL
400
+ }
401
+ })
402
+ ```
403
+
404
+ ### Disable WAL Mode
405
+
406
+ ```ruby
407
+ # Disable WAL mode if needed (e.g., for NFS or network filesystems)
408
+ sqlite = DWH.create(:sqlite, {
409
+ file: '/path/to/my/database.sqlite',
410
+ enable_wal: false
411
+ })
412
+ ```
413
+
414
+ ### Advanced Configuration
415
+
416
+ ```ruby
417
+ sqlite = DWH.create(:sqlite, {
418
+ file: '/path/to/analytics.sqlite',
419
+ readonly: false,
420
+ enable_wal: true, # Default: true
421
+ timeout: 10000, # 10 second busy timeout
422
+ pragmas: {
423
+ journal_mode: 'WAL', # Explicitly set WAL (done by default)
424
+ cache_size: -256000, # 256MB cache
425
+ page_size: 8192, # Larger page size for analytics
426
+ mmap_size: 536870912, # 512MB memory-mapped I/O
427
+ temp_store: 'MEMORY', # Keep temp data in memory
428
+ synchronous: 'NORMAL', # Balance between safety and speed
429
+ locking_mode: 'NORMAL' # Allow multiple connections
430
+ }
431
+ })
432
+ ```
433
+
434
+ ### Multiple Connections
435
+
436
+ Unlike DuckDB, SQLite allows multiple independent connections to the same database file:
437
+
438
+ ```ruby
439
+ # Multiple readers/writers to the same file
440
+ reader = DWH.create(:sqlite, { file: '/path/to/data.sqlite', readonly: true })
441
+ writer = DWH.create(:sqlite, { file: '/path/to/data.sqlite' })
442
+
443
+ # Both can operate concurrently with WAL mode enabled
444
+ data = reader.execute('SELECT * FROM sales')
445
+ writer.execute('INSERT INTO sales VALUES (...)')
446
+ ```
447
+
290
448
  ## Trino Adapter
291
449
 
292
450
  The Trino adapter requires the `trino-client-ruby` gem and works with both Trino and Presto.
@@ -40,9 +40,14 @@ postgres = DWH.create(:postgres, {
40
40
  password: 'password'
41
41
  })
42
42
 
43
+ # Connect to SQLite (lightweight, embedded)
44
+ sqlite = DWH.create(:sqlite, {
45
+ file: '/path/to/analytics.db'
46
+ })
47
+
43
48
  # Connect to DuckDB (in-memory)
44
49
  duckdb = DWH.create(:duckdb, {
45
- database: ':memory:'
50
+ file: ':memory:'
46
51
  })
47
52
  ```
48
53
 
data/docs/guides/usage.md CHANGED
@@ -293,7 +293,7 @@ native = adapter.execute(sql, format: :native) # Database's native format
293
293
  # Use streaming for large result sets
294
294
  def export_large_table(adapter, table_name, output_file)
295
295
  query = "SELECT * FROM #{table_name}"
296
-
296
+
297
297
  File.open(output_file, 'w') do |file|
298
298
  adapter.execute_stream(query, file)
299
299
  end
@@ -309,6 +309,38 @@ def process_large_dataset(adapter, query)
309
309
  end
310
310
  ```
311
311
 
312
+ ### SQLite Performance Tuning
313
+
314
+ SQLite adapter comes with optimized defaults for analytical workloads, but can be further tuned:
315
+
316
+ ```ruby
317
+ # High-performance SQLite configuration for analytics
318
+ sqlite = DWH.create(:sqlite, {
319
+ file: '/path/to/large_analytics.db',
320
+ enable_wal: true, # WAL mode for concurrent reads (default: true)
321
+ timeout: 30000, # 30 second busy timeout for heavy writes
322
+ pragmas: {
323
+ cache_size: -512000, # 512MB cache for large datasets
324
+ page_size: 8192, # Larger pages for sequential scans
325
+ mmap_size: 1073741824, # 1GB memory-mapped I/O
326
+ temp_store: 'MEMORY', # Keep temp tables in RAM
327
+ synchronous: 'NORMAL', # Balance safety/speed (safe with WAL)
328
+ journal_size_limit: 67108864 # 64MB journal limit
329
+ }
330
+ })
331
+
332
+ # Read-only analytics queries with maximum performance
333
+ readonly_analytics = DWH.create(:sqlite, {
334
+ file: '/path/to/data.db',
335
+ readonly: true, # Read-only for maximum concurrency
336
+ pragmas: {
337
+ cache_size: -256000, # 256MB cache
338
+ mmap_size: 2147483648, # 2GB memory mapping for large files
339
+ temp_store: 'MEMORY' # Fast temp operations
340
+ }
341
+ })
342
+ ```
343
+
312
344
  ## Error Handling and Debugging
313
345
 
314
346
  ### Comprehensive Error Handling
@@ -150,7 +150,7 @@ module DWH
150
150
 
151
151
  # True if the configuration was setup with a schema.
152
152
  def schema?
153
- config[:schema].present?
153
+ !config[:schema].nil? && !config[:schema]&.strip&.empty?
154
154
  end
155
155
 
156
156
  # (see Adapter#execute)
@@ -27,7 +27,7 @@ module DWH
27
27
  config :schema, String, default: 'public', message: 'schema name. defaults to "public"'
28
28
  config :username, String, required: true, message: 'connection username'
29
29
  config :password, String, required: false, default: nil, message: 'connection password'
30
- config :query_timeout, String, required: false, default: 3600, message: 'query execution timeout in seconds'
30
+ config :query_timeout, Integer, required: false, default: 3600, message: 'query execution timeout in seconds'
31
31
  config :ssl, Boolean, required: false, default: false, message: 'use ssl'
32
32
  config :client_name, String, required: false, default: 'DWH Ruby Gem', message: 'The name of the connecting app'
33
33
 
@@ -45,7 +45,7 @@ module DWH
45
45
  password: config[:password],
46
46
  application_name: config[:client_name]
47
47
  }.merge(extra_connection_params)
48
- properties[:options] = "#{properties[:options]} -c statement_timeout=#{config[:query_timeout]}s"
48
+ properties[:options] = "#{properties[:options]} -c statement_timeout=#{config[:query_timeout] * 1000}"
49
49
 
50
50
  @connection = PG.connect(properties)
51
51
 
@@ -114,7 +114,7 @@ module DWH
114
114
  db_table = Table.new table, schema: qualifiers[:schema]
115
115
 
116
116
  schema_where = ''
117
- if db_table.schema.present?
117
+ if db_table.schema?
118
118
  schema_where = "AND table_schema = '#{db_table.schema}'"
119
119
  elsif schema?
120
120
  schema_where = "AND table_schema in (#{qualified_schema_name})"
@@ -143,7 +143,7 @@ module DWH
143
143
 
144
144
  # True if the configuration was setup with a schema.
145
145
  def schema?
146
- config[:schema].present?
146
+ !config[:schema].nil? && !config[:schema]&.strip&.empty?
147
147
  end
148
148
 
149
149
  # (see Adapter#execute)
@@ -0,0 +1,48 @@
1
+ module DWH
2
+ module Adapters
3
+ # Redshift adapter. Please ensure the pg gem is available before using this adapter.
4
+ # Generally, adapters should be created using {DWH::Factory#create DWH.create}. Where a configuration
5
+ # is passed in as options hash or argument list.
6
+ #
7
+ # @example Basic connection with required only options
8
+ # DWH.create(:redshift, {host: 'localhost', database: 'redshift',
9
+ # username: 'redshift'})
10
+ #
11
+ # @example Connection with cert based SSL connection
12
+ # DWH.create(:redshift, {host: 'localhost', database: 'redshift',
13
+ # username: 'redshift', ssl: true,
14
+ # extra_connection_params: { sslmode: 'require' })
15
+ #
16
+ # valid sslmodes: disable, prefer, require, verify-ca, verify-full
17
+ # For modes requiring Certs make sure you add the appropirate params
18
+ # to extra_connection_params. (ie sslrootcert, sslcert etc.)
19
+ #
20
+ # @example Connection sending custom application name
21
+ # DWH.create(:redshift, {host: 'localhost', database: 'redshift',
22
+ # username: 'redshift', application_name: "Strata CLI" })
23
+ class Redshift < Postgres
24
+ config :host, String, required: true, message: 'server host ip address or domain name'
25
+ config :port, Integer, required: false, default: 5439, message: 'port to connect to'
26
+ config :database, String, required: true, message: 'name of database to connect to'
27
+ config :schema, String, default: 'public', message: 'schema name. defaults to "public"'
28
+ config :username, String, required: true, message: 'connection username'
29
+ config :password, String, required: false, default: nil, message: 'connection password'
30
+ config :query_timeout, Integer, required: false, default: 3600, message: 'query execution timeout in seconds'
31
+ config :client_name, String, required: false, default: 'DWH Ruby Gem', message: 'The name of the connecting app'
32
+ config :ssl, Boolean, required: false, default: false, message: 'use ssl'
33
+
34
+ # Need to override default add method
35
+ # since redshift doesn't support quarter as an
36
+ # interval.
37
+ # @param unit [String] Should be one of day, month, quarter etc
38
+ # @param val [String, Integer] The number of days to add
39
+ # @param exp [String] The sql expresssion to modify
40
+ def date_add(unit, val, exp)
41
+ gsk(:date_add)
42
+ .gsub(/@unit/i, unit)
43
+ .gsub(/@val/i, val.to_s)
44
+ .gsub(/@exp/i, exp)
45
+ end
46
+ end
47
+ end
48
+ end
@@ -142,7 +142,7 @@ module DWH
142
142
  change_current_database(db_table.catalog)
143
143
 
144
144
  schema_where = ''
145
- schema_where = "AND table_schema = '#{db_table.schema}'" if db_table.schema.present?
145
+ schema_where = "AND table_schema = '#{db_table.schema}'" if db_table.schema?
146
146
 
147
147
  sql = <<-SQL
148
148
  SELECT column_name, data_type, character_maximum_length, numeric_precision,numeric_scale
@@ -0,0 +1,364 @@
1
+ module DWH
2
+ module Adapters
3
+ # SQLite adapter optimized for analytical workloads.
4
+ #
5
+ # This requires the ruby {https://github.com/sparklemotion/sqlite3-ruby sqlite3} gem.
6
+ #
7
+ # Generally, adapters should be created using {DWH::Factory#create DWH.create}. Where a configuration
8
+ # is passed in as options hash or argument list.
9
+ #
10
+ # @example Basic connection with required only options
11
+ # DWH.create(:sqlite, {file: 'path/to/my/database.db' })
12
+ #
13
+ # @example Open in read only mode
14
+ # DWH.create(:sqlite, {file: 'path/to/my/database.db', readonly: true})
15
+ #
16
+ # @example Configure with custom performance pragmas
17
+ # DWH.create(:sqlite, {file: 'path/to/my/database.db',
18
+ # pragmas: { cache_size: -128000, mmap_size: 268435456 }})
19
+ #
20
+ # @note This adapter enables WAL mode by default for better concurrent read performance.
21
+ # Set `enable_wal: false` to disable this behavior.
22
+ class Sqlite < Adapter
23
+ config :file, String, required: true, message: 'path/to/sqlite/db'
24
+ config :readonly, Boolean, required: false, default: false, message: 'open database in read-only mode'
25
+ config :enable_wal, Boolean, required: false, default: true, message: 'enable WAL mode for better concurrency'
26
+ config :pragmas, Hash, required: false, message: 'hash of PRAGMA statements for performance tuning'
27
+ config :timeout, Integer, required: false, default: 5000, message: 'busy timeout in milliseconds'
28
+
29
+ # Default pragmas optimized for analytical workloads
30
+ DEFAULT_PRAGMAS = {
31
+ cache_size: -64_000, # 64MB cache (negative means KB)
32
+ temp_store: 'MEMORY', # Store temp tables in memory
33
+ mmap_size: 134_217_728, # 128MB memory-mapped I/O
34
+ page_size: 4096, # Standard page size
35
+ synchronous: 'NORMAL' # Faster than FULL, safe with WAL
36
+ }.freeze
37
+
38
+ # (see Adapter#connection)
39
+ def connection
40
+ return @connection if @connection
41
+
42
+ options = build_open_options
43
+ @connection = SQLite3::Database.new(config[:file], options)
44
+
45
+ # Set busy timeout to handle concurrent access
46
+ @connection.busy_timeout(config[:timeout])
47
+
48
+ # Don't return results as hash by default for performance
49
+ @connection.results_as_hash = false
50
+
51
+ # Enable WAL mode for concurrent reads (unless disabled or readonly)
52
+ @connection.execute('PRAGMA journal_mode = WAL') if config[:enable_wal] && !config[:readonly]
53
+
54
+ # Apply default pragmas
55
+ apply_pragmas(DEFAULT_PRAGMAS)
56
+
57
+ # Apply user-specified pragmas (will override defaults)
58
+ apply_pragmas(config[:pragmas]) if config.key?(:pragmas)
59
+
60
+ @connection
61
+ rescue StandardError => e
62
+ raise ConfigError, e.message
63
+ end
64
+
65
+ # (see Adapter#close)
66
+ def close
67
+ return if @connection.nil?
68
+
69
+ @connection.close unless @connection.closed?
70
+ @connection = nil
71
+ end
72
+
73
+ # (see Adapter#test_connection)
74
+ def test_connection(raise_exception: false)
75
+ connection
76
+ connection.execute('SELECT 1')
77
+ true
78
+ rescue StandardError => e
79
+ raise ConnectionError, e.message if raise_exception
80
+
81
+ false
82
+ end
83
+
84
+ # (see Adapter#tables)
85
+ def tables(**qualifiers)
86
+ sql = "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%' ORDER BY name"
87
+
88
+ res = execute(sql)
89
+ res.flatten
90
+ end
91
+
92
+ # (see Adapter#stats)
93
+ def stats(table, date_column: nil, **qualifiers)
94
+ db_table = Table.new table, **qualifiers
95
+
96
+ sql = <<-SQL
97
+ SELECT count(*) AS ROW_COUNT
98
+ #{date_column.nil? ? '' : ", min(#{date_column}) AS DATE_START"}
99
+ #{date_column.nil? ? '' : ", max(#{date_column}) AS DATE_END"}
100
+ FROM #{db_table.physical_name}
101
+ SQL
102
+
103
+ result = execute(sql)
104
+ TableStats.new(
105
+ row_count: result.first[0],
106
+ date_start: date_column ? result.first[1] : nil,
107
+ date_end: date_column ? result.first[2] : nil
108
+ )
109
+ end
110
+
111
+ # (see Adapter#metadata)
112
+ def metadata(table, **qualifiers)
113
+ db_table = Table.new table, **qualifiers
114
+
115
+ # SQLite uses PRAGMA table_info for metadata
116
+ sql = "PRAGMA table_info(#{db_table.physical_name})"
117
+
118
+ cols = execute(sql)
119
+ cols.each do |col|
120
+ # PRAGMA table_info returns: cid, name, type, notnull, dflt_value, pk
121
+ db_table << Column.new(
122
+ name: col[1],
123
+ data_type: col[2],
124
+ precision: nil,
125
+ scale: nil,
126
+ max_char_length: nil
127
+ )
128
+ end
129
+
130
+ db_table
131
+ end
132
+
133
+ # (see Adapter#execute)
134
+ def execute(sql, format: :array, retries: 0)
135
+ begin
136
+ result = with_debug(sql) { with_retry(retries) { connection.execute(sql) } }
137
+ rescue StandardError => e
138
+ raise ExecutionError, e.message
139
+ end
140
+
141
+ format = format.downcase if format.is_a?(String)
142
+ case format.to_sym
143
+ when :array
144
+ result
145
+ when :object
146
+ result_to_hash(sql, result)
147
+ when :csv
148
+ result_to_csv(sql, result)
149
+ when :native
150
+ result
151
+ else
152
+ raise UnsupportedCapability, "Unsupported format: #{format} for this #{name}"
153
+ end
154
+ end
155
+
156
+ # (see Adapter#execute_stream)
157
+ def execute_stream(sql, io, stats: nil, retries: 0)
158
+ with_debug(sql) do
159
+ with_retry(retries) do
160
+ stmt = connection.prepare(sql)
161
+ columns = stmt.columns
162
+
163
+ io.write(CSV.generate_line(columns))
164
+
165
+ stmt.execute.each do |row|
166
+ stats << row unless stats.nil?
167
+ io.write(CSV.generate_line(row))
168
+ end
169
+
170
+ stmt.close
171
+ end
172
+ end
173
+
174
+ io.rewind
175
+ io
176
+ rescue StandardError => e
177
+ raise ExecutionError, e.message
178
+ end
179
+
180
+ # (see Adapter#stream)
181
+ def stream(sql, &block)
182
+ with_debug(sql) do
183
+ stmt = connection.prepare(sql)
184
+ stmt.execute.each do |row|
185
+ block.call(row)
186
+ end
187
+ stmt.close
188
+ end
189
+ end
190
+
191
+ # Custom date truncation implementation. SQLite doesn't offer
192
+ # a native DATE_TRUNC function. We use 'start of' modifiers
193
+ # for year, month, and day, and custom logic for quarter and week.
194
+ # @see Dates#truncate_date
195
+ def truncate_date(unit, exp)
196
+ unit = unit.strip.downcase
197
+
198
+ case unit
199
+ when 'year'
200
+ "date(#{exp}, 'start of year')"
201
+ when 'quarter'
202
+ # Calculate quarter start using CASE statement
203
+ # Q1: Jan-Mar (months 1-3) -> start of year
204
+ # Q2: Apr-Jun (months 4-6) -> start of year + 3 months
205
+ # Q3: Jul-Sep (months 7-9) -> start of year + 6 months
206
+ # Q4: Oct-Dec (months 10-12) -> start of year + 9 months
207
+ '(CASE ' \
208
+ "WHEN CAST(strftime('%m', #{exp}) AS INTEGER) BETWEEN 1 AND 3 THEN date(#{exp}, 'start of year') " \
209
+ "WHEN CAST(strftime('%m', #{exp}) AS INTEGER) BETWEEN 4 AND 6 THEN date(#{exp}, 'start of year', '+3 months') " \
210
+ "WHEN CAST(strftime('%m', #{exp}) AS INTEGER) BETWEEN 7 AND 9 THEN date(#{exp}, 'start of year', '+6 months') " \
211
+ "ELSE date(#{exp}, 'start of year', '+9 months') " \
212
+ 'END)'
213
+ when 'month'
214
+ "date(#{exp}, 'start of month')"
215
+ when 'week'
216
+ # Use week start day from settings
217
+ gsk("#{settings[:week_start_day].downcase}_week_start_day")
218
+ .gsub(/@exp/i, exp)
219
+ when 'day', 'date'
220
+ "date(#{exp})"
221
+ when 'hour'
222
+ # SQLite datetime returns timestamp, truncate to hour
223
+ "datetime(strftime('%Y-%m-%d %H:00:00', #{exp}))"
224
+ when 'minute'
225
+ "datetime(strftime('%Y-%m-%d %H:%M:00', #{exp}))"
226
+ when 'second'
227
+ "datetime(strftime('%Y-%m-%d %H:%M:%S', #{exp}))"
228
+ else
229
+ raise UnsupportedCapability, "Currently not supporting truncation at #{unit} level"
230
+ end
231
+ end
232
+
233
+ # SQLite's strftime doesn't support %A (day name) or %B (month name)
234
+ # We need to implement these using CASE statements based on day/month numbers
235
+ def extract_day_name(exp, abbreviate: false)
236
+ day_num = "CAST(strftime('%w', #{exp}) AS INTEGER)"
237
+
238
+ if abbreviate
239
+ # Abbreviated day names: SUN, MON, TUE, etc.
240
+ "(CASE #{day_num} " \
241
+ "WHEN 0 THEN 'SUN' " \
242
+ "WHEN 1 THEN 'MON' " \
243
+ "WHEN 2 THEN 'TUE' " \
244
+ "WHEN 3 THEN 'WED' " \
245
+ "WHEN 4 THEN 'THU' " \
246
+ "WHEN 5 THEN 'FRI' " \
247
+ "WHEN 6 THEN 'SAT' " \
248
+ 'END)'
249
+ else
250
+ # Full day names: SUNDAY, MONDAY, TUESDAY, etc.
251
+ "(CASE #{day_num} " \
252
+ "WHEN 0 THEN 'SUNDAY' " \
253
+ "WHEN 1 THEN 'MONDAY' " \
254
+ "WHEN 2 THEN 'TUESDAY' " \
255
+ "WHEN 3 THEN 'WEDNESDAY' " \
256
+ "WHEN 4 THEN 'THURSDAY' " \
257
+ "WHEN 5 THEN 'FRIDAY' " \
258
+ "WHEN 6 THEN 'SATURDAY' " \
259
+ 'END)'
260
+ end
261
+ end
262
+
263
+ def extract_month_name(exp, abbreviate: false)
264
+ month_num = "CAST(strftime('%m', #{exp}) AS INTEGER)"
265
+
266
+ if abbreviate
267
+ # Abbreviated month names: JAN, FEB, MAR, etc.
268
+ "(CASE #{month_num} " \
269
+ "WHEN 1 THEN 'JAN' " \
270
+ "WHEN 2 THEN 'FEB' " \
271
+ "WHEN 3 THEN 'MAR' " \
272
+ "WHEN 4 THEN 'APR' " \
273
+ "WHEN 5 THEN 'MAY' " \
274
+ "WHEN 6 THEN 'JUN' " \
275
+ "WHEN 7 THEN 'JUL' " \
276
+ "WHEN 8 THEN 'AUG' " \
277
+ "WHEN 9 THEN 'SEP' " \
278
+ "WHEN 10 THEN 'OCT' " \
279
+ "WHEN 11 THEN 'NOV' " \
280
+ "WHEN 12 THEN 'DEC' " \
281
+ 'END)'
282
+ else
283
+ # Full month names: JANUARY, FEBRUARY, MARCH, etc.
284
+ "(CASE #{month_num} " \
285
+ "WHEN 1 THEN 'JANUARY' " \
286
+ "WHEN 2 THEN 'FEBRUARY' " \
287
+ "WHEN 3 THEN 'MARCH' " \
288
+ "WHEN 4 THEN 'APRIL' " \
289
+ "WHEN 5 THEN 'MAY' " \
290
+ "WHEN 6 THEN 'JUNE' " \
291
+ "WHEN 7 THEN 'JULY' " \
292
+ "WHEN 8 THEN 'AUGUST' " \
293
+ "WHEN 9 THEN 'SEPTEMBER' " \
294
+ "WHEN 10 THEN 'OCTOBER' " \
295
+ "WHEN 11 THEN 'NOVEMBER' " \
296
+ "WHEN 12 THEN 'DECEMBER' " \
297
+ 'END)'
298
+ end
299
+ end
300
+
301
+ # SQLite's CAST(... AS DATE) doesn't work properly - it just extracts the year
302
+ # We need to override cast to use the date() function for DATE types
303
+ def cast(exp, type)
304
+ if type.to_s.downcase == 'date'
305
+ "date(#{exp})"
306
+ else
307
+ super
308
+ end
309
+ end
310
+
311
+ def valid_config?
312
+ super
313
+ require 'sqlite3'
314
+ rescue LoadError
315
+ raise ConfigError, "Required 'sqlite3' gem missing. Please add it to your Gemfile."
316
+ end
317
+
318
+ private
319
+
320
+ def build_open_options
321
+ options = {}
322
+ options[:readonly] = true if config[:readonly]
323
+ options
324
+ end
325
+
326
+ def apply_pragmas(pragmas)
327
+ return unless pragmas
328
+
329
+ pragmas.each do |pragma, value|
330
+ # Format value appropriately (quote strings, leave numbers/keywords as-is)
331
+ formatted_value = value.is_a?(String) && value.upcase != value ? "'#{value}'" : value
332
+ @connection.execute("PRAGMA #{pragma} = #{formatted_value}")
333
+ end
334
+ end
335
+
336
+ def result_to_hash(sql, result)
337
+ return [] if result.empty?
338
+
339
+ # Get column names by preparing statement
340
+ stmt = connection.prepare(sql)
341
+ columns = stmt.columns
342
+ stmt.close
343
+
344
+ result.map do |row|
345
+ columns.zip(row).to_h
346
+ end
347
+ end
348
+
349
+ def result_to_csv(sql, result)
350
+ # Get column names by preparing statement
351
+ stmt = connection.prepare(sql)
352
+ columns = stmt.columns
353
+ stmt.close
354
+
355
+ CSV.generate do |csv|
356
+ csv << columns
357
+ result.each do |row|
358
+ csv << row
359
+ end
360
+ end
361
+ end
362
+ end
363
+ end
364
+ end
data/lib/dwh/adapters.rb CHANGED
@@ -80,12 +80,12 @@ module DWH
80
80
  attr_reader :config
81
81
 
82
82
  def initialize(config)
83
- @config = config.symbolize_keys
83
+ @config = config.transform_keys(&:to_sym)
84
84
  # Per instance customization of general settings
85
85
  # So you can have multiple connections to Trino
86
86
  # but exhibit diff behavior
87
87
  @settings = self.class.adapter_settings.merge(
88
- (config[:settings] || {}).symbolize_keys
88
+ (config[:settings] || {}).transform_keys(&:to_sym)
89
89
  )
90
90
 
91
91
  valid_config?
@@ -300,7 +300,7 @@ module DWH
300
300
  # Adapter name from the class name
301
301
  # @return [String]
302
302
  def adapter_name
303
- self.class.name.demodulize
303
+ self.class.name.split('::').last.downcase
304
304
  end
305
305
 
306
306
  # If any extra connection params were passed in the config
data/lib/dwh/column.rb CHANGED
@@ -22,7 +22,7 @@ module DWH
22
22
 
23
23
  DEFAULT_RULES = { /[_+]+/ => ' ', /\s+id$/i => ' ID', /desc/i => 'Description' }.freeze
24
24
  def namify(rules = DEFAULT_RULES)
25
- named = name.titleize keep_id_suffix: true
25
+ named = titleize(name)
26
26
  rules.each do |k, v|
27
27
  named = named.gsub(Regexp.new(k), v)
28
28
  end
@@ -75,5 +75,16 @@ module DWH
75
75
  def to_s
76
76
  "<Column:#{name}:#{data_type}>"
77
77
  end
78
+
79
+ def titleize(name)
80
+ # Handle underscores, dashes, and multiple spaces
81
+ # Also preserves existing spacing patterns better
82
+ name.gsub(/[_-]/, ' ') # Convert underscores and dashes to spaces
83
+ .gsub(/\s+/, ' ') # Normalize multiple spaces to single spaces
84
+ .strip # Remove leading/trailing whitespace
85
+ .split(' ') # Split into words
86
+ .map(&:capitalize) # Capitalize each word
87
+ .join(' ') # Join with single spaces
88
+ end
78
89
  end
79
90
  end
@@ -124,12 +124,27 @@ module DWH
124
124
  gsk(:date_literal).gsub(/@val/i, val)
125
125
  end
126
126
 
127
+ # @see #date_literal
128
+ def date_lit(val)
129
+ date_literal(val)
130
+ end
131
+
127
132
  # @param val [String, Date, DateTime, Time]
128
133
  def date_time_literal(val)
129
134
  val = DATE_CLASSES.include?(val.class) ? val.strftime(date_time_format) : val
130
135
  gsk(:date_time_literal).gsub(/@val/i, val)
131
136
  end
132
137
 
138
+ # @see #date_time_literal
139
+ def timestamp_lit(val)
140
+ date_time_literal(val)
141
+ end
142
+
143
+ # @see #date_time_literal
144
+ def timestamp_literal(val)
145
+ date_time_literal(val)
146
+ end
147
+
133
148
  # The current default week start day. This is how
134
149
  # the db is currently setup. Should be either monday or sunday
135
150
  def default_week_start_day
@@ -13,20 +13,20 @@ abbreviated_day_name_format: "EEE"
13
13
  month_name_format: "MMMM"
14
14
  abbreviated_month_name_format: "MMM"
15
15
 
16
- date_add: "date_add(@unit, @val, @exp)"
17
- date_diff: "date_diff(@unit, @start_exp, @end_exp)"
18
- date_format_sql: "date_format(@exp, '@format')"
19
- extract_day_of_year: 'dayofyear(@exp)'
20
- extract_day_of_week: 'dayofweek(@exp)'
21
- extract_week_of_year: 'weekofyear(@exp)'
22
- extract_year_month: 'cast(concat(year(@exp), lpad(month(@exp), 2, "0")) as int)'
16
+ date_add: "DATE_ADD(@unit, @val, @exp)"
17
+ date_diff: "DATE_DIFF(@unit, @start_exp, @end_exp)"
18
+ date_format_sql: "DATE_FORMAT(@exp, '@format')"
19
+ extract_day_of_year: 'DAYOFYEAR(@exp)'
20
+ extract_day_of_week: 'DAYOFWEEK(@exp)'
21
+ extract_week_of_year: 'WEEKOFYEAR(@exp)'
22
+ extract_year_month: 'CAST(CONCAT(YEAR(@exp), LPAD(MONTH(@exp), 2, "0")) as INT)'
23
23
 
24
24
  cast: "CAST(@exp AS @type)"
25
25
 
26
26
  # string functions
27
- trim: "trim(@exp)"
28
- lower_case: "lower(@exp)"
29
- upper_case: "upper(@exp)"
27
+ trim: "TRIM(@exp)"
28
+ lower_case: "LOWER(@exp)"
29
+ upper_case: "UPPER(@exp)"
30
30
 
31
31
  # null handling
32
32
  if_null: "COALESCE(@exp, @when_null)"
@@ -45,7 +45,7 @@ supports_window_functions: true
45
45
  extend_ending_date_to_last_hour_of_day: false # druid needs this for inclusive filtering
46
46
 
47
47
  # array operations
48
- array_in_list: "exists(@exp, x -> x IN (@list))"
49
- array_exclude_list: "not exists(@exp, x -> x IN (@list))"
50
- array_unnest_join: "LATERAL VIEW explode(@exp) AS @alias"
48
+ array_in_list: "EXISTS(@exp, x -> x IN (@list))"
49
+ array_exclude_list: "NOT EXISTS(@exp, x -> x IN (@list))"
50
+ array_unnest_join: "LATERAL VIEW EXPLODE(@exp) AS @alias"
51
51
 
@@ -25,9 +25,9 @@ sunday_week_start_day: "TIME_FLOOR(@exp, 'P7D', TIMESTAMP '1970-01-04 00:00:00')
25
25
  monday_week_start_day: "TIME_FLOOR(@exp, 'P7D', TIMESTAMP '1970-01-05 00:00:00')"
26
26
 
27
27
  # string functions
28
- trim: "trim(@exp)"
29
- lower_case: "lower(@exp)"
30
- upper_case: "upper(@exp)"
28
+ trim: "TRIM(@exp)"
29
+ lower_case: "LOWER(@exp)"
30
+ upper_case: "UPPER(@exp)"
31
31
 
32
32
  # Relevant db capabilities
33
33
  supports_table_join: true
@@ -38,7 +38,7 @@ upper_case: "UPPER(@exp)"
38
38
  create_temp_table_template: "CREATE TEMP TABLE @table AS \n@sql"
39
39
 
40
40
  # array operations
41
- array_in_list: "array_length(array_intersect(@exp, @list)) > 0"
42
- array_exclude_list: "array_length(array_intersect(@exp, @list)) = 0"
41
+ array_in_list: "ARRAY_LENGTH(ARRAY_INTERSECT(@exp, @list)) > 0"
42
+ array_exclude_list: "ARRAY_LENGTH(ARRAY_INTERSECT(@exp, @list)) = 0"
43
43
  array_unnest_join: ", LATERAL (SELECT UNNEST(@exp)) AS @alias"
44
44
 
@@ -24,8 +24,8 @@ extract_minute: 'MINUTE(@exp)'
24
24
  extract_year_month: 'CAST(CONCAT(YEAR(@exp), LPAD(MONTH(@exp), 2, "0")) AS UNSIGNED)'
25
25
  default_week_start_day: "sunday"
26
26
  week_start_day: "monday"
27
- sunday_week_start_day: "DATE(DATE_SUB(@exp, INTERVAL dayofweek(@exp)-1 DAY ))"
28
- monday_week_start_day: "DATE(DATE_SUB(@exp, INTERVAL dayofweek(@exp)-2 DAY ))"
27
+ sunday_week_start_day: "DATE(DATE_SUB(@exp, INTERVAL DAYOFWEEK(@exp)-1 DAY ))"
28
+ monday_week_start_day: "DATE(DATE_SUB(@exp, INTERVAL DAYOFWEEK(@exp)-2 DAY ))"
29
29
  cast: "CAST(@exp AS @type)"
30
30
 
31
31
  # string functions
@@ -1,6 +1,6 @@
1
1
 
2
2
  date_add: "(@exp + '@val @unit'::interval)"
3
- date_diff: "age(@start_exp, @end_exp)"
3
+ date_diff: "AGE(@start_exp, @end_exp)"
4
4
  date_format_sql: "TO_CHAR(@exp, '@format')"
5
5
  date_literal: "'@val'::DATE"
6
6
  date_time_literal: "'@val'::TIMESTAMP"
@@ -12,16 +12,16 @@ abbreviated_month_name_format: "Mon"
12
12
  sunday_week_start_day: "( DATE_TRUNC('WEEK', @exp + INTERVAL '1 DAY') - INTERVAL '1 DAY' )"
13
13
  monday_week_start_day: "( DATE_TRUNC('WEEK', @exp - INTERVAL '1 DAY') + INTERVAL '1 DAY' )"
14
14
 
15
- extract_year: 'extract(year from @exp)'
16
- extract_month: 'extract(month from @exp)'
17
- extract_quarter: 'extract(quarter from @exp)'
18
- extract_day_of_year: 'extract(DOY from @exp)'
19
- extract_day_of_month: 'extract(DAY from @exp)'
20
- extract_day_of_week: 'extract(DOW from @exp)'
21
- extract_week_of_year: 'extract(WEEK from @exp)'
22
- extract_hour: 'extract(HOUR from @exp)'
23
- extract_minute: 'extract(MINUTE from @exp)'
24
- extract_year_month: "cast((extract(year from @exp)::varchar || TO_CHAR(@exp, 'MM')) as integer)"
15
+ extract_year: 'EXTRACT(year from @exp)'
16
+ extract_month: 'EXTRACT(month from @exp)'
17
+ extract_quarter: 'EXTRACT(quarter from @exp)'
18
+ extract_day_of_year: 'EXTRACT(DOY from @exp)'
19
+ extract_day_of_month: 'EXTRACT(DAY from @exp)'
20
+ extract_day_of_week: 'EXTRACT(DOW from @exp)'
21
+ extract_week_of_year: 'EXTRACT(WEEK from @exp)'
22
+ extract_hour: 'EXTRACT(HOUR from @exp)'
23
+ extract_minute: 'EXTRACT(MINUTE from @exp)'
24
+ extract_year_month: "CAST((EXTRACT(YEAR FROM @exp)::varchar || TO_CHAR(@exp, 'MM')) as INTEGER)"
25
25
 
26
26
  cast: "@exp::@type"
27
27
 
@@ -1,28 +1,19 @@
1
-
2
- # quotes and string lit
3
- quote: "\"@exp\""
4
- string_literal: "'@exp'"
5
-
6
1
  # Date Literal Formats
7
- date_format: "%Y-%m-%d"
8
- date_time_format: "%Y-%m-%d %H:%M:%S"
9
- date_time_tz_format: "%Y-%m-%d %H:%M:%S %Z"
10
- date_type: "string" # alternative is int, integer, dateint
11
2
  day_name_format: "Day"
12
3
  abbreviated_day_name_format: "Dy"
13
4
  month_name_format: "Month"
14
5
  abbreviated_month_name_format: "Mon"
15
6
 
16
7
  # Date functions patterns
17
- current_date: "current_date"
18
- current_time: "current_time"
19
- current_timestamp: "current_timestamp"
20
- truncate_date: "date_trunc('@unit', @exp)"
21
- date_add: "dateadd(@unit, @val, @exp)"
22
- date_diff: "datediff(@unit, @start_exp, @end_exp)"
8
+ current_date: "CURRENT_DATE"
9
+ current_time: "CURRENT_TIME"
10
+ current_timestamp: "CURRENT_TIMESTAMP"
11
+ truncate_date: "DATE_TRUNC('@unit', @exp)"
12
+ date_add: "DATEADD(@unit, @val, @exp)"
13
+ date_diff: "DATEDIFF(@unit, @start_exp, @end_exp)"
23
14
  date_format_sql: "TO_CHAR(@exp, '@format')"
24
- date_literal: "'@val'"
25
- date_time_literal: "TIMESTAMP '@val'"
15
+ date_literal: "'@val'::DATE"
16
+ date_time_literal: "'@val'::TIMESTAMP"
26
17
  extract_year: 'EXTRACT(YEAR FROM @exp)'
27
18
  extract_month: 'EXTRACT(MONTH FROM @exp)'
28
19
  extract_quarter: 'EXTRACT(QUARTER FROM @exp)'
@@ -33,15 +24,15 @@ extract_week_of_year: 'EXTRACT(WEEK FROM @exp)'
33
24
  extract_hour: 'EXTRACT(HOUR FROM @exp)'
34
25
  extract_minute: 'EXTRACT(MINUTE FROM @exp)'
35
26
  extract_year_month: "TO_CHAR(@exp, 'YYYYMM')::INTEGER"
36
- default_week_start_day: "sunday" # Redshift uses Sunday as default
37
- week_start_day: "sunday"
38
- sunday_week_start_day: "DATEADD(day, -1, DATE_TRUNC(WEEK, DATEADD(DAY, 1, @exp)))"
39
- monday_week_start_day: "DATEADD(day, 1, DATE_TRUNC(WEEK, DATEADD(day, -1, @exp)))"
27
+ default_week_start_day: "monday" # Redshift uses Sunday as default
28
+ week_start_day: "monday"
29
+ sunday_week_start_day: "DATEADD(day, -1, DATE_TRUNC('WEEK', DATEADD(DAY, 1, @exp)))"
30
+ monday_week_start_day: "DATEADD(day, 1, DATE_TRUNC('WEEK', DATEADD(day, -1, @exp)))"
40
31
 
41
32
  # string functions
42
- trim: "trim(@exp)"
43
- lower_case: "lower(@exp)"
44
- upper_case: "upper(@exp)"
33
+ trim: "TRIM(@exp)"
34
+ lower_case: "LOWER(@exp)"
35
+ upper_case: "UPPER(@exp)"
45
36
 
46
37
  # null handling
47
38
  if_null: "COALESCE(@exp, @when_null)"
@@ -10,29 +10,29 @@ month_name_format: "MMMM"
10
10
  abbreviated_month_name_format: "MON"
11
11
 
12
12
  # Date functions patterns
13
- current_date: "current_date()"
14
- current_time: "current_time()"
15
- current_timestamp: "current_timestamp()"
13
+ current_date: "CURRENT_DATE()"
14
+ current_time: "CURRENT_TIME()"
15
+ current_timestamp: "CURRENT_TIMESTAMP()"
16
16
 
17
- date_add: "dateadd(@unit, @val, @exp)"
18
- date_diff: "datediff(@unit, @start_exp, @end_exp)"
17
+ date_add: "DATEADD(@unit, @val, @exp)"
18
+ date_diff: "DATEDIFF(@unit, @start_exp, @end_exp)"
19
19
  date_format_sql: "TO_VARCHAR(@exp, '@format')"
20
20
  date_literal: "'@val'::DATE"
21
21
  date_time_literal: "'@val'::TIMESTAMP"
22
- extract_year: 'year(@exp)'
23
- extract_month: 'month(@exp)'
24
- extract_quarter: 'quarter(@exp)'
22
+ extract_year: 'YEAR(@exp)'
23
+ extract_month: 'MONTH(@exp)'
24
+ extract_quarter: 'QUARTER(@exp)'
25
25
  extract_day_of_year: 'DAYOFYEAR(@exp)'
26
- extract_day_of_month: 'day(@exp)'
26
+ extract_day_of_month: 'DAY(@exp)'
27
27
  extract_day_of_week: 'DAYOFWEEK(@exp)'
28
- extract_week_of_year: 'week(@exp)'
29
- extract_hour: 'hour(@exp)'
30
- extract_minute: 'minute(@exp)'
31
- extract_year_month: 'cast((TO_VARCHAR(@exp, ''YYYY'') || TO_VARCHAR(@exp, ''MM'')) as integer)'
28
+ extract_week_of_year: 'WEEK(@exp)'
29
+ extract_hour: 'HOUR(@exp)'
30
+ extract_minute: 'MINUTE(@exp)'
31
+ extract_year_month: 'CAST((TO_VARCHAR(@exp, ''YYYY'') || TO_VARCHAR(@exp, ''MM'')) as integer)'
32
32
  default_week_start_day: "monday"
33
33
  week_start_day: "monday"
34
- sunday_week_start_day: "dateadd(day, -1,date_trunc('week', dateadd(day,1, @exp)))"
35
- monday_week_start_day: "dateadd(day, 1,date_trunc('week', dateadd(day,-1, @exp)))"
34
+ sunday_week_start_day: "DATEADD(day, -1,DATE_TRUNC('week', DATEADD(day,1, @exp)))"
35
+ monday_week_start_day: "DATEADD(day, 1,DATE_TRUNC('week', DATEADD(day,-1, @exp)))"
36
36
 
37
37
  # array operations
38
38
  array_in_list: "ARRAY_CONTAINS(@exp, ARRAY_CONSTRUCT(@list))"
@@ -0,0 +1,42 @@
1
+
2
+
3
+ # Date Literal Formats
4
+ day_name_format: "%A"
5
+ abbreviated_day_name_format: "%a"
6
+ month_name_format: "%B"
7
+ abbreviated_month_name_format: "%b"
8
+
9
+ # Date functions patterns
10
+ # SQLite uses strftime for most date operations
11
+ date_add: "datetime(@exp, '+@val @unit')"
12
+ date_diff: "CAST((julianday(@end_exp) - julianday(@start_exp)) AS INTEGER)"
13
+ date_format_sql: "strftime('@format', @exp)"
14
+ extract_year: "CAST(strftime('%Y', @exp) AS INTEGER)"
15
+ extract_month: "CAST(strftime('%m', @exp) AS INTEGER)"
16
+ extract_quarter: "CAST(((strftime('%m', @exp) - 1) / 3) + 1 AS INTEGER)"
17
+ extract_day_of_year: "CAST(strftime('%j', @exp) AS INTEGER)"
18
+ extract_day_of_month: "CAST(strftime('%d', @exp) AS INTEGER)"
19
+ extract_day_of_week: "CAST(strftime('%w', @exp) AS INTEGER)"
20
+ extract_week_of_year: "CAST(strftime('%W', @exp) AS INTEGER)"
21
+ extract_hour: "CAST(strftime('%H', @exp) AS INTEGER)"
22
+ extract_minute: "CAST(strftime('%M', @exp) AS INTEGER)"
23
+ extract_year_month: "CAST(strftime('%Y%m', @exp) AS INTEGER)"
24
+ default_week_start_day: "monday"
25
+ week_start_day: "monday"
26
+ sunday_week_start_day: "date(@exp, '-' || strftime('%w', @exp) || ' days')"
27
+ monday_week_start_day: "date(@exp, '-' || ((CAST(strftime('%w', @exp) AS INTEGER) + 6) % 7) || ' days')"
28
+
29
+ cast: "CAST(@exp AS @type)"
30
+
31
+ # string functions
32
+ trim: "TRIM(@exp)"
33
+ lower_case: "LOWER(@exp)"
34
+ upper_case: "UPPER(@exp)"
35
+
36
+ # Relevant db capabilities for query gen
37
+ create_temp_table_template: "CREATE TEMP TABLE @table AS \n@sql"
38
+
39
+ # array operations - SQLite has limited native array support
40
+ # These would need JSON extension or custom implementation
41
+ # array_in_list: "EXISTS(SELECT 1 FROM json_each(@exp) WHERE value IN @list)"
42
+ # array_exclude_list: "NOT EXISTS(SELECT 1 FROM json_each(@exp) WHERE value IN @list)"
data/lib/dwh/settings.rb CHANGED
@@ -44,7 +44,11 @@ module DWH
44
44
  logger.debug "#{adapter_name} Adapter didn't have a settings YAML file. Using only base settings."
45
45
  end
46
46
 
47
- @adapter_settings.symbolize_keys!
47
+ @adapter_settings.transform_keys! do |key|
48
+ key.to_sym
49
+ rescue StandardError
50
+ key
51
+ end
48
52
  end
49
53
 
50
54
  # By default settings_file are expected to be in a
@@ -69,7 +73,7 @@ module DWH
69
73
  end
70
74
 
71
75
  def adapter_name
72
- name.demodulize.downcase
76
+ name.split('::').last.downcase
73
77
  end
74
78
 
75
79
  def using_base_settings?
data/lib/dwh/table.rb CHANGED
@@ -22,16 +22,16 @@ module DWH
22
22
 
23
23
  @physical_name = parts.last
24
24
  @table_stats = table_stats
25
- @catalog = catalog
26
- @schema = schema
25
+ @catalog = catalog&.strip
26
+ @schema = schema&.strip
27
27
 
28
- @catalog = parts.first if @catalog.nil? && parts.length > 2
28
+ @catalog = parts.first.strip if @catalog.nil? && parts.length > 2
29
29
 
30
30
  if @schema.nil?
31
31
  if parts.length == 2
32
- @schema = parts.first
32
+ @schema = parts.first.strip
33
33
  elsif parts.length > 2
34
- @schema = parts[1]
34
+ @schema = parts[1].strip
35
35
  end
36
36
  end
37
37
 
@@ -50,12 +50,20 @@ module DWH
50
50
  [catalog, schema].compact.join('.')
51
51
  end
52
52
 
53
+ def catalog?
54
+ catalog && !catalog.empty?
55
+ end
56
+
57
+ def schema?
58
+ schema && !schema.empty?
59
+ end
60
+
53
61
  def catalog_and_schema?
54
- catalog && schema
62
+ catalog? && schema?
55
63
  end
56
64
 
57
65
  def catalog_or_schema?
58
- catalog || schema
66
+ catalog? || schema?
59
67
  end
60
68
 
61
69
  def stats
@@ -82,13 +90,13 @@ module DWH
82
90
 
83
91
  def self.from_hash_or_json(physical_name, metadata)
84
92
  metadata = JSON.parse(metadata) if metadata.is_a?(String)
85
- metadata.symbolize_keys!
93
+ metadata.transform_keys!(&:to_sym)
86
94
 
87
- stats = TableStats.new(**metadata[:stats].symbolize_keys) if metadata.key?(:stats)
95
+ stats = TableStats.new(**metadata[:stats]&.transform_keys!(&:to_sym)) if metadata.key?(:stats)
88
96
  table = new(physical_name, table_stats: stats)
89
97
 
90
98
  metadata[:columns]&.each do |col|
91
- col.symbolize_keys!
99
+ col.transform_keys!(&:to_sym)
92
100
  table << Column.new(
93
101
  name: col[:name],
94
102
  data_type: col[:data_type],
data/lib/dwh/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DWH
4
- VERSION = '0.1.1'
4
+ VERSION = '0.2.0'
5
5
  end
data/lib/dwh.rb CHANGED
@@ -1,8 +1,4 @@
1
1
  require 'faraday'
2
- require 'active_support/core_ext/string/inflections'
3
- require 'active_support/core_ext/hash/keys'
4
- require 'active_support/core_ext/object/blank'
5
- require 'active_support/duration'
6
2
 
7
3
  require_relative 'dwh/version'
8
4
  require_relative 'dwh/errors'
@@ -19,7 +15,9 @@ require_relative 'dwh/adapters/snowflake'
19
15
  require_relative 'dwh/adapters/my_sql'
20
16
  require_relative 'dwh/adapters/sql_server'
21
17
  require_relative 'dwh/adapters/duck_db'
18
+ require_relative 'dwh/adapters/sqlite'
22
19
  require_relative 'dwh/adapters/athena'
20
+ require_relative 'dwh/adapters/redshift'
23
21
 
24
22
  # DWH encapsulates the full functionality of this gem.
25
23
  #
@@ -48,7 +46,9 @@ module DWH
48
46
  register(:mysql, Adapters::MySql)
49
47
  register(:sqlserver, Adapters::SqlServer)
50
48
  register(:duckdb, Adapters::DuckDb)
49
+ register(:sqlite, Adapters::Sqlite)
51
50
  register(:athena, Adapters::Athena)
51
+ register(:redshift, Adapters::Redshift)
52
52
 
53
53
  # start_reaper
54
54
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dwh
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ajo Abraham
@@ -9,20 +9,6 @@ bindir: exe
9
9
  cert_chain: []
10
10
  date: 1980-01-02 00:00:00.000000000 Z
11
11
  dependencies:
12
- - !ruby/object:Gem::Dependency
13
- name: activesupport
14
- requirement: !ruby/object:Gem::Requirement
15
- requirements:
16
- - - "~>"
17
- - !ruby/object:Gem::Version
18
- version: 8.0.2
19
- type: :runtime
20
- prerelease: false
21
- version_requirements: !ruby/object:Gem::Requirement
22
- requirements:
23
- - - "~>"
24
- - !ruby/object:Gem::Version
25
- version: 8.0.2
26
12
  - !ruby/object:Gem::Dependency
27
13
  name: connection_pool
28
14
  requirement: !ruby/object:Gem::Requirement
@@ -173,8 +159,10 @@ files:
173
159
  - lib/dwh/adapters/my_sql.rb
174
160
  - lib/dwh/adapters/open_authorizable.rb
175
161
  - lib/dwh/adapters/postgres.rb
162
+ - lib/dwh/adapters/redshift.rb
176
163
  - lib/dwh/adapters/snowflake.rb
177
164
  - lib/dwh/adapters/sql_server.rb
165
+ - lib/dwh/adapters/sqlite.rb
178
166
  - lib/dwh/adapters/trino.rb
179
167
  - lib/dwh/behaviors.rb
180
168
  - lib/dwh/capabilities.rb
@@ -197,6 +185,7 @@ files:
197
185
  - lib/dwh/settings/postgres.yml
198
186
  - lib/dwh/settings/redshift.yml
199
187
  - lib/dwh/settings/snowflake.yml
188
+ - lib/dwh/settings/sqlite.yml
200
189
  - lib/dwh/settings/sqlserver.yml
201
190
  - lib/dwh/settings/trino.yml
202
191
  - lib/dwh/streaming_stats.rb
@@ -225,7 +214,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
225
214
  - !ruby/object:Gem::Version
226
215
  version: '0'
227
216
  requirements: []
228
- rubygems_version: 3.7.1
217
+ rubygems_version: 3.6.9
229
218
  specification_version: 4
230
219
  summary: Data warehouse adapters for interacting with popular data warehouses.
231
220
  test_files: []