RubyGems - dwh - Versions diffs - 0.1.1 → 0.3.0 - Mend

dwh 0.1.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +43 -0
data/README.md +10 -1
data/docs/guides/adapters.md +158 -0
data/docs/guides/getting-started.md +6 -1
data/docs/guides/usage.md +33 -1
data/lib/dwh/adapters/athena.rb +8 -1
data/lib/dwh/adapters/databricks.rb +328 -0
data/lib/dwh/adapters/duck_db.rb +8 -2
data/lib/dwh/adapters/my_sql.rb +7 -1
data/lib/dwh/adapters/postgres.rb +11 -5
data/lib/dwh/adapters/redshift.rb +48 -0
data/lib/dwh/adapters/sql_server.rb +8 -2
data/lib/dwh/adapters/sqlite.rb +364 -0
data/lib/dwh/adapters/trino.rb +7 -1
data/lib/dwh/adapters.rb +3 -3
data/lib/dwh/column.rb +12 -1
data/lib/dwh/functions/dates.rb +15 -0
data/lib/dwh/settings/databricks.yml +14 -15
data/lib/dwh/settings/druid.yml +3 -3
data/lib/dwh/settings/duckdb.yml +2 -2
data/lib/dwh/settings/mysql.yml +2 -2
data/lib/dwh/settings/postgres.yml +11 -11
data/lib/dwh/settings/redshift.yml +15 -24
data/lib/dwh/settings/snowflake.yml +15 -15
data/lib/dwh/settings/sqlite.yml +42 -0
data/lib/dwh/settings.rb +6 -2
data/lib/dwh/table.rb +18 -10
data/lib/dwh/version.rb +1 -1
data/lib/dwh.rb +6 -4
metadata +6 -16

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 1d0cb4848d96ff20f5b4c80e00cbfde5f1150e0e975cf55827f8466b9d4eaf49
-  data.tar.gz: ce79d54a93388718886c8ec7892e9da544c4ddc52333e1a765a742518552df69
+  metadata.gz: e70f914cc994c4be7a9d76b0d72d170ae6bf4d895427ec90adcdf9e3099774fe
+  data.tar.gz: 3ef66bc3d9a326bbae4b51d2bb3ec6af45425971617aee3a3131c7d496cf9127
 SHA512:
-  metadata.gz: e61d14edd0b8e5818eb2eb3fe5ec25adf69a49279c9fd8b8db0ddb271f746de30573d97d90a6fe3bb9acf350f2ad80a72f076a19bd72a1270d9418aee6eca320
-  data.tar.gz: a53cc7c72c222b579777e7796bbdefeada81ac77a8bf12d5ab46be73e2d76b7721deba91deb2019597013a1f1a46173a7406d6d95d1f8d11eb53f62e7ba19b26
+  metadata.gz: d05640e86dc5a6df2135173dd7a513df0805b4035fa5c4c5fa182659b1286fb6fd78f67ff2e259899f3d4ecab19b44bf5e7bfb427acfa5daa6065d71fb2fa18d
+  data.tar.gz: 28e5c623c8401dea1d222a1b318325543d4c209e083944d2fd43367ae6721c2ebe1c2b7e69190462a19887770d722edb7f2d1d851f90e5cc53b4c51cf2b65953

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,48 @@
 ## [Unreleased]
+## [0.3.0] - 2026-04-22
+### Changed
+- Added Databricks Adapter
+## [0.2.1] - 2025-01-27
+### Changed
+- **Adapter missing-gem error messages** (Athena, DuckDB, MySQL, PostgreSQL, SQL Server, Trino): replace platform-specific system library install instructions with links to official documentation. Messages now include `gem install` and a single link for system libraries.
+## [0.2.0] - 2025-10-12
+### Added
+- **SQLite adapter** with performance optimizations
+  - WAL (Write-Ahead Logging) mode enabled by default for concurrent reads
+  - Performance-tuned pragmas: cache_size, mmap_size, temp_store, synchronous
+  - Custom date truncation for year, quarter, month, week, day, hour, minute, second
+  - Custom day/month name extraction via CASE statements (SQLite lacks strftime %A/%B support)
+  - Proper date casting using `date()` function
+  - Comprehensive test suite and documentation
+- **Redshift adapter** for AWS data warehouse
+  - Native Redshift SQL function support
+  - Full metadata and table introspection
+- `date_time_literal` method for creating timestamp literals
+- `date_lit` method for creating date literals
+### Changed
+- Removed ActiveSupport dependency
+  - Replaced `symbolize_keys` with `transform_keys(&:to_sym)`
+  - Replaced `demodulize` with `split('::').last.downcase`
+  - Removed core extensions
+- Standardized all SQL function names in settings to UPPERCASE for consistency
+### Fixed
+- Config defaults now properly set even when config key is passed with nil value
+- Table instantiation issues resolved
+- Test suite no longer requires Trino gem for default tests
 ## [0.1.0] - 2025-07-03
 - Initial release

data/README.md CHANGED Viewed

@@ -25,16 +25,17 @@ The adapter only has 5 core methods (6 including the connection method).  A YAML
 - **Snowflake** - High performance cloud warehouse
 - **Trino** (formerly Presto) - Distributed SQL query engine
+- **Redshift** - AWS data warehouse platform
 - **AWS Athena** - AWS big data warehouse
 - **Apache Druid** - Real-time analytics database
 - **DuckDB** - In-process analytical database
+- **SQLite** - Lightweight embedded database
 - **PostgreSQL** - Full-featured RDBMS with advanced SQL support
 - **MySQL** - Popular open-source database
 - **SQL Server** - Microsoft's enterprise database
 ## Integrations Coming Soon
-- **Redshift** - AWS data warehouse platform
 - **ClickHouse** - High performance analytical db
 - **Databricks** - Big data compute engine
 - **MotherDuck** - Hosted DuckDB service
@@ -61,6 +62,14 @@ druid = DWH.create(:druid, {
 # basic query execution
 results = druid.execute("SELECT * FROM web_sales", format: :csv)
+# Connect to SQLite for local analytics
+sqlite = DWH.create(:sqlite, {
+  file: 'path/to/analytics.db'
+})
+# Query with optimized WAL mode enabled by default
+results = sqlite.execute("SELECT * FROM sales_data", format: :array)
 ```
 ## Core API

data/docs/guides/adapters.md CHANGED Viewed

@@ -70,6 +70,71 @@ postgres = DWH.create(:postgres, {
 })
 ```
+## Redshift Adapter
+The Redshift adapter uses the `pg` gem and provides full-featured RDBMS support.
+### Basic Configuration
+```ruby
+redshift = DWH.create(:redshift, {
+  host: 'localhost',
+  port: 5432,                    # Default: 5432
+  database: 'mydb',
+  schema: 'public',              # Default: 'public'
+  username: 'user',
+  password: 'password',
+  client_name: 'My Application'  # Default: 'DWH Ruby Gem'
+})
+```
+### SSL Configuration
+```ruby
+# Basic SSL
+redshift = DWH.create(:redshift, {
+  host: 'localhost',
+  database: 'mydb',
+  username: 'user',
+  password: 'password',
+  ssl: true,
+  extra_connection_params: {
+    sslmode: 'require'  # disable, prefer, require, verify-ca, verify-full
+  }
+})
+# Certificate-based SSL
+redshift = DWH.create(:postgres, {
+  host: 'localhost',
+  database: 'mydb',
+  username: 'user',
+  ssl: true,
+  extra_connection_params: {
+    sslmode: 'verify-full',
+    sslrootcert: '/path/to/ca-cert.pem',
+    sslcert: '/path/to/client-cert.pem',
+    sslkey: '/path/to/client-key.pem'
+  }
+})
+```
+### Advanced Configuration
+```ruby
+redshift = DWH.create(:redshift, {
+  host: 'localhost',
+  database: 'mydb',
+  username: 'user',
+  password: 'password',
+  query_timeout: 3600,  # seconds, default: 3600
+  extra_connection_params: {
+    application_name: 'Data Analysis Tool',
+    connect_timeout: 10,
+    options: '-c maintenance_work_mem=256MB'
+  }
+})
+```
 ## Snowflake
 Snowflake adapter use the REST apis (https) to connect and query. This adapter also supports Multi-Database
@@ -287,6 +352,99 @@ duckdb = DWH.create(:duckdb, {
 })
 ```
+## SQLite Adapter
+The SQLite adapter uses the `sqlite3` gem for lightweight embedded database analytics. It's optimized for analytical workloads with WAL mode enabled by default for better concurrent read performance.
+### Basic Configuration
+```ruby
+# File-based database
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/my/database.sqlite'
+})
+# In-memory database
+sqlite = DWH.create(:sqlite, {
+  file: ':memory:'
+})
+```
+### Read-Only Mode
+```ruby
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/readonly/database.sqlite',
+  readonly: true
+})
+```
+### Performance Optimization
+The adapter includes default optimizations for analytical workloads:
+- WAL mode enabled by default for concurrent reads
+- 64MB cache size
+- Memory-mapped I/O (128MB)
+- Temp tables stored in memory
+```ruby
+# Customize performance settings
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/my/database.sqlite',
+  timeout: 5000,  # busy timeout in milliseconds, default: 5000
+  pragmas: {
+    cache_size: -128000,      # 128MB cache (negative means KB)
+    mmap_size: 268435456,     # 256MB memory-mapped I/O
+    temp_store: 'MEMORY',     # Store temp tables in memory
+    synchronous: 'NORMAL'     # Faster than FULL, safe with WAL
+  }
+})
+```
+### Disable WAL Mode
+```ruby
+# Disable WAL mode if needed (e.g., for NFS or network filesystems)
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/my/database.sqlite',
+  enable_wal: false
+})
+```
+### Advanced Configuration
+```ruby
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/analytics.sqlite',
+  readonly: false,
+  enable_wal: true,           # Default: true
+  timeout: 10000,             # 10 second busy timeout
+  pragmas: {
+    journal_mode: 'WAL',      # Explicitly set WAL (done by default)
+    cache_size: -256000,      # 256MB cache
+    page_size: 8192,          # Larger page size for analytics
+    mmap_size: 536870912,     # 512MB memory-mapped I/O
+    temp_store: 'MEMORY',     # Keep temp data in memory
+    synchronous: 'NORMAL',    # Balance between safety and speed
+    locking_mode: 'NORMAL'    # Allow multiple connections
+  }
+})
+```
+### Multiple Connections
+Unlike DuckDB, SQLite allows multiple independent connections to the same database file:
+```ruby
+# Multiple readers/writers to the same file
+reader = DWH.create(:sqlite, { file: '/path/to/data.sqlite', readonly: true })
+writer = DWH.create(:sqlite, { file: '/path/to/data.sqlite' })
+# Both can operate concurrently with WAL mode enabled
+data = reader.execute('SELECT * FROM sales')
+writer.execute('INSERT INTO sales VALUES (...)')
+```
 ## Trino Adapter
 The Trino adapter requires the `trino-client-ruby` gem and works with both Trino and Presto.

data/docs/guides/getting-started.md CHANGED Viewed

@@ -40,9 +40,14 @@ postgres = DWH.create(:postgres, {
   password: 'password'
 })
+# Connect to SQLite (lightweight, embedded)
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/analytics.db'
+})
 # Connect to DuckDB (in-memory)
 duckdb = DWH.create(:duckdb, {
-  database: ':memory:'
+  file: ':memory:'
 })
 ```

data/docs/guides/usage.md CHANGED Viewed

@@ -293,7 +293,7 @@ native = adapter.execute(sql, format: :native)   # Database's native format
 # Use streaming for large result sets
 def export_large_table(adapter, table_name, output_file)
   query = "SELECT * FROM #{table_name}"
   File.open(output_file, 'w') do |file|
     adapter.execute_stream(query, file)
   end
@@ -309,6 +309,38 @@ def process_large_dataset(adapter, query)
 end
 ```
+### SQLite Performance Tuning
+SQLite adapter comes with optimized defaults for analytical workloads, but can be further tuned:
+```ruby
+# High-performance SQLite configuration for analytics
+sqlite = DWH.create(:sqlite, {
+  file: '/path/to/large_analytics.db',
+  enable_wal: true,           # WAL mode for concurrent reads (default: true)
+  timeout: 30000,             # 30 second busy timeout for heavy writes
+  pragmas: {
+    cache_size: -512000,      # 512MB cache for large datasets
+    page_size: 8192,          # Larger pages for sequential scans
+    mmap_size: 1073741824,    # 1GB memory-mapped I/O
+    temp_store: 'MEMORY',     # Keep temp tables in RAM
+    synchronous: 'NORMAL',    # Balance safety/speed (safe with WAL)
+    journal_size_limit: 67108864  # 64MB journal limit
+  }
+})
+# Read-only analytics queries with maximum performance
+readonly_analytics = DWH.create(:sqlite, {
+  file: '/path/to/data.db',
+  readonly: true,             # Read-only for maximum concurrency
+  pragmas: {
+    cache_size: -256000,      # 256MB cache
+    mmap_size: 2147483648,    # 2GB memory mapping for large files
+    temp_store: 'MEMORY'      # Fast temp operations
+  }
+})
+```
 ## Error Handling and Debugging
 ### Comprehensive Error Handling

data/lib/dwh/adapters/athena.rb CHANGED Viewed

@@ -202,8 +202,15 @@ module DWH
       def valid_config?
         super
         require 'aws-sdk-athena'
+        require 'aws-sdk-s3'
       rescue LoadError
-        raise ConfigError, "Required 'aws-sdk-athena' and 'aws-sdk-s3' gems missing. Please add them to your Gemfile."
+        raise ConfigError, <<~MSG
+          Athena adapter requires the 'aws-sdk-athena' and 'aws-sdk-s3' gems.
+          Install with: gem install aws-sdk-athena aws-sdk-s3
+          No system libraries required (pure Ruby).
+        MSG
       end
       private

data/lib/dwh/adapters/databricks.rb ADDED Viewed

@@ -0,0 +1,328 @@
+require 'csv'
+require 'base64'
+module DWH
+  module Adapters
+    # Databricks adapter for executing SQL queries against Databricks SQL warehouses.
+    #
+    # Supports OAuth M2M (service principal) authentication only.
+    #
+    # @example Connection with OAuth (service principal)
+    #   DWH.create(:databricks, {
+    #     host: 'adb-1234567890123456.7.azuredatabricks.net',
+    #     warehouse: 'abc123def456',
+    #     oauth_client_id: 'service-principal-app-id',
+    #     oauth_client_secret: 'your-oauth-secret-here',
+    #     catalog: 'main',
+    #     schema: 'default'
+    #   })
+    class Databricks < Adapter
+      config :host, String, required: true, message: 'Databricks workspace host (e.g., adb-xxx.databricks.cloud.com)'
+      config :oauth_client_id, String, required: true, message: 'OAuth client ID (service principal application ID)'
+      config :oauth_client_secret, String, required: true, message: 'OAuth client secret'
+      config :client_name, String, required: false, default: 'Ruby DWH Gem', message: 'Client name sent to Databricks'
+      config :query_timeout, Integer, required: false, default: 3600, message: 'Query execution timeout in seconds'
+      config :warehouse, String, required: true, message: 'Databricks SQL warehouse ID to use for query execution'
+      config :catalog, String, required: false, message: 'Default catalog (Unity Catalog)'
+      config :schema, String, required: false, message: 'Default schema'
+      DEFAULT_POLL_INTERVAL = 0.25
+      MAX_POLL_INTERVAL = 30
+      STATEMENTS_API = '/api/2.0/sql/statements'.freeze
+      def initialize(config)
+        super
+        validate_auth_config
+      end
+      def connection
+        return @connection if @connection && !token_expired?
+        reset_connection if token_expired?
+        @connection = Faraday.new(
+          url: "https://#{workspace_host}",
+          headers: {
+            'Content-Type' => 'application/json',
+            'Authorization' => "Bearer #{auth_token}",
+            'User-Agent' => config[:client_name]
+          },
+          request: {
+            timeout: config[:query_timeout]
+          }.merge(extra_connection_params)
+        )
+      end
+      def test_connection(raise_exception: false)
+        execute('SELECT 1')
+        true
+      rescue StandardError => e
+        raise ConnectionError, "Failed to connect to Databricks: #{e.message}" if raise_exception
+        logger.error "Connection test failed: #{e.message}"
+        false
+      end
+      # (see Adapter#execute)
+      def execute(sql, format: :array, retries: 0)
+        result = with_retry(retries + 1) do
+          with_debug(sql) do
+            response = submit_query(sql)
+            fetch_data(handle_query_response(response))
+          end
+        end
+        format_result(result, format)
+      end
+      def execute_stream(sql, io, stats: nil, retries: 0)
+        with_retry(retries) do
+          with_debug(sql) do
+            response = submit_query(sql)
+            fetch_data(handle_query_response(response), io: io, stats: stats)
+          end
+        end
+        io.rewind
+        io
+      end
+      # Execute SQL query and yield streamed results
+      # @param sql [String] SQL query to execute
+      # @yield [chunk] yields each chunk of data as it's processed
+      def stream(sql, &block)
+        with_debug(sql) do
+          response = submit_query(sql)
+          fetch_data(handle_query_response(response), proc: block)
+        end
+      end
+      def tables(**qualifiers)
+        catalog = qualifiers[:catalog] || config[:catalog]
+        schema = qualifiers[:schema] || config[:schema]
+        raise ConfigError, 'catalog is required for Databricks tables query' unless catalog
+        sql = "SELECT table_name FROM #{catalog}.information_schema.tables"
+        sql += " WHERE table_schema = '#{schema}'" if schema
+        result = execute(sql)
+        result.flatten
+      end
+      def metadata(table, **qualifiers)
+        catalog = qualifiers[:catalog] || config[:catalog]
+        schema = qualifiers[:schema] || config[:schema]
+        raise ConfigError, 'catalog is required for Databricks metadata query' unless catalog
+        db_table = Table.new(table, schema: schema, catalog: catalog)
+        sql = <<~SQL
+          SELECT column_name, data_type, numeric_precision, numeric_scale, character_maximum_length
+          FROM #{catalog}.information_schema.columns
+          WHERE table_name = '#{db_table.physical_name}'
+        SQL
+        sql += " AND table_schema = '#{db_table.schema}'" if db_table.schema
+        columns = execute(sql)
+        columns.each do |col|
+          db_table << Column.new(
+            name: col[0]&.downcase,
+            data_type: col[1]&.downcase,
+            precision: col[2],
+            scale: col[3],
+            max_char_length: col[4]
+          )
+        end
+        db_table
+      end
+      def stats(table, date_column: nil)
+        date_fields = if date_column
+                        ", MIN(#{date_column}) AS date_start, MAX(#{date_column}) AS date_end"
+                      else
+                        ', NULL AS date_start, NULL AS date_end'
+                      end
+        data = execute("SELECT COUNT(*) AS row_count#{date_fields} FROM #{table}")
+        cols = data.first
+        TableStats.new(
+          row_count: cols[0],
+          date_start: cols[1],
+          date_end: cols[2]
+        )
+      end
+      private
+      def validate_auth_config
+        raise ConfigError, 'oauth_client_id is required' unless config[:oauth_client_id]
+        raise ConfigError, 'oauth_client_secret is required' unless config[:oauth_client_secret]
+      end
+      def auth_token
+        return @oauth_access_token if @oauth_access_token && !token_expired?
+        request_oauth_access_token!
+        @oauth_access_token
+      end
+      def request_oauth_access_token!
+        credentials = Base64.strict_encode64("#{config[:oauth_client_id]}:#{config[:oauth_client_secret]}")
+        response = Faraday.post(
+          "https://#{workspace_host}/oidc/v1/token",
+          'grant_type=client_credentials&scope=all-apis',
+          'Authorization' => "Basic #{credentials}",
+          'Content-Type' => 'application/x-www-form-urlencoded'
+        )
+        raise AuthenticationError, "OAuth M2M token request failed (#{response.status}): #{response.body}" unless response.status == 200
+        data = JSON.parse(response.body)
+        @oauth_access_token = data['access_token']
+        expires_in = data['expires_in'] || 3600
+        @token_expires_at = Time.now + [expires_in - 60, 60].max
+      end
+      def reset_connection
+        @oauth_access_token = nil
+        @token_expires_at = nil
+        close
+      end
+      def submit_query(sql)
+        connection.post(STATEMENTS_API) do |req|
+          req.body = {
+            statement: sql,
+            warehouse_id: config[:warehouse],
+            catalog: config[:catalog],
+            schema: config[:schema],
+            wait_timeout: '30s',
+            on_wait_timeout: 'CONTINUE',
+            format: 'JSON_ARRAY',
+            disposition: 'INLINE'
+          }.compact.merge(extra_query_params).to_json
+        end
+      end
+      def handle_query_response(response)
+        body = JSON.parse(response.body)
+        case response.status
+        when 200
+          state = body.dig('status', 'state')
+          state == 'SUCCEEDED' ? body : poll(body['statement_id'])
+        when 202
+          poll(body['statement_id'])
+        else
+          error_message = body['message'] || body['error_code'] || response.body
+          raise ExecutionError, "Databricks query failed (#{response.status}): #{error_message}"
+        end
+      end
+      def poll(statement_id)
+        sleep_interval = DEFAULT_POLL_INTERVAL
+        logger.debug "Polling for query completion: #{statement_id}"
+        loop do
+          response = connection.get("#{STATEMENTS_API}/#{statement_id}")
+          body = JSON.parse(response.body)
+          state = body.dig('status', 'state')
+          case state
+          when 'SUCCEEDED'
+            return body
+          when 'FAILED', 'CANCELED', 'CLOSED'
+            error_msg = body.dig('status', 'error', 'message') || state
+            raise ExecutionError, "Databricks query #{state}: #{error_msg}"
+          else
+            logger.debug "Query still running (state: #{state}). Sleeping #{sleep_interval}s..."
+            sleep(sleep_interval)
+            sleep_interval = sleep_interval == MAX_POLL_INTERVAL ? DEFAULT_POLL_INTERVAL : sleep_interval
+            sleep_interval = [sleep_interval * 2, MAX_POLL_INTERVAL].min
+          end
+        end
+      end
+      def fetch_data(result, io: nil, stats: nil, proc: nil)
+        columns = result.dig('manifest', 'schema', 'columns')&.map { |col| col['name'] } || []
+        chunks = result.dig('manifest', 'chunks') || []
+        collector = {
+          columns: columns,
+          data: [],
+          io: io,
+          stats: stats,
+          wrote_header: false
+        }
+        write_data(result.dig('result', 'data_array') || [], collector, io, stats, proc)
+        return collector unless chunks.size > 1
+        statement_id = result['statement_id']
+        chunks[1..].each do |chunk|
+          chunk_index = chunk['chunk_index']
+          logger.debug "Fetching chunk #{chunk_index} of #{chunks.size} for statement: #{statement_id}"
+          resp = connection.get("#{STATEMENTS_API}/#{statement_id}/result/chunks/#{chunk_index}")
+          raise ExecutionError, "Failed to fetch chunk #{chunk_index}: #{resp.body}" unless resp.status == 200
+          chunk_data = JSON.parse(resp.body)
+          write_data(chunk_data['data_array'] || [], collector, io, stats, proc)
+        end
+        collector
+      end
+      def write_data(data, collector, io = nil, stats = nil, proc = nil)
+        if io
+          unless collector[:wrote_header]
+            io << CSV.generate_line(collector[:columns])
+            collector[:wrote_header] = true
+          end
+          data.each do |row|
+            stats << row if stats
+            io << CSV.generate_line(row)
+          end
+        elsif proc
+          data.each { proc.call(it) }
+        else
+          data.each { collector[:data] << it }
+        end
+        collector
+      end
+      def format_result(result, format)
+        data = result[:data]
+        columns = result[:columns]
+        case format
+        when :array
+          data
+        when :object
+          data.map { |row| columns.zip(row).to_h }
+        when :csv
+          CSV.generate do |csv|
+            csv << columns
+            data.each { |row| csv << row }
+          end
+        when :native
+          result
+        else
+          raise UnsupportedCapability, "Unknown result format: #{format}"
+        end
+      end
+      def workspace_host
+        config[:host].to_s.gsub(%r{\Ahttps?://}, '').gsub(%r{/+\z}, '')
+      end
+    end
+  end
+end

data/lib/dwh/adapters/duck_db.rb CHANGED Viewed

@@ -150,7 +150,7 @@ module DWH
       # True if the configuration was setup with a schema.
       def schema?
-        config[:schema].present?
+        !config[:schema].nil? && !config[:schema]&.strip&.empty?
       end
       # (see Adapter#execute)
@@ -209,7 +209,13 @@ module DWH
         super
         require 'duckdb'
       rescue LoadError
-        raise ConfigError, "Required 'duckdb' gem missing. Please add it to your Gemfile."
+        raise ConfigError, <<~MSG
+          DuckDB adapter requires the 'duckdb' gem.
+          Install with: gem install duckdb
+          See https://github.com/suketa/ruby-duckdb for installation details.
+        MSG
       end
       private