ducklake 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 9e31c96f2bcf9728685f6eb72f3cdfad44174eb28347dbe549391703f2795414
4
+ data.tar.gz: 63befb2ceaad4c0587b6ef10672e687aa2c51cdc78c95451c85599595d9e83d7
5
+ SHA512:
6
+ metadata.gz: 1a62d39d7962cbdd8b60a49ba0f7a20d47bbf2a3a18061ae59330cab07b67d91d7dab533487da99cdfbfc3d2b1d7cceaf407bfdd471ffe2bee6d84d2f5567413
7
+ data.tar.gz: 7032e761c77d93463beb4fb650e17fba8a8e3d54554ca5debc9f14a1625a18806c9167f5719cd7bab7079b068adae151b53a404600ac09b02eb1974cfe90c06e
data/CHANGELOG.md ADDED
@@ -0,0 +1,3 @@
1
+ ## 0.1.0 (2025-08-17)
2
+
3
+ - First release
data/README.md ADDED
@@ -0,0 +1,330 @@
1
+ # DuckLake Ruby
2
+
3
+ :fire: [DuckLake](https://ducklake.select/) for Ruby
4
+
5
+ Run your own data lake with a SQL database and file/object storage
6
+
7
+ ```ruby
8
+ DuckLake::Client.new(
9
+ catalog_url: "postgres://user:pass@host:5432/db",
10
+ storage_url: "s3://my-bucket/"
11
+ )
12
+ ```
13
+
14
+ [Learn more](https://duckdb.org/2025/05/27/ducklake.html)
15
+
16
+ Note: DuckLake is [not considered production-ready](https://ducklake.select/faq#is-ducklake-production-ready) at the moment
17
+
18
+ [![Build Status](https://github.com/ankane/ducklake-ruby/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/ducklake-ruby/actions)
19
+
20
+ ## Installation
21
+
22
+ First, install libduckdb. For Homebrew, use:
23
+
24
+ ```sh
25
+ brew install duckdb
26
+ ```
27
+
28
+ Then add this line to your application’s Gemfile:
29
+
30
+ ```ruby
31
+ gem "ducklake"
32
+ ```
33
+
34
+ ## Getting Started
35
+
36
+ Create a client - this one stores everything locally
37
+
38
+ ```ruby
39
+ ducklake =
40
+ DuckLake::Client.new(
41
+ catalog_url: "sqlite:///ducklake.sqlite",
42
+ storage_url: "data_files/",
43
+ create_if_not_exists: true
44
+ )
45
+ ```
46
+
47
+ Create a table
48
+
49
+ ```ruby
50
+ ducklake.sql("CREATE TABLE events (id bigint, name text)")
51
+ ```
52
+
53
+ Load data from a file
54
+
55
+ ```ruby
56
+ ducklake.sql("COPY events FROM 'data.csv'")
57
+ ```
58
+
59
+ Confirm a new Parquet file was added to the data lake
60
+
61
+ ```ruby
62
+ ducklake.list_files("events")
63
+ ```
64
+
65
+ Query the data
66
+
67
+ ```ruby
68
+ ducklake.sql("SELECT COUNT(*) FROM events").to_a
69
+ ```
70
+
71
+ ## Catalog Database
72
+
73
+ Catalog information can be stored in:
74
+
75
+ - Postgres: `postgres://user@pass@host:5432/dbname`
76
+ - SQLite: `sqlite:///path/to/dbname.sqlite`
77
+ - DuckDB: `duckdb:///path/to/dbname.duckdb`
78
+
79
+ Note: MySQL and MariaDB are not currently supported due to [duckdb/ducklake#70](https://github.com/duckdb/ducklake/issues/70) and [duckdb/ducklake#210](https://github.com/duckdb/ducklake/issues/210)
80
+
81
+ There are two ways to set up the schema:
82
+
83
+ 1. Run [this script](https://ducklake.select/docs/stable/specification/tables/overview#full-schema-creation-script)
84
+ 2. Configure the client to do it
85
+
86
+ ```ruby
87
+ DuckLake::Client.new(create_if_not_exists: true, ...)
88
+ ```
89
+
90
+ ## Data Storage
91
+
92
+ Data can be stored in:
93
+
94
+ - Local files: `data_files/`
95
+ - Amazon S3: `s3://my-bucket/path/`
96
+ - [Other providers](https://ducklake.select/docs/stable/duckdb/usage/choosing_storage): todo
97
+
98
+ ### Amazon S3
99
+
100
+ Credentials are detected in the standard AWS SDK locations, or you can pass them manually
101
+
102
+ ```ruby
103
+ DuckLake::Client.new(
104
+ storage_options: {
105
+ aws_access_key_id: "...",
106
+ aws_secret_access_key: "...",
107
+ region: "us-east-1"
108
+ },
109
+ ...
110
+ )
111
+ ```
112
+
113
+ IAM permissions
114
+
115
+ - Read: `s3::ListBucket`, `s3::GetObject`
116
+ - Write: `s3::ListBucket`, `s3::PutObject`
117
+ - Maintenance: `s3::ListBucket`, `s3::GetObject`, `s3::PutObject`, `s3::DeleteObject`
118
+
119
+ ## Operations
120
+
121
+ Create an empty table
122
+
123
+ ```ruby
124
+ ducklake.sql("CREATE TABLE events (id bigint, name text)")
125
+ ```
126
+
127
+ Or a table from a file
128
+
129
+ ```ruby
130
+ ducklake.sql("CREATE TABLE events AS FROM 'data.csv'")
131
+ ```
132
+
133
+ Load data from a file
134
+
135
+ ```ruby
136
+ ducklake.sql("COPY events FROM 'data.csv'")
137
+ ```
138
+
139
+ You can also load data directly from other [data sources](https://duckdb.org/docs/stable/data/data_sources)
140
+
141
+ ```ruby
142
+ ducklake.attach("blog", "postgres://localhost:5432/blog")
143
+ ducklake.sql("INSERT INTO events SELECT * FROM blog.ahoy_events")
144
+ ```
145
+
146
+ Or [register existing data files](https://ducklake.select/docs/stable/duckdb/metadata/adding_files)
147
+
148
+ ```ruby
149
+ ducklake.add_data_files("events", "data.parquet")
150
+ ```
151
+
152
+ Note: This transfers ownership to DuckLake, so the file can be deleted after running `cleanup_old_files`
153
+
154
+ Update data
155
+
156
+ ```ruby
157
+ ducklake.sql("UPDATE events SET name = ? WHERE id = 1", ["Test", 1])
158
+ ```
159
+
160
+ Delete data
161
+
162
+ ```ruby
163
+ ducklake.sql("DELETE * FROM events WHERE id = ?", [1])
164
+ ```
165
+
166
+ Update the schema
167
+
168
+ ```ruby
169
+ ducklake.sql("ALTER TABLE events ADD COLUMN active BOOLEAN")
170
+ ```
171
+
172
+ ## Snapshots
173
+
174
+ Get snapshots
175
+
176
+ ```ruby
177
+ ducklake.snapshots
178
+ ```
179
+
180
+ Query the data at a specific snapshot version or time
181
+
182
+ ```ruby
183
+ ducklake.sql("SELECT * FROM events AT (VERSION => ?)", [3])
184
+ #
185
+ ducklake.sql("SELECT * FROM events AT (TIMESTAMP => ?)", [Date.today - 7])
186
+ ```
187
+
188
+ You can also specify a snapshot when creating the client
189
+
190
+ ```ruby
191
+ DuckLake::Client.new(snapshot_version: 3, ...)
192
+ # or
193
+ DuckLake::Client.new(snapshot_time: Date.today - 7, ...)
194
+ ```
195
+
196
+ ## Maintenance
197
+
198
+ Merge files
199
+
200
+ ```ruby
201
+ ducklake.merge_adjacent_files
202
+ ```
203
+
204
+ Expire snapshots
205
+
206
+ ```ruby
207
+ ducklake.expire_snapshots(older_than: Date.today - 7)
208
+ ```
209
+
210
+ Clean up old files
211
+
212
+ ```ruby
213
+ ducklake.cleanup_old_files(older_than: Date.today - 7)
214
+ ```
215
+
216
+ ## Configuration
217
+
218
+ Get [options](https://ducklake.select/docs/stable/duckdb/usage/configuration)
219
+
220
+ ```ruby
221
+ ducklake.options
222
+ ```
223
+
224
+ Set an option globally
225
+
226
+ ```ruby
227
+ ducklake.set_option("parquet_compression", "zstd")
228
+ ```
229
+
230
+ Or for a specific table
231
+
232
+ ```ruby
233
+ ducklake.set_option("parquet_compression", "zstd", table_name: "events")
234
+ ```
235
+
236
+ ## Security
237
+
238
+ See [best practices](https://duckdb.org/docs/stable/operations_manual/securing_duckdb/overview.html) for DuckDB security.
239
+
240
+ Grant minimal permissions for the catalog database and data storage.
241
+
242
+ ### External Access
243
+
244
+ [Restrict external access](https://duckdb.org/docs/stable/operations_manual/securing_duckdb/overview.html#restricting-file-access) to the DuckDB engine
245
+
246
+ ```ruby
247
+ ducklake.disable_external_access
248
+ ```
249
+
250
+ Allow specific directories and paths
251
+
252
+ ```ruby
253
+ ducklake.disable_external_access(
254
+ allowed_directories: ["/path/to/directory"],
255
+ allowed_paths: ["/path/to/file.txt"]
256
+ )
257
+ ```
258
+
259
+ The storage URL is automatically included in `allowed_directories`
260
+
261
+ ## Reference
262
+
263
+ Get table info
264
+
265
+ ```ruby
266
+ ducklake.table_info
267
+ ```
268
+
269
+ Get column info
270
+
271
+ ```ruby
272
+ ducklake.column_info("events")
273
+ ```
274
+
275
+ Drop a table
276
+
277
+ ```ruby
278
+ ducklake.drop_table("events")
279
+ # or
280
+ ducklake.drop_table("events", if_exists: true)
281
+ ```
282
+
283
+ List files
284
+
285
+ ```ruby
286
+ ducklake.list_files("events")
287
+ ```
288
+
289
+ List files at a specific snapshot version or time
290
+
291
+ ```ruby
292
+ ducklake.list_files("events", snapshot_version: 3)
293
+ # or
294
+ ducklake.list_files("events", snapshot_time: Date.today - 7)
295
+ ```
296
+
297
+ ## History
298
+
299
+ View the [changelog](https://github.com/ankane/ducklake-ruby/blob/master/CHANGELOG.md)
300
+
301
+ ## Contributing
302
+
303
+ Everyone is encouraged to help improve this project. Here are a few ways you can help:
304
+
305
+ - [Report bugs](https://github.com/ankane/ducklake-ruby/issues)
306
+ - Fix bugs and [submit pull requests](https://github.com/ankane/ducklake-ruby/pulls)
307
+ - Write, clarify, or fix documentation
308
+ - Suggest or add new features
309
+
310
+ To get started with development:
311
+
312
+ ```sh
313
+ git clone https://github.com/ankane/ducklake-ruby.git
314
+ cd ducklake-ruby
315
+ bundle install
316
+
317
+ # Postgres
318
+ createdb ducklake_ruby_test
319
+ bundle exec rake test:postgres
320
+
321
+ # MySQL and MariaDB
322
+ mysqladmin create ducklake_ruby_test
323
+ bundle exec rake test:mysql
324
+
325
+ # SQLite
326
+ bundle exec rake test:sqlite
327
+
328
+ # DuckDB
329
+ bundle exec rake test:duckdb
330
+ ```
@@ -0,0 +1,430 @@
1
+ module DuckLake
2
+ class Client
3
+ def initialize(
4
+ catalog_url:,
5
+ storage_url:,
6
+ storage_options: {},
7
+ snapshot_version: nil,
8
+ snapshot_time: nil,
9
+ data_inlining_row_limit: 0,
10
+ create_if_not_exists: false,
11
+ _read_only: false # experimental
12
+ )
13
+ catalog_uri = URI.parse(catalog_url)
14
+ storage_uri = URI.parse(storage_url)
15
+
16
+ extension = nil
17
+ case catalog_uri.scheme
18
+ when "postgres", "postgresql"
19
+ extension = "postgres"
20
+ attach = "postgres:#{catalog_uri}"
21
+ when "mysql", "mariadb"
22
+ extension = "mysql"
23
+ attach = "mysql:#{catalog_uri}"
24
+ when "sqlite"
25
+ extension = "sqlite"
26
+ attach = "sqlite:#{catalog_path(catalog_uri)}"
27
+ when "duckdb"
28
+ attach = "duckdb:#{catalog_path(catalog_uri)}"
29
+ else
30
+ raise ArgumentError, "Unsupported catalog type: #{catalog_uri.scheme}"
31
+ end
32
+
33
+ secret_options = nil
34
+ storage_options = storage_options.dup
35
+
36
+ case storage_uri.scheme
37
+ when "s3"
38
+ # https://duckdb.org/docs/stable/core_extensions/httpfs/s3api.html
39
+ key_id = storage_options.delete(:aws_access_key_id)
40
+ secret = storage_options.delete(:aws_secret_access_key)
41
+ region = storage_options.delete(:region)
42
+
43
+ secret_options = {
44
+ type: "s3",
45
+ provider: "credential_chain"
46
+ }
47
+ secret_options[:key_id] = key_id if key_id
48
+ secret_options[:secret] = secret if secret
49
+ secret_options[:region] = region if region
50
+ end
51
+
52
+ if storage_options.any?
53
+ raise ArgumentError, "Unsupported #{storage_uri.scheme || "file"} storage options: #{storage_options.keys.map(&:inspect).join(", ")}"
54
+ end
55
+
56
+ attach_options = {data_path: storage_url}
57
+ attach_options[:read_only] = true if _read_only
58
+ attach_options[:snapshot_version] = snapshot_version if !snapshot_version.nil?
59
+ attach_options[:snapshot_time] = snapshot_time if !snapshot_time.nil?
60
+ attach_options[:data_inlining_row_limit] = data_inlining_row_limit if data_inlining_row_limit > 0
61
+ attach_options[:create_if_not_exists] = false unless create_if_not_exists
62
+
63
+ @catalog = "ducklake"
64
+ @storage_url = storage_url
65
+
66
+ if _read_only
67
+ config = DuckDB::Config.new
68
+ config["access_mode"] = "READ_ONLY"
69
+
70
+ # make the entire database read-only, not just DuckLake
71
+ # read-only mode can only be set when the database is opened
72
+ # and cannot be used on in-memory database, so create a temporary one
73
+ @tmpdir = Dir.mktmpdir
74
+ ObjectSpace.define_finalizer(@tmpdir, self.class.finalize(@tmpdir.dup))
75
+ dbpath = File.join(@tmpdir, "memory.duckdb")
76
+ DuckDB::Database.open(dbpath) { }
77
+
78
+ @db = DuckDB::Database.open(dbpath, config)
79
+ else
80
+ @db = DuckDB::Database.open
81
+ end
82
+
83
+ @conn = @db.connect
84
+
85
+ install_extension("ducklake")
86
+ install_extension(extension) if extension
87
+ create_secret(secret_options) if secret_options
88
+ attach_with_options(@catalog, "ducklake:#{attach}", attach_options)
89
+ execute("USE #{quote_identifier(@catalog)}")
90
+ detach("memory")
91
+ end
92
+
93
+ # https://duckdb.org/docs/stable/operations_manual/securing_duckdb/overview.html#restricting-file-access
94
+ def disable_external_access(allowed_directories: [], allowed_paths: [])
95
+ allowed_directories += [@storage_url]
96
+ execute("SET allowed_directories = #{quote_array(allowed_directories)}")
97
+ execute("SET allowed_paths = #{quote_array(allowed_paths)}")
98
+ execute("SET enable_external_access = false")
99
+ nil
100
+ end
101
+
102
+ def sql(sql, params = [])
103
+ execute(sql, params)
104
+ end
105
+
106
+ def attach(alias_, url)
107
+ type = nil
108
+ extension = nil
109
+
110
+ uri = URI.parse(url)
111
+ case uri.scheme
112
+ when "postgres", "postgresql"
113
+ type = "postgres"
114
+ extension = "postgres"
115
+ else
116
+ raise ArgumentError, "Unsupported data source type: #{uri.scheme}"
117
+ end
118
+
119
+ install_extension(extension) if extension
120
+
121
+ options = {
122
+ type: type,
123
+ read_only: true
124
+ }
125
+ attach_with_options(alias_, url, options)
126
+ end
127
+
128
+ def detach(alias_)
129
+ execute("DETACH #{quote_identifier(alias_)}")
130
+ nil
131
+ end
132
+
133
+ def table_info
134
+ symbolize_keys execute("SELECT * FROM ducklake_table_info(?)", [@catalog])
135
+ end
136
+
137
+ def column_info(table)
138
+ sql = <<~SQL
139
+ SELECT column_name AS name, LOWER(data_type) AS type
140
+ FROM information_schema.columns
141
+ WHERE table_catalog = ? AND table_schema = ? AND table_name = ?
142
+ ORDER BY ordinal_position
143
+ SQL
144
+ result = execute(sql, [@catalog, "main", table])
145
+ if result.empty?
146
+ raise CatalogError, "Table does not exist!"
147
+ end
148
+ symbolize_keys result
149
+ end
150
+
151
+ # TODO more DDL methods?
152
+ def drop_table(table, if_exists: nil)
153
+ execute("DROP TABLE#{" IF EXISTS" if if_exists} #{quote_identifier(table)}")
154
+ nil
155
+ end
156
+
157
+ # https://ducklake.select/docs/stable/duckdb/usage/snapshots
158
+ def snapshots
159
+ symbolize_keys execute("SELECT * FROM ducklake_snapshots(?)", [@catalog])
160
+ end
161
+
162
+ # https://ducklake.select/docs/stable/duckdb/usage/configuration
163
+ def options
164
+ symbolize_keys execute("SELECT * FROM ducklake_options(?)", [@catalog])
165
+ end
166
+
167
+ # https://ducklake.select/docs/stable/duckdb/usage/configuration
168
+ def set_option(name, value, table_name: nil)
169
+ args = ["?", "?", "?"]
170
+ params = [@catalog, name, value]
171
+
172
+ if !table_name.nil?
173
+ args << "table_name => ?"
174
+ params << table_name
175
+ end
176
+
177
+ execute("CALL ducklake_set_option(#{args.join(", ")})", params)
178
+ nil
179
+ end
180
+
181
+ def format_version
182
+ execute("SELECT value FROM ducklake_options(?) WHERE option_name = ?", [@catalog, "version"]).first["value"]
183
+ end
184
+
185
+ # https://ducklake.select/docs/stable/duckdb/maintenance/merge_adjacent_files
186
+ def merge_adjacent_files
187
+ execute("CALL merge_adjacent_files()")
188
+ nil
189
+ end
190
+
191
+ # https://ducklake.select/docs/stable/duckdb/maintenance/expire_snapshots
192
+ def expire_snapshots(versions: nil, older_than: nil, dry_run: false)
193
+ args = ["?"]
194
+ params = [@catalog]
195
+
196
+ if !versions.nil?
197
+ # inline since duckdb gem does not support array params
198
+ args << "versions => #{quote_array(versions)}"
199
+ end
200
+
201
+ if !older_than.nil?
202
+ args << "older_than => ?"
203
+ params << older_than
204
+ end
205
+
206
+ if dry_run
207
+ args << "dry_run => ?"
208
+ params << dry_run
209
+ end
210
+
211
+ symbolize_keys execute("CALL ducklake_expire_snapshots(#{args.join(", ")})", params)
212
+ end
213
+
214
+ # https://ducklake.select/docs/stable/duckdb/maintenance/cleanup_old_files
215
+ def cleanup_old_files(cleanup_all: false, older_than: nil, dry_run: false)
216
+ args = ["?"]
217
+ params = [@catalog]
218
+
219
+ if cleanup_all
220
+ args << "cleanup_all => ?"
221
+ params << cleanup_all
222
+ end
223
+
224
+ if !older_than.nil?
225
+ args << "older_than => ?"
226
+ params << older_than
227
+ end
228
+
229
+ if dry_run
230
+ args << "dry_run => ?"
231
+ params << dry_run
232
+ end
233
+
234
+ symbolize_keys execute("CALL ducklake_cleanup_old_files(#{args.join(", ")})", params)
235
+ end
236
+
237
+ # https://ducklake.select/docs/stable/duckdb/advanced_features/data_inlining
238
+ def flush_inlined_data(table_name: nil)
239
+ args = ["?"]
240
+ params = [@catalog]
241
+
242
+ if !table_name.nil?
243
+ args << "table_name => ?"
244
+ params << table_name
245
+ end
246
+
247
+ symbolize_keys execute("CALL ducklake_flush_inlined_data(#{args.join(", ")})", params)
248
+ end
249
+
250
+ # https://ducklake.select/docs/stable/duckdb/metadata/list_files
251
+ def list_files(table, snapshot_version: nil, snapshot_time: nil)
252
+ args = ["?", "?"]
253
+ params = [@catalog, table]
254
+
255
+ if !snapshot_version.nil?
256
+ args << "snapshot_version => ?"
257
+ params << snapshot_version
258
+ end
259
+
260
+ if !snapshot_time.nil?
261
+ snapshot_time = snapshot_time.utc if snapshot_time.is_a?(Time)
262
+ args << "snapshot_time => ?"
263
+ params << snapshot_time
264
+ end
265
+
266
+ symbolize_keys execute("SELECT * FROM ducklake_list_files(#{args.join(", ")})", params)
267
+ end
268
+
269
+ # https://ducklake.select/docs/stable/duckdb/metadata/adding_files
270
+ def add_data_files(table, data, allow_missing: nil, ignore_extra_columns: nil)
271
+ params = [@catalog, table, data]
272
+ args = ["?", "?", "?"]
273
+
274
+ if !allow_missing.nil?
275
+ args << "allow_missing => ?"
276
+ params << allow_missing
277
+ end
278
+
279
+ if !ignore_extra_columns.nil?
280
+ args << "ignore_extra_columns => ?"
281
+ params << ignore_extra_columns
282
+ end
283
+
284
+ execute("CALL ducklake_add_data_files(#{args.join(", ")})", params)
285
+ nil
286
+ end
287
+
288
+ # libduckdb does not provide function
289
+ # https://duckdb.org/docs/stable/sql/dialect/keywords_and_identifiers.html
290
+ def quote_identifier(value)
291
+ "\"#{encoded(value).gsub('"', '""')}\""
292
+ end
293
+
294
+ # libduckdb does not provide function
295
+ # TODO support more types
296
+ def quote(value)
297
+ if value.nil?
298
+ "NULL"
299
+ elsif value == true
300
+ "true"
301
+ elsif value == false
302
+ "false"
303
+ elsif defined?(BigDecimal) && value.is_a?(BigDecimal)
304
+ value.to_s("F")
305
+ elsif value.is_a?(Numeric)
306
+ value.to_s
307
+ else
308
+ if value.is_a?(Time)
309
+ value = value.utc.iso8601(9)
310
+ elsif value.is_a?(DateTime)
311
+ value = value.iso8601(9)
312
+ elsif value.is_a?(Date)
313
+ value = value.strftime("%Y-%m-%d")
314
+ end
315
+ "'#{encoded(value).gsub("'", "''")}'"
316
+ end
317
+ end
318
+
319
+ def disconnect
320
+ @conn.disconnect
321
+ @db.close
322
+ nil
323
+ end
324
+
325
+ # hide internal state
326
+ def inspect
327
+ to_s
328
+ end
329
+
330
+ def self.finalize(dir)
331
+ proc { FileUtils.remove_entry(dir) }
332
+ end
333
+
334
+ private
335
+
336
+ def execute(sql, params = [])
337
+ # use prepare instead of query to prevent multiple statements at once
338
+ result =
339
+ @conn.prepare(sql) do |stmt|
340
+ params.each_with_index do |v, i|
341
+ stmt.bind(i + 1, v)
342
+ end
343
+ stmt.execute
344
+ end
345
+
346
+ # TODO add column types
347
+ Result.new(result.columns.map(&:name), result.to_a)
348
+ rescue DuckDB::Error => e
349
+ raise map_error(e), cause: nil
350
+ end
351
+
352
+ def error_mapping
353
+ @error_mapping ||= {
354
+ "Catalog Error: " => CatalogError,
355
+ "Conversion Error: " => ConversionError,
356
+ "Invalid Input Error: " => InvalidInputError,
357
+ "IO Error: " => IOError,
358
+ "Permission Error: " => PermissionError
359
+ }
360
+ end
361
+
362
+ # not ideal to base on prefix, but do not see a better way at the moment
363
+ def map_error(e)
364
+ error_mapping.each do |prefix, cls|
365
+ if e.message&.start_with?(prefix)
366
+ return cls.new(e.message.delete_prefix(prefix))
367
+ end
368
+ end
369
+ Error.new(e.message)
370
+ end
371
+
372
+ def install_extension(extension)
373
+ execute("INSTALL #{quote_identifier(extension)}")
374
+ end
375
+
376
+ def create_secret(options)
377
+ execute("CREATE SECRET (#{options_args(options)})")
378
+ end
379
+
380
+ def attach_with_options(alias_, url, options)
381
+ execute("ATTACH #{quote(url)} AS #{quote_identifier(alias_)} (#{options_args(options)})")
382
+ end
383
+
384
+ def options_args(options)
385
+ options.map { |k, v| "#{option_name(k)} #{quote(v)}" }.join(", ")
386
+ end
387
+
388
+ def option_name(k)
389
+ name = k.to_s.upcase
390
+ # should never contain user input, but just to be safe
391
+ unless name.match?(/\A[A-Z_]+\z/)
392
+ raise "Invalid option name"
393
+ end
394
+ name
395
+ end
396
+
397
+ def symbolize_keys(result)
398
+ result.map { |v| v.transform_keys(&:to_sym) }
399
+ end
400
+
401
+ def catalog_path(uri)
402
+ # custom message for sqlite://db.sqlite
403
+ # TODO improve message
404
+ if !uri.host.empty?
405
+ raise ArgumentError, "Unexpected host in catalog_url"
406
+ end
407
+
408
+ if uri.path.length < 2 || uri.user || uri.password || uri.port || uri.query || uri.fragment
409
+ raise ArgumentError, "Invalid catalog_url"
410
+ end
411
+
412
+ uri.path[1..]
413
+ end
414
+
415
+ def quote_array(value)
416
+ "[#{value.map { |v| quote(v) }.join(", ")}]"
417
+ end
418
+
419
+ def encoded(value)
420
+ value = value.to_s if value.is_a?(Symbol)
421
+ if !value.respond_to?(:to_str)
422
+ raise TypeError, "no implicit conversion of #{value.class.name} into String"
423
+ end
424
+ if ![Encoding::UTF_8, Encoding::US_ASCII].include?(value.encoding) || !value.valid_encoding?
425
+ raise ArgumentError, "Unsupported encoding"
426
+ end
427
+ value
428
+ end
429
+ end
430
+ end
@@ -0,0 +1,22 @@
1
+ module DuckLake
2
+ class Result
3
+ include Enumerable
4
+
5
+ attr_reader :columns, :rows
6
+
7
+ def initialize(columns, rows)
8
+ @columns = columns
9
+ @rows = rows
10
+ end
11
+
12
+ def each
13
+ @rows.each do |row|
14
+ yield @columns.zip(row).to_h
15
+ end
16
+ end
17
+
18
+ def empty?
19
+ rows.empty?
20
+ end
21
+ end
22
+ end
@@ -0,0 +1,3 @@
1
+ module DuckLake
2
+ VERSION = "0.1.0"
3
+ end
data/lib/ducklake.rb ADDED
@@ -0,0 +1,19 @@
1
+ # dependencies
2
+ require "duckdb"
3
+
4
+ # stdlib
5
+ require "uri"
6
+
7
+ # modules
8
+ require_relative "ducklake/client"
9
+ require_relative "ducklake/result"
10
+ require_relative "ducklake/version"
11
+
12
+ module DuckLake
13
+ class Error < StandardError; end
14
+ class CatalogError < Error; end
15
+ class ConversionError < Error; end
16
+ class InvalidInputError < Error; end
17
+ class IOError < Error; end
18
+ class PermissionError < Error; end
19
+ end
metadata ADDED
@@ -0,0 +1,58 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ducklake
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Andrew Kane
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: duckdb
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ">="
17
+ - !ruby/object:Gem::Version
18
+ version: '0'
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - ">="
24
+ - !ruby/object:Gem::Version
25
+ version: '0'
26
+ email: andrew@ankane.org
27
+ executables: []
28
+ extensions: []
29
+ extra_rdoc_files: []
30
+ files:
31
+ - CHANGELOG.md
32
+ - README.md
33
+ - lib/ducklake.rb
34
+ - lib/ducklake/client.rb
35
+ - lib/ducklake/result.rb
36
+ - lib/ducklake/version.rb
37
+ homepage: https://github.com/ankane/ducklake-ruby
38
+ licenses:
39
+ - MIT
40
+ metadata: {}
41
+ rdoc_options: []
42
+ require_paths:
43
+ - lib
44
+ required_ruby_version: !ruby/object:Gem::Requirement
45
+ requirements:
46
+ - - ">="
47
+ - !ruby/object:Gem::Version
48
+ version: '3.2'
49
+ required_rubygems_version: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: '0'
54
+ requirements: []
55
+ rubygems_version: 3.6.9
56
+ specification_version: 4
57
+ summary: DuckLake for Ruby
58
+ test_files: []