databricks_sql 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/CHANGELOG.md +8 -0
- data/CODE_OF_CONDUCT.md +10 -0
- data/LICENSE.txt +21 -0
- data/README.md +215 -0
- data/Rakefile +10 -0
- data/lib/databricks_sql/client.rb +477 -0
- data/lib/databricks_sql/configuration.rb +22 -0
- data/lib/databricks_sql/errors.rb +53 -0
- data/lib/databricks_sql/external_link_handler.rb +89 -0
- data/lib/databricks_sql/result.rb +25 -0
- data/lib/databricks_sql/type_coercer.rb +59 -0
- data/lib/databricks_sql/version.rb +5 -0
- data/lib/databricks_sql.rb +31 -0
- data/sig/databricks_sql.rbs +182 -0
- data/spec/databricks_sql/client_spec.rb +533 -0
- data/spec/databricks_sql_spec.rb +37 -0
- data/spec/spec_helper.rb +16 -0
- metadata +77 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: fb1784eb43d904710d121f1718ce325b43fad687ca320fa0b5bbe2d370dcefd0
|
|
4
|
+
data.tar.gz: e4ea467babed9260d712d7f566921ad1dbcf4a6fb2a5c7bf1acf7fee3b13974a
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: 28a3767b8b015dddd8c26140d3911906858b20c677188c9e78999d119c08915f8f66f37aef294be3e3b006e025ec6085f204b39425f7f78d9709e891c9cc132a
|
|
7
|
+
data.tar.gz: ddd0c3ecb4725068cd3605db468ce1c0bee0472f262eeb63b91403f6a76337d798b593cdfcc3d60bee9fa83cc9a3462f18fce45546b7fb221cf82ebe5fb4efba
|
data/.rspec
ADDED
data/CHANGELOG.md
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
## [0.1.0] - 2026-04-06
|
|
2
|
+
|
|
3
|
+
- First public release of the gem.
|
|
4
|
+
- Client for executing SQL queries on Databricks.
|
|
5
|
+
- Flexible configuration via code.
|
|
6
|
+
- Custom error handling.
|
|
7
|
+
- Automatic type coercion for query results.
|
|
8
|
+
- Support for external links in results.
|
data/CODE_OF_CONDUCT.md
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# Code of Conduct
|
|
2
|
+
|
|
3
|
+
"databricks_sql" follows [The Ruby Community Conduct Guideline](https://www.ruby-lang.org/en/conduct) in all "collaborative space", which is defined as community communications channels (such as mailing lists, submitted patches, commit comments, etc.):
|
|
4
|
+
|
|
5
|
+
* Participants will be tolerant of opposing views.
|
|
6
|
+
* Participants must ensure that their language and actions are free of personal attacks and disparaging personal remarks.
|
|
7
|
+
* When interpreting the words and actions of others, participants should always assume good intentions.
|
|
8
|
+
* Behaviour which can be reasonably considered harassment will not be tolerated.
|
|
9
|
+
|
|
10
|
+
If you have any concerns about behaviour within this project, please contact us at ["lairton.mendes@gmail.com"](mailto:"lairton.mendes@gmail.com").
|
data/LICENSE.txt
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
The MIT License (MIT)
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Lairton Mendes
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
|
13
|
+
all copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
|
@@ -0,0 +1,215 @@
|
|
|
1
|
+
# databricks_sql
|
|
2
|
+
|
|
3
|
+
Ruby gem for the Databricks SQL Statements API with support for:
|
|
4
|
+
|
|
5
|
+
- Personal Access Token (PAT) authentication
|
|
6
|
+
- synchronous and asynchronous execution (polling)
|
|
7
|
+
- format: JSON_ARRAY
|
|
8
|
+
- disposition: INLINE
|
|
9
|
+
- disposition: EXTERNAL_LINK with automatic file download and parsing
|
|
10
|
+
- HTTP and SQL execution error handling
|
|
11
|
+
|
|
12
|
+
## Installation
|
|
13
|
+
|
|
14
|
+
Add to your Gemfile:
|
|
15
|
+
|
|
16
|
+
```ruby
|
|
17
|
+
gem "databricks_sql"
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Or install directly:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
gem install databricks_sql
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Global Configuration (Recommended)
|
|
27
|
+
|
|
28
|
+
Configure connection settings once and reuse them across your application:
|
|
29
|
+
|
|
30
|
+
```ruby
|
|
31
|
+
require "databricks_sql"
|
|
32
|
+
|
|
33
|
+
Databricks.configure do |config|
|
|
34
|
+
config.host = "https://adb-1234567890123456.7.azuredatabricks.net"
|
|
35
|
+
config.token = ENV.fetch("DATABRICKS_TOKEN")
|
|
36
|
+
config.warehouse_id = ENV.fetch("DATABRICKS_WAREHOUSE_ID")
|
|
37
|
+
config.timeout = 30
|
|
38
|
+
config.open_timeout = 10
|
|
39
|
+
config.external_link_require_https = true
|
|
40
|
+
config.external_link_allowed_hosts = ["files.example.com", "s3.amazonaws.com"]
|
|
41
|
+
end
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Security notes:
|
|
45
|
+
|
|
46
|
+
- The API host must use HTTPS.
|
|
47
|
+
- EXTERNAL_LINK URLs are HTTPS-only by default.
|
|
48
|
+
- If `external_link_allowed_hosts` is set, downloads are allowed only from those domains.
|
|
49
|
+
|
|
50
|
+
Then initialize your client without passing credentials again:
|
|
51
|
+
|
|
52
|
+
```ruby
|
|
53
|
+
client = DatabricksSql::Client.new
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
You can also configure through DatabricksSql.configure:
|
|
57
|
+
|
|
58
|
+
```ruby
|
|
59
|
+
DatabricksSql.configure do |config|
|
|
60
|
+
config.host = "https://adb-1234567890123456.7.azuredatabricks.net"
|
|
61
|
+
config.token = ENV.fetch("DATABRICKS_TOKEN")
|
|
62
|
+
config.warehouse_id = ENV.fetch("DATABRICKS_WAREHOUSE_ID")
|
|
63
|
+
end
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
If needed, override per client instance:
|
|
67
|
+
|
|
68
|
+
```ruby
|
|
69
|
+
client = DatabricksSql::Client.new(
|
|
70
|
+
host: "https://adb-1234567890123456.7.azuredatabricks.net",
|
|
71
|
+
token: ENV.fetch("DATABRICKS_TOKEN"),
|
|
72
|
+
warehouse_id: ENV.fetch("DATABRICKS_WAREHOUSE_ID")
|
|
73
|
+
)
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## Synchronous Usage
|
|
77
|
+
|
|
78
|
+
execute_statement submits the query and waits for a terminal status (SUCCEEDED, FAILED, CANCELED, or CLOSED).
|
|
79
|
+
|
|
80
|
+
```ruby
|
|
81
|
+
result = client.execute_statement(
|
|
82
|
+
statement: "SELECT id, name FROM analytics.users LIMIT 5",
|
|
83
|
+
format: "JSON_ARRAY",
|
|
84
|
+
disposition: "INLINE"
|
|
85
|
+
)
|
|
86
|
+
|
|
87
|
+
puts result.status
|
|
88
|
+
puts result.columns.inspect
|
|
89
|
+
puts result.rows.inspect
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### SQL Context (catalog/schema)
|
|
93
|
+
|
|
94
|
+
```ruby
|
|
95
|
+
result = client.execute_statement(
|
|
96
|
+
statement: "SELECT current_catalog(), current_schema()",
|
|
97
|
+
catalog: "main",
|
|
98
|
+
schema: "analytics"
|
|
99
|
+
)
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Type Mapping with column_schema
|
|
103
|
+
|
|
104
|
+
column_schema allows optional per-column coercion.
|
|
105
|
+
|
|
106
|
+
```ruby
|
|
107
|
+
result = client.execute_statement(
|
|
108
|
+
statement: "SELECT id, is_active, created_at FROM analytics.users LIMIT 2",
|
|
109
|
+
column_schema: {
|
|
110
|
+
"id" => :integer,
|
|
111
|
+
"is_active" => :boolean,
|
|
112
|
+
"created_at" => :datetime
|
|
113
|
+
}
|
|
114
|
+
)
|
|
115
|
+
|
|
116
|
+
result.rows.each do |row|
|
|
117
|
+
puts [row["id"].class, row["is_active"].class, row["created_at"].class].inspect
|
|
118
|
+
end
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## Asynchronous Usage (Polling)
|
|
122
|
+
|
|
123
|
+
### 1) Submit without blocking
|
|
124
|
+
|
|
125
|
+
```ruby
|
|
126
|
+
submission = client.execute_statement_async(
|
|
127
|
+
statement: "SELECT * FROM large_table",
|
|
128
|
+
format: "JSON_ARRAY",
|
|
129
|
+
disposition: "EXTERNAL_LINK",
|
|
130
|
+
wait_timeout: "10s",
|
|
131
|
+
on_wait_timeout: "CONTINUE"
|
|
132
|
+
)
|
|
133
|
+
|
|
134
|
+
statement_id = submission.fetch("statement_id")
|
|
135
|
+
puts "Statement ID: #{statement_id}"
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### 2) Manual polling
|
|
139
|
+
|
|
140
|
+
```ruby
|
|
141
|
+
loop do
|
|
142
|
+
state = client.get_statement(statement_id: statement_id)
|
|
143
|
+
puts "Current status: #{state["status"]}"
|
|
144
|
+
break if %w[SUCCEEDED FAILED CANCELED CLOSED].include?(state["status"])
|
|
145
|
+
sleep 1
|
|
146
|
+
end
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### 3) Automatic polling with global timeout
|
|
150
|
+
|
|
151
|
+
```ruby
|
|
152
|
+
result = client.wait_for_statement(
|
|
153
|
+
statement_id: statement_id,
|
|
154
|
+
disposition: "EXTERNAL_LINK",
|
|
155
|
+
poll_interval: 1.0,
|
|
156
|
+
max_wait: 120,
|
|
157
|
+
cancel_on_timeout: true
|
|
158
|
+
)
|
|
159
|
+
|
|
160
|
+
puts result.rows.size
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
## INLINE vs EXTERNAL_LINK
|
|
164
|
+
|
|
165
|
+
- INLINE returns results directly in the API payload.
|
|
166
|
+
- EXTERNAL_LINK extracts the download URL, downloads the file, and returns parsed content.
|
|
167
|
+
|
|
168
|
+
In EXTERNAL_LINK mode, JSON and CSV are parsed automatically.
|
|
169
|
+
|
|
170
|
+
## Error Handling
|
|
171
|
+
|
|
172
|
+
Main error classes:
|
|
173
|
+
|
|
174
|
+
- DatabricksSql::AuthenticationError (401)
|
|
175
|
+
- DatabricksSql::AuthorizationError (403)
|
|
176
|
+
- DatabricksSql::NotFoundError (404)
|
|
177
|
+
- DatabricksSql::RateLimitError (429)
|
|
178
|
+
- DatabricksSql::ServerError (5xx)
|
|
179
|
+
- DatabricksSql::TimeoutError
|
|
180
|
+
- DatabricksSql::ConnectionError
|
|
181
|
+
- DatabricksSql::ExecutionError (logical SQL execution failure)
|
|
182
|
+
- DatabricksSql::ParseError
|
|
183
|
+
|
|
184
|
+
Example:
|
|
185
|
+
|
|
186
|
+
```ruby
|
|
187
|
+
begin
|
|
188
|
+
result = client.execute_statement(statement: "SELECT * FROM missing_table")
|
|
189
|
+
p result.rows
|
|
190
|
+
rescue DatabricksSql::ExecutionError => e
|
|
191
|
+
warn "SQL execution failed: #{e.message}"
|
|
192
|
+
rescue DatabricksSql::HTTPError => e
|
|
193
|
+
warn "HTTP error #{e.status_code}: #{e.message}"
|
|
194
|
+
rescue DatabricksSql::Error => e
|
|
195
|
+
warn "DatabricksSql error: #{e.message}"
|
|
196
|
+
end
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Development
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
bin/setup
|
|
203
|
+
bundle exec rubocop
|
|
204
|
+
bundle exec rspec
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
Install locally:
|
|
208
|
+
|
|
209
|
+
```bash
|
|
210
|
+
bundle exec rake install
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
## License
|
|
214
|
+
|
|
215
|
+
MIT. See [LICENSE.txt](LICENSE.txt).
|