@rip-lang/db 0.9.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/INTERNALS.md +324 -0
- package/README.md +93 -237
- package/bin/rip-db +3 -3
- package/build.zig +88 -0
- package/db.rip +162 -74
- package/lib/darwin-arm64/duckdb.node +0 -0
- package/lib/duckdb.mjs +246 -333
- package/package.json +11 -8
- package/src/duckdb.zig +1156 -0
- package/PROTOCOL.md +0 -258
- package/db.html +0 -76
- package/lib/duckdb-binary.rip +0 -528
package/INTERNALS.md
ADDED
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
# DuckDB Internals
|
|
2
|
+
|
|
3
|
+
This document covers the internal architecture, binary protocol, and implementation
|
|
4
|
+
details for rip-db's native DuckDB integration. For usage, see README.md.
|
|
5
|
+
|
|
6
|
+
## Architecture
|
|
7
|
+
|
|
8
|
+
```
|
|
9
|
+
┌────────────────────────────────────────────────────────────────────────────┐
|
|
10
|
+
│ Browser │
|
|
11
|
+
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
|
12
|
+
│ │ DuckDB UI (React App) │ │
|
|
13
|
+
│ │ - Loaded from http://localhost:4213/ (proxied from ui.duckdb.org) │ │
|
|
14
|
+
│ │ - Makes API calls to relative URLs (/ddb/run, /ddb/tokenize, etc.) │ │
|
|
15
|
+
│ │ - Uses EventSource for /localEvents (catalog updates) │ │
|
|
16
|
+
│ └──────────────────────────────────────────────────────────────────────┘ │
|
|
17
|
+
│ │ │
|
|
18
|
+
│ │ Same-origin requests │
|
|
19
|
+
│ ▼ │
|
|
20
|
+
└────────────────────────────────────────────────────────────────────────────┘
|
|
21
|
+
│
|
|
22
|
+
▼
|
|
23
|
+
┌────────────────────────────────────────────────────────────────────────────┐
|
|
24
|
+
│ rip-db Server (:4213) │
|
|
25
|
+
│ │
|
|
26
|
+
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐ │
|
|
27
|
+
│ │ Static Proxy │ │ Binary API │ │ SSE Events │ │
|
|
28
|
+
│ │ GET /* │ │ POST /ddb/run │ │ GET /localEvents │ │
|
|
29
|
+
│ │ → ui.duckdb.org │ │ POST /ddb/token │ │ → catalog updates │ │
|
|
30
|
+
│ └─────────────────┘ │ POST /ddb/intr │ └─────────────────────────┘ │
|
|
31
|
+
│ └─────────────────┘ │
|
|
32
|
+
│ │ │
|
|
33
|
+
│ ▼ │
|
|
34
|
+
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
|
35
|
+
│ │ High-Performance Zig Bindings │ │
|
|
36
|
+
│ │ - Single FFI call per query │ │
|
|
37
|
+
│ │ - Zero-copy for numeric columns (direct memory access) │ │
|
|
38
|
+
│ │ - Binary serialization done in Zig (no JS overhead) │ │
|
|
39
|
+
│ │ - Pre-allocated output buffer (no allocations per query) │ │
|
|
40
|
+
│ └──────────────────────────────────────────────────────────────────────┘ │
|
|
41
|
+
│ │ │
|
|
42
|
+
│ ▼ │
|
|
43
|
+
│ ┌─────────────────┐ │
|
|
44
|
+
│ │ DuckDB │ │
|
|
45
|
+
│ │ (native) │ │
|
|
46
|
+
│ └─────────────────┘ │
|
|
47
|
+
└────────────────────────────────────────────────────────────────────────────┘
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### Why Zig?
|
|
51
|
+
|
|
52
|
+
The naive approach (per-value FFI calls) has severe problems:
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
┌─────────┐ FFI call ┌──────────┐ per-value ┌──────────┐
|
|
56
|
+
│ DuckDB │ ──────────────▶ │ Zig │ ─────────────▶ │ Bun/JS │
|
|
57
|
+
│ Result │ ◀────────────── │ Wrapper │ ◀───────────── │ Extracts │
|
|
58
|
+
└─────────┘ ptr to data └──────────┘ 100k calls └──────────┘
|
|
59
|
+
│
|
|
60
|
+
▼
|
|
61
|
+
┌──────────────────────────┐
|
|
62
|
+
│ JavaScript Objects │
|
|
63
|
+
│ (100k allocations) │
|
|
64
|
+
└──────────────────────────┘
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
**Problems:**
|
|
68
|
+
- ~100,000 FFI calls for 10k rows × 10 columns
|
|
69
|
+
- String pointers become invalid after result freed (segfaults!)
|
|
70
|
+
- V8 heap pressure from intermediate objects
|
|
71
|
+
|
|
72
|
+
**Our solution:** Do everything in Zig with a single FFI call:
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
┌─────────┐ single call ┌──────────────────────────────┐
|
|
76
|
+
│ DuckDB │ ──────────────▶ │ Zig │
|
|
77
|
+
│ Result │ │ ┌─────────────────────────┐ │
|
|
78
|
+
└─────────┘ │ │ Binary Serializer │ │
|
|
79
|
+
│ │ │ (direct memory access) │ │
|
|
80
|
+
│ column data ptrs │ └───────────┬─────────────┘ │
|
|
81
|
+
└──────────────────────┼──────────────┘ │
|
|
82
|
+
│ ▼ │
|
|
83
|
+
│ ┌─────────────────────────┐ │
|
|
84
|
+
│ │ Output Buffer │ │
|
|
85
|
+
│ │ (pre-allocated) │ │
|
|
86
|
+
│ └─────────────────────────┘ │
|
|
87
|
+
└──────────────────────────────┘
|
|
88
|
+
│
|
|
89
|
+
▼
|
|
90
|
+
┌──────────────────────────────┐
|
|
91
|
+
│ HTTP Response │
|
|
92
|
+
└──────────────────────────────┘
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
| Metric | Naive (JS) | Zig | Improvement |
|
|
96
|
+
|--------|------------|-----|-------------|
|
|
97
|
+
| FFI calls (10k×10) | ~100,000 | 1 | 100,000× |
|
|
98
|
+
| Allocations | O(rows×cols) | O(1) | ∞ |
|
|
99
|
+
| Memory copies | 3-4 per value | 1 total | 3-4× |
|
|
100
|
+
| String handling | Unsafe (crash) | Safe | ✓ |
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## Binary Protocol
|
|
105
|
+
|
|
106
|
+
The DuckDB UI uses a custom binary format for query results. All numbers are little-endian.
|
|
107
|
+
|
|
108
|
+
### Primitives
|
|
109
|
+
|
|
110
|
+
#### varint (Variable-length Integer)
|
|
111
|
+
```
|
|
112
|
+
while (byte & 0x80):
|
|
113
|
+
result |= (byte & 0x7F) << shift
|
|
114
|
+
shift += 7
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
#### Field ID
|
|
118
|
+
```
|
|
119
|
+
id: uint16 (little-endian)
|
|
120
|
+
end marker: 0xFFFF
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
#### string
|
|
124
|
+
```
|
|
125
|
+
length: varint
|
|
126
|
+
data: UTF-8 bytes
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
#### list<T>
|
|
130
|
+
```
|
|
131
|
+
count: varint
|
|
132
|
+
items: T[]
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
#### nullable<T>
|
|
136
|
+
```
|
|
137
|
+
present: uint8 (0 = null, non-zero = present)
|
|
138
|
+
value: T (only if present)
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### Response Types
|
|
142
|
+
|
|
143
|
+
#### SuccessResult
|
|
144
|
+
```
|
|
145
|
+
field_100: boolean (true)
|
|
146
|
+
field_101: ColumnNamesAndTypes
|
|
147
|
+
field_102: list<DataChunk>
|
|
148
|
+
0xFFFF
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
#### ErrorResult
|
|
152
|
+
```
|
|
153
|
+
field_100: boolean (false)
|
|
154
|
+
field_101: string (error message)
|
|
155
|
+
0xFFFF
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
#### TokenizeResult
|
|
159
|
+
```
|
|
160
|
+
field_100: list<varint> (offsets)
|
|
161
|
+
field_101: list<varint> (token types)
|
|
162
|
+
0xFFFF
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### ColumnNamesAndTypes
|
|
166
|
+
```
|
|
167
|
+
field_100: list<string> (column names)
|
|
168
|
+
field_101: list<Type> (column types)
|
|
169
|
+
0xFFFF
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### Type
|
|
173
|
+
```
|
|
174
|
+
field_100: uint8 (LogicalTypeId)
|
|
175
|
+
field_101: nullable<TypeInfo>
|
|
176
|
+
0xFFFF
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### DataChunk
|
|
180
|
+
```
|
|
181
|
+
field_100: varint (row count)
|
|
182
|
+
field_101: list<Vector>
|
|
183
|
+
0xFFFF
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Vector
|
|
187
|
+
|
|
188
|
+
```
|
|
189
|
+
field_100: uint8 (allValid: 0 = all valid, 1 = has bitmap)
|
|
190
|
+
field_101: data (validity bitmap - only if allValid != 0)
|
|
191
|
+
field_102: data (values)
|
|
192
|
+
0xFFFF
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
**Validity bitmap:** LSB first, 1 = valid, 0 = NULL, size = ceil(rows/8) bytes.
|
|
196
|
+
|
|
197
|
+
### LogicalTypeId
|
|
198
|
+
|
|
199
|
+
| ID | Type | Bytes |
|
|
200
|
+
|----|------|-------|
|
|
201
|
+
| 10 | BOOLEAN | 1 |
|
|
202
|
+
| 11 | TINYINT | 1 |
|
|
203
|
+
| 12 | SMALLINT | 2 |
|
|
204
|
+
| 13 | INTEGER | 4 |
|
|
205
|
+
| 14 | BIGINT | 8 |
|
|
206
|
+
| 15 | DATE | 4 (days since epoch) |
|
|
207
|
+
| 16 | TIME | 8 (microseconds) |
|
|
208
|
+
| 19 | TIMESTAMP | 8 (microseconds since epoch) |
|
|
209
|
+
| 22 | FLOAT | 4 |
|
|
210
|
+
| 23 | DOUBLE | 8 |
|
|
211
|
+
| 25 | VARCHAR | variable (list<string>) |
|
|
212
|
+
| 26 | BLOB | variable |
|
|
213
|
+
| 50 | HUGEINT | 16 |
|
|
214
|
+
| 54 | UUID | 16 |
|
|
215
|
+
| 100 | STRUCT | nested |
|
|
216
|
+
| 101 | LIST | nested |
|
|
217
|
+
| 102 | MAP | nested |
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## HTTP Endpoints
|
|
222
|
+
|
|
223
|
+
### Required for DuckDB UI
|
|
224
|
+
|
|
225
|
+
| Endpoint | Method | Request | Response |
|
|
226
|
+
|----------|--------|---------|----------|
|
|
227
|
+
| `/ddb/run` | POST | SQL text | Binary result |
|
|
228
|
+
| `/ddb/interrupt` | POST | Empty | Empty result |
|
|
229
|
+
| `/ddb/tokenize` | POST | SQL text | Binary tokens |
|
|
230
|
+
| `/info` | GET | - | Headers only |
|
|
231
|
+
| `/version` | GET | - | JSON `{"origin":"host",...}` |
|
|
232
|
+
| `/config` | GET | - | Proxied + version headers |
|
|
233
|
+
| `/localToken` | GET | - | Empty (no MotherDuck) |
|
|
234
|
+
| `/localEvents` | GET | - | SSE stream |
|
|
235
|
+
|
|
236
|
+
### Request Headers
|
|
237
|
+
|
|
238
|
+
| Header | Encoding | Purpose |
|
|
239
|
+
|--------|----------|---------|
|
|
240
|
+
| `Origin` | - | Security check (must match server URL) |
|
|
241
|
+
| `X-DuckDB-UI-Result-Row-Limit` | Plain | Max rows to return |
|
|
242
|
+
| `X-DuckDB-UI-Database-Name` | Base64 | Target database |
|
|
243
|
+
| `X-DuckDB-UI-Schema-Name` | Base64 | Target schema |
|
|
244
|
+
| `X-DuckDB-UI-Parameter-Count` | Plain | Prepared statement params |
|
|
245
|
+
| `X-DuckDB-UI-Parameter-Value-{n}` | Base64 | Param values |
|
|
246
|
+
|
|
247
|
+
### Response Headers (Required)
|
|
248
|
+
|
|
249
|
+
```
|
|
250
|
+
X-DuckDB-Version: 1.4.1
|
|
251
|
+
X-DuckDB-Platform: rip-db
|
|
252
|
+
X-DuckDB-UI-Extension-Version: 139-944c08a214
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
The UI checks `X-DuckDB-UI-Extension-Version` to decide between HTTP and WASM modes.
|
|
256
|
+
|
|
257
|
+
### Security
|
|
258
|
+
|
|
259
|
+
The official DuckDB server checks Origin header:
|
|
260
|
+
```cpp
|
|
261
|
+
if (origin != local_url) {
|
|
262
|
+
res.status = 401; // UNAUTHORIZED
|
|
263
|
+
}
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
Our solution: Proxy UI assets from `ui.duckdb.org` so all requests are same-origin.
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## Building
|
|
271
|
+
|
|
272
|
+
```bash
|
|
273
|
+
cd packages/db
|
|
274
|
+
|
|
275
|
+
# Build (outputs to lib/{os}-{arch}/duckdb.node)
|
|
276
|
+
# ReleaseFast is the default, so no -Doptimize flag needed
|
|
277
|
+
zig build --prefix .
|
|
278
|
+
|
|
279
|
+
# Run tests
|
|
280
|
+
zig build test
|
|
281
|
+
|
|
282
|
+
# Debug build
|
|
283
|
+
zig build -Doptimize=Debug --prefix .
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
Requires DuckDB headers and library. Set `DUCKDB_DIR` if not at `/opt/homebrew`:
|
|
287
|
+
```bash
|
|
288
|
+
DUCKDB_DIR=/path/to/duckdb zig build
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
---
|
|
292
|
+
|
|
293
|
+
## Debugging
|
|
294
|
+
|
|
295
|
+
### Test endpoints
|
|
296
|
+
|
|
297
|
+
```bash
|
|
298
|
+
# Start server
|
|
299
|
+
rip db.rip :memory: --port 4213
|
|
300
|
+
|
|
301
|
+
# Test binary endpoint
|
|
302
|
+
curl -X POST http://localhost:4213/ddb/run \
|
|
303
|
+
-H "Origin: http://localhost:4213" \
|
|
304
|
+
-d "SELECT 42" | xxd | head -5
|
|
305
|
+
|
|
306
|
+
# Should see: 6400 01 = field 100, value true (success)
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
### Common Issues
|
|
310
|
+
|
|
311
|
+
| Issue | Cause | Fix |
|
|
312
|
+
|-------|-------|-----|
|
|
313
|
+
| 401 on /ddb/run | Wrong Origin | Load UI from localhost, not ui.duckdb.org |
|
|
314
|
+
| UI shows "WASM" | Missing version header | Add `X-DuckDB-UI-Extension-Version` |
|
|
315
|
+
| Syntax highlighting broken | /ddb/tokenize error | Check server logs |
|
|
316
|
+
| Segfault | Memory lifetime | Use Zig bindings, not JS FFI per-value |
|
|
317
|
+
|
|
318
|
+
---
|
|
319
|
+
|
|
320
|
+
## References
|
|
321
|
+
|
|
322
|
+
- [DuckDB UI GitHub](https://github.com/duckdb/duckdb-ui)
|
|
323
|
+
- [DuckDB UI Client](https://github.com/duckdb/duckdb-ui/tree/main/ts/pkgs/duckdb-ui-client)
|
|
324
|
+
- [BinaryDeserializer.ts](https://github.com/duckdb/duckdb-ui/blob/main/ts/pkgs/duckdb-ui-client/src/serialization/classes/BinaryDeserializer.ts)
|