dirsql 0.2.4 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,124 @@
1
+ ---
2
+ canonical: https://thekevinscott.github.io/dirsql/guide/cli
3
+ ---
4
+
5
+ # Command-Line Interface
6
+
7
+ > Online: <https://thekevinscott.github.io/dirsql/guide/cli>
8
+
9
+ `dirsql` starts an HTTP server that exposes identical SDK functionality.
10
+
11
+ ## Installation
12
+
13
+ ::: code-group
14
+
15
+ ```bash [npm]
16
+ npx dirsql
17
+ ```
18
+
19
+ ```bash [PyPI]
20
+ uvx dirsql
21
+ ```
22
+
23
+ ```bash [Cargo]
24
+ # Installs the binary only (non-default feature)
25
+ cargo install dirsql --features cli
26
+ dirsql
27
+ ```
28
+
29
+ :::
30
+
31
+ ::: tip For Rust library consumers
32
+ The `cli` feature is **opt-in**. Adding `dirsql` as a library dependency (`cargo add dirsql`) pulls no CLI dependencies — only the core library. See the [Rust library README](https://github.com/thekevinscott/dirsql/tree/main/packages/rust) for details.
33
+ :::
34
+
35
+ ## Running the server
36
+
37
+ Run `dirsql` from the directory containing your files:
38
+
39
+ ```bash
40
+ dirsql
41
+
42
+ $ Running at localhost:7117
43
+ ```
44
+
45
+ ### Flags
46
+
47
+ | Flag | Default | Description |
48
+ |---|---|---|
49
+ | `--config <path>` | `./.dirsql.toml` | Path to the config file. The index is rooted at the directory containing this file. |
50
+ | `--host <addr>` | `localhost` | Bind address |
51
+ | `--port <n>` | `7117` | TCP port to bind |
52
+
53
+ ## HTTP API
54
+
55
+ ### `POST /query`
56
+
57
+ Run a SQL query. Request body is JSON:
58
+
59
+ ```json
60
+ {"sql": "SELECT title, author FROM posts WHERE draft = 0"}
61
+ ```
62
+
63
+ Response is a JSON array of row objects:
64
+
65
+ ```json
66
+ [
67
+ {"title": "Hello World", "author": "alice"},
68
+ {"title": "Second Post", "author": "bob"}
69
+ ]
70
+ ```
71
+
72
+ On error, the server returns a non-2xx status with a JSON body:
73
+
74
+ ```json
75
+ {"error": "syntax error near \"SLECT\""}
76
+ ```
77
+
78
+ Malformed SQL returns `400`, not `500` — the client sent bad input. Missing / unreadable config returns `503`.
79
+
80
+ ```bash
81
+ curl -s http://localhost:7117/query \
82
+ -H 'content-type: application/json' \
83
+ -d '{"sql":"SELECT COUNT(*) AS n FROM posts"}' \
84
+ | jq
85
+ ```
86
+
87
+ ### `GET /events`
88
+
89
+ Opens a [Server-Sent Events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events) stream of change events. Each `data:` payload is the same JSON schema the SDK emits from [`db.watch()`](./watching.md#event-types):
90
+
91
+ ```
92
+ event: row
93
+ data: {"action":"insert","table":"posts","file_path":"posts/hello.json","row":{"title":"Hello World","author":"alice"},"old_row":null}
94
+
95
+ event: row
96
+ data: {"action":"update","table":"posts","file_path":"posts/hello.json","row":{"title":"Hello, world","author":"alice"},"old_row":{"title":"Hello World","author":"alice"}}
97
+
98
+ event: row
99
+ data: {"action":"delete","table":"posts","file_path":"posts/second.json","row":{"title":"Second Post","author":"bob"},"old_row":null}
100
+ ```
101
+
102
+ Errors during extraction appear as `{"action":"error",...}` events on the same stream. They do **not** terminate the stream — a malformed file is a per-event problem, not a server-wide one.
103
+
104
+ ```bash
105
+ curl -N http://localhost:7117/events
106
+ ```
107
+
108
+ ## Piping event streams
109
+
110
+ The SSE stream is easy to tee into shell tools with `curl -N` plus `jq`:
111
+
112
+ ```bash
113
+ # Log every delete to a file
114
+ curl -N http://localhost:7117/events \
115
+ | jq -cR 'fromjson? | select(.action=="delete")' \
116
+ >> deletes.log
117
+
118
+ # Alert on errors
119
+ curl -N http://localhost:7117/events \
120
+ | jq -c 'fromjson? | select(.action=="error")' \
121
+ | while read -r line; do notify-send "dirsql error" "$line"; done
122
+ ```
123
+
124
+ (The `fromjson?` wrapping strips the `data:` framing; drop it if your SSE client is already parsing frames.)
@@ -0,0 +1,205 @@
1
+ ---
2
+ canonical: https://thekevinscott.github.io/dirsql/guide/config
3
+ ---
4
+
5
+ # Configuration File
6
+
7
+ > Online: <https://thekevinscott.github.io/dirsql/guide/config>
8
+
9
+ `dirsql` can be configured with a `.dirsql.toml` file, allowing you to define tables declaratively without writing code.
10
+
11
+ ## Basic Example
12
+
13
+ ```toml
14
+ [dirsql]
15
+ ignore = ["node_modules/**", ".git/**"]
16
+
17
+ [[table]]
18
+ ddl = "CREATE TABLE posts (title TEXT, author TEXT)"
19
+ glob = "posts/*.json"
20
+ ```
21
+
22
+ The `format` is inferred from the glob extension (`.json` -> JSON, `.jsonl` -> JSONL, `.csv` -> CSV, etc.). Each JSON key maps to a column with the same name.
23
+
24
+ ## Loading a Config File
25
+
26
+ Pass the config file path to the `DirSQL` constructor:
27
+
28
+ ::: code-group
29
+
30
+ ```python [Python]
31
+ from dirsql import DirSQL
32
+
33
+ db = DirSQL(config="./my-project/.dirsql.toml")
34
+ await db.ready()
35
+ ```
36
+
37
+ ```rust [Rust]
38
+ use dirsql::DirSQL;
39
+
40
+ let db = DirSQL::builder()
41
+ .config("./my-project/.dirsql.toml")
42
+ .build()?;
43
+ ```
44
+
45
+ ```typescript [TypeScript]
46
+ import { DirSQL } from "dirsql";
47
+
48
+ // String argument is interpreted as a config file path.
49
+ const db = new DirSQL("./my-project/.dirsql.toml");
50
+ await db.ready;
51
+ ```
52
+
53
+ :::
54
+
55
+ By default, the root directory scanned is the config file's parent directory. Override it by passing `root` explicitly (the explicit value wins and a warning is emitted) or by declaring `[dirsql].root` in the config file itself.
56
+
57
+ ## Root Directory
58
+
59
+ By default, the config file's parent directory is the scan root. To index a different location, declare `[dirsql].root` (relative paths are resolved relative to the config file's parent):
60
+
61
+ ```toml
62
+ [dirsql]
63
+ root = "../data"
64
+ ignore = ["node_modules/**"]
65
+ ```
66
+
67
+ ## Supported Formats
68
+
69
+ | Extension | Format | Rows |
70
+ |---|---|---|
71
+ | `.json` | JSON | Object = 1 row, Array = many rows |
72
+ | `.jsonl`, `.ndjson` | JSONL | One row per line |
73
+ | `.csv` | CSV | One row per data line (header = columns) |
74
+ | `.tsv` | TSV | One row per data line (tab-separated) |
75
+ | `.toml` | TOML | One row per file |
76
+ | `.yaml`, `.yml` | YAML | Mapping = 1 row, Sequence = many rows |
77
+ | `.md` | Frontmatter | YAML frontmatter + body column |
78
+
79
+ ## Path Captures
80
+
81
+ Use `{name}` in glob patterns to extract path segments as columns:
82
+
83
+ ```toml
84
+ [[table]]
85
+ ddl = "CREATE TABLE comments (thread_id TEXT, body TEXT, author TEXT)"
86
+ glob = "_comments/{thread_id}/index.jsonl"
87
+ ```
88
+
89
+ The directory name (e.g., `abc123`) becomes the `thread_id` column value for every row in that file.
90
+
91
+ ## Nested Data
92
+
93
+ Use `each` to navigate into nested JSON structures:
94
+
95
+ ```toml
96
+ [[table]]
97
+ ddl = "CREATE TABLE items (name TEXT, price REAL)"
98
+ glob = "catalog/*.json"
99
+ each = "data.items"
100
+ ```
101
+
102
+ This extracts rows from `{"data": {"items": [...]}}`.
103
+
104
+ ## Column Mapping
105
+
106
+ Use `columns` to map SQL column names to nested fields or path captures:
107
+
108
+ ```toml
109
+ [[table]]
110
+ ddl = "CREATE TABLE posts (display_name TEXT, body TEXT)"
111
+ glob = "posts/*.json"
112
+
113
+ [table.columns]
114
+ display_name = "metadata.author.name"
115
+ body = "body"
116
+ ```
117
+
118
+ ::: warning `[table.columns]` is a complete projection, not a partial rename
119
+ When a `[table.columns]` section is present, `dirsql` switches to fully
120
+ declarative projection: **only the columns listed in the mapping are
121
+ populated**. Any column in the DDL that is not mentioned in the mapping
122
+ is set to `NULL` for every row — the original key from the file is not
123
+ auto-copied.
124
+
125
+ This is intentional: `[table.columns]` means "here is exactly where
126
+ every column comes from", not "rename these specific keys".
127
+
128
+ **Trap to avoid.** A config like this:
129
+
130
+ ```toml
131
+ [[table]]
132
+ ddl = "CREATE TABLE comments (id TEXT, body TEXT, display_name TEXT)"
133
+ glob = "*.json"
134
+
135
+ [table.columns]
136
+ display_name = "author" # intended: "just rename author -> display_name"
137
+ ```
138
+
139
+ against a file `one.json`:
140
+
141
+ ```json
142
+ {"id": "a1", "body": "hello", "author": "Alice"}
143
+ ```
144
+
145
+ produces:
146
+
147
+ ```json
148
+ [{"id": null, "body": null, "display_name": "Alice"}]
149
+ ```
150
+
151
+ `id` and `body` are `NULL` because they are not listed in
152
+ `[table.columns]`. To keep them populated, add them to the mapping
153
+ explicitly:
154
+
155
+ ```toml
156
+ [table.columns]
157
+ id = "id"
158
+ body = "body"
159
+ display_name = "author"
160
+ ```
161
+ :::
162
+
163
+ ## Ignore Patterns
164
+
165
+ The `ignore` list skips files and directories entirely (not even scanned):
166
+
167
+ ```toml
168
+ [dirsql]
169
+ ignore = ["node_modules/**", ".git/**", "*.pyc", "__pycache__/**"]
170
+ ```
171
+
172
+ ## Strict Mode
173
+
174
+ By default, extra keys in file content are ignored and missing keys become NULL. Enable strict mode to error on mismatches:
175
+
176
+ ```toml
177
+ [[table]]
178
+ ddl = "CREATE TABLE posts (title TEXT, author TEXT)"
179
+ glob = "posts/*.json"
180
+ strict = true
181
+ ```
182
+
183
+ ## Full Example
184
+
185
+ ```toml
186
+ [dirsql]
187
+ ignore = ["node_modules/**", ".git/**", "dist/**"]
188
+
189
+ [[table]]
190
+ ddl = "CREATE TABLE comments (thread_id TEXT, body TEXT, author TEXT, resolved INTEGER)"
191
+ glob = "_comments/{thread_id}/index.jsonl"
192
+
193
+ [[table]]
194
+ ddl = "CREATE TABLE documents (title TEXT, draft INTEGER, body TEXT)"
195
+ glob = "**/index.md"
196
+
197
+ [[table]]
198
+ ddl = "CREATE TABLE metrics (date TEXT, requests INTEGER, errors INTEGER)"
199
+ glob = "logs/*.csv"
200
+
201
+ [[table]]
202
+ ddl = "CREATE TABLE config (key TEXT, value TEXT)"
203
+ glob = "config/*.toml"
204
+ strict = true
205
+ ```
@@ -0,0 +1,160 @@
1
+ ---
2
+ canonical: https://thekevinscott.github.io/dirsql/guide/crdt
3
+ ---
4
+
5
+ # Collaboration with CRDTs
6
+
7
+ > Online: <https://thekevinscott.github.io/dirsql/guide/crdt>
8
+
9
+ `dirsql` treats the filesystem as the source of truth. That works well when a single process (or a single human) is writing, but breaks down for multi-writer collaboration: two peers editing the same file concurrently produce a merge conflict, not a merged result.
10
+
11
+ [Conflict-free Replicated Data Types](https://crdt.tech/) (CRDTs) solve that merge problem at the data-structure level, not the filesystem level. Two replicas that apply the same set of edits -- in any order, with any network partitions in between -- converge on the same final state, without a central arbiter.
12
+
13
+ This guide is **opinionated**. It picks one library, explains the integration pattern with `dirsql`, and names the alternatives so you can steer if your situation is different.
14
+
15
+ ## Recommendation: Automerge
16
+
17
+ Use [Automerge](https://automerge.org/) (specifically the 2.x series with [automerge-repo](https://automerge.org/automerge-repo/) for sync).
18
+
19
+ Why Automerge over the alternatives:
20
+
21
+ - **JSON-shaped document model.** Automerge docs look like nested maps, lists, and text. That matches `dirsql`'s one-object-per-file workflow -- each Automerge document is the thing your `extract` function projects into rows.
22
+ - **Cross-language SDKs that mirror `dirsql`'s parity story.** First-class Rust ([`automerge`](https://crates.io/crates/automerge)), TypeScript ([`@automerge/automerge`](https://www.npmjs.com/package/@automerge/automerge)), and Python ([`automerge`](https://pypi.org/project/automerge/)) implementations exist today, all driven by the same Rust core. If you already have all three `dirsql` SDKs in play, Automerge won't force a language monoculture on you.
23
+ - **Filesystem-friendly sync primitives.** `automerge-repo` ships a [`NodeFSStorageAdapter`](https://automerge.org/docs/repositories/storage/) that shards document history into regular files under a directory. That directory is exactly the kind of tree `dirsql` is designed to watch.
24
+ - **Binary format with a deterministic JSON view.** You never hand-edit a CRDT file, but every replica projects the same canonical JSON from the binary state. That canonical JSON is what you feed to `dirsql`'s extract function, so two peers that have synced will produce identical rows.
25
+
26
+ ## The integration shape
27
+
28
+ There are two files per logical "document":
29
+
30
+ ```
31
+ workspace/
32
+ posts/
33
+ hello/
34
+ doc.automerge <-- binary CRDT state (the source of truth for writers)
35
+ view.json <-- materialized JSON snapshot (written on each merge)
36
+ ```
37
+
38
+ - Writers (editors, sync peers, etc.) mutate `doc.automerge` through Automerge APIs.
39
+ - After every mutation, the writer serializes the current document to `view.json`. This is the file `dirsql` indexes.
40
+ - `dirsql` watches `view.json`, not `doc.automerge`. The CRDT file is an implementation detail of how the JSON got there.
41
+
42
+ This keeps `dirsql`'s extract function oblivious to CRDT semantics: it reads a plain JSON file, exactly as it would without Automerge.
43
+
44
+ ::: tip Why not `extract` directly from `.automerge`?
45
+ You could -- the Rust and Python Automerge SDKs let you load a binary doc and walk its fields. But it couples your table schema to the CRDT library version, makes `extract` non-pure (it allocates CRDT state on every file change), and buys nothing: the writer is the only place that can produce a valid Automerge blob, so it might as well produce the JSON view at the same time.
46
+ :::
47
+
48
+ ### Example: posts as Automerge documents
49
+
50
+ ::: code-group
51
+
52
+ ```python [Python]
53
+ from dirsql import DirSQL, Table
54
+ import json
55
+
56
+ db = DirSQL(
57
+ "./workspace",
58
+ tables=[
59
+ Table(
60
+ ddl="CREATE TABLE posts (id TEXT, title TEXT, body TEXT, updated INTEGER)",
61
+ # Match the JSON view, not the raw CRDT binary.
62
+ glob="posts/*/view.json",
63
+ extract=lambda path, content: [json.loads(content)],
64
+ ),
65
+ ],
66
+ )
67
+ ```
68
+
69
+ ```rust [Rust]
70
+ use dirsql::{DirSQL, Table};
71
+ // See `row_from_json` in getting-started.md.
72
+
73
+ let db = DirSQL::new(
74
+ "./workspace",
75
+ vec![
76
+ Table::new(
77
+ "CREATE TABLE posts (id TEXT, title TEXT, body TEXT, updated INTEGER)",
78
+ "posts/*/view.json",
79
+ |_path, content| vec![row_from_json(content)],
80
+ ),
81
+ ],
82
+ )?;
83
+ ```
84
+
85
+ ```typescript [TypeScript]
86
+ import { DirSQL, type TableDef } from 'dirsql';
87
+
88
+ const tables: TableDef[] = [
89
+ {
90
+ ddl: 'CREATE TABLE posts (id TEXT, title TEXT, body TEXT, updated INTEGER)',
91
+ glob: 'posts/*/view.json',
92
+ extract: (_path, content) => [JSON.parse(content)],
93
+ },
94
+ ];
95
+
96
+ const db = new DirSQL('./workspace', tables);
97
+ ```
98
+
99
+ :::
100
+
101
+ The Automerge writer (sketch, TypeScript):
102
+
103
+ ```typescript
104
+ import * as Automerge from '@automerge/automerge';
105
+ import { writeFileSync, readFileSync } from 'node:fs';
106
+
107
+ // Load (or create) the CRDT doc.
108
+ const bytes = readFileSync('workspace/posts/hello/doc.automerge');
109
+ let doc = Automerge.load<Post>(bytes);
110
+
111
+ // Apply an edit.
112
+ doc = Automerge.change(doc, (d) => {
113
+ d.title = 'Hello, world';
114
+ d.updated = Date.now();
115
+ });
116
+
117
+ // Persist both the CRDT state and the materialized view.
118
+ writeFileSync('workspace/posts/hello/doc.automerge', Automerge.save(doc));
119
+ writeFileSync('workspace/posts/hello/view.json', JSON.stringify(doc));
120
+ ```
121
+
122
+ `dirsql`'s watcher picks up the change to `view.json`, re-runs `extract`, and emits an `update` row event. Queries over `posts` reflect the merged state without `dirsql` knowing Automerge exists.
123
+
124
+ ## Multi-writer, in practice
125
+
126
+ 1. Each peer runs an `automerge-repo` instance with the filesystem storage adapter pointed at its local `workspace/`.
127
+ 2. Peers sync via any transport `automerge-repo` supports ([WebSocket](https://automerge.org/docs/repositories/networking/), [BroadcastChannel](https://automerge.org/docs/repositories/networking/), or a custom adapter).
128
+ 3. On every sync, the repo applies incoming ops to the local CRDT, writes the updated `doc.automerge`, and the writer code re-serializes `view.json`.
129
+ 4. Every peer's `dirsql` sees the same eventual `view.json` and produces the same rows.
130
+
131
+ The key invariant: **`view.json` is a deterministic projection of `doc.automerge`**. Two peers that have converged on the CRDT state must produce byte-identical (or at least semantically-identical) JSON views. Otherwise you get spurious `update` events that flap with sync order. Use `JSON.stringify` with sorted keys (or `json.dumps(..., sort_keys=True)` in Python) to guarantee this.
132
+
133
+ ## Tradeoffs vs plain files
134
+
135
+ When **plain files** are the right answer:
136
+
137
+ - Single writer. A solo user editing `posts/*.json` will never hit a merge conflict. Adding a CRDT is overhead.
138
+ - Human-readable history matters. `git diff` on a JSON file tells a story; `git diff` on a CRDT binary does not.
139
+ - Schema churn is frequent. Renaming a field in plain JSON is a `sed`; in a CRDT it's a migration.
140
+
141
+ When **CRDTs** earn their complexity:
142
+
143
+ - Multi-writer without a central server (local-first, peer-to-peer).
144
+ - Offline edits that need to merge on reconnect.
145
+ - Fine-grained collaborative editing (cursor-level merging of a shared text field).
146
+
147
+ Hybrid is common: keep configuration and reference data as plain files, and use CRDTs only for the documents that genuinely have multiple writers.
148
+
149
+ ## Alternatives we considered
150
+
151
+ - [**Yjs**](https://docs.yjs.dev/) -- the dominant JS CRDT, excellent for rich-text collaboration (it backs many of the production collab editors you've used). Skipped as the primary recommendation because its Rust port ([`yrs`](https://crates.io/crates/yrs)) and Python bindings lag the JS implementation. If your workload is browser-first and text-heavy, prefer Yjs.
152
+ - [**Loro**](https://loro.dev/) -- Rust-first CRDT with a clean API and good cross-language story. Worth watching; we'd consider it once its Python bindings are GA. Try it if you're Rust-centric and don't need Automerge's ecosystem.
153
+ - **Operational Transform / hand-rolled merge logic** -- don't. OT is correct but hard to implement right, and you lose the offline-peer story that CRDTs give you for free.
154
+ - **Git as the merge engine** -- tempting because `dirsql` already lives on the filesystem, but three-way merges of structured JSON produce garbage conflict markers that no extract function can parse. Use a CRDT.
155
+
156
+ ## See also
157
+
158
+ - [Ink & Switch's local-first essay](https://www.inkandswitch.com/local-first/) -- the design space CRDTs sit in.
159
+ - [Automerge documentation](https://automerge.org/docs/) -- API reference and sync-adapter guides.
160
+ - [`crdt.tech`](https://crdt.tech/) -- library survey across languages.
@@ -0,0 +1,216 @@
1
+ ---
2
+ canonical: https://thekevinscott.github.io/dirsql/guide/querying
3
+ ---
4
+
5
+ # Querying
6
+
7
+ > Online: <https://thekevinscott.github.io/dirsql/guide/querying>
8
+
9
+ Once a `DirSQL` instance is created, the initial directory scan is complete and you can run SQL queries against the indexed data.
10
+
11
+ ## Basic queries
12
+
13
+ ::: code-group
14
+
15
+ ```python [Python]
16
+ # All rows from a table
17
+ results = db.query("SELECT * FROM comments")
18
+
19
+ # Filter with WHERE
20
+ results = db.query("SELECT * FROM comments WHERE author = 'alice'")
21
+
22
+ # Aggregations
23
+ results = db.query("SELECT author, COUNT(*) as n FROM comments GROUP BY author")
24
+
25
+ # JOINs across tables
26
+ results = db.query("""
27
+ SELECT posts.title, authors.name
28
+ FROM posts
29
+ JOIN authors ON posts.author_id = authors.id
30
+ """)
31
+ ```
32
+
33
+ ```rust [Rust]
34
+ // All rows from a table
35
+ let results = db.query("SELECT * FROM comments")?;
36
+
37
+ // Filter with WHERE
38
+ let results = db.query("SELECT * FROM comments WHERE author = 'alice'")?;
39
+
40
+ // Aggregations
41
+ let results = db.query("SELECT author, COUNT(*) as n FROM comments GROUP BY author")?;
42
+
43
+ // JOINs across tables
44
+ let results = db.query(
45
+ "SELECT posts.title, authors.name \
46
+ FROM posts JOIN authors ON posts.author_id = authors.id"
47
+ )?;
48
+ ```
49
+
50
+ ```typescript [TypeScript]
51
+ // All rows from a table
52
+ const results = await db.query('SELECT * FROM comments');
53
+
54
+ // Filter with WHERE
55
+ const filtered = await db.query("SELECT * FROM comments WHERE author = 'alice'");
56
+
57
+ // Aggregations
58
+ const counts = await db.query('SELECT author, COUNT(*) as n FROM comments GROUP BY author');
59
+
60
+ // JOINs across tables
61
+ const joined = await db.query(`
62
+ SELECT posts.title, authors.name
63
+ FROM posts
64
+ JOIN authors ON posts.author_id = authors.id
65
+ `);
66
+ ```
67
+
68
+ :::
69
+
70
+ Any valid SQLite **SELECT** works. The in-memory database supports the full SQLite dialect including subqueries, CTEs, window functions, and aggregate functions. See [Read-only queries](#read-only-queries) below for why write statements (`INSERT`, `UPDATE`, `DELETE`, `DROP`, etc.) are rejected.
71
+
72
+ ## Return format
73
+
74
+ `query()` returns a list of dicts (Python), a `Vec<HashMap>` (Rust), or an array of objects (TypeScript). Each entry maps column names to values.
75
+
76
+ ::: code-group
77
+
78
+ ```python [Python]
79
+ results = db.query("SELECT title, author FROM posts")
80
+ # [
81
+ # {"title": "Hello World", "author": "alice"},
82
+ # {"title": "Second Post", "author": "bob"},
83
+ # ]
84
+ ```
85
+
86
+ ```rust [Rust]
87
+ let results = db.query("SELECT title, author FROM posts")?;
88
+ // Vec<HashMap<String, Value>>
89
+ // [{"title": "Hello World", "author": "alice"}, ...]
90
+ ```
91
+
92
+ ```typescript [TypeScript]
93
+ const results = await db.query('SELECT title, author FROM posts');
94
+ // [
95
+ // { title: 'Hello World', author: 'alice' },
96
+ // { title: 'Second Post', author: 'bob' },
97
+ // ]
98
+ ```
99
+
100
+ :::
101
+
102
+ SQLite types map back to Python types:
103
+
104
+ | SQLite type | Python type |
105
+ |-------------|-------------|
106
+ | TEXT | `str` |
107
+ | INTEGER | `int` |
108
+ | REAL | `float` |
109
+ | BLOB | `bytes` |
110
+ | NULL | `None` |
111
+
112
+ ## Internal columns
113
+
114
+ `dirsql` adds internal tracking columns (`_dirsql_file_path`, `_dirsql_row_index`) to each table for file-change diffing. These columns are automatically excluded from `SELECT *` results, so day-to-day queries don't need to account for them.
115
+
116
+ If you want to know which file a row came from, you can name the tracking columns explicitly in the projection:
117
+
118
+ ::: code-group
119
+
120
+ ```python [Python]
121
+ rows = db.query("SELECT title, _dirsql_file_path FROM posts")
122
+ # [{"title": "Hello World", "_dirsql_file_path": "posts/hello.json"}, ...]
123
+ ```
124
+
125
+ ```rust [Rust]
126
+ let rows = db.query("SELECT title, _dirsql_file_path FROM posts")?;
127
+ // [{"title": "Hello World", "_dirsql_file_path": "posts/hello.json"}, ...]
128
+ ```
129
+
130
+ ```typescript [TypeScript]
131
+ const rows = await db.query('SELECT title, _dirsql_file_path FROM posts');
132
+ // [{ title: 'Hello World', _dirsql_file_path: 'posts/hello.json' }, ...]
133
+ ```
134
+
135
+ :::
136
+
137
+ Tracking columns are only returned when named explicitly — `SELECT *` continues to exclude them.
138
+
139
+ ## Read-only queries
140
+
141
+ `query()` accepts only read-only statements. Each statement is prepared on SQLite and then classified via `sqlite3_stmt_readonly`; anything SQLite itself flags as a write — `INSERT`, `UPDATE`, `DELETE`, `DROP`, `CREATE`, `ALTER`, `REPLACE`, `VACUUM`, `ANALYZE`, etc. — is rejected before any rows are produced.
142
+
143
+ This keeps the in-memory index consistent with the on-disk files that back it. Mutations only happen through the watcher/indexer pipeline: to change data, edit the underlying file and let the watcher re-extract rows.
144
+
145
+ ::: code-group
146
+
147
+ ```python [Python]
148
+ # Raises a RuntimeError; the index is unchanged.
149
+ db.query("DELETE FROM posts")
150
+ ```
151
+
152
+ ```rust [Rust]
153
+ // Returns DirSqlError::WriteForbidden; the index is unchanged.
154
+ let err = db.query("DELETE FROM posts").unwrap_err();
155
+ assert!(matches!(err, dirsql::DirSqlError::WriteForbidden));
156
+ ```
157
+
158
+ ```typescript [TypeScript]
159
+ // Throws an Error whose message explains writes are not accepted.
160
+ expect(() => db.query('DELETE FROM posts')).toThrow(/read-only/i);
161
+ ```
162
+
163
+ :::
164
+
165
+ ## Error handling
166
+
167
+ Invalid SQL raises an exception:
168
+
169
+ ::: code-group
170
+
171
+ ```python [Python]
172
+ try:
173
+ db.query("NOT VALID SQL")
174
+ except Exception as e:
175
+ print(f"Query error: {e}")
176
+ ```
177
+
178
+ ```rust [Rust]
179
+ match db.query("NOT VALID SQL") {
180
+ Ok(results) => println!("{:?}", results),
181
+ Err(e) => eprintln!("Query error: {}", e),
182
+ }
183
+ ```
184
+
185
+ ```typescript [TypeScript]
186
+ try {
187
+ await db.query('NOT VALID SQL');
188
+ } catch (e) {
189
+ console.error(`Query error: ${e}`);
190
+ }
191
+ ```
192
+
193
+ :::
194
+
195
+ ## Empty results
196
+
197
+ Queries that match no rows return an empty collection:
198
+
199
+ ::: code-group
200
+
201
+ ```python [Python]
202
+ results = db.query("SELECT * FROM posts WHERE author = 'nobody'")
203
+ assert results == []
204
+ ```
205
+
206
+ ```rust [Rust]
207
+ let results = db.query("SELECT * FROM posts WHERE author = 'nobody'")?;
208
+ assert!(results.is_empty());
209
+ ```
210
+
211
+ ```typescript [TypeScript]
212
+ const results = await db.query("SELECT * FROM posts WHERE author = 'nobody'");
213
+ console.assert(results.length === 0);
214
+ ```
215
+
216
+ :::