sqlrite 0.1.25__tar.gz → 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- sqlrite-0.2.0/CLAUDE.md +94 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/Cargo.lock +7 -7
- {sqlrite-0.1.25 → sqlrite-0.2.0}/Cargo.toml +11 -3
- {sqlrite-0.1.25 → sqlrite-0.2.0}/PKG-INFO +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/package.json +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/_index.md +8 -5
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/architecture.md +5 -4
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/file-format.md +31 -9
- sqlrite-0.2.0/docs/fts.md +319 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/mcp.md +29 -6
- sqlrite-0.2.0/docs/phase-8-plan.md +299 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/roadmap.md +29 -3
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/smoke-test.md +22 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/sql-engine.md +37 -2
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/supported-sql.md +40 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/README.md +8 -0
- sqlrite-0.2.0/examples/hybrid-retrieval/README.md +87 -0
- sqlrite-0.2.0/examples/hybrid-retrieval/hybrid_retrieval.rs +137 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/pyproject.toml +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/python/Cargo.toml +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/Cargo.toml +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/README.md +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/lib.rs +1 -1
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/main.rs +16 -5
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/meta_command/mod.rs +11 -6
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/db/table.rs +59 -2
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/executor.rs +276 -8
- sqlrite-0.2.0/src/sql/fts/bm25.rs +302 -0
- sqlrite-0.2.0/src/sql/fts/mod.rs +24 -0
- sqlrite-0.2.0/src/sql/fts/posting_list.rs +540 -0
- sqlrite-0.2.0/src/sql/fts/tokenizer.rs +101 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/mod.rs +315 -15
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/cell.rs +24 -0
- sqlrite-0.2.0/src/sql/pager/fts_cell.rs +317 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/header.rs +26 -7
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/mod.rs +663 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/overflow.rs +2 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/pager.rs +21 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/parser/create.rs +8 -5
- {sqlrite-0.1.25 → sqlrite-0.2.0}/.github/workflows/ci.yml +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/.github/workflows/release-pr.yml +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/.github/workflows/release.yml +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/.github/workflows/rust.yml +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/.gitignore +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/CODE_OF_CONDUCT.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/LICENSE +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/MAINTAINERS +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/Makefile +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/README.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/index.html +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/package-lock.json +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/src/App.svelte +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/src/app.css +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/src/main.ts +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/src/vite-env.d.ts +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/svelte.config.js +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/tsconfig.json +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/desktop/vite.config.ts +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/ask-backend-examples.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/ask.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/design-decisions.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/desktop.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/embedding.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/getting-started.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/pager.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/phase-7-plan.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/release-plan.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/release-secrets.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/storage-model.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/docs/usage.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/c/Makefile +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/c/hello.c +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/go/go.mod +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/go/hello.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/nodejs/hello.mjs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/python/hello.py +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/rust/quickstart.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/wasm/Makefile +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/wasm/index.html +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/examples/wasm/server.mjs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite - Desktop.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite Data Structures.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite Simple SQL Execution High Level Diagram.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite Simple SQL INSERT Execution High Level Diagram (Insert Row).png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite Simple SQL INSERT Execution High Level Diagram.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/SQLRite_logo.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/images/architecture.png +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/rust-toolchain.toml +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/AST.delete.example +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/AST.insert.exemple +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/AST.select.example +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/AST.update.example +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/CREATE TABLE sqlrite_schema.sql +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/CREATE_TABLE with duplicate.sql +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/CREATE_TABLE.sql +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/samples/INSERT.sql +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/scripts/bump-version.sh +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/README.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/ask.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/ask_test.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/conn.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/go.mod +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/rows.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/sqlrite.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/sqlrite_test.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/go/stmt.go +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/python/README.md +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/python/src/lib.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/python/tests/test_ask.py +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sdk/python/tests/test_sqlrite.py +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/src/lib.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/src/prompt.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/src/provider/anthropic.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/src/provider/mock.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/src/provider/mod.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/sqlrite-ask/tests/anthropic_http.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/ask/mod.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/ask/schema.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/connection.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/error.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/repl/mod.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/db/database.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/db/mod.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/db/secondary_index.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/hnsw.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/file.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/hnsw_cell.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/index_cell.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/interior_page.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/page.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/table_page.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/varint.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/pager/wal.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/parser/insert.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/parser/mod.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/parser/select.rs +0 -0
- {sqlrite-0.1.25 → sqlrite-0.2.0}/src/sql/tokenizer.rs +0 -0
sqlrite-0.2.0/CLAUDE.md
ADDED
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project
|
|
6
|
+
|
|
7
|
+
SQLRite is a from-scratch SQLite-style embedded database written in Rust. It's published on crates.io as `sqlrite-engine` (imported as `use sqlrite::…` — the lib target keeps the short name) and ships as: a REPL binary (`sqlrite`), a Tauri 2 + Svelte 5 desktop app, a Model Context Protocol stdio server (`sqlrite-mcp`), a C FFI shim (`sqlrite-ffi`), and language SDKs (Python via PyO3, Node via napi-rs, Go via cgo, WASM via wasm-bindgen). Phases 1–7 are shipped; the current branch `phase-8-plan` drafts inverted-index + BM25 full-text search and hybrid retrieval.
|
|
8
|
+
|
|
9
|
+
## Workspace layout
|
|
10
|
+
|
|
11
|
+
`Cargo.toml` is a workspace whose members are: `.` (the engine, package `sqlrite-engine`, lib `sqlrite`), `desktop/src-tauri`, `sqlrite-ffi`, `sqlrite-ask`, `sqlrite-mcp`, `sdk/python`, `sdk/nodejs`. `sdk/wasm` and `sdk/go` are deliberately **not** workspace members (wasm32 target / cgo separation).
|
|
12
|
+
|
|
13
|
+
- `src/` — engine. Public API is `Connection`/`Statement`/`Rows`/`Row`/`Value` from [src/connection.rs](src/connection.rs), re-exported via [src/lib.rs](src/lib.rs). Any new SDK should bind only to this surface.
|
|
14
|
+
- `sqlrite-ask/` — pure-Rust LLM adapter (Anthropic/OpenAI/Ollama) for natural-language → SQL. The engine's `ask` feature provides the thin `ConnectionAskExt::ask` glue.
|
|
15
|
+
- `sqlrite-mcp/` — MCP stdio server. Seven tools: `list_tables`, `describe_table`, `query`, `execute`, `schema_dump`, `vector_search`, `ask`. `--read-only` opens with a shared lock and hides `execute`.
|
|
16
|
+
- `sqlrite-ffi/` — C ABI cdylib + generated `sqlrite.h` header. Backs the Go SDK and any C consumer.
|
|
17
|
+
- `desktop/` — Tauri 2 + Svelte 5 GUI. Embeds the engine directly (no FFI hop).
|
|
18
|
+
|
|
19
|
+
Architecture deep-dive: [docs/architecture.md](docs/architecture.md). The full doc index is [docs/_index.md](docs/_index.md).
|
|
20
|
+
|
|
21
|
+
## Engine data flow
|
|
22
|
+
|
|
23
|
+
SQL string → [src/sql/mod.rs](src/sql/mod.rs) `process_command` parses with the external `sqlparser` crate (SQLite dialect) → [src/sql/parser/](src/sql/parser/) trims the AST into internal structs (`CreateQuery`, `InsertQuery`, `SelectQuery`) → [src/sql/executor.rs](src/sql/executor.rs) runs the statement against the in-memory `Database` ([src/sql/db/database.rs](src/sql/db/database.rs)). On any write, auto-save serializes changed pages through [src/sql/pager/](src/sql/pager/) — 4 KiB pages, cell-encoded B-trees per table and index, WAL + crash-safe checkpoint, fs2 advisory locks. Vector search (Phase 7d) goes through [src/sql/hnsw.rs](src/sql/hnsw.rs); KNN uses a bounded-heap top-k in the executor. Transactions snapshot the in-memory state; ROLLBACK restores it. There is no query optimizer beyond the KNN/HNSW shortcut, no joins, no aggregates yet.
|
|
24
|
+
|
|
25
|
+
## Commands
|
|
26
|
+
|
|
27
|
+
CI is the source of truth — the workspace excludes that follow are required because the desktop crate needs a Svelte build first and the PyO3/napi-rs cdylibs can't link standalone test binaries.
|
|
28
|
+
|
|
29
|
+
```sh
|
|
30
|
+
# Build / test the Rust workspace (matches CI)
|
|
31
|
+
cargo build --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --all-targets
|
|
32
|
+
cargo test --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs
|
|
33
|
+
|
|
34
|
+
# Single test (exact name; --nocapture to see println!)
|
|
35
|
+
cargo test <test_name> -- --nocapture
|
|
36
|
+
|
|
37
|
+
# Lint (CI runs all three)
|
|
38
|
+
cargo fmt --all -- --check
|
|
39
|
+
cargo clippy --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --all-targets
|
|
40
|
+
cargo doc --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --no-deps
|
|
41
|
+
|
|
42
|
+
# Run the REPL (default features include cli + ask + file-locks)
|
|
43
|
+
cargo run # in-memory
|
|
44
|
+
cargo run -- path/to/db.sqlrite # open/create file
|
|
45
|
+
cargo run -- --readonly path/to/db.sqlrite # shared-lock open
|
|
46
|
+
|
|
47
|
+
# Crate-specific
|
|
48
|
+
cargo build --release -p sqlrite-ffi # C cdylib + sqlrite.h
|
|
49
|
+
cd desktop && npm install && npm run tauri dev # desktop app dev mode
|
|
50
|
+
cargo run -p sqlrite-mcp -- /path/to.sqlrite # MCP server (stdio)
|
|
51
|
+
|
|
52
|
+
# Release plumbing
|
|
53
|
+
scripts/bump-version.sh 0.2.0 # bumps version across 11 manifests
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
`SQLRITE_LLM_API_KEY` is required for the `.ask` REPL command, the engine's `ask` feature, and the MCP `ask` tool. Clippy is **not** `-D warnings` yet (intentional — see top of [.github/workflows/ci.yml](.github/workflows/ci.yml)); deny-by-default lints still fail CI.
|
|
57
|
+
|
|
58
|
+
## Project-specific conventions
|
|
59
|
+
|
|
60
|
+
- **Errors.** Single `SQLRiteError` enum (thiserror, six variants) with a project-wide `Result<T>` alias. All public APIs return typed errors; no panics. The enum hand-rolls `PartialEq` because `std::io::Error` doesn't derive it.
|
|
61
|
+
- **Storage isn't bincode.** Tables and indexes share a cell-encoded B-tree format with a 4 KiB page size; the file header carries a format version (currently v4 after the Phase 7a vector column work). The diff-based pager only writes changed pages. See [docs/file-format.md](docs/file-format.md) and [docs/pager.md](docs/pager.md).
|
|
62
|
+
- **B-tree commit strategy.** Bottom-up rebuild on every commit (O(N), correct-by-construction). No in-place splits — deferred design decision.
|
|
63
|
+
- **Feature gates matter.** `default = ["cli", "ask", "file-locks"]`. The REPL `[[bin]]` `required-features = ["cli", "ask"]`. WASM and lean library embeddings build with `default-features = false` to avoid rustyline / clap / fs2 / sqlrite-ask. Don't pull these into the always-on dependency set.
|
|
64
|
+
- **Don't reinvent the SQL parser.** `sqlparser` is the tokenizer and AST source; project code only narrows that AST. New SQL features start by mapping the existing `sqlparser` AST node, not by extending a custom grammar.
|
|
65
|
+
- **Phase numbering is real.** The roadmap is sequenced in [docs/roadmap.md](docs/roadmap.md); design discussions live at github.com/sqlrite/design and feature work generally tracks an open phase plan in `docs/phase-*-plan.md`. Treat in-flight phase plans as load-bearing context.
|
|
66
|
+
- **Concurrency.** Engine mutates state through `Arc<Mutex<_>>` (Tauri-friendly). On-disk concurrency uses fs2 advisory locks: shared for readers, exclusive for the single writer.
|
|
67
|
+
|
|
68
|
+
## Knowledge Base
|
|
69
|
+
|
|
70
|
+
### Project-specific — `~/Documents/josh-obsidian-synced/Projects/rust_sqlite/`
|
|
71
|
+
|
|
72
|
+
- **Code:** `/Users/joaoh82/projects/rust_sqlite`
|
|
73
|
+
- **Context (read first):** `~/Documents/josh-obsidian-synced/Projects/rust_sqlite/context.md`
|
|
74
|
+
- **Notes (running journal):** `~/Documents/josh-obsidian-synced/Projects/rust_sqlite/notes.md`
|
|
75
|
+
- **Project wiki:** `~/Documents/josh-obsidian-synced/Projects/rust_sqlite/wiki/`
|
|
76
|
+
|
|
77
|
+
**How to use each:**
|
|
78
|
+
|
|
79
|
+
- `context.md` — stable background (product goals, stakeholders, domain). Read before starting non-trivial work. Update only when underlying facts change.
|
|
80
|
+
- `notes.md` — append-only dated journal. Add entries under `## YYYY-MM-DD` headings for decisions, blockers, TODOs, and incidents — anything worth preserving but not stable enough for `context.md`.
|
|
81
|
+
- `wiki/` — reference sub-docs (e.g. `Architecture.md`, `Local Dev Setup.md`, `Tech Services.md`). Create new files as topics emerge.
|
|
82
|
+
|
|
83
|
+
**When to save:**
|
|
84
|
+
|
|
85
|
+
- New stable fact about the product/domain → update `context.md`.
|
|
86
|
+
- A decision, incident, or working note → append a dated entry to `notes.md`.
|
|
87
|
+
- Reusable reference material (setup steps, credential locations, architecture) → new/updated file in `wiki/`.
|
|
88
|
+
|
|
89
|
+
### Cross-project knowledge — `~/Documents/josh-obsidian-synced/vault/`
|
|
90
|
+
|
|
91
|
+
- **General wiki:** `~/Documents/josh-obsidian-synced/vault/wiki/` — start at `_master-index.md`, then drill into the relevant topic's `_index.md`.
|
|
92
|
+
- **Raw dumps:** `~/Documents/josh-obsidian-synced/vault/raw/` — drop unprocessed research here as `YYYY-MM-DD-{slug}.md`.
|
|
93
|
+
|
|
94
|
+
Read the general wiki when the question isn't specific to this project. Drop raw research or imported notes into `vault/raw/` so it's captured even before it's distilled.
|
|
@@ -3817,7 +3817,7 @@ dependencies = [
|
|
|
3817
3817
|
|
|
3818
3818
|
[[package]]
|
|
3819
3819
|
name = "sqlrite-ask"
|
|
3820
|
-
version = "0.
|
|
3820
|
+
version = "0.2.0"
|
|
3821
3821
|
dependencies = [
|
|
3822
3822
|
"serde",
|
|
3823
3823
|
"serde_json",
|
|
@@ -3828,7 +3828,7 @@ dependencies = [
|
|
|
3828
3828
|
|
|
3829
3829
|
[[package]]
|
|
3830
3830
|
name = "sqlrite-desktop"
|
|
3831
|
-
version = "0.
|
|
3831
|
+
version = "0.2.0"
|
|
3832
3832
|
dependencies = [
|
|
3833
3833
|
"serde",
|
|
3834
3834
|
"serde_json",
|
|
@@ -3840,7 +3840,7 @@ dependencies = [
|
|
|
3840
3840
|
|
|
3841
3841
|
[[package]]
|
|
3842
3842
|
name = "sqlrite-engine"
|
|
3843
|
-
version = "0.
|
|
3843
|
+
version = "0.2.0"
|
|
3844
3844
|
dependencies = [
|
|
3845
3845
|
"clap",
|
|
3846
3846
|
"env_logger",
|
|
@@ -3857,7 +3857,7 @@ dependencies = [
|
|
|
3857
3857
|
|
|
3858
3858
|
[[package]]
|
|
3859
3859
|
name = "sqlrite-ffi"
|
|
3860
|
-
version = "0.
|
|
3860
|
+
version = "0.2.0"
|
|
3861
3861
|
dependencies = [
|
|
3862
3862
|
"cbindgen",
|
|
3863
3863
|
"serde",
|
|
@@ -3867,7 +3867,7 @@ dependencies = [
|
|
|
3867
3867
|
|
|
3868
3868
|
[[package]]
|
|
3869
3869
|
name = "sqlrite-mcp"
|
|
3870
|
-
version = "0.
|
|
3870
|
+
version = "0.2.0"
|
|
3871
3871
|
dependencies = [
|
|
3872
3872
|
"clap",
|
|
3873
3873
|
"libc",
|
|
@@ -3878,7 +3878,7 @@ dependencies = [
|
|
|
3878
3878
|
|
|
3879
3879
|
[[package]]
|
|
3880
3880
|
name = "sqlrite-nodejs"
|
|
3881
|
-
version = "0.
|
|
3881
|
+
version = "0.2.0"
|
|
3882
3882
|
dependencies = [
|
|
3883
3883
|
"napi",
|
|
3884
3884
|
"napi-build",
|
|
@@ -3888,7 +3888,7 @@ dependencies = [
|
|
|
3888
3888
|
|
|
3889
3889
|
[[package]]
|
|
3890
3890
|
name = "sqlrite-python"
|
|
3891
|
-
version = "0.
|
|
3891
|
+
version = "0.2.0"
|
|
3892
3892
|
dependencies = [
|
|
3893
3893
|
"pyo3",
|
|
3894
3894
|
"sqlrite-engine",
|
|
@@ -20,14 +20,14 @@ resolver = "3"
|
|
|
20
20
|
# dependency declaration change:
|
|
21
21
|
#
|
|
22
22
|
# [dependencies]
|
|
23
|
-
# sqlrite-engine = "0.
|
|
23
|
+
# sqlrite-engine = "0.2"
|
|
24
24
|
# # then in code: use sqlrite::{Database, …};
|
|
25
25
|
#
|
|
26
26
|
# Any workspace member here that depends on the engine uses the
|
|
27
27
|
# `package =` key so the import name stays `sqlrite` internally:
|
|
28
28
|
# sqlrite = { package = "sqlrite-engine", path = "…" }
|
|
29
29
|
name = "sqlrite-engine"
|
|
30
|
-
version = "0.
|
|
30
|
+
version = "0.2.0"
|
|
31
31
|
authors = ["Joao Henrique Machado Silva <joaoh82@gmail.com>"]
|
|
32
32
|
edition = "2024"
|
|
33
33
|
rust-version = "1.85"
|
|
@@ -65,6 +65,14 @@ required-features = ["cli", "ask"]
|
|
|
65
65
|
name = "quickstart"
|
|
66
66
|
path = "examples/rust/quickstart.rs"
|
|
67
67
|
|
|
68
|
+
# Phase 8d — hybrid retrieval (BM25 + vector cosine via raw arithmetic).
|
|
69
|
+
# Run with `cargo run --example hybrid-retrieval`. Pre-baked vectors
|
|
70
|
+
# (no embedding model dependency) over a 6-doc tech-blurb corpus,
|
|
71
|
+
# showing where pure-BM25, pure-vector, and 50/50 hybrid each win.
|
|
72
|
+
[[example]]
|
|
73
|
+
name = "hybrid-retrieval"
|
|
74
|
+
path = "examples/hybrid-retrieval/hybrid_retrieval.rs"
|
|
75
|
+
|
|
68
76
|
[features]
|
|
69
77
|
# Default build includes everything: the REPL binary (cli) and
|
|
70
78
|
# POSIX/Windows advisory file locks on the Pager (file-locks).
|
|
@@ -130,4 +138,4 @@ fs2 = { version = "0.4", optional = true }
|
|
|
130
138
|
# crate publishes to crates.io, and a path-only dep without a
|
|
131
139
|
# version field fails the manifest verification step. See PR #58
|
|
132
140
|
# retrospective in docs/roadmap.md.
|
|
133
|
-
sqlrite-ask = { version = "0.
|
|
141
|
+
sqlrite-ask = { version = "0.2", path = "sqlrite-ask", optional = true }
|
|
@@ -22,7 +22,11 @@ A small, hand-written guide to the SQLRite codebase — how it's structured, how
|
|
|
22
22
|
|
|
23
23
|
- [Ask — natural-language → SQL](ask.md) — the canonical reference for the `ask()` feature across every product surface (REPL, desktop, Rust library, Python / Node / Go / WASM SDKs, MCP server); env vars, defaults, prompt caching, security
|
|
24
24
|
- [Ask backend proxy templates](ask-backend-examples.md) — copy-paste backend examples for the WASM SDK's split design: Cloudflare Workers, Vercel Edge, Deno Deploy, Firebase Functions, AWS Lambda, Express, pure Node
|
|
25
|
-
- [MCP server (`sqlrite-mcp`)](mcp.md) — Phase 7h: SQLRite as a Model Context Protocol stdio server. Wiring into Claude Code / Cursor / `mcp-inspector`; the
|
|
25
|
+
- [MCP server (`sqlrite-mcp`)](mcp.md) — Phase 7h + 8e: SQLRite as a Model Context Protocol stdio server. Wiring into Claude Code / Cursor / `mcp-inspector`; the eight tools (`list_tables`, `describe_table`, `query`, `execute`, `schema_dump`, `vector_search`, `bm25_search`, `ask`); read-only mode; the JSON-RPC wire format
|
|
26
|
+
|
|
27
|
+
## Phase 8 — Full-text search + hybrid retrieval
|
|
28
|
+
|
|
29
|
+
- [FTS — full-text search + hybrid retrieval](fts.md) — the canonical reference for `CREATE INDEX … USING fts`, the `fts_match` / `bm25_score` scalar functions, the `try_fts_probe` optimizer hook, hybrid retrieval via raw arithmetic with `vec_distance_cosine`, persistence + the on-demand v4 → v5 file-format bump, and the `bm25_search` MCP tool
|
|
26
30
|
|
|
27
31
|
## Internals
|
|
28
32
|
|
|
@@ -42,11 +46,10 @@ As of May 2026, SQLRite has:
|
|
|
42
46
|
- WAL-backed persistence with crash-safe checkpointing, shared/exclusive lock modes, and real `BEGIN` / `COMMIT` / `ROLLBACK` transactions (Phase 4 complete)
|
|
43
47
|
- A stable public Rust embedding API plus C FFI shim and SDKs for Python, Node.js, Go, and WASM (Phase 5 complete except the optional 5f crate-polish task)
|
|
44
48
|
- A Tauri 2.0 + Svelte desktop app (Phase 2.5 complete)
|
|
45
|
-
- AI-era extensions across the product surface (Phase 7 complete
|
|
49
|
+
- AI-era extensions across the product surface (Phase 7 complete): VECTOR columns + HNSW indexes (7a-7d), JSON columns (7e), the `ask()` natural-language → SQL family across the REPL / desktop / Rust / Python / Node / Go / WASM (7g.1-7g.7), and the [`sqlrite-mcp`](mcp.md) Model Context Protocol server (7h + 7g.8)
|
|
50
|
+
- Full-text search + hybrid retrieval (Phase 8 complete): FTS5-style inverted index with BM25 ranking + `fts_match` / `bm25_score` scalar functions + `try_fts_probe` optimizer hook + on-disk persistence with on-demand v4 → v5 file-format bump (8a-8c), a worked hybrid-retrieval example combining BM25 with vector cosine via raw arithmetic (8d), and a `bm25_search` MCP tool symmetric with `vector_search` (8e). See [`docs/fts.md`](fts.md).
|
|
46
51
|
- A fully-automated release pipeline that ships every product to its registry on every release with one human action — Rust engine + `sqlrite-ask` + `sqlrite-mcp` to crates.io, Python wheels to PyPI (`sqlrite`), Node.js + WASM to npm (`@joaoh82/sqlrite` + `@joaoh82/sqlrite-wasm`), Go module via `sdk/go/v*` git tag, plus C FFI tarballs, MCP binary tarballs, and unsigned desktop installers as GitHub Release assets (Phase 6 complete)
|
|
47
52
|
|
|
48
|
-
The active frontier is **Phase 8 — Full-text search + hybrid retrieval** (the deferred 7f scope).
|
|
49
|
-
|
|
50
53
|
See the [Roadmap](roadmap.md) for the full phase plan.
|
|
51
54
|
|
|
52
55
|
## Release engineering
|
|
@@ -58,7 +61,7 @@ See the [Roadmap](roadmap.md) for the full phase plan.
|
|
|
58
61
|
## Future work
|
|
59
62
|
|
|
60
63
|
- [Phase 7 plan](phase-7-plan.md) — AI-era extensions (vector column type + HNSW, JSON, NL→SQL `ask()` API across REPL/library/SDKs/desktop/MCP, MCP server). **Implementation complete except 7f, which deferred to Phase 8.**
|
|
61
|
-
- Phase 8 — Full-text search (FTS5-style BM25) + hybrid retrieval
|
|
64
|
+
- [Phase 8 plan](phase-8-plan.md) — Full-text search (FTS5-style BM25) + hybrid retrieval. The deferred 7f scope. **All six sub-phases (8a–8f) shipped.** Canonical reference: [`docs/fts.md`](fts.md).
|
|
62
65
|
|
|
63
66
|
## Conventions
|
|
64
67
|
|
|
@@ -89,18 +89,19 @@ The engine never depends on the SDK crates; the SDK crates each depend on the en
|
|
|
89
89
|
| Module | What it owns |
|
|
90
90
|
|---|---|
|
|
91
91
|
| [`src/main.rs`](../src/main.rs) | Binary entry: init env_logger, build rustyline editor, run the REPL loop, route input to either the meta or SQL dispatcher |
|
|
92
|
-
| [`src/lib.rs`](../src/lib.rs) | Library entry: re-exports `Connection`, `Statement`, `Rows`, `Value`, `Database`, `process_command`, the `ask` module (when feature on), etc. — the stable public surface every SDK binds against |
|
|
92
|
+
| [`src/lib.rs`](../src/lib.rs) | Library entry: re-exports `Connection`, `Statement`, `Rows`, `Value`, `Database`, `process_command` / `process_command_with_render` / `CommandOutput`, the `ask` module (when feature on), etc. — the stable public surface every SDK binds against |
|
|
93
93
|
| [`src/connection.rs`](../src/connection.rs) | `Connection` / `Statement` / `Rows` / `Row` / `OwnedRow` / `FromValue` — the Phase 5a public API |
|
|
94
94
|
| [`src/ask/`](../src/ask/) | Engine integration with `sqlrite-ask`: `ConnectionAskExt`, `ask_with_database`, the `schema::dump_schema_for_database` helper. The `schema` submodule is always available; the rest is gated behind the `ask` feature. Phase 7g.2. |
|
|
95
95
|
| [`src/repl/`](../src/repl/) | `REPLHelper` (implements rustyline's `Helper` trait: completer, hinter, highlighter, validator). Also `get_config` and `get_command_type` |
|
|
96
96
|
| [`src/meta_command/`](../src/meta_command/) | `MetaCommand` enum, parsing (`.open FOO.db` → `Open(PathBuf)`, `.ask <Q>` → `Ask(String)`), and dispatch to persistence + ask helpers |
|
|
97
97
|
| [`src/error.rs`](../src/error.rs) | `SQLRiteError` (thiserror-derived), `Result<T>` alias, hand-rolled `PartialEq` that handles `io::Error` |
|
|
98
|
-
| [`src/sql/mod.rs`](../src/sql/mod.rs) | `SQLCommand` classifier, `process_command` — the top-level
|
|
98
|
+
| [`src/sql/mod.rs`](../src/sql/mod.rs) | `SQLCommand` classifier, `process_command` / `process_command_with_render` — the top-level entries that parse a SQL string and route to the right executor. Also triggers auto-save. **Never writes to stdout** — for SELECT statements, the rendered prettytable comes back inside `CommandOutput.rendered` so the REPL can print it (the engine itself doesn't); the SDK / FFI / MCP callers ignore it. |
|
|
99
99
|
| [`src/sql/parser/`](../src/sql/parser/) | Takes a `sqlparser::ast::Statement` and produces internal query structs (`CreateQuery`, `InsertQuery`, `SelectQuery`) with only the fields we actually use |
|
|
100
|
-
| [`src/sql/executor.rs`](../src/sql/executor.rs) | `execute_select`, `execute_delete`, `execute_update`, plus the shared expression evaluator `eval_expr` / `eval_predicate`. Also the bounded-heap top-k optimization (Phase 7c)
|
|
100
|
+
| [`src/sql/executor.rs`](../src/sql/executor.rs) | `execute_select`, `execute_delete`, `execute_update`, plus the shared expression evaluator `eval_expr` / `eval_predicate`. Also the bounded-heap top-k optimization (Phase 7c), the HNSW probe shortcut (Phase 7d.2), and the FTS probe shortcut (Phase 8b). |
|
|
101
101
|
| [`src/sql/db/database.rs`](../src/sql/db/database.rs) | `Database`: table map + optional `source_path` + optional long-lived `Pager` + transaction-snapshot state |
|
|
102
102
|
| [`src/sql/db/table.rs`](../src/sql/db/table.rs) | `Table`, `Column`, `Row`, `Value` (in-memory storage incl. VECTOR + JSON columns); helpers for row iteration (`rowids`, `get_value`, `set_value`, `delete_row`, `insert_row`) |
|
|
103
103
|
| [`src/sql/hnsw.rs`](../src/sql/hnsw.rs) | Standalone HNSW algorithm — insert / search / layer assignment / beam search. Phase 7d.1. |
|
|
104
|
+
| [`src/sql/fts/`](../src/sql/fts/) | Full-text search — standalone tokenizer, BM25 scorer, and in-memory `PostingList` inverted index. Wired into the executor via the `fts_match` / `bm25_score` scalar functions and the `try_fts_probe` optimizer hook. Phase 8a-8b; persistence in 8c. See [`docs/fts.md`](fts.md). |
|
|
104
105
|
| [`src/sql/json.rs`](../src/sql/json.rs) | JSON column type + path-extraction functions (`json_extract`, `json_type`, `json_array_length`, `json_object_keys`). Phase 7e. |
|
|
105
106
|
| [`src/sql/pager/`](../src/sql/pager/) | On-disk file format and I/O — see [file-format.md](file-format.md) and [pager.md](pager.md) for details. WAL + checkpointer + shared/exclusive lock modes (Phase 4a-4e) live here. |
|
|
106
107
|
|
|
@@ -127,7 +128,7 @@ Steps 1–7 are purely in-memory; step 8 is the only disk contact, and after the
|
|
|
127
128
|
- **Planning**: intentionally not a thing yet. Execution is direct — a query plan is implicit in the executor code path.
|
|
128
129
|
- **Execution**: `src/sql/executor.rs` walks the internal structs, drives reads against `Table`, and writes via `Table::set_value` / `insert_row` / `delete_row`.
|
|
129
130
|
- **Storage (in memory)**: `src/sql/db/table.rs` — column-oriented `BTreeMap<rowid, value>` per column; indexes as separate `BTreeMap`s on UNIQUE/PK columns.
|
|
130
|
-
- **Storage (on disk)**: `src/sql/pager/` — 4 KiB pages, real B-Tree per table (Phase 3d), secondary indexes (3e), HNSW indexes as their own page tree (7d.3), WAL + crash-safe checkpointer (4c-4d), shared/exclusive lock modes (4e).
|
|
131
|
+
- **Storage (on disk)**: `src/sql/pager/` — 4 KiB pages, real B-Tree per table (Phase 3d), secondary indexes (3e), HNSW indexes as their own page tree (7d.3), FTS posting lists as their own page tree (8c, on-demand v5 file format), WAL + crash-safe checkpointer (4c-4d), shared/exclusive lock modes (4e).
|
|
131
132
|
- **Persistence policy**: `src/sql/mod.rs::process_command` for when to auto-save; `src/sql/pager/mod.rs::save_database` for how. Inside a `BEGIN`/`COMMIT` block, auto-save is suppressed and changes accumulate against an in-memory snapshot — `COMMIT` flushes the whole batch in one WAL frame; `ROLLBACK` restores the snapshot.
|
|
132
133
|
- **Error handling**: `src/error.rs` defines a single `SQLRiteError` enum used throughout, with `#[from]` conversions from `ParserError` and `io::Error`.
|
|
133
134
|
|
|
@@ -4,7 +4,7 @@ A SQLRite database is a single file, by convention named `*.sqlrite`. The file i
|
|
|
4
4
|
|
|
5
5
|
All multi-byte integers in this format are **little-endian**.
|
|
6
6
|
|
|
7
|
-
The current on-disk format is **version
|
|
7
|
+
The current on-disk format is **version 4** (Phase 7) by default, with **version 5** written on demand whenever an FTS index is attached to the database (Phase 8c). Decoders accept both v4 and v5; writers preserve the existing version on no-op resaves so a v4 database without FTS stays v4. Files produced by versions 1 – 3 are rejected on open.
|
|
8
8
|
|
|
9
9
|
## Page 0 — the database header
|
|
10
10
|
|
|
@@ -15,7 +15,7 @@ The first 4096 bytes of every file are the header page. Only the first 28 bytes
|
|
|
15
15
|
│ offset │ length │ content │
|
|
16
16
|
├────────┼────────┼─────────────────────────────────────────────────┤
|
|
17
17
|
│ 0 │ 16 │ magic: "SQLRiteFormat\0\0\0" │
|
|
18
|
-
│ 16 │ 2 │ format version (u16 LE) =
|
|
18
|
+
│ 16 │ 2 │ format version (u16 LE) = 4 or 5 │
|
|
19
19
|
│ 18 │ 2 │ page size (u16 LE) = 4096 │
|
|
20
20
|
│ 20 │ 4 │ total page count (u32 LE), includes page 0 │
|
|
21
21
|
│ 24 │ 4 │ root page of sqlrite_master (u32 LE) │
|
|
@@ -25,7 +25,7 @@ The first 4096 bytes of every file are the header page. Only the first 28 bytes
|
|
|
25
25
|
|
|
26
26
|
The magic string is 14 ASCII bytes (`SQLRiteFormat`) padded with two NUL bytes to fill 16 bytes. It's deliberately different from SQLite's `"SQLite format 3\0"` so the two formats can't be confused on inspection.
|
|
27
27
|
|
|
28
|
-
`decode_header` in [`src/sql/pager/header.rs`](../src/sql/pager/header.rs) validates all three of (magic, format version, page size) on open. A wrong magic produces `not a SQLRite database`; a wrong version or page size produces `unsupported ...` errors.
|
|
28
|
+
`decode_header` in [`src/sql/pager/header.rs`](../src/sql/pager/header.rs) validates all three of (magic, format version, page size) on open. A wrong magic produces `not a SQLRite database`; a wrong version or page size produces `unsupported ...` errors. The decoder accepts both v4 and v5 (anything else is rejected); the parsed `format_version` is propagated through the in-memory `DbHeader` so the writer can preserve it on resave when no version-bumping feature has been added.
|
|
29
29
|
|
|
30
30
|
## Pages 1..page_count — payload pages
|
|
31
31
|
|
|
@@ -126,14 +126,16 @@ A cell is length-prefixed; its body starts with a `kind_tag` byte:
|
|
|
126
126
|
|
|
127
127
|
```
|
|
128
128
|
cell_length varint excludes itself; total bytes of kind_tag + body
|
|
129
|
-
kind_tag u8 0x01 = Local
|
|
130
|
-
0x02 = Overflow
|
|
131
|
-
0x03 = Interior
|
|
132
|
-
0x04 = Index
|
|
129
|
+
kind_tag u8 0x01 = Local (full row on a leaf)
|
|
130
|
+
0x02 = Overflow (pointer to spilled body)
|
|
131
|
+
0x03 = Interior (divider on an interior node)
|
|
132
|
+
0x04 = Index (entry in a secondary-index leaf)
|
|
133
|
+
0x05 = HNSW (Phase 7d.3 — one HNSW node)
|
|
134
|
+
0x06 = FTS Posting (Phase 8c — one FTS posting list)
|
|
133
135
|
body variable depends on kind_tag
|
|
134
136
|
```
|
|
135
137
|
|
|
136
|
-
The shared prefix means `Cell::peek_rowid` works uniformly across all
|
|
138
|
+
The shared prefix means `Cell::peek_rowid` works uniformly across all kinds — useful for binary search over a page's slot directory without decoding full bodies.
|
|
137
139
|
|
|
138
140
|
### Local cell body
|
|
139
141
|
|
|
@@ -205,6 +207,25 @@ value_body variable encoded per the Local cell's value-block rules
|
|
|
205
207
|
|
|
206
208
|
NULL values are never indexed — `SecondaryIndex::insert` skips them — so there's no null bitmap here; a non-null value is always present.
|
|
207
209
|
|
|
210
|
+
### FTS posting cell body — `KIND_FTS_POSTING` (0x06, Phase 8c)
|
|
211
|
+
|
|
212
|
+
Used on the leaves of an FTS index B-Tree. Each cell carries either a posting list for one term (`term`-bytes non-empty), or — in a single sidecar cell — the per-doc length map (`term`-bytes empty). The B-Tree key is `cell_id`, a sequential integer assigned at save time; it has no meaning beyond ordering cells within their tree (so `Cell::peek_rowid`'s slot-directory ordering still works without FTS-specific page plumbing).
|
|
213
|
+
|
|
214
|
+
```
|
|
215
|
+
cell_id zigzag varint sequential B-Tree slot key (1, 2, 3, ...)
|
|
216
|
+
term_len varint byte length of `term` (0 → sidecar cell)
|
|
217
|
+
term term_len bytes ASCII-lowercased term per Phase 8 Q3
|
|
218
|
+
count varint number of (rowid, value) pairs
|
|
219
|
+
for each:
|
|
220
|
+
rowid zigzag varint the row this entry refers to
|
|
221
|
+
value varint term frequency for this (term, row),
|
|
222
|
+
or doc length when term_len == 0
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
One sidecar cell with `term_len == 0` exists per index, holding `(rowid, doc_len)` pairs for every indexed doc — including any with zero-token text — so `total_docs` and `avg_doc_len` round-trip in BM25 even on degenerate corner cases. Posting cells follow, one per unique term in lexicographic order.
|
|
226
|
+
|
|
227
|
+
A single posting cell that exceeds page capacity (~4 KiB) errors at save time; overflow chaining is a Phase 8.1 stretch goal. In practice — even `'the'` in a million-row English corpus stays under the limit with the varint encoding above.
|
|
228
|
+
|
|
208
229
|
## The schema catalog: `sqlrite_master`
|
|
209
230
|
|
|
210
231
|
The schema catalog is itself a table named `sqlrite_master`, stored in the same `TableLeaf` format as any user table. Its schema is hardcoded into the engine so the open path can bootstrap:
|
|
@@ -284,7 +305,8 @@ These are not all enforced on open — we validate the header strictly and rely
|
|
|
284
305
|
- **v1** (Phases 2 / 3a / 3b) — schema catalog and table data were opaque `bincode` blobs chained across typed payload pages.
|
|
285
306
|
- **v2** (Phases 3c / 3d) — cell-based storage and `sqlrite_master`. Phase 3d added interior pages without a version bump.
|
|
286
307
|
- **v3** (Phase 3e) — `sqlrite_master` gains a `type` column; secondary indexes persist as their own cell-based B-Trees whose leaves carry `KIND_INDEX` cells.
|
|
287
|
-
- **v4** (Phase 7a
|
|
308
|
+
- **v4** (Phase 7a) — value block dispatch gains the `0x04 Vector` tag for the new `VECTOR(N)` column type. Per the [Phase 7 plan's Q8](phase-7-plan.md#q8-file-format-version-bump), later Phase 7 sub-phases (JSON storage, HNSW indexes) added their own value/cell tags inside this same v4 envelope. The `CREATE TABLE` SQL stored in `sqlrite_master` carries vector columns as `VECTOR(N)` in the type position; on open, the engine re-parses that SQL and reconstructs `DataType::Vector(N)` from the `Custom` AST node sqlparser produces.
|
|
309
|
+
- **v5** (Phase 8c, current for FTS-bearing files) — adds the `KIND_FTS_POSTING` cell tag for persisted FTS posting lists. Bumped **on demand** per the [Phase 8 plan's Q10](phase-8-plan.md#q10-file-format-version-bump-strategy): existing v4 databases without FTS keep writing v4 across non-FTS saves; the first save with at least one FTS index attached promotes the file to v5. Decoders accept both v4 and v5; opening a v4 file with a build that supports v5 is a no-op until the user creates an FTS index.
|
|
288
310
|
|
|
289
311
|
The page header (7 bytes) and chaining mechanism are stable across future phases. Phase 4's WAL introduces a sibling file (`.sqlrite-wal`) rather than changing the main file format.
|
|
290
312
|
|