latticesql 4.0.1 → 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -19,11 +19,61 @@ Every AI agent session starts cold — no memory of what happened yesterday, wha
19
19
  3. **Ingests** agent-written output back into the DB via the writeback pipeline
20
20
  4. **Manages** state with full CRUD, natural-key operations, seeding, and soft-delete
21
21
  5. **Optimizes** context with token budgets, relevance filtering, enrichment pipelines, and reward-scored memory
22
- 6. **Searches** full-text (FTS5 / `tsvector`, with a LIKE fallback) and semantically (bring-your-own embeddings + cosine similarity)
22
+ 6. **Searches** full-text (FTS5 / `tsvector`, with a LIKE fallback), semantically (bring-your-own embeddings, chunked + indexed), **hybrid** (Reciprocal Rank Fusion), and **graph-augmented** — with retrieval **eval metrics**, a health **`doctor`**, and a reproducible **benchmark** (see [docs/retrieval.md](docs/retrieval.md))
23
23
  7. **Organizes** everything into `.lattice` workspaces with a local browser GUI, a workspace dashboard, changelog/version history, and a SQL↔markdown context bridge that auto-renders on every write
24
24
 
25
25
  Lattice has no opinions about your schema, your agents, or your file format. You define the tables. You control the rendering. Lattice runs the sync loop.
26
26
 
27
+ **New in 4.2 (additive — every 4.1 caller runs unchanged):** **import structured
28
+ data by dropping a file into the assistant.** The **structured-source importer**
29
+ turns a JSON or `.xlsx` source into a schema (entities / dimensions / junctions)
30
+ and materializes it: Excel sheets become records (header + data-region detection),
31
+ per-slice tabs become read-only **views** (no duplicated rows), an **as-of date**
32
+ is detected (contents → name → Excel preamble → Claude fallback, or per-row from a
33
+ date column) so re-importing a newer period keeps a **point-in-time snapshot**
34
+ beside the prior one, and a re-upload is fingerprinted + matched to existing tables
35
+ (new snapshot, not duplicate tables). It's reachable only by dropping a file in the
36
+ assistant rail: a confident match + detected date imports silently; otherwise an
37
+ **inline confirm card** proposes the schema, date, and mode before anything is
38
+ written (applied via `POST /api/import/apply`). New exports: `inferSchema`,
39
+ `inferFieldType`, `normalizeName`, `sourceRecords`, `materializeImport`,
40
+ `detectAsOf`, `detectAsOfCandidates`, `detectAsOfColumns`, `parseCellDate`,
41
+ `matchSchemaToExisting`, `renameEntities`, `excelToRecords`,
42
+ `dedupeAndDetectViews` (+ types). See **[docs/importing.md](docs/importing.md)**.
43
+ 4.2 also tightens correctness and the read/egress posture: **retrieval bounding**
44
+ (`/api/history` clamps its `limit`; semantic search clamps `topK` before the
45
+ candidate fan-out; the no-index embedding scan takes an opt-in `maxScanChunks`
46
+ that fails loudly with `EmbeddingScanTooLargeError` rather than silently truncate
47
+ — off by default), an **import file-size cap enforced on BOTH the upload and the
48
+ apply-time read** (50 MB), a **genuinely failable retrieval-quality gate**
49
+ (cross-topic golden corpus + a generated sub-perfect committed baseline +
50
+ `npm run eval:gate` in CI; the benchmark now asserts a real pgvector index before
51
+ timing the vector phase; an advisory `npm run slo:gate`), **per-recipient scoping
52
+ of realtime delete events** (a deleted row's pk/existence is no longer fanned out
53
+ to members who couldn't read it), **symmetric many-to-many junction rendering**
54
+ (both sides show the remote entity), and a **Windows credential-store fix** (the
55
+ cross-process lock now retries transient `EPERM`/`EACCES`). All opt-in; absent the
56
+ opt-in, behavior is byte-identical to 4.1.
57
+
58
+ **New in 4.1 (additive — every 4.0 caller runs unchanged):** a measurable,
59
+ production-grade **retrieval & data substrate**. Retrieval gets _measurable_:
60
+ `evaluateRetrieval` reports the standard IR metrics (Precision@k / Recall@k / MRR /
61
+ nDCG / MAP) over any ranked retriever, `lattice doctor` reports index/embedding
62
+ health, and `benchmarkRetrieval` produces reproducible latency/throughput numbers.
63
+ Search gets _better_: **chunked + contextual embeddings** (`semanticChunker`,
64
+ `EmbeddingsConfig.chunker`), an opt-in **indexed vector** path (`buildVectorIndex` —
65
+ pgvector HNSW / sqlite-vec, with the in-process scan as fallback), **hybrid search**
66
+ (`hybridSearch` — Reciprocal Rank Fusion of vector + full-text) with **ranking
67
+ signals** + an optional **reranker**, and **graph-augmented retrieval** (a typed-edge
68
+ graph with bounded BFS + `graphSearch` adjacency boosting). The query surface grows:
69
+ **bounded reads** (`maxRows` / `defaultMaxRows` → `BoundedReadError`), **projection**,
70
+ **OR/AND + `jsonPath` filters**, **SQL-side `aggregate`**, **keyset `queryPage`**,
71
+ **`distinctOn`**, and batched relation **`include`**. Plus governance (**immutable
72
+ provenance** + a **trust/verification** workflow), reliability (**`withRetry`** +
73
+ **online resumable migrations**), **computed columns / materialized rollups**, and
74
+ seamless **keyless cloud file-byte access** (an in-database SigV4 presigner). All
75
+ opt-in per table/call; absent the opt-in, behavior is byte-identical to 4.0.
76
+
27
77
  **New in 4.0 (major release — mostly drop-in):** a major version that decomposes the three largest internal modules and hardens the cloud path for many simultaneous users, while keeping the `Lattice` / GUI surface stable. **Existing 3.0+ configs and databases are migrated forward SILENTLY on open** — you usually need to do nothing. The breaking changes are auto-handled on open (or clearly documented): the per-field `ref:` shorthand is still parsed (and the GUI rewrites it to the explicit `relations:` block on disk); a legacy empty-string `deleted_at` is normalized to `NULL`; a legacy `files.path`-only row is backfilled into the reference model; the render manifest is v2-only and self-upgrades on first render. The one consumer-code change: the exported `MEMBER_GROUP` constant is replaced by **`memberGroupFor(db)`** (the member group role is now **per-cloud** — derived from the database/schema — so unrelated clouds on one Postgres cluster no longer share a group). Opening a cloud workspace is now **much faster** (one batched schema introspection instead of per-table round-trips; the owner-side RLS/grant convergence runs in the background since the owner is BYPASSRLS). See **[docs/MIGRATING-4.0.md](docs/MIGRATING-4.0.md)** for the (mostly no-op) migration and the manual steps for library/non-GUI consumers.
28
78
 
29
79
  **New in 3.4:** the browser GUI now **updates itself** — launched from an npm install it silently installs the latest published version and keeps checking in the background, relaunching on the same port while the open tab auto-reloads onto the new build (a git checkout / `npx` copy is left untouched); **file loopback** — editing a rendered `.md` context file on disk flows back into the database through the normal write path (changelog/versioned/undoable, live in the GUI), with a public [`reverseSyncFromFiles()`](docs/api-reference.md) for embedders; on a cloud, a member's **rendered context is now scoped to their own visibility** (rendered through their RLS connection + masking view, so their assistant only ever reads rows they may see); the assistant gains a **`get_row_context`** tool (reads a record's pre-joined rendered context in one call) and **`add_column`** (add a field to an existing table on request); and the cloud gets resilience + search fixes — the open-time converge is **per-table fault-isolated** (one un-manageable table no longer breaks the whole workspace), `migrate-to-cloud` now builds the full-text index (with a public [`rebuildFtsIndexes()`](docs/api-reference.md)), and a plaintext `postgres://` URL in a config is healed into an encrypted credential reference on open. New `GET /api/version`, `GET /api/update/status`, and `POST /api/workspaces/reload` endpoints. See [docs/workspaces.md](docs/workspaces.md), [docs/cloud.md](docs/cloud.md), and [docs/assistant.md](docs/assistant.md).