kaos-tabular 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,11 @@
1
+ .coverage
2
+ .mypy_cache/
3
+ .pytest_cache/
4
+ .ruff_cache/
5
+ .venv/
6
+ __pycache__/
7
+ *.pyc
8
+ dist/
9
+ *.egg-info/
10
+ .mcp.json
11
+ .kaos-vfs/
@@ -0,0 +1,501 @@
1
+ # Changelog
2
+
3
+ All notable changes to `kaos-tabular` are documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased]
9
+
10
+
11
+ ## [0.1.0] — 2026-05-20
12
+
13
+ ### Released
14
+
15
+ - 0.1.0 GA — WU-L of GA plan. First stable release. Public API frozen.
16
+ - Pin floor raised to `>=0.1.0,<0.2` across all kaos-* runtime and
17
+ optional dependencies. Refreshed `uv.lock` to pick up the 0.1.0
18
+ line of every upstream.
19
+
20
+ ### Internal
21
+
22
+ - WU-L of the 0.1.0 GA plan
23
+ (`kaos-modules/docs/plans/2026-05-20-0.1.0-ga-plan.md`).
24
+
25
+
26
+ ## [0.1.0rc1] — 2026-05-20
27
+
28
+ ### Changed
29
+
30
+ - Pin floor raised to `>=0.1.0rc1,<0.2` across kaos-* runtime and
31
+ optional dependencies (`kaos-core`, `kaos-content`, `kaos-mcp`).
32
+ Refreshed `uv.lock` to pick up the rc1 line of every upstream.
33
+
34
+ ### Internal
35
+
36
+ - WU-J of the 0.1.0 GA plan
37
+ (`kaos-modules/docs/plans/2026-05-20-0.1.0-ga-plan.md`).
38
+ Release candidate; freezes the public API for `kaos-tabular`
39
+ ahead of 0.1.0 GA.
40
+
41
+
42
+ ## [0.1.0a5] — 2026-05-20
43
+
44
+ ### Changed
45
+
46
+ - **kaos-core floor raised to `>=0.1.0a12`** (post-URI-redesign +
47
+ Capability type). WU-F.2 of the 0.1.0 GA plan; catch-up to
48
+ kaos-core 0.1.0a12. No public API changes in kaos-tabular.
49
+
50
+ ## [0.1.0a4] — 2026-05-17
51
+
52
+ ### Changed
53
+
54
+ - **kaos-core floor raised to `>=0.1.0a10`** to pick up the URI
55
+ contract redesign. Pass-through for kaos-tabular file-input tools.
56
+ See `kaos-modules/docs/plans/uri-contract-redesign.md`.
57
+
58
+ ## [0.1.0a3] — 2026-05-17
59
+
60
+ ### Changed
61
+
62
+ - **Both file-input MCP tools — `kaos-tabular-register` and
63
+ `kaos-tabular-read-file` — now route agent-supplied paths through
64
+ `kaos_core.path_resolver.resolve_input_path` instead of raw
65
+ `Path(p).exists()`.** Both tools accept four input shapes for the
66
+ `path` argument: an absolute filesystem path (CLI / tests), a
67
+ `kaos://artifacts/<id>` URI returned by a previous tool, a `kaos://`
68
+ VFS URI, and a session-scoped VFS-relative path (e.g. a CSV uploaded
69
+ through a host UI like `kaos-ui`'s single-user-chat SPA). Both
70
+ tools'`path` parameter schema now documents these four shapes so the
71
+ LLM can discover the feature without reading source. Implements
72
+ Stage 3 of `kaos-modules/docs/plans/vfs-blind-tools-audit-and-fix-plan.md`;
73
+ closes the kaos-tabular slice of the production NDA-hallucination
74
+ incident (session `01KRVYAEA3B1HG95DBAG6H0DJ3`) where file-input
75
+ tools could not see SPA uploads and the agent fabricated a
76
+ jurisdiction / term-length analysis citing the unreadable files. The
77
+ resolver performs the artifact-store / VFS reads inside an `async
78
+ with` context manager; bytes are materialised to a temp file just
79
+ for the duration of the eager DuckDB `CREATE TABLE ... AS SELECT *
80
+ FROM read_csv(...)` (register) or the synchronous `_read_file`
81
+ build (read-file), then cleaned up. On `InputPathResolutionError`
82
+ both tools return the resolver's agent-friendly three-part message
83
+ via `ToolResult.create_error(exc.to_agent_message())`. When the
84
+ input was itself a `kaos://artifacts/<id>` URI, the existing
85
+ artifact id round-trips into `structured_content` so downstream
86
+ tools and the SPA's `ArtifactCard` can re-resolve the same handle.
87
+ - **`kaos-core` dependency floor raised from `>=0.1.0a4` to
88
+ `>=0.1.0a9`** to pick up `kaos_core.path_resolver` (released in
89
+ kaos-core 0.1.0a9).
90
+
91
+ ## [0.1.0a2] — 2026-05-15
92
+
93
+ ### Fixed
94
+
95
+ - **Nine array parameters across the MCP tool catalog now declare
96
+ their element types.** Each was previously `type=array` with no
97
+ `items`, which OpenAI's strict JSON Schema validator rejected
98
+ with HTTP 400 `invalid_function_parameters`, taking down the
99
+ whole tool catalog for openai-provider turns. Per-tool fixes:
100
+ - `kaos-tabular-dedupe.columns`, `-correlate.columns`,
101
+ `-melt.columns`, `-join.on`, `-pivot.group_by`,
102
+ `-aggregate.group_by`, `-top-n.by` → `items: {type: "string"}`.
103
+ - `kaos-tabular-aggregate.aggregates` → typed object schema
104
+ `{func: enum[sum/avg/min/max/count/count_distinct/median/stddev/
105
+ variance/first/last], column: string, alias?: string}`. The
106
+ LLM now produces correct payloads on the first try instead of
107
+ trial-and-erroring across two ReAct iterations.
108
+ - `kaos-tabular-aggregate.order_by` → typed object schema
109
+ `{column: string, direction?: enum[asc/desc]}`.
110
+ kaos-core 0.1.0a7's defensive `items: {}` floor is belt +
111
+ suspenders.
112
+
113
+ ### Security
114
+
115
+ - **Documented the SQL-quoting safety contract on six query sites in
116
+ ``engine.py`` (bandit B608).** ``TabularEngine`` builds SQL via
117
+ f-strings against table/column/path inputs and routes every dynamic
118
+ fragment through one of two validating quoters:
119
+ ``_quote_ident`` (validates + double-quotes the identifier) or
120
+ ``_q_lit`` (doubles single quotes for SQL string literals). Bandit's
121
+ static B608 heuristic can't see the quoter — it just sees an
122
+ f-string concatenating SQL fragments — so every call site is
123
+ flagged as a possible SQL-injection vector. Added inline
124
+ ``# nosec B608`` comments at each site with a one-line justification
125
+ pointing at the relevant quoter; the quoting contract itself is
126
+ unchanged. Files: ``kaos_tabular/engine.py``.
127
+ - **bandit + vulture now run in both pre-commit and CI.** Two new
128
+ hooks in ``.pre-commit-config.yaml`` (bandit + vulture), mirrored
129
+ by two new jobs in ``security.yml`` (``bandit (static security)``
130
+ + ``vulture (dead-code scan)``). Pre-commit gives contributors fast
131
+ feedback before push; CI makes the scan publicly visible on every
132
+ PR. Skip lists justified inline. Mirrors the rollout from
133
+ kaos-core. **Depends on PR #1** (bandit B608 nosec justifications
134
+ in engine.py) — bandit will fail on this branch's first run until
135
+ #1 merges, then rebase clears it.
136
+ ### Changed
137
+
138
+ - **uv.lock bumped to the current PyPI-latest of three kaos-* siblings:**
139
+ ``kaos-content`` 0.1.0a2 → 0.1.0a4, ``kaos-core`` 0.1.0a4 →
140
+ 0.1.0a5, and ``kaos-mcp`` 0.1.0a1 → 0.1.0a2. All three bumps are
141
+ no-op for kaos-tabular's public API. 276 unit tests continue to
142
+ pass.
143
+
144
+ ## [0.1.0a1] — 2026-05-08
145
+
146
+ ### Added (structured shape tools + did-you-mean error suggestions)
147
+
148
+ A second pre-tag pass reconsidered the "tools earn their weight when
149
+ SQL is genuinely awkward" framing. The framing held for `pivot`,
150
+ `unpivot`, `join`, and `correlation`, but was too narrow for the
151
+ `GROUP BY` / `WHERE` / `ORDER BY ... LIMIT` trio: agents write that
152
+ SQL correctly, yes, but typed wrappers buy validation at the boundary,
153
+ structured-event audit (the call shows up in `engine.history()` as
154
+ `aggregate:<table>` instead of an opaque `query:` string), and
155
+ dialect-insulation if the engine ever grows a non-DuckDB backend.
156
+
157
+ Three new MCP tools (14 → 17) and matching public engine methods:
158
+
159
+ - **`kaos-tabular-aggregate`** + **`engine.aggregate(table, *, aggregates,
160
+ group_by=None, where=None, having=None, order_by=None, limit=None,
161
+ target=None)`**. Composed `GROUP BY`. `aggregates` is a list of
162
+ `(func, column[, alias])` tuples; `func` ∈ `{sum, avg, min, max,
163
+ count, count_distinct, median, stddev, variance, first, last}`.
164
+ Validates the table, every column, and every aggregate function
165
+ *before* SQL is generated, with did-you-mean suggestions on a miss.
166
+ `where` / `having` remain opaque DuckDB SQL fragments (predicate
167
+ shapes are unbounded). `order_by` items must reference either a
168
+ group_by column or an explicit aggregate alias; bare aggregate
169
+ expressions in `ORDER BY` are rejected at the wrapper.
170
+ - **`kaos-tabular-filter`** + **`engine.filter(table, *, where,
171
+ limit=None, target=None)`**. Typed `SELECT * WHERE`. The table is
172
+ validated; `where` is opaque DuckDB SQL. Useful when the caller
173
+ wants the call to show up in the structured history log under
174
+ `filter:<table>` instead of inside an opaque `query:` event.
175
+ - **`kaos-tabular-top-k`** + **`engine.top_k(table, *, by, n=10,
176
+ ascending=False, target=None)`**. `ORDER BY ... LIMIT N`. Defaults
177
+ to descending so "top N by units" reads naturally; pass
178
+ `ascending=True` for bottom-N.
179
+
180
+ ### Added (did-you-mean suggestions across the engine)
181
+
182
+ Every error path that mentions a missing table or column now carries
183
+ a `Did you mean '<closest match>'?` suggestion using
184
+ `difflib.get_close_matches` with a 0.6 cutoff. The cutoff is high
185
+ enough to avoid spurious matches on short identifiers (`id` / `ip`)
186
+ but low enough to forgive single-character typos on typical 6+
187
+ character column names.
188
+
189
+ The mechanism is wired into `describe_table`, `sample`, `count`,
190
+ `find_duplicates`, `correlation`, `join` (both sides + `on=`),
191
+ `pivot`, `unpivot`, `export_table`, and the three new structured
192
+ shape methods (`aggregate`, `filter`, `top_k`). The aggregate
193
+ function whitelist also gets did-you-mean against the supported
194
+ function names. Module-level `_suggestions` and
195
+ `_did_you_mean_fragment` helpers are unit-tested in isolation against
196
+ the cutoff edge-cases (empty universe, no near-match, plural form);
197
+ the `TestExistingErrorPathsRetrofit` class pins the retrofit so a
198
+ future refactor can't silently drop suggestions from the older
199
+ analytical surfaces.
200
+
201
+ Test count: 216 → 276 unit tests (60 new in
202
+ `tests/unit/test_structured_ops.py`); coverage stays above the 70%
203
+ `fail_under` floor.
204
+
205
+ Quick benchmark (100k-row CSV, 5 distinct group keys): structured
206
+ `aggregate` runs 7.5 ms median vs. 3.4 ms for the equivalent raw
207
+ `execute` — ~4 ms validation overhead from two `information_schema`
208
+ lookups per call. The overhead is constant regardless of data size,
209
+ acceptable for interactive agent use; throughput-bound batch loops
210
+ should reach for `kaos-tabular-query` instead.
211
+
212
+ ### Fixed (post-release-review pass before tag)
213
+
214
+ External review found gaps the audit-01 sweep missed; all addressed
215
+ before tagging:
216
+
217
+ - **#1 P0: SQLite table-name SQL injection in `_register_sqlite`.**
218
+ `src_table` values from `sqlite_master` were interpolated raw into
219
+ the next `sqlite_scan('{path}', '{src_table}')` call. A crafted
220
+ SQLite file with a hostile table name could escape the literal and
221
+ execute injected DuckDB SQL. New module-level `_q_lit` helper
222
+ performs the standard `'` → `''` escape; both the path and the
223
+ `src_table` now flow through it. Adversarial test in
224
+ `tests/unit/test_sqlite_register.py::test_register_sqlite_hostile_table_name_does_not_inject`.
225
+ - **#2 P0: `save()` path SQL injection.** `EXPORT DATABASE '{p}'`
226
+ pasted the caller-supplied path directly. `save("'; ATTACH ...; --")`
227
+ could break out of the literal. Same `_q_lit` mitigation.
228
+ `export_table` (added in this release) was already correct but is
229
+ now consolidated onto `_q_lit` for consistency. Adversarial tests
230
+ in `tests/unit/test_path_injection.py`.
231
+ - **#3 P0: `duckdb` minimum lifted from `>=1.0` to `>=1.4.2`.** 1.0.0
232
+ has no cp313 wheel; 1.1.1 was the first cp313 release; 1.4.2 was
233
+ the first cp314 release. Since we support both 3.13 and 3.14, the
234
+ floor must clear both — pre-1.4.2 made the lowest-direct CI job
235
+ build duckdb from source on cp314, which is why min-deps took
236
+ 20+ minutes.
237
+ - **#4 P0: MCP tool annotations now match real behaviour.** Pre-fix,
238
+ every tool used `_TABULAR_ANNOTATIONS` with `openWorldHint=False`,
239
+ including ones that genuinely reach the filesystem
240
+ (`Register` / `Query` / `ReadFile`); `ExportTool` used
241
+ `_TABULAR_WRITE_ANNOTATIONS` with `destructiveHint=False` despite
242
+ writing/overwriting files. Split into three classes:
243
+ `_TABULAR_READ_ANNOTATIONS` (closed-world catalog reads — `List` /
244
+ `Describe` / `Sample` / `Count`), `_TABULAR_OPEN_READ_ANNOTATIONS`
245
+ (open-world filesystem reads — `Register` / `Query` / `ReadFile`),
246
+ `_TABULAR_WRITE_ANNOTATIONS` (open-world destructive writes —
247
+ `Export`, now `destructiveHint=True`). Agents make auto-approval
248
+ decisions on these flags; getting them right is the largest
249
+ actual safety improvement in this commit.
250
+ - **#5 P1: `_ENGINES` cache bounded with LRU + close-on-evict.**
251
+ Pre-fix, the per-session engine cache was an unbounded `dict`;
252
+ long-running streamable-HTTP servers leaked DuckDB connections
253
+ forever. Now an `OrderedDict` capped at
254
+ `_ENGINES_MAX_SESSIONS = 64`; the oldest engine is closed on
255
+ insert past capacity. TODO: replace with proper kaos-mcp
256
+ per-session lifecycle hook at 0.1.0a2. Coverage in
257
+ `tests/unit/test_session_engines.py`.
258
+ - **#6 P1: stale integration assertion fixed.**
259
+ `tests/integration/test_mcp_tabular_pipeline.py` asserted the
260
+ pre-KTAB-007 error string `"Cannot infer format"`. Updated to the
261
+ current `"Cannot infer export format"`. CI doesn't gate the
262
+ integration tier today; raised as a separate platform tracker.
263
+ - **#7 P1: SECURITY.md scope rewritten for kaos-tabular.** The
264
+ template carried over from kaos-mcp listed LLM/program-execution/
265
+ cache/provider concerns that don't apply here. New scope names:
266
+ the DuckDB SQL boundary, file registration paths, export/write
267
+ paths, MCP tool surface, the SQLite extension network fetch, the
268
+ transitive dep supply chain.
269
+
270
+ ### Added (post-Kelvin-comparison surface expansion)
271
+
272
+ A pre-tag review against the legacy ``kelvin_tabular`` package
273
+ (roughly 60 MCP tools across inspection / manipulation / statistics /
274
+ quality / transformation categories) found that most of those tools
275
+ were SELECT one-liners that don't earn their weight when the agent
276
+ already has free-form SQL. The ones that *do* earn their weight are
277
+ the SQL-is-genuinely-awkward cases — joins where column ambiguity
278
+ catches agents writing `JOIN ON l.x = r.x` by hand, the
279
+ ``PIVOT`` / ``UNPIVOT`` syntax, long-form correlation matrices,
280
+ provenance tracing — and those are the six we ported. The package
281
+ explicitly does NOT ship Kelvin's full tree; SQL is the expression
282
+ layer for everything else.
283
+
284
+ Six new MCP tools (8 → 14) and matching public engine methods:
285
+
286
+ - **``kaos-tabular-history``** + **``engine.history(*, last_n=20)``**
287
+ + ``EngineEvent`` exported on the public surface. Returns the
288
+ recent register / query / drop events for the session — provenance
289
+ for agents tracing back what's been loaded.
290
+ - **``kaos-tabular-find-duplicates``** + **``engine.find_duplicates(table, *, columns=None)``**.
291
+ Returns rows that share their key with at least one other row,
292
+ via DuckDB ``QUALIFY COUNT(*) OVER (PARTITION BY …) > 1``. Default
293
+ ``columns=None`` uses every column (full-row duplicate detection).
294
+ - **``kaos-tabular-correlation``** + **``engine.correlation(table, *, columns=None)``**.
295
+ Pairwise Pearson correlation between numeric columns, returned as
296
+ long-form ``(col_a, col_b, corr)`` rows. Default auto-selects
297
+ every numeric column from the catalog.
298
+ - **``kaos-tabular-join``** + **``engine.join(left, right, *, on, how="inner", target=None)``**.
299
+ Wraps DuckDB's ``USING (col)`` clause so the join key appears
300
+ once in the result. ``how`` ∈ ``{inner, left, right, outer, semi,
301
+ anti, cross}``; ``target`` materializes via
302
+ ``CREATE OR REPLACE TABLE`` and registers.
303
+ - **``kaos-tabular-pivot``** + **``engine.pivot(table, *, on, using,
304
+ aggregate="sum", group_by=None, target=None)``**. Wraps DuckDB
305
+ ``PIVOT``. ``aggregate`` ∈ ``{sum, avg, min, max, count, first}``.
306
+ - **``kaos-tabular-unpivot``** + **``engine.unpivot(table, *, columns,
307
+ name_column="variable", value_column="value", target=None)``**.
308
+ Wraps DuckDB ``UNPIVOT``.
309
+
310
+ Each tool declares its own per-tool ``ToolAnnotations`` literal
311
+ (closed-world for catalog-only ops, open-world for arbitrary SQL,
312
+ destructive-write for ``export``). Engine methods emit 3-part
313
+ errors via ``EngineError`` and the MCP layer forwards them through
314
+ ``ToolResult.create_error``. New unit-test file
315
+ ``tests/unit/test_analytical_methods.py`` covers all five engine
316
+ methods + their tool wrappers — 27 tests, including round-trips
317
+ (pivot then unpivot), edge cases (empty columns list, missing
318
+ column, invalid ``how``), and tool-side error translation.
319
+
320
+ Test count: 189 → 216 unit tests; coverage stays at ~75% above
321
+ the 70% ``fail_under`` floor.
322
+
323
+ ### Refactored (post-review code-quality pass)
324
+
325
+ A self-review against `docs/python/{boundaries,modules,errors,
326
+ dry-abstraction}.md` flagged five items worth addressing before tag.
327
+ All landed; none change the public API:
328
+
329
+ - **Item 3: `_ENGINES` global → `EngineRegistry` class.** New
330
+ module `kaos_tabular/_session.py` owning the bounded LRU.
331
+ `EngineRegistry(max_sessions=..., engine_factory=...)` lets tests
332
+ build isolated registries and inject a `_CountingEngine` factory
333
+ to spy on `close()` without monkey-patching module state. The
334
+ process singleton `SESSION_REGISTRY` keeps live MCP-session
335
+ behaviour identical. `tools._get_engine` is now a thin async
336
+ wrapper that delegates to the registry (with the same
337
+ `context is None` ephemeral-engine policy).
338
+ - **Item 4: `cast(Literal[...], fmt)` → typed inference helpers.**
339
+ New `_coerce_export_format(value: Any) -> ExportFormat | None`
340
+ and `_infer_export_format_from_extension(ext: str) -> ExportFormat | None`
341
+ return literal types directly so ty sees the narrow without a
342
+ `cast`. ExportTool's `execute` gets simpler too.
343
+ - **Item 5: brittle eviction test → `_CountingEngine` subclass.**
344
+ Replaced the `engine.close = lambda: ...` monkey-patch with a
345
+ real `TabularEngine` subclass that bumps a counter. Bonus:
346
+ asserts the evicted engine's DuckDB connection actually raises
347
+ `duckdb.ConnectionException` post-eviction.
348
+ - **Item 6: focused `_q_lit` unit tests.** New
349
+ `tests/unit/test_engine_helpers.py` pins six properties + a
350
+ parametrized 7-input round-trip through real DuckDB
351
+ (`SELECT {_q_lit(s)}` → `s`). The adversarial tests still cover
352
+ the engine-end-to-end path; this catches contract drift before it
353
+ reaches them.
354
+ - **Item 7: shared annotation constants → per-tool literals.**
355
+ Removed `_TABULAR_READ_ANNOTATIONS` / `_TABULAR_OPEN_READ_ANNOTATIONS`
356
+ / `_TABULAR_WRITE_ANNOTATIONS`. Each of the 8 tools now declares
357
+ its own `ToolAnnotations(...)` literal in its `metadata` property,
358
+ matching the kaos-reference / kaos-citations pattern. Eliminates
359
+ the misclassification-via-shared-constant risk that motivated
360
+ review #4 in the first place.
361
+
362
+ Tests: 173 → **189** unit tests, 32 integration tests still green,
363
+ coverage 75% → 73% (more code under coverage tracking; gate still
364
+ above the 70% floor).
365
+
366
+ ### Deferred to next release (tracked, not blocking 0.1.0a1)
367
+
368
+ - Make `INSTALL sqlite` / `LOAD sqlite` opt-in via a settings flag
369
+ (post-release-review #8). Currently the actionable error path is
370
+ in place (KTAB-010); making the network fetch opt-in is a real
371
+ API change worth doing in a settled release.
372
+ - Include `SECURITY.md` in the sdist (post-release-review #9). Cheap
373
+ to do at the cross-package level alongside other sdist policy.
374
+ - Pin GitHub Actions and gitleaks Docker image references to SHAs
375
+ for stronger supply-chain posture (post-release-review #10). Best
376
+ done as a platform-wide sweep across all kaos-* repos at once.
377
+
378
+ ## [0.1.0a1-original] — superseded entries below
379
+
380
+ The remainder of this entry documents the pre-review release
381
+ preparation; left intact so the audit-01 / OSS Phase A trail is
382
+ preserved.
383
+
384
+ First public alpha. DuckDB-powered tabular data engine with 8 MCP
385
+ tools for register / query / describe / list / sample / count /
386
+ export / read-file workflows. Closes every finding in
387
+ `docs/audit-01/kaos-tabular.md` (KTAB-001..KTAB-010).
388
+
389
+ ### Removed (dep minimization)
390
+
391
+ - **`polars` dropped from required dependencies.** A pre-release
392
+ audit confirmed nothing in `kaos_tabular` source or tests imports
393
+ polars; the DuckDB bridge in `kaos-content` doesn't need it
394
+ either (the polars bridge lives behind kaos-content's own
395
+ `[polars]` extra, which kaos-tabular never pulled). Result: the
396
+ resolved tree shrinks 56 → 54 packages and the install no longer
397
+ fetches the polars + polars-runtime-32 native binaries (~30 MB
398
+ combined). The `polars` keyword and the README polars mentions
399
+ are also dropped.
400
+
401
+ ### Compliance
402
+
403
+ - **License audit (50 distinct deps in the resolved tree).** Every
404
+ inbound license is on the `docs/oss/10-licensing-legal/dep-license-policy.md`
405
+ allowlist: MIT, Apache-2.0, BSD-2/3-Clause, ISC, MPL-2.0 (certifi,
406
+ weak-copyleft permitted), PSF-2.0 (typing-extensions). Zero
407
+ matches against the denylist (GPL family, AGPL family,
408
+ Commons-Clause, SSPL, BUSL, anyone else's proprietary). Audit
409
+ evidence: `uv tree --no-dedupe` × per-PyPI license metadata.
410
+
411
+ ### Added
412
+
413
+ - **`LICENSE`, `NOTICE`, `CHANGELOG.md`** seeded for the public release.
414
+ License flips from `LicenseRef-Proprietary` to Apache-2.0 via PEP 639
415
+ (`license = "Apache-2.0"`, `license-files = ["LICENSE", "NOTICE"]`).
416
+ `License ::` classifier removed (PEP 639 supersedes).
417
+
418
+ - **`TabularEngine.export_table(table_name, output_path, format=...)`**
419
+ — public engine method that owns DuckDB COPY, format mapping, and
420
+ path quoting. ExportTool MCP and `kaos-tabular export` CLI now call
421
+ it instead of reaching into `engine._con` and importing the private
422
+ `kaos_content.bridges.duckdb._quote_ident`. Closes audit-01 KTAB-003.
423
+
424
+ - **`docs/security.md`** — canonical statement of the trust contract
425
+ (DuckDB is in-process; SQL has filesystem access matching the running
426
+ process; deployments wanting stricter isolation should run
427
+ kaos-tabular in a constrained working directory or container; the
428
+ strict-isolation alternative is `kaos_content.bridges.duckdb.create_safe_connection`,
429
+ which cannot register files). Closes audit-01 KTAB-001 alongside the
430
+ description honesty fix.
431
+
432
+ - **`kaos_tabular/py.typed`** marker so the `Typing :: Typed` classifier
433
+ is honored by downstream type checkers. Closes audit-01 KTAB-004.
434
+
435
+ - **`benchmark` pytest marker** registered in `pyproject.toml`. Wall-
436
+ clock performance tests relocated from `tests/unit/test_adversarial.py`
437
+ → `tests/benchmarks/test_engine_perf.py`. Bounded unit gates can now
438
+ exclude them with `-m "not benchmark"`. Closes audit-01 KTAB-006.
439
+
440
+ - **`tests/unit/test_sqlite_register.py`** — positive (real SQLite
441
+ fixture) and negative (forced INSTALL/LOAD failure) coverage for the
442
+ new SQLite registration error path. Closes audit-01 KTAB-010.
443
+
444
+ - **`tests/unit/test_serve.py`** — argparse + import-error coverage for
445
+ `kaos_tabular.serve.main`, lifting `serve.py` from 0% to ~55% and
446
+ total coverage from 63% (audit baseline) to 73%.
447
+
448
+ - **`fail_under = 70` coverage gate** in
449
+ `[tool.coverage.report]`. Locks the new floor against regression.
450
+ Closes audit-01 KTAB-005.
451
+
452
+ ### Changed
453
+
454
+ - **`QueryTool.metadata.description` is now honest** about the trust
455
+ contract: "Execute arbitrary DuckDB SQL against the session's
456
+ in-process engine ... SQL has filesystem access matching the running
457
+ process — for stricter isolation, run kaos-tabular in a constrained
458
+ working directory or container." Previously the description claimed
459
+ "queries against registered tables" while the engine accepted
460
+ arbitrary DuckDB SQL including `read_csv_auto('...')`. Closes
461
+ audit-01 KTAB-001.
462
+
463
+ - **`_register_sqlite` now raises `RegistrationError` with a 3-part
464
+ message** when DuckDB's `INSTALL sqlite` / `LOAD sqlite` fails. The
465
+ message names the install command, the offline workaround
466
+ (pre-bundled extension), and the fallback (export tables to CSV /
467
+ Parquet first). Closes audit-01 KTAB-010.
468
+
469
+ - **MCP error messages standardized to the what / how-to-fix /
470
+ alternative-tool shape** across `tools.py`. The audit explicitly
471
+ flagged the sample (`tools.py:359`) and read-file (`tools.py:489`)
472
+ errors as incomplete; both rewritten plus the file-not-found, no-
473
+ tables-registered, and register-failed paths. Closes audit-01
474
+ KTAB-007.
475
+
476
+ - **Stale comment in `tests/unit/test_tools.py`** removed. The module
477
+ docstring claimed "Several tools have a bug where _get_engine(context)
478
+ is called without await" — current source awaits correctly. Closes
479
+ audit-01 KTAB-009.
480
+
481
+ ### Removed
482
+
483
+ - **`[xlsx]` extra and `_register_xlsx` method dropped.** Both
484
+ introduced an undocumented sideways
485
+ `kaos-tabular -> kaos-office` extraction-module dependency that the
486
+ architecture DAG explicitly forbids. Callers wanting XLSX support
487
+ parse the file with `kaos_office.parse_xlsx(path)` (in kaos-office,
488
+ which is the right home for OPC reading) and pass each `Table` to
489
+ `engine.register_table(table, name=...)` (already public). The
490
+ workspace dependency on `kaos-office` is removed; `[tool.uv.sources]`
491
+ drops the kaos-office editable entry. Closes audit-01 KTAB-002.
492
+
493
+ ### Notes (audit findings already resolved)
494
+
495
+ - **KTAB-008** — `kaos_tabular/__init__.py` `__all__` is already
496
+ alphabetically sorted under Python's default ordering (uppercase <
497
+ underscore < lowercase per ASCII). No change needed; documented here
498
+ as verified against `sorted()`.
499
+
500
+ [Unreleased]: https://github.com/273v/kaos-tabular/compare/v0.1.0a1...HEAD
501
+ [0.1.0a1]: https://github.com/273v/kaos-tabular/releases/tag/v0.1.0a1