@hachej/boring-bi-dashboard 0.1.60

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,638 @@
1
+ # Data Access Unification Plan
2
+
3
+ ## Status
4
+
5
+ Draft plan. This plan supersedes the current split-brain prototype where:
6
+
7
+ - `@hachej/boring-data-bridge` parses workspace CSV/JSON/NDJSON files itself.
8
+ - BI dashboard falls back to `/api/v1/files/records` and repeats aggregation in the browser.
9
+ - Perspective-like panels render JSON rows in an HTML table instead of using a real Perspective/Arrow preparation path.
10
+
11
+ ## Goal
12
+
13
+ Unify local file data, semantic queries, database-backed queries, DuckDB workspace files, and Perspective preparation behind a clean set of responsibilities:
14
+
15
+ 1. **Workspace file APIs own file-backed tabular reads.**
16
+ 2. **`data-bridge` owns semantic/query execution and adapter routing.**
17
+ 3. **Perspective consumers negotiate transport; server selects the safe transport.**
18
+ 4. **DuckDB/SQLite workspace files are file-backed databases: preview via file API, query via data bridge.**
19
+
20
+ The end state should make dashboards, data explorer, agents, and future report/notebook plugins share one data model instead of each plugin inventing its own CSV parser, query runner, or Perspective loader.
21
+
22
+ ## Current Findings
23
+
24
+ ### Existing `/api/v1/files/records`
25
+
26
+ Implemented in:
27
+
28
+ - `packages/agent/src/server/http/routes/file.ts`
29
+ - `packages/agent/src/server/http/routes/fileRecords.ts`
30
+
31
+ Current behavior:
32
+
33
+ - Supports JSON array, NDJSON/JSONL, and CSV.
34
+ - Accepts `path`, `offset`, `limit`, `q`.
35
+ - Enforces maximum file bytes, output bytes, row scan limits, returned row limits, column sampling limits.
36
+ - Uses the workspace adapter (`workspace.stat`, `workspace.readFile`) rather than raw filesystem paths.
37
+ - Returns:
38
+
39
+ ```ts
40
+ interface FileRecordsResult {
41
+ source: { kind: "file"; path: string; format: "json-array" | "ndjson" | "csv" }
42
+ path: string
43
+ format: string
44
+ columns: { name: string; type: string }[]
45
+ rows: Record<string, unknown>[]
46
+ total: number
47
+ hasMore: boolean
48
+ offset: number
49
+ limit: number
50
+ mtimeMs?: number
51
+ }
52
+ ```
53
+
54
+ ### Current `data-bridge` prototype
55
+
56
+ Implemented in:
57
+
58
+ - `plugins/data-bridge/src/server/index.ts`
59
+
60
+ Current behavior:
61
+
62
+ - Registers `data.v1.query.run` through WorkspaceBridge.
63
+ - Has a workspace-file adapter that reads files directly from `workspaceRoot`.
64
+ - Has its own CSV/JSON/NDJSON parsing and aggregation.
65
+ - Has BSL subprocess execution.
66
+ - Has realpath containment checks, but still duplicates file API behavior.
67
+
68
+ ### Current BI dashboard runtime
69
+
70
+ Implemented in:
71
+
72
+ - `plugins/bi-dashboard/src/front/dashboardData.ts`
73
+
74
+ Current behavior:
75
+
76
+ - Calls `data.v1.query.run` first for each dashboard query.
77
+ - Falls back to `/api/v1/files/records` if bridge fails and `dataRef.kind === "workspace-file"`.
78
+ - Browser fallback pages file records and aggregates client-side.
79
+ - `BSLPerspectiveViewer` renders a plain table from JSON rows.
80
+
81
+ ## Architecture Decision
82
+
83
+ ### Responsibility Split
84
+
85
+ | Data/source shape | Owner | API |
86
+ | --- | --- | --- |
87
+ | Plain workspace CSV/JSON/NDJSON | workspace file subsystem | `/api/v1/files/records` and future file data endpoints |
88
+ | Plain workspace Parquet/Arrow | workspace file subsystem | future `/api/v1/files/data`/`arrow`/`perspective` endpoints |
89
+ | DuckDB/SQLite file preview | workspace file subsystem | future preview endpoints |
90
+ | DuckDB/SQLite query execution | `data-bridge` adapter | `data.v1.query.run` |
91
+ | BSL semantic model query | `data-bridge` adapter | `data.v1.query.run` |
92
+ | Remote DB/warehouse query | `data-bridge` adapter | `data.v1.query.run` |
93
+ | Perspective-ready data | negotiated server operation | file API for local files, `data.v1.perspective.prepare` for query/adapters |
94
+
95
+ ### Why not put everything in file APIs?
96
+
97
+ File APIs should answer: “read this workspace file as data.”
98
+
99
+ They should not become a general SQL/BSL/warehouse execution surface. Query execution needs adapter policy, credentials, trusted callers, timeouts, capabilities, and runtime tokens. That belongs in `data-bridge`.
100
+
101
+ ### Why not put workspace-file reads only in `data-bridge`?
102
+
103
+ Because `/api/v1/files/records` already exists, is paginated, enforces file limits, uses the workspace adapter abstraction, and is useful outside BI dashboards. Duplicating CSV/JSON parsing in `data-bridge` creates inconsistent semantics and security drift.
104
+
105
+ ## Target APIs
106
+
107
+ ## 1. File Records API: keep and extract shared implementation
108
+
109
+ Keep existing HTTP endpoint:
110
+
111
+ ```http
112
+ GET /api/v1/files/records?path=data.csv&offset=0&limit=100&q=engineer
113
+ ```
114
+
115
+ Extract the core logic into a reusable server module that can be called by both HTTP routes and trusted server plugins.
116
+
117
+ Suggested package location:
118
+
119
+ ```txt
120
+ packages/file-data/src/server/
121
+ records.ts // parser + paging + type inference, no Fastify
122
+ workspaceRecords.ts // workspace adapter integration
123
+ formats/
124
+ csv.ts
125
+ json.ts
126
+ ndjson.ts
127
+ leases.ts // read-only local materialization for DB files
128
+ ```
129
+
130
+ The reusable implementation must live in a domain-neutral server-only package, **not** in `@hachej/boring-agent/server` and preferably not in workspace core unless the team explicitly decides that tabular workspace-file parsing is part of the workspace file subsystem. Recommended package: `@hachej/file-data` under `packages/file-data/`.
131
+
132
+ Rationale: `@hachej/boring-data-bridge` is a workspace server plugin; making it depend on the agent HTTP layer would couple reusable data adapters to the host app/agent package. Conversely, making `@hachej/boring-workspace` understand CSV/Arrow/Perspective risks violating the workspace package's domain-neutral boundary. A small server-only file-data package lets the agent route, workspace file APIs, and data-bridge share implementation without teaching workspace about BSL/SQL/Perspective.
133
+
134
+ Public/server exports from `@hachej/file-data/server`:
135
+
136
+ ```ts
137
+ export async function readWorkspaceFileRecords(args: {
138
+ workspace: WorkspaceFileAdapter
139
+ path: string
140
+ offset?: number
141
+ limit?: number
142
+ q?: string | null
143
+ maxFileBytes?: number
144
+ maxRowsScanned?: number
145
+ }): Promise<FileRecordsResult>
146
+
147
+ export function buildFileRecordsResult(args: {
148
+ path: string
149
+ content: string
150
+ mtimeMs?: number
151
+ offset: number
152
+ limit: number
153
+ q: string | null
154
+ }): FileRecordsResult
155
+ ```
156
+
157
+ Then move both the HTTP route and data-bridge adapter to use this same implementation.
158
+
159
+ Acceptance:
160
+
161
+ - Existing `/api/v1/files/records` tests remain green.
162
+ - `data-bridge` no longer contains a separate CSV parser.
163
+ - BI dashboard fallback and bridge produce the same rows/columns for the same workspace file.
164
+
165
+ ## 2. File data negotiation API
166
+
167
+ Add a file-centric endpoint for tabular data transport negotiation.
168
+
169
+ Preferred shape:
170
+
171
+ ```http
172
+ GET /api/v1/files/data?path=data.csv&representation=records&offset=0&limit=100
173
+ GET /api/v1/files/data?path=data.parquet&representation=arrow
174
+ GET /api/v1/files/data?path=data.csv&representation=perspective&transport.preferred=auto
175
+ ```
176
+
177
+ Alternative: separate endpoints:
178
+
179
+ ```http
180
+ GET /api/v1/files/records
181
+ GET /api/v1/files/arrow
182
+ GET /api/v1/files/perspective
183
+ ```
184
+
185
+ Recommendation: use one negotiated endpoint once we add Arrow/Perspective, but keep `/records` as a stable compatibility endpoint.
186
+
187
+ Request:
188
+
189
+ ```ts
190
+ type FileDataRepresentation = "records" | "arrow" | "perspective"
191
+ type DataTransport = "inline" | "artifact" | "websocket"
192
+ type PreferredDataTransport = "auto" | DataTransport
193
+ type PayloadFormat = "arrow" | "json"
194
+
195
+ interface FileDataRequest {
196
+ path: string
197
+ representation?: FileDataRepresentation
198
+ transport?: {
199
+ preferred?: PreferredDataTransport
200
+ accepted?: DataTransport[] // never includes "auto"
201
+ payloadFormat?: PayloadFormat
202
+ maxInlineBytes?: number
203
+ }
204
+ offset?: number
205
+ limit?: number
206
+ q?: string
207
+ table?: string // source-specific preview selector for DB-like files; no arbitrary SQL
208
+ sheet?: string // future spreadsheet selector
209
+ }
210
+ ```
211
+
212
+ Response variants:
213
+
214
+ ```ts
215
+ type FileDataResponse =
216
+ | FileRecordsResult
217
+ | FileArrowResponse
218
+ | FilePerspectiveResponse
219
+ ```
220
+
221
+ Arrow response:
222
+
223
+ ```ts
224
+ interface InlinePayloadDescriptor {
225
+ bytes?: number
226
+ data?: unknown
227
+ base64?: string
228
+ }
229
+
230
+ interface ArtifactPayloadDescriptor {
231
+ url: string
232
+ contentType: string
233
+ expiresAt?: string
234
+ bytes?: number
235
+ }
236
+
237
+ interface FileArrowResponse {
238
+ kind: "workspace-file.arrow"
239
+ version: 1
240
+ transport: "inline" | "artifact"
241
+ payloadFormat: "arrow"
242
+ contentType: "application/vnd.apache.arrow.file" | "application/vnd.apache.arrow.stream"
243
+ inline?: InlinePayloadDescriptor
244
+ artifact?: ArtifactPayloadDescriptor
245
+ schema?: Array<{ name: string; type: string }>
246
+ rowCount?: number
247
+ source: { kind: "file"; path: string; fileFormat: string }
248
+ }
249
+ ```
250
+
251
+ File Perspective response:
252
+
253
+ ```ts
254
+ interface FilePerspectiveResponse {
255
+ kind: "workspace-file.perspective"
256
+ version: 1
257
+ transport: DataTransport
258
+ payloadFormat: PayloadFormat
259
+ schema?: Array<{ name: string; type: string }>
260
+ rowCount?: number
261
+ source: { kind: "file"; path: string; fileFormat: string }
262
+ inline?: InlinePayloadDescriptor
263
+ artifact?: ArtifactPayloadDescriptor
264
+ websocket?: { url: string; protocol: "perspective" | "data-bridge-arrow-delta"; sessionId: string }
265
+ viewer?: PerspectiveViewerConfig
266
+ }
267
+ ```
268
+
269
+ ## 3. Data bridge query API
270
+
271
+ Keep:
272
+
273
+ ```ts
274
+ data.v1.query.run
275
+ ```
276
+
277
+ Use it for:
278
+
279
+ - BSL semantic queries.
280
+ - Remote DB/warehouse queries.
281
+ - DuckDB/SQLite file-backed queries.
282
+ - Structured dashboard queries over queryable sources.
283
+
284
+ Do **not** use it as the primary implementation for plain CSV/JSON file previews. If a dashboard query points directly at a plain workspace file, the bridge may delegate to shared file-record services internally, but the canonical file data surface remains the file API.
285
+
286
+ Request direction:
287
+
288
+ ```ts
289
+ type DataRef =
290
+ | { kind: "workspace-file"; path: string; fileFormat?: "csv" | "json" | "ndjson" | "parquet" | "arrow"; limit?: number }
291
+ | { kind: "duckdb-file"; path: string; table?: string }
292
+ | { kind: "sqlite-file"; path: string; table?: string }
293
+ | { kind: "semantic-model"; model: string }
294
+
295
+ interface DataBridgeQueryRunInput {
296
+ source?: string
297
+ query:
298
+ | DataBridgeDashboardQuery
299
+ | DataBridgeSemanticQuery
300
+ | DataBridgeSqlQuery
301
+ }
302
+
303
+ interface DataBridgeDashboardQuery {
304
+ language: "bsl-dashboard"
305
+ model: string
306
+ dataRef?: DataRef
307
+ groupBy?: string[]
308
+ measures?: string[]
309
+ dimensions?: string[]
310
+ filters?: DataBridgeFilterExpression[]
311
+ orderBy?: Array<[field: string, direction: "asc" | "desc"]>
312
+ limit?: number
313
+ }
314
+ ```
315
+
316
+ Response remains:
317
+
318
+ ```ts
319
+ interface DataBridgeTableResult {
320
+ kind: "data-bridge.table"
321
+ version: 1
322
+ columns: DataBridgeColumn[]
323
+ rows: Record<string, unknown>[]
324
+ rowCount: number
325
+ truncated?: boolean
326
+ source?: string
327
+ }
328
+ ```
329
+
330
+ Important: `data.v1.query.run` should generally return small/medium aggregated JSON table results for metrics/charts. Large raw datasets should not come through this path for visualization.
331
+
332
+ ## 4. Data bridge Perspective prepare API
333
+
334
+ Add:
335
+
336
+ ```ts
337
+ data.v1.perspective.prepare
338
+ ```
339
+
340
+ This is the query/adapter equivalent of file perspective preparation. It handles BSL, DuckDB, SQL, remote DB, or any adapter-backed source.
341
+
342
+ Request:
343
+
344
+ ```ts
345
+ type DataTransport = "inline" | "artifact" | "websocket"
346
+ type PreferredDataTransport = "auto" | DataTransport
347
+ type PayloadFormat = "arrow" | "json"
348
+
349
+ interface PerspectiveViewerConfig {
350
+ plugin?: "Datagrid" | "Y Line" | "X Bar" | string
351
+ columns?: string[]
352
+ group_by?: string[]
353
+ split_by?: string[]
354
+ sort?: Array<[string, "asc" | "desc"]>
355
+ filter?: unknown[][]
356
+ aggregates?: Record<string, string>
357
+ }
358
+
359
+ interface DataBridgePerspectivePrepareInput {
360
+ source?: string
361
+ datasetId?: string
362
+ query: DataBridgeQueryRunInput["query"]
363
+ viewer?: PerspectiveViewerConfig
364
+ transport?: {
365
+ preferred?: PreferredDataTransport
366
+ accepted: DataTransport[] // never includes "auto"
367
+ payloadFormat?: PayloadFormat
368
+ maxInlineBytes?: number
369
+ }
370
+ }
371
+ ```
372
+
373
+ Response:
374
+
375
+ ```ts
376
+ interface DataBridgePerspectiveResult {
377
+ kind: "data-bridge.perspective"
378
+ version: 1
379
+ transport: DataTransport
380
+ payloadFormat: PayloadFormat
381
+ schema?: DataBridgeColumn[]
382
+ rowCount?: number
383
+ source?: string
384
+ viewer?: PerspectiveViewerConfig
385
+ inline?: InlinePayloadDescriptor
386
+ artifact?: ArtifactPayloadDescriptor
387
+ websocket?: {
388
+ url: string
389
+ protocol: "perspective" | "data-bridge-arrow-delta"
390
+ sessionId: string
391
+ }
392
+ }
393
+ ```
394
+
395
+ ### Transport negotiation rule
396
+
397
+ The consumer advertises capability and preference; the server decides.
398
+
399
+ Consumer controls:
400
+
401
+ - `preferredTransport`
402
+ - `acceptedTransports`
403
+ - `format`
404
+ - `maxInlineBytes`
405
+ - viewer configuration hints
406
+
407
+ Server controls:
408
+
409
+ - actual transport
410
+ - whether inline is allowed
411
+ - maximum bytes/rows
412
+ - whether Arrow conversion is available
413
+ - artifact lifetime
414
+ - websocket availability
415
+ - security/auth/capability policy
416
+
417
+ Server must never honor `inline` blindly for large results.
418
+
419
+ ## DuckDB Workspace Files
420
+
421
+ DuckDB files are physically workspace files but semantically queryable databases.
422
+
423
+ ### Preview path: file API
424
+
425
+ Use file API for discovery and preview:
426
+
427
+ ```http
428
+ GET /api/v1/files/duckdb/tables?path=analytics.duckdb
429
+ GET /api/v1/files/duckdb/records?path=analytics.duckdb&table=orders&offset=0&limit=100
430
+ GET /api/v1/files/data?path=analytics.duckdb&representation=records&table=orders
431
+ ```
432
+
433
+ The file subsystem should enforce:
434
+
435
+ - path validation through workspace adapter
436
+ - read-only mode
437
+ - table name validation
438
+ - page limits
439
+ - no arbitrary SQL in preview endpoints
440
+
441
+ ### Query path: data bridge
442
+
443
+ Use data bridge for SQL/semantic queries:
444
+
445
+ ```ts
446
+ data.v1.query.run({
447
+ source: "duckdb-file",
448
+ query: {
449
+ language: "sql",
450
+ dialect: "duckdb",
451
+ sql: "select role, count(*) as count from people group by role",
452
+ dataRef: { kind: "duckdb-file", path: "analytics.duckdb" }
453
+ }
454
+ })
455
+ ```
456
+
457
+ Safer dashboard query shape:
458
+
459
+ ```ts
460
+ data.v1.query.run({
461
+ query: {
462
+ language: "bsl-dashboard",
463
+ model: "orders",
464
+ groupBy: ["month"],
465
+ measures: ["revenue"],
466
+ dataRef: {
467
+ kind: "duckdb-file",
468
+ path: "analytics.duckdb",
469
+ table: "orders"
470
+ }
471
+ }
472
+ })
473
+ ```
474
+
475
+ DuckDB adapter requirements:
476
+
477
+ - Obtain the DB file through a workspace-owned `materializeWorkspaceFileReadOnly()`/`WorkspaceFileLease` abstraction rather than raw `workspaceRoot` paths.
478
+ - The lease resolves path through adapter policy, rejects symlink escapes, creates an immutable temp copy or read-only local path when needed, works for remote/sandbox workspaces, and is scoped to request/session lifetime.
479
+ - Open DuckDB against the lease in read-only mode and prevent WAL/sidecar writes to the workspace.
480
+ - Disable extension auto-install/loading unless explicitly allowed.
481
+ - Restrict filesystem/network functions.
482
+ - Run with timeout/cancellation.
483
+ - Enforce output row/byte limits.
484
+ - For dashboard structured queries, compile to parameterized/quoted DuckDB SQL.
485
+ - For direct SQL, require stronger capability such as `data:sql-query` and trusted caller class depending on policy.
486
+
487
+ ## BI Dashboard Runtime Changes
488
+
489
+ ### Metrics and charts
490
+
491
+ Use `data.v1.query.run` for non-file semantic/query sources.
492
+
493
+ For plain `workspace-file` dataRefs:
494
+
495
+ - Near-term: `data.v1.query.run` may be used, but the bridge must delegate to the shared file-record service so semantics match the file API.
496
+ - The existing browser-side `/files/records` aggregation fallback is temporary compatibility debt. Remove it in Phase 2 or guard it behind an explicit dev-only feature flag with a tracked removal criterion.
497
+
498
+ Avoid two independent parsers and avoid permanent dashboard-local query execution.
499
+
500
+ ### Perspective panels
501
+
502
+ For `BSLPerspectiveViewer`:
503
+
504
+ 1. If query `dataRef.kind === "workspace-file"`, call file data/perspective endpoint.
505
+ 2. Else call `data.v1.perspective.prepare`.
506
+ 3. Load the result into real Perspective:
507
+ - inline JSON/Arrow for small results
508
+ - artifact Arrow for larger static results
509
+ - websocket once server-side replicated Perspective is available
510
+
511
+ Do not route large raw Perspective datasets through `data.v1.query.run` JSON rows.
512
+
513
+ ### Eval/E2E stitching
514
+
515
+ Add a true end-to-end workflow test:
516
+
517
+ 1. Run the BI dashboard agent eval.
518
+ 2. Locate generated `*.dashboard.json` in the eval workspace.
519
+ 3. Start playground against that workspace.
520
+ 4. Open the generated dashboard in browser.
521
+ 5. Assert:
522
+ - dashboard title visible
523
+ - at least one metric has non-placeholder value
524
+ - at least one chart/table has rows
525
+ - no “No live data source configured yet” placeholder
526
+ - if Perspective component exists, it loads through Perspective prepare path
527
+
528
+ ## Implementation Phases
529
+
530
+ ### Phase 0 — Document and freeze architecture
531
+
532
+ - Add this plan.
533
+ - Update existing data-bridge and dashboard plans to reference this split.
534
+ - Mark current `data-bridge` workspace-file parser as temporary.
535
+
536
+ Acceptance:
537
+
538
+ - Reviewers agree on file API vs data bridge ownership.
539
+
540
+ ### Phase 1 — Extract file records core
541
+
542
+ - Move file-record parsing/paging/type inference into reusable server module.
543
+ - Keep existing `/api/v1/files/records` behavior unchanged.
544
+ - Add unit tests around extracted module.
545
+ - Add contract tests proving `/api/v1/files/records` and `data-bridge` delegated workspace-file reads return compatible rows/columns/limits.
546
+
547
+ Acceptance:
548
+
549
+ - Existing file route tests pass unchanged.
550
+ - Extracted module can be imported without Fastify.
551
+
552
+ ### Phase 2 — Make `data-bridge` delegate workspace-file reads
553
+
554
+ - Replace local CSV/JSON/NDJSON parser in `plugins/data-bridge` with shared file-record service.
555
+ - Preserve path/security semantics via workspace adapter, not raw `workspaceRoot` reads.
556
+ - Keep traversal/symlink tests or adapt them to workspace adapter semantics.
557
+
558
+ Acceptance:
559
+
560
+ - `@hachej/boring-data-bridge` has no custom CSV parser.
561
+ - Bridge and `/files/records` agree on rows/columns/limits.
562
+ - BI dashboard no longer has a permanent browser aggregation fallback; if retained temporarily, it is feature-flagged and documented for removal.
563
+
564
+ ### Phase 3 — Add file data negotiation endpoint
565
+
566
+ - Add `/api/v1/files/data` or explicit `/files/arrow` + `/files/perspective` endpoint.
567
+ - Initially support `representation=records` and small `representation=perspective` with `transport.preferred=inline` and `payloadFormat=json`.
568
+ - Add Arrow support behind capability/dependency detection.
569
+
570
+ Acceptance:
571
+
572
+ - File endpoint can return records and a Perspective inline descriptor.
573
+ - Large inline requests are rejected or downgraded to artifact when artifact support exists.
574
+
575
+ ### Phase 4 — Add `data.v1.perspective.prepare`
576
+
577
+ - Add shared contract to `@hachej/boring-data-bridge`.
578
+ - Implement inline JSON first for small results.
579
+ - Add Arrow artifact once artifact storage exists.
580
+ - Add websocket later.
581
+
582
+ Acceptance:
583
+
584
+ - `BSLPerspectiveViewer` no longer calls `data.v1.query.run` for large/raw Perspective loads.
585
+ - Server chooses actual transport from accepted transports.
586
+
587
+ ### Phase 5 — DuckDB file support
588
+
589
+ - Add file API table discovery and paginated preview for `.duckdb`.
590
+ - Add `duckdb-file` data bridge adapter for query execution.
591
+ - Add `WorkspaceFileLease`/read-only materialization support for local and non-local workspace adapters.
592
+ - Add direct SQL gating policy.
593
+ - Add structured dashboard query compiler for DuckDB tables.
594
+
595
+ Acceptance:
596
+
597
+ - A workspace `.duckdb` file can be previewed without SQL.
598
+ - A dashboard can query a DuckDB table through `data.v1.query.run`.
599
+ - Direct SQL requires explicit capability and respects timeouts/output limits.
600
+
601
+ ### Phase 6 — True generated dashboard render E2E
602
+
603
+ - Extend eval harness or add a script that runs agent eval then browser render against generated output.
604
+ - Assert real data rendering, not just valid JSON.
605
+
606
+ Acceptance:
607
+
608
+ - CI can prove: agent creates dashboard → file exists → dashboard opens → data renders.
609
+
610
+ ## Security Requirements
611
+
612
+ - File APIs must use workspace adapter path validation, not raw filesystem reads.
613
+ - Symlink traversal must remain impossible for local filesystem workspaces.
614
+ - Direct SQL must be gated separately from structured dashboard queries.
615
+ - DuckDB/SQLite workspace files must be accessed through read-only leases/materialized snapshots, never raw unchecked paths.
616
+ - Direct BSL Python query strings remain trusted-only with `data:bsl-query-string`.
617
+ - Inline transports must enforce byte/row limits server-side.
618
+ - Websocket transports must bind to workspace/session auth and expire.
619
+ - Artifacts must have content type, no-sniff headers, expiry, and workspace/session authorization.
620
+
621
+ ## Open Questions
622
+
623
+ 1. Should `/api/v1/files/data` replace `/api/v1/files/records` long term, or should `/records` stay as the simple stable API?
624
+ 2. Where should Arrow conversion live: `@hachej/boring-agent`, `@hachej/boring-data-bridge`, or a dedicated optional package?
625
+ 3. What artifact store should be used for Arrow artifacts in local/dev/serverless modes?
626
+ 4. Should DuckDB direct SQL be available to browser callers with a constrained read-only policy, or only runtime/server callers?
627
+ 6. Should `WorkspaceFileLease` live in `@hachej/file-data/server` or as a minimal workspace adapter capability consumed by that package?
628
+ 5. Should Perspective websocket mode live in `data-bridge`, the workspace server, or a dedicated `perspective-bridge` plugin?
629
+
630
+ ## Recommended Immediate Next Step
631
+
632
+ Do **Phase 1 + Phase 2** before adding more dashboard features:
633
+
634
+ - Extract the existing file-record implementation.
635
+ - Make `data-bridge` delegate workspace-file reads to it.
636
+ - Keep `data.v1.query.run` for current dashboard metrics/charts.
637
+
638
+ This removes duplicated parsers and gives a safe foundation for Arrow/Perspective transport negotiation.