@c3-oss/prosa 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +163 -455
  2. package/package.json +3 -3
package/README.md CHANGED
@@ -1,506 +1,246 @@
1
- # prosa
1
+ # prosa - Query your agent history. Keep the raw trail.
2
2
 
3
- `prosa` is a local-first CLI for compiling, searching, auditing, and exporting
4
- agent session histories.
3
+ [![npm version](https://img.shields.io/npm/v/@c3-oss/prosa.svg)](https://www.npmjs.com/package/@c3-oss/prosa)
4
+ [![Node version](https://img.shields.io/node/v/@c3-oss/prosa.svg)](https://www.npmjs.com/package/@c3-oss/prosa)
5
+ [![License](https://img.shields.io/npm/l/@c3-oss/prosa.svg)](https://www.npmjs.com/package/@c3-oss/prosa)
5
6
 
6
- It imports local histories from Codex CLI, Claude Code, Gemini CLI, and Cursor
7
- into one canonical bundle so you can search across tools, inspect prior work,
8
- export readable transcripts, and run analytical queries without giving up the
9
- original raw data.
7
+ `prosa` imports local AI agent session histories into one durable, searchable
8
+ bundle on your machine.
10
9
 
11
- ## What it does
10
+ It understands Codex CLI, Claude Code, Gemini CLI, and Cursor. It preserves the
11
+ original raw files, normalizes sessions into SQLite, builds search indexes,
12
+ exports readable transcripts, writes Parquet for DuckDB, opens a terminal UI,
13
+ and serves the same local memory over MCP.
12
14
 
13
- - Imports session histories from multiple agent CLIs into a single local store.
14
- - Preserves raw source files and raw records for future re-processing.
15
- - Normalizes sessions, messages, tool calls, tool results, artifacts, and graph
16
- edges into SQLite tables.
17
- - Builds searchable derived indexes over messages, commands, paths, and result
18
- previews.
19
- - Lists and filters sessions by source and timestamp.
20
- - Exports individual sessions as Markdown.
21
- - Exports canonical tables to Parquet for DuckDB analytics.
22
- - Runs built-in analytics reports over Parquet with DuckDB.
23
- - Provides an Ink-based terminal UI for browsing sessions and search results.
24
- - Serves a read-only MCP server over the local bundle for agent memory access.
15
+ Use it when you want to know what your agents already did, which files or
16
+ commands they touched, where a failure happened, or how to reuse prior work
17
+ without reading provider-specific JSONL and SQLite files by hand.
25
18
 
26
- `prosa` is early software, but the main CLI surfaces described below are
27
- implemented.
19
+ Package: [`@c3-oss/prosa`](https://www.npmjs.com/package/@c3-oss/prosa)
28
20
 
29
- ## Quick start
21
+ ## Installation
30
22
 
31
- From this repository:
23
+ Install the published package globally:
32
24
 
33
25
  ```bash
34
- devbox shell
35
- pnpm install
36
- pnpm build
26
+ npm install -g @c3-oss/prosa
37
27
  ```
38
28
 
39
- During development, run commands through SWC:
29
+ Or run it without a global install:
40
30
 
41
31
  ```bash
42
- pnpm dev -- init
43
- pnpm dev -- compile codex
44
- pnpm dev -- sessions
45
- pnpm dev -- search "terraform"
32
+ npx --package @c3-oss/prosa prosa --help
46
33
  ```
47
34
 
48
- After building or installing the package, use the `prosa` binary:
35
+ The package provides the `prosa` binary and requires Node.js 22.15.1 through
36
+ 26.x.
37
+
38
+ ## Quickstart
39
+
40
+ Build a local bundle from every supported default history location:
49
41
 
50
42
  ```bash
51
43
  prosa init
52
44
  prosa compile-all
53
-
54
- prosa sessions --source codex --since 2026-01-01
45
+ prosa sessions
55
46
  prosa search "package.json"
47
+ ```
48
+
49
+ Export a transcript and run analytics:
50
+
51
+ ```bash
56
52
  prosa export session <session-id> --format markdown --out session.md
57
53
  prosa export parquet
58
54
  prosa query duckdb "select source_tool, count(*) from sessions group by 1"
59
55
  prosa analytics tools --refresh
60
- prosa tui
61
- prosa mcp serve
62
56
  ```
63
57
 
64
- By default, the bundle is stored at `~/.prosa`. Override it with `--store` or
65
- the `PROSA_STORE` environment variable:
58
+ Open the terminal UI or serve the bundle through MCP:
66
59
 
67
60
  ```bash
68
- PROSA_STORE=/tmp/prosa-demo prosa init
69
- prosa sessions --store /tmp/prosa-demo
61
+ prosa tui
62
+ prosa mcp serve
70
63
  ```
71
64
 
72
- ## Supported sources
73
-
74
- `prosa compile` imports one source at a time. If `--sessions-path` is omitted,
75
- the provider default is used:
65
+ By default, the bundle lives at `~/.prosa`. Override it with `--store` or
66
+ `PROSA_STORE`:
76
67
 
77
68
  ```bash
78
- prosa compile codex [--sessions-path <path>]
79
- prosa compile claude [--sessions-path <path>]
80
- prosa compile gemini [--sessions-path <path>]
81
- prosa compile cursor [--sessions-path <path>]
82
- prosa compile-all [--verbose] [--json-logs]
69
+ PROSA_STORE=/tmp/prosa-demo prosa init
70
+ prosa compile codex --store /tmp/prosa-demo
71
+ prosa search "migration" --store /tmp/prosa-demo
83
72
  ```
84
73
 
85
- Supported importers:
74
+ ## Why prosa
86
75
 
87
- | Source | Typical path | Imported files |
88
- |---|---|---|
89
- | Codex CLI | `~/.codex/sessions` | Recursive `.jsonl` session files |
90
- | Claude Code | `~/.claude/projects` | Project JSONL files and subagent JSONL files |
91
- | Gemini CLI | `~/.gemini/tmp` | `chats/session-*.json` snapshots |
92
- | Cursor | `~/.cursor/chats` | `store.db` SQLite agent stores |
76
+ Agent tools store useful work in different formats and directories. That makes
77
+ history hard to search, audit, export, or share with another agent.
93
78
 
94
- Imports are idempotent for already-seen source files. Each import reports counts
95
- for source files, sessions, messages, tool calls, tool results, artifacts,
96
- edges, and errors.
97
-
98
- `prosa compile` always disables FTS5 triggers during the import loop and
99
- rebuilds the FTS5 index in bulk at the end (mirroring how the Tantivy sidecar
100
- is rebuilt). Sidecars stay in sync without a manual step. For recovery, the
101
- standalone `prosa index fts5` command is still available.
79
+ `prosa` turns those histories into a local-first data layer:
102
80
 
103
- ## CLI reference
81
+ - raw source files and raw records stay available for audit and re-processing;
82
+ - canonical SQLite tables make sessions, messages, tool calls, artifacts, and
83
+ graph edges queryable;
84
+ - search surfaces cover messages, commands, paths, diffs, and result previews;
85
+ - derived exports give humans Markdown and analytics tools Parquet;
86
+ - MCP exposes the same bundle to agents as reusable local memory.
104
87
 
105
- ### `prosa init`
88
+ ## Features
106
89
 
107
- Initialize a bundle:
90
+ - Import Codex CLI, Claude Code, Gemini CLI, and Cursor session histories.
91
+ - Preserve raw bytes alongside normalized records.
92
+ - Search with SQLite FTS5 by default, or build an optional Tantivy sidecar.
93
+ - List sessions with filters for source, time range, columns, and output format.
94
+ - Export individual sessions as readable Markdown transcripts.
95
+ - Export canonical tables to Parquet and query them with DuckDB.
96
+ - Run built-in analytics reports for sessions, tools, errors, models, and
97
+ projects.
98
+ - Browse sessions and search results in an Ink-based terminal UI.
99
+ - Serve MCP tools and prompts over stdio or HTTP Streamable transport.
100
+ - Run bundle health checks with `prosa doctor`.
108
101
 
109
- ```bash
110
- prosa init
111
- prosa init --store /path/to/bundle
112
- ```
113
-
114
- If the bundle already exists, `init` exits with an error unless
115
- `--force-existing` is passed:
102
+ ## Supported sources
116
103
 
117
- ```bash
118
- prosa init --force-existing
119
- ```
104
+ `prosa compile` imports one source at a time. `prosa compile-all` imports every
105
+ supported source from its default location.
120
106
 
121
- ### `prosa compile`
107
+ | Source | Default path | Imported files |
108
+ |---|---|---|
109
+ | Codex CLI | `~/.codex/sessions` | Recursive `.jsonl` session files |
110
+ | Claude Code | `~/.claude/projects` | Project JSONL files and subagent JSONL files |
111
+ | Gemini CLI | `~/.gemini/tmp` | `chats/session-*.json` snapshots |
112
+ | Cursor | `~/.cursor/chats` | `store.db` SQLite agent stores |
122
113
 
123
- Import session histories into the bundle:
114
+ Examples:
124
115
 
125
116
  ```bash
126
117
  prosa compile codex
127
- prosa compile claude
118
+ prosa compile claude --sessions-path ~/custom/claude/projects
128
119
  prosa compile gemini
129
120
  prosa compile cursor
121
+ prosa compile-all --verbose
130
122
  ```
131
123
 
132
- Override a provider source path:
124
+ Imports are idempotent for already-seen source files. Each import reports counts
125
+ for source files, sessions, messages, tool calls, tool results, artifacts,
126
+ edges, and errors.
133
127
 
134
- ```bash
135
- prosa compile codex --sessions-path ~/custom/codex/sessions
136
- ```
128
+ ## Common workflows
137
129
 
138
- Import every supported provider with default paths:
130
+ Import everything local:
139
131
 
140
132
  ```bash
133
+ prosa init --force-existing
141
134
  prosa compile-all
142
- ```
143
-
144
- Options:
145
-
146
- | Option | Description |
147
- |---|---|
148
- | `--sessions-path <path>` | Root of the selected provider's session history |
149
- | `--store <path>` | Bundle directory |
150
- | `--verbose` | Emit debug logs during compilation |
151
- | `--json-logs` | Emit raw JSON logs instead of pretty logs |
152
-
153
- `prosa compile-all` accepts only the logging flags. It uses provider defaults and
154
- the normal `PROSA_STORE` environment variable when the bundle path must be
155
- overridden.
156
-
157
- ### `prosa index`
158
-
159
- Build or inspect derived search indexes:
160
-
161
- ```bash
162
135
  prosa index status
163
- prosa index fts5
164
- prosa index tantivy
165
136
  ```
166
137
 
167
- `fts5` is the default SQLite full-text index. `prosa compile` rebuilds it in
168
- bulk at the end of every import; `prosa index fts5` is a standalone recovery
169
- path that repopulates the index from `search_docs`.
170
-
171
- `tantivy` is an optional sidecar search index. Build it before searching with
172
- `--engine tantivy`:
138
+ Find prior work on a topic or file:
173
139
 
174
140
  ```bash
175
- prosa index tantivy
176
- prosa search "migration error" --engine tantivy
177
- ```
178
-
179
- `index status` supports machine-readable output:
180
-
181
- ```bash
182
- prosa index status --output-format json
183
- ```
184
-
185
- ### `prosa sessions`
186
-
187
- List sessions in the bundle:
188
-
189
- ```bash
190
- prosa sessions
191
- prosa sessions --source claude
192
- prosa sessions --since 2026-01-01
193
- prosa sessions --until 2026-02-01
194
- prosa sessions --limit 100
195
- ```
196
-
197
- Count sessions with the same filters:
198
-
199
- ```bash
200
- prosa sessions count
201
- prosa sessions count --source cursor --since 2026-01-01
202
- ```
203
-
204
- Session list output includes timestamp, source tool, a 12-char `session_id`
205
- prefix, model, message count, tool call count, and title by default. Use
206
- `--columns all` to include `cwd_initial`, `source_session_id`,
207
- `parent_session_id`, `is_subagent`, `git_branch_initial`, `model_first`,
208
- `status`, `timeline_confidence`, and `end_ts`. Pass a CSV list to pick a
209
- subset (`--columns start_ts,session_id,title`).
210
-
211
- Output formats:
212
-
213
- ```bash
214
- prosa sessions --output-format table
215
- prosa sessions --output-format json
216
- prosa sessions --output-format csv
217
- prosa sessions --columns all
218
- prosa sessions --columns start_ts,session_id,title
219
- ```
220
-
221
- `table` and `interactive` outputs are width-aware: long values are truncated
222
- with `…` to fit the terminal (or 200 columns when piped). `json` and `csv`
223
- always emit full values. Use `prosa tui` for the interactive browser.
224
-
225
- ### `prosa search`
226
-
227
- Search messages, tool calls, paths, commands, and result previews:
228
-
229
- ```bash
230
- prosa search "terraform"
231
- prosa search "package.json" --limit 20
232
- prosa search "failed migration" --output-format json
233
- prosa search "schema update" --engine fts5
234
- prosa search "schema update" --engine tantivy
235
- ```
236
-
237
- The default engine is `fts5`. The Tantivy engine requires a sidecar index:
238
-
239
- ```bash
240
- prosa index tantivy
241
- prosa search "indexing" --engine tantivy
141
+ prosa search "auth middleware"
142
+ prosa search "src/server/routes.ts" --limit 20
143
+ prosa sessions --source codex --since 2026-01-01
242
144
  ```
243
145
 
244
- Search output includes timestamp, role, tool name, session ID, and a snippet.
245
-
246
- ### `prosa export session`
247
-
248
- Export a single session as Markdown:
146
+ Export a useful transcript:
249
147
 
250
148
  ```bash
251
- prosa export session <session-id> --format markdown
149
+ prosa sessions --limit 20
252
150
  prosa export session <session-id> --format markdown --out transcript.md
253
151
  ```
254
152
 
255
- Markdown exports include source metadata, prosa and source session IDs,
256
- timestamps, working directory, branch, model span, timeline confidence,
257
- messages, and related tool calls.
258
-
259
- Large outputs are not intended to be dumped wholesale into Markdown. The export
260
- renders useful previews while the full bytes remain in the content-addressed
261
- object store.
262
-
263
- ### `prosa export parquet`
264
-
265
- Export canonical SQLite tables to Parquet:
153
+ Audit failed tool usage:
266
154
 
267
155
  ```bash
268
- prosa export parquet
269
- prosa export parquet --out /tmp/prosa-parquet
270
- ```
271
-
272
- The export writes one `.parquet` file per canonical table plus a manifest. These
273
- files are derived analytics snapshots, not the source of truth.
274
-
275
- Exported tables include:
276
-
277
- ```text
278
- objects, source_files, import_batches, raw_records, import_errors,
279
- uncertainties, projects, sessions, turns, events, messages, content_blocks,
280
- tool_calls, tool_results, artifacts, edges, search_docs
156
+ prosa analytics tools --refresh --errors-only
157
+ prosa analytics errors --output-format json
281
158
  ```
282
159
 
283
- ### `prosa query duckdb`
284
-
285
- Run DuckDB SQL over exported Parquet tables:
160
+ Run custom SQL over Parquet:
286
161
 
287
162
  ```bash
288
163
  prosa export parquet
289
- prosa query duckdb "select source_tool, count(*) from sessions group by 1"
290
- ```
291
-
292
- Use a custom Parquet directory:
293
-
294
- ```bash
295
- prosa query duckdb \
296
- --parquet-dir /tmp/prosa-parquet \
297
- "select tool_name, count(*) from tool_calls group by 1 order by 2 desc"
298
- ```
299
-
300
- Output formats:
301
-
302
- ```bash
303
- prosa query duckdb "select count(*) as n from sessions" --output-format json
304
- prosa query duckdb "select * from sessions limit 10" --output-format csv
305
- ```
306
-
307
- `prosa query duckdb` also exposes derived analytics views:
308
-
309
- ```text
310
- session_facts, tool_usage_facts, error_facts, model_usage, project_activity
164
+ prosa query duckdb "
165
+ select tool_name, status, count(*) as n
166
+ from tool_calls
167
+ group by 1, 2
168
+ order by n desc
169
+ "
311
170
  ```
312
171
 
313
- See [`docs/recipes/duckdb.md`](./docs/recipes/duckdb.md) for copy-pasteable
314
- queries.
315
-
316
- ### `prosa analytics`
317
-
318
- Run built-in reports over exported Parquet files:
172
+ Use a faster sidecar search index:
319
173
 
320
174
  ```bash
321
- prosa analytics sessions --refresh
322
- prosa analytics tools --source codex
323
- prosa analytics errors --output-format json
324
- prosa analytics models --since 2026-01-01
325
- prosa analytics projects --project /Users/me/app
175
+ prosa index tantivy
176
+ prosa search "slow test" --engine tantivy
177
+ prosa index status
326
178
  ```
327
179
 
328
- Reports require Parquet files. Add `--refresh` to export Parquet before running
329
- the report. All reports support `--store`, `--parquet-dir`, `--source`,
330
- `--since`, `--until`, `--limit`, `--output-format table|json|csv`, and
331
- `--columns <list>` for column selection.
180
+ ## Command map
332
181
 
333
- Table output is curated to fit a normal terminal: `analytics sessions` shows
334
- 9 columns by default (drops `source_file_path`, `session_id`,
335
- `source_session_id`, `tool_result_count`, `tool_duration_ms`, and
336
- `timeline_confidence`), `analytics projects` drops `project_path`, and
337
- `analytics errors` drops `session_id` and the full `message` (the shorter
338
- `preview` keeps the signal). Use `--columns all` to get every column the SQL
339
- returns, or `--columns col1,col2` to pick specific ones:
182
+ | Command | Purpose |
183
+ |---|---|
184
+ | `prosa init` | Create a bundle directory with manifest, SQLite, lock file, and object store. |
185
+ | `prosa compile <source>` | Import one source: `codex`, `claude`, `gemini`, or `cursor`. |
186
+ | `prosa compile-all` | Import every supported source from default paths. |
187
+ | `prosa sessions` | List or count sessions with filters and table, JSON, or CSV output. |
188
+ | `prosa search <query>` | Full-text search across messages, tool calls, paths, commands, and previews. |
189
+ | `prosa index` | Inspect or rebuild FTS5 and Tantivy search indexes. |
190
+ | `prosa export session` | Export one session as Markdown. |
191
+ | `prosa export parquet` | Export canonical tables to Parquet for analytics. |
192
+ | `prosa query duckdb` | Run DuckDB SQL over exported Parquet tables and derived views. |
193
+ | `prosa analytics` | Run built-in reports for sessions, tools, errors, models, and projects. |
194
+ | `prosa tui` | Open the interactive terminal explorer. |
195
+ | `prosa mcp serve` | Serve the bundle through MCP over stdio or HTTP. |
196
+ | `prosa doctor` | Run bundle health checks. |
197
+
198
+ Most commands accept `--store <path>`. `PROSA_STORE` sets the default bundle
199
+ path for a shell session.
200
+
201
+ Useful output flags:
340
202
 
341
203
  ```bash
204
+ prosa sessions --output-format json
205
+ prosa sessions --columns start_ts,session_id,title
206
+ prosa search "schema update" --output-format json
342
207
  prosa analytics sessions --columns all
343
- prosa analytics sessions --columns start_ts,project_name,source_file_path
344
- prosa analytics errors --columns all # includes the full `message`
345
208
  ```
346
209
 
347
- `json` and `csv` output always include every column regardless of `--columns`.
210
+ ## MCP
348
211
 
349
- Additional filters:
350
-
351
- ```bash
352
- prosa analytics tools --tool-name Bash --errors-only
353
- prosa analytics tools --canonical-type shell
354
- prosa analytics errors --category tool_result
355
- prosa analytics models --model gpt-5.4
356
- ```
357
-
358
- ### `prosa tui`
359
-
360
- Open the Ink-based interactive explorer:
361
-
362
- ```bash
363
- prosa tui
364
- prosa tui --store /path/to/bundle
365
- ```
366
-
367
- Key bindings:
368
-
369
- | Key | Action |
370
- |---|---|
371
- | `j` / `k` or arrows | Move selection or scroll detail view |
372
- | `Enter` | Open the selected session |
373
- | `/` | Search |
374
- | `s` | Cycle source filter |
375
- | `R` | Reload |
376
- | `Esc` | Return to the session list |
377
- | `gg` / `G` | Jump to top or bottom |
378
- | `Ctrl-d` / `Ctrl-u` | Half-page down or up |
379
- | `q` | Quit from the session list |
380
-
381
- ### `prosa mcp serve`
382
-
383
- Start a local read-only MCP server over the bundle. The default transport is
384
- stdio, suitable for MCP clients that launch a command through `npx` or a local
385
- binary:
212
+ Start a local MCP server over the bundle:
386
213
 
387
214
  ```bash
388
215
  prosa mcp serve
389
- npx @c3-oss/prosa mcp serve
390
- prosa mcp serve --transport stdio
391
- prosa mcp serve --search-engine tantivy
216
+ npx --package @c3-oss/prosa prosa mcp serve
217
+ prosa mcp serve --transport http --host 127.0.0.1 --port 7331 --path /mcp
392
218
  ```
393
219
 
394
- Example MCP client command config:
220
+ Example stdio client config:
395
221
 
396
222
  ```json
397
223
  {
398
224
  "command": "npx",
399
- "args": ["@c3-oss/prosa", "mcp", "serve"]
225
+ "args": ["--package", "@c3-oss/prosa", "prosa", "mcp", "serve"]
400
226
  }
401
227
  ```
402
228
 
403
- In stdio mode, stdout is reserved for MCP JSON-RPC frames. Do not expect normal
404
- human-readable startup logs on stdout.
405
-
406
- To expose MCP over HTTP Streamable transport, pass `--transport http`:
407
-
408
- ```bash
409
- prosa mcp serve --transport http
410
- prosa mcp serve --transport http --host 127.0.0.1 --port 7331 --path /mcp
411
- prosa mcp serve --transport http --search-engine tantivy
412
- ```
413
-
414
- By default, HTTP mode listens at:
415
-
416
- ```text
417
- http://127.0.0.1:7331/mcp
418
- ```
419
-
420
- Registered MCP tools (six in total):
229
+ MCP tools include:
421
230
 
422
231
  | Tool | Purpose |
423
232
  |---|---|
424
- | `search` | Full-text search over messages, commands, paths, diffs, and previews. Optional `engine`, `field_kind`, `since`/`until`, `raw`, `limit`. |
425
- | `sessions` | Without `session_id`, lists candidates filtered by source/time/limit. With `session_id`, opens it: `format=detail` (default) returns metadata + timeline, `format=summary` returns the row only, `format=markdown` renders the transcript. |
426
- | `tool_calls` | Audit commands and tool usage by tool_name, canonical_type, session_id, errors_only, time bounds. When `path_substring` is set, also returns matching artifacts. |
427
- | `analytics` | Built-in aggregate reports backed by SQLite views: `report=sessions\|tools\|errors\|models\|projects` with the matching filters. |
428
- | `artifact` | Fetch full text for an `artifact_id`. Binary artifacts return a placeholder. |
429
- | `compile` | Without args, returns a status snapshot (search index health). With `source` (and optional `sessions_path`), imports that provider into the bundle. |
430
-
431
- Registered MCP prompts include:
432
-
433
- | Prompt | Purpose |
434
- |---|---|
435
- | `investigate_prior_work` | Search prior work on a topic and cite evidence |
436
- | `find_file_history` | Investigate history for a file or path |
437
- | `audit_tool_failures` | Group and explain failed tool calls |
233
+ | `search` | Search messages, commands, paths, diffs, and previews. |
234
+ | `sessions` | List sessions or open a session as detail, summary, or Markdown. |
235
+ | `tool_calls` | Audit tool usage, errors, commands, and path-related artifacts. |
236
+ | `analytics` | Run aggregate reports over SQLite-backed views. |
237
+ | `artifact` | Fetch full stored text for an artifact. |
238
+ | `compile` | Return compile/index status, or import a source when args are provided. |
438
239
 
439
- Five tools are read-only; `compile` is dual-mode (status without args, mutating import with args). All tools use the same services as the CLI.
440
-
441
- ## Common workflows
240
+ MCP prompts include `investigate_prior_work`, `find_file_history`, and
241
+ `audit_tool_failures`.
442
242
 
443
- ### Import everything local
444
-
445
- ```bash
446
- prosa init --force-existing
447
- prosa compile-all
448
- prosa index status
449
- ```
450
-
451
- ### Find prior work on a topic
452
-
453
- ```bash
454
- prosa search "auth middleware"
455
- prosa sessions --source codex --limit 20
456
- prosa export session <session-id> --format markdown
457
- ```
458
-
459
- ### Audit failed or suspicious tool usage
460
-
461
- Use the built-in analytics report for quick aggregates:
462
-
463
- ```bash
464
- prosa analytics tools --refresh --errors-only
465
- prosa analytics errors --output-format json
466
- ```
467
-
468
- Use MCP `tool_calls` for the richest session-level filtering, or query
469
- Parquet directly when you need custom SQL:
470
-
471
- ```bash
472
- prosa export parquet
473
- prosa query duckdb "
474
- select tool_name, status, count(*) as n
475
- from tool_calls
476
- group by 1, 2
477
- order by n desc
478
- "
479
- ```
480
-
481
- ### Summarize a custom session store through MCP
482
-
483
- After compiling a non-default sessions path, use MCP `analytics report=sessions`
484
- with `source_path_substring` to keep analysis inside prosa instead of reading
485
- the source JSONL directly. This is useful for stores such as
486
- `~/.codex-mz/sessions` that share the same provider name as the default Codex
487
- store.
488
-
489
- ### Search faster with a sidecar index
490
-
491
- ```bash
492
- prosa index tantivy
493
- prosa search "slow test" --engine tantivy
494
- prosa index status
495
- ```
496
-
497
- ### Keep an isolated test bundle
498
-
499
- ```bash
500
- prosa init --store /tmp/prosa-demo
501
- prosa compile codex --store /tmp/prosa-demo
502
- prosa tui --store /tmp/prosa-demo
503
- ```
243
+ In stdio mode, stdout is reserved for MCP JSON-RPC frames.
504
244
 
505
245
  ## Bundle layout
506
246
 
@@ -525,17 +265,14 @@ The layers are:
525
265
 
526
266
  | Layer | Contents |
527
267
  |---|---|
528
- | Raw immutable layer | Preserved source files, raw records, import batches, import errors |
529
- | Canonical projection | Projects, sessions, turns, events, messages, blocks, tool calls, tool results, artifacts, edges |
530
- | Derived read surfaces | `search_docs`, SQLite FTS5, Tantivy sidecar, Markdown, Parquet |
268
+ | Raw immutable layer | Source files, raw records, import batches, and import errors. |
269
+ | Canonical projection | Projects, sessions, turns, events, messages, tool calls, artifacts, and edges. |
270
+ | Derived read surfaces | `search_docs`, SQLite FTS5, Tantivy, Markdown, and Parquet. |
531
271
 
532
272
  SQLite is the canonical catalog. Large payloads such as raw records, tool
533
273
  outputs, diffs, and JSON payloads are stored in the content-addressed object
534
274
  store. Object IDs use BLAKE3 and object bytes are compressed with zstd.
535
275
 
536
- Raw source files are preserved so future importer versions can rebuild better
537
- projections without re-reading the original tool history directories.
538
-
539
276
  Search indexes, Markdown exports, and Parquet files are derived. Do not treat
540
277
  them as the source of truth.
541
278
 
@@ -543,31 +280,22 @@ them as the source of truth.
543
280
 
544
281
  Requirements:
545
282
 
546
- - Node.js 22 or newer
283
+ - Node.js 22.15.1 through 26.x
547
284
  - pnpm
548
285
  - devbox, recommended for the local shell
549
286
 
550
- Useful commands:
287
+ From this repository:
551
288
 
552
289
  ```bash
290
+ devbox shell
553
291
  pnpm install
554
- pnpm dev -- <command>
555
- just build
556
- just test
557
- just lint
558
- just typecheck
559
292
  pnpm build
560
293
  pnpm test
561
- pnpm test:watch
562
- pnpm test:coverage
563
294
  pnpm typecheck
564
295
  pnpm lint
565
- pnpm lint:fix
566
- pnpm format
567
- pnpm clean
568
296
  ```
569
297
 
570
- Examples:
298
+ Run the CLI through SWC while developing:
571
299
 
572
300
  ```bash
573
301
  pnpm dev -- init --store /tmp/prosa-dev
@@ -580,9 +308,9 @@ Project layout:
580
308
  | Path | Purpose |
581
309
  |---|---|
582
310
  | `src/cli/commands/` | CLI command implementations |
583
- | `src/core/` | Bundle, schema, CAS, domain IDs, ingest helpers |
311
+ | `src/core/` | Bundle, schema, CAS, domain IDs, and ingest helpers |
584
312
  | `src/importers/` | Codex, Claude, Gemini, and Cursor importers |
585
- | `src/services/` | Sessions, search, indexing, exports |
313
+ | `src/services/` | Sessions, search, indexing, exports, and analytics |
586
314
  | `src/mcp/` | MCP server, tools, and prompts |
587
315
  | `src/tui/` | Ink terminal UI |
588
316
  | `test/` | Vitest tests and fixtures |
@@ -590,39 +318,28 @@ Project layout:
590
318
 
591
319
  ## Documentation
592
320
 
593
- `docs/` holds the architecture and source-format references. Start with
594
- [`docs/README.md`](./docs/README.md) for an index. Highlights:
321
+ Start with [`docs/README.md`](./docs/README.md) for the full documentation
322
+ index.
595
323
 
596
324
  | Doc | Purpose |
597
325
  |---|---|
598
- | [`docs/architecture/bundle-format.md`](./docs/architecture/bundle-format.md) | Bundle layout, full SQLite schema, CAS, idempotency keys |
599
- | [`docs/architecture/import-pipeline.md`](./docs/architecture/import-pipeline.md) | How `compile` walks sources, stages CAS, commits, and rebuilds indexes |
600
- | [`docs/architecture/search-engines.md`](./docs/architecture/search-engines.md) | FTS5 default vs. Tantivy sidecar |
601
- | [`docs/sources/codex.md`](./docs/sources/codex.md) | `~/.codex/sessions/` JSONL format |
602
- | [`docs/sources/claude-code.md`](./docs/sources/claude-code.md) | `~/.claude/projects/` JSONL + artifacts |
603
- | [`docs/sources/cursor.md`](./docs/sources/cursor.md) | `~/.cursor/chats/**/store.db` SQLite |
604
- | [`docs/sources/gemini.md`](./docs/sources/gemini.md) | `~/.gemini/tmp/` JSON |
326
+ | [`docs/architecture/bundle-format.md`](./docs/architecture/bundle-format.md) | Bundle layout, SQLite schema, CAS, and idempotency keys |
327
+ | [`docs/architecture/import-pipeline.md`](./docs/architecture/import-pipeline.md) | How imports walk sources, stage CAS, commit, and rebuild indexes |
328
+ | [`docs/architecture/search-engines.md`](./docs/architecture/search-engines.md) | FTS5 default search vs. Tantivy sidecar search |
329
+ | [`docs/recipes/duckdb.md`](./docs/recipes/duckdb.md) | Copy-pasteable DuckDB analytics queries |
330
+ | [`docs/sources/codex.md`](./docs/sources/codex.md) | Codex CLI source format |
331
+ | [`docs/sources/claude-code.md`](./docs/sources/claude-code.md) | Claude Code source format |
332
+ | [`docs/sources/cursor.md`](./docs/sources/cursor.md) | Cursor source format |
333
+ | [`docs/sources/gemini.md`](./docs/sources/gemini.md) | Gemini CLI source format |
605
334
 
606
335
  ## Releasing
607
336
 
608
- `prosa` uses Changesets for local npm releases to the official npm registry.
609
- The package is published publicly as `@c3-oss/prosa`.
610
-
611
- Create a changeset for user-facing changes:
337
+ `prosa` uses Changesets for npm releases to the official npm registry. The
338
+ package is published publicly as `@c3-oss/prosa`.
612
339
 
613
340
  ```bash
614
341
  just changeset
615
- ```
616
-
617
- Apply pending changesets to `package.json` and `CHANGELOG.md`:
618
-
619
- ```bash
620
342
  just version-packages
621
- ```
622
-
623
- Build and publish:
624
-
625
- ```bash
626
343
  just release
627
344
  ```
628
345
 
@@ -632,8 +349,9 @@ unless you intend to publish to `https://registry.npmjs.org/`.
632
349
 
633
350
  ## Status and limitations
634
351
 
635
- - Version is currently `0.1.0`.
636
- - Source formats for agent tools can change; importers preserve raw bytes so
352
+ - `prosa` is early software. The main CLI surfaces documented here are
353
+ implemented, but source formats and importer coverage will continue to evolve.
354
+ - Agent tools can change their on-disk formats; importers preserve raw bytes so
637
355
  projections can be improved later.
638
356
  - FTS5 is available by default; Tantivy search requires `prosa index tantivy`
639
357
  before use.
@@ -643,13 +361,3 @@ unless you intend to publish to `https://registry.npmjs.org/`.
643
361
  dumping every stored byte inline.
644
362
  - The default store may contain private local history. Be careful before
645
363
  sharing exports, Parquet snapshots, or the bundle itself.
646
-
647
- ## Why this exists
648
-
649
- Every agent CLI keeps history in its own format. Searching across tools is
650
- painful, auditing tool calls is harder, and exporting human-readable transcripts
651
- is inconsistent.
652
-
653
- `prosa` reduces that fragmentation into one queryable local store while
654
- preserving provenance and raw source bytes for auditability and future
655
- re-processing.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@c3-oss/prosa",
3
- "version": "0.6.0",
3
+ "version": "0.7.0",
4
4
  "description": "Compile, search, and export local agent session histories (Cursor, Codex, Claude Code, Gemini CLI) into a single canonical store.",
5
5
  "author": "Caian Ertl <hi@caian.org>",
6
6
  "license": "MIT",
@@ -32,14 +32,14 @@
32
32
  "registry": "https://registry.npmjs.org/"
33
33
  },
34
34
  "engines": {
35
- "node": ">=22"
35
+ "node": ">=22.15.1 <27"
36
36
  },
37
37
  "dependencies": {
38
38
  "@duckdb/node-api": "1.5.2-r.1",
39
39
  "@modelcontextprotocol/sdk": "^1.29.0",
40
40
  "@noble/hashes": "^1.7.0",
41
41
  "@oxdev03/node-tantivy-binding": "0.2.0",
42
- "better-sqlite3": "^11.5.0",
42
+ "better-sqlite3": "^12.10.0",
43
43
  "commander": "^12.1.0",
44
44
  "filtrex": "^3.1.0",
45
45
  "ink": "^7.0.1",