agentel 0.2.6 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -17,7 +17,7 @@ All supported sources are normalized into the same archive shape before write:
17
17
  - `cwd`: working directory if the source exposes one or agentlog can infer one.
18
18
  - `repoCanonical`: git remote key from `cwd`, such as `github.com/org/repo`.
19
19
  - `scopeCanonical`: non-repo storage scope for sessions without a reliable
20
- working directory, such as `claude-desktop/uncategorized`.
20
+ working directory, such as `claude-cowork/uncategorized`.
21
21
  - `messages`: normalized `user`, `assistant`, `system`, or `tool` messages with
22
22
  ISO timestamps.
23
23
  - `sourcePath`: local source file, directory, database, or export file.
@@ -59,6 +59,44 @@ usage is preserved separately and repeated provider request ids are counted once
59
59
  which avoids inflating Claude Code/Desktop sessions that repeat the same request
60
60
  usage across assistant text and tool-call rows.
61
61
 
62
+ Session metadata stores compact `toolUsage` summaries derived from canonical
63
+ `tool.called`/`tool.completed` events so global and project stats can aggregate
64
+ most-used tools without rereading transcripts. Existing archives without that
65
+ field need a clean reimport/rebuild before most-used tool stats are complete;
66
+ the web stats layer does not reread event files in the request path.
67
+
68
+ Archive schema v6 adds import-time summaries for work and durable context.
69
+ `outputTokenWork`
70
+ classifies assistant output tokens into `text`, `toolUse`, `reasoning`, and
71
+ `unknown` buckets when message usage and message shape support it; mixed
72
+ text/tool messages are counted as `unknown` instead of being split by a hidden
73
+ heuristic. `outcomes` stores lightweight counts for edit tool calls, unique
74
+ files touched by edit tools, and durable knowledge captures such as `AGENTS.md`,
75
+ `CLAUDE.md`, provider memory files, skill definitions, project rules, and
76
+ planning/decision docs. Canonical events v5 additionally emits `memory.read`,
77
+ `memory.write`, and `memory.loaded` events for memory-file activity, and
78
+ `outcomes` records `memoryReads`, `memoryWrites`, and `memoryLoads` separately
79
+ from generic edit/tool counts. The stats API aggregates these fields into daily
80
+ rows for output job-mix charts and tokens-per-meaningful-event ratios. Older
81
+ archives should be rebuilt or reimported for coverage; the viewer does not scan
82
+ old transcripts to recover these fields.
83
+ View schema v9 carries those memory events through the compact browser payload
84
+ and renders memory-file reads, writes, and loads as memory activity rather than
85
+ ordinary file diff cards.
86
+
87
+ Spend charts are derived from actual provider/export `costUsd` values when
88
+ available, otherwise from a versioned model-pricing table and known token
89
+ directions. Tokens that cannot be priced confidently remain in unpriced coverage
90
+ fields rather than being multiplied by a single blended rate. Estimated spend
91
+ payloads carry pricing source/version metadata from `src/pricing.js`, while
92
+ provider-reported cost stays labeled as actual. The stats API also emits a
93
+ 30-day usage summary for the web viewer: latest-day spend/tokens/prompts,
94
+ 7-day spend, 30-day spend/tokens/prompts/sessions, and the top known model in
95
+ that window. Browser stats payloads are range-shaped: the default response
96
+ includes all-time scalar totals, the visible chart window, and the rolling
97
+ activity heatmap; older activity years and all-time daily breakdowns are fetched
98
+ on demand instead of being bundled into every first load.
99
+
62
100
  When provider token usage is missing, message-level estimates use visible text
63
101
  length plus visible tool-call summaries at roughly four characters per token.
64
102
  The generic archive estimate is stored as `metadata.tokenEstimate` on each
@@ -84,6 +122,12 @@ own SDK aggregate fields plus `split_stats.sdk`, and the web view renders an SDK
84
122
  jobs card and heatmap. This keeps high-volume batch automation searchable and
85
123
  auditable without letting it swamp interactive usage stats.
86
124
 
125
+ Subagent child sessions are also excluded from primary stats by default. Parent
126
+ sessions keep compact subagent run metadata and provider-level thread counters,
127
+ while child sessions remain direct-addressable for transcript inspection and can
128
+ be included by the web stats Subagents toggle or other explicit
129
+ subagent-inclusive stats/search paths.
130
+
87
131
  Cursor sessions that still lack provider-reported usage get a separate
88
132
  `estimatedUsage` metadata field instead of synthetic `usage`. The estimate uses
89
133
  empirical per-assistant-turn Cursor rates by model family, with visible
@@ -95,15 +139,35 @@ split as non-assistant input, assistant output, and Claude thinking output, not
95
139
  reconstructed billing context windows.
96
140
 
97
141
  ```sh
98
- agentlog update --yes --since all
142
+ agentlog update --yes
99
143
  ```
100
144
 
101
145
  `agentlog update` preserves config preferences, redaction settings, web account
102
- labels, source histories, and recall integrations. It removes derived local
103
- archive/import/index state and reimports configured local sources. Manual web
104
- exports still need to be imported again from the original export file when those
105
- archives need to be rebuilt. `agentlog reset` is the heavier path: it removes
106
- agentlog state and archive objects, including config, while still leaving source
146
+ labels, manually imported ChatGPT/Claude.ai archive objects, local archive
147
+ objects whose original source files are no longer present, source histories,
148
+ and recall integrations. It removes derived local agent archive/import/index
149
+ state and reimports configured local sources. Sessions whose source transcripts
150
+ have already been cleaned up by the source application are restored from the
151
+ previous agentlog archive instead of being silently dropped, and update/doctor
152
+ report a source/provider/sourceType breakdown for those preserved unavailable
153
+ sessions. When the preserved session is an old Claude Code archive with an
154
+ Agentlog raw backup, restore the source transcript explicitly before a clean
155
+ reimport:
156
+
157
+ ```sh
158
+ agentlog repair claude-code-backups --dry-run
159
+ agentlog repair claude-code-backups --yes
160
+ agentlog update --yes --since all --sources claude
161
+ ```
162
+
163
+ The default rebuild window is `imports.updateSince`, saved from the initial
164
+ backfill or explicit all-source imports, falling back to `all` for legacy configs.
165
+ The watcher's rolling
166
+ `imports.defaultSinceDays` is not used by `agentlog update`; `--since` still
167
+ overrides it for one run. Manual web exports only need to be
168
+ imported again from the original export file when those chat archives themselves
169
+ need to be rebuilt. `agentlog reset` is the heavier path: it removes agentlog
170
+ state and archive objects, including config, while still leaving source
107
171
  application histories such as Cursor, Codex, Claude, Gemini, or Devin logs
108
172
  untouched.
109
173
 
@@ -137,13 +201,16 @@ agentlog does not blindly copy entire source directories.
137
201
  ## Canonical Events
138
202
 
139
203
  `events.jsonl` is the provider-independent archive/search substrate. It uses
140
- schema version `agentlog.events.v2` and these event kinds:
204
+ schema version `agentlog.events.v5` and these event kinds:
141
205
 
142
206
  - `session.started`
143
207
  - `prompt.submitted`
144
208
  - `response.generated`
145
209
  - `tool.called`
146
210
  - `tool.completed`
211
+ - `memory.read`
212
+ - `memory.write`
213
+ - `memory.loaded`
147
214
 
148
215
  Agentlog intentionally ports only the portable Forge idea here: canonical
149
216
  prompt/response/tool events with parser versions. It does not port Forge's
@@ -156,10 +223,12 @@ events add viewer-facing display metadata:
156
223
 
157
224
  - `metadata.toolCalls[]`: `id`, `name`, `displayName`, `category`, `title`,
158
225
  `status`, `argument`, `rawInputSummary`, `inputPreview`, `target`, `icon`,
159
- `categoryLabel`, and `provider`.
226
+ `categoryLabel`, `provider`, and optional `structuredPatch` hunks when the
227
+ source exposes line-aware unified diffs.
160
228
  - `metadata.toolResult`: `id`, `name`, `provider`, `kind`, `title`, `summary`,
161
- `output`, `lineCount`, `collapsed`, `category`, `categoryLabel`, `icon`, and
162
- optional `status`.
229
+ `output`, `lineCount`, `collapsed`, `category`, `categoryLabel`, `icon`,
230
+ optional `status`, and optional provider diff hunks such as
231
+ `structuredPatch`.
163
232
 
164
233
  `tool.completed.parentEventId` links to the matching `tool.called` event when a
165
234
  provider exposes stable ids or matching tool names. When those are absent,
@@ -168,6 +237,9 @@ such as Devin CLI still preserve the call/result relationship.
168
237
 
169
238
  The viewer reads canonical events or normalized metadata first. Text patterns
170
239
  such as `Grep(...)` are legacy fallback only.
240
+ For memory-file activity, the viewer follows the canonical
241
+ `tool.called -> tool.completed -> memory.*` chain and labels the visible tool
242
+ card as a memory read/write/load instead of a generic read or edit.
171
243
 
172
244
  Provider-generated context sometimes appears in upstream logs as `role: user`.
173
245
  Agentlog preserves those records in transcripts, but reclassifies known shapes
@@ -195,36 +267,49 @@ package-prefixed scheme.
195
267
 
196
268
  | Source type | Version |
197
269
  | --- | --- |
198
- | `codex-cli-history` | `0.2.6.0` |
199
- | `codex-desktop-history` | `0.2.6.0` |
200
- | `codex-sdk-history` | `0.2.6.0` |
201
- | `cli-history` | `0.2.6.0` |
202
- | `claude-sdk-history` | `0.2.6.0` |
203
- | `claude-code-desktop-metadata` | `0.2.6.0` |
204
- | `claude-workspace-desktop` | `0.2.6.0` |
205
- | `cursor-workspace-sqlite` | `0.2.6.0` |
206
- | `cursor-global-sqlite` | `0.2.6.0` |
207
- | `cursor-raw-sqlite-salvage` | `0.2.6.0` |
208
- | `cursor-agent-transcripts` | `0.2.6.0` |
209
- | `devin-cli-history` | `0.2.6.0` |
210
- | `gemini-cli-history` | `0.2.6.0` |
211
- | `cline-task-history` | `0.2.6.0` |
212
- | `opencode-cli-history` | `0.2.6.0` |
213
- | `opencode-cli-sqlite-history` | `0.2.6.0` |
214
- | `opencode-desktop-history` | `0.2.6.0` |
215
- | `opencode-desktop-sqlite-history` | `0.2.6.0` |
216
- | `opencode-web-sqlite-history` | `0.2.6.0` |
217
- | `opencode-history` | `0.2.6.0` |
218
- | `opencode-sqlite-history` | `0.2.6.0` |
219
- | `aider-chat-history` | `0.2.6.0` |
220
- | `antigravity-history` | `0.2.6.0` |
221
- | `antigravity-trajectory-summary` | `0.2.6.0` |
222
- | `windsurf-trajectory-export` | `0.2.6.0` |
223
- | `web-chat-export` | `0.2.6.0` |
224
- | `chatgpt-export` | `0.2.6.0` |
225
- | `claude-web-export` | `0.2.6.0` |
226
- | `claude-web-memory` | `0.2.6.0` |
227
- | `import` | `0.2.6.0` |
270
+ | `codex-cli-history` | `0.3.0.0` |
271
+ | `codex-desktop-history` | `0.3.0.0` |
272
+ | `codex-sdk-history` | `0.3.0.0` |
273
+ | `cli-history` | `0.3.0.0` |
274
+ | `claude-sdk-history` | `0.3.0.0` |
275
+ | `claude-code-desktop-metadata` | `0.3.0.0` |
276
+ | `claude-workspace-desktop` | `0.3.0.0` |
277
+ | `cursor-workspace-sqlite` | `0.3.0.0` |
278
+ | `cursor-global-sqlite` | `0.3.0.0` |
279
+ | `cursor-raw-sqlite-salvage` | `0.3.0.0` |
280
+ | `cursor-agent-transcripts` | `0.3.0.0` |
281
+ | `devin-cli-history` | `0.3.0.0` |
282
+ | `devin-desktop-acp-events` | `0.3.0.0` |
283
+ | `copilot-cli-history` | `0.3.0.0` |
284
+ | `factory-droid-history` | `0.3.0.0` |
285
+ | `grok-build-history` | `0.3.0.0` |
286
+ | `pi-cli-history` | `0.3.0.0` |
287
+ | `gemini-cli-history` | `0.3.0.0` |
288
+ | `cline-task-history` | `0.3.0.0` |
289
+ | `opencode-cli-history` | `0.3.0.0` |
290
+ | `opencode-cli-sqlite-history` | `0.3.0.0` |
291
+ | `opencode-desktop-history` | `0.3.0.0` |
292
+ | `opencode-desktop-sqlite-history` | `0.3.0.0` |
293
+ | `opencode-web-sqlite-history` | `0.3.0.0` |
294
+ | `opencode-history` | `0.3.0.0` |
295
+ | `opencode-sqlite-history` | `0.3.0.0` |
296
+ | `aider-chat-history` | `0.3.0.0` |
297
+ | `antigravity-history` | `0.3.0.0` |
298
+ | `antigravity-transcript-log` | `0.3.0.0` |
299
+ | `antigravity-cli-transcript-log` | `0.3.0.0` |
300
+ | `antigravity-cli-brain` | `0.3.0.0` |
301
+ | `antigravity-ide-transcript-log` | `0.3.0.0` |
302
+ | `antigravity-ide-brain` | `0.3.0.0` |
303
+ | `antigravity-summary-proto` | `0.3.0.0` |
304
+ | `antigravity-trajectory-summary` | `0.3.0.0` |
305
+ | `windsurf-cascade-brain` | `0.3.0.0` |
306
+ | `windsurf-cascade-protobuf` | `0.3.0.0` |
307
+ | `windsurf-trajectory-export` | `0.3.0.0` |
308
+ | `web-chat-export` | `0.3.0.0` |
309
+ | `chatgpt-export` | `0.3.0.0` |
310
+ | `claude-web-export` | `0.3.0.0` |
311
+ | `claude-web-memory` | `0.3.0.0` |
312
+ | `import` | `0.3.0.0` |
228
313
 
229
314
  `cursor-sqlite-history` and `antigravity-brain` are compatibility aliases for
230
315
  older labels. Fingerprints include the parser version prefix, so changing the
@@ -238,21 +323,25 @@ back to sessions for CLI/skill compatibility. Archives without `events.jsonl`
238
323
  remain searchable through transcript/markdown fallback, and missing
239
324
  `conversation.md` files are materialized from transcripts when needed.
240
325
  The web session API reads pre-baked `view.json` for the default readable pane.
241
- When the raw Markdown source view is requested, the browser asks for a
242
- Markdown-only payload instead of downloading the full transcript again. Browser
243
- session payloads compact duplicated tool output and use ETag revalidation so
326
+ `view.json` is a display cache, not the source of truth: it keeps transcript
327
+ message content visible but omits duplicated tool-output bodies from structured
328
+ metadata and canonical-event text so very long sessions can still be written and
329
+ loaded. The full redacted transcript and canonical events remain in
330
+ `transcript.jsonl` and `events.jsonl`. When the raw Markdown source view is
331
+ requested, the browser asks for a Markdown-only payload instead of downloading
332
+ the full transcript again. Browser session payloads use ETag revalidation so
244
333
  revisits and live refresh checks can avoid reparsing unchanged transcripts.
245
- The local BM25 JSON index stores term postings plus document metadata for
246
- compatibility. A SQLite FTS5 sidecar stores the same chunks for interactive
247
- search so browser, terminal, and MCP recall queries do not parse a large JSON
248
- index in short-lived search processes. Index format bumps trigger a rebuild from
249
- existing `transcript.jsonl` and `events.jsonl`; they do not require reparsing
250
- provider source files. The web search endpoint is optimized for typing: it uses
251
- the compatible warm FTS/index when present, skips obsolete or stale indexes
252
- rather than parsing/rebuilding inline, and does not scan every rendered Markdown
253
- archive as a fallback. Terminal and MCP recall search also avoid synchronous
254
- rebuilds and BM25 JSON parses, then fall back to the bounded Markdown search for
255
- legacy archives or misses.
334
+ The normal local index rebuild writes a small JSON summary plus a SQLite FTS5
335
+ sidecar for browser, terminal, and MCP recall queries. The older full BM25 JSON
336
+ index still exists for explicit compatibility callers, but routine update and
337
+ rebuild flows avoid generating that large serialized object. Index format bumps
338
+ trigger a rebuild from existing `transcript.jsonl` and `events.jsonl`; they do
339
+ not require reparsing provider source files. The web search endpoint is
340
+ optimized for typing: it uses the compatible warm FTS/index when present, skips
341
+ obsolete or stale indexes rather than parsing/rebuilding inline, and does not
342
+ scan every rendered Markdown archive as a fallback. Terminal and MCP recall
343
+ search also avoid synchronous rebuilds and BM25 JSON parses, then fall back to
344
+ the bounded Markdown search for legacy archives or misses.
256
345
 
257
346
  Recall quality has deterministic tests in `test/recall-eval.test.js` with
258
347
  fixtures under `test/fixtures/recall-evals.json`. Add a fixture when a vague
@@ -263,25 +352,33 @@ real-world query should reliably find a representative archived session.
263
352
  The setup UI, import defaults, and history source filters use this grouped order:
264
353
 
265
354
  1. OpenAI: Codex CLI, Codex Desktop, Codex SDK jobs, ChatGPT
266
- 2. Anthropic: Claude Code CLI, Claude Code Desktop, Claude Workspace,
355
+ 2. Anthropic: Claude Code CLI, Claude Code Desktop, Claude Cowork,
267
356
  Claude.ai, Claude SDK jobs
268
- 3. Google: Gemini CLI, Antigravity
269
- 4. Cognition: Devin CLI
270
- 5. Other: Cursor, Cline, OpenCode CLI, OpenCode Desktop, OpenCode Web, Aider
357
+ 3. Google: Gemini CLI, Antigravity CLI, Antigravity 2.0, Antigravity IDE
358
+ 4. Cognition: Devin CLI, Devin Desktop, Windsurf
359
+ 5. GitHub: GitHub Copilot CLI
360
+ 6. Factory: Factory Droid
361
+ 7. xAI: Grok Build
362
+ 8. Other: Cursor, pi, Cline, OpenCode CLI, OpenCode Desktop, OpenCode Web, Aider
271
363
 
272
364
  `agentlog import --source all` uses the default import order from
273
365
  `src/sources.js`: `codex-cli`, `codex-desktop`, `claude`,
274
- `claude-code-desktop`, `claude-workspace`, `gemini-cli`, `antigravity`,
275
- `devin-cli`, `cursor`, `cline`, `opencode-cli`, `opencode-desktop`,
366
+ `claude-code-desktop`, `claude-cowork`, `gemini-cli`, `antigravity-cli`, `antigravity`,
367
+ `antigravity-ide`, `devin-cli`, `devin-desktop`, `windsurf`, `copilot-cli`, `factory`, `grok-build`,
368
+ `cursor`, `pi`, `cline`, `opencode-cli`, `opencode-desktop`,
276
369
  `opencode-web`, `aider`. Codex SDK jobs and Claude SDK jobs are intentionally
277
- opt-in. Windsurf local cache scanning is disabled for now because current
278
- Cascade transcripts are encrypted binary stores, but downloaded trajectory
279
- Markdown exports are importable with an explicit path.
280
-
281
- The background watcher polls the watcher source list selected near the end of
282
- `agentlog init`. New configs still support `imports.autoDiscoverSources=true`,
283
- but init now records the chosen watcher list exactly by setting
284
- `imports.autoDiscoverSources=false`.
370
+ opt-in. Windsurf local imports are intentionally partial: readable Cascade plan
371
+ artifacts are archived when present, matching Cascade protobuf files are
372
+ preserved as raw sources, and downloaded trajectory Markdown remains the stable
373
+ full transcript path.
374
+
375
+ The background watcher covers the watcher source list selected near the end of
376
+ `agentlog init`. Sources with watchable history roots (`src/source-watch.js`)
377
+ import a few seconds after a filesystem event and are otherwise re-polled on a
378
+ 15-minute heartbeat; sources without watch roots poll every 30 seconds with a
379
+ 5-minute idle cadence. New configs still support
380
+ `imports.autoDiscoverSources=true`, but init now records the chosen watcher
381
+ list exactly by setting `imports.autoDiscoverSources=false`.
285
382
 
286
383
  Supervisor imports use `imports.defaultSinceDays` as a rolling window. Cursor
287
384
  SQLite store scans and raw recovery are disabled in supervisor ticks, so old
@@ -344,7 +441,7 @@ stable local command for the archived source.
344
441
  | Claude Code CLI | `claude -r <session-id>` | Uses the Claude Code JSONL session id. |
345
442
  | Devin CLI | `devin -r <session-id>` | agentlog archives these as `devin-<session-id>` and strips that prefix for the resume command, for example `devin -r selective-lotus`. |
346
443
  | Claude Code Desktop | No stable local resume command known. | Use Claude's own desktop/history surface or `agentlog show <session-id>`. |
347
- | Claude Workspace | No stable local resume command known. | Workspace/local-agent session ids are not known to be accepted by Claude Code's CLI resume flag. |
444
+ | Claude Cowork | No stable local resume command known. | Cowork/local-agent session ids are not known to be accepted by Claude Code's CLI resume flag. |
348
445
  | Claude SDK jobs | No interactive resume command. | These are programmatic/batch runs. |
349
446
  | ChatGPT export | No local resume command. | Official exports are imported snapshots. |
350
447
  | Claude.ai export | No local resume command. | Official exports are imported snapshots. |
@@ -360,29 +457,76 @@ stable local command for the archived source.
360
457
  - Import selector: `codex-cli`
361
458
  - Provider: `codex`
362
459
  - Source type: `codex-cli-history`
363
- - Primary store: `~/.codex/state_5.sqlite`
460
+ - Primary stores: `~/.codex/state_5.sqlite` and
461
+ `~/.codex/session_index.jsonl`
364
462
  - Session files: rollout paths referenced by the `threads` table, plus
365
463
  unindexed `rollout-*.jsonl` files under `sessions` and `archived_sessions`
366
464
  - Source split: `threads.source = "cli"`
367
465
  - Overrides:
368
466
  - `CODEX_STATE_DB` overrides the state database path.
467
+ - `CODEX_SESSION_INDEX` overrides the session index path.
369
468
  - `CODEX_HOME` is used for the fallback sessions root.
370
469
 
371
470
  The importer reads `id`, `rollout_path`, `created_at`, `updated_at`, `source`,
372
- `cwd`, and `title` from the Codex state database using `sqlite3`. When the
373
- database has the newer `stage1_outputs` table, agentlog also reads
471
+ `cwd`, `title`, and available subagent metadata columns from the Codex state
472
+ database using `sqlite3`. It also reads `thread_spawn_edges` when present. It then
473
+ prefers `~/.codex/session_index.jsonl` when a matching `thread_name` entry is
474
+ present, because Codex Desktop can now keep the sidebar title there while
475
+ leaving `threads.title` as the full first prompt. If the index has no title for
476
+ a session, the parser falls back to the rollout `thread_name_updated` event
477
+ when Codex emits one, then to non-prompt-shaped state titles and finally to
478
+ first-user-message inference. When a prompt starts with `$agentlog-recall` and
479
+ then continues with a separate task paragraph, fallback title inference skips
480
+ the recall lookup line and titles the session from the task body. If existing
481
+ Codex archives show long context titles, recall-query titles, or stale first
482
+ prompts instead of the Codex sidebar title, reimport them with
483
+ `agentlog import --source codex-desktop --since all` or
484
+ `agentlog import --source codex-cli --since all`. When the database has the
485
+ newer `stage1_outputs` table, agentlog also reads
374
486
  `rollout_summary` and `raw_memory` as supplementary Codex summary documents and
375
487
  adds them to the archived transcript. The importer also scans
376
488
  `~/.codex/sessions` and `~/.codex/archived_sessions` for `rollout-*.jsonl` and
377
489
  `rollout-*.jsonl.zst` files that are not referenced by the state database, so
378
490
  older archived rollouts still get backed up.
379
491
 
492
+ Codex subagents are stored as ordinary rollout threads whose `threads.source`
493
+ can be a JSON `subagent.thread_spawn` object, with parent/child relationships in
494
+ `thread_spawn_edges` and optional `agent_nickname`, `agent_role`, and
495
+ `agent_path` columns. Agentlog resolves those rows back to the parent's source
496
+ split (`codex-cli-history`, `codex-desktop-history`, or `codex-sdk-history`),
497
+ imports each child as `conversationKind = "codex_subagent"` with
498
+ `parentComposerId` set to the parent thread id, and attaches compact run
499
+ metadata to the parent as `metadata.sessionSummary.codexSubagentRuns`. The web
500
+ viewer renders those runs inline and opens the child transcript in the same
501
+ subagent modal used for Claude Code. Existing Codex archives need a full
502
+ reimport, for example `agentlog import --source codex-desktop --since all`, to
503
+ gain the child-session links.
504
+ Subagent child sessions remain archived as direct-addressable session records so
505
+ the modal, raw archive, and direct `agentlog show <child-id>` path can load the
506
+ full transcript, but normal history lists, stats, and search hide
507
+ `*_subagent` sessions unless an explicit subagent-inclusive path opts in.
508
+
380
509
  The rollout JSONL parser captures readable `response_item` reasoning summaries,
381
510
  Codex `event_msg` assistant/user messages, task and compaction markers, local
382
511
  shell calls, web search calls, custom tool calls such as `apply_patch`, tool
383
512
  outputs, and token-count usage deltas. Shell calls that run `apply_patch`
384
513
  through a heredoc are promoted to edit tool calls with `patch`, `diff`, and
385
- target path metadata. The working directory comes from the parsed transcript
514
+ target path metadata. Codex token totals are normalized
515
+ from `event_msg.token_count.info.total_token_usage`: `input_tokens` is split
516
+ into fresh input and `cached_input_tokens`, output tokens are preserved, and
517
+ `reasoning_output_tokens` is stored as a visible sub-count that is already
518
+ included in Codex output totals. When the Codex state database exposes
519
+ `threads.tokens_used`, agentlog stores it as the session-level provider total so
520
+ the stats page can reconcile rollout splits with Codex's own thread counter.
521
+ Because these fields are import-time metadata, changing Codex token semantics
522
+ requires a full reimport, for example:
523
+
524
+ ```bash
525
+ agentlog import --source codex-desktop --since all
526
+ agentlog import --source codex-cli --since all
527
+ ```
528
+
529
+ The working directory comes from the parsed transcript
386
530
  first, then the `threads.cwd` column. If neither is available, the session is
387
531
  archived under `codex/uncategorized` instead of inheriting the supervisor's
388
532
  current directory. Repo attribution is computed from the resolved directory.
@@ -393,15 +537,17 @@ Reading `.zst` sessions requires `zstd` or `unzstd`.
393
537
  - Import selector: `codex-desktop`
394
538
  - Provider: `codex`
395
539
  - Source type: `codex-desktop-history`
396
- - Primary store: `~/.codex/state_5.sqlite`
540
+ - Primary stores: `~/.codex/state_5.sqlite` and
541
+ `~/.codex/session_index.jsonl`
397
542
  - Session files: rollout paths referenced by the `threads` table
398
543
  - Source split: `threads.source = "vscode"`
399
544
  - Overrides: same as Codex CLI
400
545
 
401
- Codex Desktop uses the same state database, summary-document handling, and
402
- rollout parser as Codex CLI. The only distinction is the `threads.source` value.
403
- This is why the web source dropdown can split Codex CLI and Codex Desktop even
404
- though both archive under the same `codex` provider.
546
+ Codex Desktop uses the same state database, session-index title handling,
547
+ summary-document handling, and rollout parser as Codex CLI. The only distinction
548
+ is the `threads.source` value. This is why the web source dropdown can split
549
+ Codex CLI and Codex Desktop even though both archive under the same `codex`
550
+ provider.
405
551
 
406
552
  ## Codex SDK Jobs
407
553
 
@@ -427,22 +573,49 @@ in the separate SDK jobs aggregate instead of primary interactive totals.
427
573
  - Import command: `agentlog import chatgpt <path> [--scope local|team]`
428
574
  - Provider: `chatgpt`
429
575
  - Source type: `chatgpt-export`
430
- - Source file: ChatGPT JSON export or ZIP containing a JSON export
576
+ - Source file: ChatGPT JSON export, OpenAI export ZIP, extracted
577
+ `OpenAI-export`, or `User Online Activity` folder
431
578
  - Default archive scope: `chatgpt`
432
579
 
433
- ChatGPT is not scanned automatically from a desktop app. The import command
434
- without a path prints official export instructions for OpenAI's Privacy Portal
435
- and ChatGPT Data Controls. The user then provides the downloaded official export
436
- file. ZIP imports prefer `conversations.json`, then another JSON file with
437
- `chat` in the name, then the first JSON file in the ZIP.
580
+ ChatGPT is not scanned automatically from a desktop app. In a terminal, the
581
+ import command without a path starts a walkthrough that asks for the export path
582
+ or paths, account username/email, and display name.
583
+ Use `agentlog import chatgpt --instructions` for static Privacy Portal and
584
+ ChatGPT Data Controls instructions. Older ChatGPT exports usually contain a
585
+ single `conversations.json`. Newer OpenAI privacy exports can arrive as
586
+ `OpenAI-export/User Online Activity` with conversation data split across ZIPs or
587
+ folders such as
588
+ `Conversations__<account-hash>-chatgpt-0001-part-0001` and
589
+ `...part-0002`. Import the parent `User Online Activity` folder when possible.
590
+ The walkthrough also accepts each split `Conversations__...chatgpt...part`
591
+ folder one at a time, ending on a blank line, so agentlog sees all split JSON
592
+ files, manifests, `chat.html`, conversation ZIPs, and attached files together. Very large outer
593
+ `OpenAI-export.zip` files should be unzipped first because Node and unzip tooling
594
+ can hit multi-gigabyte file limits.
595
+
596
+ ChatGPT attachment files are preserved in the shared raw export archive and are
597
+ shown from normalized message metadata in the readable transcript. Fresh imports
598
+ render image/file attachment cards instead of folding `[Attachment: ...]`
599
+ placeholders into message text. Reimport ChatGPT exports after upgrading to
600
+ populate the attachment metadata and viewer URLs.
601
+ File cards are only linked when the exported raw archive actually contains the
602
+ file bytes; ChatGPT privacy exports may list some uploaded PDFs or documents in
603
+ conversation metadata without including the original file. ChatGPT tool calls
604
+ such as `web.run` are normalized into tool-call cards, uploaded-file parsing
605
+ messages are normalized as file tool results, and private-use citation markers
606
+ including file citations render as citation labels instead of unsupported glyph
607
+ boxes.
438
608
 
439
609
  For OpenAI export mappings, agentlog reads each node message, normalizes
440
- `author.role`, extracts `content.parts`, and uses `create_time` or `update_time`
441
- as the timestamp. Web imports are scope-based by default because they generally
442
- do not have a reliable local working directory. Since official exports do not
443
- usually include usage, the importer archives estimated per-message
444
- `metadata.usage` from native message content and marks the resulting session
445
- usage as estimated.
610
+ `author.role`, extracts `content.parts`, records attachment and asset-pointer
611
+ metadata, and uses `create_time` or `update_time` as the timestamp. Non-chat JSON
612
+ such as `user_settings.json` is available for account metadata but is not counted
613
+ as a conversation. Extensionless binary attachment files are preserved as raw
614
+ files rather than parsed as JSON. Web imports are scope-based by default because
615
+ they generally do not have a reliable local working directory. Since official
616
+ exports do not usually include usage, the importer archives estimated
617
+ per-message `metadata.usage` from native message content and marks the resulting
618
+ session usage as estimated.
446
619
 
447
620
  ## Claude Code CLI
448
621
 
@@ -476,6 +649,12 @@ Tool calls and results are normalized into the shared
476
649
  `metadata.toolCalls[]`, `metadata.toolResult`, and `metadata.usage` shapes.
477
650
  Bash or shell tool calls that invoke `apply_patch` are reclassified as edit
478
651
  calls and retain the patch text under `arguments.diff`.
652
+ Claude Code `Edit`/`Write` results also preserve provider `structuredPatch`
653
+ hunks with absolute line starts so the web viewer can render numbered diffs.
654
+ Existing Claude archives need a source reimport to gain this field; run
655
+ `agentlog import --source claude --since all` and repeat for
656
+ `claude-code-desktop`, `claude-cowork`, or `claude-sdk` if those sources are
657
+ enabled.
479
658
  Tool results are matched back to prior `tool_use` ids when possible so result
480
659
  cards inherit the tool name instead of displaying only the raw tool-use id.
481
660
  Remote Control lifecycle records are also converted into provider-generated
@@ -489,6 +668,29 @@ also include Remote Control attachment counts/details, available tool names,
489
668
  MCP server names, queue timing/content, agent ids, slugs, API error counts, and
490
669
  MCP structured-content counts.
491
670
 
671
+ For each Claude Code session with a working directory, agentlog also snapshots
672
+ Claude subagent definitions from the user-level `~/.claude/agents` directory and
673
+ the nearest project `.claude/agents` directory. It parses the Markdown
674
+ frontmatter fields that Claude uses for subagents (`name`, `description`,
675
+ `tools`, and `model`), records the effective project-over-user definition set in
676
+ `metadata.sessionSummary.claudeSubagents`, and preserves the source `.md` files
677
+ in the session raw manifest. The transcript is not padded with full subagent
678
+ instructions; use the raw archive when the complete definition body is needed.
679
+
680
+ Claude Code subagent run transcripts stored under
681
+ `~/.claude/projects/<project>/<parent-session-id>/subagents/*.jsonl` are also
682
+ attached to the parent session as `metadata.sessionSummary.claudeSubagentRuns`
683
+ and imported as child sessions with `conversationKind = "claude_subagent"` and
684
+ `parentComposerId` set to the parent Claude Code session id. The parent summary
685
+ keeps compact run metadata, prompts, result previews, model names, usage totals,
686
+ and tool counts; the child session carries the full normalized transcript and
687
+ preserves both the JSONL and any sibling `.meta.json` file in raw storage. The
688
+ web viewer renders run summaries inline at their transcript timestamps and links
689
+ to the child session instead of dumping every subagent run at the top. Child run
690
+ sessions are direct-addressable archive records, but normal history lists,
691
+ stats, and search hide `*_subagent` sessions unless subagents are explicitly
692
+ included.
693
+
492
694
  When the Claude desktop app has a matching
493
695
  `~/Library/Application Support/Claude/claude-code-sessions/**/local_*.json`
494
696
  record with `cliSessionId`, the CLI importer uses that sidecar's generated
@@ -552,33 +754,41 @@ when present.
552
754
 
553
755
  Discovery scans the Claude app storage once, but the user-facing source rows are
554
756
  split by kind. `claude-code-desktop` is the Claude Code desktop-launch metadata
555
- path; `claude-workspace` is Claude app local-agent/workspace mode. The older
556
- generic `claude-desktop` aggregate is kept only as a compatibility import
557
- selector and is not shown as a separate discovery row.
757
+ path; `claude-cowork` is Claude app Cowork/local-agent mode. The older
758
+ `claude-workspace` selector is accepted as an alias, and the generic
759
+ `claude-desktop` aggregate is kept only as a compatibility import selector and
760
+ is not shown as a separate discovery row.
558
761
 
559
762
  Working directory attribution comes from `originCwd`, then `cwd`, then the first
560
763
  existing folder in `userSelectedFolders`. If no existing directory is available,
561
764
  the session is archived under `claude-code-desktop/uncategorized` instead of
562
765
  being assigned to whatever repo agentlog happens to run from.
563
766
 
564
- ## Claude Workspace
767
+ ## Claude Cowork
565
768
 
566
- - Import selector: `claude-workspace`
769
+ - Import selector: `claude-cowork` (`claude-workspace` is accepted as a legacy alias)
567
770
  - Provider: `claude_desktop`
568
771
  - Source type: `claude-workspace-desktop`
569
772
  - Primary store:
570
773
  `~/Library/Application Support/Claude/local-agent-mode-sessions/local_*.json`
571
774
  - Audit transcript path:
572
775
  `~/Library/Application Support/Claude/local-agent-mode-sessions/local_<id>/audit.jsonl`
573
- - Fallback scope: `claude-desktop/uncategorized`
776
+ - Fallback scope: `claude-cowork/uncategorized`
574
777
 
575
- Claude Workspace uses the same parser as Claude Code Desktop but reads from the
778
+ Claude Cowork uses the same parser as Claude Code Desktop but reads from the
576
779
  Claude app local-agent mode directory. `audit.jsonl` is preferred when present.
577
- Metadata fallback imports the initial prompt and selected folder context.
780
+ Metadata fallback imports the initial prompt and selected folder context. The
781
+ session-level `model` in the local-agent metadata is recorded as authoritative
782
+ `modelUsage`; audit `tool_use` rows are normalized back to that session model so
783
+ internal tool orchestration does not appear as an extra user-visible model.
578
784
 
579
785
  As with Claude Code Desktop, repo attribution only happens when an existing
580
- working directory can be found. Otherwise the archive is intentionally
581
- uncategorized.
786
+ project directory can be found. Claude's synthetic `/sessions/...` and
787
+ `local-agent-mode-sessions/.../outputs` directories are ignored; selected user
788
+ folders are preferred before falling back to the Cowork uncategorized scope.
789
+ Existing archives with the old `claude-desktop/uncategorized` fallback or
790
+ synthetic cwd attribution should be rebuilt with a full reimport, for example
791
+ `agentlog import --source claude-cowork --since all`.
582
792
 
583
793
  ## Claude.ai Export
584
794
 
@@ -655,41 +865,135 @@ Gemini archives need `agentlog import --source gemini-cli --since all` after
655
865
  this parser bump to populate those titles. When Gemini shutdown interaction
656
866
  summaries appear in structured history, they are parsed into session metadata
657
867
  for tool-call counts, timing, model usage, and resume commands instead of being
658
- inserted into the chat transcript. Markdown
659
- files are split into messages by role headings such as
868
+ inserted into the chat transcript.
869
+
870
+ Gemini CLI subagents stored under
871
+ `~/.gemini/tmp/<project>/chats/<parent-session-id>/<agent-id>.jsonl` are
872
+ imported as child sessions with `conversationKind = "gemini_subagent"` and
873
+ `parentComposerId` set to the parent Gemini session id. Parent `invoke_agent`
874
+ tool calls preserve agent id/name, prompt, status, and progress/result summaries
875
+ when those fields are present, and the parent session gets
876
+ `metadata.sessionSummary.geminiSubagentRuns`. Re-run
877
+ `agentlog import --source gemini-cli --since all` to populate normalized Gemini
878
+ subagent links in existing archives.
879
+
880
+ Markdown files are split into messages by role headings such as
660
881
  `# User`, `# Assistant`, or bold role labels. The working directory comes from
661
882
  parsed cwd fields or Gemini tmp `.project_root` metadata. If no working
662
883
  directory can be resolved, the session is archived under
663
884
  `gemini-cli/uncategorized`.
664
885
 
886
+ ## Antigravity CLI
887
+
888
+ - Import selector: `antigravity-cli`
889
+ - Provider: `antigravity_cli` (separate from the desktop app's `antigravity`)
890
+ - Source types: `antigravity-cli-transcript-log`, `antigravity-cli-brain`
891
+ - Primary transcript store:
892
+ `~/.gemini/antigravity-cli/brain/*/.system_generated/logs/transcript_full.jsonl`
893
+ (untruncated), falling back to `transcript.jsonl` (truncates long tool
894
+ outputs and marks them with `truncated_fields`)
895
+ - Workspace attribution: `~/.gemini/antigravity-cli/history.jsonl` (one
896
+ `{display, timestamp, workspace}` line per user prompt)
897
+ - Binary conversation store preserved but not decoded:
898
+ `~/.gemini/antigravity-cli/conversations/*.pb`
899
+ - Environment overrides: `AGENTLOG_ANTIGRAVITY_CLI_HOME_DIR`
900
+
901
+ The Antigravity CLI (Google's Gemini CLI successor, May 2026) reuses the
902
+ desktop brain/conversations layout under its own home, so the importer reuses
903
+ the whole desktop pipeline with the home redirected and CLI-specific source
904
+ types. Transcript logs do not record the workspace, and deriving cwd from file
905
+ links in responses can land on a subdirectory, so the prompt history file is
906
+ the authoritative cwd source — it is matched by first-user-message text and
907
+ nearest timestamp, and intentionally excluded from sourceFiles because it
908
+ grows with every prompt and would churn session fingerprints. The desktop
909
+ app's Electron state DB is never read for CLI sessions, and CLI/desktop
910
+ conversation ids are distinct UUIDs, so the two sources do not collide.
911
+
912
+ The transcript logs carry no model field or token usage. The only model
913
+ signal is `<USER_SETTINGS_CHANGE>` text injected into USER_INPUT events
914
+ ("The user changed setting \`Model Selection\` from X to Y."); the parser
915
+ tracks these changes, stamps the active model onto subsequent
916
+ PLANNER_RESPONSE messages, and records all observed models in
917
+ `sessionSummary.modelUsage`. This also applies to desktop Antigravity
918
+ transcript logs.
919
+
920
+ ## Antigravity IDE
921
+
922
+ - Import selector: `antigravity-ide`
923
+ - Provider: `antigravity_ide`
924
+ - Source types: `antigravity-ide-transcript-log`, `antigravity-ide-brain`
925
+ - Primary store: `~/.gemini/antigravity-ide/` (same brain/conversations layout
926
+ as the 2.0 app)
927
+ - Environment overrides: `AGENTLOG_ANTIGRAVITY_IDE_HOME_DIR`
928
+
929
+ The Antigravity IDE split off from the 2.0 agent platform but kept the same
930
+ data layout, and its home starts as a migration copy of the 2.0 home. The
931
+ importer therefore drops conversations whose 2.0 copy is byte-identical or at
932
+ least as fresh; the IDE only owns conversations it created or continued after
933
+ the split. IDE session ids are prefixed `antigravity-ide-` because migrated
934
+ conversation ids collide with 2.0's. The IDE's own knowledge/implicit memory
935
+ stores are covered by memory backup under provider `antigravity_ide`.
936
+
665
937
  ## Antigravity
666
938
 
667
939
  - Import selector: `antigravity`
668
940
  - Provider: `antigravity`
669
- - Source types: `antigravity-brain`, `antigravity-trajectory-summary`
670
- - Primary readable store: `~/.gemini/antigravity/brain/*`
941
+ - Source types: `antigravity-transcript-log`, `antigravity-brain`,
942
+ `antigravity-summary-proto`, `antigravity-trajectory-summary`
943
+ - Primary transcript store:
944
+ `~/.gemini/antigravity/brain/*/.system_generated/logs/transcript.jsonl`
945
+ - Legacy readable store: `~/.gemini/antigravity/brain/*`
946
+ - Summary protobuf store: `~/.gemini/antigravity/agyhub_summaries_proto.pb`
671
947
  - Partial metadata store:
672
948
  `Application Support/Antigravity/User/globalStorage/state.vscdb`
673
- - Binary store counted but not decoded:
949
+ - Binary conversation store preserved but not decoded:
674
950
  `~/.gemini/antigravity/conversations/*.pb`
675
951
 
676
- agentlog imports readable Markdown artifacts from each task directory. Recognized
677
- artifact names are `task.md`, `implementation_plan.md`, `walkthrough.md`, and
678
- `plan.md`. Each artifact becomes an assistant message with a heading naming the
679
- artifact. Timestamps come from artifact file mtimes.
680
-
681
- When a conversation has no readable Markdown artifacts, agentlog also imports
682
- Antigravity trajectory summaries from the app's VS Code-style global state DB.
683
- Those summaries preserve the conversation id, visible prompt/title, timestamps,
684
- and workspace path when present. They are marked as partial summaries and do not
685
- claim to be full transcripts. The global state DB is referenced in the raw
686
- manifest but not copied, since it can contain auth-bearing settings.
687
-
688
- The importer tries to infer a working directory from `file://...` links inside
689
- the Markdown artifacts or trajectory summary. If none can be inferred, it
690
- archives under `antigravity/uncategorized`. Binary protobuf transcripts are
691
- counted in discovery details but not imported as conversation messages yet.
692
- Antigravity token usage is therefore not populated from the current local stores.
952
+ agentlog ranks Antigravity sources by fidelity and keeps one archive per
953
+ conversation id: transcript logs first, then readable brain Markdown artifacts,
954
+ then `agyhub_summaries_proto.pb`, then the legacy state DB summary row, then a
955
+ binary-only discovery placeholder if no readable source exists. This keeps the
956
+ v1 archive contract simple while still preserving the lower-fidelity evidence.
957
+
958
+ Transcript logs import user messages, planner responses, system/error/history
959
+ messages, and tool-like step outputs from `.system_generated/logs/transcript.jsonl`.
960
+ Older `.system_generated/logs/overview.txt` files are also discovered; JSONL
961
+ events are parsed when present, and plain overview text is imported as a partial
962
+ assistant summary. Matching `conversations/<id>.pb` files are copied into the
963
+ raw archive manifest for the session, but are not parsed into messages because
964
+ they are not a stable plain protobuf transcript format.
965
+
966
+ When transcript logs contain `INVOKE_SUBAGENT` steps, agentlog links the spawned
967
+ Antigravity conversation id back to the parent. The child transcript is imported
968
+ as a normal direct-addressable session with `conversationKind =
969
+ "antigravity_subagent"` and `parentComposerId` set to the parent conversation
970
+ id, while the parent gets `metadata.sessionSummary.antigravitySubagentRuns`.
971
+ The web viewer displays those runs with the same subagent card/modal path used
972
+ for Claude, Codex, and Cursor child sessions. Default history/search paths hide
973
+ `*_subagent` child sessions unless subagents are explicitly included.
974
+
975
+ Readable Markdown artifacts remain supported as the legacy Antigravity CLI path.
976
+ Recognized artifact names are `task.md`, `implementation_plan.md`,
977
+ `walkthrough.md`, and `plan.md`. Each artifact becomes an assistant message with
978
+ a heading naming the artifact. If a summary protobuf or state DB summary exists
979
+ for the same conversation id, agentlog uses it to fill title, timestamps, and
980
+ workspace metadata while keeping the Markdown artifact as the imported content.
981
+
982
+ `agyhub_summaries_proto.pb` and the VS Code-style global state DB row both
983
+ produce partial summary sessions. They preserve conversation id, visible
984
+ prompt/title, timestamps, step count, and workspace path when present, and are
985
+ marked as partial summaries so they do not claim to be full transcripts. The
986
+ global state DB is referenced in the raw manifest but not copied, since it can
987
+ contain auth-bearing settings.
988
+
989
+ The importer tries to infer a working directory from summary metadata,
990
+ `CWD:` lines, and `file://...` links in Antigravity content. If none can be
991
+ inferred, it archives under `antigravity/uncategorized`. Antigravity token usage
992
+ is not populated from the current local stores.
993
+
994
+ Existing Antigravity archives created before transcript-log and summary-protobuf
995
+ support should be rebuilt with `agentlog import --source antigravity --since all`
996
+ so the archive uses the new source ranking and raw protobuf preservation.
693
997
 
694
998
  ## Devin CLI
695
999
 
@@ -731,6 +1035,53 @@ attribution follows the project directory Devin was launched from.
731
1035
  If `message_nodes` contains no importable messages, agentlog falls back to
732
1036
  `prompt_history` so at least direct user prompts can be archived.
733
1037
 
1038
+ Devin spawns subagents through the `run_subagent` tool (built-in profiles
1039
+ `subagent_explore` / `subagent_general`, plus custom `AGENT.md` profiles).
1040
+ agentlog records each spawn in the parent's
1041
+ `sessionSummary.devinSubagentRuns` with the profile, prompt preview, and
1042
+ foreground/background mode. When the tool arguments expose a child session id
1043
+ that matches another imported Devin session, the child is marked
1044
+ `conversationKind = "devin_subagent"` with `parentComposerId` pointing at the
1045
+ parent, so it collapses out of top-level lists like other `*_subagent`
1046
+ sessions. Spawns without a resolvable child still produce run entries with
1047
+ status `spawned`. Re-run `agentlog import --source devin-cli --since all` to
1048
+ populate links in existing archives.
1049
+
1050
+ ## Devin Desktop
1051
+
1052
+ - Import selector: `devin-desktop`
1053
+ - Provider: `devin`
1054
+ - Source type: `devin-desktop-acp-events`
1055
+ - Event logs on macOS:
1056
+ `~/Library/Application Support/Devin/User/acp-events/*.ndjson`
1057
+ - Metadata/index DB on macOS:
1058
+ `~/Library/Application Support/Devin/User/globalStorage/state.vscdb`
1059
+ - Linux/Windows app roots follow the same VS Code/Electron layout under
1060
+ `~/.config/Devin` or `%APPDATA%\Devin`.
1061
+ - Overrides:
1062
+ - `AGENTLOG_DEVIN_DESKTOP_APP_SUPPORT_DIR` points at an alternate Devin app
1063
+ support root.
1064
+ - `AGENTLOG_DEVIN_DESKTOP_ACP_EVENTS_DIR` points directly at an ACP event log
1065
+ directory.
1066
+ - `AGENTLOG_DEVIN_DESKTOP_GLOBAL_STORAGE_DB` points at one explicit
1067
+ `state.vscdb`.
1068
+
1069
+ Devin Desktop writes ACP session event logs as UUID-named NDJSON files and keeps
1070
+ the session id to UUID mapping in `ItemTable['windsurf.acp.eventLog.index']`.
1071
+ The key prefix is inherited from Windsurf, but agentlog treats this as a Devin
1072
+ Desktop source and imports only `acp/devin-cli/*` and `acp/devin-cloud/*`
1073
+ sessions under provider `devin`. Cascade/Windsurf history remains the separate
1074
+ `windsurf` source.
1075
+
1076
+ The importer joins streamed `user_message_chunk`, `agent_message_chunk`, and
1077
+ `agent_thought_chunk` rows. Devin thinking is preserved as supplementary
1078
+ assistant messages with `summaryKind=thinking`. ACP `tool_call` and
1079
+ `tool_call_update` rows are normalized into `metadata.toolCalls[]` and
1080
+ `metadata.toolResult`, and `usage_update` rows are aggregated into
1081
+ `sessionSummary.usage` for stats. The global state DB is used as an index but is
1082
+ recorded as a raw reference rather than copied into every session raw folder
1083
+ because it can contain auth/session material.
1084
+
734
1085
  ## Cursor
735
1086
 
736
1087
  - Import selector: `cursor`
@@ -883,12 +1234,31 @@ for `tool_calls`, `toolCalls`, `toolResults`, command outputs, edit records,
883
1234
  diff records, model/status/request metadata, and token usage. When no
884
1235
  per-message timestamp exists, it uses the source file's mtime with stable
885
1236
  millisecond offsets so imports do not get stamped with the time of import.
1237
+ Transcript folders under
1238
+ `agent-transcripts/<parent-composer-id>/subagents/<subagent-id>` are imported as
1239
+ child sessions with `conversationKind = "cursor_subagent"` and
1240
+ `parentComposerId` set to the parent composer id. When the parent transcript is
1241
+ available in the same import, agentlog also attaches compact run metadata to the
1242
+ parent as `metadata.sessionSummary.cursorSubagentRuns`, and the web viewer shows
1243
+ those runs inline with a link to the child transcript. Existing Cursor transcript
1244
+ archives need a full reimport, for example
1245
+ `agentlog import --source cursor --since all`, to gain the normalized subagent
1246
+ metadata. As with Codex and Claude Code child runs, those child session records
1247
+ are direct-addressable but hidden from normal history lists, stats, and search
1248
+ unless subagents are explicitly included.
886
1249
 
887
1250
  Cursor project slugs are decoded back to local paths when possible. For example,
888
- `Users-bzhou-Documents-GitHub-spring-next` resolves to
889
- `/Users/bzhou/Documents/GitHub/spring-next` if that directory exists. If no
890
- working directory can be resolved for a newer transcript, it archives under
891
- `cursor/uncategorized` instead of assigning the session to the current repo.
1251
+ `Users-alex-Documents-GitHub-spring-next` resolves to
1252
+ `/Users/alex/Documents/GitHub/spring-next` if that directory exists. Newer
1253
+ agent-transcript imports prefer explicit paths in the transcript itself (tool
1254
+ arguments, command text, and nested metadata) before falling back to that slug,
1255
+ so a transcript stored under a stale Cursor project directory can still be
1256
+ attributed to the repository it actually read or edited. Existing Cursor
1257
+ transcript archives with stale project-slug attribution need a full reimport,
1258
+ for example `agentlog import --source cursor --since all`, to regenerate their
1259
+ metadata and archive location. If no working directory can be resolved for a
1260
+ newer transcript, it archives under `cursor/uncategorized` instead of assigning
1261
+ the session to the current repo.
892
1262
 
893
1263
  ## Cline
894
1264
 
@@ -925,6 +1295,19 @@ supplementary assistant event. Those tool calls carry unified diff text or
925
1295
  old/new string payloads so the web viewer can render the edits inline while the
926
1296
  original checkpoint files remain in raw backups.
927
1297
 
1298
+ Cline spawn linking distinguishes two tool shapes. Explicit subagent tools
1299
+ (`subagent`, `run_subagent`, `spawn_subagent`) mark the spawned task as
1300
+ `conversationKind = "cline_subagent"` with `parentComposerId` set to the
1301
+ parent task id. The `new_task` tool is a context handoff: the new task replaces
1302
+ the old one rather than running under it, so the child stays a top-level
1303
+ session but is still linked to the parent (`parentComposerId` plus
1304
+ `sessionSummary.clineSpawnedBy`) and listed in the parent's
1305
+ `sessionSummary.clineSubagentRuns`. Children are resolved by explicit task id
1306
+ when the arguments carry one, else by matching the handoff context against the
1307
+ child task's first user message inside a 30-minute window, else by an
1308
+ unambiguous single task started within a few minutes of the call. Ambiguous
1309
+ spawns produce run entries with status `spawned` and no link.
1310
+
928
1311
  ## OpenCode
929
1312
 
930
1313
  - Import selectors: `opencode-cli`, `opencode-desktop`, `opencode-web`, or `opencode` for all three
@@ -969,14 +1352,27 @@ created by Desktop, CLI, and Web clients, so agentlog classifies each SQLite
969
1352
  session row individually. Desktop sessions are identified by session ids in
970
1353
  OpenCode Desktop sidecar state such as `ai.opencode.desktop/*.dat`; sub-sessions
971
1354
  inherit Desktop classification from a Desktop parent. CLI sessions are
972
- identified by session-level `agent` or `model` metadata. Remaining non-`local`
973
- shared core rows are labeled as Web sessions. Rows without reliable client
974
- evidence stay on the legacy `opencode-sqlite-history` source type. The
1355
+ identified by session-level `agent` or `model` metadata, or by CLI evidence in
1356
+ the sanitized message metadata when session rows omit those fields. Remaining
1357
+ non-`local` shared core rows are labeled as Web sessions. Rows without reliable
1358
+ client evidence stay on the legacy `opencode-sqlite-history` source type. The
975
1359
  `session`, `message`, `part`, and `project` tables provide session
976
1360
  metadata, working directory, user/assistant messages, reasoning text, tool
977
- calls, tool outputs, model/provider ids, cost, and token usage. Because the
978
- database is a multi-session source, raw preservation stores it as a shared raw
979
- source instead of duplicating the same file into every session archive.
1361
+ calls, tool outputs, model/provider ids, cost, and token usage. During SQLite
1362
+ reads, agentlog removes the bulky `message.data.summary` object before JSON
1363
+ transport; canonical transcript text still comes from the `part` table, and raw
1364
+ preservation keeps the original database byte-for-byte. Because the database is a
1365
+ multi-session source, raw preservation stores it as a shared raw source instead
1366
+ of duplicating the same file into every session archive.
1367
+
1368
+ OpenCode subagent sessions are linked from the structured `session.parent_id`
1369
+ field. Child sessions are imported as `conversationKind = "opencode_subagent"`
1370
+ with `parentComposerId` set to the raw OpenCode parent session id, and the parent
1371
+ gets `metadata.sessionSummary.opencodeSubagentRuns`. Task-tool metadata such as
1372
+ `subagent_type`, `prompt`, `description`, and child `sessionId` is preserved when
1373
+ present so the web viewer can show the same subagent run cards used for other
1374
+ providers. Existing OpenCode archives should be rebuilt with `agentlog import
1375
+ --source opencode --since all` to populate normalized subagent links.
980
1376
 
981
1377
  agentlog also reads OpenCode's JSON session store directly. Sessions provide the
982
1378
  archive id and project id; message and part files provide role text, reasoning
@@ -998,6 +1394,106 @@ When `session_diff/<session-id>.json` is present, agentlog adds a supplementary
998
1394
  edit tool call with the diff payload. Unified diff text is rendered inline in
999
1395
  the history web UI, and the original diff JSON remains in the raw archive.
1000
1396
 
1397
+ ## GitHub Copilot CLI
1398
+
1399
+ - Import selector: `copilot-cli`
1400
+ - Provider: `copilot`
1401
+ - Source type: `copilot-cli-history`
1402
+ - Primary store: `~/.copilot/session-state/<session-uuid>/events.jsonl`
1403
+ - Metadata sidecar: `workspace.yaml` (flat scalars: id, name, cwd, git_root,
1404
+ repository, branch, created_at, updated_at); `plan.md` preserved as a source
1405
+ file when present
1406
+ - Environment overrides: `AGENTLOG_COPILOT_SESSION_STATE_DIR`,
1407
+ `AGENTLOG_COPILOT_HOME`, `COPILOT_HOME`
1408
+
1409
+ Each events.jsonl line is `{type, data, id, timestamp, parentId}`. agentlog maps
1410
+ `user.message`/`assistant.message` to messages, `tool.execution_start` to
1411
+ assistant tool calls (arguments may arrive as a JSON string and are re-parsed),
1412
+ `tool.execution_complete` to tool results, and `session.model_change`,
1413
+ `session.compaction_complete`, and `subagent.started/completed` to system
1414
+ messages. Session-level usage comes from the `session.shutdown` event, which is
1415
+ only written for cleanly ended sessions on CLI 0.0.422+: per-model
1416
+ `modelMetrics` token splits are aggregated into `sessionSummary.usage` and
1417
+ `modelUsage`, with premium requests, AI credits (`totalNanoAiu` / 1e9), code
1418
+ change counters, and `session.task_complete` summaries recorded under
1419
+ `sessionSummary.copilotCli`. The `session-store.db` SQLite database is a
1420
+ rebuildable index of the same data and is not read. Older
1421
+ `~/.copilot/history-session-state/` legacy sessions are not imported.
1422
+
1423
+ ## Factory Droid
1424
+
1425
+ - Import selector: `factory`
1426
+ - Provider: `factory`
1427
+ - Source type: `factory-droid-history`
1428
+ - Primary store: `~/.factory/sessions/<cwd-slug>/<session-uuid>.jsonl`, plus the
1429
+ legacy flat layout `~/.factory/sessions/<session-uuid>.jsonl`
1430
+ - Metadata sidecar: `<session-uuid>.settings.json`
1431
+ - Environment overrides: `AGENTLOG_FACTORY_SESSIONS_DIR`, `FACTORY_HOME_OVERRIDE`
1432
+
1433
+ Transcripts require a `session_start` header line carrying id, title, owner, and
1434
+ cwd. `message` lines wrap Anthropic-style content blocks: text, thinking
1435
+ (stored as `metadata.thinking`), tool_use (assistant tool calls), and
1436
+ tool_result (tool messages). Block keys drift between snake_case and camelCase
1437
+ across droid versions (`tool_use_id` vs `toolUseId`), and both are accepted.
1438
+ File-op tool results that carry `diffLines` with per-line old/new line numbers
1439
+ are converted to `structuredPatch` hunks. `todo_state` lines and unknown types
1440
+ are skipped. The same session id can exist in both legacy and per-project
1441
+ layouts after migration; the newer file by mtime wins. There is no per-message
1442
+ token usage on disk — the sidecar's aggregate `tokenUsage` becomes
1443
+ `sessionSummary.usage`, with model (BYOK `custom:`/`[Provider]` wrappers
1444
+ stripped for `modelUsage`), reasoning effort, autonomy mode, Factory credits,
1445
+ archive state, and subagent attribution (`decompSessionType`,
1446
+ `callingSessionId`, `childInclusiveTokenUsageBySessionId`) recorded under
1447
+ `sessionSummary.factoryDroid`.
1448
+
1449
+ ## Grok Build
1450
+
1451
+ - Import selector: `grok-build`
1452
+ - Provider: `grok`
1453
+ - Source type: `grok-build-history`
1454
+ - Primary store: `~/.grok/sessions/<percent-encoded-cwd>/<session-id>/updates.jsonl`
1455
+ - Metadata sidecars: `summary.json`, `signals.json`; `events.jsonl` and
1456
+ `plan.md` preserved as source files when present
1457
+ - Environment overrides: `AGENTLOG_GROK_SESSIONS_DIR`, `AGENTLOG_GROK_HOME`,
1458
+ `GROK_HOME`
1459
+
1460
+ updates.jsonl lines are ACP (Agent Client Protocol) JSON-RPC `session/update`
1461
+ notifications. Streaming chunks (`user_message_chunk`, `agent_message_chunk`,
1462
+ `agent_thought_chunk`) are concatenated into whole messages; `tool_call`
1463
+ becomes an assistant tool call and `tool_call_update` with completed/failed
1464
+ status becomes a tool result; `plan` and `current_mode_update` become system
1465
+ messages. The cwd comes from percent-decoding the parent directory name. Token
1466
+ telemetry is a cumulative `_meta.totalTokens` counter that can rewind during
1467
+ streaming, so the maximum observed value is recorded as an authoritative
1468
+ session total. `signals.json` rollups (turn count, context tokens, models used,
1469
+ session duration) land under `sessionSummary.grokBuild`. The old unofficial
1470
+ superagent grok-cli also uses `~/.grok` but stores transcripts in
1471
+ `~/.grok/grok.db` SQLite; it is a different product and is not imported.
1472
+
1473
+ ## pi
1474
+
1475
+ - Import selector: `pi`
1476
+ - Provider: `pi`
1477
+ - Source type: `pi-cli-history`
1478
+ - Primary store: `~/.pi/agent/sessions/--<encoded-cwd>--/<timestamp>_<session-id>.jsonl`
1479
+ - Environment overrides: `AGENTLOG_PI_SESSION_DIR`,
1480
+ `PI_CODING_AGENT_SESSION_DIR`, `PI_CODING_AGENT_DIR`
1481
+
1482
+ pi session files start with a `session` header line (id, timestamp, cwd; v1
1483
+ headers also carry provider/modelId, v3 adds `parentSession` fork lineage).
1484
+ Entries form a tree via `id`/`parentId`; agentlog imports all entries in file
1485
+ order, including abandoned branches, and preserves the ids as
1486
+ `providerMessageId`/`parentMessageId`. Assistant messages carry per-message
1487
+ usage with token splits and dollar cost (`usage.cost.total` →
1488
+ `metadata.usage.costUsd`); `toolResult` messages become tool messages;
1489
+ `bashExecution` (`!` commands) become a user message plus a shell tool result
1490
+ with exit code. `model_change`, `thinking_level_change`, `compaction`,
1491
+ `branch_summary`, and `custom_message` entries become system messages;
1492
+ `custom` extension state and `label` entries are skipped. Note that custom
1493
+ `--session-dir` configurations store files flat rather than in per-cwd
1494
+ subdirectories; both layouts are scanned. The oh-my-pi fork uses `~/.omp` with
1495
+ a diverged schema and is not covered by this importer.
1496
+
1001
1497
  ## Aider
1002
1498
 
1003
1499
  - Import selector: `aider`
@@ -1034,18 +1530,39 @@ backups.
1034
1530
 
1035
1531
  - Import selector: `windsurf`
1036
1532
  - Provider: `windsurf`
1037
- - Source types: `windsurf-trajectory-export`, `windsurf-cascade-brain`
1533
+ - Source types: `windsurf-trajectory-export`, `windsurf-cascade-brain`,
1534
+ `windsurf-cascade-protobuf`
1038
1535
  - Explicit export import: `agentlog import windsurf <downloaded-trajectory.md>`
1039
1536
  or `agentlog import windsurf <folder-of-trajectories>`
1537
+ - Repair-stub import: `agentlog import windsurf --claim <token>
1538
+ <downloaded-trajectory.md>`
1040
1539
  - Primary readable store: `~/.codeium/windsurf/brain/*`
1041
- - Binary store counted but not decoded: `~/.codeium/windsurf/cascade/*.pb`
1042
- - Status: encrypted local cache scanning is disabled from setup, default imports,
1043
- and history filters; downloaded trajectory Markdown exports are importable
1044
-
1045
- Windsurf local cache scanning is currently disabled. Current Cascade sessions
1046
- are written as encrypted binary stores, so agentlog can detect session IDs and
1047
- workspace metadata but cannot archive readable conversation text from those
1048
- local files.
1540
+ - Binary store preserved/counted but not decoded:
1541
+ `~/.codeium/windsurf/cascade/*.pb`
1542
+ - Metadata cache: Windsurf global state `ItemTable['windsurf.acp.metadataCache']`
1543
+ in `~/Library/Application Support/Windsurf/User/globalStorage/state.vscdb`
1544
+ and backups
1545
+ - Environment overrides: `AGENTLOG_WINDSURF_HOME_DIR`,
1546
+ `AGENTLOG_CODEIUM_HOME_DIR`, `AGENTLOG_WINDSURF_BRAIN_DIR`,
1547
+ `AGENTLOG_WINDSURF_CASCADE_DIR`, `AGENTLOG_WINDSURF_GLOBAL_STORAGE_DB`,
1548
+ `AGENTLOG_WINDSURF_APP_SUPPORT_DIR`
1549
+
1550
+ Windsurf local imports are partial. Current Cascade sessions keep the full
1551
+ conversation body in high-entropy binary protobuf files, so agentlog preserves
1552
+ matching `cascade/<conversation-id>.pb` files as raw sources but does not decode
1553
+ them into transcript messages. When `brain/<conversation-id>/plan.md`,
1554
+ `task.md`, `implementation_plan.md`, or `walkthrough.md` exists, each readable
1555
+ artifact is archived as an assistant message and the matching metadata/pb files
1556
+ are preserved in the session raw folder.
1557
+
1558
+ Agentlog also reads Windsurf's ACP metadata cache from global storage to attach
1559
+ titles, working directories, created timestamps, and updated timestamps to local
1560
+ Cascade session IDs. Protobuf-only conversations are archived as zero-message
1561
+ repair stubs when there is no readable brain artifact. The stubs preserve the
1562
+ raw `.pb`, show up in the web viewer with a deterministic `ws-...` repair token,
1563
+ and produce no canonical events, so normal history search and MCP recall do not
1564
+ index them as transcript content. The state DB is referenced in raw metadata but
1565
+ not copied because it can contain auth tokens.
1049
1566
 
1050
1567
  Windsurf's "Download trajectory" action produces Markdown headed
1051
1568
  `# Cascade Chat Conversation`. Agentlog imports those files explicitly with
@@ -1054,21 +1571,31 @@ provider transcript surface: `### User Input` sections become user messages,
1054
1571
  `### Planner Response` / assistant-like sections become assistant messages, and
1055
1572
  the original Markdown file is preserved in raw backups.
1056
1573
 
1574
+ When a web-viewer repair stub corresponds to the export, copy its token and run
1575
+ `agentlog import windsurf --claim <token> <downloaded-trajectory.md>`. Claimed
1576
+ imports replace the zero-message stub with the full Markdown transcript under
1577
+ the same `windsurf-<conversation-id>` archive id.
1578
+
1057
1579
  Bulk export tools that automate Windsurf's hidden "Download trajectory" button
1058
1580
  can write many `*.md` files into one directory. Agentlog intentionally treats
1059
- that directory as user-selected export input rather than scanning Windsurf's
1060
- encrypted cache: run `agentlog import windsurf ~/windsurf-cascade-export` after
1061
- the bulk exporter finishes.
1581
+ that directory as user-selected export input rather than local cache recovery:
1582
+ run `agentlog import windsurf ~/windsurf-cascade-export` after the bulk exporter
1583
+ finishes.
1062
1584
 
1063
- The older experimental helper can still read Markdown artifacts from Windsurf
1064
- Cascade brain directories when present. Recognized artifact names are `plan.md`,
1065
- `task.md`, `implementation_plan.md`, and `walkthrough.md`. Each artifact becomes
1066
- an assistant message with a heading naming the artifact. Timestamps come from
1067
- file mtimes.
1585
+ The local importer reads Markdown artifacts from Windsurf Cascade brain
1586
+ directories when present. Recognized artifact names are `plan.md`, `task.md`,
1587
+ `implementation_plan.md`, and `walkthrough.md`. Each artifact becomes an
1588
+ assistant message with a heading naming the artifact. Timestamps come from file
1589
+ mtimes.
1068
1590
 
1069
1591
  The importer tries to infer a working directory from `file://...` links in the
1070
1592
  Markdown. If none can be inferred, it archives under `windsurf/uncategorized`.
1071
- Binary Cascade protobuf stores are counted in discovery details but not decoded.
1593
+ Binary Cascade protobuf stores are counted in discovery details and preserved
1594
+ when they match an imported brain artifact or a repair stub, but not decoded. If
1595
+ import warnings name partial Windsurf conversations, use Download trajectory for
1596
+ a stable full transcript, or reopen the conversation in Windsurf and send
1597
+ `/recall` or a short message if a newer Windsurf build starts writing readable
1598
+ artifacts, then rerun `agentlog import --source windsurf --since all`.
1072
1599
 
1073
1600
  ## Collector And Live Monitoring
1074
1601
 
@@ -1094,8 +1621,13 @@ as importing transcript history.
1094
1621
 
1095
1622
  - ChatGPT and Claude.ai are import-by-export only; agentlog does not read their
1096
1623
  desktop app local stores.
1097
- - Windsurf encrypted cache scanning is disabled; downloaded trajectory Markdown
1098
- exports are supported.
1624
+ - Windsurf local imports are partial; downloaded trajectory Markdown exports are
1625
+ the supported path for stable full Cascade transcripts.
1626
+ - Windsurf has no subagent linking: Cascade conversation bodies are undecoded
1627
+ protobufs and the ACP metadata cache exposes no parent/child or spawn fields,
1628
+ so there is nothing to parse into `windsurf_subagent` sessions today. The
1629
+ viewer accepts `sessionSummary.windsurfSubagentRuns` so runs render if a
1630
+ future Windsurf build exposes spawn metadata.
1099
1631
  - Antigravity protobuf transcripts are counted but not decoded.
1100
1632
  - Cursor older `state.vscdb` stores are best-effort because Cursor has changed
1101
1633
  local storage layouts over time.