agentel 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,551 @@
1
+ # agentlog — spec v0.2
2
+
3
+ A weekend-buildable archive and recall layer for agent coding sessions across Codex, ChatGPT exports, Claude, Gemini, Antigravity, Devin, and Cursor. Local-first, optionally cloud-backed via any S3-compatible storage. Web chats from ChatGPT and Claude.ai are importable via their official export flows. Windsurf is disabled until Cascade exposes a readable local transcript path.
4
+
5
+ ## What it does
6
+
7
+ Agentlog optimizes for four product constraints:
8
+
9
+ 1. **Preserve the source history as-is.** Import copies raw source files into the archive before normalizing them, so agentlog can re-parse sessions as provider formats change.
10
+ 2. **Create a durable recall substrate.** Readable markdown plus canonical event JSONL is the source of truth for `/recall` skills, MCP tools, coding agents, and standardization across providers.
11
+ 3. **Show full histories clearly.** The CLI and local web viewer must make complete conversation histories easy to browse, search, export, and inspect without hiding tool calls or context.
12
+ 4. **Sync across machines.** Local storage remains canonical, but every archive object should be syncable to S3-compatible storage or another cloud target for backup, restore, and multi-device recall.
13
+
14
+ Three layers, separable, in priority order:
15
+
16
+ 1. **Archive** — every agent session is captured as readable markdown plus raw transcripts, redacted at ingest, written to S3-compatible storage keyed by canonical repo identity.
17
+ 2. **Recall** — an MCP server exposing one tool, `search_past_sessions(query, repo, limit)`, that searches canonical event JSONL first and falls back to markdown/transcript retrieval. Available to any MCP-capable agent and wrapped by installable agent commands/skills.
18
+ 3. **Notify** — optional `/buzz`-style Slack skill that posts session summaries on completion. Cut from v0; ships separately as `agentlog-slack`.
19
+
20
+ ## Architecture
21
+
22
+ ```
23
+ ┌─────────────────────────────────────────────────┐
24
+ │ Agents (Claude Code, Codex, Devin, Cursor) │
25
+ └──────────────┬──────────────────────────────────┘
26
+
27
+ ┌──────┴───────┐
28
+ │ │
29
+ OTel push File tail / poll
30
+ (Claude Code) (Codex JSONL,
31
+ (Cowork) Cursor SQLite)
32
+ │ │
33
+ └──────┬───────┘
34
+
35
+ ┌───────────────────────┐ ┌──────────────────┐
36
+ │ agentlog supervisor │◄────────│ Web chat import │
37
+ │ ├─ collector │ │ (Claude.ai, │
38
+ │ ├─ openobserve │ │ ChatGPT export │
39
+ │ ├─ codex-watcher │ │ files) │
40
+ │ ├─ cursor-poller │ └──────────────────┘
41
+ │ ├─ indexer │
42
+ │ └─ importer │
43
+ └──────────┬────────────┘
44
+
45
+ ┌───────────────────────┐
46
+ │ S3-compatible bucket │
47
+ │ (R2 / S3 / B2 / │
48
+ │ MinIO / etc) │
49
+ └──────────┬────────────┘
50
+
51
+ ┌───────────────────────┐
52
+ │ agentlog-recall MCP │ ┌─────────────────────┐
53
+ │ (stdio, on-demand) │ │ agentlog history │
54
+ │ search_past_sessions │ │ (cchv-based) │
55
+ └───────────────────────┘ └─────────────────────┘
56
+ ```
57
+
58
+ ## Process model
59
+
60
+ **One supervisor, several workers.** The supervisor is the only process the user thinks about. It manages child processes, handles restarts with backoff, exposes a control socket at `~/.agentlog/control.sock`, and unifies logging.
61
+
62
+ **Always-on workers (run inside the supervisor):**
63
+
64
+ - OTel collector — receives Claude Code's OTLP pushes, ~40MB RAM
65
+ - OpenObserve — local-mode storage and query, ~100MB RAM (skipped in remote/team mode)
66
+ - Codex watcher — `fsnotify` on `~/.codex/sessions/`, ~10MB RAM, idle when no Codex activity
67
+ - Devin/Cursor pollers — SQLite/transcript scans for configured local sources
68
+ - Indexer — runs every 10 minutes if there are unindexed sessions; pauses on battery
69
+ - Importer — runs at low priority during backfill operations; idle otherwise
70
+
71
+ **On-demand workers (launched by their caller):**
72
+
73
+ - `agentlog-recall` MCP server — spawned by the agent client over stdio when a session starts; killed when the session ends. No always-on cost.
74
+
75
+ **Total always-on footprint, solo install:** ~150MB RAM, near-zero idle CPU. Comparable to a menu-bar chat app.
76
+
77
+ **Team mode inverts this:** developer machines run only the collector and watchers (~50MB total) and forward OTLP to a team server that runs OpenObserve, the indexer, and the recall HTTP endpoint.
78
+
79
+ ## Auto-start at login
80
+
81
+ `init` offers to install a platform-native login agent. Default: yes, with explicit opt-out.
82
+
83
+ **Prompt during init:**
84
+
85
+ ```
86
+ agentlog can start automatically when you log in.
87
+
88
+ This runs a small background service (~150MB RAM) that captures
89
+ your agent conversations as they happen. You can stop it anytime
90
+ with `agentlog stop` or remove auto-start with `agentlog uninstall`.
91
+
92
+ Start agentlog automatically at login? [Y/n]
93
+ ```
94
+
95
+ **Per-platform implementation:**
96
+
97
+ - **macOS:** `~/Library/LaunchAgents/com.agentlog.supervisor.plist`. `RunAtLoad: true`, `KeepAlive: { SuccessfulExit: false }` (restart on crash, not on clean shutdown). Logs to `~/Library/Logs/agentlog/`. Loaded with `launchctl load -w`.
98
+ - **Linux:** systemd user unit at `~/.config/systemd/user/agentlog.service`. `Type=simple`, `Restart=on-failure`, `RestartSec=5`. Enabled with `systemctl --user enable --now`. Init detects whether `loginctl enable-linger` is needed and prompts separately: "Keep agentlog running when you're logged out? [y/N]" — default no.
99
+ - **Windows:** Scheduled Task triggered at logon (`schtasks /create /tn "agentlog" /tr ... /sc onlogon`). Service-based install is a v1 enhancement.
100
+
101
+ **Critical detail:** the launch agent runs `agentlog start --foreground`, not `agentlog start`. Foreground mode keeps the OS supervisor (launchd/systemd/Task Scheduler) as the parent — detaching breaks process tracking and crash restart.
102
+
103
+ **Lifecycle commands:**
104
+
105
+ ```
106
+ agentlog autostart enable # writes the launch agent/unit/task
107
+ agentlog autostart disable # removes auto-start, keeps agentlog installed
108
+ agentlog autostart status # shows current state
109
+ agentlog uninstall [--keep-data]
110
+ ```
111
+
112
+ `uninstall` is exhaustively tested: removes the launch agent, the config, the binaries, optionally the data (with confirmation). For a tool handling sensitive data, "remove all traces" must actually work.
113
+
114
+ ## Resource awareness
115
+
116
+ The supervisor respects laptop realities:
117
+
118
+ - **Power state.** On battery, indexer interval drops from 10 to 30 minutes; compaction skipped; speculative pre-fetching disabled.
119
+ - **Network state.** If storage backend is remote and we're offline, writes spool to `~/.agentlog/spool/` and flush on reconnect. The OTLP collector already does this for spans; the same pattern extends to S3 writes.
120
+ - **Sleep/wake.** On wake, supervisor verifies child health and restarts any that died during sleep.
121
+ - **Cursor presence.** Cursor poller checks `pgrep -x Cursor` before each poll cycle; sleeps entirely when Cursor is not running.
122
+
123
+ These are v0 features, not v1, because they're the difference between a tool people keep installed and a tool people uninstall after a week.
124
+
125
+ ## Storage layer
126
+
127
+ **Backend: anything S3-compatible.** Backend choice is a config matter, not a code matter. Supported in `init`:
128
+
129
+ - **Local** (default first-run) — `~/.agentlog/data/`, no cloud account
130
+ - **R2** (recommended for personal/small team) — Cloudflare, no egress fees, free 10GB tier
131
+ - **S3** (recommended for AWS-native teams)
132
+ - **Custom endpoint** — covers B2, MinIO, Wasabi, Tigris, Hetzner, etc.
133
+
134
+ **Bucket layout:**
135
+
136
+ ```
137
+ s3://<bucket>/agentlog/
138
+ devices/
139
+ <device-name>/
140
+ sessions/
141
+ repo=<canonical-repo-key>/
142
+ provider=<claude_code|codex|cursor|devin>/
143
+ year=2026/month=04/day=26/
144
+ session=<session_id>.conversation.md
145
+ session=<session_id>.transcript.jsonl
146
+ session=<session_id>.metadata.json
147
+ session=<session_id>.events.jsonl
148
+ scope=<claude-web|chatgpt>/
149
+ year=2026/month=04/day=26/
150
+ session=<session_id>.conversation.md
151
+ session=<session_id>.transcript.jsonl
152
+ session=<session_id>.metadata.json
153
+ session=<session_id>.events.jsonl
154
+ indexes/
155
+ bm25/... # local keyword/BM25-style index over events/transcripts
156
+ snapshots/
157
+ 20260504T173000Z/
158
+ <device-name>/
159
+ sessions/...
160
+ ```
161
+
162
+ Markdown conversations are the primary human-readable representation because
163
+ agents and humans can inspect them with ordinary filesystem tools. Raw
164
+ transcripts are stored alongside as immutable JSONL for provenance and
165
+ re-indexing. `events.jsonl` is the canonical machine-readable recall substrate:
166
+ one provider-independent JSONL event stream with `session.started`,
167
+ `prompt.submitted`, `response.generated`, `tool.called`, and `tool.completed`.
168
+ Structured analytics artifacts such as Parquet/OTel spans are optional siblings,
169
+ not the default recall substrate.
170
+
171
+ Every importer has a centralized semantic parser version in
172
+ `src/parser-versions.js`. Parser versions are included in archive metadata and
173
+ import fingerprints. The first npm release uses `1.0.0` as the baseline for
174
+ every source type. After release, when parser output changes for the same raw
175
+ input, the source-type version must be bumped in the same change so stale
176
+ archives can be replaced.
177
+
178
+ **Migration:** `agentlog migrate --to <backend>` records the remote target; `agentlog sync` uploads the same markdown-primary object layout to any S3-compatible target under `agentlog/devices/<device-name>/...`. Local→R2 is a one-shot upload and then an incremental supervisor upload. Sync does not delete remote objects. Receive-only and two-way sync should read other device namespaces and merge normalized archive metadata without interpreting absence on one device as a delete. `agentlog snapshot` writes redundant point-in-time copies under `agentlog/snapshots/<timestamp>/<device-name>/...`.
179
+
180
+ ## Repo keying
181
+
182
+ Every span and record gets `agentlog.repo.canonical`, derived in this order:
183
+
184
+ 1. `git config --get remote.origin.url` from the session's `cwd`, normalized: lowercase host, strip protocol, strip `.git`, strip trailing slash. `git@github.com:User/Repo.git` → `github.com/user/repo`.
185
+ 2. First-commit SHA fallback: `git rev-list --max-parents=0 HEAD` → `firstcommit:<sha>`.
186
+ 3. Non-git fallback: content hash of cwd path normalized to home-relative → `path:<sha256>`.
187
+
188
+ Repo-level override at `.agentlog.yaml`:
189
+
190
+ ```yaml
191
+ canonical_repo: github.com/myorg/private-name
192
+ aliases:
193
+ - github.com/myorg/old-name
194
+ ```
195
+
196
+ Web chat imports use `agentlog.scope.canonical` instead (e.g. `claude-web`, `chatgpt`) — see Web chat import section.
197
+
198
+ ## Redaction (at ingest, not at query)
199
+
200
+ Three layers in the collector before anything hits storage:
201
+
202
+ 1. **Built-in patterns** (always on):
203
+ - AWS keys, OpenAI/Anthropic keys, GitHub tokens, Slack tokens
204
+ - JWT-shaped strings, private key blocks
205
+ - High-entropy strings >32 chars in `KEY=value` shapes
206
+
207
+ 2. **Env-var value scrubbing.** If `.env` files are read in a session, configured variable values are scrubbed wherever they appear in transcripts.
208
+
209
+ 3. **User-defined patterns** in `~/.agentlog/redaction.yaml`:
210
+
211
+ ```yaml
212
+ patterns:
213
+ - name: internal_api
214
+ regex: 'https://[a-z]+\.internal\.acme\.com/[^\s]+'
215
+ env_vars: [API_KEY, DATABASE_URL, STRIPE_SECRET]
216
+ allowlist_repos:
217
+ - github.com/acme/public-docs
218
+ ```
219
+
220
+ Each session gets a `redaction_summary` span: counts by category, no content. Users can audit "did this leak anything" without seeing what leaked.
221
+
222
+ `agentlog reveal <session-id>` re-renders un-redacted from local cache. Local-only — never works on remote/team archives.
223
+
224
+ **Honesty about limits:** pattern-based redaction catches credentials. It does not catch personal/sensitive content (medical, legal, financial conversations). This matters especially for web chat imports, which is why those default to local-only storage.
225
+
226
+ ## Per-provider capture
227
+
228
+ ### Claude Code & Claude Cowork
229
+
230
+ Native OTel. `init` merges into `~/.claude/settings.json`:
231
+
232
+ ```json
233
+ {
234
+ "env": {
235
+ "CLAUDE_CODE_ENABLE_TELEMETRY": "1",
236
+ "OTEL_METRICS_EXPORTER": "otlp",
237
+ "OTEL_LOGS_EXPORTER": "otlp",
238
+ "OTEL_EXPORTER_OTLP_PROTOCOL": "http/json",
239
+ "OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4318",
240
+ "OTEL_LOG_USER_PROMPTS": "1",
241
+ "OTEL_RESOURCE_ATTRIBUTES": "service.name=claude-code,agentlog.user=<user>"
242
+ }
243
+ }
244
+ ```
245
+
246
+ ### Codex CLI
247
+
248
+ `fsnotify` watcher on `~/.codex/sessions/YYYY/MM/DD/`. Decompresses `.jsonl.zst`, normalizes to OTel `gen_ai.*` spans, posts to local collector. State (file offsets) in `~/.agentlog/state/codex-cursor.db` SQLite. ~200 lines of Go. Deleted when OpenAI ships native OTel.
249
+
250
+ ### Cursor
251
+
252
+ SQLite/transcript poller, 30-second interval, only active when `Cursor` process detected. Reads older `~/Library/Application Support/Cursor/User/workspaceStorage/*/state.vscdb` stores and newer `~/.cursor/projects/<project>/agent-transcripts/` JSON/JSONL transcripts on macOS/Linux. macOS Full Disk Access prompt documented with screenshots.
253
+
254
+ ### Devin
255
+
256
+ SQLite importer for Devin for Terminal. Reads
257
+ `~/.local/share/devin/cli/sessions.db`, reconstructs the visible message branch
258
+ from `sessions.main_chain_id` plus `message_nodes.parent_node_id`, skips Devin's
259
+ injected context messages, and archives user, assistant, and tool messages under
260
+ provider `devin`. `AGENTLOG_DEVIN_SESSIONS_DB` can point at an alternate
261
+ database.
262
+
263
+ ### Web chat import (Claude.ai, ChatGPT)
264
+
265
+ Web chats don't have a real-time hook — Anthropic and OpenAI provide periodic export files only. agentlog imports these as one-shot operations per export.
266
+
267
+ **Flow:**
268
+
269
+ 1. User exports from Claude.ai (Settings → Privacy → Export) or ChatGPT (Settings → Data Controls → Export)
270
+ 2. Email arrives with download link
271
+ 3. User runs: `agentlog import claude-web --file <downloaded file>` or `agentlog import chatgpt --file <downloaded file>`
272
+
273
+ **Storage scope:** local-only by default, **even if the agentlog instance is configured for a team backend.** Web chats often contain personal content; opt-in required to share with team. Override with `--scope team`.
274
+
275
+ **Repo keying:** web chats are stored under `scope=claude-web` or `scope=chatgpt` rather than a repo key. Heuristic repo inference (matching code blocks and error messages against known repos) deferred to v1.
276
+
277
+ **Recall behavior:** excluded from agent-initiated `search_past_sessions` calls by default. Included in human-initiated `agentlog history` searches. Override via `--include-web-chats` flag on recall queries.
278
+
279
+ **Frequency:** designed for periodic re-import, not continuous capture. Realistic cadence is "monthly or when the user remembers." Detection of new exports in `~/Downloads/` and prompting is a v1 enhancement.
280
+
281
+ **Confirmation prompt for team-configured installs:**
282
+
283
+ ```
284
+ $ agentlog import claude-web --file ...
285
+ Storage backend: team (s3://acme-agentlog/)
286
+ Web chats often contain personal content. By default they'll be
287
+ stored only in your local archive, not the team archive.
288
+
289
+ 1) Local only (recommended)
290
+ 2) Team archive (your conversations will be visible to teammates)
291
+ 3) Cancel
292
+ ```
293
+
294
+ ## Importing existing CLI history
295
+
296
+ `init` scans for existing CLI conversations and offers to import them. The default scope is "last 30 days" — recent enough to be useful for recall, narrow enough to avoid surprises in archives users may have forgotten the contents of.
297
+
298
+ **Discovery during init:**
299
+
300
+ ```
301
+ Scanning for existing conversations...
302
+ ✓ Codex CLI: 89 sessions across 12 projects (oldest: 2025-09-15)
303
+ ✓ Codex Desktop: 14 sessions across 3 projects (oldest: 2026-02-01)
304
+ ✓ Claude Code CLI: 247 sessions across 18 projects (oldest: 2025-11-03)
305
+ ✓ Claude Code Desktop: 4 sessions (oldest: 2026-02-04)
306
+ ✓ Claude Workspace: 7 sessions (oldest: 2026-02-04)
307
+ ✓ Gemini CLI: 2 sessions (oldest: 2026-03-01)
308
+ ✓ Antigravity: 2 sessions (oldest: 2025-11-19)
309
+ ✓ Devin CLI: 3 sessions (oldest: 2026-04-28)
310
+ ✓ Cursor: 31 sessions across 4 workspaces (oldest: 2026-01-02)
311
+
312
+ Import existing history?
313
+ 1) Last 30 days (default)
314
+ 2) Everything
315
+ 3) Choose specific repos
316
+ 4) Skip for now
317
+ ```
318
+
319
+ Import runs as a background worker at lower priority than live ingestion. Progress visible via `agentlog import status`. Idempotent on re-run (tracks imported session IDs in state DB). Sessions whose `cwd` no longer exists fall back to path-hash repo keying.
320
+
321
+ **Standalone command:**
322
+
323
+ ```
324
+ agentlog import [--source codex-cli|codex-desktop|claude|claude-code-desktop|claude-workspace|gemini-cli|antigravity|devin-cli|cursor|all]
325
+ [--since 30d|all]
326
+ [--repos <list>]
327
+ [--dry-run]
328
+ ```
329
+
330
+ `--dry-run` shows what would be imported without doing it.
331
+
332
+ **Web chat import is a separate command** — it requires a file argument and has different default storage scope. Not part of the init flow because exports take time to generate.
333
+
334
+ ## Collector
335
+
336
+ A single Go binary wrapping the upstream OTel collector with three custom processors:
337
+
338
+ 1. `repokeyprocessor` — derives canonical repo key from `cwd`, or scope key for web chats
339
+ 2. `redactionprocessor` — runs the three redaction layers
340
+ 3. `agentnormalizer` — for file-tail providers and import sources, converts ingested events into OTel spans matching `gen_ai.*` semantic conventions
341
+
342
+ Exporters: OTLP→OpenObserve for spans/metrics/logs; direct `s3exporter` for raw transcripts. Both share S3 credentials.
343
+
344
+ ## Recall layer
345
+
346
+ Separate binary, `agentlog-recall`. Spawned by agent clients over stdio (preferred) or run as HTTP server for team mode.
347
+
348
+ **One MCP tool:**
349
+
350
+ ```
351
+ search_past_sessions(query: string, repo?: string, limit?: int = 10,
352
+ include_web_chats?: bool = false)
353
+ → list of message excerpts with session links
354
+ ```
355
+
356
+ **Recall pipeline:**
357
+
358
+ - Builds a local keyword/BM25-style index over `events.jsonl` when present
359
+ - Indexes prompt, response, tool-call, and tool-result event text independently
360
+ - Aggregates event hits back to sessions for CLI/MCP compatibility
361
+ - Returns optional `event_id`, `event_kind`, `message_index`, and matched text
362
+ - Falls back to transcript/markdown search for legacy archives without events
363
+
364
+ **Retrieval:** event-first over canonical JSONL. `repo` parameter is a hard filter.
365
+ Without it, results are weighted toward the calling agent's current `cwd` repo.
366
+ Web chats are excluded unless `include_web_chats=true`.
367
+
368
+ **No memory promotion in v0.** Raw evidence with good retrieval beats lossy summarization. Add summarization in v1 only if v0 retrieval proves insufficient.
369
+
370
+ **Adding to agents:**
371
+
372
+ ```
373
+ agentlog recall add-to claude # writes ~/.claude/mcp.json, /commands/recall.md, and /skills/agentlog-recall/SKILL.md
374
+ agentlog recall add-to cursor # writes ~/.cursor/mcp.json
375
+ agentlog recall add-to codex # writes ~/.codex/config.toml
376
+ ```
377
+
378
+ Generated recall commands and skills should let the agent choose the first
379
+ `agentlog history` query from the user's request. They should prefer concise,
380
+ distinctive search terms over blindly passing the full `/recall` argument string
381
+ through to the CLI. Skill-style files should include a concise command table,
382
+ workflow, query-selection guidance, archive/filter hints, important rules, and
383
+ troubleshooting. Archive hints should note that sessions live under
384
+ `~/.agentlog/data/agentlog/sessions/repo=<repo-or-path-key>/provider=<provider>/...`,
385
+ git repos use canonical keys like `github.com/org/repo`, non-git directories may
386
+ use stable `path:<hash>` keys, and `agentlog history --repo "<repo-or-path>"`
387
+ matches canonical repo keys, local `cwd`, display labels, web scopes, and path
388
+ fragments.
389
+
390
+ ## History viewer
391
+
392
+ v0 ships a dependency-free local viewer behind `agentlog history --web`. It lists sessions in a repo tree sorted by last updated time, pages large folders with a load-more control, searches the same event-first recall index, filters by repo/provider/date, and opens full conversations through the CLI API. The static viewer follows shadcn/ui-style tokens and compact button/input/select/sidebar patterns without requiring a frontend build step. Stable `path:<hash>` keys remain valid archive identifiers for folders without git identity, but the viewer displays the local folder path. The transcript pane defaults to readable chat bubbles for user, assistant, system, and tool messages, with a markdown toggle for the canonical archive file. Tool rendering reads canonical events or normalized metadata first, uses category/icon/target fields for consistent Bash/edit/read/search/web/task/skill/MCP cards, and uses raw text patterns only for legacy archives.
393
+
394
+ **Commands:**
395
+
396
+ ```
397
+ agentlog history # native app pointed at archive
398
+ agentlog history --web # web UI on localhost:7824
399
+ agentlog history "query" --provider codex-cli
400
+ agentlog history --repo github.com/org/repo
401
+ agentlog history --since 7d
402
+ agentlog history --include-web-chats
403
+ agentlog show <session-id>
404
+ ```
405
+
406
+ The viewer's search hits the same retrieval endpoint as the recall MCP server — humans and agents see the same world (with the human-vs-agent default scope difference for web chats).
407
+
408
+ **For headless/server contexts** (SSH'd into a dev box), `--web` mode serves the UI on a local port with a bearer-token-in-URL auth pattern.
409
+
410
+ **Team mode** serves cchv as a web service at the same endpoint as the OTLP collector, gated by the same auth. v0 team mode: "everyone sees everything," documented as such. Per-user filtering and permissions in v1.
411
+
412
+ ## CLI surface
413
+
414
+ Complete user-facing commands:
415
+
416
+ ```
417
+ # Setup and lifecycle
418
+ agentlog init [--storage local|r2|s3|custom] [--remote URL]
419
+ agentlog start [--foreground]
420
+ agentlog stop
421
+ agentlog status
422
+ agentlog logs [--follow]
423
+ agentlog config <get|set> <key> [value]
424
+ agentlog migrate --to <backend>
425
+ agentlog sync [--endpoint <url>] [--bucket <name>] [--access-key-id <id>] [--secret-access-key <key>] [--prefix agentlog]
426
+ agentlog autostart <enable|disable|status>
427
+ agentlog doctor [--json]
428
+ agentlog uninstall [--keep-data]
429
+
430
+ # Capture management
431
+ agentlog reveal <session-id>
432
+ agentlog redact reapply
433
+ agentlog index <pause|resume|status>
434
+
435
+ # Import
436
+ agentlog import [--source codex-cli|codex-desktop|claude|claude-code-desktop|claude-workspace|claude-sdk|gemini-cli|antigravity|devin-cli|cursor|all] [--since 30d|all] [--repos <list>] [--dry-run]
437
+ agentlog import claude-web --file <path> [--scope local|team]
438
+ agentlog import chatgpt --file <path> [--scope local|team]
439
+ agentlog import status
440
+
441
+ # Recall
442
+ agentlog recall start
443
+ agentlog recall add-to <codex|claude|gemini|antigravity|cursor>
444
+ agentlog recall reindex
445
+
446
+ # Viewing
447
+ agentlog history [query] [--web] [--repo <repo>] [--provider <provider>] [--since <duration>] [--include-web-chats]
448
+ agentlog show <session-id> [--json|--path|--open]
449
+
450
+ # Team mode
451
+ agentlog server
452
+ ```
453
+
454
+ ## Setup flows
455
+
456
+ ### Solo, local only (~90 seconds)
457
+
458
+ ```
459
+ brew install agentlog
460
+ agentlog init # picks local storage, prompts auto-start
461
+ # → "Start agentlog automatically at login? [Y/n]"
462
+ # → scans for existing history
463
+ # → "Import last 30 days? [Y/n]"
464
+ # → writes launch agent, starts supervisor
465
+ # → import runs in background
466
+ ```
467
+
468
+ After `init` completes, prints:
469
+
470
+ ```
471
+ ✓ Launch agent installed at ~/Library/LaunchAgents/com.agentlog.supervisor.plist
472
+ ✓ Service started (PID 47291)
473
+ ✓ Collector listening on localhost:4318
474
+ ✓ Claude Code config updated at ~/.claude/settings.json
475
+ ✓ Background import started: 247 sessions queued (~25min)
476
+
477
+ View your history: `agentlog history`
478
+ Try it out: open Claude Code, have a quick conversation, then run
479
+ `agentlog status` to see it captured.
480
+ ```
481
+
482
+ ### Solo, R2-backed (~5 minutes)
483
+
484
+ ```
485
+ brew install agentlog
486
+ agentlog init --storage r2
487
+ # → opens dash.cloudflare.com/?to=/:account/r2/api-tokens
488
+ # → user pastes credentials back into CLI
489
+ # → agentlog creates bucket if needed, validates write
490
+ # → same auto-start and import prompts as above
491
+ ```
492
+
493
+ ### Team (afternoon for operator, ~1 minute per developer)
494
+
495
+ ```
496
+ # Operator, once:
497
+ terraform apply # ships agentlog/deploy-aws or deploy-cloudflare
498
+ # outputs OTLP endpoint and bootstrap token
499
+
500
+ # Each developer:
501
+ agentlog init --remote https://agentlog.myteam.com --token <token>
502
+ # auto-start prompt; import scoped to team policy;
503
+ # no local OpenObserve (team server runs it)
504
+ ```
505
+
506
+ Companion deployment modules: `agentlog/deploy-aws` (ECS + ALB + S3), `agentlog/deploy-cloudflare` (R2 + Workers for auth proxy), `agentlog/deploy-fly` (Fly.io single-region). The Terraform module is the unsexy linchpin for team adoption.
507
+
508
+ ## Privacy and data handling commitments
509
+
510
+ Codified because they shape every other decision:
511
+
512
+ 1. **No phone-home telemetry.** agentlog itself ships zero usage analytics anywhere. Any future opt-in metrics live on a separate channel.
513
+ 2. **No filesystem scanning.** Only specific known paths: `~/.codex/`, `~/.claude/`, `~/.gemini/`, `~/.local/share/devin/cli/sessions.db`, Cursor's storage, `~/.agentlog/`, plus user-specified import file paths. Windsurf paths are excluded while Cascade transcripts remain encrypted.
514
+ 3. **No process inspection beyond `pgrep`.** We check whether Cursor is running. We don't introspect what it's doing.
515
+ 4. **Redaction at ingest, not query.** Pattern-matching credentials never land in storage in the first place.
516
+ 5. **Reveal is local-only.** Un-redacted content is never reconstructible from team/remote archives.
517
+ 6. **Web chats default to local-only.** Even on team-configured installs. Explicit override required to share.
518
+ 7. **Agent recall excludes web chats by default.** Human-initiated history viewing includes them.
519
+ 8. **Redaction limits are stated honestly.** Pattern-based redaction catches credentials, not sensitive personal content.
520
+ 9. **Uninstall removes everything.** Tested.
521
+
522
+ ## What's deferred
523
+
524
+ - **Web UI beyond cchv** — v0.2 if cchv proves insufficient
525
+ - **Slack notify skill** (`agentlog-slack`) — separate repo, consumes from archive; v0.2
526
+ - **Other web chat sources** (Gemini, Perplexity, Grok) — v1, same pattern, different parsers
527
+ - **Heuristic repo inference for web chats** — v1
528
+ - **Auto-detection of new export files in `~/Downloads/`** — v1
529
+ - **Summary generation / "revival packets"** — v1, only if raw retrieval proves insufficient
530
+ - **Cross-machine session linking** — v1, depends on canonical repo keying landing solid
531
+ - **SSO** — v1, enterprise tier
532
+ - **Per-user permissions for team viewing** — v1
533
+ - **Cursor extension for richer capture** — v1 if SQLite proves too lossy
534
+ - **Windows Service install** — v1, currently Scheduled Task only
535
+ - **Direct integration with Claude.ai/ChatGPT desktop app local stores** — probably never; respect the boundary
536
+
537
+ ## What this is not
538
+
539
+ - Not a memory-curation product. Memorix and Hindsight occupy that space; agentlog can be their substrate.
540
+ - Not an enterprise observability tool. SigNoz/Datadog/Honeycomb are better and accept the same OTel feeds.
541
+ - Not a session-replay tool.
542
+ - Not a Slack bot. Slack integration is downstream of the archive, intentionally.
543
+ - Not a real-time capture tool for web chats. Those are import-only by design.
544
+
545
+ ## The differentiator
546
+
547
+ A redaction-first, repo-keyed, S3-compatible archive substrate that the agent itself can query via MCP, with backfill of existing CLI history and import paths for web chats. Every piece exists separately. The value is in composing them under one CLI with a setup flow that doesn't take an afternoon and a privacy story that doesn't require trusting a vendor.
548
+
549
+ If it's good, it disappears: the user forgets it's running, their agents quietly get smarter at their repos over time, and accumulated debugging knowledge stops evaporating between sessions.
550
+
551
+ That's v0.2.
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env node
2
+
3
+ const { runMcpServer } = require("../src/mcp");
4
+
5
+ runMcpServer().catch((error) => {
6
+ console.error(error && error.stack ? error.stack : String(error));
7
+ process.exitCode = 1;
8
+ });
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env node
2
+
3
+ const { main } = require("../src/cli");
4
+
5
+ main(process.argv.slice(2)).catch((error) => {
6
+ console.error(formatCliError(error));
7
+ process.exitCode = 1;
8
+ });
9
+
10
+ function formatCliError(error) {
11
+ if (process.env.AGENTLOG_DEBUG && error?.stack) return error.stack;
12
+ const message = error?.message || String(error);
13
+ return /^Error:/i.test(message) ? message : `Error: ${message}`;
14
+ }