@misha_misha/agentwatch 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,67 @@
1
+ <div align="center">
2
+
1
3
  # agentwatch
2
4
 
3
- **Local-only observability for AI coding agents.** One terminal timeline across Claude Code, Cursor, and OpenClaw what each agent is reading, writing, running, and what each is actually allowed to do.
5
+ **See what every AI coding agent on your machine is doingin one terminal.**
6
+
7
+ Local-only observability for Claude Code, Codex, Gemini CLI, Cursor, and
8
+ OpenClaw — unified timeline, real token + cost accounting, compaction and
9
+ anomaly detection, an MCP server agents can query their own history from,
10
+ and an OpenTelemetry exporter with `gen_ai.*` semantic conventions. All
11
+ local. No cloud. No telemetry. No sign-in.
12
+
13
+ [![npm](https://img.shields.io/npm/v/@misha_misha/agentwatch.svg)](https://www.npmjs.com/package/@misha_misha/agentwatch)
14
+ [![CI](https://github.com/mishanefedov/agentwatch/actions/workflows/ci.yml/badge.svg)](https://github.com/mishanefedov/agentwatch/actions/workflows/ci.yml)
15
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
16
+ [![Node >=20](https://img.shields.io/badge/node-%E2%89%A520-brightgreen.svg)](./package.json)
17
+
18
+ </div>
19
+
20
+ <div align="center">
21
+ <img src="./docs/demo.gif" alt="agentwatch demo" width="820" />
22
+ </div>
23
+
24
+ ---
25
+
26
+ ## Table of contents
27
+
28
+ - [Why this exists](#why-this-exists)
29
+ - [Install](#install)
30
+ - [First 60 seconds](#first-60-seconds)
31
+ - [Agent coverage](#agent-coverage)
32
+ - [Features](#features)
33
+ - [Keyboard reference](#keyboard-reference)
34
+ - [Configuration](#configuration)
35
+ - [What agentwatch reads](#what-agentwatch-reads)
36
+ - [MCP server mode](#mcp-server-mode)
37
+ - [OpenTelemetry exporter](#opentelemetry-exporter)
38
+ - [How it compares](#how-it-compares)
39
+ - [Limitations](#limitations)
40
+ - [Non-goals](#non-goals)
41
+ - [Architecture](#architecture)
42
+ - [Development](#development)
43
+ - [Security](#security)
44
+ - [License](#license)
45
+
46
+ ---
4
47
 
5
- No cloud. No Docker. No telemetry. `npm i -g agentwatch` and go.
48
+ ## Why this exists
6
49
 
7
- ## Why
50
+ You run three AI coding agents on one laptop. Claude Code in a terminal,
51
+ Codex alongside it, Cursor as your IDE, maybe Gemini CLI for a quick
52
+ review, maybe an OpenClaw sub-agent churning on a long task. Every one of
53
+ them has its own log file, its own permission model, its own idea of what
54
+ a "session" is. None of them tells you what the others are doing.
8
55
 
9
- You're running Claude Code + Cursor + OpenClaw on the same machine. Each has its own config (`CLAUDE.md`, `.cursorrules`, OpenClaw workspaces), its own activity log in a different place, and its own permission model. Nothing shows you one unified view: *right now, which agent just touched which file, ran which command, and why.*
56
+ When something goes wrong a file rewritten unexpectedly, a spend spike,
57
+ an `rm` you don't remember running — you're piecing it together from five
58
+ JSONLs and guessing.
10
59
 
11
- [claude-devtools](https://github.com/matt1398/claude-devtools) solves this beautifully — for Claude only. agentwatch does the same thing for the whole multi-agent setup.
60
+ [`claude-devtools`](https://github.com/matt1398/claude-devtools) does this
61
+ well for Claude Code. **agentwatch does it for the whole multi-agent
62
+ stack, in the terminal, with zero infrastructure and zero network.**
63
+
64
+ ---
12
65
 
13
66
  ## Install
14
67
 
@@ -17,67 +70,499 @@ npm i -g @misha_misha/agentwatch
17
70
  agentwatch
18
71
  ```
19
72
 
20
- That's it. No config, no accounts, no daemon. (Published under a
21
- scope because `agentwatch` was blocked by npm's anti-typosquatting
22
- check the binary is still `agentwatch`.)
73
+ Requires:
74
+
75
+ - **Node 20** (tested on 20 + 22 in CI)
76
+ - **macOS or Linux** (Windows intentionally out of scope for v0.x)
77
+
78
+ Published under the `@misha_misha` npm scope — the unscoped `agentwatch`
79
+ name was already taken by a CyberArk tool. The installed binary on your
80
+ `$PATH` is simply `agentwatch`.
81
+
82
+ ---
83
+
84
+ ## First 60 seconds
85
+
86
+ ```bash
87
+ agentwatch doctor # detects installed agents + readiness
88
+ agentwatch # launches the TUI
89
+ agentwatch mcp # runs the MCP stdio server (for agents, not humans)
90
+ agentwatch --help
91
+ ```
92
+
93
+ `doctor` output looks like:
94
+
95
+ ```
96
+ workspace: /Users/you/IdeaProjects
97
+
98
+ agents:
99
+ ● Claude Code installed (events captured)
100
+ ● Codex installed (events captured)
101
+ ● Gemini CLI installed (events captured)
102
+ ● Cursor installed (config-level only)
103
+ ● OpenClaw installed (events captured)
104
+ ○ Aider not detected
105
+ ○ Cline (VS Code) not detected
106
+ ```
107
+
108
+ Launch the TUI and every event your agents emit streams in. The last 4 MB
109
+ of each active session is backfilled on startup so you have immediate
110
+ context. Press **`?`** to see every hotkey.
111
+
112
+ ---
113
+
114
+ ## Agent coverage
115
+
116
+ What actually works per agent, as of v0.0.3. Features not listed here
117
+ work across every agent (timeline, export, syntax highlighting, notifications,
118
+ triggers, search, stale detection, clipboard yank).
119
+
120
+ | Feature | Claude Code | Codex | Gemini CLI | Cursor | OpenClaw |
121
+ | ------------------------------ | :---------: | :---: | :--------: | :----: | :------: |
122
+ | Live events on timeline | ✅ | ✅ | ✅ | 🟡 | ✅ |
123
+ | Token usage + cost | ✅ | ✅ | ✅ | ❌ | ✅ |
124
+ | Tool call + result pairing | ✅ | ✅ | ✅ | ❌ | 🟡 |
125
+ | Per-turn token attribution | ✅ | ✅ | ✅ | ❌ | ✅ |
126
+ | Budget alarms (session + day) | ✅ | ✅ | ✅ | ❌ | ✅ |
127
+ | Anomaly detection (cost/loops) | ✅ | ✅ | ✅ | 🟡 | ✅ |
128
+ | Compaction visualizer | ✅ | ✅ | ❌ | — | ❌ |
129
+ | Permissions view | ✅ | ✅ | ✅ | ✅ | ✅ |
130
+ | Cross-session search | ✅ | ✅ | ✅ | ❌ | ❌ |
131
+ | Subagent drilldown | ✅ | — | 🟡 | — | 🟡 |
132
+ | Agent memory file overhead | `CLAUDE.md` | `AGENTS.md` | `GEMINI.md` | `.cursorrules` | `OPENCLAW.md` |
133
+ | OTel span coverage | ✅ | ✅ | ✅ | 🟡 | ✅ |
134
+ | MCP server exposes history | ✅ | ✅ | ✅ (raw) | ❌ | ❌ |
135
+
136
+ - **Cursor** exposes config state (MCP servers, `.cursorrules`, approval
137
+ mode, sandbox) but its actual AI activity lives in a SQLite database we
138
+ haven't parsed yet. A thin read-only adapter is a follow-up.
139
+ - **Gemini CLI** doesn't persist context-compaction markers to disk, so
140
+ compaction detection is Claude + Codex only.
141
+ - **OpenClaw** doesn't persist tool_result content or compaction markers
142
+ to its JSONL — structural limit of what's on disk, not an adapter gap.
143
+
144
+ ---
145
+
146
+ ## Features
147
+
148
+ ### Live multi-agent timeline
149
+
150
+ Main screen. Every event your agents emit, ordered by event timestamp (not
151
+ arrival order, so backfill from different sessions merges correctly).
152
+ Columns: time · agent · type · `[project]` summary · duration · error.
153
+
154
+ ```
155
+ 09:54:01 openclaw response [content_agent] <think> Checked the KB…
156
+ 09:52:53 claude-code response [auraqu] Commit bddc363. q now exits instantly…
157
+ 09:52:48 codex shell_exec [dataset_research] ls -la · 12ms
158
+ 09:52:43 claude-code tool_call [auraqu] Edit: src/ui/App.tsx · 7ms
159
+ 09:51:51 gemini file_write [landing] write_file: public/llms.txt
160
+ 09:51:51 claude-code tool_call [auraqu] Agent: Competitive landscape ▸ 52 child events
161
+ ```
162
+
163
+ Rows with an anomaly fire a red `◎` prefix on the type column.
164
+
165
+ ### Event detail pane
166
+
167
+ Press **`Enter`** on any row. Opens a full-screen pane with:
168
+
169
+ - Metadata (time, agent, type, tool, path, cmd)
170
+ - Tokens / cost / duration (`in=6 cache_create=25508 cache_read=16827 out=353` · `$0.08 (claude-opus-4-6)` · `151ms`)
171
+ - Tool result — stdout for Bash, file content for Read/Write, search matches for Grep — with syntax highlighting inferred from the tool + file extension
172
+ - Full prompt or response text
173
+ - Extended thinking block when present
174
+ - Tool input JSON
23
175
 
24
- Requires Node 20. Works on macOS and Linux.
176
+ Scrollable with `↑↓` or `j/k`. `esc` closes.
25
177
 
26
- ## What it shows
178
+ ### Subagent drilldown
27
179
 
28
- - **Claude Code** — tails `~/.claude/projects/**/*.jsonl` and emits every prompt, response, tool call, file read/write, and shell exec with attribution and risk scoring.
29
- - **OpenClaw** watches `~/.openclaw/agents/*/sessions/*.jsonl` across every sub-agent (content, research, docs, main) with sub-agent attribution in the event stream, plus `config-audit.jsonl` with elevated risk scoring for config writes.
30
- - **Cursor** config-level visibility: MCP server list, permissions (`cli-config.json`), recently-viewed files (`ide_state.json`), discovered `.cursorrules` anywhere in your workspace.
31
- - **Workspace filesystem** — chokidar-backed watcher over `$WORKSPACE_ROOT` (default `~/IdeaProjects`) with sensible ignores (`node_modules`, `.git`, `dist`).
32
- - **Permissions (Claude)** — press `p` in the TUI to open a full-screen view of `~/.claude/settings.json`. Renders the allow / deny lists, `defaultMode`, and flags dangerous patterns: `Bash(*)`, missing `~/.ssh`/`.aws`/`.gnupg` denies, auto/bypass modes.
180
+ Parent `Agent` tool_use events show `▸ 52 child events`. Press **`x`** to
181
+ scope the timeline to only that subagent's inner tool calls. `X` unscopes.
182
+ Applies to Claude Code (Task tool) and partially to OpenClaw (per-agent
183
+ delegation) and Gemini (subagent sessions).
33
184
 
34
- ## Hotkeys
185
+ ### Project + session navigation
35
186
 
36
187
  ```
37
- q quit
38
- a toggle agent side panel
39
- f cycle agent filter
40
- p toggle full-screen permission view
41
- space pause / resume event stream
42
- c clear events
188
+ P → projects grid (one workspace per row, across all agents)
189
+ enter sessions list (grouped Today / Yesterday / 7d / Older)
190
+ enter → scoped timeline
191
+ ```
192
+
193
+ Projects grid aggregates across agents: per-agent session counts, total
194
+ cost, last activity. `esc` walks back one level.
195
+
196
+ ### Cross-session search (`?`)
197
+
198
+ Press **`?`** — fuzzy-substring search across every session file on disk
199
+ (`~/.claude`, `~/.codex`, `~/.gemini`). Uses ripgrep if installed, falls
200
+ back to a native scan. Enter on a hit scopes the timeline to that session.
201
+
202
+ Different from in-buffer search:
203
+ - **`/`** — search the 500-event live buffer
204
+ - **`?`** — search every session file ever written
205
+
206
+ ### Per-session cost with cache accounting
207
+
208
+ Naive token counters are 3–10× wrong on Claude because `cache_read` is
209
+ billed at 10% of input and `cache_creation` at 125%. agentwatch ships a
210
+ per-model rate table (Claude opus/sonnet/haiku, GPT-5 / GPT-5-mini,
211
+ Gemini 2.5 Pro/Flash) and computes true USD cost per turn. Cost shows:
212
+
213
+ - Per-agent total in the side panel
214
+ - Per-event in the detail pane
215
+ - Per-session in the sessions list
216
+ - Aggregate in the session's token attribution view (`[t]`)
217
+
218
+ ### Per-turn token attribution (`[t]`)
219
+
220
+ Inside a scoped session, press **`t`**. Stacked bar per turn showing:
221
+
222
+ - `user` — the preceding prompt (tokenized with `gpt-tokenizer`)
223
+ - `memory file` — CLAUDE.md / AGENTS.md / GEMINI.md / .cursorrules / etc., read from the session's cwd
224
+ - `tool I/O` — tool_input JSON + tool_result text
225
+ - `thinking` — extended thinking block
226
+ - `input (fresh)` / `cache read` / `cache create` / `output` — exact from the model's own usage record
227
+
228
+ ### Compaction visualizer (`[C]`)
229
+
230
+ Inside a scoped session, press **`C`**. Horizontal bar of context fill %
231
+ across turns, with `⋈` markers where the agent auto-compacted. Selected
232
+ compaction shows before / after token counts and the dropped-token delta.
233
+ Works on Claude Code (via `isCompactSummary`) and Codex (via
234
+ `event_msg/turn_truncated`).
235
+
236
+ ### Budget alarms
237
+
238
+ `~/.agentwatch/budgets.json`:
239
+
240
+ ```json
241
+ { "perSessionUsd": 5, "perDayUsd": 20 }
43
242
  ```
44
243
 
45
- ## CLI
244
+ Red banner in the Header when either cap is crossed; OS notification
245
+ fires once per crossing. No kill switch — we don't control agents; we
246
+ just shout.
247
+
248
+ ### Anomaly detection
249
+
250
+ Three detectors, all fully local, all running on the 500-event buffer:
251
+
252
+ - **MAD z-score outliers** on cost, duration, and input tokens per agent
253
+ (`|z| > 3.5` by default — tune in `~/.agentwatch/anomaly.json`)
254
+ - **Stuck-loop detector** with periods 1–4 — catches `A-A-A-…` and
255
+ `A-B-A-B-…` "apologize and retry" loops
256
+ - Per-session rollup + OS notification on first flag + timeline `◎` marker
257
+ + `[D]` to dismiss the banner
258
+
259
+ ### User-defined notification triggers
260
+
261
+ `~/.agentwatch/triggers.json` — live-reloaded via chokidar:
46
262
 
263
+ ```json
264
+ [
265
+ { "match": "curl .* \\| (bash|sh)", "title": "pipe-to-shell", "body": "{{agent}}: {{cmd}}" },
266
+ { "type": "file_write", "pathMatch": "^/etc/", "title": "/etc write" },
267
+ { "thresholdUsd": 0.5, "title": "expensive turn", "body": "cost {{cost}}" }
268
+ ]
47
269
  ```
48
- agentwatch launch the TUI
49
- agentwatch doctor detect installed agents and print config paths
50
- agentwatch --help usage
270
+
271
+ Placeholders: `{{agent}} {{type}} {{cmd}} {{path}} {{tool}} {{summary}} {{cost}}`.
272
+
273
+ ### Desktop notifications
274
+
275
+ Built-in alerts fire on sensitive events — `.env` access, `~/.ssh` /
276
+ `~/.aws` / `~/.gnupg` paths, `rm -rf`, `sudo`, `curl | sh`, tool errors,
277
+ budget breach, anomaly. Rate-limited (60s per rule key). Silent during
278
+ backfill.
279
+
280
+ Platform dispatch: `osascript` on macOS, `notify-send` on Linux,
281
+ PowerShell `MessageBox` on Windows. Zero third-party dependencies.
282
+
283
+ ### Per-agent permission surface (`[p]`)
284
+
285
+ Scrollable view showing:
286
+
287
+ - **Claude Code** — allow / deny / defaultMode; flagged risks (`Bash(*)`, missing `.ssh` denies, `auto` / `bypass` modes in red)
288
+ - **Codex** — config.toml projects + trust_level; latest session's sandbox_policy, approval_policy, writable_roots, network_access, model
289
+ - **Gemini CLI** — auth type, selected model, tool allow/block lists, trusted folders
290
+ - **Cursor** — approval mode, sandbox state, MCP servers, discovered `.cursorrules`
291
+ - **OpenClaw** — default workspace + per-sub-agent (name, emoji, model, workspace)
292
+
293
+ ### Session export (`[e]`)
294
+
295
+ From a session list or scoped timeline, press **`e`**. Writes
296
+ `./agentwatch-export/<agent>-<session>-<ts>.md` (human-readable transcript
297
+ with tool calls as fenced blocks) and `.json` (raw events). Path copied to
298
+ clipboard.
299
+
300
+ ### Syntax highlighting in the detail pane
301
+
302
+ `cli-highlight` (tiny ANSI highlighter) applies to:
303
+ - Tool input JSON
304
+ - Tool result when the tool is Bash or the file extension is known (`.ts`, `.py`, `.rs`, `.go`, etc.)
305
+ - Fenced blocks in user/assistant text
306
+
307
+ ### Stale-session detection
308
+
309
+ Sessions and projects idle for > 5 minutes render dimmed with a `⊘ stale`
310
+ badge. Un-greys on the next event.
311
+
312
+ ### Clipboard yank (`[y]`)
313
+
314
+ Copies the most useful payload (tool result > full text > cmd / path /
315
+ summary). Uses `pbcopy`, `wl-copy` / `xclip` / `xsel`, or `clip`.
316
+ Confirmation flashes at the footer.
317
+
318
+ ---
319
+
320
+ ## Keyboard reference
321
+
322
+ Press **`?`** anytime to open this inside the TUI.
323
+
324
+ ### Navigate
325
+
326
+ | Key | Action |
327
+ | ------------------ | ---------------------------------------------- |
328
+ | `↑ ↓` / `j k` | move selection in the timeline |
329
+ | `Enter` | open event detail pane |
330
+ | `esc` | close current view / clear selection |
331
+ | `P` | projects grid |
332
+ | `Enter` on project | sessions list for that project |
333
+ | `Enter` on session | scoped timeline for that session |
334
+ | `q` / `Ctrl-C` | quit |
335
+
336
+ ### Filter & scope
337
+
338
+ | Key | Action |
339
+ | ---- | ------------------------------------------------------------ |
340
+ | `/` | in-buffer search (last 500 events) |
341
+ | `?` | cross-session search (every session file on disk) |
342
+ | `f` | cycle agent filter |
343
+ | `a` | toggle agent side panel |
344
+ | `x` | drill selected Agent event into its subagent run |
345
+ | `X` | unscope subagent |
346
+ | `A` | clear project filter |
347
+ | `Z` | clear all filters |
348
+
349
+ ### Actions
350
+
351
+ | Key | Action |
352
+ | --------- | ------------------------------------------- |
353
+ | `y` | yank selected event content to clipboard |
354
+ | `e` | export current session to `.md` + `.json` |
355
+ | `space` | pause / resume live event stream |
356
+ | `c` | clear event buffer |
357
+ | `D` | dismiss the current anomaly banner |
358
+
359
+ ### Info overlays (only in a scoped session)
360
+
361
+ | Key | Action |
362
+ | ------ | ----------------------------------------- |
363
+ | `t` | per-turn token attribution |
364
+ | `C` | context compaction visualizer |
365
+ | `p` | permissions view (works anywhere) |
366
+
367
+ ---
368
+
369
+ ## Configuration
370
+
371
+ Four config files, all optional. Loaded on startup; triggers reload live.
372
+
373
+ | File | Purpose |
374
+ | -------------------------------- | -------------------------------------------------------- |
375
+ | `~/.agentwatch/triggers.json` | User-defined notification rules (live-reloaded) |
376
+ | `~/.agentwatch/budgets.json` | `perSessionUsd` / `perDayUsd` spend caps |
377
+ | `~/.agentwatch/anomaly.json` | `zScore`, `loopWindow`, `loopMinRepeats`, `minSamples` |
378
+
379
+ Environment variables:
380
+
381
+ | Variable | Default | Purpose |
382
+ | ------------------------------ | --------------------------- | ----------------------------------------------------- |
383
+ | `WORKSPACE_ROOT` | `~/IdeaProjects` (fallback) | Where the generic filesystem watcher looks for edits |
384
+ | `AGENTWATCH_CONTEXT_WINDOW` | `200000` | Tokens per window — used by compaction % calculation |
385
+ | `AGENTWATCH_OTLP_ENDPOINT` | unset | Enables the OTel exporter when set |
386
+ | `NO_COLOR` | unset | Standard honoring: disables ANSI colors if set |
387
+
388
+ Workspace fallback chain (used when `WORKSPACE_ROOT` isn't set):
389
+ `~/IdeaProjects` → `~/src` → `~/code` → `~/Projects` → `~/dev` → `$HOME`.
390
+
391
+ ---
392
+
393
+ ## What agentwatch reads
394
+
395
+ Read-only. agentwatch writes to exactly two places: your terminal and the
396
+ clipboard (on explicit `y`) / disk (on explicit `e` to export).
397
+
398
+ | Path | What |
399
+ | ------------------------------------------------------------ | ---------------------------------------- |
400
+ | `~/.claude/projects/**/*.jsonl` | Claude Code session transcripts |
401
+ | `~/.claude/projects/**/subagents/*.jsonl` | Claude Code Task-spawned subagents |
402
+ | `~/.claude/settings.json` | Claude permissions |
403
+ | `~/.codex/sessions/**/rollout-*.jsonl` | Codex session transcripts |
404
+ | `~/.codex/config.toml` | Codex permissions + trust levels |
405
+ | `~/.gemini/tmp/**/chats/*.json` | Gemini CLI transcripts + tool calls |
406
+ | `~/.gemini/settings.json` + `trustedFolders.json` | Gemini permissions |
407
+ | `~/.openclaw/agents/*/sessions/*.jsonl` | OpenClaw sub-agent sessions |
408
+ | `~/.openclaw/logs/config-audit.jsonl` + `openclaw.json` | OpenClaw config audit + agent roster |
409
+ | `~/.cursor/{mcp.json, cli-config.json, ide_state.json}` | Cursor config state |
410
+ | Any `.cursorrules` / `.cursor/rules/*.mdc` under WORKSPACE | Cursor project rules |
411
+ | `{CLAUDE,AGENTS,GEMINI,OPENCLAW}.md` + `.windsurfrules` etc. | Per-agent memory files for token attribution |
412
+ | `~/.agentwatch/*.json` | User config (triggers / budgets / anomaly) |
413
+ | `$WORKSPACE_ROOT` tree | Filesystem change events |
414
+
415
+ `SECURITY.md` carries the authoritative list and details of what is *not* read.
416
+
417
+ ---
418
+
419
+ ## MCP server mode
420
+
421
+ Run agentwatch as an MCP server so other agents can query their own
422
+ history. Install:
423
+
424
+ ```bash
425
+ claude mcp add agentwatch -- npx -y @misha_misha/agentwatch mcp
426
+ # or edit ~/.claude.json / ~/.cursor/mcp.json manually
51
427
  ```
52
428
 
53
- `$WORKSPACE_ROOT` overrides the detected workspace root.
429
+ Tools exposed:
430
+
431
+ | Tool | Args | Returns |
432
+ | ------------------------- | --------------------------------- | ----------------------------------------------------- |
433
+ | `list_recent_sessions` | `limit?: 1-100` | `[{agent, sessionId, project, lastActivity, sizeBytes}]` |
434
+ | `get_session_events` | `sessionId`, `maxBytes?: 1K-10M` | Raw JSONL (tail-capped) for that session |
435
+ | `search_sessions` | `query`, `limit?: 1-50` | `[{session, agent, line}]` substring hits |
436
+ | `get_tool_usage_stats` | `sessionId?`, `limit?: 1-500` | Per-tool counts, totalDurationMs, errorCount |
437
+ | `get_session_cost` | `sessionId` | `{totalCostUsd, turns, tokens, byModel}` |
438
+
439
+ See [`docs/features/mcp-server.md`](./docs/features/mcp-server.md).
440
+
441
+ ---
442
+
443
+ ## OpenTelemetry exporter
444
+
445
+ Set `AGENTWATCH_OTLP_ENDPOINT=http://localhost:4318/v1/traces` to emit
446
+ OTLP/HTTP spans for every agent event. Uses the OpenTelemetry GenAI
447
+ semantic conventions so any consumer (Jaeger, Tempo, Honeycomb, Grafana)
448
+ can interpret the data without custom dashboards.
449
+
450
+ Attributes emitted:
451
+
452
+ - `gen_ai.system` (anthropic | openai | google | cursor | …)
453
+ - `gen_ai.operation.name` (chat | tool_use | context_compaction | …)
454
+ - `gen_ai.request.model` / `gen_ai.response.model`
455
+ - `gen_ai.usage.input_tokens` / `gen_ai.usage.output_tokens`
456
+ - `gen_ai.tool.name` / `gen_ai.tool.call.id`
457
+ - `error.type` on tool errors
458
+ - `agentwatch.session.id` / `agentwatch.cost_usd`
459
+ - `agentwatch.cache_read_tokens` / `agentwatch.cache_create_tokens` / `agentwatch.cache_hit_ratio`
460
+ - `agentwatch.context.fill_pct`
461
+ - `agentwatch.risk_score`
462
+
463
+ OTel deps are loaded dynamically only when the env var is set — zero
464
+ runtime cost when disabled.
465
+
466
+ ---
54
467
 
55
468
  ## How it compares
56
469
 
57
- | | agentwatch | claude-devtools | Unfucked | Langfuse / Phoenix |
58
- |---|---|---|---|---|
59
- | Runs locally only | | | | self-host possible |
60
- | Multi-agent | Claude + Cursor + OpenClaw | Claude only | agent-agnostic (file-level) | production apps, not CLI agents |
61
- | Per-agent attribution | | | (file-level only) | N/A |
62
- | Permission surface view | | | | |
63
- | Install | `npm i -g` | Homebrew / Electron app | Homebrew / Rust binary | Docker + Postgres |
470
+ | | **agentwatch** | claude-devtools | Claudex | ccflare | Langfuse / Phoenix |
471
+ | ---------------------------------- | ---------------------------------------------- | --------------------- | --------------------- | ------------------ | ---------------------------- |
472
+ | Runs locally only | | | | | self-host possible |
473
+ | Multi-agent | Claude, Codex, Gemini, Cursor (config), OpenClaw | Claude only | Claude only | Claude only | production LLM apps |
474
+ | Real token + cost with cache | ✅ | | 🟡 | (proxy-level) | |
475
+ | Per-turn token attribution | | | | | |
476
+ | Compaction visualizer | ✅ | | | | |
477
+ | **Anomaly detection** | **✅ MAD + stuck-loop** | rule-based only | ❌ | ❌ | ❌ |
478
+ | **Budget alarms w/ OS notification** | **✅** | ❌ | ❌ | ❌ | ❌ |
479
+ | **User triggers (regex/threshold)** | **✅ live-reload** | ❌ | ❌ | ❌ | ❌ |
480
+ | **OTel exporter (gen_ai.*)** | **✅** | ❌ | ❌ | ❌ | ✅ (its own format) |
481
+ | MCP server (self-query) | ✅ | ❌ | ✅ | ❌ | ❌ |
482
+ | Permission surface view | ✅ 5 agents | ❌ | ❌ | ❌ | ❌ |
483
+ | Subagent drilldown | ✅ | ✅ | ❌ | ❌ | ✅ (LangChain-specific) |
484
+ | Install | `npm i -g` | Homebrew / Electron | `npm i -g` | Bun repo | Docker + Postgres |
485
+ | UI | TUI (Ink) | Electron + standalone | Web UI | Web + TUI | Web |
486
+ | Telemetry | none | none | none | none | opt-in |
487
+
488
+ Three moats are genuinely unique: **anomaly detection** (statistical, not
489
+ rule-based), **budget alarms**, and **OTel with gen_ai.* conventions**.
490
+
491
+ ---
492
+
493
+ ## Limitations
494
+
495
+ - **agentwatch is a viewer, not a daemon.** It captures events only while
496
+ the TUI is running. A background-capture daemon is planned.
497
+ - **Backfill is bounded.** On launch we read the last ~4 MB of each
498
+ active session file (roughly hundreds of events). For long gaps on
499
+ very active sessions, earliest events may fall out of the backfill
500
+ window. Keep agentwatch open in a tmux pane for zero gaps.
501
+ - **Cursor activity is config-level only.** Cursor's AI activity lives in
502
+ a SQLite database we don't parse yet. We capture config changes +
503
+ `.cursorrules` + MCP servers + `.cursor/rules/*.mdc`. Full activity
504
+ parsing is a follow-up.
505
+ - **Gemini and OpenClaw have data-structure gaps.** Gemini CLI doesn't
506
+ persist compaction markers to disk. OpenClaw doesn't persist
507
+ tool_result content or compaction markers. Not fixable from our side.
508
+ - **Windsurf, Aider, Cline** are detected but not instrumented yet.
509
+ - **macOS and Linux only.** Windows needs more chokidar + notifier
510
+ testing before we promise it.
511
+ - **tokenizer is cl100k_base (gpt-tokenizer)**, which is ~5% off for
512
+ Claude. Exact tokens for input / cache / output come from the model's
513
+ own usage record; the ~5% approximation only affects the user /
514
+ thinking / tool I/O / memory-file categories in the attribution view.
515
+
516
+ ---
64
517
 
65
518
  ## Non-goals
66
519
 
67
- - Not cloud. Not a SaaS. Not ever.
68
- - Not an agent itself.
69
- - Not production LLM-app tracing — [Langfuse](https://langfuse.com) owns that space.
70
- - Not enterprise compliance Anthropic's Compliance API covers that.
71
- - Not orchestration. Use [Mission Control](https://github.com/MeisnerDan/mission-control) or [DevSwarm](https://devswarm.ai) for running agents in parallel.
520
+ Hard scope boundaries so agentwatch stays small and maintainable.
521
+
522
+ - **Not cloud. Not SaaS. Not ever.**
523
+ - **Not an agent itself.** It watches agents; it doesn't take actions.
524
+ - **Not production LLM-app tracing.** [Langfuse](https://langfuse.com) owns that.
525
+ - **Not enterprise compliance.** Anthropic's Compliance API covers that.
526
+ - **Not orchestration.** Use Mission Control / Stoneforge for running agents in parallel.
527
+ - **Not memory.** Use [claude-mem](https://github.com/thedotmack/claude-mem).
528
+ - **Not governance / policy enforcement.** Use DashClaw / Castra.
529
+
530
+ ---
72
531
 
73
- ## Roadmap
532
+ ## Architecture
74
533
 
75
- - Codex + Gemini CLI adapters
76
- - Deeper Cursor activity (SQLite AI-tracking DB)
77
- - MCP proxy mode (`agentwatch wrap <agent>`)
78
- - Permission viewer for OpenClaw + Cursor + Codex + Gemini
534
+ TypeScript monorepo. Three-layer mental model:
79
535
 
80
- Feature requests → [GitHub issues](https://github.com/mishanefedov/agentwatch/issues).
536
+ ```
537
+ ┌─────────────────────────────────────────────────────────────┐
538
+ │ TUI layer (ink / React) │
539
+ │ Timeline · EventDetail · Permissions · Projects │
540
+ │ Sessions · Tokens · Compaction · CrossSearch · Header │
541
+ │ │
542
+ │ MCP server (stdio — programmatic, not a UI) │
543
+ │ list_recent_sessions · get_session_events │
544
+ │ search_sessions · get_tool_usage_stats · get_session_cost │
545
+ └─────────────────────────▲───────────────────────────────────┘
546
+ │ EventSink.emit / enrich
547
+ ┌─────────────────────────┴───────────────────────────────────┐
548
+ │ Adapter layer (one per agent) │
549
+ │ claude-code · codex · gemini · cursor · openclaw │
550
+ │ fs-watcher (generic) │
551
+ └─────────────────────────▲───────────────────────────────────┘
552
+ │ files read-only
553
+ ┌─────────────────────────┴───────────────────────────────────┐
554
+ │ OS (log files, config files, clipboard, notifier) │
555
+ └─────────────────────────────────────────────────────────────┘
556
+ ```
557
+
558
+ - Adapters read files, translate raw log lines into canonical `AgentEvent`s, emit through an `EventSink`.
559
+ - `EventSink.enrich(id, patch)` lets an adapter update a previously-emitted event (e.g. when a tool_result arrives late and needs to attach duration + output to the original tool_use).
560
+ - The TUI is a pure reducer over the event buffer. Filtering, search, scope are derived views — no mutation.
561
+ - The MCP server is a peer of the TUI: it reads the same session files on demand, via its own scan (no shared in-memory state with the TUI). This is a known duplication; see Linear for the refactor ticket.
562
+
563
+ See `src/schema.ts` for the canonical event shape.
564
+
565
+ ---
81
566
 
82
567
  ## Development
83
568
 
@@ -85,11 +570,44 @@ Feature requests → [GitHub issues](https://github.com/mishanefedov/agentwatch/
85
570
  git clone https://github.com/mishanefedov/agentwatch.git
86
571
  cd agentwatch
87
572
  npm install
88
- npm run dev
573
+ npm run dev # launch the TUI directly from source (tsx)
574
+ npm test # vitest — 97 tests
575
+ npm run typecheck # strict TypeScript
576
+ npm run build # tsup → dist/
89
577
  ```
90
578
 
91
- Run tests with `npm test`. Typecheck with `npm run typecheck`.
579
+ See [CONTRIBUTING.md](./CONTRIBUTING.md) for the contribution workflow.
580
+
581
+ ### Docs
582
+
583
+ - **[`docs/features/`](./docs/features/)** — feature specs (scope, inputs, outputs, failure modes). Being extended feature-by-feature.
584
+ - **[`docs/testing/`](./docs/testing/)** — manual test procedures + a pre-release walkthrough.
585
+ - **[`docs/use-cases/`](./docs/use-cases/)** — multi-agent triage, cost-overrun investigation, security audit, stuck-loop detection, subagent post-mortem, .env leak alert.
586
+
587
+ ---
588
+
589
+ ## Security
590
+
591
+ Local-first is a hard invariant.
592
+
593
+ - **Zero network calls** unless you explicitly set `AGENTWATCH_OTLP_ENDPOINT` (to a host *you* chose, OTel output only).
594
+ - **Zero telemetry.** Not opt-in, not opt-out — simply not there.
595
+ - **All files read-only** except the clipboard (on `y`) and `./agentwatch-export/` (on `e`).
596
+ - Every path agentwatch reads is documented in [SECURITY.md](./SECURITY.md).
597
+
598
+ Report vulnerabilities privately: `misha@auraqu.com` or via a
599
+ [Security Advisory](https://github.com/mishanefedov/agentwatch/security/advisories/new).
600
+
601
+ ---
92
602
 
93
603
  ## License
94
604
 
95
605
  MIT © Misha Nefedov. See [LICENSE](./LICENSE).
606
+
607
+ ---
608
+
609
+ <div align="center">
610
+
611
+ If agentwatch saves you a debugging hour, a ⭐ on the repo makes the effort worth it.
612
+
613
+ </div>