@misha_misha/agentwatch 0.0.3 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,13 +2,16 @@
2
2
 
3
3
  # agentwatch
4
4
 
5
- **See what every AI coding agent on your machine is doing — in one terminal.**
5
+ **Local observability + control plane for every AI coding agent on your machine.**
6
6
 
7
- Local-only observability for Claude Code, Codex, Gemini CLI, Cursor, and
8
- OpenClaw unified timeline, real token + cost accounting, compaction and
9
- anomaly detection, an MCP server agents can query their own history from,
10
- and an OpenTelemetry exporter with `gen_ai.*` semantic conventions. All
11
- local. No cloud. No telemetry. No sign-in.
7
+ A terminal live-tail *and* a browser dashboard one process, one event
8
+ stream, served from `localhost`. Unified timeline across Claude Code,
9
+ Codex, Gemini CLI, Cursor, Hermes, and OpenClaw. Token + cost accounting,
10
+ compaction + anomaly detection, hybrid search, SVG call graphs,
11
+ monaco-style diff attribution, agent-aware replay ("what would the agent
12
+ say if I edited the prompt?"), policy editor, MCP server agents can query
13
+ their own history from, and an OpenTelemetry exporter with `gen_ai.*`
14
+ semantic conventions. All local. No cloud. No telemetry. No sign-in.
12
15
 
13
16
  [![npm](https://img.shields.io/npm/v/@misha_misha/agentwatch.svg)](https://www.npmjs.com/package/@misha_misha/agentwatch)
14
17
  [![CI](https://github.com/mishanefedov/agentwatch/actions/workflows/ci.yml/badge.svg)](https://github.com/mishanefedov/agentwatch/actions/workflows/ci.yml)
@@ -18,9 +21,16 @@ local. No cloud. No telemetry. No sign-in.
18
21
  </div>
19
22
 
20
23
  <div align="center">
21
- <img src="./docs/demo.gif" alt="agentwatch demo" width="820" />
24
+ <img src="./docs/timeline.png" alt="agentwatch web UI — unified timeline across 5 agents, each in its own workspace" width="1100" />
25
+ <br />
26
+ <img src="./docs/event-detail.png" alt="agentwatch event detail view — full command, tool I/O, usage + cost" width="1100" />
22
27
  </div>
23
28
 
29
+ **The TUI is the live tail. The web UI is where you drill in** — projects,
30
+ sessions, token charts, compaction sparklines, SVG call graphs, diff
31
+ attribution, replay, anomaly triage, policy editing. Both run in one
32
+ process. Press `w` in the TUI to open the browser.
33
+
24
34
  ---
25
35
 
26
36
  ## Table of contents
@@ -63,6 +73,37 @@ stack, in the terminal, with zero infrastructure and zero network.**
63
73
 
64
74
  ---
65
75
 
76
+ ## Why this over `claude-devtools` if you run multiple agents?
77
+
78
+ Short, factual diff. `claude-devtools` is a great tool for Claude-only
79
+ workflows — if you only use Claude Code, it's probably the better pick.
80
+ agentwatch is the answer when you run more than one agent on the same
81
+ machine and want one timeline + one cost ledger + one alerting surface
82
+ across all of them.
83
+
84
+ | What | claude-devtools | **agentwatch** |
85
+ | -------------------------------------------- | ----------------------- | ------------------------------------- |
86
+ | Claude Code coverage | ✅ full | ✅ full |
87
+ | Codex coverage | ❌ | ✅ tokens + tools + cost + compaction |
88
+ | Gemini CLI coverage | ❌ | ✅ tokens + tools + cost |
89
+ | OpenClaw coverage | ❌ | ✅ tokens + cost |
90
+ | Hermes Agent coverage | ❌ | ✅ tokens + tools + cost (SQLite) |
91
+ | Cursor coverage | ❌ | 🟡 config level |
92
+ | Per-agent budget alarms | ❌ | ✅ session + daily caps |
93
+ | Statistical anomaly detection (loops / spikes) | rule-based only | ✅ MAD z-score + period-1-to-4 loops |
94
+ | OpenTelemetry exporter (`gen_ai.*`) | ❌ | ✅ Jaeger / Tempo / Grafana ready |
95
+ | MCP server — agents query their own history | ❌ | ✅ 5 tools over stdio |
96
+ | User-defined regex/threshold triggers | ❌ | ✅ live-reloaded |
97
+ | Install | Homebrew / Electron ~150 MB | `npm i -g` · 220 KB · TUI |
98
+ | Data boundary | local | local |
99
+
100
+ If "every agent on one pane of glass + programmatic access via MCP +
101
+ pipeline-friendly OTel" matches your setup, agentwatch is the tool.
102
+ If you're Claude-only and want the Electron polish, `claude-devtools`
103
+ is still excellent.
104
+
105
+ ---
106
+
66
107
  ## Install
67
108
 
68
109
  ```bash
@@ -85,11 +126,18 @@ name was already taken by a CyberArk tool. The installed binary on your
85
126
 
86
127
  ```bash
87
128
  agentwatch doctor # detects installed agents + readiness
88
- agentwatch # launches the TUI
129
+ agentwatch # TUI live-tail + web UI at http://127.0.0.1:3456
130
+ agentwatch serve # web UI only (remote boxes / server cron)
89
131
  agentwatch mcp # runs the MCP stdio server (for agents, not humans)
90
132
  agentwatch --help
91
133
  ```
92
134
 
135
+ Flags:
136
+
137
+ - `--no-web` — TUI only, don't start the web server
138
+ - `--port <n>` / `--host <addr>` — override web server bind
139
+ - `AGENTWATCH_PORT=… AGENTWATCH_HOST=…` — env equivalents
140
+
93
141
  `doctor` output looks like:
94
142
 
95
143
  ```
@@ -99,15 +147,40 @@ agents:
99
147
  ● Claude Code installed (events captured)
100
148
  ● Codex installed (events captured)
101
149
  ● Gemini CLI installed (events captured)
150
+ ● Hermes Agent installed (events captured)
102
151
  ● Cursor installed (config-level only)
103
152
  ● OpenClaw installed (events captured)
104
153
  ○ Aider not detected
105
154
  ○ Cline (VS Code) not detected
106
155
  ```
107
156
 
108
- Launch the TUI and every event your agents emit streams in. The last 4 MB
109
- of each active session is backfilled on startup so you have immediate
110
- context. Press **`?`** to see every hotkey.
157
+ Launch `agentwatch` and every event your agents emit streams in. The TUI
158
+ shows a live tail; the web UI at `http://127.0.0.1:3456` is where you
159
+ drill in projects, sessions, token charts, SVG call graphs, diff
160
+ attribution, prompt replay, trends. Press `w` in the TUI to open it.
161
+
162
+ ### Web UI map
163
+
164
+ | Route | What it is |
165
+ | ------------------------------------ | ------------------------------------------------------- |
166
+ | `/` | Live timeline (SSE-streamed) with agent + type filters |
167
+ | `/projects` | Grid of detected projects + cost + session counts |
168
+ | `/projects/:name` | Sessions table for one project |
169
+ | `/sessions/:id` | Chronological event list · export .md / .json |
170
+ | `/sessions/:id/tokens` | Stacked-area token chart per turn |
171
+ | `/sessions/:id/compaction` | Context fill % over time + compaction markers |
172
+ | `/sessions/:id/graph` | Call graph (d3-hierarchy SVG) — click nodes to drill |
173
+ | `/sessions/:id/diffs` | Writes paired with the prompt that triggered them |
174
+ | `/sessions/:id/replay` | Edit prompt → re-run the agent in single-turn exec |
175
+ | `/search` | Unified search (live / cross / semantic) |
176
+ | `/agents` | Grid of every supported agent + install status |
177
+ | `/permissions` | Per-agent permission config |
178
+ | `/cron` | OpenClaw cron jobs + heartbeats |
179
+ | `/trends` | Cost, cache-hit ratio, events per agent (30d default) |
180
+ | `/settings/{budgets,anomaly,triggers}` | Form editors for `~/.agentwatch/*.json` |
181
+
182
+ `⌘K` / `Ctrl+K` opens the command palette.
183
+ `/` focuses the timeline filter.
111
184
 
112
185
  ---
113
186
 
@@ -117,21 +190,22 @@ What actually works per agent, as of v0.0.3. Features not listed here
117
190
  work across every agent (timeline, export, syntax highlighting, notifications,
118
191
  triggers, search, stale detection, clipboard yank).
119
192
 
120
- | Feature | Claude Code | Codex | Gemini CLI | Cursor | OpenClaw |
121
- | ------------------------------ | :---------: | :---: | :--------: | :----: | :------: |
122
- | Live events on timeline | ✅ | ✅ | ✅ | 🟡 | ✅ |
123
- | Token usage + cost | ✅ | ✅ | ✅ | ❌ | ✅ |
124
- | Tool call + result pairing | ✅ | ✅ | ✅ | ❌ | 🟡 |
125
- | Per-turn token attribution | ✅ | ✅ | ✅ | ❌ | ✅ |
126
- | Budget alarms (session + day) | ✅ | ✅ | ✅ | ❌ | ✅ |
127
- | Anomaly detection (cost/loops) | ✅ | ✅ | ✅ | 🟡 | ✅ |
128
- | Compaction visualizer | ✅ | ✅ | ❌ | — | ❌ |
129
- | Permissions view | ✅ | ✅ | ✅ | ✅ | ✅ |
130
- | Cross-session search | ✅ | ✅ | ✅ | ❌ | ❌ |
131
- | Subagent drilldown | ✅ | — | 🟡 | — | 🟡 |
132
- | Agent memory file overhead | `CLAUDE.md` | `AGENTS.md` | `GEMINI.md` | `.cursorrules` | `OPENCLAW.md` |
133
- | OTel span coverage | | | | 🟡 | |
134
- | MCP server exposes history | ✅ | ✅ | ✅ (raw) | | |
193
+ | Feature | Claude Code | Codex | Gemini CLI | Cursor | OpenClaw | Hermes |
194
+ | ------------------------------ | :---------: | :---: | :--------: | :----: | :------: | :----: |
195
+ | Live events on timeline | ✅ | ✅ | ✅ | 🟡 | ✅ | ✅ |
196
+ | Token usage + cost | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
197
+ | Tool call + result pairing | ✅ | ✅ | ✅ | ❌ | 🟡 | ✅ |
198
+ | Per-turn token attribution | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
199
+ | Budget alarms (session + day) | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
200
+ | Anomaly detection (cost/loops) | ✅ | ✅ | ✅ | 🟡 | ✅ | ✅ |
201
+ | Compaction visualizer | ✅ | ✅ | ❌ | — | ❌ | ❌ |
202
+ | Permissions view | ✅ | ✅ | ✅ | ✅ | ✅ | — |
203
+ | Cross-session search | ✅ | ✅ | ✅ | ❌ | ❌ | 🟡 |
204
+ | Subagent drilldown | ✅ | — | 🟡 | — | 🟡 | 🟡 |
205
+ | Replay (agent-aware exec) | | | | | | |
206
+ | Agent memory file overhead | `CLAUDE.md` | `AGENTS.md` | `GEMINI.md` | `.cursorrules` | `OPENCLAW.md` | `SOUL.md` |
207
+ | OTel span coverage | ✅ | ✅ | ✅ | 🟡 | | 🟡 |
208
+ | MCP server exposes history | ✅ | ✅ | ✅ (raw) | ❌ | ❌ | ❌ |
135
209
 
136
210
  - **Cursor** exposes config state (MCP servers, `.cursorrules`, approval
137
211
  mode, sandbox) but its actual AI activity lives in a SQLite database we
@@ -140,6 +214,12 @@ triggers, search, stale detection, clipboard yank).
140
214
  compaction detection is Claude + Codex only.
141
215
  - **OpenClaw** doesn't persist tool_result content or compaction markers
142
216
  to its JSONL — structural limit of what's on disk, not an adapter gap.
217
+ - **[Hermes Agent](https://github.com/NousResearch/hermes-agent)** (by
218
+ Nous Research — the OpenClaw successor with a closed learning loop)
219
+ persists sessions to `~/.hermes/state.db` (SQLite + FTS5). The adapter
220
+ polls the DB over chokidar + 2s safety-net and emits the full
221
+ session/prompt/response/tool-call stream. Replay re-runs single turns
222
+ via `hermes chat -q <prompt> -Q --max-turns 1`.
143
223
 
144
224
  ---
145
225
 
@@ -406,6 +486,7 @@ clipboard (on explicit `y`) / disk (on explicit `e` to export).
406
486
  | `~/.gemini/settings.json` + `trustedFolders.json` | Gemini permissions |
407
487
  | `~/.openclaw/agents/*/sessions/*.jsonl` | OpenClaw sub-agent sessions |
408
488
  | `~/.openclaw/logs/config-audit.jsonl` + `openclaw.json` | OpenClaw config audit + agent roster |
489
+ | `~/.hermes/state.db` (SQLite) | Hermes Agent sessions + messages |
409
490
  | `~/.cursor/{mcp.json, cli-config.json, ide_state.json}` | Cursor config state |
410
491
  | Any `.cursorrules` / `.cursor/rules/*.mdc` under WORKSPACE | Cursor project rules |
411
492
  | `{CLAUDE,AGENTS,GEMINI,OPENCLAW}.md` + `.windsurfrules` etc. | Per-agent memory files for token attribution |
@@ -546,7 +627,7 @@ TypeScript monorepo. Three-layer mental model:
546
627
  │ EventSink.emit / enrich
547
628
  ┌─────────────────────────┴───────────────────────────────────┐
548
629
  │ Adapter layer (one per agent) │
549
- │ claude-code · codex · gemini · cursor · openclaw
630
+ │ claude-code · codex · gemini · cursor · openclaw · hermes
550
631
  │ fs-watcher (generic) │
551
632
  └─────────────────────────▲───────────────────────────────────┘
552
633
  │ files read-only
package/bin/agentwatch.js CHANGED
File without changes