@taewooopark/agent-blackbox 0.47.1 → 0.47.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,89 @@
1
+ # Agent-Blackbox
2
+
3
+ **Open your coding agent's black box.**
4
+
5
+ A **local-first flight recorder and context-efficiency profiler** for **[Claude Code](https://www.claude.com/product/claude-code)** and **[OpenCode](https://opencode.ai)**. It turns every agent run into a **live, replayable map** of what the agent actually *did* — what it read, changed, ran, decided, delegated, blocked on, and verified — reconstructed from observed events, not from the agent's own summary. Then it **scores how economically the run used its context window** and tells you, concretely, how to make the next one cheaper and faster.
6
+
7
+ Everything runs on your machine. **No API key. Nothing leaves your computer.**
8
+
9
+ <p align="center">
10
+ <img src="https://raw.githubusercontent.com/TaewoooPark/Agent-Blackbox/main/docs/screenshots/session-map.jpeg" alt="Agent-Blackbox session map — a real multi-agent run rendered as a monochrome Mark Lombardi network of hollow rings and sweeping arcs, with a context-efficiency score on the right." width="100%">
11
+ </p>
12
+
13
+ > *"The transcript is what the agent said. The black box is what it did — and what it cost."*
14
+
15
+ ## Quickstart
16
+
17
+ One command (needs Node 20+):
18
+
19
+ ```bash
20
+ # Record Claude Code — nothing to install; the daemon tails the session
21
+ # transcripts it already writes (~/.claude/projects/)
22
+ npx @taewooopark/agent-blackbox up --host claude-code
23
+
24
+ # …or record OpenCode (installs the recorder into OpenCode's global plugin dir)
25
+ npx @taewooopark/agent-blackbox up
26
+
27
+ # …or record both hosts at once, into one dashboard
28
+ npx @taewooopark/agent-blackbox up --host all
29
+ ```
30
+
31
+ It starts a local daemon and **opens the dashboard** at `http://127.0.0.1:5173/`. Now use your agent exactly the way you already do — the map fills in live:
32
+
33
+ ```bash
34
+ claude # Claude Code, in any folder — zero setup, just run it
35
+ opencode # …or OpenCode (terminal or the desktop app)
36
+ ```
37
+
38
+ Stop recording any time with `npx @taewooopark/agent-blackbox uninstall`.
39
+
40
+ ## Why
41
+
42
+ You can't just **ask** an agent what a task cost. A 2026 study of eight frontier models on agentic coding found they predict their own token usage with a correlation of just **0.39 — and systematically underestimate** the bill; the same task varies **up to 30×** in tokens, and agentic runs burn **~1000× more tokens** than ordinary coding. So don't ask — **measure.**
43
+
44
+ <sub>Bai et al., *How Do AI Agents Spend Your Money?*, [arXiv:2604.22750](https://arxiv.org/abs/2604.22750) (2026).</sub>
45
+
46
+ ## What you get
47
+
48
+ | Feature | What it does |
49
+ |---|---|
50
+ | **Live session map** | the run forms in real time — reads, edits, commands, subagents, decisions — over a WebSocket, no refresh |
51
+ | **Replay** | scrub the timeline to any moment; the graph and files rewind to that exact point |
52
+ | **Subagent genealogy** | real delegations fork into their own lane, attributed to the subagent that did the work |
53
+ | **Context-efficiency score** | cache reuse, redundant re-reads, read-vs-edit amplification, oversized tool dumps, retry waste — with reclaimable tokens |
54
+ | **Concrete fixes** | rule-based by default, or tailored by a **free/local model with no API key** — and optionally written back to `AGENTS.md` so the next run avoids the waste |
55
+ | **Handoff export** | one-click Markdown summary (objective, files, decisions, blockers, next step) to resume elsewhere |
56
+ | **Local-first** | traces stay on your machine; prompts, secrets, and file contents are redacted by default |
57
+
58
+ <p align="center">
59
+ <img src="https://raw.githubusercontent.com/TaewoooPark/Agent-Blackbox/main/docs/screenshots/features.jpeg" alt="Four-panel overview: the live session map, the same console in dark mode, the context-efficiency co-pilot with metric meters, and the handoff export panel." width="100%">
60
+ </p>
61
+
62
+ ## Hosts
63
+
64
+ - **Claude Code** — **no install at all.** The daemon tails the JSONL transcripts the CLI already writes, so any folder, any session is recorded the moment you run `claude`. Add `--optimize` to also install the opt-in in-run actuator hooks.
65
+ - **OpenCode** — records via a recorder dropped into OpenCode's **global** plugin directory (`~/.config/opencode/plugins/`), so any session is captured, the desktop app included. Scope to one project with `up --project <dir>`.
66
+
67
+ ## Common flags
68
+
69
+ ```bash
70
+ up --host claude-code|opencode|all # which agent(s) to record (default: opencode)
71
+ up --suggest free # tailored fixes from a rotating pool of free models
72
+ up --port 48000 --ui-port 4000 # custom daemon / dashboard ports
73
+ up --no-open # don't auto-open the browser
74
+ uninstall # remove the global recorder (+ any Claude Code hooks)
75
+ ```
76
+
77
+ ## Documentation
78
+
79
+ Full docs, screenshots, architecture, and the optimization actuator:
80
+ **https://github.com/TaewoooPark/Agent-Blackbox**
81
+
82
+ [English](https://github.com/TaewoooPark/Agent-Blackbox/blob/main/README.md) ·
83
+ [한국어](https://github.com/TaewoooPark/Agent-Blackbox/blob/main/README.ko.md) ·
84
+ [中文](https://github.com/TaewoooPark/Agent-Blackbox/blob/main/README.zh.md) ·
85
+ [日本語](https://github.com/TaewoooPark/Agent-Blackbox/blob/main/README.ja.md)
86
+
87
+ ## License
88
+
89
+ MIT © [Taewoo Park](https://taewoopark.com)
package/dist/cli.js CHANGED
@@ -2785,20 +2785,49 @@ async function broadcastSnapshot(clients, eventsFile) {
2785
2785
  if (clients.size === 0) {
2786
2786
  return;
2787
2787
  }
2788
- await Promise.allSettled([...clients].map((client) => sendSnapshot(client, eventsFile)));
2788
+ let frame;
2789
+ try {
2790
+ frame = JSON.stringify({ type: "snapshot", data: await buildTraceSnapshot(eventsFile) });
2791
+ } catch (error) {
2792
+ const errFrame = JSON.stringify({
2793
+ type: "error",
2794
+ error: { message: error instanceof Error ? error.message : String(error) }
2795
+ });
2796
+ for (const client of clients)
2797
+ if (client.readyState === WebSocket.OPEN)
2798
+ client.send(errFrame);
2799
+ return;
2800
+ }
2801
+ for (const client of clients)
2802
+ if (client.readyState === WebSocket.OPEN)
2803
+ client.send(frame);
2789
2804
  }
2790
2805
  function makeBroadcastScheduler(clients, eventsFile, delayMs = 150) {
2791
2806
  let timer = null;
2792
- return () => {
2807
+ let building = false;
2808
+ let pending = false;
2809
+ const schedule = () => {
2810
+ if (building) {
2811
+ pending = true;
2812
+ return;
2813
+ }
2793
2814
  if (timer)
2794
2815
  return;
2795
2816
  timer = setTimeout(() => {
2796
2817
  timer = null;
2797
- void broadcastSnapshot(clients, eventsFile);
2818
+ building = true;
2819
+ void broadcastSnapshot(clients, eventsFile).finally(() => {
2820
+ building = false;
2821
+ if (pending) {
2822
+ pending = false;
2823
+ schedule();
2824
+ }
2825
+ });
2798
2826
  }, delayMs);
2799
2827
  if (typeof timer.unref === "function")
2800
2828
  timer.unref();
2801
2829
  };
2830
+ return schedule;
2802
2831
  }
2803
2832
  async function sendSnapshot(client, eventsFile) {
2804
2833
  if (client.readyState !== WebSocket.OPEN) {