claudeye 1.0.9 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +17 -2643
  2. package/package.json +10 -10
package/README.md CHANGED
@@ -1,2656 +1,30 @@
1
- ```
2
- ____ _ _
3
- / ___| | __ _ _ _ __| | ___ _ _ ___
4
- | | | |/ _` | | | |/ _` |/ _ \ | | |/ _ \
5
- | |___| | (_| | |_| | (_| | __/ |_| | __/
6
- \____|_|\__,_|\__,_|\__,_|\___|\__, |\___|
7
- |___/
8
- ```
1
+ # Claudeye is now FailproofAI — and we're open source!
9
2
 
10
- # Claudeye: Oversight and Observability for Claude Code Agents
3
+ Hey there, fellow agent wranglers!
11
4
 
12
- **Uncover** what your agents did.
13
- **Understand** where they struggle.
14
- **Utilize** insights to improve.
5
+ We've got some big news: **Claudeye has evolved.** New name. New home. Same mission of keeping your AI agents honest — but now with the doors wide open.
15
6
 
16
- [![npm version](https://img.shields.io/npm/v/claudeye)](https://www.npmjs.com/package/claudeye)
17
- [![npm downloads](https://img.shields.io/npm/dm/claudeye)](https://www.npmjs.com/package/claudeye)
18
- [![node](https://img.shields.io/node/v/claudeye)](https://nodejs.org)
19
- [![TypeScript](https://img.shields.io/badge/Built%20with-TypeScript-blue)](https://www.typescriptlang.org/)
20
- [![Discord](https://badgen.net/discord/members/zT92CAgvkj)](https://discord.com/invite/zT92CAgvkj)
21
- [![Docs](https://img.shields.io/badge/docs-claudeye.exosphere.host-blue)](https://claudeye.exosphere.host)
7
+ **Claudeye is now [FailproofAI](https://befailproof.ai)**, and it's fully open source.
22
8
 
23
- ## Table of Contents
9
+ That's right all the oversight, observability, session replay, custom evals, and hook policies you know and love? Free. Open. Yours to fork, extend, and contribute to.
24
10
 
25
- - [What is Claudeye?](#what-is-claudeye)
26
- - [Quick Start](#quick-start)
27
- - [Hook Policies](#protect-hook-policies)
28
- - [Enterprise](#enterprise)
29
- - [Why Claudeye?](#why-claudeye)
30
- - [Features](#features)
31
- - [CLI Reference](#cli-reference)
32
- - [Custom Evals & Enrichments](#custom-evals--enrichments)
33
- - [API Reference](#api-reference)
34
- - [`createApp()`](#createapp)
35
- - [`app.condition(fn)`](#appconditionfn)
36
- - [`app.queueCondition(fn, options?)`](#appqueueconditionfn-options)
37
- - [`app.cacheInvalidation(fn)`](#appcacheinvalidationfn)
38
- - [`app.eval(name, fn, options?)`](#appevalname-fn-options)
39
- - [`app.enrich(name, fn, options?)`](#appenrichname-fn-options)
40
- - [`app.action(name, fn, options?)`](#appactionname-fn-options)
41
- - [`app.alert(name, fn, options?)`](#appalertname-fn-options)
42
- - [`app.dashboard.view(name, options?)`](#appdashboardviewname-options)
43
- - [`app.dashboard.filter(name, fn, options?)`](#appdashboardfiltername-fn-options)
44
- - [`app.dashboard.filter({ preBuilt })`](#appdashboardfilter-prebuilt-)
45
- - [`app.dashboard.aggregate(name, definition, options?)`](#appdashboardaggregatename-definition-options)
46
- - [`app.auth(options)`](#appauthoptions)
47
- - [`app.listen(port?, options?)`](#applistenport-options)
48
- - [Subagent Scope](#subagent-scope)
49
- - [Evaluation Order](#evaluation-order)
50
- - [UI Behavior](#ui-behavior)
51
- - [Types](#types)
52
- - [Examples](#examples)
53
- - [Basic Evals & Enrichments](#example-basic-evals--enrichments)
54
- - [Dashboard Filters](#example-dashboard-filters)
55
- - [Multi-View Dashboard](#example-multi-view-dashboard)
56
- - [Eval Score Filters (cachedOnly)](#example-eval-score-filters-cachedonly)
57
- - [Actions](#example-actions)
58
- - [Alerts](#example-alerts)
59
- - [Minimal Filters Only](#example-minimal-filters-only)
60
- - [Background Queue Processing](#background-queue-processing)
61
- - [Caching](#caching)
62
- - [Authentication](#authentication)
63
- - [Deployment with PM2](#deployment-with-pm2)
64
- - [Telemetry](#telemetry)
65
- - [How It Works](#how-it-works)
66
- - [Community](#community)
67
- - [License](#license)
11
+ ## Where to find us
68
12
 
69
- ---
13
+ | | |
14
+ |---|---|
15
+ | **GitHub** | [github.com/exospherehost/failproofai](https://github.com/exospherehost/failproofai) |
16
+ | **Website** | [befailproof.ai](https://befailproof.ai) |
70
17
 
71
- ## What is Claudeye?
18
+ ## Why the rename?
72
19
 
73
- Claudeye is an oversight and observability tool for Claude Code agents (CLI and Agents SDK).
20
+ Because "keeping an eye on Claude" was just the beginning. FailproofAI is about making **all** your AI agents failproof — not just watching, but actively protecting, evaluating, and improving every session.
74
21
 
75
- AI agents execute shell commands, read files, and make changes autonomously without guardrails, they can `rm -rf /`, read your `.env` secrets, `sudo` install packages, or force-push to main. Claudeye ships **27 built-in hook policies** (free, with more added every update) that block dangerous actions in real time, before they execute. Beyond protection, it gives you full session replay, custom evals, enrichments, and a local-first dashboard. One command and you're in:
22
+ Plus, let's be honest*failproofai* just rolls off the tongue better at conferences.
76
23
 
77
- ## Quick Start
24
+ ## What's next?
78
25
 
79
- ```bash
80
- npm install -g claudeye && claudeye
81
- # or: bun install -g claudeye && claudeye
82
- ```
26
+ Head over to the new repo, give it a star, and join the fun. Whether you're building evals, writing hook policies, or just want to replay that one session where your agent tried to `rm -rf /` — we've got you covered.
83
27
 
84
- Opens your browser at `localhost:8020`. Reads from `~/.claude/projects` by default.
28
+ See you on the other side!
85
29
 
86
- Works with [Claude Code](https://docs.anthropic.com/en/docs/claude-code) sessions and [Claude Agents SDK](https://github.com/anthropics/anthropic-sdk-python) logs.
87
-
88
- > **Note:** Claudeye is distributed as a compiled native binary per platform. When you install `claudeye`, npm automatically pulls the correct platform-specific package (`@claudeye/linux-x64`, `@claudeye/darwin-arm64`, etc.) as an optional dependency. Node.js >= 20.9.0 must be available on PATH for the dashboard server.
89
-
90
- > **Important — Session retention:** Claude Code automatically deletes
91
- > sessions older than 30 days by default. To keep sessions visible in Claudeye,
92
- > set the retention period in `~/.claude/settings.json`:
93
- >
94
- > ```json
95
- > { "cleanupPeriodDays": 9999 }
96
- > ```
97
- >
98
- > Set `cleanupPeriodDays` to the number of days you want to retain session
99
- > history. **Do NOT set it to `0`** — this wipes all stored sessions
100
- > immediately. See [Claude Code settings](https://code.claude.com/docs/en/settings)
101
- > for details.
102
-
103
- ## Protect: Hook Policies
104
-
105
- AI agents can execute arbitrary shell commands. Without guardrails, Claude can accidentally `rm -rf /`, read your `.env` secrets, `sudo` install packages, force-push to main, or pipe a curl download to `sh`. Claudeye's hook policies sit in Claude Code's hook pipeline and block dangerous actions in real time — before they execute.
106
-
107
- ### Quick Start
108
-
109
- ```bash
110
- # Install with interactive policy selector (Enter to toggle, S to submit)
111
- claudeye --install-hooks
112
-
113
- # Or install all 27 policies at once
114
- claudeye --install-hooks all
115
-
116
- # Install specific policies
117
- claudeye --install-hooks block-sudo block-env-files sanitize-jwt
118
-
119
- # See what's enabled
120
- claudeye --list-hooks
121
-
122
- # Disable a specific policy
123
- claudeye --remove-hooks block-sudo
124
-
125
- # Remove all hooks from Claude Code
126
- claudeye --remove-hooks
127
- ```
128
-
129
- ### Available Policies
130
-
131
- | Policy | Default | What it blocks / warns |
132
- |--------|:-------:|----------------|
133
- | `sanitize-jwt` | On | Scrubs JWT tokens (`eyJ...`) from tool output before Claude sees them |
134
- | `sanitize-api-keys` | On | Scrubs API keys from tool output (OpenAI, Anthropic, GitHub, AWS, Stripe, Google) |
135
- | `sanitize-connection-strings` | On | Scrubs database connection strings with embedded credentials (`postgresql://user:pass@host`) from tool output |
136
- | `sanitize-private-key-content` | On | Scrubs PEM private key blocks (`-----BEGIN PRIVATE KEY-----`) from tool output |
137
- | `sanitize-bearer-tokens` | On | Scrubs `Authorization: Bearer <token>` headers from tool output |
138
- | `protect-env-vars` | On | `env`, `printenv`, `echo $SECRET`, `export API_KEY=...` |
139
- | `block-env-files` | On | Reading or writing `.env`, `.env.local`, `.env.production` |
140
- | `block-sudo` | On | Any `sudo` command |
141
- | `block-curl-pipe-sh` | On | `curl ... \| sh`, `wget ... \| bash` — remote code execution |
142
- | `block-push-master` | On | `git push origin main`, `git push origin master` |
143
- | `block-claudeye-commands` | On | Direct `claudeye` CLI invocations and `npm/bun/yarn/pnpm uninstall claudeye` |
144
- | `block-rm-rf` | Off | `rm -rf /`, `rm -rf ~`, `rm -rf /*` — catastrophic deletions |
145
- | `block-force-push` | Off | `git push --force`, `git push -f` |
146
- | `block-secrets-write` | Off | Writing to `.pem`, `.key`, `id_rsa`, `credentials` files |
147
- | `block-read-outside-cwd` | Off | Read/Glob/Grep/Bash read commands targeting files outside session CWD (`~/.claude/` allowed except `settings*.json` files) |
148
- | `block-work-on-main` | Off | `git commit`, `git merge`, `git rebase`, `git cherry-pick` when current branch is `main` or `master` |
149
- | `warn-destructive-sql` | Off | Warns before `DROP TABLE/DATABASE`, `TRUNCATE`, or `DELETE FROM` (without `WHERE`) via `psql`, `mysql`, `sqlite3`, etc. |
150
- | `warn-large-file-write` | Off | Warns when a Write tool payload exceeds 100KB (catches runaway generation loops) |
151
- | `warn-package-publish` | Off | Warns before `npm publish`, `cargo publish`, `gem push`, `poetry publish`, and similar |
152
- | `warn-repeated-tool-calls` | Off | Detects 3+ identical tool calls in session transcript; warns Claude to try a different approach |
153
- | `warn-git-amend` | Off | Warns before `git commit --amend`, which rewrites history and can diverge shared branches |
154
- | `warn-git-stash-drop` | Off | Warns before `git stash drop` or `git stash clear`, which permanently deletes stashed changes |
155
- | `warn-all-files-staged` | Off | Warns before `git add -A`, `git add --all`, or `git add .`, which may stage unintended files |
156
- | `warn-schema-alteration` | Off | Warns before `ALTER TABLE` with column or rename operations via `psql`, `mysql`, `sqlite3`, etc. |
157
- | `warn-global-package-install` | Off | Warns before `npm install -g`, `cargo install`, `yarn global add`, and similar global installs |
158
- | `warn-background-process` | Off | Warns before `nohup`, `screen -d`, `tmux new -d`, or commands backgrounded with `&` |
159
- | `verify-intent` | ⚗ Beta | LLM-powered: verifies all user intents were addressed before Claude stops. Retries up to 3 times. Requires `--configure-llm` setup. |
160
-
161
- Selection is saved to `~/.claudeye/hooks-config.json`. In non-TTY environments (CI/piped), the 11 default policies are used automatically. Re-running `--install-hooks` re-opens the selector with your current choices pre-loaded.
162
-
163
- > **Web UI** — You can also toggle individual policies on/off from the **Policies** page in the Claudeye dashboard without re-running the CLI. Changes take effect immediately for the next hook invocation.
164
-
165
- ### Scoped Installation
166
-
167
- By default, hooks are installed at **user** scope (`~/.claude/settings.json`). Use `--scope` to target a different settings file:
168
-
169
- | Scope | File | Use case |
170
- |-------|------|----------|
171
- | `user` (default) | `~/.claude/settings.json` | Machine-wide — applies to all projects |
172
- | `project` | `{cwd}/.claude/settings.json` | Committed to git — shared across the team |
173
- | `local` | `{cwd}/.claude/settings.local.json` | Per-developer, gitignored |
174
-
175
- ```bash
176
- # Install to project scope (committed, shared with team)
177
- claudeye --install-hooks all --scope project
178
-
179
- # Install to local scope (gitignored, personal)
180
- claudeye --install-hooks --scope local
181
-
182
- # Install to a specific project directory (without cd-ing into it)
183
- claudeye --install-hooks all --scope project --cwd /path/to/project
184
-
185
- # Remove from all scopes (default)
186
- claudeye --remove-hooks
187
-
188
- # Remove from a specific scope only
189
- claudeye --remove-hooks --scope project
190
-
191
- # See which scopes have hooks installed
192
- claudeye --list-hooks
193
- ```
194
-
195
- #### The `--cwd` flag
196
-
197
- `--cwd` overrides `process.cwd()` when resolving settings paths for `--scope project` and `--scope local`. It has **no effect** with `--scope user` because user-scope settings always live at `~/.claude/settings.json` regardless of the working directory.
198
-
199
- This is intentional: the `user` scope is machine-wide and has no concept of a project directory, while `project` and `local` scopes resolve relative to a project root. `--cwd` lets you target a project without `cd`-ing into it — useful in scripts, CI pipelines, or when managing hooks across multiple repositories.
200
-
201
- ```bash
202
- # These two are equivalent:
203
- cd /path/to/project && claudeye --install-hooks all --scope project
204
- claudeye --install-hooks all --scope project --cwd /path/to/project
205
-
206
- # --cwd is silently ignored here (user scope doesn't need a project directory)
207
- claudeye --install-hooks all --cwd /path/to/project
208
- # ↑ installs to ~/.claude/settings.json, NOT /path/to/project/.claude/settings.json
209
-
210
- # Always pair --cwd with --scope project or --scope local:
211
- claudeye --install-hooks all --scope project --cwd /path/to/project
212
- claudeye --remove-hooks --scope local --cwd /path/to/project
213
- claudeye --list-hooks --cwd /path/to/project
214
- ```
215
-
216
- > **Why `--cwd` instead of `--project-dir`?** The flag was originally named `--project-dir`, but that was confusingly similar to `--projects-path` (which points to `~/.claude/projects` for the dashboard). `--cwd` is a widely understood convention (npm, pnpm, turbo all use it) and precisely describes the behavior: it sets the working directory for path resolution.
217
-
218
- **Avoid installing hooks in multiple scopes for the same project.** Claude Code merges settings from all scopes, so duplicate hooks may cause the same policy to evaluate twice. If `--list-hooks` shows hooks in multiple scopes, remove the extra installation with `--remove-hooks --scope <scope>`.
219
-
220
- ### Policy Params
221
-
222
- Tune builtin policies without replacing them. Add a `policyParams` key to any `.claudeye/hooks-config.json` file:
223
-
224
- ```json
225
- {
226
- "enabledPolicies": ["block-sudo", "block-push-master", "warn-large-file-write"],
227
- "policyParams": {
228
- "block-sudo": {
229
- "allowPatterns": ["sudo systemctl status *", "sudo journalctl *"]
230
- },
231
- "block-push-master": {
232
- "protectedBranches": ["main", "master", "release"]
233
- },
234
- "warn-large-file-write": {
235
- "thresholdKb": 512
236
- }
237
- }
238
- }
239
- ```
240
-
241
- Config files are read from three scopes in priority order (first wins for params):
242
- 1. `{cwd}/.claudeye/hooks-config.json` — project (can be committed)
243
- 2. `{cwd}/.claudeye/hooks-config.local.json` — local (gitignore this)
244
- 3. `~/.claudeye/hooks-config.json` — global (managed by `--install-hooks`)
245
-
246
- `enabledPolicies` are unioned across all three scopes.
247
-
248
- | Policy | Param | Type | Default | Description |
249
- |---|---|---|---|---|
250
- | `block-sudo` | `allowPatterns` | `string[]` | `[]` | Token-matched patterns to allow (e.g. `"sudo systemctl *"`) |
251
- | `block-rm-rf` | `allowPaths` | `string[]` | `[]` | Paths exempt from catastrophic-deletion blocking |
252
- | `block-read-outside-cwd` | `allowPaths` | `string[]` | `[]` | Paths outside cwd allowed for reading |
253
- | `block-push-master` | `protectedBranches` | `string[]` | `["main","master"]` | Branch names blocked from `git push` |
254
- | `block-work-on-main` | `protectedBranches` | `string[]` | `["main","master"]` | Branch names blocked for commits/merges |
255
- | `sanitize-api-keys` | `additionalPatterns` | `{regex,label}[]` | `[]` | Extra credential patterns to redact from output |
256
- | `block-secrets-write` | `additionalPatterns` | `string[]` | `[]` | Extra filename substrings to block writing |
257
- | `warn-large-file-write` | `thresholdKb` | `number` | `1024` | Threshold in KB above which file writes warn |
258
-
259
- > **`allowPatterns` safety** — `block-sudo` matches patterns against **parsed argv tokens**, not the raw command string. This prevents shell injection bypass like `sudo systemctl status; rm -rf /` from matching the pattern `sudo systemctl status *`.
260
-
261
- > **Web UI** — Policy params can also be edited directly from the **Policies tab** in the dashboard. Click the gear icon next to any policy with configurable parameters to open the editor.
262
-
263
- ### Custom Hooks
264
-
265
- Register arbitrary hook logic in a JavaScript file using the `custom` keyword with `--install-hooks`:
266
-
267
- ```bash
268
- claudeye --install-hooks custom ./my-hooks.js
269
- ```
270
-
271
- **`my-hooks.js`**:
272
- ```js
273
- import { customPolicies, allow, deny, instruct } from "claudeye";
274
-
275
- customPolicies.add({
276
- name: "block-production-writes",
277
- description: "Prevent writes to production config files",
278
- match: { events: ["PreToolUse"] },
279
- fn: async (ctx) => {
280
- if (ctx.toolName === "Write") {
281
- const path = ctx.toolInput?.file_path ?? "";
282
- if (path.includes("production") || path.includes("prod.")) {
283
- return deny("Writing to production config is blocked");
284
- }
285
- }
286
- return allow();
287
- },
288
- });
289
- ```
290
-
291
- - **`ctx`** fields: `eventType`, `toolName`, `toolInput`, `payload`, `session`, `params`
292
- - **Return values**: `allow()`, `deny(message)`, `instruct(message)` — `deny(message)` is surfaced to Claude as `"Blocked by claudeye: <message>"`, consistent with builtin policy output
293
- - **Transitive imports**: files imported by your hooks entry point are automatically rewritten
294
- - **Fail-open**: any error or 10-second timeout returns `allow` and logs a warning — Claude is never blocked by a broken hook
295
- - **View loaded hooks**: `claudeye --list-hooks` shows a Custom Hooks section when a path is configured; the Policies tab in the dashboard shows each custom policy as a rich row with its name, description, and event scope (where defined), aligned with the built-in policy layout — edit the JS file to add, remove, or reorder them
296
- - **Remove**: `claudeye --remove-hooks custom` clears the path from global config
297
- - **Validation**: the file is loaded and validated at install time — if it has syntax errors, import failures, or registers no hooks, the install fails with an error and config is left unchanged
298
-
299
- **TypeScript types** (exported from `claudeye`):
300
-
301
- ```ts
302
- interface PolicyContext {
303
- eventType: string;
304
- toolName?: string;
305
- toolInput?: Record<string, unknown>;
306
- payload: Record<string, unknown>;
307
- session?: {
308
- sessionId?: string;
309
- transcriptPath?: string;
310
- cwd?: string;
311
- permissionMode?: string;
312
- hookEventName?: string;
313
- };
314
- params?: Record<string, unknown>;
315
- }
316
-
317
- type PolicyDecision = "allow" | "deny" | "instruct";
318
-
319
- interface PolicyResult {
320
- decision: PolicyDecision;
321
- reason?: string;
322
- message?: string;
323
- }
324
-
325
- interface CustomHook {
326
- name: string;
327
- description?: string;
328
- match?: { events?: HookEventType[] };
329
- fn: (ctx: PolicyContext) => PolicyResult | Promise<PolicyResult>;
330
- }
331
- ```
332
-
333
- ### LLM-Powered Policies
334
-
335
- Some policies (like `verify-intent`) use an external LLM to make intelligent decisions. These require a one-time configuration of an OpenAI-compatible API provider.
336
-
337
- #### Configure LLM Provider
338
-
339
- ```bash
340
- # Interactive setup (prompts for API key, base URL, model)
341
- claudeye --configure-llm
342
-
343
- # Non-interactive (flags)
344
- claudeye --configure-llm --llm-api-key sk-xxx
345
-
346
- # Full configuration
347
- claudeye --configure-llm --llm-api-key sk-xxx --llm-base-url https://api.groq.com/openai/v1 --llm-model llama-3-70b
348
- ```
349
-
350
- Any OpenAI-compatible API works — OpenAI, Groq, Together, Ollama (`http://localhost:11434/v1`), etc. Configuration is saved to `~/.claudeye/hooks-config.json`:
351
-
352
- ```json
353
- {
354
- "enabledPolicies": ["verify-intent"],
355
- "llm": {
356
- "apiKey": "sk-...",
357
- "baseUrl": "https://api.openai.com/v1",
358
- "model": "gpt-4o-mini"
359
- }
360
- }
361
- ```
362
-
363
- You can also override at runtime with environment variables (useful for CI):
364
-
365
- | Variable | Description | Default |
366
- |----------|-------------|---------|
367
- | `CLAUDEYE_LLM_API_KEY` | API key | - |
368
- | `CLAUDEYE_LLM_BASE_URL` | Base URL | `https://api.openai.com/v1` |
369
- | `CLAUDEYE_LLM_MODEL` | Model name | `gpt-4o-mini` |
370
-
371
- #### verify-intent Policy
372
-
373
- The `verify-intent` policy fires when Claude says it's done (Stop event) and uses two LLM calls to check whether all user requests were actually completed:
374
-
375
- 1. **Extract** — Reads the session transcript and asks the LLM to list all actionable user intents
376
- 2. **Verify** — Sends the intents and transcript to the LLM, which checks if each was satisfied
377
-
378
- If any intents are unsatisfied, Claude is instructed to continue working. This repeats up to 3 times before allowing the stop.
379
-
380
- ```bash
381
- # Enable after configuring LLM
382
- claudeye --install-hooks verify-intent
383
- ```
384
-
385
- ### Hook Activity Dashboard
386
-
387
- Every policy evaluation is logged to `~/.claudeye/cache/hook-activity/` and displayed at `/policies` in the dashboard. You can filter by decision (allow/deny), event type, and policy name — with pagination and auto-refresh.
388
-
389
- Deny events are also reported to anonymous telemetry (if enabled) so you can track how often policies fire across machines.
390
-
391
- ### Hook Logging
392
-
393
- Hook processes write structured log output to **stderr**, controlled by two environment variables:
394
-
395
- | Variable | Purpose | Values | Default |
396
- |----------|---------|--------|---------|
397
- | `CLAUDEYE_LOG_LEVEL` | Log level threshold for both stderr and file output | `info`, `warn`, `error` | `warn` |
398
- | `CLAUDEYE_HOOK_LOG_FILE` | Enable file logging | unset/empty = off, `1` or `true` = `~/.claudeye/logs/`, or a custom directory path | disabled |
399
-
400
- #### What gets logged
401
-
402
- | Level | What is logged |
403
- |-------|---------------|
404
- | `info` | Event type, enabled policy count, matched policies, evaluation result (decision, policy name, duration), deny/instruct reasons |
405
- | `warn` | Stdin read failures, JSON payload parse failures, activity persistence failures |
406
- | `error` | Critical failures |
407
-
408
- Stderr output format: `[claudeye:hook] LEVEL message`
409
-
410
- #### Stderr-only logging (default)
411
-
412
- Set `CLAUDEYE_LOG_LEVEL` in your shell environment before launching Claude Code:
413
-
414
- ```bash
415
- # See all hook activity in Claude Code's stderr
416
- CLAUDEYE_LOG_LEVEL=info claude
417
-
418
- # Only see warnings and errors (default behavior)
419
- CLAUDEYE_LOG_LEVEL=warn claude
420
- ```
421
-
422
- #### File logging (opt-in)
423
-
424
- Enable persistent log files via `CLAUDEYE_HOOK_LOG_FILE`:
425
-
426
- ```bash
427
- # Enable file logging to default directory (~/.claudeye/logs/)
428
- CLAUDEYE_HOOK_LOG_FILE=1 CLAUDEYE_LOG_LEVEL=info claude
429
-
430
- # Enable file logging to a custom directory
431
- CLAUDEYE_HOOK_LOG_FILE=/var/log/claudeye CLAUDEYE_LOG_LEVEL=info claude
432
-
433
- # In .bashrc or .zshrc for persistent config
434
- export CLAUDEYE_LOG_LEVEL=info
435
- export CLAUDEYE_HOOK_LOG_FILE=1
436
- ```
437
-
438
- File logging details:
439
- - **Active log file:** `hooks.log` in the log directory
440
- - **Rotation:** Size-based at 500 KB — when `hooks.log` exceeds this, it is renamed to `hooks-{timestamp}.log` and a fresh `hooks.log` is created
441
- - **Format:** `[ISO timestamp] LEVEL message` (plain text, human-readable)
442
- - **Write mode:** Synchronous (hook processes are short-lived)
443
- - **Failure handling:** File logging is best-effort — if writes fail, the hook still completes normally
444
-
445
- ### Using Hook Policies with Claude Agents SDK
446
-
447
- Claudeye hook policies work with both **Claude Code** and the **Claude Agents SDK** (Python and TypeScript). The same `claudeye --install-hooks` installation works for both — the difference is how the SDK discovers the hooks.
448
-
449
- #### Why it matters
450
-
451
- When you run `claudeye --install-hooks`, hook entries are written to `~/.claude/settings.json`. Claude Code reads this file automatically. The Claude Agents SDK also supports these shell-command hooks, but **only if you explicitly tell it to load settings**.
452
-
453
- Without this, your Agent SDK apps run with **zero hook protection** — even if you've installed hooks for Claude Code on the same machine.
454
-
455
- #### Python (Claude Agents SDK)
456
-
457
- Add `setting_sources=["user"]` to load hooks from `~/.claude/settings.json`:
458
-
459
- ```python
460
- from claude_agent_sdk import query, ClaudeAgentOptions
461
-
462
- options = ClaudeAgentOptions(
463
- setting_sources=["user"], # Loads ~/.claude/settings.json (includes claudeye hooks)
464
- )
465
-
466
- for message in query(prompt="Refactor the auth module", options=options):
467
- print(message)
468
- ```
469
-
470
- To also load project-level settings (`.claude/settings.json` in the working directory):
471
-
472
- ```python
473
- options = ClaudeAgentOptions(
474
- setting_sources=["user", "project"],
475
- )
476
- ```
477
-
478
- #### TypeScript (Claude Agents SDK)
479
-
480
- ```typescript
481
- import { query } from "@anthropic-ai/claude-agent-sdk";
482
-
483
- for await (const message of query({
484
- prompt: "Refactor the auth module",
485
- options: {
486
- settingSources: ["user"], // Loads ~/.claude/settings.json (includes claudeye hooks)
487
- },
488
- })) {
489
- console.log(message);
490
- }
491
- ```
492
-
493
- #### What happens under the hood
494
-
495
- The flow is identical to Claude Code:
496
-
497
- 1. Agent SDK reads `~/.claude/settings.json` and finds claudeye hook entries
498
- 2. Before a tool executes, the SDK invokes `claudeye --hook PreToolUse` with the tool call payload on stdin
499
- 3. Claudeye evaluates enabled policies and returns allow/deny via stdout
500
- 4. The SDK blocks the tool call if the policy denies it
501
-
502
- The hook response format (`hookSpecificOutput` with `permissionDecision` and `permissionDecisionReason`) is the same protocol used by Claude Code — claudeye doesn't need to know which client is calling it.
503
-
504
- #### Verification
505
-
506
- After installing hooks and configuring `setting_sources`, you can verify by checking the `/policies` activity dashboard. Both Claude Code and Agent SDK hook events appear in the same activity log.
507
-
508
- #### Common Pitfall
509
-
510
- If you installed hooks with `claudeye --install-hooks` but your Agent SDK app isn't blocking anything, check that you've set `setting_sources`. Without it, the SDK ignores `~/.claude/settings.json` entirely — hooks, permissions, and all other settings.
511
-
512
- ### CLI Examples
513
-
514
- ```bash
515
- # Custom projects path
516
- claudeye --projects-path /path/to/projects
517
-
518
- # Different port, no browser
519
- claudeye --port 3000 --no-open
520
-
521
- # LAN access
522
- claudeye --host 0.0.0.0
523
-
524
- # Load custom evals and enrichments
525
- claudeye --evals ./my-evals.js
526
-
527
- # Password-protect the dashboard
528
- claudeye --auth-user admin:secret
529
-
530
- # Multiple auth users
531
- claudeye --auth-user admin:secret --auth-user viewer:readonly
532
-
533
- # Clear cached results
534
- claudeye --cache-clear
535
-
536
- # Enable background queue processing (scan every 60 seconds)
537
- claudeye --evals ./my-evals.js --queue-interval 60
538
-
539
- # Background processing with higher concurrency
540
- claudeye --evals ./my-evals.js --queue-interval 30 --queue-concurrency 5
541
- ```
542
-
543
- ## Enterprise
544
-
545
- The free tier ships 13 built-in policies with more added in every update. For teams that need deeper control, Claudeye Enterprise provides:
546
-
547
- - **Custom hook policies** — define organization-specific rules beyond the built-in set. Block access to internal endpoints, enforce branch naming conventions, restrict tool usage by project, or implement any policy your security team requires.
548
- - **Failure pattern knowledge base** — an intelligent agent that learns from your agents' past failures. It checks every session against a growing library of known failure patterns — broken tool call sequences, retry loops, permission escalations, misapplied fixes — and flags issues before they compound. The knowledge base improves continuously as your agents encounter new edge cases.
549
-
550
- [Contact us](https://discord.com/invite/zT92CAgvkj) to learn more.
551
-
552
- ## Why Claudeye?
553
-
554
- | Feature | Claudeye | Langfuse | Dev-Agent-Lens | ccusage | Raw JSONL |
555
- |---------|:--------:|:--------:|:--------------:|:-------:|:---------:|
556
- | Local-first (no cloud) | **Yes** | Self-host option | Proxy required | Yes | Yes |
557
- | Session replay | **Yes** | Traces only | Traces only | No | Manual |
558
- | Custom evals | **Yes** | Limited | No | No | No |
559
- | Subagent expansion | **Yes** | No | No | No | No |
560
- | Zero config | **Yes** | Setup required | Proxy setup | Yes | N/A |
561
- | Visual dashboard | **Yes** | Yes | Yes (Phoenix) | CLI only | No |
562
- | Hook security policies | **Yes** | No | No | No | No |
563
-
564
- ## Features
565
-
566
- ### Uncover
567
-
568
- - **Projects & sessions browser** - filter by date range or keyword, paginated and sorted newest-first
569
- - **Full execution trace viewer** - every message, tool call, thinking block, and system event
570
- - **Nested subagent logs** - expand to see subagent executions inline, pre-loaded with the session
571
- - **Virtual scrolling** - handles sessions with thousands of entries without performance issues
572
-
573
- ### Understand
574
-
575
- - **Session stats bar** - turns, tool calls, subagents, duration, and models at a glance
576
- - **Custom evals** - grade sessions with pass/fail results and 0-1 scores
577
- - **Per-eval recompute** - re-run a single eval without reprocessing all others
578
- - **Conditional evals** - gate evals globally or per-item, with session/subagent scope control
579
- - **Cache invalidation hook** - register a custom function via `app.cacheInvalidation()` to invalidate stale cached results based on age, score, or any custom logic
580
-
581
- ### Utilize
582
-
583
- - **Hook security policies** - 13 built-in policies (free, with more every update) that block dangerous commands (`sudo`, `rm -rf /`, `.env` reads, `curl | sh`, force-pushes, commits on main, and more) in real time via Claude Code's hook system — before they execute
584
- - **Hook activity dashboard** - see every policy check at `/hooks`: what was allowed, what was blocked, with filters by decision, event type, and policy name
585
- - **Hook logging** - structured stderr output for every hook invocation (event type, policy count, decision, duration), plus opt-in file logging with automatic rotation via `CLAUDEYE_HOOK_LOG_FILE`
586
- - **Custom enrichments** - compute metadata (token counts, quality signals, labels) as key-value pairs
587
- - **Custom actions** - on-demand tasks triggered from the dashboard via `app.action()` — generate summaries, export metrics, or run side-effects with full access to eval and enrichment results
588
- - **Alerts** - register callbacks via `app.alert()` that fire after all evals and enrichments complete (Slack webhooks, CI notifications, logging)
589
- - **Dashboard views & filters** - organize filters into named views, each with focused filter tiles (boolean toggles, range sliders, multi-select dropdowns) and a filterable sessions table
590
- - **Dashboard aggregates** - define cross-session summary tables with `app.dashboard.aggregate()`, using `{ collect, reduce }` for full control over output
591
- - **Unified queue** - all evals and enrichments (session, subagent, UI, background) go through a single priority queue with bounded concurrency, live tracking at `/queue`
592
- - **JSONL export** - download raw session logs
593
- - **Auto-refresh** - monitor live sessions at 5s, 10s, or 30s intervals
594
- - **Light/dark theme** - with system preference detection
595
-
596
- ## CLI Reference
597
-
598
- | Flag | Description | Default |
599
- |------|-------------|---------|
600
- | `--projects-path, -p <path>` | Path to Claude projects directory | `~/.claude/projects` |
601
- | `--port <number>` | Port to bind | `8020` |
602
- | `--host <address>` | Host to bind (`0.0.0.0` for LAN) | `localhost` |
603
- | `--evals <path>` | Path to evals/enrichments file | - |
604
- | `--auth-user <user:pass>` | Add an auth user (repeatable) | - |
605
- | `--cache-path <path>` | Custom cache directory | `~/.claudeye/cache` |
606
- | `--cache-clear` | Clear all cached results and exit | - |
607
- | `--no-open` | Don't auto-open the browser | - |
608
- | `--queue-interval <secs>` | Background scan interval in seconds | disabled |
609
- | `--queue-concurrency <num>` | Max parallel items per batch | `2` |
610
- | `--queue-history-ttl <secs>` | Seconds to keep completed items | `3600` |
611
- | `--max-queue-items <num>` | Max items to enqueue per scan (0=unlimited) | `500` |
612
- | `--logging <level>` | Log level: `info`, `warn`, `error` (applies to dashboard server; hooks read `CLAUDEYE_LOG_LEVEL` env var) | `warn` |
613
- | `--disable-telemetry` | Disable anonymous usage analytics | enabled |
614
- | `--install-hooks [policies\|custom <path>]` | Install hooks (interactive, or: `all`, `name1 name2 ...`, or `custom <path>` to register a custom hooks JS file) | - |
615
- | `--remove-hooks [policies\|custom]` | Remove hooks (all, or: `name1 name2 ...` to disable specific, or `custom` to clear the custom hooks path) | - |
616
- | `--list-hooks` | Show hook policies in a table with enabled status, descriptions, and a separate Beta section | - |
617
- | `--configure-llm` | Configure LLM provider for smart policies (interactive, or pass flags below) | - |
618
- | `--llm-api-key <key>` | API key for the LLM provider | - |
619
- | `--llm-base-url <url>` | Base URL for OpenAI-compatible API | `https://api.openai.com/v1` |
620
- | `--llm-model <model>` | Model name | `gpt-4o-mini` |
621
- | `--scope <scope>` | Scope for `--install-hooks` (default: `user`) / `--remove-hooks` (default: `all`): `user`, `project`, `local`, or `all` (remove only) | - |
622
- | `--cwd <path>` | Working directory for `--scope project`/`local` (default: cwd). No effect with `--scope user`. | cwd |
623
- | `-h, --help` | Show help | - |
624
-
625
- ## Custom Evals & Enrichments
626
-
627
- Define evals and enrichments in a single JS file and load with `--evals`:
628
-
629
- ```js
630
- import { createApp } from 'claudeye';
631
-
632
- const app = createApp();
633
-
634
- // Evals: grade your sessions
635
- app.eval('under-50-turns', ({ stats }) => ({
636
- pass: stats.turnCount <= 50,
637
- score: Math.max(0, 1 - stats.turnCount / 100),
638
- message: `${stats.turnCount} turn(s)`,
639
- }));
640
-
641
- app.eval('tool-success', ({ entries }) => {
642
- const results = entries.filter(e => e.type === 'tool_result');
643
- const errors = results.filter(e => e.is_error);
644
- const rate = results.length ? 1 - errors.length / results.length : 1;
645
- return { pass: rate >= 0.9, score: rate };
646
- });
647
-
648
- // Enrichments: add metadata to sessions
649
- app.enrich('session-summary', ({ entries, stats }) => ({
650
- 'Total Tokens': entries.reduce((s, e) => s + (e.usage?.total_tokens || 0), 0),
651
- 'Primary Model': stats.models[0] || 'unknown',
652
- 'Tool Calls': stats.toolCallCount,
653
- }));
654
- ```
655
-
656
- ```bash
657
- claudeye --evals ./my-evals.js
658
- ```
659
-
660
- ---
661
-
662
- ## API Reference
663
-
664
- > **Distribution note (v1.0.0+):** Claudeye is distributed as a compiled native binary. The `claudeye` npm package is a thin JS wrapper that proxies `createApp()` calls to the binary via JSON IPC. User-defined functions (eval closures, enrich callbacks, etc.) remain in your Node.js process — they are never serialized into the binary. The public API surface is identical to previous versions; no code changes are required.
665
-
666
- ### `createApp()`
667
-
668
- Returns a `ClaudeyeApp` instance. All methods are chainable.
669
-
670
- ```ts
671
- import { createApp } from 'claudeye';
672
- const app = createApp();
673
- ```
674
-
675
- ---
676
-
677
- ### `app.condition(fn)`
678
-
679
- Set a global condition that gates all evals, enrichments, and actions. Calling this multiple times replaces the previous condition.
680
-
681
- ```ts
682
- app.condition(({ entries, stats, projectName, sessionId }) => boolean | Promise<boolean>);
683
- ```
684
-
685
- If the global condition returns `false` (or throws), every registered eval, enrichment, and action is skipped.
686
-
687
- #### Examples
688
-
689
- ```js
690
- // Only run for sessions with actual content
691
- app.condition(({ entries }) => entries.length > 0);
692
-
693
- // Only run for non-test projects
694
- app.condition(({ projectName }) => !projectName.includes('test'));
695
-
696
- // Only run for sessions longer than 5 turns
697
- app.condition(({ stats }) => stats.turnCount >= 5);
698
-
699
- // Async condition
700
- app.condition(async ({ sessionId }) => {
701
- // You could check an external service, database, etc.
702
- return sessionId !== 'skip-this-one';
703
- });
704
- ```
705
-
706
- #### Combining Global and Per-Item Conditions
707
-
708
- Global and per-item conditions stack. The global condition runs first; if it passes, per-item conditions are checked individually:
709
-
710
- ```js
711
- const app = createApp();
712
-
713
- // Global: skip everything for empty sessions
714
- app.condition(({ entries }) => entries.length > 0);
715
-
716
- // Per-eval: only check turn count for sessions with tool calls
717
- app.eval('efficient-tools',
718
- ({ stats }) => ({
719
- pass: stats.toolCallCount <= stats.turnCount * 2,
720
- score: Math.max(0, 1 - (stats.toolCallCount / (stats.turnCount * 4))),
721
- }),
722
- { condition: ({ stats }) => stats.toolCallCount > 0 }
723
- );
724
-
725
- // Per-enrichment: only compute model info for sessions that used a model
726
- app.enrich('model-info',
727
- ({ stats }) => ({
728
- 'Primary Model': stats.models[0] || 'unknown',
729
- 'Model Count': stats.models.length,
730
- }),
731
- { condition: ({ stats }) => stats.models.length > 0 }
732
- );
733
- ```
734
-
735
- > **Note:** Calling `app.condition()` multiple times replaces the previous condition. Only the last one is active. The global condition applies to both evals and enrichments; there's no way to set separate global conditions for each.
736
-
737
- ---
738
-
739
- ### `app.queueCondition(fn, options?)`
740
-
741
- Set a condition that gates **background queue processing**. This is separate from `app.condition()` — it only affects the background scanner (`scanAndEnqueue`), not UI-triggered runs.
742
-
743
- ```ts
744
- app.queueCondition(fn: ConditionFunction, options?: QueueConditionOptions): ClaudeyeApp;
745
- ```
746
-
747
- If the queue condition returns `false` (or throws), the session is skipped entirely — no evals or enrichments are enqueued for that session during background scanning.
748
-
749
- #### Options
750
-
751
- | Option | Type | Default | Description |
752
- |--------|------|---------|-------------|
753
- | `cacheable` | `boolean` | `false` | When `true`, condition results are cached per-session with hash-based invalidation. When `false` (default), the condition is evaluated fresh every scan cycle. |
754
-
755
- #### Caching
756
-
757
- By default (`cacheable: false`), the condition is re-evaluated every scan cycle. This is safe for time-dependent or external-state-dependent conditions (e.g., business hours, feature flags).
758
-
759
- When `cacheable: true`, condition results are cached per-session. The cache auto-invalidates when:
760
- - The **session file changes** (new content → different `contentHash`)
761
- - The **condition function is edited** (different `conditionCodeHash` via `fn.toString()` → SHA-256)
762
-
763
- Use `cacheable: true` only when the condition depends solely on session data.
764
-
765
- Log parsing is **lazy** — only triggered when the condition is actually evaluated (every cycle when non-cacheable, or on cache miss when cacheable). The parsed session data comes from `getCachedSessionLog()` which is itself LRU-cached in memory.
766
-
767
- #### Examples
768
-
769
- ```js
770
- // Only process sessions with actual content (evaluated fresh every cycle)
771
- app.queueCondition(({ entries }) => entries.length > 5);
772
-
773
- // Only process sessions from specific projects (safe to cache — depends only on session data)
774
- app.queueCondition(({ projectName }) => projectName.startsWith('production'), { cacheable: true });
775
-
776
- // Time-dependent condition — must not be cached
777
- app.queueCondition(() => {
778
- const hour = new Date().getHours();
779
- return hour >= 9 && hour < 17; // business hours only
780
- });
781
-
782
- // Skip sessions that are too short to be meaningful (safe to cache)
783
- app.queueCondition(({ stats }) => stats.turnCount >= 3, { cacheable: true });
784
- ```
785
-
786
- #### Difference from `app.condition()`
787
-
788
- | | `app.condition()` | `app.queueCondition()` |
789
- |---|---|---|
790
- | **Scope** | Gates all evals/enrichments/actions | Gates background queue only |
791
- | **Affects UI runs** | Yes | No |
792
- | **Affects background scanner** | No | Yes |
793
- | **Caching** | No (re-evaluated each run) | Opt-in via `{ cacheable: true }` |
794
-
795
- Both can be set simultaneously. `app.queueCondition()` acts as a pre-filter for the background scanner; sessions that pass then go through the normal `app.condition()` check when individual items execute.
796
-
797
- ---
798
-
799
- ### `app.cacheInvalidation(fn)`
800
-
801
- Register a cache invalidation hook that runs before serving any cached eval or enrichment result. If the hook returns `true`, the cached entry is discarded and the item re-runs. Only affects evals and enrichments — actions, alerts, and conditions pass through unchanged.
802
-
803
- ```ts
804
- app.cacheInvalidation((ctx) => boolean | Promise<boolean>);
805
- ```
806
-
807
- The `ctx` object includes `itemName`, `itemKind` (`"evals"` or `"enrichments"`), `cachedAt` (ISO timestamp), `cachedValue` (the full cached result), `projectName`, `sessionId`, `contentHash`, `itemCodeHash` (hash from the cached entry), and `currentItemCodeHash` (hash of the current function code).
808
-
809
- ```js
810
- // Invalidate cache entries older than 24 hours
811
- app.cacheInvalidation(({ cachedAt }) => {
812
- const age = Date.now() - new Date(cachedAt).getTime();
813
- return age > 24 * 60 * 60 * 1000;
814
- });
815
-
816
- // Re-run a specific eval when cached score is 0
817
- app.cacheInvalidation((ctx) => {
818
- if (ctx.itemKind === 'evals' && ctx.itemName === 'no-hallucination') {
819
- return ctx.cachedValue.score === 0;
820
- }
821
- return false;
822
- });
823
-
824
- // Re-run when the eval/enrichment function code has changed
825
- app.cacheInvalidation(({ itemCodeHash, currentItemCodeHash }) => {
826
- return itemCodeHash !== currentItemCodeHash;
827
- });
828
- ```
829
-
830
- ---
831
-
832
- ### `app.eval(name, fn, options?)`
833
-
834
- Register an eval function. Evals grade sessions with a pass/fail result and an optional 0-1 score.
835
-
836
- - **`name`** - unique string identifier for the eval
837
- - **`fn`** - function receiving an `EvalContext` and returning an `EvalResult`
838
- - **`options.condition`** - optional condition function to gate this eval
839
- - **`options.scope`** - `'session'` (default), `'subagent'`, or `'both'`
840
- - **`options.subagentType`** - only run for subagents of this type (e.g. `'Explore'`)
841
-
842
- If a per-eval condition returns `false`, the eval is marked as **skipped** in the results panel. If the condition throws, the eval is marked as **errored** with the message `Condition error: <message>`.
843
-
844
- #### `EvalResult`
845
-
846
- ```ts
847
- interface EvalResult {
848
- pass: boolean; // Did the eval pass?
849
- score?: number; // 0-1, clamped automatically (default: 1.0)
850
- message?: string; // Shown in the UI
851
- metadata?: Record<string, unknown>; // Arbitrary data
852
- }
853
- ```
854
-
855
- #### Examples
856
-
857
- ```js
858
- // Simple: check if session stayed under a turn budget
859
- app.eval('under-50-turns', ({ stats }) => ({
860
- pass: stats.turnCount <= 50,
861
- score: Math.max(0, 1 - stats.turnCount / 100),
862
- message: `${stats.turnCount} turn(s)`,
863
- }));
864
-
865
- // Check tool success rate
866
- app.eval('tool-success-rate', ({ entries }) => {
867
- const toolResults = entries.filter(e =>
868
- e.type === 'user' &&
869
- Array.isArray(e.message?.content) &&
870
- e.message.content.some(b => b.type === 'tool_result')
871
- );
872
- const errors = toolResults.filter(e =>
873
- e.message?.content?.some(b => b.is_error === true)
874
- );
875
- const rate = toolResults.length > 0
876
- ? 1 - (errors.length / toolResults.length)
877
- : 1;
878
- return {
879
- pass: rate >= 0.9,
880
- score: rate,
881
- message: `${errors.length}/${toolResults.length} tool errors`,
882
- };
883
- });
884
-
885
- // Check that the session ended with a text response
886
- app.eval('has-completion', ({ entries }) => {
887
- const lastAssistant = [...entries].reverse().find(e => e.type === 'assistant');
888
- const hasText = lastAssistant?.message?.content?.some?.(b => b.type === 'text');
889
- return {
890
- pass: !!hasText,
891
- score: hasText ? 1.0 : 0,
892
- message: hasText ? 'Session completed with text response' : 'No final text response',
893
- };
894
- });
895
-
896
- // With a per-eval condition: only run for longer sessions
897
- app.eval('under-budget',
898
- ({ stats }) => ({
899
- pass: stats.turnCount <= 30,
900
- score: Math.max(0, 1 - stats.turnCount / 60),
901
- message: `${stats.turnCount} turns`,
902
- }),
903
- { condition: ({ stats }) => stats.turnCount >= 5 }
904
- );
905
-
906
- // Subagent-scoped eval
907
- app.eval('explore-depth', ({ entries, source }) => {
908
- const myEntries = entries.filter(e => e._source === source);
909
- return {
910
- pass: myEntries.length > 5,
911
- score: Math.min(myEntries.length / 20, 1),
912
- };
913
- }, { scope: 'subagent', subagentType: 'Explore' });
914
- ```
915
-
916
- ---
917
-
918
- ### `app.enrich(name, fn, options?)`
919
-
920
- Register an enricher function. Enrichments compute key-value metadata displayed in the dashboard.
921
-
922
- - **`name`** - unique string identifier for the enricher
923
- - **`fn`** - function receiving an `EvalContext` and returning a `Record<string, string | number | boolean>`
924
- - **`options.condition`** - optional condition function to gate this enricher
925
- - **`options.scope`** - `'session'` (default), `'subagent'`, or `'both'`
926
- - **`options.subagentType`** - only run for subagents of this type (e.g. `'Explore'`)
927
-
928
- #### `EnrichmentResult`
929
-
930
- ```ts
931
- // Enrichers return a flat key-value map
932
- type EnrichmentResult = Record<string, string | number | boolean>;
933
- ```
934
-
935
- #### Examples
936
-
937
- ```js
938
- // Session overview
939
- app.enrich('overview', ({ stats }) => ({
940
- 'Turns': stats.turnCount,
941
- 'Tool Calls': stats.toolCallCount,
942
- 'Duration': stats.duration,
943
- 'Models': stats.models.join(', ') || 'none',
944
- }));
945
-
946
- // Token and cost breakdown
947
- app.enrich('token-usage', ({ entries }) => {
948
- const inputTokens = entries.reduce((s, e) => s + (e.usage?.input_tokens || 0), 0);
949
- const outputTokens = entries.reduce((s, e) => s + (e.usage?.output_tokens || 0), 0);
950
- return {
951
- 'Input Tokens': inputTokens,
952
- 'Output Tokens': outputTokens,
953
- 'Total Tokens': inputTokens + outputTokens,
954
- 'Est. Cost': `$${((inputTokens * 0.003 + outputTokens * 0.015) / 1000).toFixed(4)}`,
955
- };
956
- });
957
-
958
- // Error analysis (only when errors exist)
959
- app.enrich('error-analysis',
960
- ({ entries }) => {
961
- const errors = entries.filter(e => e.is_error === true);
962
- return {
963
- 'Total Errors': errors.length,
964
- 'Error Rate': `${((errors.length / entries.length) * 100).toFixed(1)}%`,
965
- };
966
- },
967
- { condition: ({ entries }) => entries.some(e => e.is_error === true) }
968
- );
969
-
970
- // Subagent info (only when subagents were spawned)
971
- app.enrich('subagent-info',
972
- ({ entries, stats }) => {
973
- const subagentEntries = entries.filter(e => e.type === 'assistant' && e.parentUuid);
974
- return {
975
- 'Subagent Count': stats.subagentCount,
976
- 'Subagent Entries': subagentEntries.length,
977
- };
978
- },
979
- { condition: ({ stats }) => stats.subagentCount > 0 }
980
- );
981
-
982
- // Advanced metrics (async condition)
983
- app.enrich('advanced-metrics',
984
- ({ entries }) => ({
985
- 'Entry Count': entries.length,
986
- 'Avg Entry Size': Math.round(
987
- entries.reduce((s, e) => s + JSON.stringify(e).length, 0) / entries.length
988
- ),
989
- }),
990
- {
991
- condition: async ({ entries }) => {
992
- return entries.length > 10;
993
- },
994
- }
995
- );
996
- ```
997
-
998
- ---
999
-
1000
- ### `app.action(name, fn, options?)`
1001
-
1002
- Register a user-defined action. Actions are a flexible primitive for on-demand tasks — generating summaries, exporting metrics, running side-effects, or anything else that doesn't fit the eval (pass/fail) or enrichment (key-value) model. Actions are never auto-run; they are triggered manually from the dashboard.
1003
-
1004
- - **`name`** - unique string identifier for the action
1005
- - **`fn`** - function receiving an `ActionContext` and returning an `ActionResult`
1006
- - **`options.condition`** - optional condition function to gate this action
1007
- - **`options.scope`** - `'session'` (default), `'subagent'`, or `'both'`
1008
- - **`options.subagentType`** - only run for subagents of this type (e.g. `'Explore'`)
1009
- - **`options.cache`** - cache results (default: `true`). Set to `false` for side-effect actions that should always re-run.
1010
- - **`options.inputs`** - optional array of `ActionInputField` definitions. When provided, the dashboard renders an inline form before running the action. Collected values are available via `context.inputs`. Actions without inputs work exactly as before.
1011
-
1012
- #### `ActionContext`
1013
-
1014
- Actions receive an extended context that includes cached eval and enrichment results:
1015
-
1016
- ```ts
1017
- interface ActionContext extends EvalContext {
1018
- evalResults: Record<string, EvalRunResult>; // Cached eval results for the session
1019
- enrichmentResults: Record<string, EnrichRunResult>; // Cached enrichment results
1020
- inputs: ActionInputValues; // Values collected from the input form (empty object if no inputs defined)
1021
- }
1022
- ```
1023
-
1024
- This means actions can build on prior analysis — check which evals passed, read enrichment data, and combine it with raw session data. When `options.inputs` is defined, `context.inputs` contains the user-provided values keyed by field name.
1025
-
1026
- #### `ActionResult`
1027
-
1028
- ```ts
1029
- interface ActionResult {
1030
- output?: string; // Free-form text (rendered in monospace block with copy button)
1031
- status: 'success' | 'error'; // Action outcome
1032
- message?: string; // Short summary shown in the UI
1033
- }
1034
- ```
1035
-
1036
- Return `output` for text. The output is rendered in a scrollable monospace block with a copy-to-clipboard button. The `status` field determines the icon in the results panel.
1037
-
1038
- #### Examples
1039
-
1040
- ```js
1041
- // Session summary: combine stats with eval results
1042
- app.action('session-summary', ({ stats, evalResults }) => {
1043
- const evalNames = Object.keys(evalResults);
1044
- const passCount = evalNames.filter(n => evalResults[n]?.pass).length;
1045
- return {
1046
- output: [
1047
- `Session: ${stats.turnCount} turns, ${stats.toolCallCount} tool calls`,
1048
- `Duration: ${stats.duration}`,
1049
- `Models: ${stats.models.join(', ') || 'unknown'}`,
1050
- `Evals: ${passCount}/${evalNames.length} passed`,
1051
- ].join('\n'),
1052
- status: 'success',
1053
- message: 'Summary generated',
1054
- };
1055
- });
1056
-
1057
- // Export metrics: gather enrichment data into a text report
1058
- app.action('export-metrics', ({ stats, enrichmentResults }) => {
1059
- const enrichData = {};
1060
- for (const [name, result] of Object.entries(enrichmentResults)) {
1061
- if (result.data) Object.assign(enrichData, result.data);
1062
- }
1063
- const lines = [
1064
- ...Object.entries(enrichData).map(([k, v]) => `${k}: ${v}`),
1065
- `turnCount: ${stats.turnCount}`,
1066
- `toolCallCount: ${stats.toolCallCount}`,
1067
- ];
1068
- return {
1069
- output: lines.join('\n'),
1070
- status: 'success',
1071
- message: `Exported ${Object.keys(enrichData).length + 2} metrics`,
1072
- };
1073
- });
1074
-
1075
- // Action with user-defined inputs: prompt for parameters before running
1076
- app.action('custom-summary', ({ stats, evalResults, inputs }) => {
1077
- const maxLines = inputs.maxLines ?? 20;
1078
- const includeEvals = inputs.includeEvals ?? true;
1079
- const lines = [`Session: ${stats.turnCount} turns, ${stats.toolCallCount} tool calls`];
1080
- if (includeEvals) {
1081
- const evalNames = Object.keys(evalResults);
1082
- const passCount = evalNames.filter(n => evalResults[n]?.pass).length;
1083
- lines.push(`Evals: ${passCount}/${evalNames.length} passed`);
1084
- }
1085
- return { output: lines.slice(0, maxLines).join('\n'), status: 'success' };
1086
- }, {
1087
- inputs: [
1088
- { name: 'maxLines', type: 'number', label: 'Max lines', default: 20 },
1089
- { name: 'includeEvals', type: 'boolean', label: 'Include eval results', default: true },
1090
- { name: 'format', type: 'select', label: 'Output format', options: [
1091
- { label: 'Plain text', value: 'text' },
1092
- { label: 'Markdown', value: 'markdown' },
1093
- ]},
1094
- ],
1095
- });
1096
-
1097
- // Side-effect action: write results to a file (disable caching)
1098
- app.action('write-report', async ({ projectName, sessionId, stats, evalResults }) => {
1099
- const fs = await import('fs/promises');
1100
- const report = {
1101
- projectName, sessionId,
1102
- turns: stats.turnCount,
1103
- evals: Object.fromEntries(
1104
- Object.entries(evalResults).map(([name, r]) => [name, { pass: r.pass, score: r.score }])
1105
- ),
1106
- timestamp: new Date().toISOString(),
1107
- };
1108
- await fs.appendFile('session-reports.jsonl', JSON.stringify(report) + '\n');
1109
- return {
1110
- status: 'success',
1111
- message: `Report appended to session-reports.jsonl`,
1112
- output: 'Written to session-reports.jsonl',
1113
- };
1114
- }, { cache: false });
1115
-
1116
- // Conditional action: only available for sessions with tool usage
1117
- app.action('tool-analysis', ({ entries, source }) => {
1118
- const toolUses = entries.filter(e =>
1119
- e._source === (source || 'session') &&
1120
- e.type === 'assistant' &&
1121
- Array.isArray(e.message?.content) &&
1122
- e.message.content.some(b => b.type === 'tool_use')
1123
- );
1124
- const toolNames = [...new Set(toolUses.flatMap(e =>
1125
- (e.message?.content || []).filter(b => b.type === 'tool_use').map(b => b.name)
1126
- ))];
1127
- return {
1128
- output: toolNames.length > 0
1129
- ? `Tools used: ${toolNames.join(', ')}`
1130
- : 'No tools used',
1131
- status: 'success',
1132
- };
1133
- }, { condition: ({ stats }) => stats.toolCallCount > 0 });
1134
-
1135
- // Subagent-scoped action
1136
- app.action('agent-report', ({ entries, source, stats }) => {
1137
- const myEntries = entries.filter(e => e._source === source);
1138
- return {
1139
- output: [
1140
- `Source: ${source}`,
1141
- `Entries: ${myEntries.length}`,
1142
- `Turns: ${stats.turnCount}`,
1143
- `Tool calls: ${stats.toolCallCount}`,
1144
- ].join('\n'),
1145
- status: 'success',
1146
- message: `Agent ${source}: ${myEntries.length} entries`,
1147
- };
1148
- }, { scope: 'subagent' });
1149
- ```
1150
-
1151
- #### Action UI Behavior
1152
-
1153
- The Actions panel appears on session pages (and in expanded subagent cards for subagent-scoped actions) when matching actions are registered. The panel:
1154
-
1155
- - Shows all registered actions as idle (not run) on initial load, with cached results displayed immediately
1156
- - Provides a **Run** button per action and a **Run All** button in the header
1157
- - For actions with `inputs`, clicking **Run** (or **Run All**) opens an inline form to collect values before executing. When re-running a completed action, the form is pre-filled with the previous values so the user can review or modify them.
1158
- - Displays `output` in a scrollable monospace block with a copy-to-clipboard button
1159
- - Shows a "cached" badge for results served from cache
1160
- - Collapses by default (click the header to expand)
1161
- - Error count, success count, and loading spinners are visible in the header even when collapsed
1162
-
1163
- #### Action Types
1164
-
1165
- ```ts
1166
- type ActionFunction = (context: ActionContext) => ActionResult | Promise<ActionResult>;
1167
-
1168
- interface ActionInputField {
1169
- name: string;
1170
- type: 'string' | 'number' | 'boolean' | 'select';
1171
- label?: string; // Display label (defaults to name)
1172
- required?: boolean;
1173
- default?: string | number | boolean;
1174
- options?: Array<{ label: string; value: string }>; // For 'select' type only
1175
- }
1176
-
1177
- type ActionInputValues = Record<string, string | number | boolean>;
1178
-
1179
- interface RegisteredAction {
1180
- name: string;
1181
- fn: ActionFunction;
1182
- condition?: ConditionFunction;
1183
- scope: EvalScope; // 'session' | 'subagent' | 'both'
1184
- subagentType?: string;
1185
- cache: boolean; // default: true
1186
- inputs?: ActionInputField[];
1187
- }
1188
-
1189
- interface ActionRunResult {
1190
- name: string;
1191
- output?: string;
1192
- status: 'success' | 'error';
1193
- message?: string;
1194
- durationMs: number;
1195
- error?: string; // Present when the action threw or its condition threw
1196
- skipped?: boolean; // Present when the global/per-action condition returned false
1197
- inputs?: ActionInputValues; // Snapshot of input values used for this run
1198
- }
1199
-
1200
- interface ActionRunSummary {
1201
- results: ActionRunResult[];
1202
- totalDurationMs: number;
1203
- errorCount: number;
1204
- skippedCount: number;
1205
- }
1206
- ```
1207
-
1208
- ---
1209
-
1210
- ### `app.alert(name, fn, options?)`
1211
-
1212
- Register an alert callback that fires after all evals and enrichments complete for a session. Alerts are the hook point for Slack webhooks, CI notifications, logging, or any post-processing logic.
1213
-
1214
- - **`name`** - unique string identifier for the alert (re-registering replaces the previous callback)
1215
- - **`fn`** - function receiving an `AlertContext` and returning `void | Promise<void>`
1216
- - **`options`** - optional `AlertOptions` object:
1217
- - **`suppressOnRecompute`** (default: `true`) — when `true`, this alert is suppressed during recomputes triggered by code changes (e.g., modifying an eval function). Alerts still fire when new session data arrives. Set to `false` for alerts that should always fire regardless of the trigger.
1218
-
1219
- #### When Alerts Fire
1220
-
1221
- Alerts fire **once per session content version** — the unified queue checks after each item completes whether all eval/enrichment work for that session is done (no pending or processing items remain). When the last item completes, a debounced check fires alerts if the dedup marker allows it.
1222
-
1223
- | Trigger | Behavior |
1224
- |---------|----------|
1225
- | **Initial page load** | `queuePerItem()` per eval/enrichment → alerts fire when last item completes |
1226
- | **Background processing** | `scanAndEnqueue()` → individual items at LOW priority → alerts fire when last item completes |
1227
- | **Page reload (all cached)** | No items enter the queue → no alerts fire |
1228
- | **Re-run All** | Clears dedup marker first, then parallel `queuePerItem()` calls → alerts fire exactly once |
1229
- | **Re-run single** | `queuePerItem()` for one item → alerts do NOT re-fire (dedup marker still valid) |
1230
- | **Session content changes** | New `contentHash` invalidates the dedup marker → alerts fire again |
1231
- | **Alert registrations change** | New `alertsHash` invalidates the dedup marker → alerts fire again |
1232
-
1233
- Each alert callback is individually try/caught via `Promise.allSettled`. A throwing alert never blocks other alerts or eval processing. Errors are logged to console.
1234
-
1235
- #### `AlertContext`
1236
-
1237
- ```ts
1238
- interface AlertContext {
1239
- projectName: string; // Encoded project folder name
1240
- sessionId: string; // Session UUID
1241
- evalSummary?: EvalRunSummary; // Present when evals registered & ran
1242
- enrichSummary?: EnrichRunSummary; // Present when enrichments registered & ran
1243
- }
1244
- ```
1245
-
1246
- `evalSummary` and `enrichSummary` contain **all** results for the session — both cached and freshly computed. The alert always sees the complete picture.
1247
-
1248
- #### Examples
1249
-
1250
- ```js
1251
- // Slack webhook on eval failure
1252
- app.alert('slack-on-failure', async ({ projectName, sessionId, evalSummary }) => {
1253
- if (evalSummary && evalSummary.failCount > 0) {
1254
- await fetch('https://hooks.slack.com/services/T.../B.../xxx', {
1255
- method: 'POST',
1256
- headers: { 'Content-Type': 'application/json' },
1257
- body: JSON.stringify({
1258
- text: `${evalSummary.failCount} evals failed for ${projectName}/${sessionId}`,
1259
- }),
1260
- });
1261
- }
1262
- });
1263
-
1264
- // Console logging
1265
- app.alert('log-results', ({ projectName, sessionId, evalSummary, enrichSummary }) => {
1266
- const evals = evalSummary
1267
- ? `${evalSummary.passCount} pass, ${evalSummary.failCount} fail, ${evalSummary.errorCount} error`
1268
- : 'no evals';
1269
- const enrichments = enrichSummary
1270
- ? `${enrichSummary.results.length} enrichments (${enrichSummary.errorCount} errors)`
1271
- : 'no enrichments';
1272
- console.log(`[ALERT] ${projectName}/${sessionId}: ${evals} | ${enrichments}`);
1273
- });
1274
-
1275
- // Write results to a file for CI
1276
- app.alert('ci-report', async ({ projectName, sessionId, evalSummary }) => {
1277
- if (!evalSummary) return;
1278
- const fs = await import('fs/promises');
1279
- await fs.appendFile('eval-results.jsonl', JSON.stringify({
1280
- projectName,
1281
- sessionId,
1282
- passCount: evalSummary.passCount,
1283
- failCount: evalSummary.failCount,
1284
- results: evalSummary.results.map(r => ({ name: r.name, pass: r.pass, score: r.score })),
1285
- timestamp: new Date().toISOString(),
1286
- }) + '\n');
1287
- });
1288
-
1289
- // Alert that fires even on code-change recomputes (e.g., audit logging)
1290
- app.alert('audit-log', ({ projectName, sessionId }) => {
1291
- console.log(`[AUDIT] Processed ${projectName}/${sessionId}`);
1292
- }, { suppressOnRecompute: false });
1293
- ```
1294
-
1295
- #### Alert Types
1296
-
1297
- ```ts
1298
- type AlertFunction = (context: AlertContext) => void | Promise<void>;
1299
-
1300
- interface AlertOptions {
1301
- suppressOnRecompute?: boolean; // Default: true
1302
- }
1303
-
1304
- interface RegisteredAlert {
1305
- name: string;
1306
- fn: AlertFunction;
1307
- suppressOnRecompute: boolean;
1308
- }
1309
-
1310
- interface EvalRunSummary {
1311
- results: EvalRunResult[];
1312
- totalDurationMs: number;
1313
- passCount: number;
1314
- failCount: number;
1315
- errorCount: number;
1316
- skippedCount: number;
1317
- }
1318
-
1319
- interface EnrichRunSummary {
1320
- results: EnrichRunResult[];
1321
- totalDurationMs: number;
1322
- errorCount: number;
1323
- skippedCount: number;
1324
- }
1325
- ```
1326
-
1327
- ---
1328
-
1329
- ### `app.dashboard.view(name, options?)`
1330
-
1331
- Create a named dashboard view. Views group related filters into focused sets. Each view appears as a card on `/dashboard` and has its own route at `/dashboard/[viewName]`.
1332
-
1333
- - **`name`** - unique string identifier for the view
1334
- - **`options.label`** - human-readable label displayed on the card (defaults to the name)
1335
- - **`options.cachedOnly`** - only show sessions with complete cache entries (default: `false`)
1336
-
1337
- Returns a `DashboardViewBuilder` with chainable `.filter()` and `.aggregate()` methods.
1338
-
1339
- #### `DashboardViewBuilder`
1340
-
1341
- ```ts
1342
- interface DashboardViewBuilder {
1343
- filter(name: string, fn: FilterFunction, options?: FilterOptions): DashboardViewBuilder;
1344
- filter(options: { preBuilt: string[] }): DashboardViewBuilder;
1345
- aggregate(name: string, definition: AggregateDefinition, options?: AggregateOptions): DashboardViewBuilder;
1346
- }
1347
- ```
1348
-
1349
- The view builder's `.filter()` returns the view builder (not the app), so you can chain multiple filters within a view:
1350
-
1351
- ```js
1352
- app.dashboard.view('performance', { label: 'Performance Metrics' })
1353
- .filter('turn-count', ({ stats }) => stats.turnCount, { label: 'Turn Count' })
1354
- .filter('tool-calls', ({ stats }) => stats.toolCallCount, { label: 'Tool Calls' });
1355
- ```
1356
-
1357
- #### `ViewOptions`
1358
-
1359
- ```ts
1360
- interface ViewOptions {
1361
- label?: string; // Human-readable label (defaults to name)
1362
- cachedOnly?: boolean; // Only show sessions with complete cache entries (default: false)
1363
- }
1364
- ```
1365
-
1366
- #### `cachedOnly` Views
1367
-
1368
- When `cachedOnly: true`, uncached sessions are **pre-filtered** before any filter computation runs — sessions without complete cache entries are skipped entirely. This avoids expensive log parsing, eval cache reads, and filter function execution for uncached sessions. Sessions appear in the view as the background queue processes them.
1369
-
1370
- Filter functions in `cachedOnly` views receive `ctx.evalResults` — a record of cached eval results keyed by eval name. Each entry has `pass`, `score`, and optional `error`/`message` fields. This lets you create per-eval score filters (range sliders) and composite boolean filters without re-running evals.
1371
-
1372
- ```js
1373
- // Only show sessions that have been fully processed
1374
- app.dashboard.view('quality', { cachedOnly: true })
1375
- .filter('has-errors', ({ entries }) =>
1376
- entries.some(e => e.type === 'assistant' &&
1377
- Array.isArray(e.message?.content) &&
1378
- e.message.content.some(b => b.type === 'tool_use' && b.is_error)),
1379
- { label: 'Has Errors' });
1380
-
1381
- // Per-eval score filters using evalResults
1382
- app.dashboard.view('eval-results', { cachedOnly: true, label: 'Eval Score Filters' })
1383
- .filter('quality-score', (ctx) => ctx.evalResults?.['quality']?.score ?? 0,
1384
- { label: 'Quality Score' })
1385
- .filter('all-passing', (ctx) => {
1386
- if (!ctx.evalResults) return false;
1387
- return Object.values(ctx.evalResults).every(r => r.pass);
1388
- }, { label: 'All Evals Passing' });
1389
- ```
1390
-
1391
- #### Routing
1392
-
1393
- | URL | Behavior |
1394
- |-----|----------|
1395
- | `/dashboard` | If named views exist, shows a view index (card grid). If only default filters, shows them directly. If nothing registered, shows an empty state. |
1396
- | `/dashboard/[viewName]` | Specific named view with its filters and sessions table. |
1397
-
1398
- #### Examples
1399
-
1400
- ```js
1401
- // Two focused views
1402
- app.dashboard.view('performance', { label: 'Performance Metrics' })
1403
- .filter('turn-count', ({ stats }) => stats.turnCount, { label: 'Turn Count' })
1404
- .filter('tool-calls', ({ stats }) => stats.toolCallCount, { label: 'Tool Calls' });
1405
-
1406
- app.dashboard.view('quality', { label: 'Quality Checks' })
1407
- .filter('has-errors', ({ entries }) =>
1408
- entries.some(e => e.type === 'assistant' &&
1409
- Array.isArray(e.message?.content) &&
1410
- e.message.content.some(b => b.type === 'tool_use' && b.is_error)),
1411
- { label: 'Has Errors' })
1412
- .filter('primary-model', ({ stats }) => stats.models[0] || 'unknown',
1413
- { label: 'Primary Model' });
1414
-
1415
- // Backward-compat: app.dashboard.filter() still works (goes to "default" view)
1416
- app.dashboard.filter('uses-subagents', ({ stats }) => stats.subagentCount > 0,
1417
- { label: 'Uses Subagents' }
1418
- );
1419
- ```
1420
-
1421
- ---
1422
-
1423
- ### `app.dashboard.filter(name, fn, options?)`
1424
-
1425
- Register a dashboard filter on the **default** view. For organizing filters into named views, see `app.dashboard.view()` above.
1426
-
1427
- - **`name`** - unique string identifier for the filter
1428
- - **`fn`** - function receiving an `EvalContext` and returning a `FilterValue` (`boolean`, `number`, or `string`)
1429
- - **`options.label`** - human-readable label for the filter tile (defaults to the name)
1430
- - **`options.condition`** - optional condition function to gate this filter
1431
-
1432
- The return type auto-determines the UI control:
1433
-
1434
- | Return type | UI control | Behavior |
1435
- |-------------|-----------|----------|
1436
- | `boolean` | Three-state toggle | Cycle: All &rarr; Yes &rarr; No &rarr; All |
1437
- | `number` | Range slider | Dual-handle slider with min/max inputs. Step auto-adapts: integer data uses steps of 1/5/10; float data uses 0.01 (range&le;1), 0.1 (range&le;10), or 1 |
1438
- | `string` | Multi-select dropdown | Checkboxes with Select All / Clear |
1439
-
1440
- Filter values are computed server-side with an incremental index (only new/changed sessions are reprocessed). Filtering and pagination happen server-side, returning only the matching page of results.
1441
-
1442
- #### Examples
1443
-
1444
- ```js
1445
- // Boolean filter: toggle sessions that have tool errors
1446
- app.dashboard.filter('has-errors', ({ entries }) =>
1447
- entries.some(e =>
1448
- e.type === 'assistant' &&
1449
- Array.isArray(e.message?.content) &&
1450
- e.message.content.some(b => b.type === 'tool_use' && b.is_error)
1451
- ),
1452
- { label: 'Has Errors' }
1453
- );
1454
-
1455
- // Number filter: range slider for turn count
1456
- app.dashboard.filter('turn-count', ({ stats }) => stats.turnCount,
1457
- { label: 'Turn Count' }
1458
- );
1459
-
1460
- // String filter: multi-select for primary model
1461
- app.dashboard.filter('primary-model', ({ stats }) => stats.models[0] || 'unknown',
1462
- { label: 'Primary Model' }
1463
- );
1464
-
1465
- // Number filter: range slider for tool call count
1466
- app.dashboard.filter('tool-calls', ({ stats }) => stats.toolCallCount,
1467
- { label: 'Tool Calls' }
1468
- );
1469
-
1470
- // Boolean filter: sessions with subagents
1471
- app.dashboard.filter('uses-subagents', ({ stats }) => stats.subagentCount > 0,
1472
- { label: 'Uses Subagents' }
1473
- );
1474
-
1475
- // String filter: session duration bucket
1476
- app.dashboard.filter('duration-bucket', ({ stats }) => {
1477
- const ms = parseInt(stats.duration) || 0;
1478
- if (ms < 60000) return 'Under 1m';
1479
- if (ms < 300000) return '1-5m';
1480
- if (ms < 900000) return '5-15m';
1481
- return 'Over 15m';
1482
- }, { label: 'Duration' });
1483
-
1484
- // With a per-filter condition: only compute for non-empty sessions
1485
- app.dashboard.filter('avg-tools-per-turn',
1486
- ({ stats }) => stats.turnCount > 0
1487
- ? Math.round(stats.toolCallCount / stats.turnCount * 10) / 10
1488
- : 0,
1489
- {
1490
- label: 'Avg Tools/Turn',
1491
- condition: ({ entries }) => entries.length > 0,
1492
- }
1493
- );
1494
- ```
1495
-
1496
- #### How Filters Work
1497
-
1498
- 1. When the `/dashboard` page loads, the server action discovers all projects and sessions
1499
- 2. An incremental `DashboardIndex` diffs the discovered sessions against previously computed rows — only new or changed sessions are processed (unchanged sessions are skipped entirely)
1500
- 3. For new/changed sessions, it checks the per-session disk cache first, then falls back to parsing the JSONL log and running filters
1501
- 4. Filter metadata (min/max for numbers, unique values for strings) is rebuilt from accumulators only when the session set changes
1502
- 5. Server-side filtering and pagination are applied — only the matching page of results is sent to the client
1503
- 6. User interactions (toggle, slider, dropdown) trigger a debounced (300ms) server re-fetch with the new filter state
1504
-
1505
- #### Global Condition
1506
-
1507
- Dashboard filters respect the global condition set via `app.condition()`. If the global condition returns `false` for a session, all filters are skipped for that session.
1508
-
1509
- ```js
1510
- // Skip empty sessions across evals, enrichments, AND dashboard filters
1511
- app.condition(({ entries }) => entries.length > 0);
1512
- ```
1513
-
1514
- ---
1515
-
1516
- ### `app.dashboard.filter({ preBuilt })`
1517
-
1518
- Enable pre-built filters on the dashboard. Pre-built filters provide common filtering functionality with optimized UX — no custom filter function needed.
1519
-
1520
- - **`preBuilt`** - array of pre-built filter names to enable
1521
-
1522
- #### Available Pre-Built Filters
1523
-
1524
- | Name | UI control | Description |
1525
- |------|-----------|-------------|
1526
- | `'lastModified'` | Date range picker (from/to) | Filters sessions by their last modified date |
1527
-
1528
- #### Examples
1529
-
1530
- ```js
1531
- // Enable on the default view
1532
- app.dashboard.filter({ preBuilt: ['lastModified'] });
1533
-
1534
- // Combine with custom filters
1535
- app.dashboard.filter({ preBuilt: ['lastModified'] });
1536
- app.dashboard.filter('model', ({ stats }) => stats.models[0] || 'unknown',
1537
- { label: 'Model' });
1538
-
1539
- // Enable on a named view
1540
- app.dashboard.view('overview', { label: 'Overview' })
1541
- .filter({ preBuilt: ['lastModified'] })
1542
- .filter('turns', ({ stats }) => stats.turnCount, { label: 'Turns' });
1543
- ```
1544
-
1545
- Pre-built date filters operate directly on the `DashboardSessionRow.lastModified` field — they don't require parsing session logs or running a filter function. The date range (min/max) is computed from the observed session data. Filtering is done server-side using normalized server-local calendar dates.
1546
-
1547
- ---
1548
-
1549
- ### `app.dashboard.aggregate(name, definition, options?)`
1550
-
1551
- Register a cross-session aggregate on the **default** view. For organizing aggregates into named views, use `app.dashboard.view().aggregate()`.
1552
-
1553
- - **`name`** - unique string identifier for the aggregate
1554
- - **`definition`** - a `{ collect, reduce }` object
1555
- - **`options.label`** - human-readable label for the aggregate section (defaults to the name)
1556
- - **`options.condition`** - optional condition function to gate this aggregate per session
1557
-
1558
- Provide a `{ collect, reduce }` object. The `collect` function runs per session, and `reduce` transforms all collected values into your output table.
1559
-
1560
- #### Examples
1561
-
1562
- ```js
1563
- // Eval pass rate summary table
1564
- app.dashboard.aggregate('eval-summary', {
1565
- collect: ({ evalResults }) => {
1566
- const result = {};
1567
- for (const [name, r] of Object.entries(evalResults)) {
1568
- result[`${name}_pass`] = r.pass;
1569
- result[`${name}_score`] = r.score;
1570
- }
1571
- return result;
1572
- },
1573
- reduce: (collected) => {
1574
- const evalNames = new Set();
1575
- for (const s of collected) {
1576
- for (const key of Object.keys(s.values)) {
1577
- if (key.endsWith('_pass')) evalNames.add(key.replace('_pass', ''));
1578
- }
1579
- }
1580
- return Array.from(evalNames).map(name => ({
1581
- 'Eval': name,
1582
- 'Pass Rate': collected.filter(s => s.values[`${name}_pass`]).length / collected.length,
1583
- 'Avg Score': collected.reduce((sum, s) => {
1584
- const v = s.values[`${name}_score`];
1585
- return sum + (typeof v === 'number' ? v : 0);
1586
- }, 0) / collected.length,
1587
- }));
1588
- },
1589
- });
1590
-
1591
- // Aggregates on named views alongside filters
1592
- app.dashboard.view('quality', { label: 'Quality' })
1593
- .aggregate('session-metrics', {
1594
- collect: ({ stats }) => ({
1595
- turnCount: stats.turnCount,
1596
- toolCalls: stats.toolCallCount,
1597
- }),
1598
- reduce: (collected) => {
1599
- const n = collected.length || 1;
1600
- let turns = 0, tools = 0;
1601
- for (const s of collected) {
1602
- turns += typeof s.values.turnCount === 'number' ? s.values.turnCount : 0;
1603
- tools += typeof s.values.toolCalls === 'number' ? s.values.toolCalls : 0;
1604
- }
1605
- return [
1606
- { Metric: 'Avg Turns', Value: +(turns / n).toFixed(1) },
1607
- { Metric: 'Avg Tool Calls', Value: +(tools / n).toFixed(1) },
1608
- ];
1609
- },
1610
- })
1611
- .filter('turns', ({ stats }) => stats.turnCount, { label: 'Turns' });
1612
- ```
1613
-
1614
- #### `AggregateContext`
1615
-
1616
- The collect function receives an extended context:
1617
-
1618
- ```ts
1619
- interface AggregateContext {
1620
- entries: Record<string, unknown>[]; // Raw JSONL lines
1621
- stats: EvalLogStats; // Computed stats
1622
- projectName: string;
1623
- sessionId: string;
1624
- source: string;
1625
- evalResults: Record<string, { pass: boolean; score: number; error?: string; message?: string }>;
1626
- enrichResults: Record<string, Record<string, EnrichmentValue>>;
1627
- filterValues: Record<string, FilterValue>;
1628
- }
1629
- ```
1630
-
1631
- #### Aggregate Types
1632
-
1633
- ```ts
1634
- type AggregateValue = boolean | number | string;
1635
-
1636
- type AggregateCollectFunction = (
1637
- context: AggregateContext,
1638
- ) => Record<string, AggregateValue> | Promise<Record<string, AggregateValue>>;
1639
-
1640
- type AggregateReduceFunction = (
1641
- collected: CollectedSession[],
1642
- ) => AggregateTableRow[] | Promise<AggregateTableRow[]>;
1643
-
1644
- type AggregateDefinition = {
1645
- collect: AggregateCollectFunction;
1646
- reduce: AggregateReduceFunction;
1647
- };
1648
-
1649
- interface AggregateOptions {
1650
- label?: string;
1651
- condition?: ConditionFunction;
1652
- }
1653
-
1654
- interface CollectedSession {
1655
- projectName: string;
1656
- sessionId: string;
1657
- values: Record<string, AggregateValue>;
1658
- }
1659
-
1660
- type AggregateTableRow = Record<string, AggregateValue>;
1661
-
1662
- interface AggregatePayload {
1663
- aggregates: {
1664
- name: string;
1665
- label: string;
1666
- rows: AggregateTableRow[];
1667
- columns: string[];
1668
- }[];
1669
- totalSessions: number;
1670
- totalDurationMs: number;
1671
- }
1672
- ```
1673
-
1674
- ---
1675
-
1676
- ### `app.auth(options)`
1677
-
1678
- Configure username/password authentication. When at least one user is configured (via `app.auth()`, `--auth-user`, or `CLAUDEYE_AUTH_USERS` env var), all UI routes are protected by a login page. Users from all sources are merged.
1679
-
1680
- - **`options.users`** - array of `{ username: string; password: string }` objects
1681
-
1682
- ```ts
1683
- app.auth({ users: [
1684
- { username: 'admin', password: 'secret' },
1685
- { username: 'viewer', password: 'readonly' },
1686
- ] });
1687
- ```
1688
-
1689
- Chainable — returns the app instance:
1690
-
1691
- ```js
1692
- app
1693
- .auth({ users: [{ username: 'admin', password: 'secret' }] })
1694
- .eval('my-eval', fn)
1695
- .listen();
1696
- ```
1697
-
1698
- When auth is active:
1699
- - All UI routes redirect to `/login` for unauthenticated users
1700
- - A signed HMAC-SHA256 session cookie (`claudeye_session`) is set on login, with 24h expiry
1701
- - The navbar shows a **Sign out** button
1702
- - If no users are configured, auth is completely disabled (no login page, no blocking)
1703
-
1704
- #### Multiple Sources
1705
-
1706
- Users from CLI, environment, and API are merged:
1707
-
1708
- ```bash
1709
- # CLI
1710
- claudeye --evals ./my-evals.js --auth-user ops:pass123
1711
-
1712
- # Environment (comma-separated user:password pairs)
1713
- CLAUDEYE_AUTH_USERS=admin:secret claudeye --evals ./my-evals.js
1714
-
1715
- # API (in my-evals.js)
1716
- app.auth({ users: [{ username: 'dev', password: 'devpass' }] });
1717
- ```
1718
-
1719
- All three users (`ops`, `admin`, `dev`) would be valid.
1720
-
1721
- ---
1722
-
1723
- ### `app.listen(port?, options?)`
1724
-
1725
- Start the Claudeye dashboard server.
1726
-
1727
- - **`port`** - port number (default: 8020)
1728
- - **`options.host`** - bind address (default: `"localhost"`, use `"0.0.0.0"` for LAN)
1729
- - **`options.open`** - auto-open browser (default: `true`)
1730
-
1731
- When the file is loaded via `--evals` or `CLAUDEYE_EVALS_MODULE`, `listen()` is a no-op. It won't spawn a duplicate server.
1732
-
1733
- ```js
1734
- const app = createApp();
1735
-
1736
- app.eval('my-eval', fn);
1737
- app.enrich('my-enricher', fn);
1738
-
1739
- // Only starts a server when run directly with `node` or `bun`
1740
- app.listen(3000, { host: '0.0.0.0', open: false });
1741
- ```
1742
-
1743
- You can also run your evals file directly with `bun my-evals.js` (or `node my-evals.js`) if you include `app.listen()`. This spawns the dashboard as a child process.
1744
-
1745
- ---
1746
-
1747
- ## Subagent Scope
1748
-
1749
- Evals, enrichments, and actions run at the session level by default. Use the `scope` option to target subagent logs.
1750
-
1751
- ### Scope Options
1752
-
1753
- | Scope | Runs at session level | Runs at subagent level |
1754
- |-------|:---:|:---:|
1755
- | `'session'` (default) | Yes | No |
1756
- | `'subagent'` | No | Yes |
1757
- | `'both'` | Yes | Yes |
1758
-
1759
- ### Subagent Context
1760
-
1761
- When running at subagent level, the `EvalContext` includes additional metadata:
1762
-
1763
- ```js
1764
- app.eval('adaptive-eval', (ctx) => {
1765
- if (ctx.source !== 'session') {
1766
- // Running at subagent level — source is "agent-{id}"
1767
- console.log(ctx.source); // e.g. 'agent-a1b2c3'
1768
- console.log(ctx.subagentType); // e.g. 'Explore'
1769
- console.log(ctx.subagentDescription); // e.g. 'Search for auth code'
1770
- console.log(ctx.parentSessionId); // parent session ID
1771
- }
1772
- return { pass: true };
1773
- }, { scope: 'both' });
1774
- ```
1775
-
1776
- ### Combined Data in Subagent Scope
1777
-
1778
- Subagent-scoped evals and enrichments receive the full combined data (session + all subagents), not just the subagent's own entries. The `source` field in `EvalContext` directly matches the `_source` value on entries, so you can filter easily:
1779
-
1780
- ```js
1781
- // Subagent-scoped eval that filters to its own entries
1782
- app.eval('explore-thoroughness', ({ entries, source }) => {
1783
- const myEntries = entries.filter(e => e._source === source);
1784
- return {
1785
- pass: myEntries.length > 5,
1786
- score: Math.min(myEntries.length / 20, 1),
1787
- };
1788
- }, { scope: 'subagent', subagentType: 'Explore' });
1789
- ```
1790
-
1791
- ### SubagentType Filtering
1792
-
1793
- When you specify `subagentType`, the eval/enrichment only runs for subagents of that type. Subagents of other types will not see the eval panel at all.
1794
-
1795
- ```js
1796
- // Only runs for Explore subagents
1797
- app.eval('explore-thoroughness', ({ entries, source }) => {
1798
- const myEntries = entries.filter(e => e._source === source);
1799
- return {
1800
- pass: myEntries.length > 5,
1801
- score: Math.min(myEntries.length / 20, 1),
1802
- message: `${myEntries.length} entries explored`,
1803
- };
1804
- }, { scope: 'subagent', subagentType: 'Explore' });
1805
-
1806
- // Runs for all subagent types
1807
- app.eval('agent-efficiency', ({ stats }) => ({
1808
- pass: stats.turnCount <= 10,
1809
- score: Math.max(0, 1 - stats.turnCount / 20),
1810
- message: `${stats.turnCount} turns`,
1811
- }), { scope: 'subagent' });
1812
-
1813
- // Subagent-scoped enrichment
1814
- app.enrich('agent-summary',
1815
- ({ stats, entries }) => ({
1816
- 'Agent Turns': stats.turnCount,
1817
- 'Agent Tool Calls': stats.toolCallCount,
1818
- 'Agent Entries': entries.length,
1819
- }),
1820
- { scope: 'subagent' }
1821
- );
1822
-
1823
- // Scoped to both session and subagent level
1824
- app.eval('quality-check', ({ stats, source }) => ({
1825
- pass: stats.toolCallCount <= 20,
1826
- score: Math.max(0, 1 - stats.toolCallCount / 40),
1827
- message: `${source}: ${stats.toolCallCount} tool calls`,
1828
- }), { scope: 'both' });
1829
- ```
1830
-
1831
- ---
1832
-
1833
- ## Evaluation Order
1834
-
1835
- When a session is loaded, conditions are evaluated in this order:
1836
-
1837
- ```
1838
- 1. Global condition checked
1839
- |-- Returns false or throws -> ALL evals/enrichments marked "skipped"
1840
- \-- Returns true -> proceed to step 2
1841
-
1842
- 2. For each eval/enrichment:
1843
- |-- Has per-item condition?
1844
- | |-- Returns false -> that item marked "skipped"
1845
- | |-- Throws -> that item marked "errored" (not skipped)
1846
- | \-- Returns true -> run the function
1847
- \-- No condition -> run the function
1848
-
1849
- 3. Function executes
1850
- |-- Returns result -> recorded normally
1851
- \-- Throws -> marked "errored", other items still run
1852
- ```
1853
-
1854
- ---
1855
-
1856
- ## UI Behavior
1857
-
1858
- In the dashboard, conditional results appear as follows:
1859
-
1860
- | Status | Evals Panel | Enrichments Panel |
1861
- |--------|-------------|-------------------|
1862
- | **Skipped** | Grayed-out row with "skipped" label | Grayed-out row with "skipped" label |
1863
- | **Condition error** | Row with warning icon and error message | Row with warning icon and error message |
1864
- | **Passed / Data** | Green check with score bar | Key-value pairs grouped by enricher |
1865
- | **Failed** | Red X with score bar | N/A |
1866
-
1867
- Skipped items are counted separately in the summary bar (e.g. "2 passed, 1 skipped").
1868
-
1869
- ---
1870
-
1871
- ## Types
1872
-
1873
- All TypeScript types exported from `claudeye`:
1874
-
1875
- ### `EvalContext`
1876
-
1877
- Both evals and enrichers receive the same context object:
1878
-
1879
- ```ts
1880
- interface EvalContext {
1881
- entries: Record<string, unknown>[]; // Combined session + subagent JSONL lines, each tagged with `_source`
1882
- stats: EvalLogStats; // Computed stats across all entries (session + subagent)
1883
- projectName: string; // Encoded project folder name
1884
- sessionId: string; // Session UUID
1885
- source: string; // "session" or "agent-{id}" — matches entry._source directly
1886
- subagentType?: string; // e.g. 'Explore', 'Bash' (subagent scope only)
1887
- subagentDescription?: string; // Short description (subagent scope only)
1888
- parentSessionId?: string; // Parent session ID (subagent scope only)
1889
- evalResults?: Record<string, { // Cached eval results (cachedOnly views only)
1890
- pass: boolean;
1891
- score: number;
1892
- error?: string;
1893
- message?: string;
1894
- }>;
1895
- }
1896
- ```
1897
-
1898
- `entries` contains the **raw JSONL data** from the session and all its subagents combined. Every line from the session log file and its subagent log files is parsed as JSON and included. Each entry has a `_source` field: `"session"` for main session entries, or `"agent-{id}"` for subagent entries. This means:
1899
-
1900
- - Tool-result lines (which the display view merges into tool_use blocks) are present as separate entries
1901
- - All entry types are included: `user`, `assistant`, `system`, `tool_result`, `queue-operation`, etc.
1902
- - Properties are accessed directly (e.g. `e.usage?.total_tokens`) rather than through a `.raw` wrapper
1903
- - Filter by `e._source === "session"` to get only main session data
1904
- - Filter by `e._source` starting with `"agent-"` to get subagent data
1905
-
1906
- ### `EvalLogEntry` (helper type)
1907
-
1908
- `EvalLogEntry` is exported as a convenience type for describing the display-oriented parsed entries, but it is **not** the type of `EvalContext.entries`. The entries passed to evals and enrichments are raw JSONL objects (`Record<string, unknown>[]`).
1909
-
1910
- ```ts
1911
- interface EvalLogEntry {
1912
- type: string;
1913
- _source?: string; // "session" or "agent-{id}"
1914
- uuid: string;
1915
- parentUuid: string | null;
1916
- timestamp: string;
1917
- timestampMs: number;
1918
- timestampFormatted: string;
1919
- message?: {
1920
- role: string;
1921
- content: string | EvalContentBlock[];
1922
- model?: string;
1923
- };
1924
- raw?: Record<string, unknown>;
1925
- label?: string;
1926
- }
1927
- ```
1928
-
1929
- ### `EvalLogStats`
1930
-
1931
- > Stats are computed across all entries (session + subagent combined). Use `_source` filtering on entries before computing custom scoped metrics if needed.
1932
-
1933
- ```ts
1934
- interface EvalLogStats {
1935
- turnCount: number; // Number of conversation turns
1936
- userCount: number; // Number of user messages
1937
- assistantCount: number; // Number of assistant responses
1938
- toolCallCount: number; // Total tool invocations
1939
- subagentCount: number; // Number of subagent spawns
1940
- duration: string; // Formatted duration (e.g. "2m 15s")
1941
- models: string[]; // Distinct model IDs used
1942
- }
1943
- ```
1944
-
1945
- ### `EvalResult`
1946
-
1947
- ```ts
1948
- interface EvalResult {
1949
- pass: boolean; // Did the eval pass?
1950
- score?: number; // 0-1, clamped automatically (default: 1.0)
1951
- message?: string; // Shown in the UI
1952
- metadata?: Record<string, unknown>; // Arbitrary data
1953
- }
1954
- ```
1955
-
1956
- ### `EnrichmentResult`
1957
-
1958
- ```ts
1959
- type EnrichmentResult = Record<string, string | number | boolean>;
1960
- ```
1961
-
1962
- ### `ConditionFunction`
1963
-
1964
- ```ts
1965
- type ConditionFunction = (context: EvalContext) => boolean | Promise<boolean>;
1966
- ```
1967
-
1968
- ### `FilterValue`
1969
-
1970
- ```ts
1971
- type FilterValue = boolean | number | string;
1972
- ```
1973
-
1974
- ### `FilterFunction`
1975
-
1976
- ```ts
1977
- type FilterFunction = (context: EvalContext) => FilterValue | Promise<FilterValue>;
1978
- ```
1979
-
1980
- ### `FilterOptions`
1981
-
1982
- ```ts
1983
- interface FilterOptions {
1984
- label?: string; // Human-readable tile label (defaults to name)
1985
- condition?: ConditionFunction; // Per-filter gate
1986
- }
1987
- ```
1988
-
1989
- ### `FilterMeta`
1990
-
1991
- Metadata auto-derived from computed filter values. Discriminated union by `type`:
1992
-
1993
- ```ts
1994
- type FilterMeta =
1995
- | { type: 'boolean'; name: string; label: string }
1996
- | { type: 'number'; name: string; label: string; min: number; max: number }
1997
- | { type: 'string'; name: string; label: string; values: string[] }
1998
- | { type: 'date'; name: string; label: string; min: string; max: string };
1999
- ```
2000
-
2001
- ### `DashboardPayload`
2002
-
2003
- ```ts
2004
- interface DashboardPayload {
2005
- sessions: DashboardSessionRow[]; // One page of matching sessions
2006
- filterMeta: FilterMeta[]; // One per registered filter
2007
- totalDurationMs: number; // Server-side computation time
2008
- totalCount: number; // Total sessions before filtering
2009
- matchingCount: number; // Total sessions after filtering
2010
- page: number; // Current page (1-based)
2011
- pageSize: number; // Items per page
2012
- }
2013
-
2014
- interface DashboardSessionRow {
2015
- projectName: string;
2016
- sessionId: string;
2017
- lastModified: string; // ISO 8601
2018
- lastModifiedFormatted: string; // Human-readable
2019
- filterValues: Record<string, FilterValue>;
2020
- }
2021
- ```
2022
-
2023
- ### `EvalScope`
2024
-
2025
- ```ts
2026
- type EvalScope = 'session' | 'subagent' | 'both';
2027
- ```
2028
-
2029
- ---
2030
-
2031
- ## Examples
2032
-
2033
- Complete, runnable example files. Save any of these as a `.js` file and run with `claudeye --evals ./your-file.js`.
2034
-
2035
- ### Example: Basic Evals & Enrichments
2036
-
2037
- The quickstart example — define evals and enrichments in one file:
2038
-
2039
- ```js
2040
- import { createApp } from 'claudeye';
2041
-
2042
- const app = createApp();
2043
-
2044
- // ── Global condition ────────────────────────────────────────────
2045
- // Skip empty sessions across evals, enrichments, AND dashboard filters.
2046
- app.condition(({ entries }) => entries.length > 0);
2047
-
2048
- // ── Evals ───────────────────────────────────────────────────────
2049
-
2050
- app.eval('under-50-turns', ({ stats }) => ({
2051
- pass: stats.turnCount <= 50,
2052
- score: Math.max(0, 1 - stats.turnCount / 100),
2053
- message: `${stats.turnCount} turn(s)`,
2054
- }));
2055
-
2056
- app.eval('has-completion', ({ entries }) => {
2057
- const last = [...entries].reverse().find(e => e.type === 'assistant');
2058
- const hasText = last?.message?.content?.some?.(b => b.type === 'text');
2059
- return {
2060
- pass: !!hasText,
2061
- score: hasText ? 1.0 : 0,
2062
- message: hasText ? 'Ended with text' : 'No final text response',
2063
- };
2064
- });
2065
-
2066
- app.eval('session-tool-count', ({ entries }) => {
2067
- const sessionTools = entries
2068
- .filter(e => e._source === 'session' && e.type === 'assistant')
2069
- .flatMap(e => (e.message?.content || []).filter(b => b.type === 'tool_use'));
2070
- return {
2071
- pass: sessionTools.length <= 100,
2072
- score: Math.max(0, 1 - sessionTools.length / 200),
2073
- message: `${sessionTools.length} session-level tool calls`,
2074
- };
2075
- });
2076
-
2077
- // ── Enrichments ─────────────────────────────────────────────────
2078
-
2079
- app.enrich('session-overview', ({ stats }) => ({
2080
- 'Turns': stats.turnCount,
2081
- 'Tool Calls': stats.toolCallCount,
2082
- 'Subagents': stats.subagentCount,
2083
- 'Duration': stats.duration,
2084
- 'Models': stats.models.join(', ') || 'none',
2085
- }));
2086
-
2087
- app.listen();
2088
- ```
2089
-
2090
- ### Example: Dashboard Filters
2091
-
2092
- Named dashboard views with focused filter sets, evals, and enrichments:
2093
-
2094
- ```js
2095
- import { createApp } from 'claudeye';
2096
-
2097
- const app = createApp();
2098
-
2099
- // ── Global condition ────────────────────────────────────────────
2100
- app.condition(({ entries }) => entries.length > 0);
2101
-
2102
- // ── Performance view ────────────────────────────────────────────
2103
- app.dashboard.view('performance', { label: 'Performance Metrics' })
2104
- .filter({ preBuilt: ['lastModified'] })
2105
- .filter('turn-count', ({ stats }) => stats.turnCount, { label: 'Turn Count' })
2106
- .filter('tool-calls', ({ stats }) => stats.toolCallCount, { label: 'Tool Calls' })
2107
- .filter('avg-tools-per-turn',
2108
- ({ stats }) => stats.turnCount > 0
2109
- ? Math.round(stats.toolCallCount / stats.turnCount * 10) / 10
2110
- : 0,
2111
- {
2112
- label: 'Avg Tools/Turn',
2113
- condition: ({ stats }) => stats.toolCallCount > 0,
2114
- }
2115
- );
2116
-
2117
- // ── Quality view ────────────────────────────────────────────────
2118
- app.dashboard.view('quality', { label: 'Quality Checks' })
2119
- .filter('has-errors', ({ entries }) =>
2120
- entries.some(e =>
2121
- e.type === 'assistant' &&
2122
- Array.isArray(e.message?.content) &&
2123
- e.message.content.some(b => b.type === 'tool_use' && b.is_error)
2124
- ),
2125
- { label: 'Has Errors' }
2126
- )
2127
- .filter('primary-model', ({ stats }) => stats.models[0] || 'unknown',
2128
- { label: 'Primary Model' }
2129
- )
2130
- .filter('uses-subagents', ({ stats }) => stats.subagentCount > 0,
2131
- { label: 'Uses Subagents' }
2132
- );
2133
-
2134
- // ── Evals ───────────────────────────────────────────────────────
2135
-
2136
- app.eval('under-50-turns', ({ stats }) => ({
2137
- pass: stats.turnCount <= 50,
2138
- score: Math.max(0, 1 - stats.turnCount / 100),
2139
- message: `${stats.turnCount} turn(s)`,
2140
- }));
2141
-
2142
- app.eval('has-completion', ({ entries }) => {
2143
- const last = [...entries].reverse().find(e => e.type === 'assistant');
2144
- const hasText = last?.message?.content?.some?.(b => b.type === 'text');
2145
- return {
2146
- pass: !!hasText,
2147
- score: hasText ? 1.0 : 0,
2148
- message: hasText ? 'Ended with text' : 'No final text response',
2149
- };
2150
- });
2151
-
2152
- app.eval('session-tool-count', ({ entries }) => {
2153
- const sessionTools = entries
2154
- .filter(e => e._source === 'session' && e.type === 'assistant')
2155
- .flatMap(e => (e.message?.content || []).filter(b => b.type === 'tool_use'));
2156
- return {
2157
- pass: sessionTools.length <= 100,
2158
- score: Math.max(0, 1 - sessionTools.length / 200),
2159
- message: `${sessionTools.length} session-level tool calls`,
2160
- };
2161
- });
2162
-
2163
- // ── Enrichments ─────────────────────────────────────────────────
2164
-
2165
- app.enrich('session-overview', ({ stats }) => ({
2166
- 'Turns': stats.turnCount,
2167
- 'Tool Calls': stats.toolCallCount,
2168
- 'Subagents': stats.subagentCount,
2169
- 'Duration': stats.duration,
2170
- 'Models': stats.models.join(', ') || 'none',
2171
- }));
2172
-
2173
- app.listen();
2174
- ```
2175
-
2176
- ### Example: Multi-View Dashboard
2177
-
2178
- Multiple named views with focused filter sets:
2179
-
2180
- ```js
2181
- import { createApp } from 'claudeye';
2182
-
2183
- const app = createApp();
2184
-
2185
- // ── Performance view ────────────────────────────────────────────
2186
- // Metrics about session length and tool usage.
2187
- app.dashboard.view('performance', { label: 'Performance Metrics' })
2188
- .filter('turn-count', ({ stats }) => stats.turnCount,
2189
- { label: 'Turn Count' })
2190
- .filter('tool-calls', ({ stats }) => stats.toolCallCount,
2191
- { label: 'Tool Calls' })
2192
- .filter('uses-subagents', ({ stats }) => stats.subagentCount > 0,
2193
- { label: 'Uses Subagents' });
2194
-
2195
- // ── Quality view ────────────────────────────────────────────────
2196
- // Error and model analysis.
2197
- app.dashboard.view('quality', { label: 'Quality Checks' })
2198
- .filter('has-errors', ({ entries }) =>
2199
- entries.some(e =>
2200
- e.type === 'assistant' &&
2201
- Array.isArray(e.message?.content) &&
2202
- e.message.content.some(b => b.type === 'tool_use' && b.is_error)
2203
- ),
2204
- { label: 'Has Errors' })
2205
- .filter('primary-model', ({ stats }) => stats.models[0] || 'unknown',
2206
- { label: 'Primary Model' });
2207
-
2208
- // ── Backward-compatible default filter ──────────────────────────
2209
- // app.dashboard.filter() still works — goes to the "default" view.
2210
- // Default filters show below the view cards on /dashboard.
2211
- app.dashboard.filter('model', ({ stats }) => stats.models[0] || 'unknown',
2212
- { label: 'Model' }
2213
- );
2214
-
2215
- app.listen();
2216
- ```
2217
-
2218
- ### Example: Eval Score Filters (cachedOnly)
2219
-
2220
- Per-eval score filters on a `cachedOnly` dashboard view. When a view uses `cachedOnly: true`, filter functions receive `ctx.evalResults` containing cached eval results:
2221
-
2222
- ```js
2223
- import { createApp } from 'claudeye';
2224
-
2225
- const app = createApp();
2226
-
2227
- // ── Register evals that produce scores ──────────────────────────
2228
- app.eval('quality', ({ entries }) => {
2229
- const assistantMsgs = entries.filter(e => e.type === 'assistant');
2230
- const avgLength = assistantMsgs.reduce((sum, e) => {
2231
- const content = typeof e.message?.content === 'string' ? e.message.content : '';
2232
- return sum + content.length;
2233
- }, 0) / (assistantMsgs.length || 1);
2234
- const score = Math.min(avgLength / 500, 1);
2235
- return { pass: score > 0.5, score };
2236
- });
2237
-
2238
- app.eval('speed', ({ stats }) => {
2239
- const durationSec = parseFloat(stats.duration) || 60;
2240
- const score = Math.max(0, 1 - durationSec / 120);
2241
- return { pass: score > 0.3, score };
2242
- });
2243
-
2244
- // ── cachedOnly view with per-eval score filters ─────────────────
2245
- app.dashboard.view('eval-results', { cachedOnly: true, label: 'Eval Score Filters' })
2246
- .filter('quality-score',
2247
- (ctx) => ctx.evalResults?.['quality']?.score ?? 0,
2248
- { label: 'Quality Score' })
2249
- .filter('speed-score',
2250
- (ctx) => ctx.evalResults?.['speed']?.score ?? 0,
2251
- { label: 'Speed Score' })
2252
- .filter('all-passing', (ctx) => {
2253
- if (!ctx.evalResults) return false;
2254
- return Object.values(ctx.evalResults).every(r => r.pass);
2255
- }, { label: 'All Evals Passing' });
2256
-
2257
- app.listen();
2258
- ```
2259
-
2260
- ### Example: Actions
2261
-
2262
- On-demand tasks triggered manually from the dashboard. Actions receive the full session context plus cached eval and enrichment results:
2263
-
2264
- ```js
2265
- import { createApp } from 'claudeye';
2266
-
2267
- const app = createApp();
2268
-
2269
- // ── Evals (actions can read these results) ──────────────────────
2270
-
2271
- app.eval('under-50-turns', ({ stats }) => ({
2272
- pass: stats.turnCount <= 50,
2273
- score: Math.max(0, 1 - stats.turnCount / 100),
2274
- message: `${stats.turnCount} turn(s)`,
2275
- }));
2276
-
2277
- app.eval('has-completion', ({ entries }) => {
2278
- const last = [...entries].reverse().find(e => e.type === 'assistant');
2279
- const hasText = last?.message?.content?.some?.(b => b.type === 'text');
2280
- return {
2281
- pass: !!hasText,
2282
- score: hasText ? 1.0 : 0,
2283
- message: hasText ? 'Ended with text' : 'No final text response',
2284
- };
2285
- });
2286
-
2287
- // ── Enrichments (actions can read these results too) ────────────
2288
-
2289
- app.enrich('overview', ({ stats }) => ({
2290
- 'Turns': stats.turnCount,
2291
- 'Tool Calls': stats.toolCallCount,
2292
- 'Models': stats.models.join(', ') || 'none',
2293
- }));
2294
-
2295
- // ── Actions ─────────────────────────────────────────────────────
2296
-
2297
- // Session summary: combines stats with eval pass counts
2298
- app.action('session-summary', ({ stats, evalResults }) => {
2299
- const evalNames = Object.keys(evalResults);
2300
- const passCount = evalNames.filter(n => evalResults[n]?.pass).length;
2301
- return {
2302
- output: [
2303
- `Session: ${stats.turnCount} turns, ${stats.toolCallCount} tool calls`,
2304
- `Duration: ${stats.duration}`,
2305
- `Models: ${stats.models.join(', ') || 'unknown'}`,
2306
- `Evals: ${passCount}/${evalNames.length} passed`,
2307
- ].join('\n'),
2308
- status: 'success',
2309
- message: 'Summary generated',
2310
- };
2311
- });
2312
-
2313
- // Export metrics: gathers enrichment data into a text report
2314
- app.action('export-metrics', ({ stats, enrichmentResults }) => {
2315
- const enrichData = {};
2316
- for (const [name, result] of Object.entries(enrichmentResults)) {
2317
- if (result.data) Object.assign(enrichData, result.data);
2318
- }
2319
- const lines = [
2320
- ...Object.entries(enrichData).map(([k, v]) => `${k}: ${v}`),
2321
- `turnCount: ${stats.turnCount}`,
2322
- `toolCallCount: ${stats.toolCallCount}`,
2323
- ];
2324
- return {
2325
- output: lines.join('\n'),
2326
- status: 'success',
2327
- message: `Exported ${Object.keys(enrichData).length + 2} metrics`,
2328
- };
2329
- });
2330
-
2331
- // Tool inventory: lists unique tools used in the session
2332
- app.action('tool-inventory', ({ entries }) => {
2333
- const toolUses = entries.filter(e =>
2334
- e.type === 'assistant' &&
2335
- Array.isArray(e.message?.content) &&
2336
- e.message.content.some(b => b.type === 'tool_use')
2337
- );
2338
- const toolNames = [...new Set(toolUses.flatMap(e =>
2339
- (e.message?.content || []).filter(b => b.type === 'tool_use').map(b => b.name)
2340
- ))];
2341
- return {
2342
- output: toolNames.length > 0
2343
- ? `Tools used:\n${toolNames.map(t => ` - ${t}`).join('\n')}`
2344
- : 'No tools used in this session',
2345
- status: 'success',
2346
- };
2347
- }, { condition: ({ stats }) => stats.toolCallCount > 0 });
2348
-
2349
- // Side-effect action: always re-runs (cache: false)
2350
- app.action('write-report', async ({ projectName, sessionId, stats }) => {
2351
- const fs = await import('fs/promises');
2352
- await fs.appendFile('session-reports.jsonl', JSON.stringify({
2353
- projectName, sessionId, turns: stats.turnCount,
2354
- timestamp: new Date().toISOString(),
2355
- }) + '\n');
2356
- return { status: 'success', message: 'Report appended' };
2357
- }, { cache: false });
2358
-
2359
- app.listen();
2360
- ```
2361
-
2362
- ### Example: Alerts
2363
-
2364
- Callbacks that fire after all evals and enrichments complete for a session:
2365
-
2366
- ```js
2367
- import { createApp } from 'claudeye';
2368
-
2369
- const app = createApp();
2370
-
2371
- // ── Evals ───────────────────────────────────────────────────────
2372
-
2373
- app.eval('under-50-turns', ({ stats }) => ({
2374
- pass: stats.turnCount <= 50,
2375
- score: Math.max(0, 1 - stats.turnCount / 100),
2376
- message: `${stats.turnCount} turn(s)`,
2377
- }));
2378
-
2379
- app.eval('has-completion', ({ entries }) => {
2380
- const last = [...entries].reverse().find(e => e.type === 'assistant');
2381
- const hasText = last?.message?.content?.some?.(b => b.type === 'text');
2382
- return {
2383
- pass: !!hasText,
2384
- score: hasText ? 1.0 : 0,
2385
- message: hasText ? 'Ended with text' : 'No final text response',
2386
- };
2387
- });
2388
-
2389
- // ── Enrichments ─────────────────────────────────────────────────
2390
-
2391
- app.enrich('overview', ({ stats }) => ({
2392
- 'Turns': stats.turnCount,
2393
- 'Tool Calls': stats.toolCallCount,
2394
- 'Models': stats.models.join(', ') || 'none',
2395
- }));
2396
-
2397
- // ── Alerts ──────────────────────────────────────────────────────
2398
-
2399
- // Console log: always fires, logs a summary line
2400
- app.alert('log-results', ({ projectName, sessionId, evalSummary, enrichSummary }) => {
2401
- const evals = evalSummary
2402
- ? `${evalSummary.passCount} pass, ${evalSummary.failCount} fail, ${evalSummary.errorCount} error`
2403
- : 'no evals';
2404
- const enrichments = enrichSummary
2405
- ? `${enrichSummary.results.length} enrichments`
2406
- : 'no enrichments';
2407
- console.log(`[ALERT] ${projectName}/${sessionId}: ${evals} | ${enrichments}`);
2408
- });
2409
-
2410
- // Failure alert: only logs when evals fail
2411
- app.alert('warn-on-failure', ({ projectName, sessionId, evalSummary }) => {
2412
- if (evalSummary && evalSummary.failCount > 0) {
2413
- const failedNames = evalSummary.results
2414
- .filter(r => !r.error && !r.skipped && !r.pass)
2415
- .map(r => r.name);
2416
- console.warn(
2417
- `[FAILURE] ${projectName}/${sessionId}: ${failedNames.join(', ')} failed`
2418
- );
2419
- }
2420
- });
2421
-
2422
- // Slack webhook example (uncomment and replace the URL to enable):
2423
- // app.alert('slack-on-failure', async ({ projectName, sessionId, evalSummary }) => {
2424
- // if (evalSummary && evalSummary.failCount > 0) {
2425
- // await fetch('https://hooks.slack.com/services/T.../B.../xxx', {
2426
- // method: 'POST',
2427
- // headers: { 'Content-Type': 'application/json' },
2428
- // body: JSON.stringify({
2429
- // text: `${evalSummary.failCount} evals failed for ${projectName}/${sessionId}`,
2430
- // }),
2431
- // });
2432
- // }
2433
- // });
2434
-
2435
- app.listen();
2436
- ```
2437
-
2438
- ### Example: Minimal Filters Only
2439
-
2440
- A minimal example — a single named view with filters, no evals or enrichments:
2441
-
2442
- ```js
2443
- import { createApp } from 'claudeye';
2444
-
2445
- const app = createApp();
2446
-
2447
- app.dashboard.view('overview', { label: 'Session Overview' })
2448
- .filter({ preBuilt: ['lastModified'] })
2449
- .filter('model', ({ stats }) => stats.models[0] || 'unknown',
2450
- { label: 'Model' })
2451
- .filter('turns', ({ stats }) => stats.turnCount,
2452
- { label: 'Turns' })
2453
- .filter('used-tools', ({ stats }) => stats.toolCallCount > 0,
2454
- { label: 'Used Tools' });
2455
-
2456
- app.listen();
2457
- ```
2458
-
2459
- ---
2460
-
2461
- ## Background Queue Processing
2462
-
2463
- Enable background processing to automatically scan and evaluate all sessions on a timer:
2464
-
2465
- ```bash
2466
- claudeye --evals ./my-evals.js --queue-interval 60
2467
- ```
2468
-
2469
- Use `app.queueCondition()` to gate which sessions the background queue processes:
2470
-
2471
- ```js
2472
- // Only process sessions with more than 5 entries
2473
- app.queueCondition(({ entries }) => entries.length > 5);
2474
- ```
2475
-
2476
- The condition receives the full `EvalContext` and returns a boolean. If false, the session is skipped entirely. Results are cached per-session and auto-invalidate when the session file or condition function changes.
2477
-
2478
- ### How the Queue Works
2479
-
2480
- The queue is **unified** — every individual eval and enrichment (session-scoped, subagent-scoped, UI-triggered, or background-scanned) passes through a single priority queue with bounded concurrency.
2481
-
2482
- 1. **Foreground (always active):** When a session page loads or a re-run is triggered, each uncached eval/enrichment is enqueued at HIGH priority via `queuePerItem()` and processed immediately (up to the concurrency limit)
2483
- 2. **Background (opt-in):** When `CLAUDEYE_QUEUE_INTERVAL` is set, a timer scans all projects for uncached sessions and enqueues individual uncached evals/enrichments at LOW priority
2484
- 3. **Priority:** HIGH (foreground/UI) items are always processed before LOW (background) items
2485
- 4. **Dedup:** If the same item is enqueued twice, the existing entry is upgraded to the higher priority
2486
- 5. **Subagent support:** Subagent evals/enrichments go through the same queue. The session ID is encoded as `sessionId/agent-agentId` for tracking purposes
2487
- 6. **Alerts:** After each successful item completes, the queue checks if any pending/processing work remains for that session. When no work remains, alerts fire once per content version
2488
-
2489
- ### Queue Status UI
2490
-
2491
- The queue status is visible in two places:
2492
-
2493
- **Navbar dropdown** — shows current processing items (max 7) and pending items with priority badges. Badge count = pending + processing.
2494
-
2495
- **`/queue` details page** — three tabs:
2496
- - **In Queue** — pending items with type badge, item name, session link, priority, queued time
2497
- - **Processing** — active items with spinner, type badge, item name, session link, started time
2498
- - **Processed** — completed items with type badge, item name, session link, duration, success/fail icon, completed time. Data is loaded from disk via paginated JSONL files (25 entries per page), so history survives process restarts
2499
-
2500
- **Dashboard panel** — collapsible panel showing queue state with processing and pending tables, background processor indicator, and error list.
2501
-
2502
- All views auto-refresh and self-hide when there's no queue activity.
2503
-
2504
- ### Environment Variables
2505
-
2506
- | Variable | Description | Default |
2507
- |----------|-------------|---------|
2508
- | `CLAUDEYE_QUEUE_INTERVAL` | Background scan interval in seconds | disabled |
2509
- | `CLAUDEYE_QUEUE_CONCURRENCY` | Max parallel items per batch | `2` |
2510
- | `CLAUDEYE_QUEUE_HISTORY_TTL` | Seconds to keep completed items | `3600` |
2511
- | `CLAUDEYE_QUEUE_MAX_ITEMS` | Max items to enqueue per scan (0=unlimited) | `500` |
2512
- | `CLAUDEYE_LOG_LEVEL` | Log verbosity for both dashboard server and hook processes: `info`, `warn`, `error` | `warn` |
2513
- | `CLAUDEYE_HOOK_LOG_FILE` | Enable hook file logging: `1` or `true` for default dir (`~/.claudeye/logs/`), or an absolute path for a custom directory | disabled |
2514
- | `CLAUDEYE_DISABLE_PAGES` | Comma-separated pages to hide from nav and block direct access: `policies`, `dashboard`, `projects`. The first non-disabled page (policies → projects → dashboard) becomes the root `/` landing page. | unset |
2515
-
2516
- At `info` level, all log lines (including `ACTIVITY` lines for user actions) are emitted. At `warn` (default), only warnings and errors appear. At `error`, only errors are shown.
2517
-
2518
- **Hook logging:** `CLAUDEYE_LOG_LEVEL` controls the verbosity of hook stderr output (event type, policy count, evaluation result at `info`; failures at `warn`). When `CLAUDEYE_HOOK_LOG_FILE` is additionally set, hooks write to persistent log files with automatic size-based rotation at 500 KB. See the [Hook Logging](#hook-logging) section for details and examples.
2519
-
2520
- ---
2521
-
2522
- ## Caching
2523
-
2524
- Caching is **always on**. Results are cached to `~/.claudeye/cache/` and automatically invalidated when session logs or eval definitions change. Click **Re-run** in the dashboard to bypass the cache.
2525
-
2526
- ```bash
2527
- claudeye --cache-path /tmp/cc # Custom cache location
2528
- claudeye --cache-clear # Clear cache and exit
2529
- ```
2530
-
2531
- ---
2532
-
2533
- ## Authentication
2534
-
2535
- Claudeye ships with **opt-in** username/password auth. When no users are configured, everything works exactly as before — no login page, no blocking.
2536
-
2537
- ### Enable via CLI
2538
-
2539
- ```bash
2540
- # Single user
2541
- claudeye --auth-user admin:secret
2542
-
2543
- # Multiple users
2544
- claudeye --auth-user admin:secret --auth-user viewer:readonly
2545
- ```
2546
-
2547
- ### Enable via environment variable
2548
-
2549
- ```bash
2550
- CLAUDEYE_AUTH_USERS=admin:secret claudeye
2551
- CLAUDEYE_AUTH_USERS=admin:secret,viewer:readonly claudeye
2552
- ```
2553
-
2554
- ### Enable via the programmatic API
2555
-
2556
- ```js
2557
- import { createApp } from 'claudeye';
2558
-
2559
- const app = createApp();
2560
-
2561
- app.auth({ users: [
2562
- { username: 'admin', password: 'secret' },
2563
- { username: 'viewer', password: 'readonly' },
2564
- ] });
2565
-
2566
- app.listen();
2567
- ```
2568
-
2569
- All three methods can be combined — users from CLI flags, the env var, and `app.auth()` are merged together.
2570
-
2571
- When auth is active, all UI routes redirect to `/login`. After signing in, a signed session cookie (24h expiry) grants access. A **Sign out** button appears in the navbar.
2572
-
2573
- ---
2574
-
2575
- ## Deployment with PM2
2576
-
2577
- For production deployments, use PM2 with Bun as the interpreter:
2578
-
2579
- ```js
2580
- // ecosystem.config.cjs
2581
- module.exports = {
2582
- apps: [{
2583
- name: 'claudeye',
2584
- script: 'node_modules/.bin/next',
2585
- args: 'start',
2586
- interpreter: 'bun',
2587
- cwd: '/path/to/claudeye',
2588
- env: {
2589
- PORT: 8020,
2590
- HOSTNAME: '0.0.0.0',
2591
- CLAUDE_PROJECTS_PATH: '/home/user/.claude/projects',
2592
- CLAUDEYE_EVALS_MODULE: './my-evals.js',
2593
- CLAUDEYE_QUEUE_INTERVAL: '60',
2594
- },
2595
- }],
2596
- };
2597
- ```
2598
-
2599
- ```bash
2600
- # Start
2601
- pm2 start ecosystem.config.cjs
2602
-
2603
- # Monitor
2604
- pm2 monit
2605
-
2606
- # Auto-restart on reboot
2607
- pm2 startup
2608
- pm2 save
2609
- ```
2610
-
2611
- ---
2612
-
2613
- ## Telemetry
2614
-
2615
- Claudeye collects anonymous, non-PII usage analytics (e.g. `app_started`, `queue_scan_completed`) to understand feature adoption. Events are keyed by a random instance UUID — no project names, session IDs, eval names, or log content are ever sent.
2616
-
2617
- **Opt out:**
2618
-
2619
- ```bash
2620
- # CLI flag
2621
- claudeye --disable-telemetry
2622
-
2623
- # Or environment variable
2624
- CLAUDEYE_TELEMETRY_DISABLED=1 claudeye
2625
- ```
2626
-
2627
- When disabled, all telemetry code is zero-cost no-op (no network requests, no dynamic imports).
2628
-
2629
- ---
2630
-
2631
- ## How It Works
2632
-
2633
- 1. `createApp()` + `app.eval()` / `app.enrich()` / `app.action()` / `app.alert()` / `app.condition()` / `app.queueCondition()` / `app.dashboard.view()` / `app.dashboard.filter()` / `app.dashboard.aggregate()` register functions in global registries
2634
- 2. When you run `claudeye --evals ./my-file.js`, the server dynamically imports your file, populating the registries
2635
- 3. All eval/enrichment execution routes through a unified priority queue. Each individual eval and enrichment is a separate queue item. UI requests use HIGH priority; background scanning uses LOW priority
2636
- 4. Each item runs through: cache check → execute if uncached → cache result → check if session complete → fire alerts if complete
2637
- 5. The global condition is checked first. If it fails, everything is skipped
2638
- 6. Per-item conditions are checked individually. Skipped items don't block others
2639
- 7. Each function is individually error-isolated. If one throws, the others still run
2640
- 8. After all evals and enrichments complete, registered alerts fire with the complete `AlertContext` (eval summary + enrichment summary)
2641
- 9. Results are serialized and displayed in separate panels in the dashboard UI
2642
- 10. Named dashboard views (`/dashboard`) show a view index; each view (`/dashboard/[viewName]`) computes filter values incrementally (only new/changed sessions are processed), then filters and paginates server-side for efficiency
2643
- 11. Dashboard aggregates run a separate server action that collects per-session values (with eval/enrichment/filter results) and reduces them via user-defined reduce functions into sortable summary tables
2644
- 12. When `CLAUDEYE_QUEUE_INTERVAL` is set, a background processor scans for uncached items on a timer. Track queue state at `/queue` or via the navbar dropdown
2645
-
2646
- ---
2647
-
2648
- ## Community
2649
-
2650
- - [Website & Docs](https://claudeye.exosphere.host) - documentation, guides, and examples
2651
- - [Discord](https://discord.com/invite/zT92CAgvkj) - get help and connect with other developers
2652
- - [Issues](https://github.com/exospherehost/claudeye/issues) - bug reports and feature requests
2653
-
2654
- ## License
2655
-
2656
- MIT + Commons Clause. See [LICENSE](./LICENSE).
30
+ **[github.com/exospherehost/failproofai](https://github.com/exospherehost/failproofai)** | **[befailproof.ai](https://befailproof.ai)**