tuneloop 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +256 -0
- package/dist/chunk-RB45XK57.js +6941 -0
- package/dist/chunk-RB45XK57.js.map +1 -0
- package/dist/cli.d.ts +1 -0
- package/dist/cli.js +154 -0
- package/dist/cli.js.map +1 -0
- package/dist/client/app.js +4937 -0
- package/dist/client/app.js.map +1 -0
- package/dist/client/favicon.svg +7 -0
- package/dist/client/index.html +74 -0
- package/dist/client/styles.css +698 -0
- package/dist/index.d.ts +1623 -0
- package/dist/index.js +49 -0
- package/dist/index.js.map +1 -0
- package/package.json +68 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Tuneloop
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,256 @@
|
|
|
1
|
+
# tuneloop
|
|
2
|
+
|
|
3
|
+
Local analytics for your AI coding sessions. **Count outcomes, not tokens.**
|
|
4
|
+
|
|
5
|
+
tuneloop turns the session transcripts your AI coding tools already write into a
|
|
6
|
+
local dashboard of what you actually shipped, what it cost, and your work patterns.
|
|
7
|
+
|
|
8
|
+
<br>
|
|
9
|
+
|
|
10
|
+
<p align="center">
|
|
11
|
+
<img src="docs/img/cost_per_artifact.png" alt="tuneloop dashboard — headline metrics (outcome rate, cost per shipped artifact, spend, sessions, tool error rate) above a per-PR cost breakdown treemap" width="900">
|
|
12
|
+
</p>
|
|
13
|
+
|
|
14
|
+
<br>
|
|
15
|
+
|
|
16
|
+
Concretely, it enriches each session with:
|
|
17
|
+
|
|
18
|
+
- **Outcome links** — merged PRs, features shipped, files changed
|
|
19
|
+
- **Granular cost attribution to outcomes**
|
|
20
|
+
- **Task complexity**
|
|
21
|
+
- **Agent autonomy**
|
|
22
|
+
- **Work type**
|
|
23
|
+
- **Key decisions**
|
|
24
|
+
- **Tool error categories**
|
|
25
|
+
|
|
26
|
+
Combined with the data already in the transcript — model, agent harness, repo, and
|
|
27
|
+
more — this data lets you answer questions like:
|
|
28
|
+
|
|
29
|
+
- How much of my AI spend went into PR #2, or feature *X*?
|
|
30
|
+
- Are my agents getting more autonomous over time on complex tasks?
|
|
31
|
+
- What's my success rate on repo *X* vs. repo *Y* — or any other dimension you care about?
|
|
32
|
+
|
|
33
|
+
Works with Claude Code, Codex, and OpenCode. Everything runs and stays
|
|
34
|
+
on your machine; enrichments that need an LLM can use your own provider key or a
|
|
35
|
+
local model. The built-ins above are just the defaults — tuneloop is extensible,
|
|
36
|
+
and adding your own enrichment is straightforward.
|
|
37
|
+
|
|
38
|
+
> Built by the team at [Tuneloop](https://tuneloop.io).
|
|
39
|
+
|
|
40
|
+
## Quick start
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
npx tuneloop analyze
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
This scans typical session folders like `~/.claude/projects`, builds a local
|
|
47
|
+
store, and prints a summary. The **first run** processes every transcript, so
|
|
48
|
+
expect a few minutes (around 4 for ~80 sessions with [LLM
|
|
49
|
+
enrichment](#llm-enrichment) on; static-only runs are faster); later runs are
|
|
50
|
+
incremental and only re-process sessions that changed, so they finish quickly. On completion the
|
|
51
|
+
CLI prints the dashboard URL — press **Enter** to open it in your browser
|
|
52
|
+
(`Ctrl+C` to stop). Point it at other locations with a comma-separated list:
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
npx tuneloop analyze ~/.claude/projects,/path/to/more/sessions
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Handy flags:
|
|
59
|
+
|
|
60
|
+
- `--no-serve` — build the store and exit, no dashboard
|
|
61
|
+
- `--port <n>` — serve on a different port
|
|
62
|
+
- `npx tuneloop serve` — open the dashboard over an already-analyzed store, without re-analyzing
|
|
63
|
+
|
|
64
|
+
## What you get
|
|
65
|
+
|
|
66
|
+
The dashboard reads everything live from a local SQLite store:
|
|
67
|
+
|
|
68
|
+
- **Session outcome rate** — how many of your sessions ended in a win (you pick what counts).
|
|
69
|
+
- **Cost per shipped artifact** — dollars of AI spend per merged PR or per shipped feature.
|
|
70
|
+
- **Total spend** — over time, split by model, work type, or repo.
|
|
71
|
+
- **Tool & skill usage** — call counts, error rates, and error categories across every session.
|
|
72
|
+
- A **filterable session viewer**, with the full transcript and file changes behind each one.
|
|
73
|
+
- Easy transcript navigation (turn-by-turn, errors, free text search, and
|
|
74
|
+
outcomes). For example: you can jump to the part of the session where
|
|
75
|
+
you worked on a particular feature or code change.
|
|
76
|
+
- Filter sessions that touched a particular file / PR / feature.
|
|
77
|
+
|
|
78
|
+
Cost, tools, files, and git/PR outcomes come from static analysis — no setup or
|
|
79
|
+
API key. Work type, complexity, autonomy, and feature names come from [LLM
|
|
80
|
+
enrichment](#llm-enrichment), which is worth setting up: much of what makes the
|
|
81
|
+
dashboard useful depends on it.
|
|
82
|
+
|
|
83
|
+
**Highlights** turns the same data into plain-English insights about your recent work:
|
|
84
|
+
|
|
85
|
+
<br>
|
|
86
|
+
|
|
87
|
+
<p align="center">
|
|
88
|
+
<img src="docs/img/highlights_tab.png" alt="tuneloop Highlights tab — a question-led digest: sessions run, most AI spend on shipped vs unshipped work, share of spend that shipped, and success rate by complexity" width="900">
|
|
89
|
+
</p>
|
|
90
|
+
|
|
91
|
+
<br>
|
|
92
|
+
|
|
93
|
+
And every session is a readable transcript you can navigate turn-by-turn, by work
|
|
94
|
+
type, or by error — with the files it changed alongside:
|
|
95
|
+
|
|
96
|
+
<br>
|
|
97
|
+
|
|
98
|
+
<p align="center">
|
|
99
|
+
<img src="docs/img/session_transcript_viewer.png" alt="tuneloop session viewer — turn-by-turn transcript with work-type filter pills, tool calls, and a Files tab, next to the filterable session list" width="900">
|
|
100
|
+
</p>
|
|
101
|
+
|
|
102
|
+
<br>
|
|
103
|
+
|
|
104
|
+
## How it works
|
|
105
|
+
|
|
106
|
+
**Enrichment** labels each session in one LLM call:
|
|
107
|
+
|
|
108
|
+
- **Work type** — one of `plan` · `implement` · `debug` · `research` · `review` · `docs` · `other`.
|
|
109
|
+
- **Complexity** — one of `trivial` · `routine` · `substantial` · `open-ended`.
|
|
110
|
+
- **Autonomy** — how much the agent drove itself: `autonomous` · `guided` · `minimal`.
|
|
111
|
+
- **Feature** — links the session to a shipped feature, reusing your existing feature
|
|
112
|
+
names and proposing new ones. The taxonomy grows as you analyze, so related work
|
|
113
|
+
lands under one feature instead of fragmenting.
|
|
114
|
+
- **Success** — a judged outcome (`success` / `partial` / `failure`), surfaced as the
|
|
115
|
+
`session_success` outcome you can count.
|
|
116
|
+
|
|
117
|
+
**PR linking** connects a session to the PRs it produced, two ways:
|
|
118
|
+
|
|
119
|
+
- **Explicit** — the transcript shows the agent creating, merging, or reviewing a PR
|
|
120
|
+
(`gh pr create` / `gh pr review` / a GitHub MCP tool); live status comes from your
|
|
121
|
+
local `gh`.
|
|
122
|
+
- **Content-match** — for the common case where the agent writes the code and *you*
|
|
123
|
+
commit and push it (no `gh pr create` in the transcript), tuneloop matches the lines
|
|
124
|
+
the agent authored against your own PRs' diffs and links the best match.
|
|
125
|
+
|
|
126
|
+
See [ARCHITECTURE.md](./ARCHITECTURE.md#built-in-processors) for the detection rules.
|
|
127
|
+
|
|
128
|
+
**Block-level cost attribution** — a long session that touches several things isn't
|
|
129
|
+
billed as one lump. tuneloop splits it into blocks and attributes token cost per
|
|
130
|
+
block, so a per-PR or per-feature cost reflects only the work that went into it.
|
|
131
|
+
→ [how blocks work](./ARCHITECTURE.md#blocks-and-cost-attribution-srccoreblocksts)
|
|
132
|
+
|
|
133
|
+
**Metrics** — the five dashboard headlines (outcome rate, cost per shipped artifact,
|
|
134
|
+
total spend, sessions, tool error rate) are each explained in
|
|
135
|
+
[ARCHITECTURE.md](./ARCHITECTURE.md#the-metrics-explained).
|
|
136
|
+
|
|
137
|
+
## Query it from your coding agent
|
|
138
|
+
|
|
139
|
+
Everything on the dashboard is a query over the store — and so is anything it
|
|
140
|
+
*doesn't* show. `tuneloop query` runs read-only SQL over that store, straight from
|
|
141
|
+
your terminal or your coding agent's shell:
|
|
142
|
+
|
|
143
|
+
```bash
|
|
144
|
+
tuneloop query "SELECT model, SUM(cost_usd) FROM usage_facts GROUP BY 1 ORDER BY 2 DESC"
|
|
145
|
+
tuneloop query --schema # tables, facets, and measures — learn the shape first
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Only `SELECT` / `WITH … SELECT` run; writes and raw transcripts are off-limits.
|
|
149
|
+
|
|
150
|
+
Because it needs no server and speaks plain SQL, it's a natural fit for Claude Code
|
|
151
|
+
and other agents. Install the bundled skill so your agent knows the schema and the
|
|
152
|
+
grain rules before it writes a query:
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
npx skills add tuneloop/tuneloop
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
Then just ask — *"Query tuneloop: what did I spend per model last week?"* — and the agent writes
|
|
159
|
+
the SQL, runs it, and reads back the answer.
|
|
160
|
+
|
|
161
|
+
## LLM enrichment
|
|
162
|
+
|
|
163
|
+
To label each session with a work type, complexity, autonomy, and an LLM-judged
|
|
164
|
+
success signal — and to name the features you shipped — point tuneloop at **your
|
|
165
|
+
own** LLM key. Your session data goes only to the provider you choose:
|
|
166
|
+
|
|
167
|
+
```bash
|
|
168
|
+
export TUNELOOP_LLM_PROVIDER=anthropic
|
|
169
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
170
|
+
# optional: export TUNELOOP_LLM_MODEL=claude-haiku-4-5
|
|
171
|
+
npx tuneloop analyze
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
Pick a preset and supply its key; the model defaults sensibly and is overridable
|
|
175
|
+
with `TUNELOOP_LLM_MODEL` (or `--llm-model`). Anthropic and OpenAI are native;
|
|
176
|
+
everything else speaks the OpenAI-compatible API.
|
|
177
|
+
|
|
178
|
+
| `TUNELOOP_LLM_PROVIDER` | Key env | Notes |
|
|
179
|
+
|---|---|---|
|
|
180
|
+
| `anthropic` | `ANTHROPIC_API_KEY` | native |
|
|
181
|
+
| `openai` | `OPENAI_API_KEY` | native |
|
|
182
|
+
| `openrouter` | `OPENROUTER_API_KEY` | 400+ models via one key |
|
|
183
|
+
| `groq` | `GROQ_API_KEY` | fast; free tier |
|
|
184
|
+
| `deepseek` | `DEEPSEEK_API_KEY` | |
|
|
185
|
+
| `gemini` | `GEMINI_API_KEY` | Google, OpenAI-compatible endpoint |
|
|
186
|
+
| `together` / `fireworks` / `xai` | `TOGETHER_API_KEY` / `FIREWORKS_API_KEY` / `XAI_API_KEY` | |
|
|
187
|
+
| `ollama` | _(none)_ | local; `http://localhost:11434` |
|
|
188
|
+
| `openai-compatible` | `TUNELOOP_LLM_API_KEY` | any other host; set `TUNELOOP_LLM_BASE_URL` |
|
|
189
|
+
|
|
190
|
+
```bash
|
|
191
|
+
# A hosted provider — name it, never type a URL:
|
|
192
|
+
TUNELOOP_LLM_PROVIDER=openrouter OPENROUTER_API_KEY=sk-or-... \
|
|
193
|
+
npx tuneloop analyze --llm-model deepseek/deepseek-chat
|
|
194
|
+
|
|
195
|
+
# Fully local, no key, nothing leaves your machine:
|
|
196
|
+
npx tuneloop analyze --llm-provider ollama --llm-model qwen2.5
|
|
197
|
+
|
|
198
|
+
# Any other OpenAI-compatible host:
|
|
199
|
+
TUNELOOP_LLM_PROVIDER=openai-compatible TUNELOOP_LLM_BASE_URL=https://host/v1 \
|
|
200
|
+
TUNELOOP_LLM_API_KEY=… npx tuneloop analyze --llm-model my-model
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
Enrichment is one structured **tool call** per session, so use a
|
|
204
|
+
tool-call-capable model (all the hosted defaults qualify). Flags override the env
|
|
205
|
+
for one run; the API key is always env-only. It's cheap — a typical corpus of ~80
|
|
206
|
+
sessions runs about **$0.60** with Claude Haiku. This cost shows up as **Analysis
|
|
207
|
+
spend** in the summary, priced from a built-in table with an OpenRouter public
|
|
208
|
+
price list filling gaps (cached under `~/.tuneloop/`).
|
|
209
|
+
|
|
210
|
+
**Local Ollama** needs a bigger context window and a capable model: the enrichment
|
|
211
|
+
prompt is ~4–6k tokens but Ollama's ~2k default silently truncates it, so start the
|
|
212
|
+
server with `OLLAMA_CONTEXT_LENGTH=8192 ollama serve` and use a tool-strong ≥7B
|
|
213
|
+
model like `qwen2.5:7b` (tiny models tool-call unreliably).
|
|
214
|
+
|
|
215
|
+
## Privacy
|
|
216
|
+
|
|
217
|
+
Transcripts are processed locally and results are written to a local SQLite store
|
|
218
|
+
(`~/.tuneloop/` by default). tuneloop never posts your **session data** anywhere —
|
|
219
|
+
the only thing that ever leaves is a transcript sent to the LLM provider whose key
|
|
220
|
+
you supply, and only if you enable enrichment. Its other network calls are
|
|
221
|
+
read-only and carry none of your data: your local `gh` for PR status and diffs
|
|
222
|
+
(your own GitHub auth), and OpenRouter's public price list to cost models the
|
|
223
|
+
built-in table doesn't know. To avoid sending transcripts off the machine at all,
|
|
224
|
+
enrich against a local model (`--llm-provider ollama`).
|
|
225
|
+
|
|
226
|
+
## Run from source
|
|
227
|
+
|
|
228
|
+
`npx tuneloop` is all most people need. To hack on tuneloop itself, run it from a
|
|
229
|
+
local checkout:
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
npm install
|
|
233
|
+
npm run dev -- analyze # builds, runs the CLI (args after `--`), then serves the dashboard
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
Or build once and call the binary directly:
|
|
237
|
+
|
|
238
|
+
```bash
|
|
239
|
+
npm run build
|
|
240
|
+
node dist/cli.js analyze
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
`npm link` gives you a global `tuneloop` backed by your local build. LLM
|
|
244
|
+
enrichment works the same way — set `TUNELOOP_LLM_PROVIDER` and its key before
|
|
245
|
+
running.
|
|
246
|
+
|
|
247
|
+
## Extending
|
|
248
|
+
|
|
249
|
+
Adding new analysis is one file: implement the `Processor` interface, declare any
|
|
250
|
+
sliceable facets, and register it — it shows up in the store and the dashboard (as
|
|
251
|
+
a card and a filter) automatically, no migration. To support a new AI tool, write
|
|
252
|
+
a `SourceAdapter`. See [ARCHITECTURE.md](./ARCHITECTURE.md).
|
|
253
|
+
|
|
254
|
+
## License
|
|
255
|
+
|
|
256
|
+
MIT
|