extract-tracker 0.2.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- extract_tracker-0.2.2/MANIFEST.in +7 -0
- extract_tracker-0.2.2/PKG-INFO +256 -0
- extract_tracker-0.2.2/README.md +229 -0
- extract_tracker-0.2.2/manual/config.md +139 -0
- extract_tracker-0.2.2/manual/mcp.md +49 -0
- extract_tracker-0.2.2/manual/packaging.md +187 -0
- extract_tracker-0.2.2/manual/sync.md +59 -0
- extract_tracker-0.2.2/manual/tui.md +89 -0
- extract_tracker-0.2.2/manual/usage.md +124 -0
- extract_tracker-0.2.2/pyproject.toml +53 -0
- extract_tracker-0.2.2/python/src/extract/__init__.py +8 -0
- extract_tracker-0.2.2/python/src/extract/__main__.py +139 -0
- extract_tracker-0.2.2/python/src/extract/experiment.py +114 -0
- extract_tracker-0.2.2/python/src/extract/init.py +532 -0
- extract_tracker-0.2.2/python/src/extract/mcp.py +859 -0
- extract_tracker-0.2.2/python/src/extract/metrics.py +25 -0
- extract_tracker-0.2.2/python/src/extract/release_versioning.py +278 -0
- extract_tracker-0.2.2/python/src/extract/run.py +419 -0
- extract_tracker-0.2.2/python/src/extract/store.py +424 -0
- extract_tracker-0.2.2/python/src/extract/sync.py +307 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/PKG-INFO +256 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/SOURCES.txt +55 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/dependency_links.txt +1 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/entry_points.txt +2 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/requires.txt +7 -0
- extract_tracker-0.2.2/python/src/extract_tracker.egg-info/top_level.txt +1 -0
- extract_tracker-0.2.2/rust/Cargo.lock +2755 -0
- extract_tracker-0.2.2/rust/Cargo.toml +24 -0
- extract_tracker-0.2.2/rust/src/app.rs +1523 -0
- extract_tracker-0.2.2/rust/src/artifact.rs +89 -0
- extract_tracker-0.2.2/rust/src/config.rs +550 -0
- extract_tracker-0.2.2/rust/src/db.rs +1343 -0
- extract_tracker-0.2.2/rust/src/event.rs +54 -0
- extract_tracker-0.2.2/rust/src/keys.rs +64 -0
- extract_tracker-0.2.2/rust/src/main.rs +192 -0
- extract_tracker-0.2.2/rust/src/model.rs +150 -0
- extract_tracker-0.2.2/rust/src/ui/chart.rs +90 -0
- extract_tracker-0.2.2/rust/src/ui/compare.rs +746 -0
- extract_tracker-0.2.2/rust/src/ui/dashboard.rs +269 -0
- extract_tracker-0.2.2/rust/src/ui/detail.rs +938 -0
- extract_tracker-0.2.2/rust/src/ui/diff.rs +575 -0
- extract_tracker-0.2.2/rust/src/ui/heatmap.rs +140 -0
- extract_tracker-0.2.2/rust/src/ui/help.rs +192 -0
- extract_tracker-0.2.2/rust/src/ui/layout.rs +557 -0
- extract_tracker-0.2.2/rust/src/ui/lineage.rs +335 -0
- extract_tracker-0.2.2/rust/src/ui/mod.rs +18 -0
- extract_tracker-0.2.2/rust/src/ui/popup.rs +742 -0
- extract_tracker-0.2.2/rust/src/ui/registry.rs +202 -0
- extract_tracker-0.2.2/rust/src/ui/search.rs +254 -0
- extract_tracker-0.2.2/rust/src/ui/selection.rs +222 -0
- extract_tracker-0.2.2/rust/src/ui/statusbar.rs +157 -0
- extract_tracker-0.2.2/rust/src/ui/summary.rs +824 -0
- extract_tracker-0.2.2/rust/src/ui/theme.rs +134 -0
- extract_tracker-0.2.2/rust/src/ui/todo.rs +587 -0
- extract_tracker-0.2.2/rust/src/ui/tree.rs +493 -0
- extract_tracker-0.2.2/setup.cfg +4 -0
- extract_tracker-0.2.2/setup.py +24 -0
|
@@ -0,0 +1,256 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: extract-tracker
|
|
3
|
+
Version: 0.2.2
|
|
4
|
+
Summary: Local-first experiment tracking for deep learning
|
|
5
|
+
Author: Phil Oh
|
|
6
|
+
Keywords: experiment-tracking,machine-learning,deep-learning,sqlite,tui,mcp
|
|
7
|
+
Classifier: Development Status :: 3 - Alpha
|
|
8
|
+
Classifier: Intended Audience :: Developers
|
|
9
|
+
Classifier: Intended Audience :: Science/Research
|
|
10
|
+
Classifier: Programming Language :: Python :: 3
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Programming Language :: Rust
|
|
15
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
16
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
17
|
+
Classifier: Topic :: System :: Monitoring
|
|
18
|
+
Requires-Python: >=3.10
|
|
19
|
+
Description-Content-Type: text/markdown
|
|
20
|
+
Requires-Dist: python-ulid>=3.0
|
|
21
|
+
Requires-Dist: numpy>=1.21
|
|
22
|
+
Requires-Dist: typing-extensions>=4.15.0
|
|
23
|
+
Requires-Dist: tomli>=2.0
|
|
24
|
+
Requires-Dist: mcp>=1.0
|
|
25
|
+
Requires-Dist: rich>=13.0
|
|
26
|
+
Requires-Dist: questionary>=2.0
|
|
27
|
+
|
|
28
|
+
# Extract
|
|
29
|
+
|
|
30
|
+
Local-first experiment tracking for deep learning. Extract pairs a small Python SDK with a fast Rust TUI so you can log runs, watch training live, compare variants, inspect artifacts, and keep everything in a project-local SQLite store.
|
|
31
|
+
|
|
32
|
+
No hosted service. No daemon. No account. One `.extract/` directory per project.
|
|
33
|
+
|
|
34
|
+
## Why Extract
|
|
35
|
+
|
|
36
|
+
- **Local-first by default** — experiments live beside your code in `.extract/`.
|
|
37
|
+
- **Hierarchical experiments** — organize runs as `benchmark > model > variant` or any hierarchy you choose.
|
|
38
|
+
- **Live terminal UI** — stream curves during training, compare marked runs, browse artifacts, tags, notes, TODOs, lineage, and model registry entries.
|
|
39
|
+
- **Separated metric surfaces** — `run.curve()` stores dense per-step training series; `run.log()` stores headline metrics for summaries and rankings.
|
|
40
|
+
- **Portable stores** — sync with `rsync`, archive as `tar.gz`, or move one directory between machines.
|
|
41
|
+
- **Agent-readable** — optional read-only MCP server exposes experiments to Claude Code, Claude Desktop, and other MCP hosts.
|
|
42
|
+
|
|
43
|
+
## Install
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
pip install extract-tracker
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Then initialize a store in your project:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
extract init
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
`extract init` writes `.extract/config.toml` and asks for your experiment hierarchy. Example hierarchy:
|
|
56
|
+
|
|
57
|
+
```text
|
|
58
|
+
benchmark > model > variant
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## 60-second quickstart
|
|
62
|
+
|
|
63
|
+
```python
|
|
64
|
+
from extract import Store
|
|
65
|
+
|
|
66
|
+
store = Store()
|
|
67
|
+
exp = store.experiment({
|
|
68
|
+
"benchmark": "imagenet",
|
|
69
|
+
"model": "resnet50",
|
|
70
|
+
"variant": "lr_0.01",
|
|
71
|
+
})
|
|
72
|
+
|
|
73
|
+
with exp.run(config={"lr": 0.01}, total_steps=1000) as run:
|
|
74
|
+
for step in range(1000):
|
|
75
|
+
loss, acc = train_step(...)
|
|
76
|
+
run.curve(step=step, train_loss=loss, train_acc=acc)
|
|
77
|
+
|
|
78
|
+
run.log(final_acc=acc, final_loss=loss)
|
|
79
|
+
run.tag("baseline")
|
|
80
|
+
run.note("Stable run; use as comparison anchor.")
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Open another terminal while training runs:
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
extract tui
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
Curves update live. Mark runs with `Space`, press `c` to compare, press `d` to diff configs.
|
|
90
|
+
|
|
91
|
+
## Core concepts
|
|
92
|
+
|
|
93
|
+
### Store
|
|
94
|
+
|
|
95
|
+
A store is a project-local `.extract/` directory:
|
|
96
|
+
|
|
97
|
+
```text
|
|
98
|
+
.extract/
|
|
99
|
+
├── extract.db
|
|
100
|
+
├── config.toml
|
|
101
|
+
├── artifacts/
|
|
102
|
+
└── models/
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
SQLite uses WAL mode, so training scripts can write while the TUI reads.
|
|
106
|
+
|
|
107
|
+
### Experiment
|
|
108
|
+
|
|
109
|
+
An experiment is a node in your configured hierarchy. With `benchmark > model > variant`, this call creates or reuses each path component:
|
|
110
|
+
|
|
111
|
+
```python
|
|
112
|
+
exp = store.experiment({
|
|
113
|
+
"benchmark": "mmlu",
|
|
114
|
+
"model": "llama-3.2-1b",
|
|
115
|
+
"variant": "lora-r16",
|
|
116
|
+
})
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
### Run
|
|
120
|
+
|
|
121
|
+
A run is one execution under an experiment. Runs track config, status, hostname, git SHA, timestamps, tags, notes, metrics, artifacts, models, lineage, and TODOs.
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
with exp.run(config=config, name="seed-1", total_steps=len(loader)) as run:
|
|
125
|
+
...
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Metrics
|
|
129
|
+
|
|
130
|
+
Use `curve()` for dense per-step values and `log()` for final or headline metrics.
|
|
131
|
+
|
|
132
|
+
```python
|
|
133
|
+
run.curve(step=step, train_loss=loss, eval_acc=acc) # live charts
|
|
134
|
+
run.log(best_acc=best_acc, final_loss=final_loss) # summary/ranking columns
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Artifacts and models
|
|
138
|
+
|
|
139
|
+
```python
|
|
140
|
+
run.log_table("confusion_matrix", matrix)
|
|
141
|
+
run.log_text("notes", "Observed better calibration after warmup.")
|
|
142
|
+
run.register_model("resnet50", "v1", "checkpoints/best.pt", framework="pytorch")
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
## CLI
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
extract init # create .extract/config.toml
|
|
149
|
+
extract tui # open Rust TUI
|
|
150
|
+
extract tui --store path/to/.extract # browse another store
|
|
151
|
+
extract sync push user@hpc:/path/.extract/
|
|
152
|
+
extract sync pull user@hpc:/path/.extract/
|
|
153
|
+
extract sync export backup.tar.gz
|
|
154
|
+
extract sync import backup.tar.gz
|
|
155
|
+
python -m extract.mcp --store .extract # read-only MCP server
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
## TUI highlights
|
|
159
|
+
|
|
160
|
+
| Key | Action |
|
|
161
|
+
|---|---|
|
|
162
|
+
| `j` / `k` | Move down / up |
|
|
163
|
+
| `Enter` | Expand / select |
|
|
164
|
+
| `Space` | Mark run for comparison |
|
|
165
|
+
| `c` | Compare marked runs |
|
|
166
|
+
| `d` | Diff marked run configs |
|
|
167
|
+
| `r` | Run browser |
|
|
168
|
+
| `/` | Search experiments and runs |
|
|
169
|
+
| `t` | Edit tags |
|
|
170
|
+
| `n` | Append note |
|
|
171
|
+
| `M` | Model registry |
|
|
172
|
+
| `T` | TODO view |
|
|
173
|
+
| `L` | Lineage DAG |
|
|
174
|
+
| `?` | Help |
|
|
175
|
+
| `q` | Quit |
|
|
176
|
+
|
|
177
|
+
Full keymap: [manual/tui.md](manual/tui.md).
|
|
178
|
+
|
|
179
|
+
## Configuration
|
|
180
|
+
|
|
181
|
+
Edit `.extract/config.toml`:
|
|
182
|
+
|
|
183
|
+
```toml
|
|
184
|
+
[store]
|
|
185
|
+
hierarchy = "benchmark > model > variant"
|
|
186
|
+
|
|
187
|
+
[summary]
|
|
188
|
+
sections = ["runs", "metrics", "tables", "curves"]
|
|
189
|
+
curve_width = 80
|
|
190
|
+
curve_smooth = false
|
|
191
|
+
|
|
192
|
+
[metrics]
|
|
193
|
+
minimize = ["loss", "forgetting_rate"]
|
|
194
|
+
maximize = ["accuracy", "f1", "custom_score"]
|
|
195
|
+
order = "alpha"
|
|
196
|
+
|
|
197
|
+
[[tags.definitions]]
|
|
198
|
+
name = "baseline"
|
|
199
|
+
color = "blue"
|
|
200
|
+
|
|
201
|
+
[theme]
|
|
202
|
+
accent = "#89b4fa"
|
|
203
|
+
error = "#f38ba8"
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
Full config reference: [manual/config.md](manual/config.md).
|
|
207
|
+
|
|
208
|
+
## Sync between machines
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
extract sync push user@hpc:/scratch/project/.extract/
|
|
212
|
+
extract sync pull user@hpc:/scratch/project/.extract/
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
Pull merges by experiment path and run ULID, so stores can move between laptop, workstation, and HPC jobs without a central server.
|
|
216
|
+
|
|
217
|
+
More: [manual/sync.md](manual/sync.md).
|
|
218
|
+
|
|
219
|
+
## MCP server
|
|
220
|
+
|
|
221
|
+
Expose your store to LLM agents with a read-only MCP server:
|
|
222
|
+
|
|
223
|
+
```bash
|
|
224
|
+
python -m extract.mcp --store .extract
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
Agents can list experiments, inspect runs, compare metrics, search tags/status, list TODOs, walk lineage, and read model registry metadata.
|
|
228
|
+
|
|
229
|
+
More: [manual/mcp.md](manual/mcp.md).
|
|
230
|
+
|
|
231
|
+
## Development
|
|
232
|
+
|
|
233
|
+
Use Nix for project dependencies:
|
|
234
|
+
|
|
235
|
+
```bash
|
|
236
|
+
nix develop
|
|
237
|
+
pip install -e .
|
|
238
|
+
pytest python/tests
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
Build distributions locally:
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
python -m build
|
|
245
|
+
python -m twine check dist/*
|
|
246
|
+
pip install dist/*.whl
|
|
247
|
+
extract --help
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
Packaging and release notes: [manual/packaging.md](manual/packaging.md).
|
|
251
|
+
|
|
252
|
+
## Project status
|
|
253
|
+
|
|
254
|
+
Extract is early-stage and optimized for local ML research workflows. Current package name is `extract-tracker`; Python import and CLI are both `extract`.
|
|
255
|
+
|
|
256
|
+
License not declared yet. Choose and add `LICENSE` before public distribution.
|
|
@@ -0,0 +1,229 @@
|
|
|
1
|
+
# Extract
|
|
2
|
+
|
|
3
|
+
Local-first experiment tracking for deep learning. Extract pairs a small Python SDK with a fast Rust TUI so you can log runs, watch training live, compare variants, inspect artifacts, and keep everything in a project-local SQLite store.
|
|
4
|
+
|
|
5
|
+
No hosted service. No daemon. No account. One `.extract/` directory per project.
|
|
6
|
+
|
|
7
|
+
## Why Extract
|
|
8
|
+
|
|
9
|
+
- **Local-first by default** — experiments live beside your code in `.extract/`.
|
|
10
|
+
- **Hierarchical experiments** — organize runs as `benchmark > model > variant` or any hierarchy you choose.
|
|
11
|
+
- **Live terminal UI** — stream curves during training, compare marked runs, browse artifacts, tags, notes, TODOs, lineage, and model registry entries.
|
|
12
|
+
- **Separated metric surfaces** — `run.curve()` stores dense per-step training series; `run.log()` stores headline metrics for summaries and rankings.
|
|
13
|
+
- **Portable stores** — sync with `rsync`, archive as `tar.gz`, or move one directory between machines.
|
|
14
|
+
- **Agent-readable** — optional read-only MCP server exposes experiments to Claude Code, Claude Desktop, and other MCP hosts.
|
|
15
|
+
|
|
16
|
+
## Install
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
pip install extract-tracker
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Then initialize a store in your project:
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
extract init
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
`extract init` writes `.extract/config.toml` and asks for your experiment hierarchy. Example hierarchy:
|
|
29
|
+
|
|
30
|
+
```text
|
|
31
|
+
benchmark > model > variant
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## 60-second quickstart
|
|
35
|
+
|
|
36
|
+
```python
|
|
37
|
+
from extract import Store
|
|
38
|
+
|
|
39
|
+
store = Store()
|
|
40
|
+
exp = store.experiment({
|
|
41
|
+
"benchmark": "imagenet",
|
|
42
|
+
"model": "resnet50",
|
|
43
|
+
"variant": "lr_0.01",
|
|
44
|
+
})
|
|
45
|
+
|
|
46
|
+
with exp.run(config={"lr": 0.01}, total_steps=1000) as run:
|
|
47
|
+
for step in range(1000):
|
|
48
|
+
loss, acc = train_step(...)
|
|
49
|
+
run.curve(step=step, train_loss=loss, train_acc=acc)
|
|
50
|
+
|
|
51
|
+
run.log(final_acc=acc, final_loss=loss)
|
|
52
|
+
run.tag("baseline")
|
|
53
|
+
run.note("Stable run; use as comparison anchor.")
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Open another terminal while training runs:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
extract tui
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Curves update live. Mark runs with `Space`, press `c` to compare, press `d` to diff configs.
|
|
63
|
+
|
|
64
|
+
## Core concepts
|
|
65
|
+
|
|
66
|
+
### Store
|
|
67
|
+
|
|
68
|
+
A store is a project-local `.extract/` directory:
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
.extract/
|
|
72
|
+
├── extract.db
|
|
73
|
+
├── config.toml
|
|
74
|
+
├── artifacts/
|
|
75
|
+
└── models/
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
SQLite uses WAL mode, so training scripts can write while the TUI reads.
|
|
79
|
+
|
|
80
|
+
### Experiment
|
|
81
|
+
|
|
82
|
+
An experiment is a node in your configured hierarchy. With `benchmark > model > variant`, this call creates or reuses each path component:
|
|
83
|
+
|
|
84
|
+
```python
|
|
85
|
+
exp = store.experiment({
|
|
86
|
+
"benchmark": "mmlu",
|
|
87
|
+
"model": "llama-3.2-1b",
|
|
88
|
+
"variant": "lora-r16",
|
|
89
|
+
})
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Run
|
|
93
|
+
|
|
94
|
+
A run is one execution under an experiment. Runs track config, status, hostname, git SHA, timestamps, tags, notes, metrics, artifacts, models, lineage, and TODOs.
|
|
95
|
+
|
|
96
|
+
```python
|
|
97
|
+
with exp.run(config=config, name="seed-1", total_steps=len(loader)) as run:
|
|
98
|
+
...
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
### Metrics
|
|
102
|
+
|
|
103
|
+
Use `curve()` for dense per-step values and `log()` for final or headline metrics.
|
|
104
|
+
|
|
105
|
+
```python
|
|
106
|
+
run.curve(step=step, train_loss=loss, eval_acc=acc) # live charts
|
|
107
|
+
run.log(best_acc=best_acc, final_loss=final_loss) # summary/ranking columns
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Artifacts and models
|
|
111
|
+
|
|
112
|
+
```python
|
|
113
|
+
run.log_table("confusion_matrix", matrix)
|
|
114
|
+
run.log_text("notes", "Observed better calibration after warmup.")
|
|
115
|
+
run.register_model("resnet50", "v1", "checkpoints/best.pt", framework="pytorch")
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
## CLI
|
|
119
|
+
|
|
120
|
+
```bash
|
|
121
|
+
extract init # create .extract/config.toml
|
|
122
|
+
extract tui # open Rust TUI
|
|
123
|
+
extract tui --store path/to/.extract # browse another store
|
|
124
|
+
extract sync push user@hpc:/path/.extract/
|
|
125
|
+
extract sync pull user@hpc:/path/.extract/
|
|
126
|
+
extract sync export backup.tar.gz
|
|
127
|
+
extract sync import backup.tar.gz
|
|
128
|
+
python -m extract.mcp --store .extract # read-only MCP server
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
## TUI highlights
|
|
132
|
+
|
|
133
|
+
| Key | Action |
|
|
134
|
+
|---|---|
|
|
135
|
+
| `j` / `k` | Move down / up |
|
|
136
|
+
| `Enter` | Expand / select |
|
|
137
|
+
| `Space` | Mark run for comparison |
|
|
138
|
+
| `c` | Compare marked runs |
|
|
139
|
+
| `d` | Diff marked run configs |
|
|
140
|
+
| `r` | Run browser |
|
|
141
|
+
| `/` | Search experiments and runs |
|
|
142
|
+
| `t` | Edit tags |
|
|
143
|
+
| `n` | Append note |
|
|
144
|
+
| `M` | Model registry |
|
|
145
|
+
| `T` | TODO view |
|
|
146
|
+
| `L` | Lineage DAG |
|
|
147
|
+
| `?` | Help |
|
|
148
|
+
| `q` | Quit |
|
|
149
|
+
|
|
150
|
+
Full keymap: [manual/tui.md](manual/tui.md).
|
|
151
|
+
|
|
152
|
+
## Configuration
|
|
153
|
+
|
|
154
|
+
Edit `.extract/config.toml`:
|
|
155
|
+
|
|
156
|
+
```toml
|
|
157
|
+
[store]
|
|
158
|
+
hierarchy = "benchmark > model > variant"
|
|
159
|
+
|
|
160
|
+
[summary]
|
|
161
|
+
sections = ["runs", "metrics", "tables", "curves"]
|
|
162
|
+
curve_width = 80
|
|
163
|
+
curve_smooth = false
|
|
164
|
+
|
|
165
|
+
[metrics]
|
|
166
|
+
minimize = ["loss", "forgetting_rate"]
|
|
167
|
+
maximize = ["accuracy", "f1", "custom_score"]
|
|
168
|
+
order = "alpha"
|
|
169
|
+
|
|
170
|
+
[[tags.definitions]]
|
|
171
|
+
name = "baseline"
|
|
172
|
+
color = "blue"
|
|
173
|
+
|
|
174
|
+
[theme]
|
|
175
|
+
accent = "#89b4fa"
|
|
176
|
+
error = "#f38ba8"
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
Full config reference: [manual/config.md](manual/config.md).
|
|
180
|
+
|
|
181
|
+
## Sync between machines
|
|
182
|
+
|
|
183
|
+
```bash
|
|
184
|
+
extract sync push user@hpc:/scratch/project/.extract/
|
|
185
|
+
extract sync pull user@hpc:/scratch/project/.extract/
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
Pull merges by experiment path and run ULID, so stores can move between laptop, workstation, and HPC jobs without a central server.
|
|
189
|
+
|
|
190
|
+
More: [manual/sync.md](manual/sync.md).
|
|
191
|
+
|
|
192
|
+
## MCP server
|
|
193
|
+
|
|
194
|
+
Expose your store to LLM agents with a read-only MCP server:
|
|
195
|
+
|
|
196
|
+
```bash
|
|
197
|
+
python -m extract.mcp --store .extract
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
Agents can list experiments, inspect runs, compare metrics, search tags/status, list TODOs, walk lineage, and read model registry metadata.
|
|
201
|
+
|
|
202
|
+
More: [manual/mcp.md](manual/mcp.md).
|
|
203
|
+
|
|
204
|
+
## Development
|
|
205
|
+
|
|
206
|
+
Use Nix for project dependencies:
|
|
207
|
+
|
|
208
|
+
```bash
|
|
209
|
+
nix develop
|
|
210
|
+
pip install -e .
|
|
211
|
+
pytest python/tests
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
Build distributions locally:
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
python -m build
|
|
218
|
+
python -m twine check dist/*
|
|
219
|
+
pip install dist/*.whl
|
|
220
|
+
extract --help
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
Packaging and release notes: [manual/packaging.md](manual/packaging.md).
|
|
224
|
+
|
|
225
|
+
## Project status
|
|
226
|
+
|
|
227
|
+
Extract is early-stage and optimized for local ML research workflows. Current package name is `extract-tracker`; Python import and CLI are both `extract`.
|
|
228
|
+
|
|
229
|
+
License not declared yet. Choose and add `LICENSE` before public distribution.
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
# Configuration
|
|
2
|
+
|
|
3
|
+
Edit `.extract/config.toml`. `extract init` writes the initial file.
|
|
4
|
+
|
|
5
|
+
## Store setup
|
|
6
|
+
|
|
7
|
+
```toml
|
|
8
|
+
[store]
|
|
9
|
+
hierarchy = "benchmark > model > variant"
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
Hierarchy levels define how experiments are created from dict specs:
|
|
13
|
+
|
|
14
|
+
```python
|
|
15
|
+
store.experiment({
|
|
16
|
+
"benchmark": "imagenet",
|
|
17
|
+
"model": "resnet50",
|
|
18
|
+
"variant": "lr_0.01",
|
|
19
|
+
})
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Summary tab
|
|
23
|
+
|
|
24
|
+
Controls the Detail panel Summary tab (`S`).
|
|
25
|
+
|
|
26
|
+
```toml
|
|
27
|
+
[summary]
|
|
28
|
+
sections = ["runs", "metrics", "tables", "curves"]
|
|
29
|
+
curve_width = 80 # chart width as % of panel
|
|
30
|
+
# curve_height = 10 # chart height in lines; default auto-scales by metric count
|
|
31
|
+
curve_smooth = false
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Info tab and config fields
|
|
35
|
+
|
|
36
|
+
Controls the Detail panel Info tab (`I`) and config sections in Compare/Diff views.
|
|
37
|
+
Nested configs are flattened with dot notation, e.g. `model.lora_r`, `task.num_train_epochs`.
|
|
38
|
+
|
|
39
|
+
```toml
|
|
40
|
+
[info]
|
|
41
|
+
fields = ["model.*", "task.num_train_epochs"]
|
|
42
|
+
time_format = "%Y-%m-%d %H:%M:%S"
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Glob syntax:
|
|
46
|
+
|
|
47
|
+
- `*` matches one segment.
|
|
48
|
+
- `**` matches multiple segments.
|
|
49
|
+
- `?` matches one character.
|
|
50
|
+
- `{a,b}` matches alternatives.
|
|
51
|
+
- Prefix with `!` to exclude, e.g. `["model.**", "!model.parent"]`.
|
|
52
|
+
|
|
53
|
+
Empty `fields` means show all.
|
|
54
|
+
|
|
55
|
+
## Compare view
|
|
56
|
+
|
|
57
|
+
Controls Compare view (`c` with marked runs).
|
|
58
|
+
|
|
59
|
+
```toml
|
|
60
|
+
[compare]
|
|
61
|
+
sections = ["pivot", "config", "tables", "curves"]
|
|
62
|
+
curve_width = 50
|
|
63
|
+
# curve_height = 10
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Metric interpretation
|
|
67
|
+
|
|
68
|
+
```toml
|
|
69
|
+
[metrics]
|
|
70
|
+
minimize = ["forgetting_rate"]
|
|
71
|
+
maximize = ["custom_score"]
|
|
72
|
+
order = "alpha"
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
Unlisted metrics use name heuristics. For example, names containing `loss` are minimized.
|
|
76
|
+
|
|
77
|
+
`order` supports:
|
|
78
|
+
|
|
79
|
+
- `alpha`
|
|
80
|
+
- `rev_alpha`
|
|
81
|
+
- explicit order such as `"accuracy > loss > f1"`
|
|
82
|
+
|
|
83
|
+
## Table highlighting
|
|
84
|
+
|
|
85
|
+
First matching rule wins.
|
|
86
|
+
|
|
87
|
+
```toml
|
|
88
|
+
[[tables.highlight]]
|
|
89
|
+
min = 0.7
|
|
90
|
+
color = "red"
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Rule fields:
|
|
94
|
+
|
|
95
|
+
- `eq` — exact match
|
|
96
|
+
- `min` — inclusive lower bound
|
|
97
|
+
- `max` — exclusive upper bound
|
|
98
|
+
- `pattern` — substring match
|
|
99
|
+
- `color` — named color or hex
|
|
100
|
+
|
|
101
|
+
Colors: `red`, `green`, `yellow`, `blue`, `cyan`, `magenta`, `white`, `orange`, `none`, or hex such as `#ff6600`.
|
|
102
|
+
|
|
103
|
+
## Tags
|
|
104
|
+
|
|
105
|
+
Predefine tags and colors:
|
|
106
|
+
|
|
107
|
+
```toml
|
|
108
|
+
[[tags.definitions]]
|
|
109
|
+
name = "baseline"
|
|
110
|
+
color = "blue"
|
|
111
|
+
|
|
112
|
+
[[tags.definitions]]
|
|
113
|
+
name = "production"
|
|
114
|
+
color = "#a6e3a1"
|
|
115
|
+
|
|
116
|
+
[[tags.definitions]]
|
|
117
|
+
name = "deprecated"
|
|
118
|
+
color = "red"
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Press `t` in the TUI to open the tag picker. Type to fuzzy-filter, Enter to toggle, or create new tags.
|
|
122
|
+
|
|
123
|
+
## Theme
|
|
124
|
+
|
|
125
|
+
```toml
|
|
126
|
+
[theme]
|
|
127
|
+
fg = "#cdd6f4"
|
|
128
|
+
bg = "#1e1e2e"
|
|
129
|
+
accent = "#89b4fa"
|
|
130
|
+
accent_dim = "#585b70"
|
|
131
|
+
success = "#a6e3a1"
|
|
132
|
+
warning = "#f9e2af"
|
|
133
|
+
error = "#f38ba8"
|
|
134
|
+
border = "#585b70"
|
|
135
|
+
border_focused = "#89b4fa"
|
|
136
|
+
|
|
137
|
+
[notifications]
|
|
138
|
+
timeout = 3
|
|
139
|
+
```
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# MCP Server
|
|
2
|
+
|
|
3
|
+
Extract can expose a store through a read-only MCP server. Use this with Claude Code, Claude Desktop, or any MCP-capable host.
|
|
4
|
+
|
|
5
|
+
## Run server
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
python -m extract.mcp --store .extract
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
The MCP host normally launches this command as a subprocess over stdio. Relative `--store` paths resolve against the host process cwd.
|
|
12
|
+
|
|
13
|
+
## Claude Code config
|
|
14
|
+
|
|
15
|
+
Create `.mcp.json` in project root:
|
|
16
|
+
|
|
17
|
+
```json
|
|
18
|
+
{
|
|
19
|
+
"mcpServers": {
|
|
20
|
+
"extract": {
|
|
21
|
+
"command": ".venv/bin/python",
|
|
22
|
+
"args": ["-m", "extract.mcp"]
|
|
23
|
+
}
|
|
24
|
+
}
|
|
25
|
+
}
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Then ask questions such as:
|
|
29
|
+
|
|
30
|
+
- "Compare the two resnet50 runs and tell me which had the lowest final loss."
|
|
31
|
+
- "What experiments are tagged production-candidate?"
|
|
32
|
+
- "Show lineage for the best MMLU run."
|
|
33
|
+
|
|
34
|
+
## Tools
|
|
35
|
+
|
|
36
|
+
All tools are read-only.
|
|
37
|
+
|
|
38
|
+
| Tool | Purpose |
|
|
39
|
+
|---|---|
|
|
40
|
+
| `list_experiments` | Browse experiment hierarchy with run counts |
|
|
41
|
+
| `list_runs` | List runs, optionally for one experiment, with labels and config summaries |
|
|
42
|
+
| `get_run` | Full run detail: config, final metrics, params, artifacts, TODOs |
|
|
43
|
+
| `compare_runs` | Compare 2–10 runs with rankings, optional histories, config diffs |
|
|
44
|
+
| `search` | Substring search plus tag/status/prefix/date filters |
|
|
45
|
+
| `list_todos` | TODOs scoped global / experiment / run |
|
|
46
|
+
| `get_lineage` | BFS walk of lineage DAG: ancestors, descendants, or both |
|
|
47
|
+
| `list_models` | Registered models with metadata |
|
|
48
|
+
|
|
49
|
+
Full schemas, response shapes, and error catalog live in [`DOC.md`](../DOC.md#mcp-server).
|