agent-rollouts 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,4 @@
1
+ .venv/
2
+ __pycache__/
3
+ *.pyc
4
+ .env
@@ -0,0 +1,16 @@
1
+ repos:
2
+ - repo: https://github.com/astral-sh/ruff-pre-commit
3
+ rev: v0.14.3
4
+ hooks:
5
+ - id: ruff
6
+ args: [--fix]
7
+ stages: [pre-commit]
8
+ - id: ruff-format
9
+ - repo: local
10
+ hooks:
11
+ - id: ty
12
+ name: ty
13
+ entry: uv run ty check src/
14
+ language: system
15
+ files: ^src/.*\.py$
16
+ pass_filenames: false
@@ -0,0 +1 @@
1
+ 3.12
@@ -0,0 +1,430 @@
1
+ Metadata-Version: 2.4
2
+ Name: agent-rollouts
3
+ Version: 0.1.0
4
+ Summary: A CLI to get training data from your own coding agents. Track rollouts and codebase snapshots at every turn. The data can be used for SFT, RL, and Continual Learning.
5
+ Author: 13point5
6
+ License: MIT
7
+ Requires-Python: <3.14,>=3.12
8
+ Requires-Dist: huggingface-hub>=0.34.4
9
+ Requires-Dist: pydantic>=2.11.0
10
+ Requires-Dist: rich>=14.0.0
11
+ Requires-Dist: typer>=0.16.0
12
+ Provides-Extra: dev
13
+ Requires-Dist: pre-commit>=3.5.0; extra == 'dev'
14
+ Requires-Dist: ruff>=0.12.0; extra == 'dev'
15
+ Requires-Dist: ty>=0.0.1a0; extra == 'dev'
16
+ Description-Content-Type: text/markdown
17
+
18
+ # Rollouts
19
+
20
+ A CLI to get training data from your own coding agents. Track rollouts and codebase snapshots at every turn. The data can be used for SFT, RL, and Continual Learning.
21
+
22
+ ## Current status
23
+
24
+ Current prototype:
25
+
26
+ - `uv`-managed Python CLI
27
+ - global installable `rollouts` command
28
+ - `rollouts snapshot [workspace] --session --message --metadata`
29
+ - `rollouts restore [workspace] --session --message --dest`
30
+ - `rollouts restore --repo <repo> --session --message --dest`
31
+ - `rollouts delete [workspace] [--session] [--message]` and `rollouts delete --all`
32
+ - `rollouts export --agent opencode --session <session-id> --out <file>`
33
+ - `rollouts export --agent opencode --all --out <file.jsonl>`
34
+ - `rollouts hf push --agent opencode --name <dataset>`
35
+ - `rollouts remote set [workspace] --url <repo>`
36
+ - `rollouts remote clear [workspace]` and `rollouts remote clear --all`
37
+ - `rollouts remote defaults set --owner <owner> [--prefix <prefix>]`
38
+ - `rollouts push [workspace] [--session] [--message]` and `rollouts push --all`
39
+ - SQLite bootstrap with `workspaces`, `snapshots`, and `remote_defaults` tables
40
+ - one bare Git store per registered workspace
41
+
42
+ ## Install
43
+
44
+ For local development:
45
+
46
+ ```bash
47
+ uv sync --extra dev
48
+ ```
49
+
50
+ To install the CLI for use anywhere on your machine:
51
+
52
+ ```bash
53
+ uv tool install --python 3.12 --editable .
54
+ ```
55
+
56
+ After that, you can run:
57
+
58
+ ```bash
59
+ rollouts --help
60
+ ```
61
+
62
+ ## Usage
63
+
64
+ There is no separate `init` command. The first `snapshot` call automatically registers the workspace, creates the app home at `~/.rollouts` or `$ROLLOUTS_HOME`, bootstraps `rollouts.sqlite`, and creates the workspace bare store.
65
+
66
+ Create a snapshot for a session message:
67
+
68
+ ```bash
69
+ rollouts snapshot . \
70
+ --session ses_123 \
71
+ --message msg_001 \
72
+ --metadata '{"timestamp":"2026-03-29T20:02:43.622Z","kind":"hooks","name":"chat.message"}'
73
+ ```
74
+
75
+ The `snapshot` command:
76
+
77
+ - defaults the workspace path to `.`
78
+ - requires `--session`
79
+ - requires `--message`
80
+ - requires `--metadata`
81
+ - expects `--metadata` to be an inline JSON string
82
+ - automatically initializes the workspace if it has not been registered yet
83
+ - enforces that each `session_id` belongs to exactly one workspace across Rollouts
84
+ - uses the Git repository root when the path is inside a Git repo
85
+ - otherwise uses the directory path itself as the workspace root
86
+ - can update an existing workspace root if the same directory later becomes Git-backed
87
+ - snapshots the current workspace state
88
+ - if the source is a Git repo, snapshots tracked and untracked non-ignored files
89
+ - if the source is a plain directory, snapshots files recursively and excludes `.git`
90
+ - stores per-snapshot VCS metadata in the `vcs` column
91
+ - excludes the Rollouts app home if it lives inside the source directory
92
+ - stores the resulting Git commit in the workspace bare store
93
+ - inserts a row into `snapshots`
94
+
95
+ Restore a snapshot for a session message:
96
+
97
+ ```bash
98
+ rollouts restore . \
99
+ --session ses_123 \
100
+ --message msg_001 \
101
+ --dest /tmp/restored
102
+ ```
103
+
104
+ The `restore` command:
105
+
106
+ - takes the source workspace path as its argument, defaulting to `.`
107
+ - also accepts `--repo` to restore directly from a remote archive repo instead of a local workspace
108
+ - requires `--session`
109
+ - requires `--message`
110
+ - requires `--dest`
111
+ - with no `--repo`, finds the matching local snapshot by `session_id` and `message_id`
112
+ - with `--repo`, looks up the remote annotated tag for the given `session_id` and `message_id`
113
+ - extracts the stored snapshot into the destination as a plain codebase directory
114
+ - works for snapshots created from both Git repos and plain directories
115
+ - fails if the destination already exists
116
+
117
+ Restore a snapshot directly from a remote archive repo:
118
+
119
+ ```bash
120
+ rollouts restore \
121
+ --repo https://github.com/13point5/rollouts-opencode-rollouts-plugin-2c1c8861 \
122
+ --session ses_123 \
123
+ --message msg_001 \
124
+ --dest /tmp/restored
125
+ ```
126
+
127
+ Configure an archive remote for a workspace:
128
+
129
+ ```bash
130
+ rollouts remote set . --url git@github.com:you/my-project-rollouts.git
131
+ ```
132
+
133
+ The `remote set` command:
134
+
135
+ - defaults the workspace path to `.`
136
+ - automatically initializes the workspace if it has not been registered yet
137
+ - stores one archive repo URL per workspace
138
+ - uses the user’s normal local Git credentials when later pushing
139
+
140
+ Clear stored archive remotes without deleting snapshots:
141
+
142
+ ```bash
143
+ rollouts remote clear .
144
+ rollouts remote clear --all
145
+ ```
146
+
147
+ The `remote clear` command:
148
+
149
+ - defaults the workspace path to `.`
150
+ - with no `--all`, clears the stored `remote_url` for one workspace
151
+ - with `--all`, clears stored `remote_url`s for every registered workspace
152
+ - does not delete snapshots, workspace records, or remote defaults
153
+ - is useful when you deleted archive repos and want `push --create-remote` to recreate them
154
+
155
+ Configure defaults for auto-created GitHub archive repos:
156
+
157
+ ```bash
158
+ rollouts remote defaults set --owner you --prefix rollouts- --visibility private
159
+ ```
160
+
161
+ The `remote defaults set` command:
162
+
163
+ - stores one global owner/prefix/visibility config for repo auto-creation
164
+ - requires the GitHub CLI `gh` to be installed
165
+ - does not create any repos by itself
166
+
167
+ Push stored snapshots to archive remotes:
168
+
169
+ ```bash
170
+ rollouts push . --session ses_123 --message msg_001
171
+ rollouts push . --session ses_123
172
+ rollouts push .
173
+ rollouts push --all
174
+ rollouts push . --create-remote
175
+ rollouts push --all --create-remote
176
+ ```
177
+
178
+ The `push` command:
179
+
180
+ - defaults the workspace path to `.`
181
+ - with `--session` and `--message`, pushes one snapshot
182
+ - with only `--session`, pushes all snapshots for that session in the workspace
183
+ - with no `--session`, pushes all snapshots for that workspace
184
+ - with `--all`, pushes all snapshots for all workspaces that have a configured remote
185
+ - with `--create-remote`, auto-creates and stores a GitHub archive repo for any workspace in scope that does not already have a configured remote
186
+ - requires `--session` when `--message` is provided
187
+ - does not allow combining `--all` with `--session` or `--message`
188
+ - skips snapshots whose remote tag already exists
189
+ - stores remote metadata in annotated tags, not just in SQLite
190
+ - uses `gh repo create` for repo creation and then normal `git push` for snapshot upload
191
+
192
+ Delete stored Rollouts data:
193
+
194
+ ```bash
195
+ rollouts delete . --session ses_123 --message msg_001
196
+ rollouts delete . --session ses_123
197
+ rollouts delete .
198
+ rollouts delete --all
199
+ ```
200
+
201
+ The `delete` command:
202
+
203
+ - defaults the workspace path to `.`
204
+ - asks for confirmation in every mode
205
+ - with `--session` and `--message`, deletes one stored snapshot
206
+ - with only `--session`, deletes all stored snapshots for that session in the workspace
207
+ - with no `--session`, deletes the whole workspace entry, its stored snapshots, and its local bare store
208
+ - with `--all`, deletes the entire Rollouts app home, including the SQLite DB and all workspace stores
209
+ - requires `--session` when `--message` is provided
210
+ - does not allow combining `--all` with `--session` or `--message`
211
+
212
+ Export one OpenCode session as JSON with Rollouts metadata:
213
+
214
+ ```bash
215
+ rollouts export \
216
+ --agent opencode \
217
+ --session ses_123 \
218
+ --out /tmp/opencode-session.json
219
+ ```
220
+
221
+ The `export` command:
222
+
223
+ - uses `opencode export <sessionID>` under the hood
224
+ - requires the `opencode` CLI to be installed
225
+ - currently supports `--agent opencode`
226
+ - requires `--session` unless `--all` is set
227
+ - writes an envelope object with:
228
+ - top-level `session_id`, `agent`, and `exported_at`
229
+ - nested `session`, which contains the raw JSON returned by `opencode export`
230
+ - top-level `metadata`, which is either `null` or an object with `remote_url`
231
+ - sets `metadata` to `null` when Rollouts has no stored workspace for the session
232
+ - sets `metadata.remote_url` to `null` when the session's workspace exists but has no configured remote
233
+
234
+ Export all Rollouts-tracked sessions as JSONL:
235
+
236
+ ```bash
237
+ rollouts export \
238
+ --agent opencode \
239
+ --all \
240
+ --out /tmp/opencode-sessions.jsonl
241
+ ```
242
+
243
+ With `--all`, the command:
244
+
245
+ - looks up all distinct tracked `session_id`s from the Rollouts database
246
+ - exports one session record per line using the same payload shape as single-session export
247
+ - writes newline-delimited JSON to the output file
248
+
249
+ Upload tracked OpenCode sessions to a Hugging Face dataset:
250
+
251
+ ```bash
252
+ rollouts hf push \
253
+ --agent opencode \
254
+ --name your-dataset-name
255
+ ```
256
+
257
+ The `hf push` command:
258
+
259
+ - pushes all stored snapshots to their archive remotes before syncing the dataset
260
+ - auto-creates and stores missing GitHub archive repos for workspaces in scope
261
+ - uploads all Rollouts-tracked sessions to `train.jsonl` in a Hugging Face dataset repo
262
+ - uploads a `README.md` dataset card with YAML config that maps the `train` split to `train.jsonl`
263
+ - creates the dataset repo if it does not already exist
264
+ - appends a new row for each new or changed session export
265
+ - preserves older session rows so past batch states remain visible
266
+ - increments `batch_id` only when there are new or changed session rows to append
267
+ - exports session rows after the snapshot push, so `metadata.remote_url` reflects the stored archive repo
268
+ - defaults to creating a public dataset; pass `--private` to create a private one
269
+ - uses Hugging Face's existing authentication sources:
270
+ - `HF_TOKEN`, if set
271
+ - otherwise the token saved by `hf auth login`
272
+ - if `--name` does not include a namespace, uses your authenticated Hugging Face username
273
+ - requires `rollouts remote defaults set` if any tracked workspace still needs an auto-created archive repo
274
+
275
+ After pushing, the dataset should load with:
276
+
277
+ ```python
278
+ from datasets import load_dataset
279
+
280
+ dataset = load_dataset("username/dataset-name", split="train")
281
+ ```
282
+
283
+ Each HF dataset row has these top-level fields:
284
+
285
+ - `batch_id`
286
+ - `session_id`
287
+ - `agent`
288
+ - `exported_at`
289
+ - `session`
290
+ - `metadata`
291
+
292
+ Example export shape:
293
+
294
+ ```json
295
+ {
296
+ "session_id": "ses_123",
297
+ "agent": "opencode",
298
+ "exported_at": "2026-03-30T12:34:56.000Z",
299
+ "session": {
300
+ "info": {
301
+ "id": "ses_123"
302
+ },
303
+ "messages": []
304
+ },
305
+ "metadata": {
306
+ "remote_url": "https://github.com/you/my-project-rollouts.git"
307
+ }
308
+ }
309
+ ```
310
+
311
+ The current on-disk layout is:
312
+
313
+ ```text
314
+ ~/.rollouts/
315
+ rollouts.sqlite
316
+ workspaces/
317
+ <workspace_id>/
318
+ store.git/
319
+ ```
320
+
321
+ If you want to test without touching your real home directory:
322
+
323
+ ```bash
324
+ ROLLOUTS_HOME="$(mktemp -d)" rollouts snapshot . --session ses_test --message msg_test --metadata '{"event":"test"}'
325
+ ```
326
+
327
+ ## Schema
328
+
329
+ Current database schema:
330
+
331
+ ### `workspaces`
332
+
333
+ | Column | Type | Notes |
334
+ | ------------ | -------------- | --------------------------------------------------------------------------------------------- |
335
+ | `id` | `TEXT` | Primary key. Internal workspace id. |
336
+ | `root_path` | `TEXT` | Unique resolved root path for the tracked source directory. |
337
+ | `store_path` | `TEXT` | Path to the workspace bare Git store under `~/.rollouts/workspaces/<workspace_id>/store.git`. |
338
+ | `remote_url` | `TEXT \| NULL` | Optional archive repo URL used by `rollouts push`. |
339
+ | `created_at` | `TEXT` | UTC ISO 8601 timestamp for workspace registration. |
340
+
341
+ ### `remote_defaults`
342
+
343
+ | Column | Type | Notes |
344
+ | ------------- | --------- | ----------------------------------------------------------- |
345
+ | `id` | `INTEGER` | Fixed to `1`. Single-row defaults table. |
346
+ | `owner` | `TEXT` | GitHub user or organization for auto-created archive repos. |
347
+ | `repo_prefix` | `TEXT` | Prefix used when deriving auto-created archive repo names. |
348
+ | `visibility` | `TEXT` | `private`, `public`, or `internal`. |
349
+
350
+ ### `snapshots`
351
+
352
+ | Column | Type | Notes |
353
+ | ------------------ | ------ | ------------------------------------------------------------------------- |
354
+ | `id` | `TEXT` | Primary key. Internal snapshot id. |
355
+ | `workspace_id` | `TEXT` | Foreign key to `workspaces.id`. |
356
+ | `session_id` | `TEXT` | External chat session identifier. |
357
+ | `message_id` | `TEXT` | External message identifier. |
358
+ | `store_commit_sha` | `TEXT` | Commit SHA in the workspace bare store for this snapshot. |
359
+ | `vcs` | `TEXT` | JSON string with per-snapshot VCS context. |
360
+ | `metadata` | `TEXT` | Raw inline metadata JSON string from the hook or caller. |
361
+ | `captured_at` | `TEXT` | UTC ISO 8601 timestamp generated by the CLI when the snapshot is created. |
362
+
363
+ Rollouts enforces that a given `session_id` can only appear under one `workspace_id`.
364
+
365
+ ### `snapshots.vcs`
366
+
367
+ | Field | Type | When present | Notes |
368
+ | --------------- | ---------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
369
+ | `vcs` | `string \| null` | Always | `"git"` for Git-backed snapshots, otherwise `null`. |
370
+ | `worktree_path` | `string` | Git only | Resolved top-level path returned by `git rev-parse --show-toplevel`. For linked worktrees, this is the linked worktree root, not the shared common Git dir. |
371
+ | `branch` | `string \| null` | Git only | Current branch name when `HEAD` is attached. `null` in detached HEAD state. |
372
+ | `head_commit` | `string \| null` | Git only | Current `HEAD` commit SHA at snapshot time. |
373
+
374
+ ### `snapshots.metadata`
375
+
376
+ | Field | Type | Notes |
377
+ | --------------------- | ----------- | ------------------------------------------------------------------------------------------------------------ |
378
+ | varies by integration | JSON object | Stored as-is for later post-processing. Rollouts does not currently enforce a fixed schema for this payload. |
379
+
380
+ ## Remote Tags
381
+
382
+ Pushed snapshots use annotated tags in the configured archive repo:
383
+
384
+ ```text
385
+ refs/tags/rollouts/session/<session_hash>/message/<message_hash>
386
+ ```
387
+
388
+ The tag annotation stores:
389
+
390
+ - `schema_version`
391
+ - `snapshot_id`
392
+ - raw `session_id`
393
+ - raw `message_id`
394
+ - `captured_at`
395
+ - `store_commit_sha`
396
+ - parsed `vcs`
397
+ - parsed `metadata`
398
+
399
+ Auto-created archive repos use a derived name like:
400
+
401
+ ```text
402
+ <prefix><workspace-slug>-<workspace-id-prefix>
403
+ ```
404
+
405
+ ## Development
406
+
407
+ Quality checks:
408
+
409
+ ```bash
410
+ uv run ruff check .
411
+ uv run ty check src
412
+ ```
413
+
414
+ Pre-commit hooks are configured for:
415
+
416
+ - `ruff --fix`
417
+ - `ruff format`
418
+ - `ty check src/`
419
+
420
+ Install the Git hook locally with:
421
+
422
+ ```bash
423
+ uv run pre-commit install
424
+ ```
425
+
426
+ Run hooks manually with:
427
+
428
+ ```bash
429
+ uv run pre-commit run --all-files
430
+ ```