rad-experiment 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ARCHITECTURE.md +70 -0
- package/README.md +137 -59
- package/package.json +2 -2
package/ARCHITECTURE.md
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
## Architecture
|
|
2
|
+
|
|
3
|
+
The implementation provides two binaries that plug into Radicle's external COB system:
|
|
4
|
+
|
|
5
|
+
```
|
|
6
|
+
rad-experiment publish ...
|
|
7
|
+
│
|
|
8
|
+
▼
|
|
9
|
+
rad cob create ──stdin──▶ rad-cob-experiment ──stdout──▶ new state
|
|
10
|
+
│ (external COB helper)
|
|
11
|
+
▼
|
|
12
|
+
git refs/cobs/cc.experiment/<oid>
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
### `rad-cob-experiment` — External COB helper
|
|
16
|
+
|
|
17
|
+
Radicle's [external COB protocol](https://radicle.xyz) delegates state evaluation to a helper binary named `rad-cob-{type}`. When any `rad cob` command operates on a `cc.experiment` COB, Radicle spawns `rad-cob-experiment` and communicates via JSON Lines on stdin/stdout:
|
|
18
|
+
|
|
19
|
+
1. Radicle sends `{ value, op, concurrent }` on stdin
|
|
20
|
+
2. Helper applies the operation to the current state
|
|
21
|
+
3. Helper writes the new state as a JSON Line to stdout
|
|
22
|
+
4. Repeat for each operation, then exit 0
|
|
23
|
+
|
|
24
|
+
This is the only binary that Radicle itself invokes. It has no CLI flags.
|
|
25
|
+
|
|
26
|
+
### `rad-experiment` — User-facing CLI
|
|
27
|
+
|
|
28
|
+
Thin wrapper that shells out to `rad cob create/update/list/show`. All COB storage, signing, and replication are handled by Radicle — this CLI just constructs the right action JSON and formats output.
|
|
29
|
+
|
|
30
|
+
**Commands:**
|
|
31
|
+
|
|
32
|
+
| Command | What it does |
|
|
33
|
+
|--------------------|------------------------------------------------------------------------|
|
|
34
|
+
| `publish` | Create a new experiment COB with benchmark results |
|
|
35
|
+
| `list` | List all experiments (optional `--reproduced`/`--unverified` filters) |
|
|
36
|
+
| `show <id>` | Display experiment details (text or `--json`) |
|
|
37
|
+
| `reproduce <id>` | Add an independent verification to an experiment |
|
|
38
|
+
|
|
39
|
+
## Source files
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
src/
|
|
43
|
+
├── rad-experiment.ts CLI entry point — arg routing and dispatch
|
|
44
|
+
├── rad-cob-experiment.ts COB helper entry point — stdin/stdout JSON Lines protocol
|
|
45
|
+
├── types.ts Shared type definitions (Experiment, Action, Op, etc.)
|
|
46
|
+
├── cob/ COB domain logic (used by both entry points)
|
|
47
|
+
│ ├── actions.ts Builds Publish/Reproduce action JSON
|
|
48
|
+
│ └── state.ts Replays actions into Experiment state
|
|
49
|
+
└── cli/ Everything specific to the rad-experiment CLI
|
|
50
|
+
├── helpers.ts Arg parsing helpers (die, requireArg, buildMeasurement, etc.)
|
|
51
|
+
├── format.ts Display formatting (deltaDisplay, measurementDisplay, etc.)
|
|
52
|
+
├── rad.ts Wrappers around rad CLI commands (cobCreate, cobShow, etc.)
|
|
53
|
+
└── commands/
|
|
54
|
+
├── publish.ts rad-experiment publish
|
|
55
|
+
├── list.ts rad-experiment list
|
|
56
|
+
├── show.ts rad-experiment show
|
|
57
|
+
└── reproduce.ts rad-experiment reproduce
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Field naming conventions
|
|
61
|
+
|
|
62
|
+
The Rust serde serialization produces mixed naming that this implementation matches exactly:
|
|
63
|
+
|
|
64
|
+
| Layer | Convention | Example |
|
|
65
|
+
|-----------------------------------------|-------------|----------------------------------|
|
|
66
|
+
| **Action fields** (stored in COB) | snake_case | `metric_name` , `delta_pct_x100` |
|
|
67
|
+
| **Measurement fields** (nested struct) | camelCase | `medianX1000`, `stdX1000` |
|
|
68
|
+
| **Experiment state** (helper output) | camelCase | `metricName`, `deltaPctX100` |
|
|
69
|
+
|
|
70
|
+
The helper accepts both conventions when reading actions for backward compatibility.
|
package/README.md
CHANGED
|
@@ -1,87 +1,165 @@
|
|
|
1
|
-
# rad-experiment
|
|
1
|
+
# rad-experiment
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
CLI for publishing, reproducing, and browsing AI-generated optimization experiments on the [Radicle](https://radicle.xyz) network. Each experiment is a signed Collaborative Object (COB) of type `cc.experiment`.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Install
|
|
6
|
+
|
|
7
|
+
```sh
|
|
8
|
+
npm install -g rad-experiment
|
|
9
|
+
```
|
|
6
10
|
|
|
7
|
-
|
|
11
|
+
Requires [Radicle](https://radicle.dev/install) and Node.js >= 18. Both
|
|
12
|
+
`rad-experiment` and `rad-cob-experiment` (the Radicle COB helper) are installed onto `$PATH`.
|
|
8
13
|
|
|
14
|
+
### From source
|
|
15
|
+
|
|
16
|
+
```sh
|
|
17
|
+
git clone <repo> && cd rad-experiment-ts
|
|
18
|
+
npm install && npm run build && npm link
|
|
9
19
|
```
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
20
|
+
|
|
21
|
+
## Usage
|
|
22
|
+
|
|
23
|
+
All commands run inside a Radicle-tracked git repository.
|
|
24
|
+
|
|
25
|
+
### Publish an experiment
|
|
26
|
+
|
|
27
|
+
Record benchmark results for a candidate commit vs. its base:
|
|
28
|
+
|
|
29
|
+
```sh
|
|
30
|
+
rad-experiment publish \
|
|
31
|
+
--base abc123 --head def456 \
|
|
32
|
+
--metric p95_latency --unit ms --direction lower_is_better \
|
|
33
|
+
--runner amd64 --os linux --cpu xeon-8375c \
|
|
34
|
+
--baseline-median 142500 --baseline-n 10 --baseline-std 8200 \
|
|
35
|
+
--candidate-median 98700 --candidate-n 10 --candidate-std 5100 \
|
|
36
|
+
--delta 3074 \
|
|
37
|
+
-d "Connection pooling for /api/search handler"
|
|
17
38
|
```
|
|
18
39
|
|
|
19
|
-
|
|
40
|
+
```
|
|
41
|
+
Experiment published: 5dfc3d665c12dbb6247454c44c4d2dc6cc422a5e
|
|
42
|
+
metric: p95_latency +30.74%
|
|
43
|
+
baseline: 142.500 ms
|
|
44
|
+
candidate: 98.700 ms
|
|
45
|
+
```
|
|
20
46
|
|
|
21
|
-
|
|
47
|
+
Numeric values are integers scaled x1000 to avoid floating-point ambiguity (e.g. `142500` = 142.5 ms). Delta is percentage x100 (e.g. `3074` = +30.74%).
|
|
22
48
|
|
|
23
|
-
|
|
24
|
-
2. Helper applies the operation to the current state
|
|
25
|
-
3. Helper writes the new state as a JSON Line to stdout
|
|
26
|
-
4. Repeat for each operation, then exit 0
|
|
49
|
+
#### Secondary metrics
|
|
27
50
|
|
|
28
|
-
|
|
51
|
+
Track additional metrics alongside the primary one:
|
|
29
52
|
|
|
30
|
-
|
|
53
|
+
```sh
|
|
54
|
+
rad-experiment publish \
|
|
55
|
+
... \
|
|
56
|
+
--secondary "rss_peak:MB:256000:241000:-586" \
|
|
57
|
+
--secondary "p99_latency:ms:310000:285000:-806"
|
|
58
|
+
```
|
|
31
59
|
|
|
32
|
-
|
|
60
|
+
Format: `name:unit:baseline_x1000:candidate_x1000:delta_x100[:regressed]`
|
|
33
61
|
|
|
34
|
-
|
|
62
|
+
### List experiments
|
|
35
63
|
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
64
|
+
```sh
|
|
65
|
+
rad-experiment list
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
5dfc3d6 p95_latency +30.74%
|
|
70
|
+
bd3f5b8 p95_latency +12.40% [1 verified]
|
|
71
|
+
```
|
|
42
72
|
|
|
43
|
-
|
|
73
|
+
Filter by verification status:
|
|
44
74
|
|
|
75
|
+
```sh
|
|
76
|
+
rad-experiment list --reproduced # only experiments with confirmed reproductions
|
|
77
|
+
rad-experiment list --unverified # only experiments without reproductions
|
|
45
78
|
```
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
│ ├── actions.ts Builds Publish/Reproduce action JSON
|
|
52
|
-
│ └── state.ts Replays actions into Experiment state
|
|
53
|
-
└── cli/ Everything specific to the rad-experiment CLI
|
|
54
|
-
├── helpers.ts Arg parsing helpers (die, requireArg, buildMeasurement, etc.)
|
|
55
|
-
├── format.ts Display formatting (deltaDisplay, measurementDisplay, etc.)
|
|
56
|
-
├── rad.ts Wrappers around rad CLI commands (cobCreate, cobShow, etc.)
|
|
57
|
-
└── commands/
|
|
58
|
-
├── publish.ts rad-experiment publish
|
|
59
|
-
├── list.ts rad-experiment list
|
|
60
|
-
├── show.ts rad-experiment show
|
|
61
|
-
└── reproduce.ts rad-experiment reproduce
|
|
79
|
+
|
|
80
|
+
### Show experiment details
|
|
81
|
+
|
|
82
|
+
```sh
|
|
83
|
+
rad-experiment show 5dfc3d665c12dbb6247454c44c4d2dc6cc422a5e
|
|
62
84
|
```
|
|
63
85
|
|
|
64
|
-
|
|
86
|
+
```
|
|
87
|
+
Experiment 5dfc3d665c12dbb6247454c44c4d2dc6cc422a5e
|
|
65
88
|
|
|
66
|
-
|
|
89
|
+
Connection pooling for /api/search handler
|
|
67
90
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
| **Action fields** (stored in COB) | snake_case | `metric_name` , `delta_pct_x100` |
|
|
71
|
-
| **Measurement fields** (nested struct) | camelCase | `medianX1000`, `stdX1000` |
|
|
72
|
-
| **Experiment state** (helper output) | camelCase | `metricName`, `deltaPctX100` |
|
|
91
|
+
base: abc123...
|
|
92
|
+
head: def456...
|
|
73
93
|
|
|
74
|
-
|
|
94
|
+
metric: p95_latency (ms)
|
|
95
|
+
direction: lower_is_better
|
|
75
96
|
|
|
76
|
-
|
|
97
|
+
baseline: 142.500 ms (n=10)
|
|
98
|
+
candidate: 98.700 ms (n=10)
|
|
99
|
+
delta: +30.74%
|
|
100
|
+
|
|
101
|
+
runner: amd64 (linux, xeon-8375c)
|
|
102
|
+
build: ok
|
|
103
|
+
tests: ok
|
|
104
|
+
agent: claude-code/claude-opus-4-6
|
|
105
|
+
author: did:key:z6Mk...
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Use `--json` for machine-readable output:
|
|
109
|
+
|
|
110
|
+
```sh
|
|
111
|
+
rad-experiment show --json 5dfc3d66...
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Reproduce (verify) an experiment
|
|
115
|
+
|
|
116
|
+
Anyone can independently verify an experiment's results:
|
|
117
|
+
|
|
118
|
+
```sh
|
|
119
|
+
rad-experiment reproduce 5dfc3d665c12dbb6247454c44c4d2dc6cc422a5e \
|
|
120
|
+
--verdict confirmed --runner arm64 \
|
|
121
|
+
--baseline-median 148200 --baseline-n 10 \
|
|
122
|
+
--candidate-median 101500 --candidate-n 10 \
|
|
123
|
+
--delta 3148 \
|
|
124
|
+
--notes "Reproduced on staging cluster"
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
```
|
|
128
|
+
Reproduction added to 5dfc3d6
|
|
129
|
+
verdict: confirmed
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Verdicts: `confirmed`, `failed`, `inconclusive`.
|
|
133
|
+
|
|
134
|
+
### Global options
|
|
77
135
|
|
|
78
136
|
```sh
|
|
79
|
-
|
|
80
|
-
|
|
137
|
+
rad-experiment --repo /path/to/repo <command> # specify repo path (default: cwd)
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
## How it works
|
|
81
141
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
142
|
+
The CLI shells out to `rad cob create/update/list/show` for all storage operations. Radicle handles signing, replication, and conflict resolution.
|
|
143
|
+
|
|
144
|
+
When Radicle evaluates a `cc.experiment` COB, it invokes the
|
|
145
|
+
`rad-cob-experiment` helper binary via the [external COB protocol](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5/tree/main/radicle/src/cob/external.rs) (JSON Lines on stdin/stdout). This helper applies
|
|
146
|
+
operations to experiment state — it's installed alongside the CLI and found on `$PATH` by name.
|
|
147
|
+
|
|
148
|
+
## Architecture
|
|
149
|
+
|
|
150
|
+
See [ARCHITECTURE.md](ARCHITECTURE.md) for the data flow diagram, source file map, and field naming conventions.
|
|
151
|
+
|
|
152
|
+
## Testing
|
|
153
|
+
|
|
154
|
+
```sh
|
|
155
|
+
npm test # type-check + all tests (87 tests)
|
|
156
|
+
npm run test:unit # pure function + golden-file serialization tests
|
|
157
|
+
npm run test:protocol # rad-cob-experiment stdin/stdout protocol tests
|
|
158
|
+
npm run test:integration # full E2E with a real radicle repo (requires rad)
|
|
85
159
|
```
|
|
86
160
|
|
|
87
|
-
|
|
161
|
+
Golden files in `src/__tests__/golden/` contain reference JSON generated by the Rust implementation to verify byte-identical COB serialization.
|
|
162
|
+
|
|
163
|
+
## License
|
|
164
|
+
|
|
165
|
+
MIT OR Apache-2.0
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rad-experiment",
|
|
3
|
-
"version": "0.1.
|
|
4
|
-
"description": "Radicle COB type for AI-generated optimization experiments
|
|
3
|
+
"version": "0.1.1",
|
|
4
|
+
"description": "Radicle COB type for AI-generated optimization experiments",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
7
7
|
"rad-experiment": "./dist/rad-experiment.js",
|