@variantlab/core 0.1.4 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1209 -39
- package/docs/API.md +692 -0
- package/docs/ARCHITECTURE.md +430 -0
- package/docs/CONTRIBUTING.md +264 -0
- package/docs/ROADMAP.md +292 -0
- package/docs/SECURITY.md +323 -0
- package/docs/design/api-philosophy.md +347 -0
- package/docs/design/config-format.md +442 -0
- package/docs/design/design-principles.md +212 -0
- package/docs/design/targeting-dsl.md +433 -0
- package/docs/features/codegen.md +351 -0
- package/docs/features/crash-rollback.md +399 -0
- package/docs/features/debug-overlay.md +328 -0
- package/docs/features/hmac-signing.md +330 -0
- package/docs/features/killer-features.md +308 -0
- package/docs/features/multivariate.md +339 -0
- package/docs/features/qr-sharing.md +372 -0
- package/docs/features/targeting.md +481 -0
- package/docs/features/time-travel.md +306 -0
- package/docs/features/value-experiments.md +487 -0
- package/docs/phases/phase-2-expansion.md +307 -0
- package/docs/phases/phase-3-ecosystem.md +289 -0
- package/docs/phases/phase-4-advanced.md +306 -0
- package/docs/phases/phase-5-v1-stable.md +350 -0
- package/docs/research/bundle-size-analysis.md +279 -0
- package/docs/research/competitors.md +327 -0
- package/docs/research/framework-ssr-quirks.md +394 -0
- package/docs/research/naming-rationale.md +238 -0
- package/docs/research/origin-story.md +179 -0
- package/docs/research/security-threats.md +312 -0
- package/package.json +2 -1
|
@@ -0,0 +1,308 @@
|
|
|
1
|
+
# Killer features
|
|
2
|
+
|
|
3
|
+
The 10 features that make variantlab different from every free and paid A/B testing tool we surveyed. Each links to its own detailed spec.
|
|
4
|
+
|
|
5
|
+
## Table of contents
|
|
6
|
+
|
|
7
|
+
1. [On-device debug overlay](#1-on-device-debug-overlay)
|
|
8
|
+
2. [Route-scoped experiments](#2-route-scoped-experiments)
|
|
9
|
+
3. [Type-safe codegen from JSON](#3-type-safe-codegen-from-json)
|
|
10
|
+
4. [Value AND render experiments](#4-value-and-render-experiments)
|
|
11
|
+
5. [Deep-link + QR state sharing](#5-deep-link--qr-state-sharing)
|
|
12
|
+
6. [Crash-triggered automatic rollback](#6-crash-triggered-automatic-rollback)
|
|
13
|
+
7. [Time-travel inspector](#7-time-travel-inspector)
|
|
14
|
+
8. [Offline-first by default](#8-offline-first-by-default)
|
|
15
|
+
9. [HMAC-signed remote configs](#9-hmac-signed-remote-configs)
|
|
16
|
+
10. [Multivariate + feature flags in one tool](#10-multivariate--feature-flags-in-one-tool)
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## 1. On-device debug overlay
|
|
21
|
+
|
|
22
|
+
**What it is**: A floating button you embed in dev builds. Tap it to open a bottom sheet listing every experiment active on the current route, with a tappable picker to switch variants live.
|
|
23
|
+
|
|
24
|
+
**Why it matters**: Existing tools either make you edit a config file + restart (slow), or open a web dashboard in a browser (outside the app context). Neither lets a designer or PM stand next to the phone and *feel* the difference in real time.
|
|
25
|
+
|
|
26
|
+
**Where we got the idea**: Directly from the Drishtikon small-phone card problem. See [`origin-story.md`](../research/origin-story.md).
|
|
27
|
+
|
|
28
|
+
**Who has it**:
|
|
29
|
+
|
|
30
|
+
- variantlab: **Yes**, first-class, on device
|
|
31
|
+
- Firebase Remote Config: **No**
|
|
32
|
+
- GrowthBook: **No** (web dashboard only)
|
|
33
|
+
- Statsig: **No** (web dashboard only)
|
|
34
|
+
- LaunchDarkly: **No** (web dashboard only)
|
|
35
|
+
- Amplitude: **No**
|
|
36
|
+
- react-native-ab: **No**
|
|
37
|
+
|
|
38
|
+
Spec: [`debug-overlay.md`](./debug-overlay.md)
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## 2. Route-scoped experiments
|
|
43
|
+
|
|
44
|
+
**What it is**: Every experiment can declare a `routes` field (glob patterns). The debug overlay automatically filters to show only experiments relevant to the current screen.
|
|
45
|
+
|
|
46
|
+
**Why it matters**: Real apps have dozens of experiments. Showing all of them on every screen drowns the signal. Route scoping turns the debug overlay from a giant list into a focused "what's on this page" view.
|
|
47
|
+
|
|
48
|
+
**Extra benefit**: Experiments evaluate faster because non-matching ones are skipped entirely.
|
|
49
|
+
|
|
50
|
+
**Who has it**:
|
|
51
|
+
|
|
52
|
+
- variantlab: **Yes**
|
|
53
|
+
- GrowthBook: Partial (URL targeting in dashboard, no per-route client filtering)
|
|
54
|
+
- Everyone else: **No**
|
|
55
|
+
|
|
56
|
+
Spec: [`targeting.md`](./targeting.md#routes)
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## 3. Type-safe codegen from JSON
|
|
61
|
+
|
|
62
|
+
**What it is**: The `variantlab generate` CLI reads `experiments.json` and emits a TypeScript `.d.ts` with literal-union types for every experiment ID and variant ID. The hooks are overloaded on these types.
|
|
63
|
+
|
|
64
|
+
```ts
|
|
65
|
+
const variant = useVariant("news-card-layout");
|
|
66
|
+
// variant is typed as:
|
|
67
|
+
// "responsive" | "scale-to-fit" | "pip-thumbnail"
|
|
68
|
+
|
|
69
|
+
const broken = useVariant("news-card-lay0ut");
|
|
70
|
+
// ❌ Type error: argument of type '"news-card-lay0ut"' is not assignable
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
**Why it matters**: Stringly-typed experiments are a source of production bugs. One typo and you're silently stuck on the default variant. Codegen turns every typo into a compile error.
|
|
74
|
+
|
|
75
|
+
**Who has it**:
|
|
76
|
+
|
|
77
|
+
- variantlab: **Yes** (first-class, via CLI)
|
|
78
|
+
- Statsig: Partial (via Statsig Console, requires network)
|
|
79
|
+
- LaunchDarkly: Partial (enterprise feature only)
|
|
80
|
+
- Everyone else: **No**
|
|
81
|
+
|
|
82
|
+
Spec: [`codegen.md`](./codegen.md)
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## 4. Value AND render experiments
|
|
87
|
+
|
|
88
|
+
**What it is**: A single config format supports two experiment types:
|
|
89
|
+
|
|
90
|
+
- `type: "render"` — component swaps via `<Variant experimentId="...">`
|
|
91
|
+
- `type: "value"` — returned values via `useVariantValue<T>(id)`
|
|
92
|
+
|
|
93
|
+
Same JSON, same targeting, same debug overlay.
|
|
94
|
+
|
|
95
|
+
```json
|
|
96
|
+
{ "id": "cta-copy", "type": "value", "variants": [
|
|
97
|
+
{ "id": "buy", "value": "Buy now" },
|
|
98
|
+
{ "id": "get", "value": "Get started" }
|
|
99
|
+
]}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
```ts
|
|
103
|
+
const copy = useVariantValue("cta-copy"); // "Buy now" | "Get started"
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
**Why it matters**: Other libraries force you to pick: either feature flags (values) or component-swap tools (render). variantlab is both. You learn one API, use it everywhere.
|
|
107
|
+
|
|
108
|
+
**Who has it**:
|
|
109
|
+
|
|
110
|
+
- variantlab: **Yes**, unified
|
|
111
|
+
- Firebase Remote Config: Values only
|
|
112
|
+
- LaunchDarkly: Values + custom JSON (no component-swap helper)
|
|
113
|
+
- react-native-ab: Render only
|
|
114
|
+
- Others: Usually one or the other
|
|
115
|
+
|
|
116
|
+
Spec: [`value-experiments.md`](./value-experiments.md)
|
|
117
|
+
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
## 5. Deep-link + QR state sharing
|
|
121
|
+
|
|
122
|
+
**What it is**: Any variant override can be encoded as a URL or QR code. A QA engineer can say "try this" by sharing a link. Scanning sets the exact same variants on another device.
|
|
123
|
+
|
|
124
|
+
```
|
|
125
|
+
drishtikon://variantlab?override=bmV3cy1jYXJkLWxheW91dDoyNQ
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
The payload is:
|
|
129
|
+
|
|
130
|
+
- Base64url-encoded
|
|
131
|
+
- Length-limited
|
|
132
|
+
- Signed with HMAC (if enabled)
|
|
133
|
+
- Rejected if the experiment has `overridable: false`
|
|
134
|
+
|
|
135
|
+
**Why it matters**: Remote debugging becomes trivial. No screen recordings, no "press button X then Y". Just send a link or a QR.
|
|
136
|
+
|
|
137
|
+
**Who has it**:
|
|
138
|
+
|
|
139
|
+
- variantlab: **Yes**
|
|
140
|
+
- Everyone else: **No** (you can build it yourself in every tool; no one ships it)
|
|
141
|
+
|
|
142
|
+
Spec: [`qr-sharing.md`](./qr-sharing.md)
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## 6. Crash-triggered automatic rollback
|
|
147
|
+
|
|
148
|
+
**What it is**: Wrap a variant in `<VariantErrorBoundary>`. If that variant crashes `threshold` times within `window` ms, the engine clears the assignment and forces the default. A warning is emitted and (optionally) persisted so subsequent sessions stay on default.
|
|
149
|
+
|
|
150
|
+
```json
|
|
151
|
+
"rollback": { "threshold": 3, "window": 60000, "persistent": true }
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
**Why it matters**: A bad variant can crash 100% of the users targeted by it. Without rollback, your only fix is to ship a new config or new app version. With rollback, the user self-heals on the second crash.
|
|
155
|
+
|
|
156
|
+
**Where we got the idea**: From building a card layout that crashed on certain article data. We needed a safety net.
|
|
157
|
+
|
|
158
|
+
**Who has it**:
|
|
159
|
+
|
|
160
|
+
- variantlab: **Yes**, built in
|
|
161
|
+
- Sentry: Can detect crashes but can't rollback experiments
|
|
162
|
+
- LaunchDarkly: Manual kill-switch only
|
|
163
|
+
- Everyone else: **No**
|
|
164
|
+
|
|
165
|
+
Spec: [`crash-rollback.md`](./crash-rollback.md)
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## 7. Time-travel inspector
|
|
170
|
+
|
|
171
|
+
**What it is**: The engine records every assignment, override, config load, and rollback event with a timestamp. The debug overlay exposes a "history" tab that lets you scrub backwards to see what was active at any point.
|
|
172
|
+
|
|
173
|
+
**Why it matters**:
|
|
174
|
+
|
|
175
|
+
- "Why did user X see variant Y?" becomes easy to answer
|
|
176
|
+
- QA can reproduce issues without guessing
|
|
177
|
+
- Rollback events become visible
|
|
178
|
+
|
|
179
|
+
**Who has it**:
|
|
180
|
+
|
|
181
|
+
- variantlab: **Yes** (dev-only)
|
|
182
|
+
- Redux DevTools: time-travel for state, not experiments
|
|
183
|
+
- Everyone else: **No**
|
|
184
|
+
|
|
185
|
+
Spec: [`time-travel.md`](./time-travel.md)
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## 8. Offline-first by default
|
|
190
|
+
|
|
191
|
+
**What it is**: The engine loads its config from local storage on boot. Network fetches are background refreshes. If the network fails, nothing breaks. If the device is offline for a week, experiments still resolve.
|
|
192
|
+
|
|
193
|
+
**Why it matters**: Mobile apps spend 30%+ of their runtime offline or on flaky networks. A tool that requires the network to resolve experiments is a tool that degrades user experience.
|
|
194
|
+
|
|
195
|
+
**How we do it**:
|
|
196
|
+
|
|
197
|
+
- The initial config ships bundled with the app
|
|
198
|
+
- Remote configs are cached after the first fetch
|
|
199
|
+
- The engine resolves from cache synchronously
|
|
200
|
+
- Background refreshes update the cache without blocking
|
|
201
|
+
|
|
202
|
+
**Who has it**:
|
|
203
|
+
|
|
204
|
+
- variantlab: **Yes**, by design
|
|
205
|
+
- Firebase Remote Config: Yes, but blocks app startup in many configurations
|
|
206
|
+
- GrowthBook: Partial
|
|
207
|
+
- LaunchDarkly: Yes (caching SDK)
|
|
208
|
+
- Unleash: Partial
|
|
209
|
+
|
|
210
|
+
Nothing unique here, but it's table stakes and we refuse to ship without it.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
## 9. HMAC-signed remote configs
|
|
215
|
+
|
|
216
|
+
**What it is**: Remote configs are optionally signed with HMAC-SHA256 using a shared secret. The engine verifies the signature before applying the config. Tampered or unauthorized configs are rejected.
|
|
217
|
+
|
|
218
|
+
```json
|
|
219
|
+
{
|
|
220
|
+
"signature": "dBjftJeZ4CVP-mB92K27uhbUJU1p1r_wW1gFWFOEjXk",
|
|
221
|
+
"version": 1,
|
|
222
|
+
"experiments": [...]
|
|
223
|
+
}
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
**Why it matters**: Without signing, anyone who MITM's your CDN can push a malicious config that changes which variant your users see — including weaponized variants that collect data.
|
|
227
|
+
|
|
228
|
+
**How we do it**:
|
|
229
|
+
|
|
230
|
+
- Web Crypto API (universal across runtimes)
|
|
231
|
+
- Constant-time verification
|
|
232
|
+
- Secret is embedded at build time, not transmitted
|
|
233
|
+
- Signing is a CLI command: `variantlab sign experiments.json --key $SECRET`
|
|
234
|
+
|
|
235
|
+
**Who has it**:
|
|
236
|
+
|
|
237
|
+
- variantlab: **Yes** (optional)
|
|
238
|
+
- LaunchDarkly: Yes (enterprise)
|
|
239
|
+
- Statsig: Partial (TLS only)
|
|
240
|
+
- Everyone else: **No**
|
|
241
|
+
|
|
242
|
+
Spec: [`hmac-signing.md`](./hmac-signing.md)
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
## 10. Multivariate + feature flags in one tool
|
|
247
|
+
|
|
248
|
+
**What it is**: variantlab isn't just A/B testing. The same engine handles:
|
|
249
|
+
|
|
250
|
+
- **2-variant A/B**: `[control, treatment]`
|
|
251
|
+
- **Multivariate**: `[A, B, C, D, E]` with arbitrary splits
|
|
252
|
+
- **Feature flags**: `[off, on]` with targeting and rollback
|
|
253
|
+
- **Kill switches**: global `enabled: false` or per-experiment
|
|
254
|
+
- **Time-boxed experiments**: `startDate` / `endDate`
|
|
255
|
+
- **Weighted rollouts**: 10% → 50% → 100% via split updates
|
|
256
|
+
|
|
257
|
+
**Why it matters**: Most teams use 2-3 different tools: one for A/B, one for feature flags, one for rollouts. variantlab does all of them with one config file, one API, one mental model.
|
|
258
|
+
|
|
259
|
+
**Who has it**:
|
|
260
|
+
|
|
261
|
+
- variantlab: **Yes**
|
|
262
|
+
- LaunchDarkly: Yes (but $$$)
|
|
263
|
+
- Statsig: Yes
|
|
264
|
+
- GrowthBook: Yes
|
|
265
|
+
- Firebase Remote Config: Partial (weak A/B support)
|
|
266
|
+
- Unleash: Flags only
|
|
267
|
+
- ConfigCat: Flags only
|
|
268
|
+
|
|
269
|
+
variantlab is unique in being **free + open-source + lightweight + all-in-one**.
|
|
270
|
+
|
|
271
|
+
Spec: [`multivariate.md`](./multivariate.md)
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
## What's NOT in the killer-feature list (and why)
|
|
276
|
+
|
|
277
|
+
We deliberately don't market these as killer features, even though we support them:
|
|
278
|
+
|
|
279
|
+
- **Sticky hashing** — every tool has it
|
|
280
|
+
- **User targeting** — every tool has it
|
|
281
|
+
- **Percentage rollouts** — every tool has it
|
|
282
|
+
- **Session persistence** — table stakes
|
|
283
|
+
- **Telemetry hooks** — we expose the interface but don't ship telemetry
|
|
284
|
+
|
|
285
|
+
Our differentiators are the 10 above. Everything else is baseline.
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
## The "one product" positioning
|
|
290
|
+
|
|
291
|
+
Looking at all 10 features together, variantlab's positioning is:
|
|
292
|
+
|
|
293
|
+
> "The developer experience of Storybook + the power of LaunchDarkly + the footprint of a tweet, for every frontend framework."
|
|
294
|
+
|
|
295
|
+
- **Storybook-like DX**: on-device picker, route-scoped, time-travel
|
|
296
|
+
- **LaunchDarkly-like power**: targeting, rollback, signing, multivariate
|
|
297
|
+
- **Tweet-sized footprint**: < 3 KB gzipped core
|
|
298
|
+
- **Framework-agnostic**: 10+ adapters from the same core
|
|
299
|
+
|
|
300
|
+
No competitor hits all four quadrants. That's the gap we fill.
|
|
301
|
+
|
|
302
|
+
---
|
|
303
|
+
|
|
304
|
+
## See also
|
|
305
|
+
|
|
306
|
+
- [`origin-story.md`](../research/origin-story.md) — why we built it
|
|
307
|
+
- [`competitors.md`](../research/competitors.md) — how we compare
|
|
308
|
+
- [`ROADMAP.md`](../../ROADMAP.md) — when each feature ships
|
|
@@ -0,0 +1,339 @@
|
|
|
1
|
+
# Multivariate experiments
|
|
2
|
+
|
|
3
|
+
Running experiments with 3 or more variants, including weighted splits and mutually exclusive groups.
|
|
4
|
+
|
|
5
|
+
## Table of contents
|
|
6
|
+
|
|
7
|
+
- [What is a multivariate experiment](#what-is-a-multivariate-experiment)
|
|
8
|
+
- [When to use one](#when-to-use-one)
|
|
9
|
+
- [Defining 3+ variants](#defining-3-variants)
|
|
10
|
+
- [Weighted splits](#weighted-splits)
|
|
11
|
+
- [Mutual exclusion](#mutual-exclusion)
|
|
12
|
+
- [Statistical considerations](#statistical-considerations)
|
|
13
|
+
- [Real-world examples](#real-world-examples)
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## What is a multivariate experiment
|
|
18
|
+
|
|
19
|
+
An experiment with more than 2 variants. In the classic A/B sense, it's an "A/B/C/D/..." test. variantlab treats all experiments uniformly — whether you have 2 variants or 30, the config and API are the same.
|
|
20
|
+
|
|
21
|
+
**Upper limit**: 100 variants per experiment. More than that and the picker UI becomes unusable anyway.
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## When to use one
|
|
26
|
+
|
|
27
|
+
### Good fits
|
|
28
|
+
|
|
29
|
+
- **Copy testing with many candidates** — "Buy now", "Get started", "Try free", "Start trial", "Shop now"
|
|
30
|
+
- **Layout exploration** — the Drishtikon 30 card modes
|
|
31
|
+
- **Color palette experiments** — 5 palettes, pick the best
|
|
32
|
+
- **Pricing sweeps** — $4.99, $6.99, $9.99, $14.99
|
|
33
|
+
- **Feature sets** — "minimal", "standard", "all features"
|
|
34
|
+
|
|
35
|
+
### Bad fits
|
|
36
|
+
|
|
37
|
+
- **Anything with fewer than 100 users per variant** — statistical noise will dominate. Stick to 2 variants.
|
|
38
|
+
- **When variants are very different** — measuring differences becomes meaningless if variants are apples and oranges. Split into multiple experiments.
|
|
39
|
+
- **When you don't have a success metric** — multivariate tests burn traffic. Without a clear metric, you're just randomly showing different UIs.
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Defining 3+ variants
|
|
44
|
+
|
|
45
|
+
Just add more variants to the array:
|
|
46
|
+
|
|
47
|
+
```json
|
|
48
|
+
{
|
|
49
|
+
"id": "cta-copy",
|
|
50
|
+
"type": "value",
|
|
51
|
+
"default": "buy-now",
|
|
52
|
+
"variants": [
|
|
53
|
+
{ "id": "buy-now", "value": "Buy now" },
|
|
54
|
+
{ "id": "get-started", "value": "Get started" },
|
|
55
|
+
{ "id": "try-free", "value": "Try it free" },
|
|
56
|
+
{ "id": "start-trial", "value": "Start your trial" },
|
|
57
|
+
{ "id": "shop-now", "value": "Shop now" }
|
|
58
|
+
]
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
No special config. The assignment strategy handles the rest.
|
|
63
|
+
|
|
64
|
+
### With the render-prop component
|
|
65
|
+
|
|
66
|
+
```tsx
|
|
67
|
+
<Variant experimentId="cta-copy">
|
|
68
|
+
{{
|
|
69
|
+
"buy-now": <Button>Buy now</Button>,
|
|
70
|
+
"get-started": <Button>Get started</Button>,
|
|
71
|
+
"try-free": <Button>Try it free</Button>,
|
|
72
|
+
"start-trial": <Button>Start your trial</Button>,
|
|
73
|
+
"shop-now": <Button>Shop now</Button>,
|
|
74
|
+
}}
|
|
75
|
+
</Variant>
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
TypeScript enforces exhaustiveness — missing any key is a compile error with codegen active.
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Weighted splits
|
|
83
|
+
|
|
84
|
+
By default, multivariate experiments split traffic evenly. To customize, use `assignment: "weighted"` with a `split`:
|
|
85
|
+
|
|
86
|
+
```json
|
|
87
|
+
{
|
|
88
|
+
"id": "cta-copy",
|
|
89
|
+
"assignment": "weighted",
|
|
90
|
+
"split": {
|
|
91
|
+
"buy-now": 40,
|
|
92
|
+
"get-started": 30,
|
|
93
|
+
"try-free": 20,
|
|
94
|
+
"start-trial": 5,
|
|
95
|
+
"shop-now": 5
|
|
96
|
+
},
|
|
97
|
+
"default": "buy-now",
|
|
98
|
+
"variants": [...]
|
|
99
|
+
}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Rules:
|
|
103
|
+
|
|
104
|
+
- All variant IDs must appear in `split`
|
|
105
|
+
- Values are integer percentages
|
|
106
|
+
- Must sum to exactly 100
|
|
107
|
+
- Assignment is deterministic via sticky-hash of `(userId, experimentId)`
|
|
108
|
+
|
|
109
|
+
### Staged rollout pattern
|
|
110
|
+
|
|
111
|
+
Start with a heavy default:
|
|
112
|
+
|
|
113
|
+
```json
|
|
114
|
+
"split": { "buy-now": 90, "get-started": 5, "try-free": 5 }
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
Gradually shift traffic:
|
|
118
|
+
|
|
119
|
+
```json
|
|
120
|
+
"split": { "buy-now": 50, "get-started": 25, "try-free": 25 }
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Until the winner is clear:
|
|
124
|
+
|
|
125
|
+
```json
|
|
126
|
+
"split": { "buy-now": 0, "get-started": 100, "try-free": 0 }
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Then archive the experiment and bake the winner into the code.
|
|
130
|
+
|
|
131
|
+
### Zero-percent variants
|
|
132
|
+
|
|
133
|
+
A variant with `0` in the split still appears in config, still has types generated, but never gets assigned. Useful for:
|
|
134
|
+
|
|
135
|
+
- Temporary kill switches on one arm
|
|
136
|
+
- Preparing a variant before ramping it up
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Mutual exclusion
|
|
141
|
+
|
|
142
|
+
When two experiments should never run on the same user:
|
|
143
|
+
|
|
144
|
+
```json
|
|
145
|
+
{ "id": "card-layout-a", "mutex": "card-layout", ... },
|
|
146
|
+
{ "id": "card-layout-b", "mutex": "card-layout", ... },
|
|
147
|
+
{ "id": "card-layout-c", "mutex": "card-layout", ... }
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
All three experiments with `mutex: "card-layout"` form a group. A single user is enrolled in **at most one** experiment from the group, chosen by stable hash.
|
|
151
|
+
|
|
152
|
+
### Why?
|
|
153
|
+
|
|
154
|
+
Competing experiments can interact:
|
|
155
|
+
|
|
156
|
+
- Two card-layout experiments both redesign the card; which one won?
|
|
157
|
+
- Two onboarding experiments both change flow; which did the user complete?
|
|
158
|
+
|
|
159
|
+
Mutex groups prevent interactions by ensuring one wins per user.
|
|
160
|
+
|
|
161
|
+
### Rules
|
|
162
|
+
|
|
163
|
+
- All experiments in a mutex group must have the same targeting (or else the "winner" is unpredictable)
|
|
164
|
+
- A user who matches multiple experiments gets exactly one, deterministically
|
|
165
|
+
- The choice is stable across sessions (sticky-hash)
|
|
166
|
+
- Debug overlay shows mutex status on each experiment card
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Statistical considerations
|
|
171
|
+
|
|
172
|
+
variantlab does not ship analytics. But we have opinions about how to run multivariate tests.
|
|
173
|
+
|
|
174
|
+
### Traffic cost
|
|
175
|
+
|
|
176
|
+
With 5 variants and an even split, each variant gets 20% of your traffic. To reach statistical significance, you need 5x the users a 2-variant test would need.
|
|
177
|
+
|
|
178
|
+
Rule of thumb: **fewer variants is better**. Only use multivariate if you're explicitly exploring a design space.
|
|
179
|
+
|
|
180
|
+
### Winner selection
|
|
181
|
+
|
|
182
|
+
Don't pick a winner until:
|
|
183
|
+
|
|
184
|
+
- Each variant has at least 1000 exposures (per your metric)
|
|
185
|
+
- The p-value is below 0.05 (your telemetry tool should compute this)
|
|
186
|
+
- The effect is practically significant, not just statistically
|
|
187
|
+
|
|
188
|
+
### Holdout arm
|
|
189
|
+
|
|
190
|
+
Consider keeping a "control" variant at a fixed percentage (e.g., 10%) even after picking a winner. This lets you measure long-term effects vs. the original baseline.
|
|
191
|
+
|
|
192
|
+
```json
|
|
193
|
+
"split": {
|
|
194
|
+
"control": 10,
|
|
195
|
+
"new-winner": 90
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Novelty effects
|
|
200
|
+
|
|
201
|
+
Users react to any change. Measure at least 2 weeks post-launch, not just the first day.
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
## Real-world examples
|
|
206
|
+
|
|
207
|
+
### Drishtikon card layout (30 variants)
|
|
208
|
+
|
|
209
|
+
The origin story. See [`origin-story.md`](../research/origin-story.md).
|
|
210
|
+
|
|
211
|
+
```json
|
|
212
|
+
{
|
|
213
|
+
"id": "news-card-detailed-layout",
|
|
214
|
+
"type": "render",
|
|
215
|
+
"default": "responsive-image",
|
|
216
|
+
"assignment": "default",
|
|
217
|
+
"variants": [
|
|
218
|
+
{ "id": "responsive-image", "label": "Responsive image" },
|
|
219
|
+
{ "id": "no-scroll", "label": "No scroll" },
|
|
220
|
+
{ "id": "tap-collapse", "label": "Tap to collapse" },
|
|
221
|
+
{ "id": "drag-handle", "label": "Drag handle" },
|
|
222
|
+
{ "id": "scale-to-fit", "label": "Scale to fit" },
|
|
223
|
+
{ "id": "pip-thumbnail", "label": "Picture-in-picture" },
|
|
224
|
+
// ... 24 more
|
|
225
|
+
]
|
|
226
|
+
}
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
During development, `assignment: "default"` means everyone sees the default and the developer uses the debug overlay to switch manually. Once the best variant is identified, the experiment is archived and the winner is baked in.
|
|
230
|
+
|
|
231
|
+
### Pricing experiment (4 variants)
|
|
232
|
+
|
|
233
|
+
```json
|
|
234
|
+
{
|
|
235
|
+
"id": "pro-monthly-price",
|
|
236
|
+
"type": "value",
|
|
237
|
+
"assignment": "weighted",
|
|
238
|
+
"split": {
|
|
239
|
+
"low": 25,
|
|
240
|
+
"mid-low": 25,
|
|
241
|
+
"mid-high": 25,
|
|
242
|
+
"high": 25
|
|
243
|
+
},
|
|
244
|
+
"default": "mid-low",
|
|
245
|
+
"variants": [
|
|
246
|
+
{ "id": "low", "value": 4.99 },
|
|
247
|
+
{ "id": "mid-low", "value": 7.99 },
|
|
248
|
+
{ "id": "mid-high", "value": 9.99 },
|
|
249
|
+
{ "id": "high", "value": 12.99 }
|
|
250
|
+
],
|
|
251
|
+
"targeting": { "attributes": { "isNewUser": true } }
|
|
252
|
+
}
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
Shows different prices to new users to find the optimum.
|
|
256
|
+
|
|
257
|
+
### Onboarding flow (3 variants)
|
|
258
|
+
|
|
259
|
+
```json
|
|
260
|
+
{
|
|
261
|
+
"id": "onboarding-flow",
|
|
262
|
+
"type": "render",
|
|
263
|
+
"assignment": "weighted",
|
|
264
|
+
"split": {
|
|
265
|
+
"current": 50,
|
|
266
|
+
"3-step": 25,
|
|
267
|
+
"1-page": 25
|
|
268
|
+
},
|
|
269
|
+
"default": "current",
|
|
270
|
+
"mutex": "onboarding",
|
|
271
|
+
"variants": [
|
|
272
|
+
{ "id": "current", "label": "Current flow" },
|
|
273
|
+
{ "id": "3-step", "label": "3 steps" },
|
|
274
|
+
{ "id": "1-page", "label": "Single page" }
|
|
275
|
+
]
|
|
276
|
+
}
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
Mutex ensures the onboarding test doesn't collide with a future onboarding experiment.
|
|
280
|
+
|
|
281
|
+
---
|
|
282
|
+
|
|
283
|
+
## Debugging multivariate experiments
|
|
284
|
+
|
|
285
|
+
### Debug overlay
|
|
286
|
+
|
|
287
|
+
Shows:
|
|
288
|
+
|
|
289
|
+
- All variants of the experiment
|
|
290
|
+
- Which one is currently active
|
|
291
|
+
- Why (targeting, assignment, override)
|
|
292
|
+
- A picker to switch manually
|
|
293
|
+
|
|
294
|
+
### Eval CLI
|
|
295
|
+
|
|
296
|
+
```bash
|
|
297
|
+
variantlab eval experiments.json \
|
|
298
|
+
--experiment cta-copy \
|
|
299
|
+
--context '{"userId":"alice"}'
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
Output:
|
|
303
|
+
|
|
304
|
+
```
|
|
305
|
+
Experiment: cta-copy
|
|
306
|
+
Targeting: ✅ pass (no targeting)
|
|
307
|
+
Assignment: weighted
|
|
308
|
+
Variant: get-started (40% bucket)
|
|
309
|
+
Value: "Get started"
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
### Variant distribution check
|
|
313
|
+
|
|
314
|
+
```bash
|
|
315
|
+
variantlab distribution experiments.json \
|
|
316
|
+
--experiment cta-copy \
|
|
317
|
+
--users 10000
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
Output:
|
|
321
|
+
|
|
322
|
+
```
|
|
323
|
+
cta-copy distribution over 10000 simulated users:
|
|
324
|
+
buy-now: 4028 (40.3%) expected: 40% ✅
|
|
325
|
+
get-started: 3012 (30.1%) expected: 30% ✅
|
|
326
|
+
try-free: 1987 (19.9%) expected: 20% ✅
|
|
327
|
+
start-trial: 491 (4.9%) expected: 5% ✅
|
|
328
|
+
shop-now: 482 (4.8%) expected: 5% ✅
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
Useful for verifying the assignment logic is working correctly.
|
|
332
|
+
|
|
333
|
+
---
|
|
334
|
+
|
|
335
|
+
## See also
|
|
336
|
+
|
|
337
|
+
- [`config-format.md`](../design/config-format.md) — the `split` and `mutex` fields
|
|
338
|
+
- [`value-experiments.md`](./value-experiments.md) — when variants are values
|
|
339
|
+
- [`targeting.md`](./targeting.md) — scoping multivariate tests
|