@ishlabs/cli 0.8.3 → 0.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/lib/docs.js CHANGED
@@ -43,9 +43,12 @@ Two top-level run verbs:
43
43
  ## Where to look next
44
44
 
45
45
  - New here? \`ish docs get-page concepts/workspace\`, then \`concepts/study\`.
46
+ - **Cold start?** Run \`ish status\` (alias \`ish whoami\`) — confirms login
47
+ and prints active workspace/study/ask. See \`concepts/active-context\`.
46
48
  - Running your first study? \`ish docs get-page guides/first-study\`.
47
49
  - Comparing study vs ask? \`ish docs get-page concepts/run-verbs\`.
48
- - Need machine-readable output? \`ish docs get-page reference/json-mode\`.
50
+ - **Output modes** (display vs capture vs chain — \`--human\`, \`--get\`,
51
+ \`--json\`)? \`ish docs get-page reference/json-mode\`.
49
52
  - Auth gated URL? \`ish docs get-page concepts/site-access\`.
50
53
 
51
54
  ## Install the skill into this project
@@ -72,6 +75,22 @@ A workspace carries:
72
75
  - Site-access credentials (encrypted at rest) — see \`concepts/site-access\`.
73
76
  - Tester profiles + sources visible to every study/ask in the workspace.
74
77
 
78
+ ## Selecting a workspace per command
79
+
80
+ \`--workspace <id>\` works at the **program root** as well as on each
81
+ subcommand — both forms are equivalent, and the subcommand-level flag
82
+ wins on conflict:
83
+
84
+ \`\`\`
85
+ ish --workspace w-6ec study list # program root
86
+ ish study list --workspace w-6ec # subcommand (same effect)
87
+ ish --workspace w-6ec study list --workspace w-other # w-other wins
88
+ \`\`\`
89
+
90
+ Use whichever is most natural for your scripting. Without either, the
91
+ CLI falls back to \`ISH_WORKSPACE\` (env var) and then the
92
+ \`workspace\` saved in \`~/.ish/config.json\`.
93
+
75
94
  ## Common commands
76
95
 
77
96
  \`\`\`
@@ -81,6 +100,10 @@ ish workspace use w-6ec # set as active
81
100
  ish workspace get # show the active workspace
82
101
  ish workspace site-access status
83
102
  \`\`\`
103
+
104
+ ## Related
105
+
106
+ - \`reference/billing-limits\` — \`maxProducts\` cap on workspace creation.
84
107
  `;
85
108
  const CONCEPT_STUDY = `# concept: study
86
109
 
@@ -101,17 +124,78 @@ its iterations. Think: a study is the recipe; an iteration is one batch.
101
124
 
102
125
  ## Lifecycle
103
126
 
104
- 1. \`ish study create --name "Onboarding UX" --modality interactive --assignment "Sign up:Complete the signup flow" --question "How easy was it?"\`
105
- 2. \`ish iteration create --url https://example.com\` (creates the first iteration)
106
- 3. \`ish study run --sample 5 --country SE\` (dispatches simulations)
127
+ 1. \`ish study create --name "Onboarding UX" --modality interactive --assignment "Sign up:Complete the signup flow" --question "How easy was it?"\` — creates the recipe with **zero iterations**.
128
+ 2. \`ish iteration create --url https://example.com\` first iteration becomes label \`A\`.
129
+ 3. \`ish study run --sample 5 --country SE\` dispatches simulations.
107
130
  4. \`ish study results\` or \`ish study wait\` to gather outputs.
108
131
 
132
+ ### One-shot variant
133
+
134
+ \`study create\` now accepts \`--content-text\` (text modality) or
135
+ \`--url\` (interactive modality) inline; iteration A is created in the
136
+ same call. Useful when you have a single test artifact and don't need
137
+ to A/B iterations:
138
+
139
+ \`\`\`
140
+ ish study create --modality text --content-type email \\
141
+ --name "Daily Brief concept" \\
142
+ --assignment "Read:Read the email and react" \\
143
+ --question "What stood out?" \\
144
+ --content-text @./brief.md
145
+ # → study + iteration A in one call, ready for \`study run\`.
146
+ \`\`\`
147
+
148
+ Without those flags no iteration is created — agents can no longer
149
+ trip the old "empty A" footgun where \`study run\` silently targeted a
150
+ placeholder.
151
+
152
+ ## Status fields (read \`runtime_status\`, not \`status\`)
153
+
154
+ Every study response carries two status-shaped fields:
155
+
156
+ - \`status\` — the raw lifecycle column on the row, values
157
+ \`draft | running | completed | cancelled\`. Updated lazily; can
158
+ disagree with what the testers actually did.
159
+ - \`runtime_status\` — derived by aggregating the iteration testers'
160
+ states. Values: \`draft | running | completed |
161
+ completed_with_errors | cancelled\`. **Never reports \`failed\` while
162
+ completed runs exist** (the Bk2 invariant). Prefer this for any
163
+ agent decision.
164
+
165
+ The CLI also surfaces a \`status_inferred\` field + stderr warning when
166
+ it detects raw-vs-derived inconsistencies. See \`reference/json-mode\`.
167
+
168
+ ## Deleting a study
169
+
170
+ \`ish study delete <id>\` requires explicit confirmation:
171
+
172
+ - **Interactive (TTY)**: prompts on stderr; type \`y\` to proceed.
173
+ - **Non-interactive** (\`--json\`, piped, or non-TTY stdin): pass
174
+ \`-y\` / \`--yes\` to confirm. Without it, the CLI exits with usage
175
+ code 2 rather than deleting silently.
176
+
177
+ \`\`\`
178
+ ish study delete s-b2c # interactive prompt
179
+ ish study delete s-b2c --yes # skip prompt
180
+ ish study delete s-b2c --json --yes # JSON consumers must be explicit
181
+ \`\`\`
182
+
183
+ ## Generate vs create
184
+
185
+ \`ish study generate --problem "..."\` runs an LLM-backed flow that
186
+ picks a sensible modality from your brief and returns a
187
+ \`modality_rationale\` field (≤30 words) explaining the choice.
188
+ Override before adding iterations via
189
+ \`ish study update <id> --modality text\` if the rationale shows the
190
+ pick was wrong.
191
+
109
192
  ## Related
110
193
 
111
194
  - \`concepts/iteration\` — the unit of execution within a study.
112
195
  - \`concepts/assignment\` — task definition syntax.
113
196
  - \`concepts/questionnaire\` — question types and timing.
114
197
  - \`concepts/run-verbs\` — when to use \`study run\` vs \`ask run\`.
198
+ - \`reference/billing-limits\` — \`maxStudiesPerProduct\` cap on study creation.
115
199
  `;
116
200
  const CONCEPT_ITERATION = `# concept: iteration
117
201
 
@@ -157,11 +241,42 @@ ish iteration list --study s-b2c
157
241
  ish iteration get i-d4e
158
242
  \`\`\`
159
243
 
244
+ ## No more auto-empty iteration A
245
+
246
+ \`ish study create\` and \`ish study generate\` **do not auto-create
247
+ iteration A** anymore (Pattern E remediation, ish-cli v0.8.x). The
248
+ first explicit \`ish iteration create\` becomes label A, second is B,
249
+ etc. Running \`ish study run\` on a study with zero iterations exits
250
+ 2 with a clear error pointing you to \`ish iteration create\`.
251
+
252
+ If you do somehow run against an interactive iteration without a URL
253
+ (or a media iteration without content), \`study run\` exits 2 with:
254
+
255
+ \`\`\`
256
+ Iteration "A" (i-...) has no URL configured yet. Add a URL with
257
+ \`ish iteration create --study s-... --url <url>\` (or update the
258
+ existing iteration via \`ish iteration update i-... --details-json '{...}'\`),
259
+ then retry.
260
+ \`\`\`
261
+
262
+ Treat this as actionable, not transient — re-running won't change anything.
263
+
264
+ ## Default segmentation for text/image iterations
265
+
266
+ For text-modality iterations created with just \`--content-text\` (and
267
+ similarly \`--image-urls\` for image), the worker now synthesises a
268
+ single whole-content section if no \`segmentation\` was supplied. This
269
+ means a minimal \`ish iteration create --study s-XYZ --content-text
270
+ "..."\` actually runs end-to-end without you needing to author a
271
+ SegmentationConfig manually. Author your own segmentation when you
272
+ want section-level reactions; otherwise the default just works.
273
+
160
274
  ## Related
161
275
 
162
276
  - \`concepts/study\` — the parent artifact.
163
277
  - \`concepts/run-verbs\` — how \`ish study run\` selects the iteration.
164
278
  - \`concepts/audience\` — how testers are picked for a run.
279
+ - \`reference/billing-limits\` — \`maxIterationsPerStudy\` cap on iteration creation.
165
280
  `;
166
281
  const CONCEPT_ASSIGNMENT = `# concept: assignment
167
282
 
@@ -213,7 +328,7 @@ replaces the full assignment list — additive editing is not supported.
213
328
  const CONCEPT_QUESTIONNAIRE = `# concept: questionnaire
214
329
 
215
330
  The **questionnaire** is the list of \`interview_questions\` a tester
216
- answers before, during, or after their assignments. A study has 0..N
331
+ answers before or after their assignments. A study has 0..N
217
332
  questions, each with a type and a timing.
218
333
 
219
334
  ## Question shape
@@ -221,12 +336,12 @@ questions, each with a type and a timing.
221
336
  \`\`\`json
222
337
  {
223
338
  "question": "How easy was checkout?",
224
- "type": "slider", // text | slider | likert | choice_single |
225
- // choice_multiple | number |
226
- "timing": "after", // before | during | after
339
+ "type": "slider", // text | slider | likert |
340
+ // single-choice | multiple-choice | number
341
+ "timing": "after", // before | after
227
342
  "min": 1, "max": 7, "step": 1,
228
343
  "labels": ["Hard", "Easy"],
229
- "options": ["A", "B", "C"] // only for choice_*
344
+ "options": ["A", "B", "C"] // only for single-choice / multiple-choice
230
345
  }
231
346
  \`\`\`
232
347
 
@@ -289,8 +404,58 @@ ish ask run --prompt "And now which?" \\
289
404
  ish ask list
290
405
  ish ask get a-6ec --round 2
291
406
  ish ask results a-6ec
407
+ ish ask results a-6ec --json | jq '.rounds[0].aggregates'
408
+ \`\`\`
409
+
410
+ ## Reading the verdict
411
+
412
+ For \`--wants-pick\` / \`--wants-ratings\` rounds, \`ask results --json\`
413
+ includes an \`aggregates\` field per round so you don't have to parse
414
+ prose. Each individual pick also carries a **\`pick_confidence\`** score
415
+ (0..1) — the model's self-reported confidence in its variant choice.
416
+ Use it to break ties: when two variants are nominally close on count,
417
+ the variant with higher mean \`pick_confidence\` is the more decisive
418
+ choice. \`pick_confidence\` is only present on rounds run with
419
+ \`--wants-pick\`.
420
+
421
+ \`\`\`json
422
+ {
423
+ "picks": { "A": 3, "B": 0 },
424
+ "ratings": { "A": { "mean": 4.667, "n": 3 },
425
+ "B": { "mean": 2.000, "n": 3 } },
426
+ "winner": { "letter": "A", "count": 3, "tied": false }
427
+ }
292
428
  \`\`\`
293
429
 
430
+ When the ask has 2+ rounds, \`ask results\` also includes a top-level
431
+ \`cross_round_summary\` block with per-round picks/winner and a
432
+ \`picks_delta\` (R1 → last round). Skip the manual diffing of two
433
+ \`ask results\` calls.
434
+
435
+ \`\`\`json
436
+ "cross_round_summary": {
437
+ "rounds": [
438
+ { "round_number": 1, "picks": {"A": 1, "B": 2}, "winner": {"letter": "B", "count": 2, "tied": false } },
439
+ { "round_number": 2, "picks": {"A": 3, "B": 0}, "winner": {"letter": "A", "count": 3, "tied": false } }
440
+ ],
441
+ "picks_delta": { "A": +2, "B": -2 }
442
+ }
443
+ \`\`\`
444
+
445
+ ## Adding follow-up questions to a round
446
+
447
+ \`ish ask add-questions --round N --questions ./qs.json\` is **additive
448
+ by default**: prior phase-1 outputs (comment, pick, ratings) are
449
+ preserved on every non-errored response, and the worker only answers
450
+ the newly-added questions for each tester. Existing picks stay stable.
451
+
452
+ Pass \`--redispatch-all\` for the legacy reset behavior — useful when a
453
+ question is sufficiently different that you want fresh first
454
+ impressions, not augmentation. Without that flag, agents iterating on
455
+ copy can safely append questions without losing prior round results.
456
+
457
+ See \`reference/json-mode\` for the full shape.
458
+
294
459
  ## Variant syntax
295
460
 
296
461
  \`--variant <type>:<value>[::label=<label>]\`
@@ -327,6 +492,15 @@ ish ask wait a-6ec --round 2 --timeout 600
327
492
  ish ask results a-6ec --round 1
328
493
  \`\`\`
329
494
 
495
+ ## \`add-questions\` is additive
496
+
497
+ Appending questions to a completed round preserves prior data — variant
498
+ comments, picks, ratings, and earlier-question answers all stay. Only
499
+ the new question(s) get dispatched to the existing testers. Cost is
500
+ roughly N phase-2 LLM calls instead of 2N (no phase-1 re-run). Errored
501
+ responses are skipped entirely; completed responses flip to PENDING and
502
+ re-finalize after the new question is answered.
503
+
330
504
  ## Related
331
505
 
332
506
  - \`concepts/ask\` — the parent artifact.
@@ -380,10 +554,29 @@ ish profile create --file profile.json
380
554
  Expected JSON: \`{ "name": "...", "type": "ai", "gender": "female",
381
555
  "country": "US", "occupation": "...", "bio": "..." }\`
382
556
 
557
+ ## Generation behavior to expect
558
+
559
+ - **Latency**: \`profile generate\` is LLM-backed and typically takes
560
+ 10–20s for 1–5 profiles. The CLI emits stderr progress lines
561
+ (\`generating N profiles…\` then \`generated N profiles\`) so you
562
+ know it's not stuck. Suppress with \`--quiet\`.
563
+ - **Brief fidelity**: bios reference domain-specific terms from your
564
+ description verbatim or as close paraphrase. If you mention
565
+ \`F-skatt\`, "manual Excel invoicing", "Stripe payouts", or similar
566
+ tools/jargon, expect those terms (or paraphrases) to appear in
567
+ each generated bio's daily-routine framing — not sanded down to
568
+ generic prose.
569
+ - **DOB diversity**: month-and-day are derived from a deterministic
570
+ per-profile hash so birthdays spread across the year (no more
571
+ every-profile-on-\`06-15\`). Year follows the requested age.
572
+ Re-generating the same name/country/occupation/age yields the
573
+ same DOB.
574
+
383
575
  ## Related
384
576
 
385
577
  - \`concepts/source\` — the inputs to \`profile generate\`.
386
578
  - \`concepts/audience\` — how profiles get selected into a run.
579
+ - \`reference/billing-limits\` — \`maxCustomTesterProfiles\` cap on profile creation.
387
580
  `;
388
581
  const CONCEPT_SOURCE = `# concept: source
389
582
 
@@ -439,6 +632,24 @@ flags. Two ways to select:
439
632
  The two modes are **mutually exclusive** — pass either \`--profile\` or
440
633
  the filter set, not both.
441
634
 
635
+ ## Empty-pool suggestions
636
+
637
+ When a filter combination matches zero profiles, the error message
638
+ includes the top three populated countries that satisfy your *other*
639
+ filters — so you can pivot to a country with actual coverage without a
640
+ second \`profile list\` round-trip:
641
+
642
+ \`\`\`
643
+ $ ish study run --country XX --min-age 35 --sample 5
644
+ Error: No simulatable AI tester profiles in workspace w-b32 match:
645
+ --country XX --min-age 35.
646
+ Populated countries with these other filters: SE (12), DE (8), NL (3).
647
+ Broaden your filters or run \`ish profile list\` to inspect the pool.
648
+ \`\`\`
649
+
650
+ The suggestion is best-effort — it never replaces the original error,
651
+ just augments it.
652
+
442
653
  ## Defaults
443
654
 
444
655
  - \`ish study run\` with no audience flags → reuses the iteration's
@@ -571,6 +782,13 @@ ish study cancel <tester_id> # cancel a running simulation
571
782
  \`<tester_id>\` accepts a tester alias (\`t-…\`) or a full UUID. The
572
783
  study-level \`poll\`/\`wait\` forms also exist (\`--study <id>\` /
573
784
  \`--iteration <id>\`) for whole-batch progress.
785
+
786
+ ## Related
787
+
788
+ - \`reference/json-mode\` — output modes (display vs capture vs chain).
789
+ Use \`--get tester_aliases\` to capture the run's testers without
790
+ piping through \`jq\`. \`--human\` forces table output even through
791
+ \`tee\`/redirection.
574
792
  `;
575
793
  const REFERENCE_ALIASES = `# reference: aliases
576
794
 
@@ -602,32 +820,120 @@ ish profile generate --source tps-3a4 --count 4
602
820
  The full UUID is also always accepted. Add \`--verbose\` to JSON output
603
821
  to see UUIDs alongside aliases.
604
822
  `;
605
- const REFERENCE_JSON_MODE = `# reference: JSON output for agents
823
+ const REFERENCE_JSON_MODE = `# reference: output modes for agents
824
+
825
+ \`ish\` distinguishes **three output modes** so agents don't have to
826
+ post-process CLI output with \`jq\` or \`python\` for routine tasks:
827
+
828
+ 1. **Display mode (human)** — readable tables and key/value blocks.
829
+ Default on a TTY. Force it anywhere with \`--human\` (e.g. \`ish
830
+ workspace list --human | tee /tmp/x.txt\` keeps the table layout
831
+ even though stdout is redirected).
832
+ 2. **Capture mode (single value)** — \`--get <field>\` extracts the
833
+ value at a dotted path and prints it bare (no JSON quotes, no
834
+ indentation). Use this to feed one CLI's output into another:
835
+ \`ASK=$(ish ask create … --get alias)\` instead of
836
+ \`ASK=$(ish ask create … --json | jq -r .alias)\`.
837
+ 3. **Chain mode (full JSON)** — \`--json\` (or auto-enabled when stdout
838
+ is piped). Returns structured payloads for downstream parsing.
839
+ Reach for this only when you actually need multiple fields or a
840
+ nested shape; for one value, \`--get\` is shorter.
841
+
842
+ ## Picking the right mode
843
+
844
+ | You want to… | Mode |
845
+ |-------------------------------------------|--------------------------------------------------|
846
+ | Show the user a list of workspaces | bare command (TTY) or \`--human\` if redirecting |
847
+ | Capture an alias for a follow-up command | \`--get alias\` |
848
+ | Inspect a specific nested field | \`--get tester_profile.name\` |
849
+ | Compare 2+ fields, or pipe into jq | \`--json\` (or auto-on when piped) |
850
+ | Force human output through \`tee\` | \`--human\` |
851
+ | Force JSON on a TTY | \`--json\` |
852
+
853
+ \`--get\` and \`--human\` are mutually exclusive — capture and display are
854
+ different intents; pick one. \`--get\` implies \`--json\` internally so the
855
+ renderer always has structured data to extract from; you don't need to
856
+ add \`--json\` yourself.
857
+
858
+ ### Worked example: display vs. capture
859
+
860
+ \`\`\`bash
861
+ # Display: bare command on a TTY → human table.
862
+ ish workspace list
863
+
864
+ # Capture: feed one alias into the next command, no jq required.
865
+ ASK=$(ish ask create --new --name demo \\
866
+ --prompt "Which?" --variant text:A --variant text:B \\
867
+ --sample 30 --get alias)
868
+ ish ask wait "$ASK" --timeout 600
606
869
 
607
- Every command that produces output supports machine-readable JSON. JSON
608
- mode is **auto-enabled when stdout is piped**, so an agent rarely needs
609
- \`--json\` explicitly.
870
+ # Capture across an entire list: one value per line.
871
+ ish workspace list --get alias
872
+ # w-6ec
873
+ # w-d02
874
+ # …
875
+
876
+ # Display preserved through tee:
877
+ ish ask results "$ASK" --human | tee /tmp/transcript.txt
878
+ \`\`\`
610
879
 
611
880
  ## Flags
612
881
 
613
- - \`--json\` — force JSON output even on a TTY.
882
+ - \`--human\` — force human-readable output regardless of TTY
883
+ state (overrides the auto-flip-to-JSON when
884
+ stdout is piped). Mutually exclusive with
885
+ \`--get\`.
886
+ - \`--get <field>\` — extract a single field from the JSON response
887
+ and print only its bare value. Supports dotted
888
+ paths (\`tester_profile.name\`). On a paginated
889
+ \`{items: [...]}\` response, the path
890
+ auto-descends into \`items\` so \`--get alias\`
891
+ on a list yields one value per line. Implies
892
+ \`--json\` internally; mutually exclusive with
893
+ \`--human\`. Strings/numbers/bools are printed
894
+ unquoted; \`null\` prints as an empty line;
895
+ arrays print one element per line; objects
896
+ print as compact one-line JSON. Missing field
897
+ → exit 2 with a usage error.
898
+ - \`--json\` — force JSON output even on a TTY. Auto-enabled
899
+ when stdout is piped (unless \`--human\` is
900
+ set).
614
901
  - \`--fields a,b,c\` — keep only these fields in JSON output (e.g.
615
902
  \`alias,name,status\`). Filters per item only;
616
- paginated wrappers (\`{items, total, limit,
617
- offset}\`) keep their shape.
903
+ list wrappers (\`{items, total, returned,
904
+ limit, offset, has_more}\`) keep their shape.
618
905
  - \`--verbose\` — include full UUIDs, timestamps, and (on
619
906
  write paths) the full server payload instead
620
907
  of the compact response.
621
908
  - \`-q, --quiet\` — suppress progress messages on stderr (errors
622
- still go to stderr).
909
+ still go to stderr). \`--get\` implies
910
+ \`--quiet\` so the bare value is the only
911
+ thing on stdout.
623
912
 
624
913
  ## Stable shape rules
625
914
 
626
915
  The CLI guarantees these contracts so agents can chain safely:
627
916
 
628
- - **Lists keep their wrapper.** \`--fields\` strips per-item, never the
629
- envelope. A paginated list with \`{items, total, limit, offset}\` will
630
- always have those four keys.
917
+ - **Every list response is a six-key envelope.** All
918
+ \`<entity> list --json\` responses (workspace, study, iteration, ask,
919
+ profile, config) return:
920
+
921
+ \`\`\`json
922
+ {
923
+ "items": [...],
924
+ "total": 121, // server-provided when paginated; else items.length
925
+ "returned": 50, // items.length, always present
926
+ "limit": 50,
927
+ "offset": 0,
928
+ "has_more": true // total > offset + returned
929
+ }
930
+ \`\`\`
931
+
932
+ When the server doesn't paginate, \`total = returned = limit\`,
933
+ \`offset = 0\`, \`has_more = false\` (synthesized client-side).
934
+ \`--fields\` strips per-item, never the envelope — those six keys are
935
+ always present. Use \`has_more\` to detect truncation rather than
936
+ counting items yourself.
631
937
  - **Write paths always include \`id\` AND \`alias\`.** Even with
632
938
  \`--fields\` set, you can identify the affected resource. Default
633
939
  write-path JSON is compact (\`{id, alias, name, updated_at,
@@ -635,9 +941,102 @@ The CLI guarantees these contracts so agents can chain safely:
635
941
  - **\`profile generate\` trims \`simulation_config\` by default** (~9×
636
942
  smaller than the raw response). Pass \`--include-simulation-config\`
637
943
  if you need it.
944
+ - **\`<entity> get\` accepts multiple IDs.** \`profile get\`, \`study get\`,
945
+ \`iteration get\`, and \`ask get\` all take \`<ids...>\` — pass two or
946
+ more aliases (space- or comma-separated) and the response is a
947
+ \`{items:[...], total:N}\` envelope. Use this instead of piping
948
+ \`list --json\` to \`jq\`/\`python\` to filter by alias.
949
+ - **Ask detail JSON includes denormalized counts** so agents don't
950
+ have to count nested arrays. \`ask get\`, \`ask create --wait\`,
951
+ \`ask run --wait\`, and \`ask wait --verbose\` all add:
952
+
953
+ \`\`\`json
954
+ {
955
+ "testers_count": 3,
956
+ "responses_total": 9,
957
+ "responses_complete": 9,
958
+ "rounds": [
959
+ { "responses_total": 3, "responses_complete": 3, "...": "..." }
960
+ ]
961
+ }
962
+ \`\`\`
963
+
964
+ \`responses_errored\` only appears when at least one response errored.
965
+ Use these instead of \`jq '.testers | length'\` /
966
+ \`jq '.rounds[0].responses | length'\`.
638
967
  - **\`study run --json\` exposes tester handles.** The top-level
639
968
  \`tester_ids[]\` and \`tester_aliases[]\` arrays are the canonical
640
969
  inputs to \`ish study poll/wait/cancel\`.
970
+ - **Study responses carry a derived \`runtime_status\` field**
971
+ (\`draft | running | completed | completed_with_errors | cancelled\`).
972
+ Prefer this over the raw \`status\` field — \`runtime_status\` is
973
+ computed from the iteration testers' actual run state and never
974
+ reports \`failed\` while completed runs exist. Available on
975
+ \`study get\`, \`study results\`, and the response from
976
+ \`study generate\`. The CLI also surfaces a \`status_inferred\` field
977
+ alongside the raw \`status\` when it detects a partial-failure
978
+ inconsistency, plus a stderr warning ("Warning: study reports
979
+ status='failed' but N/M testers completed…").
980
+ - **\`study generate --json\` includes a \`modality_rationale\`** —
981
+ one short sentence explaining why the LLM picked that modality. Use
982
+ it to detect mis-classifications (e.g. brief was a static concept doc
983
+ but rationale says "live UI flow") and override via
984
+ \`study update <id> --modality text\` before adding iterations.
985
+ - **\`ask add-questions\` is additive by default.** Appending questions
986
+ preserves variant comments / picks / ratings / prior-question
987
+ answers; only the new question(s) get dispatched. Cost: roughly N
988
+ phase-2 LLM calls instead of 2N. Pass \`--redispatch-all\` for the
989
+ legacy reset behavior when you want fresh first impressions.
990
+ - **\`ask results --json\` includes \`cross_round_summary\` for 2+
991
+ rounds.** Top-level field with per-round picks/winner snapshots and
992
+ a \`picks_delta\` (R1 → last round). Replaces hand-rolled diffing of
993
+ two \`ask results\` calls.
994
+ - **No more auto-empty iteration A.** \`study create\` and
995
+ \`study generate\` no longer produce a placeholder iteration A. The
996
+ first explicit \`ish iteration create\` becomes label A.
997
+ \`study create\` now accepts \`--content-text\` (text) or \`--url\`
998
+ (interactive) inline so a single call yields a runnable study.
999
+ Running \`study run\` on a study with zero iterations exits 2 with
1000
+ a suggestion to run \`ish iteration create\` first.
1001
+ - **Tester responses include \`error_message\`.** When a tester is
1002
+ \`status: failed\`, the JSON exposes \`error_message: "<reason>"\` so
1003
+ agents can act without drilling into logs. \`study results\` rolls
1004
+ this up: top-level \`failed_count\`, plus per-tester \`error_message\`
1005
+ in the \`testers[]\` array, and a "Failed testers" subsection in
1006
+ human output. Empty when the tester succeeded.
1007
+ - **\`profile list\` emits a stderr pagination hint** when
1008
+ \`has_more=true\` and \`--quiet\` is not set. The hint goes to **stderr
1009
+ in every mode** including \`--json\` and piped stdout — it never
1010
+ pollutes machine-readable stdout but is visible to any agent that
1011
+ reads stderr (which they should, for warnings and progress). Format:
1012
+ "showing N–M of TOTAL; pass --offset M --limit N for more."
1013
+ JSON consumers can also read \`has_more\` directly off the envelope.
1014
+ - **\`ask results --json\` adds an \`aggregates\` field per round.** For
1015
+ rounds with \`wants_pick\`/\`wants_ratings\`, the CLI computes the
1016
+ verdict locally so agents don't have to parse comment prose:
1017
+
1018
+ \`\`\`json
1019
+ {
1020
+ "aggregates": {
1021
+ "picks": { "A": 3, "B": 0 },
1022
+ "ratings": { "A": { "mean": 4.667, "n": 3 },
1023
+ "B": { "mean": 2.000, "n": 3 } },
1024
+ "winner": { "letter": "A", "count": 3, "tied": false }
1025
+ }
1026
+ }
1027
+ \`\`\`
1028
+
1029
+ \`picks\` is present iff \`wants_pick\`; \`ratings\` is present iff
1030
+ \`wants_ratings\` and ≥ 1 rating was submitted; \`winner\` is the
1031
+ highest pick count (\`tied: true\` if multiple variants share the
1032
+ top). \`mean\` is rounded to 3 decimal places; \`n\` is the rating
1033
+ count for that variant.
1034
+ - **\`ask results --json\` deduplicates tester profile snapshots.** When
1035
+ \`tester_profile\` and \`tester_profile_snapshot\` share all
1036
+ overlapping fields (the common case — they only diverge if the
1037
+ profile was edited after dispatch), the snapshot is collapsed to
1038
+ \`{snapshotted_at, snapshot_version, _matches_tester_profile: true}\`.
1039
+ Use \`--verbose\` to keep both copies in full.
641
1040
 
642
1041
  ## Exit codes
643
1042
 
@@ -693,25 +1092,43 @@ a structured error object on **stdout** and a human message on
693
1092
  ## Examples
694
1093
 
695
1094
  \`\`\`
696
- ish workspace list --json | jq '.[].alias'
697
- ish study get s-b2c --fields alias,name,status,iterations
1095
+ # Display (table on TTY, JSON when piped):
1096
+ ish workspace list
1097
+
1098
+ # Display preserved through tee/pipe (force human):
1099
+ ish ask results a-6ec --human | tee /tmp/results.txt
1100
+
1101
+ # Capture a single alias to feed into the next command:
1102
+ WS=$(ish workspace list --get alias | head -1)
1103
+
1104
+ # Inspect a nested field:
1105
+ ish study tester t-a17 --get tester_profile.name
1106
+
1107
+ # Chain (full JSON for jq when you need multiple fields):
1108
+ ish study get s-b2c --fields alias,name,status,iterations --json
698
1109
  ish ask results a-6ec --round 1 --json
699
- ish profile generate --description "..." --count 3 --json | jq '.[].alias'
700
1110
  \`\`\`
701
1111
 
702
1112
  ## Composing commands
703
1113
 
704
- JSON mode + alias resolution makes pipelines safe:
1114
+ \`--get\` removes most of the \`jq\` shims agents reach for. Capture in
1115
+ a script, then display the final result back to the user:
705
1116
 
706
1117
  \`\`\`
707
- ITER=$(ish iteration create --url https://example.com --json | jq -r .alias)
708
- TESTERS=$(ish study run --iteration "$ITER" --sample 5 --country SE \\
709
- --json | jq -r '.tester_aliases[]')
1118
+ # Capture bare values, no jq needed:
1119
+ ITER=$(ish iteration create --url https://example.com --get alias)
1120
+ TESTERS=$(ish study run --iteration "$ITER" --sample 5 --country SE --get tester_aliases)
710
1121
  for t in $TESTERS; do
711
1122
  ish study wait "$t" --timeout 600
712
1123
  done
713
- ish study results --json | jq .
1124
+
1125
+ # Display the final results to the user, even though we're in a script:
1126
+ ish study results --human
714
1127
  \`\`\`
1128
+
1129
+ When you genuinely need multiple fields in one parse pass, \`--json\` is
1130
+ still the right tool — \`--get\` is for single-value capture, not for
1131
+ reshaping output.
715
1132
  `;
716
1133
  const GUIDE_FIRST_STUDY = `# guide: your first study, end to end
717
1134
 
@@ -774,6 +1191,168 @@ ish study results --json | jq .
774
1191
  - Want a quick reaction test instead of an interactive study? Skip to
775
1192
  \`ish docs get-page concepts/ask\`.
776
1193
  `;
1194
+ const CONCEPT_ACTIVE_CONTEXT = `# concept: active context
1195
+
1196
+ The CLI keeps a small amount of session state in \`~/.ish/config.json\`
1197
+ (or wherever \`ISH_HOME\` points) so commands don't need to repeat IDs:
1198
+
1199
+ - \`access_token\` / \`refresh_token\` — the OAuth pair from \`ish login\`.
1200
+ - \`workspace\` — set by \`ish workspace use <id>\`.
1201
+ - \`study\` — set by \`ish study use <id>\`.
1202
+ - \`ask\` — set by \`ish ask use <id>\`.
1203
+
1204
+ Most commands fall back to these when their corresponding flag is
1205
+ omitted (\`--workspace\`, \`--study\`, \`--ask\`).
1206
+
1207
+ ## Inspecting active context
1208
+
1209
+ \`ish status\` (alias: \`ish whoami\`) is the canonical way to see what's
1210
+ configured. **Run it as the first command on a cold start** — it
1211
+ confirms login, prints the active workspace/study/ask handles, and
1212
+ shows how long the token has left.
1213
+
1214
+ \`\`\`bash
1215
+ ish status
1216
+ # User: you@example.com (token valid, expires in 47m)
1217
+ # Workspace: Onboarding revamp (w-6ec)
1218
+ # Study: —
1219
+ # Ask: a-6ec "tagline AB"
1220
+ # Home: /home/you/.ish
1221
+ # API: https://api.ishlabs.io
1222
+ \`\`\`
1223
+
1224
+ JSON shape (\`ish status --json\` or piped):
1225
+
1226
+ \`\`\`json
1227
+ {
1228
+ "user": { "email": "...", "token_valid": true, "expires_in_seconds": 2820 },
1229
+ "workspace": { "id": "...", "alias": "w-6ec", "name": "Onboarding revamp" },
1230
+ "study": null,
1231
+ "ask": { "id": "...", "alias": "a-6ec", "name": "tagline AB" },
1232
+ "api_url": "https://api.ishlabs.io",
1233
+ "home": "/home/you/.ish"
1234
+ }
1235
+ \`\`\`
1236
+
1237
+ \`status\` does not error when the user is logged out — it returns
1238
+ \`user: null\` plus a \`hint\` field telling the caller to run
1239
+ \`ish login\`. Safe to run unconditionally at the start of any
1240
+ script or agent session.
1241
+
1242
+ ## Setting / clearing active context
1243
+
1244
+ \`\`\`bash
1245
+ ish workspace use w-6ec # set
1246
+ ish workspace use --clear # clear
1247
+
1248
+ ish study use s-b2c
1249
+ ish study use --clear
1250
+
1251
+ ish ask use a-6ec
1252
+ ish ask use --clear
1253
+ \`\`\`
1254
+
1255
+ ## Overriding without persisting
1256
+
1257
+ Every read command accepts \`--workspace <id>\`, \`--study <id>\`, or
1258
+ \`--ask <id>\` to override the saved active value for one invocation
1259
+ without touching the config. Useful for one-off pokes at another
1260
+ resource.
1261
+
1262
+ \`--workspace\` is accepted on **every workspace-scoped subcommand**
1263
+ (\`ask\`, \`study\`, \`iteration\`, \`profile\`, \`source\` and their
1264
+ descendants). When workspace is inferable from the subject ID alias
1265
+ (e.g. \`ish ask delete a-6ec\`) the value is silently ignored — agents
1266
+ can pass it reflexively without tripping "unknown option" errors. Out
1267
+ of scope: \`workspace\`, \`config\`, \`docs\`, \`init\`, \`login\`,
1268
+ \`logout\`, \`whoami\`, \`upgrade\` (none of these need a workspace).
1269
+
1270
+ ## Related
1271
+
1272
+ - \`reference/aliases\` — the prefix scheme used by every entity.
1273
+ - \`reference/json-mode\` — output modes (display vs capture vs chain),
1274
+ including \`--get workspace.alias\` to capture the active workspace
1275
+ without piping \`ish status --json\` through \`jq\`.
1276
+ `;
1277
+ const REFERENCE_BILLING_LIMITS = `# reference: billing tier limits
1278
+
1279
+ Some create operations are gated by your account's billing tier. The
1280
+ backend enforces these. The CLI just renders the structured rejection.
1281
+ There is no way to bypass enforcement from the CLI; running the same
1282
+ \`POST\` with \`curl\` will hit the same gate.
1283
+
1284
+ The web UI reads these caps at runtime from
1285
+ \`GET /api/v1/billing/limits\` (cached for one hour) and falls back to
1286
+ its build-time snapshot if the endpoint is unreachable. The table below
1287
+ is the CLI's own snapshot, intentionally release-pinned for offline
1288
+ use; re-pull it after each \`ish-cli\` release. The source of truth at
1289
+ request time, for any client, is the backend's \`TIER_LIMITS\` dict in
1290
+ \`tier_limits.py\`.
1291
+
1292
+ ## Limits enforced
1293
+
1294
+ | Limit | Free | Media | Starter | Pro | Enterprise |
1295
+ |-----------------------------|------|-------|---------|-----|------------|
1296
+ | \`maxProducts\` | 1 | 1 | ∞ | ∞ | ∞ |
1297
+ | \`maxStudiesPerProduct\` | 3 | ∞ | ∞ | ∞ | ∞ |
1298
+ | \`maxIterationsPerStudy\` | 2 | ∞ | ∞ | ∞ | ∞ |
1299
+ | \`maxCustomTesterProfiles\` | 3 | 10 | 10 | ∞ | ∞ |
1300
+
1301
+ Commands that may hit a limit: \`ish workspace create\`,
1302
+ \`ish study create\`, \`ish study generate\`, \`ish iteration create\`,
1303
+ \`ish profile create\`, \`ish profile generate\`.
1304
+
1305
+ ## What you see when a limit is hit
1306
+
1307
+ Human output (stderr):
1308
+
1309
+ \`\`\`
1310
+ Error: Free plan allows 3 studies per workspace. Upgrade to add more.
1311
+ → Upgrade your plan at https://app.ishlabs.io/billing
1312
+ → Run \`ish docs get-page reference/billing-limits\` for the tier table
1313
+ \`\`\`
1314
+
1315
+ JSON output (stdout — \`--json\` or piped):
1316
+
1317
+ \`\`\`json
1318
+ {
1319
+ "error": "Free plan allows 3 studies per workspace. Upgrade to add more.",
1320
+ "error_code": "usage_limit_reached",
1321
+ "status": 403,
1322
+ "retryable": false,
1323
+ "tier": "free",
1324
+ "limit": "maxStudiesPerProduct",
1325
+ "current": 3,
1326
+ "max": 3,
1327
+ "upgrade_url": "https://app.ishlabs.io/billing",
1328
+ "suggestions": ["Upgrade your plan at https://app.ishlabs.io/billing", "..."]
1329
+ }
1330
+ \`\`\`
1331
+
1332
+ Exit code: \`1\` (general — non-retryable). Don't retry; the user has to
1333
+ upgrade or delete an existing resource to free up headroom.
1334
+
1335
+ ## Agent-side handling
1336
+
1337
+ - Branch on \`error_code === "usage_limit_reached"\` (preferred) or
1338
+ \`status === 403\` with that error_code in the body. \`forbidden\`
1339
+ errors that are *not* tier-related keep \`error_code: "forbidden"\`.
1340
+ - Use \`limit\`, \`current\`, \`max\`, \`tier\` to construct your own
1341
+ recovery message. The \`limit\` value matches the table above and is
1342
+ stable.
1343
+ - The \`generate\` endpoints (\`study generate\`, \`profile generate\`)
1344
+ refuse the entire batch when the post-generation count would exceed
1345
+ the cap, rather than partially fulfilling — re-issue with a smaller
1346
+ \`--count\` after upgrading or pruning.
1347
+
1348
+ ## Related
1349
+
1350
+ - \`concepts/workspace\` — \`maxProducts\` is per-account.
1351
+ - \`concepts/study\` — \`maxStudiesPerProduct\` gates study creation.
1352
+ - \`concepts/iteration\` — \`maxIterationsPerStudy\` gates iteration creation.
1353
+ - \`concepts/profile\` — \`maxCustomTesterProfiles\` gates profile creation.
1354
+ - \`reference/json-mode\` — full error envelope shape and exit codes.
1355
+ `;
777
1356
  const PAGES = [
778
1357
  {
779
1358
  slug: "overview",
@@ -853,6 +1432,12 @@ const PAGES = [
853
1432
  description: "Side-by-side; decision rule for choosing one over the other.",
854
1433
  body: CONCEPT_RUN_VERBS,
855
1434
  },
1435
+ {
1436
+ slug: "concepts/active-context",
1437
+ title: "concept: active context",
1438
+ description: "Saved workspace/study/ask state and how to inspect it (ish status).",
1439
+ body: CONCEPT_ACTIVE_CONTEXT,
1440
+ },
856
1441
  {
857
1442
  slug: "reference/aliases",
858
1443
  title: "reference: aliases",
@@ -861,10 +1446,16 @@ const PAGES = [
861
1446
  },
862
1447
  {
863
1448
  slug: "reference/json-mode",
864
- title: "reference: JSON output for agents",
865
- description: "JSON, --fields, --verbose, exit codes, pipe behaviour.",
1449
+ title: "reference: output modes for agents (display, capture, chain)",
1450
+ description: "Display vs capture vs chain: --human, --get, --json, --fields, exit codes, pipe behavior.",
866
1451
  body: REFERENCE_JSON_MODE,
867
1452
  },
1453
+ {
1454
+ slug: "reference/billing-limits",
1455
+ title: "reference: billing tier limits",
1456
+ description: "Per-tier caps on workspaces/studies/iterations/profiles; usage_limit_reached error shape.",
1457
+ body: REFERENCE_BILLING_LIMITS,
1458
+ },
868
1459
  {
869
1460
  slug: "guides/first-study",
870
1461
  title: "guide: your first study, end to end",