@ishlabs/cli 0.13.0 → 0.14.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/lib/docs.js CHANGED
@@ -123,8 +123,43 @@ A \`null\` value on a \`*_max\` field means "unlimited" (paid tiers).
123
123
  Branch on \`studies_used >= studies_max\` before \`study create\`,
124
124
  likewise for \`testers_used\` before \`study run --sample\`.
125
125
 
126
+ ## Cold start — \`workspace_create\` is not safe to call blind
127
+
128
+ On a saturated account, calling \`workspace_create\` (or
129
+ \`ish workspace create\`) without first inspecting state returns
130
+ \`error_code: usage_limit_reached\` immediately. A first-time agent
131
+ that was told "create a fresh workspace, then run a study" will trip
132
+ the cap on the very first call. **Always inspect existing workspaces
133
+ first** — \`ish workspace list\` / \`workspace_get\` returns per-row
134
+ metadata so you can pick a reuse target rather than blindly creating.
135
+
136
+ Each row in the list response carries:
137
+
138
+ - \`last_activity_at\` — most recent run, iteration, ask, or write on
139
+ this workspace. Pick the most recently active workspace if you want
140
+ one the user is likely already thinking about.
141
+ - \`child_counts\` — \`{ studies, asks, tester_profiles }\`. Zero across
142
+ the board = a quiet/empty workspace, safe to reuse without
143
+ cluttering anyone's view.
144
+ - \`has_headroom\` — \`true\` if the workspace is below
145
+ \`maxStudiesPerProduct\`, \`maxIterationsPerStudy\`, and
146
+ \`maxCustomTesterProfiles\` for the caller's tier. Branch on this
147
+ before \`study create\` / \`profile generate\` — \`false\` here will be
148
+ \`usage_limit_reached\` on the next call.
149
+
150
+ For the idempotent create-or-reuse-by-name path, use
151
+ \`ish workspace create --name <name> --ensure\`: returns the existing
152
+ workspace owned by the caller if one with that name exists, otherwise
153
+ creates a fresh one. Safe to call from a cold-start script without
154
+ first scraping the list.
155
+
156
+ The full saturated-account walkthrough (with branch logic + a worked
157
+ transcript) lives at \`guides/cold-start\`.
158
+
126
159
  ## Related
127
160
 
161
+ - \`guides/cold-start\` — saturated-account first-step playbook
162
+ (\`workspace_get\` → inspect headroom → reuse or \`--ensure\`).
128
163
  - \`concepts/secret\` — per-workspace secrets used in chatbot endpoint
129
164
  headers via \`{{secret:KEY}}\` placeholders.
130
165
  - \`reference/billing-limits\` — \`maxProducts\` cap on workspace creation.
@@ -134,7 +169,8 @@ const CONCEPT_STUDY = `# concept: study
134
169
  A **study** is the persistent research artifact. It defines:
135
170
  - \`modality\`: \`interactive\` (the tester drives a real browser), one of
136
171
  \`text | video | audio | image | document\` (media reaction studies),
137
- or \`chat\` (multi-turn probe against an external chatbot endpoint).
172
+ or \`chat\` (multi-turn conversation either with an external chatbot
173
+ endpoint or between two AI personas via tester_pair mode).
138
174
  - \`content_type\` (media studies only): \`email | social_post | ad | …\` —
139
175
  controls the framing the tester is given.
140
176
  - \`assignments\`: the tasks the tester performs. See \`concepts/assignment\`.
@@ -168,7 +204,7 @@ test artifact and don't need to A/B iterations:
168
204
  | \`video\` | \`--content-url <url>\` |
169
205
  | \`audio\` | \`--content-url <url>\` |
170
206
  | \`document\` | \`--content-url <url>\` |
171
- | \`chat\` | \`--endpoint <id>\` or \`--endpoint-config <file>\` |
207
+ | \`chat\` | \`--endpoint <id>\` or \`--endpoint-config <file>\` (external_chatbot mode), or \`--chat-mode tester_pair --audience-a/-b --scenario-a/-b\` (two-AI rehearsal) |
172
208
 
173
209
  \`\`\`
174
210
  # Text — single email artifact:
@@ -261,14 +297,22 @@ pick was wrong.
261
297
  - \`concepts/questionnaire\` — question types and timing.
262
298
  - \`concepts/run-verbs\` — when to use \`study run\` vs \`ask run\`.
263
299
  - \`reference/billing-limits\` — \`maxStudiesPerProduct\` cap on study creation.
300
+ - \`reference/credits\` — per-run credit cost & how to preview before dispatch.
264
301
  `;
265
302
  const CONCEPT_ITERATION = `# concept: iteration
266
303
 
267
304
  An **iteration** is one configured run of a study. It carries the
268
305
  volatile bits — the URL (interactive), the media (video/text/etc.), or
269
- the chatbot endpoint (chat) — while the study carries the persistent
306
+ the chat payload (chat) — while the study carries the persistent
270
307
  shape (assignments, questionnaire, modality).
271
308
 
309
+ For chat modality, the iteration's \`details.mode_details\` discriminator
310
+ selects between **external_chatbot** (testers probe a customer chatbot
311
+ endpoint) and **tester_pair** (two AI tester audiences converse with
312
+ each other, one Conversation per pair index). Wire-shape examples and
313
+ pair-mode rules live under the "## Chat modality" section below; the
314
+ full chat-author workflow is at \`guides/chat\`.
315
+
272
316
  - Alias prefix: \`i-\`
273
317
  - A study has 1..N iterations. \`ish study run\` defaults to the latest.
274
318
  - Local files passed to \`--content-url\`, \`--image-urls\`, etc. are
@@ -311,9 +355,15 @@ ish iteration create --image-urls "./a.png,./b.png"
311
355
  # Document (PDF):
312
356
  ish iteration create --content-url ./report.pdf
313
357
 
314
- # Chat — probe a saved chatbot endpoint:
358
+ # Chat (external_chatbot) — probe a saved chatbot endpoint:
315
359
  ish iteration create --chat-endpoint-id ce-... --max-turns 10 --early-termination
316
360
 
361
+ # Chat (tester_pair) — rehearse a conversation between two AI audiences:
362
+ ish iteration create --chat-mode tester_pair \\
363
+ --audience-a tp-a1,tp-a2 --audience-b tp-b1,tp-b2 \\
364
+ --scenario-a "You're a senior sales rep pitching ish." \\
365
+ --scenario-b "You're a skeptical CTO evaluating ish."
366
+
317
367
  # Inspect:
318
368
  ish iteration list --study s-b2c
319
369
  ish iteration get i-d4e
@@ -401,22 +451,279 @@ paragraph-by-paragraph reactions to a long caption. Use the
401
451
 
402
452
  ## Chat modality
403
453
 
404
- Chat iterations probe an external chatbot endpoint by having a tester
405
- hold a multi-turn conversation against it. Two ways to wire the
406
- endpoint:
454
+ Chat iterations hold a multi-turn conversation. The conversation can
455
+ take one of two shapes, picked by the \`mode_details.mode\` discriminator
456
+ on the iteration:
457
+
458
+ - **\`external_chatbot\`** — a tester talks to a customer chatbot
459
+ endpoint (the original chat behaviour). The endpoint config or saved
460
+ chatbot-endpoint reference lives at
461
+ \`details.mode_details.endpoint\` / \`details.mode_details.chatbot_endpoint_id\`.
462
+ - **\`tester_pair\`** — two AI tester profiles talk to each other.
463
+ audience_a and audience_b pair 1:1 by index when counts match (N
464
+ pairs → N conversations); a side of exactly 1 broadcasts across the
465
+ other side (so 1 × N → N conversations all sharing the lone profile).
466
+ Each side carries its own scenario + goal; the other side does not
467
+ see it (the **asymmetry contract**). Useful for rehearsing a pitch, a
468
+ difficult conversation, a sales call, or any two-role scenario before
469
+ it happens.
470
+
471
+ Wire-shape:
407
472
 
408
- \`\`\`
409
- # Reference a saved endpoint row (recommended — reproducible):
410
- ish iteration create --chat-endpoint-id ce-...
473
+ \`\`\`json
474
+ // external_chatbot
475
+ {
476
+ "type": "chat",
477
+ "mode_details": {
478
+ "mode": "external_chatbot",
479
+ "endpoint": { "url": "https://...", "headers": {} },
480
+ "chatbot_endpoint_id": "ep-uuid"
481
+ },
482
+ "max_turns": 14,
483
+ "early_termination": true
484
+ }
411
485
 
412
- # Inline endpoint config (one-off):
413
- ish iteration create --chat-endpoint-json '{"url":"https://...","headers":{...}}'
486
+ // tester_pair (with explicit audiences)
487
+ {
488
+ "type": "chat",
489
+ "mode_details": {
490
+ "mode": "tester_pair",
491
+ "audience_a": ["tp-uuid-1", "tp-uuid-2"],
492
+ "audience_b": ["tp-uuid-3", "tp-uuid-4"],
493
+ "scenario_a": "You are a senior sales rep pitching ish.",
494
+ "scenario_b": "You are a skeptical CTO evaluating ish.",
495
+ "initiator_side": "a"
496
+ },
497
+ "max_turns": 14,
498
+ "early_termination": true
499
+ }
500
+
501
+ // tester_pair (with role criteria — backend resolves the pool)
502
+ {
503
+ "type": "chat",
504
+ "mode_details": {
505
+ "mode": "tester_pair",
506
+ "audience_a": [],
507
+ "audience_b": [],
508
+ "role_criteria_a": {
509
+ "occupation": ["founder", "ceo"],
510
+ "min_age": 28, "max_age": 55,
511
+ "country": ["US", "SE"]
512
+ },
513
+ "role_criteria_b": { "occupation": ["investor", "vc"] },
514
+ "scenario_a": "...",
515
+ "scenario_b": "...",
516
+ "initiator_side": "a"
517
+ },
518
+ "max_turns": 14,
519
+ "early_termination": true
520
+ }
414
521
  \`\`\`
415
522
 
416
- Tunables:
417
- - \`--max-turns N\` — cap the conversation length (default 12, max 50).
523
+ ## Audience selection (tester_pair)
524
+
525
+ Each side of a pair needs **either** an explicit audience list **or** a
526
+ role-criteria filter (or both). Three input modes:
527
+
528
+ | Side A input | Side B input | Behaviour |
529
+ | --- | --- | --- |
530
+ | \`--audience-a\` (UUIDs) | \`--audience-b\` (UUIDs) | Explicit pairing. Equal counts zip 1:1 by index; a side of exactly 1 broadcasts to the other. |
531
+ | \`--role-criteria-a\` (JSON) | \`--role-criteria-b\` (JSON) | Backend resolves matching pool from each side's criteria and persists the IDs back to the iteration. |
532
+ | Either flag pair | Either flag pair | Mixed (e.g. explicit A + criteria B). Backend handles each side independently. |
533
+ | Both flags on one side | (any) | Criteria validates the explicit list; mismatch blocks run with a clear error. |
534
+
535
+ **Persona-first principle**: the tester's persona is sacred — never
536
+ altered by the scenario. Criteria filter the *eligible pool* upstream
537
+ so that by the time a tester reaches the LLM prompt, their persona is
538
+ already plausible for the role. The prompt construction itself does
539
+ not change between explicit-audience and criteria-driven flows.
540
+
541
+ \`RoleCriteria\` keys (all optional):
542
+
543
+ - \`occupation: string[]\` (job titles, case-insensitive match)
544
+ - \`min_age: int\`, \`max_age: int\`
545
+ - \`gender: string[]\` (e.g. \`["female", "male"]\`)
546
+ - \`country: string[]\` (ISO-3166-alpha-2 codes)
547
+ - \`education_level_in: string[]\` (less_than_secondary, secondary, some_post_secondary, vocational_or_associate, bachelor, graduate)
548
+ - \`household_in: string[]\` (single, couple_no_kids, couple_with_kids, single_parent, shared_housing, adult_with_parents, multi_generational). MECE: a couple raising children is \`couple_with_kids\`, not \`couple_no_kids\`; \`single\` means lives alone with no partner, roommates, parents, or children sharing the household.
549
+ - \`locale_type_in: string[]\` (urban, suburban, small_town, rural)
550
+ - \`income_level_in: string[]\` (lower, lower_middle, middle, upper_middle, upper, prefer_not_to_say)
551
+ - \`employment_status_in: string[]\` (employed_full_time, employed_part_time, self_employed, unemployed_seeking, student, homemaker, retired, unable_to_work, other). Primary daytime activity wins: a student who works part-time is \`student\`; a retiree who freelances is \`retired\`.
552
+ - \`requires_captions: bool\`, \`uses_screen_reader: bool\`, \`prefers_reduced_motion: bool\`, \`prefers_high_contrast: bool\`, \`has_any_accessibility_need: bool\` (coarse boolean filters over \`accessibility_profile\`)
553
+
554
+ If the resolved pool is smaller than the requested conversation count
555
+ for a side, \`ish study run\` exits 2 with the backend's error envelope
556
+ intact. No silent fallback. Broaden the criteria, generate more
557
+ profiles, or pass an explicit \`--audience-*\` list to recover.
558
+
559
+ ## Pair-mode flag names (CLI ↔ MCP alignment)
560
+
561
+ CLI flags on \`ish study create\` / \`ish iteration create\` use the
562
+ same nouns the MCP \`study_iterate.chat_pair\` payload uses, so an
563
+ agent doesn't pay a translation tax when switching surfaces:
564
+
565
+ | CLI flag | MCP field | What it carries |
566
+ |---------------------------|----------------------------|-----------------------------------------------------|
567
+ | \`--audience-a\` / \`-b\` | \`audience_a\` / \`audience_b\` | Explicit tester profile IDs (UUIDs or aliases) for that side. |
568
+ | \`--role-criteria-a\` / \`-b\` | \`role_criteria_a\` / \`role_criteria_b\` | JSON filter (occupation, country, …) the backend resolves into a pool. |
569
+ | \`--scenario-a\` / \`-b\` | \`scenario_a\` / \`scenario_b\` | The system-prompt-shaped role text injected into one side's prompt only (asymmetry contract). |
570
+ | \`--initiator-side\` | \`initiator_side\` | Which side speaks first (\`a\` default). |
571
+ | \`--max-turns\` | \`max_turns\` | Conversation cap (default 14). |
572
+ | \`--early-termination\` | \`early_termination\` | Allow the worker to end early when parties signal. |
573
+
574
+ The pre-2026-05 \`--profile-a\` / \`--profile-b\` CLI flags were
575
+ renamed to \`--audience-a\` / \`--audience-b\` to match the MCP and
576
+ the wire shape (\`mode_details.audience_a\` /
577
+ \`mode_details.audience_b\`). Same intent, same accepted inputs
578
+ (comma-separated UUIDs or aliases, repeatable). \`--role-criteria-a\`
579
+ / \`--role-criteria-b\` were already aligned with MCP and did not
580
+ change.
581
+
582
+ CLI authoring:
583
+
584
+ \`\`\`
585
+ # external_chatbot — reference a saved endpoint (recommended):
586
+ ish iteration create --endpoint ep-abc --max-turns 10 --early-termination
587
+
588
+ # external_chatbot — inline endpoint config:
589
+ ish iteration create --endpoint-config ./bot.json
590
+
591
+ # external_chatbot — legacy escape-hatch flags still work:
592
+ ish iteration create --chat-endpoint-id ep-abc --max-turns 10
593
+ ish iteration create --chat-endpoint-json '{"url":"https://..."}'
594
+
595
+ # tester_pair — two AI audiences, asymmetric per-side scenarios:
596
+ ish iteration create --chat-mode tester_pair \\
597
+ --audience-a tp-a1,tp-a2 --audience-b tp-b1,tp-b2 \\
598
+ --scenario-a @./sales_rep.md --scenario-b @./skeptical_cto.md \\
599
+ --max-turns 14
600
+
601
+ # tester_pair — criteria-driven audience (persona-first filtering):
602
+ ish iteration create --chat-mode tester_pair \\
603
+ --role-criteria-a '{"occupation":["founder","ceo"],"min_age":28}' \\
604
+ --role-criteria-b @./criteria_investor.json \\
605
+ --scenario-a @./sales_rep.md --scenario-b @./skeptical_cto.md \\
606
+ --max-turns 14
607
+ \`\`\`
608
+
609
+ Tunables (both modes):
610
+ - \`--max-turns N\` — cap the conversation length (default 12 for
611
+ external_chatbot, 14 for tester_pair; persona drift starts ~20 turns
612
+ so cap accordingly).
418
613
  - \`--early-termination\` — let the worker end the session early when
419
- the tester signals the conversation is over.
614
+ the parties signal the conversation is over.
615
+
616
+ Pair-mode rules:
617
+ - Each side needs **either** \`--profile-*\` (explicit IDs) **or**
618
+ \`--role-criteria-*\` (filter the backend resolves). The two can also
619
+ be combined — criteria then acts as validation on the explicit list.
620
+ - When both sides use explicit \`--audience-a\` / \`--audience-b\`, they
621
+ must be the same length (≥ 1). Same profile on both sides is allowed
622
+ (self-talk rehearsal). When either side defers to criteria, the
623
+ length match is enforced server-side after pool resolution.
624
+ - **1×N broadcast**: pass exactly one profile on one side and N
625
+ profiles on the other to rehearse the fixed side against N
626
+ variations. The CLI auto-broadcasts the singleton to match length
627
+ N. Example: \`--audience-a tp-rep --audience-b tp-cto1,tp-cto2,tp-cto3\`
628
+ produces 3 conversations, all sharing tp-rep on side A. The CLI
629
+ prints a stderr notice so you know broadcasting kicked in.
630
+ - Both \`--scenario-a\` and \`--scenario-b\` are required and asymmetric.
631
+ - \`--initiator-side\` defaults to \`a\` (side A speaks first).
632
+ - \`--chat-mode\` accepts both \`tester_pair\` and \`tester-pair\`
633
+ (hyphenated variants are normalised). Same normalisation applies to
634
+ \`--screen-format\` (\`mobile_portrait\` ↔ \`mobile-portrait\`),
635
+ \`--kind\` on \`source upload\` (\`text_file\` ↔ \`text-file\`), and the
636
+ \`type\` field in \`--questionnaire\` / \`--questions\` manifests
637
+ (\`single-choice\` ↔ \`single_choice\`).
638
+ - Audiences are pinned to the iteration. \`ish study run\` refuses
639
+ run-time audience overrides (\`--profile\` / \`--sample\` / \`--all\` /
640
+ filters) on a pair iteration — change the audiences via
641
+ \`ish iteration update <id> --details-json '{...}'\` instead.
642
+ - \`--max-turns\` / \`--early-termination\` on \`ish study run\` override
643
+ the iteration's saved values for that single dispatch (they are not
644
+ persisted back to the iteration).
645
+ - One Conversation row is created per pair index, server-side. The
646
+ per-conversation summary (\`end_reason\`, \`dominant_dynamic\`) lands on
647
+ the iteration response under \`conversations[]\`. Inspect via
648
+ \`ish iteration get <id>\`.
649
+
650
+ ## Writing a good scenario
651
+
652
+ Thin scenarios produce thin rehearsals. Both \`scenario_a\` and
653
+ \`scenario_b\` are injected into their own side's prompt as
654
+ role-playing context — the partner does **not** see the other side's
655
+ scenario or goal. Treat each scenario as a system prompt for one
656
+ character in a play. Cover five things:
657
+
658
+ 1. **Role / identity** — who is this person?
659
+ 2. **Voice** — how do they speak? Formal, casual, technical, blunt?
660
+ 3. **What they know** — the context they came in with.
661
+ 4. **What they don't know** — the asymmetry that makes the rehearsal
662
+ interesting.
663
+ 5. **Goal** — what counts as success for *them*.
664
+
665
+ Example (\`scenario_a\` — the sales rep):
666
+
667
+ \`\`\`
668
+ You are Maya, a senior account executive at ish — three years of
669
+ experience selling research tooling to product orgs. You speak in
670
+ clear, plain sentences, push back when you disagree, and quantify
671
+ claims when you can. You know this is a 30-minute discovery call;
672
+ you've read the prospect's LinkedIn and that's it. You do NOT know
673
+ the prospect's current tooling, budget, or internal politics — your
674
+ job is to find out by listening and asking. Success = end the call
675
+ with a clear next step (a pilot, a follow-up demo, or a "no, here's
676
+ why"). A polite "we'll get back to you" is not success.
677
+ \`\`\`
678
+
679
+ Example (\`scenario_b\` — the buyer):
680
+
681
+ \`\`\`
682
+ You are Devon, the CTO at a 60-person Series B SaaS company. You
683
+ distrust new vendors by default — your team has been burned by
684
+ "AI for research" tools twice in the past 18 months. You speak in
685
+ short, sceptical sentences and interrupt vendor pitches with
686
+ specifics: pricing, integrations, where the data lives. You know
687
+ your team currently runs unmoderated tests via UserTesting and
688
+ Pendo; the budget for new tooling is tight (€8k/year max). You do
689
+ NOT know how ish prices, what it integrates with, or whether it
690
+ handles your stack (Mixpanel + Heap + Linear). Success = leave the
691
+ call with either a concrete proof point that addresses your top
692
+ risk, OR a clean way to decline without burning the relationship.
693
+ \`\`\`
694
+
695
+ Read those back to back: the personas are asymmetric (different
696
+ goals, different knowledge), grounded (specific tools, specific
697
+ numbers), and constrained (each has a stake). That's the difference
698
+ between a rehearsal that produces signal and one that produces
699
+ generic dialogue. Keep each scenario under ~250 words — past that,
700
+ persona drift starts to dominate.
701
+
702
+ ### Don't put demographics in the scenario
703
+
704
+ A scenario describes **voice, knowledge, and goal** for one role —
705
+ *not* the demographics of who plays it. Demographic constraints
706
+ ("you are 35-year-old Swedish founder") belong in
707
+ \`--role-criteria-a\` / \`--role-criteria-b\` instead. The tester's
708
+ persona stays sacred; criteria filter the eligible pool upstream so
709
+ the persona is already plausible for the role by the time the LLM
710
+ sees the prompt. Mixing demographics into the scenario text
711
+ short-circuits the asymmetry contract and produces incoherent
712
+ characters (a retired farmer suddenly "pitching a Series A").
713
+
714
+ Paired with the Maya / Devon scenarios above, the criteria might
715
+ look like:
716
+
717
+ \`\`\`
718
+ # --role-criteria-a (the sales rep filter):
719
+ {"occupation":["sales","account executive"],"min_age":28,"max_age":50}
720
+
721
+ # --role-criteria-b (the skeptical CTO filter):
722
+ {"occupation":["cto","vp engineering","head of engineering"],
723
+ "country":["US","SE"],"education_level_in":["bachelor","graduate"]}
724
+ \`\`\`
725
+
726
+ Scenarios describe the role; criteria pick who plays it.
420
727
 
421
728
  ## No more auto-empty iteration A
422
729
 
@@ -444,6 +751,7 @@ Treat this as actionable, not transient — re-running won't change anything.
444
751
  - \`concepts/run-verbs\` — how \`ish study run\` selects the iteration.
445
752
  - \`concepts/audience\` — how testers are picked for a run.
446
753
  - \`reference/billing-limits\` — \`maxIterationsPerStudy\` cap on iteration creation.
754
+ - \`reference/credits\` — per-iteration-run credit cost & preview shape (\`pair_preview.credit_estimate\` for tester-pair, top-level \`credit_estimate\` otherwise).
447
755
  `;
448
756
  const CONCEPT_ASSIGNMENT = `# concept: assignment
449
757
 
@@ -527,6 +835,11 @@ ish study create … --questionnaire ./questionnaire.json
527
835
  \`questionnaire.json\` is an array of question objects in the shape above.
528
836
  The same shape is accepted by \`ish ask add-questions … --questions …\`.
529
837
 
838
+ The \`type\` field is hyphenated for the multi-word values (\`single-choice\`,
839
+ \`multiple-choice\`). The CLI normalises the underscored variants
840
+ (\`single_choice\`, \`multiple_choice\`) back to the canonical hyphenated form,
841
+ so either works in your manifest.
842
+
530
843
  ## Related
531
844
 
532
845
  - \`concepts/ask\` — asks have per-round questions, similar shape.
@@ -700,6 +1013,33 @@ copy can safely append questions without losing prior round results.
700
1013
 
701
1014
  See \`reference/json-mode\` for the full shape.
702
1015
 
1016
+ ## Response-shape ergonomics
1017
+
1018
+ A few non-obvious shape rules on the MCP / ask endpoints that save
1019
+ round-trips when you know them up front:
1020
+
1021
+ - **\`cross_round_summary\` requires \`wants_pick=true\` on every
1022
+ round.** \`ask results\` / \`ask_get\` only compute the top-level
1023
+ \`cross_round_summary\` when *every* round in the ask was dispatched
1024
+ with \`wants_pick=true\` — picks across rounds are the only
1025
+ comparable signal. When even one round is a free-text drill
1026
+ question (\`wants_pick=false\`), the field is omitted and the
1027
+ response carries a \`cross_round_summary_reason\` string explaining
1028
+ which round(s) lacked \`wants_pick\` (e.g.
1029
+ \`"omitted: rounds 2, 3 lack wants_pick=true"\`). Branch on the
1030
+ reason, don't poll for the field to appear.
1031
+ - **\`audience_get\` omits \`accessibility_profile\` by default.** The
1032
+ field is ~1KB per row; on a 50-profile page it overflows
1033
+ agent-tool result budgets. Pass
1034
+ \`include_accessibility_profile=true\` to include it. Mirrors the
1035
+ existing \`include_bio=false\` default — same opt-in pattern.
1036
+ - **\`ask_testers\` uses \`dispatch_into_round\`, not \`round\`.** The
1037
+ parameter name was renamed from the ambiguous \`round\` (which read
1038
+ as "start from round N") to the verbatim \`dispatch_into_round\`
1039
+ ("add these new testers into round N"). Behavior is unchanged —
1040
+ it appends testers to the named round on an existing ask, it does
1041
+ not roll back or restart any prior round.
1042
+
703
1043
  ## Variant syntax
704
1044
 
705
1045
  \`--variant <type>:<value>[::label=<label>]\`
@@ -714,6 +1054,7 @@ See \`reference/json-mode\` for the full shape.
714
1054
  - \`concepts/round\` — what a round is and how it executes.
715
1055
  - \`concepts/audience\` — how testers are chosen at ask creation.
716
1056
  - \`concepts/run-verbs\` — \`ish ask run\` vs \`ish study run\`.
1057
+ - \`reference/credits\` — ask rounds bill 1 credit per successful response.
717
1058
  `;
718
1059
  const CONCEPT_ROUND = `# concept: round
719
1060
 
@@ -816,6 +1157,58 @@ Expected JSON: \`{ "name": "...", "type": "ai", "gender": "female",
816
1157
  Re-generating the same name/country/occupation/age yields the
817
1158
  same DOB.
818
1159
 
1160
+ ## Structured profile fields
1161
+
1162
+ Five universal enums + a versioned accessibility JSONB live on every
1163
+ TesterProfile. Values are snake_case and match
1164
+ \`https://ishlabs.io/spec/profile-enums.v1.json\` byte-for-byte.
1165
+
1166
+ - \`education_level\`: \`less_than_secondary\`, \`secondary\`,
1167
+ \`some_post_secondary\`, \`vocational_or_associate\`, \`bachelor\`, \`graduate\`
1168
+ - \`household\` (MECE): \`single\`, \`couple_no_kids\`, \`couple_with_kids\`,
1169
+ \`single_parent\`, \`shared_housing\`, \`adult_with_parents\`,
1170
+ \`multi_generational\`. A couple raising children is \`couple_with_kids\`,
1171
+ not \`couple_no_kids\`. \`single\` means lives alone (no partner,
1172
+ roommates, parents, or children sharing the household).
1173
+ - \`locale_type\`: \`urban\`, \`suburban\`, \`small_town\`, \`rural\`
1174
+ - \`income_level\`: \`lower\`, \`lower_middle\`, \`middle\`, \`upper_middle\`,
1175
+ \`upper\`, \`prefer_not_to_say\`
1176
+ - \`employment_status\`: \`employed_full_time\`, \`employed_part_time\`,
1177
+ \`self_employed\`, \`unemployed_seeking\`, \`student\`, \`homemaker\`,
1178
+ \`retired\`, \`unable_to_work\`, \`other\`. Pick the primary daytime
1179
+ activity: a student who works part-time is \`student\`; a retiree who
1180
+ freelances is \`retired\`.
1181
+ - \`accessibility_profile\`: JSONB v1.0 with optional \`visual\`,
1182
+ \`auditory\`, \`motor\`, \`cognitive\`, \`data\` groups, plus
1183
+ \`assistive_tech: string[]\` and \`notes\`. Empty \`{}\` means "no
1184
+ accessibility configuration declared". Schema:
1185
+ \`https://ishlabs.io/spec/accessibility-profile-schema.v1.json\`.
1186
+
1187
+ Set them on \`ish profile update\`:
1188
+
1189
+ \`\`\`
1190
+ ish profile update tp-1b9 \\
1191
+ --education-level bachelor \\
1192
+ --household couple_with_kids \\
1193
+ --locale-type suburban \\
1194
+ --income-level middle \\
1195
+ --employment-status employed_full_time
1196
+
1197
+ # accessibility_profile accepts inline JSON or a path:
1198
+ ish profile update tp-1b9 --accessibility-profile '{
1199
+ "version": "1.0",
1200
+ "visual": {"uses_screen_reader": true, "text_size": "large"},
1201
+ "cognitive": {"reduce_motion": true},
1202
+ "assistive_tech": ["VoiceOver"]
1203
+ }'
1204
+
1205
+ ish profile update tp-1b9 --accessibility-profile ./a11y.json
1206
+ \`\`\`
1207
+
1208
+ The legacy \`--tech-savviness\` flag was removed in
1209
+ \`profile-schema-v2\`; passing it now produces commander's standard
1210
+ "unknown option" error.
1211
+
819
1212
  ## Related
820
1213
 
821
1214
  - \`concepts/source\` — the inputs to \`profile generate\`.
@@ -829,7 +1222,7 @@ audio file, image, or PDF that an LLM reads to ground generated profiles
829
1222
  in real customer evidence.
830
1223
 
831
1224
  - Alias prefix: \`tps-\`
832
- - Source kinds: \`text_file | audio | image\` (auto-detected from extension).
1225
+ - Source kinds: \`text_file | audio | image\` (auto-detected from extension; \`text-file\` is accepted as a hyphen variant).
833
1226
  - Audio supports speaker diarization via \`--diarize\`.
834
1227
 
835
1228
  ## Two ways to use a source
@@ -894,6 +1287,52 @@ Error: No simulatable AI tester profiles in workspace w-b32 match:
894
1287
  The suggestion is best-effort — it never replaces the original error,
895
1288
  just augments it.
896
1289
 
1290
+ ## Audience-build behaviors to know before dispatch
1291
+
1292
+ Two adjacent footguns surface most often on first-time audience
1293
+ construction. Both are documented here because they cost a round-trip
1294
+ to discover by experiment.
1295
+
1296
+ ### \`occupation\` is a loose substring match
1297
+
1298
+ \`audience_build\` (and the \`--search\` flag) treats \`occupation\` as
1299
+ a **loose, case-insensitive substring filter**, not a whole-token /
1300
+ taxonomy match. \`occupation=["manager"]\` will match hotel managers,
1301
+ retail store managers, bank branch managers — anything containing
1302
+ the literal string "manager". Three patterns that recover the
1303
+ specificity you usually want:
1304
+
1305
+ - **Whole-token alternation**: \`occupation=["engineering manager",
1306
+ "software engineering manager", "vp engineering", "tech lead"]\` —
1307
+ exhaustive enumeration of the role surface beats one short token.
1308
+ - **Pair with other filters**: \`occupation=["manager"]\` +
1309
+ \`min_age=28\` + \`country=["US","SE"]\` narrows even a loose substring
1310
+ meaningfully.
1311
+ - **Preview before dispatch**: \`audience_build\` returns a
1312
+ \`match_preview\` summary on the response — a 1-line histogram of
1313
+ matched occupations (e.g. \`"matched 17 — software developer (12),
1314
+ DevOps engineer (3), other (2)"\`). Read it before
1315
+ \`ask_run\` / \`study_run\` to confirm the substring is matching what
1316
+ you intended; iterate on the filter cheaply if not.
1317
+
1318
+ ### The public profile pool skews non-tech / non-Western
1319
+
1320
+ The default public tester-profile pool was built from a broad
1321
+ demographic sample — so a substring like \`"software engineering
1322
+ manager"\` may return only a handful of matches, while \`"hotel
1323
+ manager"\` or \`"retail associate"\` return many. Two adaptations:
1324
+
1325
+ - **Don't assume Silicon Valley defaults.** A criteria-driven audience
1326
+ that works on a private testing pool may resolve to a much smaller
1327
+ count in the public pool. Read the \`match_preview\` (or count) on
1328
+ every \`audience_build\` before dispatching a run that depends on
1329
+ reaching N matches.
1330
+ - **Seed your own pool when you need a specific archetype.** If the
1331
+ public pool is genuinely thin for your role, generate the audience
1332
+ yourself via \`ish profile generate --description "..."\` — that
1333
+ produces profiles plausible for the role you described, regardless
1334
+ of public-pool composition. See \`concepts/profile\`.
1335
+
897
1336
  ## Defaults
898
1337
 
899
1338
  - \`ish study run\` with no audience flags → reuses the iteration's
@@ -1347,6 +1786,15 @@ The CLI guarantees these contracts so agents can chain safely:
1347
1786
  is collapsed to one batch entry per study (M13) with nested
1348
1787
  \`tester_ids[]\`, \`tester_aliases[]\`, \`job_ids[]\`, and \`count\` —
1349
1788
  an N-sample dispatch is a single row, not N near-duplicate rows.
1789
+ - **\`study\` JSON includes a \`url\` field.** \`study create\`,
1790
+ \`study generate\`, \`study get\`, \`study list\` (per item), and
1791
+ \`study run\` each return a top-level \`url\` pointing to the study
1792
+ in the web app — \`/<workspace>/<study>/overview\` on the read /
1793
+ write paths, \`/<workspace>/<study>/timeline\` on \`study run\`.
1794
+ Print it to the user instead of composing the host + path yourself.
1795
+ The base host follows the active backend: \`https://app.ishlabs.io\`
1796
+ on production, \`http://localhost:3000\` under \`--dev\`. Override
1797
+ with the \`ISH_APP_URL\` env var for staging or self-hosted UIs.
1350
1798
  - **\`study results --json\` includes per-answer sentiment** (M10).
1351
1799
  Every \`interview_answers[].answers[]\` row carries \`sentiment\`
1352
1800
  (the tester's session-level label from \`tester_summary.sentiment\`),
@@ -1357,16 +1805,30 @@ The CLI guarantees these contracts so agents can chain safely:
1357
1805
  error_message}. Drops \`interview_answers\` and per-interaction
1358
1806
  breakdowns. Cheapest "did this run land?" shape.
1359
1807
  - **\`study results --transcript <tester_id>\`** is the chat-modality
1360
- projection. Returns \`{tester_id, tester_alias, transcript: [...],
1361
- unique_bot_replies, tester_summary}\`. Each transcript entry is
1362
- \`{role, text, turn_index, ...}\` — bot turns add \`failure\`
1363
- (set when the dispatch crashed); tester turns add \`action_type\`,
1364
- \`option_label\`, and \`sentiment\`. \`text\` is null on tester
1365
- turns whose action carries no text (\`select_option\`,
1366
- \`ignore_offered\`); read intent from \`action_type\` +
1367
- \`option_label\`. Same shape as the MCP \`get_chat_transcript\`
1368
- tool. \`unique_bot_replies = 1\` on a multi-turn run is the M2 loop
1369
- signature.
1808
+ projection **external_chatbot mode only in v1**. Returns
1809
+ \`{tester_id, tester_alias, transcript: [...], unique_bot_replies,
1810
+ tester_summary}\`. Each transcript entry is \`{role, text, turn_index,
1811
+ ...}\` — bot turns add \`failure\` (set when the dispatch crashed);
1812
+ tester turns add \`action_type\`, \`option_label\`, and \`sentiment\`.
1813
+ \`text\` is null on tester turns whose action carries no text
1814
+ (\`select_option\`, \`ignore_offered\`); read intent from
1815
+ \`action_type\` + \`option_label\`. Same shape as the MCP
1816
+ \`get_chat_transcript\` tool. \`unique_bot_replies = 1\` on a
1817
+ multi-turn run is the M2 loop signature.
1818
+
1819
+ **For tester_pair conversations**, the bot/tester role pair doesn't
1820
+ apply (both speakers are testers). Inspect pair transcripts via the
1821
+ iteration response instead:
1822
+
1823
+ \`\`\`bash
1824
+ ish iteration get <iter-id> --json | jq '.conversations[]'
1825
+ # → [{ id, pair_index, started_at, ended_at, end_reason, summary, ... }]
1826
+ \`\`\`
1827
+
1828
+ Per-side tester summaries still land on each tester row
1829
+ (\`ish study tester <id> --json\`); the conversation-level summary
1830
+ (\`end_reason\`, \`dominant_dynamic\`, \`who_steered\`) lands on
1831
+ \`iteration.conversations[]\`.
1370
1832
  - **\`study tester --summary\`** drops the action timeline and
1371
1833
  returns just \`{tester, interaction_count, sentiment, comment,
1372
1834
  error_message?, error_kind?}\`.
@@ -1438,9 +1900,24 @@ The CLI guarantees these contracts so agents can chain safely:
1438
1900
  phase-2 LLM calls instead of 2N. Pass \`--redispatch-all\` for the
1439
1901
  legacy reset behavior when you want fresh first impressions.
1440
1902
  - **\`ask results --json\` includes \`cross_round_summary\` for 2+
1441
- rounds.** Top-level field with per-round picks/winner snapshots and
1442
- a \`picks_delta\` (R1 last round). Replaces hand-rolled diffing of
1443
- two \`ask results\` calls.
1903
+ rounds when every round used \`wants_pick=true\`.** Top-level
1904
+ field with per-round picks/winner snapshots and a \`picks_delta\`
1905
+ (R1 last round). Replaces hand-rolled diffing of two
1906
+ \`ask results\` calls. When **any** round was dispatched with
1907
+ \`wants_pick=false\` (typical for free-text follow-up rounds), the
1908
+ summary is omitted and \`cross_round_summary_reason\` carries the
1909
+ explanation (e.g. \`"omitted: rounds 2, 3 lack wants_pick=true"\`).
1910
+ Branch on the reason field, don't poll for the summary.
1911
+ - **\`audience_get\` omits \`accessibility_profile\` by default.** The
1912
+ block is ~1KB per row; including it on a 50-row page overflows
1913
+ agent tool result budgets. Pass
1914
+ \`include_accessibility_profile=true\` to opt in. Mirrors the
1915
+ existing \`include_bio=false\` opt-in.
1916
+ - **\`ask_testers\` parameter is \`dispatch_into_round\`, not
1917
+ \`round\`.** Reads verbatim — "dispatch these new testers into round
1918
+ N". The old name (\`round\`) read as "start from round N", which
1919
+ was wrong: the call never restarts prior rounds, it only appends
1920
+ testers to the named round. Behavior unchanged across the rename.
1444
1921
  - **No more auto-empty iteration A.** \`study create\` and
1445
1922
  \`study generate\` no longer produce a placeholder iteration A. The
1446
1923
  first explicit \`ish iteration create\` becomes label A.
@@ -1739,6 +2216,168 @@ of scope: \`workspace\`, \`config\`, \`docs\`, \`init\`, \`login\`,
1739
2216
  including \`--get workspace.alias\` to capture the active workspace
1740
2217
  without piping \`ish status --json\` through \`jq\`.
1741
2218
  `;
2219
+ const REFERENCE_CREDITS = `# reference: credits & cost preview
2220
+
2221
+ Every billable run (study, ask, insight) costs **credits**. The CLI
2222
+ surfaces a cost upper bound *before* you dispatch so you can budget. The
2223
+ backend is the authoritative source — its rejection envelope on
2224
+ \`insufficient_credits\` carries the live required/available pair.
2225
+
2226
+ ## How costs are shaped
2227
+
2228
+ The formula has the same shape across modalities — \`max(1, round(N / 10))\`
2229
+ per principal — but the inputs differ. **Treat the rates below as the
2230
+ current calibration**; they will evolve as we differentiate per-modality
2231
+ compute cost. Agents should:
2232
+
2233
+ - For prospective cost preview: read \`credit_estimate\` from \`study run\`'s
2234
+ JSON envelope (top-level for solo/media runs; under \`pair_preview\` for
2235
+ tester-pair chat).
2236
+ - For hard budget checks: catch the backend's \`insufficient_credits\`
2237
+ rejection (HTTP 402; envelope shape below) and react to
2238
+ \`required\` / \`available\`.
2239
+
2240
+ | Surface | Per-principal cost | Total formula | Example |
2241
+ |---------------------|---------------------------------|--------------------------------------------------|--------------------------------------|
2242
+ | Interactive (URL) | \`max(1, round(steps/10))\` | \`testers × per-tester\` | 10 testers × 30 steps → 30 credits |
2243
+ | Text/image/video/audio/document | same | same | 5 testers × 20 steps → 10 credits |
2244
+ | Chat (external chatbot, solo) | \`max(1, round(turns/10))\` | \`testers × per-tester\` | 5 testers × 12 turns → 10 credits |
2245
+ | Chat (tester pair) | \`max(1, round(turns/10))\` × 2 | \`conv × per-side × 2\` | 3 conv × 14 turns → 6 credits |
2246
+ | Ask round | 1 / successful response | \`successful_testers\` | 50 responses → 50 credits |
2247
+ | Study insights | first free, then **10 flat** | n/a | 2nd analysis → 10 credits |
2248
+
2249
+ All numbers are **upper bounds**. Early termination, refusals, or
2250
+ backend audience trimming can reduce actual charge.
2251
+
2252
+ ## Capping interactive/media spend (\`--max-interactions\`)
2253
+
2254
+ \`ish study run\` always sends \`max_interactions\` to the backend for
2255
+ interactive and media runs. Precedence: \`--max-interactions <n>\` flag
2256
+ > the iteration's stored \`details.max_interactions\` > **CLI default
2257
+ of 20**. The default exists to prevent runaway spend when a tester
2258
+ gets stuck on a broken or non-responsive surface — without a cap, one
2259
+ stuck tester can rack up 100+ steps before the SDK gives up. Pass
2260
+ \`--max-interactions\` to override (e.g. \`--max-interactions 50\` for
2261
+ deeper exploration, \`--max-interactions 5\` for a cheap smoke test).
2262
+ The confirmation block shows the resolved value and where it came
2263
+ from (flag / iteration / CLI default). The JSON envelope's
2264
+ \`credit_estimate.breakdown\` reflects the dispatched value.
2265
+
2266
+ ## Where the CLI surfaces it
2267
+
2268
+ **Human output — \`study run\` confirmation block:**
2269
+
2270
+ \`\`\`
2271
+ Run settings:
2272
+ ...
2273
+ Scale: 3 conv × 14 turns × 2 sides ≈ 84 LLM calls (upper bound — early-termination may shorten)
2274
+ Credits (est): ≈ 6 credit(s) upper bound — see \`ish docs get-page reference/credits\`
2275
+ \`\`\`
2276
+
2277
+ **JSON envelope — \`study run --json\`:**
2278
+
2279
+ Pair chat — under \`pair_preview\`:
2280
+
2281
+ \`\`\`json
2282
+ {
2283
+ "pair_preview": {
2284
+ "conversation_count": 3,
2285
+ "max_turns": 14,
2286
+ "llm_calls_upper_bound": 84,
2287
+ "credit_estimate": {
2288
+ "upper_bound": 6,
2289
+ "formula": "chat_pair",
2290
+ "breakdown": "3 conv × max(1, round(14 turns / 10)) × 2 sides = 3 × 1 × 2 = 6",
2291
+ "unit": "credits"
2292
+ }
2293
+ }
2294
+ }
2295
+ \`\`\`
2296
+
2297
+ Solo media/interactive/chat — top-level \`credit_estimate\`:
2298
+
2299
+ \`\`\`json
2300
+ {
2301
+ "iteration_id": "…",
2302
+ "credit_estimate": {
2303
+ "upper_bound": 30,
2304
+ "formula": "media_per_tester",
2305
+ "breakdown": "10 tester(s) × max(1, round(30 steps / 10)) = 10 × 3 = 30",
2306
+ "unit": "credits"
2307
+ }
2308
+ }
2309
+ \`\`\`
2310
+
2311
+ The \`formula\` key is stable: agents can branch on it (\`media_per_tester\`,
2312
+ \`chat_solo\`, \`chat_pair\`, \`ask_per_response\`).
2313
+
2314
+ ## Tier allotments
2315
+
2316
+ | Tier | Monthly credits | Notes |
2317
+ |-------------|---------------------------|--------------------------------|
2318
+ | FREE | 200 (one-time signup) | Never refilled |
2319
+ | STARTER | 1,000 / month | Monthly reset |
2320
+ | PRO | 3,000 / month | Monthly reset |
2321
+ | ENTERPRISE | unlimited | Custom contract |
2322
+
2323
+ The CLI does not enforce these — the backend does. The CLI's job is to
2324
+ *preview*, so an agent doesn't dispatch a 5,000-credit run on a
2325
+ 200-credit account.
2326
+
2327
+ ## Insufficient-credit rejection shape
2328
+
2329
+ When you try to dispatch beyond what's available, the backend returns
2330
+ HTTP 402. The CLI surfaces it as a structured error envelope:
2331
+
2332
+ \`\`\`json
2333
+ {
2334
+ "error": "Insufficient credits.",
2335
+ "error_code": "insufficient_credits",
2336
+ "status": 402,
2337
+ "retryable": false,
2338
+ "required": 30,
2339
+ "available": 8,
2340
+ "upgrade_url": "https://app.ishlabs.io/billing"
2341
+ }
2342
+ \`\`\`
2343
+
2344
+ Exit code \`1\` (non-retryable). Don't poll — the user has to upgrade or
2345
+ free credits before re-dispatch.
2346
+
2347
+ ## Agent recipe
2348
+
2349
+ 1. Build/draft the run (\`study create\`, \`iteration create\`).
2350
+ 2. Call \`study run\` *without* \`--dispatch\` to read the
2351
+ \`credit_estimate\` upper bound from JSON. (Or \`--dry-run\` where
2352
+ supported — see modality concept pages.)
2353
+ 3. If \`upper_bound\` fits your budget, re-call with \`--dispatch\`.
2354
+ 4. If you hit \`error_code: insufficient_credits\`, surface
2355
+ \`required\` / \`available\` / \`upgrade_url\` to the human.
2356
+
2357
+ ## Caveats
2358
+
2359
+ - The CLI's preview uses the **same formula** the backend bills with,
2360
+ but does **not** make a network preflight call — it's pure math
2361
+ client-side. If the backend formula changes mid-version, the preview
2362
+ will drift until the CLI is updated. The \`insufficient_credits\`
2363
+ rejection envelope is always authoritative.
2364
+ - Pair-chat \`credit_estimate\` is \`null\` if \`max_turns\` isn't a finite
2365
+ number (e.g. the iteration doesn't specify one and there's no
2366
+ \`--max-turns\` flag).
2367
+ - Audience criteria that resolve server-side won't have a precise
2368
+ estimate at preview time — the CLI prints the shape (\`N × … × 2\`)
2369
+ instead of a number.
2370
+
2371
+ ## Related
2372
+
2373
+ - \`reference/billing-limits\` — per-tier *entity* caps (max
2374
+ workspaces/studies/iterations/profiles), separate from credit budget.
2375
+ - \`reference/json-mode\` — full error envelope shape and exit codes.
2376
+ - \`concepts/study\`, \`concepts/iteration\`, \`concepts/ask\` —
2377
+ per-modality run shapes.
2378
+ - \`guides/chat\` — worked example of a pair-chat run including
2379
+ \`pair_preview.credit_estimate\`.
2380
+ `;
1742
2381
  const REFERENCE_BILLING_LIMITS = `# reference: billing tier limits
1743
2382
 
1744
2383
  Some create operations are gated by your account's billing tier. The
@@ -1812,6 +2451,9 @@ upgrade or delete an existing resource to free up headroom.
1812
2451
 
1813
2452
  ## Related
1814
2453
 
2454
+ - \`reference/credits\` — per-run credit cost & preview (separate from
2455
+ these entity caps; this page is about *how many things you can have*,
2456
+ that page is about *how much each run costs*).
1815
2457
  - \`concepts/workspace\` — \`maxProducts\` is per-account.
1816
2458
  - \`concepts/study\` — \`maxStudiesPerProduct\` gates study creation.
1817
2459
  - \`concepts/iteration\` — \`maxIterationsPerStudy\` gates iteration creation.
@@ -1820,6 +2462,51 @@ upgrade or delete an existing resource to free up headroom.
1820
2462
  `;
1821
2463
  const GUIDE_CHAT = `# guide: chat-modality studies
1822
2464
 
2465
+ Chat-modality studies cover two distinct shapes:
2466
+
2467
+ - **external_chatbot** — testers probe a customer chatbot endpoint
2468
+ (sections 1-3 below: configure → smoke test → run).
2469
+ - **tester_pair** — two AI personas converse with each other for
2470
+ rehearsal scenarios. Pitch rehearsals, difficult-conversation
2471
+ prep, founder-vs-investor archetypes. See section 7a/7b and the
2472
+ TL;DR below.
2473
+
2474
+ ## TL;DR — rehearse a pitch in one shot
2475
+
2476
+ For "rehearse my pitch against 3 different skeptical CTOs" (the
2477
+ canonical 1×N variations shape), this is the whole flow. Inline
2478
+ scenarios — no extra files needed:
2479
+
2480
+ \`\`\`bash
2481
+ # Capture aliases for the rep (1) and CTOs (3) via subshell:
2482
+ REP=$(ish profile generate \\
2483
+ --description "Senior B2B SaaS account executive; concise, technical" \\
2484
+ --count 1 --json | jq -r '.items[0].alias')
2485
+ CTOS=$(ish profile generate \\
2486
+ --description "Skeptical CTO at Series B SaaS; distrusts AI vendors" \\
2487
+ --count 3 --json | jq -r '[.items[].alias] | join(",")')
2488
+
2489
+ # One-shot study + iteration A (1×N broadcast does the rest):
2490
+ ish study create --modality chat --chat-mode tester_pair \\
2491
+ --name "Pitch rehearsal" \\
2492
+ --audience-a "$REP" --audience-b "$CTOS" \\
2493
+ --scenario-a "You are pitching <your product>. Be concise, push back on vague objections. Goal: land a pilot or a clear next step." \\
2494
+ --scenario-b "You are a skeptical CTO. Probe for technical depth, distrust marketing-speak, refuse to commit without evidence. Goal: leave with either a concrete proof point or a graceful 'no'." \\
2495
+ --assignment "Pitch:Land a pilot" --max-turns 14
2496
+
2497
+ # Run all 3 conversations:
2498
+ ish study run -y --wait
2499
+
2500
+ # Compare side-by-side:
2501
+ ish iteration get <iter-id> --json \\
2502
+ | jq '.conversations[] | {pair_index, end_reason, dynamic: .summary.dominant_dynamic}'
2503
+ \`\`\`
2504
+
2505
+ Section 7b below has the longer version with scenario-writing
2506
+ guidance, criteria-driven audiences, and the broadcast rule.
2507
+
2508
+ ---
2509
+
1823
2510
  Goal: from a customer chatbot endpoint to a finished chat-modality
1824
2511
  study with parsed transcripts, end to end via the CLI. The flow has
1825
2512
  three phases: configure the endpoint, smoke test it, run a study.
@@ -2113,13 +2800,20 @@ cat ./bot-config.json | ish study create \\
2113
2800
 
2114
2801
  Optional \`--max-turns <n>\` (default 12) caps the chat per tester.
2115
2802
 
2116
- Audience size is set at run time. Use \`--sample <N>\` to pick N
2117
- random simulatable profiles, or \`--all\` for the full pool.
2118
- \`--profile <id>\` is also supported for explicit selection:
2803
+ Audience size is set at run time for **external_chatbot** chat
2804
+ studies. Use \`--sample <N>\` to pick N random simulatable profiles,
2805
+ or \`--all\` for the full pool. \`--profile <id>\` is also supported
2806
+ for explicit selection:
2119
2807
  \`\`\`
2120
2808
  ish study run stu-xyz --sample 5 --wait
2121
2809
  \`\`\`
2122
2810
 
2811
+ > **Pair-mode is different.** \`--sample\` / \`--profile\` / demographic
2812
+ > filters on \`study run\` are **refused** for tester_pair iterations
2813
+ > — pair audiences live on the iteration itself. Set them at
2814
+ > iteration-create time via \`--audience-a/-b\` (with 1×N broadcast)
2815
+ > or \`--role-criteria-a/-b\`. See the tester_pair section below.
2816
+
2123
2817
  Pull raw interactions:
2124
2818
  \`\`\`
2125
2819
  ish study results stu-xyz --json | jq '.interactions'
@@ -2141,6 +2835,171 @@ ish iteration create --study stu-xyz --endpoint-config ./bot.json
2141
2835
 
2142
2836
  Same flag set as \`study create\`'s chat shortcut.
2143
2837
 
2838
+ ## tester_pair: rehearse a conversation between two AI personas
2839
+
2840
+ \`Modality.CHAT\` also supports a **tester_pair** mode where two AI
2841
+ tester profiles converse with each other — useful for rehearsing a
2842
+ sales pitch, a difficult conversation, a fundraising chat, or any
2843
+ two-role scenario. Each side has its own scenario + goal text; the
2844
+ other side does NOT see it (the asymmetry contract). Audiences are
2845
+ 1:1 paired by index (audience_a[i] talks to audience_b[i]).
2846
+
2847
+ One-shot study + iteration:
2848
+
2849
+ \`\`\`
2850
+ ish study create \\
2851
+ --modality chat --chat-mode tester_pair \\
2852
+ --name "Pitch rehearsal" \\
2853
+ --audience-a tp-sales-1,tp-sales-2 \\
2854
+ --audience-b tp-cto-skeptic-1,tp-cto-skeptic-2 \\
2855
+ --scenario-a @./sales_rep.md \\
2856
+ --scenario-b @./skeptical_cto.md \\
2857
+ --assignment "Pitch:Try to win the meeting"
2858
+ \`\`\`
2859
+
2860
+ Or add a pair iteration to an existing chat study:
2861
+
2862
+ \`\`\`
2863
+ ish iteration create --study stu-xyz --chat-mode tester_pair \\
2864
+ --audience-a tp-a1,tp-a2 --audience-b tp-b1,tp-b2 \\
2865
+ --scenario-a "..." --scenario-b "..." \\
2866
+ --max-turns 14
2867
+ \`\`\`
2868
+
2869
+ ### Rehearsing against N variations of one side (1×N)
2870
+
2871
+ The most common rehearsal shape: fix one side (your role) and vary
2872
+ the other (the audience you're rehearsing against). E.g. "pitch this
2873
+ once and see how it lands against 3 different skeptical CTOs."
2874
+
2875
+ Step 1 — produce N distinct profiles for the varying side:
2876
+
2877
+ \`\`\`bash
2878
+ # Generate 3 skeptical-CTO profiles (or any archetype):
2879
+ ish profile generate \\
2880
+ --description "Skeptical CTO at a Series B SaaS startup; distrusts AI vendors" \\
2881
+ --count 3 --json | jq -r '.items[].alias'
2882
+ # → tp-cto1, tp-cto2, tp-cto3
2883
+ \`\`\`
2884
+
2885
+ If you already have profiles you want to reuse, list them:
2886
+
2887
+ \`\`\`bash
2888
+ ish profile list --search "cto" --json | jq -r '.items[].alias'
2889
+ \`\`\`
2890
+
2891
+ Step 2 — author the two scenarios as separate files (\`sales_rep.md\`
2892
+ and \`skeptical_cto.md\`). **Each scenario is a system prompt for one
2893
+ role — the other side never sees it.** Cover voice, what they know,
2894
+ what they don't know, and what counts as success for them. Don't
2895
+ cram demographic constraints into the text; that's what
2896
+ \`--role-criteria-\*\` is for. See the **"Writing a good scenario"**
2897
+ section below for the Maya/Devon worked example and the 5-point
2898
+ template.
2899
+
2900
+ Step 3 — create the iteration with **one profile** on the fixed
2901
+ side and **N profiles** on the varying side. The CLI auto-broadcasts
2902
+ the singleton to match length N (and prints a stderr notice like
2903
+ \`Broadcasting --audience-a (1 profile) to length 3 to match --audience-b\`
2904
+ when it does, so you can see it happen):
2905
+
2906
+ \`\`\`bash
2907
+ ish study create \\
2908
+ --modality chat --chat-mode tester_pair \\
2909
+ --name "Pitch rehearsal — 3 CTO variants" \\
2910
+ --audience-a tp-rep \\
2911
+ --audience-b tp-cto1,tp-cto2,tp-cto3 \\
2912
+ --scenario-a @./sales_rep.md \\
2913
+ --scenario-b @./skeptical_cto.md \\
2914
+ --assignment "Pitch:Land a pilot or a clear next step"
2915
+
2916
+ # Result: 3 conversations, all using tp-rep on side A, one each
2917
+ # of tp-cto1/2/3 on side B. Same scenario for the CTOs (they share
2918
+ # the role description) but different underlying personas, so the
2919
+ # conversations diverge in tone and pressure points.
2920
+ \`\`\`
2921
+
2922
+ Run it (\`--yes\` to skip the confirmation prompt):
2923
+
2924
+ \`\`\`bash
2925
+ ish study run -y --wait
2926
+ \`\`\`
2927
+
2928
+ Inspect the per-conversation summaries side-by-side:
2929
+
2930
+ \`\`\`bash
2931
+ ish iteration get <iter-id> --json \\
2932
+ | jq '.conversations[] | {pair_index, end_reason, dominant_dynamic: .summary.dominant_dynamic}'
2933
+ \`\`\`
2934
+
2935
+ **When to use criteria instead**: if you don't care about specific
2936
+ profile IDs and just want "any 3 CTOs the backend can find", pass
2937
+ \`--role-criteria-b '{"occupation":["cto"]}'\` (alone or with a single
2938
+ \`--audience-a tp-rep\`). The backend resolves the matching pool at
2939
+ iteration-create time. Caveat: the resolved pool may collapse onto
2940
+ similar personas — for guaranteed distinctness, generate explicit
2941
+ profiles first.
2942
+
2943
+ ### Criteria-driven audience (persona-first filtering)
2944
+
2945
+ When you don't want to hand-pick UUIDs, pass a **role-criteria
2946
+ filter** per side. The backend resolves it into an eligible pool of
2947
+ tester profiles and pairs them 1:1. The persona itself is never
2948
+ altered — criteria filter the pool upstream so the persona is
2949
+ already plausible for the role:
2950
+
2951
+ \`\`\`
2952
+ ish study create \\
2953
+ --modality chat --chat-mode tester_pair \\
2954
+ --name "Pitch rehearsal" \\
2955
+ --role-criteria-a '{"occupation":["sales","account executive"],"min_age":28}' \\
2956
+ --role-criteria-b '{"occupation":["cto","vp engineering"],"country":["US","SE"]}' \\
2957
+ --scenario-a @./sales_rep.md --scenario-b @./skeptical_cto.md \\
2958
+ --assignment "Pitch:Try to land a pilot"
2959
+ \`\`\`
2960
+
2961
+ Keys (all optional): \`occupation\`, \`min_age\`, \`max_age\`,
2962
+ \`gender\`, \`country\`, \`education_level_in\`, \`household_in\`,
2963
+ \`locale_type_in\`, \`income_level_in\`, \`employment_status_in\`,
2964
+ \`requires_captions\`, \`uses_screen_reader\`, \`prefers_reduced_motion\`,
2965
+ \`prefers_high_contrast\`, \`has_any_accessibility_need\`. The five \`*_in\`
2966
+ arrays accept snake_case spec values; the five accessibility filters are
2967
+ booleans. Combine \`--profile-*\` and \`--role-criteria-*\` on the same side
2968
+ to make criteria validate an explicit list (mismatch blocks the run).
2969
+
2970
+ MECE notes for the list filters:
2971
+ - \`household_in\`: \`couple_with_kids\` covers couples raising children;
2972
+ \`couple_no_kids\` is strictly child-free. \`single\` means lives alone
2973
+ (no partner, no roommates, no parents, no children in the household).
2974
+ - \`employment_status_in\`: pick the tester's primary daytime activity.
2975
+ A student who works 15 hrs/week is \`student\`; a retiree who freelances
2976
+ is \`retired\`.
2977
+
2978
+ If the resolved pool is too small, \`ish study run\` exits 2 with the
2979
+ backend's error message intact — no silent fallback. Broaden the
2980
+ criteria or generate more matching profiles via
2981
+ \`ish profile generate --description "..."\`.
2982
+
2983
+ Dispatch is per-Conversation (one task per pair index). Run-time
2984
+ audience overrides (\`--profile\`, \`--sample\`, \`--all\`, demographic
2985
+ filters) are refused on pair iterations — the iteration's audiences
2986
+ are authoritative. To change them, update the iteration:
2987
+
2988
+ \`\`\`
2989
+ ish study run --study stu-xyz --iteration i-pair -y
2990
+ ish iteration update i-pair --details-json '{...}' # change audiences
2991
+ \`\`\`
2992
+
2993
+ Inspect:
2994
+
2995
+ \`\`\`
2996
+ ish iteration get i-pair --json | jq '.details.mode_details.mode, .conversations[]'
2997
+ \`\`\`
2998
+
2999
+ Per-Conversation summaries (\`end_reason\`, \`dominant_dynamic\`,
3000
+ \`who_steered\`) land on \`iteration.conversations[]\`. Per-tester
3001
+ summaries land on \`tester.summary\` as before.
3002
+
2144
3003
  ## Active-endpoint convention
2145
3004
 
2146
3005
  \`ish chat endpoint use <id>\` writes the endpoint to
@@ -2171,12 +3030,211 @@ Mirrors \`workspace use\` / \`study use\` / \`ask use\`.
2171
3030
 
2172
3031
  ## Related
2173
3032
 
2174
- - \`concepts/iteration\` — chat iteration shape (\`details.endpoint\`,
2175
- \`details.chatbot_endpoint_id\`, \`details.max_turns\`).
3033
+ - \`concepts/iteration\` — chat iteration shape
3034
+ (\`details.mode_details\` discriminator, \`mode_details.endpoint\` /
3035
+ \`mode_details.chatbot_endpoint_id\` for external_chatbot,
3036
+ \`mode_details.audience_a/_b\` + \`scenario_a/_b\` for tester_pair,
3037
+ \`details.max_turns\`).
2176
3038
  - \`concepts/study\` — modality + assignments + iteration nesting.
2177
3039
  - \`reference/json-mode\` — JSON output, error envelope, exit codes.
2178
3040
  - \`guides/first-study\` — the same pattern for an interactive
2179
3041
  modality study.
3042
+ - \`guides/cold-start\` — the saturated-account first-step playbook
3043
+ if \`workspace_create\` returns \`usage_limit_reached\`.
3044
+ `;
3045
+ const GUIDE_COLD_START = `# guide: cold start on a saturated account
3046
+
3047
+ The naive cold-start instruction — "create a fresh workspace, then run
3048
+ a study" — fails immediately on any account that has accumulated state.
3049
+ \`workspace_create\` (CLI: \`ish workspace create\`) returns
3050
+ \`error_code: usage_limit_reached\` once the caller hits
3051
+ \`maxProducts\` for their tier (1 on FREE). On a saturated dogfood
3052
+ account this is the first call an agent burns. This guide is the
3053
+ recovery path: inspect existing state, pick a reuse target, or call
3054
+ the idempotent create-or-reuse-by-name path.
3055
+
3056
+ ## The shape of the failure
3057
+
3058
+ \`\`\`json
3059
+ // workspace_create / POST /products on a FREE-tier account with 22 workspaces:
3060
+ {
3061
+ "error": "Free plan allows 1 workspace (you have 22).",
3062
+ "error_code": "usage_limit_reached",
3063
+ "status": 403,
3064
+ "retryable": false,
3065
+ "tier": "free",
3066
+ "limit": "maxProducts",
3067
+ "current": 22,
3068
+ "max": 1,
3069
+ "upgrade_url": "https://app.ishlabs.io/billing"
3070
+ }
3071
+ \`\`\`
3072
+
3073
+ Don't retry. The cap is server-enforced. You have three recovery
3074
+ paths:
3075
+
3076
+ 1. **Reuse an existing workspace** (most cases).
3077
+ 2. **Use the idempotent \`--ensure\` path** if you have a stable name
3078
+ the user wants to claim.
3079
+ 3. **Surface the upgrade link** if neither fits.
3080
+
3081
+ ## Step 1 — inspect before you create
3082
+
3083
+ Always start a cold-start session by listing what's already there.
3084
+ \`workspace_get\` / \`ish workspace list --json\` returns rows with
3085
+ the metadata you need to pick safely:
3086
+
3087
+ \`\`\`bash
3088
+ ish workspace list --json
3089
+ \`\`\`
3090
+
3091
+ \`\`\`json
3092
+ {
3093
+ "items": [
3094
+ {
3095
+ "id": "...", "alias": "w-6ec", "name": "Onboarding revamp",
3096
+ "base_url": "https://example.com",
3097
+ "last_activity_at": "2026-05-10T14:22:00Z",
3098
+ "child_counts": { "studies": 2, "asks": 1, "tester_profiles": 4 },
3099
+ "has_headroom": true
3100
+ },
3101
+ {
3102
+ "id": "...", "alias": "w-d02", "name": "Demo",
3103
+ "last_activity_at": "2025-11-02T09:11:00Z",
3104
+ "child_counts": { "studies": 3, "asks": 0, "tester_profiles": 0 },
3105
+ "has_headroom": false
3106
+ }
3107
+ ],
3108
+ "total": 22, "returned": 22, "limit": 50, "offset": 0, "has_more": false
3109
+ }
3110
+ \`\`\`
3111
+
3112
+ Read three fields per row:
3113
+
3114
+ - **\`last_activity_at\`** — most recent run, iteration, ask, or write
3115
+ on this workspace. The most recently active one is usually the
3116
+ workspace the user is mentally already in.
3117
+ - **\`child_counts\`** — \`{ studies, asks, tester_profiles }\`. Zero
3118
+ across the board = quiet/empty, ideal reuse target without
3119
+ cluttering anyone's view. A workspace with content the user owns is
3120
+ also fine to reuse if there's still headroom.
3121
+ - **\`has_headroom\`** — \`true\` if the workspace still has room under
3122
+ \`maxStudiesPerProduct\`, \`maxIterationsPerStudy\`, and
3123
+ \`maxCustomTesterProfiles\` for the caller's tier. If \`false\`, the
3124
+ next \`study create\` / \`profile generate\` against this workspace
3125
+ will be \`usage_limit_reached\`. Filter these out unless the user
3126
+ explicitly wants to free space by deleting state.
3127
+
3128
+ ## Step 2 — pick a reuse target (decision rule)
3129
+
3130
+ \`\`\`
3131
+ For each workspace in workspace_get():
3132
+ if has_headroom == false: skip (next call would fail)
3133
+ if name matches the user's intent: use it (early return)
3134
+ if child_counts == 0 across board: candidate (empty workspace)
3135
+ else candidate (active but not user's intent)
3136
+
3137
+ If candidates exist:
3138
+ prefer name-match > most-recent last_activity_at > lowest child_counts
3139
+
3140
+ If zero candidates with has_headroom == true:
3141
+ the account is genuinely saturated — surface upgrade_url
3142
+ from the next workspace_create's error envelope.
3143
+ \`\`\`
3144
+
3145
+ \`\`\`bash
3146
+ ish workspace use w-6ec # commit the choice; saves to ~/.ish/config.json
3147
+ \`\`\`
3148
+
3149
+ ## Step 3 — or use \`--ensure\` to skip the decision tree
3150
+
3151
+ When you have a stable workspace name the user owns (e.g. a brand
3152
+ name, a project codename), use the idempotent path:
3153
+
3154
+ \`\`\`bash
3155
+ ish workspace create --name "Acme — onboarding revamp" --ensure
3156
+ \`\`\`
3157
+
3158
+ Behavior:
3159
+
3160
+ - If a workspace with that exact name exists and is owned by the
3161
+ caller, returns it (HTTP 200, no quota consumed, no error).
3162
+ - Otherwise creates a fresh one (HTTP 201; consumes one
3163
+ \`maxProducts\` slot, so still subject to the tier cap).
3164
+ - The returned envelope is the same shape either way — agents don't
3165
+ branch on success vs. reuse.
3166
+
3167
+ This is the right call when you don't want to scrape the list
3168
+ yourself or risk a name clash. Pair it with the inspection step
3169
+ when the saturated state matters (e.g. you also need to know
3170
+ \`has_headroom\` before \`study create\`).
3171
+
3172
+ ## Worked transcript — saturated account, agent recovery
3173
+
3174
+ \`\`\`bash
3175
+ # 1. Probe state before doing anything else.
3176
+ ish workspace list --json --fields alias,name,last_activity_at,child_counts,has_headroom \\
3177
+ | jq '.items | sort_by(.last_activity_at) | reverse | .[0:5]'
3178
+
3179
+ # Output (truncated to top-5 most-recently-active):
3180
+ # [
3181
+ # {"alias":"w-6ec","name":"Onboarding revamp",
3182
+ # "last_activity_at":"2026-05-10T14:22:00Z",
3183
+ # "child_counts":{"studies":2,"asks":1,"tester_profiles":4},
3184
+ # "has_headroom":true},
3185
+ # {"alias":"w-d02","name":"Demo",
3186
+ # "last_activity_at":"2025-11-02T09:11:00Z",
3187
+ # "child_counts":{"studies":3,"asks":0,"tester_profiles":0},
3188
+ # "has_headroom":false},
3189
+ # ...
3190
+ # ]
3191
+
3192
+ # 2. Pick a workspace with has_headroom=true (w-6ec here).
3193
+ ish workspace use w-6ec
3194
+
3195
+ # 3. Carry on as if the workspace_create had succeeded.
3196
+ ish profile generate --description "..." --count 3
3197
+ ish study create --modality interactive --name "..." \\
3198
+ --url https://example.com \\
3199
+ --assignment "..." --question "..."
3200
+ ish study run --all --wait
3201
+ \`\`\`
3202
+
3203
+ If the agent prefers \`--ensure\` (e.g. so the user sees their
3204
+ preferred name in the UI):
3205
+
3206
+ \`\`\`bash
3207
+ WS=$(ish workspace create --name "Cold-start probe" --ensure --get alias)
3208
+ ish workspace use "$WS"
3209
+ \`\`\`
3210
+
3211
+ ## When the account is genuinely saturated
3212
+
3213
+ If every workspace has \`has_headroom: false\` AND \`maxProducts\` is
3214
+ at cap (\`current == max\`), there is no path to a new study without
3215
+ either upgrading the plan or deleting an existing workspace. Surface
3216
+ the \`upgrade_url\` from the \`usage_limit_reached\` envelope to the
3217
+ human and stop — don't guess which workspace to delete on the user's
3218
+ behalf.
3219
+
3220
+ ## Why this matters
3221
+
3222
+ Two of four dogfood agents stopped on \`workspace_create\` on a
3223
+ saturated account before producing any signal — the very first call
3224
+ in the cold-start script was the cap-hitter. Inspecting
3225
+ \`workspace_get\` first (or going through \`--ensure\`) cuts that
3226
+ class of failure to zero. The \`last_activity_at\` / \`child_counts\` /
3227
+ \`has_headroom\` fields exist specifically so an agent can branch
3228
+ without a second round-trip.
3229
+
3230
+ ## Related
3231
+
3232
+ - \`concepts/workspace\` — workspace fundamentals, including
3233
+ \`workspace info\` for in-workspace usage counters.
3234
+ - \`reference/billing-limits\` — the full tier × cap table; \`maxProducts\`
3235
+ drives \`workspace_create\` rejections.
3236
+ - \`reference/json-mode\` — error envelope shape and exit code mapping
3237
+ (\`usage_limit_reached\` is HTTP 403, exit 1, non-retryable).
2180
3238
  `;
2181
3239
  const PAGES = [
2182
3240
  {
@@ -2200,7 +3258,7 @@ const PAGES = [
2200
3258
  {
2201
3259
  slug: "concepts/iteration",
2202
3260
  title: "concept: iteration",
2203
- description: "One configured run of a study (URL, media, or chat). Covers segments, segment labels, and HTML content.",
3261
+ description: "One configured run of a study (URL, media, or chat). Covers segments, segment labels, HTML content, and chat mode_details (external_chatbot vs tester_pair).",
2204
3262
  body: CONCEPT_ITERATION,
2205
3263
  },
2206
3264
  {
@@ -2293,6 +3351,12 @@ const PAGES = [
2293
3351
  description: "Per-tier caps on workspaces/studies/iterations/profiles; usage_limit_reached error shape.",
2294
3352
  body: REFERENCE_BILLING_LIMITS,
2295
3353
  },
3354
+ {
3355
+ slug: "reference/credits",
3356
+ title: "reference: credits & cost preview",
3357
+ description: "Per-modality credit cost formulas, where the CLI surfaces cost estimates (Scale line, pair_preview.credit_estimate, top-level credit_estimate), tier allotments, insufficient_credits error shape.",
3358
+ body: REFERENCE_CREDITS,
3359
+ },
2296
3360
  {
2297
3361
  slug: "guides/first-study",
2298
3362
  title: "guide: your first study, end to end",
@@ -2302,9 +3366,15 @@ const PAGES = [
2302
3366
  {
2303
3367
  slug: "guides/chat",
2304
3368
  title: "guide: chat-modality studies",
2305
- description: "Configure a chatbot endpoint (slots-only model), smoke test it, run a chat-modality study. Covers slot bindings, streaming endpoints, and built-in templates.",
3369
+ description: "Configure a chatbot endpoint (slots-only model), smoke test it, run a chat-modality study (external_chatbot mode). Also: tester_pair mode two AI personas talk to each other for rehearsal scenarios.",
2306
3370
  body: GUIDE_CHAT,
2307
3371
  },
3372
+ {
3373
+ slug: "guides/cold-start",
3374
+ title: "guide: cold start on a saturated account",
3375
+ description: "What to do when workspace_create returns usage_limit_reached on a saturated account. Inspect workspace_get (has_headroom / child_counts / last_activity_at), pick a reuse target, or call ish workspace create --ensure name.",
3376
+ body: GUIDE_COLD_START,
3377
+ },
2308
3378
  ];
2309
3379
  const PAGES_BY_SLUG = new Map(PAGES.map((p) => [p.slug, p]));
2310
3380
  export function listPages() {