@ishlabs/cli 0.21.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/lib/docs.js CHANGED
@@ -635,7 +635,7 @@ Tunables (both modes):
635
635
  the parties signal the conversation is over.
636
636
 
637
637
  Pair-mode rules:
638
- - Each side needs **either** \`--profile-*\` (explicit IDs) **or**
638
+ - Each side needs **either** \`--group-a\` / \`--group-b\` (explicit IDs) **or**
639
639
  \`--role-criteria-*\` (filter the backend resolves). The two can also
640
640
  be combined — criteria then acts as validation on the explicit list.
641
641
  - When both sides use explicit \`--group-a\` / \`--group-b\`, they
@@ -657,7 +657,7 @@ Pair-mode rules:
657
657
  \`type\` field in \`--questionnaire\` / \`--questions\` manifests
658
658
  (\`single-choice\` ↔ \`single_choice\`).
659
659
  - Audiences are pinned to the iteration. \`ish study run\` refuses
660
- run-time people overrides (\`--profile\` / \`--sample\` / \`--all\` /
660
+ run-time people overrides (\`--person\` / \`--sample\` / \`--all\` /
661
661
  filters) on a pair iteration — change the peoples via
662
662
  \`ish iteration update <id> --details-json '{...}'\` instead.
663
663
  - \`--max-turns\` / \`--early-termination\` on \`ish study run\` override
@@ -1174,7 +1174,7 @@ const CONCEPT_PROFILE = `# concept: person
1174
1174
  A **person** is a reusable persona — the simulated
1175
1175
  human whose behaviour drives a participant instance during a study or ask.
1176
1176
 
1177
- - Alias prefix: \`tp-\`
1177
+ - Alias prefix: \`p-\`
1178
1178
  - Lives at the workspace level, reusable across studies and asks.
1179
1179
  - Distinct from a "participant" (\`pt-\`) — a participant is one *instance* of a
1180
1180
  profile inside one iteration.
@@ -1336,7 +1336,7 @@ A **source** is an input to \`ish person generate\`: a transcript,
1336
1336
  audio file, image, or PDF that an LLM reads to ground generated profiles
1337
1337
  in real customer evidence.
1338
1338
 
1339
- - Alias prefix: \`tps-\`
1339
+ - Alias prefix: \`ps-\`
1340
1340
  - Source kinds: \`text_file | audio | image\` (auto-detected from extension; \`text-file\` is accepted as a hyphen variant).
1341
1341
  - Audio supports speaker diarization via \`--diarize\`.
1342
1342
 
@@ -1406,7 +1406,7 @@ flags. Two ways to select:
1406
1406
  \`platform\` until the next release with a server-side
1407
1407
  deprecation warning)
1408
1408
 
1409
- The two modes are **mutually exclusive** — pass either \`--profile\` or
1409
+ The two modes are **mutually exclusive** — pass either \`--person\` or
1410
1410
  the filter set, not both.
1411
1411
 
1412
1412
  ## Empty-pool suggestions
@@ -1658,7 +1658,7 @@ and what they target differ.
1658
1658
  | Default | latest iteration of the active study | append a round to the active ask |
1659
1659
  | Fresh setup | \`ish iteration create …\` first, then run | \`--new\` (creates ask + round 1 in one shot) |
1660
1660
  | Specific target| \`--iteration <id>\` | positional ask id (\`a-6ec\`) |
1661
- | Audience | \`--profile\` OR filters with \`--sample\`/\`--all\` — else reuse iteration's participants | only at \`--new\`; fixed for the ask afterwards |
1661
+ | Audience | \`--person\` OR filters with \`--sample\`/\`--all\` — else reuse iteration's participants | only at \`--new\`; fixed for the ask afterwards |
1662
1662
  | Output unit | per-participant interactions + questionnaire answers | per-participant reactions per round |
1663
1663
 
1664
1664
  ## Decision rule
@@ -1711,6 +1711,23 @@ removed); \`extend\` then spawns a fresh participant branched from the
1711
1711
  cancelled participant's last interaction. See
1712
1712
  \`concepts/extending-a-simulation\` for the full mental model.
1713
1713
 
1714
+ ## Stuck runs are auto-failed (no manual intervention)
1715
+
1716
+ If a worker dies mid-run (instance preemption, OOM, infra restart), the
1717
+ backend reaper transitions the participant to
1718
+ \`status: failed, error_kind: stale_worker\` within ~15 min — you don't
1719
+ need to \`cancel\` it. The status payload returned by
1720
+ \`/simulation/status/{participant_id}\` (and surfaced on \`study wait\`,
1721
+ \`study run --wait\`, \`study poll\`) includes \`age_seconds\` so agents
1722
+ can tell "just slow" from "the worker is gone." Once \`age_seconds\`
1723
+ exceeds ~900s for a non-terminal participant the wait-timeout envelope
1724
+ explicitly flags it as likely stuck — stop polling and let the reaper
1725
+ finish the row.
1726
+
1727
+ \`error_kind: self_timeout\` is the same idea written by the worker
1728
+ itself when it self-detects passing its 25-min ceiling; \`stale_worker\`
1729
+ is the reaper's verdict when the row simply stopped reporting.
1730
+
1714
1731
  ## Related
1715
1732
 
1716
1733
  - \`reference/json-mode\` — output modes (display vs capture vs chain).
@@ -1744,9 +1761,12 @@ mid-run?" scenario without restarting from scratch.
1744
1761
  When extend is **not** the right verb:
1745
1762
 
1746
1763
  - Source participant is still RUNNING. \`cancel\` it first, then extend.
1747
- Extend refuses non-terminal sources server-side.
1764
+ Extend refuses non-terminal sources server-side. **Exception:** a
1765
+ stale-heartbeat RUNNING row (worker died mid-run) is reaped to
1766
+ \`failed, error_kind: stale_worker\` automatically within ~15 min — no
1767
+ manual \`cancel\` needed; just wait for the reaper, then extend.
1748
1768
  - You want a fresh cohort with new people flags. Use \`study run\`
1749
- with \`--profile\` / \`--sample\` / \`--all\` instead — extend is a
1769
+ with \`--person\` / \`--sample\` / \`--all\` instead — extend is a
1750
1770
  per-participant resume, not a batch op.
1751
1771
  - You want to change the iteration's URL or content. Edit the iteration
1752
1772
  itself (\`iteration update\` or a fresh iteration) — extend always
@@ -1906,8 +1926,8 @@ time the CLI sees an entity.
1906
1926
  - \`s-\` study
1907
1927
  - \`i-\` iteration
1908
1928
  - \`pt-\` participant (instance of a person in an iteration)
1909
- - \`tp-\` person
1910
- - \`tps-\` person source
1929
+ - \`p-\` person
1930
+ - \`ps-\` person source
1911
1931
  - \`a-\` ask
1912
1932
  - \`r-\` ask round
1913
1933
  - \`c-\` config (simulation config)
@@ -2223,7 +2243,30 @@ The CLI guarantees these contracts so agents can chain safely:
2223
2243
  envelope carries \`progress: {study_id, iteration_id?,
2224
2244
  timeout_seconds, done, total, pending, rows[]}\` so the agent
2225
2245
  can resume by polling rather than re-dispatching. Same shape on
2226
- \`study wait\` (single-participant rows[] has length 1).
2246
+ \`study wait\` (single-participant rows[] has length 1). Each row
2247
+ in \`progress.rows[]\` carries \`age_seconds\` (server-computed
2248
+ liveness from \`started_at\`) plus \`error_kind\` when populated;
2249
+ when any non-terminal row's \`age_seconds\` exceeds ~900s the
2250
+ envelope's \`error\` message explicitly flags "the worker likely
2251
+ died" — don't keep polling, the backend reaper will mark it
2252
+ \`failed, error_kind=stale_worker\` within ~15 min.
2253
+ - **Participant \`error_kind\` enumeration.** Failed participants
2254
+ carry a classified \`error_kind\` so agents branch without parsing
2255
+ prose. Lifecycle/infra kinds: \`stale_worker\` (worker died mid-run,
2256
+ reaper transitioned the row), \`self_timeout\` (worker self-aborted
2257
+ past its 25-min runtime ceiling). Modality kinds:
2258
+ \`first_impression_llm_failed\`, \`interview_llm_failed\`,
2259
+ \`variant_preparation_failed\` (ask responses). CLI-side kinds:
2260
+ \`ConfirmationRequired\` (destructive op in \`--json\` mode without
2261
+ \`--yes\`), \`TunnelInactive\`, \`BotAuthError\`, \`BotShapeError\`,
2262
+ \`BotInvalidResponseError\`. The full set is open — branch on the
2263
+ ones you handle and treat the rest as "unknown failure, surface to
2264
+ user."
2265
+ - **Per-participant status payload (\`/simulation/status/{id}\`)** carries
2266
+ \`{job_id, status, create_time, completion_time?, error?, error_kind?,
2267
+ started_at?, last_heartbeat_at?, age_seconds?}\`. \`age_seconds\` is
2268
+ server-computed so clock skew between caller and backend doesn't
2269
+ matter; treat absent fields as "older backend, info unavailable."
2227
2270
  - **\`study run\` accepts \`--dispatch-timeout <s>\`** (default 120)
2228
2271
  for the per-POST participants/batch + simulation/start budget. On
2229
2272
  timeout (or any dispatch failure), the error envelope includes
@@ -2423,7 +2466,7 @@ not branch on \`status: 0\` — that value is never emitted as of 0.20.
2423
2466
  - Lists print as JSON arrays (or paginated wrappers). Single resources
2424
2467
  as JSON objects.
2425
2468
  - Field names match the underlying API resource (snake_case).
2426
- - Aliases (\`s-…\`, \`a-…\`, \`tp-…\`, …) appear alongside UUIDs in
2469
+ - Aliases (\`s-…\`, \`a-…\`, \`p-…\`, …) appear alongside UUIDs in
2427
2470
  \`--verbose\` mode and replace UUIDs in default lean mode.
2428
2471
 
2429
2472
  ## Examples
@@ -2473,11 +2516,14 @@ reshaping output.
2473
2516
  \`--turn\`, \`--side\`, \`--assignment\`, \`--step\`, \`--sentiment\`,
2474
2517
  \`--actor\`, \`--iteration\`, \`--participant\`) and projection flags
2475
2518
  (\`--group-by iteration|frame|segment|turn|assignment|step\`). When any
2476
- filter is passed, the envelope gains a \`totals_unfiltered\` field
2477
- (\`{participant_count, interaction_count}\`) so an agent can sanity-check
2478
- coverage: "matched 12 / 80 participants". A zero-match filter returns
2479
- the stable envelope with \`participant_count: 0\` and exit code **0**
2480
- (not 4) — slicing never errors on no-match.
2519
+ filter is passed on the default \`study results\` envelope, the envelope
2520
+ gains a \`totals_unfiltered\` field (\`{participant_count,
2521
+ interaction_count}\`) so an agent can sanity-check coverage: "matched
2522
+ 12 / 80 participants". A zero-match filter returns the stable envelope
2523
+ with \`participant_count: 0\` and exit code **0** (not 4) — slicing
2524
+ never errors on no-match. \`--group-by\` returns a different shape — a
2525
+ uniform envelope \`{axis, rows, totals_unfiltered, modality_warnings,
2526
+ study_id, modality}\` (see \`guides/slicing-results\`).
2481
2527
 
2482
2528
  \`--group-by\` is **router-gated by modality**: \`frame\` requires
2483
2529
  interactive, \`segment\` requires media (video / audio / text / document),
@@ -2509,7 +2555,7 @@ client-side; no extra round trip beyond the standard study fetch.
2509
2555
  | \`--step <ref>\` | Filters \`participant_assignments[].step_results[]\` to verdicts matching the step id or name. | interactive + external_chatbot chat (steps live there) |
2510
2556
  | \`--sentiment <labels>\` | Comma-separated, case-insensitive label list (repeatable). Drops null-sentiment rows. | all |
2511
2557
  | \`--actor <ai\|human\|user>\` | Restrict by actor. | all |
2512
- | \`--iteration <ref>\` | Iteration UUID or label (\`A\`, \`B\`, … case-insensitive). | all |
2558
+ | \`--iteration <ref>\` | Iteration UUID, iteration alias (\`i-…\`), or label (\`A\`, \`B\`, … case-insensitive). | all |
2513
2559
  | \`--participant <ref>\` | Participant UUID or \`pt-…\` alias. | all |
2514
2560
  | \`--include-unmatched\` | With \`--frame\`, keep degraded captures (\`frame_version_id: null\`) under a synthetic \`_unmatched\` bucket instead of dropping them. | interactive |
2515
2561
  | \`--include-evidence\` | With \`--step\`, also drop interactions not listed in any surviving \`step_results[].evidence_interaction_ids[]\`. | interactive + external_chatbot chat |
@@ -2520,33 +2566,52 @@ The exception is \`--group-by\` — see below.
2520
2566
 
2521
2567
  ## Projection flags (--group-by)
2522
2568
 
2523
- | Axis | Output shape | Modality |
2569
+ Every \`--group-by\` axis returns the same envelope:
2570
+ \`{axis, rows, totals_unfiltered, modality_warnings, study_id, modality}\`.
2571
+ Top-level \`axis\` echoes the requested axis; \`study_id\` is the \`s-…\`
2572
+ alias; \`modality\` echoes the study's modality. \`rows\` is an
2573
+ axis-specific array of slice objects (see the table below for the per-row
2574
+ shape). \`modality_warnings\` carries any filter-flag mismatches
2575
+ (e.g. \`--turn\` on a non-chat study); empty array when none.
2576
+
2577
+ | Axis | Row shape (one element of \`rows[]\`) | Modality |
2524
2578
  |-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
2525
- | \`iteration\` | \`{study, slices: [{iteration_id, iteration_label, participant_count, interaction_count, sentiment, sample_comments, top_actions}, ...], totals_unfiltered, warnings}\` | all |
2526
- | \`frame\` | \`[{frame_id, frame_label, interaction_count, sentiment_histogram, sample_comments, participant_aliases}, ...]\` | interactive (router errors on non-interactive) |
2527
- | \`segment\` | \`[{segment_index, segment_label, interaction_count, sentiment_histogram, engagement_histogram, sample_comments}, ...]\` | media (router errors on non-media) |
2528
- | \`turn\` | \`[{turn_index, interaction_count, sentiment_histogram, sample_replies, failures}, ...]\` | chat (router errors on non-chat) |
2529
- | \`assignment\` | \`[{assignment_id, assignment_name, interaction_count, sentiment_histogram, step_completion}, ...]\` | all |
2530
- | \`step\` | \`[{assignment_id, assignment_name, step_id, step_name, total, passed, inconclusive, failed, rate, participant_verdicts: [{participant_alias, verdict, reason, evidence_interaction_ids}, ...]}, ...]\` | interactive + external_chatbot chat |
2579
+ | \`iteration\` | \`{iteration_id, iteration_label, participant_count, interaction_count, sentiment, sample_comments, top_actions}\` | all |
2580
+ | \`frame\` | \`{frame_id, frame_label, interaction_count, sentiment_histogram, sample_comments, participant_aliases}\` | interactive (router errors on non-interactive) |
2581
+ | \`segment\` | \`{segment_index, segment_label, interaction_count, sentiment_histogram, engagement_histogram, sample_comments}\` | media (router errors on non-media) |
2582
+ | \`turn\` | \`{turn_index, interaction_count, sentiment_histogram, sample_replies, failures}\` | chat (router errors on non-chat) |
2583
+ | \`assignment\` | \`{assignment_id, assignment_name, interaction_count, sentiment_histogram, step_completion}\` | all |
2584
+ | \`step\` | \`{assignment_id, assignment_name, step_id, step_name, total, passed, inconclusive, failed, rate, participant_verdicts: [{participant_alias, verdict, reason, evidence_interaction_ids}]}\` | interactive + external_chatbot chat |
2531
2585
 
2532
2586
  \`--group-by\` is **mutually exclusive with \`--summary\` and
2533
2587
  \`--transcript\`**. \`--group-by frame\` on a chat study, \`--group-by
2534
2588
  turn\` on a video study, etc. error at the surface (exit 2) with a
2535
- clear message before any IO.
2589
+ clear message before any IO. The error envelope includes a \`hint\`
2590
+ field naming the axis that DOES apply to the study's modality
2591
+ (\`use --group-by segment\` on audio/video/text/document, \`use --group-by
2592
+ turn\` on chat, \`use --group-by frame\` on interactive) — agents can
2593
+ branch on it to retry productively in one hop.
2536
2594
 
2537
2595
  ## The empty-slice contract
2538
2596
 
2539
2597
  A filter combination that matches zero interactions returns the
2540
- **stable envelope shape** with:
2598
+ **uniform envelope** with:
2541
2599
 
2542
- - \`participant_count: 0\`
2600
+ - \`rows: []\`
2543
2601
  - \`totals_unfiltered: {participant_count: <N>, interaction_count: <M>}\` populated
2602
+ - \`axis\`, \`study_id\`, \`modality\` still populated
2544
2603
  - exit code **0** (not 4)
2545
2604
 
2546
2605
  \`totals_unfiltered\` is the agent's sanity check: *"my filter matched
2547
2606
  0 of 80 participants — is the filter too tight, or did the run not
2548
2607
  produce data?"*. The shape never collapses to \`null\` or a different
2549
- envelope; \`--get participant_count\` is always safe.
2608
+ envelope; \`--get participant_count\` is always safe on the default
2609
+ (non-\`--group-by\`) envelope.
2610
+
2611
+ The default+filter envelope (no \`--group-by\`) also carries
2612
+ \`modality_warnings: string[]\` — any filter flags that were dropped as
2613
+ off-modality (e.g. \`--turn 1\` on an interactive study) appear here.
2614
+ Agents piping stderr to \`/dev/null\` get the same signal on stdout.
2550
2615
 
2551
2616
  ## Worked examples
2552
2617
 
@@ -2617,22 +2682,26 @@ No match at all errors and lists the available frame names.
2617
2682
 
2618
2683
  \`\`\`
2619
2684
  # Sanity-check coverage:
2685
+ --get axis
2686
+ --get study_id
2687
+ --get modality
2620
2688
  --get totals_unfiltered.participant_count
2621
2689
  --get totals_unfiltered.interaction_count
2690
+ --get modality_warnings
2622
2691
 
2623
- # Per-iteration projection:
2624
- --get slices.iteration_label # one label per line
2625
- --get slices.0.participant_count
2626
- --get slices.0.sentiment
2692
+ # Per-iteration projection rows:
2693
+ --get rows.iteration_label # one label per line
2694
+ --get rows.0.participant_count
2695
+ --get rows.0.sentiment
2627
2696
 
2628
- # Per-frame / per-segment / per-turn (bare array):
2629
- --get 0.frame_label
2630
- --get 0.segment_index
2631
- --get 0.sentiment_histogram
2697
+ # Per-frame / per-segment / per-turn (rows[] is the axis array):
2698
+ --get rows.0.frame_label
2699
+ --get rows.0.segment_index
2700
+ --get rows.0.sentiment_histogram
2632
2701
 
2633
2702
  # Per-step:
2634
- --get 0.rate
2635
- --get 0.participant_verdicts.verdict # one verdict per participant
2703
+ --get rows.0.rate
2704
+ --get rows.0.participant_verdicts.verdict
2636
2705
  \`\`\`
2637
2706
 
2638
2707
  ## Related
@@ -3013,6 +3082,8 @@ free credits before re-dispatch.
3013
3082
  estimate at preview time — the CLI prints the shape (\`N × … × 2\`)
3014
3083
  instead of a number.
3015
3084
 
3085
+ **Naming note:** "tier" in ish means **billing** tier (FREE / STARTER / PRO / ENTERPRISE — a credit-budget knob). It is NOT a simulation-quality dial. Per-run simulation behaviour (model, timing, retries) is controlled via \`ish config\` — see \`ish config --help\`. \`docs search tier\` returns billing results by design.
3086
+
3016
3087
  ## Related
3017
3088
 
3018
3089
  - \`reference/billing-limits\` — per-tier *entity* caps (max
@@ -3447,13 +3518,13 @@ Optional \`--max-turns <n>\` (default 12) caps the chat per participant.
3447
3518
 
3448
3519
  Audience size is set at run time for **external_chatbot** chat
3449
3520
  studies. Use \`--sample <N>\` to pick N random simulatable profiles,
3450
- or \`--all\` for the full pool. \`--profile <id>\` is also supported
3521
+ or \`--all\` for the full pool. \`--person <ids>\` is also supported
3451
3522
  for explicit selection:
3452
3523
  \`\`\`
3453
3524
  ish study run stu-xyz --sample 5 --wait
3454
3525
  \`\`\`
3455
3526
 
3456
- > **Pair-mode is different.** \`--sample\` / \`--profile\` / demographic
3527
+ > **Pair-mode is different.** \`--sample\` / \`--person\` / demographic
3457
3528
  > filters on \`study run\` are **refused** for participant_pair iterations
3458
3529
  > — pair groups live on the iteration itself. Set them at
3459
3530
  > iteration-create time via \`--group-a/-b\` (with 1×N broadcast)
@@ -3609,7 +3680,7 @@ Keys (all optional): \`occupation\`, \`min_age\`, \`max_age\`,
3609
3680
  \`requires_captions\`, \`uses_screen_reader\`, \`prefers_reduced_motion\`,
3610
3681
  \`prefers_high_contrast\`, \`has_any_accessibility_need\`. The five \`*_in\`
3611
3682
  arrays accept snake_case spec values; the five accessibility filters are
3612
- booleans. Combine \`--profile-*\` and \`--role-criteria-*\` on the same side
3683
+ booleans. Combine \`--group-a\` / \`--group-b\` and \`--role-criteria-*\` on the same side
3613
3684
  to make criteria validate an explicit list (mismatch blocks the run).
3614
3685
 
3615
3686
  MECE notes for the list filters:
@@ -3995,7 +4066,7 @@ cap at 40 entries.
3995
4066
  - \`concepts/person\` — what a person is; structured fields.
3996
4067
  - \`concepts/source\` — interview transcripts / audio / PDF inputs
3997
4068
  for the people-generation flow.
3998
- - \`reference/aliases\` — \`tp-…\` is the profile alias prefix.
4069
+ - \`reference/aliases\` — \`p-…\` is the person alias prefix.
3999
4070
  `;
4000
4071
  const GUIDE_MCP_ADD = `# guide: wire ish into your AI clients (\`ish mcp add\`)
4001
4072
 
@@ -35,10 +35,16 @@ export declare function outputList(rows: unknown[], json: boolean): void;
35
35
  /**
36
36
  * Error with valid options — used for content_type and similar validation.
37
37
  * Surfaces valid_options in JSON so agents can self-correct.
38
+ *
39
+ * Optional `hint` is the agent's *actionable next step* (e.g. for a wrong
40
+ * --group-by axis on the current modality, the axis that DOES apply). Distinct
41
+ * from `valid_options`, which describes where the supplied value WOULD be
42
+ * valid. Both serialize into the error envelope when present.
38
43
  */
39
44
  export declare class ValidationError extends Error {
40
45
  valid_options: string[];
41
- constructor(message: string, valid_options: string[]);
46
+ hint?: string | undefined;
47
+ constructor(message: string, valid_options: string[], hint?: string | undefined);
42
48
  }
43
49
  export declare function outputError(err: unknown, json: boolean): void;
44
50
  export declare function printTable(headers: string[], rows: string[][]): void;
@@ -110,13 +116,12 @@ export declare function formatAskResults(ask: Record<string, unknown>, json: boo
110
116
  export declare function formatConfigList(configs: Record<string, unknown>[], json: boolean): void;
111
117
  export type StudyResultsGroupByKind = "iteration" | "frame" | "segment" | "turn" | "assignment" | "step";
112
118
  /**
113
- * Render a `--group-by <kind>` projection. JSON mode is a thin pass-through
114
- * to jsonOutput with `preProjected: true` so the lean transform doesn't
115
- * strip our stable empties. Human mode renders one section per slice plus
116
- * a small ASCII sentiment histogram.
117
- *
118
- * The renderer accepts both the wrapped `{study, slices, ...}` shape (per-
119
- * iteration) and the bare-array shape (every other --group-by); the
120
- * surface (T5) doesn't need to know the difference.
119
+ * Render a `--group-by <kind>` projection wrapped in the uniform
120
+ * `SliceResponse` envelope (`{ axis, rows, totals_unfiltered,
121
+ * modality_warnings, study_id, modality }`). JSON mode is a thin
122
+ * pass-through to jsonOutput with `preProjected: true` so the lean
123
+ * transform doesn't strip our stable empties. Human mode pulls slices
124
+ * out of `rows` and renders one section per slice plus a small ASCII
125
+ * sentiment histogram.
121
126
  */
122
127
  export declare function formatStudyResultsGroupBy(projection: unknown, kind: StudyResultsGroupByKind, json: boolean): void;
@@ -278,6 +278,53 @@ function pickFields(data, fields) {
278
278
  }
279
279
  return data;
280
280
  }
281
+ /**
282
+ * Pattern A: when an agent passes `--fields foo,bar` and one of those names
283
+ * doesn't exist on the response, emit a one-line stderr warning naming the
284
+ * missing fields plus a sample of what IS available. Otherwise unknown names
285
+ * silently drop and the agent assumes the field doesn't exist on the wire,
286
+ * when the more common cause is a typo or the wrong projection.
287
+ *
288
+ * Probes the response shape: for an object response, the top-level keys;
289
+ * for a list-wrapper response, the keys of `items[0]`; for a bare array,
290
+ * the keys of element 0. Warns at most once per command invocation
291
+ * (the caller invokes this from jsonOutput before pickFields).
292
+ */
293
+ function warnOnUnknownFields(data, fields) {
294
+ let probe = null;
295
+ if (Array.isArray(data) && data.length > 0 && typeof data[0] === "object" && data[0] !== null) {
296
+ probe = data[0];
297
+ }
298
+ else if (data && typeof data === "object" && !Array.isArray(data)) {
299
+ const obj = data;
300
+ if (isListWrapper(obj) && Array.isArray(obj.items) && obj.items.length > 0
301
+ && typeof obj.items[0] === "object" && obj.items[0] !== null) {
302
+ probe = obj.items[0];
303
+ }
304
+ else {
305
+ probe = obj;
306
+ }
307
+ }
308
+ if (!probe)
309
+ return;
310
+ const missing = fields.filter((f) => !(f in probe));
311
+ if (missing.length === 0)
312
+ return;
313
+ // Pattern DD: surface↔backend rename hints. The agent-friendly noun is
314
+ // "workspace" but the backend stores `product_id`; agents who guess the
315
+ // surface name need a did-you-mean to find the actual response key.
316
+ const RENAME_MAP = {
317
+ workspace_id: "product_id",
318
+ workspace: "product",
319
+ };
320
+ const renameHints = missing
321
+ .filter((m) => RENAME_MAP[m] && RENAME_MAP[m] in probe)
322
+ .map((m) => `${m} → ${RENAME_MAP[m]}`);
323
+ const available = Object.keys(probe).slice(0, 12).join(", ");
324
+ const more = Object.keys(probe).length > 12 ? `, … (${Object.keys(probe).length - 12} more)` : "";
325
+ const didYouMean = renameHints.length > 0 ? ` Did you mean: ${renameHints.join(", ")}?` : "";
326
+ console.error(`warning: --fields requested ${missing.length === 1 ? "name" : "names"} not on the response: ${missing.join(", ")}.${didYouMean} Available: ${available}${more}.`);
327
+ }
281
328
  /** Serialize data as JSON, applying lean transform and field selection. */
282
329
  function jsonOutput(data, options = {}) {
283
330
  let out;
@@ -297,6 +344,7 @@ function jsonOutput(data, options = {}) {
297
344
  out = leanJson(data, options.writePath);
298
345
  }
299
346
  if (_fields && _fields.length > 0) {
347
+ warnOnUnknownFields(out, _fields);
300
348
  out = pickFields(out, _fields);
301
349
  }
302
350
  // Pattern Ω capture mode: --get <field> returns bare values instead of
@@ -396,12 +444,19 @@ export function outputList(rows, json) {
396
444
  /**
397
445
  * Error with valid options — used for content_type and similar validation.
398
446
  * Surfaces valid_options in JSON so agents can self-correct.
447
+ *
448
+ * Optional `hint` is the agent's *actionable next step* (e.g. for a wrong
449
+ * --group-by axis on the current modality, the axis that DOES apply). Distinct
450
+ * from `valid_options`, which describes where the supplied value WOULD be
451
+ * valid. Both serialize into the error envelope when present.
399
452
  */
400
453
  export class ValidationError extends Error {
401
454
  valid_options;
402
- constructor(message, valid_options) {
455
+ hint;
456
+ constructor(message, valid_options, hint) {
403
457
  super(message);
404
458
  this.valid_options = valid_options;
459
+ this.hint = hint;
405
460
  this.name = "ValidationError";
406
461
  }
407
462
  }
@@ -434,6 +489,11 @@ function suggestionsForError(err) {
434
489
  return [
435
490
  "Run a list command to see available resources",
436
491
  "Check that the alias or ID is correct",
492
+ // Pattern R: an active workspace / study / ask saved in config can
493
+ // outlive the resource on the server. Implicit lookups then 404
494
+ // with no indication that the ID came from config. `ish status`
495
+ // flags orphans; `<entity> use --clear` resets the active value.
496
+ "If you didn't pass the resource explicitly, your saved active workspace/study/ask may be stale — run `ish status` to check, then `ish workspace use --clear` (or `ish study use --clear` / `ish ask use --clear`) to reset.",
437
497
  ];
438
498
  case "insufficient_credits":
439
499
  return ["Purchase more credits at https://app.ishlabs.io"];
@@ -593,11 +653,14 @@ export function outputError(err, json) {
593
653
  error_code: "validation_error",
594
654
  retryable: false,
595
655
  valid_options: err.valid_options,
656
+ ...(err.hint && { hint: err.hint }),
596
657
  ...(suggestions.length > 0 && { suggestions }),
597
658
  }));
598
659
  }
599
660
  else {
600
661
  console.error(`Error: ${err.message}`);
662
+ if (err.hint)
663
+ console.error(` hint: ${err.hint}`);
601
664
  for (const s of suggestions)
602
665
  console.error(` → ${s}`);
603
666
  }
@@ -635,6 +698,9 @@ export function outputError(err, json) {
635
698
  ? tagged.suggestions.filter((s) => typeof s === "string")
636
699
  : [];
637
700
  const mergedSuggestions = [...new Set([...suggestions, ...taggedSuggestions])];
701
+ const availableValues = Array.isArray(tagged.available_values)
702
+ ? tagged.available_values.filter((s) => typeof s === "string")
703
+ : undefined;
638
704
  if (json) {
639
705
  console.error(JSON.stringify({
640
706
  // Generic Error: CLI-thrown (we control the message), so we don't
@@ -647,6 +713,7 @@ export function outputError(err, json) {
647
713
  ...(errorKind && { error_kind: errorKind }),
648
714
  ...(example && { example }),
649
715
  ...(progress !== undefined && { progress }),
716
+ ...(availableValues && availableValues.length > 0 && { available_values: availableValues }),
650
717
  ...(seededIds && { seeded_but_not_dispatched_ids: seededIds }),
651
718
  ...(seededAliases && { seeded_but_not_dispatched_aliases: seededAliases }),
652
719
  ...(mergedSuggestions.length > 0 && { suggestions: mergedSuggestions }),
@@ -998,6 +1065,14 @@ export function buildStudyResultsEnvelope(study, participants) {
998
1065
  ? deterministicAlias(ALIAS_PREFIX.study, String(study.id))
999
1066
  : null;
1000
1067
  const completedCount = allParticipants.filter((t) => t.status === "completed" || t.status === "complete").length;
1068
+ // Pattern N: per-status breakdown so callers can distinguish running /
1069
+ // pending / cancelled from terminal completed/failed. Additive — the
1070
+ // aggregate counts (`completed_count` / `failed_count`) stay alongside.
1071
+ const participantStatusCounts = {};
1072
+ for (const t of allParticipants) {
1073
+ const key = (t.status || "unknown").toLowerCase();
1074
+ participantStatusCounts[key] = (participantStatusCounts[key] || 0) + 1;
1075
+ }
1001
1076
  // Aggregate sentiment across all interactions on all participants.
1002
1077
  const sentimentCounts = {};
1003
1078
  let sentimentTotal = 0;
@@ -1066,6 +1141,7 @@ export function buildStudyResultsEnvelope(study, participants) {
1066
1141
  participant_count: allParticipants.length,
1067
1142
  completed_count: completedCount,
1068
1143
  failed_count: failedCount,
1144
+ participant_status_counts: participantStatusCounts,
1069
1145
  sentiment,
1070
1146
  interview_answers: interviewAnswers,
1071
1147
  participants: participantRows,
@@ -2253,16 +2329,13 @@ function asciiHistogram(hist, options = {}) {
2253
2329
  });
2254
2330
  }
2255
2331
  function slicesFromProjection(projection) {
2256
- // Iteration projection wraps `{ study, slices, totals_unfiltered, warnings }`;
2257
- // all others are bare arrays. Both come through here.
2258
- if (Array.isArray(projection)) {
2259
- return projection.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
2260
- }
2261
- if (projection && typeof projection === "object") {
2262
- const wrapped = projection;
2263
- const slices = wrapped.slices;
2264
- if (Array.isArray(slices)) {
2265
- return slices.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
2332
+ // Surface wraps every --group-by axis in the uniform SliceResponse envelope
2333
+ // `{ axis, rows, totals_unfiltered, modality_warnings, study_id, modality }`;
2334
+ // slices live under `rows`.
2335
+ if (projection && typeof projection === "object" && !Array.isArray(projection)) {
2336
+ const rows = projection.rows;
2337
+ if (Array.isArray(rows)) {
2338
+ return rows.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
2266
2339
  }
2267
2340
  }
2268
2341
  return [];
@@ -2393,14 +2466,13 @@ function renderStepSlice(slice) {
2393
2466
  }
2394
2467
  }
2395
2468
  /**
2396
- * Render a `--group-by <kind>` projection. JSON mode is a thin pass-through
2397
- * to jsonOutput with `preProjected: true` so the lean transform doesn't
2398
- * strip our stable empties. Human mode renders one section per slice plus
2399
- * a small ASCII sentiment histogram.
2400
- *
2401
- * The renderer accepts both the wrapped `{study, slices, ...}` shape (per-
2402
- * iteration) and the bare-array shape (every other --group-by); the
2403
- * surface (T5) doesn't need to know the difference.
2469
+ * Render a `--group-by <kind>` projection wrapped in the uniform
2470
+ * `SliceResponse` envelope (`{ axis, rows, totals_unfiltered,
2471
+ * modality_warnings, study_id, modality }`). JSON mode is a thin
2472
+ * pass-through to jsonOutput with `preProjected: true` so the lean
2473
+ * transform doesn't strip our stable empties. Human mode pulls slices
2474
+ * out of `rows` and renders one section per slice plus a small ASCII
2475
+ * sentiment histogram.
2404
2476
  */
2405
2477
  export function formatStudyResultsGroupBy(projection, kind, json) {
2406
2478
  if (json) {
@@ -218,6 +218,7 @@ When in doubt: side-by-side comparison usually beats in-place edits. Ids are che
218
218
  - **Chatbot endpoint response-shape mismatch**: \`chat_endpoint_test\` succeeds shallowly if the bot responds at all, but a wrong response path (e.g. bot returns \`{ data: { reply } }\` instead of \`{ reply }\`) produces empty transcripts on the actual run. Inspect one full test response before dispatching participants.
219
219
  - **Chatbot auth drift**: tokens/sessions baked into \`--from-curl\` expire. If transcripts come back as identical short error strings, re-run \`chat_endpoint_test\` and refresh the curl spec.
220
220
  - **401 surfaces as fake blocker**: an unauthenticated endpoint produces "participant got stuck on auth screen" — looks like a UX blocker but is config. Always confirm endpoint auth before reading transcripts as user-research data.
221
+ - **Don't poll a stuck run forever**: a participant whose worker died will sit in \`status: running\` until the backend reaper transitions it to \`failed, error_kind: stale_worker\` (~15 min). The per-participant status payload exposes \`age_seconds\` (server-computed from \`started_at\`); once it's above ~900s on a non-terminal row, the run is almost certainly stuck. The CLI's \`wait_timeout\` envelope explicitly flags this case in its \`error\` message — when you see "the worker likely died," stop polling and surface the failure rather than retrying. \`error_kind: self_timeout\` is the same idea but written by the worker itself when it self-aborts past its 25-min ceiling.
221
222
  - **No per-page/per-timestamp scoping for media**: there's no "evaluate just slide 14" or "react to seconds 0-30" API. State the focus explicitly in the \`assignment\` text, or pre-stitch the artifact (e.g. replace one slide locally, upload as a new iteration).
222
223
  - **\`study get --json\` participants live at the top level**, not nested under \`iterations[*].participants\`. The backend split made \`/studies/{id}\` lite (metadata + iteration shells, no participant graph) and added \`/studies/{id}/participants\`; the CLI joins them so \`study get --json\` carries a flat \`participants[]\` with \`iteration_id\` on each row. Read \`.participants[]\`, not \`.iterations[].participants[]\`.
223
224
  - **All destructive deletes require \`--yes\` in non-TTY mode**: \`ish workspace delete\`, \`study delete\`, \`ask delete\`, \`person delete\`, \`source delete\`, \`chat endpoint delete\`. In \`--json\` mode (or any piped/non-TTY invocation), omitting \`--yes\` refuses with \`error_kind: "ConfirmationRequired"\` + an \`example\` field showing the same command with \`--yes\` appended. \`workspace delete\` is the highest-blast-radius: it removes ALL nested studies, asks, people, secrets, configs, sources, and chat endpoints — the prompt names them explicitly.
@@ -954,6 +955,12 @@ ish study results s-b2c --frame doesnotexist --json
954
955
  # degraded captures (frame_version_id: null) back.
955
956
  \`\`\`
956
957
 
958
+ Every \`--group-by <axis>\` call returns the same envelope:
959
+ \`{axis, rows, totals_unfiltered, modality_warnings, study_id, modality}\`.
960
+ The \`rows\` array holds axis-specific slice objects. The envelope is
961
+ uniform across all six axes — agents can code one shape and key on
962
+ \`axis\` / \`modality\` to dispatch on what's inside \`rows\`.
963
+
957
964
  Rules to remember:
958
965
  - **Filters compose with AND across flags; OR within \`--sentiment\`.**
959
966
  \`--frame login --sentiment Frustrated,Confused\` keeps only login-frame
@@ -974,7 +981,8 @@ Rules to remember:
974
981
  the filtered set. \`--transcript\` is single-participant and errors
975
982
  (exit 2) when **any** filter or \`--group-by\` is set.
976
983
  - Per-step output exposes \`participant_verdicts: [{participant_alias,
977
- verdict, reason, evidence_interaction_ids}]\` not
984
+ verdict, reason, evidence_interaction_ids}]\` on **each row of
985
+ \`rows[]\`** (one per \`(assignment, step)\` pair) — not
978
986
  \`per_participant_verdicts\`. The verdict enum is \`passed\` /
979
987
  \`inconclusive\` / \`failed\`.
980
988
 
@@ -1078,6 +1086,7 @@ table, projection shapes, and the defensive null-handling rules.
1078
1086
  | Per-step pass/fail with reasons inline | \`study participant --json\` per participant + jq | \`ish study results <id> --step verify-email --group-by step --json\` |
1079
1087
  | Frustrated reactions to one media segment | \`study results --json\` + jq | \`ish study results <id> --segment 3 --sentiment Frustrated --json\` |
1080
1088
  | Sanity-check filter coverage | hand-count \`.participants\` vs total | \`--get totals_unfiltered.participant_count\` (set on every sliced envelope) |
1089
+ | Know the sliced-results envelope shape | guess per axis | \`{axis, rows[], totals_unfiltered, modality_warnings, study_id, modality}\` — every \`--group-by\` axis |
1081
1090
  | Chat transcript for one participant (external_chatbot) | \`study participant --json\` + jq | \`ish study results <id> --transcript <participant_id> --json\` |
1082
1091
  | Pair-mode conversation transcripts | \`study participant --json\` per participant | \`ish iteration get <iter-id> --json \\| jq '.conversations[]'\` |
1083
1092
  | Participant headline only (no action timeline) | \`study participant --json\` + jq | \`ish study participant <id> --summary --json\` |
@@ -38,6 +38,9 @@ export interface StudyParticipant extends Participant {
38
38
  conversation_id?: string | null;
39
39
  error_message?: string | null;
40
40
  error_kind?: string | null;
41
+ started_at?: string | null;
42
+ last_heartbeat_at?: string | null;
43
+ age_seconds?: number | null;
41
44
  [k: string]: unknown;
42
45
  }
43
46
  export declare function fetchStudyParticipants(client: ApiClient, studyId: string, opts?: {