npm - @ishlabs/cli - Versions diffs - 0.21.0 → 0.23.0 - Mend

@ishlabs/cli 0.21.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/dist/commands/chat.js +2 -2
package/dist/commands/config.js +17 -3
package/dist/commands/source.js +1 -1
package/dist/commands/study-analyze.js +15 -2
package/dist/commands/study-participant.js +19 -0
package/dist/commands/study-run.d.ts +2 -0
package/dist/commands/study-run.js +71 -20
package/dist/commands/study.js +96 -34
package/dist/lib/command-helpers.js +4 -3
package/dist/lib/docs.js +114 -43
package/dist/lib/output.d.ts +14 -9
package/dist/lib/output.js +91 -19
package/dist/lib/skill-content.js +10 -1
package/dist/lib/study-participants.d.ts +3 -0
package/dist/lib/study-results-filters.js +35 -14
package/dist/lib/study-results-projections.d.ts +47 -17
package/dist/lib/study-results-projections.js +39 -36
package/dist/lib/types.d.ts +4 -0
package/package.json +1 -1

package/dist/lib/docs.js CHANGED Viewed

@@ -635,7 +635,7 @@ Tunables (both modes):
   the parties signal the conversation is over.
 Pair-mode rules:
-- Each side needs **either** \`--profile-*\` (explicit IDs) **or**
+- Each side needs **either** \`--group-a\` / \`--group-b\` (explicit IDs) **or**
   \`--role-criteria-*\` (filter the backend resolves). The two can also
   be combined — criteria then acts as validation on the explicit list.
 - When both sides use explicit \`--group-a\` / \`--group-b\`, they
@@ -657,7 +657,7 @@ Pair-mode rules:
   \`type\` field in \`--questionnaire\` / \`--questions\` manifests
   (\`single-choice\` ↔ \`single_choice\`).
 - Audiences are pinned to the iteration. \`ish study run\` refuses
-  run-time people overrides (\`--profile\` / \`--sample\` / \`--all\` /
+  run-time people overrides (\`--person\` / \`--sample\` / \`--all\` /
   filters) on a pair iteration — change the peoples via
   \`ish iteration update <id> --details-json '{...}'\` instead.
 - \`--max-turns\` / \`--early-termination\` on \`ish study run\` override
@@ -1174,7 +1174,7 @@ const CONCEPT_PROFILE = `# concept: person
 A **person** is a reusable persona — the simulated
 human whose behaviour drives a participant instance during a study or ask.
-- Alias prefix: \`tp-\`
+- Alias prefix: \`p-\`
 - Lives at the workspace level, reusable across studies and asks.
 - Distinct from a "participant" (\`pt-\`) — a participant is one *instance* of a
   profile inside one iteration.
@@ -1336,7 +1336,7 @@ A **source** is an input to \`ish person generate\`: a transcript,
 audio file, image, or PDF that an LLM reads to ground generated profiles
 in real customer evidence.
-- Alias prefix: \`tps-\`
+- Alias prefix: \`ps-\`
 - Source kinds: \`text_file | audio | image\` (auto-detected from extension; \`text-file\` is accepted as a hyphen variant).
 - Audio supports speaker diarization via \`--diarize\`.
@@ -1406,7 +1406,7 @@ flags. Two ways to select:
      \`platform\` until the next release with a server-side
      deprecation warning)
-The two modes are **mutually exclusive** — pass either \`--profile\` or
+The two modes are **mutually exclusive** — pass either \`--person\` or
 the filter set, not both.
 ## Empty-pool suggestions
@@ -1658,7 +1658,7 @@ and what they target differ.
 | Default        | latest iteration of the active study             | append a round to the active ask              |
 | Fresh setup    | \`ish iteration create …\` first, then run         | \`--new\` (creates ask + round 1 in one shot) |
 | Specific target| \`--iteration <id>\`                               | positional ask id (\`a-6ec\`)                 |
-| Audience       | \`--profile\` OR filters with \`--sample\`/\`--all\` — else reuse iteration's participants | only at \`--new\`; fixed for the ask afterwards |
+| Audience       | \`--person\` OR filters with \`--sample\`/\`--all\` — else reuse iteration's participants | only at \`--new\`; fixed for the ask afterwards |
 | Output unit    | per-participant interactions + questionnaire answers  | per-participant reactions per round                |
 ## Decision rule
@@ -1711,6 +1711,23 @@ removed); \`extend\` then spawns a fresh participant branched from the
 cancelled participant's last interaction. See
 \`concepts/extending-a-simulation\` for the full mental model.
+## Stuck runs are auto-failed (no manual intervention)
+If a worker dies mid-run (instance preemption, OOM, infra restart), the
+backend reaper transitions the participant to
+\`status: failed, error_kind: stale_worker\` within ~15 min — you don't
+need to \`cancel\` it. The status payload returned by
+\`/simulation/status/{participant_id}\` (and surfaced on \`study wait\`,
+\`study run --wait\`, \`study poll\`) includes \`age_seconds\` so agents
+can tell "just slow" from "the worker is gone." Once \`age_seconds\`
+exceeds ~900s for a non-terminal participant the wait-timeout envelope
+explicitly flags it as likely stuck — stop polling and let the reaper
+finish the row.
+\`error_kind: self_timeout\` is the same idea written by the worker
+itself when it self-detects passing its 25-min ceiling; \`stale_worker\`
+is the reaper's verdict when the row simply stopped reporting.
 ## Related
 - \`reference/json-mode\` — output modes (display vs capture vs chain).
@@ -1744,9 +1761,12 @@ mid-run?" scenario without restarting from scratch.
 When extend is **not** the right verb:
 - Source participant is still RUNNING. \`cancel\` it first, then extend.
-  Extend refuses non-terminal sources server-side.
+  Extend refuses non-terminal sources server-side. **Exception:** a
+  stale-heartbeat RUNNING row (worker died mid-run) is reaped to
+  \`failed, error_kind: stale_worker\` automatically within ~15 min — no
+  manual \`cancel\` needed; just wait for the reaper, then extend.
 - You want a fresh cohort with new people flags. Use \`study run\`
-  with \`--profile\` / \`--sample\` / \`--all\` instead — extend is a
+  with \`--person\` / \`--sample\` / \`--all\` instead — extend is a
   per-participant resume, not a batch op.
 - You want to change the iteration's URL or content. Edit the iteration
   itself (\`iteration update\` or a fresh iteration) — extend always
@@ -1906,8 +1926,8 @@ time the CLI sees an entity.
 - \`s-\`    study
 - \`i-\`    iteration
 - \`pt-\`   participant (instance of a person in an iteration)
-- \`tp-\`   person
-- \`tps-\`  person source
+- \`p-\`    person
+- \`ps-\`   person source
 - \`a-\`    ask
 - \`r-\`    ask round
 - \`c-\`    config (simulation config)
@@ -2223,7 +2243,30 @@ The CLI guarantees these contracts so agents can chain safely:
   envelope carries \`progress: {study_id, iteration_id?,
   timeout_seconds, done, total, pending, rows[]}\` so the agent
   can resume by polling rather than re-dispatching. Same shape on
-  \`study wait\` (single-participant rows[] has length 1).
+  \`study wait\` (single-participant rows[] has length 1). Each row
+  in \`progress.rows[]\` carries \`age_seconds\` (server-computed
+  liveness from \`started_at\`) plus \`error_kind\` when populated;
+  when any non-terminal row's \`age_seconds\` exceeds ~900s the
+  envelope's \`error\` message explicitly flags "the worker likely
+  died" — don't keep polling, the backend reaper will mark it
+  \`failed, error_kind=stale_worker\` within ~15 min.
+- **Participant \`error_kind\` enumeration.** Failed participants
+  carry a classified \`error_kind\` so agents branch without parsing
+  prose. Lifecycle/infra kinds: \`stale_worker\` (worker died mid-run,
+  reaper transitioned the row), \`self_timeout\` (worker self-aborted
+  past its 25-min runtime ceiling). Modality kinds:
+  \`first_impression_llm_failed\`, \`interview_llm_failed\`,
+  \`variant_preparation_failed\` (ask responses). CLI-side kinds:
+  \`ConfirmationRequired\` (destructive op in \`--json\` mode without
+  \`--yes\`), \`TunnelInactive\`, \`BotAuthError\`, \`BotShapeError\`,
+  \`BotInvalidResponseError\`. The full set is open — branch on the
+  ones you handle and treat the rest as "unknown failure, surface to
+  user."
+- **Per-participant status payload (\`/simulation/status/{id}\`)** carries
+  \`{job_id, status, create_time, completion_time?, error?, error_kind?,
+  started_at?, last_heartbeat_at?, age_seconds?}\`. \`age_seconds\` is
+  server-computed so clock skew between caller and backend doesn't
+  matter; treat absent fields as "older backend, info unavailable."
 - **\`study run\` accepts \`--dispatch-timeout <s>\`** (default 120)
   for the per-POST participants/batch + simulation/start budget. On
   timeout (or any dispatch failure), the error envelope includes
@@ -2423,7 +2466,7 @@ not branch on \`status: 0\` — that value is never emitted as of 0.20.
 - Lists print as JSON arrays (or paginated wrappers). Single resources
   as JSON objects.
 - Field names match the underlying API resource (snake_case).
-- Aliases (\`s-…\`, \`a-…\`, \`tp-…\`, …) appear alongside UUIDs in
+- Aliases (\`s-…\`, \`a-…\`, \`p-…\`, …) appear alongside UUIDs in
   \`--verbose\` mode and replace UUIDs in default lean mode.
 ## Examples
@@ -2473,11 +2516,14 @@ reshaping output.
 \`--turn\`, \`--side\`, \`--assignment\`, \`--step\`, \`--sentiment\`,
 \`--actor\`, \`--iteration\`, \`--participant\`) and projection flags
 (\`--group-by iteration|frame|segment|turn|assignment|step\`). When any
-filter is passed, the envelope gains a \`totals_unfiltered\` field
-(\`{participant_count, interaction_count}\`) so an agent can sanity-check
-coverage: "matched 12 / 80 participants". A zero-match filter returns
-the stable envelope with \`participant_count: 0\` and exit code **0**
-(not 4) — slicing never errors on no-match.
+filter is passed on the default \`study results\` envelope, the envelope
+gains a \`totals_unfiltered\` field (\`{participant_count,
+interaction_count}\`) so an agent can sanity-check coverage: "matched
+12 / 80 participants". A zero-match filter returns the stable envelope
+with \`participant_count: 0\` and exit code **0** (not 4) — slicing
+never errors on no-match. \`--group-by\` returns a different shape — a
+uniform envelope \`{axis, rows, totals_unfiltered, modality_warnings,
+study_id, modality}\` (see \`guides/slicing-results\`).
 \`--group-by\` is **router-gated by modality**: \`frame\` requires
 interactive, \`segment\` requires media (video / audio / text / document),
@@ -2509,7 +2555,7 @@ client-side; no extra round trip beyond the standard study fetch.
 | \`--step <ref>\`              | Filters \`participant_assignments[].step_results[]\` to verdicts matching the step id or name. | interactive + external_chatbot chat (steps live there)           |
 | \`--sentiment <labels>\`      | Comma-separated, case-insensitive label list (repeatable). Drops null-sentiment rows.         | all                                                              |
 | \`--actor <ai\|human\|user>\` | Restrict by actor.                                                                            | all                                                              |
-| \`--iteration <ref>\`         | Iteration UUID or label (\`A\`, \`B\`, … case-insensitive).                                    | all                                                              |
+| \`--iteration <ref>\`         | Iteration UUID, iteration alias (\`i-…\`), or label (\`A\`, \`B\`, … case-insensitive).        | all                                                              |
 | \`--participant <ref>\`       | Participant UUID or \`pt-…\` alias.                                                            | all                                                              |
 | \`--include-unmatched\`       | With \`--frame\`, keep degraded captures (\`frame_version_id: null\`) under a synthetic \`_unmatched\` bucket instead of dropping them. | interactive                                                      |
 | \`--include-evidence\`        | With \`--step\`, also drop interactions not listed in any surviving \`step_results[].evidence_interaction_ids[]\`. | interactive + external_chatbot chat                              |
@@ -2520,33 +2566,52 @@ The exception is \`--group-by\` — see below.
 ## Projection flags (--group-by)
-| Axis        | Output shape                                                                                                                                                              | Modality |
+Every \`--group-by\` axis returns the same envelope:
+\`{axis, rows, totals_unfiltered, modality_warnings, study_id, modality}\`.
+Top-level \`axis\` echoes the requested axis; \`study_id\` is the \`s-…\`
+alias; \`modality\` echoes the study's modality. \`rows\` is an
+axis-specific array of slice objects (see the table below for the per-row
+shape). \`modality_warnings\` carries any filter-flag mismatches
+(e.g. \`--turn\` on a non-chat study); empty array when none.
+| Axis        | Row shape (one element of \`rows[]\`)                                                                                                                                       | Modality |
 |-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
-| \`iteration\` | \`{study, slices: [{iteration_id, iteration_label, participant_count, interaction_count, sentiment, sample_comments, top_actions}, ...], totals_unfiltered, warnings}\` | all      |
-| \`frame\`   | \`[{frame_id, frame_label, interaction_count, sentiment_histogram, sample_comments, participant_aliases}, ...]\`                                                            | interactive (router errors on non-interactive) |
-| \`segment\` | \`[{segment_index, segment_label, interaction_count, sentiment_histogram, engagement_histogram, sample_comments}, ...]\`                                                    | media (router errors on non-media)             |
-| \`turn\`    | \`[{turn_index, interaction_count, sentiment_histogram, sample_replies, failures}, ...]\`                                                                                   | chat (router errors on non-chat)               |
-| \`assignment\` | \`[{assignment_id, assignment_name, interaction_count, sentiment_histogram, step_completion}, ...]\`                                                                      | all      |
-| \`step\`    | \`[{assignment_id, assignment_name, step_id, step_name, total, passed, inconclusive, failed, rate, participant_verdicts: [{participant_alias, verdict, reason, evidence_interaction_ids}, ...]}, ...]\` | interactive + external_chatbot chat            |
+| \`iteration\` | \`{iteration_id, iteration_label, participant_count, interaction_count, sentiment, sample_comments, top_actions}\`                                                       | all      |
+| \`frame\`   | \`{frame_id, frame_label, interaction_count, sentiment_histogram, sample_comments, participant_aliases}\`                                                                   | interactive (router errors on non-interactive) |
+| \`segment\` | \`{segment_index, segment_label, interaction_count, sentiment_histogram, engagement_histogram, sample_comments}\`                                                           | media (router errors on non-media)             |
+| \`turn\`    | \`{turn_index, interaction_count, sentiment_histogram, sample_replies, failures}\`                                                                                          | chat (router errors on non-chat)               |
+| \`assignment\` | \`{assignment_id, assignment_name, interaction_count, sentiment_histogram, step_completion}\`                                                                             | all      |
+| \`step\`    | \`{assignment_id, assignment_name, step_id, step_name, total, passed, inconclusive, failed, rate, participant_verdicts: [{participant_alias, verdict, reason, evidence_interaction_ids}]}\` | interactive + external_chatbot chat            |
 \`--group-by\` is **mutually exclusive with \`--summary\` and
 \`--transcript\`**. \`--group-by frame\` on a chat study, \`--group-by
 turn\` on a video study, etc. error at the surface (exit 2) with a
-clear message before any IO.
+clear message before any IO. The error envelope includes a \`hint\`
+field naming the axis that DOES apply to the study's modality
+(\`use --group-by segment\` on audio/video/text/document, \`use --group-by
+turn\` on chat, \`use --group-by frame\` on interactive) — agents can
+branch on it to retry productively in one hop.
 ## The empty-slice contract
 A filter combination that matches zero interactions returns the
-**stable envelope shape** with:
+**uniform envelope** with:
-- \`participant_count: 0\`
+- \`rows: []\`
 - \`totals_unfiltered: {participant_count: <N>, interaction_count: <M>}\` populated
+- \`axis\`, \`study_id\`, \`modality\` still populated
 - exit code **0** (not 4)
 \`totals_unfiltered\` is the agent's sanity check: *"my filter matched
 0 of 80 participants — is the filter too tight, or did the run not
 produce data?"*. The shape never collapses to \`null\` or a different
-envelope; \`--get participant_count\` is always safe.
+envelope; \`--get participant_count\` is always safe on the default
+(non-\`--group-by\`) envelope.
+The default+filter envelope (no \`--group-by\`) also carries
+\`modality_warnings: string[]\` — any filter flags that were dropped as
+off-modality (e.g. \`--turn 1\` on an interactive study) appear here.
+Agents piping stderr to \`/dev/null\` get the same signal on stdout.
 ## Worked examples
@@ -2617,22 +2682,26 @@ No match at all errors and lists the available frame names.
 \`\`\`
 # Sanity-check coverage:
+--get axis
+--get study_id
+--get modality
 --get totals_unfiltered.participant_count
 --get totals_unfiltered.interaction_count
+--get modality_warnings
-# Per-iteration projection:
---get slices.iteration_label             # one label per line
---get slices.0.participant_count
---get slices.0.sentiment
+# Per-iteration projection rows:
+--get rows.iteration_label               # one label per line
+--get rows.0.participant_count
+--get rows.0.sentiment
-# Per-frame / per-segment / per-turn (bare array):
---get 0.frame_label
---get 0.segment_index
---get 0.sentiment_histogram
+# Per-frame / per-segment / per-turn (rows[] is the axis array):
+--get rows.0.frame_label
+--get rows.0.segment_index
+--get rows.0.sentiment_histogram
 # Per-step:
---get 0.rate
---get 0.participant_verdicts.verdict     # one verdict per participant
+--get rows.0.rate
+--get rows.0.participant_verdicts.verdict
 \`\`\`
 ## Related
@@ -3013,6 +3082,8 @@ free credits before re-dispatch.
   estimate at preview time — the CLI prints the shape (\`N × … × 2\`)
   instead of a number.
+**Naming note:** "tier" in ish means **billing** tier (FREE / STARTER / PRO / ENTERPRISE — a credit-budget knob). It is NOT a simulation-quality dial. Per-run simulation behaviour (model, timing, retries) is controlled via \`ish config\` — see \`ish config --help\`. \`docs search tier\` returns billing results by design.
 ## Related
 - \`reference/billing-limits\` — per-tier *entity* caps (max
@@ -3447,13 +3518,13 @@ Optional \`--max-turns <n>\` (default 12) caps the chat per participant.
 Audience size is set at run time for **external_chatbot** chat
 studies. Use \`--sample <N>\` to pick N random simulatable profiles,
-or \`--all\` for the full pool. \`--profile <id>\` is also supported
+or \`--all\` for the full pool. \`--person <ids>\` is also supported
 for explicit selection:
 \`\`\`
 ish study run stu-xyz --sample 5 --wait
 \`\`\`
-> **Pair-mode is different.** \`--sample\` / \`--profile\` / demographic
+> **Pair-mode is different.** \`--sample\` / \`--person\` / demographic
 > filters on \`study run\` are **refused** for participant_pair iterations
 > — pair groups live on the iteration itself. Set them at
 > iteration-create time via \`--group-a/-b\` (with 1×N broadcast)
@@ -3609,7 +3680,7 @@ Keys (all optional): \`occupation\`, \`min_age\`, \`max_age\`,
 \`requires_captions\`, \`uses_screen_reader\`, \`prefers_reduced_motion\`,
 \`prefers_high_contrast\`, \`has_any_accessibility_need\`. The five \`*_in\`
 arrays accept snake_case spec values; the five accessibility filters are
-booleans. Combine \`--profile-*\` and \`--role-criteria-*\` on the same side
+booleans. Combine \`--group-a\` / \`--group-b\` and \`--role-criteria-*\` on the same side
 to make criteria validate an explicit list (mismatch blocks the run).
 MECE notes for the list filters:
@@ -3995,7 +4066,7 @@ cap at 40 entries.
 - \`concepts/person\` — what a person is; structured fields.
 - \`concepts/source\` — interview transcripts / audio / PDF inputs
   for the people-generation flow.
-- \`reference/aliases\` — \`tp-…\` is the profile alias prefix.
+- \`reference/aliases\` — \`p-…\` is the person alias prefix.
 `;
 const GUIDE_MCP_ADD = `# guide: wire ish into your AI clients (\`ish mcp add\`)

package/dist/lib/output.d.ts CHANGED Viewed

@@ -35,10 +35,16 @@ export declare function outputList(rows: unknown[], json: boolean): void;
 /**
  * Error with valid options — used for content_type and similar validation.
  * Surfaces valid_options in JSON so agents can self-correct.
+ *
+ * Optional `hint` is the agent's *actionable next step* (e.g. for a wrong
+ * --group-by axis on the current modality, the axis that DOES apply). Distinct
+ * from `valid_options`, which describes where the supplied value WOULD be
+ * valid. Both serialize into the error envelope when present.
  */
 export declare class ValidationError extends Error {
     valid_options: string[];
-    constructor(message: string, valid_options: string[]);
+    hint?: string | undefined;
+    constructor(message: string, valid_options: string[], hint?: string | undefined);
 }
 export declare function outputError(err: unknown, json: boolean): void;
 export declare function printTable(headers: string[], rows: string[][]): void;
@@ -110,13 +116,12 @@ export declare function formatAskResults(ask: Record<string, unknown>, json: boo
 export declare function formatConfigList(configs: Record<string, unknown>[], json: boolean): void;
 export type StudyResultsGroupByKind = "iteration" | "frame" | "segment" | "turn" | "assignment" | "step";
 /**
- * Render a `--group-by <kind>` projection. JSON mode is a thin pass-through
- * to jsonOutput with `preProjected: true` so the lean transform doesn't
- * strip our stable empties. Human mode renders one section per slice plus
- * a small ASCII sentiment histogram.
- *
- * The renderer accepts both the wrapped `{study, slices, ...}` shape (per-
- * iteration) and the bare-array shape (every other --group-by); the
- * surface (T5) doesn't need to know the difference.
+ * Render a `--group-by <kind>` projection wrapped in the uniform
+ * `SliceResponse` envelope (`{ axis, rows, totals_unfiltered,
+ * modality_warnings, study_id, modality }`). JSON mode is a thin
+ * pass-through to jsonOutput with `preProjected: true` so the lean
+ * transform doesn't strip our stable empties. Human mode pulls slices
+ * out of `rows` and renders one section per slice plus a small ASCII
+ * sentiment histogram.
  */
 export declare function formatStudyResultsGroupBy(projection: unknown, kind: StudyResultsGroupByKind, json: boolean): void;

package/dist/lib/output.js CHANGED Viewed

@@ -278,6 +278,53 @@ function pickFields(data, fields) {
     }
     return data;
 }
+/**
+ * Pattern A: when an agent passes `--fields foo,bar` and one of those names
+ * doesn't exist on the response, emit a one-line stderr warning naming the
+ * missing fields plus a sample of what IS available. Otherwise unknown names
+ * silently drop and the agent assumes the field doesn't exist on the wire,
+ * when the more common cause is a typo or the wrong projection.
+ *
+ * Probes the response shape: for an object response, the top-level keys;
+ * for a list-wrapper response, the keys of `items[0]`; for a bare array,
+ * the keys of element 0. Warns at most once per command invocation
+ * (the caller invokes this from jsonOutput before pickFields).
+ */
+function warnOnUnknownFields(data, fields) {
+    let probe = null;
+    if (Array.isArray(data) && data.length > 0 && typeof data[0] === "object" && data[0] !== null) {
+        probe = data[0];
+    }
+    else if (data && typeof data === "object" && !Array.isArray(data)) {
+        const obj = data;
+        if (isListWrapper(obj) && Array.isArray(obj.items) && obj.items.length > 0
+            && typeof obj.items[0] === "object" && obj.items[0] !== null) {
+            probe = obj.items[0];
+        }
+        else {
+            probe = obj;
+        }
+    }
+    if (!probe)
+        return;
+    const missing = fields.filter((f) => !(f in probe));
+    if (missing.length === 0)
+        return;
+    // Pattern DD: surface↔backend rename hints. The agent-friendly noun is
+    // "workspace" but the backend stores `product_id`; agents who guess the
+    // surface name need a did-you-mean to find the actual response key.
+    const RENAME_MAP = {
+        workspace_id: "product_id",
+        workspace: "product",
+    };
+    const renameHints = missing
+        .filter((m) => RENAME_MAP[m] && RENAME_MAP[m] in probe)
+        .map((m) => `${m} → ${RENAME_MAP[m]}`);
+    const available = Object.keys(probe).slice(0, 12).join(", ");
+    const more = Object.keys(probe).length > 12 ? `, … (${Object.keys(probe).length - 12} more)` : "";
+    const didYouMean = renameHints.length > 0 ? ` Did you mean: ${renameHints.join(", ")}?` : "";
+    console.error(`warning: --fields requested ${missing.length === 1 ? "name" : "names"} not on the response: ${missing.join(", ")}.${didYouMean} Available: ${available}${more}.`);
+}
 /** Serialize data as JSON, applying lean transform and field selection. */
 function jsonOutput(data, options = {}) {
     let out;
@@ -297,6 +344,7 @@ function jsonOutput(data, options = {}) {
         out = leanJson(data, options.writePath);
     }
     if (_fields && _fields.length > 0) {
+        warnOnUnknownFields(out, _fields);
         out = pickFields(out, _fields);
     }
     // Pattern Ω capture mode: --get <field> returns bare values instead of
@@ -396,12 +444,19 @@ export function outputList(rows, json) {
 /**
  * Error with valid options — used for content_type and similar validation.
  * Surfaces valid_options in JSON so agents can self-correct.
+ *
+ * Optional `hint` is the agent's *actionable next step* (e.g. for a wrong
+ * --group-by axis on the current modality, the axis that DOES apply). Distinct
+ * from `valid_options`, which describes where the supplied value WOULD be
+ * valid. Both serialize into the error envelope when present.
  */
 export class ValidationError extends Error {
     valid_options;
-    constructor(message, valid_options) {
+    hint;
+    constructor(message, valid_options, hint) {
         super(message);
         this.valid_options = valid_options;
+        this.hint = hint;
         this.name = "ValidationError";
     }
 }
@@ -434,6 +489,11 @@ function suggestionsForError(err) {
                 return [
                     "Run a list command to see available resources",
                     "Check that the alias or ID is correct",
+                    // Pattern R: an active workspace / study / ask saved in config can
+                    // outlive the resource on the server. Implicit lookups then 404
+                    // with no indication that the ID came from config. `ish status`
+                    // flags orphans; `<entity> use --clear` resets the active value.
+                    "If you didn't pass the resource explicitly, your saved active workspace/study/ask may be stale — run `ish status` to check, then `ish workspace use --clear` (or `ish study use --clear` / `ish ask use --clear`) to reset.",
                 ];
             case "insufficient_credits":
                 return ["Purchase more credits at https://app.ishlabs.io"];
@@ -593,11 +653,14 @@ export function outputError(err, json) {
                 error_code: "validation_error",
                 retryable: false,
                 valid_options: err.valid_options,
+                ...(err.hint && { hint: err.hint }),
                 ...(suggestions.length > 0 && { suggestions }),
             }));
         }
         else {
             console.error(`Error: ${err.message}`);
+            if (err.hint)
+                console.error(`  hint: ${err.hint}`);
             for (const s of suggestions)
                 console.error(`  → ${s}`);
         }
@@ -635,6 +698,9 @@ export function outputError(err, json) {
             ? tagged.suggestions.filter((s) => typeof s === "string")
             : [];
         const mergedSuggestions = [...new Set([...suggestions, ...taggedSuggestions])];
+        const availableValues = Array.isArray(tagged.available_values)
+            ? tagged.available_values.filter((s) => typeof s === "string")
+            : undefined;
         if (json) {
             console.error(JSON.stringify({
                 // Generic Error: CLI-thrown (we control the message), so we don't
@@ -647,6 +713,7 @@ export function outputError(err, json) {
                 ...(errorKind && { error_kind: errorKind }),
                 ...(example && { example }),
                 ...(progress !== undefined && { progress }),
+                ...(availableValues && availableValues.length > 0 && { available_values: availableValues }),
                 ...(seededIds && { seeded_but_not_dispatched_ids: seededIds }),
                 ...(seededAliases && { seeded_but_not_dispatched_aliases: seededAliases }),
                 ...(mergedSuggestions.length > 0 && { suggestions: mergedSuggestions }),
@@ -998,6 +1065,14 @@ export function buildStudyResultsEnvelope(study, participants) {
         ? deterministicAlias(ALIAS_PREFIX.study, String(study.id))
         : null;
     const completedCount = allParticipants.filter((t) => t.status === "completed" || t.status === "complete").length;
+    // Pattern N: per-status breakdown so callers can distinguish running /
+    // pending / cancelled from terminal completed/failed. Additive — the
+    // aggregate counts (`completed_count` / `failed_count`) stay alongside.
+    const participantStatusCounts = {};
+    for (const t of allParticipants) {
+        const key = (t.status || "unknown").toLowerCase();
+        participantStatusCounts[key] = (participantStatusCounts[key] || 0) + 1;
+    }
     // Aggregate sentiment across all interactions on all participants.
     const sentimentCounts = {};
     let sentimentTotal = 0;
@@ -1066,6 +1141,7 @@ export function buildStudyResultsEnvelope(study, participants) {
         participant_count: allParticipants.length,
         completed_count: completedCount,
         failed_count: failedCount,
+        participant_status_counts: participantStatusCounts,
         sentiment,
         interview_answers: interviewAnswers,
         participants: participantRows,
@@ -2253,16 +2329,13 @@ function asciiHistogram(hist, options = {}) {
     });
 }
 function slicesFromProjection(projection) {
-    // Iteration projection wraps `{ study, slices, totals_unfiltered, warnings }`;
-    // all others are bare arrays. Both come through here.
-    if (Array.isArray(projection)) {
-        return projection.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
-    }
-    if (projection && typeof projection === "object") {
-        const wrapped = projection;
-        const slices = wrapped.slices;
-        if (Array.isArray(slices)) {
-            return slices.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
+    // Surface wraps every --group-by axis in the uniform SliceResponse envelope
+    // `{ axis, rows, totals_unfiltered, modality_warnings, study_id, modality }`;
+    // slices live under `rows`.
+    if (projection && typeof projection === "object" && !Array.isArray(projection)) {
+        const rows = projection.rows;
+        if (Array.isArray(rows)) {
+            return rows.filter((s) => Boolean(s) && typeof s === "object" && !Array.isArray(s));
         }
     }
     return [];
@@ -2393,14 +2466,13 @@ function renderStepSlice(slice) {
     }
 }
 /**
- * Render a `--group-by <kind>` projection. JSON mode is a thin pass-through
- * to jsonOutput with `preProjected: true` so the lean transform doesn't
- * strip our stable empties. Human mode renders one section per slice plus
- * a small ASCII sentiment histogram.
- *
- * The renderer accepts both the wrapped `{study, slices, ...}` shape (per-
- * iteration) and the bare-array shape (every other --group-by); the
- * surface (T5) doesn't need to know the difference.
+ * Render a `--group-by <kind>` projection wrapped in the uniform
+ * `SliceResponse` envelope (`{ axis, rows, totals_unfiltered,
+ * modality_warnings, study_id, modality }`). JSON mode is a thin
+ * pass-through to jsonOutput with `preProjected: true` so the lean
+ * transform doesn't strip our stable empties. Human mode pulls slices
+ * out of `rows` and renders one section per slice plus a small ASCII
+ * sentiment histogram.
  */
 export function formatStudyResultsGroupBy(projection, kind, json) {
     if (json) {

package/dist/lib/skill-content.js CHANGED Viewed

@@ -218,6 +218,7 @@ When in doubt: side-by-side comparison usually beats in-place edits. Ids are che
 - **Chatbot endpoint response-shape mismatch**: \`chat_endpoint_test\` succeeds shallowly if the bot responds at all, but a wrong response path (e.g. bot returns \`{ data: { reply } }\` instead of \`{ reply }\`) produces empty transcripts on the actual run. Inspect one full test response before dispatching participants.
 - **Chatbot auth drift**: tokens/sessions baked into \`--from-curl\` expire. If transcripts come back as identical short error strings, re-run \`chat_endpoint_test\` and refresh the curl spec.
 - **401 surfaces as fake blocker**: an unauthenticated endpoint produces "participant got stuck on auth screen" — looks like a UX blocker but is config. Always confirm endpoint auth before reading transcripts as user-research data.
+- **Don't poll a stuck run forever**: a participant whose worker died will sit in \`status: running\` until the backend reaper transitions it to \`failed, error_kind: stale_worker\` (~15 min). The per-participant status payload exposes \`age_seconds\` (server-computed from \`started_at\`); once it's above ~900s on a non-terminal row, the run is almost certainly stuck. The CLI's \`wait_timeout\` envelope explicitly flags this case in its \`error\` message — when you see "the worker likely died," stop polling and surface the failure rather than retrying. \`error_kind: self_timeout\` is the same idea but written by the worker itself when it self-aborts past its 25-min ceiling.
 - **No per-page/per-timestamp scoping for media**: there's no "evaluate just slide 14" or "react to seconds 0-30" API. State the focus explicitly in the \`assignment\` text, or pre-stitch the artifact (e.g. replace one slide locally, upload as a new iteration).
 - **\`study get --json\` participants live at the top level**, not nested under \`iterations[*].participants\`. The backend split made \`/studies/{id}\` lite (metadata + iteration shells, no participant graph) and added \`/studies/{id}/participants\`; the CLI joins them so \`study get --json\` carries a flat \`participants[]\` with \`iteration_id\` on each row. Read \`.participants[]\`, not \`.iterations[].participants[]\`.
 - **All destructive deletes require \`--yes\` in non-TTY mode**: \`ish workspace delete\`, \`study delete\`, \`ask delete\`, \`person delete\`, \`source delete\`, \`chat endpoint delete\`. In \`--json\` mode (or any piped/non-TTY invocation), omitting \`--yes\` refuses with \`error_kind: "ConfirmationRequired"\` + an \`example\` field showing the same command with \`--yes\` appended. \`workspace delete\` is the highest-blast-radius: it removes ALL nested studies, asks, people, secrets, configs, sources, and chat endpoints — the prompt names them explicitly.
@@ -954,6 +955,12 @@ ish study results s-b2c --frame doesnotexist --json
 #   degraded captures (frame_version_id: null) back.
 \`\`\`
+Every \`--group-by <axis>\` call returns the same envelope:
+\`{axis, rows, totals_unfiltered, modality_warnings, study_id, modality}\`.
+The \`rows\` array holds axis-specific slice objects. The envelope is
+uniform across all six axes — agents can code one shape and key on
+\`axis\` / \`modality\` to dispatch on what's inside \`rows\`.
 Rules to remember:
 - **Filters compose with AND across flags; OR within \`--sentiment\`.**
   \`--frame login --sentiment Frustrated,Confused\` keeps only login-frame
@@ -974,7 +981,8 @@ Rules to remember:
   the filtered set. \`--transcript\` is single-participant and errors
   (exit 2) when **any** filter or \`--group-by\` is set.
 - Per-step output exposes \`participant_verdicts: [{participant_alias,
-  verdict, reason, evidence_interaction_ids}]\` — not
+  verdict, reason, evidence_interaction_ids}]\` on **each row of
+  \`rows[]\`** (one per \`(assignment, step)\` pair) — not
   \`per_participant_verdicts\`. The verdict enum is \`passed\` /
   \`inconclusive\` / \`failed\`.
@@ -1078,6 +1086,7 @@ table, projection shapes, and the defensive null-handling rules.
 | Per-step pass/fail with reasons inline    | \`study participant --json\` per participant + jq | \`ish study results <id> --step verify-email --group-by step --json\` |
 | Frustrated reactions to one media segment | \`study results --json\` + jq | \`ish study results <id> --segment 3 --sentiment Frustrated --json\` |
 | Sanity-check filter coverage              | hand-count \`.participants\` vs total | \`--get totals_unfiltered.participant_count\` (set on every sliced envelope) |
+| Know the sliced-results envelope shape    | guess per axis                         | \`{axis, rows[], totals_unfiltered, modality_warnings, study_id, modality}\` — every \`--group-by\` axis |
 | Chat transcript for one participant (external_chatbot) | \`study participant --json\` + jq      | \`ish study results <id> --transcript <participant_id> --json\`           |
 | Pair-mode conversation transcripts        | \`study participant --json\` per participant       | \`ish iteration get <iter-id> --json \\| jq '.conversations[]'\`     |
 | Participant headline only (no action timeline) | \`study participant --json\` + jq            | \`ish study participant <id> --summary --json\`                           |

package/dist/lib/study-participants.d.ts CHANGED Viewed

@@ -38,6 +38,9 @@ export interface StudyParticipant extends Participant {
     conversation_id?: string | null;
     error_message?: string | null;
     error_kind?: string | null;
+    started_at?: string | null;
+    last_heartbeat_at?: string | null;
+    age_seconds?: number | null;
     [k: string]: unknown;
 }
 export declare function fetchStudyParticipants(client: ApiClient, studyId: string, opts?: {