npm - @ishlabs/cli - Versions diffs - 0.8.3 → 0.8.5 - Mend

@ishlabs/cli 0.8.3 → 0.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/README.md +7 -1
package/dist/auth.d.ts +16 -0
package/dist/auth.js +52 -3
package/dist/commands/ask.js +86 -17
package/dist/commands/iteration.js +45 -11
package/dist/commands/profile.js +79 -13
package/dist/commands/study-run.js +49 -0
package/dist/commands/study-tester.js +5 -2
package/dist/commands/study.js +82 -19
package/dist/connect.js +94 -19
package/dist/index.js +122 -2
package/dist/lib/api-client.js +29 -7
package/dist/lib/command-helpers.d.ts +51 -0
package/dist/lib/command-helpers.js +206 -7
package/dist/lib/docs.js +621 -30
package/dist/lib/output.d.ts +6 -0
package/dist/lib/output.js +570 -65
package/dist/lib/skill-content.js +216 -9
package/dist/lib/types.d.ts +3 -1
package/dist/upgrade.js +3 -3
package/package.json +1 -1

package/dist/lib/docs.js CHANGED Viewed

@@ -43,9 +43,12 @@ Two top-level run verbs:
 ## Where to look next
 - New here? \`ish docs get-page concepts/workspace\`, then \`concepts/study\`.
+- **Cold start?** Run \`ish status\` (alias \`ish whoami\`) — confirms login
+  and prints active workspace/study/ask. See \`concepts/active-context\`.
 - Running your first study? \`ish docs get-page guides/first-study\`.
 - Comparing study vs ask? \`ish docs get-page concepts/run-verbs\`.
-- Need machine-readable output? \`ish docs get-page reference/json-mode\`.
+- **Output modes** (display vs capture vs chain — \`--human\`, \`--get\`,
+  \`--json\`)? \`ish docs get-page reference/json-mode\`.
 - Auth gated URL? \`ish docs get-page concepts/site-access\`.
 ## Install the skill into this project
@@ -72,6 +75,22 @@ A workspace carries:
 - Site-access credentials (encrypted at rest) — see \`concepts/site-access\`.
 - Tester profiles + sources visible to every study/ask in the workspace.
+## Selecting a workspace per command
+\`--workspace <id>\` works at the **program root** as well as on each
+subcommand — both forms are equivalent, and the subcommand-level flag
+wins on conflict:
+\`\`\`
+ish --workspace w-6ec study list           # program root
+ish study list --workspace w-6ec           # subcommand (same effect)
+ish --workspace w-6ec study list --workspace w-other   # w-other wins
+\`\`\`
+Use whichever is most natural for your scripting. Without either, the
+CLI falls back to \`ISH_WORKSPACE\` (env var) and then the
+\`workspace\` saved in \`~/.ish/config.json\`.
 ## Common commands
 \`\`\`
@@ -81,6 +100,10 @@ ish workspace use w-6ec        # set as active
 ish workspace get              # show the active workspace
 ish workspace site-access status
 \`\`\`
+## Related
+- \`reference/billing-limits\` — \`maxProducts\` cap on workspace creation.
 `;
 const CONCEPT_STUDY = `# concept: study
@@ -101,17 +124,78 @@ its iterations. Think: a study is the recipe; an iteration is one batch.
 ## Lifecycle
-1. \`ish study create --name "Onboarding UX" --modality interactive --assignment "Sign up:Complete the signup flow" --question "How easy was it?"\`
-2. \`ish iteration create --url https://example.com\` (creates the first iteration)
-3. \`ish study run --sample 5 --country SE\` (dispatches simulations)
+1. \`ish study create --name "Onboarding UX" --modality interactive --assignment "Sign up:Complete the signup flow" --question "How easy was it?"\` — creates the recipe with **zero iterations**.
+2. \`ish iteration create --url https://example.com\` — first iteration becomes label \`A\`.
+3. \`ish study run --sample 5 --country SE\` — dispatches simulations.
 4. \`ish study results\` or \`ish study wait\` to gather outputs.
+### One-shot variant
+\`study create\` now accepts \`--content-text\` (text modality) or
+\`--url\` (interactive modality) inline; iteration A is created in the
+same call. Useful when you have a single test artifact and don't need
+to A/B iterations:
+\`\`\`
+ish study create --modality text --content-type email \\
+  --name "Daily Brief concept" \\
+  --assignment "Read:Read the email and react" \\
+  --question "What stood out?" \\
+  --content-text @./brief.md
+# → study + iteration A in one call, ready for \`study run\`.
+\`\`\`
+Without those flags no iteration is created — agents can no longer
+trip the old "empty A" footgun where \`study run\` silently targeted a
+placeholder.
+## Status fields (read \`runtime_status\`, not \`status\`)
+Every study response carries two status-shaped fields:
+- \`status\` — the raw lifecycle column on the row, values
+  \`draft | running | completed | cancelled\`. Updated lazily; can
+  disagree with what the testers actually did.
+- \`runtime_status\` — derived by aggregating the iteration testers'
+  states. Values: \`draft | running | completed |
+  completed_with_errors | cancelled\`. **Never reports \`failed\` while
+  completed runs exist** (the Bk2 invariant). Prefer this for any
+  agent decision.
+The CLI also surfaces a \`status_inferred\` field + stderr warning when
+it detects raw-vs-derived inconsistencies. See \`reference/json-mode\`.
+## Deleting a study
+\`ish study delete <id>\` requires explicit confirmation:
+- **Interactive (TTY)**: prompts on stderr; type \`y\` to proceed.
+- **Non-interactive** (\`--json\`, piped, or non-TTY stdin): pass
+  \`-y\` / \`--yes\` to confirm. Without it, the CLI exits with usage
+  code 2 rather than deleting silently.
+\`\`\`
+ish study delete s-b2c              # interactive prompt
+ish study delete s-b2c --yes        # skip prompt
+ish study delete s-b2c --json --yes # JSON consumers must be explicit
+\`\`\`
+## Generate vs create
+\`ish study generate --problem "..."\` runs an LLM-backed flow that
+picks a sensible modality from your brief and returns a
+\`modality_rationale\` field (≤30 words) explaining the choice.
+Override before adding iterations via
+\`ish study update <id> --modality text\` if the rationale shows the
+pick was wrong.
 ## Related
 - \`concepts/iteration\` — the unit of execution within a study.
 - \`concepts/assignment\` — task definition syntax.
 - \`concepts/questionnaire\` — question types and timing.
 - \`concepts/run-verbs\` — when to use \`study run\` vs \`ask run\`.
+- \`reference/billing-limits\` — \`maxStudiesPerProduct\` cap on study creation.
 `;
 const CONCEPT_ITERATION = `# concept: iteration
@@ -157,11 +241,42 @@ ish iteration list --study s-b2c
 ish iteration get i-d4e
 \`\`\`
+## No more auto-empty iteration A
+\`ish study create\` and \`ish study generate\` **do not auto-create
+iteration A** anymore (Pattern E remediation, ish-cli v0.8.x). The
+first explicit \`ish iteration create\` becomes label A, second is B,
+etc. Running \`ish study run\` on a study with zero iterations exits
+2 with a clear error pointing you to \`ish iteration create\`.
+If you do somehow run against an interactive iteration without a URL
+(or a media iteration without content), \`study run\` exits 2 with:
+\`\`\`
+Iteration "A" (i-...) has no URL configured yet. Add a URL with
+\`ish iteration create --study s-... --url <url>\` (or update the
+existing iteration via \`ish iteration update i-... --details-json '{...}'\`),
+then retry.
+\`\`\`
+Treat this as actionable, not transient — re-running won't change anything.
+## Default segmentation for text/image iterations
+For text-modality iterations created with just \`--content-text\` (and
+similarly \`--image-urls\` for image), the worker now synthesises a
+single whole-content section if no \`segmentation\` was supplied. This
+means a minimal \`ish iteration create --study s-XYZ --content-text
+"..."\` actually runs end-to-end without you needing to author a
+SegmentationConfig manually. Author your own segmentation when you
+want section-level reactions; otherwise the default just works.
 ## Related
 - \`concepts/study\` — the parent artifact.
 - \`concepts/run-verbs\` — how \`ish study run\` selects the iteration.
 - \`concepts/audience\` — how testers are picked for a run.
+- \`reference/billing-limits\` — \`maxIterationsPerStudy\` cap on iteration creation.
 `;
 const CONCEPT_ASSIGNMENT = `# concept: assignment
@@ -213,7 +328,7 @@ replaces the full assignment list — additive editing is not supported.
 const CONCEPT_QUESTIONNAIRE = `# concept: questionnaire
 The **questionnaire** is the list of \`interview_questions\` a tester
-answers before, during, or after their assignments. A study has 0..N
+answers before or after their assignments. A study has 0..N
 questions, each with a type and a timing.
 ## Question shape
@@ -221,12 +336,12 @@ questions, each with a type and a timing.
 \`\`\`json
 {
   "question": "How easy was checkout?",
-  "type": "slider",          // text | slider | likert | choice_single |
-                             // choice_multiple | number | …
-  "timing": "after",         // before | during | after
+  "type": "slider",          // text | slider | likert |
+                             // single-choice | multiple-choice | number
+  "timing": "after",         // before | after
   "min": 1, "max": 7, "step": 1,
   "labels": ["Hard", "Easy"],
-  "options": ["A", "B", "C"] // only for choice_*
+  "options": ["A", "B", "C"] // only for single-choice / multiple-choice
 }
 \`\`\`
@@ -289,8 +404,58 @@ ish ask run --prompt "And now which?" \\
 ish ask list
 ish ask get a-6ec --round 2
 ish ask results a-6ec
+ish ask results a-6ec --json | jq '.rounds[0].aggregates'
+\`\`\`
+## Reading the verdict
+For \`--wants-pick\` / \`--wants-ratings\` rounds, \`ask results --json\`
+includes an \`aggregates\` field per round so you don't have to parse
+prose. Each individual pick also carries a **\`pick_confidence\`** score
+(0..1) — the model's self-reported confidence in its variant choice.
+Use it to break ties: when two variants are nominally close on count,
+the variant with higher mean \`pick_confidence\` is the more decisive
+choice. \`pick_confidence\` is only present on rounds run with
+\`--wants-pick\`.
+\`\`\`json
+{
+  "picks":   { "A": 3, "B": 0 },
+  "ratings": { "A": { "mean": 4.667, "n": 3 },
+               "B": { "mean": 2.000, "n": 3 } },
+  "winner":  { "letter": "A", "count": 3, "tied": false }
+}
 \`\`\`
+When the ask has 2+ rounds, \`ask results\` also includes a top-level
+\`cross_round_summary\` block with per-round picks/winner and a
+\`picks_delta\` (R1 → last round). Skip the manual diffing of two
+\`ask results\` calls.
+\`\`\`json
+"cross_round_summary": {
+  "rounds": [
+    { "round_number": 1, "picks": {"A": 1, "B": 2}, "winner": {"letter": "B", "count": 2, "tied": false } },
+    { "round_number": 2, "picks": {"A": 3, "B": 0}, "winner": {"letter": "A", "count": 3, "tied": false } }
+  ],
+  "picks_delta": { "A": +2, "B": -2 }
+}
+\`\`\`
+## Adding follow-up questions to a round
+\`ish ask add-questions --round N --questions ./qs.json\` is **additive
+by default**: prior phase-1 outputs (comment, pick, ratings) are
+preserved on every non-errored response, and the worker only answers
+the newly-added questions for each tester. Existing picks stay stable.
+Pass \`--redispatch-all\` for the legacy reset behavior — useful when a
+question is sufficiently different that you want fresh first
+impressions, not augmentation. Without that flag, agents iterating on
+copy can safely append questions without losing prior round results.
+See \`reference/json-mode\` for the full shape.
 ## Variant syntax
 \`--variant <type>:<value>[::label=<label>]\`
@@ -327,6 +492,15 @@ ish ask wait a-6ec --round 2 --timeout 600
 ish ask results a-6ec --round 1
 \`\`\`
+## \`add-questions\` is additive
+Appending questions to a completed round preserves prior data — variant
+comments, picks, ratings, and earlier-question answers all stay. Only
+the new question(s) get dispatched to the existing testers. Cost is
+roughly N phase-2 LLM calls instead of 2N (no phase-1 re-run). Errored
+responses are skipped entirely; completed responses flip to PENDING and
+re-finalize after the new question is answered.
 ## Related
 - \`concepts/ask\` — the parent artifact.
@@ -380,10 +554,29 @@ ish profile create --file profile.json
 Expected JSON: \`{ "name": "...", "type": "ai", "gender": "female",
 "country": "US", "occupation": "...", "bio": "..." }\`
+## Generation behavior to expect
+- **Latency**: \`profile generate\` is LLM-backed and typically takes
+  10–20s for 1–5 profiles. The CLI emits stderr progress lines
+  (\`generating N profiles…\` then \`generated N profiles\`) so you
+  know it's not stuck. Suppress with \`--quiet\`.
+- **Brief fidelity**: bios reference domain-specific terms from your
+  description verbatim or as close paraphrase. If you mention
+  \`F-skatt\`, "manual Excel invoicing", "Stripe payouts", or similar
+  tools/jargon, expect those terms (or paraphrases) to appear in
+  each generated bio's daily-routine framing — not sanded down to
+  generic prose.
+- **DOB diversity**: month-and-day are derived from a deterministic
+  per-profile hash so birthdays spread across the year (no more
+  every-profile-on-\`06-15\`). Year follows the requested age.
+  Re-generating the same name/country/occupation/age yields the
+  same DOB.
 ## Related
 - \`concepts/source\` — the inputs to \`profile generate\`.
 - \`concepts/audience\` — how profiles get selected into a run.
+- \`reference/billing-limits\` — \`maxCustomTesterProfiles\` cap on profile creation.
 `;
 const CONCEPT_SOURCE = `# concept: source
@@ -439,6 +632,24 @@ flags. Two ways to select:
 The two modes are **mutually exclusive** — pass either \`--profile\` or
 the filter set, not both.
+## Empty-pool suggestions
+When a filter combination matches zero profiles, the error message
+includes the top three populated countries that satisfy your *other*
+filters — so you can pivot to a country with actual coverage without a
+second \`profile list\` round-trip:
+\`\`\`
+$ ish study run --country XX --min-age 35 --sample 5
+Error: No simulatable AI tester profiles in workspace w-b32 match:
+       --country XX --min-age 35.
+       Populated countries with these other filters: SE (12), DE (8), NL (3).
+       Broaden your filters or run \`ish profile list\` to inspect the pool.
+\`\`\`
+The suggestion is best-effort — it never replaces the original error,
+just augments it.
 ## Defaults
 - \`ish study run\` with no audience flags → reuses the iteration's
@@ -571,6 +782,13 @@ ish study cancel <tester_id>              # cancel a running simulation
 \`<tester_id>\` accepts a tester alias (\`t-…\`) or a full UUID. The
 study-level \`poll\`/\`wait\` forms also exist (\`--study <id>\` /
 \`--iteration <id>\`) for whole-batch progress.
+## Related
+- \`reference/json-mode\` — output modes (display vs capture vs chain).
+  Use \`--get tester_aliases\` to capture the run's testers without
+  piping through \`jq\`. \`--human\` forces table output even through
+  \`tee\`/redirection.
 `;
 const REFERENCE_ALIASES = `# reference: aliases
@@ -602,32 +820,120 @@ ish profile generate --source tps-3a4 --count 4
 The full UUID is also always accepted. Add \`--verbose\` to JSON output
 to see UUIDs alongside aliases.
 `;
-const REFERENCE_JSON_MODE = `# reference: JSON output for agents
+const REFERENCE_JSON_MODE = `# reference: output modes for agents
+\`ish\` distinguishes **three output modes** so agents don't have to
+post-process CLI output with \`jq\` or \`python\` for routine tasks:
+1. **Display mode (human)** — readable tables and key/value blocks.
+   Default on a TTY. Force it anywhere with \`--human\` (e.g. \`ish
+   workspace list --human | tee /tmp/x.txt\` keeps the table layout
+   even though stdout is redirected).
+2. **Capture mode (single value)** — \`--get <field>\` extracts the
+   value at a dotted path and prints it bare (no JSON quotes, no
+   indentation). Use this to feed one CLI's output into another:
+   \`ASK=$(ish ask create … --get alias)\` instead of
+   \`ASK=$(ish ask create … --json | jq -r .alias)\`.
+3. **Chain mode (full JSON)** — \`--json\` (or auto-enabled when stdout
+   is piped). Returns structured payloads for downstream parsing.
+   Reach for this only when you actually need multiple fields or a
+   nested shape; for one value, \`--get\` is shorter.
+## Picking the right mode
+| You want to…                              | Mode                                             |
+|-------------------------------------------|--------------------------------------------------|
+| Show the user a list of workspaces        | bare command (TTY) or \`--human\` if redirecting   |
+| Capture an alias for a follow-up command  | \`--get alias\`                                   |
+| Inspect a specific nested field           | \`--get tester_profile.name\`                     |
+| Compare 2+ fields, or pipe into jq        | \`--json\` (or auto-on when piped)                |
+| Force human output through \`tee\`         | \`--human\`                                       |
+| Force JSON on a TTY                       | \`--json\`                                        |
+\`--get\` and \`--human\` are mutually exclusive — capture and display are
+different intents; pick one. \`--get\` implies \`--json\` internally so the
+renderer always has structured data to extract from; you don't need to
+add \`--json\` yourself.
+### Worked example: display vs. capture
+\`\`\`bash
+# Display: bare command on a TTY → human table.
+ish workspace list
+# Capture: feed one alias into the next command, no jq required.
+ASK=$(ish ask create --new --name demo \\
+        --prompt "Which?" --variant text:A --variant text:B \\
+        --sample 30 --get alias)
+ish ask wait "$ASK" --timeout 600
-Every command that produces output supports machine-readable JSON. JSON
-mode is **auto-enabled when stdout is piped**, so an agent rarely needs
-\`--json\` explicitly.
+# Capture across an entire list: one value per line.
+ish workspace list --get alias
+# w-6ec
+# w-d02
+# …
+# Display preserved through tee:
+ish ask results "$ASK" --human | tee /tmp/transcript.txt
+\`\`\`
 ## Flags
-- \`--json\`            — force JSON output even on a TTY.
+- \`--human\`            — force human-readable output regardless of TTY
+                          state (overrides the auto-flip-to-JSON when
+                          stdout is piped). Mutually exclusive with
+                          \`--get\`.
+- \`--get <field>\`      — extract a single field from the JSON response
+                          and print only its bare value. Supports dotted
+                          paths (\`tester_profile.name\`). On a paginated
+                          \`{items: [...]}\` response, the path
+                          auto-descends into \`items\` so \`--get alias\`
+                          on a list yields one value per line. Implies
+                          \`--json\` internally; mutually exclusive with
+                          \`--human\`. Strings/numbers/bools are printed
+                          unquoted; \`null\` prints as an empty line;
+                          arrays print one element per line; objects
+                          print as compact one-line JSON. Missing field
+                          → exit 2 with a usage error.
+- \`--json\`             — force JSON output even on a TTY. Auto-enabled
+                          when stdout is piped (unless \`--human\` is
+                          set).
 - \`--fields a,b,c\`    — keep only these fields in JSON output (e.g.
                           \`alias,name,status\`). Filters per item only;
-                          paginated wrappers (\`{items, total, limit,
-                          offset}\`) keep their shape.
+                          list wrappers (\`{items, total, returned,
+                          limit, offset, has_more}\`) keep their shape.
 - \`--verbose\`          — include full UUIDs, timestamps, and (on
                           write paths) the full server payload instead
                           of the compact response.
 - \`-q, --quiet\`        — suppress progress messages on stderr (errors
-                          still go to stderr).
+                          still go to stderr). \`--get\` implies
+                          \`--quiet\` so the bare value is the only
+                          thing on stdout.
 ## Stable shape rules
 The CLI guarantees these contracts so agents can chain safely:
-- **Lists keep their wrapper.** \`--fields\` strips per-item, never the
-  envelope. A paginated list with \`{items, total, limit, offset}\` will
-  always have those four keys.
+- **Every list response is a six-key envelope.** All
+  \`<entity> list --json\` responses (workspace, study, iteration, ask,
+  profile, config) return:
+  \`\`\`json
+  {
+    "items":    [...],
+    "total":    121,    // server-provided when paginated; else items.length
+    "returned": 50,     // items.length, always present
+    "limit":    50,
+    "offset":   0,
+    "has_more": true    // total > offset + returned
+  }
+  \`\`\`
+  When the server doesn't paginate, \`total = returned = limit\`,
+  \`offset = 0\`, \`has_more = false\` (synthesized client-side).
+  \`--fields\` strips per-item, never the envelope — those six keys are
+  always present. Use \`has_more\` to detect truncation rather than
+  counting items yourself.
 - **Write paths always include \`id\` AND \`alias\`.** Even with
   \`--fields\` set, you can identify the affected resource. Default
   write-path JSON is compact (\`{id, alias, name, updated_at,
@@ -635,9 +941,102 @@ The CLI guarantees these contracts so agents can chain safely:
 - **\`profile generate\` trims \`simulation_config\` by default** (~9×
   smaller than the raw response). Pass \`--include-simulation-config\`
   if you need it.
+- **\`<entity> get\` accepts multiple IDs.** \`profile get\`, \`study get\`,
+  \`iteration get\`, and \`ask get\` all take \`<ids...>\` — pass two or
+  more aliases (space- or comma-separated) and the response is a
+  \`{items:[...], total:N}\` envelope. Use this instead of piping
+  \`list --json\` to \`jq\`/\`python\` to filter by alias.
+- **Ask detail JSON includes denormalized counts** so agents don't
+  have to count nested arrays. \`ask get\`, \`ask create --wait\`,
+  \`ask run --wait\`, and \`ask wait --verbose\` all add:
+  \`\`\`json
+  {
+    "testers_count":      3,
+    "responses_total":    9,
+    "responses_complete": 9,
+    "rounds": [
+      { "responses_total": 3, "responses_complete": 3, "...": "..." }
+    ]
+  }
+  \`\`\`
+  \`responses_errored\` only appears when at least one response errored.
+  Use these instead of \`jq '.testers | length'\` /
+  \`jq '.rounds[0].responses | length'\`.
 - **\`study run --json\` exposes tester handles.** The top-level
   \`tester_ids[]\` and \`tester_aliases[]\` arrays are the canonical
   inputs to \`ish study poll/wait/cancel\`.
+- **Study responses carry a derived \`runtime_status\` field**
+  (\`draft | running | completed | completed_with_errors | cancelled\`).
+  Prefer this over the raw \`status\` field — \`runtime_status\` is
+  computed from the iteration testers' actual run state and never
+  reports \`failed\` while completed runs exist. Available on
+  \`study get\`, \`study results\`, and the response from
+  \`study generate\`. The CLI also surfaces a \`status_inferred\` field
+  alongside the raw \`status\` when it detects a partial-failure
+  inconsistency, plus a stderr warning ("Warning: study reports
+  status='failed' but N/M testers completed…").
+- **\`study generate --json\` includes a \`modality_rationale\`** —
+  one short sentence explaining why the LLM picked that modality. Use
+  it to detect mis-classifications (e.g. brief was a static concept doc
+  but rationale says "live UI flow") and override via
+  \`study update <id> --modality text\` before adding iterations.
+- **\`ask add-questions\` is additive by default.** Appending questions
+  preserves variant comments / picks / ratings / prior-question
+  answers; only the new question(s) get dispatched. Cost: roughly N
+  phase-2 LLM calls instead of 2N. Pass \`--redispatch-all\` for the
+  legacy reset behavior when you want fresh first impressions.
+- **\`ask results --json\` includes \`cross_round_summary\` for 2+
+  rounds.** Top-level field with per-round picks/winner snapshots and
+  a \`picks_delta\` (R1 → last round). Replaces hand-rolled diffing of
+  two \`ask results\` calls.
+- **No more auto-empty iteration A.** \`study create\` and
+  \`study generate\` no longer produce a placeholder iteration A. The
+  first explicit \`ish iteration create\` becomes label A.
+  \`study create\` now accepts \`--content-text\` (text) or \`--url\`
+  (interactive) inline so a single call yields a runnable study.
+  Running \`study run\` on a study with zero iterations exits 2 with
+  a suggestion to run \`ish iteration create\` first.
+- **Tester responses include \`error_message\`.** When a tester is
+  \`status: failed\`, the JSON exposes \`error_message: "<reason>"\` so
+  agents can act without drilling into logs. \`study results\` rolls
+  this up: top-level \`failed_count\`, plus per-tester \`error_message\`
+  in the \`testers[]\` array, and a "Failed testers" subsection in
+  human output. Empty when the tester succeeded.
+- **\`profile list\` emits a stderr pagination hint** when
+  \`has_more=true\` and \`--quiet\` is not set. The hint goes to **stderr
+  in every mode** including \`--json\` and piped stdout — it never
+  pollutes machine-readable stdout but is visible to any agent that
+  reads stderr (which they should, for warnings and progress). Format:
+  "showing N–M of TOTAL; pass --offset M --limit N for more."
+  JSON consumers can also read \`has_more\` directly off the envelope.
+- **\`ask results --json\` adds an \`aggregates\` field per round.** For
+  rounds with \`wants_pick\`/\`wants_ratings\`, the CLI computes the
+  verdict locally so agents don't have to parse comment prose:
+  \`\`\`json
+  {
+    "aggregates": {
+      "picks":   { "A": 3, "B": 0 },
+      "ratings": { "A": { "mean": 4.667, "n": 3 },
+                   "B": { "mean": 2.000, "n": 3 } },
+      "winner":  { "letter": "A", "count": 3, "tied": false }
+    }
+  }
+  \`\`\`
+  \`picks\` is present iff \`wants_pick\`; \`ratings\` is present iff
+  \`wants_ratings\` and ≥ 1 rating was submitted; \`winner\` is the
+  highest pick count (\`tied: true\` if multiple variants share the
+  top). \`mean\` is rounded to 3 decimal places; \`n\` is the rating
+  count for that variant.
+- **\`ask results --json\` deduplicates tester profile snapshots.** When
+  \`tester_profile\` and \`tester_profile_snapshot\` share all
+  overlapping fields (the common case — they only diverge if the
+  profile was edited after dispatch), the snapshot is collapsed to
+  \`{snapshotted_at, snapshot_version, _matches_tester_profile: true}\`.
+  Use \`--verbose\` to keep both copies in full.
 ## Exit codes
@@ -693,25 +1092,43 @@ a structured error object on **stdout** and a human message on
 ## Examples
 \`\`\`
-ish workspace list --json | jq '.[].alias'
-ish study get s-b2c --fields alias,name,status,iterations
+# Display (table on TTY, JSON when piped):
+ish workspace list
+# Display preserved through tee/pipe (force human):
+ish ask results a-6ec --human | tee /tmp/results.txt
+# Capture a single alias to feed into the next command:
+WS=$(ish workspace list --get alias | head -1)
+# Inspect a nested field:
+ish study tester t-a17 --get tester_profile.name
+# Chain (full JSON for jq when you need multiple fields):
+ish study get s-b2c --fields alias,name,status,iterations --json
 ish ask results a-6ec --round 1 --json
-ish profile generate --description "..." --count 3 --json | jq '.[].alias'
 \`\`\`
 ## Composing commands
-JSON mode + alias resolution makes pipelines safe:
+\`--get\` removes most of the \`jq\` shims agents reach for. Capture in
+a script, then display the final result back to the user:
 \`\`\`
-ITER=$(ish iteration create --url https://example.com --json | jq -r .alias)
-TESTERS=$(ish study run --iteration "$ITER" --sample 5 --country SE \\
-            --json | jq -r '.tester_aliases[]')
+# Capture — bare values, no jq needed:
+ITER=$(ish iteration create --url https://example.com --get alias)
+TESTERS=$(ish study run --iteration "$ITER" --sample 5 --country SE --get tester_aliases)
 for t in $TESTERS; do
   ish study wait "$t" --timeout 600
 done
-ish study results --json | jq .
+# Display the final results to the user, even though we're in a script:
+ish study results --human
 \`\`\`
+When you genuinely need multiple fields in one parse pass, \`--json\` is
+still the right tool — \`--get\` is for single-value capture, not for
+reshaping output.
 `;
 const GUIDE_FIRST_STUDY = `# guide: your first study, end to end
@@ -774,6 +1191,168 @@ ish study results --json | jq .
 - Want a quick reaction test instead of an interactive study? Skip to
   \`ish docs get-page concepts/ask\`.
 `;
+const CONCEPT_ACTIVE_CONTEXT = `# concept: active context
+The CLI keeps a small amount of session state in \`~/.ish/config.json\`
+(or wherever \`ISH_HOME\` points) so commands don't need to repeat IDs:
+- \`access_token\` / \`refresh_token\` — the OAuth pair from \`ish login\`.
+- \`workspace\`  — set by \`ish workspace use <id>\`.
+- \`study\`      — set by \`ish study use <id>\`.
+- \`ask\`        — set by \`ish ask use <id>\`.
+Most commands fall back to these when their corresponding flag is
+omitted (\`--workspace\`, \`--study\`, \`--ask\`).
+## Inspecting active context
+\`ish status\` (alias: \`ish whoami\`) is the canonical way to see what's
+configured. **Run it as the first command on a cold start** — it
+confirms login, prints the active workspace/study/ask handles, and
+shows how long the token has left.
+\`\`\`bash
+ish status
+# User:       you@example.com  (token valid, expires in 47m)
+# Workspace:  Onboarding revamp (w-6ec)
+# Study:      —
+# Ask:        a-6ec "tagline AB"
+# Home:       /home/you/.ish
+# API:        https://api.ishlabs.io
+\`\`\`
+JSON shape (\`ish status --json\` or piped):
+\`\`\`json
+{
+  "user":      { "email": "...", "token_valid": true, "expires_in_seconds": 2820 },
+  "workspace": { "id": "...", "alias": "w-6ec", "name": "Onboarding revamp" },
+  "study":     null,
+  "ask":       { "id": "...", "alias": "a-6ec", "name": "tagline AB" },
+  "api_url":   "https://api.ishlabs.io",
+  "home":      "/home/you/.ish"
+}
+\`\`\`
+\`status\` does not error when the user is logged out — it returns
+\`user: null\` plus a \`hint\` field telling the caller to run
+\`ish login\`. Safe to run unconditionally at the start of any
+script or agent session.
+## Setting / clearing active context
+\`\`\`bash
+ish workspace use w-6ec        # set
+ish workspace use --clear      # clear
+ish study use s-b2c
+ish study use --clear
+ish ask use a-6ec
+ish ask use --clear
+\`\`\`
+## Overriding without persisting
+Every read command accepts \`--workspace <id>\`, \`--study <id>\`, or
+\`--ask <id>\` to override the saved active value for one invocation
+without touching the config. Useful for one-off pokes at another
+resource.
+\`--workspace\` is accepted on **every workspace-scoped subcommand**
+(\`ask\`, \`study\`, \`iteration\`, \`profile\`, \`source\` and their
+descendants). When workspace is inferable from the subject ID alias
+(e.g. \`ish ask delete a-6ec\`) the value is silently ignored — agents
+can pass it reflexively without tripping "unknown option" errors. Out
+of scope: \`workspace\`, \`config\`, \`docs\`, \`init\`, \`login\`,
+\`logout\`, \`whoami\`, \`upgrade\` (none of these need a workspace).
+## Related
+- \`reference/aliases\` — the prefix scheme used by every entity.
+- \`reference/json-mode\` — output modes (display vs capture vs chain),
+  including \`--get workspace.alias\` to capture the active workspace
+  without piping \`ish status --json\` through \`jq\`.
+`;
+const REFERENCE_BILLING_LIMITS = `# reference: billing tier limits
+Some create operations are gated by your account's billing tier. The
+backend enforces these. The CLI just renders the structured rejection.
+There is no way to bypass enforcement from the CLI; running the same
+\`POST\` with \`curl\` will hit the same gate.
+The web UI reads these caps at runtime from
+\`GET /api/v1/billing/limits\` (cached for one hour) and falls back to
+its build-time snapshot if the endpoint is unreachable. The table below
+is the CLI's own snapshot, intentionally release-pinned for offline
+use; re-pull it after each \`ish-cli\` release. The source of truth at
+request time, for any client, is the backend's \`TIER_LIMITS\` dict in
+\`tier_limits.py\`.
+## Limits enforced
+| Limit                       | Free | Media | Starter | Pro | Enterprise |
+|-----------------------------|------|-------|---------|-----|------------|
+| \`maxProducts\`               | 1    | 1     | ∞       | ∞   | ∞          |
+| \`maxStudiesPerProduct\`      | 3    | ∞     | ∞       | ∞   | ∞          |
+| \`maxIterationsPerStudy\`     | 2    | ∞     | ∞       | ∞   | ∞          |
+| \`maxCustomTesterProfiles\`   | 3    | 10    | 10      | ∞   | ∞          |
+Commands that may hit a limit: \`ish workspace create\`,
+\`ish study create\`, \`ish study generate\`, \`ish iteration create\`,
+\`ish profile create\`, \`ish profile generate\`.
+## What you see when a limit is hit
+Human output (stderr):
+\`\`\`
+Error: Free plan allows 3 studies per workspace. Upgrade to add more.
+  → Upgrade your plan at https://app.ishlabs.io/billing
+  → Run \`ish docs get-page reference/billing-limits\` for the tier table
+\`\`\`
+JSON output (stdout — \`--json\` or piped):
+\`\`\`json
+{
+  "error": "Free plan allows 3 studies per workspace. Upgrade to add more.",
+  "error_code": "usage_limit_reached",
+  "status": 403,
+  "retryable": false,
+  "tier": "free",
+  "limit": "maxStudiesPerProduct",
+  "current": 3,
+  "max": 3,
+  "upgrade_url": "https://app.ishlabs.io/billing",
+  "suggestions": ["Upgrade your plan at https://app.ishlabs.io/billing", "..."]
+}
+\`\`\`
+Exit code: \`1\` (general — non-retryable). Don't retry; the user has to
+upgrade or delete an existing resource to free up headroom.
+## Agent-side handling
+- Branch on \`error_code === "usage_limit_reached"\` (preferred) or
+  \`status === 403\` with that error_code in the body. \`forbidden\`
+  errors that are *not* tier-related keep \`error_code: "forbidden"\`.
+- Use \`limit\`, \`current\`, \`max\`, \`tier\` to construct your own
+  recovery message. The \`limit\` value matches the table above and is
+  stable.
+- The \`generate\` endpoints (\`study generate\`, \`profile generate\`)
+  refuse the entire batch when the post-generation count would exceed
+  the cap, rather than partially fulfilling — re-issue with a smaller
+  \`--count\` after upgrading or pruning.
+## Related
+- \`concepts/workspace\` — \`maxProducts\` is per-account.
+- \`concepts/study\`     — \`maxStudiesPerProduct\` gates study creation.
+- \`concepts/iteration\` — \`maxIterationsPerStudy\` gates iteration creation.
+- \`concepts/profile\`   — \`maxCustomTesterProfiles\` gates profile creation.
+- \`reference/json-mode\` — full error envelope shape and exit codes.
+`;
 const PAGES = [
     {
         slug: "overview",
@@ -853,6 +1432,12 @@ const PAGES = [
         description: "Side-by-side; decision rule for choosing one over the other.",
         body: CONCEPT_RUN_VERBS,
     },
+    {
+        slug: "concepts/active-context",
+        title: "concept: active context",
+        description: "Saved workspace/study/ask state and how to inspect it (ish status).",
+        body: CONCEPT_ACTIVE_CONTEXT,
+    },
     {
         slug: "reference/aliases",
         title: "reference: aliases",
@@ -861,10 +1446,16 @@ const PAGES = [
     },
     {
         slug: "reference/json-mode",
-        title: "reference: JSON output for agents",
-        description: "JSON, --fields, --verbose, exit codes, pipe behaviour.",
+        title: "reference: output modes for agents (display, capture, chain)",
+        description: "Display vs capture vs chain: --human, --get, --json, --fields, exit codes, pipe behavior.",
         body: REFERENCE_JSON_MODE,
     },
+    {
+        slug: "reference/billing-limits",
+        title: "reference: billing tier limits",
+        description: "Per-tier caps on workspaces/studies/iterations/profiles; usage_limit_reached error shape.",
+        body: REFERENCE_BILLING_LIMITS,
+    },
     {
         slug: "guides/first-study",
         title: "guide: your first study, end to end",