npm - @onkernel/cua-cli - Versions diffs - 0.1.0 - Mend

@onkernel/cua-cli 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +170 -0
package/dist/cli.js +1758 -0
package/dist/harness-models-GT8Ke1vt.js +106 -0
package/dist/main-Bphx_zOj.js +899 -0
package/package.json +48 -0

package/README.md ADDED Viewed

@@ -0,0 +1,170 @@
+# `@onkernel/cua-cli`
+The CLI / TUI binary for the [`cua`](../../README.md) monorepo. Wires
+[`@onkernel/cua-agent`](../agent)'s `CuaAgentHarness` to
+[`pi-tui`](https://www.npmjs.com/package/@earendil-works/pi-tui) for an
+interactive front-end and to
+[`pi-coding-agent`](https://www.npmjs.com/package/@earendil-works/pi-coding-agent)'s
+coding tools for workspace access.
+## Install (from the monorepo)
+```bash
+# from the repo root:
+npm install
+# run directly from source via tsx (no global install required):
+npx tsx packages/cli/src/cli.ts --help
+# optional: pin a shell function in your rc so `cua` works from any cwd
+# while preserving the caller's directory (so `--out`, transcript
+# bucketing, and `.agents/skills` discovery use the directory you
+# invoked from):
+#   CUA_REPO=/absolute/path/to/cua
+#   cua() { "$CUA_REPO/node_modules/.bin/tsx" "$CUA_REPO/packages/cli/src/cli.ts" "$@"; }
+```
+## Usage
+```bash
+# Interactive TUI:
+cua
+# Single-shot prompt:
+cua --print "open https://example.com and tell me the heading"
+# Constrained one-shot subcommands (deterministic exit codes):
+cua open https://example.com
+cua click "Sign in button"
+cua type "email field" "alice@example.com"
+cua press ctrl l                              # Ctrl+L (focus address bar)
+cua url
+cua observe "what page is loaded?"
+cua screenshot --out shot.png
+cua do "buy a pair of socks on amazon" --max-steps 20
+# List and pick supported models:
+cua models
+cua models -p openai
+cua --print --model openai:gpt-5.5 "..."
+cua --print --model anthropic:claude-opus-4-7 "..."
+cua --print --model google:gemini-3-flash-preview "..."
+cua --print --model yutori:n1.5-latest "..."
+# Named sessions (browser stays alive across calls):
+cua session start login                       # provisions Kernel browser
+cua -s login open https://github.com/login
+cua -s login type "email field" "$EMAIL"
+cua -s login click "Sign in"
+cua session stop login
+cua session list                              # NAME / KERNEL_ID / AGE / LIVE_URL
+cua session show login                        # full JSON metadata
+# Resume a prior session transcript into a fresh browser:
+cua --continue
+cua --resume                                  # picker
+cua --session abc12345                        # by id prefix
+```
+## Models
+Run `cua models` to list every supported `-m` / `--model` value and the
+provider it routes to. Filter by provider with `cua models -p openai`,
+`cua models -p anthropic`, `cua models -p google` (alias: `gemini`), or
+`cua models -p yutori`.
+`-m` / `--model` accepts a provider-qualified `provider:model` ref (e.g.
+`openai:gpt-5.5`) or a bare model id when it matches exactly one catalog
+entry. The default is `openai:gpt-5.5`.
+## Configuration
+Configuration is by environment variable. There is no config file.
+| Env                  | Used for                                       |
+| -------------------- | ---------------------------------------------- |
+| `KERNEL_API_KEY`     | Kernel API key (required)                      |
+| `OPENAI_API_KEY`     | OpenAI API key (required when `-m openai:…`)   |
+| `ANTHROPIC_API_KEY`  | Anthropic API key (required when `-m anthropic:…`) |
+| `GOOGLE_API_KEY`     | Google API key (required when `-m google:…`)   |
+| `GEMINI_API_KEY`     | alias of `GOOGLE_API_KEY`                      |
+| `TZAFON_API_KEY`     | Tzafon API key (required when `-m tzafon:…`)   |
+| `YUTORI_API_KEY`     | Yutori API key (required when `-m yutori:…`)   |
+| `KERNEL_BASE_URL`    | override Kernel base URL                       |
+| `OPENAI_BASE_URL`    | override OpenAI base URL                       |
+| `ANTHROPIC_BASE_URL` | override Anthropic base URL                    |
+| `GOOGLE_BASE_URL`    | override Google base URL                       |
+| `TZAFON_BASE_URL`    | override Tzafon base URL                       |
+| `YUTORI_BASE_URL`    | override Yutori base URL                       |
+| `XDG_DATA_HOME`      | sessions dir base (defaults to `~/.local/share`) |
+| `CUA_IMAGE_PROTOCOL` | force inline image protocol (`kitty`/`iterm2`/`none`/`auto`) |
+Use `--thinking <level>` (`off | minimal | low | medium | high | xhigh`,
+default `low`) for providers that support reasoning effort.
+## Output formats
+`--print` defaults to streaming text. Pass `-o jsonl` for one
+structured event per line (good for scripting):
+```bash
+cua --print -o jsonl "open https://example.com" \
+  | jq -c 'select(.type=="tool_call" or .type=="assistant_text_done")'
+```
+Add `--jsonl-include-deltas` for assistant-token deltas and
+`--jsonl-include-images` for base64 screenshots in `tool_result` events.
+The first event of every `--print -o jsonl` run is
+`session_created` with a `schema_version` field. The current schema
+version is `1`. The `model` field carries a provider-qualified ref
+(e.g. `openai:gpt-5.5`); use `parseCuaModelRef` from `@onkernel/cua-ai`
+if you only need the bare model id.
+## Sessions and transcripts
+`--print`, the interactive TUI, and any `-s <name>` invocation persist
+a JSONL transcript to
+`$XDG_DATA_HOME/cua/sessions/<cwd-hash>/<id>.jsonl` by default
+(typically `~/.local/share/cua/sessions/...`). Pass `--no-session` to
+keep a run in-memory only, or `--session-dir <path>` to override the
+location.
+For named sessions, the exact transcript path is in
+`cua session show <name>` under `transcript_path`. See the
+[Session transcripts section in the top-level README](../../README.md#session-transcripts)
+for the JSONL schema and `jq` analysis examples.
+## Skills
+`cua` follows the cross-agent
+[`~/.agents/skills/`](https://agentskills.io) standard. Discovery
+defaults:
+- `~/.agents/skills/` (user-global)
+- `<cwd>/.agents/skills/` (project-local)
+Plus any explicit `--skill <path>` flags. Disable with `--no-skills`
+(`-ns`).
+Each skill's `name`, `description`, and file `location` are added to
+the system prompt; the model uses the `read` tool to load a skill's
+full body when its description matches the task. Use `/skill:<name>`
+in a prompt to force-load a skill body inline.
+## Image protocol
+Force the inline-screenshot protocol with `--image-protocol` or
+`CUA_IMAGE_PROTOCOL`:
+- `kitty`  — Kitty graphics protocol (also covers Ghostty / WezTerm).
+- `iterm2` — iTerm2 inline images.
+- `none`   — disable inline images; show a compact text card instead.
+- `auto`   — auto-detect based on `TERM_PROGRAM` / `TMUX` / etc. (default).
+The TUI prints the resolved capability as the second header line so
+you can see at a glance whether inline images will render.
+## License
+MIT.