npm - @freibergergarcia/phone-a-friend - Versions diffs - 2.2.0 → 2.3.0 - Mend

@freibergergarcia/phone-a-friend 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.claude-plugin/plugin.json +1 -1
package/README.md +35 -43
package/commands/curiosity-engine.md +18 -13
package/commands/phone-a-friend.md +42 -38
package/commands/phone-a-team.md +30 -38
package/dist/index.js +478 -150
package/package.json +1 -1
package/skills/curiosity-engine/COMMAND.opencode.md +1 -1
package/skills/curiosity-engine/SKILL.md +18 -13
package/skills/phone-a-friend/COMMAND.opencode.md +1 -1
package/skills/phone-a-friend/SKILL.md +42 -38

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "phone-a-friend",
   "description": "CLI relay that lets AI coding agents collaborate by sending prompts and repository context to backend agents.",
-  "version": "2.2.0",
+  "version": "2.3.0",
   "author": {
     "name": "Bruno Freiberger"
   }

package/README.md CHANGED Viewed

@@ -24,17 +24,18 @@ Relay tasks to any backend, spin up multi-model teams, or run persistent multi-a
 | **Team** | Iterative multi-backend refinement over N rounds | Collaborative review, converging on a solution |
 | **Agentic** | Persistent multi-agent sessions with @mention routing | Autonomous collaboration, adversarial review, deep analysis |
-<div align="center">
-### TUI Dashboard
-<img src="https://raw.githubusercontent.com/freibergergarcia/phone-a-friend/main/assets/tui-dashboard.gif" alt="TUI dashboard" width="600">
-### Web Dashboard
+### Host parity
-<img src="https://raw.githubusercontent.com/freibergergarcia/phone-a-friend/main/assets/web-dashboard.gif" alt="Web dashboard" width="700">
+| Feature | Claude Code | OpenCode |
+|---|:---:|:---:|
+| `/phone-a-friend` (single-backend relay) | ✓ | ✓ |
+| `/curiosity-engine` (Q&A rally) | ✓ | ✓ |
+| `/phone-a-team` (iterative multi-model team) | ✓ | — |
+| Plugin marketplace install | ✓ | — |
+| CLI plugin install (`phone-a-friend plugin install --<host>`) | ✓ | ✓ |
+| Skill auto-discovery | ✓ | ✓ |
-</div>
+OpenCode users can replicate `/phone-a-team` by running repeated `/phone-a-friend` calls and synthesizing manually.
 ## Quick Start
@@ -57,26 +58,15 @@ The setup wizard detects your backends, offers to install detected host integrat
 **Claude Code marketplace (commands and skills only):**
-If you use [Claude Code](https://docs.anthropic.com/en/docs/claude-code/setup),
-you can install directly from the marketplace:
 ```
 /plugin marketplace add freibergergarcia/phone-a-friend
 /plugin install phone-a-friend@phone-a-friend-marketplace
 ```
-This fetches the latest version from npm automatically. To update later:
-```
-/plugin marketplace update phone-a-friend-marketplace
-/plugin update phone-a-friend@phone-a-friend-marketplace
-```
+To update: `/plugin marketplace update phone-a-friend-marketplace` then `/plugin update phone-a-friend@phone-a-friend-marketplace`.
 > [!NOTE]
-> Marketplace install gives you Claude Code integration (slash commands
-> and skills). For the full CLI including agentic mode, the TUI dashboard, and
-> the web dashboard on localhost, install globally with
-> `npm install -g @freibergergarcia/phone-a-friend`.
+> Marketplace install ships only the slash commands and skills. For the full CLI (agentic mode, TUI, web dashboard), install via `npm install -g @freibergergarcia/phone-a-friend`.
 **OpenCode commands and skills:**
@@ -113,10 +103,10 @@ Build a team with Claude and Ollama. Have them review the website copy,
 loop through 3 rounds, and converge on final suggestions.
 ```
-No slash commands needed. Once the host integration is installed (the setup wizard offers to do this), the host can route single-backend tasks through `/phone-a-friend`. In Claude Code, mention multiple backends and Claude can use `/phone-a-team` for iterative multi-agent refinement; `/phone-a-team` is Claude-only because it depends on Claude Agent Teams primitives. In OpenCode, use repeated `/phone-a-friend` calls and synthesize the results manually. You can explicitly invoke `/phone-a-friend` in both hosts, and `/phone-a-team` in Claude Code only.
+No slash commands needed once the host integration is installed (see [Host parity](#host-parity) for which slash commands work in which host).
 > [!TIP]
-> **Power-user setup:** Run Claude Code in [**tmux**](https://formulae.brew.sh/formula/tmux) and enable [**bypass permissions**](https://docs.anthropic.com/en/docs/claude-code/security) (`⏵⏵`) for trusted repos. [**Agent teams**](https://docs.anthropic.com/en/docs/claude-code/agent-teams) show up in split panes, so you can watch agents work in parallel without approval pauses. Pair it with **phone-a-friend agentic mode** for fully autonomous multi-agent sessions.
+> **Claude Code power-user setup:** Run in [**tmux**](https://formulae.brew.sh/formula/tmux) with [**bypass permissions**](https://docs.anthropic.com/en/docs/claude-code/security) (`⏵⏵`) and [**Agent Teams**](https://docs.anthropic.com/en/docs/claude-code/agent-teams) to watch agents work in parallel split panes. Pair with **phone-a-friend agentic mode** for fully autonomous sessions.
 ## CLI Usage
@@ -126,7 +116,7 @@ Delegate a task to any backend and get the result back:
 ```bash
 phone-a-friend --to codex --prompt "Review this code"
-phone-a-friend --to gemini --prompt "Analyze the architecture" --model gemini-2.5-flash
+phone-a-friend --to gemini --prompt "Analyze the architecture"
 phone-a-friend --to claude --prompt "Refactor this module"
 phone-a-friend --to ollama --prompt "Explain this function"
 phone-a-friend --to opencode --prompt "Audit this repo" --model qwen3-coder  # Local agentic (OpenCode + Ollama)
@@ -213,21 +203,22 @@ phone-a-friend config edit     # Open in $EDITOR
 ## Backends
-| Backend | Type | Streaming | How it works |
-|---------|------|-----------|-------------|
-| **Codex** | CLI subprocess | No | Runs `codex exec` with sandbox and repo context |
-| **Gemini** | CLI subprocess | No | Runs `gemini --prompt` with `--yolo` auto-approve |
-| **Ollama** | HTTP API | Yes (NDJSON) | POSTs to `localhost:11434/api/chat` via native fetch |
-| **Claude** | CLI subprocess | Yes (JSON) | Runs `claude` with sandbox-to-tool mapping |
-| **OpenCode** | CLI subprocess | Yes (NDJSON) | Runs `opencode run` with repo-local tool access |
+| Backend | Type | Streaming |
+|---------|------|-----------|
+| **Codex** | CLI subprocess | No |
+| **Gemini** | CLI subprocess | No |
+| **Ollama** | HTTP API | Yes (NDJSON) |
+| **Claude** | CLI subprocess | Yes (JSON) |
+| **OpenCode** | CLI subprocess | Yes (NDJSON) |
 Ollama configuration via environment variables:
 - `OLLAMA_HOST` -- custom host (default: `http://localhost:11434`)
 - `OLLAMA_MODEL` -- default model (overridden by `--model` flag)
 Phone-a-friend environment variables:
-- `PHONE_A_FRIEND_INCLUDE_DIFF=false` -- disable diff inclusion across every relay; equivalent to passing `--no-include-diff` on every command. Useful when `defaults.include_diff = true` in your config but you want a session without the diff. Also the canonical mechanism used by OpenCode shims for stale-binary compatibility (the `--no-include-diff` flag was added in v2.2.0+; older binaries reject it but accept this env var since v1.7.2).
-- `PHONE_A_FRIEND_HOST=opencode` -- mark the calling process as OpenCode for the recursion guard. Set automatically by the OpenCode shims; only relevant if you're invoking PaF programmatically from inside an OpenCode session.
+- `PHONE_A_FRIEND_INCLUDE_DIFF=false` -- disable diff inclusion globally (equivalent to `--no-include-diff` on every call).
+- `PHONE_A_FRIEND_HOST=opencode` -- mark the calling process as OpenCode for the recursion guard (set automatically by the OpenCode shims).
+- `PHONE_A_FRIEND_GEMINI_DEAD_CACHE=false` -- bypass the Gemini dead-model cache (debugging stale entries).
 OpenCode configuration via TOML:
 ```toml
@@ -272,14 +263,6 @@ phone-a-friend agentic dashboard              # default: localhost:7777
 phone-a-friend agentic dashboard --port 8080
 ```
-**How it works:**
-1. The orchestrator spawns each agent with the initial prompt and a unique name (e.g., `ada.reviewer`, `fern.critic`)
-2. Agents respond and `@mention` other agents (or `@all` / `@user`)
-3. The orchestrator routes messages to the targeted agents
-4. Agents reply in subsequent turns, building on accumulated context
-5. The session ends when agents converge (no new messages), hit the turn limit, or time out
 **What you get:**
 - **Live web dashboard** -- watch agents collaborate in real time at `localhost:7777` (SSE-powered)
@@ -297,11 +280,20 @@ Full usage guide, examples, CLI reference, and configuration details:
 ## Uninstall
+**npm install:**
 ```bash
 npm uninstall -g @freibergergarcia/phone-a-friend
 ```
-The Claude Code plugin and OpenCode commands/skills are removed automatically when installed through the CLI.
+Automatically removes the Claude Code plugin (CLI-installed), OpenCode commands and skills, and the `~/.config/phone-a-friend` directory (config, sessions, jobs).
+**Claude Code marketplace:**
+```
+/plugin uninstall phone-a-friend@phone-a-friend-marketplace
+/plugin marketplace remove phone-a-friend-marketplace
+```
 ## Contributing

package/commands/curiosity-engine.md CHANGED Viewed

@@ -351,28 +351,33 @@ Present the full session summary:
 <2-3 questions raised during the rally that weren't followed up on, worth exploring in a future session>
 ```
-## Gemini Model Priority
+## Gemini model selection
-When BACKEND=gemini, always pass `--model` using the first available model from this list:
+For BACKEND=gemini, **omit `--model` by default** and let Gemini CLI's
+auto-routing pick. Set `--model` only when reproducibility, specific
+capability, or debugging requires a pin.
-1. `gemini-2.5-flash` — reliable, confirmed working
-2. `gemini-2.5-pro` — higher capability, frequently at capacity (429)
-3. `gemini-2.5-flash-lite` — last resort
+**Binary mode** (preferred — when a pinned model returns a strong 404
+`ModelNotFoundError`, PaF caches it for 24h at
+`~/.config/phone-a-friend/gemini-models.json` and surfaces a clear error
+with cache path, expiry, and bypass instructions; no auto-substitution):
-When BACKEND=gemini, the relay command must include `--model`:
-**Binary mode:**
 ```bash
-phone-a-friend --to gemini --model gemini-2.5-flash --repo "$PWD" --sandbox read-only --fast $PAF_NO_DIFF --prompt "<relay-prompt>"
+phone-a-friend --to gemini --repo "$PWD" --sandbox read-only --fast $PAF_NO_DIFF --prompt "<relay-prompt>"
 ```
-**Direct mode:**
+To bypass the cache: `PHONE_A_FRIEND_GEMINI_DEAD_CACHE=false`. Or delete the
+cache file to clear it.
+**Direct mode** (no PaF wrapper — orchestrator handles retry):
 ```bash
-gemini --sandbox --yolo --include-directories "$PWD" --output-format text -m gemini-2.5-flash --prompt "<relay-prompt>"
+gemini --sandbox --yolo --include-directories "$PWD" --output-format text --prompt "<relay-prompt>"
 ```
-On capacity/transient errors (429, 500, 503), try the next model before treating as round failure.
-Do NOT use aliases like `auto`, `pro`, or `flash` — always use the full model name.
+In direct mode, on capacity/transient errors (429, 500, 503), retry with a
+different model before treating as round failure. On `ModelNotFoundError`,
+surface immediately. Either omit `--model` entirely or pass a concrete
+model name; do NOT use aliases like `auto`, `pro`, or `flash`.
 ## Constraints

package/commands/phone-a-friend.md CHANGED Viewed

@@ -169,9 +169,9 @@ I'm working on this task and got the above response. Please review it and return
    **Binary mode** (`RELAY_MODE = binary`):
    ```bash
    phone-a-friend --to codex --repo "$PWD" --prompt "<relay-prompt>" --context-text "<context-payload>" $PAF_NO_DIFF [--fast] [--session <id>]
-   # For gemini, always include --model (see "Gemini Model Priority" below).
+   # For gemini, omit --model by default (let auto-routing pick); see "Gemini model selection" below.
    # Do NOT pass --session to gemini — it will error (see "Session continuity" below):
-   phone-a-friend --to gemini --repo "$PWD" --prompt "<relay-prompt>" --context-text "<context-payload>" --model <model> $PAF_NO_DIFF [--fast]
+   phone-a-friend --to gemini --repo "$PWD" --prompt "<relay-prompt>" --context-text "<context-payload>" $PAF_NO_DIFF [--fast]
    ```
    `$PAF_NO_DIFF` comes from the probe in "Diff suppression" above. It
@@ -186,8 +186,8 @@ I'm working on this task and got the above response. Please review it and return
    ```bash
    # Codex:
    codex exec -C "$PWD" --skip-git-repo-check --sandbox read-only "<combined-prompt>" < /dev/null
-   # Gemini (always include -m, see "Gemini Model Priority" below):
-   gemini --sandbox --yolo --include-directories "$PWD" --output-format text -m <model> --prompt "<combined-prompt>"
+   # Gemini (omit -m for auto-routing; pin only when reproducibility/capability is needed):
+   gemini --sandbox --yolo --include-directories "$PWD" --output-format text --prompt "<combined-prompt>"
    ```
    In direct mode, build `<combined-prompt>` using the template from the
@@ -284,51 +284,55 @@ This is rarely the right move from inside a Claude Code conversation — the
 common case is `--session <label>` with a fresh label. Only use
 `--backend-session` when the user supplied a specific backend thread ID.
-## Gemini Model Priority
+## Gemini model selection
-When using `--to gemini`, **always** pass `--model` using the first model from
-this priority list. Never use aliases (`auto`, `pro`, `flash`) — use concrete
-model names only:
+By default, **omit `--model`** for `--to gemini` and let Gemini CLI's
+auto-routing pick. This mirrors how `--to codex` and `--to claude` are used
+in this command — the CLI's own default is the right default. Pinning
+`--model` ages docs poorly; auto-routing tracks deployed models for you.
-### Why we bypass auto-routing
+Set `--model` explicitly only when you need:
-Gemini CLI has built-in model fallback via auto mode, but it does NOT work in
-headless/non-interactive mode. `--yolo` (and `--approval-mode yolo`) only
-auto-approve tool calls, not model switch prompts. When Gemini hits a capacity
-error in headless mode, it tries to prompt for consent and fails
-(`google-gemini/gemini-cli#13561`). By passing `--model` explicitly, we bypass
-this broken behavior and handle retry/fallback ourselves.
+- **Reproducibility** — pinning produces deterministic behavior across runs.
+- **Capability** — a more capable model for a specific task (e.g.,
+  `--model gemini-2.5-pro` for a hard review, accepting more 429s).
+- **Debugging** — isolating model behavior from auto-routing changes.
-### Priority rationale
+### Cache-aware failure for explicit pins
-As of 2026-02-22, `gemini-3.1-pro-preview-*` models return 404 (not yet
-deployed) and `gemini-2.5-pro` is perpetually at capacity (429). Based on
-empirical testing across 10+ relay sessions, `gemini-2.5-flash` is the only
-model that reliably works. Lead with what works; fall forward to newer models
-as they become available.
+When you pin a model and it returns a strong 404 (`ModelNotFoundError`),
+PaF caches it as unavailable for 24h at
+`~/.config/phone-a-friend/gemini-models.json` and surfaces a clear error
+that includes the cache path, expiry timestamp, and bypass instructions.
+PaF does **not** auto-substitute another model — explicit pins surface
+explicit failures so the caller decides whether to retry, switch model,
+or omit `--model` and rely on auto-routing.
-1. `gemini-2.5-flash` — reliable, fast, confirmed working
-2. `gemini-2.5-pro` — higher capability but frequently at capacity (429)
-3. `gemini-2.5-flash-lite` — last resort
-4. `gemini-3.1-pro-preview-customtools` — not yet deployed (404 as of 2026-02-22)
-5. `gemini-3.1-pro-preview` — not yet deployed (404 as of 2026-02-22)
+What is and isn't cached:
-### Fallback rule
+- **Cached** (24h): strong 404 (`ModelNotFoundError` from gemini-cli's own classifier).
+- **Not cached**: ambiguous 404s (could be a missing project / file, not the model), 429 / RESOURCE_EXHAUSTED, authentication failures, any other error class.
+- **Not consulted**: when `--model` is unset (auto-routing), or during session resume (`--resume`).
-On Gemini relay failure, retry with the next model **only** for transient or
-capacity errors:
+To bypass the cache (debugging stale entries, testing recovery):
-- **Retry with next model**: HTTP 429, 499, 500, 503, 504; RESOURCE_EXHAUSTED;
-  "high demand"; model not found; transient/timeout errors
-- **Do NOT retry**: authentication failures, invalid arguments, prompt errors,
-  permission errors
-- **Default**: if an error cannot be confidently classified as transient, do
-  NOT model-fallback — report the error immediately
+```bash
+PHONE_A_FRIEND_GEMINI_DEAD_CACHE=false phone-a-friend --to gemini --model X --prompt "..."
+```
+Or delete `~/.config/phone-a-friend/gemini-models.json` to clear it.
+### Direct Gemini CLI mode
+When the orchestrator is calling `gemini` directly (no PaF wrapper), the
+dead-model cache does NOT apply — the orchestrator is responsible for any
+retry. Retry rules in direct mode:
-After exhausting all models, stop and report the error with the list of
-attempted models.
+- **Retry**: HTTP 429, 499, 500, 503, 504; RESOURCE_EXHAUSTED; transient/timeout errors.
+- **Do NOT retry**: authentication failures, invalid arguments, permission errors, model-not-found.
+- **Default**: if an error cannot be confidently classified as transient, surface it immediately.
-This does NOT apply to `--to codex`.
+This does NOT apply to `--to codex` or `--to claude`.
 ## Notes

package/commands/phone-a-team.md CHANGED Viewed

@@ -269,7 +269,7 @@ Resolution matrix:
 | Friend backend | Include when |
 |----------------|--------------|
 | `codex`        | `command -v codex` AND `codex --version` succeeds |
-| `gemini`       | `command -v gemini` succeeds (auth verified at first relay; transient errors handled by Gemini Model Priority retry rules) |
+| `gemini`       | `command -v gemini` succeeds (auth verified at first relay; transient errors handled by Gemini auto-routing) |
 | `ollama`       | `curl -sf "${OLLAMA_HOST:-http://localhost:11434}/api/tags"` succeeds AND parsed `models[]` has at least one entry |
 | `claude`       | `command -v claude` AND `claude --version` succeeds. Claude is excluded by default when this skill is running inside Claude Code (we are already orchestrating with Claude). Include only when the user explicitly asked for Claude in addition |
 | `opencode`     | `command -v opencode` succeeds AND the host is NOT OpenCode (`PHONE_A_FRIEND_HOST=opencode` means we are inside OpenCode; relaying back to opencode is blocked by the recursion guard regardless) |
@@ -525,7 +525,7 @@ Delegate the task to the backend via the relay. The lead's job is to
   "Direct call reference" section. If `--include-diff` is used, run
   `git diff HEAD` and append the output to the template's "Git Diff" section.
-  For gemini, always include `--model` / `-m` per the Gemini Model Priority section.
+  For gemini, omit `--model` by default and let auto-routing pick (see "Gemini model selection" section).
   For ollama, always include `--model` / model field using `OLLAMA_SELECTED_MODEL` from preflight.
 - **Both backends**: Relay to each backend (in parallel if using teams,
   sequentially otherwise). You may give them the same task or different
@@ -804,53 +804,45 @@ happened and whether the result is complete.
   targets *unsolicited* repo-content dumps, not user-requested
   diff-scoped reviews.
-## Gemini Model Priority
+## Gemini model selection
-When using `--to gemini` (including the gemini side of `--backend both`),
-**always** pass `--model` using the first model from this priority list. Never
-use aliases (`auto`, `pro`, `flash`) — use concrete model names only:
+For `--to gemini` (including the gemini side of `--backend both`), **omit
+`--model` by default** and let Gemini CLI's auto-routing pick. This mirrors
+how `--to codex` and `--to claude` are used in this command — the CLI's own
+default is the right default.
-### Why we bypass auto-routing
+Set `--model` explicitly only when reproducibility, specific capability, or
+debugging requires a pin.
-Gemini CLI has built-in model fallback via auto mode, but it does NOT work in
-headless/non-interactive mode. `--yolo` (and `--approval-mode yolo`) only
-auto-approve tool calls, not model switch prompts. When Gemini hits a capacity
-error in headless mode, it tries to prompt for consent and fails
-(`google-gemini/gemini-cli#13561`). By passing `--model` explicitly, we bypass
-this broken behavior and handle retry/fallback ourselves.
+### Cache-aware failure for explicit pins
-### Priority rationale
+PaF binary mode (`phone-a-friend --to gemini --model X`) caches strong 404s
+(`ModelNotFoundError`) at `~/.config/phone-a-friend/gemini-models.json` for
+24h and surfaces a clear error with the cache path, expiry, and bypass
+instructions. PaF does **not** auto-substitute another model — explicit
+pins surface explicit failures.
-As of 2026-02-22, `gemini-3.1-pro-preview-*` models return 404 (not yet
-deployed) and `gemini-2.5-pro` is perpetually at capacity (429). Based on
-empirical testing across 10+ relay sessions, `gemini-2.5-flash` is the only
-model that reliably works. Lead with what works; fall forward to newer models
-as they become available.
+What is and isn't cached:
-1. `gemini-2.5-flash` — reliable, fast, confirmed working
-2. `gemini-2.5-pro` — higher capability but frequently at capacity (429)
-3. `gemini-2.5-flash-lite` — last resort
-4. `gemini-3.1-pro-preview-customtools` — not yet deployed (404 as of 2026-02-22)
-5. `gemini-3.1-pro-preview` — not yet deployed (404 as of 2026-02-22)
+- **Cached** (24h): strong 404 (`ModelNotFoundError` from gemini-cli's own classifier).
+- **Not cached**: ambiguous 404s, 429 / RESOURCE_EXHAUSTED, authentication failures, any other error class.
+- **Not consulted**: when `--model` is unset (auto-routing), or during session resume.
-### Fallback rule
+To bypass the cache: `PHONE_A_FRIEND_GEMINI_DEAD_CACHE=false`. Or delete the
+cache file to clear it.
-On Gemini relay failure, retry with the next model **only** for transient or
-capacity errors:
+### Direct Gemini CLI mode
-- **Retry with next model**: HTTP 429, 499, 500, 503, 504; RESOURCE_EXHAUSTED;
-  "high demand"; model not found; transient/timeout errors
-- **Do NOT retry**: authentication failures, invalid arguments, prompt errors,
-  permission errors
-- **Default**: if an error cannot be confidently classified as transient, do
-  NOT model-fallback — treat as immediate round failure
+When invoking `gemini` directly (no PaF wrapper), the dead-model cache does
+NOT apply. Orchestrator-level retry rules:
-Model fallback happens **within the current round** (see Step 7 precedence
-rule). After exhausting all models in a round, escalate to round-level retry
-or stop per Step 7. Each new round resets to model #1.
+- **Retry**: HTTP 429, 499, 500, 503, 504; RESOURCE_EXHAUSTED; transient/timeout.
+- **Do NOT retry**: auth failures, invalid args, permission errors, model-not-found.
+- **Default**: surface unclassified errors immediately, do not loop.
-When reporting errors in synthesis, list all attempted models and the error
-from each.
+Round-level retry: each new round can attempt a different `--model` if the
+prior round failed (see Step 7). When reporting errors in synthesis, list
+the attempted models and each error.
 This does NOT apply to `--to codex` or `--to ollama`.