npm - llm-party-cli - Versions diffs - 0.7.1 → 0.10.1 - Mend

llm-party-cli 0.7.1 → 0.10.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -17,7 +17,7 @@
 <br/>
-A peer orchestrator that puts **Claude**, **Codex**, **Copilot**, and **GLM** in the same terminal. You talk, they listen. They talk to each other. Nobody is the boss except you.
+A peer orchestrator that puts **Claude**, **Codex**, **Copilot**, and any Claude-compatible API (GLM, Ollama, etc.) in the same terminal. You talk, they listen. They talk to each other. Nobody is the boss except you.
 ```
 YOU > @claude review this function
@@ -40,8 +40,9 @@ No MCP. No master/servant. No window juggling. Just peers at a terminal table.
 | ---------------------- | ------------------------------ | -------------------------------------- |
 | **Architecture** | MCP (master controls servants) | Peer orchestration (you control all)   |
 | **Integration**  | CLI wrapping, output scraping  | Direct SDK adapters                    |
-| **Sessions**     | Fresh each time                | Persistent per provider                |
+| **Sessions**     | Fresh each time                | Persistent per provider, resumable     |
 | **Context**      | Agents are siloed              | Every agent sees the full conversation |
+| **Concurrency**  | Sequential or blocked          | Non-blocking per-agent queues          |
 | **API tokens**   | Separate keys per tool         | Uses your existing CLI auth            |
 <br/>
@@ -121,6 +122,10 @@ That's it. No paths, no prompts, no usernames to configure. Just name, tag, prov
 Agents can pass the conversation to each other by ending their response with `@next:<tag>`. The orchestrator picks it up and dispatches automatically. Max 15 hops per cycle to prevent loops.
+### Non-blocking queue
+You can type while agents are working. Each agent has its own queue. If an agent is busy when a new message arrives, the message is queued and processed when the agent finishes. No blocking, no waiting for slow agents to finish before fast ones can respond.
 ## **WARNING: FULL AUTONOMY.**
 All agents run with full permissions. They can read, write, edit files and execute shell commands with zero approval gates. There is no confirmation step before any action. Run in a disposable environment. You are responsible for any changes, data loss, costs, or side effects. Do not run against production systems.
@@ -145,18 +150,31 @@ llm-party uses **official, publicly available SDKs and CLIs** published by each
 | Codex    | [`@openai/codex-sdk`](https://www.npmjs.com/package/@openai/codex-sdk)                           | OpenAI       |
 | Copilot  | [`@github/copilot-sdk`](https://www.npmjs.com/package/@github/copilot-sdk)                       | GitHub       |
+Custom providers (GLM, Ollama, etc.) route through a native CLI's SDK with environment overrides. No additional SDKs are required.
 All authentication flows through the provider's own CLI. llm-party does not implement its own auth flow, store credentials, or intercept authentication traffic.
 <br/>
 ## Supported providers
+### Native providers (detected automatically)
 | Provider          | SDK                                | Session                                | Prompt Support                                     |
 | ----------------- | ---------------------------------- | -------------------------------------- | -------------------------------------------------- |
 | **Claude**  | `@anthropic-ai/claude-agent-sdk` | Persistent via session ID resume       | Full control                                       |
 | **Codex**   | `@openai/codex-sdk`              | Persistent thread with `run()` turns | Via `developer_instructions` (limitations below) |
-| **Copilot** | `@github/copilot-sdk`            | Persistent via `sendAndWait()`       | Full control                                       |
-| **GLM**     | Claude SDK + env proxy             | Same as Claude                         | Full control                                       |
+| **Copilot** | `@github/copilot-sdk`            | Persistent via session ID resume       | Full control                                       |
+### Custom providers (config-driven)
+Any AI that exposes a Claude-compatible API can be added as a custom provider. Custom providers route through a native CLI (currently Claude) with environment overrides.
+| Provider          | API                                | Notes                                              |
+| ----------------- | ---------------------------------- | -------------------------------------------------- |
+| **GLM**     | Zhipu AI (`api.z.ai`)           | Full Claude SDK compatibility via proxy            |
+| **Ollama**  | Local (`localhost:11434`)        | Any model Ollama supports                          |
+| **Any**     | Any Claude-compatible endpoint     | Just set `AUTH_URL` and `AUTH_TOKEN`              |
 <br/>
@@ -173,19 +191,23 @@ Terminal (you)
     |
     v
 Orchestrator
+    |
+    +-- Agent Queue Manager (per-agent queues, non-blocking dispatch)
     |
     +-- Agent Registry
     |     +-- Claude  -> ClaudeAdapter  (SDK session, resume by ID)
-    |     +-- Codex   -> CodexAdapter   (SDK thread, persistent turns)
-    |     +-- Copilot -> CopilotAdapter (SDK session, sendAndWait)
-    |     +-- GLM     -> GlmAdapter     (Claude SDK + env proxy)
+    |     +-- Codex   -> CodexAdapter   (SDK thread, resumeThread by ID)
+    |     +-- Copilot -> CopilotAdapter (SDK session, resumeSession by ID)
+    |     +-- Custom  -> CustomAdapter  (routes through native CLI + env override)
     |
     +-- Conversation Log (ordered, all messages, agent-prefixed)
     |
     +-- Transcript Writer (JSONL, append-only, per session)
+    |
+    +-- Session Manifest (per-agent cursors + SDK session IDs, for resume)
 ```
-Each agent receives a rolling window of recent messages (configurable, default 16) plus any unseen messages since its last turn. Messages from other agents are included so everyone sees the full multi-party conversation.
+Each agent has its own processing queue. When you send a message, idle agents start immediately while busy agents queue it. You can keep typing while agents work. Each agent receives only unseen messages since its last turn, so no duplicate processing on resume or during concurrent dispatch.
 `~/.llm-party/config.json` is your global config. Every agent receives a base system prompt automatically. The `prompts` field in config adds extra prompt files on top of it.
@@ -207,7 +229,7 @@ Each agent receives a rolling window of recent messages (configurable, default 1
 |         |                                                                                                                  |
 | ------- | ---------------------------------------------------------------------------------------------------------------- |
 | SDK     | `@openai/codex-sdk`                                                                                            |
-| Session | Persistent thread.`startThread()` creates it, `thread.run()` adds turns to the same conversation.            |
+| Session | Persistent thread. `startThread()` creates it, `resumeThread()` restores it. `thread.run()` adds turns.      |
 | Prompt  | Injected via `developer_instructions` config key. Appended alongside Codex's built-in 13k token system prompt. |
 | Tools   | exec_command, apply_patch, file operations                                                                       |
@@ -218,20 +240,20 @@ Each agent receives a rolling window of recent messages (configurable, default 1
 |         |                                                             |
 | ------- | ----------------------------------------------------------- |
 | SDK     | `@github/copilot-sdk`                                     |
-| Session | Persistent via `CopilotClient.createSession()`.           |
+| Session | Persistent via `createSession()` with session ID. Resumable via `resumeSession()`. |
 | Prompt  | Set as `systemMessage` on session creation. Full control. |
 | Tools   | Copilot built-in toolset                                    |
-### GLM
+### Custom (GLM, Ollama, etc.)
-|         |                                                           |
-| ------- | --------------------------------------------------------- |
-| SDK     | `@anthropic-ai/claude-agent-sdk` (same as Claude)       |
-| Session | Same as Claude, routed through a proxy via env overrides. |
-| Prompt  | Same as Claude. Full control.                             |
-| Tools   | Same as Claude                                            |
+|         |                                                                                  |
+| ------- | -------------------------------------------------------------------------------- |
+| SDK     | Uses the native CLI's SDK (currently Claude's `@anthropic-ai/claude-agent-sdk`) |
+| Session | Same as the underlying CLI                                                       |
+| Prompt  | Same as the underlying CLI. Full control.                                        |
+| Tools   | Same as the underlying CLI                                                       |
-GLM uses the Claude SDK as a transport layer. The adapter routes API calls through a proxy by setting `ANTHROPIC_BASE_URL` and model aliases via the `env` config field.
+Custom providers route API calls through a native CLI by overriding `AUTH_URL` and `AUTH_TOKEN` in the agent's `env` block. The `cli` field selects which native CLI to use (defaults to `claude`).
 <br/>
@@ -247,7 +269,7 @@ Override with `LLM_PARTY_CONFIG` env var to point to a different file.
 | --------------- | -------- | -------------------------- | ---------------------------------------------------------------------------- |
 | `humanName`   | No       | Your system username       | Display name in the terminal prompt and passed to agents                     |
 | `humanTag`    | No       | derived from `humanName` | Tag for human handoff detection (`@next:you`)                              |
-| `maxAutoHops` | No       | `15`                     | Max agent-to-agent handoffs per cycle. Use `"unlimited"` to remove the cap |
+| `maxAutoHops` | No       | `15`                     | Max agent-to-agent handoffs per cycle. `0` = unlimited                     |
 | `timeout`     | No       | `600`                    | Default timeout in seconds for all agents                                    |
 | `agents`      | Yes      |                            | Array of agent definitions                                                   |
@@ -257,7 +279,9 @@ Override with `LLM_PARTY_CONFIG` env var to point to a different file.
 | ------------------ | -------- | ------------------------ | --------------------------------------------------------------------------------------------------- |
 | `name`           | Yes      |                          | Display name shown in responses as `[AGENT NAME]`. Must be unique.                                |
 | `tag`            | Yes      |                        | Routing tag for `@tag` targeting. Letters, numbers, hyphens, underscores only. No spaces.         |
-| `provider`       | Yes      |                          | SDK adapter:`claude`, `codex`, `copilot`, or `glm`                                          |
+| `provider`       | Yes      |                          | SDK adapter: `claude`, `codex`, `copilot`, or `custom`                                       |
+| `cli`            | No       | `"claude"`             | For custom providers: which native CLI to route through                                       |
+| `active`         | No       | `true`                 | Set to `false` to disable an agent without removing its config                              |
 | `model`          | Yes      |                          | Model ID passed to the provider. Examples:`opus`, `sonnet`, `gpt-5.2`, `gpt-4.1`, `glm-5` |
 | `prompts`        | No       | none                     | Array of extra prompt file paths, concatenated after `base.md`. Relative to project root          |
 | `executablePath` | No       | PATH lookup              | Path to the CLI binary. Supports `~/`. Only needed if the CLI is not in your PATH                 |
@@ -294,25 +318,42 @@ Template variables available in prompt files:
 | `{{validHandoffTargets}}` | Valid `@next:tag` targets   |
 | `{{preloadedSkills}}`     | Skills assigned to this agent via `preloadSkills` |
-### GLM environment setup
+### Custom provider setup
+Custom providers use `AUTH_URL` and `AUTH_TOKEN` in the `env` block. The adapter maps these to the correct environment variables for the underlying CLI.
-GLM requires environment overrides to route through a proxy. The adapter tries to load env variables from your shell `glm` alias automatically. Without the alias, provide everything in the `env` block:
+**GLM (Zhipu AI):**
 ```json
 {
   "name": "GLM",
-  "provider": "glm",
+  "provider": "custom",
+  "cli": "claude",
   "model": "glm-5",
   "env": {
-    "ANTHROPIC_AUTH_TOKEN": "your-glm-api-key",
-    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
-    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.5-air",
-    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-4.5",
-    "ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-5"
+    "AUTH_URL": "https://api.z.ai/api/anthropic",
+    "AUTH_TOKEN": "your-glm-api-key"
   }
 }
 ```
+**Ollama (local):**
+```json
+{
+  "name": "Ollama",
+  "provider": "custom",
+  "cli": "claude",
+  "model": "llama3",
+  "env": {
+    "AUTH_URL": "http://localhost:11434/v1",
+    "AUTH_TOKEN": "ollama"
+  }
+}
+```
+Any endpoint that speaks the Anthropic API protocol works. Set `AUTH_URL` to the base URL and `AUTH_TOKEN` to the API key.
 <br/>
 ## Skills
@@ -353,6 +394,26 @@ Every run generates a unique session ID and appends messages to a JSONL transcri
 File changes made by agents are detected via `git status` after each response. Newly modified files are printed with timestamps.
+### Resume a session
+Pick up where you left off by passing the session ID:
+```bash
+llm-party --resume 20260402-102915-74722-ba956b96
+```
+Or use the `/resume` command as your first input in a fresh session:
+```
+/resume 20260402-102915-74722-ba956b96
+```
+The session ID is shown at startup or via `/session`. Resume loads the full transcript, restores per-agent SDK sessions (Claude session IDs, Codex thread IDs, Copilot session IDs), and tracks which messages each agent has already seen. Agents pick up exactly where they left off with no duplicate processing.
+A `.manifest.json` file alongside each transcript stores the session state: agent cursors, SDK session IDs, sticky targets. This is what makes cross-provider resume possible.
+Resume only works before the first message is sent. Once a conversation has started, resuming another session into it is not allowed.
 <br/>
 ## Terminal commands
@@ -364,6 +425,7 @@ File changes made by agents are detected via `git status` after each response. N
 | `/info`        | Commands and keyboard shortcuts panel             |
 | `/save <path>` | Export conversation as JSON                       |
 | `/session`     | Show session ID and transcript path               |
+| `/resume <id>` | Resume a previous session (first message only)    |
 | `/changes`     | Show git-modified files                           |
 | `/clear`       | Clear chat display (Ctrl+L also works)            |
 | `/exit`        | Quit (graceful shutdown, all adapters cleaned up) |
@@ -409,7 +471,7 @@ LLM_PARTY_CONFIG=/path/to/config.json bun run dev
 Run `/agents` to see available tags. Tags match against agent `tag`, `name`, and `provider`.
 **"Unsupported provider"**
-Valid providers: `claude`, `codex`, `copilot`, `glm`.
+Valid providers: `claude`, `codex`, `copilot`, `custom`.
 **"Duplicate agent name"**
 Agent names must be unique (case-insensitive). Rename one of the duplicates in config.