moa-cli 0.1.0__tar.gz → 0.2.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- moa_cli-0.2.1/PKG-INFO +242 -0
- moa_cli-0.2.1/README.md +231 -0
- {moa_cli-0.1.0 → moa_cli-0.2.1}/pyproject.toml +1 -1
- {moa_cli-0.1.0 → moa_cli-0.2.1}/src/moa_cli/__init__.py +1 -1
- moa_cli-0.2.1/src/moa_cli/cli.py +1356 -0
- moa_cli-0.1.0/PKG-INFO +0 -127
- moa_cli-0.1.0/README.md +0 -116
- moa_cli-0.1.0/src/moa_cli/cli.py +0 -543
moa_cli-0.2.1/PKG-INFO
ADDED
|
@@ -0,0 +1,242 @@
|
|
|
1
|
+
Metadata-Version: 2.3
|
|
2
|
+
Name: moa-cli
|
|
3
|
+
Version: 0.2.1
|
|
4
|
+
Summary: Ask one question to multiple local AI coding CLIs in parallel and collect their answers.
|
|
5
|
+
Keywords: llm,agents,cli,claude,codex,agy,opencode,peer-review
|
|
6
|
+
Author: Paul-Louis Pröve
|
|
7
|
+
Author-email: Paul-Louis Pröve <plp@workgenius.com>
|
|
8
|
+
Requires-Dist: typer>=0.25.0
|
|
9
|
+
Requires-Python: >=3.12
|
|
10
|
+
Description-Content-Type: text/markdown
|
|
11
|
+
|
|
12
|
+
<p align="center">
|
|
13
|
+
<img src="assets/logo-full-white.png" alt="moa - mixture of agents" width="360">
|
|
14
|
+
</p>
|
|
15
|
+
|
|
16
|
+
<p align="center">
|
|
17
|
+
<a href="https://github.com/pietz/moa-cli/actions/workflows/ci.yml"><img src="https://github.com/pietz/moa-cli/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
|
|
18
|
+
</p>
|
|
19
|
+
|
|
20
|
+
# MOA - Mixture of Agents
|
|
21
|
+
|
|
22
|
+
Ask one question to multiple local AI coding CLIs **in parallel** and collect their answers. MOA detects which agent CLIs you have installed (Claude Code, Codex, agy, opencode), fans your prompt out to them, and streams each answer back the moment that agent finishes. Or run `moa distill` to have a strong aggregator merge those answers into a single unified response, or `moa debate` to have them critique each other across rounds before a neutral judge gives the verdict.
|
|
23
|
+
|
|
24
|
+
It's a drop-in, batteries-included replacement for hand-rolling parallel `claude -p` / `codex exec` / `opencode run` calls (or a "peer review" agent skill): one command, clean attributed output, made to be called by a human **or** by another agent.
|
|
25
|
+
|
|
26
|
+
The package is named `moa-cli` but installs the command `moa`.
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
uv tool install moa-cli
|
|
30
|
+
moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Or run it once without installing:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
uvx --from moa-cli moa ask "Review this plan."
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Why
|
|
40
|
+
|
|
41
|
+
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
42
|
+
|
|
43
|
+
## Usage
|
|
44
|
+
|
|
45
|
+
MOA has three prompt verbs that share the same selection/output options:
|
|
46
|
+
|
|
47
|
+
- **`moa ask PROMPT`** - council / peer review: N agents answer the same prompt in parallel; every answer is returned with attribution, streamed as it lands.
|
|
48
|
+
- **`moa distill PROMPT`** - synthesis: run the council, then one strong aggregator merges the answers into a single unified response.
|
|
49
|
+
- **`moa debate PROMPT`** - sequential debate: two debaters answer and adversarially critique each other across rounds, then a separate neutral judge writes the final verdict. The costliest mode; read the caveats below before reaching for it.
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
moa doctor # show installed CLIs and their default models
|
|
53
|
+
moa ask "Should this feature use SQLite?" # ask the top 3 installed agents (read-only)
|
|
54
|
+
moa ask -n 2 "..." # ask only the top 2 (priority order)
|
|
55
|
+
moa ask -p claude -p agy "..." # pin specific agents
|
|
56
|
+
moa ask -x claude "..." # drop an agent (e.g. exclude the caller's own model)
|
|
57
|
+
moa ask -m claude=sonnet "..." # override which model a tool uses
|
|
58
|
+
moa ask --yolo "..." # grant full write access (default is read-only)
|
|
59
|
+
moa ask --json "..." # machine-readable JSONL (for agents/pipes)
|
|
60
|
+
git diff | moa ask -f - "Review this diff." # read the prompt from stdin
|
|
61
|
+
moa distill "Design a rate limiter." # council, then merge into one answer
|
|
62
|
+
moa distill -s codex "..." # pick who distills (auto | random | provider)
|
|
63
|
+
moa debate "Is this race condition real?" # 2 debaters + a judge (default n=3)
|
|
64
|
+
moa debate -r 3 "..." # more rounds (default 2, hard max 4)
|
|
65
|
+
moa debate -j claude "..." # pin who judges (must not be a debater)
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
The shared options (`-n/--num`, `-p/--provider`, `-x/--exclude`, `-m/--model`, `-t/--timeout`, `-f/--file`, `--json`, `--yolo`) work identically on all three verbs. `distill` adds `-s/--synthesizer`; `debate` adds `-r/--rounds` and `-j/--judge`.
|
|
69
|
+
|
|
70
|
+
### Read-only by default
|
|
71
|
+
|
|
72
|
+
MOA is built to be called autonomously, so by default **no agent can write files or
|
|
73
|
+
run mutating commands**. Each agent runs in its tool's safest mode: it may read local
|
|
74
|
+
files (and, where the tool allows, research online), but it cannot edit anything. This
|
|
75
|
+
is enforced by spawning each CLI with its own read-only flags:
|
|
76
|
+
|
|
77
|
+
| Provider | Read-only (default) | Reads files | Web research |
|
|
78
|
+
| ---------- | -------------------------- | ----------- | ------------------------- |
|
|
79
|
+
| `claude` | `--permission-mode plan` | yes | yes |
|
|
80
|
+
| `codex` | `-s read-only` | yes | **no** (sandbox blocks network) |
|
|
81
|
+
| `opencode` | `--agent plan` | yes | yes |
|
|
82
|
+
| `agy` | `--sandbox` (partial: shell only - can still edit files) | yes | yes |
|
|
83
|
+
|
|
84
|
+
`codex`'s read-only mode is a kernel sandbox that also blocks network, so codex does no
|
|
85
|
+
web research in the default mode (it still reads local files). `agy` has **no true
|
|
86
|
+
read-only mode**: its `--sandbox` flag restricts agy's terminal/shell but does **not** stop
|
|
87
|
+
its `write_file` tool, so agy **can still edit files** even in the default mode. This is
|
|
88
|
+
**partial** protection (it closes the shell vector only), not read-only. moa applies
|
|
89
|
+
`--sandbox` as the next-best safeguard and the selection note on stderr states honestly that
|
|
90
|
+
`agy` is shell-sandboxed but can still edit files.
|
|
91
|
+
|
|
92
|
+
### `--yolo` (full write access)
|
|
93
|
+
|
|
94
|
+
Pass `--yolo` to grant every agent full write access (file edits and shell commands,
|
|
95
|
+
auto-approved). Use it only when you actually want the agents to change your working tree.
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
moa ask --yolo "Refactor this module and run the tests."
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
Under `--yolo` every agent gets full write access. For `agy` this means dropping
|
|
102
|
+
`--sandbox`, so `agy --yolo` runs with no shell restrictions at all. In the default mode,
|
|
103
|
+
`agy` runs with `--sandbox` (partial protection: shell only - it can still edit files), and
|
|
104
|
+
MOA states that honestly on stderr.
|
|
105
|
+
|
|
106
|
+
### How agents are selected
|
|
107
|
+
|
|
108
|
+
`-n/--num` (default 3) picks the first N **installed** agents from a popularity-ordered priority list:
|
|
109
|
+
|
|
110
|
+
```
|
|
111
|
+
claude -> codex -> agy -> opencode
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
So `moa ask -n 3` on a machine with all four installed asks Claude, Codex, and agy (opencode is #4). `agy` has no true read-only mode, so in the default mode it runs with `--sandbox` (partial protection: shell only - it can still edit files) and MOA flags that with an honest note on stderr; it is **not** excluded. Use `-p/--provider` (repeatable) to pin an exact set and ignore `-n`.
|
|
115
|
+
|
|
116
|
+
Use `-x/--exclude` (repeatable) to drop one or more agents from the run. Exclusion is applied *before* `-n` takes the first N, and it also drops excluded names from an explicit `-p` set. It is off by default. The motivating case: an agent (e.g. Claude Code) calls `moa` for *other* opinions; `moa ask -x claude` makes sure one "peer" isn't just the caller's own model. So `moa ask -n 3 -x claude` asks Codex, agy, and opencode.
|
|
117
|
+
|
|
118
|
+
### Choosing models
|
|
119
|
+
|
|
120
|
+
Each tool ships with a reasonable default model, but you can override which model any tool uses with `-m/--model PROVIDER=MODEL` (repeatable). Only the providers you name change; the rest keep their defaults.
|
|
121
|
+
|
|
122
|
+
```bash
|
|
123
|
+
moa ask -m claude=sonnet -m agy="Gemini 3.1 Pro (Low)" "..."
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
The model-string format differs per tool and is passed through verbatim (the tool's own CLI validates it):
|
|
127
|
+
|
|
128
|
+
| Provider | Default | `-m` format |
|
|
129
|
+
| ---------- | ----------------------- | ------------------------------------------------------ |
|
|
130
|
+
| `claude` | `opus` | short id, e.g. `claude=sonnet` |
|
|
131
|
+
| `codex` | `gpt-5.5` | model id, e.g. `codex=gpt-5.5` |
|
|
132
|
+
| `agy` | `Gemini 3.1 Pro (High)` | exact display name, e.g. `agy="Gemini 3.1 Pro (Low)"` |
|
|
133
|
+
| `opencode` | (tool's authed default) | `provider/model` slug, e.g. `opencode=anthropic/claude-sonnet-4` |
|
|
134
|
+
|
|
135
|
+
`opencode` has no built-in default; without an override it omits `-m` and lets opencode pick. Pass `-m opencode=provider/model` to pin one.
|
|
136
|
+
|
|
137
|
+
### Configuration
|
|
138
|
+
|
|
139
|
+
To avoid repeating the same flags on every call, persist your own defaults in a config file. MOA reads it for every verb and merges it under your flags.
|
|
140
|
+
|
|
141
|
+
**Location.** `~/.moa/config.toml` (the dir is created on first write). Set `$MOA_CONFIG_DIR` to point the whole config layer somewhere else (useful in tests/CI).
|
|
142
|
+
|
|
143
|
+
**Precedence.** `built-in default < config file < CLI flag`. A flag always wins; the config file only changes a default when that flag is omitted; an absent file means today's built-in behaviour.
|
|
144
|
+
|
|
145
|
+
**Keys** (all shared across `ask`/`distill`/`debate`):
|
|
146
|
+
|
|
147
|
+
| Key | Type | Example |
|
|
148
|
+
| ------------- | ----------------------- | ----------------------------- |
|
|
149
|
+
| `num` | int (>= 1) | `num = 2` |
|
|
150
|
+
| `timeout` | seconds (> 0) | `timeout = 120` |
|
|
151
|
+
| `exclude` | list of provider names | `exclude = ["claude"]` |
|
|
152
|
+
| `synthesizer` | `auto`/`random`/provider | `synthesizer = "codex"` |
|
|
153
|
+
| `[models]` | provider -> model table | `claude = "sonnet"` |
|
|
154
|
+
|
|
155
|
+
```toml
|
|
156
|
+
# ~/.moa/config.toml
|
|
157
|
+
num = 2
|
|
158
|
+
timeout = 120
|
|
159
|
+
exclude = ["claude"]
|
|
160
|
+
synthesizer = "auto"
|
|
161
|
+
|
|
162
|
+
[models]
|
|
163
|
+
claude = "sonnet"
|
|
164
|
+
agy = "Gemini 3.1 Pro (Low)"
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
**`moa config`** inspects and edits the file (it creates the dir/file as needed and validates provider names):
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
moa config show # effective config (defaults + file) + path
|
|
171
|
+
moa config path # print the config file path
|
|
172
|
+
moa config set num 2 # set a scalar
|
|
173
|
+
moa config set exclude claude,codex # set the exclude list (comma-separated)
|
|
174
|
+
moa config set model claude=sonnet # set one entry in [models]
|
|
175
|
+
moa config unset num # remove a key
|
|
176
|
+
moa config unset model claude # remove one [models] entry
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
The synthesizer default is persistable too (e.g. `moa config set synthesizer codex`); `debate`'s `-r/--rounds` and `-j/--judge` are not persisted. CLI `-m` overrides win per-provider over the config `[models]` table.
|
|
180
|
+
|
|
181
|
+
### Output
|
|
182
|
+
|
|
183
|
+
- **stdout** carries only content: each agent's answer is fronted by a centered separator rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines around it for clear separation, flushed the instant that agent finishes. `moa distill` then appends the merged block (`──── synthesis · via claude · OK · ... ────`) once the aggregator finishes.
|
|
184
|
+
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
185
|
+
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill` then adds a `{"type": "synthesis", ...}` record. `debate` instead emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
186
|
+
|
|
187
|
+
### `moa distill` (synthesis)
|
|
188
|
+
|
|
189
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it streams what it has and skips the merge. The aggregator is chosen with `-s/--synthesizer`:
|
|
190
|
+
|
|
191
|
+
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
192
|
+
- `random` - pick one of the agents that ran, at random
|
|
193
|
+
- a provider name (`claude`, `codex`, `agy`, `opencode`)
|
|
194
|
+
|
|
195
|
+
The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synthesize" prompt (Wang et al. 2024): it tells the aggregator to critically evaluate the inputs (some may be biased or incorrect) and not to simply replicate them but offer a refined, accurate, comprehensive reply.
|
|
196
|
+
|
|
197
|
+
### `moa debate` (sequential debate + neutral judge)
|
|
198
|
+
|
|
199
|
+
`debate` is the opt-in, highest-cost mode. Instead of fanning out in parallel, it runs a sequential, adversarial exchange and then asks a **separate neutral judge** to write the final answer.
|
|
200
|
+
|
|
201
|
+
**Roles.** By default the top **2** selected agents are the debaters and the **3rd** is the judge - so the default `-n 3` maps to *2 debaters + 1 judge*. Pin a specific judge with `-j/--judge PROVIDER`; the judge must be one of the selected agents and must **not** also be a debater. Debate needs at least 2 debaters and 1 distinct judge, so it needs at least 3 agents; with fewer it exits with a clear message rather than silently degrading.
|
|
202
|
+
|
|
203
|
+
**Rounds.** `-r/--rounds` defaults to **2** (gains plateau around 2-3 rounds while token cost grows multiplicatively) and is hard-capped at **4** - higher values are clamped with a warning on stderr.
|
|
204
|
+
|
|
205
|
+
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
206
|
+
|
|
207
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed, per item 002) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
208
|
+
|
|
209
|
+
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
210
|
+
|
|
211
|
+
**Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
|
|
212
|
+
|
|
213
|
+
> **Caveat - use sparingly.** Debate is the costliest mode (roughly `debaters x rounds + 1` model calls) **and the least reliably beneficial.** The research is mixed-to-negative: multi-agent debate can converge on a *wrong* answer through conformity, a confident-but-incorrect debater can win on persuasiveness over correctness, and more rounds can entrench an error rather than fix it. The separate neutral judge and the adversarial-stance prompt are there to fight these failure modes, but they do not eliminate them. For most questions, `ask` or `distill` is the better default; reach for `debate` when you specifically want to surface and stress-test disagreement. (See *Can LLM Agents Really Debate?* arXiv:2511.07784, *Talk Isn't Always Cheap* arXiv:2509.05396, and the conformity/position-bias work cited in the design notes.)
|
|
214
|
+
|
|
215
|
+
### Attribution policy
|
|
216
|
+
|
|
217
|
+
The human (or agent) reading MOA's output **always gets correct attribution**: every response block shows the real provider name. There is no human-facing anonymization toggle.
|
|
218
|
+
|
|
219
|
+
The `distill` aggregator is a different story. To stop it picking favourites by brand, it **always** receives the proposer answers anonymized as "Response A / B / C" and order-shuffled (no toggle). The merged answer itself is brand-agnostic prose, and the A/B/C labels never leak into stdout, stderr, or the JSON.
|
|
220
|
+
|
|
221
|
+
## Supported agents
|
|
222
|
+
|
|
223
|
+
Invocations below show the default (read-only) flags; `--yolo` swaps in each tool's full-access mode.
|
|
224
|
+
|
|
225
|
+
| Provider | CLI | Invocation (read-only default) |
|
|
226
|
+
| ----------- | ---------- | ------------------------------------------------------------------- |
|
|
227
|
+
| `claude` | `claude` | `claude --model opus --permission-mode plan -p PROMPT` |
|
|
228
|
+
| `codex` | `codex` | `codex exec -m gpt-5.5 --skip-git-repo-check -s read-only PROMPT` |
|
|
229
|
+
| `agy` | `agy` | `agy --sandbox --model "Gemini 3.1 Pro (High)" -p PROMPT` (partial: shell only - can still edit files) |
|
|
230
|
+
| `opencode` | `opencode` | `opencode run --agent plan PROMPT` |
|
|
231
|
+
|
|
232
|
+
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
233
|
+
|
|
234
|
+
## Development
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
uv sync
|
|
238
|
+
uv run pytest
|
|
239
|
+
uv run ruff check src tests
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
MIT licensed.
|
moa_cli-0.2.1/README.md
ADDED
|
@@ -0,0 +1,231 @@
|
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="assets/logo-full-white.png" alt="moa - mixture of agents" width="360">
|
|
3
|
+
</p>
|
|
4
|
+
|
|
5
|
+
<p align="center">
|
|
6
|
+
<a href="https://github.com/pietz/moa-cli/actions/workflows/ci.yml"><img src="https://github.com/pietz/moa-cli/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
|
|
7
|
+
</p>
|
|
8
|
+
|
|
9
|
+
# MOA - Mixture of Agents
|
|
10
|
+
|
|
11
|
+
Ask one question to multiple local AI coding CLIs **in parallel** and collect their answers. MOA detects which agent CLIs you have installed (Claude Code, Codex, agy, opencode), fans your prompt out to them, and streams each answer back the moment that agent finishes. Or run `moa distill` to have a strong aggregator merge those answers into a single unified response, or `moa debate` to have them critique each other across rounds before a neutral judge gives the verdict.
|
|
12
|
+
|
|
13
|
+
It's a drop-in, batteries-included replacement for hand-rolling parallel `claude -p` / `codex exec` / `opencode run` calls (or a "peer review" agent skill): one command, clean attributed output, made to be called by a human **or** by another agent.
|
|
14
|
+
|
|
15
|
+
The package is named `moa-cli` but installs the command `moa`.
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
uv tool install moa-cli
|
|
19
|
+
moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Or run it once without installing:
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
uvx --from moa-cli moa ask "Review this plan."
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Why
|
|
29
|
+
|
|
30
|
+
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
31
|
+
|
|
32
|
+
## Usage
|
|
33
|
+
|
|
34
|
+
MOA has three prompt verbs that share the same selection/output options:
|
|
35
|
+
|
|
36
|
+
- **`moa ask PROMPT`** - council / peer review: N agents answer the same prompt in parallel; every answer is returned with attribution, streamed as it lands.
|
|
37
|
+
- **`moa distill PROMPT`** - synthesis: run the council, then one strong aggregator merges the answers into a single unified response.
|
|
38
|
+
- **`moa debate PROMPT`** - sequential debate: two debaters answer and adversarially critique each other across rounds, then a separate neutral judge writes the final verdict. The costliest mode; read the caveats below before reaching for it.
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
moa doctor # show installed CLIs and their default models
|
|
42
|
+
moa ask "Should this feature use SQLite?" # ask the top 3 installed agents (read-only)
|
|
43
|
+
moa ask -n 2 "..." # ask only the top 2 (priority order)
|
|
44
|
+
moa ask -p claude -p agy "..." # pin specific agents
|
|
45
|
+
moa ask -x claude "..." # drop an agent (e.g. exclude the caller's own model)
|
|
46
|
+
moa ask -m claude=sonnet "..." # override which model a tool uses
|
|
47
|
+
moa ask --yolo "..." # grant full write access (default is read-only)
|
|
48
|
+
moa ask --json "..." # machine-readable JSONL (for agents/pipes)
|
|
49
|
+
git diff | moa ask -f - "Review this diff." # read the prompt from stdin
|
|
50
|
+
moa distill "Design a rate limiter." # council, then merge into one answer
|
|
51
|
+
moa distill -s codex "..." # pick who distills (auto | random | provider)
|
|
52
|
+
moa debate "Is this race condition real?" # 2 debaters + a judge (default n=3)
|
|
53
|
+
moa debate -r 3 "..." # more rounds (default 2, hard max 4)
|
|
54
|
+
moa debate -j claude "..." # pin who judges (must not be a debater)
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
The shared options (`-n/--num`, `-p/--provider`, `-x/--exclude`, `-m/--model`, `-t/--timeout`, `-f/--file`, `--json`, `--yolo`) work identically on all three verbs. `distill` adds `-s/--synthesizer`; `debate` adds `-r/--rounds` and `-j/--judge`.
|
|
58
|
+
|
|
59
|
+
### Read-only by default
|
|
60
|
+
|
|
61
|
+
MOA is built to be called autonomously, so by default **no agent can write files or
|
|
62
|
+
run mutating commands**. Each agent runs in its tool's safest mode: it may read local
|
|
63
|
+
files (and, where the tool allows, research online), but it cannot edit anything. This
|
|
64
|
+
is enforced by spawning each CLI with its own read-only flags:
|
|
65
|
+
|
|
66
|
+
| Provider | Read-only (default) | Reads files | Web research |
|
|
67
|
+
| ---------- | -------------------------- | ----------- | ------------------------- |
|
|
68
|
+
| `claude` | `--permission-mode plan` | yes | yes |
|
|
69
|
+
| `codex` | `-s read-only` | yes | **no** (sandbox blocks network) |
|
|
70
|
+
| `opencode` | `--agent plan` | yes | yes |
|
|
71
|
+
| `agy` | `--sandbox` (partial: shell only - can still edit files) | yes | yes |
|
|
72
|
+
|
|
73
|
+
`codex`'s read-only mode is a kernel sandbox that also blocks network, so codex does no
|
|
74
|
+
web research in the default mode (it still reads local files). `agy` has **no true
|
|
75
|
+
read-only mode**: its `--sandbox` flag restricts agy's terminal/shell but does **not** stop
|
|
76
|
+
its `write_file` tool, so agy **can still edit files** even in the default mode. This is
|
|
77
|
+
**partial** protection (it closes the shell vector only), not read-only. moa applies
|
|
78
|
+
`--sandbox` as the next-best safeguard and the selection note on stderr states honestly that
|
|
79
|
+
`agy` is shell-sandboxed but can still edit files.
|
|
80
|
+
|
|
81
|
+
### `--yolo` (full write access)
|
|
82
|
+
|
|
83
|
+
Pass `--yolo` to grant every agent full write access (file edits and shell commands,
|
|
84
|
+
auto-approved). Use it only when you actually want the agents to change your working tree.
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
moa ask --yolo "Refactor this module and run the tests."
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
Under `--yolo` every agent gets full write access. For `agy` this means dropping
|
|
91
|
+
`--sandbox`, so `agy --yolo` runs with no shell restrictions at all. In the default mode,
|
|
92
|
+
`agy` runs with `--sandbox` (partial protection: shell only - it can still edit files), and
|
|
93
|
+
MOA states that honestly on stderr.
|
|
94
|
+
|
|
95
|
+
### How agents are selected
|
|
96
|
+
|
|
97
|
+
`-n/--num` (default 3) picks the first N **installed** agents from a popularity-ordered priority list:
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
claude -> codex -> agy -> opencode
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
So `moa ask -n 3` on a machine with all four installed asks Claude, Codex, and agy (opencode is #4). `agy` has no true read-only mode, so in the default mode it runs with `--sandbox` (partial protection: shell only - it can still edit files) and MOA flags that with an honest note on stderr; it is **not** excluded. Use `-p/--provider` (repeatable) to pin an exact set and ignore `-n`.
|
|
104
|
+
|
|
105
|
+
Use `-x/--exclude` (repeatable) to drop one or more agents from the run. Exclusion is applied *before* `-n` takes the first N, and it also drops excluded names from an explicit `-p` set. It is off by default. The motivating case: an agent (e.g. Claude Code) calls `moa` for *other* opinions; `moa ask -x claude` makes sure one "peer" isn't just the caller's own model. So `moa ask -n 3 -x claude` asks Codex, agy, and opencode.
|
|
106
|
+
|
|
107
|
+
### Choosing models
|
|
108
|
+
|
|
109
|
+
Each tool ships with a reasonable default model, but you can override which model any tool uses with `-m/--model PROVIDER=MODEL` (repeatable). Only the providers you name change; the rest keep their defaults.
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
moa ask -m claude=sonnet -m agy="Gemini 3.1 Pro (Low)" "..."
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
The model-string format differs per tool and is passed through verbatim (the tool's own CLI validates it):
|
|
116
|
+
|
|
117
|
+
| Provider | Default | `-m` format |
|
|
118
|
+
| ---------- | ----------------------- | ------------------------------------------------------ |
|
|
119
|
+
| `claude` | `opus` | short id, e.g. `claude=sonnet` |
|
|
120
|
+
| `codex` | `gpt-5.5` | model id, e.g. `codex=gpt-5.5` |
|
|
121
|
+
| `agy` | `Gemini 3.1 Pro (High)` | exact display name, e.g. `agy="Gemini 3.1 Pro (Low)"` |
|
|
122
|
+
| `opencode` | (tool's authed default) | `provider/model` slug, e.g. `opencode=anthropic/claude-sonnet-4` |
|
|
123
|
+
|
|
124
|
+
`opencode` has no built-in default; without an override it omits `-m` and lets opencode pick. Pass `-m opencode=provider/model` to pin one.
|
|
125
|
+
|
|
126
|
+
### Configuration
|
|
127
|
+
|
|
128
|
+
To avoid repeating the same flags on every call, persist your own defaults in a config file. MOA reads it for every verb and merges it under your flags.
|
|
129
|
+
|
|
130
|
+
**Location.** `~/.moa/config.toml` (the dir is created on first write). Set `$MOA_CONFIG_DIR` to point the whole config layer somewhere else (useful in tests/CI).
|
|
131
|
+
|
|
132
|
+
**Precedence.** `built-in default < config file < CLI flag`. A flag always wins; the config file only changes a default when that flag is omitted; an absent file means today's built-in behaviour.
|
|
133
|
+
|
|
134
|
+
**Keys** (all shared across `ask`/`distill`/`debate`):
|
|
135
|
+
|
|
136
|
+
| Key | Type | Example |
|
|
137
|
+
| ------------- | ----------------------- | ----------------------------- |
|
|
138
|
+
| `num` | int (>= 1) | `num = 2` |
|
|
139
|
+
| `timeout` | seconds (> 0) | `timeout = 120` |
|
|
140
|
+
| `exclude` | list of provider names | `exclude = ["claude"]` |
|
|
141
|
+
| `synthesizer` | `auto`/`random`/provider | `synthesizer = "codex"` |
|
|
142
|
+
| `[models]` | provider -> model table | `claude = "sonnet"` |
|
|
143
|
+
|
|
144
|
+
```toml
|
|
145
|
+
# ~/.moa/config.toml
|
|
146
|
+
num = 2
|
|
147
|
+
timeout = 120
|
|
148
|
+
exclude = ["claude"]
|
|
149
|
+
synthesizer = "auto"
|
|
150
|
+
|
|
151
|
+
[models]
|
|
152
|
+
claude = "sonnet"
|
|
153
|
+
agy = "Gemini 3.1 Pro (Low)"
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
**`moa config`** inspects and edits the file (it creates the dir/file as needed and validates provider names):
|
|
157
|
+
|
|
158
|
+
```bash
|
|
159
|
+
moa config show # effective config (defaults + file) + path
|
|
160
|
+
moa config path # print the config file path
|
|
161
|
+
moa config set num 2 # set a scalar
|
|
162
|
+
moa config set exclude claude,codex # set the exclude list (comma-separated)
|
|
163
|
+
moa config set model claude=sonnet # set one entry in [models]
|
|
164
|
+
moa config unset num # remove a key
|
|
165
|
+
moa config unset model claude # remove one [models] entry
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
The synthesizer default is persistable too (e.g. `moa config set synthesizer codex`); `debate`'s `-r/--rounds` and `-j/--judge` are not persisted. CLI `-m` overrides win per-provider over the config `[models]` table.
|
|
169
|
+
|
|
170
|
+
### Output
|
|
171
|
+
|
|
172
|
+
- **stdout** carries only content: each agent's answer is fronted by a centered separator rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines around it for clear separation, flushed the instant that agent finishes. `moa distill` then appends the merged block (`──── synthesis · via claude · OK · ... ────`) once the aggregator finishes.
|
|
173
|
+
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
174
|
+
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill` then adds a `{"type": "synthesis", ...}` record. `debate` instead emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
175
|
+
|
|
176
|
+
### `moa distill` (synthesis)
|
|
177
|
+
|
|
178
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it streams what it has and skips the merge. The aggregator is chosen with `-s/--synthesizer`:
|
|
179
|
+
|
|
180
|
+
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
181
|
+
- `random` - pick one of the agents that ran, at random
|
|
182
|
+
- a provider name (`claude`, `codex`, `agy`, `opencode`)
|
|
183
|
+
|
|
184
|
+
The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synthesize" prompt (Wang et al. 2024): it tells the aggregator to critically evaluate the inputs (some may be biased or incorrect) and not to simply replicate them but offer a refined, accurate, comprehensive reply.
|
|
185
|
+
|
|
186
|
+
### `moa debate` (sequential debate + neutral judge)
|
|
187
|
+
|
|
188
|
+
`debate` is the opt-in, highest-cost mode. Instead of fanning out in parallel, it runs a sequential, adversarial exchange and then asks a **separate neutral judge** to write the final answer.
|
|
189
|
+
|
|
190
|
+
**Roles.** By default the top **2** selected agents are the debaters and the **3rd** is the judge - so the default `-n 3` maps to *2 debaters + 1 judge*. Pin a specific judge with `-j/--judge PROVIDER`; the judge must be one of the selected agents and must **not** also be a debater. Debate needs at least 2 debaters and 1 distinct judge, so it needs at least 3 agents; with fewer it exits with a clear message rather than silently degrading.
|
|
191
|
+
|
|
192
|
+
**Rounds.** `-r/--rounds` defaults to **2** (gains plateau around 2-3 rounds while token cost grows multiplicatively) and is hard-capped at **4** - higher values are clamped with a warning on stderr.
|
|
193
|
+
|
|
194
|
+
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
195
|
+
|
|
196
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed, per item 002) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
197
|
+
|
|
198
|
+
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
199
|
+
|
|
200
|
+
**Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
|
|
201
|
+
|
|
202
|
+
> **Caveat - use sparingly.** Debate is the costliest mode (roughly `debaters x rounds + 1` model calls) **and the least reliably beneficial.** The research is mixed-to-negative: multi-agent debate can converge on a *wrong* answer through conformity, a confident-but-incorrect debater can win on persuasiveness over correctness, and more rounds can entrench an error rather than fix it. The separate neutral judge and the adversarial-stance prompt are there to fight these failure modes, but they do not eliminate them. For most questions, `ask` or `distill` is the better default; reach for `debate` when you specifically want to surface and stress-test disagreement. (See *Can LLM Agents Really Debate?* arXiv:2511.07784, *Talk Isn't Always Cheap* arXiv:2509.05396, and the conformity/position-bias work cited in the design notes.)
|
|
203
|
+
|
|
204
|
+
### Attribution policy
|
|
205
|
+
|
|
206
|
+
The human (or agent) reading MOA's output **always gets correct attribution**: every response block shows the real provider name. There is no human-facing anonymization toggle.
|
|
207
|
+
|
|
208
|
+
The `distill` aggregator is a different story. To stop it picking favourites by brand, it **always** receives the proposer answers anonymized as "Response A / B / C" and order-shuffled (no toggle). The merged answer itself is brand-agnostic prose, and the A/B/C labels never leak into stdout, stderr, or the JSON.
|
|
209
|
+
|
|
210
|
+
## Supported agents
|
|
211
|
+
|
|
212
|
+
Invocations below show the default (read-only) flags; `--yolo` swaps in each tool's full-access mode.
|
|
213
|
+
|
|
214
|
+
| Provider | CLI | Invocation (read-only default) |
|
|
215
|
+
| ----------- | ---------- | ------------------------------------------------------------------- |
|
|
216
|
+
| `claude` | `claude` | `claude --model opus --permission-mode plan -p PROMPT` |
|
|
217
|
+
| `codex` | `codex` | `codex exec -m gpt-5.5 --skip-git-repo-check -s read-only PROMPT` |
|
|
218
|
+
| `agy` | `agy` | `agy --sandbox --model "Gemini 3.1 Pro (High)" -p PROMPT` (partial: shell only - can still edit files) |
|
|
219
|
+
| `opencode` | `opencode` | `opencode run --agent plan PROMPT` |
|
|
220
|
+
|
|
221
|
+
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
222
|
+
|
|
223
|
+
## Development
|
|
224
|
+
|
|
225
|
+
```bash
|
|
226
|
+
uv sync
|
|
227
|
+
uv run pytest
|
|
228
|
+
uv run ruff check src tests
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
MIT licensed.
|