moa-cli 0.2.0__tar.gz → 0.2.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.3
|
|
2
2
|
Name: moa-cli
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.2
|
|
4
4
|
Summary: Ask one question to multiple local AI coding CLIs in parallel and collect their answers.
|
|
5
5
|
Keywords: llm,agents,cli,claude,codex,agy,opencode,peer-review
|
|
6
6
|
Author: Paul-Louis Pröve
|
|
@@ -36,10 +36,37 @@ Or run it once without installing:
|
|
|
36
36
|
uvx --from moa-cli moa ask "Review this plan."
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
+
> **Requirements.** MOA drives agent CLIs you install separately - it ships no model
|
|
40
|
+
> or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
|
|
41
|
+
> `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
|
|
42
|
+
> first to see which ones MOA can find; with only one installed, the "council" collapses
|
|
43
|
+
> to a single answer.
|
|
44
|
+
|
|
39
45
|
## Why
|
|
40
46
|
|
|
41
47
|
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
42
48
|
|
|
49
|
+
### Example
|
|
50
|
+
|
|
51
|
+
```text
|
|
52
|
+
$ moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
53
|
+
Asking claude, codex, agy (timeout 180s, read-only)
|
|
54
|
+
|
|
55
|
+
──────────────── claude (opus) · OK · 3.2s ────────────────
|
|
56
|
+
|
|
57
|
+
For a single-user desktop app, SQLite is almost always the right call:
|
|
58
|
+
zero-config, serverless, the whole DB is one file you can ship... [trimmed]
|
|
59
|
+
|
|
60
|
+
─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
|
|
61
|
+
|
|
62
|
+
Use SQLite unless you expect concurrent writers or need network access.
|
|
63
|
+
For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
The selection note goes to stderr; the attributed answers go to stdout. In a terminal
|
|
67
|
+
each answer gets the rule shown above; when piped or read by another agent, the same
|
|
68
|
+
blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
|
|
69
|
+
|
|
43
70
|
## Usage
|
|
44
71
|
|
|
45
72
|
MOA has three prompt verbs that share the same selection/output options:
|
|
@@ -180,13 +207,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
|
|
|
180
207
|
|
|
181
208
|
### Output
|
|
182
209
|
|
|
183
|
-
- **stdout** carries only content
|
|
210
|
+
- **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
|
|
184
211
|
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
185
|
-
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill`
|
|
212
|
+
- `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
186
213
|
|
|
187
214
|
### `moa distill` (synthesis)
|
|
188
215
|
|
|
189
|
-
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it
|
|
216
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
|
|
190
217
|
|
|
191
218
|
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
192
219
|
- `random` - pick one of the agents that ran, at random
|
|
@@ -204,9 +231,9 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
|
|
|
204
231
|
|
|
205
232
|
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
206
233
|
|
|
207
|
-
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed
|
|
234
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
208
235
|
|
|
209
|
-
**Streaming/output.** Each debater's turn streams as it completes (
|
|
236
|
+
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
210
237
|
|
|
211
238
|
**Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
|
|
212
239
|
|
|
@@ -231,6 +258,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
|
|
|
231
258
|
|
|
232
259
|
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
233
260
|
|
|
261
|
+
## Agent skill
|
|
262
|
+
|
|
263
|
+
If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
|
|
264
|
+
[`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
|
|
265
|
+
how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
|
|
266
|
+
supersedes hand-rolling a "peer review" skill.
|
|
267
|
+
|
|
234
268
|
## Development
|
|
235
269
|
|
|
236
270
|
```bash
|
|
@@ -25,10 +25,37 @@ Or run it once without installing:
|
|
|
25
25
|
uvx --from moa-cli moa ask "Review this plan."
|
|
26
26
|
```
|
|
27
27
|
|
|
28
|
+
> **Requirements.** MOA drives agent CLIs you install separately - it ships no model
|
|
29
|
+
> or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
|
|
30
|
+
> `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
|
|
31
|
+
> first to see which ones MOA can find; with only one installed, the "council" collapses
|
|
32
|
+
> to a single answer.
|
|
33
|
+
|
|
28
34
|
## Why
|
|
29
35
|
|
|
30
36
|
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
31
37
|
|
|
38
|
+
### Example
|
|
39
|
+
|
|
40
|
+
```text
|
|
41
|
+
$ moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
42
|
+
Asking claude, codex, agy (timeout 180s, read-only)
|
|
43
|
+
|
|
44
|
+
──────────────── claude (opus) · OK · 3.2s ────────────────
|
|
45
|
+
|
|
46
|
+
For a single-user desktop app, SQLite is almost always the right call:
|
|
47
|
+
zero-config, serverless, the whole DB is one file you can ship... [trimmed]
|
|
48
|
+
|
|
49
|
+
─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
|
|
50
|
+
|
|
51
|
+
Use SQLite unless you expect concurrent writers or need network access.
|
|
52
|
+
For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
The selection note goes to stderr; the attributed answers go to stdout. In a terminal
|
|
56
|
+
each answer gets the rule shown above; when piped or read by another agent, the same
|
|
57
|
+
blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
|
|
58
|
+
|
|
32
59
|
## Usage
|
|
33
60
|
|
|
34
61
|
MOA has three prompt verbs that share the same selection/output options:
|
|
@@ -169,13 +196,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
|
|
|
169
196
|
|
|
170
197
|
### Output
|
|
171
198
|
|
|
172
|
-
- **stdout** carries only content
|
|
199
|
+
- **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
|
|
173
200
|
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
174
|
-
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill`
|
|
201
|
+
- `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
175
202
|
|
|
176
203
|
### `moa distill` (synthesis)
|
|
177
204
|
|
|
178
|
-
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it
|
|
205
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
|
|
179
206
|
|
|
180
207
|
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
181
208
|
- `random` - pick one of the agents that ran, at random
|
|
@@ -193,9 +220,9 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
|
|
|
193
220
|
|
|
194
221
|
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
195
222
|
|
|
196
|
-
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed
|
|
223
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
197
224
|
|
|
198
|
-
**Streaming/output.** Each debater's turn streams as it completes (
|
|
225
|
+
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
199
226
|
|
|
200
227
|
**Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
|
|
201
228
|
|
|
@@ -220,6 +247,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
|
|
|
220
247
|
|
|
221
248
|
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
222
249
|
|
|
250
|
+
## Agent skill
|
|
251
|
+
|
|
252
|
+
If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
|
|
253
|
+
[`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
|
|
254
|
+
how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
|
|
255
|
+
supersedes hand-rolling a "peer review" skill.
|
|
256
|
+
|
|
223
257
|
## Development
|
|
224
258
|
|
|
225
259
|
```bash
|
|
@@ -532,11 +532,27 @@ def build_judge_prompt(
|
|
|
532
532
|
|
|
533
533
|
_STATUS_LABELS = {"ok": "OK", "failed": "FAILED", "timeout": "TIMEOUT", "missing": "MISSING"}
|
|
534
534
|
|
|
535
|
+
# Width of the separator rule that fronts each answer block. Fixed (not terminal-
|
|
536
|
+
# derived) so output is identical whether shown live or piped to a file.
|
|
537
|
+
_RULE_WIDTH = 60
|
|
538
|
+
|
|
535
539
|
|
|
536
540
|
def _status_label(status: str) -> str:
|
|
537
541
|
return _STATUS_LABELS.get(status, status.upper())
|
|
538
542
|
|
|
539
543
|
|
|
544
|
+
def _rule(label: str) -> str:
|
|
545
|
+
"""A centered, box-drawing separator that names the block, e.g.
|
|
546
|
+
`──────── claude (opus) · OK · 2.3s ────────`. Falls back to the bare label
|
|
547
|
+
when it's wider than the rule."""
|
|
548
|
+
text = f" {label} "
|
|
549
|
+
if len(text) >= _RULE_WIDTH:
|
|
550
|
+
return text.strip()
|
|
551
|
+
pad = _RULE_WIDTH - len(text)
|
|
552
|
+
left = pad // 2
|
|
553
|
+
return "─" * left + text + "─" * (pad - left)
|
|
554
|
+
|
|
555
|
+
|
|
540
556
|
def _body(result: RunResult) -> list[str]:
|
|
541
557
|
if result.status == "ok":
|
|
542
558
|
return [result.stdout.strip(), ""]
|
|
@@ -544,15 +560,36 @@ def _body(result: RunResult) -> list[str]:
|
|
|
544
560
|
return ["```text", detail[-1200:], "```", ""]
|
|
545
561
|
|
|
546
562
|
|
|
547
|
-
def
|
|
563
|
+
def _plain_output() -> bool:
|
|
564
|
+
"""True when stdout is not an interactive terminal - piped, redirected, or
|
|
565
|
+
read by another agent (the common "an agent shells out to moa" case). There
|
|
566
|
+
we drop the decorative box-drawing rule and extra blank lines for a plain,
|
|
567
|
+
low-noise `## label` heading that is cheaper for a model to consume."""
|
|
568
|
+
return not sys.stdout.isatty()
|
|
569
|
+
|
|
570
|
+
|
|
571
|
+
def _render(label: str, result: RunResult, plain: bool) -> str:
|
|
572
|
+
"""One answer block. In a terminal: two leading blank lines and a centered
|
|
573
|
+
box-drawing rule, for clear visual separation as blocks stream in. When
|
|
574
|
+
piped: a plain `## label` heading with a single blank line, no box-drawing."""
|
|
575
|
+
if plain:
|
|
576
|
+
return "\n".join(["", f"## {label}", "", *_body(result)])
|
|
577
|
+
return "\n".join(["", "", _rule(label), "", *_body(result)])
|
|
578
|
+
|
|
579
|
+
|
|
580
|
+
def render_block(result: RunResult, plain: bool | None = None) -> str:
|
|
581
|
+
if plain is None:
|
|
582
|
+
plain = _plain_output()
|
|
548
583
|
model = f" ({result.model})" if result.model else ""
|
|
549
|
-
|
|
550
|
-
return
|
|
584
|
+
label = f"{result.provider}{model} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
585
|
+
return _render(label, result, plain)
|
|
551
586
|
|
|
552
587
|
|
|
553
|
-
def render_synthesis_block(result: RunResult, synthesizer: str) -> str:
|
|
554
|
-
|
|
555
|
-
|
|
588
|
+
def render_synthesis_block(result: RunResult, synthesizer: str, plain: bool | None = None) -> str:
|
|
589
|
+
if plain is None:
|
|
590
|
+
plain = _plain_output()
|
|
591
|
+
label = f"synthesis · via {synthesizer} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
592
|
+
return _render(label, result, plain)
|
|
556
593
|
|
|
557
594
|
|
|
558
595
|
def result_record(result: RunResult) -> dict:
|
|
@@ -579,18 +616,22 @@ def synthesis_record(result: RunResult, synthesizer: str) -> dict:
|
|
|
579
616
|
}
|
|
580
617
|
|
|
581
618
|
|
|
582
|
-
def render_debate_turn_block(result: RunResult, round_num: int) -> str:
|
|
619
|
+
def render_debate_turn_block(result: RunResult, round_num: int, plain: bool | None = None) -> str:
|
|
620
|
+
if plain is None:
|
|
621
|
+
plain = _plain_output()
|
|
583
622
|
model = f" ({result.model})" if result.model else ""
|
|
584
|
-
|
|
585
|
-
f"
|
|
586
|
-
f"{_status_label(result.status)}
|
|
623
|
+
label = (
|
|
624
|
+
f"round {round_num} · {result.provider}{model} · "
|
|
625
|
+
f"{_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
587
626
|
)
|
|
588
|
-
return
|
|
627
|
+
return _render(label, result, plain)
|
|
589
628
|
|
|
590
629
|
|
|
591
|
-
def render_judge_block(result: RunResult, judge: str) -> str:
|
|
592
|
-
|
|
593
|
-
|
|
630
|
+
def render_judge_block(result: RunResult, judge: str, plain: bool | None = None) -> str:
|
|
631
|
+
if plain is None:
|
|
632
|
+
plain = _plain_output()
|
|
633
|
+
label = f"verdict · judge {judge} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
634
|
+
return _render(label, result, plain)
|
|
594
635
|
|
|
595
636
|
|
|
596
637
|
def debate_turn_record(result: RunResult, round_num: int) -> dict:
|
|
@@ -855,11 +896,20 @@ async def _collect(
|
|
|
855
896
|
json_output: bool,
|
|
856
897
|
models: dict[str, str] | None = None,
|
|
857
898
|
yolo: bool = False,
|
|
899
|
+
emit_blocks: bool = True,
|
|
858
900
|
) -> list[RunResult]:
|
|
901
|
+
"""Gather every agent's result. With emit_blocks (ask), each complete answer
|
|
902
|
+
is flushed to stdout the instant it arrives. Without it (distill), the
|
|
903
|
+
individual answers are intermediates the user shouldn't see - only the final
|
|
904
|
+
distilled block is content - so we keep stdout clean and just heartbeat each
|
|
905
|
+
arrival to stderr so a multi-agent run doesn't look frozen while it waits."""
|
|
859
906
|
results: list[RunResult] = []
|
|
860
907
|
async for result in stream(providers, prompt, timeout, models, yolo):
|
|
861
908
|
results.append(result)
|
|
862
|
-
|
|
909
|
+
if emit_blocks:
|
|
910
|
+
_emit(json.dumps(result_record(result)) if json_output else render_block(result))
|
|
911
|
+
else:
|
|
912
|
+
_note(f" {result.provider} responded ({_status_label(result.status)}, {result.elapsed:.1f}s)")
|
|
863
913
|
return results
|
|
864
914
|
|
|
865
915
|
|
|
@@ -1029,8 +1079,13 @@ def distill(
|
|
|
1029
1079
|
# it merges through the same precedence: CLI flag > config file > built-in.
|
|
1030
1080
|
synthesizer = resolve_option(synthesizer, "synthesizer", _read_config_or_empty(), "auto")
|
|
1031
1081
|
|
|
1082
|
+
# distill returns only the merged answer, so the proposer responses are
|
|
1083
|
+
# intermediates: collect them without printing each to stdout.
|
|
1032
1084
|
results = asyncio.run(
|
|
1033
|
-
_collect(
|
|
1085
|
+
_collect(
|
|
1086
|
+
cfg.selected, cfg.prompt, cfg.timeout, cfg.json_output, cfg.models, cfg.yolo,
|
|
1087
|
+
emit_blocks=False,
|
|
1088
|
+
)
|
|
1034
1089
|
)
|
|
1035
1090
|
successes = [r for r in results if r.status == "ok"]
|
|
1036
1091
|
|