moa-cli 0.2.1__tar.gz → 0.2.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.3
|
|
2
2
|
Name: moa-cli
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.2
|
|
4
4
|
Summary: Ask one question to multiple local AI coding CLIs in parallel and collect their answers.
|
|
5
5
|
Keywords: llm,agents,cli,claude,codex,agy,opencode,peer-review
|
|
6
6
|
Author: Paul-Louis Pröve
|
|
@@ -36,10 +36,37 @@ Or run it once without installing:
|
|
|
36
36
|
uvx --from moa-cli moa ask "Review this plan."
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
+
> **Requirements.** MOA drives agent CLIs you install separately - it ships no model
|
|
40
|
+
> or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
|
|
41
|
+
> `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
|
|
42
|
+
> first to see which ones MOA can find; with only one installed, the "council" collapses
|
|
43
|
+
> to a single answer.
|
|
44
|
+
|
|
39
45
|
## Why
|
|
40
46
|
|
|
41
47
|
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
42
48
|
|
|
49
|
+
### Example
|
|
50
|
+
|
|
51
|
+
```text
|
|
52
|
+
$ moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
53
|
+
Asking claude, codex, agy (timeout 180s, read-only)
|
|
54
|
+
|
|
55
|
+
──────────────── claude (opus) · OK · 3.2s ────────────────
|
|
56
|
+
|
|
57
|
+
For a single-user desktop app, SQLite is almost always the right call:
|
|
58
|
+
zero-config, serverless, the whole DB is one file you can ship... [trimmed]
|
|
59
|
+
|
|
60
|
+
─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
|
|
61
|
+
|
|
62
|
+
Use SQLite unless you expect concurrent writers or need network access.
|
|
63
|
+
For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
The selection note goes to stderr; the attributed answers go to stdout. In a terminal
|
|
67
|
+
each answer gets the rule shown above; when piped or read by another agent, the same
|
|
68
|
+
blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
|
|
69
|
+
|
|
43
70
|
## Usage
|
|
44
71
|
|
|
45
72
|
MOA has three prompt verbs that share the same selection/output options:
|
|
@@ -180,13 +207,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
|
|
|
180
207
|
|
|
181
208
|
### Output
|
|
182
209
|
|
|
183
|
-
- **stdout** carries only content
|
|
210
|
+
- **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
|
|
184
211
|
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
185
|
-
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill`
|
|
212
|
+
- `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
186
213
|
|
|
187
214
|
### `moa distill` (synthesis)
|
|
188
215
|
|
|
189
|
-
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it
|
|
216
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
|
|
190
217
|
|
|
191
218
|
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
192
219
|
- `random` - pick one of the agents that ran, at random
|
|
@@ -204,7 +231,7 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
|
|
|
204
231
|
|
|
205
232
|
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
206
233
|
|
|
207
|
-
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed
|
|
234
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
208
235
|
|
|
209
236
|
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
210
237
|
|
|
@@ -231,6 +258,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
|
|
|
231
258
|
|
|
232
259
|
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
233
260
|
|
|
261
|
+
## Agent skill
|
|
262
|
+
|
|
263
|
+
If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
|
|
264
|
+
[`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
|
|
265
|
+
how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
|
|
266
|
+
supersedes hand-rolling a "peer review" skill.
|
|
267
|
+
|
|
234
268
|
## Development
|
|
235
269
|
|
|
236
270
|
```bash
|
|
@@ -25,10 +25,37 @@ Or run it once without installing:
|
|
|
25
25
|
uvx --from moa-cli moa ask "Review this plan."
|
|
26
26
|
```
|
|
27
27
|
|
|
28
|
+
> **Requirements.** MOA drives agent CLIs you install separately - it ships no model
|
|
29
|
+
> or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
|
|
30
|
+
> `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
|
|
31
|
+
> first to see which ones MOA can find; with only one installed, the "council" collapses
|
|
32
|
+
> to a single answer.
|
|
33
|
+
|
|
28
34
|
## Why
|
|
29
35
|
|
|
30
36
|
A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
|
|
31
37
|
|
|
38
|
+
### Example
|
|
39
|
+
|
|
40
|
+
```text
|
|
41
|
+
$ moa ask "Is Postgres or SQLite better for a desktop app?"
|
|
42
|
+
Asking claude, codex, agy (timeout 180s, read-only)
|
|
43
|
+
|
|
44
|
+
──────────────── claude (opus) · OK · 3.2s ────────────────
|
|
45
|
+
|
|
46
|
+
For a single-user desktop app, SQLite is almost always the right call:
|
|
47
|
+
zero-config, serverless, the whole DB is one file you can ship... [trimmed]
|
|
48
|
+
|
|
49
|
+
─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
|
|
50
|
+
|
|
51
|
+
Use SQLite unless you expect concurrent writers or need network access.
|
|
52
|
+
For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
The selection note goes to stderr; the attributed answers go to stdout. In a terminal
|
|
56
|
+
each answer gets the rule shown above; when piped or read by another agent, the same
|
|
57
|
+
blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
|
|
58
|
+
|
|
32
59
|
## Usage
|
|
33
60
|
|
|
34
61
|
MOA has three prompt verbs that share the same selection/output options:
|
|
@@ -169,13 +196,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
|
|
|
169
196
|
|
|
170
197
|
### Output
|
|
171
198
|
|
|
172
|
-
- **stdout** carries only content
|
|
199
|
+
- **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
|
|
173
200
|
- **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
|
|
174
|
-
- `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill`
|
|
201
|
+
- `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
|
|
175
202
|
|
|
176
203
|
### `moa distill` (synthesis)
|
|
177
204
|
|
|
178
|
-
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it
|
|
205
|
+
`distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
|
|
179
206
|
|
|
180
207
|
- `auto` (default) - the highest-priority agent that ran (deterministic)
|
|
181
208
|
- `random` - pick one of the agents that ran, at random
|
|
@@ -193,7 +220,7 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
|
|
|
193
220
|
|
|
194
221
|
**The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
|
|
195
222
|
|
|
196
|
-
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed
|
|
223
|
+
**The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
|
|
197
224
|
|
|
198
225
|
**Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
|
|
199
226
|
|
|
@@ -220,6 +247,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
|
|
|
220
247
|
|
|
221
248
|
Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
|
|
222
249
|
|
|
250
|
+
## Agent skill
|
|
251
|
+
|
|
252
|
+
If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
|
|
253
|
+
[`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
|
|
254
|
+
how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
|
|
255
|
+
supersedes hand-rolling a "peer review" skill.
|
|
256
|
+
|
|
223
257
|
## Development
|
|
224
258
|
|
|
225
259
|
```bash
|
|
@@ -560,21 +560,36 @@ def _body(result: RunResult) -> list[str]:
|
|
|
560
560
|
return ["```text", detail[-1200:], "```", ""]
|
|
561
561
|
|
|
562
562
|
|
|
563
|
-
def
|
|
564
|
-
"""
|
|
565
|
-
|
|
563
|
+
def _plain_output() -> bool:
|
|
564
|
+
"""True when stdout is not an interactive terminal - piped, redirected, or
|
|
565
|
+
read by another agent (the common "an agent shells out to moa" case). There
|
|
566
|
+
we drop the decorative box-drawing rule and extra blank lines for a plain,
|
|
567
|
+
low-noise `## label` heading that is cheaper for a model to consume."""
|
|
568
|
+
return not sys.stdout.isatty()
|
|
569
|
+
|
|
570
|
+
|
|
571
|
+
def _render(label: str, result: RunResult, plain: bool) -> str:
|
|
572
|
+
"""One answer block. In a terminal: two leading blank lines and a centered
|
|
573
|
+
box-drawing rule, for clear visual separation as blocks stream in. When
|
|
574
|
+
piped: a plain `## label` heading with a single blank line, no box-drawing."""
|
|
575
|
+
if plain:
|
|
576
|
+
return "\n".join(["", f"## {label}", "", *_body(result)])
|
|
566
577
|
return "\n".join(["", "", _rule(label), "", *_body(result)])
|
|
567
578
|
|
|
568
579
|
|
|
569
|
-
def render_block(result: RunResult) -> str:
|
|
580
|
+
def render_block(result: RunResult, plain: bool | None = None) -> str:
|
|
581
|
+
if plain is None:
|
|
582
|
+
plain = _plain_output()
|
|
570
583
|
model = f" ({result.model})" if result.model else ""
|
|
571
584
|
label = f"{result.provider}{model} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
572
|
-
return _render(label, result)
|
|
585
|
+
return _render(label, result, plain)
|
|
573
586
|
|
|
574
587
|
|
|
575
|
-
def render_synthesis_block(result: RunResult, synthesizer: str) -> str:
|
|
588
|
+
def render_synthesis_block(result: RunResult, synthesizer: str, plain: bool | None = None) -> str:
|
|
589
|
+
if plain is None:
|
|
590
|
+
plain = _plain_output()
|
|
576
591
|
label = f"synthesis · via {synthesizer} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
577
|
-
return _render(label, result)
|
|
592
|
+
return _render(label, result, plain)
|
|
578
593
|
|
|
579
594
|
|
|
580
595
|
def result_record(result: RunResult) -> dict:
|
|
@@ -601,18 +616,22 @@ def synthesis_record(result: RunResult, synthesizer: str) -> dict:
|
|
|
601
616
|
}
|
|
602
617
|
|
|
603
618
|
|
|
604
|
-
def render_debate_turn_block(result: RunResult, round_num: int) -> str:
|
|
619
|
+
def render_debate_turn_block(result: RunResult, round_num: int, plain: bool | None = None) -> str:
|
|
620
|
+
if plain is None:
|
|
621
|
+
plain = _plain_output()
|
|
605
622
|
model = f" ({result.model})" if result.model else ""
|
|
606
623
|
label = (
|
|
607
624
|
f"round {round_num} · {result.provider}{model} · "
|
|
608
625
|
f"{_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
609
626
|
)
|
|
610
|
-
return _render(label, result)
|
|
627
|
+
return _render(label, result, plain)
|
|
611
628
|
|
|
612
629
|
|
|
613
|
-
def render_judge_block(result: RunResult, judge: str) -> str:
|
|
630
|
+
def render_judge_block(result: RunResult, judge: str, plain: bool | None = None) -> str:
|
|
631
|
+
if plain is None:
|
|
632
|
+
plain = _plain_output()
|
|
614
633
|
label = f"verdict · judge {judge} · {_status_label(result.status)} · {result.elapsed:.1f}s"
|
|
615
|
-
return _render(label, result)
|
|
634
|
+
return _render(label, result, plain)
|
|
616
635
|
|
|
617
636
|
|
|
618
637
|
def debate_turn_record(result: RunResult, round_num: int) -> dict:
|
|
@@ -877,11 +896,20 @@ async def _collect(
|
|
|
877
896
|
json_output: bool,
|
|
878
897
|
models: dict[str, str] | None = None,
|
|
879
898
|
yolo: bool = False,
|
|
899
|
+
emit_blocks: bool = True,
|
|
880
900
|
) -> list[RunResult]:
|
|
901
|
+
"""Gather every agent's result. With emit_blocks (ask), each complete answer
|
|
902
|
+
is flushed to stdout the instant it arrives. Without it (distill), the
|
|
903
|
+
individual answers are intermediates the user shouldn't see - only the final
|
|
904
|
+
distilled block is content - so we keep stdout clean and just heartbeat each
|
|
905
|
+
arrival to stderr so a multi-agent run doesn't look frozen while it waits."""
|
|
881
906
|
results: list[RunResult] = []
|
|
882
907
|
async for result in stream(providers, prompt, timeout, models, yolo):
|
|
883
908
|
results.append(result)
|
|
884
|
-
|
|
909
|
+
if emit_blocks:
|
|
910
|
+
_emit(json.dumps(result_record(result)) if json_output else render_block(result))
|
|
911
|
+
else:
|
|
912
|
+
_note(f" {result.provider} responded ({_status_label(result.status)}, {result.elapsed:.1f}s)")
|
|
885
913
|
return results
|
|
886
914
|
|
|
887
915
|
|
|
@@ -1051,8 +1079,13 @@ def distill(
|
|
|
1051
1079
|
# it merges through the same precedence: CLI flag > config file > built-in.
|
|
1052
1080
|
synthesizer = resolve_option(synthesizer, "synthesizer", _read_config_or_empty(), "auto")
|
|
1053
1081
|
|
|
1082
|
+
# distill returns only the merged answer, so the proposer responses are
|
|
1083
|
+
# intermediates: collect them without printing each to stdout.
|
|
1054
1084
|
results = asyncio.run(
|
|
1055
|
-
_collect(
|
|
1085
|
+
_collect(
|
|
1086
|
+
cfg.selected, cfg.prompt, cfg.timeout, cfg.json_output, cfg.models, cfg.yolo,
|
|
1087
|
+
emit_blocks=False,
|
|
1088
|
+
)
|
|
1056
1089
|
)
|
|
1057
1090
|
successes = [r for r in results if r.status == "ok"]
|
|
1058
1091
|
|