moa-cli 0.2.0__tar.gz → 0.2.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: moa-cli
3
- Version: 0.2.0
3
+ Version: 0.2.2
4
4
  Summary: Ask one question to multiple local AI coding CLIs in parallel and collect their answers.
5
5
  Keywords: llm,agents,cli,claude,codex,agy,opencode,peer-review
6
6
  Author: Paul-Louis Pröve
@@ -36,10 +36,37 @@ Or run it once without installing:
36
36
  uvx --from moa-cli moa ask "Review this plan."
37
37
  ```
38
38
 
39
+ > **Requirements.** MOA drives agent CLIs you install separately - it ships no model
40
+ > or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
41
+ > `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
42
+ > first to see which ones MOA can find; with only one installed, the "council" collapses
43
+ > to a single answer.
44
+
39
45
  ## Why
40
46
 
41
47
  A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
42
48
 
49
+ ### Example
50
+
51
+ ```text
52
+ $ moa ask "Is Postgres or SQLite better for a desktop app?"
53
+ Asking claude, codex, agy (timeout 180s, read-only)
54
+
55
+ ──────────────── claude (opus) · OK · 3.2s ────────────────
56
+
57
+ For a single-user desktop app, SQLite is almost always the right call:
58
+ zero-config, serverless, the whole DB is one file you can ship... [trimmed]
59
+
60
+ ─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
61
+
62
+ Use SQLite unless you expect concurrent writers or need network access.
63
+ For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
64
+ ```
65
+
66
+ The selection note goes to stderr; the attributed answers go to stdout. In a terminal
67
+ each answer gets the rule shown above; when piped or read by another agent, the same
68
+ blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
69
+
43
70
  ## Usage
44
71
 
45
72
  MOA has three prompt verbs that share the same selection/output options:
@@ -180,13 +207,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
180
207
 
181
208
  ### Output
182
209
 
183
- - **stdout** carries only content: each agent's answer as a Markdown block (`## claude (opus) - OK - 3.5s`), flushed the instant that agent finishes. `moa distill` then appends the merged block (`## synthesis · via claude - OK - ...`) once the aggregator finishes.
210
+ - **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
184
211
  - **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
185
- - `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill` then adds a `{"type": "synthesis", ...}` record. `debate` instead emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
212
+ - `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
186
213
 
187
214
  ### `moa distill` (synthesis)
188
215
 
189
- `distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it streams what it has and skips the merge. The aggregator is chosen with `-s/--synthesizer`:
216
+ `distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
190
217
 
191
218
  - `auto` (default) - the highest-priority agent that ran (deterministic)
192
219
  - `random` - pick one of the agents that ran, at random
@@ -204,9 +231,9 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
204
231
 
205
232
  **The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
206
233
 
207
- **The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed, per item 002) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`## verdict · judge <name> ...`).
234
+ **The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
208
235
 
209
- **Streaming/output.** Each debater's turn streams as it completes (`## round N · <provider> ...`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
236
+ **Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
210
237
 
211
238
  **Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
212
239
 
@@ -231,6 +258,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
231
258
 
232
259
  Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
233
260
 
261
+ ## Agent skill
262
+
263
+ If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
264
+ [`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
265
+ how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
266
+ supersedes hand-rolling a "peer review" skill.
267
+
234
268
  ## Development
235
269
 
236
270
  ```bash
@@ -25,10 +25,37 @@ Or run it once without installing:
25
25
  uvx --from moa-cli moa ask "Review this plan."
26
26
  ```
27
27
 
28
+ > **Requirements.** MOA drives agent CLIs you install separately - it ships no model
29
+ > or API key of its own. You need at least two of `claude` (Claude Code), `codex`,
30
+ > `agy` (Antigravity), and `opencode` on your `PATH` and logged in. Run **`moa doctor`**
31
+ > first to see which ones MOA can find; with only one installed, the "council" collapses
32
+ > to a single answer.
33
+
28
34
  ## Why
29
35
 
30
36
  A single model gives you one perspective. Asking three frontier models the same question - and seeing where they agree, diverge, or contradict - is a fast, cheap way to pressure-test an answer. MOA makes that a one-liner using the CLIs you already pay for, with no API keys of its own.
31
37
 
38
+ ### Example
39
+
40
+ ```text
41
+ $ moa ask "Is Postgres or SQLite better for a desktop app?"
42
+ Asking claude, codex, agy (timeout 180s, read-only)
43
+
44
+ ──────────────── claude (opus) · OK · 3.2s ────────────────
45
+
46
+ For a single-user desktop app, SQLite is almost always the right call:
47
+ zero-config, serverless, the whole DB is one file you can ship... [trimmed]
48
+
49
+ ─────────────── codex (gpt-5.5) · OK · 4.1s ───────────────
50
+
51
+ Use SQLite unless you expect concurrent writers or need network access.
52
+ For a desktop app neither is likely, so SQLite wins on simplicity... [trimmed]
53
+ ```
54
+
55
+ The selection note goes to stderr; the attributed answers go to stdout. In a terminal
56
+ each answer gets the rule shown above; when piped or read by another agent, the same
57
+ blocks render as plain `## ...` headings. Add `--json` for machine-readable JSONL.
58
+
32
59
  ## Usage
33
60
 
34
61
  MOA has three prompt verbs that share the same selection/output options:
@@ -169,13 +196,13 @@ The synthesizer default is persistable too (e.g. `moa config set synthesizer cod
169
196
 
170
197
  ### Output
171
198
 
172
- - **stdout** carries only content: each agent's answer as a Markdown block (`## claude (opus) - OK - 3.5s`), flushed the instant that agent finishes. `moa distill` then appends the merged block (`## synthesis · via claude - OK - ...`) once the aggregator finishes.
199
+ - **stdout** carries only content. In a terminal, each agent's answer is fronted by a centered box-drawing rule naming it (`──── claude (opus) · OK · 3.5s ────`) with blank lines for separation, flushed the instant that agent finishes. When stdout is **piped or read by an agent** (not a TTY), the same block renders as a plain, low-noise `## claude (opus) · OK · 3.5s` heading instead - no box-drawing. `moa distill` emits only the final merged block.
173
200
  - **stderr** carries progress and selection notes (`Asking claude, codex ...`), so piping stdout stays clean.
174
- - `--json` emits one JSON object per line (JSONL): a `{"type": "response", ...}` record per agent as it completes; `distill` then adds a `{"type": "synthesis", ...}` record. `debate` instead emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
201
+ - `--json` emits one JSON object per line (JSONL): `ask` writes a `{"type": "response", ...}` record per agent as it completes; `distill` writes a single `{"type": "synthesis", ...}` record (only the merged answer); `debate` writes a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record. Ideal when another agent calls MOA and parses the result.
175
202
 
176
203
  ### `moa distill` (synthesis)
177
204
 
178
- `distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. It needs at least two successful proposer answers; with fewer it streams what it has and skips the merge. The aggregator is chosen with `-s/--synthesizer`:
205
+ `distill` runs the same council fan-out as `ask`, then one more pass where a strong aggregator merges the collected answers into a single, unified answer. **It returns only that merged answer** - the individual proposer responses are intermediates and are not printed (each one's arrival is noted on stderr so the wait isn't silent). It needs at least two successful proposer answers; with fewer it skips the merge and says so on stderr. The aggregator is chosen with `-s/--synthesizer`:
179
206
 
180
207
  - `auto` (default) - the highest-priority agent that ran (deterministic)
181
208
  - `random` - pick one of the agents that ran, at random
@@ -193,9 +220,9 @@ The aggregator prompt is adapted from the Mixture-of-Agents "Aggregate-and-Synth
193
220
 
194
221
  **The loop.** Round 1: debater A answers cold; debater B sees A's answer with an adversarial-stance instruction ("identify errors/weaknesses before giving your own answer; do not agree merely to reach consensus"). Each later round, every debater sees the other's latest answer and responds in the same spirit. If every debater signals it has *no substantive change* (it may open its reply with `NO SUBSTANTIVE CHANGE`), the debate stops early before the cap.
195
222
 
196
- **The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed, per item 002) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`## verdict · judge <name> ...`).
223
+ **The judge.** A model that is **not** a debater reads the full transcript - presented **anonymized and order-shuffled** (a model is judging, so brand/position bias is killed) - and writes the final answer. Its prompt instructs it to weigh correctness and evidence **above** confidence and fluency. The judge's verdict is the final block (`──── verdict · judge <name> · ... ────`).
197
224
 
198
- **Streaming/output.** Each debater's turn streams as it completes (`## round N · <provider> ...`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
225
+ **Streaming/output.** Each debater's turn streams as it completes (`──── round N · <provider> · ... ────`), then the judge's verdict last. `--json` emits a `{"type": "debate_turn", "round": N, ...}` record per turn plus a final `{"type": "verdict", ...}` record.
199
226
 
200
227
  **Safety.** Debaters and the judge run in the same read-only (or `--yolo`) mode as the other verbs - there is no permission bypass. agy's partial-sandbox caveat (shell only; it can still edit files) applies here too.
201
228
 
@@ -220,6 +247,13 @@ Invocations below show the default (read-only) flags; `--yolo` swaps in each too
220
247
 
221
248
  Adding a new agent is a single entry in the `PROVIDERS` table in `src/moa_cli/cli.py` (executable, default model, command builder, permission flags); it then participates in detection, `-n` selection, and `distill` automatically.
222
249
 
250
+ ## Agent skill
251
+
252
+ If you drive MOA from an agent (e.g. Claude Code), there's a ready-made skill at
253
+ [`skills/moa/SKILL.md`](skills/moa/SKILL.md): it tells an agent when to reach for MOA and
254
+ how to use it (verb choice, self-exclusion via `-x <self>`, parsing the JSONL output). It
255
+ supersedes hand-rolling a "peer review" skill.
256
+
223
257
  ## Development
224
258
 
225
259
  ```bash
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "moa-cli"
3
- version = "0.2.0"
3
+ version = "0.2.2"
4
4
  description = "Ask one question to multiple local AI coding CLIs in parallel and collect their answers."
5
5
  readme = "README.md"
6
6
  authors = [
@@ -1,3 +1,3 @@
1
1
  """MOA CLI package."""
2
2
 
3
- __version__ = "0.2.0"
3
+ __version__ = "0.2.2"
@@ -532,11 +532,27 @@ def build_judge_prompt(
532
532
 
533
533
  _STATUS_LABELS = {"ok": "OK", "failed": "FAILED", "timeout": "TIMEOUT", "missing": "MISSING"}
534
534
 
535
+ # Width of the separator rule that fronts each answer block. Fixed (not terminal-
536
+ # derived) so output is identical whether shown live or piped to a file.
537
+ _RULE_WIDTH = 60
538
+
535
539
 
536
540
  def _status_label(status: str) -> str:
537
541
  return _STATUS_LABELS.get(status, status.upper())
538
542
 
539
543
 
544
+ def _rule(label: str) -> str:
545
+ """A centered, box-drawing separator that names the block, e.g.
546
+ `──────── claude (opus) · OK · 2.3s ────────`. Falls back to the bare label
547
+ when it's wider than the rule."""
548
+ text = f" {label} "
549
+ if len(text) >= _RULE_WIDTH:
550
+ return text.strip()
551
+ pad = _RULE_WIDTH - len(text)
552
+ left = pad // 2
553
+ return "─" * left + text + "─" * (pad - left)
554
+
555
+
540
556
  def _body(result: RunResult) -> list[str]:
541
557
  if result.status == "ok":
542
558
  return [result.stdout.strip(), ""]
@@ -544,15 +560,36 @@ def _body(result: RunResult) -> list[str]:
544
560
  return ["```text", detail[-1200:], "```", ""]
545
561
 
546
562
 
547
- def render_block(result: RunResult) -> str:
563
+ def _plain_output() -> bool:
564
+ """True when stdout is not an interactive terminal - piped, redirected, or
565
+ read by another agent (the common "an agent shells out to moa" case). There
566
+ we drop the decorative box-drawing rule and extra blank lines for a plain,
567
+ low-noise `## label` heading that is cheaper for a model to consume."""
568
+ return not sys.stdout.isatty()
569
+
570
+
571
+ def _render(label: str, result: RunResult, plain: bool) -> str:
572
+ """One answer block. In a terminal: two leading blank lines and a centered
573
+ box-drawing rule, for clear visual separation as blocks stream in. When
574
+ piped: a plain `## label` heading with a single blank line, no box-drawing."""
575
+ if plain:
576
+ return "\n".join(["", f"## {label}", "", *_body(result)])
577
+ return "\n".join(["", "", _rule(label), "", *_body(result)])
578
+
579
+
580
+ def render_block(result: RunResult, plain: bool | None = None) -> str:
581
+ if plain is None:
582
+ plain = _plain_output()
548
583
  model = f" ({result.model})" if result.model else ""
549
- heading = f"## {result.provider}{model} - {_status_label(result.status)} - {result.elapsed:.1f}s"
550
- return "\n".join([heading, "", *_body(result)])
584
+ label = f"{result.provider}{model} · {_status_label(result.status)} · {result.elapsed:.1f}s"
585
+ return _render(label, result, plain)
551
586
 
552
587
 
553
- def render_synthesis_block(result: RunResult, synthesizer: str) -> str:
554
- heading = f"## synthesis · via {synthesizer} - {_status_label(result.status)} - {result.elapsed:.1f}s"
555
- return "\n".join([heading, "", *_body(result)])
588
+ def render_synthesis_block(result: RunResult, synthesizer: str, plain: bool | None = None) -> str:
589
+ if plain is None:
590
+ plain = _plain_output()
591
+ label = f"synthesis · via {synthesizer} · {_status_label(result.status)} · {result.elapsed:.1f}s"
592
+ return _render(label, result, plain)
556
593
 
557
594
 
558
595
  def result_record(result: RunResult) -> dict:
@@ -579,18 +616,22 @@ def synthesis_record(result: RunResult, synthesizer: str) -> dict:
579
616
  }
580
617
 
581
618
 
582
- def render_debate_turn_block(result: RunResult, round_num: int) -> str:
619
+ def render_debate_turn_block(result: RunResult, round_num: int, plain: bool | None = None) -> str:
620
+ if plain is None:
621
+ plain = _plain_output()
583
622
  model = f" ({result.model})" if result.model else ""
584
- heading = (
585
- f"## round {round_num} · {result.provider}{model} - "
586
- f"{_status_label(result.status)} - {result.elapsed:.1f}s"
623
+ label = (
624
+ f"round {round_num} · {result.provider}{model} · "
625
+ f"{_status_label(result.status)} · {result.elapsed:.1f}s"
587
626
  )
588
- return "\n".join([heading, "", *_body(result)])
627
+ return _render(label, result, plain)
589
628
 
590
629
 
591
- def render_judge_block(result: RunResult, judge: str) -> str:
592
- heading = f"## verdict · judge {judge} - {_status_label(result.status)} - {result.elapsed:.1f}s"
593
- return "\n".join([heading, "", *_body(result)])
630
+ def render_judge_block(result: RunResult, judge: str, plain: bool | None = None) -> str:
631
+ if plain is None:
632
+ plain = _plain_output()
633
+ label = f"verdict · judge {judge} · {_status_label(result.status)} · {result.elapsed:.1f}s"
634
+ return _render(label, result, plain)
594
635
 
595
636
 
596
637
  def debate_turn_record(result: RunResult, round_num: int) -> dict:
@@ -855,11 +896,20 @@ async def _collect(
855
896
  json_output: bool,
856
897
  models: dict[str, str] | None = None,
857
898
  yolo: bool = False,
899
+ emit_blocks: bool = True,
858
900
  ) -> list[RunResult]:
901
+ """Gather every agent's result. With emit_blocks (ask), each complete answer
902
+ is flushed to stdout the instant it arrives. Without it (distill), the
903
+ individual answers are intermediates the user shouldn't see - only the final
904
+ distilled block is content - so we keep stdout clean and just heartbeat each
905
+ arrival to stderr so a multi-agent run doesn't look frozen while it waits."""
859
906
  results: list[RunResult] = []
860
907
  async for result in stream(providers, prompt, timeout, models, yolo):
861
908
  results.append(result)
862
- _emit(json.dumps(result_record(result)) if json_output else render_block(result))
909
+ if emit_blocks:
910
+ _emit(json.dumps(result_record(result)) if json_output else render_block(result))
911
+ else:
912
+ _note(f" {result.provider} responded ({_status_label(result.status)}, {result.elapsed:.1f}s)")
863
913
  return results
864
914
 
865
915
 
@@ -1029,8 +1079,13 @@ def distill(
1029
1079
  # it merges through the same precedence: CLI flag > config file > built-in.
1030
1080
  synthesizer = resolve_option(synthesizer, "synthesizer", _read_config_or_empty(), "auto")
1031
1081
 
1082
+ # distill returns only the merged answer, so the proposer responses are
1083
+ # intermediates: collect them without printing each to stdout.
1032
1084
  results = asyncio.run(
1033
- _collect(cfg.selected, cfg.prompt, cfg.timeout, cfg.json_output, cfg.models, cfg.yolo)
1085
+ _collect(
1086
+ cfg.selected, cfg.prompt, cfg.timeout, cfg.json_output, cfg.models, cfg.yolo,
1087
+ emit_blocks=False,
1088
+ )
1034
1089
  )
1035
1090
  successes = [r for r in results if r.status == "ok"]
1036
1091