@maestrofrontier/frontier 1.4.4 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/plugins/marketplace.json +21 -0
- package/.codex-plugin/plugin.json +29 -0
- package/.cursorrules +197 -194
- package/AGENTS.md +214 -214
- package/CLAUDE.md +29 -29
- package/README.md +368 -278
- package/bin/maestro.cjs +75 -75
- package/commands/compress.md +36 -36
- package/commands/frontier.md +124 -124
- package/commands/terse.md +23 -23
- package/docs/codex.md +167 -98
- package/docs/orchestration.md +168 -168
- package/frontier/cli.cjs +279 -248
- package/frontier/config.cjs +468 -441
- package/frontier/dispatch.cjs +267 -255
- package/frontier/judge.cjs +92 -92
- package/frontier/run.cjs +201 -148
- package/frontier/schema.cjs +112 -112
- package/frontier/semaphore.cjs +49 -49
- package/frontier/synthesize.cjs +79 -79
- package/hooks/frontier-autorun.cjs +127 -124
- package/hooks/hooks.json +103 -103
- package/hooks/maestro-doctrine-guard.cjs +81 -81
- package/hooks/maestro-gate-reminder.cjs +22 -7
- package/hooks/maestro-gate-telemetry.cjs +79 -77
- package/hooks/maestro-phase-scope.cjs +118 -118
- package/hooks/maestro-statusline-sync.cjs +152 -152
- package/hooks/maestro-subagent-guard.cjs +148 -148
- package/hooks/maestro-terse-mode.cjs +189 -189
- package/hooks/maestro-toolbudget-advisory.cjs +127 -127
- package/integrations/README.md +111 -94
- package/integrations/cline/skills/frontier/SKILL.md +75 -75
- package/integrations/codex/prompts/frontier.md +70 -66
- package/integrations/codex/prompts/update.md +39 -36
- package/integrations/codex/skills/maestro-frontier/SKILL.md +122 -0
- package/integrations/codex/skills/{settings → maestro-settings}/SKILL.md +55 -46
- package/integrations/codex/skills/{terse → maestro-terse}/SKILL.md +58 -49
- package/integrations/codex/skills/maestro-update/SKILL.md +31 -0
- package/integrations/cursor/commands/frontier.md +63 -63
- package/integrations/cursor/commands/update.md +34 -34
- package/integrations/gemini/commands/frontier.toml +76 -76
- package/integrations/windsurf/workflows/frontier.md +70 -70
- package/package.json +58 -55
- package/scripts/install.cjs +1014 -605
- package/settings/cli.cjs +140 -140
- package/settings/config.cjs +309 -309
- package/skills/maestro-frontier/SKILL.md +122 -0
- package/skills/maestro-settings/SKILL.md +55 -0
- package/skills/maestro-terse/SKILL.md +58 -0
- package/skills/maestro-update/SKILL.md +31 -0
- package/skills/terse/SKILL.md +74 -0
- package/integrations/codex/skills/frontier/SKILL.md +0 -91
- package/integrations/codex/skills/update/SKILL.md +0 -29
package/README.md
CHANGED
|
@@ -1,278 +1,368 @@
|
|
|
1
|
-
<p align="center">
|
|
2
|
-
<img src="assets/maestro-frontier-banner.png" width="100%" alt="Maestro Frontier: the mascot conducts a panel of local model CLIs through a judge model into a grounded synthesis">
|
|
3
|
-
</p>
|
|
4
|
-
|
|
5
|
-
<p align="center">
|
|
6
|
-
<strong>Achieve Frontier AI performance in your CLI</strong> — by fusing the model CLIs you already run. Fan one prompt across a panel of 1-8 of your local CLIs in parallel, have a judge model you pick read every answer into a structured analysis, then a synthesizer you pick write one grounded answer that does not majority-vote. On a 100-task benchmark, every fusion panel outscored its individual member models. It runs on Maestro's discipline layer: verified done-claims, surgical scope, and a research-backed multi-agent gate.
|
|
7
|
-
</p>
|
|
8
|
-
|
|
9
|
-
<p align="center">
|
|
10
|
-
<a href="https://github.com/mbanderas/maestro/actions/workflows/ci.yml"><img src="https://github.com/mbanderas/maestro/actions/workflows/ci.yml/badge.svg" alt="CI status"></a>
|
|
11
|
-
<a href="https://github.com/mbanderas/maestro/tags"><img src="https://img.shields.io/github/v/tag/mbanderas/maestro?label=version&color=5b82d6" alt="Latest version"></a>
|
|
12
|
-
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License"></a>
|
|
13
|
-
<img src="https://img.shields.io/badge/dependencies-zero-brightgreen" alt="Zero Dependencies">
|
|
14
|
-
</p>
|
|
15
|
-
|
|
16
|
-
<p align="center">
|
|
17
|
-
<a href="#the-frontier-engine"><img src="https://img.shields.io/badge/Frontier-fusion%20engine-f59e0b" alt="Frontier fusion engine"></a>
|
|
18
|
-
<img src="https://img.shields.io/badge/agents-Claude%20Code%20%7C%20Gemini%20%7C%20Codex%20%7C%20Cursor-5b82d6" alt="Claude Code | Gemini | Codex | Cursor">
|
|
19
|
-
<img src="https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey" alt="macOS | Linux | Windows">
|
|
20
|
-
<img src="https://img.shields.io/badge/install-
|
|
21
|
-
<a href="#contributing"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
|
|
22
|
-
</p>
|
|
23
|
-
|
|
24
|
-
<p align="center">
|
|
25
|
-
<sub>13 fixture tasks · 123 valid A/B runs · 11 voids excluded & re-run · 6 hooks, all tested · ~8 KB always-on kernel ·
|
|
26
|
-
</p>
|
|
27
|
-
|
|
28
|
-
> **Install — run the
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
`
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
/
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="assets/maestro-frontier-banner.png" width="100%" alt="Maestro Frontier: the mascot conducts a panel of local model CLIs through a judge model into a grounded synthesis">
|
|
3
|
+
</p>
|
|
4
|
+
|
|
5
|
+
<p align="center">
|
|
6
|
+
<strong>Achieve Frontier AI performance in your CLI</strong> — by fusing the model CLIs you already run. Fan one prompt across a panel of 1-8 of your local CLIs in parallel, have a judge model you pick read every answer into a structured analysis, then a synthesizer you pick write one grounded answer that does not majority-vote. On a 100-task benchmark, every fusion panel outscored its individual member models. It runs on Maestro's discipline layer: verified done-claims, surgical scope, and a research-backed multi-agent gate.
|
|
7
|
+
</p>
|
|
8
|
+
|
|
9
|
+
<p align="center">
|
|
10
|
+
<a href="https://github.com/mbanderas/maestro/actions/workflows/ci.yml"><img src="https://github.com/mbanderas/maestro/actions/workflows/ci.yml/badge.svg" alt="CI status"></a>
|
|
11
|
+
<a href="https://github.com/mbanderas/maestro/tags"><img src="https://img.shields.io/github/v/tag/mbanderas/maestro?label=version&color=5b82d6" alt="Latest version"></a>
|
|
12
|
+
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License"></a>
|
|
13
|
+
<img src="https://img.shields.io/badge/dependencies-zero-brightgreen" alt="Zero Dependencies">
|
|
14
|
+
</p>
|
|
15
|
+
|
|
16
|
+
<p align="center">
|
|
17
|
+
<a href="#the-frontier-engine"><img src="https://img.shields.io/badge/Frontier-fusion%20engine-f59e0b" alt="Frontier fusion engine"></a>
|
|
18
|
+
<img src="https://img.shields.io/badge/agents-Claude%20Code%20%7C%20Gemini%20%7C%20Codex%20%7C%20Cursor-5b82d6" alt="Claude Code | Gemini | Codex | Cursor">
|
|
19
|
+
<img src="https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey" alt="macOS | Linux | Windows">
|
|
20
|
+
<img src="https://img.shields.io/badge/install-plugins%20%2B%20portable-blueviolet" alt="Native plugins and portable installs">
|
|
21
|
+
<a href="#contributing"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
|
|
22
|
+
</p>
|
|
23
|
+
|
|
24
|
+
<p align="center">
|
|
25
|
+
<sub>13 fixture tasks · 123 valid A/B runs · 11 voids excluded & re-run · 6 hooks, all tested · ~8 KB always-on kernel · plugin + portable installs</sub>
|
|
26
|
+
</p>
|
|
27
|
+
|
|
28
|
+
> **Install — run the command block for your tool, inside that tool.**
|
|
29
|
+
> Claude Code and Codex use native plugins. Cursor, Gemini, Cline,
|
|
30
|
+
> Windsurf, and other CLIs use the portable installer, which copies
|
|
31
|
+
> Maestro's runtime-agnostic files into the current project/workspace by
|
|
32
|
+
> default. Global/user installs are optional when you intentionally want
|
|
33
|
+
> cross-project behavior.
|
|
34
|
+
|
|
35
|
+
**Claude Code / Desktop** — native plugin (enforcement hooks, `/maestro:*` commands, skills, status line, Frontier auto-run):
|
|
36
|
+
|
|
37
|
+
```text
|
|
38
|
+
/plugin marketplace add mbanderas/maestro
|
|
39
|
+
/plugin install maestro@maestro
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
**Codex CLI / Desktop** — native Codex plugin via the Maestro repo
|
|
43
|
+
marketplace (skills, trusted hooks, and Frontier auto-run after you review
|
|
44
|
+
and trust the hooks):
|
|
45
|
+
|
|
46
|
+
```text
|
|
47
|
+
codex plugin marketplace add mbanderas/maestro
|
|
48
|
+
codex plugin add maestro@maestro
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Start a new Codex thread after installing or changing plugin trust so the
|
|
52
|
+
bundled skills and hooks reload.
|
|
53
|
+
|
|
54
|
+
**Portable installs for other CLI / Desktop apps** — run the matching line
|
|
55
|
+
in that tool's terminal, or ask its agent to run it. These integrations are
|
|
56
|
+
prompt/skill/workflow shortcuts around the portable CLI unless the tool has
|
|
57
|
+
native hook support.
|
|
58
|
+
|
|
59
|
+
| Tool | Install (run inside the tool) |
|
|
60
|
+
|------|-------------------------------|
|
|
61
|
+
| Cursor | `npx github:mbanderas/maestro install --target cursor` |
|
|
62
|
+
| Gemini CLI | `npx github:mbanderas/maestro install --target gemini` |
|
|
63
|
+
| Cline | `npx github:mbanderas/maestro install --target cline` |
|
|
64
|
+
| Windsurf / Devin | `npx github:mbanderas/maestro install --target windsurf` |
|
|
65
|
+
| Not sure / auto-detect | `npx github:mbanderas/maestro install --target auto` |
|
|
66
|
+
|
|
67
|
+
Portable installs lay down `AGENTS.md` plus that tool's adapter or
|
|
68
|
+
integration file, `docs/orchestration.md`, the zero-dependency Frontier
|
|
69
|
+
engine, and the relevant command/skill files. Codex does not need that copy
|
|
70
|
+
path for normal use: the repo is its marketplace
|
|
71
|
+
(`.agents/plugins/marketplace.json`) and plugin
|
|
72
|
+
(`.codex-plugin/plugin.json`), bundling the Codex skills, hooks, Frontier
|
|
73
|
+
engine, settings CLI, commands, and `docs/codex.md`. The older
|
|
74
|
+
`maestro install --target codex` path remains a manual fallback when you
|
|
75
|
+
intentionally want project files instead of the plugin. Once published, swap
|
|
76
|
+
`github:mbanderas/maestro` for `@maestrofrontier/frontier` for portable npm
|
|
77
|
+
installs.
|
|
78
|
+
|
|
79
|
+
> **Want the bare `maestro` command on your `PATH`?** Native plugins can run
|
|
80
|
+
> the bundled engine from their plugin cache, and portable installs can run
|
|
81
|
+
> `node bin/maestro.cjs frontier ...` from the installed project root. Install
|
|
82
|
+
> globally only if you want to type `maestro` from any shell:
|
|
83
|
+
> `npm install -g github:mbanderas/maestro` (swap for
|
|
84
|
+
> `@maestrofrontier/frontier` once published), then fully restart the tool so
|
|
85
|
+
> it picks up the new `PATH`.
|
|
86
|
+
|
|
87
|
+
Frontier stays **off** until you arm it, then normal prompts auto-route until
|
|
88
|
+
disabled in runtimes with trusted hooks.
|
|
89
|
+
|
|
90
|
+
- Claude Code: `/maestro:frontier fusion opus-gpt` (or
|
|
91
|
+
`/maestro:frontier off`).
|
|
92
|
+
- Codex: after installing the plugin and trusting hooks, ask the bundled
|
|
93
|
+
`maestro-frontier` skill to arm the project scope, for example "Use
|
|
94
|
+
Maestro Frontier with ChatGPT duo", "Show Maestro Frontier status", or
|
|
95
|
+
"Turn Maestro Frontier off." The skill runs the plugin-bundled engine; no
|
|
96
|
+
`npx` or global `maestro` binary is required.
|
|
97
|
+
- Shell/advanced: from a checkout or global install, the equivalent CLI form
|
|
98
|
+
is:
|
|
99
|
+
|
|
100
|
+
```text
|
|
101
|
+
maestro frontier mode fusion --preset chatgpt-duo --scope codex-project
|
|
102
|
+
maestro frontier mode fusion --preset frontier-trio --judge chatgpt --synth chatgpt --scope codex-project
|
|
103
|
+
maestro frontier mode off --scope codex-project
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Use `node bin/maestro.cjs frontier ...` in place of `maestro frontier ...`
|
|
107
|
+
when the binary is not on `PATH`. `codex-project` expands to the repo's
|
|
108
|
+
`codex-<8hex>` workspace scope, matching the trusted Codex plugin hook.
|
|
109
|
+
`maestro frontier run "<prompt>" ...` is still available for advanced/debug
|
|
110
|
+
one-offs, but arming mode is the normal autorun flow.
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
> **Agents:** start with [`docs/agent-map.md`](docs/agent-map.md) for
|
|
115
|
+
> repo navigation. This README is the user-facing product narrative.
|
|
116
|
+
|
|
117
|
+
## The Frontier Engine
|
|
118
|
+
|
|
119
|
+
**Achieve Frontier AI performance in your CLI.** Maestro Frontier is an
|
|
120
|
+
opt-in, zero-dependency multi-CLI fusion engine built from the AI CLIs
|
|
121
|
+
already on your machine. It fans a prompt out to a parallel panel of any
|
|
122
|
+
1-8 local CLIs you pick, has a judge model you choose read their answers
|
|
123
|
+
into a structured analysis (consensus, contradictions, unique insights,
|
|
124
|
+
blind spots; compare, not merge), then has a synthesizer you choose write
|
|
125
|
+
a grounded answer that does not majority-vote. The payoff is measured: on
|
|
126
|
+
a 100-task benchmark, fused panels beat the best of their individual
|
|
127
|
+
members — fusing the CLIs you already run buys frontier-tier results. It
|
|
128
|
+
is the project's new default identity; the doctrine, hooks, skills, and
|
|
129
|
+
benchmarks are unchanged; the discipline layer is its foundation.
|
|
130
|
+
|
|
131
|
+
It ships with the native plugins. Claude Code drives it with
|
|
132
|
+
`/maestro:frontier`, Codex drives it with the `maestro-frontier` skill, and
|
|
133
|
+
other CLIs can use `maestro frontier ...` or the `node bin/maestro.cjs
|
|
134
|
+
frontier ...` fallback. Three modes, switched at will, **`off` by default**
|
|
135
|
+
so installing or upgrading changes nothing until you opt in. **Arming it —
|
|
136
|
+
`single` or `fusion` — makes it auto-run on every prompt**: a
|
|
137
|
+
`UserPromptSubmit` hook routes each prompt through the engine and the live
|
|
138
|
+
session relays the synthesized answer. `off` is the disable path.
|
|
139
|
+
|
|
140
|
+
| Mode | Behavior |
|
|
141
|
+
|---|---|
|
|
142
|
+
| `off` | Normal Maestro. Engine never invoked; zero behavior change. The default, and the way to disable auto-run. |
|
|
143
|
+
| `single <model>` | Auto-runs every prompt through one local CLI and relays its answer. No panel, no judge, no synth. |
|
|
144
|
+
| `fusion <preset>` | Auto-runs every prompt through your panel -> a judge model's analysis -> a grounded synthesis, with graceful degradation and one-level recursion bounds. |
|
|
145
|
+
|
|
146
|
+
Claude Code examples:
|
|
147
|
+
|
|
148
|
+
```text
|
|
149
|
+
/maestro:frontier status # show current mode
|
|
150
|
+
/maestro:frontier single opus # arm one-CLI auto-run
|
|
151
|
+
/maestro:frontier fusion opus-gpt # arm panel auto-run (Opus + GPT-5.5)
|
|
152
|
+
/maestro:frontier run "your prompt here" # manual one-off (armed modes also auto-run)
|
|
153
|
+
/maestro:frontier off # disable auto-run; back to normal Maestro
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
In Codex, ask the bundled `maestro-frontier` skill for the same status,
|
|
157
|
+
single, fusion, run, or off operation; it resolves the plugin-bundled engine
|
|
158
|
+
when the bare `maestro` command is not on `PATH`.
|
|
159
|
+
|
|
160
|
+
Presets define the panel; the judge and synthesizer default to Opus 4.8
|
|
161
|
+
(`claude -p`), and you override either with `--judge` / `--synth`:
|
|
162
|
+
|
|
163
|
+
- **`opus-duo`**: two independent Opus runs, isolating the synthesis lift.
|
|
164
|
+
- **`opus-gpt`**: Opus + GPT-5.5 (via `codex exec`); the recommended default for bounded spend.
|
|
165
|
+
- **`chatgpt-duo`** (`gpt-duo` alias): two ChatGPT/Codex runs whose judge and synthesizer also run on ChatGPT/Codex: a Codex-only fusion that needs no `claude`.
|
|
166
|
+
- **`frontier-trio`**: Opus + GPT-5.5 + Gemini 3.1 Pro (via `gemini -p`).
|
|
167
|
+
- **`custom`**: 1-8 of the known models.
|
|
168
|
+
|
|
169
|
+
Three model CLIs ship as adapters today: Opus 4.8 (`claude`), GPT-5.5
|
|
170
|
+
(`codex`), and Gemini 3.1 Pro (`gemini`). Kimi, DeepSeek, GLM, and Qwen
|
|
171
|
+
adapters follow in an update soon.
|
|
172
|
+
|
|
173
|
+
Pass `--judge <model>` / `--synth <model>` to run those stages on any
|
|
174
|
+
model for any preset (e.g. `--judge opus --synth gpt-5.5`), so you can mix
|
|
175
|
+
the panel and the judge/synth freely. Degradation is graceful: a partial
|
|
176
|
+
panel failure still returns a synthesis plus `failed_models`; a judge
|
|
177
|
+
failure synthesizes from the raw responses; a hard failure returns a typed
|
|
178
|
+
`failure_reason`. A `FUSION_DEPTH` guard bounds recursion to one level.
|
|
179
|
+
|
|
180
|
+
Honest scope, measured rather than implied: the **engine is built,
|
|
181
|
+
unit-tested (degradation, recursion, budget, anti-majority all covered),
|
|
182
|
+
and verified end-to-end on real runs of `single` mode and the
|
|
183
|
+
`opus-gpt`, `opus-duo`, and `frontier-trio` presets**. The `chatgpt-duo`
|
|
184
|
+
preset and `--judge`/`--synth` selection share that same code path and
|
|
185
|
+
are unit-tested, but not yet live-run. The quality *lift* of local fusion
|
|
186
|
+
is **measured, not asserted**: on a 100-task suite (93 scored) every
|
|
187
|
+
fusion panel outscored its own member models, with the strongest fusion
|
|
188
|
+
leading the field. That fusion-vs-solo result is a separate axis from the
|
|
189
|
+
in-repo A/B harness, which measures Maestro doctrine ON vs OFF; numbers
|
|
190
|
+
are never mixed across the two.
|
|
191
|
+
Operational caveats: headless web access differs per CLI (Codex confirmed
|
|
192
|
+
live; Claude and Gemini are gated `webTools:false` in this build), and
|
|
193
|
+
each cold `claude -p` panel/judge/synth call is non-trivial in cost; use
|
|
194
|
+
small prompts, and prefer `opus-gpt` to bound spend. The budget cap is
|
|
195
|
+
opt-in (`tokenBudget`, default disabled). The engine is zero-dependency
|
|
196
|
+
CommonJS under [`frontier/`](frontier/); each CLI is resolved from your
|
|
197
|
+
`PATH` (`claude`, `codex`, `gemini`). Binary overrides and the full
|
|
198
|
+
operational reference are in
|
|
199
|
+
[`commands/frontier.md`](commands/frontier.md#binary-overrides).
|
|
200
|
+
|
|
201
|
+
## What You Get
|
|
202
|
+
|
|
203
|
+
Frontier is the headline; the discipline layer beneath it is what runs on
|
|
204
|
+
every task. Drop two markdown files into your repo and your agent gains
|
|
205
|
+
five things:
|
|
206
|
+
|
|
207
|
+
1. **Done means done.** Completion reports carry a verification status (`VERIFIED` / `UNVERIFIED` / `FAIL`) backed by an actual type-check, lint, or test run, with an optional hook enforcing it structurally.
|
|
208
|
+
2. **It stays in its lane.** Surgical-scope rules: every changed line traces back to what you asked for: no drive-by refactors, no formatting sweeps, no deleting code it couldn't verify was dead.
|
|
209
|
+
3. **Long runs that land.** Overnight tasks and recurring loops get checkpoint artifacts, explicit end conditions, iteration caps, and re-grounding rules. This repo's own benchmark loops run on exactly these rules.
|
|
210
|
+
4. **Multi-agent only when it pays.** A counted Decision Gate routes work single-agent by default and demands an explicit verdict line before the first edit; orchestration stays behind it.
|
|
211
|
+
5. **Receipts.** A reproducible A/B benchmark harness ships in-repo, with our own retractions and nulls. Rerun every number yourself.
|
|
212
|
+
|
|
213
|
+
The price, measured rather than implied: ON spends about 10% more than a
|
|
214
|
+
clean agent on a 10-module refactor and 38% more on a 16-file feature
|
|
215
|
+
(n=9 medians, t08/t12); you are buying verification and auditability, not
|
|
216
|
+
speed. The premium earns its keep on unattended work (overnight loops,
|
|
217
|
+
scheduled runs, CI agents) where nobody reads the 3am transcript and the
|
|
218
|
+
close-out claim is all you have.
|
|
219
|
+
|
|
220
|
+
## Discipline, Benchmarks, and Research
|
|
221
|
+
|
|
222
|
+
The discipline layer (verification, scope, honest status) applies to
|
|
223
|
+
every task, fusion or not. The full orchestration protocol lives in
|
|
224
|
+
[`docs/orchestration.md`](docs/orchestration.md). Benchmark data,
|
|
225
|
+
retractions, and methodology — including the honest reading that Maestro
|
|
226
|
+
ON has never beaten OFF on success rate in any measured cell and that the
|
|
227
|
+
early efficiency story did not survive replication — are in
|
|
228
|
+
[`docs/benchmarks.md`](docs/benchmarks.md) and
|
|
229
|
+
[`benchmarks/README.md`](benchmarks/README.md). The architecture is
|
|
230
|
+
grounded in 700+ sources; the key driver is that
|
|
231
|
+
[79% of multi-agent failures come from coordination, not model
|
|
232
|
+
capability](https://marklaursen.com/blog/why-your-multi-agent-ai-system-keeps-failing),
|
|
233
|
+
and that three optimized agents outperform seven.
|
|
234
|
+
|
|
235
|
+
## Runtime Adapters
|
|
236
|
+
|
|
237
|
+
Maestro separates **portable orchestration doctrine** from **runtime-specific adapters**. The core logic lives in `AGENTS.md` and works across any agent runtime; adapters are thin wrappers that import it and add only what is runtime-specific.
|
|
238
|
+
|
|
239
|
+
| File | Role | What it adds |
|
|
240
|
+
|---|---|---|
|
|
241
|
+
| `AGENTS.md` | Portable core | Full orchestration doctrine, runtime-agnostic |
|
|
242
|
+
| `CLAUDE.md` | Claude Code adapter | Subagent/team routing, hooks, context limits, tool scoping, long-horizon mapping (/loop, schedules) |
|
|
243
|
+
| `GEMINI.md` | Gemini adapter | Execution mapping, instruction precedence, verification notes, long-horizon note |
|
|
244
|
+
| `.cursorrules` | Cursor adapter | Kernel copy (Cursor does not support imports); full S2-S6 in docs/orchestration.md |
|
|
245
|
+
| [`docs/codex.md`](docs/codex.md) | Codex guide | AGENTS.md precedence and 32 KiB cap, Codex subagent mapping, Automations long-horizon mapping (Codex reads `AGENTS.md` natively) |
|
|
246
|
+
|
|
247
|
+
Maestro's tools run on **both Claude Code and Codex** — in Claude Code as
|
|
248
|
+
`/maestro:*` slash commands, and in Codex as plugin-bundled skills plus
|
|
249
|
+
trusted hooks. The portable `node settings/cli.cjs` and `maestro frontier ...`
|
|
250
|
+
CLIs also work on any other agent. The Codex skills
|
|
251
|
+
(`maestro-frontier`, `maestro-terse`, `maestro-settings`, `maestro-update`)
|
|
252
|
+
ship from the Maestro plugin; the older `maestro install --target codex` path
|
|
253
|
+
still works for manual project copies. When Frontier mode is on, the
|
|
254
|
+
`maestro-frontier` skill leads each Codex reply with `Maestro Frontier ON
|
|
255
|
+
(<label>)` (`single · <model>` or `fusion · <preset>`) — the Codex analog of
|
|
256
|
+
Claude Code's armed Frontier indicator; ask the skill to show status, or run
|
|
257
|
+
`maestro frontier status --scope codex-project` from a shell when using the
|
|
258
|
+
CLI directly.
|
|
259
|
+
|
|
260
|
+
GitHub Copilot, Cline, and Windsurf read `AGENTS.md` directly, so the portable core works there with no adapter. Maestro's always-on kernel (`AGENTS.md`) is ~8 KB, under Windsurf's 12,000-character limit and roughly a quarter of Codex's 32 KiB budget; the full multi-agent protocol loads on demand from `docs/orchestration.md`.
|
|
261
|
+
|
|
262
|
+
**Subagents vs Agent Teams (Claude Code):** Maestro's `CLAUDE.md` adapter
|
|
263
|
+
routes automatically. **Subagents** run within one session and report
|
|
264
|
+
results to the parent; this is the default for narrow independent work.
|
|
265
|
+
**[Agent teams](https://code.claude.com/docs/en/agent-teams)** coordinate
|
|
266
|
+
multiple sessions with peer-to-peer messaging, used only for long-running
|
|
267
|
+
parallel workstreams, competing-hypothesis debugging, or cross-layer
|
|
268
|
+
builds. Agent teams are experimental and Claude Code-only.
|
|
269
|
+
|
|
270
|
+
## Claude Code Tools
|
|
271
|
+
|
|
272
|
+
Optional Claude Code machinery; full install steps in the linked docs.
|
|
273
|
+
|
|
274
|
+
- **Verification Hook**: a `SubagentStop` hook enforcing S7.3 structurally: warns when a file-modifying subagent skips a checker or omits a status token. Never blocks. [`docs/hooks.md`](docs/hooks.md)
|
|
275
|
+
- **Hook Pack**: five more zero-dependency hooks (doctrine guard, loop guard, phase-scope, gate reminder, opt-in gate telemetry) enforcing the rest of the doctrine. [`docs/hooks.md`](docs/hooks.md)
|
|
276
|
+
- **Context Bar**: a status-line context-window progress bar that shifts green to amber to red and detects the model's window (including the 1M Opus tier). [`docs/context-bar.md`](docs/context-bar.md)
|
|
277
|
+
- **Terse Mode + Compress**: opt-in output-token reduction (`/maestro:terse`) and a memory-file compressor (`/maestro:compress`), adapted from the MIT-licensed Caveman plugin. [`docs/context-bar.md`](docs/context-bar.md)
|
|
278
|
+
- **Settings**: `/maestro:settings` changes any toggle in one line (`set terse off`, `frontier fusion opus-gpt`, `help`) or opens a keyboard picker with no arguments, plus a portable `node settings/cli.cjs status|list|help|set` for Codex and any other CLI. [`docs/settings.md`](docs/settings.md)
|
|
279
|
+
|
|
280
|
+
## Commands & Settings
|
|
281
|
+
|
|
282
|
+
Every Maestro slash command in Claude Code is namespaced `/maestro:<name>`.
|
|
283
|
+
The same tools run on Codex as plugin-bundled skills; on any CLI the same
|
|
284
|
+
actions also run through the portable scripts noted below.
|
|
285
|
+
|
|
286
|
+
| Command | What it does | Usage |
|
|
287
|
+
|---|---|---|
|
|
288
|
+
| `/maestro:settings` | See or change all toggles. With arguments it runs the change directly; with no arguments it opens a keyboard picker. | `/maestro:settings`, `… status`, `… list`, `… help`, `… set terse off`, `… frontier fusion opus-gpt` |
|
|
289
|
+
| `/maestro:frontier` | Drive the local multi-CLI fusion engine: switch mode, pick a model/preset, or run a prompt through it. | `… off`, `… single opus`, `… fusion opus-gpt`, `… status`, `… run "<prompt>"` |
|
|
290
|
+
| `/maestro:terse` | Switch terse output mode for the session (off by default). | `… lite`, `… full`, `… ultra`, `… off` |
|
|
291
|
+
| `/maestro:context-bar` | Toggle the status-line context progress bar (and the Maestro badges on it). | `/maestro:context-bar`, `… on`, `… off` |
|
|
292
|
+
| `/maestro:compress <file>` | Rewrite a natural-language memory file in terse form to cut input tokens; keeps a backup and validates deterministically. | `… path/to/NOTES.md` |
|
|
293
|
+
|
|
294
|
+
### Settings toggles
|
|
295
|
+
|
|
296
|
+
`/maestro:settings` and the portable `node settings/cli.cjs` cover three persisted toggles:
|
|
297
|
+
|
|
298
|
+
| Toggle | Values | What it controls |
|
|
299
|
+
|---|---|---|
|
|
300
|
+
| `terse` | `off`, `lite`, `full`, `ultra` | Output-token reduction. Shows an amber level badge (`ULTRA`) on the status bar. |
|
|
301
|
+
| `frontier` | `off`; `single:` `opus` / `gpt-5.5` / `gemini`; `fusion:` `opus-duo` / `opus-gpt` / `chatgpt-duo` / `frontier-trio` / `custom`, each with optional `--judge` / `--synth` | The local fusion engine. When armed it auto-runs on every prompt. The blue `f` panel badge means auto-run is on: `fO+C`, `fO+C+G`, `f*3` (`O`=Opus, `C`=ChatGPT/GPT-5.5, `G`=Gemini). |
|
|
302
|
+
| `context-bar` | `on`, `off` | The status-line context-window progress bar. |
|
|
303
|
+
|
|
304
|
+
Portable everywhere, Codex included: `node settings/cli.cjs status | list | help | set <key> <value>` (frontier also takes `--judge`, `--synth`, `--models a,b,c`, and `--scope <scope>`). Full references: [`docs/settings.md`](docs/settings.md) and [`docs/context-bar.md`](docs/context-bar.md).
|
|
305
|
+
|
|
306
|
+
## Updating Maestro
|
|
307
|
+
|
|
308
|
+
Maestro's marketplaces track `main`, so updating is a refresh rather than a
|
|
309
|
+
manual version edit.
|
|
310
|
+
|
|
311
|
+
### Claude Code
|
|
312
|
+
|
|
313
|
+
`/maestro:update` is the one-command path — it pulls the latest marketplace code, reports what changed, and tells you when to reload:
|
|
314
|
+
|
|
315
|
+
```text
|
|
316
|
+
/maestro:update
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
It can't run the reload for you (a slash command can't invoke another slash command), so it ends by prompting you to run `/reload-plugins` (or restart). The manual equivalent is two steps:
|
|
320
|
+
|
|
321
|
+
```text
|
|
322
|
+
/plugin marketplace update maestro
|
|
323
|
+
/reload-plugins
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
`/reload-plugins` applies the update in the running session; if Claude Code warns that a restart is required, restart it. Non-interactive equivalent of the pull: `claude plugin marketplace update maestro`.
|
|
327
|
+
|
|
328
|
+
### Codex
|
|
329
|
+
|
|
330
|
+
```text
|
|
331
|
+
codex plugin marketplace upgrade maestro
|
|
332
|
+
codex plugin add maestro@maestro
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
Open a new thread after reinstalling so Codex reloads bundled skills and hook
|
|
336
|
+
definitions.
|
|
337
|
+
|
|
338
|
+
### Cursor / Portable Installs
|
|
339
|
+
|
|
340
|
+
- **Git clone:** `git pull` inside the Maestro clone directory.
|
|
341
|
+
- **Downloaded copy:** re-run `npx github:mbanderas/maestro install --target auto --project .` from the project root, or re-download the tarball and re-copy `frontier/`, `bin/maestro.cjs`, plus your integration command file from the latest `main`.
|
|
342
|
+
|
|
343
|
+
### Gemini / other CLIs
|
|
344
|
+
|
|
345
|
+
Re-pull or re-copy `frontier/` and the relevant integration file from `main`. If your CLI supports custom commands and you have a `/update` wired, run that instead.
|
|
346
|
+
|
|
347
|
+
## Contributing
|
|
348
|
+
|
|
349
|
+
Contributions are welcome. Before opening a PR:
|
|
350
|
+
|
|
351
|
+
1. Read the research foundation. Maestro's constraints (4-agent cap, Decision Gate bias toward single-agent) are intentional and research-backed
|
|
352
|
+
2. Keep it zero-dependency: no npm packages, no external imports
|
|
353
|
+
3. Test with real tasks across Claude Code, Gemini, Codex, and Cursor
|
|
354
|
+
4. Docs changes: run `npx --yes markdownlint-cli2` from the repo root (no install footprint; config in `.markdownlint-cli2.jsonc`)
|
|
355
|
+
|
|
356
|
+
If you have benchmarks, case studies, or research that challenges or extends the current architecture, open an issue. The design should evolve with evidence.
|
|
357
|
+
|
|
358
|
+
## Related Projects
|
|
359
|
+
|
|
360
|
+
- **[Govyn](https://github.com/govynAI/govyn)**: Open-source AI agent governance proxy. Maestro orchestrates your agents; Govyn ensures they never hold real API keys, stay within budget, and follow policy. They are designed to work together.
|
|
361
|
+
|
|
362
|
+
## Community
|
|
363
|
+
|
|
364
|
+
Using Maestro Frontier, or running the discipline layer on your own agent? [Open a discussion](https://github.com/mbanderas/maestro/discussions) or [file an issue](https://github.com/mbanderas/maestro/issues).
|
|
365
|
+
|
|
366
|
+
## License
|
|
367
|
+
|
|
368
|
+
MIT
|