@maestrofrontier/frontier 1.4.5 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/.agents/plugins/marketplace.json +21 -21
  2. package/.codex-plugin/plugin.json +29 -29
  3. package/.cursorrules +197 -194
  4. package/AGENTS.md +3 -3
  5. package/README.md +368 -368
  6. package/bin/maestro.cjs +75 -75
  7. package/commands/compress.md +36 -36
  8. package/commands/frontier.md +124 -124
  9. package/commands/terse.md +23 -23
  10. package/docs/codex.md +167 -167
  11. package/docs/orchestration.md +168 -168
  12. package/frontier/cli.cjs +279 -252
  13. package/frontier/config.cjs +468 -468
  14. package/frontier/dispatch.cjs +267 -255
  15. package/frontier/judge.cjs +92 -92
  16. package/frontier/run.cjs +201 -180
  17. package/frontier/schema.cjs +112 -112
  18. package/frontier/semaphore.cjs +49 -49
  19. package/frontier/synthesize.cjs +79 -79
  20. package/hooks/frontier-autorun.cjs +127 -120
  21. package/hooks/hooks.json +103 -103
  22. package/hooks/maestro-doctrine-guard.cjs +81 -81
  23. package/hooks/maestro-gate-reminder.cjs +22 -7
  24. package/hooks/maestro-gate-telemetry.cjs +79 -77
  25. package/hooks/maestro-phase-scope.cjs +118 -118
  26. package/hooks/maestro-statusline-sync.cjs +152 -152
  27. package/hooks/maestro-subagent-guard.cjs +148 -148
  28. package/hooks/maestro-terse-mode.cjs +189 -189
  29. package/hooks/maestro-toolbudget-advisory.cjs +127 -127
  30. package/integrations/README.md +111 -111
  31. package/integrations/cline/skills/frontier/SKILL.md +75 -75
  32. package/integrations/codex/prompts/frontier.md +70 -70
  33. package/integrations/codex/prompts/update.md +39 -39
  34. package/integrations/codex/skills/maestro-frontier/SKILL.md +122 -122
  35. package/integrations/codex/skills/maestro-settings/SKILL.md +55 -55
  36. package/integrations/codex/skills/maestro-terse/SKILL.md +58 -58
  37. package/integrations/codex/skills/maestro-update/SKILL.md +31 -31
  38. package/integrations/cursor/commands/frontier.md +63 -63
  39. package/integrations/cursor/commands/update.md +34 -34
  40. package/integrations/gemini/commands/frontier.toml +76 -76
  41. package/integrations/windsurf/workflows/frontier.md +70 -70
  42. package/package.json +58 -58
  43. package/scripts/install.cjs +1014 -1014
  44. package/settings/cli.cjs +140 -140
  45. package/settings/config.cjs +309 -309
  46. package/skills/maestro-frontier/SKILL.md +122 -122
  47. package/skills/maestro-settings/SKILL.md +55 -55
  48. package/skills/maestro-terse/SKILL.md +58 -58
  49. package/skills/maestro-update/SKILL.md +31 -31
  50. package/skills/terse/SKILL.md +74 -74
@@ -1,168 +1,168 @@
1
- # Maestro Multi-Agent Orchestration: Full Protocol (S2-S6)
2
-
3
- Loaded on demand: read this file when the Decision Gate
4
- ([AGENTS.md](../AGENTS.md) S1) returns a multi-agent verdict. The
5
- kernel's compact protocol is a subset of this document and suffices
6
- when this file is unavailable. Relocated verbatim from the always-on
7
- doctrine in v1.2, content here extends the kernel, never overrides
8
- it.
9
-
10
- ---
11
-
12
- ## Gate constraints (S1 detail)
13
-
14
- - Max 4 specialists per group.
15
- - >60% shared files or <=3 files in one chain: single-agent.
16
- - Overlapping ownership erases parallelism; high-centrality: bias
17
- single.
18
- - Specialists must differ in role or context, not split identical
19
- work, homogeneous splits underperform one agent with the same
20
- budget. Split-design rule for the Planner, not a gate downgrade.
21
- - Parallelizability first: specialization pays only when subtasks are
22
- structurally independent. Coupled subtasks: single-agent wins at
23
- equal token budget, gains that ignore total compute don't count.
24
- - Adversarial review is the best-evidenced multi-agent win. Review
25
- and debate panels: 3 specialists (odd, no ties); 4 stays the cap
26
- for parallel workstreams.
27
- - How to split (and whether a split is too homogeneous) is the
28
- Planner's call (S2), made after the spawn, never the gate's.
29
-
30
- ---
31
-
32
- ## 2. Planner [MULTI-AGENT]
33
-
34
- First sub-agent, created by calling the Task/Agent tool, never
35
- simulated inline by the orchestrator. No specialist work before
36
- Planner returns.
37
-
38
- Produces: subtasks with boundaries, dependency map, parallel groups
39
- (max 4), per-task file scope + objective + acceptance criteria, flags
40
- for single-agent subtasks and high-risk items, cross-talk pairs,
41
- token-cost assessment (flag >60% overlap), task-class match.
42
-
43
- Fewer broader > many narrow. Flag ambiguity, don't assume.
44
-
45
- Reading: recommends single-agent -> switch. Ambiguities -> surface.
46
-
47
- Task classes: Feature (spec/implement/test/integrate),
48
- Bug (reproduce/root-cause/fix/regress),
49
- Refactor (scope/refactor/test/verify),
50
- Audit (discover/analyze/consolidate), Docs+code (change/update/check).
51
-
52
- ---
53
-
54
- ## 3. Specialists [MULTI-AGENT]
55
-
56
- Manifest fields: ROLE, TASK, FILES (read/modify), UPSTREAM,
57
- ORIENTATION, ASSUMPTIONS, OUTPUT, ACCEPT, TOOLS (scoped), RULES (S7
58
- injected). ROLE = procedural workflow (step sequence + acceptance
59
- criteria), never a bare job title, identity labels alone don't
60
- change behavior.
61
-
62
- No conversation history, other tasks, full plan, or unrelated
63
- context. Isolation is the advantage. Out of scope: report and stop.
64
-
65
- ---
66
-
67
- ## 4. Cross-Talk [MULTI-AGENT]
68
-
69
- After each group: check if A modified B's files, changed B's
70
- interfaces, invalidated B's assumptions, or produced B's inputs.
71
-
72
- Route minimum context from A to B. If B completed, spawn correction
73
- agent. Orchestrator: spawn, sequence, detect, route, deliver. Never
74
- plan, code, review, or do specialist work.
75
-
76
- ---
77
-
78
- ## 5. Staff Engineer [MULTI-AGENT]
79
-
80
- Final sub-agent. Reviews integrated output.
81
-
82
- Packet: changed files + diffs, objective, decisions, risks,
83
- questions. Expand for: core architecture, security, central
84
- abstractions.
85
-
86
- Check: requirements met, specialist contradictions, cross-breakage
87
- (interfaces/imports/types/state), architectural drift, verification
88
- (S7.3), dead code/orphaned imports/incomplete renames,
89
- surgical-scope violations (S7.4).
90
-
91
- Returns PASS or FAIL (issues + owner + fix). Max 2 cycles, then
92
- deliver with issues listed.
93
-
94
- High-risk or contested verdicts: adversarial panel of 3 (odd, no
95
- ties), each prompted to refute, not confirm.
96
-
97
- ---
98
-
99
- ## 6. Orchestrator Discipline [MULTI-AGENT]
100
-
101
- - Route minimum viable info (signature, not 200-line diff)
102
- - Checkpoint before spawns/handoffs/resumes: objective, files,
103
- requirements, decisions, risks, next action
104
- - Structured artifacts > transcript carryover
105
- - Stable scaffolds for cache reuse; no per-specialist rephrasing
106
- - Track agent status; report blocks immediately
107
- - Resume from latest artifact, not full history
108
- - Specialist fails: report, ask user. No silent retry >1
109
- - Deliver what asked. No gold-plating. Hooks > prompt reminders
110
-
111
- ---
112
-
113
- ## 9. Model Routing: full table
114
-
115
- Pick the cheapest model that handles the task. Orchestrator decides
116
- at spawn time; Planner (S2) assigns per subtask.
117
-
118
- | Tier | When | Examples |
119
- |------|------|----------|
120
- | Haiku | No edits, single source, low reasoning | Status lookup, chat, format, classify, extract |
121
- | **Sonnet** | 1-3 file edits, known scope. **Default** | Bug fix, refactor, test, review, docs |
122
- | Opus | 4+ files, novel design, high reversal cost | Architecture, security review, complex debug |
123
- | Frontier (Fable-class) | Orchestrator tier: long-horizon autonomous work, 1M-context audits, frontier reasoning | Orchestration, system design, deep multi-file debug, adversarial synthesis |
124
-
125
- When unsure: Sonnet.
126
-
127
- ### Output caps
128
-
129
- Agent prompts MUST specify max response length. Oversized results
130
- bloat parent context and trigger compaction.
131
-
132
- | Agent tier | Cap | Exception |
133
- |------------|-----|-----------|
134
- | Haiku | 100 words | - |
135
- | Sonnet | 500 words | Code output (uncapped) |
136
- | Opus | Uncapped | - |
137
- | Frontier | Uncapped | - |
138
- | Explore | 200 words | Always, regardless of model |
139
-
140
- Explore agents: "report in under 200 words" in every prompt.
141
-
142
- ### Tool-call budgets
143
-
144
- Action tokens are the third cost lever, beside output caps (above)
145
- and S8 input compression. Every subagent prompt carries a tool-call
146
- budget (manifest field `toolBudget`); idea adapted from
147
- claude-token-efficient (MIT).
148
-
149
- | Task type | Budget |
150
- |-----------|--------|
151
- | Routine subtask, known scope | ~20 calls |
152
- | Read-only research / Explore | ~10 calls |
153
- | Multi-file implementation | scale with file count; state it explicitly |
154
-
155
- Discipline inside the budget: read-first-write-once (read each
156
- needed file once, then edit, no re-read loops); one diagnostic read
157
- per failure, then the S7.3 two-attempt rule applies (stop, re-read
158
- from scratch, change approach). Budget exhausted: report progress
159
- and the named gap, never burn calls polling.
160
- Research agents returning raw dumps waste more tokens than they save.
161
-
162
- ---
163
-
164
- ## Self-evaluation (relocated S7.6)
165
-
166
- - Two perspectives: perfectionist critique + pragmatist accept
167
- - Bug autopsy: root cause vs symptom, prevention
168
- - After 2 failures: stop, re-read from scratch, different approach
1
+ # Maestro Multi-Agent Orchestration: Full Protocol (S2-S6)
2
+
3
+ Loaded on demand: read this file when the Decision Gate
4
+ ([AGENTS.md](../AGENTS.md) S1) returns a multi-agent verdict. The
5
+ kernel's compact protocol is a subset of this document and suffices
6
+ when this file is unavailable. Relocated verbatim from the always-on
7
+ doctrine in v1.2, content here extends the kernel, never overrides
8
+ it.
9
+
10
+ ---
11
+
12
+ ## Gate constraints (S1 detail)
13
+
14
+ - Max 4 specialists per group.
15
+ - >60% shared files or <=3 files in one chain: single-agent.
16
+ - Overlapping ownership erases parallelism; high-centrality: bias
17
+ single.
18
+ - Specialists must differ in role or context, not split identical
19
+ work, homogeneous splits underperform one agent with the same
20
+ budget. Split-design rule for the Planner, not a gate downgrade.
21
+ - Parallelizability first: specialization pays only when subtasks are
22
+ structurally independent. Coupled subtasks: single-agent wins at
23
+ equal token budget, gains that ignore total compute don't count.
24
+ - Adversarial review is the best-evidenced multi-agent win. Review
25
+ and debate panels: 3 specialists (odd, no ties); 4 stays the cap
26
+ for parallel workstreams.
27
+ - How to split (and whether a split is too homogeneous) is the
28
+ Planner's call (S2), made after the spawn, never the gate's.
29
+
30
+ ---
31
+
32
+ ## 2. Planner [MULTI-AGENT]
33
+
34
+ First sub-agent, created by calling the Task/Agent tool, never
35
+ simulated inline by the orchestrator. No specialist work before
36
+ Planner returns.
37
+
38
+ Produces: subtasks with boundaries, dependency map, parallel groups
39
+ (max 4), per-task file scope + objective + acceptance criteria, flags
40
+ for single-agent subtasks and high-risk items, cross-talk pairs,
41
+ token-cost assessment (flag >60% overlap), task-class match.
42
+
43
+ Fewer broader > many narrow. Flag ambiguity, don't assume.
44
+
45
+ Reading: recommends single-agent -> switch. Ambiguities -> surface.
46
+
47
+ Task classes: Feature (spec/implement/test/integrate),
48
+ Bug (reproduce/root-cause/fix/regress),
49
+ Refactor (scope/refactor/test/verify),
50
+ Audit (discover/analyze/consolidate), Docs+code (change/update/check).
51
+
52
+ ---
53
+
54
+ ## 3. Specialists [MULTI-AGENT]
55
+
56
+ Manifest fields: ROLE, TASK, FILES (read/modify), UPSTREAM,
57
+ ORIENTATION, ASSUMPTIONS, OUTPUT, ACCEPT, TOOLS (scoped), RULES (S7
58
+ injected). ROLE = procedural workflow (step sequence + acceptance
59
+ criteria), never a bare job title, identity labels alone don't
60
+ change behavior.
61
+
62
+ No conversation history, other tasks, full plan, or unrelated
63
+ context. Isolation is the advantage. Out of scope: report and stop.
64
+
65
+ ---
66
+
67
+ ## 4. Cross-Talk [MULTI-AGENT]
68
+
69
+ After each group: check if A modified B's files, changed B's
70
+ interfaces, invalidated B's assumptions, or produced B's inputs.
71
+
72
+ Route minimum context from A to B. If B completed, spawn correction
73
+ agent. Orchestrator: spawn, sequence, detect, route, deliver. Never
74
+ plan, code, review, or do specialist work.
75
+
76
+ ---
77
+
78
+ ## 5. Staff Engineer [MULTI-AGENT]
79
+
80
+ Final sub-agent. Reviews integrated output.
81
+
82
+ Packet: changed files + diffs, objective, decisions, risks,
83
+ questions. Expand for: core architecture, security, central
84
+ abstractions.
85
+
86
+ Check: requirements met, specialist contradictions, cross-breakage
87
+ (interfaces/imports/types/state), architectural drift, verification
88
+ (S7.3), dead code/orphaned imports/incomplete renames,
89
+ surgical-scope violations (S7.4).
90
+
91
+ Returns PASS or FAIL (issues + owner + fix). Max 2 cycles, then
92
+ deliver with issues listed.
93
+
94
+ High-risk or contested verdicts: adversarial panel of 3 (odd, no
95
+ ties), each prompted to refute, not confirm.
96
+
97
+ ---
98
+
99
+ ## 6. Orchestrator Discipline [MULTI-AGENT]
100
+
101
+ - Route minimum viable info (signature, not 200-line diff)
102
+ - Checkpoint before spawns/handoffs/resumes: objective, files,
103
+ requirements, decisions, risks, next action
104
+ - Structured artifacts > transcript carryover
105
+ - Stable scaffolds for cache reuse; no per-specialist rephrasing
106
+ - Track agent status; report blocks immediately
107
+ - Resume from latest artifact, not full history
108
+ - Specialist fails: report, ask user. No silent retry >1
109
+ - Deliver what asked. No gold-plating. Hooks > prompt reminders
110
+
111
+ ---
112
+
113
+ ## 9. Model Routing: full table
114
+
115
+ Pick the cheapest model that handles the task. Orchestrator decides
116
+ at spawn time; Planner (S2) assigns per subtask.
117
+
118
+ | Tier | When | Examples |
119
+ |------|------|----------|
120
+ | Haiku | No edits, single source, low reasoning | Status lookup, chat, format, classify, extract |
121
+ | **Sonnet** | 1-3 file edits, known scope. **Default** | Bug fix, refactor, test, review, docs |
122
+ | Opus | 4+ files, novel design, high reversal cost | Architecture, security review, complex debug |
123
+ | Frontier (Fable-class) | Orchestrator tier: long-horizon autonomous work, 1M-context audits, frontier reasoning | Orchestration, system design, deep multi-file debug, adversarial synthesis |
124
+
125
+ When unsure: Sonnet.
126
+
127
+ ### Output caps
128
+
129
+ Agent prompts MUST specify max response length. Oversized results
130
+ bloat parent context and trigger compaction.
131
+
132
+ | Agent tier | Cap | Exception |
133
+ |------------|-----|-----------|
134
+ | Haiku | 100 words | - |
135
+ | Sonnet | 500 words | Code output (uncapped) |
136
+ | Opus | Uncapped | - |
137
+ | Frontier | Uncapped | - |
138
+ | Explore | 200 words | Always, regardless of model |
139
+
140
+ Explore agents: "report in under 200 words" in every prompt.
141
+
142
+ ### Tool-call budgets
143
+
144
+ Action tokens are the third cost lever, beside output caps (above)
145
+ and S8 input compression. Every subagent prompt carries a tool-call
146
+ budget (manifest field `toolBudget`); idea adapted from
147
+ claude-token-efficient (MIT).
148
+
149
+ | Task type | Budget |
150
+ |-----------|--------|
151
+ | Routine subtask, known scope | ~20 calls |
152
+ | Read-only research / Explore | ~10 calls |
153
+ | Multi-file implementation | scale with file count; state it explicitly |
154
+
155
+ Discipline inside the budget: read-first-write-once (read each
156
+ needed file once, then edit, no re-read loops); one diagnostic read
157
+ per failure, then the S7.3 two-attempt rule applies (stop, re-read
158
+ from scratch, change approach). Budget exhausted: report progress
159
+ and the named gap, never burn calls polling.
160
+ Research agents returning raw dumps waste more tokens than they save.
161
+
162
+ ---
163
+
164
+ ## Self-evaluation (relocated S7.6)
165
+
166
+ - Two perspectives: perfectionist critique + pragmatist accept
167
+ - Bug autopsy: root cause vs symptom, prevention
168
+ - After 2 failures: stop, re-read from scratch, different approach